CN102298575A - Method and system for copying and pasting Word file content with format - Google Patents

Method and system for copying and pasting Word file content with format Download PDF

Info

Publication number
CN102298575A
CN102298575A CN2010102113241A CN201010211324A CN102298575A CN 102298575 A CN102298575 A CN 102298575A CN 2010102113241 A CN2010102113241 A CN 2010102113241A CN 201010211324 A CN201010211324 A CN 201010211324A CN 102298575 A CN102298575 A CN 102298575A
Authority
CN
China
Prior art keywords
file
xml
word
word file
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010102113241A
Other languages
Chinese (zh)
Inventor
李彦娜
魏超鹏
尚高峰
岳永强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Founder Group Co Ltd
Beijing Founder Electronics Co Ltd
Original Assignee
Peking University Founder Group Co Ltd
Beijing Founder Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Founder Group Co Ltd, Beijing Founder Electronics Co Ltd filed Critical Peking University Founder Group Co Ltd
Priority to CN2010102113241A priority Critical patent/CN102298575A/en
Publication of CN102298575A publication Critical patent/CN102298575A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a method and a system for copying and pasting Word file content with format, belonging to the technical field of printing and typesetting. The method comprises the following steps of: firstly, establishing a temporary Word file with docx format, pasting the Word file content to be copied to the temporary Word file; then, obtaining an XML (Extensive Makeup Language) source file according to the temporary Word file; converting the XML soruce file into an XML target file identifiable for target software; and finally, importing data of the XML target file into the target software. With the method and the system, all the content in the Word file can be directly copied to professional typesetting software, and settings for objects, such as format of characters, graph, image, table and the like, in the professional typesetting software can be avoided, so that a procedure of copying the Word file content to the professional typesetting software is greatly simplified, and typesetting efficiency is improved.

Description

A kind of Word file content copy of tape format and the method and system of pasting
Technical field
The invention belongs to field of printing and typesetting technology, be specifically related to a kind of Word file content copy of tape format and the method and system of pasting, the Word file content that is particularly suitable for being applied to containing objects such as figure, image and form copies the occasion in the software for composing to.
Background technology
In typesetting and printing field, present most user's contribution all is to use the Word word processor of Microsoft to write.The content that can present among the Word is very abundant, comprises objects such as literal, figure, image, form, and the various patterns and the attribute that are provided with on object.Content copies in the professional composing software process in the Word file, and traditional copy and method of attaching can only keep the word content in the Word file, and the form of literal can be lost.And, if contain objects such as figure, image, form in the Word file, then these objects can't be copied in the professional software for composing, need in professional software for composing, enter these objects again, and reset text formatting, cause having reduced composing efficient.
Summary of the invention
At the defective that exists in the prior art, the purpose of this invention is to provide a kind of Word file content copy of tape format and the method and system of pasting.These method and system can improve composing efficient.
To achieve these goals, the technical solution used in the present invention is as follows:
A kind of Word file content copy of tape format and the method for pasting may further comprise the steps:
(1) the interim Word file of establishment docx form pastes Word file content to be copied in this interim Word file;
(2) the interim Word file of basis obtains the XML source file of Word file content to be copied;
(3) the XML source file is converted to the XML file destination that target software can be discerned;
(4) with the data importing of XML file destination in target software.
The method of the Word file content of tape format copy and stickup in the step (1), before creating interim Word file, is judged the version of Word program earlier as mentioned above; If the version of Word program is Word2007 or higher, then directly create the interim Word file of docx form; Otherwise the interim Word file of doc form is created by elder generation, and the interim Word file conversion with the doc form is the interim Word file of docx form again.
The Word file content of the tape format copy and the method for pasting as mentioned above, in the step (2), the method of obtaining the XML source file is: the interim Word file of decompress(ion) docx form, with the document.xml file under the Word file that produces behind the decompress(ion) as the XML source file.
The Word file content of the tape format copy and the method for pasting as mentioned above, in the step (3), according to the XML form that target software can be discerned, utilize XSLT that the XML source file is converted to the XML file destination, described XSLT is a kind of language that is used for XML document is converted to other XML document.
The Word file content of the tape format copy and the method for pasting as mentioned above, wherein, the XML mode-definition that XML mode-definition that XSLT observes according to the XML source file and XML file destination are observed is write conversion module, and the XML source file is converted to the XML file destination.
The method of the Word file content of tape format copy and stickup wherein, is being converted to the XML source file in the process of XML file destination as mentioned above, and XSLT uses the XML path language to search the part that can mate one or more predefine templates in the XML source file; In case mate foundly, XSLT is converted to content in the XML file destination according to template with the compatible portion in the XML source file.
The Word file content of the tape format copy and the method for pasting as mentioned above, described in the step (4) with the data importing of XML file destination to the process of target software be: the XML mode-definition that target software is observed according to the XML file destination begins all XML nodes searching loop and the processing XML file destination from root node, creates corresponding object and set a property in target software according to the content that writes down in each XML node and attribute data.
A kind of Word file content of tape format copy and the system of pasting comprise the interim Word file that is used to create the docx form, and Word file content to be copied is pasted creation apparatus in this interim Word file; Be used for obtaining the deriving means of the XML source file of Word file content to be copied according to interim Word file; Be used for the XML source file is converted to the conversion equipment A of the XML file destination that target software can discern; Be used for the gatherer of XML file destination data importing to target software.
The Word file content copy of aforesaid tape format and the system of pasting, wherein, described system also comprises judgment means, is used to judge the version of Word program.
The Word file content copy of aforesaid tape format and the system of pasting, wherein, described system also comprises conversion equipment B, when judgment means is judged the version of Word program when lower than Word2007, creation apparatus is created the interim Word file of doc form, and conversion equipment B becomes the interim Word file conversion of doc form the interim Word file of docx form.
The method of the invention and system, all the elements in the Word file directly can be copied in the professional software for composing, need not in professional software for composing objects such as the form of literal and figure, image, form are entered again, simplify the Word file content greatly and copied the process of professional software for composing to, thereby improved composing efficient.
Description of drawings
Fig. 1 is the structured flowchart of the preferred implementation of system of the present invention;
Fig. 2 is the process flow diagram of the preferred implementation of the method for the invention;
Fig. 3 is the process flow diagram of method described in the embodiment;
Fig. 4 is a word file content synoptic diagram to be copied among the embodiment;
Fig. 5 is the structural representation behind the interim Word file of decompress(ion) docx form among the embodiment;
Fig. 6 be among the embodiment in the document.xml file about the description synoptic diagram of form;
Fig. 7 be among the embodiment in the XML file destination for the description synoptic diagram of form;
Fig. 8 describes synoptic diagram for the corresponding XSLT of the description of form among the embodiment with in the XML file destination;
Fig. 9 is with the effect synoptic diagram of XML file destination data importing behind the wound skill software for composing of soaring among the embodiment.
Embodiment
Below in conjunction with embodiment and accompanying drawing, describe the present invention.
Fig. 1 has shown the structure of the preferred implementation of system of the present invention.This system comprises creation apparatus 11, the deriving means 12 that is connected with creation apparatus 11, judgment means 15 and conversion equipment B 16, the conversion equipment A13 that is connected with deriving means 12, the gatherer 14 that is connected with conversion equipment A13.
Judgment means 15 is used to judge the version of Word program.When the version of Word program is Word2007 or when higher, creation apparatus 11 is directly created the interim Word file of docx form, and Word file content to be copied is pasted in this interim Word file; When the version of Word program is Word2003 or when lower, creation apparatus 11 is created the interim Word file of doc form earlier, by conversion equipment B16 the interim Word file conversion of doc form is become the interim Word file of docx form, creation apparatus 11 pastes Word file content to be copied in the interim Word file of the docx form after the conversion again.Deriving means 12 is used for obtaining according to the interim Word file of docx form the XML source file of Word file content to be copied.Conversion equipment A13 is used for the XML source file is converted to the XML file destination that target software can be discerned.Gatherer 14 is used for the data importing of XML file destination to target software.
Fig. 2 has shown the process flow diagram of the preferred implementation of the method for the invention.This method may further comprise the steps:
(1) the interim Word file of establishment docx form pastes Word file content to be copied in this interim Word file.
Because the docx form is the Word2007 or the file layout of highest version more, therefore before the interim Word file of creating the docx form, need to judge earlier the version of Word program; If the version of Word program is Word2007 or higher, then directly create the interim Word file of docx form; Otherwise the interim Word file of doc form is created by elder generation, and the interim Word file conversion with the doc form is the interim Word file of docx form again, as shown in Figure 3.With the interim Word file conversion of doc form is the compatible converse routine that the interim Word file of docx form can utilize Microsoft.
(2) the interim Word file of basis obtains the XML source file of Word file content to be copied.
According to open packing agreement, the interim Word file of decompress(ion) docx form, with the document.xml file under the Word file that produces behind the decompress(ion) as XML (Extensible Markup Language, extend markup language) source file.The document.xml file is ooxml (Office Open XML) form.Ooxml is that Microsoft is the technical manual of Office 2007 product developments, has now become international document format standard.
(3) the XML source file is converted to the XML file destination that target software can be discerned.
According to the XML form that target software can be discerned, utilize XSLT (Extensible StylesheetLanguage Transformations, extensible stylesheet list code-switching) that the XML source file is converted to the XML file destination.XSLT is a kind of language that is used for XML document is converted to other XML document.
XSD (the XML Schema Definition that the format character contract of XML file destination is fixed, the XML mode-definition), the XSD that XSD that XSLT observes according to XML source file (being the ooxml form) and XML file destination are observed writes conversion module, and the XML source file is converted to the XML file destination.In transfer process, XSLT uses XPath (XML path language) to search the part that can mate one or more predefine templates in the XML source file.In case mate foundly, XSLT just is converted to content in the XML file destination according to template with the compatible portion in the XML source file.Wherein, XPath is a kind of language that is used for the information of searching specially in XML document, is the important component part of XSLT.
(4) with the data importing of XML file destination in target software.
The XSD that target software is observed according to the XML file destination begins all XML nodes searching loop and the processing XML file destination from root node, in target software, create corresponding object and set a property according to the content that writes down in each XML node and attribute data, thereby the content in the XML file destination is imported in the target software.XML file destination circulation is begun recurrence from the root node of XML import present node and child node thereof, finish up to all nodes in the XML file destination are imported.
Embodiment
Present embodiment is an example the copy of the content in the Word document shown in Figure 4 is pasted in the wound skill software for composing of soaring.Content to be copied comprises two literal, the form and a pictures of one 2 row 2 row.According to flow process shown in Figure 3, at first copy content to be copied, judge the version of Word software, the Word software space of a whole page in the present embodiment is Word2007, directly creates the interim Word file of docx form, and content to be copied is pasted in the interim Word file.According to open this interim Word file of packing agreement decompression, the file structure behind the decompress(ion) as shown in Figure 5 then.Wherein, among the document.xml about the description of form as shown in Figure 6.As the XML source file, utilize XSLT that this XML source file is converted to the XML file destination document.xml.For the description of form as shown in Figure 7, corresponding XSLT as shown in Figure 8 in the XML file destination.When target software was arrived in data importing in the XML file destination, circular treatment node wherein mainly comprised paragraph node<w:p 〉, paragraph properties node<w:pPr 〉, form node<w:tbl〉and, picture place node<w:drawing 〉.The effect of XML file destination data importing behind the wound skill software for composing of soaring as shown in Figure 9.
Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, if of the present invention these are revised and modification belongs within the scope of claim of the present invention and equivalent technology thereof, then the present invention also is intended to comprise these changes and modification interior.

Claims (10)

1. the Word file content of the tape format copy and the method for pasting may further comprise the steps:
(1) the interim Word file of establishment docx form pastes Word file content to be copied in this interim Word file;
(2) the interim Word file of basis obtains the XML source file of Word file content to be copied;
(3) the XML source file is converted to the XML file destination that target software can be discerned;
(4) with the data importing of XML file destination in target software.
2. the Word file content of the tape format copy and the method for pasting according to claim 1 is characterized in that: in the step (1), before creating interim Word file, judge the version of Word program earlier; If the version of Word program is Word2007 or higher, then directly create the interim Word file of docx form; Otherwise the interim Word file of doc form is created by elder generation, and the interim Word file conversion with the doc form is the interim Word file of docx form again.
3. the Word file content of the tape format copy and the method for pasting according to claim 1, it is characterized in that, the method of obtaining the XML source file in the step (2) is: the interim Word file of decompress(ion) docx form, with the document.xml file under the Word file that produces behind the decompress(ion) as the XML source file.
4. as the Word file content copy of tape format as described in one of claim 1 to 3 and the method for pasting, it is characterized in that: in the step (3), the XML form that can discern according to target software, utilize XSLT that the XML source file is converted to the XML file destination, described XSLT is a kind of language that is used for XML document is converted to other XML document.
5. as the Word file content copy of tape format as described in the claim 4 and the method for pasting, it is characterized in that: the XML mode-definition that XML mode-definition that XSLT observes according to the XML source file and XML file destination are observed is write conversion module, and the XML source file is converted to the XML file destination.
6. as the Word file content copy of tape format as described in the claim 5 and the method for pasting, it is characterized in that: the XML source file is being converted in the process of XML file destination, XSLT uses the XML path language to search the part that can mate one or more predefine templates in the XML source file; In case mate foundly, XSLT is converted to content in the XML file destination according to template with the compatible portion in the XML source file.
7. as the Word file content copy of tape format as described in one of claim 1 to 3 and the method for pasting, it is characterized in that, described in the step (4) with the data importing of XML file destination to the process of target software be: the XML mode-definition that target software is observed according to the XML file destination begins all XML nodes searching loop and the processing XML file destination from root node, creates corresponding object and set a property in target software according to the content that writes down in each XML node and attribute data.
8. the Word file content of the tape format copy and the system of pasting comprise the interim Word file that is used to create the docx form, and Word file content to be copied is pasted creation apparatus (11) in this interim Word file; Be used for obtaining the deriving means (12) of the XML source file of Word file content to be copied according to interim Word file; Be used for the XML source file is converted to the conversion equipment A (13) of the XML file destination that target software can discern; Be used for the gatherer (14) of XML file destination data importing to target software.
9. the Word file content of the tape format as claimed in claim 8 copy and the system of pasting, it is characterized in that: described system also comprises judgment means (15), is used to judge the version of Word program.
10. the Word file content of the tape format as claimed in claim 9 copy and the system of pasting, it is characterized in that: described system also comprises conversion equipment B (16), when judgment means (15) is judged the version of Word program when lower than Word2007, creation apparatus (11) is created the interim Word file of doc form, and conversion equipment B (16) becomes the interim Word file conversion of doc form the interim Word file of docx form.
CN2010102113241A 2010-06-28 2010-06-28 Method and system for copying and pasting Word file content with format Pending CN102298575A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010102113241A CN102298575A (en) 2010-06-28 2010-06-28 Method and system for copying and pasting Word file content with format

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010102113241A CN102298575A (en) 2010-06-28 2010-06-28 Method and system for copying and pasting Word file content with format

Publications (1)

Publication Number Publication Date
CN102298575A true CN102298575A (en) 2011-12-28

Family

ID=45358998

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010102113241A Pending CN102298575A (en) 2010-06-28 2010-06-28 Method and system for copying and pasting Word file content with format

Country Status (1)

Country Link
CN (1) CN102298575A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102651016A (en) * 2012-03-30 2012-08-29 北京英富森信息技术有限公司 Desktop information collecting system and method based on user computer terminal
CN103186655A (en) * 2011-12-31 2013-07-03 北大方正集团有限公司 Processing method and device for layout file
CN104156207A (en) * 2014-07-31 2014-11-19 广州金山网络科技有限公司 File display method and device
CN104679726A (en) * 2013-12-03 2015-06-03 北大方正集团有限公司 Type setting method and device of word files
CN105426480A (en) * 2015-11-19 2016-03-23 中国地质大学(武汉) Method and apparatus for converting HTML into Word document
CN109344377A (en) * 2018-08-31 2019-02-15 深圳众赢维融科技有限公司 Method, apparatus, electronic equipment and the storage medium of data processing
CN110321545A (en) * 2018-03-29 2019-10-11 成都野望数码科技有限公司 A kind of method and apparatus of office documents typesetting
CN110580330A (en) * 2019-09-03 2019-12-17 中建易通科技股份有限公司 File processing method for filling electronic form
CN110705216A (en) * 2019-09-19 2020-01-17 深圳前海环融联易信息科技服务有限公司 Method and device for converting docx file into xml file based on java and computer equipment
CN111274768A (en) * 2018-12-04 2020-06-12 北大方正集团有限公司 Method, device, equipment and storage medium for converting journal paper into XML data
CN112699641A (en) * 2021-03-25 2021-04-23 南京国睿信维软件有限公司 Method for quickly converting batch copy of WORD content to DM based on S1000D standard

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050060317A1 (en) * 2003-09-12 2005-03-17 Lott Christopher Martin Method and system for the specification of interface definitions and business rules and automatic generation of message validation and transformation software
CN101055577A (en) * 2006-04-12 2007-10-17 龙搜(北京)科技有限公司 Collector capable of extending markup language
CN101430684A (en) * 2007-11-09 2009-05-13 北大方正集团有限公司 Method and apparatus for mutual conversion between Chinese work office software document and documents with other format

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050060317A1 (en) * 2003-09-12 2005-03-17 Lott Christopher Martin Method and system for the specification of interface definitions and business rules and automatic generation of message validation and transformation software
CN101055577A (en) * 2006-04-12 2007-10-17 龙搜(北京)科技有限公司 Collector capable of extending markup language
CN101430684A (en) * 2007-11-09 2009-05-13 北大方正集团有限公司 Method and apparatus for mutual conversion between Chinese work office software document and documents with other format

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
SIMSON L GARFINKEL等: "New XML-Based Files implications for Forensics", 《IEEE SECURITY AND PRIVACY》 *
吴晓丹等: "利用XSLT转换XML文档的应用", 《现代计算机(专业版)》 *
徐敏等: "word2007文档信息隐藏的新方法", 《计算机研究与发展》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103186655A (en) * 2011-12-31 2013-07-03 北大方正集团有限公司 Processing method and device for layout file
CN102651016A (en) * 2012-03-30 2012-08-29 北京英富森信息技术有限公司 Desktop information collecting system and method based on user computer terminal
CN104679726A (en) * 2013-12-03 2015-06-03 北大方正集团有限公司 Type setting method and device of word files
CN104156207A (en) * 2014-07-31 2014-11-19 广州金山网络科技有限公司 File display method and device
CN105426480A (en) * 2015-11-19 2016-03-23 中国地质大学(武汉) Method and apparatus for converting HTML into Word document
CN110321545A (en) * 2018-03-29 2019-10-11 成都野望数码科技有限公司 A kind of method and apparatus of office documents typesetting
CN109344377A (en) * 2018-08-31 2019-02-15 深圳众赢维融科技有限公司 Method, apparatus, electronic equipment and the storage medium of data processing
CN109344377B (en) * 2018-08-31 2023-11-24 简链科技(广东)有限公司 Data processing method, device, electronic equipment and storage medium
CN111274768B (en) * 2018-12-04 2022-02-22 北大方正集团有限公司 Method, device, equipment and storage medium for converting journal paper into XML data
CN111274768A (en) * 2018-12-04 2020-06-12 北大方正集团有限公司 Method, device, equipment and storage medium for converting journal paper into XML data
CN110580330B (en) * 2019-09-03 2023-04-07 中建易通科技股份有限公司 File processing method for filling electronic form
CN110580330A (en) * 2019-09-03 2019-12-17 中建易通科技股份有限公司 File processing method for filling electronic form
CN110705216A (en) * 2019-09-19 2020-01-17 深圳前海环融联易信息科技服务有限公司 Method and device for converting docx file into xml file based on java and computer equipment
CN110705216B (en) * 2019-09-19 2023-11-03 深圳前海环融联易信息科技服务有限公司 Method and device for converting docx file into xml file based on java and computer equipment
CN112699641A (en) * 2021-03-25 2021-04-23 南京国睿信维软件有限公司 Method for quickly converting batch copy of WORD content to DM based on S1000D standard
CN112699641B (en) * 2021-03-25 2021-07-20 南京国睿信维软件有限公司 Method for quickly converting batch copy of WORD content to DM based on S1000D standard

Similar Documents

Publication Publication Date Title
CN102298575A (en) Method and system for copying and pasting Word file content with format
CN102982010B (en) The method and apparatus extracting file structure
CN105824788B (en) A kind of method and system that PowerPoint file is converted to word document
US20070038930A1 (en) Method and system for an architecture for the processing of structured documents
KR101774257B1 (en) Document editing apparatus for maintaining style of object and operating method thereof
US20080104579A1 (en) Systems and methods of transforming XML schemas
CN104346322A (en) Document format processing device and document format processing method
CN101980183B (en) Method for analyzing Word file information and system thereof
US20090112893A1 (en) Creation and management of electronic files for localization project
CN104679726A (en) Type setting method and device of word files
CN101430684A (en) Method and apparatus for mutual conversion between Chinese work office software document and documents with other format
KR20150144073A (en) Method and apparatus for format conversion of document, and cloud server thereof
US20150248382A1 (en) Apparatus and method for converting an electronic form
CN101201833A (en) System and method for filling PDF document data
CN106815181B (en) Method and device for converting Indesign typesetted ind files into Office files
CN101799890B (en) Certificate data processing method and system
CN110187886A (en) A kind of documentation website generation method and terminal
CN104462157A (en) Method and device for secondary structuralizing of text data
CN102541818B (en) Large version re-editing method and device
CN111401005B (en) Text conversion method and device and readable storage medium
CN107506339A (en) A kind of SCD nodes verification error localization method and device based on character skew
JP2013218627A (en) Method and device for extracting information from structured document and program
CN110019968B (en) XML file processing method and device
JP3843810B2 (en) Multi-template management system and multi-template management program
CN103995813A (en) Method and system for generating electronic bill template

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20111228