WO2017215253A1 - 一种Office文档转化及修改过程中的跟踪方法 - Google Patents

一种Office文档转化及修改过程中的跟踪方法 Download PDF

Info

Publication number
WO2017215253A1
WO2017215253A1 PCT/CN2017/000320 CN2017000320W WO2017215253A1 WO 2017215253 A1 WO2017215253 A1 WO 2017215253A1 CN 2017000320 W CN2017000320 W CN 2017000320W WO 2017215253 A1 WO2017215253 A1 WO 2017215253A1
Authority
WO
WIPO (PCT)
Prior art keywords
office
customized data
office document
document
file
Prior art date
Application number
PCT/CN2017/000320
Other languages
English (en)
French (fr)
Inventor
刘芳铭
Original Assignee
福建福昕软件开发股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 福建福昕软件开发股份有限公司 filed Critical 福建福昕软件开发股份有限公司
Priority to US16/305,604 priority Critical patent/US20200117852A1/en
Publication of WO2017215253A1 publication Critical patent/WO2017215253A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/197Version control

Definitions

  • the present invention relates to the field of document management, and in particular to a tracking method in the process of converting and modifying an Office document.
  • Office documents can be converted from files in other formats (such as PNG, PDF, HTML, etc.) through various computer software. This conversion process involves re-representation of the content, in addition to changes in the form of the information, and more or more content. Less will change.
  • the Office document generated by the transformation is a separate entity with no direct association with the original data.
  • the newly generated document has a similar appearance to the original file, but the content and expression may be different, and further use and modification of the user will gradually expand the difference.
  • the user can manually record the homologous relationship between the original document and the Office document, that is, they are modified or converted from the same original document. But in many scenarios, manual recording is very inconvenient and even difficult.
  • the invention provides a tracking method in the process of converting and modifying Office documents, which is used to divide a large number of documents into homologous document clusters according to the homologous relationship in the management of a large number of documents, thereby providing data statistics and user information search for the system. Convenience.
  • a tracking method that includes the following steps:
  • the customized data is stored in a title or comment of the Office document metadata.
  • the customized data is stored in hidden text of the body of the Office document.
  • the tracking method in the process of converting and modifying the Office document provided by the invention can automatically track the conversion and modification process of the generated Office document without manual intervention by the user.
  • After using the present invention for the specific two Office documents, it can be judged whether they are modified or converted by the same original document, and the documents of other formats obtained by re-converting these Office documents can support the target format.
  • the above judgment can also be performed in the case of the operation mentioned in the invention. Therefore, the present invention can provide convenience for data statistics of the system and information search of the user, and has strong practicability.
  • FIG. 1 is a flowchart of a tracking method in an Office document conversion and modification process provided by the present invention.
  • FIG. 1 is a flowchart of a tracking method in an Office document conversion and modification process provided by the present invention. As shown in the figure, the tracking method in the process of converting and modifying an Office document provided by the present invention includes the following steps:
  • the customized data may be stored in a title or comment of the Office document metadata.
  • the customized data may also be stored in hidden text of the body of the Office document.
  • the customized data containing the unique ID represented in the XML format is obtained from the customXML mechanism of the Office document.
  • documents of other formats that may be generated by the conversion of the document 1, try to obtain the customized data therein in a manner corresponding to the target format;
  • the tracking method in the process of converting and modifying the Office document provided by the invention can automatically track the conversion and modification process of the generated Office document without manual intervention by the user.
  • After using the present invention for the specific two Office documents, it can be judged whether they are modified or converted by the same original document, and the documents of other formats obtained by re-converting these Office documents can support the target format.
  • the above judgment can also be performed in the case of the operation mentioned in the invention. Therefore, the present invention can provide convenience for data statistics of the system and information search of the user, and has strong practicability.
  • modules in the apparatus in the embodiment may be distributed in the apparatus of the embodiment according to the embodiment, or may be changed according to the embodiment.
  • the modules of the above embodiments may be combined into one module, or may be further split into multiple sub-modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)

Abstract

一种Office文档转化及修改过程中的跟踪方法,其包括以下步骤:S1:生成一包含唯一ID的客户化数据;S2:当Office文档生成时或生成后,使用Office文档的customXML机制,将所述客户化数据用XML格式保存在Office文档中;S3:对该Office文档进行修改后,该唯一ID保持不变;S4:将该Office文档转化为一目标格式的目标文件后,如果该目标格式能够支持保存该客户化数据,则将该客户化数据转移到该目标文件中,并且用户能够根据该客户化数据对该目标文件进行管理;S5:对S2步骤得到的Office文档进行更新并再次生成了新的Office文档时,该新的Office文档中的该唯一ID保持不变。能够为系统的数据统计和用户的信息查找提供方便,具有很强的实用性。

Description

一种Office文档转化及修改过程中的跟踪方法 技术领域
本发明涉及文档管理领域,具体而言,涉及一种Office文档转化及修改过程中的跟踪方法。
背景技术
Office文档可通过各种计算机软件由其他格式(如PNG、PDF、HTML等)的文件转换而来,这个转换过程涉及到内容的重新表示,除了信息的表现形式发生变化外,同时内容或多或少也会发生改变。通常情况下,转换生成的Office文档是独立的实体,与原始数据之间没有直接的关联。在对生成的Office文档进行转化后,新生成的文档具有与原始文件类似的外观,但内容及表现形式可能有了一定的差别,而用户的进一步使用和修改会使这一差别逐渐扩大。为了使用户于后续使用过程中能够识别文档的来源,用户可以手工记录原始文档与Office文档的同源关系,即,它们是由同一个原始文档修改或转换而来。但是在很多场景下,手工记录非常不方便甚至比较困难。
因此,如何在大量文档的管理中根据同源关系将大量文档划分为同源文档簇,从而为系统的数据统计和用户的信息查找提供方便,是本领域技术人员亟需解决的技术问题。
发明内容
本发明提供一种Office文档转化及修改过程中的跟踪方法,用以在大量文档的管理中根据同源关系将大量文档划分为同源文档簇,从而为系统的数据统计和用户的信息查找提供方便。
为了达到上述目的,本发明提供了一种Office文档转化及修改过程中的 跟踪方法,其包括以下步骤:
S1:生成一包含唯一ID的客户化数据;
S2:当Office文档生成时或生成后,使用Office文档的customXML机制,将所述客户化数据用XML格式保存在Office文档中;
S3:对该Office文档进行修改后,该唯一ID保持不变;
S4:将该Office文档转化为一目标格式的目标文件后,如果该目标格式能够支持保存该客户化数据,则将该客户化数据转移到该目标文件中,并且用户能够根据该客户化数据对该目标文件进行管理;
S5:对S2步骤得到的Office文档进行更新并再次生成了新的Office文档时,该新的Office文档中的该唯一ID保持不变。
在本发明的一实施例中,客户化数据存储于该Office文档元数据的标题或备注中。
在本发明的一实施例中,客户化数据存储于该Office文档正文的隐藏文字中。
本发明提供的Office文档转化及修改过程中的跟踪方法不需要用户人工干预即可自动跟踪生成的Office文档的转化和修改过程。使用本发明后,对于具体的两个Office文档,可以判断它们是否由同一个原始文档修改或转换而来,对于由这些Office文档进行再次转换而得到的其他格式的文档,在目标格式能够支持本发明所提到的操作的情况下也可以进行上述判断。因此,本发明能够为系统的数据统计和用户的信息查找提供方便,具有很强的实用性。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实 施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明提供的Office文档转化及修改过程中的跟踪方法的流程图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有付出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
图1为本发明提供的Office文档转化及修改过程中的跟踪方法的流程图,如图所示,本发明提供的Office文档转化及修改过程中的跟踪方法包括以下步骤:
S1:生成一包含唯一ID的客户化数据;
S2:当Office文档生成时或生成后,使用Office文档的customXML机制,将所述客户化数据用XML格式保存在Office文档中;
S3:对该Office文档进行修改后,该唯一ID保持不变;
S4:将该Office文档转化为一目标格式的目标文件后,如果该目标格式能够支持保存该客户化数据,则将该客户化数据转移到该目标文件中,并且用户能够根据该客户化数据对该目标文件进行管理;
S5:对S2步骤得到的Office文档进行更新并再次生成了新的Office文档时,该新的Office文档中的该唯一ID保持不变。
在本发明的一具体实施例中,客户化数据可以存储于该Office文档元数据的标题或备注中。
在本发明的另一具体实施例中,客户化数据还可以存储于该Office文档正文的隐藏文字中。
使用本发明提供的Office文档转化及修改过程中的跟踪方法判断两个文档(文档1、文档2)以及可能由它们转化而来的文档是否为同源文档的步骤如下:
(1)若文档1是Office文档,则从Office文档的customXML机制中,获取XML格式表示的包含唯一ID的客户化数据。对于可能由文档1转换生成的其他格式的文档,尝试以对应于该目标格式的方式获取其中的客户化数据;
(2)对文档2同样执行上述操作;
(3)如果这些客户化数据具有相同的唯一ID,则它们是同源的。如果不同,则它们不是同源的。如果未能获取到其中的客户化数据,说明这些文档未在本发明的判别范围内。
本发明提供的Office文档转化及修改过程中的跟踪方法不需要用户人工干预即可自动跟踪生成的Office文档的转化和修改过程。使用本发明后,对于具体的两个Office文档,可以判断它们是否由同一个原始文档修改或转换而来,对于由这些Office文档进行再次转换而得到的其他格式的文档,在目标格式能够支持本发明所提到的操作的情况下也可以进行上述判断。因此,本发明能够为系统的数据统计和用户的信息查找提供方便,具有很强的实用性。
本领域普通技术人员可以理解:附图只是一个实施例的示意图,附图中的模块或流程并不一定是实施本发明所必须的。
本领域普通技术人员可以理解:实施例中的装置中的模块可以按照实施例描述分布于实施例的装置中,也可以进行相应变化位于不同于本实施例的 一个或多个装置中。上述实施例的模块可以合并为一个模块,也可以进一步拆分成多个子模块。
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明实施例技术方案的精神和范围。

Claims (3)

  1. 一种Office文档转化及修改过程中的跟踪方法,其特征在于,包括以下步骤:
    S1:生成一包含唯一ID的客户化数据;
    S2:当Office文档生成时或生成后,使用Office文档的customXML机制,将所述客户化数据用XML格式保存在Office文档中;
    S3:对该Office文档进行修改后,该唯一ID保持不变;
    S4:将该Office文档转化为一目标格式的目标文件后,如果该目标格式能够支持保存该客户化数据,则将该客户化数据转移到该目标文件中,并且用户能够根据该客户化数据对该目标文件进行管理;
    S5:对S2步骤得到的Office文档进行更新并再次生成了新的Office文档时,该新的Office文档中的该唯一ID保持不变。
  2. 根据权利要求1所述的Office文档转化及修改过程中的跟踪方法,其特征在于,客户化数据存储于该Office文档元数据的标题或备注中。
  3. 根据权利要求1所述的Office文档转化及修改过程中的跟踪方法,其特征在于,客户化数据存储于该Office文档正文的隐藏文字中。
PCT/CN2017/000320 2016-06-15 2017-04-25 一种Office文档转化及修改过程中的跟踪方法 WO2017215253A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/305,604 US20200117852A1 (en) 2016-06-15 2017-04-25 Method for tracking in office file conversion and modification processes

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610422318.8 2016-06-15
CN201610422318.8A CN107515846B (zh) 2016-06-15 2016-06-15 一种Office文档转化及修改过程中的跟踪方法

Publications (1)

Publication Number Publication Date
WO2017215253A1 true WO2017215253A1 (zh) 2017-12-21

Family

ID=60663955

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/000320 WO2017215253A1 (zh) 2016-06-15 2017-04-25 一种Office文档转化及修改过程中的跟踪方法

Country Status (3)

Country Link
US (1) US20200117852A1 (zh)
CN (1) CN107515846B (zh)
WO (1) WO2017215253A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111143302B (zh) * 2019-12-24 2023-06-16 北京明朝万达科技股份有限公司 一种追踪Office文档内容变更的方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101477517A (zh) * 2009-01-21 2009-07-08 传神联合(北京)信息技术有限公司 一种office文档编辑多点远程同步的方法
CN102053952A (zh) * 2009-11-10 2011-05-11 英华达(上海)电子有限公司 电子书数据格式转换的方法、装置及便携式电子书阅读器
CN102163233A (zh) * 2011-04-18 2011-08-24 北京神州数码思特奇信息技术股份有限公司 一种网页标记语言格式转换方法及系统
US20130080869A1 (en) * 2011-09-23 2013-03-28 Guy Le Henaff Apparatus and method for tracing a document in a publication

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090287737A1 (en) * 2007-10-31 2009-11-19 Wayne Hammerly Architecture for enabling rapid database and application development
US8818961B1 (en) * 2009-10-30 2014-08-26 Symantec Corporation User restoration of workflow objects and elements from an archived database
US20130254699A1 (en) * 2012-03-21 2013-09-26 Intertrust Technologies Corporation Systems and methods for managing documents and other electronic content
US8924443B2 (en) * 2012-10-05 2014-12-30 Gary Robin Maze Document management systems and methods
US9529799B2 (en) * 2013-03-14 2016-12-27 Open Text Sa Ulc System and method for document driven actions
CN103294796B (zh) * 2013-05-24 2017-03-01 上海申腾信息技术有限公司 一种xml解析方法及医疗病案中自定义xml结构表单实现方法
US9613190B2 (en) * 2014-04-23 2017-04-04 Intralinks, Inc. Systems and methods of secure data exchange
US10097557B2 (en) * 2015-10-01 2018-10-09 Lam Research Corporation Virtual collaboration systems and methods

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101477517A (zh) * 2009-01-21 2009-07-08 传神联合(北京)信息技术有限公司 一种office文档编辑多点远程同步的方法
CN102053952A (zh) * 2009-11-10 2011-05-11 英华达(上海)电子有限公司 电子书数据格式转换的方法、装置及便携式电子书阅读器
CN102163233A (zh) * 2011-04-18 2011-08-24 北京神州数码思特奇信息技术股份有限公司 一种网页标记语言格式转换方法及系统
US20130080869A1 (en) * 2011-09-23 2013-03-28 Guy Le Henaff Apparatus and method for tracing a document in a publication

Also Published As

Publication number Publication date
CN107515846A (zh) 2017-12-26
CN107515846B (zh) 2019-11-15
US20200117852A1 (en) 2020-04-16

Similar Documents

Publication Publication Date Title
US9875465B2 (en) Process control device, process control method, and non-transitory computer-readable medium
US20210216678A1 (en) Cad collaborative design system
US8601367B1 (en) Systems and methods for generating filing documents in a visual presentation context with XBRL barcode authentication
US8307008B2 (en) Creation and management of electronic files for localization project
US10353874B2 (en) Method and apparatus for associating information
CN104536987B (zh) 一种查询数据的方法及装置
US20130275369A1 (en) Data record collapse and split functionality
US20140149854A1 (en) Server and method for generating object document
US20220100951A1 (en) Shareable and cross-application non-destructive content processing pipelines
Gómez et al. An approach to the co-creation of models and metamodels in Enterprise Architecture Projects.
WO2017215253A1 (zh) 一种Office文档转化及修改过程中的跟踪方法
WO2019242298A1 (zh) 用于图纸设计的数据处理方法、plm插件及计算设备
CN105786925A (zh) 基于参考模型进行动态数据建模的方法及装置
US20120254719A1 (en) Mapping an Object Type to a Document Type
US20140068426A1 (en) System and method of modifying order and structure of a template tree of a document type by merging components of the template tree
CN104572730A (zh) 数字资源导入、导出方法及装置
CN107463618B (zh) 一种索引创建方法和装置
JP2005339549A (ja) データをラップするための方法およびシステム
JP2009230300A (ja) 情報処理システム
US7895155B2 (en) Method and system for updating document content and metadata via plug-in chaining in a content management system
US20090319471A1 (en) Field mapping for data stream output
US9052906B2 (en) Modularized customization of a model in a model driven development environment
Norris Toward an ontology of audio preservation
US20130117330A1 (en) Retaining corporate memory
WO2014192116A1 (ja) データ連携支援装置及びデータ連携支援方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17812371

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17812371

Country of ref document: EP

Kind code of ref document: A1