JP5708372B2

JP5708372B2 - Document file difference extraction system, image processing apparatus, document file difference extraction method, and program

Info

Publication number: JP5708372B2
Application number: JP2011185344A
Authority: JP
Inventors: 潤國岡; 高橋　健一; 健一高橋; 和明友野; 松原　賢士; 賢士松原
Original assignee: Konica Minolta Inc
Current assignee: Konica Minolta Inc
Priority date: 2011-08-26
Filing date: 2011-08-26
Publication date: 2015-04-30
Anticipated expiration: 2031-08-26
Also published as: JP2013045437A

Description

この発明は、２つの文書ファイルを比較してその差分を抽出可能な文書ファイルの差分抽出システム、該システムに好適に用いられる画像処理装置、前記システムで実行される文書ファイルの差分抽出方法及びプログラムに関する。 The present invention relates to a document file difference extraction system capable of comparing two document files and extracting the difference, an image processing apparatus suitably used in the system, a document file difference extraction method and a program executed in the system About.

同一の文書名であるが発行時期が異なる場合、文書間における変更内容をチェックしたい場合がある。例えば、契約書等の文書において、暫定版として発行された後、正式版が発行されるような場合、担当者は、入手した文書について、前回発行したものとの差分（変更）箇所をチェックし、内容に問題がないこと等の確認を行う場合である。 If the document names are the same but are issued at different times, you may want to check changes between documents. For example, in a document such as a contract, when a formal version is issued after being issued as a provisional version, the person in charge checks the difference (change) in the obtained document from the previous issue. This is a case of confirming that there is no problem in the contents.

このような場合、暫定版の文書と正式版の文書を文字認識装置等で読み込んで電子ファイル化し、文書の全文を比較チェックする方法が良く用いられている。 In such a case, a method is often used in which a provisional version document and a formal version document are read by a character recognition device or the like and converted into an electronic file, and the entire text of the document is compared and checked.

また、特許文献１〜３にも、２つの文書を比較して差分を抽出する装置が提案されている。 Also, Patent Documents 1 to 3 propose an apparatus that compares two documents and extracts a difference.

特開２００７−２５７３０８号公報JP 2007-257308 A 特開２００１−２９７０８０号公報JP 2001-297080 A 特開平０８−１３７６４３号公報Japanese Patent Application Laid-Open No. 08-137643

しかし、文書の全文の比較チェックを行って差分を抽出する方法では、
・文末の表現（です、ます等）や句読点の有無の修正
・記載順序の変更（内容そのものの変更なし）
・文言レベルの訂正（例えばｗｅｂブラウザをウェブブラウザに変更した場合等）
のような場合も差分として抽出されてしまうため、作業者が確認しなければならない箇所が非常に多くなり、作業者にとって負担となっている。 However, in the method of comparing and checking the full text of the document and extracting the difference,
・ Changing the expression at the end of the sentence (is, etc.) and the presence or absence of punctuation marks ・ Change in the order of writing (no change in the content itself)
-Correction of the wording level (for example, when the web browser is changed to a web browser)
In such a case, since the difference is extracted, the number of places that the operator has to confirm becomes very large, which is a burden on the operator.

また、差分結果の出力形態として、差分結果を作業者の端末装置のディスプレイに表示させたり、差分結果を紙にプリントアウトすることが考えられるが、差分箇所が多くなると、エディタ等の表示画面の視認性が悪化したり、プリントアウトする紙の枚数が増えるため、作業者の作業効率が低下するという問題もある。 In addition, as the output form of the difference result, it is possible to display the difference result on the display of the operator's terminal device, or to print out the difference result on paper. There is also a problem that the working efficiency of the operator is lowered because the visibility is deteriorated or the number of sheets to be printed out increases.

このような問題は、前記特許文献１〜３に記載の装置を用いた場合も、同様に生じるものであった。 Such a problem occurred similarly when the apparatuses described in Patent Documents 1 to 3 were used.

この発明は、このような技術的課題を解決するためになされたものであって、２つの文書ファイルの差分の抽出範囲を必要な箇所に絞ることができ、これにより作業者の差分箇所の確認作業負担を減らすことができる文書ファイルの差分抽出システム及び該システムに好適に用いられる画像処理装置和提供し、さらに前記システムで実行される文書ファイルの差分抽出方法及び画像処理装置のコンピュータに差分抽出処理を実行させるための文書ファイルの差分抽出プログラムの提供を課題とする。 The present invention has been made to solve such a technical problem, and the extraction range of differences between two document files can be narrowed down to a necessary portion, thereby confirming a difference portion of an operator. Document file difference extraction system capable of reducing work load and image processing apparatus sum used suitably for the system are provided, and document file difference extraction method executed in the system and difference extraction to the image processing apparatus computer It is an object to provide a document file difference extraction program for executing processing.

上記課題は、以下の手段によって解決される。
（１）画像処理装置とドキュメント管理サーバがネットワークを介して接続された文書ファイルの差分抽出システムであって、前記画像処理装置は、第１の文書ファイルを入力する入力手段と、前記入力手段により入力した第１の文書ファイルから、文書内の段落構成を分析して各段落の項目を抽出することにより、目次を作成する目次作成手段と、前記目次作成手段により作成された目次に示される項目のうち、差分抽出対象の項目を判定する項目判定手段と、前記ドキュメント管理サーバから、前記第１の文書ファイルとの差分を抽出される第２の文書ファイルを取得する文書ファイル取得手段と、前記項目判定手段により判定された差分抽出対象の項目に対応する段落について、前記第１の文書ファイルと第２の文書ファイルの差分を抽出する差分抽出手段と、前記差分抽出手段により抽出された差分を表示手段に表示させ、または外部のユーザ端末装置に送信する制御手段と、を備え、前記ドキュメント管理サーバは、１個または２個以上の文書ファイルを保存する保存手段と、画像処理装置からの取得要求に基づいて、前記保存手段に保存されている第２の文書ファイルを前記画像処理装置に送信する送信手段と、を備え、前記ドキュメント管理サーバの保存手段には、１個または２個以上のひな形文書が保存されており、前記画像処理装置の文書ファイル取得手段は、前記第１の文書ファイルに関連するひな形文書を前記ドキュメント管理サーバから取得し、前記目次作成手段は、前記取得したひな形文書の項目を抽出し、前記項目判定手段は、前記抽出されたひな形文書の項目と同じ第１の文書ファイルの目次項目を、差分抽出対象の項目と判定することを特徴とする文書ファイルの差分抽出システム。
（２）画像処理装置とドキュメント管理サーバがネットワークを介して接続された文書ファイルの差分抽出システムであって、前記画像処理装置は、第１の文書ファイルを入力する入力手段と、前記入力手段により入力した第１の文書ファイルから、文書内の段落構成を分析して各段落の項目を抽出することにより、目次を作成する目次作成手段と、前記目次作成手段により作成された目次に示される項目のうち、差分抽出対象の項目を判定する項目判定手段と、前記ドキュメント管理サーバから、前記第１の文書ファイルとの差分を抽出される第２の文書ファイルを取得する文書ファイル取得手段と、前記項目判定手段により判定された差分抽出対象の項目に対応する段落について、前記第１の文書ファイルと第２の文書ファイルの差分を抽出する差分抽出手段と、前記差分抽出手段により抽出された差分を表示手段に表示させ、または外部のユーザ端末装置に送信する制御手段と、を備え、前記ドキュメント管理サーバは、１個または２個以上の文書ファイルを保存する保存手段と、画像処理装置からの取得要求に基づいて、前記保存手段に保存されている第２の文書ファイルを前記画像処理装置に送信する送信手段と、を備え、前記ドキュメント管理サーバの保存手段には、１個または２個以上の議事録が保存されており、前記画像処理装置は、前記第１の文書ファイルと第２の文書ファイルの発行日付を取得する日付取得手段を備え、前記画像処理装置の文書ファイル取得手段は、前記第１の文書ファイルに関連し第２の文書ファイルの発行日付から第１の文書ファイルの発行日付までの間に作成された議事録を、前記ドキュメント管理サーバから取得し、前記目次作成手段は、前記取得した議事録の項目を抽出し、前記項目判定手段は、前記抽出された議事録の項目と同じ第１の文書ファイルの目次項目を、差分抽出対象の項目と判定することを特徴とする文書ファイルの差分抽出システム。
（３）第１の文書ファイルを入力する入力手段と、前記入力手段により入力した第１の文書ファイルから、文書内の段落構成を分析して各段落の項目を抽出することにより、目次を作成する目次作成手段と、前記目次作成手段により作成された目次に示される項目のうち、差分抽出対象の項目を判定する項目判定手段と、自装置の内部または外部に設けられた、１個または２個以上の文書ファイルを保存する保存手段から、前記第１の文書ファイルとの差分を抽出される第２の文書ファイルを取得する文書ファイル取得手段と、前記項目判定手段により判定された差分抽出対象の項目に対応する段落について、前記第１の文書ファイルと第２の文書ファイルの差分を抽出する差分抽出手段と、前記差分抽出手段により抽出された差分を表示手段に表示させ、または外部のユーザ端末装置に送信する制御手段と、を備え、前記保存手段には１個または２個以上のひな形文書が保存されており、前記文書ファイル取得手段は、前記第１の文書ファイルに関連するひな形文書を前記保存手段から取得し、前記目次作成手段は、前記取得したひな形文書の項目を抽出し、前記項目判定手段は、前記抽出されたひな形文書の項目と同じ第１の文書ファイルの目次項目を、差分抽出対象の項目と判定することを特徴とする画像処理装置。
（４）第１の文書ファイルを入力する入力手段と、前記入力手段により入力した第１の文書ファイルから、文書内の段落構成を分析して各段落の項目を抽出することにより、目次を作成する目次作成手段と、前記目次作成手段により作成された目次に示される項目のうち、差分抽出対象の項目を判定する項目判定手段と、自装置の内部または外部に設けられた、１個または２個以上の文書ファイルを保存する保存手段から、前記第１の文書ファイルとの差分を抽出される第２の文書ファイルを取得する文書ファイル取得手段と、前記項目判定手段により判定された差分抽出対象の項目に対応する段落について、前記第１の文書ファイルと第２の文書ファイルの差分を抽出する差分抽出手段と、前記差分抽出手段により抽出された差分を表示手段に表示させ、または外部のユーザ端末装置に送信する制御手段と、を備え、前記保存手段には１個または２個以上の議事録が保存されており、前記第１の文書ファイルと第２の文書ファイルの発行日付を取得する日付取得手段をさらに備え、前記文書ファイル取得手段は、前記第１の文書ファイルに関連し第２の文書ファイルの発行日付から第１の文書ファイルの発行日付までの間に作成された議事録を、前記保存手段から取得し、前記目次作成手段は、前記取得した議事録の項目を抽出し、前記項目判定手段は、前記抽出された議事録の項目と同じ第１の文書ファイルの目次項目を、差分抽出対象の項目と判定することを特徴とする画像処理装置。
（５）画像処理装置とドキュメント管理サーバがネットワークを介して接続されたシステムで実行される文書ファイルの差分抽出方法であって、前記画像処理装置は、第１の文書ファイルを入力する入力ステップと、前記入力ステップにより入力した第１の文書ファイルから、文書内の段落構成を分析して各段落の項目を抽出することにより、目次を作成する目次作成ステップと、前記目次作成ステップにより作成された目次に示される項目のうち、差分抽出対象の項目を判定する項目判定ステップと、前記ドキュメント管理サーバから、前記第１の文書ファイルとの差分を抽出される第２の文書ファイルを取得する文書ファイル取得ステップと、前記項目判定ステップにより判定された差分抽出対象の項目に対応する段落について、前記第１の文書ファイルと第２の文書ファイルの差分を抽出する差分抽出ステップと、前記差分抽出ステップにより抽出された差分を表示手段に表示させ、または外部のユーザ端末装置に送信する制御ステップと、を実行し、前記ドキュメント管理サーバは、画像処理装置からの取得要求に基づいて、１個または２個以上の文書ファイルを保存する保存手段に保存されている第２の文書ファイルを、前記画像処理装置に送信する送信ステップを実行し、前記ドキュメント管理サーバの保存手段には、１個または２個以上のひな形文書が保存されており、前記画像処理装置は文書ファイル取得ステップで、前記第１の文書ファイルに関連するひな形文書を前記ドキュメント管理サーバから取得し、前記目次作成ステップで、前記取得したひな形文書の項目を抽出し、前記項目判定ステップで、前記抽出されたひな形文書の項目と同じ第１の文書ファイルの目次項目を、差分抽出対象の項目と判定することを特徴とする文書ファイルの差分抽出方法。
（６）画像処理装置とドキュメント管理サーバがネットワークを介して接続されたシステムで実行される文書ファイルの差分抽出方法であって、前記画像処理装置は、第１の文書ファイルを入力する入力ステップと、前記入力ステップにより入力した第１の文書ファイルから、文書内の段落構成を分析して各段落の項目を抽出することにより、目次を作成する目次作成ステップと、前記目次作成ステップにより作成された目次に示される項目のうち、差分抽出対象の項目を判定する項目判定ステップと、前記ドキュメント管理サーバから、前記第１の文書ファイルとの差分を抽出される第２の文書ファイルを取得する文書ファイル取得ステップと、前記項目判定ステップにより判定された差分抽出対象の項目に対応する段落について、前記第１の文書ファイルと第２の文書ファイルの差分を抽出する差分抽出ステップと、前記差分抽出ステップにより抽出された差分を表示手段に表示させ、または外部のユーザ端末装置に送信する制御ステップと、を実行し、前記ドキュメント管理サーバは、画像処理装置からの取得要求に基づいて、１個または２個以上の文書ファイルを保存する保存手段に保存されている第２の文書ファイルを、前記画像処理装置に送信する送信ステップを実行し、前記ドキュメント管理サーバの保存手段には、１個または２個以上の議事録が保存されており、前記画像処理装置は、前記第１の文書ファイルと第２の文書ファイルの発行日付を取得する日付取得ステップを実行し、前記画像処理装置の文書ファイル取得ステップで、前記第１の文書ファイルに関連し第２の文書ファイルの発行日付から第１の文書ファイルの発行日付までの間に作成された議事録を、前記ドキュメント管理サーバから取得し、前記目次作成ステップで、前記取得した議事録の項目を抽出し、前記項目判定ステップで、前記抽出された議事録の項目と同じ第１の文書ファイルの目次項目を、差分抽出対象の項目と判定することを特徴とする文書ファイルの差分抽出方法。
（７）第１の文書ファイルを入力する入力ステップと、前記入力ステップにより入力した第１の文書ファイルから、文書内の段落構成を分析して各段落の項目を抽出することにより、目次を作成する目次作成ステップと、前記目次作成ステップにより作成された目次に示される項目のうち、差分抽出対象の項目を判定する項目判定ステップと、自装置の内部または外部に設けられた、１個または２個以上の文書ファイルを保存する保存手段から、前記第１の文書ファイルとの差分を抽出される第２の文書ファイルを取得する文書ファイル取得ステップと、前記項目判定ステップにより判定された差分抽出対象の項目に対応する段落について、前記第１の文書ファイルと第２の文書ファイルの差分を抽出する差分抽出ステップと、前記差分抽出ステップにより抽出された差分を表示手段に表示させ、または外部のユーザ端末装置に送信する制御ステップと、を実行し、前記保存手段には１個または２個以上のひな形文書が保存されており、前記文書ファイル取得ステップで、前記第１の文書ファイルに関連するひな形文書を前記保存手段から取得し、前記目次作成ステップで、前記取得したひな形文書の項目を抽出し、前記項目判定ステップで、前記抽出されたひな形文書の項目と同じ第１の文書ファイルの目次項目を、差分抽出対象の項目と判定することを特徴とする画像処理装置における文書ファイルの差分抽出方法。
（８）第１の文書ファイルを入力する入力ステップと、前記入力ステップにより入力した第１の文書ファイルから、文書内の段落構成を分析して各段落の項目を抽出することにより、目次を作成する目次作成ステップと、前記目次作成ステップにより作成された目次に示される項目のうち、差分抽出対象の項目を判定する項目判定ステップと、自装置の内部または外部に設けられた、１個または２個以上の文書ファイルを保存する保存手段から、前記第１の文書ファイルとの差分を抽出される第２の文書ファイルを取得する文書ファイル取得ステップと、前記項目判定ステップにより判定された差分抽出対象の項目に対応する段落について、前記第１の文書ファイルと第２の文書ファイルの差分を抽出する差分抽出ステップと、前記差分抽出ステップにより抽出された差分を表示手段に表示させ、または外部のユーザ端末装置に送信する制御ステップと、実行し、前記保存手段には１個または２個以上の議事録が保存されており、前記第１の文書ファイルと第２の文書ファイルの発行日付を取得する日付取得ステップをさらに実行し、前記文書ファイル取得ステップで、前記第１の文書ファイルに関連し第２の文書ファイルの発行日付から第１の文書ファイルの発行日付までの間に作成された議事録を、前記保存手段から取得し、前記目次作成ステップで、前記取得した議事録の項目を抽出し、前記項目判定ステップで、前記抽出された議事録の項目と同じ第１の文書ファイルの目次項目を、差分抽出対象の項目と判定することを特徴とする画像処理装置における文書ファイルの差分抽出方法。
（９）第１の文書ファイルを入力する入力ステップと、前記入力ステップにより入力した第１の文書ファイルから、文書内の段落構成を分析して各段落の項目を抽出することにより、目次を作成する目次作成ステップと、前記目次作成ステップにより作成された目次に示される項目のうち、差分抽出対象の項目を判定する項目判定ステップと、自装置の内部または外部に設けられた、１個または２個以上の文書ファイルを保存する保存手段から、前記第１の文書ファイルとの差分を抽出される第２の文書ファイルを取得する文書ファイル取得ステップと、前記項目判定ステップにより判定された差分抽出対象の項目に対応する段落について、前記第１の文書ファイルと第２の文書ファイルの差分を抽出する差分抽出ステップと、前記差分抽出ステップにより抽出された差分を表示手段に表示させ、または外部のユーザ端末装置に送信する制御ステップと、を画像処理装置のコンピュータに実行させ、前記保存手段には１個または２個以上のひな形文書が保存されており、前記文書ファイル取得ステップで、前記第１の文書ファイルに関連するひな形文書を前記保存手段から取得し、前記目次作成ステップで、前記取得したひな形文書の項目を抽出し、前記項目判定ステップで、前記抽出されたひな形文書の項目と同じ第１の文書ファイルの目次項目を、差分抽出対象の項目と判定する処理を前記コンピュータに実行させるための文書ファイルの差分抽出プログラム。
（１０）第１の文書ファイルを入力する入力ステップと、前記入力ステップにより入力した第１の文書ファイルから、文書内の段落構成を分析して各段落の項目を抽出することにより、目次を作成する目次作成ステップと、前記目次作成ステップにより作成された目次に示される項目のうち、差分抽出対象の項目を判定する項目判定ステップと、自装置の内部または外部に設けられた、１個または２個以上の文書ファイルを保存する保存手段から、前記第１の文書ファイルとの差分を抽出される第２の文書ファイルを取得する文書ファイル取得ステップと、前記項目判定ステップにより判定された差分抽出対象の項目に対応する段落について、前記第１の文書ファイルと第２の文書ファイルの差分を抽出する差分抽出ステップと、前記差分抽出ステップにより抽出された差分を表示手段に表示させ、または外部のユーザ端末装置に送信する制御ステップと、画像処理装置のコンピュータに実行させ、前記保存手段には１個または２個以上の議事録が保存されており、前記第１の文書ファイルと第２の文書ファイルの発行日付を取得する日付取得ステップを前記コンピュータにさらに実行させ、前記文書ファイル取得ステップで、前記第１の文書ファイルに関連し第２の文書ファイルの発行日付から第１の文書ファイルの発行日付までの間に作成された議事録を、前記保存手段から取得し、前記目次作成ステップで、前記取得した議事録の項目を抽出し、前記項目判定ステップで、前記抽出された議事録の項目と同じ第１の文書ファイルの目次項目を、差分抽出対象の項目と判定する処理を前記コンピュータに実行させるための文書ファイルの差分抽出プログラム。
The above problem is solved by the following means.
(1) A document file difference extraction system in which an image processing apparatus and a document management server are connected via a network, wherein the image processing apparatus includes an input means for inputting a first document file, and the input means. A table of contents creation means for creating a table of contents by analyzing the paragraph structure in the document and extracting items of each paragraph from the input first document file, and items shown in the table of contents created by the table of contents creation means An item determination unit that determines an item to be extracted, a document file acquisition unit that acquires a second document file from which a difference from the first document file is extracted from the document management server, For the paragraph corresponding to the difference extraction target item determined by the item determination unit, the difference between the first document file and the second document file is calculated. A difference extracting unit for outputting, and a control unit for displaying the difference extracted by the difference extracting unit on a display unit or transmitting the difference to an external user terminal device, and the document management server includes one or two document management servers. A storage unit that stores the above document file; and a transmission unit that transmits the second document file stored in the storage unit to the image processing device based on an acquisition request from the image processing device ; One or two or more template documents are stored in the storage unit of the document management server, and the document file acquisition unit of the image processing apparatus stores a template document related to the first document file. Acquired from the document management server, the table of contents creating means extracts items of the acquired template document, and the item determining means is the extracted template Difference extraction system of a document file, characterized in that the table of contents entry for the same first document files and book item, determining a difference extracting target item.
(2) A document file difference extraction system in which an image processing apparatus and a document management server are connected via a network, wherein the image processing apparatus includes an input means for inputting a first document file, and an input means. A table of contents creation means for creating a table of contents by analyzing the paragraph structure in the document and extracting items of each paragraph from the input first document file, and items shown in the table of contents created by the table of contents creation means An item determination unit that determines an item to be extracted, a document file acquisition unit that acquires a second document file from which a difference from the first document file is extracted from the document management server, For the paragraph corresponding to the difference extraction target item determined by the item determination unit, the difference between the first document file and the second document file is calculated. A difference extracting unit for outputting, and a control unit for displaying the difference extracted by the difference extracting unit on a display unit or transmitting the difference to an external user terminal device, and the document management server includes one or two document management servers. A storage unit that stores the above document file; and a transmission unit that transmits the second document file stored in the storage unit to the image processing device based on an acquisition request from the image processing device; One or two or more minutes are stored in the storage unit of the document management server, and the image processing apparatus acquires a date for issuing the first document file and the second document file. An acquisition unit, wherein the document file acquisition unit of the image processing apparatus relates to the first document file from the issue date of the second document file. Minutes created up to the line date are acquired from the document management server, the table of contents creating means extracts the items of the obtained minutes, and the item determining means is the extracted minutes A document file difference extraction system, wherein the table of contents item of the first document file that is the same as the item is determined as a difference extraction target item.
(3) Create a table of contents by analyzing the paragraph structure in the document and extracting each paragraph item from the input means for inputting the first document file and the first document file input by the input means A table of contents creation means, an item determination means for judging an item of a difference extraction target among the items shown in the table of contents created by the table of contents creation means, and one or two provided inside or outside the device itself A document file acquisition unit that acquires a second document file from which a difference from the first document file is extracted from a storage unit that stores at least one document file; and a difference extraction target determined by the item determination unit For the paragraph corresponding to the item, difference extraction means for extracting the difference between the first document file and the second document file, and the difference extracted by the difference extraction means is displayed. And a control means for transmitting to an external user terminal device, wherein the storage means stores one or more template documents, and the document file acquisition means A template document related to one document file is acquired from the storage unit, the table of contents generation unit extracts items of the acquired template document, and the item determination unit stores the extracted template document. An image processing apparatus that determines a table of contents item of a first document file that is the same as an item as a difference extraction target item.
(4) An input means for inputting a first document file, and a table of contents is created by analyzing the paragraph structure in the document and extracting items of each paragraph from the first document file input by the input means. A table of contents creation means, an item determination means for judging an item of a difference extraction target among the items shown in the table of contents created by the table of contents creation means, and one or two provided inside or outside the device itself A document file acquisition unit that acquires a second document file from which a difference from the first document file is extracted from a storage unit that stores at least one document file; and a difference extraction target determined by the item determination unit For the paragraph corresponding to the item, difference extraction means for extracting the difference between the first document file and the second document file, and the difference extracted by the difference extraction means is displayed. Or a control means for transmitting to an external user terminal device, wherein one or more minutes are stored in the storage means, and the first document file and the second document file are stored in the storage means. Date acquisition means for acquiring the issue date of the document file is further included, the document file acquisition means relating to the first document file from the issue date of the second document file to the issue date of the first document file. The minutes created in between are acquired from the storage means, the table of contents preparation means extracts the items of the acquired minutes, and the item determination means is the same as the items of the extracted minutes. An image processing apparatus, wherein a table of contents item of one document file is determined as a difference extraction target item.
(5) A document file difference extraction method executed in a system in which an image processing apparatus and a document management server are connected via a network, wherein the image processing apparatus inputs a first document file; A table of contents creation step of creating a table of contents by analyzing the paragraph structure in the document and extracting the items of each paragraph from the first document file input by the input step, and the table of contents creation step Among the items shown in the table of contents, an item determination step for determining an item to be extracted, and a document file for obtaining a second document file from which a difference from the first document file is extracted from the document management server For the paragraph corresponding to the difference extraction target item determined in the acquisition step and the item determination step, A difference extraction step for extracting a difference between the document file and the second document file, and a control step for displaying the difference extracted in the difference extraction step on a display means or transmitting the difference to an external user terminal device Then, the document management server stores, in the image processing apparatus, a second document file stored in a storage unit that stores one or more document files based on an acquisition request from the image processing apparatus. A transmission step is performed, and one or more template documents are stored in the storage unit of the document management server, and the image processing apparatus acquires the first document in the document file acquisition step. A template document related to the file is acquired from the document management server, and in the table of contents creation step, the acquired template document Extracting eyes, and in the item determining step, the table of contents item of the first document file that is the same as the extracted template document item is determined as a difference extraction target item. Method.
(6) A document file difference extraction method executed in a system in which an image processing apparatus and a document management server are connected via a network, wherein the image processing apparatus inputs a first document file; A table of contents creation step of creating a table of contents by analyzing the paragraph structure in the document and extracting the items of each paragraph from the first document file input by the input step, and the table of contents creation step Among the items shown in the table of contents, an item determination step for determining an item to be extracted, and a document file for obtaining a second document file from which a difference from the first document file is extracted from the document management server For the paragraph corresponding to the difference extraction target item determined in the acquisition step and the item determination step, A difference extraction step for extracting a difference between the document file and the second document file, and a control step for displaying the difference extracted in the difference extraction step on a display means or transmitting the difference to an external user terminal device Then, the document management server stores, in the image processing apparatus, a second document file stored in a storage unit that stores one or more document files based on an acquisition request from the image processing apparatus. A transmission step is performed, and one or more minutes are stored in the storage unit of the document management server, and the image processing apparatus stores the first document file and the second document. A date acquisition step of acquiring a file issue date is executed, and in the document file acquisition step of the image processing apparatus, the first document file Minutes created between the issue date of the second document file and the issue date of the first document file are acquired from the document management server, and the items of the acquired minutes are obtained in the table of contents creation step. A document file difference extraction method, wherein, in the item determination step, the table of contents item of the first document file that is the same as the extracted minutes item is determined as a difference extraction target item.
(7) An input step for inputting a first document file, and a table of contents is created by analyzing the paragraph structure in the document and extracting items of each paragraph from the first document file input by the input step. A table of contents creation step, an item determination step of determining a difference extraction target item among the items shown in the table of contents created by the table of contents creation step, and one or two provided inside or outside the device itself A document file acquisition step of acquiring a second document file from which a difference from the first document file is extracted from a storage unit that stores at least one document file; and a difference extraction target determined by the item determination step A difference extraction step for extracting a difference between the first document file and the second document file for the paragraph corresponding to the item, and the difference extraction A control step of displaying the difference extracted by the step on the display means or transmitting it to an external user terminal device, wherein one or two or more template documents are stored in the storage means In the document file acquisition step, a template document related to the first document file is acquired from the storage unit, and in the table of contents creation step, items of the acquired template document are extracted, and the item determination step A document file difference extraction method in an image processing apparatus, wherein the table of contents item of the first document file that is the same as the extracted template document item is determined as a difference extraction target item.
(8) An input step for inputting a first document file, and a table of contents is created by analyzing the paragraph structure in the document and extracting items of each paragraph from the first document file input by the input step A table of contents creation step, an item determination step of determining a difference extraction target item among the items shown in the table of contents created by the table of contents creation step, and one or two provided inside or outside the device itself A document file acquisition step of acquiring a second document file from which a difference from the first document file is extracted from a storage unit that stores at least one document file; and a difference extraction target determined by the item determination step A difference extraction step for extracting a difference between the first document file and the second document file for the paragraph corresponding to the item, and the difference extraction The control step of displaying the difference extracted by the step on the display means or transmitting it to an external user terminal device is executed, and one or two or more minutes are stored in the storage means, A date acquisition step of acquiring the issue dates of the first document file and the second document file is further executed, and the issue date of the second document file related to the first document file is acquired in the document file acquisition step. Minutes created up to the date of issue of the first document file are acquired from the storage means, and in the table of contents creation step, items of the acquired minutes are extracted, and in the item determination step, A difference between document files in the image processing apparatus, wherein the table of contents item of the first document file that is the same as the extracted minutes is determined as a difference extraction target item Extraction method.
(9) An input step for inputting a first document file, and a table of contents is created by analyzing the paragraph structure in the document and extracting items of each paragraph from the first document file input by the input step A table of contents creation step, an item determination step of determining a difference extraction target item among the items shown in the table of contents created by the table of contents creation step, and one or two provided inside or outside the device itself A document file acquisition step of acquiring a second document file from which a difference from the first document file is extracted from a storage unit that stores at least one document file; and a difference extraction target determined by the item determination step A difference extraction step for extracting a difference between the first document file and the second document file for the paragraph corresponding to the item, and the difference extraction A control step of displaying the difference extracted by the step on the display means or transmitting the difference to an external user terminal device, causing the computer of the image processing apparatus to execute, and the storage means has one or more templates. A document is stored, and in the document file acquisition step, a template document related to the first document file is acquired from the storage unit, and in the table of contents creation step, the acquired template document item is extracted. Then, in the item determination step, the difference between the document files for causing the computer to execute a process of determining the table of contents item of the first document file that is the same as the extracted template document item as a difference extraction target item. Extraction program.
(10) An input step for inputting a first document file, and a table of contents is created by analyzing the paragraph structure in the document and extracting items of each paragraph from the first document file input by the input step A table of contents creation step, an item determination step of determining a difference extraction target item among the items shown in the table of contents created by the table of contents creation step, and one or two provided inside or outside the device itself A document file acquisition step of acquiring a second document file from which a difference from the first document file is extracted from a storage unit that stores at least one document file; and a difference extraction target determined by the item determination step A difference extraction step for extracting a difference between the first document file and the second document file for the paragraph corresponding to the item, and the difference extraction A control step of displaying the difference extracted in the step on the display means or transmitting it to an external user terminal device, and causing the computer of the image processing apparatus to execute, wherein the storage means has one or more minutes. A date acquisition step for acquiring the issue date of the first document file and the second document file, the computer further executing a date acquisition step, wherein the document file acquisition step relates to the first document file. The minutes created between the issue date of the second document file and the issue date of the first document file are acquired from the storage means, and the items of the acquired minutes are extracted in the table of contents creation step In the item determining step, the table of contents item of the first document file that is the same as the extracted minutes item is determined as a difference extraction target item. Difference extraction program document files for executing management on the computer.

前項（１）に記載の発明によれば、画像処理装置の入力手段により入力した第１の文書ファイルから、文書内の段落構成を分析して各段落の項目が抽出され、目次が作成され、作成された目次に示される項目のうち、差分抽出対象の項目が項目判定手段により判定される。一方、ドキュメント管理サーバから、第１の文書ファイルとの差分を抽出される第２の文書ファイルが取得される。そして、前記項目判定手段により判定された差分抽出対象の項目に対応する段落について、前記第１の文書ファイルと第２の文書ファイルの差分が抽出され、抽出された差分が表示手段に表示され、または外部のユーザ端末装置に送信される。 According to the invention described in (1), the paragraph structure in the document is extracted from the first document file input by the input unit of the image processing apparatus, the items of each paragraph are extracted, and the table of contents is created. Of the items shown in the created table of contents, the item to be extracted is determined by the item determining means. On the other hand, a second document file from which a difference from the first document file is extracted is acquired from the document management server. Then, for the paragraph corresponding to the difference extraction target item determined by the item determination unit, the difference between the first document file and the second document file is extracted, and the extracted difference is displayed on the display unit. Alternatively, it is transmitted to an external user terminal device.

従って、第１の文書ファイルと第２の文書ファイルの差分は、第１の文書ファイルのうち、項目判定手段により判定された差分抽出対象の項目に対応する段落に対して行われることになり、文書ファイル全ての差分を抽出する必要はないから、全体の差分抽出量を減少させることができ、ディスプレイ等に表示しあるいは紙に印刷しての作業者による確認作業が楽になる。 Therefore, the difference between the first document file and the second document file is performed on the paragraph corresponding to the item of the difference extraction target determined by the item determination unit in the first document file. Since it is not necessary to extract the differences of all the document files, the total amount of difference extraction can be reduced, and the confirmation work by the operator displayed on a display or printed on paper becomes easier.

また、第１の文書ファイルに記載されている内容をひな形に転記するような場合は、ひな形に記載されている各段落の項目に該当する内容が第１の文書ファイルに記載されていることが予想されることから、ひな形文書の項目を抽出し、この抽出された項目と同じ第１の文書ファイルの目次項目を、差分抽出対象の項目と判定することで、全文の差分抽出を回避しながら、効果的な差分抽出を行うことができる。
In addition, when the contents described in the first document file are transferred to the template, the contents corresponding to the items of each paragraph described in the template are described in the first document file. Therefore, by extracting the item of the template document and determining the table of contents item of the first document file that is the same as the extracted item as the difference extraction target item, the difference extraction of the whole sentence is performed. While avoiding, effective difference extraction can be performed.

前項（２）に記載の発明によれば、第２の文書ファイルの発行から第１の文書ファイルの発行に至る間に、第１の文書ファイルに関連する議事録が作成されたような場合は、議事録に記載されていることが重要事項と考えられることから、前記議事録の項目を抽出し、この抽出された項目と同じ第１の文書ファイルの目次項目を、差分抽出対象の項目と判定することで、全文の差分抽出を回避しながら、効果的な差分抽出を行うことができる。
According to the invention described in the preceding paragraph ( 2 ), when the minutes related to the first document file are created between the issue of the second document file and the issue of the first document file, Since it is considered that the minutes are described as important matters, the items of the minutes are extracted, and the table of contents items of the first document file that is the same as the extracted items are set as the items of the difference extraction target. By determining, effective difference extraction can be performed while avoiding the difference extraction of the whole sentence.

前項（３）に記載の発明によれば、第１の文書ファイルと第２の文書ファイルの差分は、第１の文書ファイルのうち、項目判定手段により判定された差分抽出対象の項目に対応する段落に対して行われることになり、文書ファイル全ての差分を抽出する必要はないから、全体の差分抽出量を減少させることができ、ディスプレイ等に表示しあるいは紙に印刷しての作業者による確認作業が楽になる。
According to the invention described in item ( 3 ) above, the difference between the first document file and the second document file corresponds to the difference extraction target item determined by the item determination unit in the first document file. Since it will be performed on a paragraph, it is not necessary to extract the differences of all the document files, so the total difference extraction amount can be reduced, and it can be displayed on a display or printed on paper by the operator Confirmation work becomes easy.

また、ひな形文書の項目を抽出し、この抽出された項目と同じ第１の文書ファイルの目次項目を、差分抽出対象の項目と判定することで、全文の差分抽出を回避しながら、効果的な差分抽出を行うことができる。
Also , by extracting the items of the template document and determining the table of contents items of the same first document file as the extracted items as the items to be extracted, it is effective while avoiding the difference extraction of the whole text. Difference extraction can be performed.

前項（４）に記載の発明によれば、議事録の項目を抽出し、この抽出された項目と同じ第１の文書ファイルの目次項目を、差分抽出対象の項目と判定することで、全文の差分抽出を回避しながら、効果的な差分抽出を行うことができる。
According to the invention described in the preceding paragraph ( 4 ), the minutes item is extracted, and the table of contents item of the first document file that is the same as the extracted item is determined as the item of the difference extraction target. Effective difference extraction can be performed while avoiding difference extraction.

前項（５）及び（６）に記載の発明によれば、全体の差分抽出量を減少させることができ、ディスプレイ等に表示しあるいは紙に印刷しての作業者による確認作業が楽になる。
According to the inventions described in the preceding paragraphs ( 5 ) and (6) , the total amount of difference extraction can be reduced, and the confirmation work by the operator displayed on a display or printed on paper becomes easy.

前項（９）及び（１０）に記載の発明によれば、全体の差分抽出量を減少させることができる文書ファイルの差分抽出処理を、画像処理装置のコンピュータに実行させることができる。
According to the inventions described in the preceding items ( 9 ) and (10) , it is possible to cause the computer of the image processing apparatus to execute a document file difference extraction process capable of reducing the total difference extraction amount.

この発明の一実施形態に係る文書ファイルの差分抽出システムの概略構成図である。1 is a schematic configuration diagram of a document file difference extraction system according to an embodiment of the present invention. 画像処理装置の電気的構成を示すブロック図である。It is a block diagram which shows the electric constitution of an image processing apparatus. ドキュメント管理サーバの電気的構成を示すブロック図である。It is a block diagram which shows the electric constitution of a document management server. 画像処理装置によって行われる文書ファイルのドキュメント管理サーバへの登録・蓄積処理を示すフローチャートである。It is a flowchart which shows the registration and accumulation | storage process to the document management server of the document file performed by the image processing apparatus. ユーザが文書ファイルをドキュメント管理サーバへ登録・蓄積する作業を行う際に、画像処理装置の操作パネルに表示される画面を示す図である。6 is a diagram illustrating a screen displayed on the operation panel of the image processing apparatus when a user performs an operation of registering and storing a document file in a document management server. FIG. 文書識別コードの一例を示す図である。It is a figure which shows an example of a document identification code. システム全体の流れを説明するための図である。It is a figure for demonstrating the flow of the whole system. ユーザの端末装置に表示された目次の一覧を示す図である。It is a figure which shows the list of the table of contents displayed on the user's terminal device. 差分抽出結果の表示例である。It is an example of a display of a difference extraction result. 目次一覧の中からユーザが差分抽出対象の項目を選択した際に、その重要度を選択する際の画面を示す図である。It is a figure which shows the screen at the time of selecting the importance, when a user selects the item of difference extraction object from a table of contents list. 画像処理装置によって行われる文書ファイルの差分抽出処理を示すフローチャートである。It is a flowchart which shows the difference extraction process of the document file performed by an image processing apparatus. ひな形フォーマットにおいて、各記入欄の記入項目名を抽出した状態を示す図である。It is a figure which shows the state which extracted the entry name of each entry column in a model format. 差分抽出範囲の設定方法の選択画面を示す図である。It is a figure which shows the selection screen of the setting method of a difference extraction range.

以下、この発明の実施形態を図面に基づいて説明する。 Embodiments of the present invention will be described below with reference to the drawings.

図１は、この発明の一実施形態に係る文書ファイルの差分抽出システムの概略構成図である。 FIG. 1 is a schematic configuration diagram of a document file difference extraction system according to an embodiment of the present invention.

図１において、このシステムは、画像処理装置１とドキュメント管理サーバ２と作業者（ユーザ）が所有するパーソナルコンピュータ等からなる端末装置３を備え、これらがネットワーク４を介して相互に接続されている。 In FIG. 1, this system includes an image processing apparatus 1, a document management server 2, and a terminal device 3 including a personal computer owned by a worker (user), and these are connected to each other via a network 4. .

前記画像処理装置１は、２つの文書ファイルを比較して差分抽出を行うものであり、この実施形態では、コピー機能、プリント機能、ファクシミリ機能、スキャン機能等の複数の機能を備えた多機能デジタル画像形成装置であるＭＦＰ（Multi Function Peripherals）が用いられている。以下、画像処理装置をＭＦＰともいう。 The image processing apparatus 1 compares two document files and performs difference extraction. In this embodiment, the image processing apparatus 1 is a multi-function digital having a plurality of functions such as a copy function, a print function, a facsimile function, and a scan function. An MFP (Multi Function Peripherals) that is an image forming apparatus is used. Hereinafter, the image processing apparatus is also referred to as an MFP.

図２はＭＦＰ１の電気的構成を示すブロック図である。 FIG. 2 is a block diagram showing an electrical configuration of the MFP 1.

このＭＦＰ１は、ＣＰＵ１１と、ＲＯＭ１２と、ＲＡＭ１３と、スキャナ部１４と、記憶部１５と、プリンタ部１６と、操作パネル１７と、ネットワークコントローラ(ＮＩＣ)１８等を備えている。 The MFP 1 includes a CPU 11, a ROM 12, a RAM 13, a scanner unit 14, a storage unit 15, a printer unit 16, an operation panel 17, a network controller (NIC) 18, and the like.

前記ＣＰＵ１１は、ＭＦＰ１の全体を統括制御し、コピー機能、プリンタ機能、スキャン機能、ファクシミリ機能等の基本機能を使用可能に制御するほか、この実施形態では、２つの文書ファイルの差分比較に関連する処理を行うが、詳細は後述する。 The CPU 11 performs overall control of the entire MFP 1 and controls basic functions such as a copy function, a printer function, a scan function, and a facsimile function. In this embodiment, the CPU 11 relates to a difference comparison between two document files. The process is performed, and details will be described later.

前記ＲＯＭ１２は、ＣＰＵ１１の動作プログラム等を格納するメモリである。 The ROM 12 is a memory for storing an operation program of the CPU 11 and the like.

前記ＲＡＭ１３は、ＣＰＵ１１が動作プログラムに基づいて動作する際の作業領域を提供するメモリである。 The RAM 13 is a memory that provides a work area when the CPU 11 operates based on an operation program.

前記スキャナ部１４は、図示しないＡＤＦ（原稿自動搬送装置）にセットされた原稿の画像を読み取り、画像データを出力する読み取り手段であり、文書ファイルの入力手段の一つとして機能する。 The scanner unit 14 is a reading unit that reads an image of a document set on an ADF (automatic document feeder) (not shown) and outputs image data, and functions as one of document file input units.

前記記憶部１５は、例えばハードディスクドライブ（ＨＤＤ）などの不揮発性の記憶デバイスにより構成されており、前記スキャナ部１４により入力された画像データ等を記憶している。また、記憶部１５にはボックス１５１と称される１個または複数個の記憶領域が設けられており、この記憶領域に文書ファイルを保存することが可能となっている。 The storage unit 15 is configured by a nonvolatile storage device such as a hard disk drive (HDD), and stores image data input by the scanner unit 14. The storage unit 15 is provided with one or a plurality of storage areas called boxes 151, and a document file can be stored in the storage area.

前記プリンタ部１６は、前記スキャナ部１４で読み取られた原稿の画像データや端末装置３からのプリントデータ、あるいは２つの文書ファイルの差分抽出結果等を、指示されたモードに従って印刷するものである。 The printer unit 16 prints image data of a document read by the scanner unit 14, print data from the terminal device 3, a difference extraction result of two document files, and the like according to an instructed mode.

前記操作パネル１７は、各種入力操作等のために使用されるものであり、メッセージや操作画面等を表示するタッチパネル式液晶等からなる表示部１７１と、テンキー、スタートキー、ストップキー等を備えたキー入力部１７２を備えている。 The operation panel 17 is used for various input operations, and includes a display unit 171 including a touch panel type liquid crystal for displaying a message, an operation screen, and the like, and a numeric keypad, a start key, a stop key, and the like. A key input unit 172 is provided.

前記ネットワークコントローラ１８は、ネットワーク４上のドキュメント管理サーバ２や端末装置３等との間での通信を制御することにより、データの送受信を行うものである。 The network controller 18 transmits and receives data by controlling communication with the document management server 2 and the terminal device 3 on the network 4.

前記ドキュメント管理サーバ２は、１個または２個以上の文書ファイルを保存し管理するものであり、パーソナルコンピュータによって構成されている。 The document management server 2 stores and manages one or more document files, and is constituted by a personal computer.

図３はドキュメント管理サーバ２の電気的構成を示すブロック図である。このドキュメント管理サーバ２は、ＣＰＵ２１、ＲＯＭ２２、ＲＡＭ２３、記憶部２４、表示部２５、入力装置２６、ネットワークインターフェース部（ネットワークＩ/F部）２７等を備え、システムバス２８を介して互いに接続されている。 FIG. 3 is a block diagram showing an electrical configuration of the document management server 2. The document management server 2 includes a CPU 21, a ROM 22, a RAM 23, a storage unit 24, a display unit 25, an input device 26, a network interface unit (network I / F unit) 27, etc., which are connected to each other via a system bus 28. Yes.

ＣＰＵ２１は、ＲＯＭ２２等に保存されているプログラムを実行することにより、サーバ２の全体を統括的に制御する。特に、この実施形態では、ＭＦＰ１からの文書ファイル取得要求に応じて、記憶部２４に保存されている文書ファイルの中から、所定の文書ファイルを取り出して、ＭＦＰ１に送信する。 The CPU 21 comprehensively controls the entire server 2 by executing a program stored in the ROM 22 or the like. In particular, in this embodiment, in response to a document file acquisition request from the MFP 1, a predetermined document file is extracted from the document files stored in the storage unit 24 and transmitted to the MFP 1.

ＲＯＭ２２は、ＣＰＵ２１が実行するためのプログラムやその他のデータを保存する記憶媒体である。 The ROM 22 is a storage medium that stores programs to be executed by the CPU 21 and other data.

ＲＡＭ２３は、ＣＰＵ２１が動作用プログラムに従って動作する際の作業領域を提供する記憶媒体である。 The RAM 23 is a storage medium that provides a work area when the CPU 21 operates according to the operation program.

記憶部２４は、ハードディスク等の記憶媒体からなり、ＭＦＰ１から送信されてきた文書ファイルを保存する。この実施形態ではさらに、申請書ひな形等の文書ファイルのひな形や議事録、あるいはＭＦＰ１で実行された過去の差分抽出処理の履歴等も保存され、さらには各種のアプリケーションプログラムやその他のデータ等が保存されている。 The storage unit 24 includes a storage medium such as a hard disk, and stores the document file transmitted from the MFP 1. In this embodiment, a template and minutes of a document file such as an application form template, a history of past difference extraction processing executed in the MFP 1, and the like are further stored, and various application programs and other data are also stored. Is saved.

表示部２５は、ＣＲＴや液晶表示装置等からなり、各種のメッセージ及びユーザに対する入力受付画面、選択画面等を表示する。 The display unit 25 includes a CRT, a liquid crystal display device, and the like, and displays various messages, an input reception screen for a user, a selection screen, and the like.

入力装置２６は、ユーザによる入力操作に用いられるもので、キーボードやマウス等からなる。 The input device 26 is used for an input operation by a user, and includes a keyboard, a mouse, and the like.

ネットワークインターフェース部２７は、ＭＦＰ１や他の外部機器との間で、ネットワーク４を介して、データの送受信を行う通信手段として機能する。 The network interface unit 27 functions as a communication unit that transmits / receives data to / from the MFP 1 and other external devices via the network 4.

次に、図１に示したシステムの動作を説明する。 Next, the operation of the system shown in FIG. 1 will be described.

この実施形態では、作業者（ユーザ）が入手した文書ファイルの暫定版（前バージョン）が、ドキュメント管理サーバ２に保存されており、その暫定版に対しての正式文書（更新バージョン）を作業者が入手して、暫定版の文書ファイルとの差分抽出を行う場合を例にとって説明する。 In this embodiment, a provisional version (previous version) of a document file obtained by a worker (user) is stored in the document management server 2, and an official document (update version) for the provisional version is stored in the worker. Will be described as an example in the case of obtaining the difference and extracting the difference from the provisional document file.

作業者は、文書ファイルの差分抽出を行わせる前に、前記暫定版の文書ファイルをドキュメント管理サーバ２へ登録・蓄積する作業を行う。 The worker performs an operation of registering and storing the provisional version of the document file in the document management server 2 before extracting the difference between the document files.

図４は、ＭＦＰ１によって行われる文書ファイルのドキュメント管理サーバ２への登録・蓄積処理を示すフローチャートである。この処理は、ＭＦＰ１のＣＰＵ１１がＲＯＭ１２等の記録媒体に記録された動作プログラムに従って動作することにより実行される。 FIG. 4 is a flowchart showing a process for registering and storing a document file in the document management server 2 performed by the MFP 1. This process is executed by the CPU 11 of the MFP 1 operating according to an operation program recorded on a recording medium such as the ROM 12.

作業者が登録する文書を入手し、操作パネル１７を介してドキュメント管理サーバ２への登録のための操作を行うと、ステップＳ０１で、この操作が受け付けられる。ステップＳ０２では、スキャナ部１４により文書を読み込んで電子ファイル化する。なお、スキャナ部１４で読み込むのではなく、予めボックス１５１等に保存されている文書やネットワーク４上の文書を取り込んで、登録対象の文書ファイルとしても良い。 When the operator obtains a document to be registered and performs an operation for registration in the document management server 2 via the operation panel 17, this operation is accepted in step S01. In step S02, the scanner unit 14 reads the document and converts it into an electronic file. Instead of being read by the scanner unit 14, a document stored in the box 151 or the like or a document on the network 4 may be taken in as a document file to be registered.

次に、ステップＳ０３では文書ファイルから目次を作成したのち、ステップＳ０４で、文書ファイルに文書識別コードを地紋として埋め込み、作成した目次と共にドキュメント管理サーバ２に送信する。 Next, in step S03, a table of contents is created from the document file, and in step S04, a document identification code is embedded in the document file as a background pattern and transmitted to the document management server 2 together with the created table of contents.

ドキュメント管理サーバ２に送信された文書ファイルと目次データは、相互に関連付けられて前記管理サーバ２の記憶部２４に保存され、登録される。このような処理により複数の文書ファイルがドキュメント管理サーバー２に登録・蓄積される。 The document file and table of contents data transmitted to the document management server 2 are stored in the storage unit 24 of the management server 2 and registered in association with each other. A plurality of document files are registered and stored in the document management server 2 by such processing.

図５は、作業者が文書ファイルをドキュメント管理サーバ２へ登録・蓄積する作業を行う際に、ＭＦＰ１の操作パネル１７の表示部１７１に表示される画面を示す図である。 FIG. 5 is a diagram showing a screen displayed on the display unit 171 of the operation panel 17 of the MFP 1 when the worker performs an operation of registering and storing the document file in the document management server 2.

作業者が登録のための操作を行うと、操作パネル１７の表示部１７１には図５の画面Ｄ１が表示される。この画面では、「登録する文書の情報を入力してください。」のメッセージと共に、プロジェクトコード、文書分類コード、取引先コードの各入力欄、及び「進む」ボタンと「中止」ボタンが表示されている。 When the operator performs an operation for registration, a screen D <b> 1 in FIG. 5 is displayed on the display unit 171 of the operation panel 17. On this screen, the message “Enter the information of the document to be registered.” Is displayed, along with the project code, document classification code, supplier code input fields, and the “Forward” and “Cancel” buttons. Yes.

プロジェクトコードは、今回の文書ファイルを管理する主管となるプロジェクトを示すものであり、文書分類コードは登録する文書ファイルの種別を識別する情報を示すものであり、取引先コードは今回の文書ファイルの取引先（やりとりの相手先）の情報を示すものであり、これらの各コードは文書ファイルを識別するための文書識別コードとして機能する。 The project code indicates the project that is in charge of managing the current document file, the document classification code indicates information that identifies the type of the document file to be registered, and the supplier code is the current document file. This indicates information on a business partner (exchange partner), and each of these codes functions as a document identification code for identifying a document file.

なお、各コードは作業者の所属部署等で予め決められており、そのルールに沿って入力する。各コードの一例を図６に示す。図６の例では、プロジェクトコードとして「ＵＩ開発プロジェクトＡ」を示す「Ｃ０９８」、「システム開発プロジェクトＡ」を示す「Ｃ０９９」等が示され、文書分類コードとして「契約書（業務委託）」を示す「０００１」、「見積書（業務委託）」を示す「０００２」等が示され、取引先コードとして「Ａ社」を示す「００１」、「Ｂ社」を示す「００２」、社内（Ａ部署）を示す「１０００」等が示されている。 Each code is determined in advance by the department to which the worker belongs, and is input in accordance with the rules. An example of each code is shown in FIG. In the example of FIG. 6, “C098” indicating “UI development project A”, “C099” indicating “system development project A”, and the like are shown as project codes, and “contract (business consignment)” is shown as a document classification code. “0001” indicating “quotation (business consignment)”, etc., “001” indicating “Company A”, “002” indicating “Company B”, in-house (A “1000” or the like indicating the department) is shown.

作業者はテンキー等を用いて各コードを入力する。入力された各コードは「−（ハイフン）」で連結される。また、この文書識別コードは、ドキュメント管理サーバ２において登録される文書ファイルと関連付けて登録され、コード自体はＭＦＰ１により地紋として文書ファイルの各ページに埋め込まれる。また、登録された文書ファイルは、前記文書識別コードをキーとして検索することにより、取り出すことができる。 The operator inputs each code using a numeric keypad. Each input code is connected with "-(hyphen)". The document identification code is registered in association with the document file registered in the document management server 2, and the code itself is embedded in each page of the document file as a background pattern by the MFP 1. The registered document file can be retrieved by searching using the document identification code as a key.

図５に戻り、画面Ｄ１において、作業者が各コードの入力を終了し「進む」ボタンを押すと、画面Ｄ２に遷移する。この画面Ｄ２では、「登録する文書の媒体を選択してください」のメッセージと共に、「紙媒体」と「電子ファイル」の選択ボタンが表示されている。 Returning to FIG. 5, when the operator finishes inputting each code and presses the “forward” button on the screen D1, the screen transitions to the screen D2. On this screen D2, selection buttons for “paper medium” and “electronic file” are displayed together with a message “Please select a medium for the document to be registered”.

作業者が「紙媒体」を選択し、「進む」ボタンを押すと、画面Ｄ３に遷移する。画面Ｄ３では、登録する文書をスキャナ部１４のＡＤＦにセットする旨のメッセージが表示され、文書をＡＤＦにセットして「スキャン開始」ボタンを押すと、スキャナ部１４による読み取りが開始され、画面Ｄ４に遷移する。 When the operator selects “paper medium” and presses the “forward” button, the screen transitions to a screen D3. On the screen D3, a message indicating that the document to be registered is set in the ADF of the scanner unit 14 is displayed. When the document is set in the ADF and the “scan start” button is pressed, reading by the scanner unit 14 is started, and the screen D4 is displayed. Transition to.

一方、画面Ｄ２において、作業者が「電子ファイル」を選択し、電子ファイルの保存先（参照先）であるフォルダ名等を入力して、「進む」ボタンを押すと、参照先から指定された文書ファイルが取り出されると共に、画面Ｄ４に遷移する。 On the other hand, in the screen D2, when the operator selects “electronic file”, inputs the name of the folder that is the storage destination (reference destination) of the electronic file, and presses the “forward” button, the designation is made from the reference destination. While the document file is taken out, the screen changes to a screen D4.

画面Ｄ４では、登録作業中であることを示すメッセージが表示される。 On the screen D4, a message indicating that registration is being performed is displayed.

この間、ＭＦＰ１はスキャナ部１４で読み取られた原稿の文書ファイル、あるいは保存先から取得された文書ファイルについて、目次を作成する。目次の作成は、文書内の段落構成を分析して各段落のしおりあるいはタイトル等からなる項目を抽出することにより行われるが、このような目次の作成処理は、例えば特開２０１１−３９５８０号公報等で公知となっており、このような公知の技術を用いればよい。この場合、文書ファイルが画像データである場合は、図示しない文字認識機能を用いてテキストデータに変換した上で、目次が作成され、テキストデータの状態でドキュメント管理サーバ２に保存される。 During this time, the MFP 1 creates a table of contents for the document file of the document read by the scanner unit 14 or the document file acquired from the storage destination. The table of contents is created by analyzing the paragraph structure in the document and extracting items such as bookmarks or titles of the respective paragraphs. Such table of contents creation processing is disclosed in, for example, Japanese Patent Application Laid-Open No. 2011-39580. Such a known technique may be used. In this case, if the document file is image data, it is converted into text data using a character recognition function (not shown), a table of contents is created, and stored in the document management server 2 in the state of text data.

目次作成後は、文書ファイルは前述したようにＭＦＰ１からドキュメント管理サーバ２に送信され、ドキュメント管理サーバ２で保存管理されることになる。また、前述したように、作業者が入力した文書識別コードは、文書ファイルの各ページに地紋として埋め込まれるとともに、ドキュメント管理サーバ２に送信される。文書識別コードを埋め込んでおく目的は、後にその文書ファイルの更新バージョン（正式版）を入手して、変更前後の差分抽出のためにドキュメント管理サーバ２から更新前の文書ファイルを取得する際に、ドキュメント管理サーバ２による該当する文書ファイルの検索を容易にするためである。 After the table of contents is created, the document file is transmitted from the MFP 1 to the document management server 2 and stored and managed by the document management server 2 as described above. Further, as described above, the document identification code input by the operator is embedded as a background pattern in each page of the document file and transmitted to the document management server 2. The purpose of embedding the document identification code is to obtain an updated version (official version) of the document file later and acquire the document file before update from the document management server 2 for extracting the difference before and after the change. This is for facilitating retrieval of the corresponding document file by the document management server 2.

こうして、文書ファイルの登録・蓄積処理が実行される。 Thus, document file registration / accumulation processing is executed.

次に、更新バージョンの文書ファイル（第１の文書ファイル）と、ドキュメント管理サーバ２に登録されている暫定版の文書ファイル（第２の文書ファイル）の差分を抽出する場合の動作について説明する。 Next, an operation for extracting a difference between an updated version document file (first document file) and a provisional version document file (second document file) registered in the document management server 2 will be described.

図７はシステム全体の流れを説明するための図である。 FIG. 7 is a diagram for explaining the flow of the entire system.

作業者が入手した更新バージョン（正式版）の文書ファイルについては、図４〜図６で説明したのと同様の手順で、目次一覧を作成すると共にドキュメント管理サーバ２に登録する。この場合、作業者は、暫定版の文書ファイルに付与したのと同一の文書識別コードを、正式版の文書ファイルにも付与する。また、新たなバージョンであることを示すためにバージョン番号が付加されても良い。 For the updated version (official version) document file obtained by the operator, a table of contents is created and registered in the document management server 2 in the same procedure as described with reference to FIGS. In this case, the worker assigns the same document identification code as that given to the provisional version of the document file to the official version of the document file. Also, a version number may be added to indicate a new version.

次に、ＭＦＰ１は、今回入手した文書ファイルの前回バージョン（暫定版）をドキュメント管理サーバ２から取得するが、その際、作業者が入力した文書情報である文書識別コードをキーに、ドキュメント管理サーバ２に取得要求を送信する。ドキュメント管理サーバ２は、受け取った文書識別コードを用いて、サーバ２の記憶部２４を検索し、指定された文書識別コードを地紋として含む文書ファイルを取り出し、ＭＦＰ１に送信する。 Next, the MFP 1 obtains the previous version (provisional version) of the document file obtained this time from the document management server 2. At this time, the document management server uses the document identification code which is the document information input by the operator as a key. 2 sends an acquisition request. The document management server 2 searches the storage unit 24 of the server 2 using the received document identification code, extracts a document file including the designated document identification code as a background pattern, and transmits it to the MFP 1.

また、並行して、ＭＦＰ１は、作業者の端末装置３に、正式版の文書ファイルから作成した目次の一覧データをネットワーク４を介して送信する。 In parallel, the MFP 1 transmits the list data of the table of contents created from the official document file to the worker's terminal device 3 via the network 4.

目次の一覧データを受け取った作業者の端末装置３は、図８（Ａ）に示すように、表示装置上にその目次の一覧を、選択可能に表示する。 Upon receiving the list data of the table of contents, the terminal device 3 of the worker displays the list of the table of contents on the display device in a selectable manner as shown in FIG.

作業者が、その目次ー覧の中から重要と思われる項目を選択する。図８（Ｂ）は選択された状態を示しており、黒塗り部分が選択された項目を示している。選択結果は、図７の矢印Ｘで示されるように、作業者の端末装置３からＭＦＰ１に送信される。 The worker selects an item that seems important from the table of contents. FIG. 8B shows a selected state, and shows an item in which a black portion is selected. The selection result is transmitted from the worker's terminal device 3 to the MFP 1 as indicated by an arrow X in FIG.

ＭＦＰ１は、作業者の端末装置３から選択された項目のデータを受信すると、選択された項目を差分抽出対象の項目と判定し、その項目に対応する段落のみについて、正式版の文書ファイルとドキュメント管理サーバ２から取得した暫定版の文書ファイルを比較し、両者の差分を抽出する。差分抽出は、その項目を含む段落全体の全文比較により行われ、一致しなかった箇所が差分として抽出される。 When the MFP 1 receives the data of the item selected from the terminal device 3 of the worker, the MFP 1 determines that the selected item is a difference extraction target item, and only the paragraph corresponding to the item is the official version document file and document. The provisional version of the document file acquired from the management server 2 is compared, and the difference between the two is extracted. Difference extraction is performed by full-text comparison of the entire paragraph including the item, and a portion that does not match is extracted as a difference.

抽出処理によって得られた差分結果は、図７の矢印Ｙで示すように、ＭＦＰ１から作業者の端末装置３に送信される。作業者の端末装置３は、受け取った差分結果を表示装置にに表示し、作業者はその結果をみて、前回バージョンの文書からからどう変更されているか、また何も変更がなされていないか等を確認する。また、同様の差分結果は、ＭＦＰ１のプリンタ部１６で印刷することも可能である。 The difference result obtained by the extraction process is transmitted from the MFP 1 to the operator's terminal device 3 as indicated by an arrow Y in FIG. The worker's terminal device 3 displays the received difference result on the display device, and the worker sees the result to see how the document has been changed from the previous version of the document and whether any changes have been made. Confirm. A similar difference result can also be printed by the printer unit 16 of the MFP 1.

図９は差分抽出結果の表示例である。この実施形態では、図８（Ｂ）に示したように、作業者が差分抽出対象の項目として、「取引形態」、「開発期問」、「成果物の納品について」、「補償について」、「工数」、「総額」、「その他・特記事項」を設定しており、その場合の比較結果を示している。図９（Ａ）は暫定版の文書ファイルを示し、図９（Ｂ）は正式版の文書ファイルを示し、濃く記載されている部分が変更箇所である。変更箇所はたとえば色を変えて赤色でマーキングされても良いし、色以外の表示形態を変えてもよい。 FIG. 9 is a display example of the difference extraction result. In this embodiment, as shown in FIG. 8 (B), the operator can select “dealing form”, “development period”, “delivery of deliverables”, “compensation”, “Man-hours”, “total amount”, “other / special notes” are set, and the comparison results are shown. FIG. 9A shows a provisional version document file, FIG. 9B shows an official version document file, and a darkly described portion is a changed portion. For example, the changed portion may be marked with red by changing the color, or the display form other than the color may be changed.

このように、正式版の文書ファイルと暫定版の文書ファイルの差分抽出は、正式版の文書ファイルのうち、ユーザの意思により選択された項目に対応する段落に対してのみ行われる。このため、文書ファイルの全てにおいて差分を抽出する必要はないから、全体の差分抽出量を減少させることができ、ディスプレイ等に表示しあるいは紙に印刷しての作業者による確認作業が楽になる。 As described above, the difference extraction between the official document file and the provisional document file is performed only for the paragraph corresponding to the item selected by the user's intention in the official document file. For this reason, since it is not necessary to extract differences in all document files, the total amount of difference extraction can be reduced, and the confirmation work by an operator displayed on a display or printed on paper becomes easier.

なお、図１０に示すように、目次一覧の中から作業者が差分抽出対象の項目を選択した際に、プルダウンメニュー等により重要度をたとえば「高」「中」「低」の中から選択することにより、選択した項目について重要度を設定しても良い。この場合、各重要度に応じて段落毎に、差分結果を色を変えて表示しても良い。たとえば、重要度が「高」の段落における差分は赤色、重要度が「中」の段落における差分は黄色、重要度が「低」の段落における差分は青色に色分けする等、より細かく表示形態を変化させてもよい。 As shown in FIG. 10, when the operator selects a difference extraction target item from the table of contents, the importance is selected from “high”, “medium”, and “low” by using a pull-down menu or the like. Thus, the importance may be set for the selected item. In this case, the difference result may be displayed in different colors for each paragraph according to each importance level. For example, the difference in the high-importance paragraph is red, the difference in the medium-importance paragraph is yellow, and the difference in the low-importance paragraph is blue. It may be changed.

図１１は、ＭＦＰ１によって行われる文書ファイルの差分抽出処理を示すフローチャートである。この処理は、ＭＦＰ１のＣＰＵ１１がＲＯＭ１２等の記録媒体に記録された動作プログラムに従って動作することにより実行される。 FIG. 11 is a flowchart showing document file difference extraction processing performed by the MFP 1. This process is executed by the CPU 11 of the MFP 1 operating according to an operation program recorded on a recording medium such as the ROM 12.

作業者が更新バージョンの文書を入手し、操作パネル１７を介してドキュメント管理サーバ２への登録のための操作を行うと、ステップＳ１１で、この操作が受け付けられる。ステップＳ１２では、スキャナ部１４により文書を読み込んで入力し電子ファイル化する。なお、スキャナ部１４で読み込むのではなく、予めボックス１５１等に保存されている文書やネットワーク上の文書を取り込んで、登録対象の文書ファイルとしても良い。 When the operator obtains an updated version of the document and performs an operation for registration in the document management server 2 via the operation panel 17, this operation is accepted in step S11. In step S12, a document is read and input by the scanner unit 14 and converted into an electronic file. Instead of being read by the scanner unit 14, a document previously stored in the box 151 or the like or a document on the network may be taken as a document file to be registered.

次に、ステップＳ１３では文書ファイルから目次を作成したのち、ステップＳ１４で、文書ファイルに文書識別コードを地紋として埋め込み、作成した目次と共にドキュメント管理サーバ２に送信する。 In step S13, a table of contents is created from the document file. In step S14, a document identification code is embedded in the document file as a background pattern, and transmitted to the document management server 2 together with the created table of contents.

ドキュメント管理サーバ２に送信された文書ファイルは、前記管理サーバ２の記憶部２４に保存され、登録される。このような処理により複数の文書ファイルがドキュメント管理サーバー２に登録・蓄積される。 The document file transmitted to the document management server 2 is stored and registered in the storage unit 24 of the management server 2. A plurality of document files are registered and stored in the document management server 2 by such processing.

次にステップＳ１５で、作成した目次データを作業者の端末装置（ＰＣ）に送信したのち、ステップＳ１６で、作業者が入力した文書識別コードをキーに、ドキュメント管理サーバ２に前バージョンの文書ファイルの取得を要求する。 Next, in step S15, the created table of contents data is transmitted to the operator's terminal device (PC), and then in step S16, the document management server 2 stores the previous version of the document file using the document identification code entered by the operator as a key. Request acquisition.

ステップＳ１７では、ドキュメント管理サーバ２から前バージョンの文書ファイルを受信するのを待ち、受信すると（ステップＳ１７でＹＥＳ）、ステップＳ１８で、作業者の端末装置３から作業者が選択した項目のデータを受信するのを待つ。 In step S17, it waits to receive the previous version of the document file from the document management server 2 (YES in step S17), and in step S18, the data of the item selected by the operator from the terminal device 3 of the operator is obtained. Wait for reception.

受信すると（ステップＳ１８でＹＥＳ）、ステップＳ１９で、選択された項目に該当する段落について、全文比較チェックを行い差分抽出処理を実行する。 When received (YES in step S18), in step S19, a full text comparison check is performed on the paragraph corresponding to the selected item, and a difference extraction process is executed.

抽出処理後、ステップＳ２０で、抽出結果を作業者の端末装置３に送信した後、ステップＳ２１で、作業者が選択した項目データをドキュメント管理サーバ２に送信する。 After the extraction process, the extraction result is transmitted to the worker's terminal device 3 in step S20, and then the item data selected by the worker is transmitted to the document management server 2 in step S21.

ドキュメント管理サーバ２に送信された項目データは、文書ファイルと関連付けられて、サーバ２内の記憶部２５に保存される。 The item data transmitted to the document management server 2 is stored in the storage unit 25 in the server 2 in association with the document file.

上記の実施形態では、差分抽出処理の実行範囲（段落）を、作業者の目次一覧における項目選択に基づいて決定したが、差分抽出処理の実行範囲の決定方法は上記に限定されることはない。 In the above embodiment, the execution range (paragraph) of the difference extraction process is determined based on the item selection in the operator's table of contents. However, the method for determining the execution range of the difference extraction process is not limited to the above. .

他の例を挙げると、作業者が入手した文書に書かれている内容のうち、後に作成する書類に転記する項目を含む場合に、その項目を後に作成する書類のひな形から抽出し、この項目に対応する段落を、差分抽出処理の実行範囲と決定してもよい。 As another example, if an item to be transferred to a document to be created later is included in the contents obtained in the document obtained by the operator, that item is extracted from the template of the document to be created later. The paragraph corresponding to the item may be determined as the execution range of the difference extraction process.

例えば、ある業務をＡ社に委託する場合、社内での委託申請書などの書類を作成することがある。このようなケースでは、契約書に記載されている「発注金額」や「開発期間」、「納期日」などの情報を、申請書の特定個所に転記することが多い。このように、後に作成する書類の記入事項を元に、それに関係する部分を重要と判断する。 For example, when a certain business is outsourced to Company A, a document such as an in-house application for application may be created. In such a case, information such as “order amount”, “development period”, and “delivery date” described in the contract is often transferred to a specific part of the application form. In this way, based on the entries in a document to be created later, it is determined that the relevant part is important.

なお、各種申請書などは、ひな形が社内のドキュメント管理サーバ２に保管されており、社内ネットワーク４を介して、ＭＦＰ１は各種申請書のひな形をドキュメント管理サーバ２から取得することができるものとなされている。 The various application forms are stored in the in-house document management server 2, and the MFP 1 can acquire the various application forms from the document management server 2 via the in-house network 4. It has been.

作業者は、当該契約書を元に作成する申請書などの種類を、契約書と関連づけてあらかじめ設定しておき、ＭＦＰ１は更新バージョンの文書ファイルが入力された時に、該文書ファイルが例えば契約書であれば、その文書ファイルに設定されている申請書のひな形をドキュメント管理サーバ２から取得し、そのひな形をナビＰＤＦ化する。ナビＰＤＦ化に際しては、ひな形フォーマットにおいて、各記入欄に記入項目名が題目化されており、ナビＰＤＦ化によって、図１２に示すように、各題目が抽出される。たとえば図１２では、申請書のひな形から、「業務内容」「発注金額」「開発時期」「成果物に対する権利」「取引形態」等のキーワードが抽出される。 The operator sets in advance the type of application created based on the contract in association with the contract, and the MFP 1 receives the updated version of the document file. If so, the application template set in the document file is acquired from the document management server 2, and the template is converted into a navigation PDF. At the time of conversion to the navigation PDF, the entry name is themed in each entry column in the template format, and each title is extracted by the conversion to the navigation PDF as shown in FIG. For example, in FIG. 12, keywords such as “business contents”, “order amount”, “development time”, “right to deliverables”, “transaction form”, and the like are extracted from the template of the application form.

そして、ＭＦＰ１は更新バージョンの文書ファイルの目次の中から前記抽出されたキーワードを検索し、これらのキーワードが項目として存在する段落を、差分抽出対象とする。 Then, the MFP 1 searches the extracted keyword from the table of contents of the updated version of the document file, and sets a paragraph in which these keywords exist as items as a difference extraction target.

また、差分抽出処理の実行範囲の他の決定方法として、議事録（打ち合わせ内容等を記録したメール等も含む）を利用する方法を挙げることができる。 Further, as another method for determining the execution range of the difference extraction process, a method of using minutes (including an email recording the contents of the meeting) can be cited.

即ち、前バージョン（暫定版）の文書ファイルから更新バージョン（正式版）の文書ファイルに至る期間において発行された打ち合わせ時の議事録や、送受信したメールの内容を基に、差分抽出処理の実行範囲を決定するものである。 In other words, based on the minutes of meetings issued during the period from the previous version (provisional version) document file to the updated version (official version) document file and the contents of the sent and received emails, the scope of execution of the difference extraction process Is to determine.

議事録やメールなどの各種書類をドキュメント管理サーバ２に登録、保管する際に、発行日時を登録する。文書ファイルの差分抽出処理に際して、暫定版の文書ファイルの発行日時と、今回の正式版の文書ファイルの発行日時をＭＦＰ１がドキュメント管理サーバ２から取得し、正式版の文書ファイルが発行されるまでの期間をＭＦＰ１が計算する。 When various documents such as minutes and e-mails are registered and stored in the document management server 2, the date and time of issue are registered. In the document file difference extraction process, the MFP 1 obtains the issue date and time of the provisional version of the document file and the issue date and time of the current official version of the document file from the document management server 2 until the official version of the document file is issued. The MFP 1 calculates the period.

各文書には、プロジェクトコードなど、各プロジェクトごとにどのプロジェクトに属する文書かを識別するための管理ＩＤが割り当てられており、発注先と打ち合わせを行った際などの議事録等も、ドキュメント管理サーバ２内に、発行日時とともに関連する文書ファイルと同じ管理ＩＤを割り当てられて保管されている。 Each document is assigned a management ID for identifying which project belongs to each project, such as a project code, and the document management server also records minutes when meeting with the supplier. 2, the same management ID as that of the related document file is assigned and stored together with the issue date and time.

ＭＦＰ１は、それぞれの日付から、暫定版文書ファイルの発行から正式版文書ファイルの発行までの期間内に発行され、それら文書ファイルに関連する議事録をドキュメント管理サーバ２から取得する。 The MFP 1 is issued within the period from the issuance of the provisional version document file to the issuance of the official version document file from each date, and the minutes related to the document file are acquired from the document management server 2.

ＭＦＰ１は、議事録を取得すると、申請書のひな形と同様にナビＰＤＦ化して議事録の各議題項目を抽出し、各議題項目を重要キーワードと設定する。 When the MFP 1 obtains the minutes, it converts it into a navigation PDF in the same manner as the application template, extracts each agenda item of the minutes, and sets each agenda item as an important keyword.

また、差分抽出処理の実行範囲の他の決定方法として、過去に類似の文書ファイルについて差分抽出処理を行った時に、差分抽出対象の項目と判定された項目を利用しても良い。 As another method for determining the execution range of the difference extraction process, an item that has been determined to be a difference extraction target when the difference extraction process has been performed on a similar document file in the past may be used.

例えば、過去に、異なるプロジェクトで同じ発注先と契約したことがあり、その際の契約書について文書ファイルの差分抽出処理が行われていた場合は、その際に使用された差分抽出対象項目を再利用することが出来る。 For example, if you have contracted with the same supplier in a different project in the past, and the document file difference extraction process was performed for the contract at that time, the difference extraction target item used at that time is re- It can be used.

過去に設定された差分抽出対象項目は、履歴として文書ファイルと関連付けてドキュメント管理サーバ２に保存されており、ＭＦＰ１は、差分抽出処理を行う更新バージョンの文書ファイルが入力されると、過去の類似の文書ファイルについて使用された差分抽出対象項目の取得をドキュメント管理サーバ２に要求する。ドキュメント管理サーバ２は、書類文書識別コードなどの情報から過去の類似の文書ファイルについて設定された差分抽出対象項目を、履歴情報から呼び出し、ＭＦＰ１に送信する。 Difference extraction target items set in the past are stored in the document management server 2 as a history in association with the document file. When an updated version of the document file for performing the difference extraction process is input, the MFP 1 receives past similarities. The document management server 2 is requested to acquire the difference extraction target item used for the document file. The document management server 2 calls the difference extraction target item set for the past similar document file from information such as the document document identification code from the history information, and transmits it to the MFP 1.

ＭＦＰ１は取得した差分抽出対象項目に対応する段落について、暫定版の文書ファイルと正式版文書ファイルの間で差分抽出処理を実行する。 The MFP 1 executes difference extraction processing between the provisional version document file and the official version document file for the paragraph corresponding to the acquired difference extraction target item.

このように、差分抽出処理の実行範囲の設定方法は複数存在する。このため、差分抽出処理の実行に際しては、作業者の端末装置３に、図１３に示すような差分抽出範囲の設定方法の選択画面を表示し、作業者に選択させるとともに、ＭＦＰ１は選択された方法で差分抽出処理の実行範囲を決定すればよい。 As described above, there are a plurality of methods for setting the execution range of the difference extraction process. For this reason, when executing the difference extraction process, a selection screen for setting a method for setting the difference extraction range as shown in FIG. 13 is displayed on the terminal device 3 of the worker, and the worker selects the MFP 1 and the MFP 1 is selected. The execution range of the difference extraction process may be determined by the method.

１画像処理装置
２ドキュメント管理サーバ
３ユーザの端末装置
４ネットワーク
１１ＣＰＵ
１２ＲＯＭ
１４スキャナ部
１５記憶部
１７操作パネル
１７１表示部
１８ネットワークコントローラ DESCRIPTION OF SYMBOLS 1 Image processing apparatus 2 Document management server 3 User terminal device 4 Network 11 CPU
12 ROM
14 Scanner unit 15 Storage unit 17 Operation panel 171 Display unit 18 Network controller

Claims

A document file difference extraction system in which an image processing apparatus and a document management server are connected via a network,
The image processing apparatus includes:
An input means for inputting the first document file;
A table of contents creating means for creating a table of contents by analyzing the paragraph structure in the document and extracting items of each paragraph from the first document file input by the input means;
Among the items shown in the table of contents created by the table of contents creating means, an item determination means for determining an item to be extracted, and
A document file acquisition means for acquiring a second document file from which a difference from the first document file is extracted from the document management server;
Difference extraction means for extracting the difference between the first document file and the second document file for the paragraph corresponding to the item of the difference extraction target determined by the item determination means;
Control means for displaying the difference extracted by the difference extracting means on a display means or transmitting it to an external user terminal device;
With
The document management server
Storage means for storing one or more document files;
Based on an acquisition request from the image processing apparatus, a transmission means for transmitting the second document file stored in the storage means to the image processing apparatus;
Equipped with a,
In the storage means of the document management server, one or more template documents are stored,
A document file acquisition unit of the image processing apparatus acquires a template document related to the first document file from the document management server;
The table of contents creating means extracts items of the acquired template document,
The document item difference extraction system, wherein the item determination means determines the table of contents item of the first document file that is the same as the extracted template document item as a difference extraction target item.

A document file difference extraction system in which an image processing apparatus and a document management server are connected via a network,
The image processing apparatus includes:
An input means for inputting the first document file;
A table of contents creating means for creating a table of contents by analyzing the paragraph structure in the document and extracting items of each paragraph from the first document file input by the input means;
Among the items shown in the table of contents created by the table of contents creating means, an item determination means for determining an item to be extracted, and
A document file acquisition means for acquiring a second document file from which a difference from the first document file is extracted from the document management server;
Difference extraction means for extracting the difference between the first document file and the second document file for the paragraph corresponding to the item of the difference extraction target determined by the item determination means;
Control means for displaying the difference extracted by the difference extracting means on a display means or transmitting it to an external user terminal device;
With
The document management server
Storage means for storing one or more document files;
Based on an acquisition request from the image processing apparatus, a transmission means for transmitting the second document file stored in the storage means to the image processing apparatus;
With
The storage means of the document management server stores one or more minutes.
The image processing apparatus includes date acquisition means for acquiring issue dates of the first document file and the second document file,
The document file acquisition means of the image processing apparatus records the minutes created between the issue date of the second document file and the issue date of the first document file in relation to the first document file. Obtained from the management server,
The table of contents creation means extracts the items of the acquired minutes,
The document item difference extraction system, wherein the item determination unit determines that the table of contents item of the first document file that is the same as the extracted minutes is an item to be extracted.

An input means for inputting the first document file;
A table of contents creating means for creating a table of contents by analyzing the paragraph structure in the document and extracting items of each paragraph from the first document file input by the input means;
Among the items shown in the table of contents created by the table of contents creating means, an item determination means for determining an item to be extracted, and
A document file for acquiring a second document file from which a difference from the first document file is extracted from a storage means for storing one or more document files provided inside or outside the apparatus. Acquisition means;
Difference extraction means for extracting the difference between the first document file and the second document file for the paragraph corresponding to the item of the difference extraction target determined by the item determination means;
Control means for displaying the difference extracted by the difference extracting means on a display means or transmitting it to an external user terminal device;
With
In the storage means, one or more template documents are stored,
The document file acquisition means acquires a template document related to the first document file from the storage means,
The table of contents creating means extracts items of the acquired template document,
The item determination means determines an index item of the first document file that is the same as the extracted template document item as a difference extraction target item.

An input means for inputting the first document file;
A table of contents creating means for creating a table of contents by analyzing the paragraph structure in the document and extracting items of each paragraph from the first document file input by the input means;
Among the items shown in the table of contents created by the table of contents creating means, an item determination means for determining an item to be extracted, and
A document file for acquiring a second document file from which a difference from the first document file is extracted from a storage means for storing one or more document files provided inside or outside the apparatus. Acquisition means;
Difference extraction means for extracting the difference between the first document file and the second document file for the paragraph corresponding to the item of the difference extraction target determined by the item determination means;
Control means for displaying the difference extracted by the difference extracting means on a display means or transmitting it to an external user terminal device;
With
The storage means stores one or more minutes,
Date acquisition means for acquiring the issue date of the first document file and the second document file;
The document file acquisition means acquires from the storage means minutes recorded between the issuance date of the second document file and the issuance date of the first document file in relation to the first document file. ,
The table of contents creation means extracts the items of the acquired minutes,
The image processing apparatus according to claim 1, wherein the item determination unit determines that the table of contents item of the first document file that is the same as the extracted minutes is an item to be extracted.

A document file difference extraction method executed in a system in which an image processing apparatus and a document management server are connected via a network,
The image processing apparatus includes:
An input step of inputting a first document file;
A table of contents creation step of creating a table of contents by analyzing the paragraph structure in the document and extracting items of each paragraph from the first document file input in the input step;
Among the items shown in the table of contents created by the table of contents creation step, an item determination step for judging an item to be extracted, and
A document file acquisition step of acquiring a second document file from which a difference from the first document file is extracted from the document management server;
A difference extraction step of extracting a difference between the first document file and the second document file for the paragraph corresponding to the item of difference extraction target determined by the item determination step;
A control step of displaying the difference extracted by the difference extraction step on a display means or transmitting the difference to an external user terminal device;
Run
The document management server
Based on an acquisition request from the image processing apparatus, executing a transmission step of transmitting the second document file stored in the storage unit that stores one or more document files to the image processing apparatus;
In the storage means of the document management server, one or more template documents are stored,
In the document file acquisition step, the image processing apparatus acquires a template document related to the first document file from the document management server,
In the table of contents creation step, items of the acquired template document are extracted,
A document file difference extraction method, wherein, in the item determination step, a table of contents item of the first document file that is the same as the extracted template document item is determined as a difference extraction target item.

A document file difference extraction method executed in a system in which an image processing apparatus and a document management server are connected via a network,
The image processing apparatus includes:
An input step of inputting a first document file;
A table of contents creation step of creating a table of contents by analyzing the paragraph structure in the document and extracting items of each paragraph from the first document file input in the input step;
Among the items shown in the table of contents created by the table of contents creation step, an item determination step for judging an item to be extracted, and
A document file acquisition step of acquiring a second document file from which a difference from the first document file is extracted from the document management server;
A difference extraction step of extracting a difference between the first document file and the second document file for the paragraph corresponding to the item of difference extraction target determined by the item determination step;
A control step of displaying the difference extracted by the difference extraction step on a display means or transmitting the difference to an external user terminal device;
Run
The document management server
Based on an acquisition request from the image processing apparatus, executing a transmission step of transmitting the second document file stored in the storage unit that stores one or more document files to the image processing apparatus;
The storage means of the document management server stores one or more minutes.
The image processing apparatus executes a date acquisition step of acquiring issue dates of the first document file and the second document file,
In the document file acquisition step, minutes recorded between the issue date of the second document file and the issue date of the first document file related to the first document file are acquired from the document management server. And
In the table of contents creation step, the items of the acquired minutes are extracted,
A document file difference extraction method, wherein, in the item determination step, the table of contents item of the first document file that is the same as the extracted minutes is determined as a difference extraction target item.

An input step of inputting a first document file;
A table of contents creation step of creating a table of contents by analyzing the paragraph structure in the document and extracting items of each paragraph from the first document file input in the input step;
Among the items shown in the table of contents created by the table of contents creation step, an item determination step for judging an item to be extracted, and
A document file for acquiring a second document file from which a difference from the first document file is extracted from a storage means for storing one or more document files provided inside or outside the apparatus. An acquisition step;
A difference extraction step of extracting a difference between the first document file and the second document file for the paragraph corresponding to the item of difference extraction target determined by the item determination step;
A control step of displaying the difference extracted by the difference extraction step on a display means or transmitting the difference to an external user terminal device;
Run
In the storage means, one or more template documents are stored,
In the document file acquisition step, a template document related to the first document file is acquired from the storage unit;
In the table of contents creation step, items of the acquired template document are extracted,
A document file difference extraction method in an image processing apparatus, wherein, in the item determination step, the table of contents item of the first document file that is the same as the extracted template document item is determined as a difference extraction target item. .

An input step of inputting a first document file;
A table of contents creation step of creating a table of contents by analyzing the paragraph structure in the document and extracting items of each paragraph from the first document file input in the input step;
Among the items shown in the table of contents created by the table of contents creation step, an item determination step for judging an item to be extracted, and
A document file for acquiring a second document file from which a difference from the first document file is extracted from a storage means for storing one or more document files provided inside or outside the apparatus. An acquisition step;
A difference extraction step of extracting a difference between the first document file and the second document file for the paragraph corresponding to the item of difference extraction target determined by the item determination step;
A control step of displaying the difference extracted by the difference extraction step on a display means or transmitting the difference to an external user terminal device;
Run,
The storage means stores one or more minutes,
Further executing a date acquisition step of acquiring issue dates of the first document file and the second document file;
In the document file acquisition step, minutes recorded between the issue date of the second document file and the issue date of the first document file in relation to the first document file are acquired from the storage means. ,
In the table of contents creation step, the items of the acquired minutes are extracted,
A document file difference extraction method in an image processing apparatus, wherein, in the item determination step, a table of contents item of a first document file that is the same as the extracted minutes item is determined as a difference extraction target item.

An input step of inputting a first document file;
A table of contents creation step of creating a table of contents by analyzing the paragraph structure in the document and extracting items of each paragraph from the first document file input in the input step;
Among the items shown in the table of contents created by the table of contents creation step, an item determination step for judging an item to be extracted, and
A document file for acquiring a second document file from which a difference from the first document file is extracted from a storage means for storing one or more document files provided inside or outside the apparatus. An acquisition step;
A difference extraction step of extracting a difference between the first document file and the second document file for the paragraph corresponding to the item of difference extraction target determined by the item determination step;
A control step of displaying the difference extracted by the difference extraction step on a display means or transmitting the difference to an external user terminal device;
Is executed by the computer of the image processing apparatus,
In the storage means, one or more template documents are stored,
In the document file acquisition step, a template document related to the first document file is acquired from the storage unit;
In the table of contents creation step, items of the acquired template document are extracted,
Document file difference extraction program for causing the computer to execute a process of determining the table of contents item of the first document file that is the same as the extracted template document item as a difference extraction target item in the item determining step .

An input step of inputting a first document file;
A table of contents creation step of creating a table of contents by analyzing the paragraph structure in the document and extracting items of each paragraph from the first document file input in the input step;
Among the items shown in the table of contents created by the table of contents creation step, an item determination step for judging an item to be extracted, and
A document file for acquiring a second document file from which a difference from the first document file is extracted from a storage means for storing one or more document files provided inside or outside the apparatus. An acquisition step;
A difference extraction step of extracting a difference between the first document file and the second document file for the paragraph corresponding to the item of difference extraction target determined by the item determination step;
A control step of displaying the difference extracted by the difference extraction step on a display means or transmitting the difference to an external user terminal device;
Let the computer of the image processing device execute,
The storage means stores one or more minutes,
Causing the computer to further execute a date acquisition step of acquiring issue dates of the first document file and the second document file;
In the document file acquisition step, minutes recorded between the issue date of the second document file and the issue date of the first document file in relation to the first document file are acquired from the storage means. ,
In the table of contents creation step, the items of the acquired minutes are extracted,
A document file difference extraction program for causing the computer to execute a process of determining, as the difference extraction target item, a table of contents item of the first document file that is the same as the extracted minutes item in the item determination step.