JP2009230450A

JP2009230450A - Document attribute information register and program

Info

Publication number: JP2009230450A
Application number: JP2008074855A
Authority: JP
Inventors: Yasuhiro Maruyama; 泰弘丸山
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2008-03-24
Filing date: 2008-03-24
Publication date: 2009-10-08

Abstract

<P>PROBLEM TO BE SOLVED: To simplify work for adding attribute information to a file of a document or the like to be registered. <P>SOLUTION: This attribute information register has an actual attribute name information storage part 26 for storing a document classification, and an actual attribute name to be described in each document belonging to the document classification, out of the actual attribute names registered into a document database 42, while correlated each other, an attribute name conversion information storage part 27 for storing a conversion code including the document classification, a read attribute name described in a read document, and an actual attribute name of an attribute item added with the read attribute name, as a set, an attribute name converting part 23 for converting the read attribute name into the actual attribute name, when the read attribute name of the document classification extracted from the read document is judged to be inconsistent with the actual attribute name, and a document registering part 24 for registering an attribute item name of the attribute information of the read document into the document database 42, by the actual attribute name. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、文書属性管理装置及びプログラムに関する。 The present invention relates to a document attribute management apparatus and a program.

近年では、文書をスキャナ等で読み取り、そのスキャン文書の記載内容を解析することによって文書に記載された属性名とその属性値との組を自動的に抽出する技術がある。例えば、医療機関で利用される処方箋をスキャナに読み取らせることで、その処方箋に記載されている「担当医師」という属性名と「○山△夫」という属性値との組が抽出されることになる。この技術を利用すると、文書をスキャナ等に読み取らせるだけで、当該スキャン文書の属性情報をそのスキャン文書に付加して文書管理システムの所定の文書データベースに登録する処理を自動的に行うことができる。なお、スキャン文書から文書のタイトルを自動抽出する技術が提案されている（例えば、特許文献１）。 In recent years, there is a technique for automatically extracting a set of attribute names and attribute values described in a document by reading the document with a scanner or the like and analyzing the description content of the scanned document. For example, by causing a scanner to read a prescription used in a medical institution, a combination of an attribute name “doctor in charge” and an attribute value “Yoyama △ o” described in the prescription is extracted. Become. By using this technology, it is possible to automatically perform processing for adding attribute information of the scanned document to the scanned document and registering it in a predetermined document database of the document management system simply by reading the document with a scanner or the like. . A technique for automatically extracting a document title from a scanned document has been proposed (for example, Patent Document 1).

ところで、医療機関では、処方箋の他に紹介状、手術同意書、検査依頼書等様々な種類の文書が取り扱われる。そして、文書の種類のよって、例えば医療行為をする医師に対する表現として、前述した担当医師の他に、主治医、検査依頼者等様々な表現が用いられる場合がある。 By the way, medical institutions handle various types of documents such as referral letters, surgical consent forms, and inspection request forms in addition to prescriptions. Depending on the type of document, for example, various expressions such as an attending doctor and an examination requester may be used in addition to the above-mentioned doctor in charge as an expression for a doctor performing a medical practice.

特開２００６−２５１８６４号公報JP 2006-251864 A 特開２００６−０９２２２６号公報JP 2006-092226 A 特開２００３−０３０２１８号公報Japanese Patent Laid-Open No. 2003-030218 特開２００３−３３０９５９号公報JP 2003-330959 A 特開２００４−０７８３４３号公報JP 2004-078343 A

しかしながら、データベースに登録するデータ項目には、例えば医療行為をする医師に対する表現として「主治医」という属性名が用いられていた場合、新たに読み取られた文書に「担当医師」という属性名が記載されていた場合には、文書に対する同じ属性であっても属性名が異なることから異なる属性項目として取り扱われてしまうことになる。 However, in the data item registered in the database, for example, when the attribute name “primary physician” is used as an expression for a doctor who performs a medical practice, the attribute name “in charge doctor” is described in the newly read document. In this case, even if the attribute is the same for the document, the attribute name is different, so that it is treated as a different attribute item.

本発明は、文書等のファイルに属性情報を付加して登録する作業の簡便化を図ることを目的とする。 An object of the present invention is to simplify the work of adding and registering attribute information to a file such as a document.

本発明に係るプログラムは、コンピュータを、文書種類の識別情報と、文書の記載内容から抽出される属性情報を属性情報記憶手段に登録する際の属性項目の名称である実属性名のうち当該文書種類に属する各文書に記載されるべき１又は複数の属性項目の実属性名と、を対応付けして記憶する実属性名情報記憶手段、文書種類の識別情報と、文書に記載された属性項目の名称である読取属性名と、当該読取属性名が付けられた属性項目の実属性名と、を組にして含む変換レコードを記憶する属性名変換情報記憶手段、文書を読み取る読取手段、前記読取手段に読み取られた文書を解析することによって、当該文書種類と、当該文書に記載された属性項目の読取属性名及び属性値の組と、を抽出する抽出手段、前記属性名変換情報記憶手段を参照することにより、前記抽出手段により抽出された当該文書種類の読取属性名が実属性名と一致していないと判断した場合には、その読取属性名を実属性名に変換する属性名変換手段、前記読取手段に読み取られた文書の属性情報を実属性名にて前記属性情報記憶手段に登録する登録手段、として機能させる。 The program according to the present invention includes a computer that stores the document type identification information and the actual attribute name that is the name of the attribute item when registering the attribute information extracted from the description content of the document in the attribute information storage unit. Real attribute name information storage means for associating and storing real attribute names of one or more attribute items to be described in each document belonging to the type, document type identification information, and attribute items described in the document Attribute name conversion information storage means for storing a conversion record including a pair of a read attribute name that is the name of the attribute and a real attribute name of the attribute item to which the read attribute name is assigned, a reading means for reading a document, and the reading Analyzing the document read by the means to extract the document type and a combination of the read attribute name and attribute value of the attribute item described in the document; and the attribute name conversion information storage means If it is determined that the reading attribute name of the document type extracted by the extracting unit does not match the actual attribute name, the attribute name converting unit converts the reading attribute name into the actual attribute name. The reading unit functions as a registering unit that registers the attribute information of the document read by the reading unit with the real attribute name in the attribute information storage unit.

また、前記属性名変換手段は、前記抽出手段により抽出された当該文書種類の読取属性名を含む変換レコードが前記属性名変換情報記憶手段に記憶されていない場合、当該読取属性名を変換すべき実属性名をユーザに入力させ、その入力された実属性名を対応付けして変換レコードを生成して前記属性名変換情報記憶手段に登録する変換レコード登録部を有することを特徴とする。 The attribute name conversion unit should convert the reading attribute name when the conversion record including the reading attribute name of the document type extracted by the extracting unit is not stored in the attribute name conversion information storage unit. It has a conversion record registration part which makes a user input a real attribute name, matches the inputted real attribute name, produces | generates a conversion record, and registers it in the said attribute name conversion information storage means.

本発明に係る文書属性情報登録装置は、文書種類の識別情報と、文書の記載内容から抽出される属性情報を属性情報記憶手段に登録する際の属性項目の名称である実属性名のうち当該文書種類に属する各文書に記載されるべき１又は複数の属性項目の実属性名と、を対応付けして記憶する実属性名情報記憶手段と、文書種類の識別情報と、文書に記載された属性項目の名称である読取属性名と、当該読取属性名が付けられた属性項目の実属性名と、を組にして含む変換レコードを記憶する属性名変換情報記憶手段と、文書を読み取る読取手段と、前記読取手段に読み取られた文書を解析することによって、当該文書種類と、当該文書に記載された属性項目の読取属性名及び属性値の組と、を抽出する抽出手段と、前記属性名変換情報記憶手段を参照することにより、前記抽出手段により抽出された当該文書種類の読取属性名が実属性名と一致していないと判断した場合には、その読取属性名を実属性名に変換する属性名変換手段と、前記読取手段に読み取られた文書の属性情報を実属性名にて前記属性情報記憶手段に登録する登録手段と、を有することを特徴とする。 The document attribute information registration apparatus according to the present invention includes the identification information of the document type and the actual attribute name that is the name of the attribute item when registering the attribute information extracted from the description content of the document in the attribute information storage unit. Real attribute name information storage means for associating and storing real attribute names of one or more attribute items to be described in each document belonging to the document type, document type identification information, and the document type Attribute name conversion information storage means for storing a conversion record including a combination of a reading attribute name that is the name of the attribute item and a real attribute name of the attribute item to which the reading attribute name is attached, and a reading means for reading a document Analyzing the document read by the reading means, and extracting means for extracting the document type and a combination of the read attribute name and attribute value of the attribute item described in the document; and the attribute name Conversion information storage means Referring to the attribute name conversion means for converting the read attribute name into the real attribute name when it is determined that the read attribute name of the document type extracted by the extraction means does not match the real attribute name. And registration means for registering the attribute information of the document read by the reading means in the attribute information storage means with an actual attribute name.

請求項１記載の発明によれば、コンピュータに、文書の種類によって属性項目の名称が統一されていない場合でも、読取文書から抽出した属性項目の属性名（読取属性名）を、属性情報記憶手段に登録する際の属性項目の名称である実属性名に変換してから登録させることができる。 According to the first aspect of the present invention, the attribute information storage means stores the attribute name (reading attribute name) of the attribute item extracted from the read document even if the name of the attribute item is not unified depending on the type of document in the computer. It is possible to register after converting to the real attribute name which is the name of the attribute item when registering in the field.

請求項２記載の発明によれば、抽出手段により抽出された当該文書種類の読取属性名を含む変換レコードが属性名変換情報記憶手段に記憶されていない場合でも、当該読取属性名と当該読取属性名が付けられた属性項目の実属性名とを対応付けた変換レコードを属性名変換情報記憶手段にその都度登録することができる。 According to the second aspect of the present invention, even when the conversion record including the reading attribute name of the document type extracted by the extracting unit is not stored in the attribute name conversion information storage unit, the reading attribute name and the reading attribute A conversion record that associates the real attribute name of the attribute item with the name can be registered in the attribute name conversion information storage unit each time.

請求項３記載の発明によれば、文書の種類によって属性項目の名称が統一されていない場合でも、読取文書から抽出した属性項目の属性名（読取属性名）を、属性情報記憶手段に登録する際の属性項目の名称である実属性名に変換してから登録することができる。 According to the third aspect of the present invention, even when the attribute item names are not unified depending on the document type, the attribute name (read attribute name) extracted from the read document is registered in the attribute information storage unit. It is possible to register after converting to an actual attribute name which is the name of the attribute item at the time.

以下、図面に基づいて、本発明の好適な実施の形態について説明する。 Hereinafter, preferred embodiments of the present invention will be described with reference to the drawings.

図１は、本発明に係る文書属性情報登録装置２０を形成するコンピュータのハードウェア構成図である。本実施の形態における文書属性情報登録装置２０を形成するコンピュータは、従前から存在するパーソナルコンピュータ等の汎用的なハードウェア構成で実現できる。すなわち、コンピュータは、図１に示したようにＣＰＵ１、ＲＯＭ２、ＲＡＭ３、ハードディスクドライブ（ＨＤＤ）４を接続したＨＤＤコントローラ５、入力手段として設けられたマウス６とキーボード７、及び表示装置として設けられたディスプレイ８をそれぞれ接続する入出力コントローラ９、通信手段として設けられたネットワークコントローラ１０を内部バス１１に接続して構成される。本実施の形態では、外部の文書管理サーバ４０に搭載された文書データベースにアクセスするが、このアクセスは、ネットワークコントローラ１０を介して行われる。また、本実施の形態では、紙文書を読み取る手段が必要なので、入出力コントローラ９には更にスキャナ１２が接続される。 FIG. 1 is a hardware configuration diagram of a computer forming a document attribute information registration apparatus 20 according to the present invention. The computer forming the document attribute information registration apparatus 20 in the present embodiment can be realized with a general-purpose hardware configuration such as a personal computer that has existed in the past. That is, as shown in FIG. 1, the computer is provided with a CPU 1, a ROM 2, a RAM 3, an HDD controller 5 connected to a hard disk drive (HDD) 4, a mouse 6 and a keyboard 7 provided as input means, and a display device. An input / output controller 9 for connecting each display 8 and a network controller 10 provided as a communication means are connected to an internal bus 11. In the present embodiment, a document database installed in the external document management server 40 is accessed. This access is performed via the network controller 10. In this embodiment, since a means for reading a paper document is necessary, a scanner 12 is further connected to the input / output controller 9.

なお、本実施の形態では、スキャナ１２を用いるので、コンピュータにスキャナ１２を外部接続して文書属性情報登録装置２０を形成したが、例えば予めスキャナが搭載された複合機等の画像形成装置を使用して文書属性情報登録装置２０を実現してもよい。 In the present embodiment, since the scanner 12 is used, the document attribute information registration device 20 is formed by connecting the scanner 12 externally to a computer. For example, an image forming device such as a multifunction machine equipped with a scanner is used. Thus, the document attribute information registration device 20 may be realized.

なお、性能的に差異はあるかもしれないが、文書管理サーバ４０もコンピュータであることから、そのハードウェア構成は、図１と同じように図示することができる。 Although there may be a difference in performance, since the document management server 40 is also a computer, its hardware configuration can be illustrated as in FIG.

図２は、本実施の形態における文書管理システムのブロック構成図である。図２には、文書属性情報登録装置２０と文書管理サーバ４０とが示されている。 FIG. 2 is a block diagram of the document management system according to this embodiment. FIG. 2 shows the document attribute information registration device 20 and the document management server 40.

文書属性情報登録装置２０は、文書読取部２１、文書解析部２２、属性名変換部２３、文書登録部２４、処理制御部２５、実属性名情報記憶部２６及び属性名変換情報記憶部２７を有している。 The document attribute information registration apparatus 20 includes a document reading unit 21, a document analysis unit 22, an attribute name conversion unit 23, a document registration unit 24, a processing control unit 25, an actual attribute name information storage unit 26, and an attribute name conversion information storage unit 27. Have.

文書読取部２１は、スキャナ１２にセットされた文書を読み取る。文書解析部２２は、文書読取部２１に読み取られた文書を解析することによって内蔵する文書種類特定部２８、属性抽出部２９及び実属性名情報登録部３０によって以下の処理を実行する。すなわち、文書種類特定部２８は、文書の読取画像から区分コードを特定する。区分コードというのは、文書の種類毎に付与された識別コードである。属性抽出部２９は、読取画像から文書に記載されていた属性名とその属性値とを組にして抽出する。実属性名情報登録部３０は、ユーザが区分コードを入力指定した場合において、その区分コードに関する実属性名情報を実属性名情報記憶部２６に登録する。属性名変換部２３は、属性名変換情報記憶部２７を参照することによって文書解析部２２より抽出された当該文書種類の読取属性名が実属性名と一致していないと判断した場合には、その読取属性名を実属性名に変換する。文書登録部２４は、文書読取部２１に読み取られた文書に属性情報を付加して文書管理サーバ４０に送信することで、当該文書のデータベース登録を指示する。処理制御部２５は、各構成要素２１〜２４の動作制御全般を行う。実属性名情報記憶部２６及び属性名変換情報記憶部２７は、後述する実属性名情報及び属性名変換情報をそれぞれ記憶する。 The document reading unit 21 reads a document set on the scanner 12. The document analysis unit 22 analyzes the document read by the document reading unit 21 and executes the following processing using the built-in document type identification unit 28, attribute extraction unit 29, and real attribute name information registration unit 30. That is, the document type specifying unit 28 specifies the classification code from the read image of the document. The classification code is an identification code assigned to each type of document. The attribute extraction unit 29 extracts the attribute name and the attribute value described in the document from the read image as a set. The real attribute name information registration unit 30 registers the real attribute name information regarding the division code in the real attribute name information storage unit 26 when the user inputs and designates the division code. When the attribute name conversion unit 23 determines that the read attribute name of the document type extracted from the document analysis unit 22 by referring to the attribute name conversion information storage unit 27 does not match the actual attribute name, The read attribute name is converted into an actual attribute name. The document registration unit 24 instructs the database registration of the document by adding attribute information to the document read by the document reading unit 21 and transmitting the attribute information to the document management server 40. The processing control unit 25 performs overall operation control of the components 21 to 24. The real attribute name information storage unit 26 and the attribute name conversion information storage unit 27 store real attribute name information and attribute name conversion information described later, respectively.

ところで、上記説明においても記載したように、本実施の形態では、実属性名と読取属性名という２種類の属性名が登場するが、ここでこの属性名について説明する。 By the way, as described in the above description, in the present embodiment, two types of attribute names, that is, an actual attribute name and a read attribute name, appear. Here, the attribute name will be described.

本実施の形態において取り扱う「属性情報」というのは、文書の記載内容から抽出される情報またはその一部の情報を示している。本実施の形態においては、文書管理システムを医療機関に適用した場合を例にして説明するが、医療機関では、処方箋、紹介状、手術同意書、検査依頼書等様々な種類の文書が取り扱われる。そして、例えば、処方箋という文書には、患者ＩＤや患者氏名、患者を診療した診療科、主治医、処方薬等の記載事項を特定する情報が印刷され、その隣接した記入欄に属性値が記入される。このように、印刷や記入等によって各文書に記載された情報をスキャナ等で読み取り、そして既存技術を利用し読取画像を解析することによって、属性項目の名称（この処方箋の例では、「患者ＩＤ」や「主治医」等）と属性値を組にして抽出する。このように、文書の記載事項を読み取ることで得た当該文書に関連する情報を、本実施の形態では「属性情報」と称し、この属性情報には、読取画像から得られた属性項目の名称と当該属性項目の属性値との組が１又は複数含まれることになる。そして、「患者ＩＤ」や「主治医」等の読取文書から抽出された属性項目の名称を、本実施の形態では「読取属性名」と称することにする。 The “attribute information” handled in the present embodiment indicates information extracted from the description content of the document or a part of the information. In this embodiment, a case where the document management system is applied to a medical institution will be described as an example. However, in the medical institution, various types of documents such as a prescription, an introduction letter, an operation consent form, and an examination request form are handled. . For example, in a document called a prescription, information for specifying description items such as a patient ID, a patient's name, a medical department, an attending physician, a prescription drug, etc., which has been treated for a patient is printed, and an attribute value is written in an adjacent entry column. The In this way, by reading the information described in each document by printing or filling in with a scanner or the like, and analyzing the read image using existing technology, the name of the attribute item (in this prescription example, “patient ID” ”Or“ Physician ”) and attribute values as a pair. Thus, information related to the document obtained by reading the description items of the document is referred to as “attribute information” in the present embodiment, and this attribute information includes the name of the attribute item obtained from the read image. One or a plurality of pairs of the attribute item and the attribute value of the attribute item are included. The name of the attribute item extracted from the read document such as “patient ID” or “primary physician” is referred to as “read attribute name” in the present embodiment.

一方、文書管理システムにおいて各文書の属性値を保持管理するためには、属性項目に名称を付けて管理されることになる。本実施の形態では、管理対象とする文書及びその属性情報を文書データベース４２に登録することになるが、このように、属性情報を文書データベース４２に登録する際の属性項目の名称を、本実施の形態では「実属性名」と称することにする。 On the other hand, in order to retain and manage the attribute value of each document in the document management system, the attribute item is assigned a name and managed. In this embodiment, a document to be managed and its attribute information are registered in the document database 42. As described above, the name of the attribute item when registering the attribute information in the document database 42 is used in this embodiment. In this form, it is referred to as “real attribute name”.

診療や治療行為を担当する医師のことを表現する際、処方箋では「主治医」、検査依頼シートでは「検査依頼者」、手術同意書では「担当医師名」などと異なってくる。従って、医師という情報を示す属性項目であっても読取属性名は「主治医」、「検査依頼者」、「担当医師名」などと読取属性名は異なる。ただ、医師という同じ情報を示す属性項目をデータベース管理する際には、全ての読取属性名を統一した、例えば「主治医」という実属性名で登録する必要がある。本実施の形態では、文書によって表現が異なってくる属性名（読取属性名）を実属性名に変換して属性情報を文書データベース４２に登録できるようにしたことを特徴としている。 When expressing a doctor who is in charge of medical treatment or treatment, it is different from the “primary physician” in the prescription, “inspection requester” in the examination request sheet, and “in charge doctor name” in the surgical consent form. Therefore, even if it is an attribute item indicating information of a doctor, the reading attribute name is different from “primary physician”, “examination requester”, “in charge doctor name”, and the like. However, when the attribute item indicating the same information as a doctor is managed in the database, it is necessary to register all of the reading attribute names, for example, with the actual attribute name “physician”. The present embodiment is characterized in that an attribute name (reading attribute name) whose expression varies depending on a document is converted into an actual attribute name so that attribute information can be registered in the document database 42.

文書属性情報登録装置２０における各構成要素２１〜２５は、文書属性情報登録装置２０を形成するコンピュータと、コンピュータに搭載されたＣＰＵ１で動作するプログラムとの協調動作により実現される。また、実属性名情報記憶部２６及び属性名変換情報記憶部２７は、文書属性情報登録装置２０に搭載されたＨＤＤ４若しくはＲＡＭ３にて実現される。 The components 21 to 25 in the document attribute information registration apparatus 20 are realized by a cooperative operation between a computer that forms the document attribute information registration apparatus 20 and a program that operates on the CPU 1 installed in the computer. The real attribute name information storage unit 26 and the attribute name conversion information storage unit 27 are realized by the HDD 4 or the RAM 3 mounted on the document attribute information registration device 20.

文書管理サーバ４０は、本システムで取り扱う文書の全体管理を行うために用いられるサーバである。文書データベース４２には、その文書及び当該文書の属性情報が格納される。文書管理部４１は、新規文書に関する情報のデータベース登録や既存文書の情報更新、削除、また外部からの問合せに対して情報通知等文書データベース４２に記憶された情報の管理全般を行う。文書管理部４１は、文書管理サーバ４０を形成するサーバコンピュータと、サーバコンピュータに搭載されたＣＰＵで動作するプログラムとの協調動作により実現される。また、文書データベース４２は、文書管理サーバ４０に搭載されたＨＤＤ４にて実現される。 The document management server 40 is a server used for overall management of documents handled by this system. The document database 42 stores the document and attribute information of the document. The document management unit 41 performs overall management of information stored in the document database 42 such as information registration for information on new documents, information update and deletion of existing documents, and notification of information in response to external inquiries. The document management unit 41 is realized by a cooperative operation of a server computer forming the document management server 40 and a program operating on a CPU mounted on the server computer. The document database 42 is realized by the HDD 4 mounted on the document management server 40.

図３は、本実施の形態における文書データベース４２に登録される属性情報のデータ構成例を示した図である。文書データベース４２に登録される属性情報は、文書毎に設定され、文書名、文書区分、患者ＩＤ、患者氏名、診療科及び主治医という実属性名で示された属性項目によって構成される。このうち、文書区分というのは、当該文書の種類を特定するために付けられた識別コードである。 FIG. 3 is a diagram showing a data configuration example of attribute information registered in the document database 42 in the present embodiment. The attribute information registered in the document database 42 is set for each document, and includes attribute items indicated by actual attribute names such as a document name, a document classification, a patient ID, a patient name, a medical department, and an attending physician. Among these, the document classification is an identification code assigned to specify the type of the document.

図４は、本実施の形態における実属性名情報記憶部２６に記憶された実属性名情報のデータ構成例を示した図である。実属性名情報は、文書区分毎に設定され、文書区分に、当該文書区分に属する文書に共通して付けられているタイトルと、当該文書区分に属する文書に記載されるべき１又は複数の属性項目の名称（実属性名）と、が対応付けして設定される。 FIG. 4 is a diagram showing a data configuration example of the real attribute name information stored in the real attribute name information storage unit 26 in the present embodiment. The actual attribute name information is set for each document category, and the title assigned to the document category in common with the document belonging to the document category and one or more attributes to be described in the document belonging to the document category Item names (actual attribute names) are set in association with each other.

なお、本実施の形態では、例えば検査依頼シートという文書の種類に属する各文書には、複数の様式が存在するかもしれないが、いずれの検査依頼シートにも医師を示す表現としては「検査依頼者」が用いられるということを前提としているので、文書の種類を特定する区分毎に実属性名を関連付けて実属性名情報を設定した。 In this embodiment, for example, each document belonging to the type of document called an examination request sheet may have a plurality of formats. Since it is assumed that “user” is used, the real attribute name information is set in association with the real attribute name for each category for specifying the document type.

図５は、本実施の形態における属性名変換情報記憶部２７に記憶された属性名変換情報のデータ構成例を示した図である。属性名変換情報は、属性項目毎に設定され、文書区分と、文書区分に属する文書に共通して用いられている属性項目の名称（読取属性名）と、当該読取属性名が付けられた属性項目がデータベース登録される際に用いられる属性項目の名称（実属性名）と、を組にして設定される。なお、各文書区分に対応した属性名変換情報を、変換レコードともいう。 FIG. 5 is a diagram showing a data configuration example of the attribute name conversion information stored in the attribute name conversion information storage unit 27 in the present embodiment. The attribute name conversion information is set for each attribute item, and includes the document category, the name of the attribute item (read attribute name) commonly used for documents belonging to the document category, and the attribute with the read attribute name. The attribute item name (actual attribute name) used when the item is registered in the database is set as a set. The attribute name conversion information corresponding to each document category is also referred to as a conversion record.

本実施の形態で用いるプログラムは、通信手段により提供することはもちろん、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭ等のコンピュータ読み取り可能な記録媒体に格納して提供することも可能である。通信手段や記録媒体から提供されたプログラムはコンピュータにインストールされ、コンピュータのＣＰＵがインストールプログラムを順次実行することで各種処理が実現される。 The program used in this embodiment can be provided not only by communication means but also by storing in a computer-readable recording medium such as a CD-ROM or DVD-ROM. The program provided from the communication means or the recording medium is installed in the computer, and various processes are realized by the CPU of the computer sequentially executing the installation program.

次に、本実施の形態において文書の属性情報をデータベース登録する処理について図６及び図７に示したフローチャートを用いて説明する。 Next, processing for registering document attribute information in the database in the present embodiment will be described with reference to the flowcharts shown in FIGS.

まず、ユーザが登録したい紙文書をスキャナ１２にセットして所定の読取操作を行うと、文書読取部２１は、スキャナ１２を用いて紙文書を読み取る（ステップ１００）。文書解析部２２における文書種類特定部２８は、既存技術を利用して読取画像を解析することで、文書に記載されている文書区分コードを抽出し、特定する（ステップ１０１）。なお、読取文書から文書区分が特定できなかった場合には、後述する文書種類指定画面を利用して文書区分コードをユーザに指定させるようにしてもよい。また、文書区分と共に文書名を合わせて抽出する。仮に、文書名を抽出できなかった場合には、文書区分を含む文書名取得要求を文書管理サーバ４０へ送ることで文書名を取得することができる。続いて、文書解析部２２における属性抽出部２９は、既存技術を利用して読取画像を解析することで、文書に記載されている属性名及び属性値を組にして抽出する（ステップ１０２）。そして、文書解析部２２は、その抽出した情報を表示する文書登録確認画面をディスプレイ８に表示する（ステップ１０３）。本実施の形態における文書登録確認画面の表示例を図８に示すが、この表示例のように文書登録確認画面には、以上の処理で取得した文書区分及び文書名、更に属性名（読取属性名）と属性値とが組にして表示される。 First, when a user sets a paper document to be registered in the scanner 12 and performs a predetermined reading operation, the document reading unit 21 reads the paper document using the scanner 12 (step 100). The document type specifying unit 28 in the document analyzing unit 22 extracts and specifies the document classification code described in the document by analyzing the read image using the existing technology (step 101). If the document classification cannot be specified from the read document, the user may specify a document classification code using a document type designation screen described later. In addition, the document name is extracted together with the document classification. If the document name cannot be extracted, the document name can be acquired by sending a document name acquisition request including the document classification to the document management server 40. Subsequently, the attribute extraction unit 29 in the document analysis unit 22 analyzes the read image using existing technology, and extracts attribute names and attribute values described in the document as a set (step 102). Then, the document analysis unit 22 displays a document registration confirmation screen for displaying the extracted information on the display 8 (step 103). FIG. 8 shows a display example of the document registration confirmation screen in the present embodiment. As shown in this display example, the document registration confirmation screen includes the document classification and document name acquired by the above processing, and an attribute name (read attribute). Name) and attribute value are displayed in pairs.

ここで、ユーザが文書区分及び／または文書名で確認できる文書種類が所望する文書区分とは異なるために変更したい場合には文書種類変更ボタン３６を、読取属性名が実属性名と異なり、かつ属性名変換情報記憶部２７に登録されていないことを事前に把握しているために属性名変換情報記憶部２７に変換レコードを追加登録したい場合には属性追加ボタン３５を、この情報のまま読取文書をデータベース登録してよい場合には登録ボタン３７を、選択することになる。まず、ユーザが文書種類変更ボタン３６を選択した場合（ステップ１０４で「文書種類変更」）、文書種類特定部２８は、文書種類指定画面をディスプレイ８に表示する。 Here, when the user wants to change the document type that can be confirmed by the document category and / or the document name because it is different from the desired document category, the document type change button 36 is displayed, and the reading attribute name is different from the actual attribute name. If it is desired to register additional conversion records in the attribute name conversion information storage unit 27 because it knows in advance that it is not registered in the attribute name conversion information storage unit 27, the attribute addition button 35 is read as it is. When the document may be registered in the database, the registration button 37 is selected. First, when the user selects the document type change button 36 (“change document type” in step 104), the document type specifying unit 28 displays a document type designation screen on the display 8.

図９は、本実施の形態において処理対象の文書が属する文書種類を表示するための画面の例を示した図である。この画面には、ステップ１０１において抽出された変更前の文書区分及び文書名が表示される表示領域５１と、変更する文書区分を入力するための入力領域５２と、文書区分が入力されたときに、その文書区分に対応し、文書管理サーバ４０から取得した文書名が表示される表示領域５３とが設けられている。ユーザが入力領域５２に文書区分を入力して変更ボタン５４を選択すると、文書種類特定部２８は、文書区分を変更する（ステップ１０５）。そして、文書解析部２２は、変更された文書区分を文書登録確認画面に表示する（ステップ１０３）。 FIG. 9 is a diagram showing an example of a screen for displaying the document type to which the document to be processed belongs in the present embodiment. On this screen, a display area 51 for displaying the document classification and document name before change extracted in step 101, an input area 52 for inputting the document classification to be changed, and when the document classification is input Corresponding to the document classification, a display area 53 for displaying the document name acquired from the document management server 40 is provided. When the user inputs a document category in the input area 52 and selects the change button 54, the document type identification unit 28 changes the document category (step 105). Then, the document analysis unit 22 displays the changed document classification on the document registration confirmation screen (step 103).

また、ユーザが属性追加ボタン３５を選択した場合（ステップ１０４で「属性追加」）、処理制御部２５は、属性名変換部２３に属性名追加処理を実行させる（ステップ１０６）。この処理については、図７に示したフローチャートを用いて説明する。 If the user selects the attribute addition button 35 (“add attribute” in step 104), the process control unit 25 causes the attribute name conversion unit 23 to execute an attribute name addition process (step 106). This process will be described with reference to the flowchart shown in FIG.

属性名変換部２３は、まず属性追加画面をディスプレイ８に表示する（ステップ１２１）。図８は、本実施の形態における属性追加画面の表示例を示した図である。図８には、読取属性名をリスト表示する表示領域５６と、実属性名をリスト表示する表示領域５７とが設けられている。表示領域５６に表示する読取属性名は文書解析部２２から取得している。表示領域５７に表示する実属性名は、文書解析部２２から取得した文書区分を含む実属性名取得要求を文書管理サーバ４０へ送ることで取得できる。そして、ユーザに対応付ける読取属性名と実属性名とを１つずつ選択させた後、属性追加ボタン５８を選択させる。このようにして、読取属性名とその読取属性名に対応付けられた実属性名の組を受け付けると（ステップ１２２）、変換レコード登録部３１は、それら属性名の組に文書区分を対応させて変換レコードを生成し、属性名変換情報記憶部２７に登録する（ステップ１２３）。以上の属性名変換処理の終了を確認すると、処理制御部２５は、文書解析部２２に文書登録確認画面を表示させる（ステップ１０３）。 The attribute name conversion unit 23 first displays an attribute addition screen on the display 8 (step 121). FIG. 8 is a diagram showing a display example of the attribute addition screen in the present embodiment. In FIG. 8, a display area 56 for displaying a list of reading attribute names and a display area 57 for displaying a list of actual attribute names are provided. The read attribute name displayed in the display area 56 is acquired from the document analysis unit 22. The real attribute name displayed in the display area 57 can be acquired by sending a real attribute name acquisition request including the document classification acquired from the document analysis unit 22 to the document management server 40. Then, after selecting the reading attribute name and the actual attribute name associated with the user one by one, the attribute addition button 58 is selected. In this way, upon receiving a set of reading attribute names and real attribute names associated with the reading attribute names (step 122), the conversion record registration unit 31 associates document classifications with these sets of attribute names. A conversion record is generated and registered in the attribute name conversion information storage unit 27 (step 123). Upon confirming the end of the attribute name conversion process, the process control unit 25 causes the document analysis unit 22 to display a document registration confirmation screen (step 103).

そして、ユーザが登録ボタン４８を選択した場合（ステップ１０４で「登録」）、処理制御部２５は、属性名変換部２３に後述する属性変換処理を実行させる。この際、文書解析部２２は、文書区分に対応する実属性名を実属性名情報記憶部２６から取り出し、その実属性名と読取属性名と文書区分、更には読取画像を属性名変換部２３へ渡す。 When the user selects the registration button 48 (“Register” in Step 104), the process control unit 25 causes the attribute name conversion unit 23 to execute attribute conversion processing described later. At this time, the document analysis unit 22 extracts the real attribute name corresponding to the document classification from the real attribute name information storage unit 26, and the real attribute name, the read attribute name, the document classification, and the read image to the attribute name conversion unit 23. hand over.

まず、属性名変換部２３は、文書解析部２２から受け取った読取属性名の１つを、実属性名と比較する（ステップ１０７）。ここで、読取属性名と一致する実属性名が存在する場合（ステップ１０８でＹ）、まだ比較処理を行っていない読取属性名が存在すれば（ステップ１１４でＹ）、その読取属性名の比較処理に移行する（ステップ１０７）。 First, the attribute name conversion unit 23 compares one of the read attribute names received from the document analysis unit 22 with the actual attribute name (step 107). Here, if there is a real attribute name that matches the reading attribute name (Y in step 108), if there is a reading attribute name that has not been compared (Y in step 114), comparison of the reading attribute name is performed. The process proceeds (step 107).

読取属性名と実属性名とを比較した結果、読取属性名と一致する実属性名が存在しない場合（ステップ１０８でＮ）、属性名変換部２３は、文書区分及び読取属性名をキーに属性名変換情報記憶部２７を検索する。ここで、文書区分及び読取属性名が設定された変換レコードが存在する場合（ステップ１１０でＹ）、属性名変換部２３は、読取属性名を変換レコードに含まれる実属性名に変換する（ステップ１１２）。一方、該当する変換レコードが存在しない場合（ステップ１１０でＮ）、属性名変換部２３における変換レコード登録部３１は、属性名追加処理を実行する（ステップ１１１）。この処理については、前述したので説明は省略するが、この属性名追加処理により、処理対象の読取属性名とユーザにより指定された実属性名との組に文書区分が対応付けされて生成された変換レコードが属性名変換情報記憶部２７に登録されることになる。そして、属性名変換部２３は、読取属性名をその変換レコードに含まれる実属性名に変換する（ステップ１１２）。 If there is no real attribute name that matches the read attribute name as a result of the comparison between the read attribute name and the real attribute name (N in step 108), the attribute name conversion unit 23 uses the document classification and the read attribute name as a key for the attribute. The name conversion information storage unit 27 is searched. Here, when there is a conversion record in which the document classification and the reading attribute name are set (Y in Step 110), the attribute name conversion unit 23 converts the reading attribute name into the actual attribute name included in the conversion record (Step S110). 112). On the other hand, when there is no corresponding conversion record (N in Step 110), the conversion record registration unit 31 in the attribute name conversion unit 23 performs an attribute name addition process (Step 111). Since this processing has been described above, the description thereof will be omitted, but this attribute name addition processing is generated by associating the document classification with the set of the read attribute name to be processed and the real attribute name specified by the user. The conversion record is registered in the attribute name conversion information storage unit 27. Then, the attribute name conversion unit 23 converts the read attribute name into an actual attribute name included in the conversion record (step 112).

以上の処理を、全ての読取属性名に対して繰り返し行う（ステップ１１３）。以上の処理により、実属性名と一致していた読取属性名はそのまま変換されず、実属性名と一致していない読取属性名は属性名変換情報に従い実属性名に変換され、この結果、登録すべき属性項目の名称は、全て実属性名となる。 The above processing is repeated for all reading attribute names (step 113). Through the above processing, the reading attribute name that matches the actual attribute name is not converted as it is, and the reading attribute name that does not match the actual attribute name is converted to the actual attribute name according to the attribute name conversion information. The names of the attribute items to be used are all real attribute names.

そして、文書登録部２４は、実属性名と属性値との組、及び文書区分を含む属性情報を形成し、読取文書に付加して文書管理サーバ４０へ送ることで文書登録指示を出す。文書管理サーバ４０における文書管理部４１は、文書属性情報登録装置２０から文書登録指示を受け取ると、文書に属性情報を関連付けして文書データベース４２に登録する（ステップ１１４）。 Then, the document registration unit 24 forms attribute information including a pair of an actual attribute name and an attribute value and a document classification, adds the information to the read document, and sends it to the document management server 40 to issue a document registration instruction. When the document management unit 41 in the document management server 40 receives the document registration instruction from the document attribute information registration apparatus 20, the document management unit 41 associates the attribute information with the document and registers it in the document database 42 (step 114).

なお、本実施の形態では、本発明を医療機関に適用した場合を例にして説明したが、他の機関や分野に適用できることは言うまでもない。 In the present embodiment, the case where the present invention is applied to a medical institution has been described as an example, but it goes without saying that the present invention can be applied to other institutions and fields.

本発明に係る文書属性情報登録装置を形成するコンピュータのハードウェア構成図である。It is a hardware block diagram of the computer which forms the document attribute information registration apparatus which concerns on this invention. 本実施の形態における文書管理システムのブロック構成図である。It is a block block diagram of the document management system in this Embodiment. 本実施の形態における文書データベースに登録される属性情報のデータ構成例を示した図である。It is the figure which showed the data structural example of the attribute information registered into the document database in this Embodiment. 本実施の形態における実属性名情報記憶部に記憶された実属性名情報のデータ構成例を示した図である。It is the figure which showed the data structural example of the real attribute name information memorize | stored in the real attribute name information storage part in this Embodiment. 本実施の形態における属性名変換情報記憶部に記憶された属性名変換情報のデータ構成例を示した図である。It is the figure which showed the data structural example of the attribute name conversion information memorize | stored in the attribute name conversion information storage part in this Embodiment. 本実施の形態において文書の属性情報をデータベース登録する処理を示したフローチャートである。6 is a flowchart illustrating processing for registering document attribute information in a database according to the present exemplary embodiment. 本実施の形態における属性名追加処理を示したフローチャートである。It is the flowchart which showed the attribute name addition process in this Embodiment. 本実施の形態における文書登録確認画面の例を示した図である。It is the figure which showed the example of the document registration confirmation screen in this Embodiment. 本実施の形態における文書種類指定画面の例を示した図である。It is the figure which showed the example of the document kind designation | designated screen in this Embodiment. 本実施の形態における属性追加画面の例を示した図である。It is the figure which showed the example of the attribute addition screen in this Embodiment.

Explanation of symbols

１ＣＰＵ、２ＲＯＭ、３ＲＡＭ、４ハードディスクドライブ（ＨＤＤ）、５ＨＤＤコントローラ、６マウス、７キーボード、８ディスプレイ、９入出力コントローラ、１０ネットワークコントローラ、１１内部バス、１２スキャナ、２０文書属性情報登録装置、２１文書読取部、２２文書解析部、２３属性名変換部、２４文書登録部、２５処理制御部、２６実属性名情報記憶部、２７属性名変換情報記憶部、２８文書種類特定部、２９属性抽出部、３０実属性名情報登録部、３１変換レコード登録部、４０文書管理サーバ、４１文書管理部、４２文書データベース。 1 CPU, 2 ROM, 3 RAM, 4 Hard disk drive (HDD), 5 HDD controller, 6 Mouse, 7 Keyboard, 8 Display, 9 Input / output controller, 10 Network controller, 11 Internal bus, 12 Scanner, 20 Document attribute information registration Apparatus, 21 document reading unit, 22 document analysis unit, 23 attribute name conversion unit, 24 document registration unit, 25 processing control unit, 26 actual attribute name information storage unit, 27 attribute name conversion information storage unit, 28 document type identification unit, 29 Attribute extraction unit, 30 Real attribute name information registration unit, 31 Conversion record registration unit, 40 Document management server, 41 Document management unit, 42 Document database.

Claims

Computer
Of the actual attribute name that is the name of the attribute item when registering the identification information of the document type and the attribute information extracted from the description content of the document in the attribute information storage means, it should be described in each document belonging to the document type Real attribute name information storage means for associating and storing real attribute names of one or more attribute items;
An attribute for storing a conversion record including a pair of document type identification information, a reading attribute name that is the name of an attribute item described in the document, and an actual attribute name of the attribute item to which the reading attribute name is attached. Name conversion information storage means,
Reading means for reading a document;
Extracting means for extracting the document type and a set of the read attribute name and attribute value of the attribute item described in the document by analyzing the document read by the reading means;
If it is determined by referring to the attribute name conversion information storage means that the reading attribute name of the document type extracted by the extracting means does not match the actual attribute name, the reading attribute name is changed to the actual attribute name. Attribute name conversion means for converting to a name,
Registration means for registering the attribute information of the document read by the reading means in the attribute information storage means with an actual attribute name;
Program to function as.

The program according to claim 1,
The attribute name conversion means, when a conversion record including the reading attribute name of the document type extracted by the extraction means is not stored in the attribute name conversion information storage means, the actual attribute to be converted to the reading attribute name A program comprising: a conversion record registration unit for causing a user to input a name, generating a conversion record by associating the input real attribute name, and registering the conversion record in the attribute name conversion information storage unit.

Of the actual attribute name that is the name of the attribute item when registering the identification information of the document type and the attribute information extracted from the description content of the document in the attribute information storage means, it should be described in each document belonging to the document type Real attribute name information storage means for associating and storing real attribute names of one or more attribute items;
An attribute for storing a conversion record including a pair of document type identification information, a reading attribute name that is the name of an attribute item described in the document, and an actual attribute name of the attribute item to which the reading attribute name is attached. Name conversion information storage means;
Reading means for reading a document;
By extracting the document type and a set of attribute attribute reading attribute name and attribute value described in the document by analyzing the document read by the reading unit;
If it is determined by referring to the attribute name conversion information storage means that the reading attribute name of the document type extracted by the extracting means does not match the actual attribute name, the reading attribute name is changed to the actual attribute name. Attribute name conversion means for converting to a name,
Registration means for registering the attribute information of the document read by the reading means in the attribute information storage means with an actual attribute name;
A document attribute information registration apparatus characterized by comprising: