WO2018003153A1 - Recognition device and recognition method - Google Patents

Recognition device and recognition method Download PDF

Info

Publication number
WO2018003153A1
WO2018003153A1 PCT/JP2017/001418 JP2017001418W WO2018003153A1 WO 2018003153 A1 WO2018003153 A1 WO 2018003153A1 JP 2017001418 W JP2017001418 W JP 2017001418W WO 2018003153 A1 WO2018003153 A1 WO 2018003153A1
Authority
WO
WIPO (PCT)
Prior art keywords
line
recognition
item value
row
histogram
Prior art date
Application number
PCT/JP2017/001418
Other languages
French (fr)
Japanese (ja)
Inventor
昭 森口
Original Assignee
株式会社日立ソリューションズ
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立ソリューションズ filed Critical 株式会社日立ソリューションズ
Publication of WO2018003153A1 publication Critical patent/WO2018003153A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/192Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
    • G06V30/194References adjustable by an adaptive method, e.g. learning

Definitions

  • the present invention relates to a recognition device that recognizes a table structure from a document such as a form.
  • a table structure is recognized by ruled lines, an item name is identified using an item name candidate database, and an item name and an item are determined based on the positional relationship between the item name and other item value candidate cells.
  • a method is disclosed in which the likelihood of correspondence with a value is calculated, and the item name and the item value are associated with each other so that the likelihood is highest in the entire table structure.
  • Japanese Patent Laid-Open No. 2013-190993 discloses a ruled line in which the ruled line becomes a boundary between an item name and an item value due to differences between items described across the ruled line, such as differences in background color, font size, font type, and the like. Is described, and a method for estimating an item name and an item value in a table structure and a correspondence relationship thereof is described.
  • item names and item values have similarities in the horizontal start position and end position in the form, and there are characters between the line containing the table heading and the item value.
  • a method of associating a table headline with a line including an item value and associating an item name with an item value is described using the fact that similarity is found in the appearing coordinate positions.
  • a character string that is not related to the table heading may be described between the table heading and the line including the item value (hereinafter referred to as the item value line) or between the item value lines.
  • the item value line contains the product name and price, but if there is a shortage of inventory and more time is required for delivery of the product, the period and reason for delay in delivery, etc.
  • the supplementary information is described at the top or bottom of the item value line.
  • information on discounts for product purchases during the sales promotion period and product purchases in bulk is described near the item value line.
  • the start position and end position of character strings are compared, and a line (character string line) including a character string in a form is represented by coordinates where a character exists. Is converted to binary data with 1 being blank and 0 being blank, and by calculating the Hamming distance between the binary data of the table header and the binary data of the character string row, the table header, the item value row, and other character string rows To distinguish.
  • the start position and end position of the character string are not necessarily the same between the table heading and the item value line, and the number of character strings in the table heading and the number of character strings in the item value line may be different. For this reason, the hamming distance between the table heading and the item value line becomes larger than the hamming distance between the table heading and a line including another character string, making the association difficult.
  • a typical example of the invention disclosed in the present application is as follows. That is, a recognition apparatus comprising a processor that executes a program and a storage device that stores the program, and a recognition model that determines whether a character string extracted from a form is an item value line including an item value
  • the recognition model converts line information including a character string in a form into a histogram, analyzes a line histogram including a table heading and a line histogram including an item value to determine the relevance of the line structure.
  • the recognition model is generated by machine learning, and the recognition model extracts line information including a character string from a form to be recognized, converts the extracted line information into a histogram, and includes a line including a table heading. It is determined whether the other row is an item value row using the relationship of the row structure obtained by comparing the histogram of the other and the histogram of the other row as a feature amount.
  • FIG. 1 is a configuration diagram of the in-form table structure recognition system according to the embodiment of the present invention.
  • the in-form table structure recognition system includes a recognition server 100 that extracts item names and item values from a form.
  • the recognition server 100 is connected to a reading device 112 that digitizes a paper form 111 received by mail from a business partner.
  • the recognition server 100 is connected to a network (for example, the Internet 114), and receives an electronic form from the customer company PC 113.
  • the recognition server 100 includes a form receiving unit 109, an item value line learning program 101, an item value line recognition program 102, and an item value recognition program 103. Further, the recognition server 100 has an item name database 105 in which item names to be acquired from the form are registered.
  • the form receiving unit 109 stores the electronic form received via the reading device 112 or the Internet 114 as a learning form 104 or a recognition target form 106 together with a supplier company name.
  • the item value line learning program 101 uses the line including the item name registered in the item name database 105 as a table headline, and the correspondence between the table headline and the item value line from the learning form 104 where the position of the item value line is known. The relationship is machine-learned to generate an item value line recognition model 107 (see FIG. 3).
  • the item value line recognition program 102 recognizes and extracts an item value line in the recognition target form 106 using the item value line recognition model 107 (see FIG. 10).
  • the item value recognition program 103 associates the item value in the item value row with the item name of the table header, and stores it in the item name / item value database 108 shown in FIG. 11 (see FIG. 10).
  • FIG. 1B is a block diagram showing a physical configuration of the recognition server 100.
  • the recognition server 100 of this embodiment is configured by a computer having a processor (CPU) 1, a memory 2, an auxiliary storage device 3, and a communication interface 4.
  • the processor 1 executes a program stored in the memory 2.
  • the memory 2 includes a ROM that is a nonvolatile storage element and a RAM that is a volatile storage element.
  • the ROM stores an immutable program (for example, BIOS).
  • BIOS basic input/output
  • the RAM is a high-speed and volatile storage element such as DRAM (Dynamic Random Access Memory), and temporarily stores a program executed by the processor 1 and data used when the program is executed.
  • the auxiliary storage device 3 is configured by a large-capacity and non-volatile storage device such as a magnetic storage device (HDD) or a flash memory (SSD), for example, and stores a program executed by the processor 1 and data used when the program is executed. Store. That is, the program is read from the auxiliary storage device 3, loaded into the memory 2, and executed by the processor 1.
  • a large-capacity and non-volatile storage device such as a magnetic storage device (HDD) or a flash memory (SSD), for example, and stores a program executed by the processor 1 and data used when the program is executed. Store. That is, the program is read from the auxiliary storage device 3, loaded into the memory 2, and executed by the processor 1.
  • the communication interface 4 is a network interface device that controls communication with other devices (reading device 112, customer company PC 113) according to a predetermined protocol.
  • the recognition server 100 may have an input interface 5 and an output interface 8.
  • the input interface 5 is an interface to which an input from an operator is received, to which a keyboard 6 and a mouse 7 are connected.
  • the output interface 8 is an interface to which a display device 9 or a printer is connected, and the execution result of the program is output in a form that can be visually recognized by the operator.
  • the program executed by the processor 1 is provided to the recognition server 100 via a removable medium (CD-ROM, flash memory, etc.) or a network, and stored in the nonvolatile auxiliary storage device 3 that is a non-temporary storage medium. For this reason, the recognition server 100 may have an interface for reading data from a removable medium.
  • the recognition server 100 is a computer system configured on a single computer or a plurality of computers configured logically or physically, and operates in a separate thread on the same computer. Alternatively, it may operate on a virtual machine constructed on a plurality of physical computer resources.
  • all or part of the functional blocks implemented by the program may be configured by a physical integrated circuit (for example, Field-Programmable Gate Array).
  • FIG. 2 is a diagram illustrating an example of a form recognized by the recognition server 100.
  • the form shown in FIG. 2 is an invoice from Company A to Company B.
  • the products and prices purchased by Company B are listed in the form in a table structure, and the table heading 201 includes the number of products (Quantity), the product number (Item No.), the description of the product (Description), and the unit price (UNIT).
  • the item names of “PRICE” and total price (PRICE) are described.
  • item value rows 202, 204, and 206 item values corresponding to the item names of the table headings are described.
  • supplementary information 203 and 205 for supplementing the item value line is described between the item value lines 202, 204 and 206.
  • an Invoice Number 207 that uniquely identifies the form is assigned to the form for each business partner company.
  • the learning form 104 sets the rectangular coordinates of the table heading 201 and the item value lines 202, 204, and 206 as correct data for machine learning.
  • FIG. 3 is a flowchart of processing by the item value line learning program 101.
  • the item value line learning program 101 receives an input of the learning form 104 (step S301).
  • step S302 the rectangular coordinates of the character string row are extracted from the learning form 104 (step S302).
  • step S ⁇ b> 302 a rectangle as shown in FIG. 4 is extracted from the learning form 104.
  • step S303 the learning form 104 is subjected to OCR processing, and character information and the coordinates of the character are extracted. Then, from the OCR result, a character that matches the item name registered in the item name database 105 is specified, and the coordinates of the specified character on the form are specified as the position of the table heading (step S304).
  • a histogram of character pixels in the rectangle is generated for all the character string rows extracted as a rectangle in step S302 (step S305).
  • This histogram represents the structural features of the rows in the horizontal direction. Specifically, after dividing a rectangle of a character string row by a certain number in the horizontal direction, the number of black pixels contained in characters in the divided area is set as the frequency of the histogram.
  • a histogram generated from the table heading 201 of the form shown in FIG. 2 is shown in FIG.
  • the horizontal item value learning is a process in which the neural network learns the relationship between the table header and the structure of the item value row from the horizontal histogram representing the pixel distribution generated in step S305.
  • Table headings and item value rows are: (1) the number of character strings is the same or close, (2) character strings exist at a common position in the horizontal direction, and (3) item values are indicated by item names in the table headings.
  • FIG. 6 is a diagram showing a recognition model of a neural network that performs horizontal item value learning.
  • the horizontal direction item value line recognition neural network model 610 shown in FIG. 6 takes a table header histogram 601 and a character string line histogram 602 as input values.
  • the table heading histogram 601 is a histogram generated in step S305 for the rectangle of the table heading specified in step S304.
  • the character string row histogram 602 is a histogram generated in step S305 for a character string rectangle other than the table header extracted in step S302.
  • the horizontal item value row recognition neural network model 610 includes a feature amount extraction layer A611 that extracts the feature amount of the structure of the table header histogram 601 and a feature amount extraction layer B612 that extracts the feature amount of the structure of the character string row histogram 602. , And a comparison layer 613 that compares the two feature amounts.
  • learning is performed so that the position of the character string in the table header, the number of character strings, and the position of a specific item name (for example, Description) are extracted as the feature quantity.
  • the feature amount extraction layer B612 learning is performed so that the position of the character string in the character string row, the number of character strings, and the length of the character string are extracted as feature amounts.
  • the comparison layer 613 evaluates the likelihood that the structure of the character string row histogram 602 is likely to be the structure of the item value row corresponding to the table header histogram 601 from the two feature amounts. Specifically, the position of the character string in the character string row, the number of character strings, and the length of the character string with respect to the table heading correspond to each of the position of the character string of the table heading, the number of character strings, and the item name. Likelihood is learned. The output of the comparison layer 613 is the item value row probability 614.
  • the output is 1 when the table heading histogram 601 and the item value line histogram of the learning form 104 are input, and the learning form.
  • the learning is executed by a known neural network learning method (for example, error back-propagation method) so that the output when inputting the histogram of the table header histogram 601 and the character string row other than the item value row becomes zero. To do.
  • the item value row can be estimated from the structural features of the table header and the item value row.
  • a neighboring line feature value generation process for generating a feature value that can be input to the neural network from information in the peripheral space of the item value line is performed (step S307).
  • the item value row can be estimated with higher accuracy.
  • the peripheral space information includes ruled lines, blanks, and similar character string rows.
  • a ruled line is described between the table heading and the item value line, or at the end of the table structure. Therefore, the ruled line is effective information for determining the existence range of the item value line.
  • a certain amount or more of space is provided between the table structure and the non-table structure.
  • the space is effective information for determining the existence range of item value rows. Further, when there are a plurality of item value rows in the table structure, row structures having similar feature quantities repeatedly exist within a certain range, and the relative position of the similar row structure is information useful for determining an item value row. Therefore, it is possible to improve the recognition accuracy of the item value line by causing the neural network to learn information in which ruled lines, blanks, and similar character string lines exist.
  • FIG. 7A and 7B are diagrams illustrating an example of the neighborhood line feature value generation process.
  • feature amounts are generated from the top and bottom 10 lines as the space around the character string line 701 of the form 700.
  • the target range is 10 neighboring rows 702 and 703 in which each character string row is one row, a blank portion having the same height as the character string row 701 is one row, and a ruled line is one row.
  • the 7B includes a neighboring row feature amount table 710 including neighboring row numbers 704 and 711 assigned to each neighboring row and a feature amount 712 of each neighboring row.
  • the feature quantity 712 is a value calculated by the horizontal direction item value line recognition neural network model 610 generated in step S306.
  • the probability that each character string line is an item value line (Possibilities), whether it is blank (Blank), a ruled line (Line) or table header (Header).
  • Possibilities compare the row structures of the rows and determine that a row having the same or similar row structure is likely to be an item value row.
  • step S308 vertical direction item value row learning is performed using the neighboring row feature quantity generated in step S307 as an input (step S308).
  • the vertical item value line recognition neural network model 802 generated by the vertical item value line learning is the same as the horizontal item value line recognition neural network model 610 with the neighboring line feature 801 as an input.
  • the item value row probability 803 is output.
  • learning is performed using the inverse error propagation method so that 1 is output when the character string row 701 is an item value row and 0 is output when the character string row 701 is a non-item value row. To do.
  • FIG. 9 is a flowchart of processing by the item value line recognition program 102 and the item value recognition program 103.
  • the item value line recognition program 102 acquires the recognition target form 106 together with the business partner company name (step S901).
  • step S902 to step S905 is the same as the processing from step S302 to step S305 by the item value line learning program 101.
  • step S906 the table header histogram 601 and the character string row histogram 602 generated by the processing up to step S905 are input for each character string row of the recognition target form 106, and the horizontal direction item value line recognition generated in step S306 is input.
  • the probability that the character string row is the item value row is calculated by the neural network model 610 (step S906).
  • a neighboring line feature amount is generated for each character string line of the recognition target form 106, similarly to step S307 by the item value line learning program 101 (step S907). .
  • the probability that the character string line is the item value line is calculated from the neighboring line feature amount generated in step S907 by the vertical direction item value line recognition neural network model generated in step S308 (step S908).
  • the character string line is an item value line. Further, it is determined that a row having the same or similar row structure is likely to be an item value row. Further, it is determined that the character string line between the two ruled lines is highly likely to be an item value line, and after the bottom ruled line, it is determined that the possibility of being an item value line is low.
  • the character string row having the probability of being the item value row calculated in step S908 is determined to be the item value row, and the item name in the table header is associated with the item value in the item value row.
  • a method of associating the item name with the item value is shown in FIG.
  • the item name database 105 includes Quantity, Item No. , Description, UNIT PRICE, and PRICE.
  • the table heading 1001 includes five item names. Note that UNIT PRICE in the table heading 1001 corresponds to UNIT PRICE and PRICE in the item name database 105, but item names having a long character string length are preferentially used.
  • the number of character strings is calculated by dividing the character string in the item value line by the minimum space. If the number of character strings is different from the number of item names in the table heading 1001, the length of the space separating the character strings is increased, and the number of character strings is calculated again. Until the number of item names in the table heading becomes equal to the number of character strings in the item value row, the process is repeated with the blank length increased to determine the item value. For example, in the item value row 1002, the character string is divided with a blank between Office and Chair, and the number of character strings is 6. When the blank length between P000115 and Office is used for character string division, the number of character strings is 5 (1003).
  • the form number is extracted (S910).
  • the Invoice Number is extracted from the OCR result extracted in Step S903.
  • the Invoice Number is generally a character string that includes a numerical value that exists immediately to the right of or directly below the character string Invoice Number on the form, so that it can be easily distinguished from other character strings in the form. In the form shown in FIG. 2, 111111 on the right side of the character string Invoice Number is extracted.
  • the item value recognition program 103 stores the supplier company name acquired in step S901, the item name and item value associated in step S909, and the Invoice Number extracted in step S910 in the item name / item value database 108. (Step S911).
  • FIG. 11 is a diagram showing a configuration example of the item name / item value database 108.
  • the item name / item value database 108 stores the item value 1103 corresponding to the supplier company name 1101, the Invoice Number 1102, and the item name (Quantity, Item No., Description, Unit Price, Price).
  • Company A as Company
  • 111111 as Invoice Number
  • 4 111 as Quantity
  • Item No. P000115 is stored as the description
  • Office Chair as the description
  • 40 as the unit price
  • 160 as the price.
  • the item value line recognition model 610 extracts line information including a character string from a form to be recognized, converts the extracted line information into a histogram, Analyzing the histogram of the row including the table heading and the histogram of the other row and using the relationship of the row structure as a feature amount to determine whether the other row is an item value row.
  • the value can be accurately associated.
  • the line information is rectangular information determined to include a character string, position information of the rectangle, and character information recognizing the character string, the area to be analyzed in the form is limited, and calculation is performed. The amount can be reduced.
  • the histogram is configured to represent the number of black pixels included in a character in an area obtained by dividing a rectangle defined to include the character string in the row into a predetermined number in the horizontal direction, the sum of characters in the row is represented. You can quantify the position of the characters.
  • the item value line recognition model 610 is generated by machine learning using the characteristic as a feature quantity, a value suitable for machine learning, which is a quantitative value representing a structural characteristic of a line, is used rather than inputting a character itself. To generate models for analyzing forms.
  • the item value line recognition model 610 uses at least one of the ruled line, the blank, and the position of the character string line having the same structure included in the form to be recognized, and the other line is an item value line. Since it is determined whether it exists, the accuracy which recognizes an item value line can be improved.
  • the item value line recognition model 610 determines that there is a low possibility of being an item value line after a predetermined number of blank lines continue, the item value line can be recognized with high accuracy even in an unknown form.
  • the item value line recognition model 610 determines that there is a high possibility that lines having the same line structure are item value lines, the item value line can be recognized with high accuracy even in an unknown form.
  • the item value line recognition model 610 determines that a line between two ruled lines is highly likely to be an item value line, and that a line below the bottom ruled line is unlikely to be an item value line.
  • the item value line can be recognized with high accuracy even in the form.
  • the present invention is not limited to the above-described embodiments, and includes various modifications and equivalent configurations within the scope of the appended claims.
  • the above-described embodiments have been described in detail for easy understanding of the present invention, and the present invention is not necessarily limited to those having all the configurations described.
  • a part of the configuration of one embodiment may be replaced with the configuration of another embodiment.
  • another configuration may be added, deleted, or replaced.
  • each of the above-described configurations, functions, processing units, processing means, etc. may be realized in hardware by designing a part or all of them, for example, with an integrated circuit, and the processor realizes each function. It may be realized by software by interpreting and executing the program to be executed.
  • Information such as programs, tables, and files that realize each function can be stored in a storage device such as a memory, a hard disk, and an SSD (Solid State Drive), or a recording medium such as an IC card, an SD card, and a DVD.
  • a storage device such as a memory, a hard disk, and an SSD (Solid State Drive), or a recording medium such as an IC card, an SD card, and a DVD.
  • control lines and information lines indicate what is considered necessary for the explanation, and do not necessarily indicate all control lines and information lines necessary for mounting. In practice, it can be considered that almost all the components are connected to each other.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

A recognition device is provided with: a processor for executing a program; and a storage device for storing the program, and has a recognition model for determining whether a character string extracted from a form is an item value row including an item value. The recognition model is generated by converting information relating to the rows including character strings in the form into histograms, analyzing a histogram of a row including a table heading and a histogram of a row including an item value, and machine-learning the relevance of the row structures. The recognition model extracts information relating to rows including character strings from a form to be recognized, converts the extracted information relating to the rows into histograms, and, by using, as a feature quantity, the relevance of row structures obtained by comparing the histogram of a row including a table heading and the histogram of a different row, determines whether the different row is an item value row.

Description

認識装置及び認識方法Recognition device and recognition method 参照による取り込みImport by reference
 本出願は、平成28年(2016年)6月30日に出願された日本出願である特願2016-129997の優先権を主張し、その内容を参照することにより、本出願に取り込む。 This application claims the priority of Japanese Patent Application No. 2016-129997, which was filed on June 30, 2016, and is incorporated herein by reference.
 本発明は、帳票などの文書から表構造を認識する認識装置に関する。 The present invention relates to a recognition device that recognizes a table structure from a document such as a form.
 企業は、経済活動の中で、売上伝票や請求書、領収書などの帳票を他の企業とやり取りする。これらの帳票を企業の業務系、勘定系システムに入力し、出荷や入金処理を行うため、OCR(Optical Characterize Recognition、光学式文字認識)を用いて、帳票内の文書を電子データに変換する技術が利用されている。OCRを用いて帳票を電子化した後、近傍の文字列を対応付けたデータをシステムに登録する。例えば、文字列「帳票発行日」の近傍の文字列「2016年3月29日」がある場合、項目名を「帳票発効日」とし、項目値を「2016年3月29日」としてシステムに登録する。さらに、罫線を利用して、帳票内の表構造、すなわち表見出しの項目名と表見出しに対応する項目値のセルとを認識し、これらを対応付けたのち、システムに登録する。 Companies will exchange sales slips, invoices, receipts and other forms with other companies during their economic activities. Technology that converts documents in a form into electronic data using OCR (Optical Characterize Recognition, optical character recognition) in order to enter these business forms into a company's business system or account system, and perform shipping and deposit processing. Is being used. After the form is digitized using OCR, data in which neighboring character strings are associated is registered in the system. For example, when there is a character string “March 29, 2016” in the vicinity of the character string “form issue date”, the item name is “form effective date” and the item value is “March 29, 2016”. sign up. Further, the table structure in the form, that is, the item name of the table heading and the cell of the item value corresponding to the table heading are recognized using the ruled lines, and these are associated and registered in the system.
 特開2013-205974号公報には、罫線によって表構造を認識し、項目名候補データベースを用いて項目名を識別し、項目名と他の項目値候補のセルとの位置関係から項目名と項目値との対応の尤度を算出し、表構造全体で尤度の最も高くなるように項目名と項目値を対応付ける方法が開示されている。 In Japanese Patent Laid-Open No. 2013-205974, a table structure is recognized by ruled lines, an item name is identified using an item name candidate database, and an item name and an item are determined based on the positional relationship between the item name and other item value candidate cells. A method is disclosed in which the likelihood of correspondence with a value is calculated, and the item name and the item value are associated with each other so that the likelihood is highest in the entire table structure.
 特開2013-190993号公報には、罫線をまたいで記載されている項目間の特徴、例えば背景色やフォントサイズ、フォントタイプ等の差異から、その罫線が項目名と項目値の境界となる罫線かを判定し、表構造中の項目名と項目値、およびその対応関係を推定する方法が記載されている。 Japanese Patent Laid-Open No. 2013-190993 discloses a ruled line in which the ruled line becomes a boundary between an item name and an item value due to differences between items described across the ruled line, such as differences in background color, font size, font type, and the like. Is described, and a method for estimating an item name and an item value in a table structure and a correspondence relationship thereof is described.
 米国特許8,214,733号公報には、項目名と項目値は、帳票中の水平方向の開始位置、終了位置に類似性が見られることと、表見出しと項目値を含む行間では文字が登場する座標位置に類似性が見られることを利用し、表見出しと項目値を含む行とを対応付け、及び、項目名と項目値との対応付ける方法が記載されている。 In U.S. Pat. No. 8,214,733, item names and item values have similarities in the horizontal start position and end position in the form, and there are characters between the line containing the table heading and the item value. A method of associating a table headline with a line including an item value and associating an item name with an item value is described using the fact that similarity is found in the appearing coordinate positions.
 特開2013-205974号公報及び特開2013-190993号公報に記載の方法では、罫線を表構造認識の手掛かりとしているが、罫線が記載されていない帳票の表構造の認識には用いることができない。 In the methods described in Japanese Patent Laid-Open Nos. 2013-205974 and 2013-190993, ruled lines are used as a clue for recognizing the table structure, but cannot be used for recognizing the table structure of a form on which no ruled line is described. .
 さらに、帳票によっては、表見出しと項目値を含む行(以降、項目値行と記載)との間や、項目値行同士の間に、表見出しと関連しない文字列が記載される場合がある。例えば、請求書や領収書の場合、項目値行には、商品名や価格が記載されるが、在庫不足で商品の配送に通常より多くの期間が必要な場合は、期間及び配送遅延理由等の補足情報が、その項目値行の上部又は下部に記載される。また、セールスプロモーション期間での商品購入やバルクでの商品購入によるディスカウントの情報が項目値行の近くに記載される。特に、特開2013-205974号公報に記載の方法では、隣接する項目間で尤度を算出するため、無関係な文字列により項目が分断されると、正しく項目名と項目値を対応付けられなくなる。また、特開2013-190993号公報に記載の方法では、近くの項目間の特徴を用いて、項目名と項目値の境界を識別するため、補足情報による分断によって、境界の識別が困難になる。 Furthermore, depending on the form, a character string that is not related to the table heading may be described between the table heading and the line including the item value (hereinafter referred to as the item value line) or between the item value lines. . For example, in the case of invoices and receipts, the item value line contains the product name and price, but if there is a shortage of inventory and more time is required for delivery of the product, the period and reason for delay in delivery, etc. The supplementary information is described at the top or bottom of the item value line. In addition, information on discounts for product purchases during the sales promotion period and product purchases in bulk is described near the item value line. In particular, in the method described in Japanese Patent Laid-Open No. 2013-205974, since the likelihood is calculated between adjacent items, if an item is divided by an irrelevant character string, the item name cannot be correctly associated with the item value. . Further, in the method described in Japanese Patent Application Laid-Open No. 2013-190993, since the boundary between the item name and the item value is identified using the feature between nearby items, it is difficult to identify the boundary due to the division by the supplementary information. .
 また、米国特許8,214,733号公報に記載の方法では、文字列の開始位置及び終了位置を比較し、さらに帳票中の文字列を含む行(文字列行)を、文字が存在する座標を1、空白を0としたバイナリデータに変換し、表見出しのバイナリデータと文字列行のバイナリデータとの間のハミング距離を算出することによって表見出しと項目値行と他の文字列行とを区別する。しかし、表見出しと項目値行とで文字列の開始位置、終了位置が同じとは限らず、また、表見出しの文字列数と項目値行の文字列数とが異なる場合がある。このため、表見出しと項目値行とのハミング距離が、表見出しと他の文字列を含む行とのハミング距離より大きくなり、対応付けが困難になる。 Further, in the method described in US Pat. No. 8,214,733, the start position and end position of character strings are compared, and a line (character string line) including a character string in a form is represented by coordinates where a character exists. Is converted to binary data with 1 being blank and 0 being blank, and by calculating the Hamming distance between the binary data of the table header and the binary data of the character string row, the table header, the item value row, and other character string rows To distinguish. However, the start position and end position of the character string are not necessarily the same between the table heading and the item value line, and the number of character strings in the table heading and the number of character strings in the item value line may be different. For this reason, the hamming distance between the table heading and the item value line becomes larger than the hamming distance between the table heading and a line including another character string, making the association difficult.
 このため、罫線が無く、かつ表構造中に表見出しと関連が無い文字列が登場する帳票でも表見出しと項目値とを対応付ける必要がある。 For this reason, it is necessary to associate a table heading with an item value even in a form that has no ruled line and a character string that is not related to the table heading appears in the table structure.
 本願において開示される発明の代表的な一例を示せば以下の通りである。すなわち、認識装置であって、プログラムを実行するプロセッサと、前記プログラムを格納する記憶装置とを備え、帳票から抽出された文字列が項目値を含む項目値行であるかを判定する認識モデルを有し、前記認識モデルは、帳票内の文字列を含む行の情報をヒストグラムに変換し、表見出しを含む行のヒストグラムと項目値を含む行のヒストグラムとを解析して行構造の関連性を機械学習して生成されたものであって、前記認識モデルは、認識すべき帳票から文字列を含む行の情報を抽出し、前記抽出した行の情報をヒストグラムに変換し、表見出しを含む行のヒストグラムと他の行のヒストグラムとを比較した行構造の関連性を特徴量として用いて、当該他の行が項目値行であるかを判定する。 A typical example of the invention disclosed in the present application is as follows. That is, a recognition apparatus comprising a processor that executes a program and a storage device that stores the program, and a recognition model that determines whether a character string extracted from a form is an item value line including an item value The recognition model converts line information including a character string in a form into a histogram, analyzes a line histogram including a table heading and a line histogram including an item value to determine the relevance of the line structure. The recognition model is generated by machine learning, and the recognition model extracts line information including a character string from a form to be recognized, converts the extracted line information into a histogram, and includes a line including a table heading. It is determined whether the other row is an item value row using the relationship of the row structure obtained by comparing the histogram of the other and the histogram of the other row as a feature amount.
 本発明の一態様によれば、表見出しと項目値とを正確に対応付けできる。前述した以外の課題、構成及び効果は、以下の実施例の説明により明らかにされる。 According to one aspect of the present invention, it is possible to accurately associate table headings with item values. Problems, configurations, and effects other than those described above will become apparent from the description of the following embodiments.
本発明の実施例の帳票内表構造認識システムの構成図である。It is a block diagram of the in-form table | surface structure recognition system of the Example of this invention. 認識サーバの物理的な構成を示すブロック図である。It is a block diagram which shows the physical structure of a recognition server. 認識サーバが認識する帳票の一例を示す図である。It is a figure which shows an example of the form which a recognition server recognizes. 項目値行学習プログラムによる処理のフローチャートである。It is a flowchart of the process by an item value line learning program. 学習用帳票の一例を示す図である。It is a figure which shows an example of the form for learning. 帳票の表見出しから生成されたヒストグラムの一例を示す図である。It is a figure which shows an example of the histogram produced | generated from the table | surface heading of the form. 水平方向項目値学習を行うニューラルネットの認識モデルを示す図である。It is a figure which shows the recognition model of the neural network which performs horizontal direction item value learning. 近傍行特徴量生成処理の例を示す図である。It is a figure which shows the example of a neighborhood line feature-value production | generation process. 近近傍行特徴量テーブルの構成例を示す図である。It is a figure which shows the structural example of a near neighbor line feature-value table. 垂直方向項目値行認識ニューラルネットワークモデルを示す図である。It is a figure which shows a vertical direction item value line recognition neural network model. 項目値行認識プログラム及び項目値認識プログラムによる処理のフローチャートである。It is a flowchart of the process by an item value line recognition program and an item value recognition program. 項目名と項目値とを対応付ける方法を示す図である。It is a figure which shows the method of matching an item name and an item value. 項目名・項目値データベースの構成例を示す図である。It is a figure which shows the structural example of an item name / item value database.
 以下、図面を参照して本発明の実施例を説明する。 Embodiments of the present invention will be described below with reference to the drawings.
 図1は、本発明の実施例の帳票内表構造認識システムの構成図である。 FIG. 1 is a configuration diagram of the in-form table structure recognition system according to the embodiment of the present invention.
 本実施例の帳票内表構造認識システムは、帳票から項目名及び項目値を抽出する認識サーバ100から構成される。認識サーバ100は、取引先企業から郵送等で受け取った紙帳票111を電子化する読み取り装置112に接続されている。また、認識サーバ100は、ネットワーク(例えば、インターネット114)に接続されており、取引先企業PC113から電子帳票を受け取る。 The in-form table structure recognition system according to the present embodiment includes a recognition server 100 that extracts item names and item values from a form. The recognition server 100 is connected to a reading device 112 that digitizes a paper form 111 received by mail from a business partner. The recognition server 100 is connected to a network (for example, the Internet 114), and receives an electronic form from the customer company PC 113.
 認識サーバ100は、帳票受信部109と、項目値行学習プログラム101と、項目値行認識プログラム102と、項目値認識プログラム103とを有する。また、認識サーバ100は、帳票から取得したい項目名が登録されている項目名データベース105を有する。 The recognition server 100 includes a form receiving unit 109, an item value line learning program 101, an item value line recognition program 102, and an item value recognition program 103. Further, the recognition server 100 has an item name database 105 in which item names to be acquired from the form are registered.
 帳票受信部109は、読み取り装置112又はインターネット114を介して受信した電子帳票を、取引先企業名と共に、学習用帳票104又は認識対象帳票106として保存する。項目値行学習プログラム101は、項目名データベース105に登録されている項目名を含む行を表見出しとし、項目値行の位置が既知である学習用帳票104から、表見出しと項目値行の対応関係を機械学習し、項目値行認識モデル107を生成する(図3参照)。項目値行認識プログラム102は、項目値行認識モデル107を用いて、認識対象帳票106中の項目値行を認識し、抽出する(図10参照)。項目値認識プログラム103は、表見出しの項目名に項目値行内の項目値を対応付け、図11に示す項目名・項目値データベース108に格納する(図10参照)。 The form receiving unit 109 stores the electronic form received via the reading device 112 or the Internet 114 as a learning form 104 or a recognition target form 106 together with a supplier company name. The item value line learning program 101 uses the line including the item name registered in the item name database 105 as a table headline, and the correspondence between the table headline and the item value line from the learning form 104 where the position of the item value line is known. The relationship is machine-learned to generate an item value line recognition model 107 (see FIG. 3). The item value line recognition program 102 recognizes and extracts an item value line in the recognition target form 106 using the item value line recognition model 107 (see FIG. 10). The item value recognition program 103 associates the item value in the item value row with the item name of the table header, and stores it in the item name / item value database 108 shown in FIG. 11 (see FIG. 10).
 図1Bは、認識サーバ100の物理的な構成を示すブロック図である。 FIG. 1B is a block diagram showing a physical configuration of the recognition server 100.
 本実施例の認識サーバ100は、プロセッサ(CPU)1、メモリ2、補助記憶装置3及び通信インターフェース4を有する計算機によって構成される。 The recognition server 100 of this embodiment is configured by a computer having a processor (CPU) 1, a memory 2, an auxiliary storage device 3, and a communication interface 4.
 プロセッサ1は、メモリ2に格納されたプログラムを実行する。メモリ2は、不揮発性の記憶素子であるROM及び揮発性の記憶素子であるRAMを含む。ROMは、不変のプログラム(例えば、BIOS)などを格納する。RAMは、DRAM(Dynamic Random Access Memory)のような高速かつ揮発性の記憶素子であり、プロセッサ1が実行するプログラム及びプログラムの実行時に使用されるデータを一時的に格納する。 The processor 1 executes a program stored in the memory 2. The memory 2 includes a ROM that is a nonvolatile storage element and a RAM that is a volatile storage element. The ROM stores an immutable program (for example, BIOS). The RAM is a high-speed and volatile storage element such as DRAM (Dynamic Random Access Memory), and temporarily stores a program executed by the processor 1 and data used when the program is executed.
 補助記憶装置3は、例えば、磁気記憶装置(HDD)、フラッシュメモリ(SSD)等の大容量かつ不揮発性の記憶装置によって構成され、プロセッサ1が実行するプログラム及びプログラムの実行時に使用されるデータを格納する。すなわち、プログラムは、補助記憶装置3から読み出されて、メモリ2にロードされて、プロセッサ1によって実行される。 The auxiliary storage device 3 is configured by a large-capacity and non-volatile storage device such as a magnetic storage device (HDD) or a flash memory (SSD), for example, and stores a program executed by the processor 1 and data used when the program is executed. Store. That is, the program is read from the auxiliary storage device 3, loaded into the memory 2, and executed by the processor 1.
 通信インターフェース4は、所定のプロトコルに従って、他の装置(読み取り装置112、取引先企業PC113)との通信を制御するネットワークインターフェース装置である。 The communication interface 4 is a network interface device that controls communication with other devices (reading device 112, customer company PC 113) according to a predetermined protocol.
 認識サーバ100は、入力インターフェース5及び出力インターフェース8を有してもよい。入力インターフェース5は、キーボード6やマウス7などが接続され、オペレータからの入力を受けるインターフェースである。出力インターフェース8は、ディスプレイ装置9やプリンタなどが接続され、プログラムの実行結果をオペレータが視認可能な形式で出力するインターフェースである。 The recognition server 100 may have an input interface 5 and an output interface 8. The input interface 5 is an interface to which an input from an operator is received, to which a keyboard 6 and a mouse 7 are connected. The output interface 8 is an interface to which a display device 9 or a printer is connected, and the execution result of the program is output in a form that can be visually recognized by the operator.
 プロセッサ1が実行するプログラムは、リムーバブルメディア(CD-ROM、フラッシュメモリなど)又はネットワークを介して認識サーバ100に提供され、非一時的記憶媒体である不揮発性の補助記憶装置3に格納される。このため、認識サーバ100は、リムーバブルメディアからデータを読み込むインターフェースを有するとよい。 The program executed by the processor 1 is provided to the recognition server 100 via a removable medium (CD-ROM, flash memory, etc.) or a network, and stored in the nonvolatile auxiliary storage device 3 that is a non-temporary storage medium. For this reason, the recognition server 100 may have an interface for reading data from a removable medium.
 認識サーバ100は、物理的に一つの計算機上で、又は、論理的又は物理的に構成された複数の計算機上で構成される計算機システムであり、同一の計算機上で別個のスレッドで動作してもよく、複数の物理的計算機資源上に構築された仮想計算機上で動作してもよい。 The recognition server 100 is a computer system configured on a single computer or a plurality of computers configured logically or physically, and operates in a separate thread on the same computer. Alternatively, it may operate on a virtual machine constructed on a plurality of physical computer resources.
 また、認識サーバ100において、プログラムによって実装される機能ブロックの全部又は一部は、物理的な集積回路(例えば、Field-Programmable Gate Array)等によって構成されてもよい。 In the recognition server 100, all or part of the functional blocks implemented by the program may be configured by a physical integrated circuit (for example, Field-Programmable Gate Array).
 図2は、認識サーバ100が認識する帳票の一例を示す図である。 FIG. 2 is a diagram illustrating an example of a form recognized by the recognition server 100.
 図2に示す帳票は、Company AからCompany Bへの請求書である。Company Bが購入した商品及び価格が表構造で帳票に記載されており、表見出し201には、商品の個数(Quantity)、商品番号(Item No.)、商品の説明(Description)、単価(UNIT PRICE)、合計価格(PRICE)の項目名が記載されている。項目値行202、204、206には、表見出しの項目名に対応する項目値が記載されている。また、項目値行202、204、206の間には、項目値行を補足する補足情報203、205が記載されている。さらに、帳票には、取引先企業毎に、帳票を一意に識別するInvoice Number207が付与されている。学習用帳票104は、この帳票の表見出し201と項目値行202、204、206の矩形座標を機械学習の正解データとして設定している。 The form shown in FIG. 2 is an invoice from Company A to Company B. The products and prices purchased by Company B are listed in the form in a table structure, and the table heading 201 includes the number of products (Quantity), the product number (Item No.), the description of the product (Description), and the unit price (UNIT). The item names of “PRICE” and total price (PRICE) are described. In the item value rows 202, 204, and 206, item values corresponding to the item names of the table headings are described. Further, supplementary information 203 and 205 for supplementing the item value line is described between the item value lines 202, 204 and 206. Further, an Invoice Number 207 that uniquely identifies the form is assigned to the form for each business partner company. The learning form 104 sets the rectangular coordinates of the table heading 201 and the item value lines 202, 204, and 206 as correct data for machine learning.
 図3は、項目値行学習プログラム101による処理のフローチャートである。 FIG. 3 is a flowchart of processing by the item value line learning program 101.
 まず、項目値行学習プログラム101は、学習用帳票104の入力を受ける(ステップS301)。 First, the item value line learning program 101 receives an input of the learning form 104 (step S301).
 次に、学習用帳票104から文字列行の矩形座標を抽出する(ステップS302)。ステップS302では、図4に示すような矩形が学習用帳票104から抽出される。 Next, the rectangular coordinates of the character string row are extracted from the learning form 104 (step S302). In step S <b> 302, a rectangle as shown in FIG. 4 is extracted from the learning form 104.
 その後、学習用帳票104にOCR処理を行い、文字情報と当該文字の座標を抽出する(ステップS303)。そして、OCRの結果から、項目名データベース105に登録されている項目名と一致する文字を特定し、特定された文字の帳票上の座標を表見出しの位置として特定する(ステップS304)。 Thereafter, the learning form 104 is subjected to OCR processing, and character information and the coordinates of the character are extracted (step S303). Then, from the OCR result, a character that matches the item name registered in the item name database 105 is specified, and the coordinates of the specified character on the form are specified as the position of the table heading (step S304).
 ステップS302で矩形として抽出された全ての文字列行について、矩形内の文字ピクセルのヒストグラムを生成する(ステップS305)。このヒストグラムは、行の水平方向の構造的な特徴を表している。具体的には、文字列行の矩形を水平方向に一定数分割した後、分割された領域内の文字に含まれる黒色ピクセルの個数をヒストグラムの度数とする。図2に示す帳票の表見出し201から生成されたヒストグラムを、図5に示す。 A histogram of character pixels in the rectangle is generated for all the character string rows extracted as a rectangle in step S302 (step S305). This histogram represents the structural features of the rows in the horizontal direction. Specifically, after dividing a rectangle of a character string row by a certain number in the horizontal direction, the number of black pixels contained in characters in the divided area is set as the frequency of the histogram. A histogram generated from the table heading 201 of the form shown in FIG. 2 is shown in FIG.
 次に、水平方向項目値学習を行う(ステップS306)。水平方向項目値学習は、ステップS305で生成された、ピクセルの分布を表す水平方向のヒストグラムから表見出しと項目値行の構造の関連性をニューラルネットワークに学習させる処理である。表見出しと項目値行とは、(1)文字列の数が同一又は近い、(2)水平方向で共通の位置に文字列が存在する、(3)表見出し内の項目名によって項目値の文字列長が所定値以上となる又は所定値以下となる、などのパターンがあり、これをニューラルネットに学習させる。例えば、項目名Descriptionに対応する項目値の文字列長は長くなりやすく、項目名Quantityに対応する項目値の文字列長は短くなりやすい。 Next, horizontal item value learning is performed (step S306). The horizontal item value learning is a process in which the neural network learns the relationship between the table header and the structure of the item value row from the horizontal histogram representing the pixel distribution generated in step S305. Table headings and item value rows are: (1) the number of character strings is the same or close, (2) character strings exist at a common position in the horizontal direction, and (3) item values are indicated by item names in the table headings. There are patterns such that the character string length is greater than or equal to a predetermined value or less, and the neural network learns this pattern. For example, the character string length of the item value corresponding to the item name Description tends to be long, and the character string length of the item value corresponding to the item name Quantity tends to be short.
 図6は、水平方向項目値学習を行うニューラルネットの認識モデルを示す図である。 FIG. 6 is a diagram showing a recognition model of a neural network that performs horizontal item value learning.
 図6に示す水平方向項目値行認識ニューラルネットワークモデル610は、表見出しヒストグラム601及び文字列行ヒストグラム602を入力値とする。表見出しヒストグラム601は、ステップS304で特定された表見出しの矩形について、ステップS305で生成されたヒストグラムである。文字列行ヒストグラム602は、ステップS302で抽出された表見出し以外の文字列の矩形について、ステップS305で生成されたヒストグラムである。 The horizontal direction item value line recognition neural network model 610 shown in FIG. 6 takes a table header histogram 601 and a character string line histogram 602 as input values. The table heading histogram 601 is a histogram generated in step S305 for the rectangle of the table heading specified in step S304. The character string row histogram 602 is a histogram generated in step S305 for a character string rectangle other than the table header extracted in step S302.
 水平方向項目値行認識ニューラルネットワークモデル610は、表見出しヒストグラム601の構造の特徴量を抽出する特徴量抽出層A611と、文字列行ヒストグラム602の構造の特徴量を抽出する特徴量抽出層B612と、二つの特徴量を比較する比較層613とで構成される。特徴量抽出層A611では、表見出し内の文字列の位置、文字列の数、特定の項目名(例えば、Description)の位置が特徴量として抽出されるよう学習が行われる。特徴量抽出層B612では、文字列行内の文字列の位置、文字列の数、文字列の長さが特徴量として抽出されるように学習が行われる。比較層613では、二つの特徴量から、文字列行ヒストグラム602の構造が、表見出しヒストグラム601に対応する項目値行の構造として尤もらしい程度を評価する。具体的には、表見出しの文字列の位置、文字列の数及び項目名の各々に対応して、表見出しに対する文字列行内の文字列の位置、文字列の数及び文字列の長さの尤度が学習される。比較層613の出力は、項目値行の確率614になる。 The horizontal item value row recognition neural network model 610 includes a feature amount extraction layer A611 that extracts the feature amount of the structure of the table header histogram 601 and a feature amount extraction layer B612 that extracts the feature amount of the structure of the character string row histogram 602. , And a comparison layer 613 that compares the two feature amounts. In the feature quantity extraction layer A611, learning is performed so that the position of the character string in the table header, the number of character strings, and the position of a specific item name (for example, Description) are extracted as the feature quantity. In the feature amount extraction layer B612, learning is performed so that the position of the character string in the character string row, the number of character strings, and the length of the character string are extracted as feature amounts. The comparison layer 613 evaluates the likelihood that the structure of the character string row histogram 602 is likely to be the structure of the item value row corresponding to the table header histogram 601 from the two feature amounts. Specifically, the position of the character string in the character string row, the number of character strings, and the length of the character string with respect to the table heading correspond to each of the position of the character string of the table heading, the number of character strings, and the item name. Likelihood is learned. The output of the comparison layer 613 is the item value row probability 614.
 水平方向項目値行認識ニューラルネットワークモデル610に対して、帳票から抽出した文字列行毎に、学習用帳票104の表見出しヒストグラム601と項目値行ヒストグラムを入力する場合の出力が1、学習用帳票104の表見出しヒストグラム601と項目値行以外の文字列行のヒストグラムを入力する場合の出力が0となるように、ニューラルネットワークの公知の学習法(例えば、誤差逆伝搬法)によって、学習を実行する。 For the horizontal item value line recognition neural network model 610, for each character string line extracted from the form, the output is 1 when the table heading histogram 601 and the item value line histogram of the learning form 104 are input, and the learning form. The learning is executed by a known neural network learning method (for example, error back-propagation method) so that the output when inputting the histogram of the table header histogram 601 and the character string row other than the item value row becomes zero. To do.
 ステップS306によって、表見出しと項目値行との構造上の特徴量から項目値行を推定できる。 In step S306, the item value row can be estimated from the structural features of the table header and the item value row.
 続いて、項目値行の周辺空間の情報からニューラルネットワークに入力可能な特徴量を生成する近傍行特徴量生成処理を行う(ステップS307)。項目値行の周辺空間の情報を追加の特徴量として用いると、より高い精度で項目値行を推定できる。周辺空間の情報とは、具体的には、罫線、空白、類似した文字列行である。帳票によっては、罫線が表見出しと項目値行との間や、表構造の終端に記載されるため、罫線は項目値行の存在範囲を判定するための有効な情報である。また、帳票によっては、表構造と非表構造との間には一定以上の空白が設けられるため、空白は項目値行の存在範囲の判定に有効な情報である。さらに、項目値行が表構造中に複数ある場合、特徴量が類似する行構造が一定範囲内に繰り返し存在し、同様の行構造の相対位置は項目値行の判断に有効な情報である。よって、罫線、空白、類似した文字列行が存在する情報をニューラルネットワークに学習させることによって、項目値行の認識精度を高めることができる。 Subsequently, a neighboring line feature value generation process for generating a feature value that can be input to the neural network from information in the peripheral space of the item value line is performed (step S307). When information on the space around the item value row is used as an additional feature amount, the item value row can be estimated with higher accuracy. Specifically, the peripheral space information includes ruled lines, blanks, and similar character string rows. Depending on the form, a ruled line is described between the table heading and the item value line, or at the end of the table structure. Therefore, the ruled line is effective information for determining the existence range of the item value line. In addition, depending on the form, a certain amount or more of space is provided between the table structure and the non-table structure. Therefore, the space is effective information for determining the existence range of item value rows. Further, when there are a plurality of item value rows in the table structure, row structures having similar feature quantities repeatedly exist within a certain range, and the relative position of the similar row structure is information useful for determining an item value row. Therefore, it is possible to improve the recognition accuracy of the item value line by causing the neural network to learn information in which ruled lines, blanks, and similar character string lines exist.
 図7A、図7Bは、近傍行特徴量生成処理の例を示す図である。 7A and 7B are diagrams illustrating an example of the neighborhood line feature value generation process.
 図示する例では、帳票700の文字列行701の周辺空間として、上下10行から特徴量を生成する。具体的には、各文字列行を1行、文字列行701と同じ高さの空白部分を1行、罫線を1行とした近傍行10行702、703を対象範囲とする。 In the example shown in the figure, feature amounts are generated from the top and bottom 10 lines as the space around the character string line 701 of the form 700. Specifically, the target range is 10 neighboring rows 702 and 703 in which each character string row is one row, a blank portion having the same height as the character string row 701 is one row, and a ruled line is one row.
 図7Bに示す近傍行特徴量テーブル710は、各近傍行に割り当てられた近傍行番号704、711と、各近傍行の特徴量712を含む。特徴量712は、ステップS306で生成した水平方向項目値行認識ニューラルネットワークモデル610が算出した値で、各文字列行が項目値行である確率(Possibility)、空白であるか(Blank)、罫線であるか(Line)、表見出しであるか(Header)を含む。例えば、Possibilityは、行の行構造同士を比較して、同じ又は類似している行構造の行は項目値行である可能性が高いと判定する。 7B includes a neighboring row feature amount table 710 including neighboring row numbers 704 and 711 assigned to each neighboring row and a feature amount 712 of each neighboring row. The feature quantity 712 is a value calculated by the horizontal direction item value line recognition neural network model 610 generated in step S306. The probability that each character string line is an item value line (Possibilities), whether it is blank (Blank), a ruled line (Line) or table header (Header). For example, Possibilities compare the row structures of the rows and determine that a row having the same or similar row structure is likely to be an item value row.
 次に、ステップS307で生成した近傍行特徴量を入力として垂直方向項目値行学習を行う(ステップS308)。図8に示すように、垂直方向項目値行学習で生成される垂直方向項目値行認識ニューラルネットワークモデル802は、近傍行特徴量801を入力として、水平方向項目値行認識ニューラルネットワークモデル610と同様に項目値行の確率803を出力する。帳票から抽出された各文字列行毎に、文字列行701が項目値行の場合は1を出力し、非項目値行の場合は0を出力するように、逆誤差伝搬法を用いて学習する。 Next, vertical direction item value row learning is performed using the neighboring row feature quantity generated in step S307 as an input (step S308). As shown in FIG. 8, the vertical item value line recognition neural network model 802 generated by the vertical item value line learning is the same as the horizontal item value line recognition neural network model 610 with the neighboring line feature 801 as an input. The item value row probability 803 is output. For each character string row extracted from the form, learning is performed using the inverse error propagation method so that 1 is output when the character string row 701 is an item value row and 0 is output when the character string row 701 is a non-item value row. To do.
 図9は、項目値行認識プログラム102及び項目値認識プログラム103による処理のフローチャートである。 FIG. 9 is a flowchart of processing by the item value line recognition program 102 and the item value recognition program 103.
 まず、項目値行認識プログラム102は、認識対象帳票106を取引先企業名と共に取得する(ステップS901)。 First, the item value line recognition program 102 acquires the recognition target form 106 together with the business partner company name (step S901).
 ステップS902からステップS905までの処理は、項目値行学習プログラム101によるステップS302からステップS305までの処理と同じである。 The processing from step S902 to step S905 is the same as the processing from step S302 to step S305 by the item value line learning program 101.
 ステップS906では、認識対象帳票106の文字列行毎に、ステップS905までの処理で生成された表見出しヒストグラム601及び文字列行ヒストグラム602を入力し、ステップS306で生成された水平方向項目値行認識ニューラルネットワークモデル610によって、文字列行が項目値行である確率を算出する(ステップS906)。 In step S906, the table header histogram 601 and the character string row histogram 602 generated by the processing up to step S905 are input for each character string row of the recognition target form 106, and the horizontal direction item value line recognition generated in step S306 is input. The probability that the character string row is the item value row is calculated by the neural network model 610 (step S906).
 ステップS906で算出された項目値行である確率を用いて、項目値行学習プログラム101によるステップS307と同様に、認識対象帳票106の文字列行毎に近傍行特徴量を生成する(ステップS907)。 Using the probability of the item value line calculated in step S906, a neighboring line feature amount is generated for each character string line of the recognition target form 106, similarly to step S307 by the item value line learning program 101 (step S907). .
 ステップS308で生成された垂直方向項目値行認識ニューラルネットワークモデルによって、ステップS907で生成された近傍行特徴量から、文字列行が項目値行である確率を算出する(ステップS908)。 The probability that the character string line is the item value line is calculated from the neighboring line feature amount generated in step S907 by the vertical direction item value line recognition neural network model generated in step S308 (step S908).
 具体的には、所定数の空白行が連続した後は、文字列行が項目値行である可能性が低いと判定する。また、同じ又は類似している行構造の行は、項目値行である可能性が高いと判定する。また、二つの罫線の間の文字列行は項目値行である可能性が高いと判定し、最下部の罫線以後は項目値行である可能性が低いと判定する。 Specifically, after a predetermined number of blank lines continue, it is determined that there is a low possibility that the character string line is an item value line. Further, it is determined that a row having the same or similar row structure is likely to be an item value row. Further, it is determined that the character string line between the two ruled lines is highly likely to be an item value line, and after the bottom ruled line, it is determined that the possibility of being an item value line is low.
 ステップS908で算出された項目値行である確率が所定の閾値以上の文字列行を項目値行であると判定し、表見出しの項目名と項目値行内の項目値とを対応付ける。項目名と項目値とを対応付ける方法を図10に示す。項目名データベース105に格納されている項目名のうち、表見出しに含まれる項目名の数を算出する。項目名データベース105には、Quantity、Item No.、Description、UNIT PRICE、PRICEが含まれている。このとき、表見出し1001には、5つの項目名が含まれると判定できる。なお、表見出し1001中のUNIT PRICEについては、項目名データベース105中のUNIT PRICE及びPRICEが対応するが、文字列長の長い項目名を優先して利用する。続いて、項目値行内の文字列を最小の空白で区切り、文字列の数を算出する。文字列の数が表見出し1001内の項目名の数と異なる場合、文字列を区切る空白長を長くして、再度、文字列の数を算出する。表見出し内の項目名の数と項目値行内の文字列の数とが等しくなるまで、空白長を長くして処理を繰り返し、項目値を決定する。例えば、項目値行1002では、OfficeとChair間を空白として文字列を分割しており、文字列の個数は6となる。P000115とOfficeとの間の空白長を文字列の分割に用いた場合、文字列の数は5となる(1003)。すなわち、項目値行内の項目数が表見出し内の項目数と同じになるように、小さい空白を除外していく。よって、図10に示す場合では、4、P000115、Office Chair、$40.00、$160.00が項目値となる。得られた項目値を左から順に表見出しの項目名と対応付ける(ステップS909)。 The character string row having the probability of being the item value row calculated in step S908 is determined to be the item value row, and the item name in the table header is associated with the item value in the item value row. A method of associating the item name with the item value is shown in FIG. Of the item names stored in the item name database 105, the number of item names included in the table heading is calculated. The item name database 105 includes Quantity, Item No. , Description, UNIT PRICE, and PRICE. At this time, it can be determined that the table heading 1001 includes five item names. Note that UNIT PRICE in the table heading 1001 corresponds to UNIT PRICE and PRICE in the item name database 105, but item names having a long character string length are preferentially used. Subsequently, the number of character strings is calculated by dividing the character string in the item value line by the minimum space. If the number of character strings is different from the number of item names in the table heading 1001, the length of the space separating the character strings is increased, and the number of character strings is calculated again. Until the number of item names in the table heading becomes equal to the number of character strings in the item value row, the process is repeated with the blank length increased to determine the item value. For example, in the item value row 1002, the character string is divided with a blank between Office and Chair, and the number of character strings is 6. When the blank length between P000115 and Office is used for character string division, the number of character strings is 5 (1003). That is, small blanks are excluded so that the number of items in the item value row is the same as the number of items in the table header. Therefore, in the case shown in FIG. 10, 4, P000115, Office Chair, $ 40.00, and $ 160.00 are item values. The obtained item values are associated with the item names in the table header in order from the left (step S909).
 次に、帳票番号を抽出する(S910)。具体的には、ステップS903で抽出したOCR結果からInvoice Numberを抽出する。Invoice Numberは、一般的に帳票上で文字列Invoice Numberの右隣又は直下に存在する数値を含む文字列であるため、帳票中の他の文字列と容易に区別可能である。図2に示す帳票では、文字列Invoice Numberの右隣にある111111を抽出する。 Next, the form number is extracted (S910). Specifically, the Invoice Number is extracted from the OCR result extracted in Step S903. The Invoice Number is generally a character string that includes a numerical value that exists immediately to the right of or directly below the character string Invoice Number on the form, so that it can be easily distinguished from other character strings in the form. In the form shown in FIG. 2, 111111 on the right side of the character string Invoice Number is extracted.
 そして、項目値認識プログラム103は、ステップS901で取得した取引先企業名、ステップS909で対応付けた項目名及び項目値、及びステップS910で抽出したInvoice Numberを項目名・項目値データベース108に格納する(ステップS911)。 The item value recognition program 103 stores the supplier company name acquired in step S901, the item name and item value associated in step S909, and the Invoice Number extracted in step S910 in the item name / item value database 108. (Step S911).
 図11は、項目名・項目値データベース108の構成例を示す図である。 FIG. 11 is a diagram showing a configuration example of the item name / item value database 108.
 項目名・項目値データベース108は、取引先企業名1101、Invoice Number1102、項目名(Quantity、Item No.、Description、Unit Price、Price)に対応する項目値1103を格納する。図2及び図10に示す帳票では、図11の最下行のように、CompanyとしてCompany A、Invoice Numberとして111111、Quantityとして4、Item No.としてP000115、DescriptionとしてOffice Chair、Unit Priceとして40、Priceとして160が格納される。 The item name / item value database 108 stores the item value 1103 corresponding to the supplier company name 1101, the Invoice Number 1102, and the item name (Quantity, Item No., Description, Unit Price, Price). In the forms shown in FIGS. 2 and 10, as shown in the bottom row of FIG. 11, Company A as Company, 111111 as Invoice Number, 4, 111 as Quantity, Item No. P000115 is stored as the description, Office Chair as the description, 40 as the unit price, and 160 as the price.
 以上に説明したように、本発明の実施例によると、項目値行認識モデル610は、認識すべき帳票から文字列を含む行の情報を抽出し、抽出した行の情報をヒストグラムに変換し、表見出しを含む行のヒストグラムと他の行のヒストグラムとを解析して行構造の関連性を特徴量として用いて、当該他の行が項目値行であるかを判定するので、表見出しと項目値とを正確に対応付けできる。 As described above, according to the embodiment of the present invention, the item value line recognition model 610 extracts line information including a character string from a form to be recognized, converts the extracted line information into a histogram, Analyzing the histogram of the row including the table heading and the histogram of the other row and using the relationship of the row structure as a feature amount to determine whether the other row is an item value row. The value can be accurately associated.
 また、行の情報は、文字列を含むように定められた矩形の情報、前記矩形の位置情報、及び文字列を認識した文字情報であるので、帳票中で解析すべき領域を限定し、演算量を減らすことができる。 In addition, since the line information is rectangular information determined to include a character string, position information of the rectangle, and character information recognizing the character string, the area to be analyzed in the form is limited, and calculation is performed. The amount can be reduced.
 また、ヒストグラムは、行内の文字列を含むように定められた矩形を水平方向に所定数に分割した領域内で文字に含まれる黒色ピクセルの数を表すように構成したので、行内の文字の和也文字の位置を定量化できる。 In addition, since the histogram is configured to represent the number of black pixels included in a character in an area obtained by dividing a rectangle defined to include the character string in the row into a predetermined number in the horizontal direction, the sum of characters in the row is represented. You can quantify the position of the characters.
 また、帳票から文字列を含む行の情報を抽出し、抽出した行の情報をヒストグラムに変換し、表見出しを含む行のヒストグラムと項目値を含む行のヒストグラムとを解析して行構造の関連性を特徴量として機械学習することによって、項目値行認識モデル610を生成するので、文字そのものを入力するよりも、行の構造的な特徴を表す定量的な値という機械学習に適する値を用いて帳票を解析するためのモデルを生成できる。 Also, it extracts line information including text strings from the form, converts the extracted line information into a histogram, analyzes the line histogram including the table header and the line histogram including the item value, and relates the line structure. Since the item value line recognition model 610 is generated by machine learning using the characteristic as a feature quantity, a value suitable for machine learning, which is a quantitative value representing a structural characteristic of a line, is used rather than inputting a character itself. To generate models for analyzing forms.
 また、項目値行認識モデル610は、前記認識すべき帳票に含まれる罫線、空白、及び同じ構造を持った文字列行の位置の少なくとも一つを用いて、当該他の行が項目値行であるかを判定するので、項目値行を認識する精度を向上できる。 In addition, the item value line recognition model 610 uses at least one of the ruled line, the blank, and the position of the character string line having the same structure included in the form to be recognized, and the other line is an item value line. Since it is determined whether it exists, the accuracy which recognizes an item value line can be improved.
 また、項目値行認識モデル610は、所定数の空白行が連続した後は項目値行である可能性が低いと判定するので、未知の帳票でも高精度に項目値行を認識できる。 In addition, since the item value line recognition model 610 determines that there is a low possibility of being an item value line after a predetermined number of blank lines continue, the item value line can be recognized with high accuracy even in an unknown form.
 また、項目値行認識モデル610は、行構造が同じ行は項目値行である可能性が高いと判定するので、未知の帳票でも高精度に項目値行を認識できる。 In addition, since the item value line recognition model 610 determines that there is a high possibility that lines having the same line structure are item value lines, the item value line can be recognized with high accuracy even in an unknown form.
 また、項目値行認識モデル610は、二つの罫線の間の行は項目値行である可能性が高く、最下の罫線より下は項目値行である可能性が低いと判定するので、未知の帳票でも高精度に項目値行を認識できる。 The item value line recognition model 610 determines that a line between two ruled lines is highly likely to be an item value line, and that a line below the bottom ruled line is unlikely to be an item value line. The item value line can be recognized with high accuracy even in the form.
 なお、本発明は前述した実施例に限定されるものではなく、添付した特許請求の範囲の趣旨内における様々な変形例及び同等の構成が含まれる。例えば、前述した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに本発明は限定されない。また、ある実施例の構成の一部を他の実施例の構成に置き換えてもよい。また、ある実施例の構成に他の実施例の構成を加えてもよい。また、各実施例の構成の一部について、他の構成の追加・削除・置換をしてもよい。 The present invention is not limited to the above-described embodiments, and includes various modifications and equivalent configurations within the scope of the appended claims. For example, the above-described embodiments have been described in detail for easy understanding of the present invention, and the present invention is not necessarily limited to those having all the configurations described. A part of the configuration of one embodiment may be replaced with the configuration of another embodiment. Moreover, you may add the structure of another Example to the structure of a certain Example. In addition, for a part of the configuration of each embodiment, another configuration may be added, deleted, or replaced.
 また、前述した各構成、機能、処理部、処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等により、ハードウェアで実現してもよく、プロセッサがそれぞれの機能を実現するプログラムを解釈し実行することにより、ソフトウェアで実現してもよい。 In addition, each of the above-described configurations, functions, processing units, processing means, etc. may be realized in hardware by designing a part or all of them, for example, with an integrated circuit, and the processor realizes each function. It may be realized by software by interpreting and executing the program to be executed.
 各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリ、ハードディスク、SSD(Solid State Drive)等の記憶装置、又は、ICカード、SDカード、DVD等の記録媒体に格納することができる。 Information such as programs, tables, and files that realize each function can be stored in a storage device such as a memory, a hard disk, and an SSD (Solid State Drive), or a recording medium such as an IC card, an SD card, and a DVD.
 また、制御線や情報線は説明上必要と考えられるものを示しており、実装上必要な全ての制御線や情報線を示しているとは限らない。実際には、ほとんど全ての構成が相互に接続されていると考えてよい。 Also, the control lines and information lines indicate what is considered necessary for the explanation, and do not necessarily indicate all control lines and information lines necessary for mounting. In practice, it can be considered that almost all the components are connected to each other.

Claims (16)

  1.  認識装置であって、
     プログラムを実行するプロセッサと、前記プログラムを格納する記憶装置とを備え、
     帳票から抽出された文字列が項目値を含む項目値行であるかを判定する認識モデルを有し、
     前記認識モデルは、帳票内の文字列を含む行の情報をヒストグラムに変換し、表見出しを含む行のヒストグラムと項目値を含む行のヒストグラムとを解析して行構造の関連性を機械学習して生成されたものであって、
     前記認識モデルは、
     認識すべき帳票から文字列を含む行の情報を抽出し、
     前記抽出した行の情報をヒストグラムに変換し、
     表見出しを含む行のヒストグラムと他の行のヒストグラムとを比較した行構造の関連性を特徴量として用いて、当該他の行が項目値行であるかを判定することを特徴とする認識装置。
    A recognition device,
    A processor that executes the program; and a storage device that stores the program;
    A recognition model that determines whether a character string extracted from a form is an item value line including an item value;
    The recognition model converts line information including text in a form into a histogram, analyzes the line histogram including table headings and the line histogram including item values, and performs machine learning on the relationship between the line structures. Generated by
    The recognition model is
    Extract line information including character strings from the form to be recognized,
    Converting the extracted row information into a histogram;
    Recognizing apparatus characterized by determining whether or not the other row is an item value row by using, as a feature amount, the relation of the row structure obtained by comparing the histogram of the row including the table header and the histogram of the other row. .
  2.  請求項1に記載の認識装置であって、
     前記行の情報は、文字列を含むように定められた矩形の情報、前記矩形の位置情報、及び文字列を認識した文字情報であることを特徴とする認識装置。
    The recognition device according to claim 1,
    The line information includes rectangular information determined to include a character string, position information of the rectangle, and character information that recognizes the character string.
  3.  請求項2に記載の認識装置であって、
     前記ヒストグラムは、行内の文字列を含むように定められた矩形を水平方向に所定数に分割した領域内で文字に含まれる黒色ピクセルの数を表すことを特徴とする認識装置。
    The recognition device according to claim 2,
    The recognition apparatus, wherein the histogram represents the number of black pixels included in a character in an area obtained by dividing a rectangle defined to include a character string in a row into a predetermined number in the horizontal direction.
  4.  請求項1に記載の認識装置であって、
     帳票から文字列を含む行の情報を抽出し、前記抽出した行の情報をヒストグラムに変換し、表見出しを含む行のヒストグラムと項目値を含む行のヒストグラムとを解析して行構造の関連性を特徴量として機械学習することによって、前記認識モデルを生成することを特徴とする認識装置。
    The recognition device according to claim 1,
    Extract line information including character strings from the form, convert the extracted line information into a histogram, analyze the line histogram including the table header and the line histogram including the item value, and relevance of the line structure The recognition device is characterized by generating the recognition model by machine learning using as a feature quantity.
  5.  請求項1に記載の認識装置であって、
     前記認識モデルは、前記認識すべき帳票に含まれる罫線、空白、及び同じ構造を持った文字列行の位置の少なくとも一つを用いて、当該他の行が項目値行であるかを判定することを特徴とする認識装置。
    The recognition device according to claim 1,
    The recognition model determines whether the other line is an item value line by using at least one of a ruled line, a blank, and a character string line having the same structure included in the form to be recognized. A recognition device characterized by that.
  6.  請求項5に記載の認識装置であって、
     前記認識モデルは、所定数の空白行が連続した後は項目値行である可能性が低いと判定することを特徴とする認識装置。
    The recognition device according to claim 5,
    The recognition apparatus determines that the possibility that the recognition model is an item value line after a predetermined number of blank lines continues is low.
  7.  請求項5に記載の認識装置であって、
     前記認識モデルは、行構造が同じ行は項目値行である可能性が高いと判定することを特徴とする認識装置。
    The recognition device according to claim 5,
    The recognition apparatus determines that a line having the same line structure is likely to be an item value line.
  8.  請求項5に記載の認識装置であって、
     前記認識モデルは、二つの罫線の間の行は項目値行である可能性が高く、最下の罫線より下は項目値行である可能性が低いと判定することを特徴とする認識装置。
    The recognition device according to claim 5,
    The recognition apparatus determines that a line between two ruled lines has a high possibility of being an item value line, and that a line below the lowest ruled line has a low possibility of being an item value line.
  9.  認識装置が実行する認識方法であって、
     前記認識装置は、
     プログラムを実行するプロセッサと、前記プログラムを格納する記憶装置とを有し、
     帳票から抽出された文字列が項目値を含む項目値行であるかを判定する認識モデルを有し、
     前記認識モデルは、帳票内の文字列を含む行の情報をヒストグラムに変換し、表見出しを含む行のヒストグラムと項目値を含む行のヒストグラムとを解析して行構造の関連性を機械学習して生成されたものであって、
     前記方法は、
     前記認識モデルが、認識すべき帳票から文字列を含む行の情報を抽出し、
     前記認識モデルが、前記抽出した行の情報をヒストグラムに変換し、
     前記認識モデルが、表見出しを含む行のヒストグラムと他の行のヒストグラムとを比較した行構造の関連性を特徴量として用いて、当該他の行が項目値行であるかを判定することを特徴とする認識方法。
    A recognition method executed by a recognition device,
    The recognition device is
    A processor for executing the program; and a storage device for storing the program;
    A recognition model that determines whether a character string extracted from a form is an item value line including an item value;
    The recognition model converts line information including text in a form into a histogram, analyzes the line histogram including table headings and the line histogram including item values, and performs machine learning on the relationship between the line structures. Generated by
    The method
    The recognition model extracts line information including character strings from the form to be recognized,
    The recognition model converts the extracted row information into a histogram;
    The recognition model determines whether the other row is an item value row by using, as a feature amount, the relevance of the row structure obtained by comparing the histogram of the row including the table header and the histogram of the other row. Recognition method as a feature.
  10.  請求項9に記載の認識方法であって、
     前記行の情報は、文字列を含むように定められた矩形の情報、前記矩形の位置情報、及び文字列を認識した文字情報であることを特徴とする認識方法。
    The recognition method according to claim 9, comprising:
    The recognition method according to claim 1, wherein the line information includes rectangular information determined to include a character string, position information of the rectangle, and character information that recognizes the character string.
  11.  請求項10に記載の認識方法であって、
     前記ヒストグラムは、行内の文字列を含むように定められた矩形を水平方向に所定数に分割した領域内で文字に含まれる黒色ピクセルの数を表すことを特徴とする認識方法。
    The recognition method according to claim 10, comprising:
    The recognition method, wherein the histogram represents the number of black pixels included in a character in an area obtained by dividing a rectangle defined to include a character string in a line into a predetermined number in the horizontal direction.
  12.  請求項9に記載の認識方法であって、
     帳票から文字列を含む行の情報を抽出し、前記抽出した行の情報をヒストグラムに変換し、表見出しを含む行のヒストグラムと項目値を含む行のヒストグラムとを解析して行構造の関連性を特徴量として機械学習することによって、前記認識モデルを生成することを特徴とする認識方法。
    The recognition method according to claim 9, comprising:
    Extract line information including character strings from the form, convert the extracted line information into a histogram, analyze the line histogram including the table header and the line histogram including the item value, and relevance of the line structure The recognition method is characterized in that the recognition model is generated by machine learning as a feature quantity.
  13.  請求項9に記載の認識方法であって、
     前記認識モデルは、前記認識すべき帳票に含まれる罫線、空白、及び同じ構造を持った文字列行の位置の少なくとも一つを用いて、当該他の行が項目値行であるかを判定することを特徴とする認識方法。
    The recognition method according to claim 9, comprising:
    The recognition model determines whether the other line is an item value line by using at least one of a ruled line, a blank, and a character string line having the same structure included in the form to be recognized. A recognition method characterized by the above.
  14.  請求項13に記載の認識方法であって、
     前記認識モデルは、所定数の空白行が連続した後は項目値行である可能性が低いと判定することを特徴とする認識方法。
    The recognition method according to claim 13, comprising:
    The recognition model is characterized by determining that there is a low possibility that the recognition model is an item value line after a predetermined number of blank lines continue.
  15.  請求項13に記載の認識方法であって、
     前記認識モデルは、行構造が同じ行は項目値行である可能性が高いと判定することを特徴とする認識方法。
    The recognition method according to claim 13, comprising:
    The recognition model is characterized by determining that a row having the same row structure is likely to be an item value row.
  16.  請求項13に記載の認識方法であって、
     前記認識モデルは、二つの罫線の間の行は項目値行である可能性が高く、最下の罫線より下は項目値行である可能性が低いと判定することを特徴とする認識方法。
    The recognition method according to claim 13, comprising:
    The recognition model is characterized by determining that a line between two ruled lines is highly likely to be an item value line, and that a line below the bottom ruled line is unlikely to be an item value line.
PCT/JP2017/001418 2016-06-30 2017-01-17 Recognition device and recognition method WO2018003153A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2016129997A JP2018005462A (en) 2016-06-30 2016-06-30 Recognition device and recognition method
JP2016-129997 2016-06-30

Publications (1)

Publication Number Publication Date
WO2018003153A1 true WO2018003153A1 (en) 2018-01-04

Family

ID=60785193

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2017/001418 WO2018003153A1 (en) 2016-06-30 2017-01-17 Recognition device and recognition method

Country Status (2)

Country Link
JP (1) JP2018005462A (en)
WO (1) WO2018003153A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019194052A1 (en) * 2018-04-02 2019-10-10 日本電気株式会社 Image processing device, image processing method, and storage medium storing program
CN116071771A (en) * 2023-03-24 2023-05-05 南京燧坤智能科技有限公司 Table reconstruction method and device, nonvolatile storage medium and electronic equipment

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086756B (en) * 2018-06-15 2021-08-03 众安信息技术服务有限公司 Text detection analysis method, device and equipment based on deep neural network
JP7122896B2 (en) * 2018-07-17 2022-08-22 株式会社豆蔵 Form information processing apparatus, form information structuring processing method, and form information structuring processing program
JP7383882B2 (en) * 2019-01-22 2023-11-21 富士フイルムビジネスイノベーション株式会社 Information processing device and information processing program
JP7077998B2 (en) 2019-03-07 2022-05-31 セイコーエプソン株式会社 Information processing equipment
JP7452120B2 (en) 2020-03-12 2024-03-19 富士通株式会社 Image processing method, image processing program, and image processing device
CN111709339B (en) * 2020-06-09 2023-09-19 北京百度网讯科技有限公司 Bill image recognition method, device, equipment and storage medium
JP7111143B2 (en) * 2020-10-22 2022-08-02 日本電気株式会社 Image processing device, image processing method and program

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08161436A (en) * 1994-12-06 1996-06-21 Toshiba Corp Receipt reader
JP2001092921A (en) * 1999-09-17 2001-04-06 Toshiba Corp Character line area extracting method and learning method to be used for detecting character line area

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08161436A (en) * 1994-12-06 1996-06-21 Toshiba Corp Receipt reader
JP2001092921A (en) * 1999-09-17 2001-04-06 Toshiba Corp Character line area extracting method and learning method to be used for detecting character line area

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019194052A1 (en) * 2018-04-02 2019-10-10 日本電気株式会社 Image processing device, image processing method, and storage medium storing program
JP2019185139A (en) * 2018-04-02 2019-10-24 日本電気株式会社 Image processing device, image processing method, and program
US11605219B2 (en) 2018-04-02 2023-03-14 Nec Corporation Image-processing device, image-processing method, and storage medium on which program is stored
CN116071771A (en) * 2023-03-24 2023-05-05 南京燧坤智能科技有限公司 Table reconstruction method and device, nonvolatile storage medium and electronic equipment

Also Published As

Publication number Publication date
JP2018005462A (en) 2018-01-11

Similar Documents

Publication Publication Date Title
WO2018003153A1 (en) Recognition device and recognition method
US20200074169A1 (en) System And Method For Extracting Structured Information From Image Documents
JP4366108B2 (en) Document search apparatus, document search method, and computer program
RU2679209C2 (en) Processing of electronic documents for invoices recognition
CN110765770A (en) Automatic contract generation method and device
US11393233B2 (en) System for information extraction from form-like documents
WO2023279045A1 (en) Ai-augmented auditing platform including techniques for automated document processing
JP7396568B2 (en) Form layout analysis device, its analysis program, and its analysis method
US11436852B2 (en) Document information extraction for computer manipulation
US11630956B2 (en) Extracting data from documents using multiple deep learning models
JP2021043478A (en) Information processing device, control method thereof and program
JP5343617B2 (en) Character recognition program, character recognition method, and character recognition device
JP2013246732A (en) Handwritten character retrieval apparatus, method and program
Pengcheng et al. Fast Chinese calligraphic character recognition with large-scale data
JP6856916B1 (en) Information processing equipment, information processing methods and information processing programs
Lee et al. Deep learning-based digitalization of a part catalog book to generate part specification by a neutral reference data dictionary
CN115578736A (en) Certificate information extraction method, device, storage medium and equipment
JPWO2014068770A1 (en) Data extraction method, data extraction device and program thereof
US11256760B1 (en) Region adjacent subgraph isomorphism for layout clustering in document images
TWM626292U (en) Business-oriented key item key-value identification system
Balreira et al. Assessing similarity in handwritten texts
JP2022095391A (en) Information processing apparatus and information processing program
TWI807467B (en) Key-item detection model building method, business-oriented key-value identification system and method
JP2020154962A (en) Information processing device and program
CN116758565B (en) OCR text restoration method, equipment and storage medium based on decision tree

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17819523

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17819523

Country of ref document: EP

Kind code of ref document: A1