US8131087B2 - Program and apparatus for forms processing - Google Patents
Program and apparatus for forms processing Download PDFInfo
- Publication number
- US8131087B2 US8131087B2 US12/216,632 US21663208A US8131087B2 US 8131087 B2 US8131087 B2 US 8131087B2 US 21663208 A US21663208 A US 21663208A US 8131087 B2 US8131087 B2 US 8131087B2
- Authority
- US
- United States
- Prior art keywords
- strings
- characters
- item
- string
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
- 238000012545 processing Methods 0.000 title claims description 45
- 239000000284 extract Substances 0.000 claims abstract description 37
- 238000000034 method Methods 0.000 claims description 109
- 230000014509 gene expression Effects 0.000 claims description 39
- 238000005266 casting Methods 0.000 claims description 19
- 230000015654 memory Effects 0.000 claims description 5
- 238000003672 processing method Methods 0.000 claims description 2
- 238000000605 extraction Methods 0.000 description 20
- 238000011156 evaluation Methods 0.000 description 18
- 230000006870 function Effects 0.000 description 9
- 238000013075 data extraction Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/1444—Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/26—Techniques for post-processing, e.g. correcting the recognition result
- G06V30/262—Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Definitions
- the present invention relates to a program and apparatus for forms processing, and more particularly to a program and apparatus for forms processing, which extract prescribed keywords from a scanned form image.
- the structured-form input operation is employed when the types of forms to be entered are known.
- the layouts of forms to be entered such as the positions of keywords and so on, are previously defined.
- the form of a scanned form image is identified, and keywords are automatically extracted based on the defined layout corresponding to the form.
- this structured-form input operation has a drawback that this method cannot be employed for a case where forms are of unknown types. For processing forms of unknown types, layout definitions should be manually made in advance for each form to be processed, which costs a lot.
- the non-structured-form input operation is employed when the types of forms to be entered is unknown.
- the input operation should be manually done. Therefore, a manual input cost is very high.
- a readout region of a form image is determined through layout recognition, strings are recognized within the determined readout region through character recognition, and strings corresponding to keywords are detected from the recognized strings through word matching.
- layout recognition and the character recognition it is not easy to perform the layout recognition and the character recognition on non-structured form images which do not have layout definitions, and there is always a possibility of failure.
- the forms processing in related art performs matching on strings extracted through the layout recognition and the character recognition, which causes a problem that keywords cannot be extracted if the recognition processes are not done accurately.
- FIGS. 19A and 19B are views showing a case where keywords cannot be accurately extracted due to failure in layout recognition.
- FIG. 19A is the image of a form and
- FIG. 19B shows text blocks recognized by performing the layout recognition on the form image of FIG. 19A .
- a form image 901 produced by a scanner has a noise 902 due to dirt or the like of the form.
- the “ESTIMATE (PRICE)” and “ESTIMATE (PRODUCT)” are recognized as falling into one block due to the noise 902 therebetween, with the result that a text block 903 including the noise is erroneously extracted. Therefore, “ESTIMATE” and “PRICE”, and “ESTIMATE” and “PRODUCT” are separated. If the character recognition is performed on these recognized text blocks, a text block “ESTIMATE . . .
- ESTIMATE 903
- a text block “PRICE” 904 a text block “PRODUCT” 905
- a text block “ ⁇ 120,000” 906 a text block “PERSONAL COMPUTER” 907
- keywords for the matching search include “ESTIMATE PRICE” and “ESTIMATE PRODUCT”
- these strings cannot be detected from the character recognition result, and thus keywords cannot be extracted.
- a keyword is represented by two types of elements: one is an item and the other is data.
- the forms processing of related art has another drawback that appropriate linking between items and data cannot be performed.
- FIGS. 20A and 20B are views showing a case where it is difficult to link an item and data.
- FIG. 20A shows a case where two items can be linked to one piece of data.
- FIG. 20B shows a case where two pieces of data can be linked to one item.
- the layout recognition process and the character recognition process are performed on a form image 910 , and items, “price” 911 and “TOTAL” 915 , and data, “ ⁇ 40,000” 912 , “ ⁇ 42,000” 913 , and “ ⁇ 82,000” 914 , are obtained.
- an item and data which have an almost same vertical or horizontal coordinate that is, an item and data which can be regarded as being arranged in the vertical direction or horizontal direction are linked to each other.
- “ ⁇ 40,000” 912 and “ ⁇ 42,000” 913 are linked to the “PRICE” 911 which is arranged with them in the vertical direction.
- “ ⁇ 82,000” 914 can be linked to both “price” 911 which is arranged with it in the vertical direction and “TOTAL” 915 ” which is arranged with it in the horizontal direction. It cannot be determined from the positional relationships which should be linked to “ ⁇ 82,000” 914 .
- the present invention has been made in view of foregoing and intends to provide a computer-readable recording medium storing a form processing program and a form processing apparatus, which are capable of realizing stable keyword extraction, irrespective of defects in a recognition result and noise.
- this invention intends to provide the computer-readable recording medium storing the form processing program and the form processing apparatus, which are capable of performing linking between an item and data as keywords with taking entire integrity into consideration.
- a computer-readable recording medium containing a form processing program for extracting prescribed keywords from a form image scanned.
- the form processing program causes a computer to perform as: a layout recognizer which recognizes a layout of the form image and extracts a readout region containing character images from the form image; a character recognizer which recognizes characters from the character images of the extracted readout region through character recognition, and outputs the recognized characters as a character recognition result; a possible string extractor which extracts from the character recognition result characters which are included in strings defined as the keywords in form logical definitions, and determines as possible strings combinations of extracted characters each of which satisfies positional relationships of one of the strings, the form logical definitions defining the strings as the keywords according to logical structures which are common to forms of same type; and a linking unit which, for the keywords each represented by a plurality of elements, links the possible strings belonging to different elements of the plurality of elements according to positional relationships on the form image, and determines a combination of possible strings as
- a form processing apparatus which has the above processing units for extracting prescribed keywords from a form image scanned.
- the form processing apparatus comprises: a layout recognizer which recognizes a layout of the form image and extracts a readout region containing character images from the form image; a character recognizer which recognizes characters from the character images of the extracted readout region through character recognition, and outputs the recognized characters as a character recognition result; a form logical definition memory storing form logical definitions defining the strings as the keywords according to logical structures which are common to forms of same type; a possible string extractor which reads out the form logical definitions regarding a form to be processed, extracts from the character recognition result characters which are included in the strings defined as the keywords in the form logical definitions, and determines as possible strings combinations of extracted characters each of which satisfies positional relationships of one of the strings; and a linking unit which, for the keywords each represented by a plurality of elements, links the possible strings belonging to different elements of the plurality of elements according
- FIG. 1 shows a conceptual diagram of the invention which is implemented in one embodiment.
- FIG. 2 shows an example of the hardware configuration of a form processing apparatus according to the embodiment.
- FIG. 3 shows an example of the software configuration of the form processing apparatus according to the embodiment.
- FIG. 4 shows an example of logical definitions according to the embodiment.
- FIG. 5 shows an example of a form image to be entered into the form processing apparatus according to the embodiment.
- FIG. 6 shows extraction of characters from a character recognition result according to the embodiment.
- FIG. 7 shows a casting result in an item string matching process according to the embodiment.
- FIG. 8 shows an example of a graph created according to the embodiment.
- FIG. 9 shows an example of an integrity graph table in the form processing apparatus according to the embodiment.
- FIG. 10 shows an example of validity verification in terms of character arrangement according to the embodiment.
- FIG. 11 shows an example of an item string which is written on plural lines.
- FIG. 12 is a flowchart showing how to perform an item extraction process, according to the embodiment.
- FIG. 13 is a flowchart showing how to perform the item string matching process, according to the embodiment.
- FIG. 14 is a flowchart showing how to perform a possible item string determination process, according to the embodiment.
- FIG. 15 shows an example of extraction of a * (asterisk) portion according to the embodiment.
- FIG. 16 is a flowchart showing how to perform a data extraction process, according to the embodiment.
- FIG. 17 is a flowchart showing how to perform a data string matching process, according to the embodiment.
- FIG. 18 is a flowchart showing how to perform an item and data linking process, according to the embodiment.
- FIGS. 19A and 19B are views showing a case where keywords cannot be correctly extracted due to failure in the layout recognition:
- FIG. 19A is the image of a form
- FIG. 19B shows text blocks recognized by performing layout recognition on the form image of FIG. 19A .
- FIGS. 20A and 20B are views showing a case where it is difficult to link an item and data: FIG. 20A shows a case where two items can be linked to one piece of data, and FIG. 20B shows a case where two piece of data can be linked to one item.
- FIG. 1 is a conceptual view of the invention which is implemented in one embodiment.
- a form processing apparatus 1 has processing units including a layout recognizer 11 for extracting a readout region, a character recognizer 13 for performing a character recognition process on the readout region, a possible string extractor 15 for extracting possible strings, and a linking unit 16 for linking possible strings, and memory units for a recognition dictionary database 12 and a form logical definition database 14 .
- these processing units of the form processing apparatus 1 are realized by a computer executing a form processing program.
- the layout recognizer 11 recognizes the layout of an entered form image, extracts a readout region containing character images, and gives a notice of the readout region to the character recognizer 13 .
- the recognition dictionary database 12 stores a recognition dictionary to be used to recognize characters of the character images.
- the character recognizer 13 consults the recognition dictionary database 12 to recognize characters from the character images of the extracted readout region through the character recognition process, and outputs the recognized characters to the possible string extractor 15 as a character recognition result.
- all character types can be recognized through the character recognition process.
- the character recognition process may search for only strings and character types which are defined in form logical definitions of the form logical definition database 14 . Limiting the character types to be searched for can enhance the accuracy in the character recognition process.
- the form logical definition database 14 stores the form logical definitions which define strings as keywords according to logical structures which are common to forms of same type.
- the logical structures of forms comprise meanings, items, data, and relationships among them.
- definitions relating to an item and data which are two elements to represent a keyword are made for each category.
- the item expresses the meaning of a keyword, and item strings which are considered to be written in forms are defined.
- the data is an actual value corresponding to the meaning of a keyword, and data region attributes including normal expressions and character types which are considered to be used in forms are defined.
- the possible string extractor 15 extracts as possible strings combinations of recognized characters each of which satisfies relationships of a string, the relationships defined in the form logical definitions stored in the form logical definition database 14 . More specifically, with the item strings defined in the form logical definitions as keys, the possible item string extractor 15 a extracts characters which are included in the defined item strings, from the character recognition result. Then, the possible item string extractor 15 a casts the extracted characters for the corresponding characters of the item strings, evaluates integrity between the cast characters in terms of their positional relationships, and then determines combinations of characters each of which satisfies the positional relationships of a string. In this connection, a graph theory is used for the integrity evaluation.
- positional integrity is evaluated for two characters corresponding to each pair of nodes.
- the corresponding nodes are connected with a path, so as to create a graph.
- cliques which are partial complete graphs of a graph, are extracted from the graph. Every node forming a clique is connected to all nodes other than the own node with a path in the clique, and therefore it can be said that the nodes forming the clique entirely satisfy integrity.
- An evaluation value of each clique is calculated and the best clique is obtained. Then, a degree of matching of an item string is determined. Then, the item string with the highest degree of matching is output as a possible item string.
- the possible data string extractor 15 b extracts a possible data string from the character recognition result based on the data region attributes defined in the form logical definitions. First, portions which can be value portions of the normal expressions of the data defined in the data region attributes are extracted from the character recognition result and collected as a value portion. Then, the possible data string extractor 15 b processes the value portion and the strings included in the normal expressions, in the same way as the possible item string extractor 15 a , so as to determine a possible data string.
- the linking unit 16 links the possible item strings determined by the possible item string extractor 15 a and the possible data strings determined by the possible data string extractor 15 b according to the positional relationships defined in the form logical definitions, and determines a combination of an item string and a data string as keywords. For example, combinations are created based on the relative positional relationships between item strings and data strings, and with created combinations as nodes, integrity of two combinations is evaluated. If the integrity is verified between the two combinations, the corresponding nodes are connected with a path. Thereby, a graph is created. Then cliques are extracted from the graph, an evaluation value of each clique is calculated, and by determining the best clique, a combination of an item string and a data string is determined.
- form logical definition database 14 form logical definitions defining the logical structures of forms to be processed are previously stored.
- the layout recognizer 11 extracts a readout region containing the character images through the layout recognition.
- the character recognizer 13 consults the recognition dictionary stored in the recognition dictionary database 12 to recognize the characters within the extracted readout region, thus obtaining a character recognition result from the recognized characters. By performing this process, the characters existing in the form image are recognized and output to the possible string extractor 15 as the character recognition result.
- the possible item string extractor 15 a extracts, from the character recognition result, characters which are included in the item strings defined in the form logical definitions stored in the form logical definition database 14 , and casts the extracted characters for the corresponding characters of the defined item strings. Then, the possible item string extractor 15 a evaluates positional integrity between the cast characters, obtains combinations of characters each of which satisfies relationships of a string, and determines possible item strings. On the other hand, the possible data string extractor 15 b extracts portions which can be value portions of the normal expressions, from the character recognition result, based on the data region attributes defined in the form logical definitions, and collects them as a value portion.
- the possible data string extractor 15 b processes the value portion extracted from the character recognition result and the strings of normal expressions, and determines and extracts combinations each of which satisfies relationships of a string as possible data strings, in the same way as the possible item string extractor 15 a .
- the linking unit 16 links the possible item strings and the possible data strings determined by the possible string extractor 15 according to the positional relationships between an item and data defined in the form logical definitions, and determines a combination of a possible item string and a possible data string.
- the form processing apparatus 1 selects a character group with the highest degree of matching out of character groups each of which is a combination of recognized characters which satisfies relationships of a string defined in the form logical definitions defining keywords. Therefore, even if the character recognition result has some erroneous recognition, the matching can be done with the other correct recognition, resulting in realizing the correct matching. Similarly, even if the layout recognition is failed and therefore the character recognition result does not show correct character arrangement, the correct matching can be realized. Further, if there is a plurality of combinations of an item string and a data string, a combination with the highest integrity can be extracted, which results in obtaining the correct result.
- FIG. 2 shows an example of the hardware configuration of a form processing apparatus according to this embodiment.
- the RAM 102 temporarily stores a part of the Operating System (OS) program and application programs to be executed by the CPU 101 .
- the RAM 102 stores various kinds of data for CPU processing.
- the HDD 103 stores the OS and application programs.
- the graphics processor 104 is connected to a monitor 108 and is designed to display images on the screen of the monitor 108 under the control of the CPU 101 .
- the input device interface 105 is connected to a keyboard 109 a and a mouse 109 b and is designed to transfer signals from the keyboard 109 a and the mouse 109 b to the CPU 101 through the bus 107 .
- the communication interface 106 is connected to a scanner 20 , and is designed to transfer form image data scanned by the scanner 20 to the CPU 101 through the bus 107 . It should be noted that the scanner 20 can be directly connected with the bus 107 .
- the form processing apparatus 100 has processing units including the layout recognizer 110 , a character recognizer 130 and a keyword extractor 140 , and databases including a recognition dictionary database 120 and a logical definition database 150 .
- the keyword extractor 140 has an item extractor 160 for extracting possible item strings, a data extractor 170 for extracting possible data strings, and a linking unit 180 for linking the possible item strings and the possible data strings.
- the recognition dictionary database 120 stores dictionary information to be used for character recognition.
- the character recognizer 130 is a character recognition means that recognizes characters within the readout region extracted by the layout recognizer 110 , through the character recognition, and outputs a character recognition result.
- the logical definition database 150 stores form logical definitions (hereinafter, referred to as logical definitions) defining logical structures which are common to forms of same type. For example, for estimate forms, “date information” and “request number” are included. Therefore, it can be assumed that the forms of same type have many common things, such as information items, even if they have different layouts.
- the logical structures define these common things.
- the logical structures of forms comprise a set of meaning, item, and data, and relationships among them.
- the meaning expresses a functional role in the form.
- the item shows strings which are considered to be written in forms as the functional role of a corresponding meaning.
- the data is an actual value of the functional role of a corresponding meaning.
- the relationships among them mean relationships between the sets, and may be link relationships or an equation.
- the logical definition database 150 defines item strings for an item, and data region attributes for data, which will be described in detail later.
- the item extractor 160 realizes its processing functions with modules including casting of character recognition result 161 , graph creation (integrity evaluation) 162 , maximum clique determination 163 , and possible item string determination 164 .
- the casting of character recognition result 161 compares a character recognition result to the characters included in the item strings defined in the logical definitions. When a matching character is detected in the character recognition result, it is cast for a corresponding character of the defined item strings.
- the graph creation (integrity evaluation) 162 evaluates the integrity between characters obtained by the casting of the character recognition result, and creates a graph. More specifically, with the cast characters as nodes, it is determined whether there is integrity in terms of positional relationships of a string between the characters.
- the arrangement order of the characters of an item string (if “ORDER NUMBER”) is defined as an item string, “R” and “D” should come after “O”) and positional relationships between characters (the characters should be arranged on the same line) are defined in the logical definition database 150 .
- the integrity is determined according to such positional relationships.
- the nodes corresponding to the characters are connected with a path. This process is successively performed on every character (node), and thereby a graph is created.
- the maximum clique determination 163 extracts cliques from the graph created by the graph creation (integrity evaluation) 162 , selects appropriate cliques and determines the maximum clique with the highest degree of matching.
- a degree of matching can be calculated based on the ratio of nodes to a character group including the clique. If a plurality of item strings are defined for one category, the maximum clique is determined for each of the item strings. Then, one possible item string is selected for the category.
- the possible item string determination 164 outputs as a possible item string one of the item strings belonging to the category, based on the maximum clique with the highest degree of matching out of the maximum cliques detected by the maximum clique determination 163 .
- a graph is a concept of “dots and lines connecting them” which is abstracted with focusing on “a way of connecting” a group of nodes (nodal and top points) and a group of paths (branches and sides) connecting the nodes, and a graph theory is used for searching for various kinds of properties that a graph has.
- a group of top points having a path between all pairs of the top points in the graph is called a clique
- a technique of extracting the maximum clique from cliques is called maximum clique extraction.
- a technique of extracting the maximum clique in a graph is well known, and for example, is disclosed in “C. Bron and J. Kerbosch, “Finding all cliques of an undirected graph”, Commun. ACM, Vol. 16, No. 9, pp. 575 to 577, 1973).
- the data extractor 170 realizes its processing functions with modules including * (asterisk) portion extraction 171 , casting of character recognition result 172 , graph creation (integrity evaluation) 173 , and possible data string determination (maximum clique determination) 174 .
- * asterisk portion extraction 171
- casting of character recognition result 172 graph creation (integrity evaluation) 173
- possible data string determination maximum clique determination
- a date is represented by “(month)*, *”.
- * represents any numerals or signs. This sign “*” which is usable instead of any characters is called a wild card.
- the * portion extraction 171 collectively extracts wild card portions of data from a character recognition result, and takes them as a * portion all together.
- the casting of character recognition result 172 performs casting on the strings and the * portions which are extracted from the character recognition result and are included in the normal expressions of the data, in the same way as the casting of character recognition result 161 .
- the graph creation (integrity evaluation) 173 creates a graph, in the same way as the graph creation (integrity evaluation) 162
- the possible data string determination (maximum clique determination) 174 determines the maximum clique with the highest degree of matching as a possible data string, in the same way as the possible item string determination 164 .
- the linking unit 180 realizes its processing functions with modules including item and data combination 181 , graph creation (integrity evaluation) 182 , and a combination determination (maximum clique determination) 183 .
- the item and data combination 181 detects all possible combinations from among the possible item strings extracted by the item extractor 160 and the possible data strings extracted by the data extractor 170 .
- the graph creation (integrity evaluation) 182 takes the detected combinations as nodes, and creates a graph by connecting with a path nodes corresponding to combinations having integrity in terms of positional relationships.
- the combination determination 183 determines the maximum clique with the highest integrity from the graph. That is, a combination of an item string and a data string with the highest integrity is determined.
- the logical definitions define meanings, items, and data which form logical structures.
- Keywords are classified into categories 201 according to their meanings.
- a date 210 and a form number 220 are defined.
- strings expressing a meaning that is, item strings 202 are defined for each category.
- “DATE” and “ISSUE DATE” are defined for the date 210 .
- “YOUR RECEPTION” and “ORDER NUMBER” are defined for the form number 220 .
- characters 203 and normal expressions 204 to be used for the data are defined, respectively, for each category.
- characters to be used for data 203 characters types to be used for actual values are defined. For example, for the date 210 , it is defined that data is written in “numerals”.
- the normal expressions to be used for data 204 the expression styles of data are defined. For example, for the date 210 , expression styles of “*/*/*” and “(month) *,*” are used.
- characters 206 that may exist between an item and data are defined for each category according to necessity. For example, for the date 210 , “right” and “below” are defined. This means that data should be arranged on the right side of or below a region where an item is arranged.
- characters 206 that may exist between an item and data “:” is defined. This means that, if “:” exists between an extracted possible item string and possible data string, integrity is verified therebetween.
- FIG. 5 shows an example of a form image to be entered in the form processing apparatus according to this embodiment.
- a form image 300 is a part of “ESTIMATE” form, and the layout recognizer 110 enters the form image 300 and extracts a readout region through the layout recognition process.
- the character recognizer 130 recognizes characters of all character types within the readout region through the character recognition process.
- “ESTIMATE” 301 “Sep. 25, 2005” 302 , “B Inc.” 303 , “YOUR RECEPTION No.” 304 , “20050925-0101” 305 , “A Co. Ltd” 306 , “Tel.” 307 , and “044-754-2678” 308 are output as a character recognition result.
- characters to be recognized can be limited based on the logical definitions stored in the logical definition database 150 .
- the character recognition can enhances its accuracy.
- the keyword extractor 140 starts its processing.
- the keyword extraction process includes extraction of possible item strings by the item extractor 160 , extraction of possible data strings by the data extractor 170 , and linking of the possible item strings and possible data strings by the linking unit 180 .
- an item string expressing an item is extracted from a character recognition result based on the item strings defined in the logical definitions.
- Second the casting of character recognition result 161 extracts, from the character recognition result, characters which are included in the item strings defined in the logical definitions stored in the logical definition database 150 , and performs casting. It is assumed that “YOUR RECEPTION No.”, “ORDER NUMBER”, and “RECEPTION NUMBER” are defined as item strings for the form of the form image 300 . From the character recognition result, characters which are included in the defined item strings are extracted.
- FIG. 6 shows extraction of characters from a character recognition result according to the embodiment, that is, the portions of characters extracted from the form image 300 , and has the same reference numerals as FIG. 5 .
- Characters which are defined in the item strings are sequentially extracted from the character recognition result. At this time, the extracted characters are labeled with numerals for the sake of convenience. For example, out of “YOUR RECEPTION No.” 304 , “Y( 1 )”, “O( 2 )”, “U( 3 )”, “R( 4 )”, “R( 5 ), “E( 6 )”, “C( 7 )”, “E( 8 )”, “P( 9 )”, “T( 10 )”, “I( 11 )”, “O( 12 )”, “N( 13 ), “N( 14 )”, and “o( 15 )” are extracted. Similarly, “C( 16 )” and “o( 17 )” out of “A Co. Ltd.” 306 , “T( 18 )” out of “Tel.” 307 , and “I( 19 )” out of “B Inc.” 303 are extracted. Then, the extracted characters are cast for the corresponding characters of the item strings.
- FIG. 7 shows a casting result in an item string matching process according to this embodiment, and has the same reference numerals as FIG. 6 .
- ( 1 ) is cast for “Y”, ( 2 ) and ( 12 ) for “O”, ( 3 ) for “U”, ( 4 ) and ( 5 ) for “R”, ( 4 ) and ( 5 ) for “R”, ( 6 ) and ( 8 ) for “E”, ( 7 ) and ( 16 ) for “C”, ( 6 ) and ( 8 ) for “E”, ( 9 ) for “P”, ( 10 ) and ( 18 ) for “T”, “ 11 ” and “ 19 ” for “I”, ( 2 ) and ( 12 ) for “O”, ( 13 ) and ( 14 ) for “N”, ( 13 ) and ( 14 ) for “N”, and ( 15 ) and ( 17 ) for “o”.
- ( 4 ) and ( 5 ) are cast for “R”, ( 6 ) and ( 8 ) for “E”, ( 7 ) and ( 16 ) for “C”, ( 6 ) and ( 8 ) for “E”, ( 9 ) for “P”, ( 10 ) and ( 18 ) for “T”, ( 11 ) and ( 19 ) for “I”, ( 2 ) and ( 12 ) for “O”, ( 13 ) and ( 14 ) for “N”, ( 13 ) and ( 14 ) for “N”, ( 3 ) for “U”, ( 6 ) and ( 8 ) for “E”, and ( 4 ) and ( 5 ) for “R”.
- the graph creation (integrity evaluation) 162 creates a graph with the cast characters as nodes, based on the casting result.
- the relationships of a string and positional integrity are checked. It can be said that two characters A and B have positional integrity when one of the characters is the i-th character and the other is the j-th character in a corresponding item string (here, i ⁇ j), x coordinate of A ⁇ x-coordinate of B, and y-coordinate of A is almost equal to y-coordinate of B.
- x represents a horizontal coordinate axis while y represents a vertical coordinate axis.
- the average size of characters is an average of the long-side lengths of the circumscribed rectangles of all the characters.
- the nodes corresponding to the two characters are connected with a path, thereby creating a graph.
- FIG. 8 shows an example of a created graph according to the embodiment.
- the nodes have numerals ( 1 ), ( 2 ), ( 3 ), ( 4 ), ( 5 ), ( 6 ), ( 7 ), ( 8 ), ( 9 ), ( 10 ), ( 11 ), ( 12 ), ( 13 ), ( 14 ), ( 15 ), ( 16 ), ( 17 ), ( 18 ), and ( 19 ) given to corresponding characters shown in FIG. 6 .
- cliques which are partial complete graphs of a graph, are extracted. Every node forming a clique are connected to all nodes other than the own node with a path in the clique.
- FIG. 9 shows an example of the integrity graph table in the form processing apparatus according to the embodiment.
- nodes are arranged vertically and horizontally, and a path status is set in the column at each intersecting point.
- “ 1 ” represents a connection with a path.
- “ 0 ” represents a not-connection with a path.
- the maximum clique determination 163 selects only appropriate cliques from the extracted cliques. For example, only cliques which have the number of nodes greater than a prescribed threshold are selected. Thereby only cliques having acceptable matching to item strings are selected. In the case where the threshold is set to two in this example, “ 19 ” out of the extracted cliques, “ 1 , . . . , 15 ”, “ 16 , 17 , 18 ”, and “ 19 ”, is excluded. A clique which has only one-character matching to an item string cannot be considered to be the item string, and therefore this clique is excluded from cliques.
- a center point of characters forming a region surrounding the character group (the center point of the region) is calculated by dividing the length of the region by the number of characters. Then, the difference between the calculated center point and an actual center point of characters is obtained, and if the difference exceeds a prescribed threshold, the clique is considered to be inappropriate and is deleted.
- the average size of characters is m
- the number of characters is n
- the difference of each character is d
- equation (3) is satisfied.
- the average size of characters is obtained as an average of the long-side lengths of the circumscribed rectangles of all the characters.
- FIG. 10 shows an example of validity verification in terms of character arrangement according to this embodiment.
- the possible item string determination 164 compares the maximum cliques each determined for each of the item strings, in terms of the degree of matching, and determines the clique with the highest degree of matching as a possible item string. If some cliques have the same highest degree of matching, the cliques are all output.
- the degree of matching is calculated, for example, based on the number of nodes included in a character group or a ratio of matching characters to a string.
- character re-recognition is performed by limiting character types based on the remaining cliques. More specifically, the character re-recognition may be performed, which searches for only the characters of item strings forming the remaining cliques. Out of character re-recognition results, only results whose recognition reliability is beyond a prescribed threshold remain, and the other results are deleted. Then, the casting, the graph creation, and the clique extraction are performed on the remaining character re-recognition results. The number of nodes forming each clique is counted, and this number is taken as an evaluation value of the clique. Then a clique with the highest evaluation value is output. If some cliques have the same highest evaluation value, they are all output.
- FIG. 11 shows an example of an item string which is written on plural lines.
- ranges 501 y and 502 y obtained by projecting the two strings in the y direction do not overlap, it can be judged that the two strings do not overlap one above the other.
- ranges 501 x and 502 x obtained by projecting the two strings in the x direction overlap it can be judged that the two strings overlap from side to side.
- the latter “number” 502 comes after and is arranged below “Estimate” 501 on the image.
- This process is started when a character recognition result is entered.
- step S 13 the item string matching process is performed on the item string Sj of category Ci. This process will be described in detail later. By performing the matching process, the maximum clique for the item string Sj of category Ci is determined.
- step S 14 in order to perform the process for the next item string, j is incremented by one.
- step S 15 j is compared with the number of item strings defined in the logical definitions. If j does not reach the number of item strings, the process returns to step S 13 to perform the matching process on the next item string. If j reaches the number of item strings, the process for all item strings is completed.
- step S 16 by repeating the process from step S 13 to step S 15 , the maximum cliques for all of the item strings of category Ci are determined, and then a possible item string determination process for category Ci is performed. This process will be described in detail later.
- step S 17 in order to perform the process for the next category, i is incremented by one.
- step S 18 i is compared with the number of categories defined in the logical definitions. If i does not reach the number of categories, the process returns to step S 12 and the process is performed for the next category. If the process for all categories has been completed, the process is completed.
- the item string matching process will be described with reference to a flowchart shown in FIG. 13 .
- This process is started when an item string Sj is specified.
- step S 131 characters which are included in the item string Sj defined in the logical definitions stored in the logical definition database 150 are extracted from the character recognition result and are cast.
- a graph is created based on the casting result with the cast characters as nodes. First with respect to all pairs of characters out of the cast characters, relationships of a string and positional integrity are checked, and when integrity is verified, the corresponding nodes are connected with a path, thereby creating a graph.
- cliques which are partial complete graphs of a graph, are extracted from the graph created at step S 132 . Every node forming a clique is connected with a path to all nodes other than the own node in the clique.
- step S 134 out of the cliques extracted at step S 133 , only cliques each of which has the number of nodes greater than a prescribed threshold are selected. Further, validity of a clique in terms of the character arrangement is checked, and inappropriate cliques are deleted. Out of the remaining cliques, a clique with the highest degree of matching is selected and is output as the maximum clique.
- step S 162 i is compared with the number of item strings defined in the category, if i reaches the number of item strings, this process is completed.
- step S 163 as i does not reach the number of item strings, the degree of matching Pi of i-th item string is compared with the highest degree of matching Pt. More specifically, it is checked whether Pi ⁇ Pt, and when Pi is lower than Pt, the current highest degree of matching Pt is considered to be still the highest, and the process goes on to step S 167 .
- step S 167 i is incremented by one and the process returns to step S 162 to perform the process for the next item string.
- the number of possible item strings n and the possible item strings q[i] (0 ⁇ i ⁇ n) can be obtained.
- the data extraction process is performed to extract, from a character recognition result, data which is defined in data normal expressions, based on the data normal expressions and data types of the logical definitions.
- the logical definitions define such data attributes as normal expressions. For example, “*/*/*”, etc. are defined for the data of date keyword, “* yen”, “ ⁇ *”, etc. are defined for an amount of money.
- the * portion extractor 171 reads out the types (numerals, etc.) of a * portion and other characters (“/ (slash)”, “. (dot)”, “- (hyphen)”, etc.), which are defined as normal expressions, from the logical definition database 150 , and extracts matching characters from the character recognition result. At this time, a condition that characters other than dot and hyphen should be larger than a certain size is imposed. A group of the extracted characters is taken as A.
- a surrounding region is created by adding margins (margin of m on the right and left and margin of n at the top and bottom) to the circumscribed rectangular of a character belonging to A. Then the character is linked to another character which belongs to A, is the closest to the character, and has an overlapping y coordinate with the character. The linked characters in A are combined, and are extracted as a * portion of the region.
- FIG. 15 shows one example of extraction of a * portion according to this embodiment. This figure shows a case of extracting data which is defined in a normal expression of “*/*/*” (* is a numeral).
- numerals are extracted from a character recognition result.
- a character group A 601 ) of “9/30/2004” is taken out.
- the characters belonging to A neighboring characters are linked to each other.
- a surrounding region 603 obtained by adding a margin of m on the right and left and a margin of n at the top and bottom to the circumscribed rectangular 602 of “9” is set, and then “/” is linked to a character which belongs to A, exists within the surrounding region, is closest to “9”, and has an almost same y-coordinate.
- “9” is linked to “/”.
- the casting of character recognition result 172 casts the character portions and the * portion which is indicated by * in the normal expressions, in the same way as the casting of character recognition result 161 of the item extractor 160 .
- the casting is done for each character and for the * portion. In the case of the character group A ( 601 ), the casting of the characters “/”, and the * portion “9 30 2004” is done.
- the graph creation (integrity evaluation) 173 and the possible data string determination (maximum clique determination) 174 creates a graph, extracts cliques, and determine a maximum clique, in the same way as the graph creation (integrity evaluation) 162 and the maximum clique determination 163 of the item extractor 160 .
- the graph creation with each of the cast characters and the * portion as a node, when two nodes have integrity in terms of positional relationships, the two nodes are connected with a path.
- the maximum clique extraction only cliques which have the number of nodes greater than a prescribed threshold are selected. If there are one or more other characters between two characters corresponding to two neighboring nodes of a clique on the image, the clique is judged inappropriate and is deleted. With respect to the remaining cliques having the number of nodes greater than the prescribed threshold, their regions and characters are all output.
- This process is started when a character recognition result is entered.
- step S 24 in order to perform the process for the next normal expression, j is incremented by one.
- step S 25 j is compared with the number of normal expressions defined in the logical definitions. If j does not reach the number of normal expressions, the process returns to step S 23 to perform the matching process for the next normal expression. If j does, the process for all the normal expressions is completed.
- step S 26 in order to perform the process for the next category, i is incremented by one.
- This process is started when a normal expression Rj is specified.
- step S 231 a * portion of defined type and other strings which are defined in the data normal expressions stored in the logical definition database 150 are taken out from a character recognition result. Neighboring characters are linked one another, and the collected character group is extracted as a * portion.
- step S 232 the characters and a portion represented by * (numerals or the like) in the normal expressions, which are included in the character group extracted as the * portion, are cast.
- step S 233 with each of the cast characters and * portion as a node, when two nodes have integrity, the two nodes are connected with a path, so as to thereby create a graph.
- cliques which are partial complete graphs of a graph, are extracted from the graph created at step S 233 . Every node forming a clique is connected to all nodes other than the own node with a path in the clique.
- step S 235 out of the cliques extracted at step 234 , only cliques which have the number of nodes greater than a prescribed threshold are selected. Further, validity of the cliques in terms of character arrangement is checked, and inappropriate cliques are deleted. With respect to only the remaining cliques having the number of nodes greater than the prescribed threshold, the regions and characters corresponding to the cliques are all output.
- the item and data linking 181 sets a surrounding region of the possible item string I based on the relationships (the relative position 205 of data with respect to an item) between an item and data for category C defined in the logical definitions. For example, “below” is defined, the surrounding region is set below the possible item string I. If “right” is defined, the surrounding region is set on the right side of the possible item string I. If there is a possible data string which is within the set surrounding region and satisfies the normal expressions of data for category C, the possible data string and the possible item string are combined. In this connection, if any character other than characters that may exist between an item and data for category C as defined in the logical definitions exists within a circumscribed rectangular including both the possible item string and the possible data string, the possible item string and the possible data string are not combined.
- the graph creation (integrity evaluation) 182 creates a graph with each combination of an item string and a data string extracted by the item and data combiner 181 as a node.
- integrity of all pairs of combinations is checked. Having integrity between two combinations A and B means that possible item strings and possible data strings forming the combinations do not overlap.
- nodes corresponding to the two combinations are connected with a path, so as to thereby create a graph.
- the combination determination (maximum clique determination) 183 extracts cliques and determines the maximum clique, in the same way as the item string matching process. In the maximum clique extraction, a clique which has the greatest number of nodes is output.
- the process is started when an item string and a data string are extracted.
- a possible item string and a possible data string of a same category are combined according to the relative position of data with respect to an item defined in the logical definitions, and all combinations of a possible item string and a possible data string are detected.
- step S 32 with each combination of an item string and a data string extracted at step S 31 as a node, integrity between all pairs of combinations is evaluated (it is checked whether the possible item strings and the possible data strings forming the combinations do not overlap), and if integrity is verified, the corresponding nodes are connected with a path, so as to thereby create a graph.
- step S 33 cliques, which are partial complete graphs of a graph, are extracted from the graph created at step S 32 .
- the maximum clique is extracted from the cliques extracted at step S 33 .
- a clique having the greatest number of nodes is output as the maximum clique.
- a combination of a possible item string and a possible data string is determined with the entire integrity into consideration. Thereby an appropriate combination can be selected even if there are a plurality of possible combinations.
- Computer-readable recording media include magnetic recording devices, optical discs, magneto-optical recording media, semiconductor memories, etc.
- the magnetic recording devices include Hard Disk Drives (HDD), Flexible Disks (FD), magnetic tapes, etc.
- the optical discs include Digital Versatile Discs (DVDs), DVD-Random Access Memory (RAM), Compact Disc Read Only Memory (CD-ROM), CD-Recordable (CD-R)/ReWritable (RW), etc.
- the magneto-optical recording media include Magneto-Optical disks (MOs) etc.
- portable recording media such as DVDs and CD-ROMs, on which the program is recorded may be put on sale.
- the program may be stored in the storage device of the server computer and may be transferred from the server computer to the computer through a network.
- the computer which is to execute the program stores in its storage device the server program recorded on a portable recording medium.
- the computer reads the server program from the storage device and executes a process according to the program.
- the computer may run the program directly from the portable recording medium to execute the process according to the program. Also, while receiving the program being transferred from the server computer, the computer may sequentially run this program.
- the forms processing according to this invention automatically extracts keywords based on form logical definitions defining logical structures that forms have, not based on layout definitions corresponding to the layouts of forms. Therefore, even if forms have different layouts, keywords can be automatically extracted from the forms having the same logical structures. Further, in order to extract keywords, characters which are included in strings defined as keywords are extracted from a character recognition result, combinations of characters each of which satisfies the relationships of a string in the form logical definitions is extracted as possible strings, and possible strings forming keywords are linked. Therefore, even if a string does not completely match a string defined as a keyword, when the string can be judged as having relationships of a string, this is output as a keyword. As a result, stable keyword extraction can be realized, regardless of failure in the layout recognition and the character recognition.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Character Discrimination (AREA)
- Character Input (AREA)
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2006/300325 WO2007080642A1 (ja) | 2006-01-13 | 2006-01-13 | 帳票処理プログラムおよび帳票処理装置 |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2006/300325 Continuation WO2007080642A1 (ja) | 2006-01-13 | 2006-01-13 | 帳票処理プログラムおよび帳票処理装置 |
Publications (2)
Publication Number | Publication Date |
---|---|
US20080273802A1 US20080273802A1 (en) | 2008-11-06 |
US8131087B2 true US8131087B2 (en) | 2012-03-06 |
Family
ID=38256057
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/216,632 Expired - Fee Related US8131087B2 (en) | 2006-01-13 | 2008-07-08 | Program and apparatus for forms processing |
Country Status (4)
Country | Link |
---|---|
US (1) | US8131087B2 (zh) |
JP (1) | JP4750802B2 (zh) |
CN (1) | CN101356541B (zh) |
WO (1) | WO2007080642A1 (zh) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090041361A1 (en) * | 2007-08-09 | 2009-02-12 | Fujitsu Limited | Character recognition apparatus, character recognition method, and computer product |
US20100008578A1 (en) * | 2008-06-20 | 2010-01-14 | Fujitsu Frontech Limited | Form recognition apparatus, method, database generation apparatus, method, and storage medium |
US20150206006A1 (en) * | 2014-01-22 | 2015-07-23 | Fuji Xerox Co., Ltd. | Image processing apparatus, non-transitory computer readable medium, and image processing method |
US20220229863A1 (en) * | 2021-01-21 | 2022-07-21 | International Business Machines Corporation | Assigning documents to entities of a database |
US11475688B2 (en) * | 2019-09-06 | 2022-10-18 | Canon Kabushiki Kaisha | Information processing apparatus and information processing method for extracting information from document image |
US11494923B2 (en) | 2019-01-31 | 2022-11-08 | Fujifilm Business Innovation Corp. | Information processing device and non-transitory computer readable medium |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8064703B2 (en) * | 2006-03-17 | 2011-11-22 | Data Trace Information Services, Llc | Property record document data validation systems and methods |
WO2008012845A1 (en) * | 2006-07-26 | 2008-01-31 | Stmicroelectronics S.R.L. | Use of nitroaniline derivatives for the production of nitric oxide |
JP4998237B2 (ja) * | 2007-12-06 | 2012-08-15 | 富士通株式会社 | 論理構造モデル作成支援プログラム、論理構造モデル作成支援装置および論理構造モデル作成支援方法 |
JP5125573B2 (ja) | 2008-02-12 | 2013-01-23 | 富士通株式会社 | 領域抽出プログラム、文字認識プログラム、および文字認識装置 |
JP5462017B2 (ja) * | 2010-02-08 | 2014-04-02 | 沖電気工業株式会社 | 帳票処理システム、エントリ端末および帳票データ処理方法 |
CN102402693B (zh) * | 2010-09-09 | 2014-07-30 | 富士通株式会社 | 处理包含字符的图像的方法和设备 |
CN102509115B (zh) * | 2011-11-22 | 2014-06-25 | 北京京北方信息技术有限公司 | 一种分层带回溯查找机制的票据类型识别方法 |
JP5831420B2 (ja) * | 2012-09-28 | 2015-12-09 | オムロン株式会社 | 画像処理装置および画像処理方法 |
CN106650715B (zh) * | 2016-10-26 | 2019-07-12 | 西安电子科技大学 | 一种根据允许集对字符串ocr识别结果检错与纠错的方法 |
WO2020054067A1 (ja) * | 2018-09-14 | 2020-03-19 | 三菱電機株式会社 | 画像情報処理装置、画像情報処理方法、及び画像情報処理プログラム |
JP2020027598A (ja) * | 2018-12-27 | 2020-02-20 | 株式会社シグマクシス | 文字認識装置、文字認識方法及び文字認識プログラム |
JP7318248B2 (ja) * | 2019-03-20 | 2023-08-01 | 富士フイルムビジネスイノベーション株式会社 | 情報処理装置及び情報処理プログラム |
JP7370733B2 (ja) * | 2019-05-30 | 2023-10-30 | キヤノン株式会社 | 情報処理装置、制御方法、及びプログラム |
JP7282603B2 (ja) * | 2019-06-05 | 2023-05-29 | キヤノン株式会社 | 画像処理装置、その制御方法及びプログラム |
US10832656B1 (en) * | 2020-02-25 | 2020-11-10 | Fawzi Shaya | Computing device and method for populating digital forms from un-parsed data |
CN111444906B (zh) * | 2020-03-24 | 2023-09-29 | 腾讯科技(深圳)有限公司 | 基于人工智能的图像识别方法和相关装置 |
CN111832396B (zh) * | 2020-06-01 | 2023-07-25 | 北京百度网讯科技有限公司 | 文档布局的解析方法、装置、电子设备和存储介质 |
JP7317886B2 (ja) * | 2021-04-12 | 2023-07-31 | 株式会社プリマジェスト | 情報処理装置及び情報処理方法 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH1078997A (ja) | 1996-09-03 | 1998-03-24 | Matsushita Electric Ind Co Ltd | 文字認識装置及びその方法並びにその方法を記録した記録媒体 |
US6104500A (en) * | 1998-04-29 | 2000-08-15 | Bcl, Computer Inc. | Networked fax routing via email |
CN1265499A (zh) | 1999-03-01 | 2000-09-06 | 株式会社日立制作所 | 账票处理方法与账票处理系统 |
JP2001312691A (ja) | 2000-05-01 | 2001-11-09 | Canon Inc | 画像処理方法および装置並びに記憶媒体 |
US6614931B1 (en) * | 1998-10-08 | 2003-09-02 | Hewlett-Packard Development Company, Lp. | Handwritten instructions for messaging appliances |
US20040008889A1 (en) * | 2002-07-09 | 2004-01-15 | Canon Kabushiki Kaisha | Character recognition apparatus and method |
US6721451B1 (en) * | 2000-05-31 | 2004-04-13 | Kabushiki Kaisha Toshiba | Apparatus and method for reading a document image |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0589279A (ja) * | 1991-09-30 | 1993-04-09 | Sharp Corp | 文字認識装置 |
JP3076731B2 (ja) * | 1994-12-26 | 2000-08-14 | 株式会社ピーエフユー | 帳票出力処理方法 |
JP3380136B2 (ja) * | 1997-04-22 | 2003-02-24 | 富士通株式会社 | 表画像のフォーマットを識別するフォーマット識別装置および方法 |
-
2006
- 2006-01-13 WO PCT/JP2006/300325 patent/WO2007080642A1/ja active Application Filing
- 2006-01-13 JP JP2007553802A patent/JP4750802B2/ja not_active Expired - Fee Related
- 2006-01-13 CN CN2006800509316A patent/CN101356541B/zh not_active Expired - Fee Related
-
2008
- 2008-07-08 US US12/216,632 patent/US8131087B2/en not_active Expired - Fee Related
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH1078997A (ja) | 1996-09-03 | 1998-03-24 | Matsushita Electric Ind Co Ltd | 文字認識装置及びその方法並びにその方法を記録した記録媒体 |
US6104500A (en) * | 1998-04-29 | 2000-08-15 | Bcl, Computer Inc. | Networked fax routing via email |
US6614931B1 (en) * | 1998-10-08 | 2003-09-02 | Hewlett-Packard Development Company, Lp. | Handwritten instructions for messaging appliances |
CN1265499A (zh) | 1999-03-01 | 2000-09-06 | 株式会社日立制作所 | 账票处理方法与账票处理系统 |
US6625313B1 (en) | 1999-03-01 | 2003-09-23 | Hitachi, Ltd. | Business form handling method and system for carrying out the same |
JP2001312691A (ja) | 2000-05-01 | 2001-11-09 | Canon Inc | 画像処理方法および装置並びに記憶媒体 |
US6721451B1 (en) * | 2000-05-31 | 2004-04-13 | Kabushiki Kaisha Toshiba | Apparatus and method for reading a document image |
US20040008889A1 (en) * | 2002-07-09 | 2004-01-15 | Canon Kabushiki Kaisha | Character recognition apparatus and method |
Non-Patent Citations (13)
Title |
---|
Chinese Patent Office Action issued May 14, 2010 in Chinese Patent Application No. 200680050931.6. |
Chinese Patent Office Action issued May 14, 2010 in Korean Patent Application No. 200680050931.6. |
Ishitani, "Model-based information extraction and its applications for document images," DLIA 2001. * |
Japanese Office Action in Appln. No. 2007-553802, dated Feb. 8, 2011. |
Patent Abstracts of Japan No. 10-078997 published Mar. 24, 1998. |
Patent Abstracts of Japan No. 10-302023 published Nov. 13, 1998. |
Patent Abstracts of Japan No. 11-238165 published Aug. 31, 1999. |
Patent Abstracts of Japan No. 2001-312691published Nov. 9, 2001. |
Patent Abstracts of Japan No. 5-089279 published Apr. 9, 1993. |
Patent Abstracts of Japan, Publication No. 10-078997, Published Mar. 24, 1998. |
Patent Abstracts of Japan, Publication No. 2001-312691, Published Nov. 9, 2001. |
Patent Abstracts of Japan, Publication No. 2001312691A, Published Nov. 9, 2001. |
Tuganbaev et al., "Universal data capture technology from semi-structured forms," ICDAR Aug. 29-Sep. 1, 2005. * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090041361A1 (en) * | 2007-08-09 | 2009-02-12 | Fujitsu Limited | Character recognition apparatus, character recognition method, and computer product |
US20100008578A1 (en) * | 2008-06-20 | 2010-01-14 | Fujitsu Frontech Limited | Form recognition apparatus, method, database generation apparatus, method, and storage medium |
US8891871B2 (en) * | 2008-06-20 | 2014-11-18 | Fujitsu Frontech Limited | Form recognition apparatus, method, database generation apparatus, method, and storage medium |
US20150206006A1 (en) * | 2014-01-22 | 2015-07-23 | Fuji Xerox Co., Ltd. | Image processing apparatus, non-transitory computer readable medium, and image processing method |
US9582740B2 (en) * | 2014-01-22 | 2017-02-28 | Fuji Xerox Co., Ltd. | Image processing apparatus, non-transitory computer readable medium, and image processing method |
US11494923B2 (en) | 2019-01-31 | 2022-11-08 | Fujifilm Business Innovation Corp. | Information processing device and non-transitory computer readable medium |
US11475688B2 (en) * | 2019-09-06 | 2022-10-18 | Canon Kabushiki Kaisha | Information processing apparatus and information processing method for extracting information from document image |
US20220229863A1 (en) * | 2021-01-21 | 2022-07-21 | International Business Machines Corporation | Assigning documents to entities of a database |
US11593417B2 (en) * | 2021-01-21 | 2023-02-28 | International Business Machines Corporation | Assigning documents to entities of a database |
Also Published As
Publication number | Publication date |
---|---|
JPWO2007080642A1 (ja) | 2009-06-11 |
US20080273802A1 (en) | 2008-11-06 |
CN101356541B (zh) | 2012-05-30 |
JP4750802B2 (ja) | 2011-08-17 |
WO2007080642A1 (ja) | 2007-07-19 |
CN101356541A (zh) | 2009-01-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8131087B2 (en) | Program and apparatus for forms processing | |
US11816888B2 (en) | Accurate tag relevance prediction for image search | |
US10430649B2 (en) | Text region detection in digital images using image tag filtering | |
US7529748B2 (en) | Information classification paradigm | |
US7599926B2 (en) | Reputation information processing program, method, and apparatus | |
US8938385B2 (en) | Method and apparatus for named entity recognition in chinese character strings utilizing an optimal path in a named entity candidate lattice | |
US7801392B2 (en) | Image search system, image search method, and storage medium | |
JP4443443B2 (ja) | 文書画像レイアウト解析プログラム、文書画像レイアウト解析装置、および文書画像レイアウト解析方法 | |
US8208765B2 (en) | Search and retrieval of documents indexed by optical character recognition | |
US8224090B2 (en) | Apparatus and method for analyzing and determining correlation of information in a document | |
US20070043690A1 (en) | Method and apparatus of supporting creation of classification rules | |
US20070038937A1 (en) | Method, Program, and Device for Analyzing Document Structure | |
JP6003705B2 (ja) | 情報処理装置及び情報処理プログラム | |
CN103902993A (zh) | 文档图像识别方法和设备 | |
KR20160149050A (ko) | 텍스트 마이닝을 활용한 순수 기업 선정 장치 및 방법 | |
CN107844531B (zh) | 答案输出方法、装置和计算机设备 | |
CN118035416A (zh) | 一种流式问答配图方法及系统 | |
KR101118628B1 (ko) | 지능형 인식 라이브러리 및 관리 도구를 활용한 고문서 이미지 데이터 인식 및 처리 방법 | |
US11645332B2 (en) | System and method for clustering documents | |
JP2006309347A (ja) | 対象文書からキーワードを抽出する方法、システムおよびプログラム | |
US7756872B2 (en) | Searching device and program product | |
US20170293863A1 (en) | Data analysis system, and control method, program, and recording medium therefor | |
KR20220041336A (ko) | 중요 키워드 추천 및 핵심 문서를 추출하기 위한 그래프 생성 시스템 및 이를 이용한 그래프 생성 방법 | |
EP4202714A1 (en) | Text similarity determination method and apparatus and industrial diagnosis method and system | |
JP2017068799A (ja) | 情報抽出装置、情報抽出方法、及び情報抽出プログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: FUJITSU LIMITED, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TAKEBE, HIROAKI;FUJIMOTO, KATSUHITO;REEL/FRAME:021272/0035 Effective date: 20080702 |
|
ZAAA | Notice of allowance and fees due |
Free format text: ORIGINAL CODE: NOA |
|
ZAAB | Notice of allowance mailed |
Free format text: ORIGINAL CODE: MN/=. |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20240306 |