WO2022044067A1 - Document image recognition system - Google Patents

Document image recognition system Download PDF

Info

Publication number
WO2022044067A1
WO2022044067A1 PCT/JP2020/031792 JP2020031792W WO2022044067A1 WO 2022044067 A1 WO2022044067 A1 WO 2022044067A1 JP 2020031792 W JP2020031792 W JP 2020031792W WO 2022044067 A1 WO2022044067 A1 WO 2022044067A1
Authority
WO
WIPO (PCT)
Prior art keywords
character recognition
document image
cloud api
api
processed
Prior art date
Application number
PCT/JP2020/031792
Other languages
French (fr)
Japanese (ja)
Inventor
光貴 岩村
守真 横田
剛久 三輪
康次 長谷川
仁己 小田
誠司 奥村
孝之 小平
啓太 齊藤
嵩久 榎本
Original Assignee
三菱電機ビルテクノサービス株式会社
三菱電機株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 三菱電機ビルテクノサービス株式会社, 三菱電機株式会社 filed Critical 三菱電機ビルテクノサービス株式会社
Priority to JP2022534682A priority Critical patent/JP7134380B2/en
Priority to PCT/JP2020/031792 priority patent/WO2022044067A1/en
Priority to CN202080103301.0A priority patent/CN116569225B/en
Publication of WO2022044067A1 publication Critical patent/WO2022044067A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition

Definitions

  • a document image recognition system that uses a character recognition function application program interface (hereinafter referred to as a character recognition cloud API) provided by a cloud service is known.
  • the character recognition cloud API is selected by evaluating the correct answer rate and processing speed of multiple character recognition cloud APIs using the test images prepared in advance, and the character recognition processing is executed on the selected character recognition cloud API.
  • Patent Document 1 A document image recognition system that uses a character recognition function application program interface (hereinafter referred to as a character recognition cloud API) provided by a cloud service is known.
  • the character recognition cloud API is selected by evaluating the correct answer rate and processing speed of multiple character recognition cloud APIs using the test images prepared in advance, and the character recognition processing is executed on the selected character recognition cloud API.
  • the character recognition cloud API may have different character recognition accuracy rates depending on the characteristics of the document image. Therefore, when a document image having characteristics different from the test image used in the evaluation of the character recognition cloud API is input in advance, the character recognition cloud API different from the prior evaluation may be optimal. Therefore, the character recognition accuracy of the document image recognition system may decrease.
  • an object of the present invention is to provide a document image recognition system with high character recognition accuracy.
  • the document image recognition system of the present invention recognizes characters of a user terminal that acquires a document image, a center server that is connected to the user terminal by a communication line, and a document image that is connected to the center server by a communication line and input.
  • a document image recognition system including a plurality of character recognition cloud APIs that perform processing and output character recognition results.
  • the center server performs character recognition processing on the characteristics of the input document image and the input document image.
  • the user terminal is provided with a selection database that stores a set with a character recognition cloud API that maximizes the correct answer rate of character recognition among a plurality of character recognition cloud APIs, and the user terminal processes the acquired document image as a processing target document image.
  • the center server extracts the characteristics of the processing target document image from the processing target document image received from the user terminal, and the center server extracts the characteristics of the processing target document image and stores the characteristics of the input document image in the selection database. Select the feature of the input document image that most closely resembles the feature of the input document image to be processed, and select one character recognition cloud API that is paired with the feature of the selected input document image. It is characterized in that the processing target document image is transmitted to one selected character recognition cloud API, the character recognition result is received from one character recognition cloud API, and the received character recognition result is transmitted to the user terminal. ..
  • the character recognition cloud API that is most suitable for the character recognition processing of the document image to be processed received from the user terminal is selected, and the character recognition cloud API is made to perform the character recognition processing. Therefore, the character recognition accuracy of the document image recognition system Can be improved.
  • the user terminal when the user terminal receives the character recognition result from the center server, the user terminal outputs the correct character string included in the processing target document image input by the user to the center server.
  • the center server transmits the processing target document image to each character recognition cloud API, and receives and receives the character recognition result from each character recognition cloud API. Update each feature of each input document image that is paired with each character recognition cloud API of the selected database according to the degree of correctness of the character recognition result, and of the feature of the input document image and the set of character recognition cloud API Either or both of the additions to the selection database may be made.
  • the character recognition result received from the selected one character recognition cloud API is correct, and the character recognition cloud API other than the selected one character recognition cloud API is correct.
  • the characteristics of the document image to be processed and the characteristics of the input document image paired with the selected character recognition cloud API are predetermined. If it is equal to or more than the threshold value of, the characteristics of the input document image combined with one character recognition cloud API selected based on the characteristics of the document image to be processed may be updated.
  • the center server has a correct character recognition result received from one selected character recognition cloud API and at least a character recognition result received from another character recognition cloud API.
  • a correct character recognition result received from one selected character recognition cloud API and at least a character recognition result received from another character recognition cloud API.
  • the center server recognizes characters other than the selected one character recognition cloud API in which the character recognition result received from the selected one character recognition cloud API is correct.
  • the characteristics of the document image to be processed and the character recognition cloud API whose character recognition result is the correct answer among other character recognition cloud APIs are combined. If the value similar to the characteristics of the input document image is equal to or greater than a predetermined threshold, the character whose character recognition result is correct in other character recognition cloud APIs based on the characteristics of the document image to be processed is the correct answer. You may update the characteristics of the input document image that is paired with the recognition cloud API.
  • the center server has a correct character recognition result received from one selected character recognition cloud API and at least a character recognition result received from another character recognition cloud API.
  • a correct character recognition result received from one selected character recognition cloud API and at least a character recognition result received from another character recognition cloud API.
  • One is the case where the answer is correct, and the characteristics of the document image to be processed and the characteristics of the input document image that is paired with the character recognition cloud API whose character recognition result is the correct answer among the other character recognition cloud APIs. If the similar value of is less than the predetermined threshold, the set of the feature of the document image to be processed and the character recognition cloud API whose character recognition result is the correct answer among other character recognition cloud APIs is added to the selection database. You may.
  • the center server recognizes characters other than the selected one character recognition cloud API in which the character recognition result received from the selected one character recognition cloud API is correct.
  • the cloud API When there is no correct answer in the character recognition result received from the cloud API, and a similar value between the characteristics of the document image to be processed and the characteristics of the input document image paired with the selected character recognition cloud API is specified. If it is equal to or more than the threshold value of, the characteristics of the input document image combined with one character recognition cloud API selected based on the characteristics of the document image to be processed may be updated.
  • the center server recognizes characters other than the selected one character recognition cloud API in which the character recognition result received from the selected one character recognition cloud API is correct.
  • the cloud API When there is no correct answer in the character recognition result received from the cloud API, and a similar value between the characteristics of the document image to be processed and the characteristics of the input document image paired with the selected character recognition cloud API is specified. If it is less than the threshold value of, the set of the feature of the document image to be processed and one selected character recognition cloud API may be added to the selection database.
  • the center server has an incorrect character recognition result received from the selected one character recognition cloud API, and other characters other than the selected one character recognition cloud API.
  • the character recognition results received from the recognition cloud API is the correct answer, and the characteristics of the document image to be processed and the character recognition cloud API whose character recognition result is the correct answer among other character recognition cloud APIs.
  • the similarity value with the feature of the input document image in the set is equal to or more than the predetermined threshold value, the character recognition result is correct in other character recognition cloud APIs based on the feature of the document image to be processed. You may update the characteristics of the input document image that is paired with the character recognition cloud API.
  • the center server has an incorrect character recognition result received from the selected one character recognition cloud API, and other characters other than the selected one character recognition cloud API.
  • the character recognition results received from the recognition cloud API is the correct answer, and the characteristics of the document image to be processed and the character recognition cloud API whose character recognition result is the correct answer among other character recognition cloud APIs.
  • the similarity value with the feature of the input document image in the set is less than the predetermined threshold, the character recognition for which the character recognition result is the correct answer in the feature of the document image to be processed and other character recognition cloud APIs.
  • a pair with the cloud API may be added to the selection database.
  • the center server has an incorrect character recognition result received from the selected one character recognition cloud API, and other characters other than the selected one character recognition cloud API. If there is no correct answer in the character recognition result received from the recognition cloud API, it is processed by another character recognition cloud API other than the character recognition cloud API stored in the selection database as a set with the characteristics of the input document image. If the character recognition result received from another character recognition cloud API after sending the document image is correct, the feature of the document image to be processed and the combination with another character recognition cloud API may be added to the selection database. ..
  • the features of the document image are the image feature amount calculated from the pixel information of the document image, the image attribute indicating the situation when the document image is acquired by the user terminal, and learning. It may include at least one of the learning feature values calculated using the machine.
  • the image attribute is information acquired by the user terminal when the document image is acquired by the user terminal, and is at least the brightness, illuminance, acquisition location, and acquisition time of the document image.
  • One may be included.
  • the character recognition cloud API stored in the selection database extracts the features of a plurality of setting document images whose contained character strings are known, and sets the features to be similar to each other. It is a character recognition cloud API that maximizes the correct answer rate of character recognition when grouping the document images for setting and performing character recognition of multiple setting document images included in each group of setting document images.
  • the feature of the input document image combined with the API may be a representative feature representing the feature of each group of the setting document image.
  • the present invention can provide a document image recognition system with high character recognition accuracy.
  • the character recognition cloud API will be described as a cloud API 31 or a cloud API 32.
  • the document image recognition system 100 includes a user terminal 10, a center server 20, and a cloud API group 30 including a plurality of cloud APIs 31.
  • the user terminal 10 acquires a document image and transmits it to the center server 20.
  • the center server 20 transmits a document image to the cloud API 31 selected from the cloud API group 30, receives a character recognition result from the cloud API 31, and transmits the character recognition result to the user terminal 10.
  • the user terminal 10 displays the character recognition result received from the center server 20.
  • the user terminal 10 is composed of a smartphone with a camera or a tablet terminal with a camera, and is connected to the center server 20 by a communication line such as the Internet or a telephone line.
  • the user terminal 10 includes three functional blocks of a document image acquisition unit 11, a character string display unit 12, and a correct answer character string input unit 13.
  • the user terminal 10 acquires a document image by imaging or the like by the document image acquisition unit 11, and transmits the acquired document image to the center server 20 as a processing target document image 80 (see FIG. 12). Further, the user terminal 10 receives the character recognition result of the document image 80 to be processed from the center server 20 and displays it on the character string display unit 12.
  • the correct character string input unit 13 of the user terminal 10 accepts the user's approval input when the character string displayed on the character string display unit 12 is a correct character string, and when the character string is incorrect, the user's Accepts input of correct character string.
  • the document image acquisition unit 11 of the user terminal 10 is realized by a camera attached to the user terminal 10.
  • the character string display unit 12 is realized by the screen of a smartphone or a tablet terminal.
  • the correct answer character string input unit 13 is realized by an input device such as an icon, a touch key, or a keyboard displayed on the screen of a smartphone or a tablet terminal, a character conversion function, or a voice input function.
  • the center server 20 is connected to the user terminal 10 by a communication line, and is also connected to each cloud API 31 included in the cloud API group 30 by a communication line such as the Internet or a telephone line.
  • the center server 20 includes three functional blocks of a character recognition processing unit 21, a selection database 24, and a selection database update unit 25. Further, the character recognition processing unit 21 includes two functional blocks, a data transmission / reception unit 22 and a cloud API selection unit 23, inside.
  • the data transmission / reception unit 22 receives the processing target document image 80 from the user terminal 10 and transmits the received processing target document image 80 to one cloud API 31 selected by the cloud API selection unit 23. Further, the data transmission / reception unit 22 receives the character recognition result from one selected cloud API 31, and transmits the received character recognition result to the user terminal 10.
  • the cloud API selection unit 23 selects the cloud API 31 most suitable for character recognition based on the characteristics of the document image 80 to be processed while referring to the selection database 24, and outputs the selected result to the data transmission / reception unit 22.
  • the selection database 24 stores a set of the characteristics of the input document image and the cloud API 31 in which the correct answer rate of character recognition is the largest among the plurality of cloud API 31 when the character recognition process of the input document image is performed. It is a database that has been created. The details of the operation of the cloud API selection unit 23 will be described later.
  • the selection database update unit 25 transmits the document image 80 to be processed to each cloud API 31 of the cloud API group 30, and the character from each cloud API 31.
  • the recognition result is received, and the contents of the selection database 24 are updated according to the degree of correct answer, which is the degree of correct or incorrect answer of the character recognition result.
  • the operation of the selection database update unit 25 will be described in detail later.
  • the general-purpose computer 150 includes a CPU 151 which is a processor that performs information processing, a ROM 152 and a RAM 153 that temporarily store data during information processing, and a hard disk drive that stores programs, user data, and the like. (HDD) 154, a mouse 155 provided as an input means, a keyboard 156, and a display 157 provided as a display device are included.
  • the CPU 151, the ROM 152, the RAM 153, and the HDD 154 are connected by a data bus 160.
  • the mouse 155, the keyboard 156, and the display 157 are connected to the data bus 160 via the input / output controller 158.
  • a network controller 159 provided as a communication means is connected to the data bus 160.
  • the data transmission / reception unit 22, the cloud API selection unit 23, and the selection database update unit 25 of the center server 20 are realized by the cooperative operation of the hardware of the general-purpose computer 150 shown in FIG. 2 and the program running on the CPU 151.
  • the selection database 24 is realized by storing a set of the characteristics of the input document image and the cloud API 31 in the HDD 154 of the general-purpose computer 150 shown in FIG.
  • HDD 154 instead of HDD 154, it may be realized by using an external storage means via a network.
  • the plurality of cloud APIs 31 are character recognition function application program interfaces (character recognition cloud APIs) provided by cloud services. Each cloud API 31 performs character recognition processing of a document image input from the outside, and outputs the character recognition result to the outside. Each cloud API 31 is connected to the center server 20 by a communication line such as the Internet or a telephone line.
  • the respective reference numerals 50, 51, 55, 60 and 70 are used.
  • the numbers are added in parentheses after the sign, such as (1), (2), and (J).
  • N setting document images 50 used for setting the selection database 24 are prepared.
  • the setting document image 50 is a document image in which the contained character string contained in the image is known.
  • N setting document images 50 are input to the center server 20.
  • the processor of the center server 20 extracts the characteristics of the image of each setting document image 50.
  • the image features are extracted as an image feature data set 51 composed of a plurality of parameters indicating the image features and data of each parameter.
  • the parameters of the image feature data set 51 use a plurality of image feature quantities calculated from the pixel information of the document image, a plurality of image attributes indicating the situation when the document image is acquired by the user terminal 10, and a learning machine. It is composed of calculated learning feature values.
  • the image feature data set 51 does not have to include all of the image feature amount, the image attribute, and the learning feature value, and may include at least one of them.
  • the external margin ratio is an index showing what percentage of the outer margin area occupies with respect to the area of the document image.
  • the internal margin ratio is an index showing what percentage of the white portion in the document image excluding the outer peripheral margin occupies.
  • the chromaticity distribution rate is an index showing the distribution of colorful parts. Similar to the chromaticity distribution rate, the saturation distribution rate is an index showing the distribution status of colorful parts.
  • the chromatic aberration distribution rate is an index indicating the distribution of image deviation, bleeding, and blurring.
  • the formatting rate is an index that quantifies the regular arrangement of characters.
  • the image attributes are, for example, the brightness, illuminance, acquisition location, and acquisition time of the document image when the document image is captured by the camera of the user terminal 10.
  • the learning feature value is, for example, a feature value extracted using a convolutional neural network (CNN).
  • each image feature data set group 55 includes a plurality of image feature data sets 51.
  • the image feature data set group 55 (1) includes the image feature data sets 51 (1), 51 (4), ... 51 (N-1), and the image feature data set group 55 (1).
  • K) includes image feature data sets 51 (2), 51 (3), ... 51 (N).
  • the similarity value is a numerical value indicating mutual similarity, and is 1.0 when they match and 0 when they do not match at all.
  • the predetermined threshold value can be freely determined, but may be, for example, about 0.7 to 0.9. Further, the classification may be performed with a higher threshold value, and if the classification cannot be performed well, the threshold value may be sequentially lowered to perform the classification.
  • the processor of the center server 20 is set document image 50 corresponding to a plurality of image feature data sets 51 included in each image feature data set group 55 in step S104 of FIG. Is generated as a group of K document image groups 60 for setting.
  • the setting document image 50 (1) corresponding to the image feature data sets 51 (1), 51 (4), ... 51 (N-1) included in the image feature data set group 55 (1), respectively. 50 (4), ... 50 (N-1) are grouped to generate a setting document image group 60 (1).
  • the setting document images 50 (2), 50 (corresponding to the image feature data sets 51 (2), 51 (3), ... 51 (N) included in the image feature data set group 55 (K), respectively. 3), ... 50 (N) are grouped to generate a setting document image group 60 (K).
  • step S105 of FIG. 4 the processor of the center server 20 sets the counter J to the initial value of 1. Then, the process proceeds to step S106 of FIG. 4, and as shown in FIG. 7, each setting document image included in the setting document image group 60 (J) is transmitted to M cloud APIs 31. Then, as shown in step S107 of FIG. 4, the center server 20 receives the character recognition results from the M cloud APIs 31 (A) to 31 (M), respectively.
  • step S108 of FIG. 4 the processor of the center server 20 sets the character recognition results of the plurality of setting document images 50 included in the setting document image group 60 (J) received from one cloud API 31 (A) and each setting. Comparing with the known contained character string of the document image 50, the case where the character recognition result and the known contained character string completely match is regarded as a correct answer, and the case where the character recognition result does not completely match is regarded as an incorrect answer. Then, the processor of the center server 20 counts the number of the correct setting document images 50.
  • step S109 of FIG. 4 the processor of the center server 20 divides the number of correct answers by the total number of the setting document images 50 included in the setting document image group 60 (J) for setting in the cloud API 31 (A).
  • the correct answer rate is calculated when a plurality of setting document images 50 of the document image group 60 (J) are recognized as characters.
  • the processor of the center server 20 has the character recognition results of the plurality of setting document images 50 included in the setting document image group 60 (J) received from the other cloud APIs 31 (B) to API31 (M) and each of them.
  • the cloud API31 (B) to the cloud API31 (M) are made to recognize a plurality of setting document images 50 of the setting document image group 60 (J). Calculate the correct answer rate in each case.
  • the processor of the center server 20 extracts the cloud API 31 (A) having the highest correct answer rate calculated in step S109 in step S110 of FIG.
  • step S111 of FIG. 4 the processor of the center server 20 is represented by using the representative value of each parameter of one image feature data set group 55 (J) as each data of each parameter, as shown in FIG.
  • the image feature data set 70 (J) is generated.
  • the image feature data set group 55 (1) includes image feature data sets 51 (1), 51 (4), ... 51 (N-1).
  • the image feature data set 51 (4) also stores data of each parameter such as an image feature amount (1), an image feature amount (2), an image attribute (1), an image attribute (2), and a learning feature value. Has been done.
  • the processor of the center server 20 stores the representative value of the data of each parameter in the data of the parameter for the representative image feature data set 70 (J).
  • the representative value for example, an average value, a median value, or the like may be used.
  • the representative value of the image feature amount (1) ranges from the image feature amount (1) of the image feature data set 51 (1) to the image feature amount (1) of the image feature data set 51 (N-1). It becomes the average value of.
  • a term of a superordinate concept including each image attribute (1) of each image feature data set 51 may be used as a representative value.
  • the average value or the median value of latitude and longitude may be used as a representative value.
  • the representative image feature data set 70 (J) is a representative feature representing the features of the image of the setting document image group 60 (J) including the plurality of setting document images 50.
  • the generated representative image feature data set 70 (J) is included in the image feature data set group 55 (J).
  • the similar value to the image feature data set 51 of is about 0.7 to 0.9, which is the same as the threshold value. Therefore, the cloud API 31 (A), which has the highest accuracy rate when a plurality of setting document images 50 included in the setting document image group 60 (J) are recognized as characters, is similar to the representative image feature data set 70.
  • the cloud API 31 has the highest accuracy rate when character recognition of a document image having the image feature data set 51 is performed.
  • the processor of the center server 20 combines the representative image feature data set 70 (J) generated in step S111 and the cloud API 31 (A) with the highest accuracy rate extracted in step S110 of FIG. 4 in step S112 of FIG. And store it in the selection database 24.
  • the processor of the center server 20 increments the counter J by 1 in step S113 of FIG. 4, and the counter J is the number of image feature data set groups 55 or the number of document image groups 60 for setting in step S114 of FIG. It is judged whether or not the K is exceeded. Then, if NO is determined in step S114 of FIG. 4, the process returns to step S106 of FIG.
  • the processor of the center server 20 repeatedly executes steps S106 to S112 in FIG. 4, and as shown in FIG. 10, is similar to the K representative image feature data set 70 and its representative image feature data set 70.
  • K sets with a cloud API 31 having the highest correct answer rate are generated and stored in the selection database 24. It should be noted that one cloud API 31 may be paired with a plurality of representative image feature data sets 70.
  • the setting operation of the selection database 24 described above is an example, and the selection database 24 may be set by another operation.
  • the data transmission / reception unit 22 of the center server 20 has the data transmission / reception unit 22 as shown in step S201 of FIG. , Receives the document image 80 to be processed.
  • the data transmission / reception unit 22 outputs the received document image 80 to be processed to the cloud API selection unit 23.
  • the cloud API selection unit 23 extracts the features of the processing target document image 80 and images of the processing target document image 80, as described earlier in the selection database setting operation. Generate feature data set 81.
  • the cloud API selection unit 23 calculates each similarity value with the plurality of representative image feature data sets 70 stored in the selection database 24, as shown in steps S203 and 13 of FIG. Then, the representative image feature data set 70 (1) having the largest similarity value is selected.
  • the maximum similarity value differs depending on the image feature data set 81 of the document image 80 to be processed, but when the image feature data set 81 is close to the feature of the setting document image 50 used when setting the selection database 24. Will be as high as 0.8 or 0.7, for example.
  • the image feature data set 81 is different from the feature of the setting document image 50 used when setting the selection database 24, it becomes as low as about 0.2 to 0.3.
  • the cloud API selection unit 23 selects the cloud API 31 (A) that is paired with the representative image feature data set 70 (1) selected in step S203, and causes the data transmission / reception unit 22. Output.
  • the data transmission / reception unit 22 transmits the processing target document image 80 to the selected cloud API 31 (A) input from the cloud API selection unit 23. Then, the data transmission / reception unit 22 receives the character recognition result from the cloud API 31 (A) in step S206 of FIG.
  • the data transmission / reception unit 22 transmits the character recognition result received from the cloud API 31 (A) to the user terminal 10.
  • the user terminal 10 displays the character string of the character recognition result transmitted from the data transmission / reception unit 22 of the center server 20 on the character string display unit 12.
  • the document image recognition system 100 of the embodiment selects the cloud API 31 most suitable for the character recognition processing of the processing target document image 80 received from the user terminal 10, and causes the cloud API 31 to perform the character recognition processing. Therefore, character recognition processing can be performed with high accuracy.
  • the cloud API selection unit 23 calculates each similarity value between the image feature data set 81 of the document image 80 to be processed and the plurality of representative image feature data sets 70 stored in the selection database 24. , Select the representative image feature data set 70 with the largest similarity value. However, if the image feature data set 81 is close to the feature of the setting document image 50 used when setting the selection database 24, the maximum similarity value is, for example, 0.8 or 0.7. So high. On the other hand, when the image feature data set 81 is different from the feature of the setting document image 50 used when setting the selection database 24, it becomes as low as about 0.2 to 0.3.
  • the character recognition process is performed using the cloud API 31 paired with the representative image feature data set 70, the character recognition result may not be the correct answer. .. Therefore, it is necessary to update the selection database 24 so that the similarity value between the image feature data set 81 of the document image 80 to be processed and the representative image feature data set 70 stored in the selection database 24 is as high as possible. Become.
  • the user terminal 10 receives the character recognition result from the center server 20 and displays the character string of the character recognition result on the character string display unit 12, and the user who sees this displays the character recognition result on the processing target document image 80. It is started by inputting the included correct answer character string into the correct answer character string input unit 13. When the correct answer character string is input, the user terminal 10 transmits the correct answer character string to the center server 20. The center server 20 transmits the document image 80 to be processed to each cloud API 31, and updates the selection database 24 according to the degree of correctness or incorrectness of the received character recognition result. Hereinafter, it will be described in detail.
  • the correct answer means that all the received character recognition result character strings are correct, and if the received character recognition result character string contains even one incorrect character, the answer is incorrect. It is explained as. Further, in the following description, it is assumed that the cloud API 31 (A) is selected in the character recognition operation.
  • the user confirms the character string of the character recognition result displayed on the character string display unit 12 of the user terminal 10.
  • the approval icon and the character input area are displayed on the screen of the user terminal 10.
  • the approval icon and the character input area constitute the correct answer character string input unit 13.
  • the user presses the approval icon displayed on the screen of the user terminal 10. Then, the user terminal 10 transmits the character recognition result transmitted from the center server 20 in step S207 of FIG. 11 as a correct character string to the selection database update unit 25 of the center server 20.
  • the user uses the character input area displayed on the screen of the user terminal 10. The correct character string of the document image 80 to be processed is input to.
  • the user terminal 10 transmits the input correct answer character string to the selection database update unit 25 of the center server 20.
  • the user may input the approval or the correct character string by voice. At this time, the voice input function constitutes the correct answer character string input unit 13.
  • the selection database update unit 25 of the center server 20 waits until the correct answer character string of the document image 80 to be processed is input from the user terminal 10, and then the correct answer character string is input.
  • the process proceeds to step S302 of FIG. 14, and as shown in FIG. 19, the document image 80 to be processed is transmitted to all M cloud APIs 31 (A) to 31 (M).
  • the selection database update unit 25 receives the character recognition results from the M cloud APIs 31 (A) to 31 (M).
  • the selection database update unit 25 includes the character recognition result and the correct answer character string received from the cloud API 31 (A) selected by the cloud API selection unit 23 in the previous character recognition operation. If the character recognition result of the selected cloud API 31 (A) is correct, the process proceeds to step S305 in FIG.
  • the selection database update unit 25 compares the character recognition result received from the cloud APIs 31 (B) to 31 (M) other than the cloud API31 (A) previously selected in step S305 of FIG. 14 with the correct character string. If at least one of the character recognition results received from the other cloud APIs 31 (B) to 31 (M) has a correct answer, the process proceeds to step S306 in FIG.
  • the selection database update unit 25 is the representative shown in FIG. 13 which is paired with the image feature data set 81 of the document image 80 to be processed shown in FIG. 12 and the previously selected cloud API 31 (A) in step S306 of FIG. It is determined whether or not the value similar to the image feature data set 70 (1) is equal to or greater than a predetermined threshold value.
  • a predetermined threshold value can be freely selected, but may be set to, for example, about 0.8 or 0.7.
  • the selection database update unit 25 determines YES in step S306 of FIG. 15, the selection database update unit 25 proceeds to step S307 of FIG. 15 and selects the cloud API 31 (previously selected based on the image feature data set 81 of the document image 80 to be processed).
  • the representative image feature data set 70 (1) paired with A) is updated. For example, the update is performed by weighting the difference between each data of each parameter of the representative image feature data set 70 (1) and the image feature data set 81 of each parameter of the document image 80 to be processed.
  • the data of each parameter of the set 70 (1) may be increased or decreased. Further, each data of each parameter of the representative image feature data set 70 (1) may be replaced with each data of each parameter of the image feature data set 81 of the document image 80 to be processed.
  • the selection database update unit 25 determines NO in step S306 of FIG. 15, the selection database update unit 25 proceeds to step S308 of FIG. 15 to proceed to the image feature data set 81 of the document image 80 to be processed and one cloud previously selected.
  • the pair with API 31 (A) is added to the selection database 24. However, if the above set exists in the selection database 24, the set is not added.
  • the selection database update unit 25 proceeds to step S309 of FIG. 15 and proceeds to FIG. 14 among the image feature data set 81 of the document image 80 to be processed and the other cloud API 31. It is determined whether the similarity value with the representative image feature data set 70 paired with the cloud API 31 whose character recognition result is the correct answer in step S305 is equal to or higher than a predetermined threshold value.
  • the selection database update unit 25 determines YES in step S309 of FIG. 15, the selection database update unit 25 proceeds to step S310 of FIG. 15 and another cloud API 31 based on the image feature data set 81 of the document image 80 to be processed.
  • the representative image feature data set 70 which is paired with the cloud API 31 for which the character recognition result is the correct answer, is updated.
  • the update is performed by weighting the difference between each data of each parameter of the representative image feature data set 70 and each data of the image feature data set 81 of the document image 80 to be processed.
  • Each data of each parameter of the feature data set 70 may be increased or decreased. Further, each data of each parameter of the representative image feature data set 70 may be replaced with each data of each parameter of the image feature data set 81 of the document image 80 to be processed.
  • the selection database update unit 25 determines NO in step S309 of FIG. 15, the selection database update unit 25 proceeds to step S311 of FIG. 15 and among the image feature data set 81 of the document image 80 to be processed and the other cloud API 31. The pair with the cloud API 31 for which the character recognition result is the correct answer in is added to the selection database 24. If the above set exists in the selection database 24, the set is not added.
  • each of the other cloud APIs 31 is from step S309 of FIG. The process of S311 is performed.
  • the selection database update unit 25 ends the update operation when the process of step S310 or S311 in FIG. 15 is completed.
  • steps S401 to S403 of FIG. 16 are executed. Since the operation of steps S401 to S403 in FIG. 16 is the same as the operation of steps S306 to S308 shown in FIG. 15, the description thereof will be omitted.
  • step S501 of FIG. 17 If the selection database update unit 25 determines NO in step S304 of FIG. 14, it proceeds to step S501 of FIG. 17 and correctly answers the character recognition results of the other cloud APIs 31 (B) to 31 (M). Determine if there is. Then, if the selection database update unit 25 determines YES in step S501 of FIG. 17, the operation of steps S502 to S504 of FIG. 17 is executed. Since the operation of steps S502 to S504 in FIG. 17 is the same as the operation of steps S309 to S311 shown in FIG. 15, the description thereof will be omitted.
  • step S501 of FIG. 17 the process proceeds to step S505 of FIG. 18, and as shown in FIG. 19, the selection database 24 is combined with the representative image feature data set 70.
  • the processing target document image 80 is transmitted to another cloud API 32 other than the stored cloud API 31.
  • step S506 of FIG. 18 when the selection database update unit 25 receives the character recognition result from another cloud API 32, the selection database update unit 25 confirms whether or not the character recognition result received in step S507 has a correct answer. If YES is determined in step S507 of FIG. 18, the selection database update unit 25 proceeds to step S508 to select a set of the image feature data set 81 of the document image 80 to be processed and another cloud API 32. Add to 24.
  • the representative image feature data set 70 paired with the cloud API 31 whose character recognition result is the correct answer is brought closer to the image feature data set 81 of the document image 80 to be processed, so that the document to be processed is processed.
  • the selection database 24 can be updated so that the similarity value between the image feature data set 81 of the image 80 and the representative image feature data set 70 stored in the selection database 24 gradually increases. If there is no correct answer in the character recognition result, another cloud API 32 in which the character recognition result is correct and the image feature data set 81 of the document image 80 to be processed are stored in the selection database 24 as a set. It is possible to expand the range in which characters can be recognized accurately.
  • the correct answer means that all the received character recognition result character strings are correct, and if the received character recognition result character string contains even one incorrect character, it is explained as an incorrect answer.
  • the ratio of the number of correct characters to the total number of characters included in the received character recognition result is 90% or more, it is regarded as a correct answer, and if it is less than the predetermined threshold, it is regarded as an incorrect answer. You may execute the update operation of.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Character Discrimination (AREA)

Abstract

Provided is a document image recognition system (100) including a user terminal (10), a center server (20), and cloud APIs (31), wherein the center server (20) comprises a selection database (24) storing character recognition cloud APIs (31) with which the highest accuracy rate of character recognition is achieved when a character recognition process is performed on an input document image, the user terminal (10) transmits an acquired document image to the center server (20) as a document image to be processed, and the center server (20) extracts features from the document image to be processed, selects a character recognition cloud API (31) on the basis of the extracted features, and transmits the document image to be processed to the selected character recognition cloud API (31).

Description

文書画像認識システムDocument image recognition system
 文字認識クラウドAPIを利用した文書画像認識システムに関する。 Regarding a document image recognition system that uses the character recognition cloud API.
 クラウドサービスが提供する文字認識機能アプリケーションプログラムインターフェース(以下、文字認識クラウドAPIという)を利用する文書画像認識システムが知られている。このシステムでは、事前に用意した試験用画像を用いて複数の文字認識クラウドAPIの正解率や処理速度を評価して文字認識クラウドAPIを選定し、選定した文字認識クラウドAPIに文字認識処理を実行させる場合が多い(例えば、特許文献1参照)。 A document image recognition system that uses a character recognition function application program interface (hereinafter referred to as a character recognition cloud API) provided by a cloud service is known. In this system, the character recognition cloud API is selected by evaluating the correct answer rate and processing speed of multiple character recognition cloud APIs using the test images prepared in advance, and the character recognition processing is executed on the selected character recognition cloud API. In many cases (see, for example, Patent Document 1).
特開2008-293354号公報Japanese Unexamined Patent Publication No. 2008-293354
 一方、文字認識クラウドAPIは文書画像の特徴により文字認識の正解率が異なる場合がある。このため、事前に文字認識クラウドAPIの評価の際に用いた試験用画像と異なる特徴を有する文書画像を入力した場合、事前評価と異なる文字認識クラウドAPIが最適となる場合がある。このため、文書画像認識システムの文字認識精度が低下する場合があった。 On the other hand, the character recognition cloud API may have different character recognition accuracy rates depending on the characteristics of the document image. Therefore, when a document image having characteristics different from the test image used in the evaluation of the character recognition cloud API is input in advance, the character recognition cloud API different from the prior evaluation may be optimal. Therefore, the character recognition accuracy of the document image recognition system may decrease.
 そこで、本発明は、文字認識精度の高い文書画像認識システムを提供することを目的とする。 Therefore, an object of the present invention is to provide a document image recognition system with high character recognition accuracy.
 本発明の文書画像認識システムは、文書画像を取得するユーザ端末と、前記ユーザ端末と通信回線で接続されたセンタサーバと、前記センタサーバと通信回線で接続され、入力された文書画像の文字認識処理を行い、文字認識結果を出力する複数の文字認識クラウドAPIと、を含む文書画像認識システムであって、前記センタサーバは、入力文書画像の特徴と、前記入力文書画像の文字認識処理を行った際に文字認識の正解率が複数の文字認識クラウドAPIの中で最大となる文字認識クラウドAPIとの組を格納した選択データベースを備え、前記ユーザ端末は、取得した文書画像を処理対象文書画像として前記センタサーバに送信し、前記センタサーバは、前記ユーザ端末から受信した前記処理対象文書画像から前記処理対象文書画像の特徴を抽出し、前記選択データベースに格納されている前記入力文書画像の特徴の中から前記処理対象文書画像の特徴と最も類似している前記入力文書画像の特徴を選択し、選択した前記入力文書画像の特徴と組になっている一の文字認識クラウドAPIを選択し、選択した一の文字認識クラウドAPIに前記処理対象文書画像を送信し、一の文字認識クラウドAPIから文字認識結果を受信し、受信した文字認識結果を前記ユーザ端末に送信すること、を特徴とする。 The document image recognition system of the present invention recognizes characters of a user terminal that acquires a document image, a center server that is connected to the user terminal by a communication line, and a document image that is connected to the center server by a communication line and input. A document image recognition system including a plurality of character recognition cloud APIs that perform processing and output character recognition results. The center server performs character recognition processing on the characteristics of the input document image and the input document image. The user terminal is provided with a selection database that stores a set with a character recognition cloud API that maximizes the correct answer rate of character recognition among a plurality of character recognition cloud APIs, and the user terminal processes the acquired document image as a processing target document image. The center server extracts the characteristics of the processing target document image from the processing target document image received from the user terminal, and the center server extracts the characteristics of the processing target document image and stores the characteristics of the input document image in the selection database. Select the feature of the input document image that most closely resembles the feature of the input document image to be processed, and select one character recognition cloud API that is paired with the feature of the selected input document image. It is characterized in that the processing target document image is transmitted to one selected character recognition cloud API, the character recognition result is received from one character recognition cloud API, and the received character recognition result is transmitted to the user terminal. ..
 このように、ユーザ端末から受信した処理対象文書画像の文字認識処理に最適な文字認識クラウドAPIを選択し、その文字認識クラウドAPIに文字認識処理を行わせるので、文書画像認識システムの文字認識精度を向上させることができる。 In this way, the character recognition cloud API that is most suitable for the character recognition processing of the document image to be processed received from the user terminal is selected, and the character recognition cloud API is made to perform the character recognition processing. Therefore, the character recognition accuracy of the document image recognition system Can be improved.
 本発明の文書画像認識システムにおいて、前記ユーザ端末は、前記センタサーバから文字認識結果を受信した際に、ユーザが入力した前記処理対象文書画像に含まれる正解文字列を前記センタサーバに出力し、前記センタサーバは、前記ユーザ端末から前記正解文字列が入力された場合に、前記処理対象文書画像を各文字認識クラウドAPIに送信し、各文字認識クラウドAPIからそれぞれ文字認識結果を受信し、受信した文字認識結果の正解度に応じて前記選択データベースの各文字認識クラウドAPIと組となっている各入力文書画像の各特徴の更新、及び、入力文書画像の特徴と文字認識クラウドAPIの組の前記選択データベースへの追加のいずれか一方又は両方を行ってもよい。 In the document image recognition system of the present invention, when the user terminal receives the character recognition result from the center server, the user terminal outputs the correct character string included in the processing target document image input by the user to the center server. When the correct answer character string is input from the user terminal, the center server transmits the processing target document image to each character recognition cloud API, and receives and receives the character recognition result from each character recognition cloud API. Update each feature of each input document image that is paired with each character recognition cloud API of the selected database according to the degree of correctness of the character recognition result, and of the feature of the input document image and the set of character recognition cloud API Either or both of the additions to the selection database may be made.
 これにより、選択データベースの最適化を図ることができ、文書画像認識システムの文字認識精度を向上させることができる。 This makes it possible to optimize the selection database and improve the character recognition accuracy of the document image recognition system.
 本発明の文書画像認識システムにおいて、前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が正解で、且つ、選択した一の文字認識クラウドAPI以外の他の文字認識クラウドAPIから受信した文字認識結果の少なくとも1つが正解の場合で、且つ、処理対象文書画像の特徴と、選択した一の文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値以上の場合には、処理対象文書画像の特徴に基づいて選択した一の文字認識クラウドAPIと組になっている入力文書画像の特徴を更新してもよい。 In the document image recognition system of the present invention, in the center server, the character recognition result received from the selected one character recognition cloud API is correct, and the character recognition cloud API other than the selected one character recognition cloud API is correct. When at least one of the character recognition results received from is correct, and the characteristics of the document image to be processed and the characteristics of the input document image paired with the selected character recognition cloud API are predetermined. If it is equal to or more than the threshold value of, the characteristics of the input document image combined with one character recognition cloud API selected based on the characteristics of the document image to be processed may be updated.
 また、本発明に文書画像認識システムにおいて、前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が正解で、且つ、他の文字認識クラウドAPIから受信した文字認識結果の少なくとも1つが正解の場合で、且つ、処理対象文書画像の特徴と、選択した一の文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値未満の場合には、処理対象文書画像の特徴と選択した一の文字認識クラウドAPIとの組を選択データベースに追加してもよい。 Further, in the document image recognition system according to the present invention, the center server has a correct character recognition result received from one selected character recognition cloud API and at least a character recognition result received from another character recognition cloud API. When one is the correct answer and the similarity value between the feature of the document image to be processed and the feature of the input document image paired with the selected character recognition cloud API is less than a predetermined threshold value. A set of the characteristics of the document image to be processed and one selected character recognition cloud API may be added to the selection database.
 また、本発明の文書画像認識システムにおいて、前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が正解で、且つ、選択した一の文字認識クラウドAPI以外の他の文字認識クラウドAPIから受信した文字認識結果の少なくとも1つが正解の場合で、且つ、処理対象文書画像の特徴と、他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値以上の場合には、処理対象文書画像の特徴に基づいて他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIと組になっている入力文書画像の特徴を更新してもよい。 Further, in the document image recognition system of the present invention, the center server recognizes characters other than the selected one character recognition cloud API in which the character recognition result received from the selected one character recognition cloud API is correct. When at least one of the character recognition results received from the cloud API is the correct answer, and the characteristics of the document image to be processed and the character recognition cloud API whose character recognition result is the correct answer among other character recognition cloud APIs are combined. If the value similar to the characteristics of the input document image is equal to or greater than a predetermined threshold, the character whose character recognition result is correct in other character recognition cloud APIs based on the characteristics of the document image to be processed is the correct answer. You may update the characteristics of the input document image that is paired with the recognition cloud API.
 また、本発明の文書画像認識システムにおいて、前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が正解で、且つ、他の文字認識クラウドAPIから受信した文字認識結果の少なくとも1つが正解の場合で、且つ、処理対象文書画像の特徴と、他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値未満の場合には、処理対象文書画像の特徴と他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIとの組を選択データベースに追加してもよい。 Further, in the document image recognition system of the present invention, the center server has a correct character recognition result received from one selected character recognition cloud API and at least a character recognition result received from another character recognition cloud API. One is the case where the answer is correct, and the characteristics of the document image to be processed and the characteristics of the input document image that is paired with the character recognition cloud API whose character recognition result is the correct answer among the other character recognition cloud APIs. If the similar value of is less than the predetermined threshold, the set of the feature of the document image to be processed and the character recognition cloud API whose character recognition result is the correct answer among other character recognition cloud APIs is added to the selection database. You may.
 また、本発明の文書画像認識システムにおいて、前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が正解で、且つ、選択した一の文字認識クラウドAPI以外の他の文字認識クラウドAPIから受信した文字認識結果に正解がない場合で、且つ、処理対象文書画像の特徴と、選択した一の文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値以上の場合には、処理対象文書画像の特徴に基づいて選択した一の文字認識クラウドAPIと組になっている入力文書画像の特徴を更新してもよい。 Further, in the document image recognition system of the present invention, the center server recognizes characters other than the selected one character recognition cloud API in which the character recognition result received from the selected one character recognition cloud API is correct. When there is no correct answer in the character recognition result received from the cloud API, and a similar value between the characteristics of the document image to be processed and the characteristics of the input document image paired with the selected character recognition cloud API is specified. If it is equal to or more than the threshold value of, the characteristics of the input document image combined with one character recognition cloud API selected based on the characteristics of the document image to be processed may be updated.
 また、本発明の文書画像認識システムにおいて、前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が正解で、且つ、選択した一の文字認識クラウドAPI以外の他の文字認識クラウドAPIから受信した文字認識結果に正解がない場合で、且つ、処理対象文書画像の特徴と、選択した一の文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値未満の場合には、処理対象文書画像の特徴と選択した一の文字認識クラウドAPIとの組を選択データベースに追加してもよい。 Further, in the document image recognition system of the present invention, the center server recognizes characters other than the selected one character recognition cloud API in which the character recognition result received from the selected one character recognition cloud API is correct. When there is no correct answer in the character recognition result received from the cloud API, and a similar value between the characteristics of the document image to be processed and the characteristics of the input document image paired with the selected character recognition cloud API is specified. If it is less than the threshold value of, the set of the feature of the document image to be processed and one selected character recognition cloud API may be added to the selection database.
 また、本発明の文書画像認識システムにおいて、前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が不正解で、且つ、選択した一の文字認識クラウドAPI以外の他の文字認識クラウドAPIから受信した文字認識結果の少なくとも1つが正解の場合で、且つ、処理対象文書画像の特徴と、他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値以上の場合には、処理対象文書画像の特徴に基づいて他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIと組になっている入力文書画像の特徴を更新してもよい。 Further, in the document image recognition system of the present invention, the center server has an incorrect character recognition result received from the selected one character recognition cloud API, and other characters other than the selected one character recognition cloud API. When at least one of the character recognition results received from the recognition cloud API is the correct answer, and the characteristics of the document image to be processed and the character recognition cloud API whose character recognition result is the correct answer among other character recognition cloud APIs. When the similarity value with the feature of the input document image in the set is equal to or more than the predetermined threshold value, the character recognition result is correct in other character recognition cloud APIs based on the feature of the document image to be processed. You may update the characteristics of the input document image that is paired with the character recognition cloud API.
 また、本発明の文書画像認識システムにおいて、前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が不正解で、且つ、選択した一の文字認識クラウドAPI以外の他の文字認識クラウドAPIから受信した文字認識結果の少なくとも1つが正解の場合で、且つ、処理対象文書画像の特徴と、他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値未満の場合には、処理対象文書画像の特徴と他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIとの組を選択データベースに追加してもよい。 Further, in the document image recognition system of the present invention, the center server has an incorrect character recognition result received from the selected one character recognition cloud API, and other characters other than the selected one character recognition cloud API. When at least one of the character recognition results received from the recognition cloud API is the correct answer, and the characteristics of the document image to be processed and the character recognition cloud API whose character recognition result is the correct answer among other character recognition cloud APIs. When the similarity value with the feature of the input document image in the set is less than the predetermined threshold, the character recognition for which the character recognition result is the correct answer in the feature of the document image to be processed and other character recognition cloud APIs. A pair with the cloud API may be added to the selection database.
 また、本発明の文書画像認識システムにおいて、前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が不正解で、且つ、選択した一の文字認識クラウドAPI以外の他の文字認識クラウドAPIから受信した文字認識結果に1つも正解がなかった場合には、入力文書画像の特徴と組として選択データベースに格納されている文字認識クラウドAPI以外の別の文字認識クラウドAPIに処理対象文書画像を送信し、別の文字認識クラウドAPIから受信した文字認識結果が正解の場合には、処理対象文書画像の特徴と別の文字認識クラウドAPIとの組を選択データベースに追加してもよい。 Further, in the document image recognition system of the present invention, the center server has an incorrect character recognition result received from the selected one character recognition cloud API, and other characters other than the selected one character recognition cloud API. If there is no correct answer in the character recognition result received from the recognition cloud API, it is processed by another character recognition cloud API other than the character recognition cloud API stored in the selection database as a set with the characteristics of the input document image. If the character recognition result received from another character recognition cloud API after sending the document image is correct, the feature of the document image to be processed and the combination with another character recognition cloud API may be added to the selection database. ..
 また、本発明の文書画像認識システムにおいて、文書画像の特徴は、文書画像の画素情報から算出される画像特徴量と、前記ユーザ端末で文書画像を取得した際の状況を示す画像属性と、学習機を用いて算出される学習特徴値と、の少なくとも1つを含んでもよい。 Further, in the document image recognition system of the present invention, the features of the document image are the image feature amount calculated from the pixel information of the document image, the image attribute indicating the situation when the document image is acquired by the user terminal, and learning. It may include at least one of the learning feature values calculated using the machine.
 また、本発明の文書画像認識システムにおいて、前記画像属性は、前記ユーザ端末で文書画像を取得する際に前記ユーザ端末で取得した情報で、文書画像の輝度、照度、取得場所、取得時間の少なくとも1つを含んでもよい。 Further, in the document image recognition system of the present invention, the image attribute is information acquired by the user terminal when the document image is acquired by the user terminal, and is at least the brightness, illuminance, acquisition location, and acquisition time of the document image. One may be included.
 また、本発明の文書画像認識システムにおいて、前記選択データベースに格納されている文字認識クラウドAPIは、含有文字列が既知の複数の設定用文書画像の特徴を抽出し、特徴が相互に類似する設定用文書画像をグルーピングし、設定用文書画像の各グループに含まれる複数の設定用文書画像の文字認識を行った際に文字認識の正解率が最大となる文字認識クラウドAPIであり、文字認識クラウドAPIと組になっている入力文書画像の特徴は、設定用文書画像の各グループの特徴を代表する代表特徴としてもよい。 Further, in the document image recognition system of the present invention, the character recognition cloud API stored in the selection database extracts the features of a plurality of setting document images whose contained character strings are known, and sets the features to be similar to each other. It is a character recognition cloud API that maximizes the correct answer rate of character recognition when grouping the document images for setting and performing character recognition of multiple setting document images included in each group of setting document images. The feature of the input document image combined with the API may be a representative feature representing the feature of each group of the setting document image.
 本発明は、文字認識精度の高い文書画像認識システムを提供することができる。 The present invention can provide a document image recognition system with high character recognition accuracy.
実施形態の文書画像認識システムの構成を示す系統図である。It is a system diagram which shows the structure of the document image recognition system of embodiment. 汎用コンピュータの構成を示す系統図である。It is a system diagram which shows the structure of a general-purpose computer. 実施形態の文書画像認識システムの選択データベース設定動作の前半部分を示すフローチャートである。It is a flowchart which shows the first half part of the selection database setting operation of the document image recognition system of embodiment. 実施形態の文書画像認識システムの選択データベース設定動作の後半部分を示すフローチャートである。It is a flowchart which shows the latter half part of the selection database setting operation of the document image recognition system of embodiment. 選択データベース設定動作における設定用文書画像の特徴の抽出を示す説明図である。It is explanatory drawing which shows the extraction of the feature of the document image for setting in the selection database setting operation. 選択データベース設定動作における画像特徴データセットの分類と、設定用文書画像のグルーピングとを示す説明図である。It is explanatory drawing which shows the classification of the image feature data set in the selection database setting operation, and the grouping of the document image for setting. 選択データベース設定動作における文字認識クラウドAPIの正解率の算出と、正解率が最も高い文字認識クラウドAPIの抽出とを示す説明図である。It is explanatory drawing which shows the calculation of the correct answer rate of the character recognition cloud API in the selection database setting operation, and the extraction of the character recognition cloud API which has the highest correct answer rate. 選択データベース設定動作における代表画像特徴データセットの生成を示す説明図である。It is explanatory drawing which shows the generation of the representative image feature data set in the selection database setting operation. 代表画像特徴データセットと文字認識クラウドAPIとの組と、代表画像特徴データセットと設定用文書画像グループとの対応を示す説明図である。It is explanatory drawing which shows the correspondence of the set of the representative image feature data set and the character recognition cloud API, and the representative image feature data set and the document image group for setting. 選択データベースの構造を示す説明図である。It is explanatory drawing which shows the structure of a selection database. 実施形態の文書画像認識システムの文字認識動作を示すフローチャートである。It is a flowchart which shows the character recognition operation of the document image recognition system of embodiment. 文字認識動作における処理対象文書画像の特徴の抽出を示す説明図である。It is explanatory drawing which shows the extraction of the feature of the document image to be processed in the character recognition operation. 文字認識動作における文字認識クラウドAPIの選択を示す説明図である。It is explanatory drawing which shows the selection of the character recognition cloud API in the character recognition operation. ユーザ端末から処理対象文書画像の正解文字列の入力があった場合の選択データベース更新動作を示すフローチャートである。It is a flowchart which shows the selection database update operation when the correct character string of the document image to be processed is input from the user terminal. 図14に示す結合子2の場合の処理を示すフローチャートである。It is a flowchart which shows the process in the case of the combiner 2 shown in FIG. 図14に示す結合子3の場合の処理を示すフローチャートである。It is a flowchart which shows the process in the case of the combiner 3 shown in FIG. 図14に示す結合子4の場合の処理を示すフローチャートである。It is a flowchart which shows the process in the case of the combiner 4 shown in FIG. 図17に示す結合子5の場合の処理を示すフローチャートである。It is a flowchart which shows the process in the case of the combiner 5 shown in FIG. ユーザ端末から処理対象文書画像の正解文字列の入力があった場合の選択データベース更新動作を示す説明図である。It is explanatory drawing which shows the selection database update operation when the correct character string of the document image to be processed is input from the user terminal.
 以下、図面を参照しながら実施形態の文書画像認識システム100について説明する。以下の説明では、文字認識クラウドAPIをクラウドAPI31、又は、クラウドAPI32として説明する。図1に示すように、文書画像認識システム100は、ユーザ端末10と、センタサーバ20と、複数のクラウドAPI31を含むクラウドAPI群30とで構成されている。ユーザ端末10は文書画像を取得してセンタサーバ20に送信する。センタサーバ20はクラウドAPI群30の中から選択したクラウドAPI31に文書画像を送信し、クラウドAPI31から文字認識結果を受信し、ユーザ端末10に送信する。ユーザ端末10は、センタサーバ20から受信した文字認識結果を表示する。以下の説明では、複数のクラウドAPI31を区別しない場合には、符号31を用い、各クラウドAPI31を区別する場合には、符号31の後にアルファベットを括弧付けで付記し、クラウドAPI31(A)~クラウドAPI31(M)のように表記する。 Hereinafter, the document image recognition system 100 of the embodiment will be described with reference to the drawings. In the following description, the character recognition cloud API will be described as a cloud API 31 or a cloud API 32. As shown in FIG. 1, the document image recognition system 100 includes a user terminal 10, a center server 20, and a cloud API group 30 including a plurality of cloud APIs 31. The user terminal 10 acquires a document image and transmits it to the center server 20. The center server 20 transmits a document image to the cloud API 31 selected from the cloud API group 30, receives a character recognition result from the cloud API 31, and transmits the character recognition result to the user terminal 10. The user terminal 10 displays the character recognition result received from the center server 20. In the following description, when a plurality of cloud APIs 31 are not distinguished, the reference numeral 31 is used, and when each cloud API 31 is distinguished, the alphabet is added in parentheses after the reference numeral 31, and the cloud APIs 31 (A) to the cloud are added. Notated as API31 (M).
 ユーザ端末10は、カメラ付きスマートフォン、或いはカメラ付きタブレット端末で構成され、インターネット、電話回線等の通信回線によってセンタサーバ20と接続されている。ユーザ端末10は、文書画像取得部11と、文字列表示部12と、正解文字列入力部13の3つの機能ブロックを含んでいる。ユーザ端末10は、文書画像取得部11で撮像等により文書画像を取得し、取得した文書画像を処理対象文書画像80(図12参照)としてセンタサーバ20に送信する。また、ユーザ端末10は、センタサーバ20から処理対象文書画像80の文字認識結果を受信して、文字列表示部12に表示する。ユーザ端末10の正解文字列入力部13は、文字列表示部12に表示された文字列が正しい文字列である場合にはユーザの承認入力を受け付け、正しくない文字列の場合には、ユーザの正解文字列の入力を受け付ける。 The user terminal 10 is composed of a smartphone with a camera or a tablet terminal with a camera, and is connected to the center server 20 by a communication line such as the Internet or a telephone line. The user terminal 10 includes three functional blocks of a document image acquisition unit 11, a character string display unit 12, and a correct answer character string input unit 13. The user terminal 10 acquires a document image by imaging or the like by the document image acquisition unit 11, and transmits the acquired document image to the center server 20 as a processing target document image 80 (see FIG. 12). Further, the user terminal 10 receives the character recognition result of the document image 80 to be processed from the center server 20 and displays it on the character string display unit 12. The correct character string input unit 13 of the user terminal 10 accepts the user's approval input when the character string displayed on the character string display unit 12 is a correct character string, and when the character string is incorrect, the user's Accepts input of correct character string.
 ユーザ端末10の文書画像取得部11はユーザ端末10に取付けられているカメラによって実現される。また、文字列表示部12は、スマートフォン又はタブレット端末の画面によって実現される。また、正解文字列入力部13は、スマートフォン又はタブレット端末の画面に表示されるアイコンやタッチキー又はキーボード等の入力装置と文字変換機能、或いは、音声入力機能によって実現される。 The document image acquisition unit 11 of the user terminal 10 is realized by a camera attached to the user terminal 10. Further, the character string display unit 12 is realized by the screen of a smartphone or a tablet terminal. Further, the correct answer character string input unit 13 is realized by an input device such as an icon, a touch key, or a keyboard displayed on the screen of a smartphone or a tablet terminal, a character conversion function, or a voice input function.
 センタサーバ20は、ユーザ端末10と通信回線で接続されると共に、クラウドAPI群30に含まれる各クラウドAPI31とインターネットや電話回線等の通信回線で接続されている。センタサーバ20は、文字認識処理部21と、選択データベース24と、選択データベース更新部25の3つの機能ブロックを備えている。また、文字認識処理部21は、内部にデータ送受信部22と、クラウドAPI選択部23の2つの機能ブロックを含んでいる。 The center server 20 is connected to the user terminal 10 by a communication line, and is also connected to each cloud API 31 included in the cloud API group 30 by a communication line such as the Internet or a telephone line. The center server 20 includes three functional blocks of a character recognition processing unit 21, a selection database 24, and a selection database update unit 25. Further, the character recognition processing unit 21 includes two functional blocks, a data transmission / reception unit 22 and a cloud API selection unit 23, inside.
 データ送受信部22は、ユーザ端末10から処理対象文書画像80を受信し、受信した処理対象文書画像80をクラウドAPI選択部23が選択した一のクラウドAPI31に送信する。また、データ送受信部22は、選択した一のクラウドAPI31から文字認識結果を受信し、受信した文字認識結果をユーザ端末10に送信する。クラウドAPI選択部23は、選択データベース24を参照しながら処理対象文書画像80の特徴に基づいて文字認識に最適なクラウドAPI31を選択し、選択した結果をデータ送受信部22に出力する。ここで、選択データベース24は、入力文書画像の特徴と、入力文書画像の文字認識処理を行った際に文字認識の正解率が複数のクラウドAPI31の中で最大となるクラウドAPI31との組を格納したデータベースである。なお、クラウドAPI選択部23の動作の詳細は後で説明する。 The data transmission / reception unit 22 receives the processing target document image 80 from the user terminal 10 and transmits the received processing target document image 80 to one cloud API 31 selected by the cloud API selection unit 23. Further, the data transmission / reception unit 22 receives the character recognition result from one selected cloud API 31, and transmits the received character recognition result to the user terminal 10. The cloud API selection unit 23 selects the cloud API 31 most suitable for character recognition based on the characteristics of the document image 80 to be processed while referring to the selection database 24, and outputs the selected result to the data transmission / reception unit 22. Here, the selection database 24 stores a set of the characteristics of the input document image and the cloud API 31 in which the correct answer rate of character recognition is the largest among the plurality of cloud API 31 when the character recognition process of the input document image is performed. It is a database that has been created. The details of the operation of the cloud API selection unit 23 will be described later.
 選択データベース更新部25は、ユーザ端末10から処理対象文書画像80の正解文字列が入力された際に、処理対象文書画像80をクラウドAPI群30の各クラウドAPI31に送信し、各クラウドAPI31から文字認識結果を受信し、文字認識結果の正解或いは不正解の度合いである正解度に応じて選択データベース24の内容を更新する。選択データベース更新部25の動作については後で詳細に説明する。 When the correct character string of the document image 80 to be processed is input from the user terminal 10, the selection database update unit 25 transmits the document image 80 to be processed to each cloud API 31 of the cloud API group 30, and the character from each cloud API 31. The recognition result is received, and the contents of the selection database 24 are updated according to the degree of correct answer, which is the degree of correct or incorrect answer of the character recognition result. The operation of the selection database update unit 25 will be described in detail later.
 センタサーバ20の各機能ブロックは、図2に示すような汎用コンピュータ150によって実現することができる。図2に示すように、汎用コンピュータ150は、情報処理を行うプロセッサであるCPU151と、情報処理の際にデータを一時的に記憶するROM152、RAM153と、プログラムやユーザのデータ等を格納するハードディスクドライブ(HDD)154と、入力手段として設けられたマウス155と、キーボード156、及び表示装置として設けられたディスプレイ157とを含んでいる。CPU151とROM152とRAM153とHDD154とはデータバス160によって接続されている。また、マウス155とキーボード156とディスプレイ157とは入出力コントローラ158を介してデータバス160に接続されている。また、データバス160には通信手段として設けられたネットワークコントローラ159が接続されている。 Each functional block of the center server 20 can be realized by a general-purpose computer 150 as shown in FIG. As shown in FIG. 2, the general-purpose computer 150 includes a CPU 151 which is a processor that performs information processing, a ROM 152 and a RAM 153 that temporarily store data during information processing, and a hard disk drive that stores programs, user data, and the like. (HDD) 154, a mouse 155 provided as an input means, a keyboard 156, and a display 157 provided as a display device are included. The CPU 151, the ROM 152, the RAM 153, and the HDD 154 are connected by a data bus 160. Further, the mouse 155, the keyboard 156, and the display 157 are connected to the data bus 160 via the input / output controller 158. Further, a network controller 159 provided as a communication means is connected to the data bus 160.
 センタサーバ20のデータ送受信部22、クラウドAPI選択部23、選択データベース更新部25は、図2に示す汎用コンピュータ150のハードウェアとCPU151で動作するプログラムとの協調動作により実現される。選択データベース24は、図2に示す汎用コンピュータ150のHDD154に入力文書画像の特徴とクラウドAPI31との組を格納することにより実現される。なお、HDD154に代えて、外部の記憶手段をネットワーク経由で利用することによって実現してもよい。 The data transmission / reception unit 22, the cloud API selection unit 23, and the selection database update unit 25 of the center server 20 are realized by the cooperative operation of the hardware of the general-purpose computer 150 shown in FIG. 2 and the program running on the CPU 151. The selection database 24 is realized by storing a set of the characteristics of the input document image and the cloud API 31 in the HDD 154 of the general-purpose computer 150 shown in FIG. In addition, instead of HDD 154, it may be realized by using an external storage means via a network.
 複数のクラウドAPI31は、クラウドサービスが提供する文字認識機能アプリケーションプログラムインターフェース(文字認識クラウドAPI)である。各クラウドAPI31は、外部から入力された文書画像の文字認識処理を行い、文字認識結果を外部に出力する。各クラウドAPI31は、センタサーバ20とインターネット、電話回線等の通信回線で接続されている。 The plurality of cloud APIs 31 are character recognition function application program interfaces (character recognition cloud APIs) provided by cloud services. Each cloud API 31 performs character recognition processing of a document image input from the outside, and outputs the character recognition result to the outside. Each cloud API 31 is connected to the center server 20 by a communication line such as the Internet or a telephone line.
 次に、図3から図10を参照しながら、選択データベース24の設定動作の一例について説明する。なお、以下の説明では、複数の設定用文書画像50、複数の画像特徴データセット51、複数の画像特徴データセットグループ55、複数の設定用文書画像グループ60、複数の代表画像特徴データセット70、を区別しない場合には各符号50,51,55,60,70を用いる。また、複数のそれぞれを区別する場合には、符号の後に括弧付けで番号を(1)、(2)、(J)のように付記して表記する。 Next, an example of the setting operation of the selection database 24 will be described with reference to FIGS. 3 to 10. In the following description, a plurality of setting document images 50, a plurality of image feature data sets 51, a plurality of image feature data set groups 55, a plurality of setting document image groups 60, and a plurality of representative image feature data sets 70. When no distinction is made, the respective reference numerals 50, 51, 55, 60 and 70 are used. In addition, when distinguishing each of a plurality of pieces, the numbers are added in parentheses after the sign, such as (1), (2), and (J).
 まず、図3のステップS101、図5に示すように、選択データベース24の設定に使用する設定用文書画像50をN個準備する。設定用文書画像50は、画像の中に含まれている含有文字列が既知の文書画像である。 First, as shown in steps S101 and FIG. 5 of FIG. 3, N setting document images 50 used for setting the selection database 24 are prepared. The setting document image 50 is a document image in which the contained character string contained in the image is known.
 次に、図3のステップS102、図5に示すように、N個の設定用文書画像50をセンタサーバ20に入力する。センタサーバ20のプロセッサは、各設定用文書画像50の画像の特徴を抽出する。画像の特徴は図5に示すように、画像の特徴を示す複数のパラメータと、各パラメータのデータとで構成される画像特徴データセット51として抽出される。画像特徴データセット51のパラメータは、文書画像の画素情報から算出される複数の画像特徴量と、ユーザ端末10で文書画像を取得した際の状況を示す複数の画像属性と、学習機を用いて算出される学習特徴値とで構成されている。なお、画像特徴データセット51は、画像特徴量と画像属性と学習特徴値とを全て含まなくてもよく、これらの内の少なくとも1つを含んでいればよい。 Next, as shown in steps S102 and FIG. 5 of FIG. 3, N setting document images 50 are input to the center server 20. The processor of the center server 20 extracts the characteristics of the image of each setting document image 50. As shown in FIG. 5, the image features are extracted as an image feature data set 51 composed of a plurality of parameters indicating the image features and data of each parameter. The parameters of the image feature data set 51 use a plurality of image feature quantities calculated from the pixel information of the document image, a plurality of image attributes indicating the situation when the document image is acquired by the user terminal 10, and a learning machine. It is composed of calculated learning feature values. The image feature data set 51 does not have to include all of the image feature amount, the image attribute, and the learning feature value, and may include at least one of them.
 画像特徴量としては、様々なパラメータを用いることができるが、例えば、外部余白率、内部余白率、色度分布率、彩度分布率、色収差分布率、フォーマット化率等を用いてもよい。ここで、外部余白率は、外周の余白面積が文書画像の面積に対して何%を占めるかを示す指標である。内部余白率は、外周の余白を除いた文書画像内の白色部分が何%を占めるかを示す指標である。色度分布率は、カラフルな部分の分布状況を示す指標である。彩度分布率は、色度分布率と同様、カラフルな部分の分布状況を示す指標である。色収差分布率は、画像のズレやにじみ、ボケの分布状況を示す指数である。フォーマット化率は、文字が規則的にならんでいることを数値化した指標である。 Various parameters can be used as the image feature amount, and for example, the external margin ratio, the internal margin ratio, the chromaticity distribution ratio, the saturation distribution ratio, the chromatic aberration distribution ratio, the formatting ratio, and the like may be used. Here, the external margin ratio is an index showing what percentage of the outer margin area occupies with respect to the area of the document image. The internal margin ratio is an index showing what percentage of the white portion in the document image excluding the outer peripheral margin occupies. The chromaticity distribution rate is an index showing the distribution of colorful parts. Similar to the chromaticity distribution rate, the saturation distribution rate is an index showing the distribution status of colorful parts. The chromatic aberration distribution rate is an index indicating the distribution of image deviation, bleeding, and blurring. The formatting rate is an index that quantifies the regular arrangement of characters.
 画像属性は、例えば、ユーザ端末10のカメラで文書画像を撮像した際の、文書画像の輝度、照度や、取得場所、取得時間である。また、学習特徴値は、例えば、畳み込みニューラルネットワーク(CNN)を用いて抽出した特徴値等である。 The image attributes are, for example, the brightness, illuminance, acquisition location, and acquisition time of the document image when the document image is captured by the camera of the user terminal 10. The learning feature value is, for example, a feature value extracted using a convolutional neural network (CNN).
 次に、図3のステップS103、図6に示すように、センタサーバ20のプロセッサは、図3のステップS102で抽出したN個の画像特徴データセット51(1)~51(N)を相互の類似値が所定の閾値以上になるK個の画像特徴データセットグループ55(1)~55(K)に分類する。図6に示すように、各画像特徴データセットグループ55には、それぞれ複数の画像特徴データセット51が含まれる。例えば、画像特徴データセットグループ55(1)には、画像特徴データセット51(1),51(4),・・・51(N-1)が含まれており、画像特徴データセットグループ55(K)には、画像特徴データセット51(2),51(3),・・・51(N)が含まれている。ここで、類似値は、相互の類似性を示す数値であり、一致する場合が1.0で全く類似しない場合には0である。所定の閾値は自由に決めることができるが、例えば、0.7~0.9程度としてもよい。また、高めの閾値で分類を行い、うまく分類できない場合には、閾値を順次低くして分類を行うようにしてもよい。 Next, as shown in steps S103 and 6 of FIG. 3, the processor of the center server 20 mutually exchanges the N image feature data sets 51 (1) to 51 (N) extracted in step S102 of FIG. It is classified into K image feature data set groups 55 (1) to 55 (K) whose similar values are equal to or more than a predetermined threshold value. As shown in FIG. 6, each image feature data set group 55 includes a plurality of image feature data sets 51. For example, the image feature data set group 55 (1) includes the image feature data sets 51 (1), 51 (4), ... 51 (N-1), and the image feature data set group 55 (1). K) includes image feature data sets 51 (2), 51 (3), ... 51 (N). Here, the similarity value is a numerical value indicating mutual similarity, and is 1.0 when they match and 0 when they do not match at all. The predetermined threshold value can be freely determined, but may be, for example, about 0.7 to 0.9. Further, the classification may be performed with a higher threshold value, and if the classification cannot be performed well, the threshold value may be sequentially lowered to perform the classification.
 また、センタサーバ20のプロセッサは、図3のステップS104で、図6に示すように、各画像特徴データセットグループ55にそれぞれ含まれる複数の画像特徴データセット51に対応する各設定用文書画像50をグループにしたK個の設定用文書画像グループ60を生成する。例えば、画像特徴データセットグループ55(1)に含まれる画像特徴データセット51(1),51(4),・・・51(N-1)にそれぞれ対応する設定用文書画像50(1),50(4),・・・50(N-1)をグルーピングして設定用文書画像グループ60(1)を生成する。また、画像特徴データセットグループ55(K)に含まれる画像特徴データセット51(2),51(3),・・・51(N)にそれぞれ対応する設定用文書画像50(2),50(3),・・・50(N)をグルーピングして設定用文書画像グループ60(K)を生成する。 Further, as shown in FIG. 6, the processor of the center server 20 is set document image 50 corresponding to a plurality of image feature data sets 51 included in each image feature data set group 55 in step S104 of FIG. Is generated as a group of K document image groups 60 for setting. For example, the setting document image 50 (1) corresponding to the image feature data sets 51 (1), 51 (4), ... 51 (N-1) included in the image feature data set group 55 (1), respectively. 50 (4), ... 50 (N-1) are grouped to generate a setting document image group 60 (1). Further, the setting document images 50 (2), 50 (corresponding to the image feature data sets 51 (2), 51 (3), ... 51 (N) included in the image feature data set group 55 (K), respectively. 3), ... 50 (N) are grouped to generate a setting document image group 60 (K).
 次に、図4のステップS105に示すように、センタサーバ20のプロセッサは、カウンタJに初期値の1をセットする。そして、図4のステップS106に進んで図7に示すように設定用文書画像グループ60(J)に含まれる各設定用文書画像をM個のクラウドAPI31に送信する。そして、センタサーバ20は、図4のステップS107に示すように、M個のクラウドAPI31(A)~31(M)からそれぞれ文字認識結果を受信する。 Next, as shown in step S105 of FIG. 4, the processor of the center server 20 sets the counter J to the initial value of 1. Then, the process proceeds to step S106 of FIG. 4, and as shown in FIG. 7, each setting document image included in the setting document image group 60 (J) is transmitted to M cloud APIs 31. Then, as shown in step S107 of FIG. 4, the center server 20 receives the character recognition results from the M cloud APIs 31 (A) to 31 (M), respectively.
 センタサーバ20のプロセッサは、図4のステップS108において、一のクラウドAPI31(A)から受信した設定用文書画像グループ60(J)に含まれる複数の設定用文書画像50の文字認識結果と各設定用文書画像50の既知の含有文字列とを比較して、文字認識結果と既知の含有文字列とが完全に一致した場合を正解、完全に一致しなかった場合を不正解とする。そして、センタサーバ20のプロセッサは、正解となった設定用文書画像50の数をカウントする。 In step S108 of FIG. 4, the processor of the center server 20 sets the character recognition results of the plurality of setting document images 50 included in the setting document image group 60 (J) received from one cloud API 31 (A) and each setting. Comparing with the known contained character string of the document image 50, the case where the character recognition result and the known contained character string completely match is regarded as a correct answer, and the case where the character recognition result does not completely match is regarded as an incorrect answer. Then, the processor of the center server 20 counts the number of the correct setting document images 50.
 そして、センタサーバ20のプロセッサは、図4のステップS109において、正解数を設定用文書画像グループ60(J)に含まれる設定用文書画像50の全数で割って、クラウドAPI31(A)に設定用文書画像グループ60(J)の複数の設定用文書画像50を文字認識させた場合の正解率を算出する。 Then, in step S109 of FIG. 4, the processor of the center server 20 divides the number of correct answers by the total number of the setting document images 50 included in the setting document image group 60 (J) for setting in the cloud API 31 (A). The correct answer rate is calculated when a plurality of setting document images 50 of the document image group 60 (J) are recognized as characters.
 同様に、センタサーバ20のプロセッサは、他のクラウドAPI31(B)~API31(M)から受信した設定用文書画像グループ60(J)に含まれる複数の設定用文書画像50の文字認識結果と各設定用文書画像50の既知の含有文字列とを比較して、クラウドAPI31(B)~クラウドAPI31(M)に設定用文書画像グループ60(J)の複数の設定用文書画像50を文字認識させた場合の正解率をそれぞれ算出する。 Similarly, the processor of the center server 20 has the character recognition results of the plurality of setting document images 50 included in the setting document image group 60 (J) received from the other cloud APIs 31 (B) to API31 (M) and each of them. By comparing with the known contained character string of the setting document image 50, the cloud API31 (B) to the cloud API31 (M) are made to recognize a plurality of setting document images 50 of the setting document image group 60 (J). Calculate the correct answer rate in each case.
 そして、センタサーバ20のプロセッサは、図4のステップS110において、ステップS109で算出した正解率が最も高いクラウドAPI31(A)を抽出する。 Then, the processor of the center server 20 extracts the cloud API 31 (A) having the highest correct answer rate calculated in step S109 in step S110 of FIG.
 次に、センタサーバ20のプロセッサは、図4のステップS111において、図8に示すように、1つの画像特徴データセットグループ55(J)の各パラメータの代表値を各パラメータの各データとする代表画像特徴データセット70(J)を生成する。図8に示すように、画像特徴データセットグループ55(1)には、画像特徴データセット51(1),51(4),・・・51(N-1)が含まれている。同様に画像特徴データセット51(4)も、画像特徴量(1)、画像特徴量(2)、画像属性(1)、画像属性(2)、学習特徴値等の各パラメータの各データが格納されている。センタサーバ20のプロセッサは、各パラメータのデータの代表値を代表画像特徴データセット70(J)の対するパラメータのデータに格納する。代表値は、例えば、平均値、中央値等を用いてもよい。平均値を用いる場合、画像特徴量(1)の代表値は画像特徴データセット51(1)の画像特徴量(1)から画像特徴データセット51(N-1)の画像特徴量(1)までの平均値となる。また、画像属性(1)では各画像特徴データセット51の各画像属性(1)を包含する上位概念の用語を代表値としてもよい。また、ユーザ端末10で文書画像を撮像した際の場所を画像属性(1)としている場合には、経緯度の平均値、或いは中央値を代表値としてもよい。 Next, in step S111 of FIG. 4, the processor of the center server 20 is represented by using the representative value of each parameter of one image feature data set group 55 (J) as each data of each parameter, as shown in FIG. The image feature data set 70 (J) is generated. As shown in FIG. 8, the image feature data set group 55 (1) includes image feature data sets 51 (1), 51 (4), ... 51 (N-1). Similarly, the image feature data set 51 (4) also stores data of each parameter such as an image feature amount (1), an image feature amount (2), an image attribute (1), an image attribute (2), and a learning feature value. Has been done. The processor of the center server 20 stores the representative value of the data of each parameter in the data of the parameter for the representative image feature data set 70 (J). As the representative value, for example, an average value, a median value, or the like may be used. When the average value is used, the representative value of the image feature amount (1) ranges from the image feature amount (1) of the image feature data set 51 (1) to the image feature amount (1) of the image feature data set 51 (N-1). It becomes the average value of. Further, in the image attribute (1), a term of a superordinate concept including each image attribute (1) of each image feature data set 51 may be used as a representative value. Further, when the location when the document image is captured by the user terminal 10 is set as the image attribute (1), the average value or the median value of latitude and longitude may be used as a representative value.
 図9に示すように、代表画像特徴データセット70(J)は、複数の設定用文書画像50を含む設定用文書画像グループ60(J)の画像の特徴を代表する代表特徴である。 As shown in FIG. 9, the representative image feature data set 70 (J) is a representative feature representing the features of the image of the setting document image group 60 (J) including the plurality of setting document images 50.
 図3のステップS103の分類の際の閾値を0.7~0.9程度とした場合、生成した代表画像特徴データセット70(J)は、画像特徴データセットグループ55(J)に含まれる複数の画像特徴データセット51との類似値は、閾値と同様の0.7~0.9程度になる。従って、設定用文書画像グループ60(J)に含まれる複数の設定用文書画像50を文字認識させた場合の正解率が最も高くなるクラウドAPI31(A)は、その代表画像特徴データセット70に類似する画像特徴データセット51を有する文書画像の文字認識を行った際に最も高い正解率となるクラウドAPI31となる。 When the threshold value at the time of classification in step S103 of FIG. 3 is set to about 0.7 to 0.9, the generated representative image feature data set 70 (J) is included in the image feature data set group 55 (J). The similar value to the image feature data set 51 of is about 0.7 to 0.9, which is the same as the threshold value. Therefore, the cloud API 31 (A), which has the highest accuracy rate when a plurality of setting document images 50 included in the setting document image group 60 (J) are recognized as characters, is similar to the representative image feature data set 70. The cloud API 31 has the highest accuracy rate when character recognition of a document image having the image feature data set 51 is performed.
 センタサーバ20のプロセッサは、図4のステップS112において、ステップS111で生成した代表画像特徴データセット70(J)と図4のステップS110で抽出した正解率が最も高いクラウドAPI31(A)とを組にして選択データベース24に格納する。 The processor of the center server 20 combines the representative image feature data set 70 (J) generated in step S111 and the cloud API 31 (A) with the highest accuracy rate extracted in step S110 of FIG. 4 in step S112 of FIG. And store it in the selection database 24.
 センタサーバ20のプロセッサは、図4のステップS113でカウンタJを1だけインクレメントして図4のステップS114でカウンタJが画像特徴データセットグループ55の数、或いは、設定用文書画像グループ60の数であるKを越えたかどうか判断する。そして、図4のステップS114でNOと判断した場合には、図4のステップS106に戻る。 The processor of the center server 20 increments the counter J by 1 in step S113 of FIG. 4, and the counter J is the number of image feature data set groups 55 or the number of document image groups 60 for setting in step S114 of FIG. It is judged whether or not the K is exceeded. Then, if NO is determined in step S114 of FIG. 4, the process returns to step S106 of FIG.
 そして、センタサーバ20のプロセッサは、図4のステップS106からステップS112を繰り返し実行し、図10に示すように、K個の代表画像特徴データセット70と、その代表画像特徴データセット70に類似する画像特徴データセット51を有する文書画像の文字認識を行った際に最も高い正解率となるクラウドAPI31との組をK組生成して選択データベース24に格納する。なお、1つのクラウドAPI31が複数の代表画像特徴データセット70と組になっていてもよい。 Then, the processor of the center server 20 repeatedly executes steps S106 to S112 in FIG. 4, and as shown in FIG. 10, is similar to the K representative image feature data set 70 and its representative image feature data set 70. When character recognition of a document image having an image feature data set 51 is performed, K sets with a cloud API 31 having the highest correct answer rate are generated and stored in the selection database 24. It should be noted that one cloud API 31 may be paired with a plurality of representative image feature data sets 70.
 そして、センタサーバ20のプロセッサは、図4のステップS114でYESと判断したら、選択データベース24の設定動作を終了する。 Then, when the processor of the center server 20 determines YES in step S114 of FIG. 4, the setting operation of the selection database 24 ends.
 なお、以上説した選択データベース24の設定動作は一例であって、他の動作によって選択データベース24を設定してもよい。 The setting operation of the selection database 24 described above is an example, and the selection database 24 may be set by another operation.
 次に図1及び図11から図13を参照して文書画像認識システム100を用いた文字認識動作について説明する。 Next, the character recognition operation using the document image recognition system 100 will be described with reference to FIGS. 1 and 11 to 13.
 図1に示すように、ユーザがユーザ端末10によって取得した文書画像をセンタサーバ20に処理対象文書画像80として送信すると、図11のステップS201に示すように、センタサーバ20のデータ送受信部22は、処理対象文書画像80を受信する。データ送受信部22は、受信した処理対象文書画像80をクラウドAPI選択部23に出力する。 As shown in FIG. 1, when the user transmits the document image acquired by the user terminal 10 to the center server 20 as the document image 80 to be processed, the data transmission / reception unit 22 of the center server 20 has the data transmission / reception unit 22 as shown in step S201 of FIG. , Receives the document image 80 to be processed. The data transmission / reception unit 22 outputs the received document image 80 to be processed to the cloud API selection unit 23.
 図11のステップS202、図12に示すように、クラウドAPI選択部23は、先に選択データベース設定動作で説明したと同様、処理対象文書画像80の特徴を抽出して処理対象文書画像80の画像特徴データセット81を生成する。 As shown in steps S202 and 12 of FIG. 11, the cloud API selection unit 23 extracts the features of the processing target document image 80 and images of the processing target document image 80, as described earlier in the selection database setting operation. Generate feature data set 81.
 次に、クラウドAPI選択部23は、図11のステップS203、図13に示すように、選択データベース24に格納されている複数の代表画像特徴データセット70との各類似値を算出する。そして、類似値が最大の代表画像特徴データセット70(1)を選択する。最大の類似値は、処理対象文書画像80の画像特徴データセット81によって異なるが、画像特徴データセット81が選択データベース24の設定の際に用いた設定用文書画像50の特徴に近いものである場合には、例えば、0.8或いは、0.7ように高くなる。一方、画像特徴データセット81が選択データベース24の設定の際に用いた設定用文書画像50の特徴から離れたものである場合には0.2から0.3程度のように低くなる。 Next, the cloud API selection unit 23 calculates each similarity value with the plurality of representative image feature data sets 70 stored in the selection database 24, as shown in steps S203 and 13 of FIG. Then, the representative image feature data set 70 (1) having the largest similarity value is selected. The maximum similarity value differs depending on the image feature data set 81 of the document image 80 to be processed, but when the image feature data set 81 is close to the feature of the setting document image 50 used when setting the selection database 24. Will be as high as 0.8 or 0.7, for example. On the other hand, when the image feature data set 81 is different from the feature of the setting document image 50 used when setting the selection database 24, it becomes as low as about 0.2 to 0.3.
 そして、クラウドAPI選択部23は、図11のステップS204において、ステップS203で選択した代表画像特徴データセット70(1)と組になっているクラウドAPI31(A)を選択してデータ送受信部22に出力する。 Then, in step S204 of FIG. 11, the cloud API selection unit 23 selects the cloud API 31 (A) that is paired with the representative image feature data set 70 (1) selected in step S203, and causes the data transmission / reception unit 22. Output.
 データ送受信部22は、図11のステップS205に示すようにクラウドAPI選択部23から入力された選択されたクラウドAPI31(A)に処理対象文書画像80を送信する。そして、データ送受信部22は、図11のステップS206において、クラウドAPI31(A)から文字認識結果を受信する。 As shown in step S205 of FIG. 11, the data transmission / reception unit 22 transmits the processing target document image 80 to the selected cloud API 31 (A) input from the cloud API selection unit 23. Then, the data transmission / reception unit 22 receives the character recognition result from the cloud API 31 (A) in step S206 of FIG.
 そして、データ送受信部22は、クラウドAPI31(A)から受信した文字認識結果をユーザ端末10に送信する。 Then, the data transmission / reception unit 22 transmits the character recognition result received from the cloud API 31 (A) to the user terminal 10.
 図1に示すように、ユーザ端末10は、センタサーバ20のデータ送受信部22から送信された文字認識結果の文字列を文字列表示部12に表示する。 As shown in FIG. 1, the user terminal 10 displays the character string of the character recognition result transmitted from the data transmission / reception unit 22 of the center server 20 on the character string display unit 12.
 以上説明したように、実施形態の文書画像認識システム100は、ユーザ端末10から受信した処理対象文書画像80の文字認識処理に最適なクラウドAPI31を選択し、そのクラウドAPI31に文字認識処理を行わせるので、高い精度で文字認識処理を行うことができる。 As described above, the document image recognition system 100 of the embodiment selects the cloud API 31 most suitable for the character recognition processing of the processing target document image 80 received from the user terminal 10, and causes the cloud API 31 to perform the character recognition processing. Therefore, character recognition processing can be performed with high accuracy.
 次に、図14から図19を参照しながら、選択データベース24の更新動作について説明する。 Next, the update operation of the selected database 24 will be described with reference to FIGS. 14 to 19.
 先に説明したように、クラウドAPI選択部23は、処理対象文書画像80の画像特徴データセット81と選択データベース24に格納されている複数の代表画像特徴データセット70との各類似値を算出し、類似値が最大の代表画像特徴データセット70を選択する。しかし、最大の類似値は、画像特徴データセット81が選択データベース24の設定の際に用いた設定用文書画像50の特徴に近いものである場合には、例えば、0.8或いは、0.7ように高くなる。一方、画像特徴データセット81が選択データベース24の設定の際に用いた設定用文書画像50の特徴から離れたものである場合には0.2から0.3程度のように低くなる。このため、類似値が最大となる代表画像特徴データセット70を選択し、それと組になっているクラウドAPI31を用いて文字認識処理を行った場合でも、文字認識結果が正解とならない可能性がある。そこで、処理対象文書画像80の画像特徴データセット81と選択データベース24に格納されている代表画像特徴データセット70との類似値ができるだけ高くなるように選択データベース24を更新していくことが必要となる。 As described above, the cloud API selection unit 23 calculates each similarity value between the image feature data set 81 of the document image 80 to be processed and the plurality of representative image feature data sets 70 stored in the selection database 24. , Select the representative image feature data set 70 with the largest similarity value. However, if the image feature data set 81 is close to the feature of the setting document image 50 used when setting the selection database 24, the maximum similarity value is, for example, 0.8 or 0.7. So high. On the other hand, when the image feature data set 81 is different from the feature of the setting document image 50 used when setting the selection database 24, it becomes as low as about 0.2 to 0.3. Therefore, even if the representative image feature data set 70 having the maximum similarity value is selected and the character recognition process is performed using the cloud API 31 paired with the representative image feature data set 70, the character recognition result may not be the correct answer. .. Therefore, it is necessary to update the selection database 24 so that the similarity value between the image feature data set 81 of the document image 80 to be processed and the representative image feature data set 70 stored in the selection database 24 is as high as possible. Become.
 選択データベース24の更新は、ユーザ端末10がセンタサーバ20から文字認識結果を受信して文字列表示部12に文字認識結果の文字列を表示し、これを見たユーザが処理対象文書画像80に含まれる正解文字列を正解文字列入力部13に入力することにより開始される。正解文字列が入力されると、ユーザ端末10は、正解文字列をセンタサーバ20に送信する。センタサーバ20は、処理対象文書画像80を各クラウドAPI31に送信し、受信した文字認識結果の正解或いは不正解の度合いである正解度に応じて選択データベース24の更新を行う。以下、詳細に説明する。なお、以下の説明では、正解とは受信した文字認識結果の文字列が全て正しい場合をいい、受信した文字認識結果の文字列に1つでも正しくない文字が含まれている場合には不正解として説明する。また、以下の説明では、文字認識動作において、クラウドAPI31(A)が選択されたものとして説明する。 To update the selection database 24, the user terminal 10 receives the character recognition result from the center server 20 and displays the character string of the character recognition result on the character string display unit 12, and the user who sees this displays the character recognition result on the processing target document image 80. It is started by inputting the included correct answer character string into the correct answer character string input unit 13. When the correct answer character string is input, the user terminal 10 transmits the correct answer character string to the center server 20. The center server 20 transmits the document image 80 to be processed to each cloud API 31, and updates the selection database 24 according to the degree of correctness or incorrectness of the received character recognition result. Hereinafter, it will be described in detail. In the following explanation, the correct answer means that all the received character recognition result character strings are correct, and if the received character recognition result character string contains even one incorrect character, the answer is incorrect. It is explained as. Further, in the following description, it is assumed that the cloud API 31 (A) is selected in the character recognition operation.
 図1に示すように、ユーザは、ユーザ端末10の文字列表示部12に表示された文字認識結果の文字列を確認する。この際、ユーザ端末10の画面には承認アイコンと文字入力エリアとが表示されている。承認アイコンと文字入力エリアとは正解文字列入力部13を構成する。 As shown in FIG. 1, the user confirms the character string of the character recognition result displayed on the character string display unit 12 of the user terminal 10. At this time, the approval icon and the character input area are displayed on the screen of the user terminal 10. The approval icon and the character input area constitute the correct answer character string input unit 13.
 ユーザは、文字列表示部12に表示された文字認識結果が正しい文字列であれば、ユーザ端末10の画面に表示されている承認アイコンを押す。すると、ユーザ端末10は、図11のステップS207でセンタサーバ20から送信された文字認識結果を正解文字列としてセンタサーバ20の選択データベース更新部25に送信する。一方、ユーザが文字列表示部12に表示された文字列を確認した結果、文字認識結果が正しい文字列ではないと判断した場合、ユーザは、ユーザ端末10の画面に表示されている文字入力エリアに処理対象文書画像80の正解文字列を入力する。ユーザ端末10は、文字入力エリアに正解文字列が入力された場合には、入力された正解文字列をセンタサーバ20の選択データベース更新部25に送信する。なお、ユーザは、承認入力、或いは正解文字列の入力を音声入力してもよい。この際、音声入力機能は正解文字列入力部13を構成する。 If the character recognition result displayed on the character string display unit 12 is a correct character string, the user presses the approval icon displayed on the screen of the user terminal 10. Then, the user terminal 10 transmits the character recognition result transmitted from the center server 20 in step S207 of FIG. 11 as a correct character string to the selection database update unit 25 of the center server 20. On the other hand, when the user checks the character string displayed on the character string display unit 12 and determines that the character recognition result is not the correct character string, the user uses the character input area displayed on the screen of the user terminal 10. The correct character string of the document image 80 to be processed is input to. When the correct answer character string is input in the character input area, the user terminal 10 transmits the input correct answer character string to the selection database update unit 25 of the center server 20. The user may input the approval or the correct character string by voice. At this time, the voice input function constitutes the correct answer character string input unit 13.
 図14のステップS301に示すように、センタサーバ20の選択データベース更新部25は、ユーザ端末10から処理対象文書画像80の正解文字列の入力があるまで待機し、正解文字列の入力があったら図14のステップS302に進んで、図19に示すように、処理対象文書画像80をM個のクラウドAPI31(A)~31(M)全てに送信する。そして、図14のステップS303に示すように、選択データベース更新部25は、M個のクラウドAPI31(A)~31(M)から文字認識結果を受信する。 As shown in step S301 of FIG. 14, the selection database update unit 25 of the center server 20 waits until the correct answer character string of the document image 80 to be processed is input from the user terminal 10, and then the correct answer character string is input. The process proceeds to step S302 of FIG. 14, and as shown in FIG. 19, the document image 80 to be processed is transmitted to all M cloud APIs 31 (A) to 31 (M). Then, as shown in step S303 of FIG. 14, the selection database update unit 25 receives the character recognition results from the M cloud APIs 31 (A) to 31 (M).
 図14のステップS304、図19に示すように、選択データベース更新部25は、先の文字認識動作でクラウドAPI選択部23が選択したクラウドAPI31(A)から受信した文字認識結果と正解文字列とを対比し、選択したクラウドAPI31(A)の文字認識結果が正解の場合には、図14のステップS305に進む。 As shown in steps S304 and 19 of FIG. 14, the selection database update unit 25 includes the character recognition result and the correct answer character string received from the cloud API 31 (A) selected by the cloud API selection unit 23 in the previous character recognition operation. If the character recognition result of the selected cloud API 31 (A) is correct, the process proceeds to step S305 in FIG.
 選択データベース更新部25は、図14のステップS305で先に選択したクラウドAPI31(A)以外の他のクラウドAPI31(B)~31(M)から受信した文字認識結果と正解文字列とを対比し、他のクラウドAPI31(B)~31(M)から受信した文字認識結果の少なくとも1つに正解がある場合には、図15のステップS306に進む。 The selection database update unit 25 compares the character recognition result received from the cloud APIs 31 (B) to 31 (M) other than the cloud API31 (A) previously selected in step S305 of FIG. 14 with the correct character string. If at least one of the character recognition results received from the other cloud APIs 31 (B) to 31 (M) has a correct answer, the process proceeds to step S306 in FIG.
 選択データベース更新部25は、図15のステップS306で、図12に示す処理対象文書画像80の画像特徴データセット81と先に選択したクラウドAPI31(A)と組になっている図13に示す代表画像特徴データセット70(1)との類似値が所定の閾値以上かどうか判断する。ここで、所定の閾値は自由に選択できるが、例えば、0.8或いは0.7程度に設定してもよい。 The selection database update unit 25 is the representative shown in FIG. 13 which is paired with the image feature data set 81 of the document image 80 to be processed shown in FIG. 12 and the previously selected cloud API 31 (A) in step S306 of FIG. It is determined whether or not the value similar to the image feature data set 70 (1) is equal to or greater than a predetermined threshold value. Here, a predetermined threshold value can be freely selected, but may be set to, for example, about 0.8 or 0.7.
 選択データベース更新部25は、図15のステップS306でYESと判断した場合には、図15のステップS307に進んで処理対象文書画像80の画像特徴データセット81に基づいて先に選択したクラウドAPI31(A)と組になっている代表画像特徴データセット70(1)を更新する。更新は、例えば、代表画像特徴データセット70(1)の各パラメータの各データと処理対象文書画像80の画像特徴データセット81各パラメータの各データの差に重みをつけた量だけ代表画像特徴データセット70(1)の各パラメータの各データを増減させてもよい。また、代表画像特徴データセット70(1)の各パラメータの各データを処理対象文書画像80の画像特徴データセット81各パラメータの各データに置き換えてもよい。 If the selection database update unit 25 determines YES in step S306 of FIG. 15, the selection database update unit 25 proceeds to step S307 of FIG. 15 and selects the cloud API 31 (previously selected based on the image feature data set 81 of the document image 80 to be processed). The representative image feature data set 70 (1) paired with A) is updated. For example, the update is performed by weighting the difference between each data of each parameter of the representative image feature data set 70 (1) and the image feature data set 81 of each parameter of the document image 80 to be processed. The data of each parameter of the set 70 (1) may be increased or decreased. Further, each data of each parameter of the representative image feature data set 70 (1) may be replaced with each data of each parameter of the image feature data set 81 of the document image 80 to be processed.
 また、選択データベース更新部25は、図15のステップS306でNOと判断した場合には、図15のステップS308に進んで処理対象文書画像80の画像特徴データセット81と先に選択した一のクラウドAPI31(A)との組を選択データベース24に追加する。ただし、選択データベース24の中に、上記の組が存在する場合には、組の追加は行わない。 If the selection database update unit 25 determines NO in step S306 of FIG. 15, the selection database update unit 25 proceeds to step S308 of FIG. 15 to proceed to the image feature data set 81 of the document image 80 to be processed and one cloud previously selected. The pair with API 31 (A) is added to the selection database 24. However, if the above set exists in the selection database 24, the set is not added.
 選択データベース更新部25は、図15のステップS307又はステップS308の処理を終了したら図15のステップS309に進み、処理対象文書画像80の画像特徴データセット81と他のクラウドAPI31の内で図14のステップS305で文字認識結果が正解とされたクラウドAPI31と組になっている代表画像特徴データセット70との類似値が所定の閾値以上か判断する。 After completing the process of step S307 or step S308 of FIG. 15, the selection database update unit 25 proceeds to step S309 of FIG. 15 and proceeds to FIG. 14 among the image feature data set 81 of the document image 80 to be processed and the other cloud API 31. It is determined whether the similarity value with the representative image feature data set 70 paired with the cloud API 31 whose character recognition result is the correct answer in step S305 is equal to or higher than a predetermined threshold value.
 そして、選択データベース更新部25は、図15のステップS309でYESと判断した場合には、図15のステップS310に進んで、処理対象文書画像80の画像特徴データセット81に基づいて他のクラウドAPI31の内で文字認識結果が正解となったクラウドAPI31と組になっている代表画像特徴データセット70を更新する。更新は、先に説明したと同様、代表画像特徴データセット70の各パラメータの各データと処理対象文書画像80の画像特徴データセット81各パラメータの各データの差に重みをつけた量だけ代表画像特徴データセット70の各パラメータの各データを増減させてもよい。また、代表画像特徴データセット70の各パラメータの各データを処理対象文書画像80の画像特徴データセット81各パラメータの各データに置き換えてもよい。 If the selection database update unit 25 determines YES in step S309 of FIG. 15, the selection database update unit 25 proceeds to step S310 of FIG. 15 and another cloud API 31 based on the image feature data set 81 of the document image 80 to be processed. The representative image feature data set 70, which is paired with the cloud API 31 for which the character recognition result is the correct answer, is updated. As described above, the update is performed by weighting the difference between each data of each parameter of the representative image feature data set 70 and each data of the image feature data set 81 of the document image 80 to be processed. Each data of each parameter of the feature data set 70 may be increased or decreased. Further, each data of each parameter of the representative image feature data set 70 may be replaced with each data of each parameter of the image feature data set 81 of the document image 80 to be processed.
 また、選択データベース更新部25は、図15のステップS309でNOと判断した場合には、図15のステップS311に進んで、処理対象文書画像80の画像特徴データセット81と他のクラウドAPI31の内で文字認識結果が正解となったクラウドAPI31との組を選択データベース24に追加する。なお、選択データベース24に上記の組が存在する場合には、組の追加は行わない。 If the selection database update unit 25 determines NO in step S309 of FIG. 15, the selection database update unit 25 proceeds to step S311 of FIG. 15 and among the image feature data set 81 of the document image 80 to be processed and the other cloud API 31. The pair with the cloud API 31 for which the character recognition result is the correct answer in is added to the selection database 24. If the above set exists in the selection database 24, the set is not added.
 なお、図14のステップS305で他のクラウドAPI31(B)~31(M)から受信した複数の文字認識結果が正解となった場合には、それぞれの他のクラウドAPI31について図15のステップS309からS311の処理を行う。 If the plurality of character recognition results received from the other cloud APIs 31 (B) to 31 (M) are correct in step S305 of FIG. 14, each of the other cloud APIs 31 is from step S309 of FIG. The process of S311 is performed.
 選択データベース更新部25は、図15のステップS310又はS311の処理を終了したら更新動作を終了する。 The selection database update unit 25 ends the update operation when the process of step S310 or S311 in FIG. 15 is completed.
 また、選択データベース更新部25は、図14のステップS305でNOと判断した場合には、図16のステップS401~S403の動作を実行する。図16のステップS401~S403の動作は、図15に示すステップS306~S308の動作と同一なので、説明は省略する。 Further, when the selection database update unit 25 determines NO in step S305 of FIG. 14, the operation of steps S401 to S403 of FIG. 16 is executed. Since the operation of steps S401 to S403 in FIG. 16 is the same as the operation of steps S306 to S308 shown in FIG. 15, the description thereof will be omitted.
 また、選択データベース更新部25は、図14のステップS304でNOと判断した場合には、図17のステップS501に進んで、他のクラウドAPI31(B)~31(M)の文字認識結果に正解があるかどうかを判断する。そして、選択データベース更新部25は、図17のステップS501でYESと判断した場合には、図17のステップS502~S504の動作を実行する。図17のステップS502~S504の動作は、図15に示すステップS309~S311の動作と同一なので、説明は省略する。 If the selection database update unit 25 determines NO in step S304 of FIG. 14, it proceeds to step S501 of FIG. 17 and correctly answers the character recognition results of the other cloud APIs 31 (B) to 31 (M). Determine if there is. Then, if the selection database update unit 25 determines YES in step S501 of FIG. 17, the operation of steps S502 to S504 of FIG. 17 is executed. Since the operation of steps S502 to S504 in FIG. 17 is the same as the operation of steps S309 to S311 shown in FIG. 15, the description thereof will be omitted.
 選択データベース更新部25は、図17のステップS501でNOと判断した場合には、図18のステップS505に進んで、図19に示すように、代表画像特徴データセット70と組として選択データベース24に格納されているクラウドAPI31以外の別のクラウドAPI32に処理対象文書画像80を送信する。そして、選択データベース更新部25は、図18のステップS506に示すように、別のクラウドAPI32から文字認識結果を受信したら、ステップS507で受信した文字認識結果に正解があるかどうか確認する。そして、図18のステップS507でYESと判断した場合には、選択データベース更新部25は、ステップS508に進んで処理対象文書画像80の画像特徴データセット81と別のクラウドAPI32との組を選択データベース24に追加する。 If the selection database update unit 25 determines NO in step S501 of FIG. 17, the process proceeds to step S505 of FIG. 18, and as shown in FIG. 19, the selection database 24 is combined with the representative image feature data set 70. The processing target document image 80 is transmitted to another cloud API 32 other than the stored cloud API 31. Then, as shown in step S506 of FIG. 18, when the selection database update unit 25 receives the character recognition result from another cloud API 32, the selection database update unit 25 confirms whether or not the character recognition result received in step S507 has a correct answer. If YES is determined in step S507 of FIG. 18, the selection database update unit 25 proceeds to step S508 to select a set of the image feature data set 81 of the document image 80 to be processed and another cloud API 32. Add to 24.
 以上説明した更新動作では、文字認識結果が正解となったクラウドAPI31と組になっている代表画像特徴データセット70を処理対象文書画像80の画像特徴データセット81に近づけていくので、処理対象文書画像80の画像特徴データセット81と選択データベース24に格納されている代表画像特徴データセット70との類似値が次第に高くなるように選択データベース24を更新していくことができる。また、文字認識結果に正解がなかった場合には、文字認識結果が正解となった別のクラウドAPI32と処理対象文書画像80の画像特徴データセット81とを組として選択データベース24に格納するので、正確に文字認識可能な範囲を広げていくことができる。 In the update operation described above, the representative image feature data set 70 paired with the cloud API 31 whose character recognition result is the correct answer is brought closer to the image feature data set 81 of the document image 80 to be processed, so that the document to be processed is processed. The selection database 24 can be updated so that the similarity value between the image feature data set 81 of the image 80 and the representative image feature data set 70 stored in the selection database 24 gradually increases. If there is no correct answer in the character recognition result, another cloud API 32 in which the character recognition result is correct and the image feature data set 81 of the document image 80 to be processed are stored in the selection database 24 as a set. It is possible to expand the range in which characters can be recognized accurately.
 これにより、実施形態の文書画像認識システム100の文字認識精度を向上させていくことができる。 Thereby, the character recognition accuracy of the document image recognition system 100 of the embodiment can be improved.
 以上の説明では、正解とは受信した文字認識結果の文字列が全て正しい場合をいい、受信した文字認識結果の文字列に1つでも正しくない文字が含まれている場合には不正解として説明したが、これに限らない。例えば、受信した文字認識結果に含まれる全ての文字数の内の正解の文字数の割合が90%等、所定の閾値以上の場合には、正解とみなし、所定の閾値未満の場合を不正解として上記の更新動作を実行してもよい。 In the above explanation, the correct answer means that all the received character recognition result character strings are correct, and if the received character recognition result character string contains even one incorrect character, it is explained as an incorrect answer. However, it is not limited to this. For example, if the ratio of the number of correct characters to the total number of characters included in the received character recognition result is 90% or more, it is regarded as a correct answer, and if it is less than the predetermined threshold, it is regarded as an incorrect answer. You may execute the update operation of.
 10 ユーザ端末、11 文書画像取得部、12 文字列表示部、13 正解文字列入力部、20 センタサーバ、21 文字認識処理部、22 データ送受信部、23 クラウドAPI選択部、24 選択データベース、25 選択データベース更新部、30 クラウドAPI群、31,32 クラウドAPI、50 設定用文書画像、51,81 画像特徴データセット、55 画像特徴データセットグループ、60 設定用文書画像グループ、70 代表画像特徴データセット、80 処理対象文書画像、100 文書画像認識システム、150 汎用コンピュータ、151 CPU、152 ROM、153 RAM、154 HDD、155 マウス、156 キーボード、157 ディスプレイ、158 入出力コントローラ、159 ネットワークコントローラ、160 データバス。
 
10 user terminal, 11 document image acquisition unit, 12 character string display unit, 13 correct character string input unit, 20 center server, 21 character recognition processing unit, 22 data transmission / reception unit, 23 cloud API selection unit, 24 selection database, 25 selection Database update unit, 30 cloud API group, 31, 32 cloud API, 50 document image for setting, 51, 81 image feature data set, 55 image feature data set group, 60 document image group for setting, 70 representative image feature data set, 80 Document image to be processed, 100 Document image recognition system, 150 General-purpose computer, 151 CPU, 152 ROM, 153 RAM, 154 HDD, 155 mouse, 156 keyboard, 157 display, 158 input / output controller, 159 network controller, 160 data bus.

Claims (14)

  1.  文書画像を取得するユーザ端末と、
     前記ユーザ端末と通信回線で接続されたセンタサーバと、
     前記センタサーバと通信回線で接続され、入力された文書画像の文字認識処理を行い、文字認識結果を出力する複数の文字認識クラウドAPIと、を含む文書画像認識システムであって、
     前記センタサーバは、入力文書画像の特徴と、前記入力文書画像の文字認識処理を行った際に文字認識の正解率が複数の文字認識クラウドAPIの中で最大となる文字認識クラウドAPIとの組を格納した選択データベースを備え、
     前記ユーザ端末は、取得した文書画像を処理対象文書画像として前記センタサーバに送信し、
     前記センタサーバは、前記ユーザ端末から受信した前記処理対象文書画像から前記処理対象文書画像の特徴を抽出し、前記選択データベースに格納されている前記入力文書画像の特徴の中から前記処理対象文書画像の特徴と最も類似している前記入力文書画像の特徴を選択し、選択した前記入力文書画像の特徴と組になっている一の文字認識クラウドAPIを選択し、選択した一の文字認識クラウドAPIに前記処理対象文書画像を送信し、一の文字認識クラウドAPIから文字認識結果を受信し、受信した文字認識結果を前記ユーザ端末に送信すること、
     を特徴とする文書画像認識システム。
    The user terminal that acquires the document image and
    A center server connected to the user terminal via a communication line,
    A document image recognition system including a plurality of character recognition cloud APIs that are connected to the center server via a communication line, perform character recognition processing of input document images, and output character recognition results.
    The center server is a combination of the characteristics of the input document image and the character recognition cloud API in which the correct answer rate of character recognition is the largest among the plurality of character recognition cloud APIs when the character recognition process of the input document image is performed. Equipped with a selection database that stores
    The user terminal transmits the acquired document image as a document image to be processed to the center server.
    The center server extracts the characteristics of the processing target document image from the processing target document image received from the user terminal, and the processing target document image from the characteristics of the input document image stored in the selection database. Select the feature of the input document image that most closely resembles the feature of, select the one character recognition cloud API that is paired with the feature of the selected input document image, and select the one character recognition cloud API. To send the image of the document to be processed to, receive the character recognition result from one character recognition cloud API, and send the received character recognition result to the user terminal.
    A document image recognition system featuring.
  2.  請求項1に記載の文書画像認識システムであって、
     前記ユーザ端末は、前記センタサーバから文字認識結果を受信した際に、ユーザが入力した前記処理対象文書画像に含まれる正解文字列を前記センタサーバに出力し、
     前記センタサーバは、
     前記ユーザ端末から前記正解文字列が入力された場合に、前記処理対象文書画像を各文字認識クラウドAPIに送信し、
     各文字認識クラウドAPIからそれぞれ文字認識結果を受信し、
     受信した文字認識結果の正解度に応じて前記選択データベースの各文字認識クラウドAPIと組となっている各入力文書画像の各特徴の更新、及び、入力文書画像の特徴と文字認識クラウドAPIの組の前記選択データベースへの追加のいずれか一方又は両方を行うこと、
     を特徴とする文書画像認識システム。
    The document image recognition system according to claim 1.
    When the user terminal receives the character recognition result from the center server, the user terminal outputs the correct character string included in the processing target document image input by the user to the center server.
    The center server is
    When the correct character string is input from the user terminal, the processing target document image is transmitted to each character recognition cloud API.
    Receive the character recognition result from each character recognition cloud API,
    Update of each feature of each input document image that is paired with each character recognition cloud API of the selected database according to the correctness of the received character recognition result, and set of features of the input document image and character recognition cloud API. To do one or both of the additions to the selection database,
    A document image recognition system featuring.
  3.  請求項2に記載の文書画像認識システムであって、
     前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が正解で、且つ、選択した一の文字認識クラウドAPI以外の他の文字認識クラウドAPIから受信した文字認識結果の少なくとも1つが正解の場合で、且つ、処理対象文書画像の特徴と、選択した一の文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値以上の場合には、
     処理対象文書画像の特徴に基づいて選択した一の文字認識クラウドAPIと組になっている入力文書画像の特徴を更新すること、
     を特徴とする文書画像認識システム。
    The document image recognition system according to claim 2.
    In the center server, the character recognition result received from the selected one character recognition cloud API is correct, and at least one of the character recognition results received from another character recognition cloud API other than the selected one character recognition cloud API. When one is correct and the similarity value between the feature of the document image to be processed and the feature of the input document image paired with the selected character recognition cloud API is equal to or more than a predetermined threshold.
    Updating the characteristics of the input document image that is paired with the one character recognition cloud API selected based on the characteristics of the document image to be processed,
    A document image recognition system featuring.
  4.  請求項3に記載の文書画像認識システムであって、
     前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が正解で、且つ、他の文字認識クラウドAPIから受信した文字認識結果の少なくとも1つが正解の場合で、且つ、処理対象文書画像の特徴と、選択した一の文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値未満の場合には、
     処理対象文書画像の特徴と選択した一の文字認識クラウドAPIとの組を選択データベースに追加すること、
     を特徴とする文書画像認識システム。
    The document image recognition system according to claim 3.
    The center server is a processing target when the character recognition result received from one selected character recognition cloud API is correct and at least one of the character recognition results received from another character recognition cloud API is correct. If the similarity between the characteristics of the document image and the characteristics of the input document image paired with the selected character recognition cloud API is less than a predetermined threshold,
    Adding a set of the characteristics of the document image to be processed and the selected character recognition cloud API to the selection database,
    A document image recognition system featuring.
  5.  請求項2に記載の文書画像認識システムであって、
     前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が正解で、且つ、選択した一の文字認識クラウドAPI以外の他の文字認識クラウドAPIから受信した文字認識結果の少なくとも1つが正解の場合で、且つ、処理対象文書画像の特徴と、他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値以上の場合には、
     処理対象文書画像の特徴に基づいて他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIと組になっている入力文書画像の特徴を更新すること、
     を特徴とする文書画像認識システム。
    The document image recognition system according to claim 2.
    In the center server, the character recognition result received from the selected one character recognition cloud API is correct, and at least one of the character recognition results received from another character recognition cloud API other than the selected one character recognition cloud API. One is the case of the correct answer, and the characteristics of the document image to be processed and the characteristics of the input document image that is paired with the character recognition cloud API whose character recognition result is the correct answer among other character recognition cloud APIs. If the similar value is greater than or equal to a given threshold,
    To update the characteristics of the input document image that is paired with the character recognition cloud API for which the character recognition result is the correct answer among other character recognition cloud APIs based on the characteristics of the document image to be processed.
    A document image recognition system featuring.
  6.  請求項5に記載の文書画像認識システムであって、
     前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が正解で、且つ、他の文字認識クラウドAPIから受信した文字認識結果の少なくとも1つが正解の場合で、且つ、処理対象文書画像の特徴と、他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値未満の場合には、
     処理対象文書画像の特徴と他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIとの組を選択データベースに追加すること、
     を特徴とする文書画像認識システム。
    The document image recognition system according to claim 5.
    In the center server, the character recognition result received from one selected character recognition cloud API is correct, and at least one of the character recognition results received from another character recognition cloud API is correct, and the processing target is When the similarity between the characteristics of the document image and the characteristics of the input document image paired with the character recognition cloud API for which the character recognition result is the correct answer among other character recognition cloud APIs is less than a predetermined threshold value. teeth,
    Adding to the selection database a set of the characteristics of the document image to be processed and the character recognition cloud API for which the character recognition result is the correct answer among other character recognition cloud APIs.
    A document image recognition system featuring.
  7.  請求項2に記載の文書画像認識システムであって、
     前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が正解で、且つ、選択した一の文字認識クラウドAPI以外の他の文字認識クラウドAPIから受信した文字認識結果に正解がない場合で、且つ、処理対象文書画像の特徴と、選択した一の文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値以上の場合には、
     処理対象文書画像の特徴に基づいて選択した一の文字認識クラウドAPIと組になっている入力文書画像の特徴を更新すること、
     を特徴とする文書画像認識システム。
    The document image recognition system according to claim 2.
    In the center server, the correct answer is the character recognition result received from the selected one character recognition cloud API, and the correct answer is the character recognition result received from another character recognition cloud API other than the selected one character recognition cloud API. If there is no such value and the similarity between the characteristics of the document image to be processed and the characteristics of the input document image paired with the selected character recognition cloud API is equal to or greater than a predetermined threshold.
    Updating the characteristics of the input document image that is paired with the one character recognition cloud API selected based on the characteristics of the document image to be processed,
    A document image recognition system featuring.
  8.  請求項7に記載の文書画像認識システムであって、
     前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が正解で、且つ、選択した一の文字認識クラウドAPI以外の他の文字認識クラウドAPIから受信した文字認識結果に正解がない場合で、且つ、処理対象文書画像の特徴と、選択した一の文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値未満の場合には、
     処理対象文書画像の特徴と選択した一の文字認識クラウドAPIとの組を選択データベースに追加すること、
     を特徴とする文書画像認識システム。
    The document image recognition system according to claim 7.
    In the center server, the correct answer is the character recognition result received from the selected one character recognition cloud API, and the correct answer is the character recognition result received from another character recognition cloud API other than the selected one character recognition cloud API. If there is no such value and the similarity between the characteristics of the document image to be processed and the characteristics of the input document image paired with the selected character recognition cloud API is less than a predetermined threshold value.
    Adding a set of the characteristics of the document image to be processed and the selected character recognition cloud API to the selection database,
    A document image recognition system featuring.
  9.  請求項2に記載の文書画像認識システムであって、
     前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が不正解で、且つ、選択した一の文字認識クラウドAPI以外の他の文字認識クラウドAPIから受信した文字認識結果の少なくとも1つが正解の場合で、且つ、処理対象文書画像の特徴と、他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値以上の場合には、
     処理対象文書画像の特徴に基づいて他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIと組になっている入力文書画像の特徴を更新すること、
     を特徴とする文書画像認識システム。
    The document image recognition system according to claim 2.
    The center server has an incorrect character recognition result received from the selected one character recognition cloud API, and at least the character recognition result received from another character recognition cloud API other than the selected one character recognition cloud API. One is the case where the answer is correct, and the characteristics of the document image to be processed and the characteristics of the input document image that is paired with the character recognition cloud API whose character recognition result is the correct answer among the other character recognition cloud APIs. If the similar value of is greater than or equal to a predetermined threshold,
    To update the characteristics of the input document image that is paired with the character recognition cloud API for which the character recognition result is the correct answer among other character recognition cloud APIs based on the characteristics of the document image to be processed.
    A document image recognition system featuring.
  10.  請求項9に記載の文書画像認識システムであって、
     前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が不正解で、且つ、選択した一の文字認識クラウドAPI以外の他の文字認識クラウドAPIから受信した文字認識結果の少なくとも1つが正解の場合で、且つ、処理対象文書画像の特徴と、他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIと組になっている入力文書画像の特徴との類似値が所定の閾値未満の場合には、
     処理対象文書画像の特徴と他の文字認識クラウドAPIの内で文字認識結果が正解となった文字認識クラウドAPIとの組を選択データベースに追加すること、
     を特徴とする文書画像認識システム。
    The document image recognition system according to claim 9.
    The center server has an incorrect character recognition result received from the selected one character recognition cloud API, and at least the character recognition result received from another character recognition cloud API other than the selected one character recognition cloud API. One is the case where the answer is correct, and the characteristics of the document image to be processed and the characteristics of the input document image that is paired with the character recognition cloud API whose character recognition result is the correct answer among the other character recognition cloud APIs. If the similar value of is less than a predetermined threshold,
    Adding to the selection database a set of the characteristics of the document image to be processed and the character recognition cloud API for which the character recognition result is the correct answer among other character recognition cloud APIs.
    A document image recognition system featuring.
  11.  請求項2に記載の文書画像認識システムであって、
     前記センタサーバは、選択した一の文字認識クラウドAPIから受信した文字認識結果が不正解で、且つ、選択した一の文字認識クラウドAPI以外の他の文字認識クラウドAPIから受信した文字認識結果に1つも正解がなかった場合には、
     入力文書画像の特徴と組として選択データベースに格納されている文字認識クラウドAPI以外の別の文字認識クラウドAPIに処理対象文書画像を送信し、別の文字認識クラウドAPIから受信した文字認識結果が正解の場合には、
     処理対象文書画像の特徴と別の文字認識クラウドAPIとの組を選択データベースに追加すること、
     を特徴とする文書画像認識システム。
    The document image recognition system according to claim 2.
    In the center server, the character recognition result received from the selected one character recognition cloud API is incorrect, and the character recognition result received from another character recognition cloud API other than the selected one character recognition cloud API is 1 If there is no correct answer,
    Character recognition as a set with the characteristics of the input document image The document image to be processed is sent to another character recognition cloud API other than the character recognition cloud API stored in the database, and the character recognition result received from another character recognition cloud API is the correct answer. In Case of,
    Adding a pair of features of the document image to be processed and another character recognition cloud API to the selection database,
    A document image recognition system featuring.
  12.  請求項1から11のいずれか1項に記載の文書画像認識システムにおいて、
     文書画像の特徴は、文書画像の画素情報から算出される画像特徴量と、前記ユーザ端末で文書画像を取得した際の状況を示す画像属性と、学習機を用いて算出される学習特徴値と、の少なくとも1つを含むこと、
     を特徴とする文書画像認識システム。
    In the document image recognition system according to any one of claims 1 to 11.
    The features of the document image are the image feature amount calculated from the pixel information of the document image, the image attribute indicating the situation when the document image is acquired by the user terminal, and the learning feature value calculated by using the learning machine. , Including at least one of,
    A document image recognition system featuring.
  13.  請求項12に記載の文書画像認識システムにおいて、
     前記画像属性は、前記ユーザ端末で文書画像を取得する際に前記ユーザ端末で取得した情報で、文書画像の輝度、照度、取得場所、取得時間の少なくとも1つを含むこと、
     を特徴とする文書画像認識システム。
    In the document image recognition system according to claim 12,
    The image attribute is information acquired by the user terminal when the document image is acquired by the user terminal, and includes at least one of the luminance, illuminance, acquisition location, and acquisition time of the document image.
    A document image recognition system featuring.
  14.  請求項1から11のいずれか1項に記載の文書画像認識システムにおいて、
     前記選択データベースに格納されている文字認識クラウドAPIは、含有文字列が既知の複数の設定用文書画像の特徴を抽出し、特徴が相互に類似する設定用文書画像をグルーピングし、設定用文書画像の各グループに含まれる複数の設定用文書画像の文字認識を行った際に文字認識の正解率が最大となる文字認識クラウドAPIであり、
     文字認識クラウドAPIと組になっている入力文書画像の特徴は、設定用文書画像の各グループの特徴を代表する代表特徴であること、
     を特徴とする文書画像認識システム。
     
    In the document image recognition system according to any one of claims 1 to 11.
    The character recognition cloud API stored in the selection database extracts the features of a plurality of setting document images whose contained character strings are known, groups the setting document images having similar features to each other, and sets the setting document images. It is a character recognition cloud API that maximizes the correct answer rate of character recognition when character recognition is performed for multiple setting document images included in each group of.
    The feature of the input document image that is paired with the character recognition cloud API is that it is a representative feature that represents the feature of each group of the document image for setting.
    A document image recognition system featuring.
PCT/JP2020/031792 2020-08-24 2020-08-24 Document image recognition system WO2022044067A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2022534682A JP7134380B2 (en) 2020-08-24 2020-08-24 Document image recognition system
PCT/JP2020/031792 WO2022044067A1 (en) 2020-08-24 2020-08-24 Document image recognition system
CN202080103301.0A CN116569225B (en) 2020-08-24 2020-08-24 Document image recognition system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/031792 WO2022044067A1 (en) 2020-08-24 2020-08-24 Document image recognition system

Publications (1)

Publication Number Publication Date
WO2022044067A1 true WO2022044067A1 (en) 2022-03-03

Family

ID=80352890

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/031792 WO2022044067A1 (en) 2020-08-24 2020-08-24 Document image recognition system

Country Status (3)

Country Link
JP (1) JP7134380B2 (en)
CN (1) CN116569225B (en)
WO (1) WO2022044067A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016207039A (en) * 2015-04-24 2016-12-08 富士通フロンテック株式会社 Image processing system, image processing device, image processing method, and image processing program
JP2019040417A (en) * 2017-08-25 2019-03-14 富士ゼロックス株式会社 Information processing device and program
JP2019164687A (en) * 2018-03-20 2019-09-26 富士ゼロックス株式会社 Information processing device
JP2019169025A (en) * 2018-03-26 2019-10-03 株式会社Pfu Information processing device, character recognition engine selection method, and program

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008293354A (en) 2007-05-25 2008-12-04 Canon Inc Document image recognition system
CN105159870B (en) * 2015-06-26 2018-06-29 徐信 A kind of accurate processing system and method for completing continuous natural-sounding textual
CN111202663B (en) * 2019-12-31 2022-12-27 浙江工业大学 Vision training learning system based on VR technique

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016207039A (en) * 2015-04-24 2016-12-08 富士通フロンテック株式会社 Image processing system, image processing device, image processing method, and image processing program
JP2019040417A (en) * 2017-08-25 2019-03-14 富士ゼロックス株式会社 Information processing device and program
JP2019164687A (en) * 2018-03-20 2019-09-26 富士ゼロックス株式会社 Information processing device
JP2019169025A (en) * 2018-03-26 2019-10-03 株式会社Pfu Information processing device, character recognition engine selection method, and program

Also Published As

Publication number Publication date
CN116569225A (en) 2023-08-08
JP7134380B2 (en) 2022-09-09
JPWO2022044067A1 (en) 2022-03-03
CN116569225B (en) 2024-04-30

Similar Documents

Publication Publication Date Title
TWI703458B (en) Data processing model construction method, device, server and client
EP3989104A1 (en) Facial feature extraction model training method and apparatus, facial feature extraction method and apparatus, device, and storage medium
US20070109302A1 (en) Link relationship display apparatus, and control method and program for the link relationship display apparatus
US20220189189A1 (en) Method of training cycle generative networks model, and method of building character library
US11748452B2 (en) Method for data processing by performing different non-linear combination processing
CN110795542A (en) Dialogue method and related device and equipment
CN110880324A (en) Voice data processing method and device, storage medium and electronic equipment
EP3905672A1 (en) Color space mapping method and device, computer readable storage medium and apparatus
CN112950640A (en) Video portrait segmentation method and device, electronic equipment and storage medium
CN114037003A (en) Question-answer model training method and device and electronic equipment
CN113378911A (en) Image classification model training method, image classification method and related device
CN106557178B (en) Method and device for updating entries of input method
WO2022044067A1 (en) Document image recognition system
CN111831134B (en) Multi-character structure self-adaptive input method and layout generation method thereof
WO2021057062A1 (en) Method and apparatus for optimizing attractiveness judgment model, electronic device, and storage medium
CN113313066A (en) Image recognition method, image recognition device, storage medium and terminal
CN116229188B (en) Image processing display method, classification model generation method and equipment thereof
JP2019152727A (en) Information processing device, information processing system, and program
CN114268625B (en) Feature selection method, device, equipment and storage medium
KR20220034077A (en) Training method for character generation model, character generation method, apparatus and device
CN114186039A (en) Visual question answering method and device and electronic equipment
JP2019046388A (en) Chat system, server, screen generation method and computer program
CN112202985A (en) Information processing method, client device, server device and information processing system
KR102627950B1 (en) System for recommending the Local and using the same
CN118070156B (en) Air pressure identification method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20951323

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022534682

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 202080103301.0

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20951323

Country of ref document: EP

Kind code of ref document: A1