WO2023181149A1 - Système de reconnaissance de caractères, procédé de reconnaissance de caractères et support d'enregistrement - Google Patents

Système de reconnaissance de caractères, procédé de reconnaissance de caractères et support d'enregistrement Download PDF

Info

Publication number
WO2023181149A1
WO2023181149A1 PCT/JP2022/013389 JP2022013389W WO2023181149A1 WO 2023181149 A1 WO2023181149 A1 WO 2023181149A1 JP 2022013389 W JP2022013389 W JP 2022013389W WO 2023181149 A1 WO2023181149 A1 WO 2023181149A1
Authority
WO
WIPO (PCT)
Prior art keywords
preprint
image
characters written
recognition
characters
Prior art date
Application number
PCT/JP2022/013389
Other languages
English (en)
Japanese (ja)
Inventor
裕一 中谷
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to PCT/JP2022/013389 priority Critical patent/WO2023181149A1/fr
Publication of WO2023181149A1 publication Critical patent/WO2023181149A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/192Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
    • G06V30/194References adjustable by an adaptive method, e.g. learning

Definitions

  • the present invention relates to a character recognition system and the like.
  • OCR Optical Character Recognition
  • OCR Optical Character Recognition
  • Character recognition by OCR is performed, for example, by using a learning model generated by machine learning to recognize characters written on a preprint of a form.
  • the shape of the characters written on the preprint of the form and the position where the characters are written on the preprint vary depending on the person writing the characters.
  • the preprint and characters coexist in the image.
  • a learning model that recognizes handwritten characters on a form uses a mixture of preprints and characters written on preprints that are written in various shapes and in various positions. It may be required to be able to accurately recognize images. For this reason, it is desirable to have a technology that can accurately recognize characters on preprinted forms.
  • Patent Document 1 uses a learning model to extract handwritten characters written within the frame of a preprint.
  • the image processing system disclosed in Patent Document 1 extracts handwritten characters from an image of handwritten characters written within the preprint frame by erasing the preprint frame through image processing.
  • the main object of the present invention is to provide a character recognition system etc. that can improve the recognition accuracy of characters written on preprints.
  • the character recognition system of the present invention includes an acquisition means for acquiring an image of the characters written on the preprint of a form including the preprint, and an acquisition means for acquiring an image of the characters written on the preprint of a form including the preprint. Using a recognition model that recognizes the characters written on the preprint from the captured image and the preprint image that captures the preprint, and an output means for outputting the recognition result.
  • the character recognition method of the present invention acquires an image of characters written on a preprint of a form including a preprint, and combines an image of the characters written on the preprint and a print of the preprint. Using a recognition model that recognizes the characters written on the preprint from the print image, the characters written on the preprint of the acquired image are recognized from the acquired image and the preprint image, and the recognition model is used to recognize the characters written on the preprint of the acquired image. Output the results.
  • the recording medium of the present invention is capable of acquiring an image of characters written on a preprint of a form including a preprint, an image of characters written on the preprint, and a process of acquiring an image of the characters written on the preprint of a form, and A process of recognizing the characters written on the preprint of the acquired image from the acquired image and the preprint image using a recognition model that recognizes the characters written on the preprint from the acquired preprint image.
  • a character recognition program that causes a computer to execute a process of outputting a recognition result is recorded non-temporarily.
  • the recognition accuracy of characters written on preprints can be improved.
  • FIG. 1 is a diagram showing an example of a configuration of a first embodiment of the present invention.
  • FIG. It is a figure showing an example of a form in a 1st embodiment of the present invention. It is a figure which shows the example of the image in which the character of the 1st Embodiment of this invention was written.
  • FIG. 3 is a diagram showing an example of a preprint image according to the first embodiment of the present invention. It is a figure which shows the example of the image in which the character of the 1st Embodiment of this invention was written.
  • FIG. 3 is a diagram showing an example of a preprint image according to the first embodiment of the present invention.
  • 1 is a diagram showing an example of the configuration of a character recognition system according to a first embodiment of the present invention.
  • FIG. 3 is a diagram showing an example of a preprint image according to the first embodiment of the present invention. It is a figure which shows the example of the image in which the character of the 1st Embodiment of this invention was written.
  • FIG. 3 is a diagram showing an example of a preprint image according to the first embodiment of the present invention. It is a figure which shows the example of the image in which the character of the 1st Embodiment of this invention was written.
  • FIG. 3 is a diagram showing an example of a preprint image according to the first embodiment of the present invention.
  • FIG. 2 is a diagram showing an example of the operation flow of the character recognition system according to the first embodiment of the present invention.
  • FIG. 2 is a diagram showing an example of the operation flow of the character recognition system according to the first embodiment of the present invention. It is a figure showing an example of composition of a 2nd embodiment of the present invention. It is a figure showing an example of composition of a character recognition system of a 2nd embodiment of the present invention.
  • FIG. 7 is a diagram schematically showing a flow of data processing in a second embodiment of the present invention. It is a figure showing an example of an operation flow of a character recognition system of a 2nd embodiment of the present invention. It is a figure showing an example of an operation flow of a character recognition system of a 2nd embodiment of the present invention. It is a figure showing an example of an operation flow of a character recognition system of a 2nd embodiment of the present invention. It is a figure showing the example of composition of other embodiments of the present invention.
  • FIG. 1 is a diagram showing an example of the configuration of a form processing system according to this embodiment.
  • the form processing system includes, for example, a character recognition system 10, a scanner 20, and an information processing server 30.
  • the character recognition system 10 is connected to a scanner 20 via a network, for example. Further, the character recognition system 10 is connected to an information processing server 30 via a network.
  • the character recognition system 10 acquires an image obtained by reading a form by the scanner 20.
  • a preprint for writing characters is printed on the paper of the form.
  • a preprint is, for example, a frame or a line on a form that indicates the position where characters are written.
  • the character recognition system 10 acquires, for example, an image of handwritten characters written on a preprint.
  • the characters written on the preprint may be printed.
  • the characters written on the preprint are not limited to the above examples.
  • the character recognition system 10 uses a recognition model to identify the characters written on the preprint from an image of the characters written on the preprint obtained from the scanner 20 and a preprint image of the preprint. Recognize.
  • the recognition model is a learning model that recognizes characters written on a preprint from an image of the characters written on the preprint and the preprint image.
  • the character recognition system 10 outputs the recognition results of characters written on the preprint to the information processing server 30, for example.
  • the information processing server 30 is a server that performs processing according to the purpose of the recognition results of characters written on the preprint.
  • the character recognition system 10 recognizes characters by recognizing the characters written on the preprint using the preprint image in addition to the image of the characters written on the preprint to be recognized. It is possible to suppress the influence of preprints on
  • FIG. 2 is a diagram showing an example of a form.
  • the name of the form is written as "payment slip" at the top.
  • the example of the form in FIG. 2 is, for example, a document submitted to a financial institution when depositing money into an account at the financial institution.
  • entry columns for "account number” and “amount” are set.
  • the frames in which numbers are entered in the "account number” and “amount” fields are preprints.
  • the characters written on the preprint are, for example, the characters written within the frame of the preprint.
  • the characters written on the preprint may be written so as to overlap with the frame of the preprint.
  • An image with characters written on a preprint is an image that includes both the preprint and the characters written on the preprint.
  • the preprint image is an image of only a preprint without any characters written on it.
  • numbers are written on the preprint, but the characters written on the preprint are not limited to numbers.
  • the characters written on the preprint may include symbols.
  • FIG. 3 is a diagram showing an example of an image of characters written on a preprint.
  • FIG. 3 is an extracted image of the "account number" entry field in the example of the form shown in FIG. 2.
  • FIG. 4 is a preprint image of the entry field for "account number" in the example of the form shown in FIG.
  • the characters "01778543" are handwritten on the preprint shown in FIG.
  • An image that is only a preprint may include characters as a preprint.
  • the characters as the preprint are, for example, characters that indicate the digit of the amount, characters that indicate the item, or characters that indicate the unit.
  • the characters as a preprint are not limited to those mentioned above, as long as they are printed on paper as a preprint.
  • FIG. 5 is a diagram showing an example of an image of characters written on a preprint.
  • FIG. 5 is an image in which the "amount" entry field is extracted from the example of the form shown in FIG. 2.
  • FIG. 6 shows the example of the form shown in FIG. This is a preprint image of the "amount" entry field.
  • "yen” indicating the unit of monetary amount is printed as part of the preprint at the bottom of the frame on the right.
  • the characters "40000" are handwritten on the preprint shown in FIG.
  • a form is a document used for procedures at, for example, financial institutions, government offices, educational institutions, hospitals, transportation facilities, or companies. Further, the form may be a document attached to an item to be managed. Examples of forms are not limited to the above.
  • the preprint indicates, for example, a position on the form where the date, name, affiliation, address, telephone number, e-mail address, age, gender, occupation, or amount is to be written.
  • a preprint is composed of, for example, items to be filled in and a frame in which characters are written. When multiple characters are entered in one item, the preprint may be a series of multiple frames.
  • preprints for a plurality of items may be printed on one form. For example, when a preprint is printed on a sheet of paper as an entry column with a plurality of consecutive frames, the character recognition system 10 outputs the recognized characters as character string data according to the order of the frames.
  • FIG. 7 is a diagram showing an example of the configuration of the character recognition system 10.
  • the character recognition system 10 includes an acquisition section 11, a recognition section 13, and an output section 14 as basic components.
  • the character recognition system 10 further includes an image extraction section 12, a generation section 15, and a storage section 16.
  • the acquisition unit 11, the image extraction unit 12, the recognition unit 13, the output unit 14, and the storage unit 16 extract the characters written on the preprint from an image of the characters written on the preprint. Recognize. Further, the acquisition unit 11, the generation unit 15, and the storage unit 16 generate a recognition model, for example.
  • the acquisition unit 11 acquires an image of the characters written on the preprint.
  • the acquisition unit 11 acquires, for example, from the scanner 20 an image of a form with characters written on a preprint.
  • the acquisition unit 11 may acquire an image in which the portion of the preprint with characters written on it has already been extracted from the form.
  • the image of the portion where the characters are written on the preprint is, for example, the image shown in the examples of FIGS. 3 and 5.
  • the acquisition unit 11 may acquire an image of the form without any characters written on the preprint.
  • the acquisition unit 11 acquires, for example, from the scanner 20 an image of a form with no characters written on the preprint.
  • the acquisition unit 11 may acquire learning data used to generate the recognition model.
  • the generation unit 15 acquires, as learning data, an image of the characters written on the preprint, and data in which the preprint image is associated with the characters written on the preprint.
  • the learning data is input into the character recognition system 10 or another terminal device connected to the character recognition system 10, for example, by an operator's operation.
  • the image extracting unit 12 extracts a preprint image that corresponds to the image obtained by the obtaining unit 11 and depicting the characters written on the preprint.
  • the image extraction unit 12 extracts a preprint image from the form data stored in the storage unit 16, for example.
  • the form data includes, for example, an image of a form and definition data.
  • the definition data includes, for example, information about the items to be written on the form and the position of the preprint corresponding to the written item on the form. Information on the position of the preprint, for example, information indicating the range where the preprint is printed on the form.
  • the items to be described include, for example, one or more of name, postal code, address, telephone number, age, personal identification number, account number, amount, and date. The items to be described are not limited to the above examples.
  • the image extraction unit 12 identifies the position of the preprint on the form, for example, based on the information on the position of the preprint included in the definition data. Then, the image extraction unit 12 extracts a preprint image from the image stored in the storage unit 16 by cutting out the image at the specified preprint position. The image extraction unit 12 may extract the preprint image from the image of the form with no characters written on the preprint, which is acquired by the acquisition unit 11.
  • the recognition unit 13 uses the recognition model to recognize the characters in the image from the image of the characters written on the preprint acquired by the acquisition unit 11 and the preprint image.
  • the recognition model is a learning model that recognizes the characters written on the preprint from an image of the characters written on the preprint and the preprint image.
  • the recognition unit 13 inputs, for example, the image of the characters written on the preprint and the preprint image acquired by the acquisition unit 11 into the recognition model. Then, the recognition unit 13 recognizes the characters written on the preprint using the recognition model.
  • the recognition unit 13 may recognize characters written on the preprint using a preprint image extracted in advance. Further, the recognition unit 13 may recognize characters written on the preprint using a preprint image generated in advance as an image of the preprint portion.
  • the recognition unit 13 uses, for example, a preprint image stored in the storage unit 16 to recognize characters written on the preprint.
  • the recognition unit 13 extracts an image showing the characters written on the preprint by specifying the position of the preprint, for example, based on the information on the position of the preprint included in the definition data. Then, the recognition unit 13 uses the recognition model to identify the characters written on the preprint from the extracted image of the characters written on the preprint and the preprint image extracted by the image extraction unit 12. recognize.
  • the recognition unit 13 combines an image of the characters written on the preprint and the preprint image into one data and inputs the data into the recognition model.
  • Combining an image of characters written on a preprint with a preprint image means generating image data by superimposing the two images. If the image with characters written on the preprint and the preprint image are images with three channels of RGB per pixel, the recognition unit 13, for example, combines the data of the two images to create a single image. Image data of 6 channels per pixel. Then, the recognition unit 13 inputs the combined six-channel image data to the recognition model.
  • the recognition unit 13 combines, for example, an image of characters written on a preprint and a preprint image based on preset conditions.
  • the recognition unit 13, for example, uses an image of the characters written on the preprint extracted at the same size and a character written on the preprint based on the outer periphery of the preprint image.
  • the two images are combined by overlapping the preprint image and the preprint image.
  • the recognition unit 13 combines image data of corresponding pixels. Then, the recognition unit 13 inputs the combined data into a recognition model and recognizes the characters written on the preprint.
  • the recognition unit 13 may recognize characters other than those written on the preprint in the image of the form. For example, the recognition unit 13 may identify the type of the form from the image of the form acquired by the acquisition unit 11. Then, the recognition unit 13 recognizes the characters written on the preprint by specifying the position of the preprint based on the definition data included in the form data corresponding to the specified type of form. The recognition unit 13 identifies the type of the form, for example, by recognizing the form name or form number printed on the form in the image of the form. The relationship between the form name or form number printed on the form and the type of form is set in advance. Further, the recognition model used by the recognition unit 13 may be a learning model generated outside the character recognition system 10.
  • FIG. 8 is a diagram showing an example of an image of characters written on a preprint.
  • FIG. 8 differs from the example of the image in FIG. 3 in the aspect of the preprint.
  • the preprint of the example image in FIG. 8 differs from the example image in FIG. 3 in the thickness and type of lines, for example.
  • FIG. 9 is a preprint image of the example image of FIG. In the example of the image of the characters written on the preprint shown in FIG. 8, the characters "13758047" are handwritten on the preprint shown in FIG.
  • the recognition model outputs "13758047" as a recognition result when the example image in FIG. 8 and the example image in FIG. 9 are input. For example, even if the recognition model is a learning model generated using the preprint of the example image in FIG.
  • the recognition model can recognize characters written on a preprint that has not been trained.
  • FIG. 10 is a diagram showing an example of an image in which characters describing the year in Western calendar notation are printed on a preprint.
  • the characters "A.D.” and “Year” are printed in advance within the frame of the preprint.
  • the characters "2022” are handwritten on the preprint image.
  • FIG. 11 is a preprint image in the example of the image in FIG. 10.
  • the recognition model outputs "2022" as a recognition result when the example image in FIG. 10 and the example image in FIG. 11 are input.
  • FIG. 12 shows an example of an image in which, in the image example of FIG. 10, the upper two digits "20" indicating the year in Western calendar notation are printed in advance as a preprint. That is, in the example image of FIG. 12, “Year”, “20”, and “Year” are printed in advance as preprints. In the example image of FIG. 12, "22" out of "2022” is handwritten on the preprint.
  • FIG. 13 is a preprint image in the example of the image of FIG. 12.
  • the recognition model outputs "22" as a recognition result when the example image in FIG. 12 and the example image in FIG. 13 are input. For example, even if the recognition model is a learning model that does not use the preprints of the example images of FIGS.
  • the recognition model can recognize characters written on various forms of preprints by inputting images of the characters written on the preprints and preprint images.
  • the recognition model can also be used for preprints with different frame shapes and colors. Recognition can be performed in the same way even if learning is not performed using preprints of all aspects as learning data.
  • the output unit 14 outputs the recognition result by the recognition unit 13.
  • the output unit 14 outputs the characters recognized by the recognition unit 13 to the information processing server 30, for example.
  • the output unit 14 outputs, for example, an item corresponding to the preprint and the recognized characters in association with each other.
  • the recognition target is an account number as shown in the example image of FIG. 3
  • the output unit 14 outputs, for example, information indicating that it is an account number in association with the recognized character string.
  • the output unit 14 may output the recognition result to a display device (not shown) connected to the character recognition system 10.
  • the generation unit 15 When generating a recognition model in the character recognition system 10, the generation unit 15 performs processing related to generation of the recognition model. The generation unit 15 learns an image of the characters written on the preprint, and the relationship between the preprint image and the characters written on the preprint. Then, the generation unit 15 generates a recognition model that recognizes the characters in the image from the image of the characters written on the preprint and the preprint image.
  • the generation unit 15 generates a recognition model by learning the relationship between an image of the characters written on the preprint, data obtained by combining the preprint images, and the characters written on the preprint. generate.
  • the generation unit 15 combines the data of the two images to create one image. Image data of 6 channels per pixel.
  • the generation unit 15 then generates a recognition model by learning the relationship between the combined six-channel image data and the characters written on the preprint.
  • the generation unit 15 may perform learning using randomly shaped figures as preprints.
  • the generation unit 15 When using a random-shaped figure as a preprint, the generation unit 15 generates, for example, an image of characters written on the randomly-shaped figure, and an image that is the same as the figure with the characters written on it.
  • a recognition model is generated using the images as learning data.
  • the generation unit 15 generates a recognition model by deep learning using, for example, DNN (Deep Neural Network).
  • DNN Deep Neural Network
  • Machine learning algorithms for generating recognition models are not limited to deep learning using DNN.
  • the storage unit 16 stores, for example, a recognition model used by the recognition unit 13 to recognize characters in an image.
  • the storage unit 16 stores, for example, preprint images.
  • the storage unit 16 stores, for example, form data.
  • the form data includes, for example, image data of a form and definition data.
  • the form data may include a preprint image extracted in advance.
  • the storage unit 16 stores, for example, an image of characters written on a preprint, a preprint image, and characters written on the preprint as learning data.
  • the recognition model used by the recognition unit 13 may be stored in a storage means other than the storage unit 16.
  • the scanner 20 for example, optically reads a form and generates an image of the form.
  • the scanner 20 then outputs the image of the form to the character recognition system 10.
  • the scanner 20 may extract the image of the preprint portion from among the images of the form.
  • the scanner 20 outputs the extracted preprint image to the character recognition system 10.
  • the scanner 20 may generate an image of the form by photographing the form.
  • the information processing server 30 acquires, for example, the recognition results of characters written on the form from the character recognition system 10.
  • the information processing server 30 uses the recognition results to perform processing according to the purpose.
  • the information processing server 30 uses the recognition results, for example, in processing related to application and deposit/withdrawal related to account management at a financial institution.
  • the information processing server 30 may use the recognition results, for example, to process application documents in government offices, educational institutions, hospitals, or transportation facilities.
  • the information processing server 30 may use the recognition results for slip processing at a company. Further, the information processing server 30 may use the identification results for managing the goods in distribution. Examples of identification results are not limited to the above.
  • FIG. 14 is a diagram showing an example of an operation flow when the character recognition system 10 recognizes characters written on a preprint.
  • the acquisition unit 11 acquires an image showing the characters written on the preprint (step S11).
  • the acquisition unit 11 acquires, for example, from the scanner 20 an image of a form showing characters written on the preprint.
  • the image extraction unit 12 extracts a preprint image corresponding to the image acquired by the acquisition unit 11 (step S12).
  • the image extraction unit 12 extracts, for example, a preprint image corresponding to the image acquired by the acquisition unit 11 from the data stored in the storage unit 16.
  • the recognition unit 13 uses the recognition model to recognize characters in the image from the image acquired by the acquisition unit 11 and the preprint image (step S13).
  • the recognition model recognizes the characters written on the preprint from the image of the characters written on the preprint and the preprint image.
  • the output unit 14 When the characters in the image are recognized, the output unit 14 outputs the recognition results (step S14). The output unit 14 outputs the recognition result to the information processing server 30, for example.
  • FIG. 15 is a diagram showing an example of an operation flow when the character recognition system 10 generates a recognition model.
  • the acquisition unit 11 acquires, as learning data, an image of the characters written on the preprint, the preprint image, and the characters written on the preprint (step S21).
  • the generation unit 15 Upon acquiring the learning data, the generation unit 15 learns the image of the characters written on the preprint and the relationship between the preprint image and the characters written on the preprint, and generates a recognition model (Ste S22). For example, the generation unit 15 combines an image of characters written on a preprint with a preprint image. Then, the generation unit 15 learns the relationship between the combined data and the characters written on the preprint, which are included as correct data in the learning data, and generates a recognition model.
  • the generation unit 15 After generating the recognition model, the generation unit 15 saves the generated recognition model (step S23).
  • the generation unit 15 stores the generated recognition model in the storage unit 16, for example.
  • the character recognition system 10 of the form processing system of this embodiment uses a recognition model to recognize characters written on a preprint from an image of the characters written on the preprint and a preprint image. .
  • the character recognition system 10 recognizes the characters written on the preprint by further using the preprint image in addition to the written image showing the characters written on the preprint to be recognized. It is possible to suppress the influence of preprints on the recognition of As a result, the character recognition system 10 can improve the accuracy of recognizing characters written on preprints.
  • the recognition model used by the character recognition system 10 uses as input an image of the characters written on the preprint and the preprint image, and performs learning by recognizing the characters written on the preprint. It is possible to recognize characters written on a preprint of an embodiment in which the method is not performed. Therefore, the character recognition system 10 recognizes the characters written on the preprint by inputting an image showing the characters written on the preprint and the preprint image. Characters written on prints can be recognized. Furthermore, in the character recognition system 10, when generating a recognition model, it is not necessary to prepare learning data for each form of preprint actually used for recognition.
  • the character recognition system 10 when generating a recognition model, there is no need to learn learning data for each form of preprint actually used for recognition, so the amount of learning when generating a recognition model is suppressed. be able to. Therefore, in the character recognition system 10, the computer resources necessary for generating a recognition model can be suppressed. Therefore, the character recognition system 10 can efficiently generate a recognition model.
  • the character recognition system 10 can generate recognition models that can recognize characters written on various preprint images. I can do it. That is, by using a recognition model generated using randomly shaped figures as a preprint, the character recognition system 10 can recognize the characters written on the preprint even if the shape of the preprint image is different for each form. Can recognize characters accurately.
  • the character recognition system 10 of this embodiment recognizes characters by inputting an image of characters written on a preprint and data obtained by combining the preprint images into a recognition model. There is no need to erase preprints as preprocessing for recognition. Further, since the process of erasing the preprint is not performed, it is possible to suppress the influence of the process related to erasing the preprint on character recognition. Therefore, the character recognition system 10 of this embodiment can improve recognition accuracy while suppressing the resources necessary for recognizing characters written on preprints.
  • FIG. 16 is a diagram showing an example of the configuration of the form processing system of this embodiment.
  • the form processing system includes, for example, a character recognition system 40, a scanner 20, and an information processing server 30.
  • the character recognition system 40 is connected to the scanner 20 via a network, for example. Further, the character recognition system 40 is connected to the information processing server 30 via a network.
  • the character recognition system 10 of the first embodiment uses, for example, a recognition model to input data that combines an image in which characters are written on a preprint and a preprint image, and recognizes the characters on the preprint. recognize. The character recognition system 10 then outputs the recognition result.
  • the character recognition system 40 of the present embodiment for example, when combining an image with characters written on a preprint and a preprint image, improves the accuracy of overlapping the two images.
  • a transformation model is used to transform the preprint images and then combine them.
  • the conversion model is a learning model that estimates conversion parameters used when performing conversion processing on a preprint image.
  • FIG. 17 is a diagram showing an example of the configuration of the character recognition system 40.
  • the character recognition system 40 includes an acquisition section 11 , an image extraction section 12 , a recognition section 41 , an output section 14 , a generation section 42 , and a storage section 16 .
  • the recognition unit 41 also includes a conversion unit 51 and an image recognition unit 52.
  • the configuration and functions of the acquisition unit 11, image extraction unit 12, output unit 14, and storage unit 16 of the character recognition system 40 are the same as the acquisition unit 11, image extraction unit 12, and output unit 14 of the character recognition system 10 of the first embodiment. and storage unit 16, respectively.
  • the conversion unit 51 of the recognition unit 41 converts the preprint image using, for example, a conversion model.
  • the transformation model for example, performs an affine transformation on the preprint image.
  • the recognition unit 41 converts the preprint image so that it overlaps with the destination image by, for example, rotating, adjusting the size, and moving the preprint image in parallel.
  • the transformation model estimates transformation parameters used when rotating, adjusting size, and translating a preprint image, for example.
  • the conversion unit 51 uses a conversion model to estimate affine transformation parameters from data obtained by combining an image of characters written on a preprint and a preprint image according to preset conditions. Then, the conversion unit 51 performs affine transformation on the preprint image using the estimated parameters. For example, the converting unit 51 combines the two images so that they overlap by matching the outer periphery of each of the preprint images with the image of the characters written on the preprint as a preset condition. . Then, the conversion unit 51 uses the conversion model to estimate conversion parameters from the data combined under preset conditions.
  • the conversion parameter is a parameter for converting preprint images so that the accuracy of overlaying is improved compared to when combining under preset conditions. After estimating the transformation parameters, the transformation unit 51 performs affine transformation on the preprint image using the transformation parameters, thereby increasing the accuracy of superposition.
  • the transformation model is, for example, a learning model that uses DNN called STN (Spatial Transformer Networks).
  • STN Session Initiation Networks
  • the image transformation method using STN is described, for example, in Max Jaderberg et al. "Spatial Transformer Networks", NIPS'15: Proceedings of the 28th International Conference on Neural Information Processing Systems, Volume 2, December 2015, p. 2017-2025 Are listed.
  • the image recognition unit 52 of the recognition unit 41 uses a recognition model to recognize the characters written on the preprint from the image of the characters written on the preprint and the preprint image.
  • the image recognition unit 52 combines the image of the characters written on the preprint with the preprint image on which the conversion unit 51 has performed affine transformation. Then, the image recognition unit 52 uses the identification model to recognize characters written on the preprint from the combined data.
  • the conversion model and the recognition model may be learning models generated outside the character recognition system 40.
  • FIG. 18 is a diagram schematically showing the flow of processing when the recognition unit 41 recognizes characters written on a preprint.
  • the converting unit 51 combines, for example, an image of characters written on a preprint and a preprint image according to, for example, preset conditions.
  • the preset conditions are, for example, set so that the outer peripheries of the two images are aligned.
  • the conversion unit 51 estimates affine transformation parameters using the transformation model. Then, the conversion unit 51 performs affine transformation on the preprint image using the estimated affine transformation parameters.
  • the converting unit 51 outputs the preprint image that has undergone affine transformation to the image recognizing unit 52.
  • the image recognition unit 52 combines the image in which characters are written on the preprint and the image that has been subjected to affine transformation.
  • the image recognition unit 52 uses the recognition model to recognize characters written on the preprint from the combined data.
  • the character recognition system 40 generates only the recognition model out of the conversion model and the recognition model.
  • a recognition model for example, a learning model generated outside the character recognition system 40 is used as the conversion model.
  • the generation unit 42 When generating only the recognition model out of the conversion model and the recognition model, the generation unit 42 generates, for example, an image containing the characters written on the preprint and the preprint image, which are included in the learning data. , are combined using a transformation model. Then, the generation unit 42 learns the relationship between the combined data and the characters written on the preprint, which are included as correct data in the learning data, and generates a recognition model.
  • the generation unit 42 stores the generated conversion model and recognition model in the storage unit 16.
  • the character recognition system 40 may generate both a conversion model and a recognition model.
  • the generation unit 42 uses the conversion model to combine an image of the characters written on the preprint and the preprint image according to preset conditions. Estimate the transformation parameters from the data. Furthermore, the generation unit 42 uses the recognition model to recognize characters written on the preprint from the combined data.
  • the generation unit 42 updates the parameters of the transformation model so that the difference between the affine transformation parameters estimated by the transformation model and the affine transformation parameters included in the learning data becomes smaller.
  • the generation unit 42 also updates the parameters of the recognition model so that the difference between the identification result and the correct data becomes smaller.
  • the generation unit 42 repeats the above process using the updated model. For example, the generation unit 42 generates a conversion model and a recognition model by repeating the above processing until the accuracy of the estimation result of the conversion parameter of the conversion model and the recognition result of the recognition model satisfy a preset standard. Further, the generation unit 42 generates the identification model by, for example, updating the parameters of the recognition model so that the difference between the identification result and the correct data becomes smaller. The generation unit 42 stores the generated conversion model and recognition model in the storage unit 16, for example.
  • FIG. 19 is a diagram showing an example of an operation flow when the character recognition system 40 recognizes characters written on a preprint.
  • the acquisition unit 11 acquires an image showing the characters written on the preprint (step S31).
  • the acquisition unit 11 acquires, for example, from the scanner 20 an image of a form showing characters written on the preprint.
  • the image extraction unit 12 extracts a preprint image corresponding to the image acquired by the acquisition unit 11 (step S32).
  • the image extraction unit 12 extracts, for example, a preprint image corresponding to the image acquired by the acquisition unit 11 from the data stored in the storage unit 16.
  • the conversion unit 51 of the recognition unit 41 uses the conversion model to estimate conversion parameters to be used when converting the preprint image. Then, the conversion unit 51 converts the preprint image using the estimated conversion parameters (step S33).
  • the image recognition unit 52 combines the image with the characters written on the preprint and the converted preprint image. Then, the image recognition unit 52 uses the recognition model to recognize characters in the image from the combined data (step S34).
  • the output unit 14 When the characters in the image are recognized, the output unit 14 outputs the recognition results (step S35). The output unit 14 outputs the recognition result to the information processing server 30, for example.
  • FIG. 20 is a diagram showing an example of an operation flow when the character recognition system 40 generates only a recognition model.
  • the acquisition unit 11 acquires, as learning data, an image of the characters written on the preprint, the preprint image, and the characters written on the preprint (step S41).
  • the generation unit 42 uses the conversion model to estimate conversion parameters to be used when converting the preprint image. Then, the generation unit 42 converts the preprint image using the estimated conversion parameters and the conversion model (step S42).
  • the generation unit 42 After converting the preprint image, the generation unit 42 combines the image containing the characters written on the preprint with the converted preprint image. The generation unit 42 then learns the relationship between the combined data and the characters written on the preprint, and generates a recognition model (step S43).
  • the generation unit 42 After generating the recognition model, the generation unit 42 saves the generated recognition model (step S44).
  • the generation unit 42 stores the generated recognition model in the storage unit 16, for example.
  • FIG. 21 is a diagram showing an example of an operation flow when the character recognition system 40 generates a conversion model and a recognition model.
  • the acquisition unit 11 acquires, as learning data, data obtained by combining an image of the characters written on the preprint and the preprint image, a conversion parameter, and the characters written on the preprint (step S51 ).
  • the generation unit 42 When the learning data is acquired, the generation unit 42 combines the data included in the learning model, which is a combination of the image of the characters written on the preprint and the preprint image, and the parameters included in the learning model. Generate a transformation model by learning relationships. In addition, the generation unit 42 generates a recognition model by learning the relationship between the data obtained by combining the image of the characters written on the preprint and the preprint image, and the characters written on the preprint ( Step S52).
  • the generation unit 42 After generating the conversion model and recognition model, the generation unit 42 saves the generated conversion model and recognition model (step S53).
  • the generation unit 42 stores the generated conversion model and recognition model in the storage unit 16, for example.
  • the character recognition system 40 of this embodiment uses a conversion model to combine an image of characters written on a preprint with a preprint image. Then, the character recognition system 40 uses the recognition model to recognize characters written on the preprint from the combined data. By using the preprint image converted using the conversion model, the character recognition system 40 improves the accuracy of superposition when combining the image of the characters written on the preprint with the preprint image. can be improved. By using the data combined in this way, the character recognition system 40 uses the recognition model to recognize the characters written on the preprint and the preprint image while suppressing fluctuations in the deviation between the image showing the characters written on the preprint and the preprint image. Characters on preprints can be recognized. The character recognition system 40 improves the recognition accuracy of the characters written on the preprint by using the recognition model to recognize the characters written on the preprint while variations in the shift between the two images are suppressed. can do.
  • the character recognition system 40 recognizes the overlap between the image of the characters written on the preprint and the preprint image, which may occur in actual use. A conversion model that suppresses misalignment can be generated. Therefore, the character recognition system 40 can suppress variations in the deviation between the image in which characters are written on the preprint and the preprint image, depending on the actual usage situation. Therefore, when generating a conversion model using learning data, the character recognition system 40 can further improve the recognition accuracy of characters written on a preprint.
  • FIG. 22 shows an example of the configuration of a computer 200 that executes a computer program that performs each process in the character recognition system 10 of the first embodiment and the character recognition system 40 of the second embodiment.
  • the computer 200 includes a CPU (Central Processing Unit) 201, a memory 202, a storage device 203, an input/output I/F (Interface) 204, and a communication I/F 205.
  • CPU Central Processing Unit
  • the CPU 201 reads computer programs for performing each process from the storage device 203 and executes them.
  • the CPU 201 may be configured by a combination of multiple CPUs. Further, the CPU 201 may be configured by a combination of a CPU and other types of processors. For example, the CPU 201 may be configured by a combination of a CPU and a GPU (Graphics Processing Unit).
  • the memory 202 is configured with a DRAM (Dynamic Random Access Memory) or the like, and temporarily stores computer programs executed by the CPU 201 and data being processed.
  • the storage device 203 stores computer programs executed by the CPU 201.
  • the storage device 203 is configured by, for example, a nonvolatile semiconductor storage device. Other storage devices such as a hard disk drive may be used as the storage device 203.
  • the input/output I/F 204 is an interface that receives input from a worker and outputs display data and the like.
  • the communication I/F 205 is an interface that transmits and receives data between the scanner 20 and the information processing server 30. Furthermore, the information processing server 30 may also have a similar configuration.
  • the computer program used to execute each process can also be stored and distributed in a computer-readable recording medium that non-temporarily records data.
  • a computer-readable recording medium for example, a magnetic tape for data recording or a magnetic disk such as a hard disk can be used.
  • an optical disc such as a CD-ROM (Compact Disc Read Only Memory) can also be used.
  • a nonvolatile semiconductor memory device may be used as the recording medium.
  • Character recognition system 11 Acquisition unit 12 Image extraction unit 13 Recognition unit 14 Output unit 15 Generation unit 16 Storage unit 20 Scanner 30 Information processing server 40 Character recognition system 41 Recognition unit 42 Generation unit 51 Conversion unit 52 Image recognition unit 100 Computer 101 CPU 102 Memory 103 Storage device 104 Input/output I/F 105 Communication I/F

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Discrimination (AREA)

Abstract

La présente invention concerne un système de reconnaissance de caractères qui comprend une unité d'acquisition, une unité de reconnaissance et une unité de sortie. L'unité d'acquisition acquiert une image représentant un caractère noté sur une préimpression d'une feuille de grand-livre comprenant la préimpression. L'unité de reconnaissance utilise l'image représentant le caractère noté sur la préimpression et un modèle de reconnaissance pour reconnaître un caractère noté sur une préimpression à partir d'une image de préimpression représentant la préimpression, pour reconnaître, à partir de l'image acquise et de l'image de préimpression, le caractère noté sur la préimpression de l'image acquise. L'unité de sortie délivre en sortie le résultat de reconnaissance.
PCT/JP2022/013389 2022-03-23 2022-03-23 Système de reconnaissance de caractères, procédé de reconnaissance de caractères et support d'enregistrement WO2023181149A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/013389 WO2023181149A1 (fr) 2022-03-23 2022-03-23 Système de reconnaissance de caractères, procédé de reconnaissance de caractères et support d'enregistrement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/013389 WO2023181149A1 (fr) 2022-03-23 2022-03-23 Système de reconnaissance de caractères, procédé de reconnaissance de caractères et support d'enregistrement

Publications (1)

Publication Number Publication Date
WO2023181149A1 true WO2023181149A1 (fr) 2023-09-28

Family

ID=88100226

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/013389 WO2023181149A1 (fr) 2022-03-23 2022-03-23 Système de reconnaissance de caractères, procédé de reconnaissance de caractères et support d'enregistrement

Country Status (1)

Country Link
WO (1) WO2023181149A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05266247A (ja) * 1992-03-19 1993-10-15 Toshiba Corp 画像データ処理システム
JP2007148846A (ja) * 2005-11-29 2007-06-14 Nec Corp Ocr装置、フォームアウト方法及びフォームアウトプログラム
JP2021043650A (ja) * 2019-09-10 2021-03-18 キヤノン株式会社 画像処理装置、画像処理システム、画像処理方法、及びプログラム

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05266247A (ja) * 1992-03-19 1993-10-15 Toshiba Corp 画像データ処理システム
JP2007148846A (ja) * 2005-11-29 2007-06-14 Nec Corp Ocr装置、フォームアウト方法及びフォームアウトプログラム
JP2021043650A (ja) * 2019-09-10 2021-03-18 キヤノン株式会社 画像処理装置、画像処理システム、画像処理方法、及びプログラム

Similar Documents

Publication Publication Date Title
US20190279170A1 (en) Dynamic resource management associated with payment instrument exceptions processing
US9652671B2 (en) Data lifting for exception processing
US9342741B2 (en) Systems, methods and computer program products for determining document validity
CA2502811C (fr) Systeme et procede de capture, de stockage et de traitement de recepisses et de donnees associees
US9098765B2 (en) Systems and methods for capturing and storing image data from a negotiable instrument
US9824288B1 (en) Programmable overlay for negotiable instrument electronic image processing
JP2008259156A (ja) 情報処理装置、情報処理システム、情報処理方法、プログラムおよび記録媒体
US10528807B2 (en) System and method for processing and identifying content in form documents
US10229395B2 (en) Predictive determination and resolution of a value of indicia located in a negotiable instrument electronic image
CA2619873A1 (fr) Integration de flux de travaux pour guichet et arriere-guichet
US9031308B2 (en) Systems and methods for recreating an image using white space and check element capture
US20160379186A1 (en) Element level confidence scoring of elements of a payment instrument for exceptions processing
US20170076070A1 (en) Methods for securely processing non-public, personal health information having handwritten data
WO2023181149A1 (fr) Système de reconnaissance de caractères, procédé de reconnaissance de caractères et support d'enregistrement
JP2019191665A (ja) 財務諸表読取装置、財務諸表読取方法及びプログラム
JP2008005219A (ja) 帳票画像処理システム
CN110135218A (zh) 用于识别图像的方法、装置、设备和计算机存储介质
US20150120548A1 (en) Data lifting for stop payment requests
KR101516684B1 (ko) Ocr을 이용한 문서 변환 서비스 방법
JP5998090B2 (ja) 画像照合装置、画像照合方法、画像照合プログラム
US20150120517A1 (en) Data lifting for duplicate elimination
CN101727572A (zh) 使用文档特征来确保图像完整性
US11055528B2 (en) Real-time image capture correction device
US10115081B2 (en) Monitoring module usage in a data processing system
Bogahawatte et al. Online Digital Cheque Clearance and Verification System using Block Chain

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22933295

Country of ref document: EP

Kind code of ref document: A1