WO2022190900A1 - Image processing apparatus, program, and system - Google Patents

Image processing apparatus, program, and system Download PDF

Info

Publication number
WO2022190900A1
WO2022190900A1 PCT/JP2022/007871 JP2022007871W WO2022190900A1 WO 2022190900 A1 WO2022190900 A1 WO 2022190900A1 JP 2022007871 W JP2022007871 W JP 2022007871W WO 2022190900 A1 WO2022190900 A1 WO 2022190900A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
processor
character
string
information processing
Prior art date
Application number
PCT/JP2022/007871
Other languages
French (fr)
Japanese (ja)
Inventor
杜朗 鳥居
泰弘 大川
一隆 朝日
和也 藤井
翔太 永渕
和久 吉田
裕之 堺
崇 青木
琢磨 赤木
賢太郎 瀬崎
昌孝 佐藤
瑛央 高田
Original Assignee
株式会社 東芝
東芝インフラシステムズ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社 東芝, 東芝インフラシステムズ株式会社 filed Critical 株式会社 東芝
Publication of WO2022190900A1 publication Critical patent/WO2022190900A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/24Character recognition characterised by the processing or recognition method
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/24Character recognition characterised by the processing or recognition method
    • G06V30/242Division of the character sequences into groups prior to recognition; Selection of dictionaries
    • G06V30/244Division of the character sequences into groups prior to recognition; Selection of dictionaries using graphical properties, e.g. alphabet type or font
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/42Document-oriented image-based pattern recognition based on the type of document
    • G06V30/424Postal images, e.g. labels or addresses on parcels or postal envelopes

Definitions

  • the embodiments of the present invention relate to information processing devices, programs and systems.
  • a system uses a cloud server to perform character recognition on images acquired by terminals.
  • the terminal sends the acquired image to the cloud server and acquires the result of character recognition from the cloud server.
  • the system had the problem of a large amount of transfer because it was necessary to send images from the terminal to the cloud server.
  • an information processing device In order to solve the above problems, an information processing device, a program, and a system that can reduce the amount of data transferred to a device that performs character recognition are provided.
  • an information processing device includes an image interface, a communication interface, and a processor.
  • An image interface acquires a captured image containing a character string.
  • the communication interface connects to external devices.
  • the processor generates intermediate information composed of information generated in the course of character recognition processing from the captured image, and transmits the intermediate information to the external device through the communication interface.
  • FIG. 1 is a block diagram showing a configuration example of a recognition system according to an embodiment.
  • FIG. 2 is a block diagram showing a configuration example of the OCR device according to the embodiment.
  • FIG. 3 is a block diagram illustrating a configuration example of a server according to the embodiment;
  • FIG. 4 is a diagram illustrating an operation example of the OCR device according to the embodiment;
  • FIG. 5 is a diagram illustrating a configuration example of intermediate information according to the embodiment.
  • FIG. 6 is a diagram illustrating an operation example of the server according to the embodiment;
  • FIG. 9 is a diagram showing another operation example of the OCR device according to the embodiment.
  • the recognition system recognizes a character string from an image using character recognition processing (OCR (Optical Character Recognition) processing).
  • OCR Optical Character Recognition
  • the recognition system recognizes the destination of the parcel from an image such as a slip attached to the parcel.
  • a recognition system sorts the packages based on the recognized destination.
  • FIG. 1 shows a configuration example of a recognition system 1 according to an embodiment.
  • the recognition system 1 includes a segmentation device 2, a camera 3, a network 6, an OCR device 10, a server 20, and the like.
  • the OCR device 10 connects to the sorting device 2 and camera 3 . Also, the OCR device 10 and the server 20 are connected to the network 6 .
  • the recognition system 1 may further include a configuration according to need, or a specific configuration may be excluded from the recognition system 1.
  • the sorting device 2 sorts the packages thrown in by an operator, a conveyor belt, a robot, or the like.
  • the sorting device 2 receives destination information (character string information) related to the destination (character string) of the parcel from the OCR device 10 .
  • the sorting device 2 sorts packages based on the destination information. For example, the sorting device 2 sorts packages into chutes, pockets, carts, trays, or the like as sorting destinations.
  • the sorting device 2 is composed of a sorter, a conveyor belt, a robot, or the like.
  • the camera 3 shoots the packages that are put into the sorting device 2.
  • the camera 3 photographs the surface on which the destination is displayed.
  • the camera 3 takes an image of the side to which the slip is attached.
  • the camera 3 supplies the captured image (captured image) to the OCR device 10 .
  • the camera 3 is a CCD (Charge Coupled Device) camera.
  • the camera 3 may be provided with a light source for illuminating the baggage.
  • the OCR device 10 (information processing device, first information processing device, external device) acquires the captured image from the camera 3 .
  • the OCR device 10 generates intermediate information related to OCR processing from the captured image.
  • the OCR device 10 transmits the intermediate information to the server 20 and receives from the server 20 destination information related to the destination of the parcel shown in the captured image.
  • the OCR device 10 inputs the received destination information to the sorting device 2 .
  • the OCR device 10 and intermediate information will be detailed later.
  • the network 6 relays communication between the OCR device 10 and the server 20.
  • network 6 is the Internet.
  • the server 20 (information processing device, second information processing device, external device) receives intermediate information from the OCR device 10 .
  • the server 20 generates destination information based on the received intermediate information.
  • the server 20 supplies the generated destination information to the OCR device 10 .
  • the server 20 will be detailed later.
  • FIG. 2 shows a configuration example of the OCR device 10 according to the embodiment.
  • FIG. 2 is a block diagram showing a configuration example of the OCR device 10.
  • the OCR device 10 includes a processor 11, a ROM 12, a RAM 13, an NVM 14, a communication section 15, an operation section 16, a display section 17, a sorting device interface 18, a camera interface 19, and the like.
  • the processor 11, ROM 12, RAM 13, NVM 14, communication section 15, operation section 16, display section 17, sorting device interface 18 and camera interface 19 are connected to each other via a data bus or the like. It should be noted that the OCR device 10 may have a configuration other than the configuration shown in FIG.
  • the processor 11 (first processor) has a function of controlling the operation of the OCR device 10 as a whole.
  • Processor 11 may include an internal cache, various interfaces, and the like.
  • the processor 11 implements various processes by executing programs pre-stored in the internal memory, ROM 12 or NVM 14 .
  • processor 11 controls the functions performed by the hardware circuits.
  • the ROM 12 is a non-volatile memory in which control programs, control data, etc. are stored in advance.
  • the control program and control data stored in the ROM 12 are preinstalled according to the specifications of the OCR device 10 .
  • the RAM 13 is a volatile memory.
  • the RAM 13 temporarily stores data being processed by the processor 11 .
  • RAM 13 stores various application programs based on instructions from processor 11 .
  • the RAM 13 may store data necessary for executing the application program, execution results of the application program, and the like.
  • the NVM 14 is a non-volatile memory in which data can be written and rewritten.
  • the NVM 14 is composed of, for example, a HDD (Hard Disk Drive), SSD (Solid State Drive), flash memory, or the like.
  • the NVM 14 stores control programs, applications, various data, etc. according to the operational use of the OCR device 10 .
  • the communication unit 15 (communication interface, first communication interface) is an interface for connecting to the network 6 . That is, the communication unit 15 is an interface for transmitting/receiving data to/from the server 20 or the like through the network 6 .
  • the communication unit 15 is an interface that supports wired or wireless LAN (Local Area Network) connection.
  • the operation unit 16 receives inputs for various operations from the operator.
  • the operation unit 16 transmits a signal indicating the input operation to the processor 11 .
  • the operation unit 16 may be composed of a touch panel.
  • the display unit 17 displays image data from the processor 11 .
  • the display unit 17 is composed of a liquid crystal monitor.
  • the display section 17 may be formed integrally with the operating section 16 .
  • the partitioning device interface 18 is an interface for connecting to the partitioning device 2.
  • the sorting device interface 18 transmits signals (eg, destination information) from the processor 11 to the sorting device 2 .
  • the sorting device interface 18 also transmits signals from the sorting device 2 to the processor 11 .
  • the camera interface 19 (image interface) is an interface for connecting to the camera 3.
  • Camera interface 19 transmits signals from processor 11 to camera 3 .
  • the camera interface 19 also transmits signals (such as captured images) from the camera 3 to the processor 11 .
  • FIG. 3 shows a configuration example of the server 20 according to the embodiment.
  • FIG. 3 is a block diagram showing a configuration example of the server 20.
  • the server 20 includes a processor 21, a ROM 22, a RAM 23, an NVM 24, a communication section 25, an operation section 26, a display section 27, and the like.
  • the processor 21, ROM 22, RAM 23, NVM 24, communication section 25, operation section 26 and display section 27 are connected to each other via a data bus or the like.
  • the server 20 may have a configuration other than the configuration shown in FIG. 3 as necessary, or may exclude a specific configuration from the server 20 .
  • the processor 21 (second processor) has a function of controlling the operation of the server 20 as a whole.
  • Processor 21 may include an internal cache, various interfaces, and the like.
  • the processor 21 implements various processes by executing programs pre-stored in the internal memory, ROM 22 or NVM 24 .
  • processor 21 controls the functions performed by the hardware circuits.
  • the ROM 22 is a non-volatile memory in which control programs, control data, etc. are stored in advance.
  • the control programs and control data stored in the ROM 22 are installed in advance according to the specifications of the server 20 .
  • the RAM 23 is a volatile memory.
  • the RAM 23 temporarily stores data being processed by the processor 21 .
  • RAM 23 stores various application programs based on instructions from processor 21 .
  • the RAM 23 may store data necessary for executing the application program, execution results of the application program, and the like.
  • the NVM 24 is a non-volatile memory in which data can be written and rewritten.
  • the NVM 24 is composed of, for example, an HDD, SSD, flash memory, or the like.
  • the NVM 24 stores control programs, applications, various data, etc. according to the operational use of the server 20 .
  • the communication unit 25 (communication interface, second communication interface) is an interface for connecting to the network 6. That is, the communication unit 25 is an interface for transmitting/receiving data to/from the OCR device 10 or the like via the network 6 .
  • the communication unit 25 is an interface that supports wired or wireless LAN connection.
  • the operation unit 26 receives inputs for various operations from the operator.
  • the operation unit 26 transmits a signal indicating the input operation to the processor 21 .
  • the operation unit 26 may be composed of a touch panel.
  • the display unit 27 displays image data from the processor 21 .
  • the display unit 27 is composed of a liquid crystal monitor.
  • the display section 27 may be formed integrally with the operating section 26 .
  • the functions realized by the OCR device 10 are realized by the processor 11 executing a program stored in the internal memory, the ROM 12, the NVM 14, or the like.
  • FIG. 4 is a diagram for explaining the functions realized by the OCR device 10.
  • FIG. 4 is a diagram for explaining the functions realized by the OCR device 10.
  • the processor 11 has a function of acquiring a captured image (first step).
  • a captured image first step.
  • the processor 11 causes the camera 3 to photograph the luggage through the camera interface 19.
  • the processor 11 acquires the photographed image 103 of the luggage from the camera 3 through the camera interface 19 .
  • the processor 11 acquires the parameters of the captured image 103 that has been acquired.
  • the processor 11 acquires the size of the captured image 103 acquired.
  • the processor 11 has a function (second step) of extracting a baggage image in which the baggage is shown from the photographed image.
  • the processor 11 extracts the baggage image 104 from the photographed image 103 using predetermined image processing. For example, processor 11 extracts package image 104 by edge detection. Processor 11 may also extract package image 104 using artificial intelligence such as a neural network.
  • the method by which the processor 11 extracts the parcel image 104 from the captured image 103 is not limited to a specific method.
  • the processor 11 acquires parameters of the extracted parcel image 104 .
  • the processor 11 acquires the coordinates, size, angle (inclination of the baggage image 104) and color of the baggage image 104. FIG.
  • the processor 11 also has a function of extracting a character string image in which the character string appears from the package image 104.
  • the parcel image 104 includes the destination character string, barcode, and label.
  • the processor 11 extracts a character string image showing the character string of the destination from the package image 104 using predetermined image processing. For example, the processor 11 detects and extracts character string images by pattern recognition. Also, the processor 11 may extract the character string image using artificial intelligence such as a neural network. The method by which processor 11 extracts the character string image from parcel image 104 is not limited to a specific method.
  • the processor 11 acquires parameters of the extracted character string image.
  • the processor 11 acquires the coordinates and size of the character string image.
  • the processor 11 may acquire a flag indicating whether the character string in the character string image is handwritten or printed.
  • the processor 11 uses predetermined image processing to determine whether the character string in the character string image is handwritten or printed.
  • the processor 11 may also read the barcode from the parcel image 104. For example, processor 11 obtains the coordinates and size of the barcode. The processor 11 also decodes the barcode to obtain information indicated by the barcode.
  • the processor 11 may also read the label from the package image 104. For example, processor 11 obtains the coordinates and size of the label. In addition, the processor 11 performs OCR processing on the label to acquire information (precautionary notes, etc.) written on the label.
  • the processor 11 also has a function (third step) of extracting character candidates (images), which are candidates for areas containing one character, from the character string image.
  • the processor 11 extracts character candidates 105 with overlapping lines from the character string image.
  • the character candidate 105 is a pattern surrounded by a rectangle.
  • the processor 11 extracts the connection pattern of the character candidates 105 based on the coordinates of the character candidates 105 and the like.
  • a connection pattern indicates a pattern of a series of character strings formed by each character candidate 105 .
  • the processor 11 extracts multiple connection patterns.
  • lines 106 between character candidates 105 indicate connections between character candidates 105 . That is, the connection pattern indicates the connection of the character candidates 105 indicated by connecting the character candidates 105 from the start point 107 to the end point 108 by the line 106 .
  • the processor 11 also has a function (fifth step) of calculating a score (likelihood) indicating the possibility that the character candidate 105 is a predetermined character string by OCR processing.
  • the processor 11 matches the character candidate 105 with the dictionary information by OCR processing.
  • Processor 11 calculates the score of character candidate 105 by matching.
  • the score indicates the possibility that the image of character candidate 105 is a predetermined character.
  • processor 11 calculates a score for a plurality of predetermined characters. That is, the processor 11 calculates a score indicating the possibility that the character candidate 105 is the predetermined character for each of the plurality of predetermined characters. Processor 11 similarly calculates a score for each character candidate 105 .
  • the processor 11 also has a function of generating intermediate information based on the information obtained in the first to fifth steps.
  • the intermediate information consists of information generated during the OCR process. That is, the intermediate information is information for recognizing character strings.
  • the intermediate information does not include the captured image 103, package image 104, and recognition result. Also, the intermediate information is binary data.
  • FIG. 5 shows a configuration example of intermediate information.
  • the intermediate information includes "size of image”, “coordinates of package”, “size of package”, “angle of package”, “color of package”, “coordinates of character string”, “text Column clip size”, “barcode”, “label”, “handwriting/printing judgment”, “character candidate coordinates”, “character candidate clip size”, “connection of character candidates”, and “character candidate score” etc.
  • the intermediate information may have a configuration as necessary in addition to the configuration shown in FIG. 5, or a specific configuration may be excluded from the intermediate information.
  • Image size is obtained by the first step. “Image size” indicates the size of the captured image 103 .
  • “Package coordinates” indicates the coordinates of the package image 104 .
  • “Package size” indicates the size of the package image 104 .
  • “Angle of Package” indicates the inclination of the package image 104 .
  • “Package color” indicates the color of the package image 104 .
  • Coordinates of character string indicates the coordinates of the character string image.
  • Character string cutout size indicates the size of the character string image.
  • Barcode is information related to the barcode of package image 104 . For example, “barcode” indicates the coordinates of the barcode, the size of the barcode, and the information indicated by the barcode.
  • Label is information related to the label of package image 104 .
  • label indicates the coordinates of the label, the size of the label, and the information written on the label.
  • Haandwritten/printed determination indicates whether the character string in the character string image is handwritten or printed.
  • “Coordinates of character candidates”, “Cut-out size of character candidates” and “Connection of character candidates” are acquired in the fourth step. “Coordinates of character candidate” indicates the coordinates of the character candidate 105 . “Cut-out size of character candidate” indicates the size of the character candidate 105 . “Connection of character candidates” indicates each connection pattern of the character candidates 105 .
  • the "character candidate score” is obtained by the fifth step. “Character Candidate Score” indicates the score of each character candidate 105 .
  • the processor 11 also has a function of transmitting destination information related to the destination to the sorting device 2 . After generating the intermediate information, the processor 11 transmits the generated intermediate information to the server 20 through the communication unit 15 .
  • the server 20 transmits destination information to the OCR device 10 for the intermediate information.
  • the processor 11 receives destination information from the server 20 through the communication unit 15 . Upon receiving the destination information, processor 11 transmits the received destination information to sorting device 2 through sorting device interface 18 .
  • the functions realized by the server 20 are realized by the processor 21 executing a program stored in the internal memory, the ROM 22, the NVM 24, or the like.
  • FIG. 6 is a diagram for explaining the functions realized by the server 20.
  • FIG. 6 is a diagram for explaining the functions realized by the server 20.
  • the processor 21 has a function of recognizing a character string written in a character string image based on intermediate information. As described above, the processor 21 of the OCR device 10 transmits intermediate information to the server 20 through the communication section 15 .
  • the processor 21 of the server 20 receives the intermediate information from the OCR device 10 through the communication section 15.
  • the processor 21 Upon receiving the intermediate information, the processor 21 acquires one connection pattern from the intermediate information.
  • the processor 21 After acquiring one connection pattern, the processor 21 matches the connection pattern with a predetermined candidate (character string). Here, the processor 21 calculates an evaluation value indicating the possibility that the character string indicated by the connection pattern is a predetermined candidate. Processor 21 calculates an evaluation value for each of the plurality of candidates.
  • the NVM 24 stores an address database showing multiple candidates (here, address candidates).
  • the processor 21 inputs each candidate indicated by the address database and each information indicated by the intermediate information into a predetermined evaluation function to calculate an evaluation value for each candidate.
  • the processor 21 similarly calculates the evaluation value of each candidate for each connection pattern indicated by the intermediate information.
  • the processor 21 After calculating the evaluation value of each candidate for each connection pattern, the processor 21 identifies the largest evaluation value. After identifying the highest evaluation value, the processor 21 obtains candidates corresponding to the identified evaluation value. The processor 21 acquires the candidate as a character string (here, destination) described in the character string image.
  • the processor 21 also has a function of transmitting destination information related to the recognized character string to the OCR device 10 .
  • the processor 21 recognizes the character string written in the character string image.
  • the processor 21 has recognized the destination as a character string.
  • processor 21 Upon recognizing the destination, processor 21 generates destination information associated with the recognized destination.
  • the destination information includes the recognized destination (the destination itself).
  • the destination information may indicate the sorting destination to which the parcels of the recognized destination are sorted.
  • the destination information may indicate a chute, a pocket, a cart, a tray, or the like into which articles are sorted in the sorting device 2 .
  • the configuration of the destination information is not limited to a specific configuration.
  • the processor 21 After generating the destination information, the processor 21 transmits the generated destination information to the OCR device 10 through the communication unit 25 .
  • FIG. 7 is a flowchart for explaining an operation example of the OCR device 10. As shown in FIG.
  • the processor 11 of the OCR device 10 acquires the captured image 103 from the camera 3 (S11). After obtaining the photographed image 103, the processor 11 extracts the luggage image 104 from the photographed image 103 (S12).
  • the processor 11 After extracting the package image 104, the processor 11 extracts a character string image from the package image 104 (S13). After extracting the character string image, the processor 11 determines whether the character string described in the character string image is handwritten or printed (S14).
  • the processor 11 After determining whether the character string described in the character string image is handwritten or printed, the processor 11 extracts character candidates 105 from the character string image (S15). After extracting the character candidates 105, the processor 21 calculates the score of each character candidate 105 (S16).
  • the processor 21 After calculating the score of each character candidate 105, the processor 21 generates intermediate information (S17). After generating the intermediate information, the processor 21 transmits the generated intermediate information to the server 20 through the communication unit 25 (S18).
  • the processor 21 determines whether the destination information has been received through the communication unit 25 (S19). When determining that the destination information has not been received through the communication unit 25 (S19, NO), the processor 21 returns to S19.
  • the processor 21 When determining that the destination information has been received through the communication unit 25 (S19, YES), the processor 21 transmits the received destination information to the sorting device 2 through the sorting device interface 18 (S20). After sending the destination information to the sorting device 2, the processor 21 ends the operation.
  • FIG. 8 is a flowchart for explaining an operation example of the server 20. As shown in FIG. 8
  • the processor 21 of the server 20 receives intermediate information from the OCR device 10 through the communication unit 25 (S21). Upon receiving the intermediate information, the processor 21 acquires one connection pattern from the intermediate information (S22).
  • the processor 21 After acquiring one connection pattern, the processor 21 matches the connection pattern with each candidate (S23). The processor 21 matches the connection pattern with each candidate, and calculates the evaluation value of each candidate (S24).
  • the processor 21 determines whether there is another connection pattern (S25). When determining that there is another connection pattern (S25, YES), the processor 21 returns to S22.
  • the processor 21 recognizes the character string written in the character string image based on each evaluation value (S26). Upon recognizing the character string, the processor 21 transmits destination information related to the recognized character string to the OCR device 10 through the communication unit 25 (S27). After transmitting the destination information to the OCR device 10, the processor 21 ends its operation.
  • processor 11 of OCR device 10 generates a score map.
  • FIG. 9 is a diagram for explaining functions realized by the OCR device 10 in the modified example.
  • the processor 11 generates a score map in the fourth step.
  • the score map can be obtained by applying machine learning, pattern recognition, CNN (Convolutional Neural Network), etc. to the character string image.
  • the width W of the score map changes (proportionally or the same) according to the width of the character string image.
  • the width H of the score map is the number of types of characters to be recognized +1.
  • is a special character representing "nothing".
  • Each row (ordinate) of the score map corresponds to each character ( ⁇ ABCDEF) to be recognized.
  • Each column (abscissa) of the score map corresponds to each column of the character string image.
  • the score map has the characteristic that the score (value) of the corresponding column in the row corresponding to the "character" written in the character string image is large.
  • the server 20 can recognize the characters written in the character string image by obtaining each character corresponding to the maximum score in each column of the score map.
  • Processor 11 generates intermediate information including a score map. Note that the intermediate information may include information obtained from the first to third steps. After generating the intermediate information, the processor 11 transmits the generated intermediate information to the server 20 through the communication unit 15 .
  • the processor 11 of the OCR device 10 may generate information indicating the sorting destination based on the destination information. For example, if the destination information includes a destination, the processor 11 may generate information indicating a sorting destination for sorting parcels at the destination. Processor 11 transmits the generated information to sorting device 2 .
  • sorting device 2 and the camera 3 may be integrally formed. Also, the sorting device 2, the camera 3 and the OCR device 10 may be integrally formed.
  • the recognition system configured as described above generates intermediate information from the captured image in the OCR device.
  • the recognition system sends intermediate information to the server.
  • a recognition system recognizes a character string based on the intermediate information at the server.
  • the recognition system can reduce the amount of data transferred to the server compared to when images are sent to the server. Therefore, the recognition system can reduce the transfer time to the server and can quickly recognize the character string.
  • the recognition system since the recognition system does not send the image to the server, it is possible to prevent the image from being read. As a result, the recognition system can reduce risks such as leakage of personal information.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Character Discrimination (AREA)
  • Character Input (AREA)

Abstract

Provided are an information processing apparatus, a program, and a system that make it possible to suppress the amount of transfer to an apparatus for performing character recognition. According to an embodiment, an information processing apparatus is provided with an image interface, a communication interface, and a processor. The image interface acquires a captured image including a character string. The communication interface connects to an external apparatus. The processor generates interim information comprising information generated in the course of character recognition processing from the captured image, and transmits the interim information to the external apparatus via the communication interface.

Description

情報処理装置、プログラム及びシステムInformation processing device, program and system
 本発明の実施形態は、情報処理装置、プログラム及びシステムに関する。 The embodiments of the present invention relate to information processing devices, programs and systems.
 端末が取得した画像の文字認識をクラウドサーバで行うシステムが提供されている。そのようなシステムでは、端末は、取得された画像をクラウドサーバに送信して、クラウドサーバから文字認識の結果を取得する。 A system is provided that uses a cloud server to perform character recognition on images acquired by terminals. In such a system, the terminal sends the acquired image to the cloud server and acquires the result of character recognition from the cloud server.
 従来、システムは、端末からクラウドサーバに画像を送信する必要があるため転送量が大きいという課題がある。 Conventionally, the system had the problem of a large amount of transfer because it was necessary to send images from the terminal to the cloud server.
日本国特開2015-90623号公報Japanese Patent Application Laid-Open No. 2015-90623
 上記の課題を解決するため、文字認識を行う装置への転送量を抑制することができる情報処理装置、プログラム及びシステムを提供する。 In order to solve the above problems, an information processing device, a program, and a system that can reduce the amount of data transferred to a device that performs character recognition are provided.
 実施形態によれば、情報処理装置は、画像インターフェースと、通信インターフェースと、プロセッサと、を備える。画像インターフェースは、文字列を含む撮影画像を取得する。通信インターフェースは、外部装置に接続する。プロセッサは、前記撮影画像から文字認識処理の過程で生成される情報から構成される中間情報を生成し、前記通信インターフェースを通じて前記中間情報を前記外部装置に送信する。 According to the embodiment, an information processing device includes an image interface, a communication interface, and a processor. An image interface acquires a captured image containing a character string. The communication interface connects to external devices. The processor generates intermediate information composed of information generated in the course of character recognition processing from the captured image, and transmits the intermediate information to the external device through the communication interface.
図1は、実施形態に係る認識システムの構成例を示すブロック図である。FIG. 1 is a block diagram showing a configuration example of a recognition system according to an embodiment. 図2は、実施形態に係るOCR装置の構成例を示すブロック図である。FIG. 2 is a block diagram showing a configuration example of the OCR device according to the embodiment. 図3は、実施形態に係るサーバの構成例を示すブロック図である。FIG. 3 is a block diagram illustrating a configuration example of a server according to the embodiment; 図4は、実施形態に係るOCR装置の動作例を示す図である。FIG. 4 is a diagram illustrating an operation example of the OCR device according to the embodiment; 図5は、実施形態に係る中間情報の構成例を示す図である。FIG. 5 is a diagram illustrating a configuration example of intermediate information according to the embodiment. 図6は、実施形態に係るサーバの動作例を示す図である。FIG. 6 is a diagram illustrating an operation example of the server according to the embodiment; 図7は、実施形態に係るOCR装置の動作例を示すフローチャートである。FIG. 7 is a flow chart showing an operation example of the OCR device according to the embodiment. 図8は、実施形態に係るサーバの動作例を示すフローチャートである。8 is a flowchart illustrating an operation example of the server according to the embodiment; FIG. 図9は、実施形態に係るOCR装置の他の動作例を示す図である。FIG. 9 is a diagram showing another operation example of the OCR device according to the embodiment.
実施形態embodiment
 以下、図面を参照して実施形態について説明する。 
 実施形態に係る認識システムは、文字認識処理(OCR(Optical Character Recognition)処理)を用いて画像から文字列を認識する。ここでは、認識システムは、荷物に添付されている伝票などの画像から荷物の宛先を認識する。認識システムは、認識された宛先に基づいて荷物を区分する。
Embodiments will be described below with reference to the drawings.
The recognition system according to the embodiment recognizes a character string from an image using character recognition processing (OCR (Optical Character Recognition) processing). Here, the recognition system recognizes the destination of the parcel from an image such as a slip attached to the parcel. A recognition system sorts the packages based on the recognized destination.
 図1は、実施形態に係る認識システム1の構成例を示す。図1が示すように、認識システム1は、区分装置2、カメラ3、ネットワーク6、OCR装置10及びサーバ20などを備える。 FIG. 1 shows a configuration example of a recognition system 1 according to an embodiment. As shown in FIG. 1, the recognition system 1 includes a segmentation device 2, a camera 3, a network 6, an OCR device 10, a server 20, and the like.
 OCR装置10は、区分装置2及びカメラ3に接続する。また、OCR装置10及びサーバ20は、ネットワーク6に接続する。 The OCR device 10 connects to the sorting device 2 and camera 3 . Also, the OCR device 10 and the server 20 are connected to the network 6 .
 なお、認識システム1は、図1が示すような構成の他に必要に応じた構成をさらに具備したり、認識システム1から特定の構成が除外されたりしてもよい。 In addition to the configuration shown in FIG. 1, the recognition system 1 may further include a configuration according to need, or a specific configuration may be excluded from the recognition system 1.
 区分装置2は、オペレータ、搬送ベルト又はロボットなどによって投入された荷物を区分する。区分装置2は、荷物の宛先(文字列)に関連する宛先情報(文字列情報)をOCR装置10から受信する。区分装置2は、宛先情報に基づいて荷物を区分する。たとえば、区分装置2は、区分先としてのシュータ、ポケット、カート又はトレイなどに荷物を区分する。たとえば、区分装置2は、ソータ、搬送ベルト又はロボットなどから構成される。 The sorting device 2 sorts the packages thrown in by an operator, a conveyor belt, a robot, or the like. The sorting device 2 receives destination information (character string information) related to the destination (character string) of the parcel from the OCR device 10 . The sorting device 2 sorts packages based on the destination information. For example, the sorting device 2 sorts packages into chutes, pockets, carts, trays, or the like as sorting destinations. For example, the sorting device 2 is composed of a sorter, a conveyor belt, a robot, or the like.
 カメラ3は、区分装置2に投入される荷物を撮影する。カメラ3は、宛先が表示されている面を撮影する。たとえば、カメラ3は、伝票が添付されている面を撮影する。カメラ3は、撮影で得られた画像(撮影画像)をOCR装置10に供給する。 The camera 3 shoots the packages that are put into the sorting device 2. The camera 3 photographs the surface on which the destination is displayed. For example, the camera 3 takes an image of the side to which the slip is attached. The camera 3 supplies the captured image (captured image) to the OCR device 10 .
 たとえば、カメラ3は、CCD(Charge Coupled Device)カメラである。また、カメラ3は、荷物を照らす光源を備えるものであってもよい。 For example, the camera 3 is a CCD (Charge Coupled Device) camera. Moreover, the camera 3 may be provided with a light source for illuminating the baggage.
 OCR装置10(情報処理装置、第1の情報処理装置、外部装置)は、撮影画像をカメラ3から取得する。OCR装置10は、撮影画像からOCR処理に関連する中間情報を生成する。OCR装置10は、中間情報をサーバ20に送信し、撮影画像に写る荷物の宛先に関連する宛先情報をサーバ20から受信する。OCR装置10は、受信された宛先情報を区分装置2に入力する。OCR装置10及び中間情報については、後に詳述する。 The OCR device 10 (information processing device, first information processing device, external device) acquires the captured image from the camera 3 . The OCR device 10 generates intermediate information related to OCR processing from the captured image. The OCR device 10 transmits the intermediate information to the server 20 and receives from the server 20 destination information related to the destination of the parcel shown in the captured image. The OCR device 10 inputs the received destination information to the sorting device 2 . The OCR device 10 and intermediate information will be detailed later.
 ネットワーク6は、OCR装置10とサーバ20との間の通信を中継する。たとえば、ネットワーク6は、インターネットである。 The network 6 relays communication between the OCR device 10 and the server 20. For example, network 6 is the Internet.
 サーバ20(情報処理装置、第2の情報処理装置、外部装置)は、中間情報をOCR装置10から受信する。サーバ20は、受信された中間情報に基づいて宛先情報を生成する。サーバ20は、生成された宛先情報をOCR装置10に供給する。サーバ20については、後に詳述する。 The server 20 (information processing device, second information processing device, external device) receives intermediate information from the OCR device 10 . The server 20 generates destination information based on the received intermediate information. The server 20 supplies the generated destination information to the OCR device 10 . The server 20 will be detailed later.
 次に、OCR装置10について説明する。 
 図2は、実施形態に係るOCR装置10の構成例を示す。図2は、OCR装置10の構成例を示すブロック図である。図2が示すように、OCR装置10は、プロセッサ11、ROM12、RAM13、NVM14、通信部15、操作部16、表示部17、区分装置インターフェース18及びカメラインターフェース19などを備える。
Next, the OCR device 10 will be explained.
FIG. 2 shows a configuration example of the OCR device 10 according to the embodiment. FIG. 2 is a block diagram showing a configuration example of the OCR device 10. As shown in FIG. As shown in FIG. 2, the OCR device 10 includes a processor 11, a ROM 12, a RAM 13, an NVM 14, a communication section 15, an operation section 16, a display section 17, a sorting device interface 18, a camera interface 19, and the like.
 プロセッサ11と、ROM12、RAM13、NVM14、通信部15、操作部16、表示部17、区分装置インターフェース18及びカメラインターフェース19と、は、データバスなどを介して互いに接続する。 
 なお、OCR装置10は、図2が示すような構成の他に必要に応じた構成を具備したり、OCR装置10から特定の構成が除外されたりしてもよい。
The processor 11, ROM 12, RAM 13, NVM 14, communication section 15, operation section 16, display section 17, sorting device interface 18 and camera interface 19 are connected to each other via a data bus or the like.
It should be noted that the OCR device 10 may have a configuration other than the configuration shown in FIG.
 プロセッサ11(第1のプロセッサ)は、OCR装置10全体の動作を制御する機能を有する。プロセッサ11は、内部キャッシュ及び各種のインターフェースなどを備えてもよい。プロセッサ11は、内部メモリ、ROM12又はNVM14が予め記憶するプログラムを実行することにより種々の処理を実現する。 The processor 11 (first processor) has a function of controlling the operation of the OCR device 10 as a whole. Processor 11 may include an internal cache, various interfaces, and the like. The processor 11 implements various processes by executing programs pre-stored in the internal memory, ROM 12 or NVM 14 .
 なお、プロセッサ11がプログラムを実行することにより実現する各種の機能のうちの一部は、ハードウエア回路により実現されるものであってもよい。この場合、プロセッサ11は、ハードウエア回路により実行される機能を制御する。 It should be noted that some of the various functions realized by the processor 11 executing the program may be realized by hardware circuits. In this case, processor 11 controls the functions performed by the hardware circuits.
 ROM12は、制御プログラム及び制御データなどが予め記憶された不揮発性のメモリである。ROM12に記憶される制御プログラム及び制御データは、OCR装置10の仕様に応じて予め組み込まれる。 The ROM 12 is a non-volatile memory in which control programs, control data, etc. are stored in advance. The control program and control data stored in the ROM 12 are preinstalled according to the specifications of the OCR device 10 .
 RAM13は、揮発性のメモリである。RAM13は、プロセッサ11の処理中のデータなどを一時的に格納する。RAM13は、プロセッサ11からの命令に基づき種々のアプリケーションプログラムを格納する。また、RAM13は、アプリケーションプログラムの実行に必要なデータ及びアプリケーションプログラムの実行結果などを格納してもよい。 The RAM 13 is a volatile memory. The RAM 13 temporarily stores data being processed by the processor 11 . RAM 13 stores various application programs based on instructions from processor 11 . Also, the RAM 13 may store data necessary for executing the application program, execution results of the application program, and the like.
 NVM14は、データの書き込み及び書き換えが可能な不揮発性のメモリである。NVM14は、たとえば、HDD(Hard Disk Drive)、SSD(Solid State Drive)又はフラッシュメモリなどから構成される。NVM14は、OCR装置10の運用用途に応じて制御プログラム、アプリケーション及び種々のデータなどを格納する。 The NVM 14 is a non-volatile memory in which data can be written and rewritten. The NVM 14 is composed of, for example, a HDD (Hard Disk Drive), SSD (Solid State Drive), flash memory, or the like. The NVM 14 stores control programs, applications, various data, etc. according to the operational use of the OCR device 10 .
 通信部15(通信インターフェース、第1の通信インターフェース)は、ネットワーク6に接続するためのインターフェースである。即ち、通信部15は、ネットワーク6を通じてサーバ20などとデータを送受信するためのインターフェースである。たとえば、通信部15は、有線又は無線のLAN(Local Area Network)接続をサポートするインターフェースである。 The communication unit 15 (communication interface, first communication interface) is an interface for connecting to the network 6 . That is, the communication unit 15 is an interface for transmitting/receiving data to/from the server 20 or the like through the network 6 . For example, the communication unit 15 is an interface that supports wired or wireless LAN (Local Area Network) connection.
 操作部16は、オペレータから種々の操作の入力を受け付ける。操作部16は、入力された操作を示す信号をプロセッサ11へ送信する。操作部16は、タッチパネルから構成されてもよい。 The operation unit 16 receives inputs for various operations from the operator. The operation unit 16 transmits a signal indicating the input operation to the processor 11 . The operation unit 16 may be composed of a touch panel.
 表示部17は、プロセッサ11からの画像データを表示する。たとえば、表示部17は、液晶モニタから構成される。操作部16がタッチパネルから構成される場合、表示部17は、操作部16と一体的に形成されてもよい。 The display unit 17 displays image data from the processor 11 . For example, the display unit 17 is composed of a liquid crystal monitor. When the operating section 16 is configured by a touch panel, the display section 17 may be formed integrally with the operating section 16 .
 区分装置インターフェース18は、区分装置2に接続するためのインターフェースである。区分装置インターフェース18は、プロセッサ11からの信号(たとえば、宛先情報)を区分装置2に送信する。また、区分装置インターフェース18は、区分装置2からの信号をプロセッサ11に送信する。 The partitioning device interface 18 is an interface for connecting to the partitioning device 2. The sorting device interface 18 transmits signals (eg, destination information) from the processor 11 to the sorting device 2 . The sorting device interface 18 also transmits signals from the sorting device 2 to the processor 11 .
 カメラインターフェース19(画像インターフェース)は、カメラ3に接続するためのインターフェースである。カメラインターフェース19は、プロセッサ11からの信号をカメラ3に送信する。また、カメラインターフェース19は、カメラ3からの信号(撮影画像など)をプロセッサ11に送信する。 The camera interface 19 (image interface) is an interface for connecting to the camera 3. Camera interface 19 transmits signals from processor 11 to camera 3 . The camera interface 19 also transmits signals (such as captured images) from the camera 3 to the processor 11 .
 次に、サーバ20について説明する。 
 図3は、実施形態に係るサーバ20の構成例を示す。図3は、サーバ20の構成例を示すブロック図である。図3が示すように、サーバ20は、プロセッサ21、ROM22、RAM23、NVM24、通信部25、操作部26及び表示部27などを備える。
Next, the server 20 will be explained.
FIG. 3 shows a configuration example of the server 20 according to the embodiment. FIG. 3 is a block diagram showing a configuration example of the server 20. As shown in FIG. As shown in FIG. 3, the server 20 includes a processor 21, a ROM 22, a RAM 23, an NVM 24, a communication section 25, an operation section 26, a display section 27, and the like.
 プロセッサ21と、ROM22、RAM23、NVM24、通信部25、操作部26及び表示部27と、は、データバスなどを介して互いに接続する。 
 なお、サーバ20は、図3が示すような構成の他に必要に応じた構成を具備したり、サーバ20から特定の構成が除外されたりしてもよい。
The processor 21, ROM 22, RAM 23, NVM 24, communication section 25, operation section 26 and display section 27 are connected to each other via a data bus or the like.
It should be noted that the server 20 may have a configuration other than the configuration shown in FIG. 3 as necessary, or may exclude a specific configuration from the server 20 .
 プロセッサ21(第2のプロセッサ)は、サーバ20全体の動作を制御する機能を有する。プロセッサ21は、内部キャッシュ及び各種のインターフェースなどを備えてもよい。プロセッサ21は、内部メモリ、ROM22又はNVM24が予め記憶するプログラムを実行することにより種々の処理を実現する。 The processor 21 (second processor) has a function of controlling the operation of the server 20 as a whole. Processor 21 may include an internal cache, various interfaces, and the like. The processor 21 implements various processes by executing programs pre-stored in the internal memory, ROM 22 or NVM 24 .
 なお、プロセッサ21がプログラムを実行することにより実現する各種の機能のうちの一部は、ハードウエア回路により実現されるものであってもよい。この場合、プロセッサ21は、ハードウエア回路により実行される機能を制御する。 It should be noted that some of the various functions realized by the processor 21 executing the program may be realized by hardware circuits. In this case, processor 21 controls the functions performed by the hardware circuits.
 ROM22は、制御プログラム及び制御データなどが予め記憶された不揮発性のメモリである。ROM22に記憶される制御プログラム及び制御データは、サーバ20の仕様に応じて予め組み込まれる。 The ROM 22 is a non-volatile memory in which control programs, control data, etc. are stored in advance. The control programs and control data stored in the ROM 22 are installed in advance according to the specifications of the server 20 .
 RAM23は、揮発性のメモリである。RAM23は、プロセッサ21の処理中のデータなどを一時的に格納する。RAM23は、プロセッサ21からの命令に基づき種々のアプリケーションプログラムを格納する。また、RAM23は、アプリケーションプログラムの実行に必要なデータ及びアプリケーションプログラムの実行結果などを格納してもよい。 The RAM 23 is a volatile memory. The RAM 23 temporarily stores data being processed by the processor 21 . RAM 23 stores various application programs based on instructions from processor 21 . In addition, the RAM 23 may store data necessary for executing the application program, execution results of the application program, and the like.
 NVM24は、データの書き込み及び書き換えが可能な不揮発性のメモリである。NVM24は、たとえば、HDD、SSD又はフラッシュメモリなどから構成される。NVM24は、サーバ20の運用用途に応じて制御プログラム、アプリケーション及び種々のデータなどを格納する。 The NVM 24 is a non-volatile memory in which data can be written and rewritten. The NVM 24 is composed of, for example, an HDD, SSD, flash memory, or the like. The NVM 24 stores control programs, applications, various data, etc. according to the operational use of the server 20 .
 通信部25(通信インターフェース、第2の通信インターフェース)は、ネットワーク6に接続するためのインターフェースである。即ち、通信部25は、ネットワーク6を通じてOCR装置10などとデータを送受信するためのインターフェースである。たとえば、通信部25は、有線又は無線のLAN接続をサポートするインターフェースである。 The communication unit 25 (communication interface, second communication interface) is an interface for connecting to the network 6. That is, the communication unit 25 is an interface for transmitting/receiving data to/from the OCR device 10 or the like via the network 6 . For example, the communication unit 25 is an interface that supports wired or wireless LAN connection.
 操作部26は、オペレータから種々の操作の入力を受け付ける。操作部26は、入力された操作を示す信号をプロセッサ21へ送信する。操作部26は、タッチパネルから構成されてもよい。 The operation unit 26 receives inputs for various operations from the operator. The operation unit 26 transmits a signal indicating the input operation to the processor 21 . The operation unit 26 may be composed of a touch panel.
 表示部27は、プロセッサ21からの画像データを表示する。たとえば、表示部27は、液晶モニタから構成される。操作部26がタッチパネルから構成される場合、表示部27は、操作部26と一体的に形成されてもよい。 The display unit 27 displays image data from the processor 21 . For example, the display unit 27 is composed of a liquid crystal monitor. When the operating section 26 is configured by a touch panel, the display section 27 may be formed integrally with the operating section 26 .
 次に、OCR装置10が実現する機能について説明する。OCR装置10が実現する機能は、プロセッサ11が内部メモリ、ROM12又はNVM14などに格納されるプログラムを実行することで実現される。 Next, functions realized by the OCR device 10 will be described. The functions realized by the OCR device 10 are realized by the processor 11 executing a program stored in the internal memory, the ROM 12, the NVM 14, or the like.
 図4は、OCR装置10が実現する機能について説明するための図である。 FIG. 4 is a diagram for explaining the functions realized by the OCR device 10. FIG.
 まず、プロセッサ11は、撮影画像を取得する機能(第1の工程)を有する。 
 ここでは、カメラ3が撮影可能な位置に区分装置2に投入される荷物が存在するものとする。
First, the processor 11 has a function of acquiring a captured image (first step).
Here, it is assumed that there is a package to be thrown into the sorting device 2 at a position where the camera 3 can shoot.
 プロセッサ11は、カメラインターフェース19を通じて、カメラ3に荷物を撮影させる。プロセッサ11は、カメラインターフェース19を通じて、荷物が写る撮影画像103をカメラ3から取得する。 The processor 11 causes the camera 3 to photograph the luggage through the camera interface 19. The processor 11 acquires the photographed image 103 of the luggage from the camera 3 through the camera interface 19 .
 また、プロセッサ11は、取得された撮影画像103のパラメータを取得する。ここでは、プロセッサ11は、取得された撮影画像103のサイズを取得する。 Also, the processor 11 acquires the parameters of the captured image 103 that has been acquired. Here, the processor 11 acquires the size of the captured image 103 acquired.
 また、プロセッサ11は、撮影画像から荷物が写る荷物画像を抽出する機能(第2の工程)を有する。 In addition, the processor 11 has a function (second step) of extracting a baggage image in which the baggage is shown from the photographed image.
 プロセッサ11は、所定の画像処理を用いて撮影画像103から荷物画像104を抽出する。たとえば、プロセッサ11は、エッジ検出によって荷物画像104を抽出する。また、プロセッサ11は、ニューラルネットワークなどの人工知能を用いて荷物画像104を抽出してもよい。プロセッサ11が撮影画像103から荷物画像104を抽出する方法は、特定の方法に限定されるものではない。 The processor 11 extracts the baggage image 104 from the photographed image 103 using predetermined image processing. For example, processor 11 extracts package image 104 by edge detection. Processor 11 may also extract package image 104 using artificial intelligence such as a neural network. The method by which the processor 11 extracts the parcel image 104 from the captured image 103 is not limited to a specific method.
 また、プロセッサ11は、抽出された荷物画像104のパラメータを取得する。ここでは、プロセッサ11は、荷物画像104の座標、サイズ、角度(荷物画像104の傾き)及び色を取得する。 Also, the processor 11 acquires parameters of the extracted parcel image 104 . Here, the processor 11 acquires the coordinates, size, angle (inclination of the baggage image 104) and color of the baggage image 104. FIG.
 また、プロセッサ11は、荷物画像104から文字列が写る文字列画像を抽出する機能を有する。 The processor 11 also has a function of extracting a character string image in which the character string appears from the package image 104.
 ここでは、荷物画像104には、宛先の文字列、バーコード及びラベルが写っているものとする。 Here, it is assumed that the parcel image 104 includes the destination character string, barcode, and label.
 プロセッサ11は、所定の画像処理を用いて荷物画像104から宛先の文字列が写る文字列画像を抽出する。たとえば、プロセッサ11は、パターン認識によって文字列画像を検出して抽出する。また、プロセッサ11は、ニューラルネットワークなどの人工知能を用いて文字列画像を抽出してもよい。プロセッサ11が荷物画像104から文字列画像を抽出する方法は、特定の方法に限定されるものではない。 The processor 11 extracts a character string image showing the character string of the destination from the package image 104 using predetermined image processing. For example, the processor 11 detects and extracts character string images by pattern recognition. Also, the processor 11 may extract the character string image using artificial intelligence such as a neural network. The method by which processor 11 extracts the character string image from parcel image 104 is not limited to a specific method.
 また、プロセッサ11は、抽出された文字列画像のパラメータを取得する。ここでは、プロセッサ11は、文字列画像の座標及びサイズを取得する。また、プロセッサ11は、文字列画像の文字列が手書き又は印字であるかを示すフラグを取得してもよい。たとえば、プロセッサ11は、所定の画像処理を用いて文字列画像の文字列が手書き又は印字であるかを判定する。 Also, the processor 11 acquires parameters of the extracted character string image. Here, the processor 11 acquires the coordinates and size of the character string image. Also, the processor 11 may acquire a flag indicating whether the character string in the character string image is handwritten or printed. For example, the processor 11 uses predetermined image processing to determine whether the character string in the character string image is handwritten or printed.
 また、プロセッサ11は、荷物画像104からバーコードを読み取ってもよい。たとえば、プロセッサ11は、バーコードの座標及びサイズを取得する。また、プロセッサ11は、バーコードをデコードして、バーコードが示す情報を取得する。 The processor 11 may also read the barcode from the parcel image 104. For example, processor 11 obtains the coordinates and size of the barcode. The processor 11 also decodes the barcode to obtain information indicated by the barcode.
 また、プロセッサ11は、荷物画像104からラベルを読み取ってもよい。たとえば、プロセッサ11は、ラベルの座標及びサイズを取得する。また、プロセッサ11は、ラベルに対してOCR処理を行って、ラベルに記載されている情報(注意書きなど)を取得する。 The processor 11 may also read the label from the package image 104. For example, processor 11 obtains the coordinates and size of the label. In addition, the processor 11 performs OCR processing on the label to acquire information (precautionary notes, etc.) written on the label.
 また、プロセッサ11は、文字列画像から1つの文字を含む領域の候補である文字候補(画像)を抽出する機能(第3の工程)を有する。 The processor 11 also has a function (third step) of extracting character candidates (images), which are candidates for areas containing one character, from the character string image.
 プロセッサ11は、文字列画像から、線に重なりのある文字候補105を抽出する。ここでは、文字候補105は、矩形に囲まれたパターンである。文字候補105を抽出すると、プロセッサ11は、文字候補105の座標などに基づいて、文字候補105の繋がりパターンを抽出する。繋がりパターンは、各文字候補105によって形成される一連の文字列のパターンを示す。 The processor 11 extracts character candidates 105 with overlapping lines from the character string image. Here, the character candidate 105 is a pattern surrounded by a rectangle. After extracting the character candidates 105, the processor 11 extracts the connection pattern of the character candidates 105 based on the coordinates of the character candidates 105 and the like. A connection pattern indicates a pattern of a series of character strings formed by each character candidate 105 .
 図4が示すように、プロセッサ11は、複数の繋がりパターンを抽出する。図4では、文字候補105間の線106は、文字候補105同士の繋がりを示す。即ち、繋がりパターンは、線106によって文字候補105を始点107から終点108まで繋げることで示される文字候補105の繋がりを示す。 As shown in FIG. 4, the processor 11 extracts multiple connection patterns. In FIG. 4, lines 106 between character candidates 105 indicate connections between character candidates 105 . That is, the connection pattern indicates the connection of the character candidates 105 indicated by connecting the character candidates 105 from the start point 107 to the end point 108 by the line 106 .
 また、プロセッサ11は、OCR処理によって文字候補105が所定の文字列である可能性を示すスコア(尤度)を算出する機能(第5の工程)を有する。 The processor 11 also has a function (fifth step) of calculating a score (likelihood) indicating the possibility that the character candidate 105 is a predetermined character string by OCR processing.
 プロセッサ11は、OCR処理によって、文字候補105と辞書情報とをマッチングする。プロセッサ11は、マッチングによって、文字候補105のスコアを算出する。ここで、スコアは、文字候補105の画像が所定の文字である可能性を示す。 The processor 11 matches the character candidate 105 with the dictionary information by OCR processing. Processor 11 calculates the score of character candidate 105 by matching. Here, the score indicates the possibility that the image of character candidate 105 is a predetermined character.
 ここでは、プロセッサ11は、複数の所定の文字に関してスコアを算出する。即ち、プロセッサ11は、複数の所定の文字に関して、文字候補105が当該所定の文字である可能性を示すスコアをそれぞれ算出する。 
 プロセッサ11は、各文字候補105に対して同様にスコアを算出する。
Here, processor 11 calculates a score for a plurality of predetermined characters. That is, the processor 11 calculates a score indicating the possibility that the character candidate 105 is the predetermined character for each of the plurality of predetermined characters.
Processor 11 similarly calculates a score for each character candidate 105 .
 また、プロセッサ11は、第1の工程乃至第5の工程で得られた情報に基づいて中間情報を生成する機能を有する。 
 中間情報は、OCR処理の過程で生成される情報から構成される。即ち、中間情報は、文字列を認識するための情報である。ここでは、中間情報は、撮影画像103、荷物画像104及び認識結果を含まない。また、中間情報は、バイナリデータである。
The processor 11 also has a function of generating intermediate information based on the information obtained in the first to fifth steps.
The intermediate information consists of information generated during the OCR process. That is, the intermediate information is information for recognizing character strings. Here, the intermediate information does not include the captured image 103, package image 104, and recognition result. Also, the intermediate information is binary data.
 図5は、中間情報の構成例を示す。図5が示すように、中間情報は、「画像のサイズ」、「荷物の座標」、「荷物のサイズ」、「荷物の角度」、「荷物の色」、「文字列の座標」、「文字列の切り出しサイズ」、「バーコード」、「ラベル」、「手書き・印字判定」、「文字候補の座標」、「文字候補の切り出しサイズ」、「文字候補の繋がり」及び「文字候補のスコア」などから構成される。 FIG. 5 shows a configuration example of intermediate information. As shown in FIG. 5, the intermediate information includes "size of image", "coordinates of package", "size of package", "angle of package", "color of package", "coordinates of character string", "text Column clip size", "barcode", "label", "handwriting/printing judgment", "character candidate coordinates", "character candidate clip size", "connection of character candidates", and "character candidate score" etc.
 なお、中間情報は、図5が示すような構成の他に必要に応じた構成を具備したり、中間情報から特定の構成が除外されたりしてもよい。 It should be noted that the intermediate information may have a configuration as necessary in addition to the configuration shown in FIG. 5, or a specific configuration may be excluded from the intermediate information.
 「画像のサイズ」は、第1の工程によって取得される。 
 「画像のサイズ」は、撮影画像103のサイズを示す。
"Image size" is obtained by the first step.
“Image size” indicates the size of the captured image 103 .
 「荷物の座標」、「荷物のサイズ」、「荷物の角度」及び「荷物の色」は、第2の工程によって取得される。 "Package coordinates", "package size", "package angle" and "package color" are obtained by the second step.
 「荷物の座標」は、荷物画像104の座標を示す。 
 「荷物のサイズ」は、荷物画像104のサイズを示す。 
 「荷物の角度」は、荷物画像104の傾きを示す。 
 「荷物の色」は、荷物画像104の色を示す。
“Package coordinates” indicates the coordinates of the package image 104 .
“Package size” indicates the size of the package image 104 .
“Angle of Package” indicates the inclination of the package image 104 .
“Package color” indicates the color of the package image 104 .
 「文字列の座標」、「文字列の切り出しサイズ」、「バーコード」、「ラベル」及び「手書き・印字判定」は、第3の工程によって取得される。 "Character string coordinates", "character string cutout size", "barcode", "label", and "handwriting/printing determination" are acquired in the third step.
 「文字列の座標」は、文字列画像の座標を示す。 
 「文字列の切り出しサイズ」は、文字列画像のサイズを示す。 
 「バーコード」は、荷物画像104のバーコードに関連する情報である。たとえば、「バーコード」は、バーコードの座標、バーコードのサイズ及びバーコードが示す情報を示す。
"Coordinates of character string" indicates the coordinates of the character string image.
“Character string cutout size” indicates the size of the character string image.
“Barcode” is information related to the barcode of package image 104 . For example, "barcode" indicates the coordinates of the barcode, the size of the barcode, and the information indicated by the barcode.
 「ラベル」は、荷物画像104のラベルに関連する情報である。たとえば、「ラベル」は、ラベルの座標、ラベルのサイズ及びラベルに記載されている情報を示す。 
 「手書き・印字判定」は、文字列画像の文字列が手書き又は印字であるかを示す。
“Label” is information related to the label of package image 104 . For example, "label" indicates the coordinates of the label, the size of the label, and the information written on the label.
“Handwritten/printed determination” indicates whether the character string in the character string image is handwritten or printed.
 「文字候補の座標」、「文字候補の切り出しサイズ」及び「文字候補の繋がり」は、第4の工程によって取得される。 
 「文字候補の座標」は、文字候補105の座標を示す。 
 「文字候補の切り出しサイズ」は、文字候補105のサイズを示す。 
 「文字候補の繋がり」は、文字候補105の各繋がりパターンを示す。
"Coordinates of character candidates", "Cut-out size of character candidates" and "Connection of character candidates" are acquired in the fourth step.
“Coordinates of character candidate” indicates the coordinates of the character candidate 105 .
“Cut-out size of character candidate” indicates the size of the character candidate 105 .
“Connection of character candidates” indicates each connection pattern of the character candidates 105 .
 「文字候補のスコア」は、第5の工程によって取得される。 
 「文字候補のスコア」は、各文字候補105の各スコアを示す。
The "character candidate score" is obtained by the fifth step.
“Character Candidate Score” indicates the score of each character candidate 105 .
 また、プロセッサ11は、宛先に関連する宛先情報を区分装置2に送信する機能を有する。 
 中間情報を生成すると、プロセッサ11は、通信部15を通じて、生成された中間情報をサーバ20に送信する。
The processor 11 also has a function of transmitting destination information related to the destination to the sorting device 2 .
After generating the intermediate information, the processor 11 transmits the generated intermediate information to the server 20 through the communication unit 15 .
 後述するように、サーバ20は、中間情報に対して宛先情報をOCR装置10に送信する。 As will be described later, the server 20 transmits destination information to the OCR device 10 for the intermediate information.
 プロセッサ11は、通信部15を通じて宛先情報をサーバ20から受信する。宛先情報を受信すると、プロセッサ11は、区分装置インターフェース18を通じて、受信された宛先情報を区分装置2に送信する。 The processor 11 receives destination information from the server 20 through the communication unit 15 . Upon receiving the destination information, processor 11 transmits the received destination information to sorting device 2 through sorting device interface 18 .
 次に、サーバ20が実現する機能について説明する。サーバ20が実現する機能は、プロセッサ21が内部メモリ、ROM22又はNVM24などに格納されるプログラムを実行することで実現される。 Next, functions realized by the server 20 will be described. The functions realized by the server 20 are realized by the processor 21 executing a program stored in the internal memory, the ROM 22, the NVM 24, or the like.
 図6は、サーバ20が実現する機能について説明するための図である。 FIG. 6 is a diagram for explaining the functions realized by the server 20. FIG.
 まず、プロセッサ21は、中間情報に基づいて文字列画像に記載されている文字列を認識する機能を有する。 
 前述の通り、OCR装置10のプロセッサ21は、通信部15を通じて中間情報をサーバ20に送信する。
First, the processor 21 has a function of recognizing a character string written in a character string image based on intermediate information.
As described above, the processor 21 of the OCR device 10 transmits intermediate information to the server 20 through the communication section 15 .
 サーバ20のプロセッサ21は、通信部15を通じて中間情報をOCR装置10から受信する。 The processor 21 of the server 20 receives the intermediate information from the OCR device 10 through the communication section 15.
 中間情報を受信すると、プロセッサ21は、中間情報から1つの繋がりパターンを取得する。 Upon receiving the intermediate information, the processor 21 acquires one connection pattern from the intermediate information.
 1つの繋がりパターンを取得すると、プロセッサ21は、当該繋がりパターンと所定の候補(文字列)とをマッチングする。ここでは、プロセッサ21は、当該繋がりパターンが示す文字列が所定の候補である可能性を示す評価値を算出する。プロセッサ21は、複数の候補において、候補ごとに評価値を算出する。 After acquiring one connection pattern, the processor 21 matches the connection pattern with a predetermined candidate (character string). Here, the processor 21 calculates an evaluation value indicating the possibility that the character string indicated by the connection pattern is a predetermined candidate. Processor 21 calculates an evaluation value for each of the plurality of candidates.
 たとえば、NVM24は、複数の候補(ここでは、住所の候補)を示す住所データベースを格納する。プロセッサ21は、住所データベースが示す各候補及び中間情報が示す各情報を所定の評価関数に入力して、各候補の評価値を算出する。 For example, the NVM 24 stores an address database showing multiple candidates (here, address candidates). The processor 21 inputs each candidate indicated by the address database and each information indicated by the intermediate information into a predetermined evaluation function to calculate an evaluation value for each candidate.
 プロセッサ21は、中間情報が示す各繋がりパターンについて同様に各候補の評価値を算出する。 The processor 21 similarly calculates the evaluation value of each candidate for each connection pattern indicated by the intermediate information.
 各繋がりパターンについて各候補の評価値を算出すると、プロセッサ21は、最も大きな評価値を特定する。最も大きな評価値を特定すると、プロセッサ21は、特定された評価値に対応する候補を取得する。プロセッサ21は、文字列画像に記載されている文字列(ここでは、宛先)として当該候補を取得する。 After calculating the evaluation value of each candidate for each connection pattern, the processor 21 identifies the largest evaluation value. After identifying the highest evaluation value, the processor 21 obtains candidates corresponding to the identified evaluation value. The processor 21 acquires the candidate as a character string (here, destination) described in the character string image.
 また、プロセッサ21は、認識された文字列に関連する宛先情報をOCR装置10に送信する機能を有する。 The processor 21 also has a function of transmitting destination information related to the recognized character string to the OCR device 10 .
 前述の通り、プロセッサ21は、文字列画像に記載されている文字列を認識する。ここでは、プロセッサ21は、文字列として宛先を認識したものとする。宛先を認識すると、プロセッサ21は、認識された宛先に関連する宛先情報を生成する。 As described above, the processor 21 recognizes the character string written in the character string image. Here, it is assumed that the processor 21 has recognized the destination as a character string. Upon recognizing the destination, processor 21 generates destination information associated with the recognized destination.
 たとえば、宛先情報は、認識された宛先(宛先自体)を含む。また、宛先情報は、認識された宛先の荷物が区分される区分先を示すものであってもよい。たとえば、宛先情報は、区分装置2におけて物品が仕分されるシュータ、ポケット、カート又はトレイなどを示すものであってもよい。宛先情報の構成は、特定の構成に限定されるものではない。 For example, the destination information includes the recognized destination (the destination itself). In addition, the destination information may indicate the sorting destination to which the parcels of the recognized destination are sorted. For example, the destination information may indicate a chute, a pocket, a cart, a tray, or the like into which articles are sorted in the sorting device 2 . The configuration of the destination information is not limited to a specific configuration.
 宛先情報を生成すると、プロセッサ21は、通信部25を通じて、生成された宛先情報をOCR装置10に送信する。 After generating the destination information, the processor 21 transmits the generated destination information to the OCR device 10 through the communication unit 25 .
 次に、認識システム1の動作例について説明する。 
 まず、OCR装置10の動作例について説明する。図7は、OCR装置10の動作例について説明するためのフローチャートである。
Next, an operation example of the recognition system 1 will be described.
First, an operation example of the OCR device 10 will be described. FIG. 7 is a flowchart for explaining an operation example of the OCR device 10. As shown in FIG.
 まず、OCR装置10のプロセッサ11は、カメラ3から撮影画像103を取得する(S11)。撮影画像103を取得すると、プロセッサ11は、撮影画像103から荷物画像104を抽出する(S12)。 First, the processor 11 of the OCR device 10 acquires the captured image 103 from the camera 3 (S11). After obtaining the photographed image 103, the processor 11 extracts the luggage image 104 from the photographed image 103 (S12).
 荷物画像104を抽出すると、プロセッサ11は、荷物画像104から文字列画像を抽出する(S13)。文字列画像を抽出すると、プロセッサ11は、文字列画像に記載されている文字列が手書きであるか印字であるかを判定する(S14)。 After extracting the package image 104, the processor 11 extracts a character string image from the package image 104 (S13). After extracting the character string image, the processor 11 determines whether the character string described in the character string image is handwritten or printed (S14).
 文字列画像に記載されている文字列が手書きであるか印字であるかを判定すると、プロセッサ11は、文字列画像から文字候補105を抽出する(S15)。文字候補105を抽出すると、プロセッサ21は、各文字候補105のスコアを算出する(S16)。 After determining whether the character string described in the character string image is handwritten or printed, the processor 11 extracts character candidates 105 from the character string image (S15). After extracting the character candidates 105, the processor 21 calculates the score of each character candidate 105 (S16).
 各文字候補105のスコアを算出すると、プロセッサ21は、中間情報を生成する(S17)。中間情報を生成すると、プロセッサ21は、通信部25を通じて、生成された中間情報をサーバ20に送信する(S18)。 After calculating the score of each character candidate 105, the processor 21 generates intermediate information (S17). After generating the intermediate information, the processor 21 transmits the generated intermediate information to the server 20 through the communication unit 25 (S18).
 中間情報をサーバ20に送信すると、プロセッサ21は、通信部25を通じて宛先情報を受信したかを判定する(S19)。通信部25を通じて宛先情報を受信していないと判定すると(S19、NO)、プロセッサ21は、S19に戻る。 After transmitting the intermediate information to the server 20, the processor 21 determines whether the destination information has been received through the communication unit 25 (S19). When determining that the destination information has not been received through the communication unit 25 (S19, NO), the processor 21 returns to S19.
 通信部25を通じて宛先情報を受信したと判定すると(S19、YES)、プロセッサ21は、区分装置インターフェース18を通じて、受信された宛先情報を区分装置2に送信する(S20)。 
 宛先情報を区分装置2に送信すると、プロセッサ21は、動作を終了する。
When determining that the destination information has been received through the communication unit 25 (S19, YES), the processor 21 transmits the received destination information to the sorting device 2 through the sorting device interface 18 (S20).
After sending the destination information to the sorting device 2, the processor 21 ends the operation.
 次に、サーバ20の動作例について説明する。図8は、サーバ20の動作例について説明するためのフローチャートである。 Next, an operation example of the server 20 will be described. FIG. 8 is a flowchart for explaining an operation example of the server 20. As shown in FIG.
 まず、サーバ20のプロセッサ21は、通信部25を通じて中間情報をOCR装置10から受信する(S21)。中間情報を受信すると、プロセッサ21は、中間情報から1つの繋がりパターンを取得する(S22)。 First, the processor 21 of the server 20 receives intermediate information from the OCR device 10 through the communication unit 25 (S21). Upon receiving the intermediate information, the processor 21 acquires one connection pattern from the intermediate information (S22).
 1つの繋がりパターンを取得すると、プロセッサ21は、当該繋がりパターンと各候補とをマッチングする(S23)。プロセッサ21は、当該繋がりパターンと各候補とをマッチングして、各候補の評価値を算出する(S24)。 After acquiring one connection pattern, the processor 21 matches the connection pattern with each candidate (S23). The processor 21 matches the connection pattern with each candidate, and calculates the evaluation value of each candidate (S24).
 各候補の評価値を算出すると、プロセッサ21は、他に繋がりパターンが存在するかを判定する(S25)。他に繋がりパターンが存在すると判定すると(S25、YES)、プロセッサ21は、S22に戻る。 After calculating the evaluation value of each candidate, the processor 21 determines whether there is another connection pattern (S25). When determining that there is another connection pattern (S25, YES), the processor 21 returns to S22.
 他に繋がりパターンが存在しないと判定すると(S25、NO)、プロセッサ21は、各評価値に基づいて文字列画像に記載されている文字列を認識する(S26)。文字列を認識すると、プロセッサ21は、通信部25を通じて、認識された文字列に関連する宛先情報をOCR装置10に送信する(S27)。 
 宛先情報をOCR装置10に送信すると、プロセッサ21は、動作を終了する。
If it is determined that there is no other connection pattern (S25, NO), the processor 21 recognizes the character string written in the character string image based on each evaluation value (S26). Upon recognizing the character string, the processor 21 transmits destination information related to the recognized character string to the OCR device 10 through the communication unit 25 (S27).
After transmitting the destination information to the OCR device 10, the processor 21 ends its operation.
 次に、OCR装置10の変形例について説明する。 
 ここでは、OCR装置10のプロセッサ11は、スコアマップを生成する。
Next, a modified example of the OCR device 10 will be described.
Here, processor 11 of OCR device 10 generates a score map.
 図9は、変形例においてOCR装置10が実現する機能について説明するための図である。 FIG. 9 is a diagram for explaining functions realized by the OCR device 10 in the modified example.
 第1の工程乃至第3の工程は、前述の通りであるため説明を省略する。 Description of the first to third steps is omitted because they are as described above.
 プロセッサ11は、第4の工程において、スコアマップを生成する。 The processor 11 generates a score map in the fourth step.
 スコアマップは、文字列画像に対し機械学習、パターン認識又はCNN(Convolutional Neural Network)などを適用することで求めることができる。スコアマップの幅Wは、文字列画像の幅に応じて変化する(比例、もしくは同一)。スコアマップの幅Hは、認識対象となる文字の種類数+1である。 The score map can be obtained by applying machine learning, pattern recognition, CNN (Convolutional Neural Network), etc. to the character string image. The width W of the score map changes (proportionally or the same) according to the width of the character string image. The width H of the score map is the number of types of characters to be recognized +1.
 図9の例では、「ΦABCDEF」でH=7となっている(実際のOCRでは数字、漢字などを含むためH=数千)。 In the example of FIG. 9, "ΦABCDEF" has H = 7 (actual OCR includes numbers, kanji, etc., so H = several thousand).
 ここで「Φ」は「何も無い」を表す特殊文字である。 
 スコアマップの各行(縦座標)は、認識対象となる各文字(ΦABCDEF)に対応する。 
 スコアマップの各列(横座標)は、文字列画像の各列に対応する。
Here, "Φ" is a special character representing "nothing".
Each row (ordinate) of the score map corresponds to each character (ΦABCDEF) to be recognized.
Each column (abscissa) of the score map corresponds to each column of the character string image.
 スコアマップは、文字列画像に記載されている「文字」に対応する行の、対応する列のスコア(値)が大きい特性をもつ。 The score map has the characteristic that the score (value) of the corresponding column in the row corresponding to the "character" written in the character string image is large.
 図9の例では、「CAFFEE」の「C」に対応する4行目の2列目と3列目が他の行(Φ BCEF)より大きな値となっている(下の図では各列における最も大きな値をボールド表記している)。 
 「A」や「E」に同様に関しても同様である。
In the example of FIG. 9, the second and third columns of the fourth row corresponding to "C" of "CAFFEE" have larger values than the other rows (ΦBCEF) (in the figure below, (highest value in bold).
The same is true for "A" and "E".
 サーバ20は、スコアマップの各列における最大スコアに対応する各文字を求めることで、文字列画像に記載されている文字を認識することができる。 The server 20 can recognize the characters written in the character string image by obtaining each character corresponding to the maximum score in each column of the score map.
 プロセッサ11は、スコアマップを含む中間情報を生成する。なお、中間情報は、第1乃至第3の工程から取得された情報を含むものであってもよい。 
 中間情報を生成すると、プロセッサ11は、通信部15を通じて、生成された中間情報をサーバ20に送信する。
Processor 11 generates intermediate information including a score map. Note that the intermediate information may include information obtained from the first to third steps.
After generating the intermediate information, the processor 11 transmits the generated intermediate information to the server 20 through the communication unit 15 .
 なお、OCR装置10のプロセッサ11は、宛先情報に基づいて区分先を示す情報を生成してもよい。たとえば、宛先情報が宛先を含む場合、プロセッサ11は、当該宛先の荷物を区分する区分先を示す情報を生成してもよい。プロセッサ11は、生成された情報を区分装置2に送信する。 Note that the processor 11 of the OCR device 10 may generate information indicating the sorting destination based on the destination information. For example, if the destination information includes a destination, the processor 11 may generate information indicating a sorting destination for sorting parcels at the destination. Processor 11 transmits the generated information to sorting device 2 .
 また、区分装置2とカメラ3とは、一体的に形成されるものであってもよい。また、区分装置2、カメラ3及びOCR装置10は、一体的に形成されるものであってもよい。 Also, the sorting device 2 and the camera 3 may be integrally formed. Also, the sorting device 2, the camera 3 and the OCR device 10 may be integrally formed.
 以上のように構成された認識システムは、OCR装置において撮影画像から中間情報を生成する。認識システムは、中間情報をサーバに送信する。認識システムは、サーバにおいて、中間情報に基づいて文字列を認識する。その結果、認識システムは、画像をサーバに送信する場合よりもサーバへの転送量を抑制することができる。従って、認識システムは、サーバへの転送時間を抑制することができ、迅速に文字列を認識することができる。 The recognition system configured as described above generates intermediate information from the captured image in the OCR device. The recognition system sends intermediate information to the server. A recognition system recognizes a character string based on the intermediate information at the server. As a result, the recognition system can reduce the amount of data transferred to the server compared to when images are sent to the server. Therefore, the recognition system can reduce the transfer time to the server and can quickly recognize the character string.
 また、認識システムは、画像をサーバに送信しないため、画像が読み取られることを防止することができる。その結果、認識システムは、個人情報の流出などの危険を軽減することができる。 Also, since the recognition system does not send the image to the server, it is possible to prevent the image from being read. As a result, the recognition system can reduce risks such as leakage of personal information.
 本発明のいくつかの実施形態を説明したが、これらの実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。これら新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 Although several embodiments of the invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These novel embodiments can be implemented in various other forms, and various omissions, replacements, and modifications can be made without departing from the scope of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, and are included in the scope of the invention described in the claims and equivalents thereof.

Claims (13)

  1.  文字列を含む撮影画像を取得する画像インターフェースと、
     外部装置に接続する通信インターフェースと、
      前記撮影画像から文字認識処理の過程で生成される情報から構成される中間情報を生成し、
      前記通信インターフェースを通じて前記中間情報を前記外部装置に送信する、
     プロセッサと、
    を備える情報処理装置。
    an image interface for acquiring a captured image containing a character string;
    a communication interface that connects to an external device;
    generating intermediate information composed of information generated in the course of character recognition processing from the captured image;
    transmitting the intermediate information to the external device through the communication interface;
    a processor;
    Information processing device.
  2.  前記プロセッサは、
      前記撮影画像から文字候補を抽出し、
      前記文字候補の繋がりパターンを抽出し、
      前記文字候補に記載されている文字が所定の文字列である可能性を示すスコアを算出し、
     前記中間情報は、前記文字候補の繋がりパターンと前記スコアとを含む、
    請求項1に記載の情報処理装置。
    The processor
    extracting character candidates from the captured image;
    Extracting a connection pattern of the character candidates,
    calculating a score indicating the possibility that the character described in the character candidate is a predetermined character string;
    the intermediate information includes the connection pattern of the character candidates and the score;
    The information processing device according to claim 1 .
  3.  前記中間情報は、前記文字候補の座標及びサイズを含む、
    請求項2に記載の情報処理装置。
    the intermediate information includes coordinates and sizes of the character candidates;
    The information processing apparatus according to claim 2.
  4.  前記中間情報は、前記文字列が手書き又は印字であるかを示す、
    請求項1乃至3の何れか1項に記載の情報処理装置。
    The intermediate information indicates whether the character string is handwritten or printed,
    The information processing apparatus according to any one of claims 1 to 3.
  5.  前記プロセッサは、前記通信インターフェースを通じて前記文字列に関連する文字列情報を前記外部装置から受信する、
    請求項1乃至4の何れか1項に記載の情報処理装置。
    the processor receives string information related to the string from the external device through the communication interface;
    The information processing apparatus according to any one of claims 1 to 4.
  6.  前記文字列は、宛先である、
    請求項5に記載の情報処理装置。
    the string is a destination,
    The information processing device according to claim 5 .
  7.  物品を区分する区分装置に接続する区分装置インターフェースを備え、
     前記プロセッサは、前記区分装置インターフェースを通じて、前記文字列情報を前記区分装置に送信する、
    請求項6に記載の情報処理装置。
    comprising a sorting device interface for connecting to a sorting device for sorting articles,
    the processor transmits the string information to the segmentation device through the segmentation device interface;
    The information processing device according to claim 6 .
  8.  プロセッサによって実行されるプログラムであって、
     前記プロセッサに、
      文字列を含む撮影画像から文字認識処理の過程で生成される情報から構成される中間情報を生成する機能と、
      前記中間情報を外部装置に送信する機能と、
    を実現させるプログラム。
    A program executed by a processor,
    to the processor;
    A function to generate intermediate information composed of information generated in the course of character recognition processing from a photographed image containing character strings;
    a function of transmitting the intermediate information to an external device;
    program to realize
  9.  外部装置とデータを送受信する通信インターフェースと、
      前記通信インターフェースを通じて、文字列を含む撮影画像から文字認識処理の過程で生成される情報から構成される中間情報を前記外部装置から受信し、
      前記中間情報に基づいて前記文字列を認識し、
      認識された前記文字列に関連する文字列情報を生成する、
     プロセッサと、
    を備える情報処理装置。
    a communication interface for transmitting and receiving data to and from an external device;
    receiving intermediate information composed of information generated in the course of character recognition processing from a photographed image containing a character string from the external device through the communication interface;
    recognizing the character string based on the intermediate information;
    generating string information associated with said recognized string;
    a processor;
    Information processing device.
  10.  前記中間情報は、前記撮影画像から抽出された文字候補の繋がりパターンと、前記文字候補に記載されている文字が所定の文字である可能性を示すスコアとを含み、
     前記プロセッサは、
      前記繋がりパターンと前記スコアとに基づいて前記文字列が所定の候補である可能性を示す評価値を算出し、
      前記評価値に基づいて前記文字列を認識する、
    請求項9に記載の情報処理装置。
    The intermediate information includes a connection pattern of character candidates extracted from the photographed image and a score indicating a possibility that the character described in the character candidate is a predetermined character,
    The processor
    calculating an evaluation value indicating the possibility that the character string is a predetermined candidate based on the connection pattern and the score;
    recognizing the string based on the evaluation value;
    The information processing apparatus according to claim 9 .
  11.  前記プロセッサは、前記通信インターフェースを通じて前記文字列に関連する文字列情報を前記外部装置に送信する、
    請求項9又は10に記載の情報処理装置。
    the processor transmits string information related to the string to the external device through the communication interface;
    The information processing apparatus according to claim 9 or 10.
  12.  プロセッサによって実行されるプログラムであって、
     前記プロセッサに、
      文字列を含む撮影画像から文字認識処理の過程で生成される情報から構成される中間情報を外部装置から受信する機能と、
      前記中間情報に基づいて前記文字列を認識する機能と、
      認識された前記文字列に関連する文字列情報を生成する機能と、
    を実現させるプログラム。
    A program executed by a processor,
    to the processor;
    A function of receiving from an external device intermediate information composed of information generated in the course of character recognition processing from a photographed image containing a character string;
    a function of recognizing the character string based on the intermediate information;
    the ability to generate string information associated with said recognized string;
    program to realize
  13.  第1の情報処理装置と第2の情報処理装置とから構成されるシステムであって、
     前記第1の情報処理装置は、
      文字列を含む撮影画像を取得する画像インターフェースと、
      前記第2の情報処理装置に接続する第1の通信インターフェースと、
       前記撮影画像から文字認識処理の過程で生成される情報から構成される中間情報を生成し、
       前記第1の通信インターフェースを通じて前記中間情報を前記第2の情報処理装置に送信する、
      第1のプロセッサと、
    を備え、
     前記第2の情報処理装置は、
      前記第1の情報処理装置とデータを送受信する第2の通信インターフェースと、
       前記第2の通信インターフェースを通じて、前記中間情報を前記第1の情報処理装置から受信し、
       前記中間情報に基づいて前記文字列を認識し、
       認識された前記文字列に関連する文字列情報を生成する、
      第2のプロセッサと、
    を備える、
    システム。 
    A system comprising a first information processing device and a second information processing device,
    The first information processing device is
    an image interface for acquiring a captured image containing a character string;
    a first communication interface connected to the second information processing device;
    generating intermediate information composed of information generated in the course of character recognition processing from the captured image;
    transmitting the intermediate information to the second information processing device through the first communication interface;
    a first processor;
    with
    The second information processing device is
    a second communication interface that transmits and receives data to and from the first information processing device;
    receiving the intermediate information from the first information processing device through the second communication interface;
    recognizing the character string based on the intermediate information;
    generating string information associated with said recognized string;
    a second processor;
    comprising a
    system.
PCT/JP2022/007871 2021-03-08 2022-02-25 Image processing apparatus, program, and system WO2022190900A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021036368A JP2022136656A (en) 2021-03-08 2021-03-08 Information processing device, program, and system
JP2021-036368 2021-03-08

Publications (1)

Publication Number Publication Date
WO2022190900A1 true WO2022190900A1 (en) 2022-09-15

Family

ID=83226780

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/007871 WO2022190900A1 (en) 2021-03-08 2022-02-25 Image processing apparatus, program, and system

Country Status (2)

Country Link
JP (1) JP2022136656A (en)
WO (1) WO2022190900A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03230287A (en) * 1990-02-06 1991-10-14 Ricoh Co Ltd Information transmitting/receiving system
JP2006092027A (en) * 2004-09-21 2006-04-06 Fuji Xerox Co Ltd Capital letter recognizing device, capital letter recognizing method and capital letter recognizing program
JP2017216497A (en) * 2016-05-30 2017-12-07 株式会社東芝 Image processing apparatus, image processing system, image processing method, and program
JP2020173669A (en) * 2019-04-11 2020-10-22 ソフトバンク株式会社 Image recognition device, image recognition method, image recognition program, and image recognition system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03230287A (en) * 1990-02-06 1991-10-14 Ricoh Co Ltd Information transmitting/receiving system
JP2006092027A (en) * 2004-09-21 2006-04-06 Fuji Xerox Co Ltd Capital letter recognizing device, capital letter recognizing method and capital letter recognizing program
JP2017216497A (en) * 2016-05-30 2017-12-07 株式会社東芝 Image processing apparatus, image processing system, image processing method, and program
JP2020173669A (en) * 2019-04-11 2020-10-22 ソフトバンク株式会社 Image recognition device, image recognition method, image recognition program, and image recognition system

Also Published As

Publication number Publication date
JP2022136656A (en) 2022-09-21

Similar Documents

Publication Publication Date Title
CN110135411B (en) Business card recognition method and device
JP2575539B2 (en) How to locate and identify money fields on documents
JP6527410B2 (en) Character recognition device, character recognition method, and program
WO2014030399A1 (en) Object discrimination device, object discrimination method, and program
US11694440B2 (en) Image processing techniques to quickly find a desired object among other objects from a captured video scene
CN101558416A (en) Text detection on mobile communications devices
WO2014030400A1 (en) Object discrimination device, object discrimination method, and program
EP3881234A1 (en) Automatically predicting text in images
US10785452B2 (en) Identifying consumer products in images
JP2021193553A (en) Camera and method for processing image data
CN109389115B (en) Text recognition method, device, storage medium and computer equipment
US20220051008A1 (en) Training a card type classifier with simulated card images
CN111967286A (en) Method and device for identifying information bearing medium, computer equipment and medium
US10311564B2 (en) Image processing device, image sensor, and image processing method
WO2022190900A1 (en) Image processing apparatus, program, and system
CN108090728B (en) Express information input method and system based on intelligent terminal
US20160063034A1 (en) Address recognition apparatus, sorting apparatus, integrated address recognition apparatus and address recognition method
CN111213157A (en) Express information input method and system based on intelligent terminal
Sabóia et al. Brazilian mercosur license plate detection and recognition using haar cascade and tesseract ocr on synthetic imagery
US11087122B1 (en) Method and system for processing candidate strings detected in an image to identify a match of a model string in the image
Khan et al. Text detection and recognition on traffic panel in roadside imagery
JP2022140466A (en) Delivery processor, delivery processing method and delivery processing program
US11640702B2 (en) Structurally matching images by hashing gradient singularity descriptors
JP7095574B2 (en) Image processing programs, image processing methods, and image processing equipment
JP6976158B2 (en) Sorting device and sorting system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22766859

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22766859

Country of ref document: EP

Kind code of ref document: A1