WO2018037519A1 - Mobile terminal, image processing method, and program - Google Patents

Mobile terminal, image processing method, and program Download PDF

Info

Publication number
WO2018037519A1
WO2018037519A1 PCT/JP2016/074720 JP2016074720W WO2018037519A1 WO 2018037519 A1 WO2018037519 A1 WO 2018037519A1 JP 2016074720 W JP2016074720 W JP 2016074720W WO 2018037519 A1 WO2018037519 A1 WO 2018037519A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
orientation
data
document
blur
Prior art date
Application number
PCT/JP2016/074720
Other languages
French (fr)
Japanese (ja)
Inventor
朋也 穴澤
清人 小坂
Original Assignee
株式会社Pfu
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社Pfu filed Critical 株式会社Pfu
Priority to PCT/JP2016/074720 priority Critical patent/WO2018037519A1/en
Priority to JP2018535993A priority patent/JP6613378B2/en
Publication of WO2018037519A1 publication Critical patent/WO2018037519A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition

Definitions

  • the present invention relates to a mobile terminal, an image processing method, and a program.
  • Patent Literature a technique is disclosed in which blur or blur in a subject area of a photographed image is detected and a success image in which blur blur is intentionally expressed by a user or an unsuccessful failure image is evaluated.
  • Patent Document 1 has a problem that blur determination of an image is not used for image orientation correction.
  • the present invention has been made in view of the above problems, and the inclination of the device at the time of shooting is corrected by correcting the orientation of the document image using a non-blurred area in the document image shot by the user with the mobile terminal.
  • An object of the present invention is to provide a mobile terminal, an image processing method, and a program capable of appropriately correcting the inclination of a document image caused by the above.
  • a mobile terminal includes an image acquisition unit that acquires captured image data of a captured image captured by a capturing unit, and a document specifying unit that specifies a document image included in the captured image.
  • a partial area acquisition means for acquiring partial area image data of a partial area in the document image, a blur detection means for detecting blur in the partial area, and the orientation of the document image based on the blur.
  • Target area setting means for setting as a target area for identifying the content
  • orientation direction means for specifying the orientation of the content in the target area, and specifying the orientation of the document image based on the orientation of the content
  • document image Orientation correcting means for acquiring post-correction image data of the original image corrected upright based on the orientation of the image.
  • the image processing method includes an image acquisition step of acquiring captured image data of a captured image captured by a capturing unit, a document specifying step of specifying a document image included in the captured image, and a document image
  • a target area setting step that is set as: a direction of content in the target area is specified; a direction specifying step that specifies a direction of the document image based on the direction of the content; and a direction based on the direction of the document image;
  • a direction correction step of acquiring post-correction image data of the original image corrected upright.
  • the program according to the present invention includes an image acquisition step of acquiring captured image data of a captured image captured by the capturing unit, a document specifying step of specifying a document image included in the captured image, and a partial area in the document image.
  • a target region setting step a direction of content in the target region is specified, a direction specifying step of specifying a direction of the original image based on the direction of the content, and an upright state based on the direction of the original image
  • the present invention it is possible to appropriately perform orientation correction on a document image captured by a user with a mobile camera, regardless of the document type or the tilt of the device at the time of shooting.
  • FIG. 1 is a block diagram illustrating an example of a configuration of a mobile terminal according to the present embodiment.
  • FIG. 2 is a flowchart illustrating an example of processing in the mobile terminal according to the present embodiment.
  • FIG. 3 is a diagram illustrating an example of a captured image in the present embodiment.
  • FIG. 4 is a diagram illustrating an example of the orientation specifying process in the present embodiment.
  • FIG. 5 is a diagram illustrating an example of the orientation correction process in the present embodiment.
  • FIG. 6 is a diagram illustrating an example of partial area acquisition processing in the present embodiment.
  • FIG. 7 is a diagram illustrating an example of partial area acquisition processing in the present embodiment.
  • FIG. 8 is a diagram illustrating an example of partial area acquisition processing in the present embodiment.
  • FIG. 1 is a block diagram illustrating an example of a configuration of a mobile terminal according to the present embodiment.
  • FIG. 2 is a flowchart illustrating an example of processing in the mobile terminal according to the present embodiment.
  • FIG. 3
  • FIG. 9 is a diagram illustrating an example of a captured image in the present embodiment.
  • FIG. 10 is a diagram illustrating an example of a document image in the present embodiment.
  • FIG. 11 is a diagram illustrating an example of blur determination in the present embodiment.
  • FIG. 12 is a diagram illustrating an example of blur determination in the present embodiment.
  • FIG. 13 is a schematic diagram illustrating an example of the orientation correction process in the present embodiment.
  • FIG. 1 is a block diagram illustrating an example of a configuration of the mobile terminal 100 according to the present embodiment.
  • the embodiment described below exemplifies the mobile terminal 100 for embodying the technical idea of the present invention, and is not intended to specify the present invention to the mobile terminal 100, and claims.
  • the present invention is equally applicable to the mobile terminal 100 of other embodiments included in the scope of the above.
  • the form of function distribution in the mobile terminal 100 exemplified in the present embodiment is not limited to the following, and may be configured to be functionally or physically distributed / integrated in arbitrary units within a range where similar effects and functions can be achieved. be able to.
  • the mobile terminal 100 is a portable information processing having portability such as a tablet terminal, a mobile phone, a smartphone, a PHS, a PDA, a notebook personal computer, or a wearable computer such as a glasses type or a watch type. It may be a device.
  • the mobile terminal 100 is generally configured to include a control unit 102, a storage unit 106, a photographing unit 110, an input / output unit 112, a sensor unit 114, and a communication unit 116.
  • an input / output interface unit (not shown) for connecting the input / output unit 112 and the control unit 102 may be further provided.
  • Each unit of the mobile terminal 100 is connected to be communicable via an arbitrary communication path.
  • the communication unit 116 is a network interface (NIC (Network Interface Controller) or the like), Bluetooth (registered trademark), or the like for transmitting and receiving IP data by wired communication and / or wireless communication (WiFi (registered trademark), etc.), or An interface that performs wireless communication by infrared communication or the like may be used.
  • NIC Network Interface Controller
  • Bluetooth registered trademark
  • WiFi registered trademark
  • WiFi registered trademark
  • An interface that performs wireless communication by infrared communication or the like may be used.
  • the mobile terminal 100 may be communicably connected to an external device via a network using the communication unit 116.
  • the sensor unit 114 detects a physical quantity and converts it into a signal (digital signal) of another medium.
  • the sensor unit 114 includes a proximity sensor, a direction sensor, a magnetic field sensor, a linear acceleration sensor, a luminance sensor, a gyro sensor, a pressure sensor, a gravity sensor, an acceleration sensor, an atmospheric pressure sensor, and / or a temperature sensor. Also good.
  • the input / output unit 112 performs data input / output (I / O).
  • the input / output unit 112 may be, for example, a key input unit, a touch panel, a control pad (for example, a touch pad and a game pad), a mouse, a keyboard, and / or a microphone.
  • the input / output unit 112 may be a display unit that displays a display screen of an application or the like (for example, a display, a monitor, a touch panel, or the like configured by liquid crystal or organic EL).
  • a display screen of an application or the like for example, a display, a monitor, a touch panel, or the like configured by liquid crystal or organic EL.
  • the input / output unit 112 may be an audio output unit (for example, a speaker or the like) that outputs audio information as audio.
  • the input / output unit (touch panel) 112 may include a sensor unit 114 that detects physical contact and converts it into a signal (digital signal).
  • the photographing unit 110 acquires continuous (moving image) image data (frames) by continuously photographing a subject (for example, a document or the like).
  • the imaging unit 110 may acquire video data.
  • the imaging unit 110 may acquire ancillary data.
  • the photographing unit 110 may be a camera or the like provided with an image sensor such as a CCD (Charge Coupled Device) and / or a CMOS (Complementary Metal Oxide Semiconductor).
  • an image sensor such as a CCD (Charge Coupled Device) and / or a CMOS (Complementary Metal Oxide Semiconductor).
  • the photographing unit 110 may acquire captured image data of a captured image that is a still image by capturing a still image of the subject.
  • the captured image data may be uncompressed image data.
  • the captured image data may be high-resolution image data.
  • the high resolution may be full high vision, 4K resolution, super high vision (8K resolution), or the like.
  • the photographing unit 110 may shoot moving images at 24 fps, 30 fps, or the like.
  • the storage unit 106 stores various databases, tables, and / or files.
  • the storage unit 106 may store various application programs (for example, user applications).
  • the storage unit 106 is storage means, for example, a memory such as RAM / ROM, a fixed disk device such as a hard disk, a solid state drive (SSD), a flexible disk, and / or a tangible storage device such as an optical disk, Alternatively, a memory circuit can be used.
  • a memory such as RAM / ROM, a fixed disk device such as a hard disk, a solid state drive (SSD), a flexible disk, and / or a tangible storage device such as an optical disk, Alternatively, a memory circuit can be used.
  • the storage unit 106 stores a computer program and the like for giving instructions to the controller and performing various processes.
  • the dictionary data file 106a stores dictionary data.
  • the dictionary data may be data relating to characters, numbers, symbols and the like of each language.
  • the form data file 106b stores characteristic data and layout data of a specific form.
  • the specific form may be a prescribed form having a predetermined layout such as various licenses including a driver's license, various identification cards including a passport, or a health insurance card.
  • the image data file 106c stores image data (such as a frame).
  • the image data file 106c may store captured image data, document image data, partial area image data, target area image data, and / or corrected image data.
  • the image data file 106c may store position data such as a document image, a partial area, and / or a target area.
  • the image data file 106c may store character data corresponding to the image data.
  • the image data file 106c may store video data.
  • the image data file 106c may store ancillary data.
  • control unit 102 is a CPU that centrally controls the mobile terminal 100, a many-core CPU, a GPU (Graphics Processing Unit), a DSP (Digital Signal Processor), an LSI (Large Scale Integration), and an ASIC (Application Splicing Spec.). And / or a tangible controller including a FPGA (Field-Programmable Gate Array) or the like, or a control circuit.
  • a FPGA Field-Programmable Gate Array
  • the control unit 102 has an internal memory for storing a control program, a program defining various processing procedures, and necessary data, and performs information processing for executing various processes based on these programs.
  • control unit 102 has an image acquisition unit 102a, a document specification unit 102b, a partial region acquisition unit 102c, a blur detection unit 102d, a target region setting unit 102e, a form determination unit 102f, a direction specification unit 102g, in terms of functional concept.
  • An orientation correction unit 102h and an image display unit 102i are provided.
  • the image acquisition unit 102a acquires image data.
  • the image acquisition unit 102a may acquire captured image data of a captured image captured by the capturing unit 110.
  • the image acquisition unit 102a may acquire captured image data obtained by re-shooting by the shooting unit 110 when the blur detection unit 102d does not detect a blur below a predetermined reference value.
  • the image acquisition unit 102a may acquire non-compressed and high-resolution image data.
  • the image acquisition unit 102a may acquire image data (frame) corresponding to one frame by controlling continuous image shooting or moving image shooting by the shooting unit 110.
  • the image acquisition unit 102a may acquire image data by controlling still image shooting by the shooting unit 110.
  • the image acquisition unit 102a may acquire document image data, partial area image data, target area image data, and / or corrected image data.
  • the image acquisition unit 102a may acquire ancillary data.
  • the document specifying unit 102b specifies a document image included in the photographed image.
  • the document may be a rectangular document.
  • the document specifying unit 102b may detect the position data of the document image from the captured image data.
  • the document specifying unit 102b may detect the corner coordinates (four points) of the document image from the captured image data.
  • the document specifying unit 102b may detect the layout of the document image from the captured image data.
  • the document specifying unit 102b detects the position data of the document image from the photographed image data using the edge detection method and / or the feature point matching method, and specifies the document image based on the position data of the document image. May be.
  • the partial area acquisition unit 102c acquires partial area image data of a partial area in the document image.
  • the partial area acquisition unit 102c may acquire partial area image data of a partial area obtained by dividing the document image.
  • the partial area acquisition unit 102c may acquire partial area image data of a partial area indicating characters in the original image by labeling the original image data of the original image.
  • the blur detection unit 102d detects the blur of the image.
  • the blur detection unit 102d may detect blur in the partial area.
  • the target area setting unit 102e sets a target area for identifying the orientation of the document image.
  • the target area setting unit 102e may set the partial area as a target area for identifying the orientation of the document image based on the blur.
  • the target area setting unit 102e identifies the orientation of the original image of the partial area in which the blur equal to or less than the predetermined reference value is detected by the blur detection unit 102d.
  • the target area may be set.
  • the target area setting unit 102e may compare the blur detected by the blur detection unit 102d and set a partial area that is least blurred as a target area for identifying the orientation of the document image.
  • the form determination unit 102f determines whether the document image corresponds to the specific form based on the feature data of the specific form.
  • the orientation identifying unit 102g identifies the orientation of the document image.
  • the orientation specifying unit 102g may specify the orientation of the content in the target area, and may specify the orientation of the document image based on the orientation of the content.
  • the orientation specifying unit 102g may specify the orientation of the document image based on the layout data of the specific form when the form determining unit 102f determines that the original image corresponds to the specific form.
  • the orientation specifying unit 102g specifies a character area indicating a character in the target area by labeling the target area image data of the target area, and based on the comparison between the character area data of the character area and the dictionary data, The direction of the character may be specified, and the direction of the document image may be specified based on the direction of the character.
  • the orientation specifying unit 102g may specify the direction of characters in the target area based on the comparison between the target area data of the target area and the dictionary data, and may specify the direction of the document image based on the direction of the characters. Good.
  • the orientation correction unit 102h acquires post-correction image data of the original image that has been corrected upright.
  • the orientation correction unit 102h may acquire post-correction image data of the document image that has been corrected upright based on the orientation of the document image.
  • the image display unit 102i displays image data.
  • the image display unit 102i may display captured image data, document image data, partial area image data, target area image data, and / or corrected image data.
  • the image display unit 102i may display the image data on the input / output unit 112.
  • the image display unit 102i may display character data.
  • FIG. 2 is a flowchart illustrating an example of processing in the mobile terminal 100 of the present embodiment.
  • the image acquisition unit 102 a controls shooting by a shooting unit (camera) 110 using a rectangular document as a subject, and acquires shot image data of a shot image shot by the shooting unit 110 ( Step SA-1).
  • the document specifying unit 102b detects the position data of the document image from the captured image data by using the edge detection method and / or the feature point matching method, and includes it in the captured image based on the position data of the document image.
  • a document image to be printed is specified (step SA-2).
  • the image display unit 102i may display the document image data of the document image specified by the document specifying unit 102b on the input / output unit 112, thereby allowing the user to confirm the specified document image.
  • the form determination unit 102f determines whether the document image corresponds to the specific form based on the characteristic data of the specific form stored in the form data file 106b (step SA-3).
  • step SA-3 If the form determination unit 102f determines that the document image corresponds to a specific form (step SA-3: Yes), the form determination unit 102f shifts the process to step SA-4.
  • the orientation specifying unit 102g specifies the orientation of the document image based on the layout data of the specific form stored in the form data file 106b (Step SA-4).
  • the orientation correction unit 102h acquires the corrected image data of the original image corrected upright based on the orientation of the original image (step SA-5), and shifts the processing to step SA-12.
  • FIG. 3 is a diagram illustrating an example of a captured image in the present embodiment.
  • FIG. 4 is a diagram illustrating an example of the orientation specifying process in the present embodiment.
  • FIG. 5 is a diagram illustrating an example of the orientation correction process in the present embodiment.
  • the original image shown in FIG. 4 is specified by extracting a rectangle from the photographed image shown in FIG.
  • the feature A is extracted from the document image of the driver's license shown in FIG.
  • whether the feature is a specific form is determined by determining the consistency between the extracted feature and the feature data of the form registered in the database (form data file 106b) in advance. It is determined whether or not.
  • the form when the form is a specific form, the form type information is set, and the orientation of the original image is corrected based on the layout data specific to the specific form as shown in FIG. .
  • step SA-3 when the form determination unit 102f determines that the document image does not correspond to a specific form (step SA-3: No), the form shifts the process to step SA-6.
  • the partial area acquisition unit 102c acquires partial area image data of the partial area obtained by dividing the document image (step SA-6).
  • the partial area acquisition unit 102c may acquire partial area image data of a partial area indicating characters in the original image by labeling the original image data of the original image.
  • FIGS. 6 to 8 are diagrams illustrating an example of the partial region acquisition process in the present embodiment.
  • partial area image data of a partial area obtained by simply dividing an original image into 2 ⁇ 2 quadrants may be acquired.
  • partial area image data of a partial area obtained by simply dividing a document image into 3 ⁇ 3 9 areas may be acquired.
  • the partial region C divided into character units is divided. Partial area image data may be acquired.
  • the blur detection unit 102d detects blur in the partial area (step SA-7).
  • the blur detection unit 102d may detect blur in the partial region using a determination method based on edge strength or the like.
  • FIG. 9 is a diagram illustrating an example of a captured image in the present embodiment.
  • FIG. 10 is a diagram illustrating an example of a document image in the present embodiment.
  • a document When a document is photographed with a mobile camera, it may be photographed from multiple directions, such as from an oblique direction.
  • the blur detection unit 102d determines whether or not there is a partial region where the detected blur is equal to or less than a predetermined reference value (step SA-8).
  • step SA-8 No If the blur detection unit 102d determines that there is no partial region where the detected blur is equal to or less than the predetermined reference value (step SA-8: No), the process shifts to step SA-1.
  • step SA-8 Yes
  • FIGS. 11 and 12 are diagrams illustrating an example of blur determination in the present embodiment.
  • the state where the blur of the document image is equal to or less than the reference value (not blurred) is a state where the document image is not blurred and the visibility of the characters is good.
  • the state in which the blur of the document image is larger than the reference value (blurred) is a state in which the document image is blurred and the character visibility is poor. Retry is required.
  • the target area setting unit 102e sets a partial area in which blur less than a predetermined reference value is detected as a target area for identifying the orientation of the document image (step SA-9).
  • the target area setting unit 102e may detect blur in each partial area in order, and set the partial area as the target area when the blur is confirmed to be equal to or less than a reference value (not blurred). .
  • the target area setting unit 102e may compare the blurs of the partial areas and set the least blurred area as the target area.
  • the orientation specifying unit 102g specifies the orientation of the content in the target area, and specifies the orientation of the document image based on the orientation of the content (step SA-10).
  • the orientation specifying unit 102g performs a labeling process on the character area data of the character area that is the content included in the target area, and compares the character area data with the dictionary data stored in the dictionary data file 106a to determine the content direction. May be specified.
  • the orientation correcting unit 102h acquires post-correction image data of the original image that has been corrected upright based on the orientation of the original image (step SA-11).
  • the orientation correction unit 102h saves (stores) the corrected image data in the image data file 106c (step SA-12), and ends the process.
  • the image display unit 102i may cause the user to confirm the orientation-corrected document image by displaying the corrected image data of the document image acquired by the orientation correction unit 102h on the input / output unit 112.
  • FIG. 13 is a schematic diagram illustrating an example of the orientation correction process in the present embodiment.
  • a document is photographed (step SB-1), and a rectangle that becomes a document image is extracted from the photographed image (step SB-2).
  • the document image data of the clipped document image is displayed and the user is confirmed.
  • document image data of a document image that has undergone projective conversion may be displayed.
  • the original image extracted in a rectangle is displayed to the user, if the photographed original is a general document, the original image is divided into 2 ⁇ 2 parts (step SB-3). .
  • each partial region is detected as blurred (step SB-4), and when a partial region with a certain blur or less is detected, the partial region is used to perform direction correction, thereby correcting the direction.
  • the subsequent document image is saved (step SB-5), and the process is terminated.
  • the partial area may not be subjected to blur detection, and the blur determination may be performed in another partial area.
  • step SB-4 the blur of each partial area is detected (step SB-4), and the blur of all four partial areas is larger than the reference value (blur) (step SB-6).
  • step SB-7 the process proceeds to step SB-1 (step SB-7).
  • the original image included in the image is detected, the original image is divided, the blur of each divided area is detected, and the area with less blur is determined as the target area for orientation correction. Then, the orientation correction of the document image may be performed from the orientation correction target area.
  • the form is determined based on the feature amount of the original image, and in the case of a specific form (such as a driver's license or health insurance card), orientation correction processing specialized for the form type and post-processing
  • the image data may be stored.
  • the camera image has a problem that it is difficult to obtain an image equivalent to the image quality of the scanner because the surrounding light quantity, shooting direction, and environment such as motion during shooting are not stable.
  • the correct orientation of the document is determined by recognizing characters at the top of the document image or at random positions.
  • the document image may be shot from multiple directions including oblique directions, resulting in a blurred area in the document image, causing a reduction in the accuracy of orientation correction processing.
  • the image processing performed with the scanner image quality can also be applied to the mobile camera image quality.
  • the mobile terminal 100 may perform processing in a stand-alone form, performs processing in response to a request from a client terminal (which is a separate housing from the mobile terminal 100), and outputs the processing result to the client terminal You may make it return to.
  • a client terminal which is a separate housing from the mobile terminal 100
  • all or a part of the processes described as being automatically performed can be manually performed, or all of the processes described as being manually performed can be performed.
  • a part can be automatically performed by a known method.
  • processing procedure, control procedure, specific name, information including parameters such as registration data or search conditions for each processing, screen examples, or database configuration shown in the description and drawings are specially noted. It can be changed arbitrarily except for.
  • each illustrated component is functionally conceptual and does not necessarily need to be physically configured as illustrated.
  • each device of the mobile terminal 100 particularly the processing functions performed by the control unit 102, are realized by the CPU and a program interpreted and executed by the CPU.
  • it may be realized as hardware by wired logic.
  • the program is recorded on a non-transitory computer-readable recording medium including a programmed instruction for causing a computer to execute the method according to the present invention, which will be described later, and the mobile terminal 100 as necessary.
  • OS Operating System
  • the computer program may be stored in an application program server connected to the mobile terminal 100 via an arbitrary network, and may be downloaded in whole or in part as necessary. .
  • the program according to the present invention may be stored in a computer-readable recording medium, or may be configured as a program product.
  • the “recording medium” includes a memory card, USB memory, SD card, flexible disk, magneto-optical disk, ROM, EPROM, EEPROM, CD-ROM, DVD, Blu-ray (registered trademark) Disc, etc. Including any “portable physical medium”.
  • program is a data processing method described in an arbitrary language or description method, and may be in any form such as source code or binary code. Note that the “program” is not necessarily limited to a single configuration, and functions are achieved in cooperation with a separate configuration such as a plurality of modules and libraries or a separate program represented by the OS. Including things. In addition, a well-known structure and procedure can be used about the specific structure for reading a recording medium in each apparatus shown in embodiment, a reading procedure, or the installation procedure after reading.
  • Various databases and the like stored in the storage unit 106 are storage means such as a memory device such as a RAM or a ROM, a fixed disk device such as a hard disk, a flexible disk, and / or an optical disk.
  • Various programs, tables, databases, and / or web page files used may be stored.
  • the mobile terminal 100 may be configured as an information processing apparatus such as a known personal computer, or may be configured by connecting an arbitrary peripheral device to the information processing apparatus.
  • the mobile terminal 100 may be realized by installing software (including programs, data, and the like) that causes the information processing apparatus to implement the method of the present invention.
  • the specific form of distribution / integration of the devices is not limited to that shown in the figure, and all or a part of them may be functional or physical in arbitrary units according to various additions or according to functional loads. Can be distributed and integrated. That is, the above-described embodiments may be arbitrarily combined and may be selectively implemented.
  • the mobile terminal, the image processing method, and the program can be implemented in many industrial fields, particularly in the image processing field that handles images read by a camera, and are extremely useful.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Studio Devices (AREA)
  • Image Processing (AREA)
  • Character Input (AREA)

Abstract

Captured image data of an image captured by an image capture unit is acquired, a source document image included in the captured image is identified, image data of partial areas in the source document image is acquired, blur in the partial areas is detected, a partial area is set as an area for identifying the orientation of the source document image, on the basis of the blur, the orientation of content in said area is identified, the orientation of the source document image is identified on the basis of the orientation of the content, and corrected image data of the source document image which has been corrected so as to be upright is acquired on the basis of the orientation of the source document image.

Description

モバイル端末、画像処理方法、および、プログラムMobile terminal, image processing method, and program
 本発明は、モバイル端末、画像処理方法、および、プログラムに関する。 The present invention relates to a mobile terminal, an image processing method, and a program.
 従来から、画像のボケを検出する技術が開示されている。 Conventionally, techniques for detecting image blur have been disclosed.
 ここで、撮影画像の被写体領域のボケもしくはブレを検出し、ユーザにより意図的にボケブレが表現された成功画像なのか、そうではない失敗画像なのかを評価する技術が開示されている(特許文献1を参照)。 Here, a technique is disclosed in which blur or blur in a subject area of a photographed image is detected and a success image in which blur blur is intentionally expressed by a user or an unsuccessful failure image is evaluated (Patent Literature). 1).
特開2013-12906号公報JP 2013-12906 A
 しかしながら、従来の画像処理装置(特許文献1)においては、画像のボケ判定を、画像の向き補正に利用するものではないという問題点を有していた。 However, the conventional image processing apparatus (Patent Document 1) has a problem that blur determination of an image is not used for image orientation correction.
 本発明は、上記問題点に鑑みてなされたもので、ユーザがモバイル端末にて撮影した原稿画像中のボケていない領域を用いて原稿画像の向き補正を行うことで、撮影時のデバイスの傾き等に起因する原稿画像の傾きを適切に補正することができるモバイル端末、画像処理方法、および、プログラムを提供することを目的とする。 The present invention has been made in view of the above problems, and the inclination of the device at the time of shooting is corrected by correcting the orientation of the document image using a non-blurred area in the document image shot by the user with the mobile terminal. An object of the present invention is to provide a mobile terminal, an image processing method, and a program capable of appropriately correcting the inclination of a document image caused by the above.
 このような目的を達成するため、本発明に係るモバイル端末は、撮影部の撮影による撮影画像の撮影画像データを取得する画像取得手段と、前記撮影画像に含まれる原稿画像を特定する原稿特定手段と、前記原稿画像における部分領域の部分領域画像データを取得する部分領域取得手段と、前記部分領域のボケを検出するボケ検出手段と、前記ボケに基づいて、前記部分領域を前記原稿画像の向きの識別を行う対象領域として設定する対象領域設定手段と、前記対象領域におけるコンテンツの向きを特定し、前記コンテンツの向きに基づいて、前記原稿画像の向きを特定する向き特定手段と、前記原稿画像の向きに基づいて、正立補正した前記原稿画像の補正後画像データを取得する向き補正手段と、を備えたことを特徴とする。 In order to achieve such an object, a mobile terminal according to the present invention includes an image acquisition unit that acquires captured image data of a captured image captured by a capturing unit, and a document specifying unit that specifies a document image included in the captured image. A partial area acquisition means for acquiring partial area image data of a partial area in the document image, a blur detection means for detecting blur in the partial area, and the orientation of the document image based on the blur. Target area setting means for setting as a target area for identifying the content, orientation direction means for specifying the orientation of the content in the target area, and specifying the orientation of the document image based on the orientation of the content, and the document image Orientation correcting means for acquiring post-correction image data of the original image corrected upright based on the orientation of the image.
 また、本発明に係る画像処理方法は、撮影部の撮影による撮影画像の撮影画像データを取得する画像取得ステップと、前記撮影画像に含まれる原稿画像を特定する原稿特定ステップと、前記原稿画像における部分領域の部分領域画像データを取得する部分領域取得ステップと、前記部分領域のボケを検出するボケ検出ステップと、前記ボケに基づいて、前記部分領域を前記原稿画像の向きの識別を行う対象領域として設定する対象領域設定ステップと、前記対象領域におけるコンテンツの向きを特定し、前記コンテンツの向きに基づいて、前記原稿画像の向きを特定する向き特定ステップと、前記原稿画像の向きに基づいて、正立補正した前記原稿画像の補正後画像データを取得する向き補正ステップと、を含むことを特徴とする。 The image processing method according to the present invention includes an image acquisition step of acquiring captured image data of a captured image captured by a capturing unit, a document specifying step of specifying a document image included in the captured image, and a document image A partial region acquisition step of acquiring partial region image data of the partial region, a blur detection step of detecting blur of the partial region, and a target region for identifying the orientation of the document image based on the blur A target area setting step that is set as: a direction of content in the target area is specified; a direction specifying step that specifies a direction of the document image based on the direction of the content; and a direction based on the direction of the document image; And a direction correction step of acquiring post-correction image data of the original image corrected upright.
 また、本発明に係るプログラムは、撮影部の撮影による撮影画像の撮影画像データを取得する画像取得ステップと、前記撮影画像に含まれる原稿画像を特定する原稿特定ステップと、前記原稿画像における部分領域の部分領域画像データを取得する部分領域取得ステップと、前記部分領域のボケを検出するボケ検出ステップと、前記ボケに基づいて、前記部分領域を前記原稿画像の向きの識別を行う対象領域として設定する対象領域設定ステップと、前記対象領域におけるコンテンツの向きを特定し、前記コンテンツの向きに基づいて、前記原稿画像の向きを特定する向き特定ステップと、前記原稿画像の向きに基づいて、正立補正した前記原稿画像の補正後画像データを取得する向き補正ステップと、をコンピュータに実行させることを特徴とする。 The program according to the present invention includes an image acquisition step of acquiring captured image data of a captured image captured by the capturing unit, a document specifying step of specifying a document image included in the captured image, and a partial area in the document image. A partial region acquisition step for acquiring the partial region image data, a blur detection step for detecting blur of the partial region, and the partial region as a target region for identifying the orientation of the document image based on the blur A target region setting step, a direction of content in the target region is specified, a direction specifying step of specifying a direction of the original image based on the direction of the content, and an upright state based on the direction of the original image Causing the computer to execute an orientation correction step of acquiring corrected image data of the corrected original image. And butterflies.
 この発明によれば、ユーザがモバイルカメラで撮影した原稿画像に対し、原稿種、または、撮影時のデバイスの傾きにとらわれない向き補正を適切に行うことが可能となる。 According to the present invention, it is possible to appropriately perform orientation correction on a document image captured by a user with a mobile camera, regardless of the document type or the tilt of the device at the time of shooting.
図1は、本実施形態に係るモバイル端末の構成の一例を示すブロック図である。FIG. 1 is a block diagram illustrating an example of a configuration of a mobile terminal according to the present embodiment. 図2は、本実施形態のモバイル端末における処理の一例を示すフローチャートである。FIG. 2 is a flowchart illustrating an example of processing in the mobile terminal according to the present embodiment. 図3は、本実施形態における撮影画像の一例を示す図である。FIG. 3 is a diagram illustrating an example of a captured image in the present embodiment. 図4は、本実施形態における向き特定処理の一例を示す図である。FIG. 4 is a diagram illustrating an example of the orientation specifying process in the present embodiment. 図5は、本実施形態における向き補正処理の一例を示す図である。FIG. 5 is a diagram illustrating an example of the orientation correction process in the present embodiment. 図6は、本実施形態における部分領域取得処理の一例を示す図である。FIG. 6 is a diagram illustrating an example of partial area acquisition processing in the present embodiment. 図7は、本実施形態における部分領域取得処理の一例を示す図である。FIG. 7 is a diagram illustrating an example of partial area acquisition processing in the present embodiment. 図8は、本実施形態における部分領域取得処理の一例を示す図である。FIG. 8 is a diagram illustrating an example of partial area acquisition processing in the present embodiment. 図9は、本実施形態における撮影画像の一例を示す図である。FIG. 9 is a diagram illustrating an example of a captured image in the present embodiment. 図10は、本実施形態における原稿画像の一例を示す図である。FIG. 10 is a diagram illustrating an example of a document image in the present embodiment. 図11は、本実施形態におけるボケ判定の一例を示す図である。FIG. 11 is a diagram illustrating an example of blur determination in the present embodiment. 図12は、本実施形態におけるボケ判定の一例を示す図である。FIG. 12 is a diagram illustrating an example of blur determination in the present embodiment. 図13は、本実施形態における向き補正処理の一例を示す概要図である。FIG. 13 is a schematic diagram illustrating an example of the orientation correction process in the present embodiment.
 以下に、本発明に係るモバイル端末、画像処理方法、および、プログラムの実施形態を図面に基づいて詳細に説明する。なお、この実施形態により本発明が限定されるものではない。 Hereinafter, embodiments of a mobile terminal, an image processing method, and a program according to the present invention will be described in detail based on the drawings. In addition, this invention is not limited by this embodiment.
[本実施形態の構成]
 以下、本発明の実施形態に係るモバイル端末100の構成の一例について図1を参照して説明し、その後、本実施形態の処理等について詳細に説明する。図1は、本実施形態に係るモバイル端末100の構成の一例を示すブロック図である。
[Configuration of this embodiment]
Hereinafter, an example of the configuration of the mobile terminal 100 according to the embodiment of the present invention will be described with reference to FIG. 1, and then the processing and the like of the present embodiment will be described in detail. FIG. 1 is a block diagram illustrating an example of a configuration of the mobile terminal 100 according to the present embodiment.
 但し、以下に示す実施形態は、本発明の技術思想を具体化するためのモバイル端末100を例示するものであって、本発明をこのモバイル端末100に特定することを意図するものではなく、請求の範囲に含まれるその他の実施形態のモバイル端末100にも等しく適用し得るものである。 However, the embodiment described below exemplifies the mobile terminal 100 for embodying the technical idea of the present invention, and is not intended to specify the present invention to the mobile terminal 100, and claims. The present invention is equally applicable to the mobile terminal 100 of other embodiments included in the scope of the above.
 また、本実施形態で例示するモバイル端末100における機能分散の形態は以下に限られず、同様の効果や機能を奏し得る範囲において、任意の単位で機能的または物理的に分散・統合して構成することができる。 In addition, the form of function distribution in the mobile terminal 100 exemplified in the present embodiment is not limited to the following, and may be configured to be functionally or physically distributed / integrated in arbitrary units within a range where similar effects and functions can be achieved. be able to.
 ここで、モバイル端末100は、例えば、タブレット端末、携帯電話、スマートフォン、PHS、PDA、ノート型のパーソナルコンピュータ、または、メガネ型もしくは時計型などのウェアラブルコンピュータ等の可搬性を有する携帯型の情報処理装置であってもよい。 Here, the mobile terminal 100 is a portable information processing having portability such as a tablet terminal, a mobile phone, a smartphone, a PHS, a PDA, a notebook personal computer, or a wearable computer such as a glasses type or a watch type. It may be a device.
 まず、図1に示すように、モバイル端末100は、概略的に、制御部102と記憶部106と撮影部110と入出力部112とセンサ部114と通信部116とを備えて構成される。 First, as shown in FIG. 1, the mobile terminal 100 is generally configured to include a control unit 102, a storage unit 106, a photographing unit 110, an input / output unit 112, a sensor unit 114, and a communication unit 116.
 ここで、図1では省略しているが、本実施形態において、更に、入出力部112と制御部102とを接続する入出力インターフェース部(図示せず)を備えていてもよい。これらモバイル端末100の各部は任意の通信路を介して通信可能に接続されている。 Here, although omitted in FIG. 1, in this embodiment, an input / output interface unit (not shown) for connecting the input / output unit 112 and the control unit 102 may be further provided. Each unit of the mobile terminal 100 is connected to be communicable via an arbitrary communication path.
 ここで、通信部116は、有線通信および/または無線通信(WiFi(登録商標)等)によりIPデータを送受信するためのネットワークインターフェース(NIC(Network Interface Controller)等)、Bluetooth(登録商標)、または、赤外線通信等によって無線通信を行うインターフェースであってもよい。 Here, the communication unit 116 is a network interface (NIC (Network Interface Controller) or the like), Bluetooth (registered trademark), or the like for transmitting and receiving IP data by wired communication and / or wireless communication (WiFi (registered trademark), etc.), or An interface that performs wireless communication by infrared communication or the like may be used.
 ここで、モバイル端末100は、通信部116を用いて、ネットワークを介して外部装置と通信可能に接続されていてもよい。 Here, the mobile terminal 100 may be communicably connected to an external device via a network using the communication unit 116.
 また、センサ部114は、物理量を検出して別媒体の信号(デジタル信号)に変換する。ここで、センサ部114は、近接センサ、方角センサ、磁場センサ、直線加速センサ、輝度センサ、ジャイロセンサ、圧力センサ、重力センサ、加速度センサ、気圧センサ、および/または、温度センサ等を含んでいてもよい。 Also, the sensor unit 114 detects a physical quantity and converts it into a signal (digital signal) of another medium. Here, the sensor unit 114 includes a proximity sensor, a direction sensor, a magnetic field sensor, a linear acceleration sensor, a luminance sensor, a gyro sensor, a pressure sensor, a gravity sensor, an acceleration sensor, an atmospheric pressure sensor, and / or a temperature sensor. Also good.
 また、入出力部112は、データの入出力(I/O)を行う。ここで、入出力部112は、例えば、キー入力部、タッチパネル、コントロールパッド(例えば、タッチパッド、および、ゲームパッド等)、マウス、キーボード、および/または、マイク等であってもよい。 Also, the input / output unit 112 performs data input / output (I / O). Here, the input / output unit 112 may be, for example, a key input unit, a touch panel, a control pad (for example, a touch pad and a game pad), a mouse, a keyboard, and / or a microphone.
 また、入出力部112は、アプリケーション等の表示画面を表示する表示部(例えば、液晶または有機EL等から構成されるディスプレイ、モニタ、または、タッチパネル等)であってもよい。 Also, the input / output unit 112 may be a display unit that displays a display screen of an application or the like (for example, a display, a monitor, a touch panel, or the like configured by liquid crystal or organic EL).
 また、入出力部112は、音声情報を音声として出力する音声出力部(例えば、スピーカ等)であってもよい。また、入出力部(タッチパネル)112は、物理的接触を検出し、信号(デジタル信号)に変換するセンサ部114を含んでいてもよい。 Also, the input / output unit 112 may be an audio output unit (for example, a speaker or the like) that outputs audio information as audio. The input / output unit (touch panel) 112 may include a sensor unit 114 that detects physical contact and converts it into a signal (digital signal).
 また、撮影部110は、被写体(例えば、原稿等)を連続画像撮影(動画撮影)することで、連続(動画)の画像データ(フレーム)を取得する。例えば、撮影部110は、映像データを取得してもよい。また、撮影部110は、アンシラリデータを取得してもよい。 Further, the photographing unit 110 acquires continuous (moving image) image data (frames) by continuously photographing a subject (for example, a document or the like). For example, the imaging unit 110 may acquire video data. The imaging unit 110 may acquire ancillary data.
 ここで、撮影部110は、CCD(Charge Coupled Device)、および/または、CMOS(Complementary Metal Oxide Semiconductor)等の撮像素子を備えたカメラ等であってもよい。 Here, the photographing unit 110 may be a camera or the like provided with an image sensor such as a CCD (Charge Coupled Device) and / or a CMOS (Complementary Metal Oxide Semiconductor).
 また、撮影部110は、被写体を静止画撮影することで、静止画である撮影画像の撮影画像データを取得してもよい。ここで、撮影画像データは、非圧縮の画像データであってもよい。また、撮影画像データは、高解像度の画像データであってもよい。 In addition, the photographing unit 110 may acquire captured image data of a captured image that is a still image by capturing a still image of the subject. Here, the captured image data may be uncompressed image data. The captured image data may be high-resolution image data.
 ここで、高解像度とは、フルハイビジョン、4K解像度、または、スーパーハイビジョン(8K解像度)等であってもよい。また、撮影部110は、24fpsまたは30fps等で動画撮影してもよい。 Here, the high resolution may be full high vision, 4K resolution, super high vision (8K resolution), or the like. Further, the photographing unit 110 may shoot moving images at 24 fps, 30 fps, or the like.
 記憶部106は、各種のデータベース、テーブル、および/または、ファイルなどを格納する。また、記憶部106は、各種アプリケーションプログラム(例えば、ユーザアプリケーション等)を記憶していてもよい。 The storage unit 106 stores various databases, tables, and / or files. The storage unit 106 may store various application programs (for example, user applications).
 また、記憶部106は、ストレージ手段であり、例えばRAM・ROM等のメモリ、ハードディスクのような固定ディスク装置、SSD(Solid State Drive)、フレキシブルディスク、および/または、光ディスク等の有形の記憶装置、または、記憶回路を用いることができる。 The storage unit 106 is storage means, for example, a memory such as RAM / ROM, a fixed disk device such as a hard disk, a solid state drive (SSD), a flexible disk, and / or a tangible storage device such as an optical disk, Alternatively, a memory circuit can be used.
 また、記憶部106には、コントローラ等に命令を与え各種処理を行うためのコンピュータプログラム等が記録されている。 Further, the storage unit 106 stores a computer program and the like for giving instructions to the controller and performing various processes.
 これら記憶部106の各構成要素のうち、辞書データファイル106aは、辞書データを記憶する。ここで、辞書データは、各言語の文字、数字、および、記号等に関するデータであってもよい。 Among these components of the storage unit 106, the dictionary data file 106a stores dictionary data. Here, the dictionary data may be data relating to characters, numbers, symbols and the like of each language.
 帳票データファイル106bは、特定の帳票の特徴データ、および、レイアウトデータを記憶する。ここで、特定の帳票は、運転免許証を含む各種免許証、パスポートを含む各種身分証明書、または、健康保険証等のレイアウトが既定である規定帳票であってもよい。 The form data file 106b stores characteristic data and layout data of a specific form. Here, the specific form may be a prescribed form having a predetermined layout such as various licenses including a driver's license, various identification cards including a passport, or a health insurance card.
 画像データファイル106cは、画像データ(フレーム等)を記憶する。ここで、画像データファイル106cは、撮影画像データ、原稿画像データ、部分領域画像データ、対象領域画像データ、および/または、補正後画像データを記憶していてもよい。 The image data file 106c stores image data (such as a frame). Here, the image data file 106c may store captured image data, document image data, partial area image data, target area image data, and / or corrected image data.
 また、画像データファイル106cは、原稿画像、部分領域、および/または、対象領域等の位置データを記憶していてもよい。また、画像データファイル106cは、画像データに対応する文字データを記憶していてもよい。 Further, the image data file 106c may store position data such as a document image, a partial area, and / or a target area. The image data file 106c may store character data corresponding to the image data.
 また、画像データファイル106cは、映像データを記憶していてもよい。また、画像データファイル106cは、アンシラリデータを記憶していてもよい。 Also, the image data file 106c may store video data. The image data file 106c may store ancillary data.
 また、制御部102は、モバイル端末100を統括的に制御するCPU、メニーコアCPU、GPU(Graphics Processing Unit)、DSP(Digital Signal Processor)、LSI(Large Scale Integration)、ASIC(Application Specific Integrated Circuit)、および/または、FPGA(Field-Programmable Gate Array)等を含む有形のコントローラ、または、制御回路から構成されてもよい。 In addition, the control unit 102 is a CPU that centrally controls the mobile terminal 100, a many-core CPU, a GPU (Graphics Processing Unit), a DSP (Digital Signal Processor), an LSI (Large Scale Integration), and an ASIC (Application Splicing Spec.). And / or a tangible controller including a FPGA (Field-Programmable Gate Array) or the like, or a control circuit.
 制御部102は、制御プログラムと各種の処理手順等を規定したプログラムと所要データとを格納するための内部メモリを有し、これらプログラムに基づいて種々の処理を実行するための情報処理を行う。 The control unit 102 has an internal memory for storing a control program, a program defining various processing procedures, and necessary data, and performs information processing for executing various processes based on these programs.
 ここで、制御部102は、機能概念的に、画像取得部102a、原稿特定部102b、部分領域取得部102c、ボケ検出部102d、対象領域設定部102e、帳票判定部102f、向き特定部102g、向き補正部102h、および、画像表示部102iを備える。 Here, the control unit 102 has an image acquisition unit 102a, a document specification unit 102b, a partial region acquisition unit 102c, a blur detection unit 102d, a target region setting unit 102e, a form determination unit 102f, a direction specification unit 102g, in terms of functional concept. An orientation correction unit 102h and an image display unit 102i are provided.
 画像取得部102aは、画像データを取得する。ここで、画像取得部102aは、撮影部110の撮影による撮影画像の撮影画像データを取得してもよい。 The image acquisition unit 102a acquires image data. Here, the image acquisition unit 102a may acquire captured image data of a captured image captured by the capturing unit 110.
 また、画像取得部102aは、ボケ検出部102dにより所定の基準値以下のボケが検出されなかった場合、撮影部110の再度の撮影による撮影画像データを取得してもよい。また、画像取得部102aは、非圧縮且つ高解像度の画像データを取得してもよい。 Further, the image acquisition unit 102a may acquire captured image data obtained by re-shooting by the shooting unit 110 when the blur detection unit 102d does not detect a blur below a predetermined reference value. The image acquisition unit 102a may acquire non-compressed and high-resolution image data.
 また、画像取得部102aは、撮影部110による連続画像撮影または動画撮影を制御して、1コマに相当する画像データ(フレーム)を取得してもよい。また、画像取得部102aは、撮影部110による静止画撮影を制御して、画像データを取得してもよい。 In addition, the image acquisition unit 102a may acquire image data (frame) corresponding to one frame by controlling continuous image shooting or moving image shooting by the shooting unit 110. The image acquisition unit 102a may acquire image data by controlling still image shooting by the shooting unit 110.
 また、画像取得部102aは、原稿画像データ、部分領域画像データ、対象領域画像データ、および/または、補正後画像データ等を取得してもよい。また、画像取得部102aは、アンシラリデータを取得してもよい。 Further, the image acquisition unit 102a may acquire document image data, partial area image data, target area image data, and / or corrected image data. The image acquisition unit 102a may acquire ancillary data.
 原稿特定部102bは、撮影画像に含まれる原稿画像を特定する。ここで、原稿は、矩形原稿であってもよい。また、原稿特定部102bは、撮影画像データから原稿画像の位置データを検出してもよい。 The document specifying unit 102b specifies a document image included in the photographed image. Here, the document may be a rectangular document. The document specifying unit 102b may detect the position data of the document image from the captured image data.
 ここで、原稿特定部102bは、撮影画像データから原稿画像のコーナー座標(4点)を検出してもよい。また、原稿特定部102bは、撮影画像データから原稿画像のレイアウトを検出してもよい。 Here, the document specifying unit 102b may detect the corner coordinates (four points) of the document image from the captured image data. The document specifying unit 102b may detect the layout of the document image from the captured image data.
 また、原稿特定部102bは、エッジ検出法、および/または、特徴点マッチング法を用いて、撮影画像データから原稿画像の位置データを検出し、原稿画像の位置データに基づいて、原稿画像を特定してもよい。 The document specifying unit 102b detects the position data of the document image from the photographed image data using the edge detection method and / or the feature point matching method, and specifies the document image based on the position data of the document image. May be.
 部分領域取得部102cは、原稿画像における部分領域の部分領域画像データを取得する。ここで、部分領域取得部102cは、原稿画像を分割した部分領域の部分領域画像データを取得してもよい。 The partial area acquisition unit 102c acquires partial area image data of a partial area in the document image. Here, the partial area acquisition unit 102c may acquire partial area image data of a partial area obtained by dividing the document image.
 また、部分領域取得部102cは、原稿画像の原稿画像データに対するラベリング処理により、原稿画像における文字を示す部分領域の部分領域画像データを取得してもよい。 Further, the partial area acquisition unit 102c may acquire partial area image data of a partial area indicating characters in the original image by labeling the original image data of the original image.
 ボケ検出部102dは、画像のボケを検出する。ここで、ボケ検出部102dは、部分領域のボケを検出してもよい。 The blur detection unit 102d detects the blur of the image. Here, the blur detection unit 102d may detect blur in the partial area.
 対象領域設定部102eは、原稿画像の向きの識別を行う対象領域を設定する。ここで、対象領域設定部102eは、ボケに基づいて、部分領域を原稿画像の向きの識別を行う対象領域として設定してもよい。 The target area setting unit 102e sets a target area for identifying the orientation of the document image. Here, the target area setting unit 102e may set the partial area as a target area for identifying the orientation of the document image based on the blur.
 また、対象領域設定部102eは、ボケ検出部102dにより所定の基準値以下のボケが検出された時点で、当該所定の基準値以下のボケが検出された部分領域を原稿画像の向きの識別を行う対象領域として設定してもよい。 Further, the target area setting unit 102e identifies the orientation of the original image of the partial area in which the blur equal to or less than the predetermined reference value is detected by the blur detection unit 102d. The target area may be set.
 また、対象領域設定部102eは、ボケ検出部102dにより検出されたボケを比較し、最もボケていない部分領域を原稿画像の向きの識別を行う対象領域として設定してもよい。 Further, the target area setting unit 102e may compare the blur detected by the blur detection unit 102d and set a partial area that is least blurred as a target area for identifying the orientation of the document image.
 帳票判定部102fは、特定の帳票の特徴データに基づいて、原稿画像が特定の帳票に該当するか否かを判定する。 The form determination unit 102f determines whether the document image corresponds to the specific form based on the feature data of the specific form.
 向き特定部102gは、原稿画像の向きを特定する。ここで、向き特定部102gは、対象領域におけるコンテンツの向きを特定し、コンテンツの向きに基づいて、原稿画像の向きを特定してもよい。 The orientation identifying unit 102g identifies the orientation of the document image. Here, the orientation specifying unit 102g may specify the orientation of the content in the target area, and may specify the orientation of the document image based on the orientation of the content.
 また、向き特定部102gは、帳票判定部102fにより原稿画像が特定の帳票に該当すると判定された場合、特定の帳票のレイアウトデータに基づいて、原稿画像の向きを特定してもよい。 Further, the orientation specifying unit 102g may specify the orientation of the document image based on the layout data of the specific form when the form determining unit 102f determines that the original image corresponds to the specific form.
 また、向き特定部102gは、対象領域の対象領域画像データに対するラベリング処理により、対象領域における文字を示す文字領域を特定し、文字領域の文字領域データと辞書データとの比較に基づいて、文字領域における文字の向きを特定し、文字の向きに基づいて、原稿画像の向きを特定してもよい。 Further, the orientation specifying unit 102g specifies a character area indicating a character in the target area by labeling the target area image data of the target area, and based on the comparison between the character area data of the character area and the dictionary data, The direction of the character may be specified, and the direction of the document image may be specified based on the direction of the character.
 また、向き特定部102gは、対象領域の対象領域データと辞書データとの比較に基づいて、対象領域における文字の向きを特定し、文字の向きに基づいて、原稿画像の向きを特定してもよい。 Further, the orientation specifying unit 102g may specify the direction of characters in the target area based on the comparison between the target area data of the target area and the dictionary data, and may specify the direction of the document image based on the direction of the characters. Good.
 向き補正部102hは、正立補正した原稿画像の補正後画像データを取得する。ここで、向き補正部102hは、原稿画像の向きに基づいて、正立補正した原稿画像の補正後画像データを取得してもよい。 The orientation correction unit 102h acquires post-correction image data of the original image that has been corrected upright. Here, the orientation correction unit 102h may acquire post-correction image data of the document image that has been corrected upright based on the orientation of the document image.
 画像表示部102iは、画像データを表示させる。ここで、画像表示部102iは、撮影画像データ、原稿画像データ、部分領域画像データ、対象領域画像データ、および/または、補正後画像データを表示させてもよい。 The image display unit 102i displays image data. Here, the image display unit 102i may display captured image data, document image data, partial area image data, target area image data, and / or corrected image data.
 また、画像表示部102iは、画像データを入出力部112に表示させてもよい。また、画像表示部102iは、文字データを表示させてもよい。 Further, the image display unit 102i may display the image data on the input / output unit 112. The image display unit 102i may display character data.
[本実施形態の処理]
 上述した構成のモバイル端末100で実行される処理の一例について、図2から図13を参照して説明する。図2は、本実施形態のモバイル端末100における処理の一例を示すフローチャートである。
[Process of this embodiment]
An example of processing executed by the mobile terminal 100 configured as described above will be described with reference to FIGS. FIG. 2 is a flowchart illustrating an example of processing in the mobile terminal 100 of the present embodiment.
 図2に示すように、まず、画像取得部102aは、矩形原稿を被写体とした撮影部(カメラ)110による撮影を制御して、撮影部110の撮影による撮影画像の撮影画像データを取得する(ステップSA-1)。 As shown in FIG. 2, first, the image acquisition unit 102 a controls shooting by a shooting unit (camera) 110 using a rectangular document as a subject, and acquires shot image data of a shot image shot by the shooting unit 110 ( Step SA-1).
 そして、原稿特定部102bは、エッジ検出法、および/または、特徴点マッチング法を用いて、撮影画像データから原稿画像の位置データを検出し、原稿画像の位置データに基づいて、撮影画像に含まれる原稿画像を特定する(ステップSA-2)。 Then, the document specifying unit 102b detects the position data of the document image from the captured image data by using the edge detection method and / or the feature point matching method, and includes it in the captured image based on the position data of the document image. A document image to be printed is specified (step SA-2).
 この時点で、画像表示部102iは、原稿特定部102bにより特定された原稿画像の原稿画像データを入出力部112に表示させることにより、ユーザに特定された原稿画像を確認させてもよい。 At this time, the image display unit 102i may display the document image data of the document image specified by the document specifying unit 102b on the input / output unit 112, thereby allowing the user to confirm the specified document image.
 そして、帳票判定部102fは、帳票データファイル106bに記憶された特定の帳票の特徴データに基づいて、原稿画像が特定の帳票に該当するか否かを判定する(ステップSA-3)。 Then, the form determination unit 102f determines whether the document image corresponds to the specific form based on the characteristic data of the specific form stored in the form data file 106b (step SA-3).
 そして、帳票判定部102fは、原稿画像が特定の帳票に該当すると判定した場合(ステップSA-3:Yes)、処理をステップSA-4に移行させる。 If the form determination unit 102f determines that the document image corresponds to a specific form (step SA-3: Yes), the form determination unit 102f shifts the process to step SA-4.
 そして、向き特定部102gは、帳票データファイル106bに記憶された特定の帳票のレイアウトデータに基づいて、原稿画像の向きを特定する(ステップSA-4)。 Then, the orientation specifying unit 102g specifies the orientation of the document image based on the layout data of the specific form stored in the form data file 106b (Step SA-4).
 そして、向き補正部102hは、原稿画像の向きに基づいて、正立補正した原稿画像の補正後画像データを取得し(ステップSA-5)、処理をステップSA-12に移行させる。 Then, the orientation correction unit 102h acquires the corrected image data of the original image corrected upright based on the orientation of the original image (step SA-5), and shifts the processing to step SA-12.
 ここで、図3から図5を参照して、本実施形態における特定の帳票における向き補正処理の一例について説明する。図3は、本実施形態における撮影画像の一例を示す図である。図4は、本実施形態における向き特定処理の一例を示す図である。図5は、本実施形態における向き補正処理の一例を示す図である。 Here, with reference to FIG. 3 to FIG. 5, an example of the orientation correction processing in a specific form in the present embodiment will be described. FIG. 3 is a diagram illustrating an example of a captured image in the present embodiment. FIG. 4 is a diagram illustrating an example of the orientation specifying process in the present embodiment. FIG. 5 is a diagram illustrating an example of the orientation correction process in the present embodiment.
 本実施形態においては、図3に示す撮影画像から矩形抽出することにより、図4に示す原稿画像を特定している。 In the present embodiment, the original image shown in FIG. 4 is specified by extracting a rectangle from the photographed image shown in FIG.
 そして、本実施形態においては、図4に示す運転免許証の原稿画像に対して、特徴A(太枠で囲んだ部分)を抽出している。 In the present embodiment, the feature A (portion surrounded by a thick frame) is extracted from the document image of the driver's license shown in FIG.
 そして、本実施形態においては、抽出した特徴と予めデータベース(帳票データファイル106b)に登録してある帳票の特徴データとの整合性を判定することで、特定の帳票(運転免許証)であるか否かを判定している。 In the present embodiment, whether the feature is a specific form (driver's license) is determined by determining the consistency between the extracted feature and the feature data of the form registered in the database (form data file 106b) in advance. It is determined whether or not.
 そして、本実施形態においては、特定の帳票であった場合、帳票種別情報を設定し、特定の帳票特有のレイアウトデータに基づいて、図5に示すように、原稿画像の向き補正を行っている。 In this embodiment, when the form is a specific form, the form type information is set, and the orientation of the original image is corrected based on the layout data specific to the specific form as shown in FIG. .
 図2に戻り、帳票判定部102fは、原稿画像が特定の帳票に該当しないと判定した場合(ステップSA-3:No)、処理をステップSA-6に移行させる。 Returning to FIG. 2, when the form determination unit 102f determines that the document image does not correspond to a specific form (step SA-3: No), the form shifts the process to step SA-6.
 そして、部分領域取得部102cは、原稿画像を分割した部分領域の部分領域画像データを取得する(ステップSA-6)。 Then, the partial area acquisition unit 102c acquires partial area image data of the partial area obtained by dividing the document image (step SA-6).
 なお、部分領域取得部102cは、原稿画像の原稿画像データに対するラベリング処理により、原稿画像における文字を示す部分領域の部分領域画像データを取得してもよい。 The partial area acquisition unit 102c may acquire partial area image data of a partial area indicating characters in the original image by labeling the original image data of the original image.
 ここで、図6から図8を参照して、本実施形態における部分領域取得処理の一例について説明する。図6から図8は、本実施形態における部分領域取得処理の一例を示す図である。 Here, an example of the partial area acquisition process in the present embodiment will be described with reference to FIGS. 6 to 8 are diagrams illustrating an example of the partial region acquisition process in the present embodiment.
 図6に示すように、本実施形態においては、単純に原稿画像を2×2の4分割に領域分割した部分領域の部分領域画像データを取得してもよい。 As shown in FIG. 6, in this embodiment, partial area image data of a partial area obtained by simply dividing an original image into 2 × 2 quadrants may be acquired.
 また、図7に示すように、本実施形態においては、単純に原稿画像を3×3の9分割に領域分割した部分領域の部分領域画像データを取得してもよい。 Further, as shown in FIG. 7, in this embodiment, partial area image data of a partial area obtained by simply dividing a document image into 3 × 3 9 areas may be acquired.
 また、図8に示すように、本実施形態においては、二値化した原稿画像Bの原稿画像データに対してラベリング処理を行うことにより、文字単位(ラベル単位)まで領域分割した部分領域Cの部分領域画像データを取得してもよい。 Also, as shown in FIG. 8, in the present embodiment, by performing a labeling process on the binarized document image B document image data, the partial region C divided into character units (label units) is divided. Partial area image data may be acquired.
 図2に戻り、ボケ検出部102dは、部分領域のボケを検出する(ステップSA-7)。ここで、ボケ検出部102dは、エッジ強度による判定方法等を用いて、部分領域のボケを検出してもよい。 Returning to FIG. 2, the blur detection unit 102d detects blur in the partial area (step SA-7). Here, the blur detection unit 102d may detect blur in the partial region using a determination method based on edge strength or the like.
 ここで、図9および図10を参照して、本実施形態におけるボケ発生の一例について説明する。図9は、本実施形態における撮影画像の一例を示す図である。図10は、本実施形態における原稿画像の一例を示す図である。 Here, with reference to FIG. 9 and FIG. 10, an example of blur occurrence in the present embodiment will be described. FIG. 9 is a diagram illustrating an example of a captured image in the present embodiment. FIG. 10 is a diagram illustrating an example of a document image in the present embodiment.
 モバイルカメラで原稿を撮影した場合、斜めからなど、多方向から撮影することがあるため、図9に示すような撮影画像となりやすい。 When a document is photographed with a mobile camera, it may be photographed from multiple directions, such as from an oblique direction.
 そのため、図9の撮影画像に含まれる原稿画像Dにおいては、図10に示すように、撮影位置から近い領域Eではボケが発生しにくく、文字解像度が低下しにくいが、撮影位置から遠い原稿画像D中の領域Fではボケが発生しやすく、文字解像度が低下しやすい。 Therefore, in the document image D included in the photographed image of FIG. 9, as shown in FIG. 10, in the region E close to the photographing position, blurring is unlikely to occur and the character resolution is unlikely to decrease, but the document image far from the photographing position. In the region F in D, blurring is likely to occur, and the character resolution is likely to decrease.
 図2に戻り、ボケ検出部102dは、検出したボケが所定の基準値以下となる部分領域があるか否かを判定する(ステップSA-8)。 Returning to FIG. 2, the blur detection unit 102d determines whether or not there is a partial region where the detected blur is equal to or less than a predetermined reference value (step SA-8).
 そして、ボケ検出部102dは、検出したボケが所定の基準値以下となる部分領域がないと判定した場合(ステップSA-8:No)、処理をステップSA-1に移行させる。 If the blur detection unit 102d determines that there is no partial region where the detected blur is equal to or less than the predetermined reference value (step SA-8: No), the process shifts to step SA-1.
 一方、ボケ検出部102dは、検出したボケが所定の基準値以下となる部分領域があると判定した場合(ステップSA-8:Yes)、処理をステップSA-9に移行させる。 On the other hand, if the blur detection unit 102d determines that there is a partial region where the detected blur is equal to or less than the predetermined reference value (step SA-8: Yes), the process shifts to step SA-9.
 ここで、図11および図12を参照して、本実施形態におけるボケ判定の一例について説明する。図11および図12は、本実施形態におけるボケ判定の一例を示す図である。 Here, an example of blur determination in the present embodiment will be described with reference to FIGS. 11 and 12. 11 and 12 are diagrams illustrating an example of blur determination in the present embodiment.
 図11に示すように、本実施形態において、原稿画像のボケが基準値以下(ボケていない)という状態は、原稿画像がボケておらず、文字の視認性がよい状態である。 As shown in FIG. 11, in the present embodiment, the state where the blur of the document image is equal to or less than the reference value (not blurred) is a state where the document image is not blurred and the visibility of the characters is good.
 一方、図12に示すように、本実施形態において、原稿画像のボケが基準値より大きい(ボケている)という状態は、原稿画像がボケていて、文字の視認性が悪い状態であり、撮影のリトライが必要となる。 On the other hand, as shown in FIG. 12, in the present embodiment, the state in which the blur of the document image is larger than the reference value (blurred) is a state in which the document image is blurred and the character visibility is poor. Retry is required.
 図2に戻り、対象領域設定部102eは、所定の基準値以下のボケが検出された部分領域を原稿画像の向きの識別を行う対象領域として設定する(ステップSA-9)。 Referring back to FIG. 2, the target area setting unit 102e sets a partial area in which blur less than a predetermined reference value is detected as a target area for identifying the orientation of the document image (step SA-9).
 例えば、対象領域設定部102eは、各部分領域のボケを順番に検出し、ボケが基準値以下(ボケていない)の領域を確認した時点で、その部分領域を対象領域として設定してもよい。 For example, the target area setting unit 102e may detect blur in each partial area in order, and set the partial area as the target area when the blur is confirmed to be equal to or less than a reference value (not blurred). .
 また、対象領域設定部102eは、各部分領域のボケを比較して最もボケていない領域を対象領域として設定してもよい。 Further, the target area setting unit 102e may compare the blurs of the partial areas and set the least blurred area as the target area.
 そして、向き特定部102gは、対象領域におけるコンテンツの向きを特定し、コンテンツの向きに基づいて、原稿画像の向きを特定する(ステップSA-10)。 Then, the orientation specifying unit 102g specifies the orientation of the content in the target area, and specifies the orientation of the document image based on the orientation of the content (step SA-10).
 例えば、向き特定部102gは、対象領域に含まれるコンテンツである文字領域の文字領域データに対するラベリング処理、および、文字領域データと辞書データファイル106aに記憶された辞書データとの比較により、コンテンツの向きを特定してもよい。 For example, the orientation specifying unit 102g performs a labeling process on the character area data of the character area that is the content included in the target area, and compares the character area data with the dictionary data stored in the dictionary data file 106a to determine the content direction. May be specified.
 そして、向き補正部102hは、原稿画像の向きに基づいて、正立補正した原稿画像の補正後画像データを取得する(ステップSA-11)。 Then, the orientation correcting unit 102h acquires post-correction image data of the original image that has been corrected upright based on the orientation of the original image (step SA-11).
 そして、向き補正部102hは、補正後画像データを画像データファイル106cに保存(格納)し(ステップSA-12)、処理を終了する。 Then, the orientation correction unit 102h saves (stores) the corrected image data in the image data file 106c (step SA-12), and ends the process.
 更に、画像表示部102iは、向き補正部102hにより取得された原稿画像の補正後画像データを入出力部112に表示させることにより、ユーザに向き補正された原稿画像を確認させてもよい。 Further, the image display unit 102i may cause the user to confirm the orientation-corrected document image by displaying the corrected image data of the document image acquired by the orientation correction unit 102h on the input / output unit 112.
 ここで、図13を参照して、本実施形態における向き補正処理の概要の一例について説明する。図13は、本実施形態における向き補正処理の一例を示す概要図である。 Here, with reference to FIG. 13, an example of an outline of the orientation correction processing in the present embodiment will be described. FIG. 13 is a schematic diagram illustrating an example of the orientation correction process in the present embodiment.
 図13に示すように、本実施形態においては、原稿を撮影し(ステップSB-1)、撮影画像から原稿画像となる矩形抽出を行う(ステップSB-2)。 As shown in FIG. 13, in the present embodiment, a document is photographed (step SB-1), and a rectangle that becomes a document image is extracted from the photographed image (step SB-2).
 そして、この時点で、本実施形態においては、切り出した原稿画像の原稿画像データを表示させ、ユーザに確認させる。なお、本実施形態においては、射影変換した原稿画像の原稿画像データを表示させてもよい。 At this point, in the present embodiment, the document image data of the clipped document image is displayed and the user is confirmed. In the present embodiment, document image data of a document image that has undergone projective conversion may be displayed.
 そして、本実施形態においては、ユーザへ矩形抽出した原稿画像の表示を行った後、撮影した原稿が一般的な文書の場合、原稿画像を部分領域に2×2分割する(ステップSB-3)。 In this embodiment, after the original image extracted in a rectangle is displayed to the user, if the photographed original is a general document, the original image is divided into 2 × 2 parts (step SB-3). .
 そして、本実施形態においては、それぞれの部分領域のボケを検出し(ステップSB-4)、ボケがある一定以下の部分領域を検出した時点でその部分領域を使い、向き補正を行い、向き補正後の原稿画像を保存し(ステップSB-5)、処理を終了する。 In the present embodiment, each partial region is detected as blurred (step SB-4), and when a partial region with a certain blur or less is detected, the partial region is used to perform direction correction, thereby correcting the direction. The subsequent document image is saved (step SB-5), and the process is terminated.
 ここで、もし、部分領域が白紙または文字数の候補が少ない場合、本実施形態においては、当該部分領域をボケ検出の対象とせずに、他の部分領域でボケ判定を行ってもよい。 Here, if the partial area is blank or there are few candidates for the number of characters, in this embodiment, the partial area may not be subjected to blur detection, and the blur determination may be performed in another partial area.
 一方、本実施形態においては、それぞれの部分領域のボケを検出し(ステップSB-4)、4つ全ての部分領域のボケが、基準値より大きい(ボケている)場合(ステップSB-6)、撮影自体をやり直す(リトライする)ため、処理をステップSB-1に移行させる(ステップSB-7)。 On the other hand, in this embodiment, the blur of each partial area is detected (step SB-4), and the blur of all four partial areas is larger than the reference value (blur) (step SB-6). In order to redo (retry) the photographing itself, the process proceeds to step SB-1 (step SB-7).
 このように、本実施形態においては、画像中に含まれる原稿画像を検出し、原稿画像を分割し、分割した各領域のボケを検出し、よりボケの少ない領域を向き補正の対象領域として決定し、向き補正の対象領域から原稿画像の向き補正を行ってもよい。 As described above, in the present embodiment, the original image included in the image is detected, the original image is divided, the blur of each divided area is detected, and the area with less blur is determined as the target area for orientation correction. Then, the orientation correction of the document image may be performed from the orientation correction target area.
 また、本実施形態においては、原稿画像の特徴量による帳票判定を行い、特定の帳票(運転免許証または健康保険証等)の場合、その帳票種に特化した向き補正処理、および、処理後の画像データの保存を行ってもよい。 In the present embodiment, the form is determined based on the feature amount of the original image, and in the case of a specific form (such as a driver's license or health insurance card), orientation correction processing specialized for the form type and post-processing The image data may be stored.
 近年、スマートフォンまたはタブレット等のモバイル端末の普及に伴い、従来ではスキャナが利用されてきた業務が、カメラを有するモバイル端末を利用する業務に変化している。 In recent years, with the widespread use of mobile terminals such as smartphones and tablets, the work that has been used in the past has been changed to work that uses a mobile terminal having a camera.
 これは、カメラによるスキャニングは場所を問わないこと、または、媒体の制限が無いことによる自由度が高いという利点があるからである。 This is because scanning by a camera has an advantage that the degree of freedom is high because it does not matter where it is or there is no restriction on the medium.
 一方、カメラ画像は、周辺の光量、撮影方向、および、撮影時の運動などの環境が安定しないことから、スキャナ画質と同等の画像を得ることが難しいという問題があった。 On the other hand, the camera image has a problem that it is difficult to obtain an image equivalent to the image quality of the scanner because the surrounding light quantity, shooting direction, and environment such as motion during shooting are not stable.
 更に、従来の向き補正処理においては、原稿画像上部またはランダムな位置の文字を認識することで正しい原稿向きを判定していた。 Furthermore, in the conventional orientation correction process, the correct orientation of the document is determined by recognizing characters at the top of the document image or at random positions.
 しかしながら、一般的なカメラで原稿を撮影する場合、斜めを含む多方向から撮影されることがあるため、原稿画像にボケ領域が発生してしまい、向き補正処理の精度低下の原因となっていた。 However, when shooting a document with a general camera, the document image may be shot from multiple directions including oblique directions, resulting in a blurred area in the document image, causing a reduction in the accuracy of orientation correction processing. .
 そこで、本実施形態においては、スキャナ画質で行われていた画像処理を、モバイルカメラ画質にも対応できるようにしている。 Therefore, in the present embodiment, the image processing performed with the scanner image quality can also be applied to the mobile camera image quality.
[他の実施形態]
 さて、これまで本発明の実施形態について説明したが、本発明は、上述した実施形態以外にも、請求の範囲に記載した技術的思想の範囲内において種々の異なる実施形態にて実施されてよいものである。
[Other Embodiments]
The embodiments of the present invention have been described so far, but the present invention may be implemented in various different embodiments other than the above-described embodiments within the scope of the technical idea described in the claims. Is.
 例えば、モバイル端末100は、スタンドアローンの形態で処理を行ってもよく、クライアント端末(モバイル端末100とは別筐体である)からの要求に応じて処理を行い、その処理結果を当該クライアント端末に返却するようにしてもよい。 For example, the mobile terminal 100 may perform processing in a stand-alone form, performs processing in response to a request from a client terminal (which is a separate housing from the mobile terminal 100), and outputs the processing result to the client terminal You may make it return to.
 また、実施形態において説明した各処理のうち、自動的に行われるものとして説明した処理の全部または一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部または一部を公知の方法で自動的に行うこともできる。 In addition, among the processes described in the embodiment, all or a part of the processes described as being automatically performed can be manually performed, or all of the processes described as being manually performed can be performed. Alternatively, a part can be automatically performed by a known method.
 このほか、明細書中および図面中で示した処理手順、制御手順、具体的名称、各処理の登録データもしくは検索条件等のパラメータを含む情報、画面例、または、データベース構成については、特記する場合を除いて任意に変更することができる。 In addition, the processing procedure, control procedure, specific name, information including parameters such as registration data or search conditions for each processing, screen examples, or database configuration shown in the description and drawings are specially noted. It can be changed arbitrarily except for.
 また、モバイル端末100に関して、図示の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。 Further, regarding the mobile terminal 100, each illustrated component is functionally conceptual and does not necessarily need to be physically configured as illustrated.
 例えば、モバイル端末100の各装置が備える処理機能、特に制御部102にて行われる各処理機能については、その全部または任意の一部を、CPUおよび当該CPUにて解釈実行されるプログラムにて実現してもよく、また、ワイヤードロジックによるハードウェアとして実現してもよい。 For example, all or some of the processing functions provided in each device of the mobile terminal 100, particularly the processing functions performed by the control unit 102, are realized by the CPU and a program interpreted and executed by the CPU. Alternatively, it may be realized as hardware by wired logic.
 なお、プログラムは、後述する、コンピュータに本発明に係る方法を実行させるためのプログラム化された命令を含む、一時的でないコンピュータ読み取り可能な記録媒体に記録されており、必要に応じてモバイル端末100に機械的に読み取られる。すなわち、ROMまたはHDDなどの記憶部106などには、OS(Operating System)と協働してCPUに命令を与え、各種処理を行うためのコンピュータプログラムが記録されている。このコンピュータプログラムは、RAMにロードされることによって実行され、CPUと協働して制御部を構成する。 Note that the program is recorded on a non-transitory computer-readable recording medium including a programmed instruction for causing a computer to execute the method according to the present invention, which will be described later, and the mobile terminal 100 as necessary. Read mechanically. That is, in the storage unit 106 such as a ROM or an HDD, computer programs for performing various processes by giving instructions to the CPU in cooperation with an OS (Operating System) are recorded. This computer program is executed by being loaded into the RAM, and constitutes a control unit in cooperation with the CPU.
 また、このコンピュータプログラムは、モバイル端末100に対して任意のネットワークを介して接続されたアプリケーションプログラムサーバに記憶されていてもよく、必要に応じてその全部または一部をダウンロードすることも可能である。 The computer program may be stored in an application program server connected to the mobile terminal 100 via an arbitrary network, and may be downloaded in whole or in part as necessary. .
 また、本発明に係るプログラムを、コンピュータ読み取り可能な記録媒体に格納してもよく、また、プログラム製品として構成することもできる。ここで、この「記録媒体」とは、メモリーカード、USBメモリ、SDカード、フレキシブルディスク、光磁気ディスク、ROM、EPROM、EEPROM、CD-ROM、DVD、および、Blu-ray(登録商標) Disc等の任意の「可搬用の物理媒体」を含むものとする。 Further, the program according to the present invention may be stored in a computer-readable recording medium, or may be configured as a program product. Here, the “recording medium” includes a memory card, USB memory, SD card, flexible disk, magneto-optical disk, ROM, EPROM, EEPROM, CD-ROM, DVD, Blu-ray (registered trademark) Disc, etc. Including any “portable physical medium”.
 また、「プログラム」とは、任意の言語や記述方法にて記述されたデータ処理方法であり、ソースコードやバイナリコード等の形式を問わない。なお、「プログラム」は必ずしも単一的に構成されるものに限られず、複数のモジュールやライブラリとして分散構成されるものや、OSに代表される別個のプログラムと協働してその機能を達成するものをも含む。なお、実施形態に示した各装置において記録媒体を読み取るための具体的な構成、読み取り手順、あるいは、読み取り後のインストール手順等については、周知の構成や手順を用いることができる。 In addition, “program” is a data processing method described in an arbitrary language or description method, and may be in any form such as source code or binary code. Note that the “program” is not necessarily limited to a single configuration, and functions are achieved in cooperation with a separate configuration such as a plurality of modules and libraries or a separate program represented by the OS. Including things. In addition, a well-known structure and procedure can be used about the specific structure for reading a recording medium in each apparatus shown in embodiment, a reading procedure, or the installation procedure after reading.
 記憶部106に格納される各種のデータベース等は、RAMもしくはROM等のメモリ装置、ハードディスク等の固定ディスク装置、フレキシブルディスク、および/または、光ディスク等のストレージ手段であり、各種処理やウェブサイト提供に用いる各種のプログラム、テーブル、データベース、および/または、ウェブページ用ファイル等を格納してもよい。 Various databases and the like stored in the storage unit 106 are storage means such as a memory device such as a RAM or a ROM, a fixed disk device such as a hard disk, a flexible disk, and / or an optical disk. Various programs, tables, databases, and / or web page files used may be stored.
 また、モバイル端末100は、既知のパーソナルコンピュータ等の情報処理装置として構成してもよく、また、該情報処理装置に任意の周辺装置を接続して構成してもよい。また、モバイル端末100は、該情報処理装置に本発明の方法を実現させるソフトウェア(プログラム、データ等を含む)を実装することにより実現してもよい。 The mobile terminal 100 may be configured as an information processing apparatus such as a known personal computer, or may be configured by connecting an arbitrary peripheral device to the information processing apparatus. The mobile terminal 100 may be realized by installing software (including programs, data, and the like) that causes the information processing apparatus to implement the method of the present invention.
 更に、装置の分散・統合の具体的形態は図示するものに限られず、その全部または一部を、各種の付加等に応じて、または、機能負荷に応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。すなわち、上述した実施形態を任意に組み合わせて実施してもよく、実施形態を選択的に実施してもよい。 Furthermore, the specific form of distribution / integration of the devices is not limited to that shown in the figure, and all or a part of them may be functional or physical in arbitrary units according to various additions or according to functional loads. Can be distributed and integrated. That is, the above-described embodiments may be arbitrarily combined and may be selectively implemented.
 以上のように、モバイル端末、画像処理方法、および、プログラムは、産業上の多くの分野、特にカメラで読み込んだ画像を扱う画像処理分野で実施することができ、極めて有用である。 As described above, the mobile terminal, the image processing method, and the program can be implemented in many industrial fields, particularly in the image processing field that handles images read by a camera, and are extremely useful.
 100 モバイル端末
 102 制御部
 102a 画像取得部
 102b 原稿特定部
 102c 部分領域取得部
 102d ボケ検出部
 102e 対象領域設定部
 102f 帳票判定部
 102g 向き特定部
 102h 向き補正部
 102i 画像表示部
 106 記憶部
 106a 辞書データファイル
 106b 帳票データファイル
 106c 画像データファイル
 110 撮影部
 112 入出力部
 114 センサ部
 116 通信部
DESCRIPTION OF SYMBOLS 100 Mobile terminal 102 Control part 102a Image acquisition part 102b Original specification part 102c Partial area acquisition part 102d Blur detection part 102e Target area setting part 102f Form determination part 102g Orientation specification part 102h Direction correction part 102i Image display part 106 Storage part 106a Dictionary data File 106b Form data file 106c Image data file 110 Imaging unit 112 Input / output unit 114 Sensor unit 116 Communication unit

Claims (17)

  1.  撮影部の撮影による撮影画像の撮影画像データを取得する画像取得手段と、
     前記撮影画像に含まれる原稿画像を特定する原稿特定手段と、
     前記原稿画像における部分領域の部分領域画像データを取得する部分領域取得手段と、
     前記部分領域のボケを検出するボケ検出手段と、
     前記ボケに基づいて、前記部分領域を前記原稿画像の向きの識別を行う対象領域として設定する対象領域設定手段と、
     前記対象領域におけるコンテンツの向きを特定し、前記コンテンツの向きに基づいて、前記原稿画像の向きを特定する向き特定手段と、
     前記原稿画像の向きに基づいて、正立補正した前記原稿画像の補正後画像データを取得する向き補正手段と、
     を備えたことを特徴とする、モバイル端末。
    Image acquisition means for acquiring captured image data of a captured image captured by the imaging unit;
    Document specifying means for specifying a document image included in the photographed image;
    Partial area acquisition means for acquiring partial area image data of a partial area in the original image;
    Blur detection means for detecting blur in the partial area;
    Target area setting means for setting the partial area as a target area for identifying the orientation of the document image based on the blur;
    Orientation specifying means for specifying the orientation of the content in the target area and specifying the orientation of the document image based on the orientation of the content;
    Orientation correcting means for acquiring post-correction image data of the original image corrected upright based on the orientation of the original image;
    A mobile terminal comprising:
  2.  特定の帳票の特徴データ、および、レイアウトデータを記憶する帳票データ記憶手段と、
     前記特徴データに基づいて、前記原稿画像が前記特定の帳票に該当するか否かを判定する帳票判定手段と、
     を更に備え、
     前記向き特定手段は、
     更に、前記帳票判定手段により前記原稿画像が前記特定の帳票に該当すると判定された場合、前記レイアウトデータに基づいて、前記原稿画像の向きを特定する、請求項1に記載のモバイル端末。
    Form data storage means for storing characteristic data of specific forms and layout data;
    A form determination means for determining whether the document image corresponds to the specific form based on the feature data;
    Further comprising
    The orientation specifying means includes
    2. The mobile terminal according to claim 1, wherein when the document determination unit determines that the document image corresponds to the specific document, the orientation of the document image is specified based on the layout data.
  3.  辞書データを記憶する辞書データ記憶手段、
     を更に備え、
     前記向き特定手段は、
     前記対象領域の対象領域画像データに対するラベリング処理により、前記対象領域における文字を示す文字領域を特定し、前記文字領域の文字領域データと前記辞書データとの比較に基づいて、前記文字領域における前記文字の向きを特定し、前記文字の向きに基づいて、前記原稿画像の向きを特定する、請求項1または2に記載のモバイル端末。
    Dictionary data storage means for storing dictionary data;
    Further comprising
    The orientation specifying means includes
    A character area indicating a character in the target area is identified by a labeling process on the target area image data in the target area, and the character in the character area is compared based on a comparison between the character area data in the character area and the dictionary data. The mobile terminal according to claim 1, wherein the orientation of the document image is identified, and the orientation of the document image is identified based on the orientation of the character.
  4.  前記部分領域取得手段は、
     前記原稿画像を分割した前記部分領域の前記部分領域画像データを取得する、請求項1から3のいずれか一つに記載のモバイル端末。
    The partial area acquisition means includes
    The mobile terminal according to claim 1, wherein the partial area image data of the partial area obtained by dividing the document image is acquired.
  5.  辞書データを記憶する辞書データ記憶手段、
     を更に備え、
     前記部分領域取得手段は、
     前記原稿画像の原稿画像データに対するラベリング処理により、前記原稿画像における文字を示す前記部分領域の前記部分領域画像データを取得し、
     前記向き特定手段は、
     前記対象領域の対象領域データと前記辞書データとの比較に基づいて、前記対象領域における前記文字の向きを特定し、前記文字の向きに基づいて、前記原稿画像の向きを特定する、請求項1または2に記載のモバイル端末。
    Dictionary data storage means for storing dictionary data;
    Further comprising
    The partial area acquisition means includes
    The partial area image data of the partial area indicating characters in the original image is obtained by labeling the original image data of the original image,
    The orientation specifying means includes
    2. The direction of the character in the target area is specified based on a comparison between the target area data of the target area and the dictionary data, and the direction of the document image is specified based on the direction of the character. Or the mobile terminal of 2.
  6.  前記対象領域設定手段は、
     前記ボケ検出手段により所定の基準値以下の前記ボケが検出された時点で、当該所定の基準値以下のボケが検出された前記部分領域を前記原稿画像の向きの識別を行う前記対象領域として設定する、請求項1から5のいずれか一つに記載のモバイル端末。
    The target area setting means includes
    When the blur detection unit detects the blur below the predetermined reference value, the partial area where the blur below the predetermined reference value is detected is set as the target region for identifying the orientation of the document image. The mobile terminal according to any one of claims 1 to 5.
  7.  前記対象領域設定手段は、
     前記ボケ検出手段により検出された前記ボケを比較し、最もボケていない前記部分領域を前記原稿画像の向きの識別を行う前記対象領域として設定する、請求項1から5のいずれか一つに記載のモバイル端末。
    The target area setting means includes
    6. The blur detected by the blur detection unit is compared, and the partial area that is least blurred is set as the target area for identifying the orientation of the document image. Mobile devices.
  8.  前記画像取得手段は、
     前記ボケ検出手段により所定の基準値以下の前記ボケが検出されなかった場合、前記撮影部の再度の撮影による撮影画像データを取得する、請求項1から5のいずれか一つに記載のモバイル端末。
    The image acquisition means includes
    The mobile terminal according to any one of claims 1 to 5, wherein when the blur detection unit detects no blur that is equal to or less than a predetermined reference value, the captured image data is acquired by the imaging unit again. .
  9.  撮影部の撮影による撮影画像の撮影画像データを取得する画像取得ステップと、
     前記撮影画像に含まれる原稿画像を特定する原稿特定ステップと、
     前記原稿画像における部分領域の部分領域画像データを取得する部分領域取得ステップと、
     前記部分領域のボケを検出するボケ検出ステップと、
     前記ボケに基づいて、前記部分領域を前記原稿画像の向きの識別を行う対象領域として設定する対象領域設定ステップと、
     前記対象領域におけるコンテンツの向きを特定し、前記コンテンツの向きに基づいて、前記原稿画像の向きを特定する向き特定ステップと、
     前記原稿画像の向きに基づいて、正立補正した前記原稿画像の補正後画像データを取得する向き補正ステップと、
     を含むことを特徴とする、画像処理方法。
    An image acquisition step of acquiring captured image data of a captured image captured by the imaging unit;
    A document specifying step for specifying a document image included in the photographed image;
    A partial area acquisition step of acquiring partial area image data of a partial area in the document image;
    A blur detection step for detecting blur in the partial area;
    A target area setting step for setting the partial area as a target area for identifying the orientation of the document image based on the blur;
    A direction specifying step of specifying a direction of content in the target area and specifying a direction of the document image based on the direction of the content;
    A direction correction step for obtaining post-correction image data of the original image corrected upright based on the direction of the original image;
    An image processing method comprising:
  10.  記憶された特定の帳票の特徴データに基づいて、前記原稿画像が前記特定の帳票に該当するか否かを判定する帳票判定ステップ、
     を更に含み、
     前記向き特定ステップにて、
     更に、前記帳票判定ステップにて前記原稿画像が前記特定の帳票に該当すると判定された場合、記憶された特定の帳票のレイアウトデータに基づいて、前記原稿画像の向きを特定する、請求項9に記載の画像処理方法。
    A form determination step for determining whether or not the original image corresponds to the specific form based on the stored characteristic data of the specific form;
    Further including
    In the orientation specifying step,
    Furthermore, when it is determined in the form determination step that the document image corresponds to the specific form, the orientation of the document image is specified based on the stored layout data of the specific form. The image processing method as described.
  11.  前記向き特定ステップにて、
     前記対象領域の対象領域画像データに対するラベリング処理により、前記対象領域における文字を示す文字領域を特定し、前記文字領域の文字領域データと記憶された辞書データとの比較に基づいて、前記文字領域における前記文字の向きを特定し、前記文字の向きに基づいて、前記原稿画像の向きを特定する、請求項9または10に記載の画像処理方法。
    In the orientation specifying step,
    The character area indicating the character in the target area is identified by the labeling process for the target area image data of the target area, and based on the comparison between the character area data of the character area and the stored dictionary data, The image processing method according to claim 9 or 10, wherein a direction of the character is specified, and a direction of the document image is specified based on the direction of the character.
  12.  前記部分領域取得ステップにて、
     前記原稿画像を分割した前記部分領域の前記部分領域画像データを取得する、請求項9から11のいずれか一つに記載の画像処理方法。
    In the partial region acquisition step,
    The image processing method according to claim 9, wherein the partial area image data of the partial area obtained by dividing the document image is acquired.
  13.  前記部分領域取得ステップにて、
     前記原稿画像の原稿画像データに対するラベリング処理により、前記原稿画像における文字を示す前記部分領域の前記部分領域画像データを取得し、
     前記向き特定ステップにて、
     前記対象領域の対象領域データと記憶された辞書データとの比較に基づいて、前記対象領域における前記文字の向きを特定し、前記文字の向きに基づいて、前記原稿画像の向きを特定する、請求項9または10に記載の画像処理方法。
    In the partial region acquisition step,
    The partial area image data of the partial area indicating characters in the original image is obtained by labeling the original image data of the original image,
    In the orientation specifying step,
    The direction of the character in the target area is specified based on a comparison between the target area data of the target area and stored dictionary data, and the direction of the document image is specified based on the direction of the character. Item 11. The image processing method according to Item 9 or 10.
  14.  前記対象領域設定ステップにて、
     前記ボケ検出ステップにて所定の基準値以下の前記ボケが検出された時点で、当該所定の基準値以下のボケが検出された前記部分領域を前記原稿画像の向きの識別を行う前記対象領域として設定する、請求項9から13のいずれか一つに記載の画像処理方法。
    In the target area setting step,
    The partial area where the blur below the predetermined reference value is detected as the target area for identifying the orientation of the document image when the blur below the predetermined reference value is detected in the blur detection step. The image processing method according to claim 9, wherein the image processing method is set.
  15.  前記対象領域設定ステップにて、
     前記ボケ検出ステップにて検出された前記ボケを比較し、最もボケていない前記部分領域を前記原稿画像の向きの識別を行う前記対象領域として設定する、請求項9から13のいずれか一つに記載の画像処理方法。
    In the target area setting step,
    The blur detected in the blur detection step is compared, and the partial area that is least blurred is set as the target area for identifying the orientation of the document image. The image processing method as described.
  16.  前記画像取得ステップにて、
     前記ボケ検出ステップにて所定の基準値以下の前記ボケが検出されなかった場合、前記撮影部の再度の撮影による撮影画像データを取得する、請求項9から13のいずれか一つに記載の画像処理方法。
    In the image acquisition step,
    The image according to any one of claims 9 to 13, wherein when the blur that is equal to or less than a predetermined reference value is not detected in the blur detection step, captured image data obtained by re-shooting by the shooting unit is acquired. Processing method.
  17.  撮影部の撮影による撮影画像の撮影画像データを取得する画像取得ステップと、
     前記撮影画像に含まれる原稿画像を特定する原稿特定ステップと、
     前記原稿画像における部分領域の部分領域画像データを取得する部分領域取得ステップと、
     前記部分領域のボケを検出するボケ検出ステップと、
     前記ボケに基づいて、前記部分領域を前記原稿画像の向きの識別を行う対象領域として設定する対象領域設定ステップと、
     前記対象領域におけるコンテンツの向きを特定し、前記コンテンツの向きに基づいて、前記原稿画像の向きを特定する向き特定ステップと、
     前記原稿画像の向きに基づいて、正立補正した前記原稿画像の補正後画像データを取得する向き補正ステップと、
     をコンピュータに実行させるためのプログラム。
    An image acquisition step of acquiring captured image data of a captured image captured by the imaging unit;
    A document specifying step for specifying a document image included in the photographed image;
    A partial area acquisition step of acquiring partial area image data of a partial area in the document image;
    A blur detection step for detecting blur in the partial area;
    A target area setting step for setting the partial area as a target area for identifying the orientation of the document image based on the blur;
    A direction specifying step of specifying a direction of content in the target area and specifying a direction of the document image based on the direction of the content;
    A direction correction step for obtaining post-correction image data of the original image corrected upright based on the direction of the original image;
    A program that causes a computer to execute.
PCT/JP2016/074720 2016-08-24 2016-08-24 Mobile terminal, image processing method, and program WO2018037519A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/JP2016/074720 WO2018037519A1 (en) 2016-08-24 2016-08-24 Mobile terminal, image processing method, and program
JP2018535993A JP6613378B2 (en) 2016-08-24 2016-08-24 Mobile terminal, image processing method, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2016/074720 WO2018037519A1 (en) 2016-08-24 2016-08-24 Mobile terminal, image processing method, and program

Publications (1)

Publication Number Publication Date
WO2018037519A1 true WO2018037519A1 (en) 2018-03-01

Family

ID=61245655

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2016/074720 WO2018037519A1 (en) 2016-08-24 2016-08-24 Mobile terminal, image processing method, and program

Country Status (2)

Country Link
JP (1) JP6613378B2 (en)
WO (1) WO2018037519A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11272792A (en) * 1998-03-24 1999-10-08 Fuji Xerox Co Ltd Method and device for discriminating form direction
JP2007280346A (en) * 2006-03-15 2007-10-25 Ricoh Co Ltd Image processor, image direction determining method, and image direction determining program
JP2013250975A (en) * 2012-05-31 2013-12-12 Fujitsu Ltd Document processor, document processing method and scanner

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11272792A (en) * 1998-03-24 1999-10-08 Fuji Xerox Co Ltd Method and device for discriminating form direction
JP2007280346A (en) * 2006-03-15 2007-10-25 Ricoh Co Ltd Image processor, image direction determining method, and image direction determining program
JP2013250975A (en) * 2012-05-31 2013-12-12 Fujitsu Ltd Document processor, document processing method and scanner

Also Published As

Publication number Publication date
JP6613378B2 (en) 2019-11-27
JPWO2018037519A1 (en) 2018-12-06

Similar Documents

Publication Publication Date Title
JP7059054B2 (en) Image processing equipment, image processing methods and programs
US10810743B2 (en) Image processing device, image processing method, and computer program product
US20180285677A1 (en) Information processing apparatus, control method thereof, and storage medium
US10885375B2 (en) Mobile terminal, image processing method, and computer-readable recording medium
JP7187265B2 (en) Image processing device and its control method and program
WO2018167971A1 (en) Image processing device, control method, and control program
JP6613378B2 (en) Mobile terminal, image processing method, and program
US10514591B2 (en) Camera apparatus, image processing device, and image processing method
JP6600090B2 (en) Image processing apparatus, image processing method, and program
JP6777507B2 (en) Image processing device and image processing method
US10116809B2 (en) Image processing apparatus, control method, and computer-readable storage medium, which obtains calibration image information with which to correct image data
WO2017126056A1 (en) Mobile terminal, image processing method, and program
JP6697829B2 (en) Mobile terminal, image processing method, and program
JP6785930B2 (en) Mobile devices, image processing methods, and programs
JP6596512B2 (en) Mobile terminal, image processing method, and program
WO2018003090A1 (en) Image processing device, image processing method, and program
JP2020149184A (en) Information processor and control method thereof and program
JP4315025B2 (en) Imaging apparatus, image acquisition method, and program
WO2017158814A1 (en) Mobile terminal, image processing method, and program
JP2014143630A (en) Image processing system
JP2005173946A (en) Portable information terminal and character recognition method therefor

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 2018535993

Country of ref document: JP

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16914193

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16914193

Country of ref document: EP

Kind code of ref document: A1