WO2018003090A1

WO2018003090A1 - Image processing device, image processing method, and program

Info

Publication number: WO2018003090A1
Application number: PCT/JP2016/069528
Authority: WO
Inventors: 満西川; 清人小坂
Original assignee: 株式会社Pfu
Priority date: 2016-06-30
Filing date: 2016-06-30
Publication date: 2018-01-04

Abstract

In the present invention a photographed frame is acquired, photographic subject region image data for a photographic subject region is acquired from the frame, a required information region in the photographic subject region is detected, required information region image data for the required information region is acquired, glare with respect to the required information region image data is detected, and non-glare region image data, which is required information region image data for which glare has not been detected, is acquired.

Description

Image processing apparatus, image processing method, and program

The present invention relates to an image processing apparatus, an image processing method, and a program.

Conventionally, techniques for correcting the shine of image data have been disclosed.

Here, a technique for correcting luminance unevenness in an image using an average value in a window of color information such as luminance in the image as a background is disclosed (see Patent Document 1).

Also, a whiteout area in the photographed image is detected, and the whiteout area is used as a correction target area, and an area in another image corresponding to the correction target area is extracted and combined to generate an image without a whiteout area. A technique is disclosed (see Patent Document 2).

JP-T-2015-503813 JP 2013-229698 A

However, in the conventional image processing apparatus (Patent Document 1, etc.), even when a plurality of images are combined, an area not including characters can be selected as a joint for combining images, that is, a boundary between images. However, there is a case where a joint that crosses the character portion is selected, and an image with an image quality inappropriate for OCR (Optical Character Recognition) due to blurring of the character may be acquired. Had.

The present invention has been made in view of the above-described problems. An image processing apparatus and an image processing apparatus that can acquire image data suitable for OCR or the like even under a situation where a subject is damaged by illumination such as a fluorescent lamp. It is an object to provide a method and a program.

In order to achieve such an object, an image processing apparatus according to the present invention includes a frame acquisition unit that acquires a captured frame, a subject region acquisition unit that acquires subject region image data of a subject region from the frame, Necessary information area acquisition means for detecting a necessary information area in the subject area and acquiring necessary information area image data of the necessary information area; Detecting means for detecting shine on the necessary information area image data; and And non-shiny area obtaining means for obtaining non-shiny area image data that is the necessary information area image data in which the shine has not been detected by the detecting means.

The image processing method according to the present invention includes a frame acquisition step of acquiring a captured frame, a subject region acquisition step of acquiring subject region image data of a subject region from the frame, and a necessary information region in the subject region. The necessary information area acquisition step for detecting and acquiring necessary information area image data of the necessary information area, a shine detection step for detecting shine on the necessary information area image data, and the shine detection in the shine detection step. A non-shiny area obtaining step of obtaining non-shiny area image data that is the necessary information area image data that has not been detected.

The program according to the present invention detects a frame acquisition step for acquiring a captured frame, a subject region acquisition step for acquiring subject region image data of a subject region from the frame, and a necessary information region in the subject region. The necessary information area acquisition step for acquiring the necessary information area image data of the necessary information area, the shine detection step for detecting shine on the necessary information area image data, and the shine detection in the shine detection step. A non-shine area acquisition step of acquiring non-shine area image data that is the necessary information area image data that has not been included is executed by a computer.

According to the present invention, since a boundary that does not include a necessary area such as a character portion can be selected in advance as a joint for image composition, there is no factor that leads to image quality degradation of the character portion due to composition, and high OCR accuracy is realized. be able to.

Further, according to the present invention, it is possible to acquire image data for performing OCR even under a situation where the subject is taken up by illumination such as a fluorescent lamp in the office.

For this reason, according to the present invention, OCR can be carried out even in an environment that could not be used conventionally, and the usage applications are expanded.

In addition, according to the present invention, it is not necessary to repeat moving image shooting until the image data without shining can be acquired in a situation where the subject is shining, so the time until the processing is completed can be shortened. it can.

Further, according to the present invention, it is possible to acquire the information described in the subject required by the user without necessarily performing the synthesis process on the acquired partial image.

FIG. 1 is a block diagram illustrating an example of the configuration of the image processing apparatus according to the present embodiment. FIG. 2 is a flowchart illustrating an example of processing in the image processing apparatus according to the present embodiment. FIG. 3 is a diagram illustrating an example of a necessary information area in the present embodiment. FIG. 4 is a diagram showing an example of necessary information area division in the present embodiment. FIG. 5 is a diagram showing an example of necessary information area division in the present embodiment. FIG. 6 is a diagram illustrating an example of necessary information area division in the present embodiment. FIG. 7 is a diagram showing an example of necessary information area division in the present embodiment. FIG. 8 is a diagram showing an example of necessary information area division in the present embodiment. FIG. 9 is a diagram illustrating an example of image composition in the present embodiment. FIG. 10 is a diagram illustrating an example of an OCR processing result in the present embodiment.

Hereinafter, embodiments of an image processing device, an image processing method, and a program according to the present invention will be described in detail with reference to the drawings. In addition, this invention is not limited by this embodiment.

[Configuration of this embodiment]
Hereinafter, an example of the configuration of the image processing apparatus 100 according to the embodiment of the present invention will be described with reference to FIG. 1, and then the processing and the like of the present embodiment will be described in detail. FIG. 1 is a block diagram illustrating an example of the configuration of the image processing apparatus 100 according to the present embodiment.

However, the embodiment described below exemplifies the image processing apparatus 100 for embodying the technical idea of the present invention, and is not intended to specify the present invention to the image processing apparatus 100. The present invention is equally applicable to the image processing apparatus 100 of other embodiments included in the scope of claims.

In addition, the form of function distribution in the image processing apparatus 100 exemplified in the present embodiment is not limited to the following, and may be configured to be functionally or physically distributed / integrated in arbitrary units within a range where similar effects and functions can be achieved. can do.

Here, the image processing apparatus 100 is, for example, portable information such as a tablet terminal, a mobile phone, a smartphone, a PHS, a PDA, a notebook personal computer, or a wearable computer such as a glasses type or a watch type. It may be a processing device (mobile terminal).

First, as shown in FIG. 1, the image processing apparatus 100 is generally configured to include a control unit 102, a storage unit 106, a photographing unit 110, an input / output unit 112, a sensor unit 114, and a communication unit 116. .

In FIG. 1, the image processing apparatus 100 is illustrated as a mobile terminal including a photographing unit 110 in a housing. However, the image processing apparatus 100 does not include the photographing unit 110 in a housing, and captures captured image data from an external photographing device. It may be configured to receive (for example, a desktop personal computer).

Here, although omitted in FIG. 1, in this embodiment, an input / output interface unit (not shown) for connecting the input / output unit 112 and the control unit 102 may be further provided. Each unit of the image processing apparatus 100 is connected to be communicable via an arbitrary communication path.

Here, the communication unit 116 is a network interface (NIC (Network Interface Controller), etc.), Bluetooth (registered trademark), infrared communication, etc. for transmitting and receiving IP data by wired communication and / or wireless communication (WiFi, etc.) May be an interface for performing wireless communication.

Here, the image processing apparatus 100 may be communicably connected to an external apparatus via a network using the communication unit 116.

Also, the sensor unit 114 detects a physical quantity and converts it into a signal (digital signal) of another medium. Here, the sensor unit 114 includes a proximity sensor, a direction sensor, a magnetic field sensor, a linear acceleration sensor, a luminance sensor, a gyro sensor, a pressure sensor, a gravity sensor, an acceleration sensor, an atmospheric pressure sensor, and / or a temperature sensor. Also good.

Also, the input / output unit 112 performs data input / output (I / O). Here, the input / output unit 112 may be, for example, a key input unit, a touch panel, a control pad (for example, a touch pad and a game pad), a mouse, a keyboard, and / or a microphone.

Also, the input / output unit 112 may be a display unit that displays a display screen of an application or the like (for example, a display, a monitor, a touch panel, or the like configured by liquid crystal or organic EL).

Also, the input / output unit 112 may be an audio output unit (for example, a speaker or the like) that outputs audio information as audio. The input / output unit (touch panel) 112 may include a sensor unit 114 that detects physical contact and converts it into a signal (digital signal).

Also, the image capturing unit 110 acquires still image data by capturing a still image of a subject (for example, a form or the like). For example, the imaging unit 110 may acquire captured image data.

Further, the photographing unit 110 may acquire continuous (moving image) image data (frames) by continuously capturing images (moving image capturing) of the subject. For example, the imaging unit 110 may acquire video data. The imaging unit 110 may acquire ancillary data.

Here, the frame may be non-compressed image data. The frame may be high-resolution image data. Here, the high resolution may be full high vision, 4K resolution, super high vision (8K resolution), or the like.

Further, the photographing unit 110 may shoot moving images at 24 fps or 30 fps. Here, the image capturing unit 110 may be a camera including an image sensor such as a CCD (Charge Coupled Device) and / or a CMOS (Complementary Metal Oxide Semiconductor).

The storage unit 106 is storage means, for example, a memory such as RAM / ROM, a fixed disk device such as a hard disk, an SSD (Solid State Drive), and / or a tangible storage device such as an optical disk, or a storage circuit. Can be used.

The storage unit 106 stores various databases, tables, buffers, and / or files (necessary information area file 106a, image data file 106b, etc.). Here, the storage unit 106 may store a computer program or the like for giving a command to a CPU (Central Processing Unit) and performing various processes.

Of the components of the storage unit 106, the necessary information area file 106a stores boundary data of the necessary information area in the subject area. Here, the necessary information area may be an image corresponding to the whiteboard (entire).

The necessary information area may be an area where necessary information can be visually recognized in the subject area. Here, the area where necessary information can be visually recognized may be an area including letters, numbers, symbols, figures, photographs, and / or seals.

Further, the subject area may be a document image of a document (form) included in a read image based on a frame. The form may be a prescribed form such as various licenses including a driver's license, various identification cards, or a health insurance card.

As described above, the necessary information area file 106a stores boundary data of boundaries that are joints of necessary information areas in advance for a known document that is a subject.

Also, the image data file 106b stores image data (such as a frame). Here, the image data file 106b includes subject area image data, necessary information area image data, non-shiny area image data, composite image data, divided area image data, non-shiny divided area image data, captured image data, and / or Document image data may be stored.

The control unit 102 is a CPU, GPU (Graphics Processing Unit), DSP (Digital Signal Processor), LSI (Large Scale Integration / Integration Specified and ASIC), which controls the image processing apparatus 100 in an integrated manner. Alternatively, it may be composed of a tangible controller including a FPGA (Field-Programming Gate Array) or a control circuit.

In addition, the control unit 102 has an internal memory for storing a control program, a program defining various processing procedures, and necessary data, and performs information processing for executing various processes based on these programs. Do.

Here, the control unit 102 functionally conceptually includes a frame acquisition unit 102a, a subject region acquisition unit 102b, a necessary information region acquisition unit 102c, a shine detection unit 102d, a non-shine region acquisition unit 102e, a divided region acquisition unit 102f, and an image. A synthesis unit 102g, a division synthesis unit 102h, and an OCR unit 102i are provided.

The frame acquisition unit 102a acquires a frame (captured image data of a captured image). Here, the frame acquisition unit 102a may acquire a frame imaged by the imaging unit 110 or an external imaging device.

Further, the frame acquisition unit 102a may acquire captured image data of a captured image including a subject area. The frame acquisition unit 102a may acquire still image data by controlling still image shooting by the shooting unit 110.

Also, the frame acquisition unit 102a may acquire a frame corresponding to one frame by controlling continuous image shooting or moving image shooting by the shooting unit 110.

The subject area acquisition unit 102b acquires subject area image data of the subject area from the frame. For example, the subject area acquisition unit 102b may acquire document image data of a document image from a frame.

The necessary information area acquisition unit 102c detects a necessary information area in the subject area and acquires necessary information area image data of the necessary information area.

Here, the necessary information area acquisition unit 102c may detect the necessary information area based on the boundary data of the necessary information area stored in the necessary information area file 106a and acquire the necessary information area image data.

The shine detection unit 102d detects shine on the necessary information area image data. Here, the shine detection unit 102d may detect shine on each divided region image data.

Further, the shine detection unit 102d may detect the shine based on a comparison between the luminance of the necessary information area image data and a predetermined threshold value.

The non-shine area acquisition unit 102e acquires non-shine area image data that is necessary information area image data in which no shine is detected by the shine detection section 102d.

Here, the non-shine area acquisition unit 102e may acquire non-shine division area image data that is division area image data in which no shine has been detected by the shine detection section 102d.

The divided area acquisition unit 102f detects a non-character area by character detection processing on the necessary information area image data in which the shine is detected by the shine detection unit 102d, and divides the necessary information area into the non-character areas. Get the data.

Here, the divided region acquisition unit 102f detects the non-character region by the edge detection method for the necessary information region image data in which the shine is detected by the shine detection unit 102d, and acquires the divided region image data of the divided region. Also good.

The image composition unit 102g acquires composite image data obtained by combining a plurality of non-shine area image data acquired by the non-shine area acquisition unit 102e.

Here, the image composition unit 102g may acquire composite image data obtained by combining the non-shiny region image data and the region external image data of the subject region excluding the necessary information region.

In addition, when the non-shine area image data of all necessary information areas included in the subject area is acquired, the image composition unit 102g may acquire composite image data obtained by combining a plurality of non-shine area image data.

The division synthesis unit 102h acquires non-shine area image data by synthesizing the non-shine division area image data acquired by the non-shine area acquisition unit 102e.

The OCR unit 102i performs OCR processing on the image data and acquires character data. Here, the OCR unit 102i may perform the OCR process on the non-shine area image data to acquire character data.

Further, the OCR unit 102i may perform character image data by performing OCR processing on the composite image data.

[Process of this embodiment]
With respect to an example of processing executed by the image processing apparatus 100 (mobile terminal) configured as described above, an example of OCR processing according to the present embodiment will be described with reference to FIGS. FIG. 2 is a flowchart illustrating an example of processing in the image processing apparatus 100 according to the present embodiment.

As shown in FIG. 2, first, the frame acquisition unit 102a initializes the setting of the imaging unit 110 according to the subject, controls the start of moving image shooting by the imaging unit 110, and acquires a frame (step SA-1). .

That is, in this embodiment, moving image shooting of the subject by the camera device is started.

Then, the subject area acquisition unit 102b detects a subject area in the captured image based on the frame, acquires subject area image data of the subject area, and stores (records) it in the image data file 106b (step SA-2).

That is, in this embodiment, the subject area in the captured image is detected during moving image shooting. Here, the detection of the subject region in the captured image may be performed using processing such as edge detection and / or feature point detection.

Then, the necessary information area acquisition unit 102c detects the necessary information area in the subject area and acquires the necessary information area image data of the necessary information area (step SA-3).

Here, with reference to FIG. 3, an example of necessary information areas and joints in known document types will be described. FIG. 3 is a diagram illustrating an example of a necessary information area in the present embodiment.

In the driver's license shown in FIG. 3, area 1 (name), area 2 (date of birth), area 3 (address), area 4 (delivery date), area 5 (expiration date) surrounded by dotted lines, The area 6 (license number) and the area 7 (face photograph) may be necessary information areas, and the dotted line portion may be a joint (boundary).

Here, in the present embodiment, a region that is not covered by the necessary information region in the subject image may be outside the region. In the present embodiment, the outside of the area may be specified from the subject image acquired last before the image composition.

In the present embodiment, when the necessary information area covers the entire subject image, there is no area outside.

Referring back to FIG. 2, the shine detection unit 102d detects the shine for each necessary information area image data based on the comparison between the luminance of the necessary information area image data and a predetermined threshold (step SA-4).

That is, in the present embodiment, the necessary information area without the shine in the subject is specified by detecting the shine in the subject area.

Here, in the present embodiment, as the detection of the shine in the subject area, as a threshold determination of the luminance, when (0.299 * R + 0.587 * G + 0.114 * B)> 250 holds for RGB of the image data. It may be determined that the shine has been detected.

Further, in the present embodiment, as the detection of the shine in the subject area, it may be determined that the shine has been detected when the average value in the window> 200 is established as the threshold value determination of the background estimation value.

Then, the non-shine area acquisition unit 102e acquires non-shine area image data, which is necessary information area image data in which no shine has been detected by the shine detection section 102d, and records it in the image data file 106b (step SA-5). .

That is, in the present embodiment, image data of a necessary information area without shine is recorded.

Here, with reference to FIG. 4 and FIG. 5, an example of the non-shine area image data acquisition processing by necessary information area division in the present embodiment will be described. 4 and 5 are diagrams showing an example of necessary information area division in the present embodiment.

In the present embodiment, when only a part of the necessary information area is detected when the shine is detected, the necessary information area may be divided to acquire the non-shine area image data.

That is, as shown in FIG. 4, in the present embodiment, the necessary information area A that is the date of driver license is divided into a divided area B and a divided area C as shown in FIG. 5. .

Here, in the present embodiment, the boundary between the divided region B and the divided region C becomes a new joint, but character detection is performed so that the joint does not include a character, and it is determined that the character is not a character. A non-character area may be set as a joint for division.

Here, an edge detection method or the like may be used as a character detection method.

In the present embodiment, the non-shiny divided region image data is obtained for each of the divided region B and the divided region C, and the non-shiny region image data of the necessary information region A is obtained by recombining. May be.

In addition, an example of necessary information area division in the whiteboard in the present embodiment will be described with reference to FIGS. 6 to 8 are diagrams showing an example of necessary information area division in the present embodiment.

As shown in FIG. 6, in the present embodiment, an image captured using a whiteboard as a subject may be detected as the necessary information area D.

As shown in FIG. 7, in this embodiment, after the start of moving image shooting, when a shine G occurs in a part of the image corresponding to the whiteboard, the division method of the necessary information area D is used. The boundary between the divided area E and the divided area F may be set as a joint under the character string “> Agenda” where no edge is detected.

Then, as shown in FIG. 8, in this embodiment, the location where shine occurs due to room lighting is changed to shine J due to the change of the camera shooting position.

Then, with respect to the divided area F in which the shine G has occurred as a whole in FIG. 7, the boundary between the divided area H and the divided area I below the character string of “2. Step” in which no edge is detected. May be set as a joint.

In this way, the gloss of the whiteboard reflects the office lighting well, and there is a problem that it is greatly affected by the shine, but by utilizing this embodiment, the whiteboard that avoids the influence of the shine is a problem. The acquisition of non-shine area image data by scanning is realized.

Furthermore, in the present embodiment, it is possible to collect the divided area image data without the shine, so that finally the image data of the entire whiteboard without the shine can be synthesized.

As a result, in the present embodiment, when a meeting is described using a whiteboard at a meeting in an office, the whiteboard on which the contents of the proceedings are recorded is taken with a mobile camera and OCR is performed. Since it can be converted into text data, efficiency in office work such as meeting minutes is promoted.

Returning to FIG. 2, the non-shine area acquisition unit 102e determines whether non-shine area image data of all necessary information areas included in the subject area has been acquired (step SA-6).

In other words, in the present embodiment, the processing from step SA-2 to step SA-5 is repeated to determine whether image data of the necessary information area without the shine of the entire subject has been acquired.

If the non-shine area acquisition unit 102e determines that non-shine area image data of all necessary information areas included in the subject area has not been acquired (step SA-6: No), the process proceeds to step SA-2. To migrate.

On the other hand, if the non-shine area acquisition unit 102e determines that non-shine area image data of all necessary information areas included in the subject area has been acquired (step SA-6: Yes), the process proceeds to step SA-7. Let

Then, the image composition unit 102g obtains composite image data obtained by combining the non-shiny region image data of all necessary information regions included in the subject region and the region external image data of the subject region excluding the necessary information region ( Step SA-7).

That is, in the present embodiment, when all the areas are acquired, the images are combined to create a subject image without shine.

Here, an example of image composition in the present embodiment will be described with reference to FIG. FIG. 9 is a diagram illustrating an example of image composition in the present embodiment.

As shown in FIG. 9, in the present embodiment, even if the upper part and the lower part are respectively set as necessary information areas with the boundary (dotted line in FIG. 9) between the address field of the driver's license and the date of issue as a boundary. Good.

As shown in FIG. 9, in the present embodiment, an image in which shine M is generated in the upper date of birth column K and address column L in the subject (driver's license) image during moving image shooting, You may acquire the image data with the image which the shine P generate | occur | produced in the license number column N. FIG.

Then, the necessary information area image data in which the shine of both image data is not detected may be recorded as the non-shine area image data.

And as shown in FIG. 9, you may synthesize | combine non-shine area image data and acquire synthetic | combination image data.

Returning to FIG. 2, the OCR unit 102i performs OCR processing on the composite image data to acquire character data, and records the character data in the image data file 106b in association with the subject area image data (step SA-8). ), The process is terminated.

Thus, in this embodiment, a highly accurate OCR success rate can be achieved by performing the OCR process on the acquired composite image data.

Here, an example of the OCR processing result in the present embodiment will be described with reference to FIG. FIG. 10 is a diagram illustrating an example of an OCR processing result in the present embodiment.

As shown in FIG. 10, when an OCR process is performed on the date of birth column K and address column L with the shine M in FIG. 9, it may not be properly recognized.

Also, as shown in FIG. 10, even when an OCR process is attempted on the license number column N with the shine P in FIG. 9, it may not be properly recognized.

Therefore, as shown in FIG. 10, in the present embodiment, the OCR process is performed on the synthesized image data synthesized in FIG. 9, so that the OCR process is performed on the shining image data itself. Thus, high OCR accuracy is realized.

Here, in an actual usage scene such as a store window, there is a case where illumination is not uniform, and a photographed image of a driver's license is obscured, so that character recognition with high accuracy is hindered.

Even in such a case, in this embodiment, the personal information (name, date of birth, address, delivery) described in the driver's license is obtained by scanning the driver's license with a mobile camera and performing OCR processing. Date and license number, etc.).

Thereby, in the present embodiment, processing such as personal authentication by comparing character data acquired by OCR processing with a personal information database can be easily performed.

As described above, in the present embodiment, it is possible to perform OCR quickly and with high accuracy even in a situation where shine is generated in the document medium.

In particular, in this embodiment, an information area (for example, a text field or a face photograph) that needs to be acquired is stored in advance in a known document type (such as various licenses or invoices). , It is possible to specify a location that is a joint for image synthesis.

In other words, in the present embodiment, in order to perform high-precision OCR, image synthesis is not performed in a place where text exists, and a high OCR accuracy is achieved by using a joint having no text as a synthesis target. Can be realized.

As described above, in this embodiment, the boundary of the necessary information area is used as a joint for a document type known in advance.

If image synthesis is performed at a joint that crosses the text portion, noise components may be mixed into the text portion, which may cause deterioration in OCR accuracy. In this embodiment, the text portion is used as a synthesis joint. By not doing so, high OCR accuracy is achieved.

Thereby, in this embodiment, since the OCR process can be performed on the synthetic image data without the shine using the shine detection, the OCR process avoiding the adverse effect due to the shine is realized, and the character recognition has a high accuracy. Can be achieved.

In this embodiment, image data without shine can be quickly synthesized by combining shine detection and moving image processing technology.

Further, in the present embodiment, since the shine detection is executed for each necessary information area in the subject area in the photographed image, it is not necessary to repeat photographing until the entire subject area is not shining.

For this reason, in the present embodiment, the number of retries during moving image shooting can be reduced, and quick OCR processing can be performed.

In recent years, it has become possible to perform OCR using mobile camera images by improving the resolution of camera devices incorporated in smartphones and the like.

In particular, since document media such as various licenses or health insurance cards can be easily taken and applied to OCR, it can be used as a means of personal authentication, so that contract procedures at commercial counters or stores can be simplified. It is becoming.

However, many of these document media reflect light under illumination, resulting in a shining spot in the photographed image, thus reducing the OCR recognition performance.

Also, if the image portion containing the content required for personal authentication such as a face photo is brilliant, it will no longer serve as a certificate.

For this reason, it is possible to perform OCR with high accuracy even in scenes where shine occurs, and in order to acquire the necessary partial images, a technique for removing or avoiding the influence of shine is necessary.

Therefore, in the present embodiment, as a technique for dealing with shine, a technique for dealing with shine using the characteristics of a mobile camera that allows a mobile camera to shoot a movie while moving the shooting position is disclosed.

Specifically, in the present embodiment, by performing shine position detection at the time of real-time shooting, a partial image that does not have shine is extracted and synthesized by including a necessary content, so that an image of a subject image without shine is obtained. Data generation is realized.

In the present embodiment, OCR processing is performed on the composite image data to achieve high accuracy of OCR.

Also, conventionally, there has been a method of removing the influence of the shine by estimating the shine brightness distribution by a method of averaging the brightness distribution in the shot image and subtracting the shine component from the shot image.

However, since it is difficult to accurately estimate the shine distribution, noise caused by an estimation error is generated in the image from which the shine component is subtracted, which is a cause of erroneous recognition during OCR execution.

Conventionally, there is a method for detecting a shine position based on color information in an image, mainly a global method based on a luminance value histogram of the entire image and a local method based on a local luminance value in the image. There was a typical method.

However, both methods are used only to determine whether or not the photographed image is shining.

In particular, retry processing is practically performed by detecting shine before OCR processing, so that, in the case of an image that has been lost, OCR execution is rejected, and until an image without shine is obtained, photographing is requested again. It was.

That is, in any of the methods, if there is a shine in a very small part of the image, it is necessary to repeat many shootings before the OCR is performed because the shooting is requested again.

Therefore, in the present embodiment, it is not necessary to repeat trials until it is determined that there is no shine on the entire image before OCR is executed.

Therefore, in this embodiment, when performing OCR quickly, even if shine occurs in a part of the image, the OCR is performed on a portion that is not shining, so that photographing is performed as much as possible. The number of retries can be reduced.

Further, in the present embodiment, since it is not necessary to perform the image processing for removing shine that has been conventionally performed on the subject area, erroneous recognition of OCR can be reduced, and high-precision OCR is realized.

[Other Embodiments]
The embodiments of the present invention have been described so far, but the present invention may be implemented in various different embodiments other than the above-described embodiments within the scope of the technical idea described in the claims. Is.

For example, the image processing apparatus 100 may perform processing in a stand-alone form, performs processing in response to a request from a client terminal (which is a separate housing from the image processing apparatus 100), and the processing result is You may make it return to a client terminal.

In addition, among the processes described in the embodiment, all or a part of the processes described as being automatically performed can be manually performed, or all of the processes described as being manually performed can be performed. Alternatively, a part can be automatically performed by a known method.

In addition, the processing procedure, control procedure, specific name, information including parameters such as registration data or search conditions for each processing, screen examples, or database configuration shown in the description and drawings are specially noted. It can be changed arbitrarily except for.

Further, with respect to the image processing apparatus 100, the illustrated components are functionally conceptual, and need not be physically configured as illustrated.

For example, all or some of the processing functions provided in each device of the image processing apparatus 100, particularly the processing functions performed by the control unit 102, are executed by the CPU and a program interpreted and executed by the CPU. You may implement | achieve and may implement | achieve as hardware by a wired logic.

The program is recorded on a non-transitory computer-readable recording medium including a programmed instruction for causing a computer to execute the method according to the present invention, which will be described later, and an image processing apparatus as necessary. 100 mechanically read. That is, in the storage unit 106 such as a ROM or an HDD, computer programs for performing various processes by giving instructions to the CPU in cooperation with an OS (Operating System) are recorded. This computer program is executed by being loaded into the RAM, and constitutes a control unit in cooperation with the CPU.

The computer program may be stored in an application program server connected to the image processing apparatus 100 via an arbitrary network, and may be downloaded in whole or in part as necessary. is there.

Further, the program according to the present invention may be stored in a computer-readable recording medium, or may be configured as a program product. Here, the “recording medium” includes a memory card, USB memory, SD card, flexible disk, magneto-optical disk, ROM, EPROM, EEPROM, CD-ROM, MO, DVD, and Blu-ray (registered trademark). It includes any “portable physical medium” such as Disc.

In addition, “program” is a data processing method described in an arbitrary language or description method, and may be in any form such as source code or binary code. Note that the “program” is not necessarily limited to a single configuration, and functions are achieved in cooperation with a separate configuration such as a plurality of modules and libraries or a separate program represented by the OS. Including things. In addition, a well-known structure and procedure can be used about the specific structure for reading a recording medium in each apparatus shown in embodiment, a reading procedure, or the installation procedure after reading.

Various databases and the like stored in the storage unit 106 are storage means such as a memory device such as a RAM or a ROM, a fixed disk device such as a hard disk, a flexible disk, and / or an optical disk. Various programs, tables, databases, and / or web page files used may be stored.

The image processing apparatus 100 may be configured as an information processing apparatus such as a known personal computer, or may be configured by connecting an arbitrary peripheral device to the information processing apparatus. The image processing apparatus 100 may be realized by installing software (including programs, data, and the like) that causes the information processing apparatus to realize the method of the present invention.

Furthermore, the specific form of distribution / integration of the devices is not limited to that shown in the figure, and all or a part of them may be functional or physical in arbitrary units according to various additions or according to functional loads. Can be distributed and integrated. That is, the above-described embodiments may be arbitrarily combined and may be selectively implemented.

As described above, the image processing apparatus, the image processing method, and the program can be implemented in many industrial fields, particularly in the image processing field that handles images read by a camera, and are extremely useful.

DESCRIPTION OF SYMBOLS 100 Image processing apparatus 102 Control part 102a Frame acquisition part 102b Subject area acquisition part 102c Necessary information area acquisition part 102d Detective detection part 102e Non-shine area acquisition part 102f Division area acquisition part 102g Image composition part 102h Division composition part 102i OCR part 106 Storage 106a Necessary information area file 106b Image data file 110 Imaging unit 112 Input / output unit 114 Sensor unit 116 Communication unit

Claims

Frame acquisition means for acquiring a photographed frame;
Subject region acquisition means for acquiring subject region image data of the subject region from the frame;
Necessary information area acquisition means for detecting a necessary information area in the subject area and acquiring necessary information area image data of the necessary information area;
Shine detection means for detecting shine on the necessary information area image data;
Non-shine area acquisition means for acquiring non-shine area image data that is the necessary information area image data in which the shine is not detected by the shine detection means;
An image processing apparatus comprising:
Image combining means for acquiring composite image data obtained by combining the plurality of non-shine area image data acquired by the non-shine area acquisition means;
The image processing apparatus according to claim 1, further comprising:
A non-character area is detected by character detection processing on the necessary information area image data in which the shine is detected by the shine detection means, and divided area image data of a divided area obtained by dividing the necessary information area by the non-character area is obtained. Divided region acquisition means to perform,
Further comprising
The shine detection means is
Detecting shine for each of the divided region image data,
The non-shine area acquisition means includes
Non-shiny divided area image data that is the divided area image data in which the shine was not detected by the shine detection means,
A dividing and synthesizing unit that acquires the non-shiny region image data by synthesizing the non-shining region image data obtained by the non-shiny region obtaining unit;
The image processing apparatus according to claim 1, further comprising:
Necessary information area storage means for storing boundary data of the necessary information area;
Further comprising
The necessary information area acquisition means includes
The image processing apparatus according to claim 1, wherein the necessary information area is detected based on the boundary data, and the necessary information area image data is acquired.
The shine detection means is
5. The image processing apparatus according to claim 1, wherein the shine is detected based on a comparison between a luminance of the necessary information area image data and a predetermined threshold value. 6.
The divided area acquisition means includes
4. The divided area image data of the divided area is acquired by detecting the non-character area by an edge detection method for the necessary information area image data in which the shine is detected by the shine detection means. Image processing apparatus.
OCR means for obtaining character data by performing OCR processing on the non-shine area image data;
The image processing apparatus according to claim 1, further comprising:
The necessary information area is:
The image processing apparatus according to claim 1, wherein the image processing apparatus is an image corresponding to a whiteboard.
The necessary information area is:
The image processing apparatus according to claim 1, wherein necessary information in the subject area is a visible area.
A frame acquisition step for acquiring a photographed frame;
A subject area acquisition step of acquiring subject area image data of a subject area from the frame;
A necessary information area obtaining step of detecting a necessary information area in the subject area and obtaining necessary information area image data of the necessary information area;
A shine detection step for detecting shine on the necessary information area image data;
A non-shine area acquisition step of acquiring non-shine area image data that is the necessary information area image data in which the shine is not detected in the shine detection step;
An image processing method comprising:
An image combining step of acquiring composite image data obtained by combining the plurality of non-shine area image data acquired in the non-shine area acquisition step;
The image processing method according to claim 10, further comprising:
A non-character area is detected by character detection processing on the necessary information area image data in which the shine is detected in the shine detection step, and divided area image data of a divided area obtained by dividing the necessary information area by the non-character area is obtained. A divided region acquisition step to be acquired;
Further including
In the shine detection step,
Detecting shine for each of the divided region image data,
In the non-shine area acquisition step,
Obtaining non-shiny divided area image data that is the divided area image data in which the shine was not detected in the shine detection step;
A dividing and synthesizing step of acquiring the non-shine area image data by combining the non-shine area image data acquired in the non-shine area acquisition step,
The image processing method according to claim 10 or 11, further comprising:
In the necessary information area acquisition step,
The image processing method according to any one of claims 10 to 12, wherein the necessary information area is detected and the necessary information area image data is acquired based on the stored boundary data of the necessary information area.
In the shine detection step,
The image processing method according to claim 10, wherein the shine is detected based on a comparison between a luminance of the necessary information area image data and a predetermined threshold value.
In the divided region acquisition step,
The divided area image data of the divided area is obtained by detecting the non-character area by an edge detection method for the necessary information area image data in which the shine is detected in the shine detection step. The image processing method as described.
An OCR step of performing character character data by performing an OCR process on the non-shine area image data;
The image processing method according to claim 10, further comprising:
The necessary information area is:
The image processing method according to claim 10, wherein the image processing method is an image corresponding to a whiteboard.
The necessary information area is:
The image processing method according to claim 10, wherein necessary information is visible in the subject area.
A frame acquisition step for acquiring a photographed frame;
A subject area acquisition step of acquiring subject area image data of a subject area from the frame;
A necessary information area obtaining step of detecting a necessary information area in the subject area and obtaining necessary information area image data of the necessary information area;
A shine detection step for detecting shine on the necessary information area image data;
A non-shine area acquisition step of acquiring non-shine area image data that is the necessary information area image data in which the shine is not detected in the shine detection step;
A program that causes a computer to execute.