WO2011096571A1

WO2011096571A1 - Input device

Info

Publication number: WO2011096571A1
Application number: PCT/JP2011/052591
Authority: WO
Inventors: 堪亮坂本; 慶木下
Original assignee: 株式会社ネクステッジテクノロジー
Priority date: 2010-02-08
Filing date: 2011-02-08
Publication date: 2011-08-11
Also published as: JP2013080266A

Abstract

Disclosed is an input device by way of a video image for simplifying an algorithm when remotely operating an information terminal device or personal computer, lowering the amount of data to be processed to reduce the amount of computation and memory usage, greatly reducing the price of the device overall, and enabling remote operation of the personal computer in real time. For a low-resolution color image acquired by photographing a user by Web cameras (4 and 6), graying processing, image partitioning/binarization processing, inter-frame difference processing, histogram processing, active rectangular region extraction processing, active rectangular region selection processing, virtual cursor control processing/screen control processing are performed; movement of hands of an operator is detected without being affected by movement of a person in the background; and size control, position control, color control, and click control of a virtual cursor (25), as well as zoom-up control, zoom-down control, rotation control, vertical scroll control, and horizontal scroll control of a screen to be operated are performed.

Description

Input device

The present invention is used by being connected to an information device such as an information terminal device or a personal computer, takes an operation image of an operator (user) by a camera, and controls cursor operation of the information device, selection and execution of an application program, and the like. An input device, in particular, video input that simplifies the algorithm and reduces the amount of processing data as much as possible to reduce the amount of computation and memory usage, as well as controlling the cursor of the personal computer in real time. Relates to the device.

In recent years, various input devices have been proposed in which a user is photographed with a camera, image analysis is performed, and an apparatus such as an audio device or an air conditioner is operated using the image analysis result.

FIG. 32 is a block diagram of an operation input device for explaining such a conventional image input device (see Patent Document 1).

The operation input device 101 shown in this figure uses the visible light camera 106 (see FIG. 33) and the like to image the user and output a color image, and analyzes the color image output from the imaging unit 102. , Hand region detecting means 103 for detecting the shape of the user's hand, and hand operation determining means for comparing the hand shape registered in advance with the hand shape output from the hand region detecting means 103 to determine the operation instruction content 104, and a selection menu expression unit 105 that informs the user of the selection menu by voice or a project image based on the determination content of the manual operation determination unit 104.

The operation input device 101 extracts a hand region from a color image photographed by the visible light camera 106 and determines what shape the hand is in (for example, tilting a hand or bending a finger). The judgment is made, and a manual operation instruction corresponding to the judgment content is notified to the user by voice or a project image.

JP 2009-104297 A

By the way, in such a conventional operation input device 101, since it is necessary to accurately detect the hand shape of the user, a high-resolution camera capable of photographing the hand portion in detail must be used as the visible light camera 106. In other words, there is a problem that the entire operation input device 101 becomes expensive.

In the conventional operation input device 101, both the background image processed by the hand region detection means 103 and the current image have high resolution, so the number of pixels and the amount of data are enormous. Therefore, the capacity of the hard disk, the memory capacity, etc. have to be increased by that amount, and there is a problem that the entire operation input device 101 becomes expensive.

Furthermore, in the conventional operation input device 101, as shown in FIG. 33 as the hand region detection unit 103, the difference region extraction unit 110, the skin color region extraction unit 111, the binarization correction unit 112, the distance calculation unit 113, the center weight correction Means 114, hand region candidate detection means 115, contour length / area calculation means 116, hand region determination means 117, etc., which not only have to perform complicated calculations, but also use a fairly fast CPU or dedicated circuit. Otherwise, there was a problem that the shape of the hand could not be detected in real time.

For this reason, it is difficult to use the above-described conventional operation input device 101 as an input device for easily remotely operating a cursor of a personal computer, etc., and an inexpensive camera can be applied, with a small amount of computation and a small amount of memory. The development of an input device capable of remote operation input has been strongly desired.

Further, the conventional operation input device 101 has a problem that when a person other than the user moves within the shooting range of the visible light camera 106 and moves his / her hand, this is detected and malfunctions.

In view of the above circumstances, the present invention enables the use of a low-resolution and inexpensive camera, and greatly reduces the cost of the entire apparatus by greatly reducing the amount of calculation and memory, and the movement of the user. It is an object of the present invention to provide an input device that can remotely detect a cursor of a personal computer and can operate a personal computer cursor and the like, and does not malfunction even when a person other than the user moves within the shooting range of the camera.

In addition, it is possible to use an inexpensive camera with low resolution separated from a personal computer, etc., and by significantly reducing the cost of the entire device by significantly reducing the circuit scale, the user's movement is detected and the personal computer is It is an object of the present invention to provide an input device that can remotely control the cursor, etc., and that does not malfunction even when a person other than the user is in the shooting range of the camera and moves his / her hand.

It also enables the use of low-resolution and inexpensive cameras, and detects one-handed movement of the user while significantly reducing the overall cost of the device by significantly reducing the amount of computation, memory, or circuit scale. The purpose is to provide an input device that can remotely control the cursor of the computer, the scroll of the operation target screen, etc., and that does not malfunction even if a person other than the user is in the shooting range of the camera and moves his / her hand. To do.

In addition, it enables the use of low-resolution and inexpensive cameras, and detects the user's two-handed movement while significantly reducing the overall cost of the device by greatly reducing the amount of computation, memory, or circuit scale. Provide an input device that can remotely control operations such as enlargement / reduction of the operation target screen and rotation of the operation target screen, and that does not malfunction even if a person other than the user moves within the shooting range of the camera. For the purpose.

In addition to enabling the use of low-resolution and inexpensive cameras, and reducing the amount of computation, memory, or circuit scale significantly, the cost of the entire apparatus is greatly reduced, and the user's hand moves. Provides an input device that accurately detects only the part, performs stable virtual cursor control, click control, control of the operation target screen, and does not malfunction even if a person other than the user moves within the shooting range of the camera The purpose is to do.

In addition, it is possible to use an inexpensive camera with a low resolution, and due to the shadow of the user while greatly reducing the cost of the entire apparatus by significantly reducing the amount of computation, the amount of memory, or the circuit scale. To provide an input device that prevents malfunctions, performs stable virtual cursor control, click control, and operation target screen control, and does not malfunction even when a person other than the user moves within the shooting range of the camera. Objective.

In addition, it is possible to use an inexpensive camera with a low resolution, and to significantly reduce the cost of the entire apparatus by significantly reducing the amount of calculation, memory, or circuit scale. It is possible to enable only the image included in a slightly wider range than the user's hand image from inside, invalidate other images and remove noise existing in parts other than the change area, and within the shooting range of the camera An object of the present invention is to provide an input device that does not malfunction even when a person other than the user is present and moves his / her hand.

In addition to enabling the use of low-resolution and inexpensive cameras, and reducing the amount of computation, memory, or circuit scale significantly, the cost of the entire apparatus is greatly reduced, and the user's hand moves. It is an object of the present invention to provide an input device that accurately detects only a portion, performs stable virtual cursor control, click control, and operation target screen control, and prevents malfunction caused by hand movements that are not intended by the user.

The binocular parallax method can be used while significantly reducing the cost of the entire device by significantly reducing the amount of computation, memory, or circuit scale while allowing the use of low-resolution and inexpensive cameras. An object of the present invention is to provide an input device with improved accuracy by correcting the measurement distance from the camera to the subject.

The binocular parallax method can be used while significantly reducing the cost of the entire device by significantly reducing the amount of computation, memory, or circuit scale while allowing the use of low-resolution and inexpensive cameras. An object of the present invention is to provide an input device that speeds up processing when correcting a measurement distance from a camera to a subject.

In order to achieve the above object, an input device of the present invention is an input device that processes an operator's image obtained by a video camera and generates an operation instruction according to the operation content of the operator. A color camera for the right eye that shoots, and a color camera for the left eye that is arranged alongside the color camera for the right eye at a position away from the color camera for the right eye by a predetermined distance; The color image output from the right-eye color camera is subjected to graying processing, image division / binarization processing, interframe difference processing, histogram processing, and active rectangular area extraction processing, and the right eye side of the operator A right eye side image processing program for extracting an active rectangular area and a color image output from the left eye color camera, graying processing, image division / binarization processing, inter-frame difference processing, hysteresis Gram processing, activity rectangle area extraction processing, the left eye side image processing program for extracting the left eye side activity rectangle area of the operator, the right eye side activity rectangle area obtained by the right eye side image processing program, The left-eye side active rectangular area obtained by the left-eye side image processing program is subjected to an active rectangular area selection process using a binocular parallax method, a virtual cursor control process / screen control process, and the operator's hand Or an image processing program that detects the movement of the fingertip and generates an operation instruction according to the detection result.

Further, the input device of the present invention is an input device that processes an image of an operator obtained by a video camera, generates an operation instruction according to the operation content of the operator, and controls the operation of the remote operation target device. An input device case formed in a box shape, a color camera body for a right eye that is attached to the front left side of the input device case and takes an image of an operator, and attached to the front right side of the input device case A color camera body for the left eye that captures an image of the operator, and a graying processing circuit, an image division / binarization processing circuit, an inter-frame difference processing circuit, and a histogram processing circuit, which are disposed in the input device casing. A right-eye side image processing board for processing a color image output from the right-eye color camera main body by the activity rectangular area extraction processing circuit and extracting the right-eye side activity rectangular area of the operator; Arranged in the apparatus casing and output from the color camera body for the left eye by a graying processing circuit, an image division / binarization processing circuit, an inter-frame difference processing circuit, a histogram processing circuit, and an active rectangular area extraction processing circuit A left-eye side image processing board for processing a color image and extracting the left-eye side active rectangular area of the operator, and an active rectangular area selection processing circuit, a virtual cursor control process / screen disposed in the input device casing Activity using the binocular parallax method on the right-eye side active rectangular area obtained on the right-eye side image processing board and the left-eye side active rectangular area obtained on the left-eye side image processing board by the control processing circuit Performs rectangular area selection processing, virtual cursor control processing / screen control processing, detects the movement of the operator's hand or fingertip, generates pointing data according to the detection result, and performs remote operation It is characterized in that it comprises a common substrate for controlling the operation of the elephant equipment.

In the input device of the present invention, when the virtual cursor control process / screen control process or the virtual cursor control process / screen control processing circuit has one active rectangular area group on the virtual cursor active area image, Based on the shape and presence / absence of movement, a Carlsol control instruction or a screen scroll instruction is generated.

In the input device of the present invention, when the virtual cursor control process / screen control process or the virtual cursor control process / screen control processing circuit has two active rectangular area groups on the virtual cursor active area image, Based on the moving direction, one of a screen rotation instruction, a screen enlargement instruction, and a screen reduction instruction is generated.

In the input device of the present invention, the activity rectangular area extraction process, or the activity rectangular area extraction processing circuit uses a histogram statistical processing result to generate a virtual cursor activity area image, a virtual button click activity area from the histogram. It is characterized by creating images.

In the input device of the present invention, the activity rectangle area extraction process, or the activity rectangle area extraction processing circuit is a multi-stage rectangle object extraction process for the virtual cursor activity area image or the virtual button click activity area image. And removing noise components.

Further, the input device of the present invention adds an enlargement / reduction rectangle mask creation process or an enlargement / reduction rectangle mask creation processing circuit, and performs the enlargement / reduction rectangle mask creation process or the enlargement / reduction rectangle mask creation processing circuit. The image corresponding to the change area rectangle on the virtual cursor activity area image or the change area rectangle of the virtual button click activity area image is extracted from the color images obtained by the color camera and the color camera body. The other image is cut to remove noise components.

In the input device of the present invention, the active rectangular area extraction process or the active rectangular area extraction processing circuit is the latest difference image of the multi-stage difference image data generated by the histogram processing or the histogram processing circuit. An invalid activity rectangle area is determined based on a comparison between the activity rectangle area extracted from the data and the virtual button click activity area image.

In the activity rectangular area selection process, the input apparatus according to the present invention may be configured such that the viewing angle of the color camera is a constant and the right eye activity rectangular area is input between the center point of the left eye activity rectangular area. The distance from the color camera to the subject is corrected according to the center coordinate distance in the horizontal direction.

The input device according to the present invention is characterized in that the distance to the subject corrected in accordance with the center coordinate distance is stored in advance as table data.

According to the present invention, it is possible to use an inexpensive camera with low resolution, and to detect a user's movement while greatly reducing the cost of the entire apparatus by greatly reducing the amount of calculation and memory. An input device that can remotely control a cursor of a personal computer or the like and that does not malfunction even when a person other than the user moves within the shooting range of the camera and moves his / her hand can be realized.

1 is a block diagram showing a first embodiment of an input device according to the present invention. 3 is a flowchart illustrating a detailed operation example of the input device illustrated in FIG. 1. 3 is a flowchart showing a detailed operation example of grayed / binarized image processing shown in FIG. 2. 3 is a flowchart showing a detailed operation example of inter-frame difference / histogram creation processing shown in FIG. 2. It is a flowchart which shows the detailed operation example of the activity rectangular area extraction process shown in FIG. It is a flowchart which shows the detailed operation example of the active rectangle area | region selection process shown in FIG. 3 is a flowchart showing detailed operation examples of a virtual cursor control process and a screen control process shown in FIG. 2. It is a schematic diagram which shows the example of an original histogram used with the input device shown in FIG. 1, and the example of a histogram after change area extraction. It is a schematic diagram which shows the histogram example after the change area extraction used with the input device shown in FIG. 1, and a virtual cursor active area image example. It is a schematic diagram which shows the histogram example after the change area extraction used with the input device shown in FIG. 1, and a virtual cursor active area image example. FIG. 3 is a schematic diagram showing an example of an active rectangular area group selected by an active rectangular area selection process in the input device shown in FIG. 1. FIG. 3 is a schematic diagram showing an example of an active rectangular area group selected by an active rectangular area selection process in the input device shown in FIG. 1. It is a schematic diagram which shows an example of the multistage object extraction process used with the input device shown in FIG. It is a schematic diagram which shows an example of the multistage object extraction process used with the input device shown in FIG. It is a schematic diagram which shows the outline | summary of the binocular parallax method used with the input device shown in FIG. In the input device shown in FIG. 1, it is a schematic diagram which shows the relationship between the right active rectangular area and the left active rectangular area before performing correction by the binocular parallax method. FIG. 2 is a schematic diagram showing a relationship between a right active rectangular area and a left active rectangular area after correction by the binocular parallax method in the input device shown in FIG. 1. In the input device shown in FIG. 1, it is a schematic diagram which shows the actual example of the right activity rectangular area after correcting by the binocular parallax method, and each left activity rectangular area. It is a schematic diagram which shows the example of a relationship between the motion of the user's hand image | photographed with the input device shown in FIG. 1, and a virtual cursor. It is a schematic diagram which shows the example of a relationship between the motion of the user's hand image | photographed with the input device shown in FIG. 1, and a virtual cursor. It is a schematic diagram which shows the fine adjustment operation example of the real cursor controlled by the input device shown in FIG. It is a schematic diagram which shows the color control example of the virtual cursor controlled by the input device shown in FIG. It is a schematic diagram which shows the histogram example after the change area extraction used with the input device shown in FIG. 1, and a virtual cursor active area image example. It is a schematic diagram which shows the example of a relationship between the motion of the user's hand image | photographed with the input device shown in FIG. 1, and click operation | movement. It is a schematic diagram which shows an example when there exist two each active rectangular area groups on the virtual cursor active area image obtained by the input device shown in FIG. It is a schematic diagram which shows the positional relationship example of each active rectangular area group on the virtual cursor active area image obtained by the input device shown in FIG. It is a schematic diagram which shows an example when there exist two each active rectangular area groups on the virtual cursor active area image obtained by the input device shown in FIG. It is a schematic diagram which shows the positional relationship example of each active rectangular area group on the virtual cursor active area image obtained by the input device shown in FIG. It is a block diagram which shows 2nd Embodiment of the input device by this invention. It is a flowchart which shows other embodiment of the input device by this invention. FIG. 31 is a schematic diagram illustrating an operation example of the flowchart illustrated in FIG. 30. It is a block diagram which shows an example of the operation input apparatus known conventionally. It is a block diagram which shows the detailed circuit structural example of the hand area | region detection means shown in FIG. FIG. 6 shows a step 41 ′ to be processed instead of the step 41 in the flowchart shown in FIG. 5. Step 43 ′ to be processed is shown instead of step 43 in the flowchart shown in FIG. 5. FIG. 7 shows a step 57 ′ to be processed instead of the step 57 in the flowchart shown in FIG. 6. FIG. 2 is a schematic diagram for explaining an operation for removing noise generated by an operation unintended by a user in an extraction activity region in the input device shown in FIG. 1. It is a schematic diagram explaining the relationship between the camera viewing angle, the subject width, and the distance to the subject in one camera. It is a schematic diagram explaining the measurement of the exact distance from a camera to a to-be-photographed object by the binocular parallax method by two cameras of the input device shown in FIG.

1. Description of First Embodiment FIG. 1 is a block diagram showing a first embodiment of an input device according to the present invention.
The input device 1a shown in this figure includes a built-in web camera (a left-eye color camera of claim 1) 4 provided in the display unit 3 of the personal computer 2, a video capture 5 provided in the personal computer 2, and a personal computer 2 An external web camera (a right-eye color camera according to claim 1) 6 provided in the display unit 3, a USB interface 7 provided in the personal computer 2, a hard disk 8 provided in the personal computer 2, and the personal computer 2. The CPU 9 is provided inside and the memory 10 is provided in the personal computer 2. The input device 1a analyzes the color images obtained by the

web cameras

4 and 6, and has a predetermined distance range from the installation position of the

web cameras

4 and 6, for example, “0.3 m” to “0.8 m”. A virtual cursor 25 [FIG. 19] displayed on the display unit 3 of the personal computer 2 is detected by detecting only movements of the user's hand, fingertips, etc. while distinguishing between a user within the range and another person at a distance other than that. (See (b))], the operation target screen (OS screen, application screen) and the like are controlled, and the currently running application is controlled.

The web camera 4 is a color camera having a resolution of about 320 pixels × 240 pixels, and supplies the video capture 5 with a color video signal obtained by photographing the user when a shooting instruction is issued from the video capture 5. To do.

When a shooting instruction is issued from the CPU 9 via the system bus 12, the video capture 5 controls the web camera 4 to shoot a user, captures a color video signal obtained by the shooting operation, and receives an RGB signal. It is converted into a color image of the format and supplied to the CPU 9.

The web camera 6 is a color camera having a resolution of about 320 pixels × 240 pixels attached to the upper edge of the display unit 3 at a predetermined distance in the horizontal direction from the web camera 4. When the photographing instruction is issued, the YUV signal obtained by photographing the user is supplied to the USB interface 7.

When a shooting instruction is issued from the CPU 9 via the system bus 12, the USB interface 7 controls the web camera 6 to take a user's image and captures the YUV signal obtained by the shooting operation. Is converted into a color image in the RGB signal format.

The hard disk 8 includes an OS storage area 13 in which an OS (Operating System) and constant data are stored, an application storage area 14 in which application programs such as an Internet Explorer program and a browser program are stored, and image processing used in the present invention. An image processing program storage area 15 for storing a program (a right eye side image processing program, a left eye side image processing program, an image processing program) according to claim 1 and an HSV (hue / saturation / lightness) method are set in advance. A color mask, a binarized image, a histogram, a virtual cursor active area image 27 (see FIG. 9), a virtual button click active area image, and the like necessary for extracting a specific color image (for example, skin color). And an image storage area 16 to be stored. When a read instruction is output from the CPU 9, the CPU 9 captures this via the system bus 12 and stores the OS, constant data, application program, image processing program, binary image, histogram, The virtual cursor activity area image 27, the virtual button click activity area image, and the like are read out and supplied to the CPU 9 via the system bus 12. Further, when a write instruction and data are output from the CPU 9, these are taken in via the system bus 12 and stored in an area designated by the write instruction, for example, the image storage area 16.

The CPU 9 generates display data specified by the OS, constant data, application program and the like stored in the hard disk 8 and supplies the display data to the display interface 11 connected to the system bus 12, and displays the operation target screen on the display unit 3. Display. In addition, the image processing described in the right eye side image processing program, the left eye side image processing program, the image processing program, etc. is performed, the control of the size and position of the virtual cursor displayed on the operation target screen, click control, Scroll control, screen rotation control, screen enlargement control, screen reduction control, etc.

The memory 10 has a capacity of about several hundred megabytes to several gigabytes, and temporary data when the CPU 9 performs processing specified by an application program, a right eye side image processing program, a left eye side image processing program, an image processing program, or the like. Used as storage area.

Next, the image processing operation, cursor control operation, screen control operation, etc. of the input device 1a will be described with reference to the flowcharts shown in FIGS. 2 to 7, the schematic diagrams shown in FIGS. 8 to 28, and the like.

<< Binary image generation and storage >>
First, when the personal computer 2 is turned on and an application program, a right eye side image processing program, a left eye side image processing program, and an image processing program are started, the CPU 9 performs video capture as shown in the flowchart of FIG. 5 is controlled, the color video signal obtained by the photographing operation of the web camera 4 is taken in, and is converted into a color image in the RGB signal format and temporarily stored in the memory 10 or the like (step S1).

In parallel with this operation, the USB interface 7 is controlled by the CPU 9 so that the YUV signal obtained by the photographing operation of the web camera 6 is taken in and converted into a color image in the RGB signal format and temporarily stored in the memory 10 or the like. Stored (step S2).

In parallel with these operations, among the color images (color images obtained by the shooting operations of the web cameras 4 and 6) temporarily stored in the memory 10 or the like by the CPU 9, a user is selected from the personal computer 2 side. When viewed, a color image obtained by a web camera corresponding to the right eye, for example, an external web camera 6 is read (step S3).

Thereafter, the CPU 9 starts graying / binarized image processing (step S4). That is, as shown in the flow chart of FIG. 3, the color image obtained by the external web camera 6 is masked by the color mask stored in the image storage rear 16 of the hard disk 8, and preset from the color images. A color image (skin color image) of a specific color (for example, skin color) is extracted (step S21), and the color image obtained by the external web camera 6 is gray-processed and set in advance. The image is converted into a monochrome image of a certain gradation, and the image capacity for one frame is reduced (step S22).

Then, the CPU 9 checks whether or not a screen division instruction is set. If there is a screen division instruction, the monochrome image is divided into a plurality of areas (each area is composed of several to several tens of pixels). If there is no screen division instruction, the division process is skipped, and then the monochrome image is binarized by the maximum likelihood threshold method to create a binarized image (step S23).

Next, the logical sum of the binarized image and the skin color image is taken by the CPU 9 and the skin color portion in the binarized image is extracted (step S24), and this is the binarized image for one frame (on the right eye side). (Binary image) is stored in the image storage area 16 of the hard disk 8 (step S25).

After that, among the color images (color images obtained by the shooting operations of the web cameras 4 and 6) temporarily stored in the memory 10 or the like by the CPU 9 as shown in the flowchart of FIG. When the user is viewed, a color image obtained by the web camera corresponding to the left eye, for example, the built-in web camera 4 is read (step S5).

Next, the CPU 9 starts graying / binarized image processing (step S6). That is, as shown in the flowchart of FIG. 3, the color image obtained by the built-in web camera 4 is masked by the color mask stored in the image storage area 16 of the hard disk 8 and preset from the color image. A color image (skin color image) of a specific color (for example, skin color) is extracted (step S21), and the color image obtained by the built-in web camera 4 is subjected to gray processing, and is set in advance. The image is converted into a tone monochrome image, and the image capacity for one frame is reduced (step S22).

Next, the logical sum of the binarized image and the skin color image is obtained by the CPU 9 and the skin color portion in the binarized image is extracted (step S24), and this is converted into a binarized image for one frame (on the left eye side). (Binary image) is stored in the image storage area 16 of the hard disk 8 (step S25).

Thereafter, the above-described image processing is repeated, and the right eye side binarized image and the left eye side binarized image are in the image storage area 16 of the hard disk 8 in FIFO (First / In / First / Out) format. The frame is stored for tens of frames.

《Difference between frames, creation of histogram》
In parallel with this operation, as shown in the flowchart of FIG. 2, the right eye side is selected from the binarized images of several frames to several tens of frames stored in the image storage area 16 of the hard disk 13 by the CPU 9. Several consecutive frames of binarized images including the latest binarized image corresponding to are sequentially read (step S7).

Then, the number of frames of the binarized image that can be read out by the CPU 9 is checked. If the number of frames is equal to or larger than the predetermined number (step S8), the inter-frame difference / histogram creation process is started (step S9). That is, as shown in the flowchart of FIG. 4, the inter-frame difference processing is performed on the binarized images for two consecutive frames among the respective binarized images (steps S31 and S32), and the inter-frame difference is performed. Each difference image obtained by the process is cumulatively added for each divided area to create a right-eye histogram and stored in the image storage area 16 of the hard disk 8 (steps S33 and S34).

Next, as shown in the flowchart of FIG. 2, the latest 2 corresponding to the left eye side among the binarized images of several frames to several tens of frames stored in the image storage area 16 of the hard disk 13 by the CPU 9. Several consecutive frames of binarized images including the binarized image are sequentially read out (step S10).

Then, the number of frames of the binarized image that can be read out by the CPU 9 is checked. If the number of frames is equal to or larger than the predetermined number (step S11), the interframe difference / histogram creation process is started (step S12). That is, as shown in the flowchart of FIG. 4, the inter-frame difference processing is performed on the binarized images for two consecutive frames among the respective binarized images (steps S31 and S32), and the inter-frame difference is performed. Each difference image obtained by the process is cumulatively added for each divided area to create a left eye side histogram and stored in the image storage area 16 of the hard disk 8 (steps S33 and S34).

《Statistical processing, virtual cursor activity area determination, change area extraction》
Thereafter, as shown in the flowchart of FIG. 2, the right-eye histogram stored in the image storage area 16 of the hard disk 8 is read by the CPU 9 (step S13), and the active rectangular area extraction process is started. (Step S14). That is, as shown in the flowchart of FIG. 5, statistical processing is performed on the density value of each divided area of the histogram, and an average value, density variance value, maximum value, deviation (± 1σ, ± 2σ), and the like are calculated ( Step S41).

Next, the CPU 9 extracts each divided area having a density value larger than the threshold value for extracting the change area rectangle (for example, average value −1σ) from each divided area of the histogram. A rectangular change area rectangle 65 (see FIG. 31) is determined so as to include (activity division area) and stored in the image storage area 16 of the hard disk 8.

In parallel with this operation, the CPU 9 extracts the virtual cursor rectangle from the divided areas 20 (see FIGS. 9 and 10) constituting the histogram as shown in the three-dimensional density distribution diagram of FIG. A divided area (activity divided area 21) having a density value larger than the threshold value (for example, maximum value −1σ) is extracted. The histogram shows the frequency distribution of 20 changes in each divided area. In this process, the active divided area 21 in which the change in the action is significant is extracted by capturing the user's operation.

As a result, when the user turns the fingertip greatly, a rectangular activity rectangular area 26 is determined so as to include each activity division area 21 as shown in FIG. 9, and based on the determination result, as shown in FIG. A right-eye virtual cursor activity area screen 27 is created and stored in the image storage area 16 of the hard disk 8.

Further, when the user moves both hands, a rectangular activity rectangular area 26 is determined so as to include each activity division area 21 as shown in FIG. 10, and the virtual cursor activity on the right eye side as shown in FIG. An area image 27 is created and stored in the image storage area 16 of the hard disk 8 (step S42).

《Determine activity area for virtual button click》
Thereafter, the CPU 9 reads the right-eye histogram stored in the image storage area 16 of the hard disk 8 and extracts a threshold value for extracting a virtual button click rectangle in each divided area 20 (for example, the maximum value − 2)) divided areas (activity divided areas) having a density value greater than 2σ) are extracted, and a rectangular activity rectangular area is determined so as to include each of these activity divided areas, and a virtual button on the right eye side is determined. A click activity area image (not shown) is created and stored in the image storage area 16 of the hard disk 8 (step S43).

《Multi-stage rectangular object extraction processing, shadow effect removal》
Next, the virtual cursor click area extraction threshold value 27 for the right eye side virtual cursor active region image 27 obtained by the CPU 9 using the threshold value for virtual cursor rectangle extraction (for example, the maximum value −1σ) is obtained. For example, with respect to the virtual button click activity region image on the right eye side obtained using the maximum value −2σ), it is checked whether or not each of the activity rectangular regions 26 can be divided into left and right. As shown in FIG. 13, the horizontal center point “A” of the active rectangular area 26 is obtained, and an inactive area on the left side of the horizontal center point “A”, an active area (for example, an active division area 21), Boundary point “B”, a non-active area on the right side of the horizontal center point “A”, and a boundary point “C” between the active area (for example, the active division area 21) are detected. B ”,“ C ” Are determined as each active rectangular area 26, and other active areas are determined as unnecessary active areas due to the shadows of the user and are invalidated (two-point extraction processing).

Thereafter, the CPU 9 checks whether or not each of the activity rectangular areas 26 for which the two-point extraction processing has been completed can be divided vertically, and if it can be divided vertically, as shown in FIG. The vertical center point “A” of the area 26 is obtained, and the boundary point “B” between the inactive area above the vertical center point “A” and the active area (for example, the active division area 21) is detected. Then, the area including these boundary points “B” is determined as the active rectangular area 26, and the lower active area is determined as an unnecessary active area due to the shadow of the user, and is invalidated (minimization process) (step S44). ).

Subsequently, the CPU 9 performs a virtual cursor activity region image 27 on the right eye side including the activity rectangular region 26 obtained by the multi-stage rectangular object extraction process constituted by the two-point extraction process and the minimization process, and the right-eye virtual image. The button click activity area image is stored in the image storage area 16 of the hard disk 8 (step S45).

《Statistical processing, virtual cursor activity area determination, change area extraction》
Thereafter, as shown in the flowchart of FIG. 2, the CPU 9 reads the left eye side histogram stored in the image storage area 16 of the hard disk 8 (step S15), and starts the active rectangular area extraction process. (Step S16). That is, as shown in the flowchart of FIG. 5, statistical processing is performed on the density value of each divided area of the histogram, and an average value, density variance value, maximum value, deviation (± 1σ, ± 2σ), and the like are calculated ( Step S41).

Next, the CPU 9 extracts each divided area 20 having a density value larger than the threshold for extracting the change area rectangle (for example, the average value −1σ), and includes these divided areas (activity divided areas). Thus, a rectangular change area rectangle 65 (see FIG. 31) is determined and stored in the image storage area 16 of the hard disk 8.

In parallel with this operation, the CPU 9 extracts the virtual cursor rectangle from the divided areas 20 (see FIGS. 9 and 10) constituting the histogram as shown in the three-dimensional density distribution diagram of FIG. A divided area (activity divided area 21) having a density value larger than the threshold value (for example, maximum value −1σ) is extracted.

As a result, when the user turns the fingertip greatly, a rectangular activity rectangular area 26 is determined so as to include each activity division area 21 as shown in FIG. 9, and based on the determination result, as shown in FIG. A virtual cursor activity area screen 27 on the left eye side is created and stored in the image storage area 16 of the hard disk 8.

Further, when the user moves both hands, a rectangular activity rectangular area 26 is determined so as to include each activity division area 21 as shown in FIG. 10, and the virtual cursor activity on the left eye side as shown in FIG. An area image 27 is created and stored in the image storage area 16 of the hard disk 8 (step S42).

《Determine activity area for virtual button click》
Next, the left-eye histogram stored in the image storage area 16 of the hard disk 8 is read by the CPU 9, and a threshold value for extracting a virtual button click rectangle in each divided area 20 (for example, the maximum value −2σ) ) A divided area (activity divided area) having a larger density value is extracted, and a rectangular activity rectangular area is determined so as to include each of these activity divided areas, and a virtual button click on the left eye side is clicked An active area image (not shown) is created and stored in the image storage area 16 of the hard disk 8 (step S43).

《Multi-stage rectangular object extraction processing, shadow effect removal》
Next, the virtual cursor active area image 27 on the left eye side obtained by the CPU 9 using the threshold for virtual cursor rectangle extraction (for example, the maximum value −1σ), the threshold for virtual button click rectangle extraction ( For example, with respect to the virtual button click activity region image on the left eye side obtained using the maximum value −2σ), it is checked whether each of the activity rectangular regions 26 can be divided into left and right. As shown in FIG. 13, the horizontal center point “A” of the active rectangular area 26 is obtained, and an inactive area on the left side of the horizontal center point “A”, an active area (for example, an active division area 21), Boundary point “B”, a non-active area on the right side of the horizontal center point “A”, and a boundary point “C” between the active area (for example, the active division area 21) are detected. B ”,“ C ” Are determined as each active rectangular area 26, and other active areas are determined as unnecessary active areas due to the shadows of the user and are invalidated (two-point extraction processing).

Thereafter, the CPU 9 checks whether or not each of the activity rectangular areas 26 for which the two-point extraction process has been completed can be divided vertically, and if it can be divided vertically, as shown in FIG. 26, the center point “A” in the vertical direction is obtained, and the boundary point “B” between the inactive region and the active region (for example, the active division area 21) above the vertical center point “A” is detected. The area including the boundary point “B” is determined as the active rectangular area 26, and the lower active area is determined as an unnecessary active area due to the shadow of the user or the like, and is invalidated (minimization process) (step S44).

Next, the left eye-side virtual cursor activity region image 27 including the activity rectangle region 26 obtained by the multi-step rectangular object extraction process constituted by the two-point extraction process and the minimization process by the CPU 9, and the left-eye virtual button The click activity area image is stored in the image storage area 16 of the hard disk 8 (step S45).

As described above, the movement of the user's hand can be captured by the virtual area activity area image 27 and the virtual button click activity area image of the right eye and the left eye. At this time, the input device 101 detects the movement of the hand not intended by the user. It would be even better if it could be detected. The activity rectangular area 26 as a result of extracting the activity area in the form of a histogram in multiple stages is extracted on the histogram data even when the user's intention is not pointing or tapping, for example, when the hand is waving left and right. Is done. Therefore, if it can be determined that this is not due to the pointing operation intended by the user, the reliability of the input device 101 is increased.

∙ To determine whether the extracted activity rectangle area is valid, the following processing is performed. That is, in the process of step S41 of FIG. 5, the process of step S41 ′ shown in FIG. 34 is performed, and the CPU 9 extracts the active rectangular area 26 from the latest difference data among the multi-stage difference image data that is formed into a histogram. And hold for a certain time while tracking those points.

Next, in the process of step S43 in FIG. 5, the process of step S43 ′ shown in FIG. 35 is performed, and the CPU 9 stores the virtual button click activity area (maximum value −2σ) in step S41 ′. When the corresponding tracking is obtained and the tracking result is active over an area of a specific size, it is determined that the activity rectangular area 26 is invalid.

FIG. 37 shows an active area extracted when the user moves his / her hand from the lower right to the upper left, further to the lower left, and then to the upper right. However, the active area generated from the tracking data of the activity rectangular area 26 has a large movement. Ignore it as noise by judging what was generated.

In the above processing by the CPU 9, it is possible to remove the extraction activity area caused by the operation not intended by the user as noise, and the operability is improved.

《Activity rectangular area selection》
Thereafter, as shown in the flowchart of FIG. 2, the active rectangle area selection process is started by the CPU 9 (step S17). That is, as shown in the flowchart of FIG. 6, the right eye side virtual cursor activity region image 27 and the right eye side virtual button click activity region image stored in the image storage area 16 of the hard disk 8 are read out, and these right Position correction by the binocular parallax method is performed on each active rectangular area 26 included in the virtual cursor activity area image 27 on the eye side and the virtual button click activity area image on the right eye side as shown in the schematic diagram of FIG. Center coordinates of each active rectangular area 26 so as to correspond to the attachment position (horizontal distance “B”, vertical distance, etc.) of each

webcam

4, 6, the focal length “f” of each

webcam

4, 6, etc. After “P _R (X _R , Y _R )” is corrected, numbers are added in order of magnitude (steps S51 and S52).

Next, the CPU 9 reads the left eye side virtual cursor activity region image 27 and the left eye side virtual button click activity region image stored in the image storage area 16 of the hard disk 8, and these left eye side virtual cursor activity regions are read out. Position correction by the binocular parallax method is performed on the area image 27 and each activity rectangular area 26 of the left eye side virtual button click activity area image as shown in the schematic diagram of FIG. The coordinates “P _L (X _L , Y _L )” of each active rectangular area 26 so as to correspond to the mounting position (horizontal distance “B”, vertical distance, etc.), the focal length “f” of each

web camera

4, 6, etc. Are corrected, numbers are added in order of size (steps S53 and S54).

As a result, the user's hand is at a focus position corresponding to the focal length of each of the

webcams

4 and 6, for example, “0.3 m” to “0.8 m” away from each of the

webcams

4 and 6. Before performing the position correction by the parallax method, as shown in the schematic diagram of FIG. 16, even if the positions of the activity rectangle areas 26 on the right eye side and the activity rectangle areas 26 on the left eye side are shifted, the binocular parallax method is used. As shown in the schematic diagram of FIG. 17, the right-side active rectangular area 26 and the left-eye active rectangular area 26 can be completely matched (or substantially matched). .

Thereafter, as shown in the schematic diagram of FIG. 15 by the CPU 9, the center coordinates “X _R , Y _R ” of the activity rectangular area 26 corresponding to the right eye side to which the number “1” is assigned correspond to the left eye side. The distance (center coordinate distance) with the center coordinates “X _L , Y _L ” of the active rectangular area 26 is calculated and stored in the memory 10 together with the number “1”.

Hereinafter, the central coordinates “X _R , Y _R ” of the activity rectangular area 26 corresponding to the right eye side to which the next number “2” to the last number “N” are assigned by the CPU 9 and the activity corresponding to the left eye side will be described. The distance (center coordinate distance) with respect to the center coordinates “X _L , Y _L ” of the rectangular area 26 is sequentially calculated and stored in the memory 10 together with the next number “2” to the last number “N” ( Steps S55 and S56).

When these processes are completed, the CPU 9 sequentially reads out the respective central coordinate distances stored in the memory 10 and compares them with a predetermined value. When there is an activity rectangle area 26 on the eye side and an activity rectangle area 26 on the left eye side, that is, when the user's hand is at a position away from each

web camera

4, 6 by “0.3 m” to “0.8 m”, It is determined that the right-eye activity rectangle area 26 and the left-eye activity rectangle area 26 corresponding to the user's hand are the valid right-eye activity rectangle area 26 and the valid left-eye activity rectangle area 26. It is determined that the other activity rectangle area 26 on the right eye side and the activity rectangle area 26 on the left eye side are the invalid activity rectangle area 26 on the right eye side and the activity rectangle area 26 on the invalid left eye side. .

Thus, using the binocular parallax method, while correcting the position of the active rectangular area, the coordinate distance in the horizontal direction (center coordinate distance) between the center points of the active rectangular areas 26 of the right eye and the left eye to the subject. Therefore, the effectiveness of the active area 26 can be determined with high accuracy. However, an error occurs in the center portion and the end portion of the subject in the field of view of the

web cameras

4 and 6. Therefore, the present invention further provides a method of correcting this and measuring a more accurate distance.

The distance is measured by calculating a correction value from the center coordinate distance in the active rectangular area 26 of each of the visual fields of the

webcams

4 and 6 with the camera viewing angle measured in advance as a constant. Will be described.

FIG. 38 explains the conversion of distance from the viewing angle of one web camera 4A. From the definition of the trigonometric function, the relationship of Equation 1 is established.

w: Width of the subject on the image [pixel]
img_w: Number of horizontal pixels of the camera image [pixel]
d: Distance between camera and subject [m]
δ: Subject width / 2 [m]
θw: Camera viewing angle / 2 [rad] (measured in advance)
However, w and (img_w) are “angles” as physical quantities because of their sizes in the image.
When this is solved for d, Equation 2 is obtained.

In Equation 2, since θw and img_w are constants, the distance d, which is a value to be obtained, depends on δ and w. Here, since δ varies depending on the subject 60 to be reflected, it is necessary to obtain the distance d without depending on δ.

Therefore, in the present invention using two

web cameras

4 and 6, the distance d can be obtained without depending on δ. FIG. 39 explains the conversion of the distance from the viewing angle of the two main and sub webcams. The distance d is expressed by Equation 3.

w ′: absolute value of the difference between the abscissa [pixel] of the center of the subject on the sub-camera image and the center abscissa [pixel] of the subject on the image of the main camera (| X _L −X _R |)
δ ′: Distance [m] between two points at which the center line of sight of the sub camera and the center line of sight of the main camera intersect at the position of the subject (when both eyes are parallel, δ ′ is always constant)
In Expression 3, the distance d does not depend on δ ′ because it is constant when the central line of sight of both eyes is parallel. Note that d is sufficiently larger than δ ′ (d >> δ ′), and even if the subject 60 exists at the edge of the image, it is approximately established.

From the above, if two cameras are used, the distance d depends only on w ′, as shown in Equation 4. Therefore, since d is calculated from the pixel value and is a discrete value, d (w ′) of each w ′ is calculated in advance and stored in a table, thereby correcting the distance (d) to the subject 60 at high speed. Can be processed.

If the above principle is incorporated in the determination of the effectiveness of the activity area 26 by the CPU 9, the accuracy of the determination can be further improved. In this case, the CPU 9 performs the process of step 57 ′ shown in FIG. 36 in the process of step S57, calculates the center coordinate distance w ′ (| X _L −X _R |), and stores the table data in the memory 10. The effective distance to the subject 60 is measured by comparing the distance to the subject 60 and it is determined whether the distance is within the range of “0.3 m” to “0.8 m”. This is a judgment.

As a result, if each activity rectangular area 26 on the right eye side and each activity rectangle area 26 on the left eye side have the relationship shown in the schematic diagram of FIG. 18, the number “the center coordinate distance is equal to or greater than a predetermined value”. The number “0” is determined to be invalid for the right-eye side active rectangle area (O _R1 ) 26 and the left-eye side active rectangle area (O _L1 ) 26 corresponding to “1”. It is determined that the right-eye active rectangular area (O _R2 ) 26 and the left-eye active rectangular area (O _L2 ) 26 corresponding to 2 ″ are valid.

Next, the right-side active rectangular area (O _R2 ) 26 determined to be valid by the CPU 9 and the left-side active rectangular area (O _L2 ) 26 determined to be valid are previously designated, For example, a virtual cursor activity region image 27 and a virtual button click activity region image in which the activity rectangle region ( _OL2 ) on the left eye side determined to be valid is left and the other activity rectangle regions 26 are deleted are created, This is stored in the image storage area 16 of the hard disk 8 as a virtual cursor activity area image 27 and a virtual button click activity area image from which the movement of the person in front of the user and the movement of the person behind are removed by the binocular parallax method. (Step S57).

《Position, size, and color control of virtual cursor by one hand gesture》
Thereafter, as shown in the flowchart of FIG. 2, the virtual cursor control process / screen control process is started by the CPU 9 (step S18). That is, as shown in the flowchart of FIG. 7, each virtual cursor activity region image stored in the image storage area 16 of the hard disk 8 (the movement of the person in front of the user and the person behind the user by the binocular parallax method). Among the virtual cursor activity region images 27), the virtual cursor activity region images 27 for several frames including the latest activity rectangle region 26 are read (step S61), and one or more adjacent ones are displayed. It is checked whether an active rectangle area group constituted by the active rectangle areas 26 exists in the virtual cursor active area image 27.

If there are active rectangular area groups in the latest virtual cursor active area image 27 and the number thereof is “1” and almost rectangular (steps S62 and S63), the CPU 9 determines the size of the active rectangular area group. Now, the moving direction is determined, and virtual cursor control is performed so as to correspond to the determination result (step S64).

For example, as shown in FIG. 19 (a), the position of the large active rectangular area group obtained by the previous processing corresponding to the user turning the fingertip at the same height, left and right positions as in the previous time. When a large group of active rectangular areas is obtained at the same position, the CPU 9 determines that it is a virtual cursor display instruction, and the display unit 3 has a large size and white virtual display as shown in FIG. A cursor 25 is displayed.

Further, when the user moves the fingertip up and down or left and right, and a large group of active rectangular areas moving from the position obtained in the previous process is obtained correspondingly, the CPU 9 performs virtual processing. The cursor movement instruction is determined, and the large size, white virtual cursor 25 displayed on the display unit 3 is moved so as to correspond to the moving direction of the fingertip.

Also, as shown in FIG. 20 (a), the position of the large active rectangular area group obtained by the previous processing corresponding to the user turning the fingertip small at the same height and left and right positions as in the previous time. When the small active rectangle area group is obtained at the same position as the virtual cursor, the CPU 9 determines that the movement of the virtual cursor is stopped, and the virtual cursor displayed on the display unit 3 as shown in FIG. The movement of 25 is stopped and the size is reduced.

If a certain time has passed in this state, the CPU 9 changes the color of the virtual cursor 25 to red and prohibits a large movement, and a cursor movement instruction is issued to the OS side, and the real cursor 28 is placed in the virtual cursor 25. Move.

Thereafter, when the user moves the fingertip slightly, this is detected by the CPU 9 and the position of the virtual cursor 25 displayed on the display unit 3 is finely adjusted, and a cursor position adjustment instruction is issued to the OS side. Thus, the position of the actual cursor 28 is finely adjusted as shown in FIG.

Next, if the user stops moving the fingertip, the CPU 9 detects this, and after a certain time, the position of the virtual cursor 25 displayed on the display unit 3 is fixed and the virtual cursor 25 is displayed as shown in FIG. The color of is changed from red to gray to inform the user that it can be clicked.

Even in this state, if the user turns the fingertip again largely, this is detected by the CPU 9, the color of the virtual cursor 25 displayed on the display unit 3 is returned to white, and the virtual cursor 25 is returned to a movable state. .

《Scroll control with one hand gesture》
When it is checked whether or not there is an active rectangular area group in the virtual cursor active area image 27 described above, as shown in FIG. 23, if the number of active rectangular area groups is “1” and is long in the horizontal direction (step In S62, S63), the CPU 9 determines which direction is longer than the previous active rectangular area group, and generates a right scroll instruction (or left scroll instruction) corresponding to the longer direction to the application side. The application screen (operation target screen) displayed on the display unit 3 is scrolled rightward (or leftward) (step S64).

If it is checked whether or not there is an active rectangular area group in the virtual cursor active area image 27 described above, if the number of active rectangular area groups is “1” and is long in the vertical direction (steps S62 and S63), the CPU 9 Is used to determine which direction is longer than the previous active rectangular area group, and an upward scroll instruction (or downward scroll instruction) corresponding to the longer direction is generated and passed to the application side, and the display unit 3 The application screen (operation target screen) displayed on the screen is scrolled upward (or downward) (step S64).

《Real cursor click control by one hand gesture》
Thereafter, the CPU 9 checks whether the color of the virtual cursor 25 is gray. If the color of the virtual cursor 25 is gray, the latest virtual button click activity area image stored in the image storage area 16 of the hard disk 8 is checked. The virtual button click activity region images for several frames including the activity rectangle region are read out (step S66).

Next, the CPU 9 checks whether there is an active rectangular area group composed of one or more active rectangular areas 26 in the virtual button click active area image, and whether the shape has changed. Is a change in which the active rectangular area group is set in advance, for example, as shown in FIG. 24 (a), the user extends his hand only once from the state of pointing, and the active rectangular area group is small once. If it has changed from “large” to “large” (step S67), it is determined that it is a single click, a single click instruction is issued to the OS side, and it is in the virtual cursor 25 as shown in FIG. An icon or the like is single-clicked by the real cursor 28 (step S68).

Also, if the user extends or shrinks his / her hand twice or more from the pointing state, and the activity rectangular area group has changed from “large” to “small” and “small” to “large” several times (step S67). The CPU 9 determines that it is a double click, and a double click instruction is issued to the OS side, and the icon or the like at the position of the real cursor 28 is double clicked (step S68).

《Screen enlargement and reduction control using two-hand gestures》
In addition, the user puts his right hand and left hand at the focus position of each of the

webcams

4 and 6 and moves the right hand and left hand to check whether or not there is an active rectangular area group in the virtual cursor active area image 27 described above. If the number of active rectangular area groups is “2” and each is rectangular (step S63), this is detected by the CPU 9 and displayed on the display unit 3 in accordance with the movement of each active rectangular area group. The operation target screen that has been enlarged is reduced, reduced, rotated, or the like (step S65).

For example, the user puts his right hand and left hand at the focus positions of the

web cameras

4 and 6 and moves the right hand and left hand away from each other, and there are two groups of active rectangular areas shown in FIG. 25 corresponding thereto. Then, as shown in FIG. 26 (a), when the distance between each of the activity rectangle areas becomes longer than the previous time by moving in the direction wider than the previous time, it is determined that the screen enlargement instruction is input by the CPU 9, and the activity rectangle A screen enlargement instruction with an enlargement ratio corresponding to the distance change ratio of the region group is generated and passed to the application side, and the application screen (operation target screen) displayed on the display unit 3 is enlarged.

In addition, the user puts out his right hand and left hand at the focus positions of the

web cameras

4 and 6 and moves the right hand and left hand in a direction approaching each other. Correspondingly, two active rectangular area groups are shown in FIG. When the distance between each of the active rectangular area groups is shorter than the previous time, the CPU 9 determines that a screen reduction instruction has been input, and the distance change ratio of the active rectangular area groups is A screen reduction instruction with a corresponding reduction ratio is generated and passed to the application side, and the application screen (operation target screen) displayed on the display unit 3 is reduced.

《Screen rotation control using two-hand gestures》
Also, the user puts out his right hand and left hand at the focus position of each of the

webcams

4 and 6 and moves one of the right hand and left hand up and the other down, corresponding to the two activity rectangles shown in FIG. When at least one of the area groups moves upward (or downward), the CPU 9 determines that a screen rotation instruction has been input, and depends on the angle of the upper active rectangular area group with respect to the lower active rectangular area group A screen rotation instruction with the rotation angle is generated and passed to the application side, and the application screen (operation target screen) displayed on the display unit 3 is rotated.

At this time, as shown in FIG. 28 (a), in a state where the left and right distances of the respective active rectangular area groups are narrow, one of the active rectangular area groups moves upward and the angle of the upper active rectangular area group with respect to the lower active rectangular area group is large. At this time, a screen rotation instruction with a large rotation angle is generated by the CPU 9 and passed to the application side, and the application screen (operation target screen) displayed on the display unit 3 is largely rotated.

Also, as shown in FIG. 28B, when the left and right distances of the respective active rectangular area groups are wide, when one of the active rectangular area groups is small and moves upward, the angle of the upper active rectangular area group with respect to the lower active rectangular area group is small. The CPU 9 generates a screen rotation instruction with a small rotation angle and passes it to the application side, and the application screen (operation target screen) displayed on the display unit 3 is rotated small.

As described above, in the first embodiment of the present invention, a graying process, an image division / binarization process, and the like for a low-resolution color image obtained by photographing a user with each of the

web cameras

4 and 6, Color cursor processing, frame buffer processing, inter-frame difference processing, histogram processing, active rectangular region extraction processing, active rectangular region selection processing, virtual cursor control processing / screen control processing are performed to detect the movement of the user's hand, and virtual cursor 25 Size control, position control, color control, click control, operation target screen enlargement control, reduction control, rotation control, up / down scroll control, left / right scroll control, etc., so that the following effects can be obtained. it can.

First, since the

inexpensive web cameras

4 and 6 that do not have high resolution can be used, the cost of the input device 1a can be kept low (effect of claim 1).

In addition, a binarized image obtained by performing graying processing, image division / binarization processing, and color filtering processing on a low-resolution color image obtained by photographing the user with the

web cameras

4 and 6. Is stored in the image storage area 16, the input device 1 a can be configured even when the capacity of the hard disk 8 is small, and the cost of the entire device can be kept low (effect of claim 1).

Further, low-resolution color images obtained by photographing the user with the

web cameras

4 and 6 are subjected to a small number of image processing such as graying processing, image division / binarization processing, and color filtering processing. Since a binary image for one frame is obtained, it is possible to prevent the CPU 9 from being subjected to a large burden, and thus to respond to a user's movement almost in real time even when an inexpensive CPU 9 whose processing speed is not fast is used. As described above, the size control, position control, color control, click control, enlargement control, reduction control, rotation control, up / down scroll control, left / right scroll control, and the like of the operation target screen can be performed. Can be kept low (effect of claim 1).

In addition, the center coordinate position is corrected by the binocular parallax method for each of the right-eye activity rectangle area 26 and the left-eye activity rectangle area 26 obtained by photographing the user with the

web cameras

4 and 6. After that, numbers are assigned in order of size, and the center coordinate positions are compared. Based on the comparison result, the right-side active rectangular area 26 and the left-eye active rectangular area 26 corresponding to the focus position are selected. Therefore, even if there is something other than the user's hand at the focus position of each

webcam

4, 6 such as a person behind the user and moving, only the movement of the user's hand is not affected by this. Can be extracted to perform size control, position control, color control, click control, enlargement control, reduction control, rotation control, up / down scroll control, left / right scroll control, and the like of the operation target screen. 1 Effect).

In the first embodiment of the present invention, when the user moves only one hand, it is determined that it is a virtual cursor control instruction or scroll control of the operation target screen, and the size control, position control, and color of the virtual cursor 25 are determined. Since control, click control, scroll control of the operation target screen, and the like are performed, the size, position, color, click, scroll of the operation target screen, etc. of the virtual cursor 25 displayed on the display unit 3 with only one hand can be controlled. Remote control is possible (effect of claim 3).

Further, in the first embodiment of the present invention, when the user moves both hands, the right hand movement and the left hand movement are detected, and the operation target screen enlargement / reduction control instruction or the operation target screen rotation control is detected. Since the instruction is determined to be an instruction, the application screen (operation target screen) displayed on the display unit 3 can be enlarged, reduced, and rotated only by the user moving the right hand and the left hand. Effect).

In the first embodiment of the present invention, a virtual cursor activity region image 27 and a virtual button click activity region image are created from the histogram using the results obtained by statistically processing the histogram in the activity rectangular region extraction process. Therefore, a moving part such as a user's hand can be accurately detected, and stable virtual cursor control, click control, and operation target screen control can be performed (effect of claim 5).

In the first embodiment of the present invention, the multi-step rectangular object extraction process is performed on the virtual cursor activity area image 27 and the virtual button click activity area image in the activity rectangular area extraction process. It is possible to prevent malfunction caused by a shadow or the like and perform stable virtual cursor control, click control, and operation target screen control (effect of claim 6).

2. Description of Second Embodiment FIG. 29 is a block diagram showing a second embodiment of the input device according to the present invention.
An input device 1b shown in this figure is composed of a plastic member or the like formed in a box shape, and is an input device housing (illustrated) disposed in the vicinity of a remote operation target device such as a personal computer, a television, an air conditioner, or a large screen screen device. Is attached to the left side of the front surface of the input device housing, and the right-eye video camera main body (color camera main body of claim 2) 30 that captures a user and outputs a color image signal, and the input device housing Right-eye image processing that generates a right-eye side virtual cursor activity region image and a right-eye side virtual button click activity region image by processing a color image captured by the right-eye video camera body 30 A left-eye video camera body that is attached to the substrate 31 and the front right side of the input device housing and that captures a user and outputs a color image signal (color camera body of claim 2) 2 and a color image placed in the input device casing and photographed by the video camera main body 32 for the left eye to process a virtual cursor activity area image on the left eye side and a virtual button click activity area image on the left eye side. The left eye side image processing board 33 to be generated, the right eye side virtual cursor activity area image and the right eye side virtual button click activity area image which are arranged in the input device housing and output from the right eye side image processing board 31 The left eye side virtual cursor activity region image and the left eye side virtual button click activity region image output from the left eye side image processing board 32 are image-processed to generate pointing data corresponding to the movement of the user's hand. Common processing bases that are supplied to devices such as personal computers, televisions, air conditioners, and large screen devices via cables such as USB cables and signal connection cables. And a 34.

And, by analyzing the color image obtained by shooting the user, it responded to the movement of the user's hand while removing the effects of shadows, the effects of the people on the front side of the user and the people behind them Pointing data is generated, and the pointing data is supplied to the remote operation target device through the path of the input device 1b → cable → remote operation target device, and the operation of the remote operation target device is controlled.

The right-eye video camera main body 30 is configured by a color camera having a resolution of about 320 pixels × 240 pixels, and captures a user when a power supply voltage, a clock signal, and the like are output from the right-eye image processing board 31. Then, the color video signal obtained thereby is supplied to the right eye side image processing board 31.

The right eye side image processing board 31 converts the color video signal output from the video camera body for the right eye into a color image in RGB format, and then is specified in advance by the HSV (Hue / Saturation / Brightness) method. Using a color mask necessary for extracting a color image of a color (for example, skin color), the skin color image extracting circuit 35 for extracting the skin color image in the color image and the video camera body 30 for the right eye are output. After the color video signal is converted into a color image in RGB format, a graying processing circuit 36 for converting the color video signal into a monochrome image having a preset gradation, and a monochrome image output from the graying processing circuit 36 are set in advance. The screen is divided by the number of screen divisions (however, when screen division is not set, this screen division processing is skipped) and binarized by the maximum likelihood threshold method. The logical division of the image division / binarization processing circuit 37 to be an image, the binarized image output from the image division / binarization processing circuit 37 and the skin color image output from the skin color image extraction circuit 35 is performed, And a color filtering processing circuit 38 for extracting a skin color portion in the binarized image.

Further, the right eye side image processing board 31 is stored in the frame buffer circuit 39 and the frame buffer circuit 39 for temporarily storing the binary image output from the color filtering processing circuit 38 for several frames to several tens of frames. The inter-frame difference processing circuit 40 that performs the inter-frame difference processing while sequentially reading the binarized images, and generates the difference image, and the difference images output in units of frames from the inter-frame difference processing circuit 40 are divided into the divided areas. The histogram processing circuit 41 that accumulates each time to generate a histogram, and performs statistical processing on the histogram output from the histogram processing circuit 41, and also uses the statistical processing result to determine virtual cursor activity area determination processing, virtual button click activity Performs area determination processing, multi-step rectangular object extraction processing, etc., and influences such as shadows Removing the right eye of the virtual cursor activity area image, and a work rectangular area extraction processing circuit 42 for generating a virtual button click activity area image of the right eye side.

Furthermore, the activity rectangular area extraction processing circuit 42 compares the activity rectangular area extracted from the latest difference image data of the multi-stage difference image data generated by the histogram processing circuit 41 with the virtual button click activity area image. When the extracted activity rectangle area exceeds the range of the virtual button click activity area, the extraction action rectangle area is judged to be invalid so that the movement that the user does not intend for the input operation is ignored as noise. can do.

For the color video signal output from the right-eye video camera body 30, graying processing, screen division / binarization processing, color filtering processing, frame buffer processing, inter-frame difference processing, histogram processing, active rectangular area Extraction processing is sequentially performed to generate a virtual cursor activity region image on the right eye side and a virtual button click activity region image on the right eye side, which are supplied to the common processing board 34.

The left-eye video camera main body 32 is composed of a color camera having a resolution of about 320 pixels × 240 pixels. When the power supply voltage, clock signal, etc. are output from the left-eye image processing board 33, the user And the color video signal obtained thereby is supplied to the left eye side image processing board 33.

The left-eye-side image processing board 33 converts the color video signal output from the left-eye video camera body 32 into a color image in RGB format, and is preset in the HSV (Hue / Saturation / Brightness) method. A skin color image extraction circuit 43 that extracts a skin color image in a color image using a color mask necessary for extracting a color image of a specific color (for example, skin color) and an output from the left-eye video camera main body 32 After converting the color video signal to be converted into a color image in RGB format, a graying processing circuit 44 for converting to a monochrome image having a preset gradation, and a monochrome image output from the graying processing circuit 44 are set in advance. (However, when screen division is not set, this screen division process is skipped) and binarization is performed using the maximum likelihood threshold method. Logic of the image division / binarization processing circuit 45 to be a binarized image, the binarized image output from the image division / binarization processing circuit 45, and the skin color image output from the skin color image extraction circuit 43 And a color filtering processing circuit 46 that extracts the skin color portion in the binarized image.

Further, the left eye side image processing board 33 is stored in the frame buffer circuit 47 and the frame buffer circuit 47 for temporarily storing the binarized image output from the color filtering processing circuit 46 for several frames to several tens of frames. The inter-frame difference processing circuit 48 that performs the inter-frame difference processing while sequentially reading the binarized images, and generates the difference image, and the difference images output in units of frames from the inter-frame difference processing circuit 48 are divided into the divided areas. A histogram processing circuit 49 that accumulates each time and generates a histogram, and performs statistical processing on the histogram output from the histogram processing circuit 49, and also uses the statistical processing result to determine virtual cursor activity area determination processing and virtual button click activity Performs area determination processing, multi-step rectangular object extraction processing, etc., and influences such as shadows It comprises removing the left eye side of the virtual cursor activity area image, and activities rectangular area extraction processing circuit 50 for generating a virtual button click activity area image of the left eye side.

Further, the activity rectangular area extraction processing circuit 50 compares the activity rectangular area extracted from the latest difference image data of the multi-stage difference image data generated by the histogram processing circuit 49 with the virtual button click activity area image. As shown in FIG. 37 described above, when the extracted activity rectangular area exceeds the range of the virtual button click activity area, the user can perform an input operation by performing a process of determining that these extracted activity rectangle areas are invalid. Unintentional movement can be ignored as noise

For the color video signal output from the left-eye video camera main body 32, graying processing, screen division / binarization processing, color filtering processing, frame buffer processing, inter-frame difference processing, histogram processing, active rectangular area Extraction processing is sequentially performed to generate a virtual cursor activity region image on the left eye side and a virtual button click activity region image on the left eye side, and supply them to the common processing board 34.

The common processing board 34 includes right-eye web camera 30 and left-eye web camera 32 attachment position data (horizontal distance “B”, vertical distance, etc.) necessary for position correction by the binocular parallax method, right-eye web camera. 30, a shooting condition setting circuit 51 in which shooting condition information such as the focal length “f” of the left-eye web camera 32 is set, and a right-eye virtual cursor activity region image output from the right-eye image processing board 31 The right eye side virtual button click activity region image, the left eye side virtual cursor activity region image output from the left eye side image processing board 33, and the left eye side virtual button click activity region image include an activity rectangular region. Process for correcting the position of each active rectangular area by the binocular parallax method using the shooting condition information set in the shooting condition setting circuit 51, and adding a number to each active rectangular area in order of size Processing to calculate the distance between the center coordinates (center coordinate distance) of each activity rectangle area to which the same number is added, processing to select the activity rectangle area corresponding to each center coordinate distance that is less than or equal to the predetermined value, and selected Perform a process such as creating a virtual cursor activity area image on the left eye side that contains only the activity rectangle area and not an unselected activity rectangle area, and a virtual button click activity area image on the left eye side. Virtual cursor activity area image, right eye side virtual button click activity area image, left eye side virtual cursor activity area image, left eye side virtual button click activity area image, and a person in front of the user, and behind After removing the influence of the movement of the person, the activity rectangle that holds several to several tens of virtual cursor activity area images on the left eye side and virtual button click activity area images on the left eye side And a selection processing circuit 52.

Furthermore, the activity rectangular area selection processing circuit 52 uses the viewing angles of the

web cameras

30 and 32 as constants to change the color according to the center coordinate distance in the horizontal direction between the center points of the right eye activity rectangular area and the left eye activity rectangle area. By correcting the distance from the camera to the subject, it is possible to more accurately remove the influence of the movement of the person in front of the user and the person behind. The principle of correction and the calculation formula for correction are as shown in FIG. 39 and Formula 4 described above. The active rectangular area selection processing circuit 52 can realize high-speed processing by holding the distance to the subject calculated in advance according to the center coordinate distance as table data.

Further, the common processing board 34, when the active rectangle area group exists in the latest virtual cursor activity area image among the virtual cursor activity area images on the left eye side held in the activity rectangle area selection processing circuit 52, Virtual cursor position instruction, virtual cursor shape instruction, virtual cursor color instruction, operation target screen scroll instruction, operation target screen enlargement instruction, operation target screen reduction instruction based on the number of active rectangle areas, shape, movement presence / absence, movement direction, etc. In addition to generating pointing data such as an operation target screen rotation instruction, and when the virtual cursor is in a clickable state, among the virtual button click activity area images held in the activity rectangular area selection processing circuit 52, Check if the activity rectangle area group exists in the latest virtual button click activity area image, and the activity rectangle When the frequency group is present, a work such as a single click instruction based on the rectangular region group shape, the virtual cursor control processing / display control processing circuit 53 for generating a pointing data such as a double-click instruction.

Then, the right eye side virtual cursor activity region image output from the right eye side image processing substrate 31, the right eye side virtual button click activity region image, and the left eye side virtual output from the left eye side image processing substrate 32. When an active rectangle area is included in the cursor activity area image and the left eye side virtual button click activity area image, the user removes noise caused by the movement of the person in front of the user, the movement of the person behind the user, etc. In accordance with the determination result, a virtual cursor position instruction, virtual cursor shape instruction, virtual cursor color instruction, operation target screen scroll instruction, operation target screen enlargement instruction, operation target screen Pointing data such as a reduction instruction, operation target screen rotation instruction, etc. is generated, and the personal computer, TV, air conditioner, large screen that is the target of remote operation Supplies such as clean equipment.

As described above, in the second embodiment, color filtering processing and graying are performed on a low-resolution color image obtained by photographing the user with the right-eye video camera body 30 and the left-eye video camera body 32. User's hand movements including processing, image segmentation / binarization processing, frame buffer processing, inter-frame difference processing, histogram processing, active rectangular area extraction processing, active rectangular area selection processing, virtual cursor control processing / screen control processing, etc. Generating pointing data such as a virtual cursor position instruction, virtual cursor shape instruction, virtual cursor color instruction, operation target screen scroll instruction, operation target screen enlargement instruction, operation target screen reduction instruction, operation target screen rotation instruction, Since it is supplied to the remote operation target device, the virtual cursor size and virtual Cursor position, the virtual cursor color, click, scroll up and down of the operation target screen, left and right scrolling, enlargement, reduction, rotation, etc., can be remotely operated (Effect of Claim 2).

In the second embodiment, since the inexpensive right-eye video camera main body 30 and left-eye video camera main body 32 that do not have high resolution can be used, the cost of the input device 1b can be kept low. Effect of 2).

In the second embodiment, the low-resolution color image obtained by photographing the user with the right-eye video camera main body 30 and the left-eye video camera main body 32 is subjected to graying processing, image division / 2. Since the binarized image obtained by performing the binarization process and the color filtering process is stored in the

frame buffer circuits

39 and 47, the input device 1b can be used even when the storage capacity of the

frame buffer circuits

39 and 47 is small. The cost of the entire apparatus can be kept low (effect of claim 2).

Furthermore, in the second embodiment, graying processing, image division / 2 are performed on a low-resolution color image obtained by photographing the user with the right-eye video camera body 30 and the left-eye video camera body 32. Image processing with a small number of stages such as value processing, color filtering processing, frame buffer processing, inter-frame difference processing, histogram processing, active rectangular area extraction processing, active rectangular area selection processing, virtual cursor control processing / screen control processing, etc. Since pointing data is generated, skin color

image extraction circuits

35 and 43, graying

processing circuits

36 and 44, image division /

binarization processing circuits

37 and 45, color

filtering processing circuits

38 and 46, a frame

buffer Processing circuits

36, 47, inter-frame

difference processing circuits

40, 48,

histogram processing circuits

41, 49, activity rectangle As the area

extraction processing circuits

42 and 50, the active rectangular area selection circuit 52, and the virtual cursor control processing / screen control processing circuit 53, it is possible to use an element whose processing speed is not so fast and keep the cost of the entire apparatus low. It is possible to detect the movement of the user in real time and control the remote operation target device (effect of claim 2).

In the second embodiment, the right-eye activity rectangular area obtained by photographing the user with the right-eye video camera main body 30 and the left-eye video camera main body 32 and the left-eye activity After correcting the center coordinate position with the binocular parallax method for the rectangular area, the center coordinate position is compared in the order of size, and each active rectangle on the right eye side corresponding to the focus position based on the comparison result Since the area and each activity rectangle area on the left-eye side are selected, other than the user's hand at the focus position of the right-eye video camera main body 30 and the left-eye video camera main body 32, for example, the user Even if there is a person behind and moving, only the movement of the user's hand is extracted without being affected by this, size control of the virtual cursor, position control, color control, click control, enlargement control of the operation target screen , Reduction control, rotation Please, vertical scroll control can be performed such as the left and right scroll control (Effect of Claim 2).

Also in the second embodiment, as in the first embodiment described above, when the user is moving only one hand, it is any one of the virtual cursor control instruction, the click control instruction, and the scroll control instruction. Since pointing data is generated to indicate and indicate virtual cursor size instructions, virtual cursor position instructions, virtual cursor color instructions, scroll control instructions, click instructions, etc., display on the display on the remote operation target device side with only one hand It is possible to remotely control the size, position, color, click operation, scroll of the operation target screen, and the like of the virtual cursor being operated (effect of claim 3).

Also in the second embodiment, as in the first embodiment described above, when the user is moving both hands, the right hand movement and the left hand movement are detected, and control instructions on the operation target screen are displayed. Since it is determined that there is a pointing data that indicates an operation target screen enlargement instruction, an operation target screen reduction instruction, an operation target screen rotation instruction, etc., the user only needs to move the right hand and left hand to the remote operation target device side The operation target screen displayed on the display can be enlarged, reduced, and rotated (effect of claim 4).

Further, in the second embodiment of the present invention, the active rectangular area

extraction processing circuits

42 and 50 statistically process the histogram, and using the statistical processing result, the virtual cursor active area image on the right eye side from the histogram, the right eye The virtual button click activity area image on the side, the virtual cursor activity area image on the left side, and the virtual button click activity area image on the left side are created so that moving parts such as the user's hand can be accurately detected. Thus, stable virtual cursor control, click control, and operation target screen control can be performed (effect of claim 5).

In the second embodiment of the present invention, in the active rectangle area

extraction processing circuits

42 and 50, the right eye side virtual cursor activity area image, the right eye side virtual button click activity area image, and the left virtual cursor activity area Multi-step rectangular object extraction processing is performed on the image and the virtual button click activity area image on the left side to prevent malfunctions caused by the shadow of the user, stable virtual cursor control, click control, operation target Screen control can be performed (effect of claim 6).

3. Description of Other Embodiments In each of the above-described embodiments, the entire area of the color image obtained by each

web camera

4, 6, right-eye video camera body 30, and left-eye video camera body 32 is grayed out and binarized. As shown in the flowchart of FIG. 30 and the schematic diagram of FIG. 31, the histogram is statistically processed, and the change area rectangle (rectangle including the activity rectangle area) 65 obtained by the activity rectangle area extraction process is An enlarged / reduced rectangular mask 66 enlarged / reduced at a specified enlargement / reduction ratio (for example, enlargement ratio “10%”) is created (step S71), and the monochrome image obtained by graying out the entire color image area of the next frame. Only the portion corresponding to the enlarged / reduced rectangular mask 66 (the image 67 of the active region portion included in the monochrome image) may be extracted from the image and binarized. Step S72).

In this way, only the image included in the area slightly wider than the active area from the monochrome image etc. is enabled, the image in the other area is disabled, and the noise existing in the part other than the change area is removed. (Effect of claim 7).

Further, in the first embodiment and the second embodiment described above, the skin color image is extracted by the color filtering processing by the CPU 9 or the color

filtering processing circuits

38 and 46, but the user specifies When controlling the position, click, scrolling of the operation target screen, enlargement of the operation target screen, reduction of the operation target screen, rotation of the operation target screen, etc. using a color operation device such as a red pen Alternatively, a color mask for red extraction may be used, and a red color image may be extracted by the color filtering processing by the CPU 9 or the color

filtering processing circuits

38 and 46.

Thereby, even if there are a plurality of people in the shooting range of each of the

web cameras

4 and 6, the right-eye video camera main body 30, and the left-eye video camera main body 32, a color image corresponding to the color of the operating device possessed by the user is extracted. Thus, size control, position control, click control, scroll control of the operation target screen, enlargement control, reduction control, rotation control, and the like can be performed.

In addition, such color filtering processing moves such as the user's hand included in the color video signals output from the

web cameras

4 and 6, the right-eye video camera body 30, and the left-eye video camera body 32. Since the process is performed to extract the image, the color filtering process is omitted when the lighting condition of the place where the user is located is good and the contrast between the image where the user's hand is moving and the background image is large. You may make it do.

The present invention is used by being connected to an information device such as an information terminal device or a personal computer, takes an operation image of an operator (user) by a camera, and controls cursor operation of the information device, selection and execution of an application program, and the like. An input device, in particular, video input that simplifies the algorithm and reduces the amount of processing data as much as possible to reduce the amount of computation and memory usage, as well as controlling the cursor of the personal computer in real time. It relates to a device and has industrial applicability.

1a, 1b: input device 2: personal computer 3: display unit 4: web camera (color camera for left eye)
5: Video capture 6: Web camera (color camera for right eye)
7: USB interface 8: Hard disk 9: CPU
10: Memory 11: Display interface 12: System bus 13: OS storage area 14: Application storage area 15: Image processing program storage area 16: Image storage area 20: Division area 21: Activity division area 25: Virtual cursor 26: Activity rectangle Area 27: Virtual cursor activity area image 28: Real cursor 30: Video camera body for right eye (color camera body for right eye)
31: Right-eye image processing board 32: Left-eye video camera body (left-eye color camera body)
33: Left eye side image processing board 34: Common processing board 35: Skin color image extraction circuit 36: Graying processing circuit 37: Image division / binarization processing circuit 38: Color filtering processing circuit 39: Frame buffer circuit 40: Between frames Difference processing circuit 41: Histogram processing circuit 42: Activity rectangular area extraction processing circuit 43: Skin color image extraction circuit 44: Graying processing circuit 45: Image division / binarization processing circuit 46: Color filtering processing circuit 47: Frame buffer circuit 48 : Frame difference processing circuit 49: Histogram processing circuit 50: Activity rectangular area extraction processing circuit 51: Shooting condition setting circuit 52: Activity rectangular area selection processing circuit 53: Virtual cursor control processing / screen control processing circuit 65: Change area rectangle 66 : Enlarged / reduced rectangular mask 67: Image after masking

Claims

In an input device that processes an operator image obtained by a video camera and generates an operation instruction according to the operation content of the operator,
A color camera for the right eye to photograph the operator,
A color camera for the left eye that is arranged side by side with the color camera for the right eye at a position away from the color camera for the right eye by a predetermined distance, and photographs the operator,
The color image output from the right-eye color camera is subjected to graying processing, image division / binarization processing, interframe difference processing, histogram processing, and active rectangular area extraction processing, and the right eye side of the operator A right eye side image processing program for extracting an active rectangular area;
The color image output from the left-eye color camera is subjected to graying processing, image division / binarization processing, inter-frame difference processing, histogram processing, and active rectangular area extraction processing, and the left eye side of the operator A left eye side image processing program for extracting an active rectangular area;
Activity rectangle region selection processing using binocular parallax for the right eye side activity rectangular region obtained by the right eye side image processing program and the left eye side activity rectangle region obtained by the left eye side image processing program An image processing program that performs virtual cursor control processing / screen control processing to detect the movement of the operator's hand or fingertip, and generates an operation instruction according to the detection result;
An input device comprising:
In the input device that processes the image of the operator obtained by the video camera, generates an operation instruction according to the operation content of the operator, and controls the operation of the remote operation target device.
An input device housing formed in a box shape;
A color camera body for the right eye that is attached to the left side of the front surface of the input device housing and captures an image of the operator,
A color camera body for the left eye that is attached to the front right side of the input device housing and captures an image of the operator;
Output from the right-eye color camera body by the graying processing circuit, the image segmentation / binarization processing circuit, the inter-frame difference processing circuit, the histogram processing circuit, and the active rectangular area extraction processing circuit which are arranged in the input device casing. A right eye side image processing board that processes the color image to be extracted and extracts the right eye side activity rectangular region of the operator;
Output from the left-eye color camera body by the graying processing circuit, image segmentation / binarization processing circuit, inter-frame difference processing circuit, histogram processing circuit, and active rectangular area extraction processing circuit, which is arranged in the input device casing. A left eye side image processing board that processes the color image to be extracted and extracts the left eye side activity rectangular region of the operator;
The right-eye side active rectangular area, the left-eye-side image, which is arranged in the input device casing and obtained on the right-eye side image processing board by the active rectangular area selection processing circuit and the virtual cursor control processing / screen control processing circuit. An activity rectangular area selection process using a binocular parallax method, a virtual cursor control process / a screen control process are performed on the left-eye activity rectangle area obtained on the processing board, and the movement of the operator's hand or fingertip is performed. A common processing board for detecting, generating pointing data according to the detection result, and controlling the operation of the remote operation target device;
An input device comprising:
The virtual cursor control process / screen control process, or the virtual cursor control process / screen control processing circuit, when there is one active rectangular area group on the virtual cursor active area image, based on the shape and the presence / absence of movement, 3. The input device according to claim 1, wherein an instruction or a screen scroll instruction is generated.
The virtual cursor control processing / screen control processing, or the virtual cursor control processing / screen control processing circuit, when there are two active rectangular area groups on the virtual cursor active area image, based on the moving direction, a screen rotation instruction, 4. The input device according to claim 1, wherein either an instruction for enlarging a screen or an instruction for reducing a screen is generated.
The activity rectangle area extraction process or the activity rectangle area extraction processing circuit creates a virtual cursor activity area image and a virtual button click activity area image from the histogram using a histogram statistical processing result. The input device according to claim 1.
The active rectangular area extraction process or the active rectangular area extraction processing circuit performs a multi-stage rectangular object extraction process on the virtual cursor active area image or the virtual button click active area image to remove a noise component. The input device according to any one of claims 1 to 5, wherein:
An enlargement / reduction rectangle mask creation process or an enlargement / reduction rectangle mask creation processing circuit is added, and the color camera, the color camera main body is obtained by the enlargement / reduction rectangle mask creation process or the enlargement / reduction rectangle mask creation process circuit. Extract the image corresponding to the change area rectangle on the virtual cursor activity area image or the change area rectangle of the virtual button click activity area image from the color image obtained in the above, and cut the other images The input device according to claim 1, wherein noise components are removed.
The activity rectangle area extraction process, or the activity rectangle area extraction processing circuit, the activity rectangle area extracted from the latest difference image data of the multi-stage difference image data generated by the histogram process or the histogram processing circuit, The input device according to claim 5, wherein an invalid extraction activity rectangular region is determined based on a comparison with the virtual button click activity region image.
In the activity rectangular area selection process, from the color camera according to a center coordinate distance in a horizontal direction between the center points of the right eye activity rectangular area and the left eye activity rectangular area with a viewing angle of the color camera as a constant. The input device according to claim 1, wherein the distance to the subject is corrected.
10. The input device according to claim 9, wherein the distance to the subject corrected according to the center coordinate distance is stored in advance as table data.