WO2012019322A1

WO2012019322A1 - Input method, input system and input device of vision directing type mouse using monocular camera calibration technique

Info

Publication number: WO2012019322A1
Application number: PCT/CN2010/001229
Authority: WO
Inventors: 许洪; 许涛
Original assignee: Xu Hong; Xu Tao
Priority date: 2010-08-13
Filing date: 2010-08-13
Publication date: 2012-02-16
Also published as: CN103124949B; CN103124949A

Abstract

An input method and an input system of a visual pointing type mouse are applied to precisely indicate display contents on a display screen without contact. The mouse input system includes: a host computer, a display screen (18) coupled to the host computer, a planar target comprising a characteristic target point, a monocular image sensor (100), a control function unit (102), a processing circuit (104) and an information receiving and processing apparatus. The method includes: the monocular image sensor (100) points to the target area to collect images; the control function unit (102) generates various control function signals; the processing circuit (104) performs an image information processing and calculates imaging parameters of a monocular camera using monocular camera calibration technique, thereby figures out the display coordinates of a cursor, and transmits the coordinate information and the control function signals to the information receiving and processing apparatus; the information receiving and processing apparatus receives the display coordinates of the cursor and the control function signals transmitted by the processing circuit (104), and informs the computer operating system to display the cursor or other image objects at the display coordinates on the display screen (18).

Description

Vision-oriented mouse input method, input system and input device using monocular camera calibration technology

The invention relates to a computer peripheral input technology and device, and is a method and device for realizing precise positioning of a camera vision and driving a mouse cursor or other display target by using a monocular camera calibration technology, in particular, a monocular image imaging device can be realized. A directional pointing mouse device that points to the input device. Background technique

With the development of computer technology, the input technology and equipment of the graphical user interface have undergone continuous development. Initially, the mouse device used to drive cursor movement in the graphical user interface is a mechanical mouse, typically having a trackball or direction key component that controls the movement of the cursor using the motion of the trackball relative to the mouse pad or by pressing the direction keys. Subsequently, the optical mouse gradually replaced the mechanical mouse. When it moved on the working plane, it continuously acquired the image reflected by the plane, processed the sequence image by image processing technology, and extracted the moving direction and displacement to drive the cursor motion. Mechanical mice and optical mice generally need to move on a certain working plane, which limits the convenience of their use.

With the development of electronic technology, the use of touch screens has become more and more widespread. The touch screen can use the touch method to locate the input position and drive the cursor, thereby realizing the accurate directional input of the graphical user interface. However, the use of the touch screen requires direct contact of the pointer with the screen, which imposes certain constraints on its application, and is costly to manufacture, which is difficult to manufacture on a large-area screen.

At present, the technology of using machine vision technology to realize human-computer interaction input has attracted more and more attention. There are many techniques and devices for generating computer input commands by using space movements such as gestures, eyeballs, heads and other body parts or other controls. . This technology generally uses a set digital camera to collect continuous video images of the target, and uses image processing technology to extract parameters such as position, direction, and displacement, and then uses these control parameters to drive a display target such as a mouse cursor, or perform certain specifics. Control action. With this technology, a directional input-driven mouse cursor can be realized. For example, as described in the patents CN2609054Y, CN2602419Y, and CN101236468A, an image capturing device such as a digital camera is generally placed and calibrated in advance, and the field of view includes a computer display screen and an indicator bar. Or a pointing device such as a finger, so that the display coordinates of the mouse cursor can be determined according to the relative position of the pointing point in the display screen. This method requires a certain amount of space, and often requires the use of auxiliary devices such as laser pointers, selective reflection films, filter polarizers, etc., which makes the entire system extremely complicated.

In addition, a visually directional input technique is to bind the digital camera to the pointing end of the finger or the pointing stick. When it points to the display screen, the camera captures the localized display image content and then inputs it into the host computer. Scan the area of the screen being displayed area by area, find the position of the pointing area in the display screen; or control the movement of the cursor by the relative displacement of the captured image content caused by the movement of the indicating end. Although this method has great convenience and feasibility, it has the following disadvantages: First, if the display content is the same color, such as blank, it will not be able to perform image matching or displacement extraction; secondly, due to the rotation of the camera , offset, distance change, causing affine projection deformation of the resulting image, resulting in difficult matching; In addition, due to the rotation of the camera, the camera image coordinate system and the display coordinate system direction of the display screen are inconsistent, so that the relative displacement direction of the image Indicates that the motion direction of the platform is inconsistent, and the wrong mouse cursor input direction is generated, and the direction sensor needs to be added to correct; in addition, because the imaging distance of the camera from the display screen changes, the image size changes in proportion to the actual screen image size, so at different distances Generated by the same amount of spatial displacement The amount of image displacement is not the same and needs to be corrected. Therefore, this technique alone does not accurately drive the mouse cursor, and it is often necessary to add a combination of the direction sensor and the distance sensor, resulting in a complicated structure.

In summary, the above techniques are insufficient in manufacturing a precision pointing type mouse input device which is simple in structure, simple in operation, and inexpensive in cost. To this end, this patent proposes a simple and easy way to use the monocular camera calibration technology in machine vision technology to accurately extract the cursor pointing point coordinates of the visual pointing mouse input method, which can be used to manufacture accurate non-contact vision pointing mouse devices. . Summary of the invention

The patent proposes a visual pointing mouse input method using a monocular camera calibration technique for controlling a graphical target such as a mouse cursor to accurately follow the pointing movement of the virtual pointing axis of the image sensor, the method comprising the following steps: The sensor is directed to a planar target, the target has a defined target coordinate system, wherein a plurality of characteristic targets are set, the image sensor is activated, and the image sensor is connected to the computer by wire or wirelessly; the image sensor collects the feature content included in the target pointing area Image, extracting the image coordinates of the feature target in the image; using the acquired target image coordinates and the coordinates of the target in the target coordinate system, and calculating the imaging parameters of the monocular camera according to the monocular camera calibration technique; The image coordinates of a fixed image point on the imaging surface of the sensor, using the calculated monocular camera imaging parameters, calculate the coordinates of the object point corresponding to the image point in the target coordinate system, that is, the indication point is in the target coordinate system. Coordinate, at this time fixed image point, lens light center Indicator dots indicating a virtual axis; calculated coordinate point indicated by the target coordinate system coordinates of the mouse cursor of a display or other image of the target on the display screen, the cursor displayed by the computer on the display. In a very short time interval, the above process is repeated to cause the mouse cursor or other image object to follow the direction of the image sensor to move.

The patent proposes a visual pointing type mouse input method using a monocular camera calibration technique for implementing a visual input method of a spatial motion gesture, the method comprising the steps of: pointing a monocular image sensor to a planar target, The target has a certain target coordinate system, wherein a plurality of characteristic targets are set, the image sensor is activated, the image sensor is connected to the computer by wire or wirelessly, and the image sensor is activated; the image sensor collects an image of the feature content included in the target pointing region. Extracting the image coordinates of the feature target in the image; using the acquired target image coordinates and the coordinates of the target in the target coordinate system, and calculating the imaging parameters of the monocular camera according to the monocular camera calibration technique; The camera imaging parameters are used to find the spatial orientation coordinates of the image sensor in the target coordinate system. The spatial orientation coordinates include the three-axis rotation angle (α, β, γ) and the origin coordinates (X, Y, Z). In a very short time interval, with the pointing motion of the image sensor, the above process is repeated, the spatial orientation coordinates of the image sensor at different positions are obtained, and the spatial orientation coordinates of a series of image forming devices are connected to obtain an image. The spatial motion of the sensor relative to the display.

The visual pointing type mouse input method proposed by the patent, wherein the image sensor is directed to a plane target, and the method has different ways: the image sensor can be directly pointed to a target that has been set in advance, that is, one already exists Target, and then point the image sensor to the target; you can also determine the position of the image sensor and then set the target in the pointing area. This method is suitable for the target can be set on the dynamic display device, for example. When the image sensor is pointed at the computer display screen, the computer first determines that the image sensor is pointing to the approximate area of the display screen, and then the computer dynamically determines a target in the display pointing area of the image sensor.

When it is desired to set the target on a dynamic display device such as a display screen, the present patent proposes two methods of determining the position of the image sensor pointing to the display screen. The first method comprises the following steps: collecting an image sensor to a target area of the display screen, the device is connected to the computer host by wired or wireless communication, and the display screen is connected with the computer host; the image sensor is activated to notify the computer host Outputs a number of different colors or graphics on the display in a very short time The coding pattern consisting of the feature block arrangement of the content, each color or graphic content is coded into different numbers, and the code composed of all the feature blocks in a certain range near each feature block is unique in the entire coded picture, and the entire coded picture The area codes of all the feature blocks constitute a positioning lookup table. In particular, a coding pattern composed of a plurality of rectangular feature block arrangements of different colors or graphic contents may be output on the display screen, and the nX n range near each rectangular feature block The coding of the feature block is unique in the coded picture; the image sensor collects the coded image of the pointed area, extracts the coding of the local pattern from it, compares it with the spatial position lookup table of the coded picture, determines the image sensor pointing The approximate location of the display.

The second method for determining the position of the image sensor pointing to the display screen proposed by the patent includes the following steps: collecting an image sensor to a target area of the display screen, and the device is connected to the host computer by wired or wireless communication, and the display screen is The computer host is connected; the image sensor is activated to work, firstly outputting a coarse resolution coding pattern composed of a plurality of different color or graphic content feature block arrangements on the display screen, each color or graphic content is encoded into a different number, image sensor acquisition Pointing to the area image to determine the position of the feature block to which it is located; then, the host computer outputs the same coding pattern again within the determined large feature block, the size of which is the size of the feature block, and the image sensor collects the image of the pointing area, further Judging the position of the small feature block to which it is located; thus performing this operation in a fast loop of large and small, finally determining the position of the display screen pointed by the image sensor.

The directional pointing mouse input method proposed by the patent, wherein the planar target is specifically characterized as follows: The size of the target area is determined, and the area includes a plurality of characteristic target points, and the target has a specific color, shape and the like, which is convenient for The image is extracted and the coordinates of the target in the target coordinate system of the target region are known.

The visual pointing type mouse input method proposed by the patent, wherein the planar target may be a fixed planar target, and the frame of the display screen may be selected as a target area, and the characteristic target point is set on the frame, and the distance between each point It is known that a size-determined plane around the display screen can be selected as the target area, and the feature target points are set in the area, and the distance between the points is known; a certain fixed area can be selected by the computer as a target on the display screen. The size of the area is determined by the computer display coordinates of the area, and a number of feature points are determined by the computer within the display target area as feature targets.

The directional pointing mouse input method proposed by the patent, wherein the planar target may be a dynamic planar target, and the target may be dynamically generated by a computer on a display screen, and the target generating position always follows the direction of the image sensor, and the target area is The range can be adjusted according to the imaging distance of the image sensor, the size of which is determined by the computer display coordinates of the area, and the computer determines a number of feature points as feature points in the display target area.

The visual pointing type mouse input method proposed by the patent, wherein the computer determines the feature target point in the target area of the display screen, the computer refers to: the computer processes the display content in a certain range including the target area on the display screen, and uses the color, Features such as edge, corner, azimuth, and surrounding environment information, select several feature targets from the display content of the computer, define the target region range from these feature targets, and record the feature information.

The visual pointing type mouse input method proposed by the patent, wherein the determining, by the computer, the feature target point in the target area of the display screen comprises the following steps: the computer counts the color of the display content in a certain range of the target area on the display screen, and selects Displaying a color that is not in the content and has a large difference from the existing color as the color of the generated feature target; by the computer in the target area on the display screen, a certain display content is additionally generated by the selected color, and the generated display is generated. The content includes features such as intersections, corners, and center points. Based on these features, several feature targets can be selected, and the target regions are defined by the feature targets.

The visual pointing type mouse input method proposed by the patent, the plurality of characteristic target points refers to: a visual pointing type mouse input method using an image sensor that does not previously calibrate internal physical parameters such as a focal length and a pixel interval, and needs to be displayed At least 4 feature targets are determined on the screen.

The visual pointing type mouse input method proposed by the patent, the plurality of characteristic target points refers to: a visual pointing type mouse input method using an image sensor that has previously calibrated internal physical parameters such as a focal length and a pixel interval, and needs to be displayed At least 3 feature targets are determined on the screen.

The visual pointing mouse input method proposed by the patent, the solving monocular camera imaging parameter refers to: using the acquired target image coordinate and the target coordinate coordinate of the target, according to the monocular camera calibration technology, the solution is calculated. Monocular camera external imaging parameters of the image sensor.

The visual pointing mouse input method proposed by the patent, the solving monocular camera imaging parameter refers to: using the acquired target image coordinate and the target coordinate coordinate of the target, according to the monocular camera calibration technology, the solution is calculated. Monocular camera internal imaging parameters and external imaging parameters of the image sensor.

The visual pointing type mouse input method proposed by the patent, the fixed image point on the imaging surface of the image sensor means that the image point can be any image point on the imaging surface, and the image point and the imaging lens optical center The connection constitutes a virtual indication axis, and the object point corresponding to the image point is the pointing point of the virtual indication axis.

The visual pointing type mouse input method proposed by the patent, the fixed image point on the imaging surface of the image sensor means that the image point can be a central image point on the imaging surface, and the image point and the optical center of the imaging lens The connection, that is, the optical axis of the imaging system, constitutes a virtual indication axis, and the object point corresponding to the central image point is the pointing point of the virtual axis.

The visual pointing type mouse input method proposed by the patent calculates the display coordinates of the mouse cursor by the coordinates of the indication point in the target coordinate system, and refers to: the unit length of the target coordinate system and the pixel interval size of the computer display screen. When they are the same, the coordinates of the target that have been calculated in the target coordinate system are the display coordinates. When the unit length of the target coordinate system is different from the pixel interval size of the computer display screen, the calculated target point needs to be The coordinates in the target coordinate system are multiplied by a scaling factor to obtain display coordinates obtained by dividing the cell spacing size of the display screen by the unit length of the target coordinate system.

The patent proposes a vision-oriented mouse input system using monocular camera calibration technology, which comprises: a computer mainframe, and a display screen connected thereto; a planar target including a plurality of characteristic targets, the distance between each target point has been It is known that each target has a certain feature to facilitate the extraction of the target from the image; the monocular image sensor is connected to the host computer through the processing circuit, and the monocular image sensor is used as a pointing device to point to the target area of the display screen. The image is a virtual indication axis with a fixed point on the imaging surface of the image sensor and the center of the lens; the control function is arranged with various control function buttons for generating system trigger, left button, right button, page turning, moving, etc. The control function is connected to the host computer through the processing circuit; the system processing circuit is connected to the image sensor and the control function component, and is connected to the host computer by wire or wirelessly; the visual pointing type mouse information receiving and processing device is inserted in the computer host , with computer operating system and processing Road communication interaction. The functions of the processing circuit of the system include: processing the acquired image, completing the pointing orientation of the image sensor, extracting the feature target, calculating the parameter of the monocular camera imaging model, and calculating the coordinates of the pointing point of the cursor; generating a system trigger, Control function signals such as left button, right button, page turning, and movement; communicate with the host computer through wired or wireless communication, and transmit information such as images, feature information, operation results, and control signals. The functions of the information receiving and processing device of the system include: receiving information such as images and operation results transmitted by the processing circuit; processing the acquired images to complete pointing positioning of the image sensor, extracting characteristic points, and calculating parameter of the monocular camera imaging model The function of calculating the coordinates of the cursor pointing to the point; receiving the control function signals generated by the processing circuit, such as system trigger, left button, right button, page turning, and movement; transmitting the feature information of the target to the visual pointing mouse input system processing circuit, Coordinate information; output the calculated cursor coordinate information to the computer operating system. The patent proposes a vision-oriented mouse input system using monocular camera calibration technology, which comprises: a computer mainframe, and a display screen connected thereto; a planar target including a plurality of characteristic targets, the distance between each target point has been It is known that each target has a certain feature to facilitate extraction of a target from an image; a monocular image sensor is connected to a host computer through an information receiving processing device, and a monocular image sensor is used as a pointing device to point to a target of the display screen. The area captures the image, and the connection line of a certain fixed point and the lens center on the image sensor image surface is used as a virtual indication axis; the control function component is arranged with various control function buttons for generating system trigger, left button, right button, page turning, A control function such as movement is connected to the host computer through the information receiving and processing device; the visual pointing type mouse information receiving and processing device is inserted in the computer host, and communicates with the computer operating system and the image sensor and the control function, and the functions thereof include: receiving Image sent by the monocular image sensor Receiving control function signals generated by the control function, such as system trigger, left button, right button, page turning, movement, etc.; processing the acquired image to complete the orientation of the image sensor, feature target extraction, monocular camera imaging model The parameter calculation, the display coordinate calculation of the cursor pointing point, etc.; output the calculated cursor coordinate information to the computer operating system.

The patent proposes a glove-type directional pointing type mouse input device, which comprises: a computer host, and a display screen connected thereto; a plane target including a plurality of characteristic targets, a distance between each target point is known, each target point It has a certain feature to facilitate the extraction of the target from the image; a pointing finger sleeve with a monocular image sensor is used to collect the image from the target area of the display screen, and is connected to the host computer through the processing circuit to image the image sensor The connection between a fixed point and the center of the lens is used as a virtual indicator axis; the control function key finger sleeve includes a plurality of buttons, touch keys or pressure switches, which are controlled by a thumb for generating system trigger, left button, right button, page turning, and movement The control signal is connected to the host computer through the processing circuit; the auxiliary function key finger sleeve may include a function control button such as turning the page, and the finger is triggered by the finger itself by bending the finger, and is connected to the host computer through the processing circuit, the auxiliary function key Finger sleeves are selected according to usage; processing circuits, and image sensors and The control functions are connected and connected to the host computer by wire or wirelessly; the visual pointing type mouse information receiving and processing device is inserted in the computer host and communicates with the computer operating system and the processing circuit. The functions of the processing circuit of the visual pointing type mouse input device include: processing the acquired image, completing the pointing orientation of the image sensor, extracting the feature target point, calculating the parameter of the monocular camera imaging model, and performing the coordinate calculation of the pointing point of the cursor. Generate control signals such as system trigger, left button, right button, page turning, and movement; communicate with the host computer through wired or wireless communication, and transfer images, feature information, operation results, control signals, and other information. The function of the information receiving processing device of the visual pointing type mouse input device comprises: receiving information such as an image transmitted by the processing circuit, an operation result, etc.; processing the acquired image to complete pointing positioning of the image sensor, extracting feature points, and monocular camera Function calculation of imaging model, display coordinate calculation of cursor pointing point; receiving control function signals generated by processing circuit, such as system trigger, left button, right button, page turning, movement; sending target to visual pointing mouse input system processing circuit Feature information and coordinate information of the point; output the calculated cursor coordinate information to the computer operating system.

The patent proposes a glove-type directional pointing type mouse input device, which comprises: a computer host, and a display screen connected thereto; a plane target including a plurality of characteristic targets, a distance between each target point is known, each target point Determining features to facilitate the extraction of targets from images; pointing finger sleeves with monocular image sensors for capturing images in the target area of the display, connected to the host computer, to a fixed point on the imaging surface of the image sensor The connection to the center of the lens serves as a virtual indicator axis; the control function finger cot includes a number of buttons, touch keys or pressure switches, controlled by the thumb, for generating control signals such as system trigger, left button, right button, page turning, movement, etc. , connected to the host computer; auxiliary function key finger sleeve, can include function control keys such as page turning, by bending the finger, the finger itself triggers the switch control, and is connected with the computer host, the auxiliary function key finger sleeve is selected according to the use condition; Type mouse information receiving and processing device, installed in the host computer The communication interaction with the computer operating system and the image sensor and the control function includes: receiving image information sent by the image sensor; receiving system trigger, left button, right button, page turning, moving, etc. generated by the control function component Function signal; processing the acquired image to complete the orientation of the image sensor, feature target extraction, parameter calculation of the monocular camera imaging model, display coordinate calculation of the cursor pointing point, etc.; output the calculated cursor to the computer operating system Coordinate information.

The patent proposes a finger-type directional pointing type mouse input device, which comprises: a computer host, and a display screen connected thereto; a plane target including a plurality of characteristic targets, a distance between each target point is known, each target The point has a certain feature to facilitate the extraction of the target from the image; the visual pointing mouse input finger sleeve integrates the monocular image sensor, the processing circuit, and the control function key on one finger sleeve, and is worn on the index finger or other fingers. The thumb realizes function control, and the image sensor located at the front end of the finger sleeve is pointed to the target area of the display screen to collect images, and the image sensor is connected to the computer host through a processing circuit in a wired or wireless manner, and the image sensor is fixed on the imaging surface of the image sensor. The connection between the point and the center of the lens serves as a virtual indicator axis. The control function keys of the finger set include a number of buttons, touch keys or pressure switches, which are controlled by the thumb for generating system trigger, left button, right button, page turning, movement, etc. Function signal, connected to the host computer through the processing circuit; information reception Processing means placed in the host computer, the processing circuit communication device interacting with a computer operating system and mouse input. The processing circuit in the mouse input finger sleeve is connected with the image sensor and the control function component, and is connected to the host computer by wire or wirelessly, and the functions thereof include: processing the image of the image set, completing the pointing position of the image sensor, and defining the target Point extraction, parameter calculation of the monocular camera imaging model, display coordinate calculation of the cursor pointing point, etc.; generating control function signals such as system trigger, left button, right button, page turning, and movement; and wired or wireless with the host computer Communication interaction, passing information such as images, feature information, operation results, control signals, and so on. The function of the information receiving processing device of the mouse input device includes: receiving information such as an image transmitted by a processing circuit of the mouse input device, an operation result, and the like; receiving a system trigger, a left button, a right button, a page turning, a moving, and the like generated by the processing circuit Signal; processing the acquired image to complete the orientation of the image sensor, feature target extraction, parameter calculation of the monocular camera imaging model, display coordinate calculation of the cursor pointing point, etc.; sending characteristic information of the target to the processing circuit, Coordinate information; output the calculated cursor coordinate information to the computer operating system.

The patent proposes a finger-type directional pointing type mouse input device, which comprises: a computer host, and a display screen connected thereto; a plane target including a plurality of characteristic targets, a distance between each target point is known, each target The point has a certain feature to facilitate the extraction of the target from the image; the visual pointing type mouse input finger sleeve, the monocular image sensor and the control function key are integrated on one finger sleeve, worn on the index finger or other fingers, and the function is realized by the thumb Control, when using, the image sensor located at the front end of the finger cot is directed to the target area of the display screen to collect images, and the image sensor is connected to the host computer by wire or wirelessly, with a fixed point of the image sensor image surface and the lens center connection line as The virtual indicator axis, the control function keys of the finger sleeve include a thousand buttons, a touch button or a pressure switch, which are controlled by a thumb, and are used to generate control signals such as system trigger, left button, right button, page turning, movement, etc., to be wired or wireless. The method is connected to the host computer; the information receiving and processing device is inserted in the host computer Communicating with computer operating system and image sensor and control function, the functions include: receiving image information sent by the image sensor, receiving system trigger, left button, right button, page turning, moving, etc. generated by the control function Signal, processing the acquired image, completing the pointing function of the image sensor, extracting the feature target, calculating the parameter of the monocular camera imaging model, and calculating the coordinates of the cursor pointing point, and outputting the calculated cursor coordinates to the computer operating system information.

The patent proposes a visual pointing mouse application, which resides in a computer host and communicates with a computer operating system and a visually pointing mouse input system, and includes the following functions: a control function program, receiving by a visual pointing type The control function signal generated by the control function of the mouse input system, such as system trigger, left button, right button, page turning, movement, etc.; image receiving processing program, receiving image information sent by the image sensor of the vision pointing type mouse input system; image sensor positioning The program determines the position of the image sensor pointing to the display screen: after receiving the work trigger signal of the visual pointing type mouse input system, notifying the computer to output a positioning code pattern for determining the pointing position of the image sensor on the display screen in a very short time, Processing the image of the pointing area collected by the image sensor, extracting the included pattern code from the image of the local area, comparing it with the spatial position lookup table of the positioning code pattern, and determining the approximate position of the image sensor pointing to the display screen; Feature target generation program, determining feature target on the display screen: In the target display area on the display screen, select some characteristic target points distributed in the image acquisition area of the image sensor according to the display content, or generate a plurality of distribution images in the image sensor a feature target in the image acquisition area; a target extraction program, processing the image of the pointing area containing the feature target collected by the image sensor, extracting the image coordinate of the target image according to the target feature; and calculating the imaging parameter of the camera, Using the image coordinates of the acquired target and its computer display coordinates, according to the monocular camera calibration technique, the monocular camera imaging parameters are solved; the coordinate calculation program is displayed, and the connection between a fixed point and the lens center on the imaging surface of the image sensor is performed. As the virtual indication axis, the image coordinates of the fixed point, that is, the image coordinates of the intersection of the virtual indication axis and the imaging surface, are used to calculate the corresponding intersection display on the computer display screen using the calculated monocular camera imaging parameters. Coordinate; The cursor displays the program, informing the computer to display the mouse cursor or other image target at the pointing point of the virtual pointing axis on the display screen, and re-determining the display pointing area centering on the display cursor, and re-determining the feature target in the pointing area point. DRAWINGS

1 is a schematic view of a first embodiment of a directional pointing mouse using a monocular camera calibration technique; FIG. 2A is a schematic view of a pinhole model of a monocular camera imaging;

FIG. 2B is a schematic diagram of calibrating the plane of the display screen by using a monocular camera calibration technique;

Figure 3A is a schematic diagram of coded positioning composed of different colors;

FIG. 3B is a schematic diagram showing pointing positioning according to local area coding of the display screen;

Figure 3C is a schematic diagram of coarse resolution coding positioning composed of different colors;

Figure 3D is a schematic view showing the pointing position of the image sensor for the first time;

Figure 3E is a schematic view showing the pointing position of the second positioning image sensor;

FIG. 4 is a schematic diagram of selecting a feature target by using features such as color, edge, corner, and the like in the display content; FIG. 5A is a schematic diagram showing a color distribution of a display point and a color selection region of a target point in a target point setting area in a color space; Figure 5B is a schematic diagram showing the generation of four cross targets in a square arrangement;

Figure 6 is a workflow diagram of a first embodiment of a directional pointing mouse utilizing monocular camera calibration technology;

FIG. 7 is a basic system block diagram of a visual pointing type mouse input system using monocular camera calibration technology; FIG. 8A to FIG. 8C are schematic diagrams showing a wearing method and a usage method of a glove type visual pointing input device; Is a schematic diagram of the monocular image sensor indicating component;

Figure 8E is a schematic diagram of the main control function;

Figure 8F is a schematic diagram of the auxiliary control function;

Figure 9A is a schematic view of a finger-type directional pointing input device that integrates all components into one body;

Figure 9B is a schematic view of a pen-type visual pointing input device; Figure 10 is a flow chart showing a second embodiment of a directional pointing mouse utilizing monocular camera calibration technology;

11A to 11C are diagrams showing a third embodiment of a directional pointing type mouse using a monocular camera calibration technique;

Figure 12 is a workflow diagram of a third embodiment of a directional pointing mouse utilizing monocular camera calibration techniques.

Figure 13 is a schematic illustration of a fourth embodiment of a directional pointing mouse utilizing monocular camera calibration techniques.

Figure 14 is a workflow diagram showing a fourth embodiment of a directional pointing mouse utilizing monocular camera calibration techniques. detailed description

DETAILED DESCRIPTION OF THE INVENTION The present invention will be described in detail with reference to the accompanying drawings in which a directional pointing mouse embodiment using monocular camera calibration techniques is used.

[Embodiment 1]

1 through 9B illustrate a first embodiment of a directional pointing mouse that utilizes monocular camera calibration techniques. The first embodiment relates to a visual pointing input method for controlling a graphical target such as a mouse cursor to be accurately displayed at an intersection of a virtual pointing axis of an image sensor pointing to a computer display screen. In this method, a dynamic display device such as a display screen is used as The setting area of the target is firstly determined by the computer to locate the position of the image sensor pointing to the display screen, and then the computer sets the feature target on the display screen to define the target area, and the target area can be pointed along with the image sensor. Movement, always keeping the target area in the imaged area of the image sensor.

1 is a schematic view of a first embodiment in which the visual pointing input device 10 is worn on the hand in the form of a glove, and the core working component is a small monocular image sensor 100 worn on the index finger for pointing The target area collects an image. In the first embodiment, the target area is set by a computer on the display screen, and the image sensor can automatically zoom according to the imaging distance to obtain a clear image (when the imaging distance is much larger than the focal length of the image sensor, the fixed focus image is used) The sensor can always get a clearer image); the input device also includes a control function 102 worn on the middle finger or other fingers, on which the control function keys are arranged, through the thumb control, for system triggering, left button, right button and page Scrolling and the like; and processing circuit 104, which is placed on the back of the hand in FIG. 1, can also be integrated into image sensor 100 or control function 102, which is coupled to image sensor 100 and control function 102, and is wired Or wirelessly connected to the host computer. The functions of the processing circuit 104 mainly include: processing the acquired image, completing the pointing orientation of the image sensor, extracting the feature target, calculating the parameter of the imaging model, and calculating the coordinate of the pointing point of the cursor; generating a system trigger, a left button, Control function signals such as right button and page scroll; communicate with the host computer through wired or wireless communication, and transmit information such as images, feature information, operation results, and control signals. The specific structure of the visual pointing input device 10 will be further explained later.

As shown in FIG. 1, when the image sensor 100 worn on the index finger is photographed in a certain area of the display screen 18, the computer first uses some positioning technique to determine the approximate position of the display screen corresponding to the image sensor. The introduction of the technology will be explained later. Then, the computer selects several feature targets in the region according to the color, edge, corner point and the like of the display content, or generates a plurality of feature targets 12 in the region, the feature targets constitute the target region, and the desired target The number is related to the specific calibration technique used. At the same time, the computer transmits the characteristic information of the targets to the processing circuit 104 by wired or wireless communication, so that the input device can extract the special image from the acquired image by using these features. Target. Further, according to the computer display coordinates of the feature points and the image coordinates thereof, the imaging parameters of the imaging model are calculated by the monocular camera calibration technique, and then the imaging parameters are used to calculate the intersection of the virtual indication axis 14 and the display screen 16 The relative coordinates of the target. Finally, based on the display coordinates of the feature target on the display screen, the actual display coordinates of the intersection 16 on the display screen are calculated, so that the mouse cursor or other graphic object can be accurately displayed. In a very short period of time, repeating the above process, the cursor can accurately follow the direction of the image sensor and move. For example, for an image sensor with a frame rate of 15 S, the time interval between the two images is about 67 ms. For an image sensor with a frame rate of 30 S, the time interval between the two images is about 33 ms, and the higher the frame rate, the shorter the time interval. Figure 1 shows the computer using a pointing input device for processing, which also satisfies the situation where multiple similar pointing input devices are used simultaneously.

In order to better understand how the monocular directional visual mouse input technology proposed in this patent accurately obtains the display coordinates of the virtual pointing axis pointing to the indication point of the display screen, the monocular camera calibration technique used is illustrated below.

2A is a pinhole model of monocular camera imaging, showing a schematic view of imaging any object point in space. Object point 20a is projected through image center 22 on image plane 24 to become image point 20b. The imaging device may be CCD devices, CMOS devices, or other digital imaging devices. In the description of the imaging process, the world coordinate system O _w -X _w Y _w Z _w where the object point is located, the camera coordinate system O _c -X _c Y _c Z _c and the image coordinate system U-0- where the image plane is located V three coordinate systems, respectively, are 200, 202, 204. The entire imaging model relationship, that is, the expression relationship between the coordinates (X _w , Y _w , Z _w ) of the object point in the world coordinate system and the image coordinate system coordinate V of the image point on the image plane, can be determined by the world coordinates. The three-axis rotation angle (α, β, γ) of the O _w -X _w Y _w Z _w in the camera coordinate system Oe-XcYcZc and the coordinates of its origin in the camera coordinate system (Χο, Υ, Ζ.), etc. Six external physical parameters, and three internal physical parameters such as the horizontal pixel interval d _x and the vertical pixel interval d _{y of the} camera focal length imaging device, a parameter matrix determined by a total of nine physical parameters are described:

0 1

f 0 "o 0

0 f

v ₀ 0

0 0 1 0 then

Where is the three-axis rotation angle (a, β, γ) and the origin coordinates of the world coordinate system in the camera coordinate system (Χο, Υ., Ζ ₀ The external parameter matrix, M ₂ is the internal parameter matrix determined by the camera's internal physical parameters: focal length f, lateral pixel spacing d _x , longitudinal pixel spacing d _y , ( uo , vo ) is the camera optical axis and imaging plane The coordinates of the intersection point in the image coordinate system. By eliminating Z _c , two equations containing only the point coordinates (X _W , Y _W , Z _W ) and the image plane coordinates (u, v) in the world coordinate system can be obtained. According to the camera calibration theory, in general, if the coordinates of the world coordinate system of six spatial points and the coordinates of its image coordinate system are known, all parameters of the parameter matrices Mi, M ₂ can be solved, and thus can be determined by any object point. The coordinates of the world coordinate system (X _w , Y _w , Z _w ) calculate its coordinates (u, v) in the image coordinate system, and can also be inversely calculated from the arbitrary coordinate coordinates (u, V) in the world coordinate system. The coordinates (X _w , Yw, Z _w ).

This patent is applied to the visual directivity input of the computer display screen. All the target points are located on the plane of the display screen. Therefore, the display plane can be used as the X _w O _w Y _w plane of the world coordinate system, and the pixel interval of the display screen is taken. As the coordinate unit length of the world coordinate system X _w O _w Y _w plane, that is, the X _w O _w Y _w plane coordinate is consistent with the computer display coordinate unit, and the direction perpendicular to the plane is the Z _w axis, as shown in FIG. 2B. In the figure, 18 is a display screen located on the X _w O _w Y _w plane, and 16a, 16b are the intersection of the camera optical axis and the display screen and the central image point of the corresponding imaging surface. Since the display plane is used as the X _w O _w Y _w plane, ∑ =0 for all points on the display, and the calculation is simplified. Moreover, since the display screen of the known target point is displayed, the relative coordinates of the indication axis indication point with respect to the target point are obtained, and the display coordinates of the indication point in the display screen can be obtained according to the display coordinates of the target point. For the above specific application conditions, the required calibration work is simplified accordingly. According to the camera calibration theory, even in the case of the internal physical parameters of the unknown image sensor, only the object space display by the four targets on the display plane is solved. The set of imaging equations determined by the coordinates and image space image coordinates can be used to solve the external parameter matrix that satisfies the requirements. Further, by using the required imaging parameters obtained by calibration, the display coordinates of the intersection of the virtual pointing axis determined by a certain fixed point on the imaging surface and the lens center line and the computer display screen can be obtained, thereby accurately positioning the mouse cursor or other display targets. In practical applications, the optical axis of the image sensor can be used as an indicator axis for convenience. In FIG. 2B, 12a is a feature target set in the display screen, 12b is its image point in the imaging plane, and according to their display coordinates and image coordinates, the required imaging parameters can be calibrated, and thus can be The image coordinates of the intersection 16b of the optical axis and the imaging surface calculate the coordinates of the intersection 16a in the corresponding display screen with respect to the target point, and finally the accurate computer display coordinates can be obtained.

The application method described above can be directly used in the case of the internal physical parameters of the unknown image sensor. Furthermore, if the internal physical parameters such as the focal length and pixel spacing of the image sensor have been previously calibrated before use, according to the camera calibration theory, only the object space display coordinates and image space image coordinates of the three targets are determined. The imaging equations can be used to solve the external parameter matrix that meets the requirements. In addition, after the external parameter matrix is obtained, the external physical parameters determining these external parameters, that is, the three-axis rotation angle (α, β, γ) and the origin coordinates of the world coordinate system in the camera coordinate system can be further obtained. . . . , Ζ.), thereby also determining the spatial attitude of the image sensor relative to the display screen, such an application would be used in the fourth embodiment that follows.

As previously mentioned, in a first embodiment of the patent, an image sensor positioning technique was developed to determine the approximate location of the display screen to which the image sensor is directed. The function of this positioning technology is that when the visual pointing input device works, if the camera is close to the display screen, only the local area can be imaged, and the local area of the display screen corresponding to the camera needs to be determined first, and then the computer can be notified. The processor generates a feature target in this area. This patent proposes two methods for locating the location of the coarsely determined image sensor, which are described separately below. 3A and 3B illustrate a first positioning method. When the input device 10 is activated, a synchronization signal is sent to the computer, and the computer outputs a specially designed color coding pattern to the display screen 18. This is a The color squares are arranged in a square map, and the color coding of the square area around the square is unique. For example, the coded figure 30 in the figure shown in FIG. 3, 30 is a frame of 300, 302, 304, 306. The four color arrays are arranged in a grid, each color is given a different code, and the color coding of the 3x3 square area around each square in 30 is unique, so that the entire coding pattern corresponds to a coded lookup table, and thus, The approximate position of the display screen to which the camera is directed can be determined based on the color coding of the area extracted from the captured image. As shown in FIG. 3B, the image sensor 100 extracts the code 022302010 from the acquired local area image 32, so that the approximate position corresponding to the code can be found from the lookup table, so that the feature target can be further generated in the area. The coding pattern can take many forms, in addition to being color coded, it can also be encoded by the geometric content of the block; of course, it can also be designed as other geometric coded pictures, such as a ring coded picture. The color coding pattern can be translucently blended on the original display content, or it can be completely displayed in a very short time. After being captured by the image sensor, the original content is displayed.

Another method for determining the position of the image sensor is as shown in FIG. 3C, FIG. 3D, and FIG. 3E, which is different from the previous one-time output coding pattern, which can continuously output the sequence coding pattern in a very short time. The image sensor position is gradually determined coarsely and finely. For example, 34 shown in FIG. 3C is a coarse resolution coding pattern of four colors of 300, 302, 304, and 306. First, the computer outputs a coarse resolution coding pattern 34 on the full screen of the display to roughly determine the image sensor. The area is as shown in FIG. 3D; then, the same coding pattern 34 is outputted in the determined local area to further determine the position of the image sensor, as shown in FIG. 3E; so that the image can be accurately determined by repeating the loop several times. The location of the sensor.

After the above image sensor positioning method determines the region in which it is located, the computer needs to set a number of targets in the region for image sensor calibration. For this reason, two methods are proposed to set the feature target. The first method is shown in FIG. 4. The computer processes the display content in the determined image acquisition area, and selects several features from the computer display content by using features such as color, edge, corner point, orientation, and surrounding environment information. Target, and record its characteristic information. When it is difficult or impossible to select an appropriate feature target in the region, such as for a blank region or a same color region, as shown in FIG. 1, the second feature target setting method can be used to work, that is, in the region The computer dynamically generates several feature targets. The method is illustrated in FIG. 5A and FIG. 5B. In order to facilitate the identification of the extracted target, certain features can be given in color and shape. The patent proposes the following technology to generate a target for easy identification and extraction: First, the dynamically generated target color can select a color that is not in the image area. As shown in FIG. 5A, all the display points in the area are distributed in the RGB color coordinate system according to the color coordinate, and the 50 area in the figure is the image color. In the area where the points are concentrated, when selecting the target color, in order to facilitate the distinction, the blank area away from the existing color point in the coordinate body, such as the color of the 52 area, is used as the color of the target point, so that it can be collected from the color image sensor. The feature target is easily extracted from the image; secondly, the shape of the target can be selected as a positive cross or oblique ten Intersections, or any other convenient form of identification, FIG. 5B generated a regular tetragonal arrangement of four crosshead target 12, of course, any shape may be generated to meet the needs of the target pattern. In addition, it is also possible to adopt a method of continuously changing the generated target color in a short time to further improve the speed and accuracy of target extraction. In some cases, it may be necessary to assign different color or shape features to the feature target to facilitate the determination of the orientation order of the target.

In summary, the first embodiment of the directional pointing mouse using the monocular camera calibration technology comprises the following steps: pointing the monocular image sensor to any position on the display screen, starting the image sensor operation, and the image sensor is wired or wireless. Connected to the computer; determines the position of the image sensor pointing to the display screen: The computer outputs the positioning code pattern determining the pointing position of the image sensor on the display screen in a very short time, and the image sensor collects the image of the pointed area from the local area Extracting the included pattern code into the image, comparing it with the spatial position lookup table of the positioning code pattern, determining the approximate position of the image sensor pointing to the display screen; determining the feature target on the positioned display pointing area: by the computer Selecting a plurality of feature targets distributed in the image acquisition area of the image sensor according to the display content, or generating a plurality of feature targets distributed in the image collection area of the image sensor on the displayed display pointing area; The sensor collects the image of the pointing area containing the characteristic target point, extracts the image coordinates of the target image according to the target feature; uses the image coordinates of the acquired target point and its computer display coordinates, and calculates the monocular according to the monocular camera calibration technique. Camera imaging parameters; a line connecting a certain fixed point on the imaging surface of the image sensor and the center of the lens as a virtual indication axis, and the image coordinates of the fixed point, that is, the image coordinates of the intersection of the virtual indication axis and the imaging surface, are calculated by using the solution The monocular camera imaging parameters calculate the corresponding display coordinates of the intersection point on the computer display screen; the mouse cursor or other image target is displayed by the computer at the pointing point of the virtual indication axis on the display screen, and the computer displays the cursor as The center re-determines the display pointing area, and re-determines the feature target in the pointing area; in a very short time interval, repeating the above steps with the pointing movement of the image sensor, so that the mouse cursor or other image target follows the virtual pointing axis Point to point motion on the display.

According to the above steps, the workflow of the first embodiment is designed as shown in FIG. 6 , and includes the following steps: after starting to point to the input device, in step 600, notifying the computer central processor to send a positioning code pattern to the screen; The image sensor of the input device collects an image, and the image processing circuit extracts the image code therefrom; in step 604, it is determined whether the image code matches the code map lookup table, finds the approximate position corresponding to the image sensor, and then proceeds to the next step, otherwise continues to repeat 600 Step 602: In step 606a, the computer is notified to find or generate a feature target in the area; in step 608, an image is acquired by the image sensor, and each target coordinate is extracted from the acquired image according to the target feature; , judging whether the coordinates of each target point are correctly extracted, otherwise repeating step 608; in step 612, calculating the external parameters of the imaging model by using the display coordinates of the target point and the extracted image coordinates; in step 614a, using the calculated external Parameter, the coordinates of the center image point Calculate the display coordinates of the intersection of the corresponding optical axis and the display screen; in step 616, the computer cursor or other target is displayed on the optical axis pointing point coordinate position; in step 618, the feature target is reset and displayed in The optical axis points to the area centered at the point; in step 620, it is judged whether the input device is still in the working state, and if it is still working, the same operation processing is continued on the successive images taken, otherwise the system operation is terminated. Through the above workflow, it is ensured that the cursor accurately follows the optical axis pointing point motion of the image sensor.

FIG. 7 is a basic system block diagram of a visual pointing type mouse input system using monocular camera calibration technology proposed in this patent, and 10, 30, and 18 are respectively three main devices of the system: a visual pointing input device, a computer host, and The functional block diagram of the display screen is not shown in the detailed details. The complete visual pointing input device includes an external input device 10 that is operated by the operator and an information receiving processing device 702 that is placed in the host computer. The external input device 10 for visual pointing includes three main functional components of the image sensor 100, the control function component 102 and the processing circuit 104, and the processing circuit 104 further includes a communication function module 704, an image processing module 706, and a control function module 708. The function module; the information receiving processing device 702 is responsible for communication interaction with the input device 10 and the computer operating system 700, and is embedded in the computer host in the form of software or hardware. As shown in FIG. 7, after the input device 10 starts working, the communication receiving module 702 activates the information receiving processing device 702, 702 to interact with the host operating system program 700, informing the operating system program 700 where to display the image content, and where The method generates a feature target, and simultaneously informs the input device of the feature information and the coordinate information of the target. The image processing module 706 of the input device extracts a target from the image acquired by the image sensor 100, performs an arithmetic operation, and transmits the resolved coordinates of the pointing point to the information receiving processing device 702 by the communication module 704, thereby notifying the computer operating system 700 , Display the image target such as the mouse cursor at the position of the pointing point. The control signal generated by control module 102 drive control module circuit 708 is also communicated by communication module 704 to information receiving processing device 702, which instructs the computer operating system to generate corresponding control operations. In practical applications, arithmetic functions such as image information processing may be flexibly arranged in the processing circuit 104, or may be disposed in the information receiving processing device 702 inserted in the computer.

As shown in FIG. 1, according to the visual pointing input technology proposed by the patent using the monocular camera calibration technology, a glove type directional pointing type input device is designed, and FIGS. 8A to 8C show the wearing method and use of each part. Method, diagram 8D to 8F show the basic mechanism of each part, which will be further described in detail below with reference to the schematic diagram. As shown in FIG. 8A, the glove type directional pointing type input device mainly includes an image sensor indicating part 100, a main control function part 102, a processing circuit 104, and an auxiliary control function part 106. These parts are fixed on gloves made of flexible materials such as nylon or plastic. The joints of the gloves can be properly opened. The parts are placed at appropriate positions on the fingers, which can not affect the bending action of the fingers, so that the typing can still be guaranteed. Waiting for the convenience of hand operation. The image sensor indicating unit 100 is used to point to a target such as a display screen. For the directional pointing input technique using the monocular camera calibration technique, only one image sensor is used, the basic structure of which is shown in FIG. 8D, and the image sensor 80 is packaged in a mechanical jacket. Can be clamped or bound to the finger. The main control function component 102 has a function control key on the side, which is convenient for realizing mouse activation, left key, right key, page turning, etc., and its working schematic diagram is shown in FIG. 8B. The function control key can have various forms, which can be The push button 82a, which may also be the touch button 82b, as shown in Fig. 8E, may also be a combination of the two, for example, when the thumb touch function key activates the input device to operate, and when the thumb is pressed down, the left button function is issued. The auxiliary control function component 106 has a squeeze switch or a touch switch 84. As shown in FIG. 8F, the work is triggered by bending the little finger. The working schematic diagram is shown in FIG. 8C, and the right key function and the page turning function can be placed on the main control unit. It can also be placed on the auxiliary control unit. The processing circuit 104 can be flexibly arranged on the back of the hand or other positions, wherein an information processing module for performing image processing and data decoding using a digital processing chip such as a DSP or an FPGA, a wired or wireless communication module for communicating with the computer, and data Memory, etc. Further, an arithmetic function such as image information processing may be disposed in the information receiving processing device inserted in the computer. The above components can be flexibly designed for appearance and increase or decrease as needed, and can be operated in various possible ways.

The above describes a glove type directional pointing input device. In fact, by using the directional pointing input technology provided by this patent, a variety of application devices can be designed according to the needs of use. For example, for the glove-like visual pointing input device described above, the letter processing circuit can be integrated into other components. Of course, it can also be well designed to integrate all the components into one, forming a finger cot and working on the index finger, as shown in Figure 9A. It can also be designed in various shapes such as a pen shape or a gun shape depending on the platform used. Figure 9B shows a pen-type visual input device in which all components are assembled in a pen-type housing. In addition, it is also possible to fix such a directional pointing type input device to the head, and to drive the cursor movement by the movement of the head. In short, under the premise of retaining the main functional components, visual orientation input devices of various structures can be designed as needed, and various usage methods can be used.

[Embodiment 2]

Figure 10 depicts a second embodiment of a directional pointing mouse that utilizes monocular camera calibration techniques. The second embodiment is similar to the first embodiment, and also relates to a visual pointing input method for controlling a graphical object such as a mouse cursor to be accurately displayed at an intersection of a virtual indicator axis of an image sensor pointing to a computer display screen. In the middle, the dynamic display device such as a display screen is still used as the target setting area, but the second method does not locate the pointing position of the image sensor, but the computer directly sets the characteristic target map in a certain area of the display screen to determine the target. In the dot area, the operator actively points an image sensor such as a camera to the area, acquires a target image, and positions the mouse cursor. In the second method, the target region is still moved following the orientation of the image sensor, keeping the target region at the imaged area of the image sensor.

Compared with the first embodiment, the second embodiment only lacks the link of image sensor positioning. Therefore, the second embodiment includes the following steps: starting the image sensor operation, and connecting the image sensor to the computer by wire or wirelessly; Determining the feature target on the display screen: The computer selects a plurality of feature targets distributed in the image acquisition area of the image sensor according to the display content, or generates thousands of distributions in the image acquisition area of the image sensor. Characteristic target; point the monocular image sensor to the feature target area on the display screen, and collect the feature target Pointing to the area image, extracting the image coordinates of the target image according to the target feature; using the image coordinates of the acquired target point and its computer display coordinates, and calculating the imaging parameters of the monocular camera according to the monocular camera calibration technique; The line connecting a fixed point on the imaging surface and the center of the lens is used as a virtual indication axis. The image coordinates of the fixed point, that is, the image coordinates of the intersection of the virtual indication axis and the imaging surface, are calculated by using the calculated monocular camera imaging parameters. The corresponding display coordinates of the intersection on the computer display screen; the mouse cursor or other image target is displayed by the computer at the pointing point of the virtual indication axis on the display screen, and the computer re-determines the display pointing area centering on the display cursor , re-determining the feature target in the pointing area; in a very short time interval, repeating the above steps with the pointing movement of the image sensor, so that the mouse cursor or other image target follows the pointing point movement of the virtual pointing axis on the display screen .

According to the above steps, the workflow of designing the second embodiment is shown in FIG. 10, and includes the following steps: After starting to point to the input device, directly enter step 606b, and notify the computer to find or generate a feature target in this area; The image is acquired by the image sensor, and the coordinates of each target point are extracted from the acquired image. In step 610, it is determined whether the coordinates of each target point are correctly extracted, otherwise the step 608 is repeated; in step 612, the display coordinates and extraction of the target point are utilized. The image coordinates of the imaging model are calculated by the coordinates of the image; in step 614a, the calculated imaging parameters are used, and the coordinates of the intersection of the corresponding optical axis and the display screen are calculated from the coordinates of the central image point; Display the mouse cursor or other target at the position of the optical axis pointing point; in step 618, reset the feature target and display it in the area centered on the optical axis; in step 620, determine if the input device is still in Working status, continue to shoot continuous images if still working Kind of operation, otherwise terminate the system work. Through the above workflow, it is possible to ensure that the cursor accurately follows the optical axis pointing point motion of the image sensor.

[Embodiment 3]

11A through 11C illustrate a third embodiment of a directional pointing mouse utilizing monocular camera calibration techniques. A third embodiment relates to a visual pointing input method for controlling the movement of a graphical target such as a mouse cursor to accurately follow the motion of a virtual indicator axis of an image sensor in a target region. In this method, a fixed planar target is used, and the size of the target region is determined, wherein a planar target coordinate system is set, and the target region contains a plurality of characteristic targets, and the target has a specific color, shape and the like, and is convenient for the image. Extracted, and the coordinates of the target in the target coordinate system are known. During operation, the image sensor is pointed to the area to obtain the target image. First, the monocular camera calibration technique is used to locate the coordinates of the virtual indicator axis of the image sensor in the target area in the target coordinate system, and then according to the target coordinate system and The unit scale relationship of the display coordinate system is used to find the corresponding mouse cursor display coordinates.

The main features of the third embodiment are: pre-set a number of characteristic targets in a fixed target area, the coordinates of the target in the area are known, as shown in FIG. 11A to FIG. 11C, the area may be on a computer display screen. A certain fixed area (Fig. 11A) may also be the border of the display screen (Fig. 11B) or the target surface around the display screen (Fig. 11C). There are various options for the setting of the target, which can be a display point with a specific color and shape generated by a computer on the display screen, or a physical point device such as an LED light source, which is disposed on a physical object such as a display frame. Colored reflective patches, etc.

The third embodiment comprises the following steps: selecting a fixed target area as a pointing area of the collected image, the size of the area is determined, a plurality of characteristic target points are set in the area, and the coordinates of the target point in the area are known; The monocular image sensor is directed to the target area, and the image sensor is connected to the host computer by wire or wirelessly to activate the image sensor; the image of the pointing area containing the feature target is acquired, and the image coordinates of the target are extracted in the image according to the target feature; Using the image coordinates of the acquired target and its coordinates in the fixed area, according to the monocular camera calibration technique, the monocular camera imaging parameters are calculated; the connection between a certain fixed point and the lens center on the imaging surface of the image sensor is used. Virtual indicator axis, From the image coordinates of the fixed point, that is, the image coordinates of the intersection of the virtual indication axis and the imaging surface, using the calculated monocular camera imaging parameters, the coordinates of the corresponding intersection points in the target area are calculated; the calculated indication is The coordinate of the axis intersection point in the target coordinate system is multiplied by a proportional coefficient, which is obtained by dividing the actual size of the display screen by the actual size of the corresponding target area, and determining the display coordinates of the cursor on the display screen; The mouse cursor or other image target is displayed on the display according to the calculated display coordinates; in a very short time interval, with the pointing movement of the image sensor, repeat the above steps to make the mouse cursor or other image target follow the virtual indication The axis moves in the direction of the target area.

FIG. 12 is a flow chart of the third embodiment, including the following steps: after starting to point to the input device, in step 606c, aligning the image sensor with a fixed target area set in advance; in step 608, collecting by the image sensor Image, extracting the coordinates of each target from the acquired image; in step 610, determining whether the coordinates of each target are correctly extracted, otherwise repeating step 608; in step 612, using the display coordinates of the target and the extracted image coordinates Outputting the imaging parameters of the imaging model; in step 614b, using the calculated imaging parameters, calculating the coordinates of the intersection of the corresponding indication axis and the target region from the coordinates of the central image point; in step 622, the calculated intersection coordinates Multiply by a scale factor to obtain the display coordinates of the cursor on the display screen; in step 616, the mouse cursor or other target is displayed by the computer at the position of the optical axis pointing point; in step 620, it is judged whether the input device is still in the working state. If you are still working, continue to perform the same for the continuous image you are shooting. The operation is handled, otherwise the system is terminated. Through the above workflow, it is possible to ensure that the cursor accurately follows the optical axis pointing point motion of the image sensor.

[Embodiment 4]

Figure 13 depicts a fourth embodiment of a directional pointing mouse utilizing monocular camera calibration techniques. The fourth embodiment is a spatial motion attitude visual input method realized by the monocular camera calibration technology, which utilizes an image sensor that pre-calibrates internal physical parameters such as focal length and lateral and longitudinal pixel intervals, on the display screen or other The fixed feature target of the known mutual distance in the target region is subjected to continuous image acquisition, and the image coordinates of the target point are extracted, and the external imaging parameters are obtained according to the monocular camera calibration theory, and then the external parameters are determined. The external physical parameter, that is, the three-axis rotation angle (α, β, γ) and the origin coordinates (Χο, Υ, Ζ ₀ ) of the world coordinate system in which the display screen is located in the camera coordinate system of the image sensor, thereby determining the image sensor The spatial orientation coordinates relative to the display. In this way, by processing the sequence image acquired by the image sensor, the spatial attitude and the motion trajectory of the image sensor relative to the display screen can be obtained, and the specific spatial operation can be used to achieve certain specific operations.

In this application mode, multiple visual input devices can be simultaneously used for multiple moving objects, and the spatial orientation of each imaging device can be calculated at the same time, so that an overall motion including multi-part motion can be formed. For example, a small camera can be placed on each finger to align with the same set of targets on the screen for movement, and the active content of the hand can be identified according to the calculated spatial posture and motion trajectory of each imaging device, such as grasping, Rotate, pan, etc., to perform the corresponding control operations.

As shown in FIG. 13, the input device is initially located at the 90 position. At this time, an image of the feature target 12 in a certain fixed area of the display screen is acquired, and the external image is calculated from the image coordinates of the target point, the display coordinates, and the known internal parameters. The value of the imaging parameter, and then the orientation relationship of the input device relative to the target point, that is, the parameters such as (α, β, γ; Χο, Υ, Ζ::); when the input device moves to the position 92, Perform the same operation to calculate the azimuth parameter of the input device relative to the target point. According to the difference of the orientation relationship between the two positions, the spatial attitude and motion trajectory of the input device can be obtained.

The fourth embodiment comprises the following steps: selecting a fixed target area as a pointing area of the acquired image, setting a plurality of characteristic target points in the area, and knowing the coordinates of the target point in the area; the focal length, image will be calibrated Meta interval The monocular image sensor of the internal physical parameter points to the target area, and the image sensor is connected to the computer by wire or wirelessly, starts the image sensor work, collects the image of the pointing area containing the characteristic target point, and extracts the target point in the image according to the target feature. Image coordinates; using the image coordinates of the acquired target and its coordinates in the fixed area, according to the monocular camera calibration technique, the imaging parameters of the monocular camera are calculated; and then the imaging parameters of the calculated imaging model are used to obtain an image The spatial azimuth coordinate of the sensor in the coordinate system where the feature target is located; in a very short time interval, with the pointing motion of the image sensor, repeat the above steps to find the coordinates of the feature target when the image sensor is at different positions The spatial orientation coordinates in the system are connected to the spatial orientation coordinates of a series of image forming devices, and the spatial motion posture of the image sensor relative to the display screen is obtained.

Figure 14 is a flow chart of the fourth embodiment, comprising the following steps: after starting the pointing input device, in step 606c, aligning the image sensor with a fixed target area set in advance; in step 608, by the image sensor The image is acquired, and the coordinates of each target point are extracted from the acquired image; in step 610, it is determined whether the coordinates of each target point are correctly extracted, otherwise the step 608 is repeated; in step 612, the display coordinates of the target point and the extracted image coordinates are used. Calculating the imaging parameters of the imaging model; in step 624, using the calculated imaging parameters, further obtaining the spatial orientation coordinates of the image sensor; in step 620, determining whether the input device is still in the working state, and continuing to shoot if still working The continuous image is processed in the same way, otherwise the system is terminated. Through the above workflow, it is guaranteed that the cursor accurately follows the optical axis of the image sensor to point to the point motion.

The above specific embodiments are only intended to illustrate the patent, and are not intended to limit the invention, and various other changes and modifications may be made in accordance with the scope of the invention.

Claims

Claim

A visual pointing mouse input method using monocular camera calibration technology, the method comprising the following steps:

(i) directing the monocular image sensor to a target to activate the image sensor;

(ii) the image sensor collects an image of the feature content contained in the target pointing region, and extracts image coordinates of the feature target in the image;

(m) using the acquired target image coordinates and the coordinates of the target in the target coordinate system, and calculating the imaging parameters of the monocular camera according to the monocular camera calibration technique;

(iv) using the image coordinates of a fixed image point on the imaging surface of the image sensor, using the calculated monocular camera imaging parameters, calculating the coordinates of the object point corresponding to the image point in the target coordinate system, that is, the indication point is The coordinates in the target coordinate system;

(v) calculating the display coordinates of the mouse cursor or other image target on the display screen by the coordinates of the indication point in the target coordinate system, and displaying the cursor on the display screen by the computer;

In a very short time interval, repeat steps ii through V to move the mouse cursor or other image target following the direction of the image sensor.

2. A visual pointing mouse input method using monocular camera calibration technology, the method comprising the following steps:

(ii) the image sensor collects an image of the feature content included in the target pointing region, and extracts image coordinates of the feature target in the image;

(iii) using the acquired target image coordinates and the coordinates of the target in the target coordinate system, and calculating the imaging parameters of the monocular camera according to the monocular camera calibration technique;

(iv) further using the calculated camera imaging parameters to determine the spatial orientation coordinates of the image sensor in the target coordinate system;

In a very short time interval, with the pointing motion of the image sensor, repeating the steps ii to iv, finding the spatial orientation coordinates of the image sensor at different positions, and connecting the spatial orientation coordinates of a series of image forming devices, The spatial motion attitude of the image sensor relative to the display screen.

3. The method according to claim 1, wherein the pointing the image sensor to a target means: directing the image sensor to a target that has been previously set.

4. The method according to claim 1, wherein the pointing the image sensor to a target means: when the image sensor is directed to the computer display screen, first determining, by the computer, that the image sensor is directed to the approximate area of the display screen And then dynamically determine a target in the image sensor's display pointing area.

5. The method according to claim 4, wherein the determining that the image sensor points to a rough area of the display screen comprises the following steps:

(1) The image sensor is pointed to a certain area of the display screen to collect an image, and the image sensor is connected to the computer through wired or wireless communication mode, and the display screen is connected with the computer;

(2) starting the image sensor operation, informing the computer to output a coding pattern composed of a plurality of different color or graphic content feature block arrays on the display screen in a very short time, each color or graphic content is grouped into different numbers, each Characteristic

' 7 The coding consisting of a certain range of feature blocks in the vicinity of the block is unique in the entire coded picture, and the area codes of all the feature blocks of the entire coded picture constitute a positioning lookup table;

(3) The image sensor collects the coded image of the pointed area, extracts the code of the partial pattern from it, compares it with the spatial position lookup table of the coded picture, and determines the approximate position of the image sensor to the display screen.

The method according to claim 5, wherein the feature block coding pattern refers to: outputting, on the display screen, a coding pattern consisting of a plurality of rectangular feature block arrangements of different colors or graphic contents, each color Or the graphic content is encoded into different numbers, and the coding consisting of all the feature blocks in the nXn range near each rectangular feature block is unique in the entire coded picture, and the area codes of all the feature blocks of the entire coded picture constitute a positioning lookup table. .

7. The method according to claim 4, wherein the determining the image sensor to point to a general area of the display screen comprises the following steps:

(2) Starting the image sensor work, first outputting a coarse resolution coding pattern composed of a plurality of different color or graphic content feature block arrangements on the display screen, each color or graphic content is encoded into a different number, and the image sensor collects the pointing area. Image, determining the location of the feature block pointed to by itself;

(3) Then, the computer outputs the coding pattern again within the determined large feature block, the range of which is the size of the feature block, and the image sensor collects the pointing area image to further determine the position of the small feature block to which it is located;

(4) Perform this operation quickly by large and small ground to finally determine the position of the display pointed to by the image sensor.

8. The method according to claim 1, wherein the target is: a fixed planar target, the size of the target region is determined, the region includes a plurality of characteristic targets, and the target has a specific color. Features such as shapes, easy to extract from the image, and the coordinates of the target in the target coordinate system of the region are known.

9. The method according to claim 8, wherein the fixed plane target is: selecting a frame of the display screen as a target area, the feature target is set on the frame, and the distance between the points is known. .

10. The method according to claim 8, wherein the fixed plane target refers to: selecting a size-determined plane around the display screen as a target area, and the feature target point is set in the area, each point The distance between them is known.

The method according to claim 8, wherein the fixed plane target is: a computer selects a certain fixed area as a target area on a display screen, and the size thereof is determined by computer display coordinates of the area. And a plurality of feature points are determined by the computer in the target area of the display screen as feature targets.

12. The method according to claim 1, wherein the target is: a dynamic planar target, the target can be dynamically generated by a computer on a display screen, and the target generation position always follows the orientation of the image sensor. The range of the target area can be adjusted according to the imaging distance of the image sensor, the size of which is determined by the computer display coordinates of the area, and the computer determines a plurality of feature points as feature points in the display target area.

The method according to claim 11, wherein the determining, by the computer, the feature target in the target area of the display means: displaying content within a certain range of the target area on the display screen by the computer Processing, using features such as color, edge, corner, orientation, and surrounding environment information, selects a number of feature targets from the display content of the computer, and defines the range of the target region by these feature targets, and records the feature information.

14. The method according to claim 11, wherein the determining, by the computer, the feature target in the target area of the display screen comprises the following steps:

(1) The computer counts the color of the display content within a certain range of the target area on the display screen, and selects a color that is not in the display content and has a large difference from the existing color as the color of the generated feature target;

(2) The computer displays a certain display content in the target area on the display screen with the selected color, and the generated display content includes intersection points, corner points, center points and the like, and several features can be selected according to the features. Target, the target range is defined by these characteristic targets.

15. The method according to claim 8, wherein the plurality of characteristic targets refer to: a method for using an image sensor that does not previously calibrate internal physical parameters such as a focal length, a pixel interval, etc. At least 4 characteristic targets are determined on the top.

16. The method according to claim 8, wherein the plurality of characteristic targets refer to: a method for using an image sensor that has previously calibrated internal physical parameters such as a focal length, a pixel interval, etc., in a display screen At least 3 characteristic targets are determined on the top.

17. The method according to claim 1, wherein the solving the monocular camera imaging parameter is: using the acquired target image coordinate and the target coordinate system coordinate of the target, according to the monocular The camera calibration technique is used to calculate the external imaging parameters of the monocular camera of the image sensor.

18. The method according to claim 1, wherein the solving the monocular camera imaging parameter is: using the acquired target image coordinate and the target coordinate system coordinate of the target, according to the monocular The camera calibration technique is used to calculate the internal imaging parameters and external imaging parameters of the monocular camera of the image sensor.

19. The method according to claim 1, wherein: a fixed image point on the imaging surface of the image sensor means: the image point can be any image point on the imaging surface, the image point and The connection of the optical center of the imaging lens constitutes a virtual indication axis, and the object point corresponding to the image point is the pointing point of the virtual indication axis.

The method according to claim 1, wherein a fixed image point on the imaging surface of the image sensor means: the image point may be a central image point on the imaging surface, and the image point and The connection of the optical center of the imaging lens, that is, the optical axis of the imaging system, constitutes a virtual indication axis, and the object point corresponding to the central image point is the pointing point of the virtual indication axis.

21. The method of claim 1, wherein the mouse is calculated from coordinates of the indication point in the target coordinate system

( The display coordinates of the cursor are: When the unit length of the target coordinate system is the same as the pixel interval size of the computer display screen, the coordinates of the calculated target point in the target coordinate system are the display coordinates; when the target coordinate system When the unit length is different from the pixel interval size of the computer display screen, the coordinates of the calculated target point in the target coordinate system are multiplied by a proportional coefficient to obtain display coordinates, which are represented by the pixels of the display screen. The spacing size is obtained by dividing the unit length of the target coordinate system.

22. The method according to claim 2, wherein the spatial orientation coordinates of the image sensor in the target coordinate system are: the spatial orientation coordinates include a three-axis rotation angle ((!^^) and origin coordinates (^, ¥., 2.).

23. A directional pointing mouse input system utilizing monocular camera calibration techniques, comprising:

(i) the host computer, and the display connected to it;

(ii) a target for calibrating camera imaging parameters of the image sensor;

(Hi) a monocular image sensor for acquiring an image of a feature target area, connected to a host computer by a processing circuit;

(iv) controlling the functional components to generate the required control function signals, which are connected to the host computer through the processing circuit;

(V) processing circuit for image information processing and communication functions, connected to image sensors and control functions, and connected to the host computer by wired or wireless means;

(vi) The information receiving and processing device is installed in the computer host, and is connected to the processing circuit by wire or wirelessly, and simultaneously communicates with the computer operating system.

24. The system according to claim 23, wherein the processing circuit comprises an image acquisition module, a data processing module, a control signal module, and a data interface module, and has the following functions: receiving image information and control transmitted by the image sensor The control function signal generated by the function component; processing the acquired image, completing the pointing function of the image sensor, extracting the feature point, calculating the imaging parameter of the monocular camera, and calculating the coordinate of the pointing point of the cursor; generating a system trigger, left button Control function signals such as right button, page turning, scrolling, etc.; communicate with the information receiving and processing device by wired or wireless communication, and transmit information such as images, feature information, operation results, control function signals, and the like.

The system according to claim 23, wherein the information receiving processing device comprises a data interface module, a data processing module, and a data communication module, and has the following functions: receiving an image transmitted by the processing circuit, an operation result, etc. Information; receiving a control function signal generated by the processing circuit, such as system trigger, left button, right button, page turning, scrolling; sending characteristic information and coordinate information of the target to the processing circuit; and outputting the calculated cursor coordinate information to the computer operating system.

26. A visual pointing mouse input system utilizing monocular camera calibration technology, comprising -

(i) the host computer, and the display connected to it;

(ii) a target for calibrating camera imaging parameters of the image sensor;

(iii) a monocular image sensor for acquiring an image of the feature target area, connected to the computer host by the information receiving processing device;

(iv) a control function that generates a desired control function signal that is coupled to the host computer via the information receiving processing device;

0 (v) The information receiving and processing device is installed in the host computer to realize image information processing and communication functions, and communicates with the computer operating system.

27. The system according to claim 23, 26, wherein the control function component: arranges a plurality of control function buttons for generating control function signals such as system trigger, left button, right button, page turning, scrolling, and the like.

The system according to claim 26, wherein the information receiving processing device comprises an image acquisition module, a data processing module, a control signal module, a data interface module, and a data communication module, and has the following functions: receiving an image sensor The transmitted image information and the control function signal generated by the control function; processing the acquired image, completing the pointing orientation of the image sensor, extracting the feature target, calculating the parameter of the monocular camera imaging model, calculating the coordinate of the pointing point of the cursor, etc. Function; Generate control function signals such as system trigger, left button, right button, page turning, scrolling; notify the computer operating system to display the mouse cursor or other image target at the display coordinates on the display.

29. A glove type directional pointing mouse input device comprising:

(i) the host computer, and the display connected to it;

(ii) a target for calibrating camera imaging parameters of the image sensor;

(iii) a pointing finger sleeve with a monocular image sensor for collecting images from the target area and connected to the host computer via a processing circuit;

(iv) a control function key finger sleeve, comprising a plurality of buttons, a touch button or a pressure switch for generating a desired control function signal, which is connected to the host computer through a processing circuit;

(V) auxiliary function key finger sleeve, by bending the finger, the finger itself triggers the switch control, and is connected to the computer main body through the processing circuit, and the auxiliary function key finger sleeve is selected according to the use condition;

(vi) processing circuitry to implement image information processing and communication functions, connected to image sensors and control functions, and to be connected to the host computer by wire or wireless means;

(vii) The information receiving and processing device is inserted in the host computer and connected to the processing circuit by wire or wirelessly, and simultaneously communicates with the computer operating system.

30. A glove type directional pointing mouse input device, comprising:

(i) the host computer, and the display connected to it;

(ii) a target for calibrating camera imaging parameters of the image sensor;

(v) Auxiliary function key finger sleeve, which is connected to the host computer by bending the finger and triggered by the finger itself, and the auxiliary function key finger sleeve is selected according to the use condition;

(vi) The information receiving and processing device is installed in the host computer to realize image information processing and communication functions, and communicates with the computer operating system.

31. A finger-finger type directional pointing mouse input device, comprising:

( i ) a computer mainframe, and a display connected to it;

(ii) a target for calibrating camera imaging parameters of the image sensor;

(iii) a directional pointing type mouse input finger sleeve, which integrates the monocular image sensor, the processing circuit, and the control function key on a finger cot, can be worn on one finger, and is connected to the computer host by wire or wirelessly;

(iv) The information receiving and processing device is installed in the host computer to realize image information processing and communication functions, and communicates with the computer operating system.

32. A finger-type directional pointing mouse input device, comprising:

(i) a computer mainframe, and a display connected to it;

(ii) a target for calibrating camera imaging parameters of the image sensor;

(iii) Vision-pointing mouse input finger cot, integrating the monocular image sensor and control function keys on a finger cot, can be worn on one finger, and connected to the computer host by wire or wirelessly;

( iv) The information receiving and processing device is installed in the host computer to realize image information processing and communication functions, and communicates with the computer operating system.

33. A vision-oriented mouse application that resides in a host computer and communicates with a computer operating system and a visually-oriented mouse input system, including the following -

(i) an image receiving processing program that receives image information transmitted by an image sensor of the vision-directed mouse input system;

(ii) an image sensor positioning program that determines the position of the image sensor pointing to the display;

(iii) a feature target generation program for determining a desired feature target on the display screen;

(iv) a target extraction program for extracting image coordinates of the target in the image according to the target feature;

(v) an imaging parameter calculation program, which calculates an imaging parameter of the monocular camera according to the monocular camera calibration technique;

(vi) displaying a coordinate calculation program, and calculating a cursor display coordinate corresponding to the pointing axis pointing point;

(vii) The cursor display program notifies the computer to display the mouse cursor or other image target on the display screen, and re-determines the display pointing area with the display cursor as the center, and re-determines the feature target point in the pointing area;

(viii) Control function program to generate control function signals such as system trigger, left button, right button, page turning, and movement.