CN114600067A

CN114600067A - Supervisory setup of a control device with an imager

Info

Publication number: CN114600067A
Application number: CN202080070383.3A
Authority: CN
Inventors: 朱利安·科拉弗朗切斯科; 西蒙·切迪基安; 西蒙·吉约; 路易斯·筱·王
Original assignee: Seven Haggs Laboratories
Current assignee: Seven Haggs Laboratories
Priority date: 2019-08-08
Filing date: 2020-08-10
Publication date: 2022-06-07
Also published as: WO2021024238A1; US11445107B2; US20210044741A1; DE112020003765T5

Abstract

Systems and methods related to determining that a pointing device is pointing to a target are disclosed. One embodiment of the invention is a method performed by a control device having a pointing direction and an onboard imager. The field of view of the onboard imager includes a pointing direction. This method comprises the steps of: the setup phase of capturing a reference image, determining a feature quantity from the reference image, generating a feedback message from the feature quantity, and predicting the control device from the feature quantity terminates.

Description

Supervisory setup of a control device with an imager

Cross reference to related patent applications

This application claims the benefit of U.S. provisional patent application No. 62/884,278 filed on 8/2019.

Background

Consumers intuitively align the remote control with the device they want to interact with by pointing the remote control at the device. This behavior pattern results from a conventional remote control sending a radio signal directly to the pointing direction of the remote control. Thus, aligning the pointing direction of the remote control with the device provides the best opportunity for radio signals to reach the intended target.

Recently, a remote controller using the technology described in U.S. patent No. 10,068,463 has determined a means for transmitting a command through one system, and the command itself is transmitted by other means. For example, a set of beacons may determine the pointing direction of the remote control, while a separate radio system indirectly transmits commands from the remote control to the controllable device. These methods preserve the traditional user experience mode of pointing to controllable devices while extending the range of devices that can receive commands from a remote control. However, these devices now require the system to recognize the location at which the pointing device is pointing so that the commands can be sent to the appropriate device.

There are different methods available for identifying the position and orientation of an object. For example, the location and orientation of the object may be determined by geolocation, such as GPS. However, GPS is not accurate enough to accurately determine the pointing direction, especially in indoor environments, and cannot distinguish whether the pointing direction corresponds to one of two adjacent devices. Indoor positioning systems have different beacons that are alternatives to GPS, such as the beacons mentioned above, which use locally generated wireless signals to estimate the location of an object. However, the layout limitations of these beacons can make installation costly and complicated.

Disclosure of Invention

Systems and methods for determining a pointing device is pointed at a target using an onboard imager located on the object are disclosed. The shape of the pointing device may define the natural pointing direction of the pointing device. For example, the device may be a rectangle with a distinct short side, wherein the distinctive feature of the long and short sides of the rectangle naturally indicates the pointing direction of the object. In general, a pointing device may have a pointing direction. For example, the device may be a puck that points in the direction indicated by the arrow icon on the surface of the device. In other examples, any reference sign that enables a person to determine the direction in which a device is pointing may be considered as providing a pointing direction (a term used in this disclosure) to a device. The onboard imager may be aligned with the pointing direction of the subject. The onboard imager may be connected to the exterior of the pointing device or be an integral component of the pointing device. The onboard imager may have a field of view. The orientation of the onboard imager may be such that the field of view includes a pointing direction.

The system of the present invention may be directed to the pointing device itself, but may also include a support device (e.g., a base or charger for the pointing device) and a remote control device, such as a server or cloud infrastructure in operative communication with the support device or the pointing device. In this disclosure, reference will be made to non-transitory computer-readable media storing instructions for the disclosed system to perform certain operations. In these embodiments, the computer readable medium may be located entirely within the pointing device, may be disposed on the support device, the remote control device, and the pointing device, or may be located entirely on the support device and/or the remote control device.

In a particular embodiment of the invention, the pointing device may be a control device. The pointing device may be a remote control that selects a controllable object or a pointing target, typically in the form of a communication object. Pointing a pointing device to a particular pointing target may form an association between a controllable object or communication object and the sending system. Such association may then be used to send commands to or to send communications from the currently associated controllable object. For example, if the object is a controllable object such as a television, a command obtained from the user on the pointing device may be sent to the controllable object while maintaining the association. As another example, if the object is a communication object such as a weather service on a remote server, communication information obtained from the remote server is transmitted to the pointing device while maintaining the association. In this way, a user can receive communications from and send commands to various objects depending on the position at which the pointing device is pointed at any given time.

The object association formed by pointing the pointing device at a given target may also be changed by displaying a control for the currently associated object on the user interface. A user interface may be provided on the pointing device. For example, the pointing device may include a touch display, and when an association is formed, the controls for the currently associated controllable object may be displayed on the touch display. The touch display may display a channel and volume control interface of the television when a user points the pointing device at the television. When a user points the device at a light fixture, the touch display may display a brightness control interface of the light fixture.

The above-mentioned association may be predefined by a setup program. The system may be described as operating in a setup phase during execution of a setup program. The setup procedure may involve associating the target area with the pointing target and defining a signature for the target area. The physical area may be a particular volume of space within a physical location, such as a portion of a room near a television, or a particular surface such as a wall, ceiling, floor, or interface thereof. The pointing target may be the center of the target area. For example, the pointing target may be a small device (e.g., a DVR or compact streaming media cartridge), and the target area may be a cabinet and surrounding area where the small device is located. The setup program may associate the target area with the pointing target and further associate the pointing target with the controllable object or the communication object. The signature involved in identifying the target area, invoking a pointing target associated with the target area, invoking a controllable object or communication object associated with this pointing target, and associating with an object of the sending system using the pointing device. This phase of operation of the system may be referred to as an operational phase. The various nodes of the system may be said to be in a deployed state during the operational phase.

In certain embodiments of the present invention, the system may determine the pointing target of the pointing device using images captured by the imager. The images are applicable to systems that use these images to identify the signature of the target area. The pointing target may be the center of the target area. Fig. 1 shows an example of a user 100 pointing a pointing device 101 at an audio/video (a/V) apparatus 102 as an example of a user attempting to form an association between a controllable object and a pointing device. In this example, the imager is aligned with pointing device 101 and may capture an image of A/V apparatus 102. Fig. 1 shows an example of an image 103, where a receiver 105 is located within a cabinet 104, serving as a/V device 102. The system may then operate on the image 103 to identify the signature of the target region 106. In this case, the target area may include the entire corner of the room occupied by the cabinet 104. Assuming that the target area 106 has been previously associated with the enclosure 104 (pointing to the target) and the enclosure 104 has been previously associated with the receiver 105 (controllable object), the entire control system may form an association with the receiver 105 and send a command to the receiver 105 in accordance with this determination. As mentioned above, the pointing object and the controllable device need not be identical. Rather, the user may associate a particular zone in any given zone with a particular controllable device (i.e., a window of a room may be associated with a weather service on a remote server). In the illustrated example, the imager may obtain an image 107 in which only the cabinet 104 is visible, but the system may still understand that commands should be sent to the receiver 105 while pointing at the area 106.

Detecting a pointing target for association with a pointing device at any given time can be difficult for a number of reasons. For example, it is difficult to detect a small pointing object, such as a compact streaming media box, in an image taken in a general area. Furthermore, a venue may include many identical copies of the same controllable device, such as a home automation assistant or multiple versions of a home stereo system. It is almost impossible to distinguish between the two devices based only on the image recognition system identifying the devices because the two devices are identical copies of the same design. In addition, homes and the like typically include pieces of furniture and other repeating design elements on walls, floors, and ceilings that may be identical for an image recognition system. Furthermore, due to variations in brightness, variations in the perspective of the pointing device, variations in the spatial arrangement of objects within a region, and the introduction or removal of objects within a region, regions of a given site may appear to be very different at different times.

In certain embodiments of the present invention, placing emphasis on the target area rather than pointing at the target helps to reduce many of the problems identified in the preceding paragraph. The target area will include additional information and signals to derive the signature. For example, such additional stronger signals may prevent the system from confusing an enclosure 108 in the image 109 with the same enclosure 104. This is because the target area may include additional information (in the form of which corner of the room the cabinet is located in) that would be hidden if only the features of the cabinet were used to generate the signature. Certain aspects of these embodiments are described in detail below that enhance the ability of the system to gather information about a target area rather than being specifically directed to target-related information.

In certain embodiments of the present invention, the problems identified in the preceding paragraph are reduced by constructing a robust signature creation and detection system to identify the pointing target of the pointing device. In these embodiments, the signature creation process involves a collection of different images of the same target region. The different images acquired during the setup phase are referred to as reference images in the present invention, since they provide a reference for later identification of the signature of the target area. The signature creation process, which is designed to ensure that different images are highly different to increase the strength of the signature, may be performed during a setup phase of the system. The image may be selected to change the image based on the point of view of the captured image, the brightness conditions of the captured image, the configuration of the target area itself during image capture, and other changes. The variability of the image used to create the signature in the setup phase will create a recognizable signature despite the variability in the image at the time of system deployment. For example, a system with a robust signature can still identify the target area 106 when providing the image 110, even though the cabinet 104 is partially covered by

items

111, 112, and 113 that modify the appearance of the area. The robust signatures have a sufficiently high signal-to-noise ratio so that the noise introduced into the representation of these articles does not cause the signature detection system to fail.

In a particular embodiment of the invention, the user may be relied upon to capture reference images that are used to generate signatures of target areas identified during operation of the pointing device. However, in practice, there is no guarantee that the reference image taken by the user is sufficient for the system to work properly at deployment. For example, a user may not be able to take enough images, may take multiple identical images, and therefore may not add any useful information, may take disturbed images (e.g., occluded images where objects briefly appear in the field of view), may take blurred images (e.g., the imager moves too fast when capturing a reference image), or may take images under extreme brightness conditions that are not appropriate for the sensor features (e.g., nighttime). Thus, in a particular embodiment of the invention, supervision is provided to ensure that a set of reference images obtained during setup is sufficient to generate a strong, robust signature for the target region. The form of supervision may be to feedback to the user the sufficiency of the reference image that has been acquired so far, as well as potential encouragement for additional images to be obtained.

In a particular embodiment of the invention, the system can calculate the characteristic quantities of a set of reference images and predict the end of the setup phase from the characteristic quantities above a sufficiency threshold. The feature quantity may be a quantity representing a signal quantity contained in a set of reference images on which an operation is performed to generate a signature of the target region. The characteristic quantity may be a quantity representing a signal-to-noise ratio comprised in a set of reference images, wherein the signal represents the uniqueness of a target region signature that may be generated from the set of reference images. For example, the feature amount may be a number calculated from the degree of statistical variation between images in a set of reference images. Many other examples are described in detail below. Alternatively or in combination, feedback may also be provided to the user during the setup phase, ensuring the sufficiency of a set of reference images. For example, the feedback may provide encouragement for additional steps to be taken to complete the setup procedure, such as taking additional images. In a specific embodiment of the present invention, the feedback may be generated based on the characteristic quantity. For example, an indication may be provided on the display of the pointing device, indicating where the user should move the imager, in order to capture additional images with a higher degree of variation, further improving the amount of features. Many other examples of such feedback are described in detail below.

In a particular embodiment of the present invention, a system is provided. The system includes a control device. The shape of the control means may define the pointing direction of the control means. The system also includes an onboard imager located on the control device. The field of view of the onboard imager includes a pointing direction. The system also includes one or more computer-readable media storing instructions that are executable to: receiving a reference image; determining a feature quantity according to the reference image; generating a feedback message according to the characteristic quantity; and the setting phase of the control device is terminated according to the characteristic quantity prediction.

In a particular embodiment of the present invention, a computer-implemented method is provided. The method is carried out by a control device having the following features: (i) its shape may define the pointing direction of the control device; and (ii) the field of view of the onboard imager includes a pointing direction. The method includes capturing a reference image, determining a feature quantity from the reference image, generating a feedback message from the feature quantity, and predicting a set-up phase termination of a control device from the feature quantity.

In a particular embodiment of the present invention, a system is provided. The system includes a control device. The system has a pointing direction. The system also includes an onboard imager located on the control device. The field of view of the onboard imager includes a pointing direction. The system also includes one or more computer-readable media storing instructions that are executable to: receiving a set of reference images; determining a feature quantity from the set of reference images; and according to the characteristic quantity: (i) generating a feedback message; or (ii) terminate the setup phase of the control device.

Drawings

FIG. 1 shows a user using a pointing device and a set of images captured by an imager on board the pointing device, in accordance with certain embodiments of the present disclosure.

FIG. 2 shows a system block diagram and accompanying flow chart of a computer-implemented method for performing this system setup phase, in accordance with certain embodiments of the present disclosure.

Figure 3 illustrates a method for performing the operational stages of system setup using the process described with reference to figure 2 in accordance with certain embodiments of the present disclosure.

Fig. 4 shows a user interface displayed on a pointing device according to certain embodiments of the present disclosure.

Detailed Description

Systems and methods for determining a pointing device is pointing to a target using an imager on the pointing device are disclosed. In a particular embodiment of the invention, the imager is aligned with the pointing direction of the pointing device. The image of the imager is then used to identify the target area at which the pointing device is currently pointing. Images may be acquired at an operational stage of the system, referred to as sample images. The system may then determine the pointing target within the target area and then create an association with the pointing target according to the method in the above summary. A specific method of the setup phase of the system including identifying a target area according to the above-described inventive content is disclosed below. A set of reference images acquired during the system setup phase may facilitate identification of the target region. The examples provided in this section are non-limiting examples of the invention and should not be taken as limiting the scope of the invention.

Embodiments of the present disclosure relate to pointing devices in the form of a remote control with an integrated touch display, radio frequency transmitter, and visible light onboard imager. However, the pointing device may take various alternative forms. The pointing device may include various components including various user interface members, additional sensors, and specialized hardware for processing reference images and identifying target areas. As described in the summary, the pointing device may also be used in combination with a set of support devices and a remote control device to perform the method of the invention.

In a particular embodiment of the invention, the system responsible for the setup phase of the operation and determining that the pointing device is pointing to the target in the operation phase may be a pointing device operating independently. However, the system responsible for these operations may also include a support device (e.g., a charger for the pointing device) and may also include a remote control device (e.g., a server or cloud infrastructure in operative communication with the pointing device). In particular embodiments of the present invention, the system responsible for receiving user commands, presenting a user interface to a user, and/or providing information to a user from a pointing object may be a stand-alone pointing device. However, these tasks may equally be performed by separate support devices and remote control devices. For example, the support device may receive user commands directly (e.g., via a built-in microphone), or relay commands entered via the pointing device to the currently associated controllable object. These separable components (pointing device, support device, and remote control device) may be referred to as nodes of the system.

The operation of the system may be divided among the nodes in a number of ways, depending on the hardware of the pointing device and other design constraints. For example, if the battery of the pointing device is limited, memory and logic for performing resource intensive operations (e.g., storing images or identifying signatures of target areas) may be located on the support device (e.g., the charging base of the pointing device). As another example, more resource intensive operations may be performed on the server or cloud infrastructure than on the pointing device or any computing device in the same physical environment as the pointing device, such as initially generating a signature for a target area, computing feature quantities for a set of reference images used to generate the signature, and generating feedback for a user according to the feature quantities. In a particular embodiment of the invention, the system may be a multiprocessor system, the main processor cooperating with the AI accelerator. The main processor and the AI accelerator may both be located on the pointing device. However, a multiprocessor system may also be a distributed system, in which a host processor cooperates with a remote coprocessor (e.g., an AI accelerator). The remote coprocessor may be located on a remote server.

In certain embodiments of the invention, one or both of determining the signature or determining whether the image is the target area based on the signature may be performed on the support device. The support device may perform these operations using its internal processor and memory. In certain embodiments of the invention, the support device will have a wall outlet power connection and no pointing device is as sensitive to power when making heavy load calculations. In a particular embodiment of the invention, the support device will include an AI accelerator in communication with the support device or a main processor on the pointing device. Alternatively, the support device may receive the reference image and store it for later transmission to the cloud infrastructure for analysis of the reference image and/or generation of a signature of the target region. The support device may also send commands to the appropriate devices when the pointing device is set up and deployed.

In particular embodiments of the present invention, a cloud architecture communicatively coupled with a pointing device may perform various operations. For example, the cloud architecture may train an initial state for a trainable directed graph that will be deployed with the device. The cloud infrastructure may provide an initial state to the pointing device or other support device as a download when the device is first initialized. The cloud architecture may also provide instructions for the reference images that need to be acquired for the setup phase to be accurate. The cloud infrastructure may also view the images and determine the progress of the reference image feature quantities and generate feedback to the user when needed. The images are acquired by an imager on the pointing device or an imager on the alternate device.

The pointing device and any associated support devices may be enhanced by dedicated hardware associated with identifying a target area signature. The dedicated hardware may perform image processing and recognition algorithms more efficiently than general purpose processors. The dedicated hardware may be the AI accelerator of the present invention. The dedicated hardware may more efficiently train and utilize the trainable machine intelligence system. A trainable machine intelligence system may be trained using a set of reference images acquired during a system setup phase. The signature of the particular target region is then embodied in the weights of the trained system. When deployed, the input image of the imager on the pointing device is fed to the machine intelligence system to determine that the pointing device is pointing at the target area.

The dedicated hardware may take many forms. For example, the pointing device or associated support device may include a dedicated digital signal processor for instantiating, training, or drawing inferences from the trainable image recognition system. For example, the digital signal processor may be embodied as a GPU, FPGA, chipset, or ASIC dedicated to such computations. The ASIC may be designed to make inferences about the signatures involved in the sample image, and each inference consumes power on the order of micro joules. The GPU or ASIC may be mounted on the pointing device. The digital signal processor may be optimized to speed up computations, such as linear matrix operations by an Artificial Neural Network (ANN) that performs image classification inferences. The digital signal processor may perform multiple computational operations in parallel. As another example, the support device, such as the pointing device or charger, may be enhanced by a dedicated digital signal processor to train or infer operations for the trainable machine intelligence system, which may be easily performed on the dedicated digital signal processor without excessive battery consumption or processing time.

The pointing device and any associated support devices may include one or more interfaces for receiving user commands. The pointing device may include a display for displaying information to a user. The pointing device may also receive user commands through an interface, such as a keyboard, touch display or microphone, or any other known user interface technology. Alternatively, the pointing device may not have any interface to receive commands, while the support device includes a touch display, microphone, or gesture recognition interface to receive user commands. For example, a smartphone, tablet, or hub device may receive user commands, while a pointing device may be used to identify the controllable device. The pointing device may also include a speaker or haptic feedback system that provides information to the user. When the system selects or locks the pointing target, a display, speaker, or haptic feedback system may provide a prompt to the user alerting the user that the system is forming an association according to this pointing target.

The pointing device may include additional sensors in addition to the imager, such as a motion tracker. The motion tracker may be an Inertial Motion Unit (IMU). The pointing device may include an integrated motion tracker (e.g., magnetometer, accelerometer, gyroscope, or any 9-axis sensor). The motion tracker may be used to activate an imager. For example, the motion tracker may determine when the pointing device is moving and remains stationary, indicating that the user is pointing at a pointing target. Upon detection of such a motion pattern, the imager may be triggered to acquire a reference image or a sample image. Additional sensors such as motion trackers may also be used for data fusion applications, as described below.

As described above, a reference image of the target area may be acquired during a setup phase to generate a signature of the target area, and this signature may be used to analyze the sample image at a later time when the device is deployed. In certain embodiments of the present invention, both steps may be performed by an onboard imager. The reference image may be captured by the same onboard imager used to identify the target area and determine the pointing direction of the pointing device when it is deployed. However, the reference image may also be captured by a stand-alone imager, while the onboard imager captures an image that is analyzed when the device is deployed. The self-contained imager may be mounted on a mating device. The mating means may be one of the support means described above. The matching device can be a smart phone or a special device in a setting stage. The companion device may include functions not available on the pointing device that facilitate the execution of the setup phase. For example, the companion device may include a higher resolution display than the pointing device, allowing the user to more easily check the quality of the reference images, or to more easily receive feedback from the system for a set of reference images. The companion device may also include means for providing reference image related feedback to the user. The feedback may be provided via a display of the companion device, a speaker of the companion device, or other user interface of the companion device. In certain embodiments of the invention, the companion device will be paired with the pointing device via a wireless connection (e.g., a bluetooth connection), and reference images will be acquired by the pointing device's onboard imager, while the reference images are displayed for review on the companion device display along with any additional feedback for the user to obtain, such as instructions for additional images.

The imager on board the imager or on a companion device may take a variety of forms and may be enhanced to capture additional information from the physical environment in which it operates. The imager may acquire one, two or three dimensional images. The image may be generated using any form of electromagnetic energy (e.g., visible, infrared, or ultraviolet) or any combination of multiple frequency bands of electromagnetic energy. To assist the imager, the imager may work with a projector that generates and projects imager visible electromagnetic energy. The projector may be used to illuminate the environment or to generate a structured light pattern in the environment, thereby enabling the imager to gather more information from the environment. For example, the imager may include a night vision infrared camera with an infrared LED or an ultraviolet camera with an ultraviolet structured light projector. In certain embodiments, the infrared LEDs or other projectors may be mounted on support devices around the physical environment so that they do not consume too much power from the pointing device battery. The support device may be a charger for the pointing device. The imager may include a plurality of sub-imagers, such as a visible light camera for detecting visible light in the environment and an ultraviolet camera for detecting ultraviolet projector projection patterns.

In certain embodiments of the invention, the imager has a larger capture area to maximize the information available to the system, to generate robust signatures during the setup phase and to identify signatures during the operational phase. For example, the imager may be a wide-angle visible light camera that captures a larger area of each image to obtain more information about the target area. The additional information may include the relative position of the room adjacent the corners and/or the distance from the pointing object to the ceiling and floor. The imager may be a fisheye imager. In other embodiments, the imager may capture a panoramic image, such as a panoramic image of an all-spherical surface.

In certain embodiments of the present invention, the field of view of the imager will include the pointing direction, but will not be centered on the pointing direction. For example, the imager may be tilted with respect to the pointing direction. The degree of tilt may be selected for a particular application. For example, for a pointing device having a user interface that is typically read by a user when operating the device, the imager may be tilted vertically with respect to the pointing direction to offset common deviations associated with this mode of use. As another example, because furniture and other items on the floor, the room corners are generally more visible from the ceiling-wall interface than the ceiling-floor interface, the imager may be tilted slightly vertically toward the ceiling to capture layout information for the room.

The imager may include a combination of sensors that project a pattern in the pointing direction of the device, such as a visible light camera and an ultraviolet LED or laser enhanced ultraviolet camera. The combined image may include depth information captured by a depth sensor, such as an ultraviolet or infrared camera that detects the projected pattern, in addition to the visible light emitted by the camera. Any form of projected structured light may be projected to capture depth information including visible, ultraviolet, or infrared light. The depth sensor may be any form of depth sensor, including a LIDAR, a stereo imaging device, or any other form of sensor that can capture depth information or information from which depth information is derived. In these embodiments, the image may be an RGB-D matrix or a depth cloud. In other approaches, the imager and image may simply ignore texture data based on depth information. For example, the image may capture locations in a two-dimensional image of infrared points illuminated by the projection of structured light. The particular sensor used to acquire the image may vary depending on the environmental conditions in which the imager is operating. For example, the pointing device or support device (e.g., a charging station) may include an Ambient Light Sensor (ALS), and the visible light sensor may be turned off if it is determined that there is insufficient ambient light in the environment for the visible light sensor to provide operational information.

FIG. 2 shows a system block diagram 200 and accompanying flow chart 210 of a computer-implemented method performed using this system, in accordance with certain disclosed embodiments of the invention. The block diagram 200 includes a control device 201. The shape of the control means 201 may define the pointing direction 202 of the control means. The diagram 200 also includes an onboard imager 203 located on the control device 201. As shown, the field of view 204 of the onboard imager 203 includes a pointing direction 202. The block diagram 200 also includes a computer-readable medium 205 storing instructions that can perform each step in the flow chart 210. As described in this summary, the computer-readable medium 205 may be located entirely within the control device 201, e.g., in a memory internal to the control device 201, or may be distributed throughout a system that includes the control device 201.

The flowchart 210 begins with step 211 (capturing a reference image). The reference image may be captured by an onboard imager 203 or an imager on a companion device. The flowchart 210 continues with step 212 (receiving a reference image). This step refers to receiving an image from an on-board imager of the processing device for storage in the memory of the pointing device, or receiving an image on one node of the system from another node on the system where the image was captured (e.g., capturing the image on a companion device in the form of a smartphone and then transmitting to a pointing device charger or remote server where the image was received).

The flowchart 210 continues with step 213 (determining feature quantities from the reference image). Step 213 may also be performed from a set of reference images if the system previously captured additional reference images at this point. The feature quantity may be a value calculated for the most recent reference image or for the entire set of reference images that have been acquired. The feature quantity may be a quality measure of the image quality of the most recent reference image, the number of reference images in a set of reference images, or a more complex value reflecting a set of reference images calculated by means of graphic and geometric analysis (evaluating image changes occurring in aspects of image viewpoint, image brightness, etc.). The characteristic quantity may be a measure of completion (e.g. ratio of found surface to total surface to be found) of at least part of the target area or of a three-dimensional reconstruction directed to the target in the reference image. The features may be a diversity measure calculated from the spectral content of different reference images. The feature quantity may be a diversity measure of the reference image estimation viewpoint.

The flow chart 210 continues with step 214 (predicting the end of the setup phase of the pointing device from the characteristic quantities). In this step, the feature quantities generated in step 213 may be evaluated and used to determine whether the setup phase is complete. Thus, the termination of the setting phase can be predicted from the feature quantity. The evaluation of the feature quantity may involve comparing the feature quantity with a threshold value (e.g., the feature quantity is greater than, equal to, or less than a given threshold value). The comparison will depend on the type of feature quantity calculated in step 214. In a particular embodiment, the feature quantity will be the number of reference images in a set of reference images. The threshold is a predetermined desired number of reference images that the user needs to capture. In another particular embodiment, the feature quantity will be a measure of difference or cross-correlation between different images in a set of reference images. The differences or correlations may be measured in terms of overall texture maps, imager pose, brightness, or other aspects of the image. However, regardless of which particular amount is calculated, step 214 is based on the feature quantity, which indicates that the reference image or set of reference images evaluated in step 213 contains sufficient information to generate a robust, unique signature for the target region of the reference image. For example, the feature quantity may be a degree of correlation between reference images in a set of reference images. In this case, the evaluation would require the feature quantity to be less than a threshold value, indicating that the image is sufficiently sharp. As another example, the feature quantity will be a degree of difference between reference images in a set of reference images. In this case, the evaluation would require the feature quantity to be greater than a threshold value, indicating that the image is sufficiently sharp.

The flowchart 210 continues with step 215 (generating a feedback message based on the characteristic quantity). In this branch of the flow chart, the setup phase is not terminated because in step 214 a set of reference images was found to be incomplete or insufficient. The feedback message may provide guidance or encouragement for the user to proceed with the setup phase to improve the set of reference images. The feedback message may be a text message or a symbolic message. For example, the feedback message may be a textual indication to the device display that additional images were captured (e.g., move left, move right, move back, move forward, shoot again and hold the imager steady at the same time, turn more lights on, turn off lights). Text messages may also be provided by audible signals using a speaker. As another example, the feedback message may be a score, progress bar, or other symbolic representation that represents the ratio of the number of reference images taken to the number of reference images required. The feedback message may be provided visually through a display or audibly through a speaker. The feedback message may be generated from the feature quantity or directly from the reference image. The feedback message may include the reference image itself so that the user can review the reference image and the diagnostic system finds the cause of the inadequacy of the reference image or set of reference images. Various combinations of the types of feedback messages described in the present invention may also be provided to the user. For example, the feedback message may include a reference image requesting the user to confirm the quality of the reference image, and the feedback message may further include a progress indicator that displays the number of reference images received by the user compared to a target threshold.

In certain embodiments of the present invention, the feedback message will be displayed on a companion device (e.g., a smartphone). These embodiments provide certain advantages if visual inspection of the feedback message is important (e.g., the feedback message includes the reference image itself) when the companion device includes a display that is superior to the pointing device. In these embodiments, the pointing device may capture an image and transmit it wirelessly to a companion device for display. Alternatively, the companion device may be used to capture and display a reference image.

In a particular embodiment of the invention, the feedback message may comprise instructions how to capture at least one additional reference image. The instructions provided can ensure that the next reference image obtained has the greatest value to the system in terms of its effect on the feature quantity. The instructions may specify at least one position that the pointing device or companion device (whichever device is acquiring the reference image) is to be in when capturing the at least one additional reference image. For example, the user may see an arrow on the display indicating the direction in which they need to move to capture the next reference image. As another example, the user may obtain a textual indication, either through a visual display or an audible indication, "take an additional reference image for this region from the left side of the just-obtained image". Feedback can be provided in real time to guide the user to maintain the correct imager pose. For example, the feedback may be an AR signal. For example, the feedback may highlight which portions of the area have not been adequately captured (e.g., visually highlighted portions are displayed in real time on the image of the area). As another example, the feedback may show an arrow that continuously updates how the device should move until it is in the best position to create the maximum for a set of reference images.

In particular embodiments of the present invention, the feedback message may be directed to a most recently captured reference image (e.g., determining the feature quantity may include determining that the most recently obtained reference image is of insufficient quality due to poor illumination, blurring, insufficient variation from a previous reference image). The feedback message may simply include instructions to recapture the same reference image, or to modify it before capturing the reference image from the same vantage point (e.g., "steady imager" or "turn more lights on").

In particular embodiments of the present invention, the feedback message may be considered an improvement over the entire set of reference images (e.g., determining the characteristic quantity may determine that a set of reference images lacks some form of difference). The feedback message considers that at least one additional reference image needs to be captured to improve the feature quantity of a set of reference images. For example, the evaluation of the feature quantity may indicate that the user needs to capture additional images from different angles, different brightnesses, remove an object from the target area, and the like. The feedback message may then provide this information to the user (i.e., an instruction to move to a particular angle to complete a set of desired unique viewpoints or an instruction to turn on additional lights).

In certain embodiments, step 211 is performed while changing the environment to increase the diversity of the reference images. The difference in the environment may be performed according to a set procedure or a procedure that varies according to the feature amount during acquisition of a set of reference images. The differences in the environment may be created by an environment difference system. Because the pointing device may be a control device that serves as a controller for a large number of devices, and the environment differentiation system may be a component of the overall system including the pointing device, the environment differentiation system may have access to a large number of devices that may be used to change the environment. Indeed, in certain embodiments, the pointing device may be part of the environment differentiation system and act as the initiator of the command for changing the environment.

Various environmental differences may be introduced through various methods according to the apparatus controlled by the environmental difference system. For example, the brightness of a room may be changed by turning lights on or off, changing the color of any color changing lights, opening or closing curtains, or opening or closing a television or other displayable device. As another example, the visual appearance of a room may be changed by displaying different color patterns or images on a television or other displayable device. These environmental changes can be created from the feature quantities as part of the above-described procedure. In an embodiment, where the feature quantity is a measure of the difference or correlation between different image intensities or overall texture maps, a feedback message may be generated by computer readable instructions for the desired difference of the next reference image, and the environment difference system may create the desired difference, setting the environment to the conditions required to respond to the feedback message. For example, the feature quantity may indicate that the reference image lacks sufficient difference in brightness, the feedback message may be an instruction to increase brightness, and the environment difference system may select a potential method of increasing brightness in response. If the environment difference system has access to only one lamp, the brightness of the lamp may be reduced based on the feedback message. If the ambient variance system has access to a light and a shade, the ambient variance system may select different methods and study the impact of these methods on the diversity of the reference images (e.g., try the shade first, dim the light if the next feedback message still indicates that a greater brightness variance is needed).

As shown, if the setup phase is not terminated, the flow chart will return to steps 212 and 213 where additional reference images are captured, received and added to the set of reference images; and re-determines the feature quantity using the additional image. This process includes receiving at least one additional reference image and re-determining feature quantities from the reference image captured in the first iteration and the at least one additional reference image. After the feature quantity is re-determined, the flow chart will return to the point where the system can trigger the end of the setup phase according to the feature quantity. This step may involve a further iteration of step 214 that does not result in another iteration through the loop, resulting in the end of the trigger setup phase. The steps described in flowchart 200 may be performed by a companion device (e.g., a smartphone) instead of control device 201. However, in certain embodiments, the control device, and in particular an imager on the control device, may be used to capture the reference image and the additional reference image in step 211 while displaying the feedback in step 215 on a display of a companion device (e.g., a smartphone).

As shown, the flow chart 210 may continue from step 214, with step 216 terminating the setup phase for triggering the pointing device. When the system determines that a set of reference images is sufficient, step 216 may be performed at any time during the setup phase. As shown in the figure, the termination of the trigger setting phase is performed in accordance with the behavior of the feature quantity prediction setting phase termination. The setup phase shown in fig. 2 is illustrated with reference to a single target area 206. However, the setup phase may include acquiring reference images for multiple target regions at once. Thus, the steps of the discussed flowchart 200 may be repeated multiple times (including multiple sets of iterative loops in step 215) so as to acquire reference images for these additional target regions. The setup phase may also be revisited when the user wants to add an additional target area in the existing system. A command by the user to add an additional area may trigger a re-entry into the setup phase.

Termination of the setup phase may trigger additional steps required to bring the system into operating phase conditions. These steps include generating a signature for the target area and associating the point target or object with the target area. The flowchart 210 includes step 217 (generating a signature of the target region using the reference image captured in step 212). The signature may be generated by a set of reference images captured for a plurality of iterations in step 212. The flow chart 210 also includes a step 218 of associating the object with the target area. The object may be a communication object or a control object. In the operational phase, the association between the object and the target area may be used to create an appropriate association to send a communication or command through the system when the pointing device is pointed at the target area and to identify the target area using the generated signature.

The signature generated at step 217 may represent the target region, and this signature may be derived from two-dimensional or three-dimensional data of the target region. The signatures may be embodied as weights used by the trained ANN to process the image to determine whether it is a target area image. The signature may be a trained directed graph, a series of points/coordinates, or other form of compressed information that may be used to identify the region. The signature may be a feature vector that is used to match the output of the classification system (provided with the target region input image). In particular embodiments of the present invention, a region may have multiple signatures (e.g., multiple two-dimensional or three-dimensional features taken from different perspectives) or one common signature (e.g., a three-dimensional model of the region created by reconstruction from multiple two-dimensional reference images or features of these reference images). Because the lighting conditions affect the reference image, the signature may be a combination of day and night signatures. In a particular embodiment of the invention, different signature libraries are accessed depending on the characteristics of the sample image taken while the pointing device is in the operational phase. For example, the system may maintain a set of libraries of low light signatures for a target area and a library of high light signatures for the set of target areas. Using the ALS on the pointing device, determining the target area during the operational phase may include accessing only a low light signature (if the ALS detects a low light condition, and vice versa in a high light condition).

In a particular embodiment of the invention, the feature quantity is a measure of the diversity of the reference image estimation viewpoints (e.g., an estimate of each reference image view is measured or computed and then analyzed to determine if the variation between views is sufficient). The system may determine and analyze viewpoint differences between reference images. This step can be performed by computing points of interest (e.g., ORB descriptors) in two reference images, correlating these points throughout the image, and computing an essential matrix based on the correlation, and then deriving the direction of change in orientation and change in position between the viewpoints. In these embodiments, the threshold for evaluating the feature quantity for the end of the prediction setup phase may be an absolute angular distance between the two image orientations that is determined to be sufficiently different. The number of sufficiently different orientations using this metric may also be calculated between the reference images and displayed to the user as a feedback message. The threshold used to predict the end of the setup phase may also be a large number of reference pictures with sufficiently different orientations using the metric measurements described above.

In a particular embodiment of the invention, the characteristic quantity may be a complete measure of a three-dimensional reconstruction of at least a part of the target region or of the pointing object. The reference image may be used to reconstruct a three-dimensional representation of the target region, and the setup phase is terminated (or the user is advised to terminate) if the three-dimensional reconstruction is sufficiently complete (i.e., no dead zones are present in the reconstruction). Such a three-dimensional reconstruction may employ a similar procedure as considered in the previous paragraph, wherein the depth of the point of interest is retrieved from the two images in which it appears, and the relative orientation and translation direction of the two images is calculated from the essential matrix.

In certain embodiments of the present invention, the process of acquiring images from the user during the setup phase will be performed using a real-time image stream captured by the imager of the onboard imager or companion device. The pointing device may also analyze the reference images captured in this real-time stream so that the feature quantities of a set of reference images are continually updated as the images are captured by the imager. Also, feedback may be calculated in real time and provided to the user. The feedback in these embodiments may be the AR signal described above. For example, an arrow is placed on the display of the real-time image stream indicating the direction in which the imager should be moved, or highlighted on a different surface of the real-time image stream indicating a sufficiently captured surface and added to the three-dimensional reconstruction of the physical space.

In certain embodiments of the present invention, data fusion is employed to improve the setup phase and/or the operation phase. Data fusion can be used to provide more accurate feedback or better estimate feature quantities. Data fusion may involve adding additional data to the imager data, such as motion tracking and/or position data. For example, the pose estimate of the imager is added to the image taken from this pose. In a particular embodiment of the invention, the system will include a motion tracker on a control device and a computer readable medium storing instructions executable to: determining a feature quantity from the reference image and the data of the motion tracker, generating a feedback message from the feature quantity and the data of the motion tracker, or using the motion tracker data to identify a signature of the target area in an operational phase.

Using a visible light camera as the sole sensing element makes the system susceptible to blurring, visual occlusion, and visual blurring between multiple places (e.g., two blank walls). To address these issues, one possibility is to supplement the visible light imager with another source of information (such as an IMU) to sense the pose of the imager (i.e., its engaged position and orientation). And then, when the sensor of the imager is blocked or the ambient brightness is too low, the sensor can not work normally, and the pose estimation can be used as a standby. The pose estimate can then be used as additional information to distinguish two regions with the same visual aspect during the operational phase. Further, pose information may be used to assist in computing feature quantities for a set of reference images, particularly those based on the degree of difference required between the viewpoints at which each reference image is captured.

In certain embodiments of the present invention, by knowing the position of each target region in advance, the pose estimation of the imager can be aided. This can be done by: (1) by capturing the pose of the pointing device each time a reference image is taken; and (2) triangulating the position of the reference region from the at least two pose estimates captured in (1). A similar method can be used to correct for drift in the sensors used to determine the position of the pointing device. The pose estimate of the IMU is subject to drift, which means that the estimate slowly deviates from the true pose. To address this problem, the output of the imager may be used to correct for drift each time a known target region is identified.

Fig. 3 shows a flow chart of the operational stages of the system according to a particular embodiment of the present invention. The flowchart 300 begins with step 301 (receiving a sample image). The sample image may be a pointing target image of an imager on board the device. As shown, the sample image 310 includes a visible light code or television 311. The television represents the target area. The flowchart 300 continues with step 302 (identify signature using sample image). The signature will be the signature of the target area represented in the image 310. This step involves providing the information of the image 310 to a classifier, a trained machine learning system, a trained support vector machine, or any other system capable of identifying a target region signature from an image.

In certain embodiments of the invention, once the signature is identified, the system may form an association with the controllable device or other pointing object with which the system is in communication. Accordingly, the flow diagram 300 includes step 304 (associating with the object through the system). In certain embodiments of the invention, the system may also provide a user interface to the user when identifying the signature. As shown, the flowchart 300 continues with step 303 (displaying a controllable object interface on the display for the identification signature). Fig. 3 shows the remote control in a first state 312 with a display 313. Thus, execution of step 303 corresponds to the same remote control transitioning to state 314 where controllable object interface 315 of television 311 is displayed on display 313. Such a conversion may be performed after the target area signature is detected if the target area was previously associated with a controllable object, such as a setup phase procedure according to the present invention.

Once the pointing target of the pointing device is selected, this information can be used in various ways. The pointing target may be selected until the user clears, or may change synchronously with the instantaneous pointing direction of the pointing device. When a particular pointing target continues to be the selected target, the pointing device itself or the system to which the pointing device belongs may interact with the particular device or system with which this particular pointing target is associated. In a particular embodiment of the invention, the pointing device may be a remote control, and the pointing target may be selected to identify the controllable device with which the remote control should interact at any given time. In certain embodiments of the invention, the pointing device may have a display and the pointing target may be selected to identify the control interface that the display should display at any given time. The display may be a touch display, or other combination display and input interface, for displaying information to a user and receiving control inputs from the user. As an example combining the two sets of embodiments described above, when the user points the pointing device at a television, a channel selector and volume selector may be displayed on the display, along with the word "television" identifying the currently selected device. Subsequently, when the user points the pointing device at the light bulb, the display may display a switch, a dimmer, or a color selector according to the features associated with the light bulb.

The relationship between this pointing object and the association produced by the system when selecting a particular pointing object can be in a variety of forms. For example, the pointing object itself may be a controllable object, which should be the subject of the control system association. However, the pointing target may be a physical area defined by certain features that are detected by the imager but not physically associated with the controllable object. Instead, the system may have previously associated a pointing target with a particular controllable object. Thus, the controllable object may be a real or virtual object. For example, the controllable object may be a physical automation building component or a smart home device that receives instructions or provides information, such as a lighting device, a television, an electronic curtain, a thermostat, an actuator for commercial HVAC equipment, a smoke alarm, a chemical sensor, a security device, and the like. However, the controllable object may also be a virtual object that receives commands or provides information, such as a network accessible API or other virtual object. Furthermore, the flexible association between the pointing target and the controllable object ensures that the user does not need to physically see the location of the controllable object to select the object.

The chain of operations from obtaining an image, selecting a pointing target, associating a pointing target or controllable object, and utilizing this association according to the above examples may be implemented in various ways. In particular, the transmission of commands or information may be performed by various nodes in the system, depending on the currently selected pointing target. For example, the command may be received through an interface of the pointing device itself (e.g., a touch display or a microphone), or through a completely separate system (e.g., a microphone located in the same room as the pointing device). Further, the pointing device may only pass images from the imager to a separate node in the system, or may perform all necessary steps to generate a command specifically addressed for transmission to a given controllable device. The pointing device itself may capture images captured thereon and store and process these images in place, or transmit these images to an alternate device for storage and processing. For example, the pointing device may send an image to the charger, which then decides the object that the control system should associate with the subject and sends a command to this object. Also, the pointing device may be associated with the controllable device by command alone, or by command only with the pointing target, while the support system proceeds to associate the command with the controlled device by this stored association between the pointing target and the controlled device that is stored and maintained by the support system.

In particular embodiments of the present invention, the manner in which commands are actually sent at the system may vary from implementation to implementation. Fig. 3 is an example of a pointing device in the form of a remote control device 316 that has been associated with a control device in the form of a television 317 so that a transmission system 318 can transmit commands from the remote control device 316 to the television 317. The transmission system 318 may be fully incorporated into the remote control device 316 and include an IR transmitter that is tuned to transmit signals to various devices, including an IR receiver of the television 317. However, the transmission system 318 may also include additional devices, such as a charging station for the remote control device 316. The sending system 318 may also include any number of local or wide area networks, the internet, and remote devices, such as servers and companion cloud architectures. The charging station may include an IR transmitter that communicates with the television 317 and a different wireless communication system that communicates with the remote control 316. Furthermore, as described elsewhere herein, a remote control device or other pointing device may be used only to form the association 302, with the commands sent by the sending system 318 originating from a separate device. These commands may be received, for example, through a microphone on the charger of remote control 316 or any microphone that is communicatively interfaced with transmission system 318.

FIG. 4 illustrates a potential user interface 400 installed on a display of a remote control 410 for guiding a user in acquiring a set of reference images in accordance with certain embodiments of the present invention. A user interface is displayed on the touch display of the remote control 410. During setup, the system has now recognized polychromatic light on the potential pointing target. The pointing targets are referred to as potential pointing targets in the present invention because they have not been added to a set of pointing targets known to the system. An area 401 of the display identifies the pointing device for the current part of the setup process. The user interface 400 also displays a counter 402 that displays a number of pointing target images indicating the user's acquisition. Additional instructions may be provided to help guide the user in acquiring the appropriate images. A controller 404 is also provided for the user to instruct the imager to acquire images. Additional instructional counters and areas can be used to provide other types of information to the user regarding perspective changes, lighting changes, and other feedback of the system, guiding the user in successfully performing the image acquisition process. The user interface of fig. 4 may provide a similar user experience flow for acquiring images of a target area. In fact, in certain embodiments, the user may not know that the system is capturing information for a target area, but rather provide instructions within this area to point to a target-related reference image.

Various forms of preprocessing may be performed before the image received in the application step 212 determines the feature quantity or the identification signature. For example, if the image is a visible light image, the image may be processed by auto white balance, auto focus and image stabilization, and any other image normalization procedures, in order to use the image as a reference to the target later display mode. The image may be corrected using a rolling shutter camera model. In a related approach, images may be acquired using a global shutter camera to aid in image normalization when acquiring images. The pre-processing may further include image processing to compensate for brightness variations during the setup phase. Such processing may also be applied to images acquired while the pointing device is operating in the deployed state. In embodiments where the pointing device acquires various data, the preprocessing may also include fusing the data of the various sensors together, for example, by combining the visible light and depth data to form an RGB-D point cloud.

The same type of pre-processing described above may be applied to the images acquired when the pointing device is operating in the deployed state, e.g. the images received in step 301. In fact, various pre-processing methods for normalizing images help the system to match images of the same orientation type, since spurious differences are eliminated.

In certain embodiments of the present invention, a method (e.g., the method exemplified by flowchart 300) may accommodate a system detected fault condition or a significant change in data availability. For example, in the event of a fault condition or lack of data, the pointing device may default to a manual mode in which the device to be controlled is identified by the user by other means, e.g., a voice command or selection of the device in a menu where the pointing device touches the display. The lack of data or fault condition may be detected in a variety of ways. For example, an ambient light sensor, or a control system that recognizes the light state and time of day in a room, may control which type of imager is used to capture an image, and which modifications to the program that generate inferences may be needed. According to this example, a pointing device with both a visible light sensor and an IR sensor may turn off the visible light sensor (provided that the device is operated in the dark). As another example, the system may determine whether significant modifications are made to the physical location or any given area, such that the system can no longer reliably identify an area. More specifically, the system can determine whether the user has rearranged the physical space while executing the setup program, thereby returning the program to the setup phase, recalibrating and refining the signature associated with this region.

In any of the above examples, the user may be provided with an option to switch to a default mode, rather than automatically exiting the imager mode or automatically changing the performance of the imager mode. The user may also be continuously presented with the option of switching to the default mode through configurable settings, regardless of whether a fault condition or a lack of data condition exists.

In certain embodiments of the invention, components of the system may be coupled to a mobile robot to automatically adjust the physical location of the components during a setup or operation phase of use. For example, the pointing device, the supporting device, the control device, and the supporting device described above may be all or individually mounted on the mobile robot. The mobile robot may be a tread, wheel or legged robot. The mobile robot may be an airborne robot, for example, a micro-quad aircraft. The mobile robot may be a fixed tripod or other support attached to the base of the rotating imager, where the robot can change and set the pose of the attached imager. Any of the mobile robots described above may also be enhanced by such a rotating imager mount base. The mobile robot may provide six degrees of freedom (DOF) mobility for the device, allowing the device to have different x, y, and z positions in the environment, as well as different pitch, yaw, and roll imager poses. The robot may also automatically change the region of the imager.

In a specific implementation of the present invention, the mobile robot can change the position of the device in the setup phase to increase the diversity of the reference images. The position may be changed according to a set procedure or a procedure that varies according to the feature amount during acquisition of a set of reference images. The position change is created from the feature amount as part of the above-described procedure. In an embodiment, where the feature quantity is a measure of the difference or correlation between the camera pose or environmental physical appearance for different images, a feedback message may be generated by computer readable instructions for the required difference for the next reference image, and the mobile robot may create the required difference, setting the environment to the conditions required to respond to the feedback message. For example, the feature quantity may indicate that the reference image lacks sufficient difference in camera pose, and the mobile robot may move the imager to a different position to obtain another reference image. As another example, the feature quantity may indicate a lack of sufficient physical differentiation of an area within which the mobile robot may move itself or other items to increase such diversity.

While the specification has been described in detail with respect to specific embodiments thereof, it will be appreciated that those skilled in the art, upon attaining an understanding of the foregoing, may readily conceive of alterations to, variations of, and equivalents to these embodiments. For example, the present invention uses a remote control as an example of a pointing device, determining a pointing direction for sending commands to a controllable device. However, this method may be applied more generally to any pointing device, such as a presentation pointer, an inventory management device, or a wireless tag toy. Furthermore, the same method can also be used to determine the heading of any device in a particular physical environment, e.g., a robot or drone for automated navigation in a given space. Furthermore, many of the methods described herein are applicable to devices having built-in imagers even though these devices were not originally intended for use as pointing devices. For example, a camera mounted on the back of a smartphone camera may be used as an imager aligned with the direction of pointing of the smartphone, while a display may be used as a control interface. These and other modifications and variations to the present invention may be practiced by those of ordinary skill in the art, without departing from the scope of the present invention, which is more particularly set forth in the appended claims.

Claims

1. A system, comprising:

a control device, wherein the shape of the control device defines a pointing direction of the control device;

an onboard imager located on said control device, wherein a field of view of said onboard imager includes a pointing direction; and

one or more computer-readable media storing instructions that are executable to:

receiving a reference image;

determining a feature quantity according to the reference image;

generating a feedback message according to the characteristic quantity; and

the setting phase of the predictive control device is terminated according to the characteristic quantity.

2. The system of claim 1, wherein the one or more computer-readable media further store instructions executable to:

receiving at least one additional reference picture;

re-determining feature quantities from the reference image and the at least one additional reference image; and

after the characteristic quantity is redetermined, the termination of the system setup phase is triggered in accordance with the characteristic quantity.

3. The system of claim 1, wherein the one or more computer-readable media further store instructions executable to:

changing the environment of the control device according to the feedback message;

wherein changing the environment comprises changing one of: brightness of the environment, a position of a mobile robot in the environment, an image displayed on a screen in the environment;

the feedback message considers that at least one additional reference image needs to be captured to improve the feature quantity.

4. The system of claim 1, wherein:

the feedback message considers that at least one additional reference image needs to be captured to improve the characteristic quantity; and

the feedback message includes instructions how to capture the at least one additional reference image.

5. The system of claim 1, further comprising:

a companion device having a companion device imager and a display;

wherein the reference image is captured by the companion device imager; and

wherein the one or more computer-readable media further stores instructions to display the reference image on the display.

6. The system of claim 5, wherein:

capturing at least one additional reference image by the companion device; and

the instructions specify at least one position to be located by the companion device when capturing the at least one additional reference image.

7. The system of claim 1, wherein:

determining the feature quantity includes determining that the reference image is of insufficient quality; and

the feedback message includes an instruction to recapture the reference image.

8. The system of claim 1, the one or more computer-readable media further storing instructions executable to:

disabling a capture interface on a control device according to the feature quantity; and

wherein the determining of the feature quantity from the reference image is performed in real time.

9. The system of claim 1, the one or more computer-readable media further storing instructions executable to:

generating a signature for a target region using the reference image;

associating a communication object with the target area;

receiving a pointing target image from the onboard imager;

identifying the signature using the pointing target image; and

activating a communication interface of the communication object for identifying the signature.

10. The system of claim 1, further comprising:

a display located on said control device; and

wherein the one or more computer-readable media further stores instructions that are executable to:

generating a signature for a target region using the reference image;

associating a controllable object with the target area;

receiving a pointing target image from the onboard imager;

identifying the signature using the pointing target image; and

displaying a controllable object user interface on the display for identifying the signature.

11. The system of claim 1, further comprising:

a motion tracker located on said control device; and

determining the characteristic quantity according to the reference image and data provided by a motion tracker; or

And generating the feedback message according to the characteristic quantity and the data provided by the motion tracker.

12. A computer-implemented method performed using a control device having the following features: (i) its shape may define the pointing direction of the control device; and (ii) the field of view of the onboard imager includes a pointing direction, comprising the steps of:

capturing a reference image;

determining a characteristic quantity according to the reference image;

generating a feedback message according to the characteristic quantity; and

13. The computer-implemented method of claim 12, further comprising the steps of:

receiving at least one additional reference picture;

after the characteristic quantity is redetermined, the termination of the setting phase is triggered in accordance with the characteristic quantity.

14. The computer-implemented method of claim 12, further comprising the steps of:

capturing the reference image using a companion device, wherein the companion device has a companion device imager and a display; and

the reference image is displayed on the display.

15. The computer-implemented method of claim 14, wherein:

16. The computer-implemented method of claim 15, wherein:

17. The computer-implemented method of claim 16, further comprising:

capturing the at least one additional reference image using the companion device imager; and

wherein the instructions specify at least one position that the companion device is to be in when capturing the at least one additional reference image.

18. The computer-implemented method of claim 14, further comprising:

continuously capturing a real-time image stream using the onboard imager; and

wherein the feedback message comprises an augmented reality image stream generated using a real-time image stream.

19. The computer-implemented method of claim 12, wherein:

the step of determining the feature quantity includes determining that the reference image is of insufficient quality; and

the feedback message includes an instruction to recapture the reference image.

20. The computer-implemented method of claim 12, further comprising the steps of:

wherein the step of determining the feature quantity from the reference image is performed in real time.

21. The computer-implemented method of claim 12, further comprising the steps of:

generating a signature for a target region using the reference image;

associating with the target area one of: a communication object and a controllable object;

receiving a pointing target image from the onboard imager;

identifying the signature using the pointing target image; and

activating a communication interface of the responding communication object for identifying the signature.

22. The computer-implemented method of claim 12, further comprising the steps of:

determining the characteristic quantity according to a reference image on the control device and data provided by a motion tracker; or

23. A system, comprising:

a control device, wherein the control device has a pointing direction;

receiving a set of reference images;

determining a feature quantity from the set of reference images; and

according to the characteristic quantity: (i) generating a feedback message; or (ii) terminate the setup phase of the control device.

24. The system of claim 23, further comprising:

a companion device having a companion device imager and a display;

wherein the set of reference images is captured by the companion device imager; and

wherein the one or more computer-readable media further stores instructions to display the set of reference images on a display.

25. The system of claim 24, wherein:

26. The system of claim 25, wherein:

27. The system of claim 26, wherein:

capturing the at least one additional reference image by the companion device; and

28. The system of claim 23, the one or more computer-readable media further storing instructions executable to:

generating a signature for a target region using the set of reference images;

associating a communication object with the target area;

receiving a pointing target image from the onboard imager;

identifying the signature using the pointing target image; and

29. The system of claim 23, wherein:

the one or more computer-readable media further stores instructions that are executable to:

(i) generating the feedback message; and (ii) continuously capturing a real-time image stream using the onboard imager; and

the feedback message includes an augmented reality image stream generated using a real-time image stream.

30. The system of claim 23, the system further comprising:

a motion tracker located on said control device; and

determining the feature quantity from the set of reference images and data provided by a motion tracker; or