EP3014874A1 - System and method for user monitoring and intent determination - Google Patents

System and method for user monitoring and intent determination

Info

Publication number
EP3014874A1
EP3014874A1 EP14817985.6A EP14817985A EP3014874A1 EP 3014874 A1 EP3014874 A1 EP 3014874A1 EP 14817985 A EP14817985 A EP 14817985A EP 3014874 A1 EP3014874 A1 EP 3014874A1
Authority
EP
European Patent Office
Prior art keywords
state
viewing area
recited
identity
media
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP14817985.6A
Other languages
German (de)
French (fr)
Other versions
EP3014874A4 (en
Inventor
Arsham Hatambeiki
Paul D. Arling
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Universal Electronics Inc
Original Assignee
Universal Electronics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US13/925,966 external-priority patent/US9137570B2/en
Application filed by Universal Electronics Inc filed Critical Universal Electronics Inc
Publication of EP3014874A1 publication Critical patent/EP3014874A1/en
Publication of EP3014874A4 publication Critical patent/EP3014874A4/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/4223Cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44218Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/482End-user interface for program selection
    • H04N21/4826End-user interface for program selection using recommendation lists, e.g. of programs or channels sorted out according to their score

Definitions

  • Home entertainment systems comprised of plural appliances and/or the controlling devices used to issue commands to such appliances may be provisioned with devices for detecting user presence and/or user interaction via methods such as gesture, spoken voice, facial recognition, spatial analysis, etc., as known in the art.
  • devices for detecting user presence and/or user interaction via methods such as gesture, spoken voice, facial recognition, spatial analysis, etc., as known in the art.
  • personal communication devices such as smart phones, tablet computers, etc., may provide additional means for identification of user presence via detection of such personal communication devices on a local wireless network such as a WiFi network, a Bluetooth network, etc.
  • a central routing appliance such as an AV receiver, set top box, smart TV, etc.
  • This invention relates generally to home entertainment systems and control methods therefor and, in particular, to enhanced functionalities for such home
  • sensing interfaces such as an image sensing interface (e.g., an interface associated with a camera), a sound sensing interface (e.g., an interface associated with microphone) , and/or an interface for sensing the presence of an RF device such as a smart phone may be used to fully or partially automate a system response to common events which may occur during a TV viewing session, such as a user or users leaving or entering the viewing area, a user answering a telephone call, the detection of a doorbell ringing or a baby monitor alarm, etc.
  • image sensing interface e.g., an interface associated with a camera
  • a sound sensing interface e.g., an interface associated with microphone
  • an interface for sensing the presence of an RF device such as a smart phone
  • data derived from such sensing interfaces may be utilized to enhance the responsiveness of one or more system components, for example by sensing when a user is reaching for a physical remote control unit or preparing a component to issue a voice or gesture command.
  • user presence data derived from such sensing interfaces may be used by a central routing appliance in conjunction with media stream information in order to capture and report user viewing habits and/or preferences.
  • Figure 1 illustrates an exemplary system in which the teachings of the subject invention may be utilized
  • Figure 2 illustrates a further exemplary system in which the teachings of the subject invention may be utilized
  • Figure 3 illustrates, in block diagram form, an exemplary hardware architecture for an appliance which may be a component part of the systems illustrated in Figures 1 and 2;
  • Figure 4 illustrates, in block diagram form, an exemplary software architecture for the illustrative appliance of Figure 3 ;
  • Figure 5 illustrates, in flowchart form, the operation of an exemplary event processing module of the software architecture illustrated in Figure 4.
  • an exemplary home entertainment system in which the methods of the subject invention may be applied may comprise an AV receiver 100 which serves as a hub for directing a selected video and/or audio media stream from a source appliance such as, for example, a satellite or cable system set top box and/or DVR device ("STB") 104, a DVD player 106, a CD player 108, or a game console 1 10 to a destination appliance, such as a TV set 102, where the selected video and/or audio media stream is to be rendered.
  • a source appliance such as, for example, a satellite or cable system set top box and/or DVR device (“STB") 104, a DVD player 106, a CD player 108, or a game console 1 10 to a destination appliance, such as a TV set 102, where the selected video and/or audio media stream is to be rendered.
  • STB satellite or cable system set top box and/or DVR device
  • connections 130, 132 between appliances 102 through 110 and AV receiver 100 may generally comprise connections for carrying HDMI-compliant digital signals, although it will be appreciated that other interface standards such as component video, PCM audio, etc., may be substituted where necessitated by the limitations of a particular appliance.
  • additional AV content streams may also be available from a streaming service 1 18 such as for example Netflix, Vudu, YouTube, NBC online, etc., via a wide area network such as the Internet 1 16, to which end AV receiver 100 may be provisioned with a connection 112 to an Internet gateway device such as for example router 114.
  • connection between AV receiver 100 and Internet gateway device 114 may be wired as illustrated, or may be wireless, e.g., a WiFi local area network, as appropriate.
  • an exemplary home entertainment system may also be provisioned with one or more sensing interfaces, such as interfaces associated with microphone 120 and camera 122, suitable for the capture of audible and/or visible events within the home entertainment system environment.
  • Users 124, 126 of the illustrative entertainment system may select the media stream currently being viewed by means of any convenient method such through use of a remote control as known in the art, a voice command, a gesture, etc.
  • data regarding these selections including without limitation media source, channel, track, title, etc., together with viewing duration, user presence, etc. may be accumulated and reported to a database server 128 for the aggregation and analysis of user viewing habits and preferences as discussed in greater detail below.
  • a "smart" TV device 200 may incorporate both content rendering and source stream selection functionality.
  • local appliances 104 through 1 10 may be connected directly to multiple input ports of TV 200 via, for example, HDMI connections 130.
  • TV 200 may also support a connection 112 to a wide area network such as the Internet 1 16 over which streaming AV content and other data may be received.
  • the means for user command input to TV 200 and appliances 104 through 110 may take the form of a controlling device 204, for example a conventional remote control or a smart phone app in communication with the appliances via any convenient infrared (IR), radio frequency (RF), hardwired, point-to-point, or networked protocol, as necessary to cause the respective target appliances to perform the desired operational functions.
  • a controlling device 204 for example a conventional remote control or a smart phone app in communication with the appliances via any convenient infrared (IR), radio frequency (RF), hardwired, point-to-point, or networked protocol, as necessary to cause the respective target appliances to perform the desired operational functions.
  • user input may also comprise spoken and/or gestured commands in place of or supplemental to controlling device signals, which sounds and gestures may be received by microphone 120 and camera 122, processed and decoded by one of the appliances, for example TV 200, and where necessary relayed to other target appliances via, for example, HDMI CEC commands, IR or RF signals, etc., as described for example in co-pending U.S. Patent Application 13/657, 176 "System and Method for Optimized Appliance Control," of common ownership and incorporated by reference herein in its entirety.
  • an exemplary central routing appliance such as smart
  • TV appliance 200 may include, as needed for a particular application, rendering capabilities, e.g., TV engine and media processor 300 (it being appreciated that this may comprise one or more than one physical processor depending on the particular embodiment); memory 302 which may comprise any type of readable or read/write media such as RAM, ROM, FLASH, EEPROM, hard disk, optical disk, etc., or a combination thereof; a USB interface 304; digital AV input ports and interface 306, for example DVI or HDMI; analog AV input ports and interface 308, for example composite or component video with associated analog audio; an Ethernet and/or WiFi interface 310; a Bluetooth interface 328; a digital camera interface 312 with associated camera 122 which may be externally connected or built into the cabinet of TV 200; a microphone interface 314 with associated microphone 120 which may be externally connected or built into the cabinet of TV 200; a remote control interface 316 for receiving user- initiated operational commands via IR or RF signals 326; a display output 318 connected to TV screen 322; and an audio output 320 connected to internal or
  • TV programming may be stored within the memory 302 (hereafter the "TV programming") for execution by TV engine and media processor(s) 300.
  • TV programming An exemplary architecture for such TV programming is presented in Figure 4.
  • the exemplary TV programming may include, as required for a particular application, an underlying operating system 402, such as for example LINUX, which may support a set of software modules implementing the various functionalities of the smart TV device.
  • Such software modules may include a hardware abstraction layer 404 to provide a device independent interface between the various application software modules and the hardware dependent software modules such as video output driver 406, audio output driver 408, HDMI interface 410, analog input/output ADC/DAC 412, Ethernet and/or WiFi interface 414, Bluetooth interface 416; USB interface 418, remote control interface 420, and camera and microphone drivers 422 and 423.
  • Exemplary application modules which reside above abstraction layer 404 may include as required for a particular embodiment transport and session layer protocol and interface management 428; AV output management 424; input/output processing and routing 440; a miscellaneous services module 426 to support closed captioning, display configuration, OSD, etc.; remote control command decoder 430; and device resource management 442.
  • the exemplary TV programming may include audio and visual event detector modules 432, 434, for example voice and/or image recognition engines or the like; user event processing 436; and a user statistics gathering and reporting module 438.
  • smart TV appliance 200 may for example receive an incoming AV media stream from one of the input ports 306,308 to be processed, buffered, separated into audio and video components, and routed to outputs 318, 320 for rendering on TV display screen 322 and loudspeaker(s) 324; may receive commands from remote control interface 316 which are decoded and acted upon, for example to select an input media stream, adjust audio volume, etc.; may manage a connection to the Internet through Ethernet or WiFi interface 310 to enable browsing for content, download of software updates, video telephony utilizing inputs from camera 122 and microphone 120; etc.
  • the exemplary TV programming may receive and process input signals from controlling device 204, camera 122 and/or microphone 120 in order to detect user presence, identify individual users, and/or receive user command input, as will described hereafter.
  • the source of audio input signals may comprise a microphone 120 and associated interface 314 provisioned as part of a smart TV appliance 200
  • audio input signals may be captured by any other appliance in the system and forwarded to appliance 200 for processing, or may originate from a microphone provisioned in a controlling device such as remote control or smartphone 204, the output of which microphone may, by way of example, be digitized and/or processed by controlling device 204 and wirelessly forwarded to smart TV appliance 200 via remote control interface 326, WiFi interface 310, Bluetooth interface 328, or any other means as appropriate for a particular implementation.
  • the user event processing module 436 of the TV programming of TV appliance 200 may act as illustrated in the flowchart of Figure 5 upon occurrence of a user-related event.
  • event processing may act as illustrated in the flowchart of Figure 5 upon occurrence of a user-related event.
  • step 502 it may be determined if the event constitutes receipt of a remote control command, as may be reported by remote control command decoder 430.
  • remote control commands may be received via any or all of RC interface 420 (e.g., infrared or RF4CE signals or the like), Ethernet/WiFi interface 414, or Bluetooth interface 416, depending on the particular embodiment and the particular controlling device currently in use.
  • the requested functional operation may be executed.
  • such operations may include adjustment of output audio volume to be performed by AV management module 424, selection of a new media input stream to be performed by I/O and routing module 440, etc.
  • received remote control commands may also comprise requests to direct the functional operation of other connected appliances, for example control of DVD player 106 or STB 104 via CEC commands to be issued by interface management module 428 over HDMI connections 130; or
  • step 542 it may next be determined if the command function just performed comprised a change in the media stream being rendered by TV 200, for example selection of a new input port 306,308; a change to the selected broadcast channel or DVR playback of STB 104; a change to an internet media source, etc. If so, at step 544 data regarding this event may be conveyed to user statistics module 438 for logging and ultimate reporting to database server 128.
  • the data logged regarding the new content stream may comprise some or all of the command parameters themselves, e.g., an STB channel number and timestamp; metadata items obtainable from the content source device or channel such as a DVD title, streaming video URL, etc.; a sample of the audio or video content itself for analysis as is known in the art and described, for example, in U.S. Patents 7,986,913, 7,627,477 or 7,346,512; or any other data which may be suitable for identification purposes.
  • the command parameters themselves, e.g., an STB channel number and timestamp
  • metadata items obtainable from the content source device or channel such as a DVD title, streaming video URL, etc.
  • a sample of the audio or video content itself for analysis as is known in the art and described, for example, in U.S. Patents 7,986,913, 7,627,477 or 7,346,512; or any other data which may be suitable for identification purposes.
  • step 546 it may next be determined if the executed function comprised a reportable event, such as for example powering TV 200 (or any other connected device) on or off, issuing a fast forward command during DVR playback, etc. If so, this event may also be reported to user statistics module 438 for logging, after which processing of the remote control command is complete.
  • reportable appliance events such as powering an attached device on or off may also be separately initiated via for example direct communication to an appliance from its own remote control. Accordingly, though not illustrated, where such events are detectable, for example via HDMI status, absence or presence of video or audio signals, etc., these events may also be reported and logged.
  • the event processing may next determine if the reported event constitutes an image change event reported by visual event detection module 432 in response to analysis of image data received from camera 122 via camera driver 422.
  • image processing may utilize for example the techniques described in U.S. Patents 5,534,917, 6,829,384, 8,274,535, WIPO (PCT) patent application publication WO2010/057683A1, or the like, and may for example periodically monitor an image comprising the field of view of camera 122 in order in order to initiate image analysis in response to detection of any variation in image data which exceed a certain threshold value.
  • the event processing 436 may be adapted to issue a "pause" command to the source of the current media stream.
  • Other actions may include, without limitation, activating the recording function of a DVR, logging off a Web site, etc., as appropriate for a particular embodiment and configuration.
  • the event processing may cause display of a request for confirmation on TV screen 322, e.g. "Would you like to pause this show? (Y/N)." If confirmed by the user at step 522, which confirmation may take the form of a gesture, spoken command, remote control input, etc., or, in those embodiments where the default is to take action, a timeout, at step 528 the indicated action may be executed.
  • the performance accuracy of audio and/or visual event detection modules 432,434 may be improved by indicating a range of possible responses (e.g., "yes” or “no” in this instance) to these modules in advance, thereby limiting the number of sound or gesture templates which need to be matched.
  • the sound level of TV audio output 320 may be temporarily lowered to reduce background noise.
  • step 530 data regarding the change in user, including user identity where this is determinable, for example via use of techniques such as described in U.S. Patents 7,551,756, 7,702,599, or the like, may be conveyed to statistic gathering module 438 for logging, after which processing is complete.
  • the event processor may next determine if the reported event comprises the arrival of a new or additional user in the TV viewing environment and if so take appropriate action.
  • the event processor may be adapted to allow a viewer to invoke a "private viewing" status which may cause the current content to be automatically muted, paused, switched, etc. in the event an additional user enters the viewing environment.
  • the entry of a user into a viewing environment may trigger an offer to resume playback of previously paused content; or in a multi-room, multi-device household in which appliances are networked together and equipped with viewer recognition, the entry of a user into one viewing environment may cause the event processor in that environment to query other event processors and/or statistic modules elsewhere in the household to determine if that user has recently departed another environment, and if so, offer to resume playback of a content stream which was previously paused in that other environment.
  • the action(s) to be taken upon a user entering the viewing environment may be set to be specific to the identity of the arriving person, e.g., to be performed only when a specifically recognized individual or recognized type/category of individual, such as a child, enters the viewing environment.
  • the action(s) to be taken upon the arrival of a user to the viewing environment may be set to be specific to the identity and/or type of the currently viewing user or users.
  • an action to be executed may be one to inhibit the performance of any new actions by the currently viewing user or users, e.g., to inhibit a child from changing a channel that is currently being viewed to thereby allow the parent an opportunity to see what the child is currently watching.
  • the inhibiting of any action(s) may be lifted after a given period of time expires, upon the new user again exiting the viewing area, upon the new user overriding the action (for example via a gesture, voice, further action, or the like - to the extent that user is authorized to perform such action), etc.
  • the actions to be taken when a user enters the viewing area may be prioritized based upon the identities of the newly arriving user and the currently viewing user or users.
  • the actions to be taken by the system upon a detected "user entry event” can be used to provide one or more appliances within the system with desired states, e.g., to establish volume level settings, closed-captioning settings, commercial skipping settings, SAP setting, lighting level settings, and/or the like type of user preference settings without limitation.
  • desired states e.g., to establish volume level settings, closed-captioning settings, commercial skipping settings, SAP setting, lighting level settings, and/or the like type of user preference settings without limitation.
  • the actions to be taken by the system upon the detected arrival of a new user to the viewing area may cause the system to combine the favorite channel listings (such as shown in an electronic program guide) that have been established for the multiple viewers into a single listing or to provide a single listing that will include only those programs and/or channels that are commonly found within the favorite channel listings that have been established for each of the multiple viewers.
  • access to videos, games, programs, channels, or the like can be limited to only those videos, games, programs, channels, or the like that are commonly accessible to each of the multiple viewers.
  • the system may take appropriate action(s) to establish within the system one or more of these preferences upon the detection of a "user entry event."
  • step 526 it may be determined if any such "user entry event" action preferences have been set and, if so, at step 528 the appropriate action(s) may be executed, after which data regarding the user arrival, including user identity where this is determinable, may be conveyed to statistic gathering module 438 for logging, and event processing is complete.
  • step 528 the appropriate action(s) may be executed, after which data regarding the user arrival, including user identity where this is determinable, may be conveyed to statistic gathering module 438 for logging, and event processing is complete.
  • a "user exit event" indicative of one or more users exiting or preparing to exit the viewing area can be used by the system to take actions that would remove consideration of the exiting user's preference, to place the system into a state that is appropriate for the remaining users, etc.
  • the actions to be taken by the system upon detection of a "user exit event” can be set so as to be specific to the identity of the exiting user or users, the identity of remaining user or users, and/or may be prioritized based upon the identities of the exiting user or users and the remaining user or users without limitation.
  • step 518 it may be determined if any such "user exit event" action preferences have been set and, if so, at step 520 the appropriate action(s) may be executed, after which data regarding the user exiting, including user identity where this is determinable, may be conveyed to statistic gathering module 438 for logging, and event processing is complete.
  • the detected image change event is not a user arrival or departure, at steps 532, 534 and 536 it may next be determined if the reported event comprises a detectable gesture, a detectable gesture in this context comprising a user action or motion which either by pre-programming or a learning process has been identified to visual event detection module 432 as having significance as user input. If the reported event is determined to be associated with an operational command function, for example "pause”, “mute”, "channel up”, etc., processing may continue at step 540 to execute the command as described previously. If the reported event is determined to be a preparatory gesture, appropriate anticipatory action may be taken at step 538.
  • a preparatory gesture may comprise without limitation any preliminary motion or gesture by a user which may be interpreted as a possible indication of that user's intent to perform an action, for example standing up, reaching for or setting down a remote control device, beckoning an out of sight person to enter the room, picking up a phone, etc.
  • Anticipatory actions may include for example pre-conditioning visual and/or audio detection modules 432,434 to favor certain subsets of templates for matching purposes; modifying an onscreen menu from a format optimized for gesture control to one optimized for navigation via a keyboarded device such as a remote control, or vice- versa; reducing or muting audio volume; signaling a remote control device exit a quiescent state, to turn on or shut off its backlight, or the like; etc.
  • the event processing may next determine if the reported event constitutes a sound recognition event reported by audio event detection module 434.
  • Speech or sound recognition by module 434 may utilize for example the techniques described in U.S. Patent 7,603,276, WIPO (PCT) patent application publication WO2002/054382A1, or the like. If the event is a sound recognition event, at step 512 it may be determined if the decoded sound constitutes a voice command issued by a user. If so processing may continue at step 540 to execute the desired command as described previously. If not a voice command, at step 514 it may be determined if the reported event constitutes a trigger sound.
  • a trigger sound may be an instance of a non-vocal audio signal received via microphone 120 which either by pre-programming or via a learning process has been assigned a command or an anticipatory action.
  • trigger sounds may include a phone or doorbell ringing, a baby monitor, smoke alarm, microwave chime, etc. If the reported sound event is a trigger sound, the appropriate action, such as muting the television, etc., may be taken at step 516.
  • the system can be also be programmed to recognize various sounds and or spoken words/phrases as being indicative of a preparatory event whereupon one or more of the system components will be readied via an anticipatory action as described above.
  • a spoken phrase such as "let's see what else is on” may be recognized as a preparatory event whereupon an anticipatory action may be executed to place a remote control device into a state wherein the remote control device is readied to receive input in anticipation of its use or a spoken phrase such as "come here," the sound of a door bell, or the like may be recognized as a preparatory event whereupon an anticipatory action may be executed to ready the system to look for an anticipated event, e.g., a specific gesture, such as a user standing up, leaving the viewing area, etc. whereupon the appropriate response action to the sensed event that was anticipated, e.g., pausing the media, may be performed.
  • an anticipated event e.g., a specific gesture, such as a user standing up, leaving the viewing area, etc.
  • the system may execute, as needed, further actions, such as restorative action, to place the system into a state as desired, e.g., to return one or more components of the home entertainment system to a state where the component(s) is no longer looking for the occurrence of the anticipation event.
  • further actions such as restorative action
  • the event processing may next determine if the reported event comprises a wireless device such as for example a smart phone, tablet computer, game controller, etc., joining into or dropping from a LAN or PAN associated with smart TV 200 or other appliance in the system. If such devices have been previously registered with the TV programming, such activity may be used to infer the presence or absence of particular users. Such information may then be processed in a similar manner to user image detection (e.g., processed as a user being added or departing) continuing at step 518 as previously described.
  • a wireless device such as for example a smart phone, tablet computer, game controller, etc.
  • the event processor may process any other event report activity as consistent with a particular embodiment.
  • users may be provisioned with personal remote control devices which embed user identification data in their command transmissions, such as described in co-pending U.S. Patent Application 13/225,635 "Controlling Devices Used to Provide an Adaptive User Interface," of common ownership and incorporated by reference herein in its entirety, in which case user presence reporting events may be generated by remote control interface 420.
  • technologies such as infrared body heat sensing such as proposed in the art for use in automatic unattended power-off applications may be further adapted for the purposes described herein.
  • Additional sources of activity events may also include data received from other household equipment such as security, lighting, or HVAC control systems equipped with occupancy sensors; entryway cameras; driveway sensors, etc., where appropriate.
  • Statistic gathering module 438 may be adapted to report the data conveyed to it during the event processing steps described above to a centralized service, e.g., hosted on Internet connected server device 128, for aggregation and analysis of user viewing habits and preferences. Depending on the particular embodiment such reporting may be performed on an event-by-event basis, or alternatively the data may be accumulated and reported at predetermined time intervals or only upon receipt of a request from the server device.
  • data reported to statistic gathering module 438 may be formatted into several different event record classes and types for uploading to server device 128.
  • Exemplary record classes may comprise user events, e.g., as may be reported at step 530 of Figure 5; appliance events, e.g., as may be reported at step 548 of Figure 5; and content events, e.g., as may be reported at step 544 of Figure 5.
  • user events e.g., as may be reported at step 530 of Figure 5
  • appliance events e.g., as may be reported at step 548 of Figure 5
  • content events e.g., as may be reported at step 544 of Figure 5.
  • different or additional event classes may be appropriate in alternate embodiments and accordingly the above classifications are presented by way of example only and without limitation.
  • user event record types may include addition (i.e., arrival) of user to the viewing area and deletion (i.e., departure) of a user from the viewing area.
  • each of these record types may include timestamp data and a user ID.
  • the timestamp illustrated is suitable for use in applications where the server device is already aware of the geographical location of the reporting system, e.g., as a result of an initial setup procedure, by URL decoding, etc.
  • the timestamp field may include additional data such as a time zone, zip code, etc., where required.
  • User identity may be any item of data which serves to uniquely identify individual users to facilitate viewing habit and preference analysis at server 128.
  • user ID data may comprise identities explicitly assigned during a setup/configuration process; random numbers assigned by the system as each distinct user is initially detected; a hash value generated by a facial or voice recognition algorithm; a MAC address or serial number assigned to a smart phone or tablet computer; etc. as appropriate for a particular embodiment.
  • appliance event records may comprise record types indicative of events reported to, functional commands issued to, and/or operations performed by various controlled appliances, e.g., as reported at step 548 of Figure 5.
  • Such events may include without limitation appliance power on/off commands, playback control commands, etc., as necessary for a complete understanding of user viewing habits and preferences.
  • fast forward commands may also be logged in order to monitor commercial skipping activity.
  • each appliance event record type may also include a timestamp field as described above and an appliance type/ID field comprising an appliance type indicator (e.g.
  • STB/DVR Portable Network Video Recorder
  • DVD Portable Network Video Recorder
  • Internet stream etc.
  • a unique ID or subtype value which may be assigned by the event monitoring and/or statistics gathering module in order to distinguish between multiple appliances of the same type, e.g., a household with multiple DVD players, or with both cable and satellite STBs.
  • content event record types may include without limitation a channel or track change information record type; a title information record type containing, for example, a show title retrieved from program guide data, DVD or video-on-demand title information, etc.; a metadata record type containing metadata values obtained from a DVD or CD, streaming video service, or the like; a content sample record type containing a sample clip of audio and/or video content for comparison against a content identification database; or in alternate embodiments any other data which may be utilized in determining the identity of a particular content stream.
  • Each record type may comprise timestamp and source appliance fields as described above, together with a field containing identity data, which may comprise numeric, text, or binary data as necessary.
  • additional record and/or field types may be utilized in other embodiments, as necessary to enable reliable identification of media content streams.
  • Tables 1 through 3 are presented herein using a tabular format for ease of reference, in practice these may be implemented in various forms using any convenient data representation, for example a structured database, XML file, cloud-based service, etc., as appropriate for a particular embodiment.
  • the statistics gathering and recording functionality of the illustrative embodiment is implemented as part of the programming of an exemplary appliance 200, i.e., in software module 438, in other embodiments this functionality may be provisioned at a different location, for example in one of the other appliances forming part of an entertainment system, in a local PC, at a remote server or cable system headend, etc., or at any other convenient location to which the particular appliance programming may be capable of reporting user events.
  • image analysis and/or speech recognition may be performed by a smart TV device, a locally connected personal computer, a home security system, etc., or even "in the cloud", i.e., by an Internet based service, with the results reported to a user statistic gathering module and/or an event processing module resident in a connected AV receiver or STB.
  • image analysis and/or speech recognition may be performed by a smart TV device, a locally connected personal computer, a home security system, etc., or even "in the cloud", i.e., by an Internet based service, with the results reported to a user statistic gathering module and/or an event processing module resident in a connected AV receiver or STB.

Abstract

Sensing interfaces associated with a home entertainment system are used to automate a system response to events which occur in a viewing area associated with the home entertainment system. Data derived from such sensing interfaces may also be used to enhance the response readiness of one or more system components. Still further, user presence data derived from such sensing interfaces may be used to capture and report user viewing habits and/or preferences.

Description

SYSTEM AND METHOD FOR USER MONITORING
AND INTENT DETERMINATION
RELATED APPLICATION INFORMATION
This application claims the benefit of, and is a continuation-in-part of, U.S.
Application No. 13/758,307, filed on February 4, 2013, the disclosure of which is incorporated herein by reference in its entirety.
BACKGROUND
Home entertainment systems comprised of plural appliances and/or the controlling devices used to issue commands to such appliances may be provisioned with devices for detecting user presence and/or user interaction via methods such as gesture, spoken voice, facial recognition, spatial analysis, etc., as known in the art. Furthermore, growing use of personal communication devices such as smart phones, tablet computers, etc., may provide additional means for identification of user presence via detection of such personal communication devices on a local wireless network such as a WiFi network, a Bluetooth network, etc. While multiple media sources and multiple media rendering devices may be coupled in many of these home entertainment systems through a central routing appliance such as an AV receiver, set top box, smart TV, etc., no systems or methods currently exist for using user presence and/or user interaction detection alone or in conjunction with a central routing appliance to provide enhanced home entertainment system functionalities.
SUMMARY OF THE INVENTION
This invention relates generally to home entertainment systems and control methods therefor and, in particular, to enhanced functionalities for such home
entertainment systems which are enabled by the availability of additional user-related input methods for such systems. For example, in one aspect of the invention sensing interfaces such as an image sensing interface (e.g., an interface associated with a camera), a sound sensing interface (e.g., an interface associated with microphone) , and/or an interface for sensing the presence of an RF device such as a smart phone may be used to fully or partially automate a system response to common events which may occur during a TV viewing session, such as a user or users leaving or entering the viewing area, a user answering a telephone call, the detection of a doorbell ringing or a baby monitor alarm, etc. In another aspect of the invention, data derived from such sensing interfaces may be utilized to enhance the responsiveness of one or more system components, for example by sensing when a user is reaching for a physical remote control unit or preparing a component to issue a voice or gesture command. In a yet further aspect of the invention, user presence data derived from such sensing interfaces may be used by a central routing appliance in conjunction with media stream information in order to capture and report user viewing habits and/or preferences.
A better understanding of the objects, advantages, features, properties and relationships of the invention will be obtained from the following detailed description and accompanying drawings which set forth illustrative embodiments and which are indicative of the various ways in which the principles of the invention may be employed.
BRIEF DESCRIPTION OF THE DRAWINGS
For a better understanding of the various aspects of the invention, reference may be had to preferred embodiments shown in the attached drawings in which:
Figure 1 illustrates an exemplary system in which the teachings of the subject invention may be utilized;
Figure 2 illustrates a further exemplary system in which the teachings of the subject invention may be utilized;
Figure 3 illustrates, in block diagram form, an exemplary hardware architecture for an appliance which may be a component part of the systems illustrated in Figures 1 and 2;
Figure 4 illustrates, in block diagram form, an exemplary software architecture for the illustrative appliance of Figure 3 ; and
Figure 5 illustrates, in flowchart form, the operation of an exemplary event processing module of the software architecture illustrated in Figure 4.
DETAILED DESCRIPTION
With reference to Figure 1, an exemplary home entertainment system in which the methods of the subject invention may be applied may comprise an AV receiver 100 which serves as a hub for directing a selected video and/or audio media stream from a source appliance such as, for example, a satellite or cable system set top box and/or DVR device ("STB") 104, a DVD player 106, a CD player 108, or a game console 1 10 to a destination appliance, such as a TV set 102, where the selected video and/or audio media stream is to be rendered. In one preferred embodiment, the connections 130, 132 between appliances 102 through 110 and AV receiver 100 may generally comprise connections for carrying HDMI-compliant digital signals, although it will be appreciated that other interface standards such as component video, PCM audio, etc., may be substituted where necessitated by the limitations of a particular appliance. In some embodiments, additional AV content streams may also be available from a streaming service 1 18 such as for example Netflix, Vudu, YouTube, NBC online, etc., via a wide area network such as the Internet 1 16, to which end AV receiver 100 may be provisioned with a connection 112 to an Internet gateway device such as for example router 114. As will be appreciated, the connection between AV receiver 100 and Internet gateway device 114 may be wired as illustrated, or may be wireless, e.g., a WiFi local area network, as appropriate. To support audio and/or video phone calling, conferencing, etc. and in accordance with certain teachings of the subject invention, an exemplary home entertainment system may also be provisioned with one or more sensing interfaces, such as interfaces associated with microphone 120 and camera 122, suitable for the capture of audible and/or visible events within the home entertainment system environment. Users 124, 126 of the illustrative entertainment system may select the media stream currently being viewed by means of any convenient method such through use of a remote control as known in the art, a voice command, a gesture, etc. In some embodiments, data regarding these selections including without limitation media source, channel, track, title, etc., together with viewing duration, user presence, etc. may be accumulated and reported to a database server 128 for the aggregation and analysis of user viewing habits and preferences as discussed in greater detail below.
Turning now to Figure 2, in a second illustrative embodiment, a "smart" TV device 200 may incorporate both content rendering and source stream selection functionality. In this configuration, local appliances 104 through 1 10 may be connected directly to multiple input ports of TV 200 via, for example, HDMI connections 130. As is characteristic of the smart TV genre, TV 200 may also support a connection 112 to a wide area network such as the Internet 1 16 over which streaming AV content and other data may be received. The means for user command input to TV 200 and appliances 104 through 110 may take the form of a controlling device 204, for example a conventional remote control or a smart phone app in communication with the appliances via any convenient infrared (IR), radio frequency (RF), hardwired, point-to-point, or networked protocol, as necessary to cause the respective target appliances to perform the desired operational functions. Additionally, in certain embodiments user input may also comprise spoken and/or gestured commands in place of or supplemental to controlling device signals, which sounds and gestures may be received by microphone 120 and camera 122, processed and decoded by one of the appliances, for example TV 200, and where necessary relayed to other target appliances via, for example, HDMI CEC commands, IR or RF signals, etc., as described for example in co-pending U.S. Patent Application 13/657, 176 "System and Method for Optimized Appliance Control," of common ownership and incorporated by reference herein in its entirety.
For brevity, the discussions which follow will generally be with reference to the exemplary equipment configuration of Figure 2, it being understood however that in other embodiments, for example such that illustrated in Figure 1, the steps of the methods presented herein may be performed, mutatis mutandis, by various appliances or combinations of appliances as appropriate for a particular equipment configuration.
As illustrated in Figure 3, an exemplary central routing appliance, such as smart
TV appliance 200, may include, as needed for a particular application, rendering capabilities, e.g., TV engine and media processor 300 (it being appreciated that this may comprise one or more than one physical processor depending on the particular embodiment); memory 302 which may comprise any type of readable or read/write media such as RAM, ROM, FLASH, EEPROM, hard disk, optical disk, etc., or a combination thereof; a USB interface 304; digital AV input ports and interface 306, for example DVI or HDMI; analog AV input ports and interface 308, for example composite or component video with associated analog audio; an Ethernet and/or WiFi interface 310; a Bluetooth interface 328; a digital camera interface 312 with associated camera 122 which may be externally connected or built into the cabinet of TV 200; a microphone interface 314 with associated microphone 120 which may be externally connected or built into the cabinet of TV 200; a remote control interface 316 for receiving user- initiated operational commands via IR or RF signals 326; a display output 318 connected to TV screen 322; and an audio output 320 connected to internal or external loudspeakers 324.
To cause the smart TV appliance 200 to perform an action, appropriate programming instructions may be stored within the memory 302 (hereafter the "TV programming") for execution by TV engine and media processor(s) 300. An exemplary architecture for such TV programming is presented in Figure 4. As illustrated, the exemplary TV programming may include, as required for a particular application, an underlying operating system 402, such as for example LINUX, which may support a set of software modules implementing the various functionalities of the smart TV device. Such software modules may include a hardware abstraction layer 404 to provide a device independent interface between the various application software modules and the hardware dependent software modules such as video output driver 406, audio output driver 408, HDMI interface 410, analog input/output ADC/DAC 412, Ethernet and/or WiFi interface 414, Bluetooth interface 416; USB interface 418, remote control interface 420, and camera and microphone drivers 422 and 423. Exemplary application modules which reside above abstraction layer 404 may include as required for a particular embodiment transport and session layer protocol and interface management 428; AV output management 424; input/output processing and routing 440; a miscellaneous services module 426 to support closed captioning, display configuration, OSD, etc.; remote control command decoder 430; and device resource management 442. Additionally, in keeping with the teachings of this invention, the exemplary TV programming may include audio and visual event detector modules 432, 434, for example voice and/or image recognition engines or the like; user event processing 436; and a user statistics gathering and reporting module 438.
Under the control of such TV programming, smart TV appliance 200 may for example receive an incoming AV media stream from one of the input ports 306,308 to be processed, buffered, separated into audio and video components, and routed to outputs 318, 320 for rendering on TV display screen 322 and loudspeaker(s) 324; may receive commands from remote control interface 316 which are decoded and acted upon, for example to select an input media stream, adjust audio volume, etc.; may manage a connection to the Internet through Ethernet or WiFi interface 310 to enable browsing for content, download of software updates, video telephony utilizing inputs from camera 122 and microphone 120; etc. Additionally, in accordance with the teachings herein, the exemplary TV programming may receive and process input signals from controlling device 204, camera 122 and/or microphone 120 in order to detect user presence, identify individual users, and/or receive user command input, as will described hereafter. As will be appreciated, while in the illustrative embodiment the source of audio input signals may comprise a microphone 120 and associated interface 314 provisioned as part of a smart TV appliance 200, in alternative embodiments audio input signals may be captured by any other appliance in the system and forwarded to appliance 200 for processing, or may originate from a microphone provisioned in a controlling device such as remote control or smartphone 204, the output of which microphone may, by way of example, be digitized and/or processed by controlling device 204 and wirelessly forwarded to smart TV appliance 200 via remote control interface 326, WiFi interface 310, Bluetooth interface 328, or any other means as appropriate for a particular implementation. In an exemplary embodiment the user event processing module 436 of the TV programming of TV appliance 200 (hereafter "event processing") may act as illustrated in the flowchart of Figure 5 upon occurrence of a user-related event. First, at step 502 it may be determined if the event constitutes receipt of a remote control command, as may be reported by remote control command decoder 430. As will be appreciated, remote control commands may be received via any or all of RC interface 420 (e.g., infrared or RF4CE signals or the like), Ethernet/WiFi interface 414, or Bluetooth interface 416, depending on the particular embodiment and the particular controlling device currently in use. If it is determined that the event constitutes receipt of a remote control command, at step 540 the requested functional operation may be executed. By way of example without limitation, such operations may include adjustment of output audio volume to be performed by AV management module 424, selection of a new media input stream to be performed by I/O and routing module 440, etc. In some embodiments, received remote control commands may also comprise requests to direct the functional operation of other connected appliances, for example control of DVD player 106 or STB 104 via CEC commands to be issued by interface management module 428 over HDMI connections 130; or
alternatively via a connected IR blaster or a LAN as described for example in co-pending U.S. Patent Application 13/657, 176 "System and Method for Optimized Appliance Control," of common ownership and incorporated by reference herein in its entirety. Upon completion of the requested command function, at step 542 it may next be determined if the command function just performed comprised a change in the media stream being rendered by TV 200, for example selection of a new input port 306,308; a change to the selected broadcast channel or DVR playback of STB 104; a change to an internet media source, etc. If so, at step 544 data regarding this event may be conveyed to user statistics module 438 for logging and ultimate reporting to database server 128. As will be appreciated, the data logged regarding the new content stream may comprise some or all of the command parameters themselves, e.g., an STB channel number and timestamp; metadata items obtainable from the content source device or channel such as a DVD title, streaming video URL, etc.; a sample of the audio or video content itself for analysis as is known in the art and described, for example, in U.S. Patents 7,986,913, 7,627,477 or 7,346,512; or any other data which may be suitable for identification purposes. If it is determined that the executed function did not comprise any change to the current media stream, at step 546 it may next be determined if the executed function comprised a reportable event, such as for example powering TV 200 (or any other connected device) on or off, issuing a fast forward command during DVR playback, etc. If so, this event may also be reported to user statistics module 438 for logging, after which processing of the remote control command is complete. In this regard, it will be appreciated that reportable appliance events such as powering an attached device on or off may also be separately initiated via for example direct communication to an appliance from its own remote control. Accordingly, though not illustrated, where such events are detectable, for example via HDMI status, absence or presence of video or audio signals, etc., these events may also be reported and logged.
If the reported event is not a remote control command, at step 504 the event processing may next determine if the reported event constitutes an image change event reported by visual event detection module 432 in response to analysis of image data received from camera 122 via camera driver 422. Such image processing may utilize for example the techniques described in U.S. Patents 5,534,917, 6,829,384, 8,274,535, WIPO (PCT) patent application publication WO2010/057683A1, or the like, and may for example periodically monitor an image comprising the field of view of camera 122 in order in order to initiate image analysis in response to detection of any variation in image data which exceed a certain threshold value. If it is determined that the event is a report of a detected image change, at step 518 it is next determined if the event comprises the departure or imminent departure of a user from the viewing environment of TV 200. If so, various actions may be taken by event processing 436 as appropriate. By way of example, if the departing user is the sole viewer (or, in some embodiments, the primary user, e.g. the user who initiated the current viewing session, the user that is provided within the system with a higher priority relative to remaining users, or the like) the event processing may be adapted to issue a "pause" command to the source of the current media stream. Other actions may include, without limitation, activating the recording function of a DVR, logging off a Web site, etc., as appropriate for a particular embodiment and configuration. If such an action is to be taken, at step 520 in the illustrative embodiment the event processing may cause display of a request for confirmation on TV screen 322, e.g. "Would you like to pause this show? (Y/N)." If confirmed by the user at step 522, which confirmation may take the form of a gesture, spoken command, remote control input, etc., or, in those embodiments where the default is to take action, a timeout, at step 528 the indicated action may be executed. In this context it will be appreciated that in embodiments where voice or gesture responses are expected, the performance accuracy of audio and/or visual event detection modules 432,434 may be improved by indicating a range of possible responses (e.g., "yes" or "no" in this instance) to these modules in advance, thereby limiting the number of sound or gesture templates which need to be matched. Also, in some embodiments where voice input may be utilized, the sound level of TV audio output 320 may be temporarily lowered to reduce background noise.
Thereafter, at step 530 data regarding the change in user, including user identity where this is determinable, for example via use of techniques such as described in U.S. Patents 7,551,756, 7,702,599, or the like, may be conveyed to statistic gathering module 438 for logging, after which processing is complete.
If the detected image change event is not a user departure, at step 524 the event processor may next determine if the reported event comprises the arrival of a new or additional user in the TV viewing environment and if so take appropriate action. By way of example, some embodiments may be adapted to allow a viewer to invoke a "private viewing" status which may cause the current content to be automatically muted, paused, switched, etc. in the event an additional user enters the viewing environment. In a still further embodiment, the entry of a user into a viewing environment may trigger an offer to resume playback of previously paused content; or in a multi-room, multi-device household in which appliances are networked together and equipped with viewer recognition, the entry of a user into one viewing environment may cause the event processor in that environment to query other event processors and/or statistic modules elsewhere in the household to determine if that user has recently departed another environment, and if so, offer to resume playback of a content stream which was previously paused in that other environment.
It is contemplated that the action(s) to be taken upon a user entering the viewing environment may be set to be specific to the identity of the arriving person, e.g., to be performed only when a specifically recognized individual or recognized type/category of individual, such as a child, enters the viewing environment. Similarly, the action(s) to be taken upon the arrival of a user to the viewing environment may be set to be specific to the identity and/or type of the currently viewing user or users. In such a case, an action to be executed may be one to inhibit the performance of any new actions by the currently viewing user or users, e.g., to inhibit a child from changing a channel that is currently being viewed to thereby allow the parent an opportunity to see what the child is currently watching. In the latter example, the inhibiting of any action(s) may be lifted after a given period of time expires, upon the new user again exiting the viewing area, upon the new user overriding the action (for example via a gesture, voice, further action, or the like - to the extent that user is authorized to perform such action), etc. In still further cases, the actions to be taken when a user enters the viewing area may be prioritized based upon the identities of the newly arriving user and the currently viewing user or users. Yet further, the actions to be taken by the system upon a detected "user entry event" can be used to provide one or more appliances within the system with desired states, e.g., to establish volume level settings, closed-captioning settings, commercial skipping settings, SAP setting, lighting level settings, and/or the like type of user preference settings without limitation. Thus, from the foregoing examples it will be appreciated that various combinations of actions may be specified for the system to take upon the detection of a "user entry event," which actions may or may not consider user identities and/or user identity priorities, and, as such, the examples provided herein are not intended to be limiting in any form.
By way of still further example, the actions to be taken by the system upon the detected arrival of a new user to the viewing area may cause the system to combine the favorite channel listings (such as shown in an electronic program guide) that have been established for the multiple viewers into a single listing or to provide a single listing that will include only those programs and/or channels that are commonly found within the favorite channel listings that have been established for each of the multiple viewers. Similarly, access to videos, games, programs, channels, or the like (such as set via use of a parental control chip) can be limited to only those videos, games, programs, channels, or the like that are commonly accessible to each of the multiple viewers. Yet further, to the extent preferences that have been established for the recognized users do not conflict, the system may take appropriate action(s) to establish within the system one or more of these preferences upon the detection of a "user entry event."
It is for the above-noted purposes at step 526 it may be determined if any such "user entry event" action preferences have been set and, if so, at step 528 the appropriate action(s) may be executed, after which data regarding the user arrival, including user identity where this is determinable, may be conveyed to statistic gathering module 438 for logging, and event processing is complete. It will also be appreciated that, while the foregoing describes various actions that may be taken upon a detection of a "user entry event," it is to be understood that like actions may be taken when the system is initially started in the presence of multiple viewers.
In the event that it is detected that a user responsible for a "user entry event" exits or is in the process of exiting from the viewing area, i.e., a "user exit event" is detected, the above described actions taken in connection with the "user entry event" can be automatically reversed (or the remaining user (s) can be prompted if a reversing action is to take place). For example, video that was paused can be resumed, electronic program guides/favorite channel listings can be restored, etc. In a like manner, a "user exit event" indicative of one or more users exiting or preparing to exit the viewing area can be used by the system to take actions that would remove consideration of the exiting user's preference, to place the system into a state that is appropriate for the remaining users, etc. As before, the actions to be taken by the system upon detection of a "user exit event" can be set so as to be specific to the identity of the exiting user or users, the identity of remaining user or users, and/or may be prioritized based upon the identities of the exiting user or users and the remaining user or users without limitation. It is again to these purposes that at step 518 it may be determined if any such "user exit event" action preferences have been set and, if so, at step 520 the appropriate action(s) may be executed, after which data regarding the user exiting, including user identity where this is determinable, may be conveyed to statistic gathering module 438 for logging, and event processing is complete.
If the detected image change event is not a user arrival or departure, at steps 532, 534 and 536 it may next be determined if the reported event comprises a detectable gesture, a detectable gesture in this context comprising a user action or motion which either by pre-programming or a learning process has been identified to visual event detection module 432 as having significance as user input. If the reported event is determined to be associated with an operational command function, for example "pause", "mute", "channel up", etc., processing may continue at step 540 to execute the command as described previously. If the reported event is determined to be a preparatory gesture, appropriate anticipatory action may be taken at step 538. In this context, a preparatory gesture may comprise without limitation any preliminary motion or gesture by a user which may be interpreted as a possible indication of that user's intent to perform an action, for example standing up, reaching for or setting down a remote control device, beckoning an out of sight person to enter the room, picking up a phone, etc. Anticipatory actions may include for example pre-conditioning visual and/or audio detection modules 432,434 to favor certain subsets of templates for matching purposes; modifying an onscreen menu from a format optimized for gesture control to one optimized for navigation via a keyboarded device such as a remote control, or vice- versa; reducing or muting audio volume; signaling a remote control device exit a quiescent state, to turn on or shut off its backlight, or the like; etc.
If the reported event is not an image change event, at step 506 the event processing may next determine if the reported event constitutes a sound recognition event reported by audio event detection module 434. Speech or sound recognition by module 434 may utilize for example the techniques described in U.S. Patent 7,603,276, WIPO (PCT) patent application publication WO2002/054382A1, or the like. If the event is a sound recognition event, at step 512 it may be determined if the decoded sound constitutes a voice command issued by a user. If so processing may continue at step 540 to execute the desired command as described previously. If not a voice command, at step 514 it may be determined if the reported event constitutes a trigger sound. In this context a trigger sound may be an instance of a non-vocal audio signal received via microphone 120 which either by pre-programming or via a learning process has been assigned a command or an anticipatory action. By way of example without limitation, trigger sounds may include a phone or doorbell ringing, a baby monitor, smoke alarm, microwave chime, etc. If the reported sound event is a trigger sound, the appropriate action, such as muting the television, etc., may be taken at step 516. The system can be also be programmed to recognize various sounds and or spoken words/phrases as being indicative of a preparatory event whereupon one or more of the system components will be readied via an anticipatory action as described above. By way of example, a spoken phrase such as "let's see what else is on" may be recognized as a preparatory event whereupon an anticipatory action may be executed to place a remote control device into a state wherein the remote control device is readied to receive input in anticipation of its use or a spoken phrase such as "come here," the sound of a door bell, or the like may be recognized as a preparatory event whereupon an anticipatory action may be executed to ready the system to look for an anticipated event, e.g., a specific gesture, such as a user standing up, leaving the viewing area, etc. whereupon the appropriate response action to the sensed event that was anticipated, e.g., pausing the media, may be performed. In the event that an anticipated event is not performed within a predetermined period of time, the system may execute, as needed, further actions, such as restorative action, to place the system into a state as desired, e.g., to return one or more components of the home entertainment system to a state where the component(s) is no longer looking for the occurrence of the anticipation event.
If the reported event is not a sound recognition event, at step 508 the event processing may next determine if the reported event comprises a wireless device such as for example a smart phone, tablet computer, game controller, etc., joining into or dropping from a LAN or PAN associated with smart TV 200 or other appliance in the system. If such devices have been previously registered with the TV programming, such activity may be used to infer the presence or absence of particular users. Such information may then be processed in a similar manner to user image detection (e.g., processed as a user being added or departing) continuing at step 518 as previously described.
Finally, at step 510 the event processor may process any other event report activity as consistent with a particular embodiment. For example, in some embodiments users may be provisioned with personal remote control devices which embed user identification data in their command transmissions, such as described in co-pending U.S. Patent Application 13/225,635 "Controlling Devices Used to Provide an Adaptive User Interface," of common ownership and incorporated by reference herein in its entirety, in which case user presence reporting events may be generated by remote control interface 420. In other embodiments technologies such as infrared body heat sensing such as proposed in the art for use in automatic unattended power-off applications may be further adapted for the purposes described herein. Additional sources of activity events may also include data received from other household equipment such as security, lighting, or HVAC control systems equipped with occupancy sensors; entryway cameras; driveway sensors, etc., where appropriate.
Statistic gathering module 438 may be adapted to report the data conveyed to it during the event processing steps described above to a centralized service, e.g., hosted on Internet connected server device 128, for aggregation and analysis of user viewing habits and preferences. Depending on the particular embodiment such reporting may be performed on an event-by-event basis, or alternatively the data may be accumulated and reported at predetermined time intervals or only upon receipt of a request from the server device. By way of example, in an illustrative embodiment data reported to statistic gathering module 438 may be formatted into several different event record classes and types for uploading to server device 128. Exemplary record classes may comprise user events, e.g., as may be reported at step 530 of Figure 5; appliance events, e.g., as may be reported at step 548 of Figure 5; and content events, e.g., as may be reported at step 544 of Figure 5. As will be appreciated, different or additional event classes may be appropriate in alternate embodiments and accordingly the above classifications are presented by way of example only and without limitation.
With reference to Table 1 below, user event record types may include addition (i.e., arrival) of user to the viewing area and deletion (i.e., departure) of a user from the viewing area. As illustrated, each of these record types may include timestamp data and a user ID. The timestamp illustrated is suitable for use in applications where the server device is already aware of the geographical location of the reporting system, e.g., as a result of an initial setup procedure, by URL decoding, etc. In embodiments where this is not the case, the timestamp field may include additional data such as a time zone, zip code, etc., where required. User identity may be any item of data which serves to uniquely identify individual users to facilitate viewing habit and preference analysis at server 128. By way of example, user ID data may comprise identities explicitly assigned during a setup/configuration process; random numbers assigned by the system as each distinct user is initially detected; a hash value generated by a facial or voice recognition algorithm; a MAC address or serial number assigned to a smart phone or tablet computer; etc. as appropriate for a particular embodiment.
TABLE 1 : User event record
Referring now to Table 2, appliance event records may comprise record types indicative of events reported to, functional commands issued to, and/or operations performed by various controlled appliances, e.g., as reported at step 548 of Figure 5. Such events may include without limitation appliance power on/off commands, playback control commands, etc., as necessary for a complete understanding of user viewing habits and preferences. By way of example, in addition to capturing power status for a DVR appliance, fast forward commands may also be logged in order to monitor commercial skipping activity. As illustrated, each appliance event record type may also include a timestamp field as described above and an appliance type/ID field comprising an appliance type indicator (e.g. STB/DVR, DVD, Internet stream, etc.) together with a unique ID or subtype value which may be assigned by the event monitoring and/or statistics gathering module in order to distinguish between multiple appliances of the same type, e.g., a household with multiple DVD players, or with both cable and satellite STBs.
TABLE 2: Appliance event record
Referring now to Table 3, content event record types may include without limitation a channel or track change information record type; a title information record type containing, for example, a show title retrieved from program guide data, DVD or video-on-demand title information, etc.; a metadata record type containing metadata values obtained from a DVD or CD, streaming video service, or the like; a content sample record type containing a sample clip of audio and/or video content for comparison against a content identification database; or in alternate embodiments any other data which may be utilized in determining the identity of a particular content stream. Each record type may comprise timestamp and source appliance fields as described above, together with a field containing identity data, which may comprise numeric, text, or binary data as necessary. As will be appreciated, additional record and/or field types may be utilized in other embodiments, as necessary to enable reliable identification of media content streams. Event type Time stamp Source Identity data
appliance
Content:chan/track yyyy:mm:dd:hh:mm:ss tt:xxxx xxxxx
Content:title yyyy:mm:dd:hh:mm:ss tt:xxxx {text}
Content:metadata yyyy:mm:dd:hh:mm:ss tt:xxxx {text}
Content:sample yyyy:mm:dd:hh:mm:ss tt:xxxx {binary data}
Etc...
TABLE 3 : Content event record
It will be appreciated that while the exemplary data structures of Tables 1 through 3 are presented herein using a tabular format for ease of reference, in practice these may be implemented in various forms using any convenient data representation, for example a structured database, XML file, cloud-based service, etc., as appropriate for a particular embodiment. Furthermore, it will also be appreciated that while the statistics gathering and recording functionality of the illustrative embodiment is implemented as part of the programming of an exemplary appliance 200, i.e., in software module 438, in other embodiments this functionality may be provisioned at a different location, for example in one of the other appliances forming part of an entertainment system, in a local PC, at a remote server or cable system headend, etc., or at any other convenient location to which the particular appliance programming may be capable of reporting user events.
While various concepts have been described in detail, it will be appreciated by those skilled in the art that various modifications and alternatives to those concepts could be developed in light of the overall teachings of the disclosure. For example, in alternate embodiments the steps of the methods described above may be advantageously performed in various other appliances as appropriate, e.g. an AV receiver or a cable/satellite STB. Further, in an interconnected system such as illustrated in Figures 1 or 2, especially where such interconnection is digital, it will be appreciated that the various steps of the methods may be performed by different appliances. For example without limitation, image analysis and/or speech recognition may be performed by a smart TV device, a locally connected personal computer, a home security system, etc., or even "in the cloud", i.e., by an Internet based service, with the results reported to a user statistic gathering module and/or an event processing module resident in a connected AV receiver or STB. Further, while described in the context of functional modules and illustrated using block diagram format, it is to be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or a software module, or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an enabling understanding of the invention. Rather, the actual
implementation of such modules would be well within the routine skill of an engineer, given the disclosure herein of the attributes, functionality, and inter-relationship of the various functional modules in the system. Therefore, a person skilled in the art, applying ordinary skill, will be able to practice the invention set forth in the claims without undue experimentation. It will be additionally appreciated that the particular concepts disclosed are meant to be illustrative only and not limiting as to the scope of the invention which is to be given the full breadth of the appended claims and any equivalents thereof.
All patents cited within this document are hereby incorporated by reference in their entirety.

Claims

CLAIMS What is claimed is:
1. A method for controlling at least one component device in a home entertainment system comprised of a plurality of component devices, comprising:
receiving event data via an image sensing interface;
determining from the received event data at least a number of and identity of viewers in a viewing area associated with the home entertainment system; and
when it is determined from the received event data that at least the number of and identity of viewers in the viewing area associated with the home entertainment system has changed causing a command action to be executed whereupon at least one of the plurality of component devices is caused to perform an operational function to thereby change a state of the at least one of the plurality of component devices.
2. The method as recited in claim I, wherein the state of the at least one of the plurality of component devices comprises a state in which media is made available via use of at least one of a favorites list and an electronic program guide.
3. The method as recited in claim 2, comprising changing the state in which media is made available via use of at least one of a favorites list and an electronic program guide in response to the number of viewers in the viewing area increasing by using within the at least one of a favorites list and an electronic program guide only that media which is commonly found within a favorites list or an electronic program guide linked to an identity of each viewer in the viewing area.
4. The method as recited in claim 2, comprising changing the state in which media is made available via use of at least one of a favorites list and an electronic program guide in response to the number of viewers in the viewing area increasing by using within the at least one of a favorites list and an electronic program guide all media which is found within a favorites list or an electronic program guide linked to an identity of each viewer in the viewing area.
5. The method as recited in claim 2, comprising changing the state in which media is made available via use of at least one of a favorites list and an electronic program guide in response to the number of viewers in the viewing area decreasing by using within the at least one of a favorites list and an electronic program guide only that media which is found within a favorites list or an electronic program guide linked to an identity of at least one viewer remaining in the viewing area.
6. The method as recited in claim 1 , wherein the state of the at least one of the plurality of component devices comprises a state associated with a preference linked to an identity of a new viewer entering the viewing area.
7. The method as recited in claim 6, wherein the preference comprises a closed- captioning preference.
8. The method as recited in claim 6, wherein the preference comprises a SAP preference.
9. The method as recited in claim I, wherein the state of the at least one of the plurality of component devices comprises a state associated with a preference that is linked to an identity of a one of a plurality of viewers having a highest assigned viewer priority.
10. The method as recited in claim I, wherein determining from the received event data at least a number of and identity of viewers in a viewing area associated with the home entertainment system comprises determining an identity category for at least one of the viewers in the viewing area.
11. The method as recited in claim 1, wherein the state of the at least one of the plurality of component devices comprises a state in which the one of the plurality of component device is rendered non-responsive to commands provided thereto for at least a period of time.
12. The method as recited in claim 1, wherein the state of the at least one of the plurality of component devices comprises a state in which the one of the plurality of component device is caused to render media associated with an identity of a new viewer entering the viewing area.
13. The method as recited in claim 1, wherein the at least one of the plurality of component devices comprises a media playback device and the executed command action comprises causing the media playback device to perform a pause operational function to inhibit a playing of media in the viewing area.
14. A method for controlling at least one component device in a home entertainment system comprised of a plurality of component devices, comprising:
receiving event data via an image sensing interface;
determining from the received event data at least a number of and identity of viewers in a viewing area associated with the home entertainment system; and
when it is determined from the received event data that at least the number of and identity of viewers in the viewing area associated with the home entertainment system has increased causing a command action to be executed whereupon at least one of the plurality of component devices is caused to perform an operational function to thereby change a state of the at least one of the plurality of component devices and wherein the state of the at least one of the plurality of component devices to be changed is determined considering an identity of a new viewer to the viewing area.
15. The method as recited in claim 14, wherein the state of the at least one of the plurality of component devices comprises a state in which media is made available via use of at least one of a favorites list and an electronic program guide.
16. The method as recited in claim 15, comprising changing the state in which media is made available via use of at least one of a favorites list and an electronic program guide by using within the at least one of a favorites list and an electronic program guide only that media which is commonly found within a favorites list or an electronic program guide linked to an identity of each viewer in the viewing area.
17. The method as recited in claim 15, comprising changing the state in which media is made available via use of at least one of a favorites list and an electronic program guide by using within the at least one of a favorites list and an electronic program guide all media which is found within a favorites list or an electronic program guide linked to an identity of each viewer in the viewing area.
18. The method as recited in claim 14, wherein the state of the at least one of the plurality of component devices comprises a closed-captioning state.
19. The method as recited in claim 14, wherein the state of the at least one of the plurality of component devices comprises a SAP state.
20. The method as recited in claim 14, wherein the state of the at least one of the plurality of component device comprises a volume level state.
21. The method as recited in claim 14, comprising determining from the received event data that at least the number of and identity of viewers in the viewing area associated with the home entertainment system has subsequently decreased and causing a command action to be executed whereupon at least one of the plurality of component devices is caused to perform an operational function to thereby change a state of the at least one of the plurality of component devices and wherein the state of the at least one of the plurality of component devices to be changed is determined considering an identity of a viewer exiting the viewing area.
EP14817985.6A 2013-06-25 2014-06-17 System and method for user monitoring and intent determination Withdrawn EP3014874A4 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13/925,966 US9137570B2 (en) 2013-02-04 2013-06-25 System and method for user monitoring and intent determination
PCT/US2014/042687 WO2014209674A1 (en) 2013-06-25 2014-06-17 System and method for user monitoring and intent determination

Publications (2)

Publication Number Publication Date
EP3014874A1 true EP3014874A1 (en) 2016-05-04
EP3014874A4 EP3014874A4 (en) 2016-06-08

Family

ID=52142560

Family Applications (1)

Application Number Title Priority Date Filing Date
EP14817985.6A Withdrawn EP3014874A4 (en) 2013-06-25 2014-06-17 System and method for user monitoring and intent determination

Country Status (3)

Country Link
EP (1) EP3014874A4 (en)
CN (1) CN105474652B (en)
WO (1) WO2014209674A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10057640B2 (en) * 2015-08-17 2018-08-21 Google Llc Media content migration based on user location
CN108681398A (en) * 2018-05-10 2018-10-19 北京光年无限科技有限公司 Visual interactive method and system based on visual human

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7185355B1 (en) * 1998-03-04 2007-02-27 United Video Properties, Inc. Program guide system with preference profiles
DK1327209T3 (en) * 2000-10-11 2008-12-08 United Video Properties Inc Systems and methods for providing data storage on servers in an on-demand media delivery system
CN103369391B (en) * 2007-11-21 2016-12-28 高通股份有限公司 The method and system of electronic equipment is controlled based on media preferences
US8949871B2 (en) * 2010-09-08 2015-02-03 Opentv, Inc. Smart media selection based on viewer user presence
US20130061258A1 (en) * 2011-09-02 2013-03-07 Sony Corporation Personalized television viewing mode adjustments responsive to facial recognition
US20130219417A1 (en) * 2012-02-16 2013-08-22 Comcast Cable Communications, Llc Automated Personalization
CN102970606B (en) * 2012-12-04 2017-11-17 深圳Tcl新技术有限公司 The TV programme suggesting method and device of identity-based identification

Also Published As

Publication number Publication date
EP3014874A4 (en) 2016-06-08
BR112015032568A2 (en) 2017-07-25
WO2014209674A1 (en) 2014-12-31
CN105474652B (en) 2019-07-05
CN105474652A (en) 2016-04-06

Similar Documents

Publication Publication Date Title
US11949947B2 (en) System and method for user monitoring and intent determination
US9137570B2 (en) System and method for user monitoring and intent determination
US11671662B2 (en) Methods and systems for controlling media display in a smart media display environment
US9215507B2 (en) Volume customization
US10531152B2 (en) Tracking and responding to distracting events
US11122338B2 (en) First-screen navigation with channel surfing, backdrop reviewing and content peeking
US9191914B2 (en) Activating devices based on user location
US10028023B2 (en) Methods and systems for automatic media output based on user proximity
US10743058B2 (en) Method and apparatus for processing commands directed to a media center
EP3014874A1 (en) System and method for user monitoring and intent determination
US11849176B2 (en) Systems and methods for facilitating voice interaction with content receivers
BR112015032568B1 (en) METHOD FOR CONTROLLING AT LEAST ONE COMPONENT DEVICE IN A HOME ENTERTAINMENT SYSTEM

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20151224

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

A4 Supplementary search report drawn up and despatched

Effective date: 20160511

RIC1 Information provided on ipc code assigned before grant

Ipc: H04N 21/422 20110101ALI20160504BHEP

Ipc: H04N 21/482 20110101ALI20160504BHEP

Ipc: H04N 21/442 20110101ALI20160504BHEP

Ipc: H04N 7/16 20060101AFI20160504BHEP

Ipc: H04N 21/4223 20110101ALI20160504BHEP

DAX Request for extension of the european patent (deleted)
18D Application deemed to be withdrawn

Effective date: 20161210

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN