CN117136352A

CN117136352A - Techniques for communication between a hub device and multiple endpoints

Info

Publication number: CN117136352A
Application number: CN202280028296.0A
Authority: CN
Inventors: J·S·格拉布; R·M·斯图尔特; G·桑切斯; A·简恩; Z·U·R·阿施拉弗; D·J·钱德勒; A·拜伦; A·比斯瓦斯; M·李; M·莎恩巴格
Original assignee: Apple Inc
Current assignee: Apple Inc
Priority date: 2021-04-15
Filing date: 2022-04-13
Publication date: 2023-11-28
Also published as: GB2619894A

Abstract

Techniques for coordinating interactions between a user device and a plurality of accessory devices are disclosed. In one example, a user device receives information identifying one or more accessory devices in communication with the user device. The user device may implement an accessory interaction instance for each of the identified accessories. The first accessory interaction instance can be associated with a first accessory of the identified accessories and receive a first audio input from the first accessory corresponding to a user request. The first accessory interaction instance can process a portion of the received audio input and receive a first response from the server computer. The user device may then transmit the first response to the first accessory device.

Description

Techniques for communication between a hub device and multiple endpoints

Cross Reference to Related Applications

The present application claims the benefit of U.S. non-provisional application No. 17/718,977 entitled "TECHNIQUES FOR COMMUNICATION BETWEEN HUB DEVICE AND MULTIPLE ENDPOINTS" filed on day 4, month 12 of 2022, which claims priority from U.S. c. ≡119 (e) filed on day 4, month 15 of 2021, entitled "TECHNIQUES FOR COMMUNICATION BETWEEN HUB DEVICE AND MULTIPLE ENDPOINTS" to U.S. provisional application No. 63/175,473. The contents of these applications are incorporated herein by reference.

The present application claims the benefit of U.S. non-provisional application No. 17/718,984 entitled "TECHNIQUES FOR LOAD BALANCING WITH A HUB DEVICE AND MULTIPLE ENDPOINTS" filed on day 2022, month 4, and claims priority from U.S. c. ≡119 (e) filed on day 2021, month 4, and day 15, U.S. provisional application No. 63/175,478 entitled "TECHNIQUES FOR LOAD BALANCING WITH A HUB DEVICE AND MULTIPLE ENDPOINTS.

The present application claims the benefit of U.S. non-provisional application No. 17/719,086, entitled "TECHNIQUES FOR ESTABLISHING COMMUNICATIONS WITH THIRD-PARTY ACCESSORIES", filed on month 4 of 2022, which claims priority from U.S. c. ≡119 (e), U.S. provisional application No. 63/175,480, entitled "TECHNIQUES FOR ESTABLISHING COMMUNICATIONS WITH THIRD-PARTY ACCESSORIES", filed on month 4 of 2021.

Background

Techniques exist for multiple user devices in a residential environment to communicate between the multiple devices. For example, a user may interact with a device that provides a digital assistant program. The device may communicate with other devices, including controlling intelligent accessory devices such as lighting switches, speakers, and thermostats, via a digital assistant to perform requests from a user. However, controlling smart device functionality remains a challenge. The user may not be able to directly access the user device with the digital assistant to provide the desired interaction. Accessory devices can have many different features and capabilities and be produced by different manufacturers. In a residential environment, a user may want to interact with an accessory through voice commands in the same manner as a device assistant interacts with a user device.

Disclosure of Invention

Embodiments of the present disclosure may provide methods, systems, and computer readable media for providing management of interactions between accessory devices receiving audio user requests and user devices processing those requests. In some examples, a user device may be associated with one or more accessories and separate instances of a device assistant application may be implemented to manage user requests at the accessories.

According to one embodiment, a method may be performed by a computer system within a residential environment. The computer system may be a user device such as a smart phone, tablet, smart Television (TV) media streaming device, smart hub speaker, or the like. The user device may receive information identifying one or more accessories present within the residential environment. The user device may use this information to form an association with the identified accessory. In forming the association, the user device may implement an instance of one or more processes or other applications corresponding to the associated accessory. The instance may be a device assistant application or other process for analyzing human speech or other audio signals.

In some examples, the user device may receive audio input from one or more of the associated accessory devices. The audio input may be audio data transmitted from the accessory to the user device in a streaming manner. At least a portion of the audio input may correspond to an audio trigger or wake word. A first accessory interaction instance corresponding to an accessory transmitting audio input may receive audio and process it. In some implementations, processing may include transmitting some or all of the received audio input to a server computer or cloud service for robust linguistic analysis. The server computer may parse and analyze the audio to determine if the audio corresponds to an identified user within the residence, if the audio is a user request or command, in what language any spoken audio should be presented, what the appropriate response should be, and if the user or transmitting accessory device is authorized to make the identified request or receive the determined response. The first interaction instance may then receive the response and transmit it to the accessory device.

In other embodiments, the accessory-interaction instance can also perform another process or operation as part of the received response. This may include setting a timer, instructing another device to take some action (such as turning off a light), or invoking a music streaming service to transmit audio to the accessory device. The accessory instance can also delegate execution of the response to another device, including another accessory that may be better suited for the response.

In some implementations, the user device can receive a second audio input from a second accessory device. Because each associated accessory has its own instance of interaction at the user device, the second audio input can be processed simultaneously with the first audio input. At least a portion of the second audio input may correspond to a trigger or wake word. The second accessory interaction instance can process the portion of the second audio input to determine whether a wake word is present. The instance can also determine whether the second accessory is authorized to interact with the instance. If the wake word is present and the second accessory is authorized, the second interaction instance may process the remaining portion of the second input audio in a manner similar to the processing of the first audio.

To enable interaction between the accessory device and the user device, in some embodiments, the accessory devices may each include a software development kit ("SDK") within their memory. The software can be provided by an entity (e.g., manufacturer) associated with the user device such that the software can communicate with a particular user device regardless of the manufacturer of the accessory device. The accessory interaction instance can be configured to communicate with the SDK, which can include transmitting accessory settings from the accessory to the user device for management. The SDK may also provide additional features including wake word detection for audio input received at the accessory device.

Embodiments of the present disclosure may provide methods, systems, and computer readable media for providing load balancing management between an accessory device and a hub device. In some examples, the user device may act as a leader device to assign the accessory device to the hub device and provide information corresponding to the assignment.

According to one embodiment, a method may be performed by a computer system within a residential environment. The computer system may be a user device such as a smart phone, tablet, smart Television (TV) media streaming device, smart hub speaker, or the like. The user device may receive an allocation request from an accessory device within the residential environment. The user device may then select a hub device within the residential environment to connect to the accessory device. The selection may be based on a determination of which hub device is the best hub device to connect to the accessory.

In some embodiments, the user device may receive information from the accessory device identifying accessory characteristics corresponding to features or functions of the accessory. The user device may also receive information from one or more hub devices within the residential environment. The hub information may correspond to attributes of each hub, including characteristics and capabilities of each hub. The hub information may also include a number of accessory connection slots (slots) available at the hub device. The user device may then score the hub device by comparing the accessory characteristics to the hub attributes to obtain a score corresponding to the suitability of the hub to connect with the accessory device. This score may then be multiplied by the number of available attachment slots at each hub to obtain a final connection score. The hub with the highest connection score may be assigned to the attachment.

In another embodiment, the hub device may update its available connection slots due to changes at the hub device (including increased processing load at the hub device). The updated slot may cause the currently assigned attachment to be discarded from the hub device. The discarded attachment may then request a new allocation from the leader device.

In another embodiment, a user device acting as a leader device may receive information that it is no longer suitable to act as a leader device. This information may be a determination made by the user device based on its own current attributes, including that the user device is experiencing higher processing loads and is no longer able to effectively manage the hub device. The user device may transmit a request to the server device to select another user device to act as a leader device. The server device may then select the second user device as the leader device. The server device may then instruct the second user device to assume administrative control of the hub device and the accessory device. The first user device may then transmit current hub information and attachment allocation information to the second user device.

In some embodiments, a user device acting as a leader device may obtain information about the current processing capabilities of the first hub device. The information may indicate that the first hub device is unable to respond to the accessory assigned to the first hub device within a threshold amount of time. The user device may then obtain hub information from other hub devices and compare the hub information with accessory characteristics of the accessory currently associated with the first hub device to obtain a connection score. The user device may then instruct the first hub device to discard the accessory device and instruct the other hub device that gets the highest score to connect to the accessory device.

Embodiments of the present disclosure may provide methods, systems, and computer readable media for providing communication between an accessory device and a cellular-enabled device. In some examples, the controller device may receive a call request from the accessory device and select an appropriate cellular-enabled device to place the call. The cellular-enabled device may then establish an audio connection with the accessory device to relay call audio to and from the accessory device.

According to one embodiment, a method may be performed by a computer system within a residential environment. The computer system may be a user device such as a smart phone, tablet, smart Television (TV) media streaming device, smart hub speaker, or the like. The user device may receive a call request from an accessory device within the residential environment. The user device may then select a cellular-enabled device within the residential environment to initiate the call and connect to the accessory device. The selection may be based on a determination of which cellular-enabled devices are associated with the user making the call request.

In some embodiments, one or more accessory devices can be associated with a user device. The user device can implement one or more instances of a process or other application corresponding to the associated accessory. The instance may be a device assistant application or other process for analyzing human speech or other audio signals. The set of processes in each instance may correspond to a software ecosystem on the user device.

In another embodiment, the user device may enter a call listening state when the accessory device is engaged in a call with a cellular-enabled device. When in the call listening state, the instance corresponding to the accessory device may have voice processing capabilities limited to detecting only end words. In other embodiments, when the first accessory device is engaged in a call, interaction instances corresponding to other accessories associated with the user device may not be limited. Other instances may receive and process user requests from other accessories while the call is not in progress, depending on the normal operation of the accessory and the user device.

Drawings

Fig. 1 is a simplified block diagram of an exemplary method according to some embodiments.

Fig. 2 is a schematic diagram of a residential environment including a user device and an accessory device, in accordance with some embodiments.

Fig. 3 is another simplified block diagram illustrating at least some methods of coordinating communication between a user device and an accessory device, in accordance with some embodiments.

Fig. 4 is a block diagram illustrating at least some techniques for communication between an accessory device and a user device.

FIG. 5 is a flowchart illustrating an exemplary process for an accessory device and a user device to detect a user request and take action in accordance with the user request, according to an embodiment.

FIG. 6 is a simplified block diagram illustrating an exemplary architecture of a system for detecting user requests and taking actions in accordance with the user requests, in accordance with some embodiments.

Fig. 7 is another simplified block diagram illustrating an example of an accessory device receiving and processing multiple communications from a user device, in accordance with some embodiments.

Fig. 8 is a flow chart illustrating a process by which an accessory device of a plurality of accessory devices determines which accessory device will respond to a user request, in accordance with some embodiments.

Fig. 9 is a flow chart illustrating a process by which a user device coordinates interactions with multiple accessory devices, in accordance with some embodiments.

Fig. 10 is a simplified block diagram of an exemplary method according to some embodiments.

Fig. 11 is a schematic diagram of a residential environment including a hub device and an accessory device, in accordance with some embodiments.

Fig. 12 is another simplified block diagram illustrating at least some methods of managing associations between accessory devices and hub devices, in accordance with some embodiments.

FIG. 13 is a simplified block diagram illustrating an exemplary architecture of a system for detecting user requests and taking actions in accordance with the user requests, in accordance with some embodiments.

Fig. 14 is a flowchart illustrating an exemplary process for assigning accessory devices to selected hub devices according to an embodiment.

Fig. 15 is another flow diagram illustrating an exemplary process for reassigning an accessory device from one hub device to another hub device in accordance with an embodiment.

Fig. 16 is another flow diagram illustrating an exemplary process for reassigning an accessory device from one hub device to another hub device in accordance with an embodiment.

Fig. 17 is another flow chart illustrating an exemplary process for transferring hub management from one user device to another user device according to an embodiment.

Fig. 18 is a flow chart illustrating a process by which a user device assigns an accessory device to a selected hub device, in accordance with some embodiments.

Fig. 19 is a simplified block diagram of an exemplary process according to some embodiments.

Fig. 20 is a schematic diagram of a residential environment including a user device and an accessory device, in accordance with some embodiments.

Fig. 21 is another simplified block diagram illustrating at least some methods of establishing communication between an accessory device and a cellular-enabled device, in accordance with some embodiments.

Fig. 22 is a simplified block diagram illustrating at least some techniques for communication between an accessory device and a cellular-enabled device.

Fig. 23 is another simplified block diagram illustrating an exemplary architecture of a system for establishing communication between an accessory device and a cellular-enabled device, according to an embodiment.

Fig. 24 is another flow diagram illustrating an exemplary process for requesting a phone call at an accessory device and placing the phone call at a cellular-enabled device, according to an embodiment.

Fig. 25 is another flow diagram illustrating an exemplary process for requesting termination of a telephone call at an accessory device and ending the call at a cellular-enabled device, according to an embodiment.

Fig. 26 is a flow chart illustrating a process by which a user device establishes a connection between an accessory device and a cellular-enabled device, in accordance with some embodiments.

Detailed Description

In the following description, various examples will be described. For purposes of explanation, numerous specific configurations and details are set forth in order to provide a thorough understanding of the examples. It will be apparent, however, to one skilled in the art that some examples may be practiced without these specific details. Furthermore, well-known features may be omitted or simplified in order not to obscure the examples described herein.

Embodiments of the present disclosure may provide techniques for coordinating interactions between a user device and a plurality of accessory devices. As a first example, consider a residential environment corresponding to a residence. A person in a home may want to know the current time. The person may query for accessory devices (e.g., nearby smart speakers) within the residential environment with a verbal request (e.g., "what is now. The accessory device can determine that the request is for the device and then transmit the received audio information to the user device (e.g., hub speaker). The user device may process the audio information to determine the nature of the request and prepare a corresponding response (e.g., "10:30 a.m. now"). Alternatively, or in part in combination with the above, the user device may transmit some or all of the verbal request to a server computer (e.g., implementing a service provider), where the service provider may determine the nature of the request and/or prepare a corresponding response. The user device may then transmit a response back to the accessory device for playback to the user. In another example, the user may have a request (e.g., "turn off light") that does not require a response. The user device may process the audio request to identify the other device or devices corresponding to the request and transmit instructions to the other device to perform the request (e.g., instruct the device to control the light to be off). Similar to the alternatives mentioned above, the user device may transmit a request to the service provider that does not require a response, which may transmit instructions to another device or return instructions for the other device to the user device. In the latter case, the user device will then send a server-generated instruction to the other device.

As an illustration of the above examples, a residential environment may include numerous "smart" devices, e.g., electronic devices having features that allow them to operate interactively and autonomously to some extent. The smart device may have various functions including cameras, speakers, thermostats, headphones and headsets, telephones, or media players. The smart device may also have various network communication capabilities including WiFi, ethernet, bluetooth, zigbee, cellular, and the like. These devices may be produced by different manufacturers. In some cases, the smart device may be classified as a user device and an accessory device. The user device may be a resident device of the residence (e.g., a smart speaker, a smart digital media player configured to control a Television (TV), a mobile phone, etc.). While not always so, in some examples, resident devices are contemplated to reside within the home and to move infrequently (e.g., move within the home or outside the home). The user device may have the capability to equal or exceed the capabilities of the accessory device. For example, the user device may be a mobile phone that may include wireless (e.g., wiFi) and cellular communication capabilities, multimedia capabilities, and device assistants. In this same example, the accessory device may be a smart speaker that may include audio media and wireless communication capabilities but lack device assistance. The device assistant may be a virtual assistant program configured to interact with the user. In these examples, the smart speaker may be a user device or an accessory device, depending on its capabilities. In some examples, if the accessory is manufactured by an entity other than the entity that manufactured the user device, the accessory may not be initially configured with the capability to communicate with the user device. In some cases, a user device manufacturer may provide an accessory development kit ("ADK") for installation on an accessory that enables such communication after the accessory is manufactured, sold, supplied, or used.

In some embodiments, the user device may obtain information about accessory devices present in the residential environment. This information may be obtained by a user device in direct communication with an accessory device sharing the same network within the residential environment. In other embodiments, information about the accessory device may be sent to the user device by a second user device, a user device configured as a leader device, or a remote server device (e.g., a service provider). For example, a user in a home may add a new accessory device to the home environment. As part of this process, the user may interact with a second user device (e.g., a mobile phone) to configure the new accessory device and send new accessory device information to the first user device. As another example, a leader device in a residential environment may have information about a plurality of accessory devices in the residential environment and report information about some or all of the accessory devices to a user device. The user device may then use this information to form an association with the corresponding accessory device. The attachment information may be stored by the user device.

The user device may be associated with a plurality of accessory devices by creating an accessory interaction instance for each accessory device. An interaction instance may be a software module or process configured to perform tasks at a user device. In some implementations, the interaction examples can each implement and/or communicate with a device assistant. For example, the user device may receive information regarding an accessory smart speaker and a smart thermostat located in a residential environment. The user device may create two interaction instances corresponding to the device assistant, one for each of the intelligent speaker and the intelligent thermostat. In some embodiments, the interaction instance may be a copy of the device assistant, while in other embodiments, the instance may be a collection of modules including the device assistant and other processes for performing tasks on the user device. The interaction instance may include different modules or processes depending on the associated accessory and its capabilities. It should be appreciated that any suitable combination of processes running on the user device may be included in the interaction instance corresponding to the accessory device.

Continuing with the first example above, the user may speak a request to the accessory. For example, a user may speak "computer" into a microphone of a nearby smart speaker (or thermostat, bulb, etc.), which is now several? "in this example, the request (" what is now. The start phrase ("computer") may correspond to a second portion of the user's audio input and may be a trigger or wake word. In some implementations, the smart speaker can perform voice recognition processing on the wake word. Based on this process, the intelligent speaker may determine whether the user's voice is intended to be a request or command that the speaker should respond to. Wake word processing at the accessory device may occur at a first level sufficient to identify that a command or request may be included within the user's audio input. If so identified, the smart speaker may then transmit user audio to the user device running the accessory interaction instance corresponding to the smart speaker. In some embodiments, the accessory device can temporarily store a copy of the audio input for transmission to the user device after processing the wake word portion. In other implementations, in processing the wake word portion of the audio input, the accessory device can establish a streaming audio connection with the user device to relay the portion of the audio input subsequent to the wake word of the user. In these embodiments, the accessory device can transmit a copy of the stored wake word portion of the audio input to the user device for additional processing.

Upon receiving the audio input from the smart speaker, the user device may perform additional processing on both the wake word portion of the audio input and the portion corresponding to the request or command. For example, the user device may perform natural language processing ("NLP") on the wake word. The wake word processing at the user device may occur at a second level sufficient to confirm the presence of the wake word with a higher degree of probability than the wake word processing performed at the accessory (e.g., at the first level). Based on the wake-up word processing, the user device may then process the portion of the audio corresponding to the request. If the user device determines that the wake word portion is not actually an exact wake word, it may ignore the remainder of the audio or terminate the audio stream from the accessory. In some embodiments, the voice processing module on the user device can be part of the accessory interaction instance. The interaction instance may also transmit all or a portion of the audio input to another device for analysis (e.g., to a service provider device). The service provider device may be a remote server computer or cloud device that can perform voice processing and parse the request to provide an appropriate response. In some cases, the user device performs the wake word processing while remotely processing the remainder of the audio. Parsing the request includes determining content and context of the user spoken audio and providing a response for the user device to take action. In the present example, the response will be an indication of time, which may be determined and prepared by the user device using an appropriate process or by a remote server device or a combination of both devices.

Once the response has been determined, the user equipment may execute the response. This may include preparing an audio response for transmission back to the accessory device for playback to the user. The preparation and execution of the response may occur in an interaction instance corresponding to the accessory. A response requiring a particular action may be delegated from the interaction instance to another process on the user device or to another device with which the user device may communicate, as appropriate. Any response audio may be generated by the text-to-speech process within the interaction instance and transmitted to the accessory. The accessory may then play the response. Some embodiments may provide various pre-generated audio responses corresponding to frequently encountered requests that do not require a unique response. These responses may be stored at the user device or the accessory device. For example, if the accessory device fails to connect to the user device and cannot transmit user audio, the accessory device may provide an audio response stored at the accessory indicating that the request cannot be processed.

Extending the example just described, consider another scenario in which a second user in a home also wants to make a request to a second accessory intelligent thermostat. The request may suitably be "computer, raising the temperature by 3°f". As previously described, the thermostat may process a portion of the user's audio input to determine the presence of a wake word and, if detected, transmit the wake word and a request to a user device associated with the intelligent thermostat. Once received by the user device, instances of the voice processing module and the accessory interaction module, which are different from the interaction module associated with the previous exemplary smart speaker, may process audio. In this way, the user device can process and execute requests from multiple accessories simultaneously. An accessory interaction instance corresponding to a smart thermostat may include a thermostat management module configured to manage an ambient environment (e.g., heating or air conditioning) of a residential environment. Upon processing a user request, the thermostat management module may execute the request and instruct the intelligent thermostat to increase its temperature setting by 3°f. In other embodiments, the thermostat management module may exist as a single instance on the user device. The accessory interaction instance corresponding to the intelligent thermostat may then delegate its requested execution to the single management module, as may be desirable in a system that contains multiple intelligent thermostat accessories but requires unified management of the surrounding environment on the user device. In several embodiments, the architecture of the accessory interaction instance and other software modules on the user device may be configured in any suitable manner to increase the efficiency of request processing and execution of the interaction instance on the user device. This may include various combinations of modules and processes related to the features and capabilities of the accessory device and the user device.

Fig. 1 is a simplified block diagram 101 of an exemplary embodiment. Process 100 is an exemplary high-level process flow for a system including user device 110 that may be associated with various accessory devices 111 to receive user requests from accessories. Diagram 101 shows the state of the system corresponding to the blocks of process 100. Process 100 may be performed within a residential environment that includes a plurality of user devices 110 and accessories. As described herein, the user device 110 may be a hub speaker and the accessory device 111 may be a smart thermostat 112, a camera 114, or a smart speaker 116. Although described as a particular device, it is apparent that accessory device 111 may be several types of smart devices in various combinations and numbers. Similarly, although a hub speaker is depicted as user device 110 performing process 100, other suitable devices may perform one or more of the operations in process 100. For example, a smart phone, a media device (e.g., a smart TV), or a tablet (connected to a cellular network, to a local area network via WiFi of a residential network, or to a wide area network ("WAN")) may perform one or more of the operations of process 100.

Turning in more detail to process 100, at block 102, user device 110 can create one or more accessory interaction instances corresponding to one or more associated accessory devices 111. Each accessory interaction instance represents one or more software modules or processes running on the user device 110 to enable the accessory device 111 to interact with the user device 110. As shown in fig. 1, accessory interaction instance 122 may correspond to intelligent thermostat 112, accessory interaction instance 124 may correspond to camera 114, and accessory interaction instance 126 may correspond to intelligent speaker 116.

At block 104, an accessory device, shown as smart speaker 116, receives audio input 120. In some implementations, the audio input 120 can include an audio portion corresponding to a user request or command (e.g., "what is now. The wake word need not be a single word, and may be a word or phrase that signals to the system that the user has or will speak a request, command, or other audible interaction to the system. The audio input may also be other sounds that are not made by the user, including glass breaking sounds or baby crying sounds. In these cases, the wake word may be a trigger sound corresponding to a portion of the audio input 120, as described herein. In other embodiments, portions of the audio input 120 corresponding to wake words and user requests may be received by the accessory 116 at intervals. The period of time may be sufficient to allow the user to speak the wake-up word and receive a confirmation response from accessory 116 before speaking the user request. Upon receiving an input containing a wake word, accessory 116 can process the portion of audio input 120 at a first level to determine the presence of the wake word. The first level of processing may be performed in a time and resource efficient manner that determines when wake words may exist. For example, the accessory can perform voice pattern matching using a stored voice pattern corresponding to the user speaking the wake word. The stored patterns may be associated with users in a residential environment that contains the system, or may be generic patterns that are applicable to a large number of possible users. In this way, the accessory device 116 is not burdened with complex speech detection procedures, nor is it responsive to each extraneous audio input received by the user or other sources in its vicinity.

Moving down to block 106, upon detecting the wake word, the accessory device 116 can transmit the received audio input 120 to the user device 110 where it is to be processed. As shown, the smart speaker 116 has a corresponding accessory interaction instance 126 on the user device 110 such that the accessory interaction instance 126 manages the processing of the audio input 120 received from the smart speaker 116. As described in more detail below with reference to fig. 4, the accessory interaction instance 126 may include a module configured to process the audio input 120. For example, the accessory interaction instance 126 can include a voice detection module that can analyze a portion of the audio input 120 corresponding to the wake word. The analysis may be performed at a second level where the presence of wake words may be confirmed with a higher degree of probability than wake word detection at the smart speaker 116. Further, in some embodiments, the voice detection module may determine a language of the user and perform wake word detection based on the determined language. If the wake word is not detected by the voice detection module of the accessory interaction instance 126, the user device 110 may ignore the audio input.

The accessory interaction instance 126 can also include a module configured to communicate with a remote service 130. The remote service 130 may be provided by a remote server associated with the residential environment of the user device 110 over a WAN or other network, and may be a cloud server in some embodiments. The remote services may include NLP or other voice analysis services. If the accessory interaction instance 126 does detect a wake word, the accessory interaction instance can process the portion of the audio input 120 corresponding to the user request by transmitting the portion to the remote service 130. The remote service 130 may analyze the request to determine the type of request, the appropriate response, and one or more devices executing the response. In some embodiments, the remote service 130 may also determine the identity of the requesting user. The identity may be determined from user profile information accessed by the remote service 130. As part of the processing of the audio input 120 by the accessory interaction instance 126, user profile information may be stored at the user device and transmitted to the remote service 130. In some cases, the user profile information is stored on a remote device accessible by the remote service 130 or on a remote server providing the remote service 130. Once the requested portion has been analyzed by the remote service 130, the response may be transmitted back to the user device 110 for execution. Upon receiving the response, the accessory interaction instance 126 can then execute the response. Execution of the response may include delegating one or more elements of the response to other processes on the user device or another device, including other user devices or accessory devices or remote devices in the residential environment. Following the example shown in fig. 101, the execution may include determining a current time and preparing an audio response to be played for the user. The accessory interaction instance 126 can delegate the request to a process on the user device 110 to provide the accessory interaction instance 126 with the current time. The accessory interaction instance 126 can include a text-to-speech module that can convert the received current time information into an audio response.

Moving to block 108, the user device 110 can transmit a response to the accessory device 116. For responses that require an audio response 140 to the user, the accessory interaction instance 126 on the user device 110 can communicate with the smart speaker 116 and transmit audio. As shown in FIG. 101, the audio response 140 is a reply "10:30 evening," which corresponds to the current time requested by the user. Other responses may include an indication that the user request was performed by another device or that the request cannot be performed. In some implementations, the accessory device 116 may not have the ability to play an audio response (e.g., it does not have a speaker output), but includes a visual user interface (e.g., screen) or other means (e.g., lights) to indicate the response to the user.

Fig. 2 illustrates a residential environment 200 containing user devices and accessory devices, according to some embodiments. User devices may include a hub speaker 202, a media player 204, and a smart phone 206. These user devices may correspond to user device 110 from the embodiment described above with respect to fig. 1. The accessory devices may include smart speakers 212, 214, smart watch 216, and thermostat 224. Similarly, these accessory devices may correspond to accessory device 111 described with respect to fig. 1. All or some of these accessory devices may be third party devices (e.g., not manufactured, programmed, or supplied by the manufacturer, programmer, or supplier of the user device). Thus, they may not be automatically and/or initially compatible with the user device. Each user device in the residential environment 200 may be associated with 0, one, or more accessory devices. As shown by the long dashed lines, hub speaker 202 is associated with smart speakers 212, 214 and smart watch 216, while media player 204 is associated with thermostat 224. The smart phone 206 is not associated with an accessory device. Devices within the residential environment 200 may be configured to communicate over one or more networks associated with the residential environment 200 using one or more network protocols. For example, the residential environment 200 may be associated with a local area network ("LAN"), WAN, cellular network, or other network, and the devices may communicate using a WiFi connection, bluetooth connection, thread connection, zigbee connection, or other communication method.

The arrangement of the association of the accessory device with the user device may include various different combinations and may be modified by the user device. For example, the smart phone 206 can receive information regarding an accessory device to be associated with the smart phone. The accessory devices may include one or more of the accessory devices currently associated with other user devices in the home, and may include new accessory devices added to the home. The smartphone 206 will then create an attachment interaction instance for each attachment association. In some embodiments, a user device in the residential environment 200 can communicate with another user device to transfer one or more accessory devices associated with the first user device to the second user device. The transfer may occur automatically based on information received by the user device about the home environment 200, including but not limited to information that another user device may be better suited to be associated with one or more accessories or that accessories have been added to or removed from the home environment 200. The suitability of any particular user device associated with an accessory may be based at least in part on the capabilities of the user device, the capabilities of the accessory device, the current processing load experienced by the user device, the location of the device within the residential environment, and the status of communications between devices on the network. Many other criteria for rearranging device associations in a residential environment are contemplated.

In some embodiments, accessory devices and non-resident user devices may also leave the residential environment or lose network connectivity with the residential environment. The accessory device leaving the residential environment may be disassociated by the previously associated user device such that the user device removes the corresponding accessory interaction instance from its memory. Accessory devices associated with a user device that loses network connectivity to the residential environment may be reassigned by another user device that maintains network connectivity. Some embodiments may have a user device designated as a leader device to manage allocation of accessory devices between user devices within a residential environment. In other embodiments, if the user device and accessory device are associated and leave the residential environment and network connectivity is lost, the user device may maintain its association with the accessory device and perform the materialization methods described herein.

Returning to fig. 2, as an example of the foregoing description of some embodiments, a hub speaker may communicate with smart phone 206 to transfer association with smart watch 216. The user 230 wearing the smart watch 216 may collect the smart phone 206 and move into a different room in the residential environment, making the smart phone 206 a more suitable user device associated with the smart watch 216. User 230 may then leave residential environment 200 along with smart phone 206 and smart watch 216. The smart phone 206 may maintain its association with the smart watch 216 as its accessory device even if the smart watch 216 loses network connectivity with other devices in the residential environment. Further, the user 230 may take additional accessory devices outside of the home environment, and the accessory may also be associated with the smart phone 206 such that the smart phone 206 is associated with both accessory devices when outside of the home environment 200. As another example, the media player 204 may begin processing and playing the media file and become unsuitable for association with the thermostat 224 due to the load. The media player 204 may transfer the association of its accessory thermostat 224 to the smart phone 206 or hub speaker 202.

Continuing with FIG. 2, the residential environment 200 can have a plurality of users 230, 234 that issue a plurality of audio requests 232, 236 for accessories. The requests 232, 236 correspond to the audio input 120 described above with reference to fig. 1. The requests 232, 236 may occur separately or simultaneously and may be received by multiple accessory devices, as indicated by the short dashed lines. For example, request 232 may be received by smart speaker 214 or smart watch 216, while request 236 may be received by smart speaker 212 and thermostat 224. As previously mentioned, the accessory device and its associated arrangement may take various forms and may change over time. Thus, the user request may be received by a plurality of accessory devices associated with different user devices. For example, the user request 236 is received by both the thermostat 224 associated with the media player 204 and the smart speaker 212 associated with the hub speaker 202.

In some embodiments, the accessory device may coordinate with other accessory devices within the residential environment 200 to determine which accessory device should respond to user requests received by one or more accessory devices. As described in more detail below with reference to fig. 8, selecting an accessory device in response to audio input may occur through an election process between accessory devices. The election may use a score based on criteria including, but not limited to, the strength (e.g., loudness) of the audio input received at the accessory device, the quality of the received audio input, the capabilities of the accessory device, and the capabilities of the user device associated with the accessory device. As an example, both the smart speaker 212 and the thermostat 224 receive a user request 236. Upon receiving the request, both accessories may process the portion of the user request corresponding to the wake word and transmit audio to their respective user devices, hub speaker 202, and media player 204. An accessory interaction instance on the user device associated with the respective accessory can process the received wake word and determine a score for the accessory. The user device may transmit the score back to the accessory before further processing the user request 236. The accessory may then open a communication channel between them and exchange the score. Each accessory compares its score to other scores to determine a winner. The winning accessory may report to its user device that it has won. Similarly, a missing accessory may report to its user device that it has been lost and will not respond to a user request. In this example, because the user 234 is in the same room as the thermostat 224, it may be the case that the thermostat is louder and clearly hearing the user request 236 as compared to the intelligent speaker 212 in another room. Thus, the accessory election may result in the thermostat 224 being the winning accessory. Conversely, the smart speaker 212 or the hub speaker 202 may have a capability more suitable for responding to user requests, such as in the case where the request requires an audio response and the thermostat 224 does not have audio output capability. In such a scenario, the score determined for the smart speaker 212 may be higher to reflect the greater capabilities and result in the smart speaker 212 winning the election. Those skilled in the art will recognize a number of potential scoring criteria for the accessory device.

Fig. 3 illustrates an exemplary coordination process 300 for associating one or more user devices 302 with one or more accessories 304. In some embodiments, user device 302 may correspond to user devices described herein (e.g., user device 110 of fig. 1, user devices 202, 204, and 206 of fig. 2, etc.). The configuration device 306, depicted here as a smart phone, may be a user device, hub device, leader device, or other device for configuration device association in the user device 302. Configuration device 306 can be configured to communicate with user device 302 and accessory 304 through one or more networks (including LANs or WANs) as described herein. In some embodiments, the configuration device 306 is a remote server device configured to communicate with the user device 302 and the accessory 304 over a WAN (e.g., the internet).

The various elements of the reconciliation process 300 are presented in more detail. Configuration device 306 can include an accessory configuration module 310. Accessory configuration module 310 can be a software process running on configuration device 306. The accessory configuration module 310 can be configured to store, update, receive, and transmit information related to the association of the accessory 304 and the user device 302. The information may include accessory management settings 312 and accessory settings 314. The accessory management settings 312 may include information identifying which accessories 304 are assigned to any particular one of the user devices 302. The accessory configuration module 310 can also include accessory settings 314 that can provide, for each accessory, associated information about the assigned user device.

Each of the user devices 302 may include an accessory management module 320, which may be a software process running on the user device 302. In some implementations, the accessory management module 320 can receive, process, store, update, and transmit accessory management settings 322. The accessory management settings 322 correspond to information in the accessory management settings 312 associated with a particular user device 302. For a particular user device, its attachment management settings 322 may include a list of all attachments assigned to that user device, as well as other information regarding the capabilities of those assigned attachments. The accessory management module 320 can also include an accessory interaction instance 324, which can correspond to the accessory interaction instances 122, 124, and 126 of fig. 1. The accessory interaction instance 324 can be created by the user device 302 based on accessory management settings 322 it receives from the configuration device 306. In this way, the configuration device 306 can update the association of the accessory 304 with the user device 302 by sending updated accessory management settings 322 to each user device based on the updated accessory management settings 312. The accessory management module 320 on each user device can then create or remove accessory interaction instances 324 as needed so that it has accessory interaction instances 324 corresponding to its currently associated accessory.

Each accessory may also store accessory settings 334. Accessory settings 334 can include information identifying the user device currently associated with the accessory. The accessory settings 334 can also include information related to the processing of trigger or wake words at the accessory. For example, a user may change the wake word that they want to activate (e.g., "wake") a device, which may be different from a generic wake word and may be identified with that particular user. The configuration device 306 can transmit corresponding information to the accessory device 304 regarding the customized wake word (e.g., the audio pattern of the wake word for comparison with the received audio input). The accessory settings 334 can be configured to include information regarding a plurality of wake words or trigger configurations corresponding to different wake words or triggers that can be detected at the accessory device 304.

Completing the detailed elements of fig. 3, the progress indicators 330, 340 represent the data transfer between the configuration device 306 and the user device 302 and between the user device 302 and the accessory 304, respectively. The progress indicators 330, 340 may indicate communication between various devices as described herein over one or more networks, including but not limited to a WiFi LAN or an internet WAN. The progress indicator 330 indicates the transmission of data within the accessory management settings 312 and the accessory settings 314 to the user device 302. Correspondingly, the user device 302 can transmit data within its accessory management settings 322 or other data to the configuration device 306. The data may identify that one or more accessories are no longer in communication with their assigned user device. Similarly, the progress indicator 340 indicates the transfer of data including the accessory settings 334 between the user device 302 and the accessory 304. In some embodiments, the configuration device 306 can also communicate directly with the accessory to transmit the accessory settings 314 and receive information in return.

As a specific example of the foregoing description of several embodiments, consider the scenario of introducing a new accessory device into a residential environment. The residential environment may correspond to the residential environment 200 of fig. 2. Initially, none of the devices in the residential environment have any information related to the new accessory and none of the user devices are associated with the new accessory. Configuration device 306 may be a user's smart phone and may obtain information about the new accessory. This information may be obtained through user input, for example, through an application running on a smart phone, to configure and provision devices within the residential environment. The information may also include an identification of the new accessory from the list of identified accessories that may be added to the residential environment or obtained by the smart phone communicating with the new accessory through one of the networks in the residential environment. The smart phone may then communicate with the user device 302 within the residential environment to receive information about the user device 302 and their associated current accessories 304. Once the smartphone has the appropriate information about the new accessory and the existing devices, it can update the accessory management settings 312 and accessory settings 314 in its accessory configuration module 310 and then assign the new accessory to one of the user devices 302. The allocation may include transmitting data corresponding to the accessory management settings 322 to the selected user device and transmitting data corresponding to the accessory settings 334 to the new accessory.

In a corresponding specific example, the accessory can be removed from the residential environment. This may occur if the accessory is a non-resident device and is able to leave the residential environment (e.g., a smart watch), or if the accessory has lost network communication with its assigned user device. In this case, the user device may transmit updated accessory management settings 322 to the configuration device 306, which in turn may update its accessory management settings 312 and accessory settings 314 and transmit the updated information back to the user device. Updating the accessory management settings 312 may include reconfiguring the arrangement of accessories 304 that are still present in the residential environment with the user device 302. Similar updates may occur if the user device is removed (e.g., if the user device is a non-resident device like a smart phone or network connectivity has been lost). Accessories previously associated with the removed user device may communicate with the configuration device 306 and report updated accessory settings 334, including inability to communicate with the assigned user device. Configuration device 306 can then select a new user device for association with the accessory and transmit updated accessory management settings 322 and accessory settings 334 to the respective devices.

It should be appreciated that the various scenarios described with reference to process 300 are representative. One or more aspects of each scenario may change and still perform embodiments as described herein, including but not limited to, the number and type of accessories, the number and type of user devices, the type of network through which the configuration device communicates with other devices, and the manner in which the configuration device obtains information about the new accessory device.

Fig. 4 is a block diagram 400 illustrating at least some techniques for communicating between an accessory device 401 and a user device 402 to process audio input to create a response. Diagram 400 includes some detailed architecture of a representative device and process flow arrows that provide a general indication of transfer of data or information. Process flow arrows are not intended to imply any particular architectural connection between the elements detailed herein. Each of the elements depicted in fig. 4 may be similar to one or more elements depicted in other figures described herein. For example, accessory device 401 can correspond to one or more of the accessories and accessory devices described herein, and so forth. In some embodiments, at least some of the elements in diagram 400 may operate within the context of a residential environment like residential environment 200 of fig. 2.

Turning to each element in more detail, accessory device 401 may have audio input and output functions, including accessory microphone input 404 and accessory speaker output 406. Accessory microphone input 404 may include both hardware and software/firmware necessary to provide audio input functionality. Similarly, accessory speaker output 406 can include both hardware and software to provide its functionality. The accessory device 401 also has an accessory audio module 412 that can interface with one or both of the accessory microphone input 404 and the accessory speaker output 406, as well as receive audio from other devices. The accessory device 401 also includes an accessory development kit ("ADK") 408. The ADK may be an SDK stored on the accessory device 401 and configured to be executed or processed on the accessory device. As used herein, an SDK may include an application programming interface and associated software libraries sufficient to enable operation of other software within or associated with the SDK. In some implementations, the ADK may be provided by an entity (e.g., manufacturer) associated with the user device 402. The ADK 408 may include a wake word detection module 410 that performs a first process on a portion of the audio input corresponding to a trigger or wake word. The wake word detection module 410 itself may contain information about the wake word and trigger, including, for example, trigger criteria and audio patterns corresponding to the particular wake word. As indicated by the process flow arrow, accessory device 401 may receive audio input at its microphone 404 and then process a portion of the audio in wake word detection module 410. The process may be performed at a first level to determine the presence of wake words. The first level of processing may be performed in a time and resource efficient manner that determines when wake words may exist. If a wake-up word is detected, the received audio input may be transmitted to the user device 402. In some embodiments, the transmission involves establishing a streaming audio connection from accessory device 401 to user device 402.

The user device 402 can include a management module 414, which can correspond to the accessory management module 320 of fig. 3. The management module 414 can provide one or more audio input repeaters 416, each associated with an accessory device. In this way, the user device 402 can manage multiple inbound audio inputs received from multiple accessory devices, including multiple simultaneous streaming audio connections. The audio input relay 416 may send the received audio input to the speech processing module 420, where the audio input is received by the accessory input plug-in 422. In some embodiments, the voice processing module 420 does not have a running instance for each accessory device associated with the user device. In these cases, the voice processing module may have an accessory input plug-in and a separate each accessory instance of the wake word detection module 424.

As described above with respect to fig. 1, the user device may process audio received from the accessory device. This may include processing a portion of the received audio to detect the presence of a wake-up word. The speech processing module 420 may include multiple instances of a wake word detection module 424, in some embodiments, a wake word detection module 426 configured to process audio input received directly by a user device. The wake word detection module 424 can process wake word audio at a second level where the presence of wake words can be confirmed with a higher degree of probability than the wake word detection module 410 at the accessory device 401. If the wake word is not detected by the speech processing module 420, the user device 402 may ignore the audio input. If a wake word is detected, the audio input may be further processed at accessory interaction instance 430.

The accessory interaction instance 430 can include a virtual device assistant and include other processes such as a server interaction module 432, a delegation process 434, and a text-to-speech module 436. To process audio from a user request or other audio input through the wake word detection module 424, the accessory interaction instance 430 can connect to the remote service 450 and transmit a portion of the audio input to the remote service 450. Each accessory interaction instance associated with an accessory may be separately connected to the remote service 450 through a server interaction module 432 corresponding to the accessory interaction instance. Remote service 450 may be hosted on a remote server or other remote device and may be implemented through one of the networks to which the user device may be connected (e.g., the internet WAN). The NLP and other services for processing audio input may include a remote service 450. In addition, the remote service may also include information related to user profiles, user languages, and user authorizations regarding voice interactions between the user and accessories and user devices within the residential environment. In some embodiments, the user information or some component of the user profile can be stored locally on the user device and can be accessed by the accessory interaction instance 430 via one of the local services 452. In this way, the remote service may identify the user making the user request, process the audio input in a language corresponding to the identified user, and determine whether the user is authorized to have the request performed by one or more devices. Because each accessory interaction instance 430 can interact with remote service 450 separately and individually, multiple user requests can be processed and executed by a single user device 402 simultaneously or nearly simultaneously.

Once the request has been processed, accessory interaction instance 430 can receive a response from remote service 450 for execution. The response may contain data corresponding to an audio message to be played at accessory device 401 for the requesting user. The audio response may be generated by text-to-speech module 436. In some implementations, the received response can require the accessory interaction instance 430 to communicate with the local service 452 to receive additional information to complete the request. For example, if a request is made to query for the current time, accessory interaction instance 430 can obtain the current time from a clock process or other service located elsewhere on user device 402. In other embodiments, the response may require execution by another device or an action requiring unexpected further output by the user device 402 or the accessory device 401. In those cases, the accessory interaction instance 430 can delegate the response to the local service 452 via the delegation process 434. The local service 452 may include a communication process for transmitting response instructions to other user devices for execution on those user devices or accessories associated with them. Local services 452 may also include music services. The user request may include a response to playing music or other audio content at the accessory. The music service may then communicate with the media module 470 to transmit music or other audio to the accessory device 401, as described below. In still other embodiments, the response may include a delay such that delegation to the local service 452 is temporary. For example, the request may be to set a timer, which may be delegated to a local timer process. The delegation-responsive accessory-interaction instance is then free to handle additional request processing from its associated accessory until the delegation process 434 receives an indication that the timer process has completed, at which point the accessory-interaction instance will complete execution of the timer response by sending an appropriate indicator to the accessory.

Continuing with the detailed elements of the user device 402, an audio response can be sent to the accessory device 401 via the media module 470 that includes the accessory audio repeater 472. The accessory audio repeater can negotiate an audio connection with accessory device 401. The audio connection may be through one of the networks to which the user device 402 and the accessory device 401 are connected, and may use any number of methods or protocols including, but not limited to, airPlay, real-time transport protocol ("RTP"), real-time streaming protocol ("RTSP"), etc. In some implementations, the response is a phrase or sentence that is converted to audible speech to be played at the accessory device 401. In other embodiments, the response may be an audio stream of music or similar media to be played at accessory device 401. In still other embodiments, the response may be an indication to accessory device 401 to play a piece of audio stored locally at accessory device 401, such as a notification chime specific to the accessory.

Depending on its capabilities, accessory device 401 may not be configured to store the received audio response, which may be due to spatial limitations, streaming nature of the audio response, or other reasons. As described in more detail below with respect to fig. 7, as part of the response to the user request, accessory interaction instance 430 can provide instructions to accessory device 401 for processing received audio if accessory device 401 receives multiple input audio responses or other audio inputs simultaneously or nearly simultaneously. The instructions may include rules for muting or muting portions of the output audio to facilitate other portions of the audio, or providing other rules for mixing or balancing the level of output audio generated from the received audio response. For example, the user may request a timer at the accessory device 401 and then later request the accessory to play the music stream. When the timer is closed, the accessory interaction instance on the user device 402 can be sent with a response corresponding to the timer prompt and an instruction for the accessory device 401 to attenuate streaming audio at the output such that the audible sound is louder than the timer prompt of music.

Returning to the depiction in fig. 4, in some embodiments, the audio input repeater 416, accessory input plug-in 422, wake word detection module 424, and accessory audio repeater 472 represent multiple instances of the same or similar processes or modules running on the user device 402. Examples of these elements correspond to associated accessory devices in the same manner that accessory interaction examples 430 correspond to associated devices. According to a software architecture, accessory interaction instances as described herein may also encompass one or more of these instance elements such that accessory interaction instance 430 includes one or more audio input relay 416, accessory input plug-in 422, wake word detection module 424, and accessory audio relay 472. Various ways of rendering the device architecture to provide per-attachment instances of various processes are contemplated.

Continuing with fig. 4, user device 402 may have its own microphone and speaker input and output functions represented by microphone input 474 and speaker output 476 within media module 470. In several embodiments, the user device 402 is capable of receiving audio input and processing the audio directly through the wake word detection module 426 and the device interaction instance 440, which may include a digital virtual assistant and server interaction module 442, a delegation process 444, and a text-to-speech module 446. Other embodiments feature a user device that cannot directly receive audio input but can still process audio data transmitted from an accessory device to an accessory interaction instance. For example, a media player user device (e.g., a television media player) may not have microphone input and therefore cannot "hear" any user audio input. The media player may still be associated with one or more accessory devices and include one or more corresponding accessory interaction instances. Processing of the user request directly through the user device 402 occurs in a manner similar to processing of audio received by the accessory device. Because the device interaction instance 440 and the accessory interaction instance 430 are separate, the user device 402 can process both the second user request received directly by the user device 402 and the user request received from the accessory.

FIG. 5 is a flow chart illustrating a particular exemplary process 500 for an accessory device and a user device to detect a user request and take action based on the user request. Each of the elements and operations depicted in fig. 5 may be similar to one or more elements depicted in other figures described herein. For example, user device 502 may be similar to other user devices, and so on. In some implementations, the process 500 may be performed within a residential environment (e.g., the residential environment 200 of fig. 2). Process 500, as well as process 800, process 900 of fig. 8 and 9 (described below), are illustrated as logic flow diagrams, each of which represents a series of operations that can be implemented in hardware, computer instructions, or a combination thereof. In the context of computer instructions, operations represent computer-executable instructions stored on one or more computer-readable storage media that, when executed by one or more processors, perform the recited operations. Generally, computer-executable instructions include routines, programs, objects, components, data structures, etc. that perform particular functions or implement particular data types. The order in which the operations are described is not intended to be construed as a limitation, and any number of the operations may be combined in any order and/or in parallel to implement the process.

Additionally, some, any, or all of these processes may be performed under control of one or more computer systems configured with executable instructions, and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more application programs) that is executed jointly on one or more processors, by hardware, or a combination thereof. As described above, the code may be stored on a computer readable storage medium, for example, in the form of a computer program comprising a plurality of instructions executable by one or more processors. The computer readable storage medium is non-transitory.

At block 504, accessory device 501 can receive user utterance 503. The user utterance may correspond to the audio input 120 of fig. 1. The user utterance 503 may include a portion (e.g., "computer") corresponding to a trigger or wake word and a portion (e.g., "what is now.

At block 506, the accessory device 501 may process the portion of the user utterance 503 corresponding to the wake word in a first pass to determine the presence of the wake word. The first pass may be performed in a time and resource efficient manner to determine when a wake word may be present. At decision 508, based on the first pass processing, accessory device 501 determines whether a wake word is present. If not, the process may terminate at endpoint 510 by ignoring the user utterance. If, according to the first pass, there is a wake word, then the process continues to block 512.

At block 512, accessory device 501 can establish a streaming audio connection with user device 502. The connection may occur through one of the networks (e.g., through a WiFi LAN) to which the accessory device 501 and the user device 502 are connected. Streaming audio may use any number of methods or protocols including, but not limited to, airPlay, real-time transport protocol ("RTP"), real-time streaming protocol (RTSP), and the like. The user utterance 503 is then transmitted to the user device 502 by streaming audio. In some implementations, portions of the user utterance 503 corresponding to wake words and user requests can be received by the accessory device 501 at intervals. Due to the first pass processing of the wake-up word portion, the portion may be sent separately to the user device 502 over the buffered audio connection via the WiFi LAN using a transmission control protocol ("TCP") or other suitable method of sending recorded audio data.

At block 514, the user device 502 receives the user utterance 503 and any other streaming audio transmitted from the accessory device, which may include a portion of a longer user request transmitted after the accessory device processes the wake-up word and opens the streaming audio connection. At block 516, the user device 502 may process the wake word a second pass. The process may occur at a second level where the presence of wake words may be confirmed with a higher degree of probability than the first pass process at the accessory device 501 at block 506. At decision 518, if the user device 502 does not confirm the presence of a wake-up word, the process moves to block 520 and terminates the streaming audio connection. The process then moves to endpoint 522 and ignores the user utterance. If the user device 502 does confirm the presence of the wake-up word, the process moves to block 524 and begins processing other portions of the user utterance received from the streaming audio connection or transmitted with the wake-up word portion.

Block 524 may include additional blocks 526-534. These blocks represent portions of the process related to the processing of speech by the user device 502 or a remote server in communication with the user device 502. Blocks 526-534 are not necessarily arranged in a particular implied order and processing audio may require execution of one, some, or all of these blocks. At block 526, the user device 502 may connect to a remote service located at a remote server. The speech processing may include NLP or other speech analysis services provided as part of a remote service. In block 528, the user device, or in some embodiments, one or more of the remote services, may determine an identity of the user speaking the user utterance 503. The identification may be based on user information or user profile data stored at the user device 502 or stored at a remote device to which the remote service has access. Similarly, at block 530, the user device or remote service may determine a language of the user utterance. The determination may be based on user information associated with block 528 or according to NLP analysis or the like. At block 532, the user's request is determined as a result of the NLP or other speech processing analysis. This step includes parsing the request to determine the appropriate response and which device or devices should execute the response. In some implementations, at block 534, the process may determine whether the identified user is authorized to make the request contained in the user utterance 503. For example, the user may not have authorization to access a streaming music service that the user device 502 can access and transmit to the accessory device. In this case, the response to the request may be to play a message to the user indicating that they lack proper authorization. Other authorizations may encompass device-level authorizations. For example, the process at block 534 may determine that the accessory device is not authorized to interact with the associated accessory interaction instance, request a particular response, or a particular response that is delegated for execution. Still other authorizations may encompass higher-level functions such as the user not having authorization to make a specific type of voice request to one or more devices within the residential environment or being able to make any voice request at all.

In the event that audio is processed and a response is determined, process 500 moves to decision 536. Depending on the nature of the request, some or all of the response may require an audio response to be played to the user at the accessory device 501. If the request requires an audio response, the process moves to block 538 and the user device may synthesize an audio message for the user. The synthesis may be performed by a text-to-speech module on the user device 502. The composition may also include selecting from a plurality of previously prepared responses corresponding to commonly issued requests or conditions that user device 502 may respond to. At block 540, the prepared response is transmitted to accessory device 501, which plays the response at endpoint 542.

At endpoint 544, the user device performs the request according to the response determined from the audio processing. Execution of the response may include delegating one or more elements of the response to other processes on the user device or another device, including other user devices or accessory devices or remote devices in the residential environment. Some examples include delegation to a music streaming process or voice communication service, which may then connect the user device 502 to the accessory device and transmit the streaming music or voice communication to the accessory device 501.

FIG. 6 is a simplified block diagram 600 illustrating an exemplary architecture of a system for detecting user requests and taking actions in accordance with the user requests, in accordance with some embodiments. The illustration includes a representative user device 602, one or more accessory devices 604, a representative accessory device 606, one or more networks 608, and a server device. Each of these elements depicted in fig. 6 may be similar to one or more elements depicted in the other figures described herein. In some embodiments, at least some elements of diagram 600 may operate within the context of a residential environment (e.g., residential environment 200 of fig. 2).

The accessory device 604 and representative accessory device 606 may be any suitable computing device (e.g., smart speaker, smart watch, smart thermostat, camera, etc.). In some embodiments, the accessory device can perform any one or more of the operations of the accessory device described herein. Depending on the type of accessory device and/or the location of the accessory device (e.g., within or outside of the residential environment), the accessory device may be enabled to communicate over network 608 (e.g., including a LAN or WAN) using one or more network protocols (e.g., bluetooth connection, thread connection, zigBee connection, wiFi connection, etc.) and network paths, as further described herein.

In some embodiments, server device 610 may be a computer system including at least one memory, one or more processing units (or processors), a storage unit, a communication device, and an I/O device. In some embodiments, server device 610 may perform any one or more of the operations of the server devices described herein. In some embodiments, these elements may be implemented in a similar manner (or in a different manner) as described with reference to similar elements of user device 602.

In some embodiments, the representative user device 602 may correspond to any one or more of the user devices described herein. For example, the user device 602 may correspond to one or more of the user devices of the residential environment 200 of fig. 2. The representative user device may be any suitable computing device (e.g., a mobile phone, a tablet, a smart hub speaker device, a smart media player communicatively connected to a TV, etc.).

In some embodiments, one or more networks 608 may include an internet WAN and LAN. As described herein, a residential environment may be associated with a LAN, wherein devices within the residential environment may communicate with each other through the LAN. As described herein, the WAN may be external to the residential environment. For example, a router associated with a LAN (and thus with a residential environment) may enable traffic from the LAN to be transmitted to a WAN, and vice versa. In some embodiments, the server device 610 may be external to the residential environment and thus communicate with other devices over a WAN.

As described herein, user device 602 may represent one or more user devices connected to one or more of networks 608. The user device 602 has at least one memory 612, a communication interface 614, one or more processing units (or processors) 616, a storage unit 618, and one or more input/output (I/O) devices 620.

Turning in further detail to each element of the user device 602, the processor 616 may be implemented in hardware, computer-executable instructions, firmware, or a combination thereof, as appropriate. Computer-executable instructions or firmware implementations of processor 616 may include computer-executable instructions or machine-executable instructions written in any suitable programming language to perform the various functions described.

Memory 612 may store program instructions that may be loaded and executed on processor 616, as well as data generated during execution of such programs. Depending on the configuration and type of user device 602, memory 612 may be volatile (such as random access memory ("RAM")) or non-volatile (such as read-only memory ("ROM"), flash memory, etc.). In some implementations, the memory 612 may include a variety of different types of memory, such as static random access memory ("SRAM"), dynamic random access memory ("DRAM"), or ROM. The user device 602 may also include additional storage 618, such as removable storage or non-removable storage, including, but not limited to, magnetic storage, optical disks, and/or tape storage. The disk drives and their associated computer-readable media may provide nonvolatile storage of computer-readable instructions, data structures, program modules and other data for the computing device. In some embodiments, storage 618 may be used to store data content received from one or more other devices (e.g., server device 610, other user devices, accessory device 604, or representative accessory device 606). For example, storage 618 may store accessory management settings, accessory settings, and user data associated with users affiliated with the residential environment.

The user device 602 may also contain a communication interface 614 that allows the user device 602 to communicate with a storage database, another computing device or server, a user terminal, or other device via the network 608. The user device 602 may also include an I/O device 620, such as for enabling connection to a keyboard, mouse, pen, voice input device, touch input device, display, speakers, printer, etc. In some implementations, the I/O device 620 may be used to output an audio response, or other indication as part of performing a response to a user request.

Memory 612 may include an operating system 622 and one or more application programs or services for implementing the features disclosed herein, including a communication module 624, a user interface module 626, a speech processing module 630, an accessory interaction instance 632, and a management module 634. The voice processing module also includes a wake word module 636 and the accessory interaction instance 632 also includes a digital assistant 638.

The communication module 624 may include code that causes the processor 616 to generate instructions and messages, transmit messages, or otherwise communicate with other entities. For example, the communication module 624, in conjunction with the management module 634, can transmit and receive data associated with accessory settings, accessory management settings, accessory scores from the accessory devices 604, 606, other user devices, or server device 610. As described herein, the communication module 624 may transmit messages via one or more network paths of the network 608 (e.g., via a LAN or internet WAN associated with a residential environment).

The user interface module 626 may include code that causes the processor 616 to present information corresponding to accessory devices and user devices present within the residential environment. For example, the user interface module 626 can present graphical representations of user devices and accessory devices currently associated with each accessory device. In some embodiments, the user interface module 626 may allow a user to provide configuration information regarding a new accessory device to be added to the residential environment, or allow a user to select a user device or an accessory device to remove from the residential environment.

The speech processing module 630 may include code that causes the processor 616 to receive and process audio input corresponding to speech or other sounds that may be analyzed by the techniques described herein. In some implementations, one or more of the operations of the speech processing module 630 may be similar to those described with reference to block 524 of fig. 5. Wake word module 636 may include code that causes processor 616 to receive and process a portion of the audio input corresponding to the trigger or wake word. In some implementations, one or more of the operations of wake word module 636 may be similar to those described with reference to block 516 of fig. 5 or wake word detection module 424 of fig. 4. For example, wake word module 636 may analyze a portion of the audio input to determine the presence of wake words. In some embodiments, the speech processing module can also determine a language corresponding to the audio input and use the language to inform analysis of the wake word portions.

Accessory interaction instance 632 can include code that causes processor 616 to receive and process a portion of an audio input corresponding to a user request. In some implementations, one or more of the operations of the accessory interaction example 632 can be similar to those described with reference to the accessory interaction example 430 of fig. 4. For example, the accessory interaction instance 632 can include a plurality of processes or services that can cause the processor 616 to send and receive data to a remote service, delegate execution of a response to another process or service, or synthesize an audio response based on speech analysis of the portion of the audio input. The accessory interaction instance 632 can include a digital assistant 638 that can perform one or more of these exemplary operations, as well as additional operations related to interactions between the accessory devices 604, 606 and the user device 602 as described herein. The accessory interaction instance 632 can also include an election scoring module 644. In some implementations, multiple accessory devices 604, 606 can receive the same audio input. In those cases, the election scoring module 644 may include code that causes the processor to calculate a score for the wake word received and processed at the user device (e.g., by the wake word module 636) and then transmit the score to the associated accessory device.

The management module 634 can include code that causes the processor 616 to send information to and receive information from one or more accessory devices 604, 606 or other user devices. In some embodiments, one or more of the operations of the management module may be similar to those described with reference to the accessory management module 320 of fig. 3 and the management module 414 of fig. 4. For example, the management module can incorporate the communication module 624 to transmit and receive information corresponding to the accessory device 604, 606 associated with the user device 602. The management module 634 can include an accessory management settings 640 and a new accessory settings module 642. The accessory management settings 640 can include information corresponding to one or more accessory devices 604, 606 and their association with one or more user devices within the residential environment. The accessory management settings 640 can also include information corresponding to features and capabilities of the accessory devices 604, 606, the user device 602, or other user devices. In some embodiments, the user device 602 may be a configuration device that may send and receive accessory information to add a new accessory device to the residential environment. The new accessory setup module 642 may perform one or more of the processes involved in configuring the association between the new accessory device and the selected user device.

Turning now to the details of the representative accessory device 606, in some embodiments the accessory device 606 can have at least one memory 650, a communication interface 652, a processor 654, a storage unit 656, and an I/O device 658. As described herein with respect to user device 602, these elements of the accessory device may have the same appropriate hardware implementations as their counterparts on user device 602.

The memory 650 of the accessory device 606 can include an operating system 664, and one or more applications or services for implementing the features disclosed herein, including a communication module 660, an audio module 662, and an ADK 670. As described herein with respect to user device 602, communication module 660 may have similar appropriate functionality as its corresponding communication module 624.

The audio module 662 may include code that causes the processor 654 to receive, process, and transmit audio signals in conjunction with the I/O device 658. In some implementations, one or more of the operations of the audio module may be similar to those described with reference to the accessory audio module 412 of fig. 4. For example, audio module 662 may receive user utterances or other audio inputs at a microphone using I/O device 658 and transmit the audio data to user device 602 via a streaming audio channel or other suitable connection. The audio module 662 may also receive response audio from the user device 602 and play the audio at speakers within the I/O device 658.

The ADK 670 may include code that causes the processor 654 to receive and process a portion of the audio input corresponding to a trigger or wake word. In some implementations, one or more of the operations of ADK 670 may be similar to those described with reference to ADK 408 of fig. 4. ADK 670 may include a voice detection module 672 and a wake word module 674. Wake word module 674 may include code that causes processor 654 to receive and process wake words. In some embodiments, one or more of the operations of wake word module 674 may be similar to those described with reference to block 506 of fig. 5 or wake word detection module 410 of fig. 4. For example, the wake word module 674 may analyze a portion of the audio input to determine the presence of wake words.

In some implementations, the ADK 670 may also include an attachment election module 676. The accessory election module 676 can include code that causes the processor 654 to send and receive scores to and from the accessory device 606 and the user device 602, and causes the processor 654 to compare the scores received from the other accessories to determine a winning score. For example, when the wake word module receives a wake word for processing, the accessory election module 676 can communicate with other accessory devices 604 to elect to determine which accessory device should respond to the wake word. This process is described in detail below with reference to fig. 8.

Fig. 7 is another simplified block diagram 700 illustrating an exemplary architecture of an accessory device 702 that receives and processes multiple communications from a user device, in accordance with some embodiments. Each of the accessory device 702 and the depicted elements therein may be similar to other accessory devices and similar elements depicted in other figures described herein, including accessory device 606 of fig. 6. In some implementations, the accessory device 702 can perform any one or more of the operations of the accessory device described herein. Accessory device 702 can be any suitable computing device (e.g., a smart speaker, a smart watch, etc.) and can include memory 710, processor 724, storage unit 726, I/O device 728, and communication interface 730. Each of these elements may have suitable implementations in hardware, firmware, computer-executable instructions, or combinations thereof, and have functions similar to the computing devices described in detail above with respect to fig. 6 for user device 602 and accessory device 606.

Turning in more detail to elements of the memory 710, the memory 710 may include a communication module 712, an audio module 714, an operating system 716, and an ADK 720. Each of these elements may be similar to those described with reference to accessory device 606 of fig. 6. In some implementations, the ADK 720 may include audio mixing logic 722. The audio mixing logic may include code that, in combination with the communication module 712 and the audio module 714, causes the processor 724 to receive, process, combine, mix, and output one or more audio sources. The audio sources may include one or more streaming audio inputs 754 or request responses 750, 752. In addition, the audio mixing logic 722 may also receive mixing rules 756. The mixing rules 756 may instruct the accessory device 702 to perform one or more audio mixing processes on the received audio input via its audio mixing logic 722 or other elements of the ADK 720. The mixing rules 756 can be transmitted by the associated user device to the accessory device 702 and can be part of the request responses 750, 752, the streaming audio input 754, or a separate communication. The mixing rules 756 may provide instructions corresponding to a desired volume of the incoming audio response, a desired volume of any other audio response to be output concurrently with the incoming audio response, whether the current output audio stream should be muted or muted during output of the incoming audio response or other audio response, and so forth.

As an example of the foregoing embodiment, consider a scenario in which accessory device 702 is currently playing a music stream on its speaker in response to a previous user request. The user then makes another request at the accessory device (e.g., "computer, now several. The request may be processed at the associated user device and the accessory device 702 may receive an audio response (e.g., "time is 10:30 pm"). The audio response may be accompanied by mixing rules generated by the user device that indicate that the audio response is to be played at a specific volume at the speaker that is greater than the high frequency portion of the music stream and that the music stream should be faded to a lower volume. The accessory device can apply the mixing rules and play the audio response at a sound greater than the music stream. As another example, consider adding a second request corresponding to an alarm clock (e.g., "computer, set alarm clock at 10:30 pm"). In this case, the response to the time request and the response to the alarm clock may arrive at the accessory device 702 at the same time or nearly the same time. The user device may generate a mixing rule to indicate that both the music stream and the alarm indication should be muted until the time response is announced, then restore the volume of the music stream to a fade level and play the alarm indication with sound greater than the music stream. Many other rules, parameters, or combinations thereof are contemplated. In some implementations, the audio mixing logic 722 may store the mixing rules 756 such that the rules persist and apply to future request responses and other received audio. Subsequent blending rules may modify or update the stored rule set. In other embodiments, the mixing rules 756 are transient and apply only to one or more request responses or audio inputs that are currently received or played at the accessory device 702.

Fig. 8 is a flow chart illustrating a process 800 of an accessory device 801 determining which accessory device of a plurality of accessory devices is to respond to a user request, in accordance with some embodiments. The accessory device 801 may correspond to any one or more of the accessory devices described herein. In some embodiments, some or all of these processes may be performed by one or more user devices or another device (e.g., a server device), which may correspond to any of the user devices or server devices/server computers described herein, respectively.

At block 802, the accessory device 801 may receive a wake word. In some implementations, the wake-up word can correspond to a portion of the audio input received at the accessory device 801, including a user utterance. Similar to the wake word detection module 410 of fig. 4, block 802 may also encompass processing wake words at a first level.

At block 804, the accessory device 801 may establish an audio channel with a user device. The accessory device 801 may transmit a wake word to the user device over an audio channel. At block 806, the accessory device can receive an accessory response score from the user device. The accessory response score may be based on an analysis of wake words sent to the user device. In some embodiments, the score is based on criteria including, but not limited to, the strength (e.g., loudness) of the audio input received at the accessory device, the quality of the received audio input, the capabilities of the accessory device, and the capabilities of the user device associated with the accessory device.

At block 808, the accessory device 801 may open an election communication channel with one or more accessory devices that may have received audio input from the same user utterance or other audio source. The election communication channel may also be configured to communicate with one or more user devices such that participants in the election include both the accessory device and the user device. In some embodiments, the communication channel may exist on one or more networks with which the accessory device may communicate. In other embodiments, an ad hoc network or other small area network or LAN may be established for purposes of election. For example, the accessory device can use bluetooth to establish an anonymous election connection to send and receive an election score.

At block 810, the accessory device 801 may transmit its accessory response score to other accessories connected to the election communication channel. At block 812, the accessory device receives the competition score from one or more other accessories. In some scenarios, no other accessory device or user device transmits a score to accessory device 801, in which case the accessory device may proceed as if it won the election. Once the scores are received from all participant devices, the process may move to block 814 where the accessory device 801 compares its score with all other received scores. Meanwhile, in some embodiments, other participant devices are performing the same or similar comparison between their scores and the scores transmitted by accessory device 801. The comparison may comprise a simple comparison between the numerical scores. In some embodiments, the score may be generated in a manner that ensures that there is a unique winner (i.e., no tie). As depicted herein in fig. 8, a "better" score is indicated as being greater than a less desirable score, such that the election winner will have a higher score. Other scoring systems that change the comparison hierarchy but do not change the outcome of the election process as described herein are possible.

At decision 816, if the accessory device receives a score from another accessory that exceeds its own score, the process may proceed to endpoint 818 and the accessory device may ignore the wake word. Ignoring the wake-up word may include terminating an audio channel with the associated user device. If the response score of the accessory device 801 is greater than all other received scores from other participant devices, the accessory device 801 has won the election. The process moves to endpoint 820 where accessory device 801 reports its winnings to its associated user device. The election may determine which device of a plurality of devices receiving the audio input is the preferred device for responding to the input. The winning device may proceed with other operations related to receiving or performing a response to the audio input.

Fig. 9 is another simplified flowchart illustrating an exemplary process 900 for a user device to coordinate interactions with multiple accessory devices. In some embodiments, one or more of the operations of process 900 may be similar to those described with reference to fig. 1 and 4.

At block 902, a user device may receive information identifying one or more accessories capable of communicating with the user device. In some implementations, one or more of the operations of block 902 may be similar to one or more of the operations described with respect to process indicator 330 with reference to fig. 3.

At block 904, the user device can implement an accessory interaction instance corresponding to each of the accessories identified in block 902. In this way, the user device may have an accessory interaction instance associated with each identified accessory such that operations performed by the user device to interact with the accessory device may be managed by one accessory interaction instance without affecting interactions of the user device with other accessory devices. In some implementations, one or more of the operations of block 904 may be similar to one or more operations of block 102 of fig. 1.

At block 906, the user device may receive a first audio input from a first accessory and a second audio input from a second accessory, wherein the first accessory and the second accessory are accessories of those previously identified in block 902 and associated with the accessory interaction instance in block 904. The first audio input and the second audio input may correspond to user utterances or other audio sources received at the first accessory and the second accessory. In some implementations, the first audio input and the second audio input may have the same audio source.

At block 908, the first accessory interaction instance of the user device may process at least a portion of the first audio input received in block 906. The first accessory interaction instance may be an accessory interaction instance corresponding to the first accessory. The portion of the first audio input can correspond to a user request such that processing of the accessory interaction instance can parse or otherwise analyze the request and determine a response. Processing the portion of the first audio input may include transmitting the portion to a server computer for analysis. The first audio input need not contain a request, and the process may determine an appropriate response based on the analyzed portion. In some implementations, one or more of the operations of block 908 may be similar to one or more of the operations of block 524 of fig. 5.

At block 910, the first accessory interaction instance can receive a first response from the server computer corresponding to the processed portion of the first audio input. In some embodiments, the analysis of the portion of the audio input is performed by the server computer, such that the server computer analyzes any requests or determines a first response corresponding to the processed portion. The analysis may include techniques like NLP or other speech processing. In some implementations, one or more of the operations of block 910 may be similar to one or more of the operations of blocks 524 and 532 of fig. 5.

At block 912, the user device may transmit a first response to the first accessory. In some implementations, one or more of the operations of block 904 may be similar to one or more of the operations of block 540 of fig. 5.

Exemplary techniques for coordinating interactions between a user device and a plurality of accessory devices are described above. Some or all of these techniques may be implemented, at least in part, by an architecture such as those illustrated at least in fig. 1-9 above, but need not be implemented by such an architecture. While many embodiments are described above with reference to a server device, an accessory device, and a user device, it should be appreciated that other types of computing devices may be suitable for performing the techniques disclosed herein. Further, various non-limiting examples are described in the foregoing description. For purposes of explanation, numerous specific configurations and details are set forth in order to provide a thorough understanding of the examples. It will be apparent, however, to one skilled in the art that some examples may be practiced without these specific details. Furthermore, well-known features are sometimes omitted or simplified in order not to obscure the examples described herein.

Techniques for load balancing hub devices and multiple endpoints

Additional embodiments of the present disclosure may provide techniques for managing association of accessory devices and hub devices. As a first example, consider a residential environment corresponding to a residence. A person in a home may want to know the current time. The person may query for accessory devices (e.g., nearby smart speakers) within the residential environment with a verbal request (e.g., "what is now. The accessory device can determine that the request is for the device and then transmit the received audio information to a hub device (e.g., a hub speaker). The hub device may process the audio information to determine the nature of the request and prepare a corresponding response (e.g., "10:30 a.k.m.). Alternatively, or in part in combination with the above, the user device may transmit some or all of the verbal request to a server computer (e.g., implementing a service provider), where the service provider may determine the nature of the request and/or prepare a corresponding response. The hub device may then transmit the response back to the accessory device for playback to the user. To provide coordination between the accessory device and the hub device, the user device may be configured to assign the accessory device to that particular hub device. The user device can monitor the hub device and other hub devices in the residential environment and reassign the accessory device to the other hub devices if the other hub devices would be better suited to handle user requests or other interactions with the accessory. In addition, the user device may be configured to assign the newly added accessory device to one of the hub devices based at least in part on various factors (e.g., accessory capability, hub capability, etc.).

As an illustration of the above examples, a residential environment may include numerous "smart" devices, e.g., electronic devices having features that allow them to operate interactively and autonomously to some extent. The smart device (which may have various functions) may be a camera, speaker, thermostat, earphone and headphone, telephone or media player. The smart device may also have various network communication capabilities including WiFi, ethernet, bluetooth, zigbee, cellular, and the like. These devices may be produced by different manufacturers. In some cases, the smart devices may be categorized as hub devices and accessory devices. The hub device may be a resident device of the residence (e.g., a smart speaker, a smart digital media player configured to control a Television (TV), a mobile phone, etc.). While not always so, in some examples, resident devices are contemplated to reside within the home and to move infrequently (e.g., move within the home or outside the home). The hub device may have the capability to equal or exceed the capabilities of the accessory device. For example, the hub device may be a mobile phone that may include wireless (e.g., wiFi) and cellular communication capabilities, multimedia capabilities, and device assistants. In this same example, the accessory device may be a smart speaker that may include audio media and wireless communication capabilities but lack device assistance. The device assistant may be a virtual assistant program configured to interact with the user. In these examples, the smart speaker may be a hub device or an accessory device, depending on its capabilities. In some examples, if the accessory is manufactured by an entity other than the entity that manufactured the user device, the accessory may not be initially configured with the capability to communicate with the user device. In some cases, a user device manufacturer may provide an accessory development kit ("ADK") for installation on an accessory that enables such communication after the accessory is manufactured, sold, supplied, or used. The user device may be a hub device as described herein and may include user interface features. In some embodiments, the user device is a leader device selected from hub devices in a residential environment. As used herein, the terms hub device, user device, and leader device may indicate one or more similar devices that are distinct from accessory devices.

In some embodiments, the user device may obtain information about accessory devices and hub devices present in the residential environment. This information may be obtained by a user device in direct communication with an accessory device and a hub device sharing the same network within the residential environment. In other embodiments, information about the accessory device and the hub device may be transmitted to the user device by the second user device, the leader device, or the remote server device. For example, a user in a home may add a new accessory device to the home environment. As part of this process, the user may interact with a second user device (e.g., a mobile phone) to configure the new accessory device and send new accessory device information to the first user device. As another example, a leader device in a residential environment may have information about multiple accessory devices and hub devices in the residential environment and report information about some or all of these devices to a user device. The user device may then use this information to assign the accessory device to the appropriate hub device. The attachment information and hub device information may be stored by the user device.

The information received by the user device from the hub device and the accessory device may correspond to characteristics, attributes, or capabilities of the hub device and the accessory device. For example, the hub device may be a hub speaker having a microphone input, a speaker output, and a device assistant supporting multiple spoken languages. The attributes of the hub speaker will then identify that the hub speaker can receive audio input (through the microphone), generate audio output (through the speaker), and process user requests (through the device assistant). Further, as an attribute, the hub speaker may support multiple language processing such that the hub speaker may be able to process user requests in various spoken languages. Continuing with this example, the accessory device can be a smart thermostat with a touch screen, a microphone input, a speaker-less output, and a connection to a stove in a residential environment. The attributes of the intelligent thermostat will then recognize that the thermostat can interact with the user at the touch screen, present visual information or other indications to the user at the touch screen, receive audio input, be unable to produce audio output, and operate the stove to control ambient environmental conditions in the home. However, one of ordinary skill will appreciate that this is only one example, and that most, if not all, accessories will have speakers for output. In some embodiments, the attribute information may also include information corresponding to the computing capabilities of the device, including the current processing load of the device. Furthermore, each hub device may have a limited number of connection slots corresponding to a maximum number of associated accessories that the hub device may support. Each accessory associated with the hub occupies one slot, leaving an unoccupied slot available for association with other accessories. The attribute information may also include the network communication capabilities of the device. Any particular capability or function of a device may be an identifiable attribute, and many combinations of capability and device functions are contemplated and described in the several examples herein.

Once the user device has received attribute information from the hub device and the accessory device, the user device may score the hub device to determine which hub device is the "best" hub device associated with the accessory device. The score may be based on whether the accessory device requires a particular attribute of the hub device. For example, a camera accessory may require that any hub device with which it will be associated possess a hardware video decoder, such that any hub that does not have the required decoder will receive a low score or 0 score from the user device. As another example, an accessory smart speaker may require specific language support from any hub device associated therewith. One hub device may implement the language support in part, while a second hub device may implement the language support with high quality conditions. The first hub may receive a low score and the second hub may receive a higher score as the best hub associated with the smart speaker.

In some embodiments, the hub device may receive a base score corresponding to its general computing power. For example, a tablet or smart phone hub device may have a higher base score than a hub speaker due to its greater computing power. However, in other examples, the smart phone hub device may have a lower base score due to transient nature (e.g., the smart phone may move around and/or leave the home more frequently than other devices). Transient devices may not be the best hub because as they move, they may discard the associated accessory devices and force those accessory devices to re-associate to other hubs (which may create delays within the system). The base score may be modified by comparing the attachment and hub attributes. The final connection score may then be calculated by multiplying the modified score by the number of available connection slots at each hub. The hub device with the highest connection score is considered the best hub device for the accessory and the user device may transmit information to the best hub device to associate the accessory with it.

Fig. 10 is a simplified block diagram 1001 of an exemplary embodiment. Process 1000 is an exemplary high-level process flow for a system including an accessory device 1010 requesting an assignment to a hub device via a user device 1012. Diagram 1001 shows the state of the system corresponding to the blocks of process 1000. Process 1000 may be performed within a residential environment that includes a plurality of accessory devices, user devices, and hub devices. As depicted herein, accessory device 1010 may be a smart speaker and user device 1012 may be a hub speaker. Although described as a particular device, it should be apparent that the accessory device 1010, user device 1012, or hub devices 1014, 1016 may be several types of smart devices in various combinations and numbers. For example, a smart phone, a media device (e.g., a smart TV), or a tablet (connected to a cellular network, to a local area network via WiFi of a residential network, or to a wide area network ("WAN")) may perform one or more of the operations of process 1000.

Turning in more detail to process 1000, at block 1002, user device 1012 may receive an allocation request from accessory device 1010. The allocation request may occur when accessory device 1010 recognizes that it is not currently associated with a hub device. For example, accessory device 1010 can be a new accessory device that is introduced into a residential environment. As another example, a hub device previously associated with accessory device 1010 may have lost network connection with a device in the residential environment such that accessory device 1010 needs to be associated with a new hub device to properly relay user requests or other functions. In other cases, accessory device 1010 may be discarded by its currently associated hub device. This may occur when a currently associated hub device reduces the total number of its connection slots (e.g., instructed to do so by the user device or another device due to an increase in computational load).

At block 1004, the user device 1012 may receive information from the accessory device 1010, the first hub device 1014, and the second hub device 1016. The information may include attributes 1020 from the accessory device 1010 and attribute information 1024, 1026 from the hub device 1014, 1016. As described in more detail below with respect to fig. 12, the accessory attributes 1020 can include any feature of the accessory device 1010 related to its association with the hub device. Some accessory attributes may be classified as requirement attributes. Similarly, the hub information 1024, 1026 may include any attribute or characteristic of the hub device 1014, 1016 associated with selecting the best hub device. In some embodiments, the user device 1012 may receive the hub attribute information 1024, 1026 upon request. The user device 1012 may store the received hub and attachment attribute information. The stored attribute information may be stored locally at the user device or at a remote device like a server computer. In other embodiments, the hub devices 1014, 1016 may periodically update the stored attribute information 1024, 1026 without a request from the user device 1012. For example, the hub device 1014, 1016 may update its attribute information stored at the server device whenever these attributes change (e.g., if the number of attachment slots decreases) or at regular intervals (e.g., every minute). The user device 1012 may then access the stored data to receive the hub attribute information 1024, 1026 without directly querying each hub within the residential environment.

Moving down to block 1006, the user device 1012 may compare the accessory attribute 1020 with the first hub information 1024 and the second hub information 1026. The comparison may produce a score or other metric to determine which of the first hub device 1014 and the second hub device 1016 should be assigned to the accessory device 1010. In some implementations, each of the hub devices 1014, 1016 may be given a base score corresponding to the general computing capabilities of each device. The user device 1012 may then modify the base score by combining the base score with the comparison result. The connection score for each hub device is calculated by multiplying the modified score by the number of slots available at hub device 1014 and hub device 1016. The user device 1012 may then assign the accessory device 1010 to the hub device with the highest connection score, as shown in block 1008.

As an example of the foregoing embodiment, consider the scenario depicted in fig. 10, wherein accessory device 1010 is a smart speaker, hub device 1014 is a hub speaker, and hub device 1016 is a tablet computer. Because the tablet may have a greater general computing power than the hub speaker, the hub device 1016 may be assigned a higher base score than the hub device 1014. The intelligent speakers may only require basic language support for one language in the residential environment, and both the hub speaker and the tablet may provide that support. Thus, each hub receives the same non-0 score when comparing the required attributes of the accessory device 1010 to the hub information 1024, 1026. Regarding other accessory features, a tablet may provide both a microphone and a speaker, as well as a touch screen, while a hub speaker may provide only a microphone and speaker. The intelligent speaker accessory attribute may indicate that a hub device that can visually present information (e.g., via voice-to-text) is beneficial. With all other attributes equal, the tablet may receive a higher comparison score than the hub speaker. The user device 1012 may then calculate a modified score for the second hub device 1016 that is significantly higher than the modified score for the first hub device 1014. However, a tablet computer may have only one accessory connection slot that is already occupied by another accessory, while a hub speaker may have two open slots. The user device 1012 may then calculate a connection score for each hub device, i.e., multiply the modified score for the hub speaker by 2 and multiply the modified score for the tablet by 0. The result will be that the hub speaker is the winning device and the user device 1012 can assign the accessory device 1010 to the hub device 1014, as depicted. The above description should be considered only as one example of the results of the disclosed process. Those skilled in the art will recognize that a residential environment with many devices may produce a variety of different results for attachment/hub assignments, depending on the dynamic nature of the hub attributes, available connection slots, and migration of hub devices and attachment devices into or out of the residential network.

Fig. 11 is a schematic diagram of a residential environment 1100 containing hub devices and accessory devices, in accordance with some embodiments. Hub devices may include a hub speaker 1102, a media player 1104, and a smart phone 1106. These hub devices may correspond to the hub devices 1014, 1016 from the embodiments described above with respect to fig. 10. The accessory devices may include smart speakers 1112, 1114, a smart watch 1116, and a thermostat 1124. Similarly, these accessory devices may correspond to accessory device 1010 described with respect to fig. 10. All or some of these accessory devices may be third party devices (e.g., not manufactured, programmed, or supplied by the manufacturer, programmer, or supplier of the hub device or user device). Thus, they may not be automatically and/or initially compatible with the user device. Each hub device in the residential environment 1100 may be associated with 0, one, or more accessory devices. As shown by the long dashed lines, hub speaker 1102 is associated with smart speakers 1112, 1114 and smart watch 1116, while media player 1104 is associated with thermostat 1124. Smart phone 1106 is not associated with an accessory device. Devices within the home environment 1100 may be configured to communicate over one or more networks associated with the home environment 1100 using one or more network protocols. For example, residential environment 1100 may be associated with a local area network ("LAN"), WAN, cellular network, personal area network, or other network, and the devices may communicate using WiFi connections, bluetooth connections, thread connections, zigbee connections, or other communication methods.

The arrangement of the association of the accessory device with the hub device may include various different combinations and may be modified by the user device. For example, the user device may receive an allocation request from an accessory device within the residential environment. The accessory device may be one of the accessory devices previously associated with the hub device in the home or may be a new accessory device added to the home. The user device will then obtain attribute information from the accessory device and hub speaker 1102, media player 1104, and smart phone 1106, which will then be scored. Because smart phone 1106 is not currently associated with an accessory device, it may have the highest score and receive an assignment of a new accessory. In some embodiments, a user device in the residential environment 1100 can communicate with a hub device to transfer one or more accessory devices associated with a first hub device to a second hub device. The transfer may occur automatically based on information received by the user device about the home environment 1100, including but not limited to information that another hub device may be more suitable for association with one or more accessories or that accessories have been added to or removed from the home environment 1100. The suitability of any particular hub device associated with an accessory may be based at least in part on the capabilities of the hub device, the capabilities of the accessory device, the current processing load experienced by the hub device, the location of the device within the residential environment, and the status of communication between devices on the network. Many other criteria for rearranging device associations in a residential environment are contemplated.

In some embodiments, accessory devices and non-resident hub devices may also leave the residential environment or lose network connectivity with the residential environment. The accessory device leaving the residential environment may be disassociated by the previously associated hub device such that the hub device updates its accessory slot to account for the new available slot. Accessory devices associated with hub devices that lose network connectivity to the residential environment may be reassigned by user devices that maintain network connectivity. In this case, the user device may receive information that the hub device is no longer able to communicate with the accessory device and reassign the accessory device. Some embodiments may have a user device designated as a leader device to manage allocation of accessory devices between hub devices within a residential environment. In other embodiments, if the user device and accessory device are associated and leave the residential environment and network connectivity is lost, the user device may maintain its association with the accessory device and perform the materialization methods described herein.

In another embodiment, a user device acting as a leader device may monitor hub devices within a residential environment to determine whether each hub device may respond to its attachments within an acceptable timeframe. In some cases, the hub device may suffer from increased processing load such that it cannot respond to one or more of its assigned attachments within a threshold amount of time. The user device may receive information about the hub device indicating that it has delayed a response to the accessory request and instruct the hub device to discard one or more of its accessory devices. The user device may then determine which other hubs within the residential environment the discarded accessory device should be assigned to.

Returning to fig. 11, as an example of the foregoing description of some embodiments, hub speaker 1102 may discard its association with smart watch 1116. This may occur due to the increased processing load generated at the hub speaker 1102, such as where the hub speaker is currently streaming separate music channels to the smart speakers 1112, 1114. Because smart watch 1116 does not have an associated hub, the smart watch may request an allocation from a user device, which may be smart phone 1106. The smart phone 1106 may then receive the attribute information from the hub device in the residential environment 1100. The attribute information may include attribute information about the smart phone 1106 itself, as in some embodiments the user device may act similarly to a hub device. The smart watch 1116 will then score each hub device based on a comparison of the attribute information and the accessory attributes. Because the hub speaker 1102 has recently discarded the accessory smart watch 1116 due to increased demand, the hub speaker may reduce the total number of its accessory connection slots and no slots are available. Smart phone 1106 may then determine the winning hub device between itself and media player 1104 and assign smart watch 1116 appropriately.

Continuing with FIG. 11, the residential environment 1100 can have a plurality of users 1130, 1134 issuing audio requests 1132, 1136 for accessories. The requests 1132, 1136 may occur separately or simultaneously and may be received by multiple accessory devices, as indicated by the short dashed lines. For example, request 1132 may be received by smart speaker 1114 or smart watch 1116, while request 1136 may be received by smart speaker 1112 and thermostat 1124. As previously mentioned, the accessory device and its associated arrangement may take various forms and may change over time. Thus, the user request may be received by a plurality of accessory devices associated with different hub devices. For example, the user request 1136 is received by both the thermostat 1124 associated with the media player 1104 and the smart speaker 1112 associated with the hub speaker 1102. Because user interaction with devices in the residential environment 1100 can occur in several arrangements, the ability of the user device or leader device to assign an attachment to the optimal hub device can prevent missing user requests. If the accessory devices are discarded by their hub devices, as described in the previous examples, or if the hub devices lose network connectivity, the accessory devices may not be able to process the user request unless the accessory devices are quickly reassigned to the best available hub devices.

Hub device management by the user device or leader device may also allow the hub device to receive and process audio requests 1132, 1136 that require responses at one or more accessory devices other than the accessory device that originally received the request. In these embodiments, the accessory device may not have information about other accessories within the residential environment and no mechanism to establish trust or permissions with other accessories on one or more of the residential networks. For example, smart thermostat 1124 and smart watch 1116 may be third party accessories manufactured by two different entities and thus may not have the inherent ability or permission to directly connect to each other to transfer or exchange data and information. The user device may manage the allocation of accessories to hub devices so that each hub may establish a connection with each of its accessories and transmit information received from the other hubs to those accessories without the accessories knowing any other devices in the residential environment except the hub to which it is allocated.

As a specific example of the foregoing embodiment, consider the case where: when the user 1134 is about to leave the home, the user 1130 wants to issue an announcement to the user 1130 remembering to purchase milk at the grocery store. The intelligent thermostat 1124 assigned to the media player 1104 may hear the user request 1136 more clearly because the user 1134 is in the same room as the thermostat 1124. Upon processing the user request, the media player 1104 may transmit a response to the hub speaker 1102 and the smart phone 1106 (e.g., other known hub devices in the residential environment). The announcement may be played at each capable device within the home, including smart phone 1106, hub speaker 1102, smart speakers 1112, 1114, and smart watch 1116 to ensure that it is heard by user 1130. The announcement may also be selectively transmitted by the hub speaker 1102 to the smart watch 1116 so that it is presented by the most appropriate device. In this way, a user request may be received at one third party device (thermostat) and executed at a second third party device (smartwatch) without requiring either third party device to communicate directly with the other third party device. Consider another example in which user 1130 takes away smart phone 1106 and leaves the home environment before user 1134 issues an annunciation request. Since the smart watch 1116 has left the residential environment, the smart watch may lose network connectivity with the hub speaker 1102. When doing so, smart watch 1116 may request an allocation to another hub device. Smart phone 1106 or other user device may assign smart watch 1116 to another suitable hub device, which may be smart phone 1106, as the smart phone remains in close proximity to smart watch 1116. The smart phone 1106 may also maintain network connectivity to the residential environment via its cellular network connection to the internet WAN. When the user 1134 issues an announcement to buy back milk from the grocery store, the request may again be processed by the media player 1104 and transmitted to the hub speaker 1102 and smart phone 1106 for playback. Since the hub speaker 1102 is no longer assigned to the smart watch 1116, the hub speaker will take no further action to perform the annunciation request. Smartphone 1106 may receive the announcement and transmit it to its accessory smartwatch 1116 for audio playback to user 1130.

Fig. 12 is a schematic diagram illustrating a process 1200 in which a user device 1201 assigns an accessory device 1206 to one of hub devices 1202. In some embodiments, the user device 1201 may correspond to a user device described herein (e.g., user device 1012 of fig. 10). Similarly, accessory 1204 and accessory device 1206 can correspond to other accessory devices, while hub device 1202 can correspond to other similar hub devices described herein. The user device 1201, depicted here as a hub speaker, may be a hub device, a leader device, a configuration device, or other device for determining device assignments and associations in the hub device 1202. User device 1201 can be configured to communicate with hub device 1202, accessory 1204, and accessory device 1206 over one or more networks (including LANs or WANs) described herein. In some embodiments, user device 1201 is a remote server device configured to communicate with hub device 1202, accessory 1204, and accessory device 1206 over a WAN.

The various elements of the allocation process 1200 are presented in more detail. The hub devices 1202 may each include hub attribute information including a hub attribute 1210 and a slot 1216. The information may be stored in a memory or storage unit at the hub device. In some embodiments, the hub information may be stored at a remote device like a server computer or cloud device. In other embodiments, the hub information is stored at the user device 1201 and updated periodically. Hub attributes 1210 may include any attribute or feature of hub device 1202 associated with selecting the best hub device. This may include language processing 1212, hardware decoder 1214, supported network connections (e.g., wiFi, cellular, or Thread), the ability to act as an edge router for a personal area network (e.g., thread border router), and the type of other accessory currently associated with the hub device. The slots 1216 may include a number of available accessory connection slots 1218 and a number of currently assigned slots 1220. The total number of slots 1216 is a dynamic quantity and may be changed or updated by the hub device according to changes in the residential environment. If the hub device no longer has enough slots 1216 available to support the accessory, a change in the total number of slots 1216 may result in the assigned accessory being removed from the hub device. In some embodiments, if the number of slots 1216 becomes less than the number of assigned accessories, accessories assigned to the hub based on the match requirement characteristics 1226 are disassociated from the hub only if no other accessory devices can be disassociated first. As depicted herein, an exemplary hub device may have a total of four slots, with two slots available and two slots assigned to "accessory 1" (thermostat) and "accessory 2" (camera).

Accessory device 1206 can be a representative accessory in accessory 1204 and can include information corresponding to accessory status 1222 and accessory characteristics 1224. As used herein, the terms accessory property and accessory attribute are used interchangeably. The accessory status 1222 is a piece of information corresponding to whether the accessory 1206 is currently assigned to the hub. If the attachment is assigned, the attachment state 1222 will identify the assigned hub. If the attachment is not assigned (e.g., the attachment is a new attachment to a residential environment), the attachment state 1222 may indicate that the attachment 1206 requires assignment, thereby indicating that the attachment requests assignment from the user device 1201. In some embodiments, the accessory device may not be assigned to the hub and does not request a hub assignment. In these cases, an unassigned accessory may transmit its status information to the user device 1201, but does not request an assignment to a hub device. Accessory characteristics 1224 may include any feature of accessory device 1206 related to its association with a hub device, including, but not limited to, language support requirements, hardware encoding/decoding requirements, input/output ("I/O") devices present (e.g., speaker and microphone or screen), types of supported network connections (e.g., wiFi and bluetooth), and external device controls (e.g., control of lighting switches or home ovens). Some accessory attributes may be classified as requirement attributes.

Completing the detailed elements of fig. 12, the progress indicators 1230, 1240 represent the data transmissions between the accessory user device 1201 and the accessory device 1206 and between the user device 1201 and the hub device 1202, respectively. The progress indicators 1230, 1240 may indicate communication between various devices as described herein over one or more networks, including but not limited to a WiFi LAN or internet WAN. The progress indicator 1230 indicates the transmission of data corresponding to the allocation request and the attachment property 1224 to the user device 1201. The data may include that the accessory device 1206 is no longer in communication with its assigned hub device. Similarly, the progress indicator 1240 indicates the transmission of data, including the hub property 1210 and the slot 1216, between the hub device 1202 and the user device 1201.

Process 1200 provides a more detailed picture of the scoring process described above with respect to block 1004 of fig. 10. Once the user device 1201 has received the accessory property 1224 and the hub attribute 1210 and the slot 1216 from the hub device 1202, the user device may score the hub device 1202 to determine the best hub. The scoring begins by determining a base score for each hub device. The user device 1201 may then compare the accessory characteristics 1224 with the hub attributes 1210. The comparison begins by comparing the required characteristics 1226 of the accessory device 1206 with the hub attribute 1210. Because accessory device 1206 has somehow indicated that these attributes are required, if the comparison does not match the required characteristics 1226 with the corresponding attributes 1210 of the hub device, the resulting score for the hub device will be 0. This 0 result will occur regardless of the score of the hub device relative to other attributes of the accessory device 1206. In some cases, the hub device may only partially match the required characteristics 1226 (e.g., by providing the required characteristics at low quality). The user device 1201 may give a non-0 score to the hub device as a result of partially satisfying the requirement. Once the required characteristics 1226 are compared, the user device 1201 compares the other accessory characteristics 1228 to each of the features in the hub attributes 1210. The results are then combined with the base score to obtain a modified score. The connection score for each hub device may be calculated by multiplying the modified score by the number of slots available at each of the hub devices 1202.

Fig. 13 is a simplified block diagram 1300 illustrating an exemplary architecture of a system for assigning accessory devices to hub devices, in accordance with some embodiments. The illustration includes a user device 1302 (e.g., a leader device), one or more accessory devices 1304, a representative accessory device 1306, one or more networks 1310, and a server device 1312. In some examples, the user device 1302 may be one of a plurality of hub devices that has been elected as a leader of the hub device. Each of these elements depicted in fig. 13 may be similar to one or more elements depicted in the other figures described herein. In some embodiments, at least some elements of diagram 1300 may operate within the context of a residential environment (e.g., residential environment 1100 of fig. 11).

The accessory device 1304 and representative accessory device 1306 can be any suitable computing device (e.g., smart speaker, smart watch, smart thermostat, camera, etc.). In some embodiments, the accessory device can perform any one or more of the operations of the accessory device described herein. Depending on the type of accessory device and/or the location of the accessory device (e.g., within or outside of the residential environment), the accessory device may be enabled to communicate over network 1310 (e.g., including a LAN or WAN) using one or more network protocols (e.g., bluetooth connection, thread connection, zigBee connection, wiFi connection, etc.) and network paths, as further described herein.

In some embodiments, server device 1312 may be a computer system including at least one memory, one or more processing units (or processors), a storage unit, a communication device, and an I/O device. In some embodiments, server device 1312 may perform any one or more of the operations of the server devices described herein. In some embodiments, these elements may be implemented in a similar manner (or in a different manner) as described with reference to similar elements of user device 1302.

In some embodiments, user device 1302 may correspond to any one or more of the user devices described herein. For example, user device 1302 may correspond to one or more of the user devices of residential environment 1100 of fig. 11. The representative user device may be any suitable computing device (e.g., a mobile phone, a tablet, a smart hub speaker device, a smart media player communicatively connected to a TV, etc.). Similarly, the hub device 1308 may be any device capable of performing the function of a user device.

In some embodiments, the one or more networks 1310 may include an internet WAN and LAN. As described herein, a residential environment may be associated with a LAN, wherein devices within the residential environment may communicate with each other through the LAN. As described herein, the WAN may be external to the residential environment. For example, a router associated with a LAN (and thus with a residential environment) may enable traffic from the LAN to be transmitted to a WAN, and vice versa. In some embodiments, the server device 1312 may be external to the residential environment and thus communicate with other devices over the WAN.

As described herein, user device 1302 may represent one or more user devices connected to one or more of networks 1310. The user device 1302 has at least one memory 1314, a communication interface 1316, one or more processing units (or processors) 1318, a storage unit 1320, and one or more input/output (I/O) devices 1322.

Turning in further detail to each element of the user device 1302, the processor 1318 may be implemented in hardware, computer-executable instructions, firmware, or a combination thereof, as appropriate. Computer-executable instructions or firmware implementations of processor 1318 include computer-executable instructions or machine-executable instructions written in any suitable programming language to perform the various functions described.

The memory 1314 may store program instructions that can be loaded and executed on the processor 1318 as well as data generated during execution of such programs. Depending on the configuration and type of user device 1302, memory 1314 may be volatile (such as random access memory ("RAM")) or non-volatile (such as read-only memory ("ROM"), flash memory, etc.). In some implementations, the memory 1314 may include a variety of different types of memory, such as static random access memory ("SRAM"), dynamic random access memory ("DRAM"), or ROM. The user device 1302 may also include additional storage 1320, such as removable storage or non-removable storage, including, but not limited to, magnetic storage, optical disks, and/or tape storage. The disk drives and their associated computer-readable media may provide nonvolatile storage of computer-readable instructions, data structures, program modules and other data for the computing device. In some embodiments, the storage 1320 may be used to store data content received from one or more other devices (e.g., the server device 1312, other user devices, the hub device 1308, the accessory device 1304, or the representative accessory device 1306). For example, the storage 1320 may store attachment management settings, attachment attributes, hub attribute information, and user data associated with users affiliated with the residential environment.

The user device 1302 may also include a communication interface 1316 that allows the user device 1302 to communicate with a storage database, another computing device or server, a user terminal, or other device via the network 1310. User device 1302 can also include I/O devices 1322 such as for enabling connection with a keyboard, mouse, pen, voice input device, touch input device, display, speakers, printer, etc.

Memory 1314 may include an operating system 1324 and one or more applications or services for implementing the features disclosed herein, including a communication module 1326, a user interface module 1328, a digital assistant 1330, and a management module 1332. Management module 1332 also includes scoring module 1334 and device attributes 1336.

The communication module 1326 may include code that causes the processor 1318 to generate instructions and messages, transmit messages, or otherwise communicate with other entities. For example, the communication module 1326 can transmit and receive data associated with accessory assignment requests, accessory characteristics, and hub device attributes from the accessory devices 1304, 1306, hub device 1308, other user devices, or server device 1312 in conjunction with the management module 1332. As described herein, the communication module 1326 may transmit the message via one or more network paths of the network 1310 (e.g., via a LAN or internet WAN associated with a residential environment).

The user interface module 1328 can include code that causes the processor 1318 to present information corresponding to the accessory device 1304 and the hub device 1308 that are present within the residential environment. For example, the user interface module 1328 may present a graphical representation of the hub device 1308 and the accessory devices 1304 currently associated with each hub device. In some embodiments, the user interface module 1328 may allow a user to provide configuration information regarding a new accessory device to be added to the residential environment, or allow a user to select the hub device 1308 or the accessory device 1304 to remove from the residential environment.

Digital assistant 1330 may include code that causes processor 1318 to receive and process user requests. The user request may be transmitted from accessory device 1304 to the user device. The digital assistant 1330 may include voice processing capabilities and language support. The digital assistant 1330 and the presence of features therein may include one or more of the attributes considered when comparing the accessory property 1358 with the hub attributes to score the hub device 1308.

The management module 1332 can include code that causes the processor 1318 to send information to and receive information from one or more accessory devices 1304, 1306, and to send information to and receive information from one or more hub devices 1308 or other user devices. For example, the management module 1332 can receive the allocation request and accessory characteristics from the accessory device 1306, or the hub attributes from the hub device 1308, in conjunction with the communication module 1326. The management module 1332 may also transmit information indicating an assignment to the optimal hub to one of the accessory device 1306 and the hub device 1308. In some embodiments, the management module 1332 may store information corresponding to accessory status of each accessory managed by the user device 1302 within the residential environment. The accessory status can identify which hub of the plurality of hub devices 1308 each accessory device of the plurality of accessory devices 1304 is assigned to.

Management module 1332 may include scoring module 1334 and device attributes 1336. The device attributes 1336 may include accessory characteristics 1358 received from the accessory device 1306 and hub attributes received from the hub device 1308. The device attributes 1336 may be stored in the memory 1314 or in the storage 1320 at the user device. In some embodiments, the device attributes may be stored at another device including the server device 1312 and received into the memory 1314 as the user device 1302 processes the allocation request. Scoring module 1334 may include code that causes processor 1318 to compare the accessory characteristics to hub attributes including device attributes 1336 to calculate a score for each of hub devices 1308.

Turning now to the details of the representative accessory device 1306, in some embodiments the accessory device 1306 may have at least one memory 1342, a communication interface 1344, a processor 1346, a storage unit 1348, and an I/O device 1350. As described herein with respect to user device 1302, these elements of the accessory device may have the same appropriate hardware implementations as their counterparts on user device 1302.

The memory 1342 of the accessory device 1306 can include an operating system 1354 and one or more applications or services for implementing the features disclosed herein, including a communication module 1352 and an accessory development kit ("ADK") 1356. The ADK may be a software development kit ("SDK") stored on the accessory device 1306 and configured to be executed or processed on the accessory device. As used herein, an SDK may include an application programming interface and associated software libraries sufficient to enable operation of other software within or associated with the SDK. In some embodiments, the ADK may be provided by an entity (e.g., their manufacturer) associated with the hub device 1308 or the user device 1302. As described herein with respect to user device 1302, communication module 1352 may have similar appropriate functionality as its corresponding communication module 1326.

ADK 1356 may include code that causes processor 1346 to determine the allocation status of accessory device 1306 and transmit an allocation request to user device 1302 if accessory device 1306 is not currently allocated to a hub device. ADK 1356 may include accessory characteristics, including required characteristics of accessory device 1306 and other characteristics. In some embodiments, the accessory characteristics may be similar to those described with reference to accessory characteristics 1224 of fig. 12. For example, the accessory device 1306 may require specific language support for any of its assigned hub devices 1308.

Fig. 14 is a flow chart illustrating a particular exemplary process 1400 for assigning an accessory 1402 to a selected hub device. Each of the elements and operations depicted in fig. 14 may be similar to one or more elements depicted in other figures described herein. For example, the user device 1401 may be similar to other user devices (e.g., leader devices), and so forth. In some embodiments, process 1400 may be performed within a residential environment (e.g., residential environment 1100 of fig. 11). Process 1400 and processes 1500, 1600, 1700, and 1800 of fig. 15, 16, 17, and 18 (described below) are illustrated as logic flow diagrams, each of which represents a series of operations that can be implemented in hardware, computer instructions, or a combination thereof. In the context of computer instructions, operations represent computer-executable instructions stored on one or more computer-readable storage media that, when executed by one or more processors, perform the recited operations. Generally, computer-executable instructions include routines, programs, objects, components, data structures, etc. that perform particular functions or implement particular data types. The order in which the operations are described is not intended to be construed as a limitation, and any number of the operations may be combined in any order and/or in parallel to implement the process.

At block 1408, the accessory 1402 can request an allocation to a hub device. The request may be initiated based on the accessory device identifying that it is not currently assigned to the hub device. This may be due to the accessory device being new to the residential environment or the previously assigned hub device discarding the accessory device. At block 1410, the user device 1401 may receive an allocation request. In some embodiments, the user device 1401 may be a leader hub device of hub devices in a residential environment. The leader hub device may be selected as a leader by another process, including a process that occurs at a server device or other device that determines leader capabilities between hub devices.

At block 1412, the user device 1401 can obtain accessory characteristics reported by accessory 1402 at block 1414. The accessory characteristics may be similar to accessory characteristics 1224 described with reference to fig. 12. In some embodiments, the user device 1401 can query the accessory 1402 for those features. In other embodiments, the accessory device transmits the accessory characteristics with the allocation request. In yet other embodiments, the accessory 1402 can periodically report its accessory characteristics to the leader device (or one or more other devices (including other leader devices)) or server device such that the leader device, other devices, or server device maintains up-to-date storage of the accessory characteristics. Similarly, at block 1416, the user device 1401 may obtain hub information from one or more hub devices. The hub information may be similar to the hub attributes 1210 and slots 1216 described in detail above with respect to fig. 12.

In some examples, the controller device may enable or disable the hub capability such that only devices that have enabled the hub capability may act as a hub for accessory 1402. Similarly or alternatively, a device may enable or disable a digital and/or virtual assistant, which may be used to determine whether the device may act as a hub. The user device 1401 may maintain a list of all devices that have hub capabilities (or assistant capabilities) enabled. Thus, when the user device 1401 obtains the hub information at block 1416, the user device may query only the devices (e.g., the first hub device 1404 and the second hub device 1406) because these devices have been listed as hub-enabled or assistant-enabled. Alternatively, when the user device 1401 is updated with information identifying a hub-enabled or assistant-enabled device, hub information may already be obtained from the hub device (e.g., the user device 1401 may obtain each device hub information simultaneously when the user device knows which devices are hub-enabled and/or assistant-enabled). Additionally, in some examples, block 1416 may be performed when user device 1401 becomes a leader and is thus configured to poll all other devices in the grid for their capabilities. As described above, it is possible that the user device 1401 obtains hub information at this time, rather than obtaining information in response to the accessory 1402 requesting an allocation at block 1408. In this case, at block 1416, the user device 1401 may obtain the hub information by retrieving the hub information from its own memory, as the user device may have presumably received the hub information from the first hub device 1404 and/or the second hub device 1406 when the grid was created or when the user device 1401 became the leader. In some examples, a hub device may periodically update its respective hub information (e.g., as features are added/removed). When this occurs, the user device 1401 may need to obtain updated hub information from each of the first hub device 1404 and the second hub device 1406.

As depicted in fig. 14, the hub devices may include a first hub device 1404 and a second hub device 1406. Process 1400 is not limited to only two hub devices and may be performed for any number of hub devices in a residential environment. As with the receipt of the accessory characteristic, in some embodiments, the user device 1401 may query the hub devices 1404, 1406 at blocks 1418 and 1420 to report the hub information. The hub devices 1404, 1406 may also periodically report the hub information to the user device 1401 or the server device to maintain up-to-date storage of the hub information. In these embodiments, the user device 1401 may receive the hub information by accessing a hub information repository stored at itself or at a server device.

At block 1422, the user device 1401 may calculate a base metric for the first hub device 1404 and the second hub device 1406. The base metric may be based on the general computing power of the hub devices 1404, 1406. For example, the first hub device 1404 may be a tablet computer with a powerful processor and large on-board memory, while the second hub device 1406 may be a hub speaker with a less powerful processor and memory. The base score of the tablet may be higher than the base score of the hub speaker. The base score may also depend on the current processing load experienced by the hub device. In the previous example, while the tablet may have greater computing power than the hub speaker, the tablet may be continually used to perform other computing tasks (e.g., running an application or playing media) for users in the residential environment. Thus, the calculated base score may be different for the same hub device depending on the current processing power of the hub device when the score is determined.

At block 1424, the user device 1401 may compare the received accessory characteristics with hub information received from the first hub device 1404 and the second hub device 1406. In some implementations, one or more of the operations of block 1424 may be similar to one or more of the operations described with respect to block 1006 of fig. 10 or process 1200 of fig. 12. For example, accessory 1402 can have desired characteristics and other characteristics. The hub information can identify features and functions of the hub devices 1404, 1406 that can meet the requirements of the accessory requirement characteristics. The user device 1401 may compare the required characteristics with the attributes of the first hub device 1404 and the second hub device 1406 and calculate a modified score for each hub device. The modified score is a modification of the base score using the comparison result. Hub devices that cannot meet the required characteristics of the accessory device are given a modified score of 0.

Many ways of quantifying the comparison of accessory characteristics to hub attributes are contemplated, and the scoring algorithm used by the user device 1401 may be updated or modified over time to provide different quantifications of the best hub device score. Consider, for example, a residential environment where hub devices often encounter significant loads due to activities unrelated to interactions between the hub device and accessory devices. Such loading may be due to many hub devices receiving user interactions for performing other tasks, such as streaming video content from the internet by a tablet or media player. In these cases, hub devices may frequently indicate that they cannot support the accessory to which they are assigned, which may result in frequent reassignment of the accessory. The stability of the connection between the assigned accessory device and the hub device may be improved by changing the scoring algorithm such that a hub device having an attribute indicating that the hub device may be subject to frequent loads will receive a lower score.

Returning to process 1400, at block 1426, the user device 1401 may calculate a final score for the first hub device 1404 and the second hub device 1406. The final score may be a first connection score for the first hub device 1404 and a second connection score for the second hub device 1406. The final score is calculated by multiplying the modified score obtained at block 1424 by the available connection slots of each hub device. If the hub device has no slots available, its final score may be 0. Similarly, if a hub device receives a modified score of 0 because it does not meet the requirements of the accessory characteristic, the hub device's final score may also be 0 regardless of the number of available slots the hub device has or its general computing power.

At decision 1428, the final scores from hub devices 1404, 1406 are compared and the highest score wins, indicating the best hub device for attachment assignment. If the first hub device 1404 has a higher connection score, the process moves to block 1430 and the user device 1401 assigns an attachment 1402 to the first hub device 1404. The assignment may include the first hub device 1404 receiving an assignment instruction identifying the accessory 1402 and allowing the first hub device to associate with the accessory 1402. The association may include, for example, creating an interaction instance including a software module for communicating with the accessory 1402, and processing a user request transmitted from the accessory 1402 to the first hub device 1404. The association may also include the first hub device 1404 establishing a connection with the accessory 1402. At endpoint 1434, first hub device 1404 may update its slots to reflect one less available connection slot due to the newly associated attachment 1402. Conversely, if the second hub device 1406 has a higher connection score, the process may move to block 1436 and the user device 1401 may assign the accessory 1402 to the second hub device 1406. Block 1438 and endpoint 1440 are similar to block 1432 and endpoint 1434, but the operations are performed by the second hub device 1406. For example, upon receiving the allocation instruction at block 1438, the second hub device can associate itself with the accessory 1402 (e.g., establish a connection with the accessory 1402). In some examples, when the accessory 1402 is first connected to a network (e.g., the same network to which each of the user device 1401, the first hub device 1404, and the second hub device 1406 are connected), the accessory 1402 can provide its information (e.g., an identifier, etc.), so each device on the network will know the device and how to connect to the device. In this way, once the first hub device 1404 or the second hub device 1406 are assigned as hubs for the accessories 1402, they are configured to connect.

Once the optimal hub device has been determined and assigned, the process may move to block 1442 to assign the attachment 1402 to the optimal hub device. The assignment of the accessory 1402 can include transmitting information to the accessory 1402 identifying which of the first hub device 1404 and the second hub device 1406 received the higher connection score. At endpoint 1444, accessory 1402 can update its accessory status to reflect the current hub allocation. However, in some cases, block 1442 is skipped and the user device 1401 does not report any information back to the accessory 1402 regarding the hub assignment. In this case, at endpoint 1444, accessory 1402 can instead update its accessory status based on being contacted (and/or connected) by either first hub device 1404 or second hub device 1406 as needed (e.g., as described above).

Fig. 15 is another flow diagram illustrating an exemplary process 1500 for reassigning an accessory device from one hub device to another hub device in accordance with an embodiment. Each of the elements and operations depicted in fig. 15 may be similar to one or more elements depicted in other figures described herein. For example, the user device 1501 may be similar to other user devices (e.g., leader devices), while the first hub device 1504, the second hub device 1506, and the third hub device 1508 may be similar to other hub devices, and so on. As with the previously described process 1400, in some embodiments, the process 1500 may be performed within a residential environment (e.g., the residential environment 1100 of fig. 11) or within any type of network environment (e.g., an office network, a school network, a general purpose network, etc.).

At block 1510, the first hub device 1504 may discard the currently assigned attachment 1502 (e.g., by closing a socket, etc. to end the session with the attachment 1502), for example, because the first hub device 1504 began to experience increased processing load due to other activities at the first hub device 1504. At block 1512, the first hub device 1504 may indicate to the accessory 1502 that the accessory has been discarded and disassociated. The first hub device may then update its slot information to reflect that the accessory 1502 is no longer associated. In some implementations, the first hub device 1504 may update its slot information in response to an increased processing load before reporting to the accessory 1502 that the accessory is being discarded. In these implementations, the operations of blocks 1510, 1512, and 1514 may occur in a different order than depicted in the illustration of process 1500. The slot information may be similar to the slot 1216 described in detail with respect to fig. 12.

At block 1516, upon receiving the information from the first hub device 1504 that it is being discarded, the accessory 1502 may update its status to reflect that it is no longer assigned to the first hub device. At block 1518, the accessory can request an assignment from the user device 1501. Blocks 1520-1526 may include one or more operations similar to blocks 1410-1416 of fig. 14. At block 1526, the user device 1501 may receive the second hub information from the second hub device 1506 and the third hub information from the third hub device 1508. The second hub device 1506 may report the hub information at block 1528, while the third hub device 1508 may report the hub information at block 1530. In some embodiments, although not depicted in fig. 15, the user device 1501 may also obtain the first hub information from the first hub device 1504 along with the hub information corresponding to its own attributes. In these embodiments, the first hub device 1504 may not report updated hub information to the user device 1501 such that the user device 1501 does not know that the first hub device 1504 has recently discarded the attachment 1502 and is likely to lose the subsequent scoring process performed between the hub devices. Such an embodiment may represent an example in which the scoring algorithm and communication between the hub device and the user device 1501 are more efficient if the user device 1501 is agnostic to the particular reason the accessory 1502 makes the allocation request.

At block 1532, the user device 1501 scores the hub based on the received accessory characteristics and the hub information. Blocks 1532-1536 may include one or more operations similar to blocks 1422-1436 of fig. 14. At block 1536, the user device 1501 may assign the attachment 1502 to the second hub device or the third hub device according to the connection scores of the second hub device 1506 and the third hub device 1508. If the second hub device 1506 has a higher connection score, it may receive an allocation instruction at block 1542, update its slot accordingly at endpoint 1544, and connect to the accessory device 1502 at endpoint 1548. Similarly, if the third hub device 1508 has a higher connection score, it may receive an allocation instruction at block 1538, update its slot at endpoint 1540, and connect to the accessory 1502 at endpoint 1548. Operations at endpoints 1544, 1540, and/or 1548 may be performed in any order. In other words, the second hub device 1506 may connect to the accessory 1502 at endpoint 1548 before or after updating its slot information at endpoint 1544, and the same is true for endpoints 1548 and 1540.

As with the several previous blocks, one or more of the operations of block 1546 and endpoint 1548 may be similar to one or more of the operations of block 1442 and endpoint 1444 as described with respect to FIG. 14.

Fig. 16 is another flow diagram illustrating an exemplary process 1600 for reassigning an accessory device from one hub device to another hub device, in accordance with an embodiment. Each of the elements and operations depicted in fig. 16 may be similar to one or more elements depicted in other figures described herein. For example, user device 1601 may be similar to other user devices (e.g., leader devices), while first hub device 1604, second hub device 1606, and third hub device 1608 may be similar to other hub devices, and so forth. As with the previously described process 1400, in some embodiments, the process 1600 may be performed within a residential environment (e.g., the residential environment 1100 of fig. 11) or within any type of network environment (e.g., an office network, a school network, a general purpose network, etc.).

At block 1610, the first hub device 1604 may lose connectivity to the network or lose connection with the accessory 1602, for example, because the first hub device is turned off or moved. In some examples, a hub device (e.g., first hub device 1604) that acts as a hub for an accessory (e.g., accessory 1602) may be configured to ping the accessory at intervals (e.g., every 15 seconds to 30 seconds, etc.) to inform the accessory that the hub device is still acting as a hub. If the accessory 1602 does not detect a ping within a threshold period of time, the accessory device 1602 may enter a discarded state at block 1616. The ping may be a "keep alive" message sent to each connected accessory. In some examples, the "keep-alive" message may be a request from the first hub device 1604 to read the "ping" characteristics of the accessory 1602. When the accessory device 1602 detects that the first hub device 1604 has read the "ping" feature, the detection of the reading of the "ping" feature informs the accessory 1602 that the first hub device 1604 is still acting as a hub and that the session is still active. Alternatively, rather than the hub devices sending "ping" messages to the accessory (and reading accessory characteristics), the accessory devices may be configured to initiate a ping to their assigned hubs to verify a connection to each assigned hub in some cases. If the accessory pins its respective hub and cannot verify the connection, the accessory device 1602 may enter a discarded state at block 1616.

In some examples, accessory 1602 can stay in the discarded state for a period of time (e.g., a "grace period") at block 1616 before requesting a new hub. If the first device 1604 is re-online and attempts to connect with the accessory 1602 (e.g., establish a new socket and/or session), the accessory 1602 can be reassigned to the first hub device 1604 (e.g., reconnected to the first hub device 1604) without requesting a new hub from the leader (e.g., user device 1601). However, if the grace period expires, then the accessory 1602 will instead transition to block 1618, where the accessory 1602 will request an allocation of the new hub from the leader (e.g., user device 1601).

Blocks 1620-1626 may include one or more operations similar to blocks 1410-1416 of fig. 14 or blocks 1520-1526 of fig. 15. At block 1626, user device 1601 may receive second hub information from second hub device 1606 and third hub information from third hub device 1608. The second hub device 1606 may report the hub information at block 1628, and the third hub device 1608 may report the hub information at block 1630. In some implementations, although not depicted in fig. 16, the user device 1601 may also obtain first hub information from the first hub device 1604 as well as hub information corresponding to its own attributes. In these embodiments, the first hub device 1604 may not report updated hub information to the user device 1601 such that the user device 1601 is unaware that the first hub device 1604 has recently discarded the accessory 1602 and is likely to lose the subsequent scoring process performed between the hub devices. Such an embodiment may represent an example in which the scoring algorithm and communication between the hub device and the user device 1601 is more efficient if the user device 1601 is agnostic to the particular reason the accessory 1602 issued the allocation request.

At block 1632, the user device 1601 scores the hub based on the received accessory characteristics and hub information. Blocks 1632-1636 may include one or more operations similar to blocks 1422-1436 of fig. 14. At block 1636, the user device 1601 may assign the accessory 1602 to the second hub device or the third hub device based on the connection scores of the second hub device 1606 and the third hub device 1608. If second hub device 1606 has a higher connection score, it can receive the allocate instruction at block 1642, update its slot accordingly at endpoint 1644, and connect to accessory device 1602 at endpoint 1648. Similarly, if the third hub device 1608 has a higher connection score, it may receive an allocation instruction at block 1638, update its slot at the endpoint 1640, and connect to the accessory 1602 at the endpoint 1648. Operations at endpoints 1644, 1640, and/or 1648 may be performed in any order. In other words, second hub device 1606 may connect to accessory 1602 at endpoint 1648 before or after updating its slot information at endpoint 1644, and so on for endpoints 1648 and 1640.

As with the number of previous blocks, one or more of the operations of block 1646 and endpoint 1648 may be similar to one or more of the operations of block 1442 and endpoint 1444 as described with respect to fig. 14.

Fig. 17 illustrates an exemplary process 1700 for transferring hub management from one user device to another user device according to an embodiment. The server device 1703 may participate in the transfer of hub management responsibilities from the first user device 1701 to the second user device 1702. The first user device 1701, the second user device 1702, and the server device 1703 may correspond to any one or more of the user devices, hub devices, and server devices described herein. In some embodiments, the process 1700 may be performed within a residential environment (e.g., the residential environment 1100 of fig. 11).

At block 1704, the first user device 1701 may receive information or instructions to transfer hub leadership capability to another hub device within the residential environment. In some implementations, the user device 1701 may determine that it is no longer able to function as a leader device for a hub device in a residential environment. For example, the user device 1701 may be a tablet computer that experiences significant processing load due to playing streaming media for a user in a home. Upon determining that it is no longer suitable to provide management for other hub devices, the tablet may transmit a request to transfer the leadership capability to another device. In other embodiments, another user device or server device 1703 may instruct the first user device 1701 to relinquish its leader role. This may occur when a user reconfigures devices within the residential environment and selects a different device to act as a default leader device. The first user device 1701 may then transmit a leader reassignment request to the server device 1703.

At block 1708, the server device 1703 may identify a new leader device. This operation is similar to the user device reassigning an accessory device to another hub device. The server device may select a new leader device based on several criteria including, but not limited to: the processing power of the new leader device, the number of hub devices and accessory devices present within the home, whether the new leader device is a resident device of the home, and whether the user has indicated that a particular device should be the new leader device. In some embodiments, another user device or hub device within the residence may perform one or more of the operations of the server device 1703, including selecting a new leader device based on the criteria described above.

Once the appropriate device is selected as the new leader, the first user device 1701 may assign leader capabilities to the selected device at block 1710. As shown in fig. 17, the new leader device may be a second user device 1702. As part of the allocation operation, the first user device 1701 may send information to the second user device 1702 regarding the hub device, the accessory device, and the association therebetween. This information may include current hub attributes and attachment characteristics. In some embodiments, the first user device 1701 only sends instructions to the second user device 1702 to accept the lead role for hub devices and accessories in the residential environment. In yet other embodiments, the server device 1703 may communicate directly with the second user device 1702 to provide leadership capability allocation. For example, in some cases, the first user device 1701 may lose network connectivity with other devices in the residential environment. In the event that no leader assignment is received from the first user device 1701, the server device may receive information that the first user device is disconnected from the network and is no longer able to act as a leader device. The server device 1703 may then instruct the second user device 1702 to act as a hub leader.

At block 1712, the second user device 1702 receives the leadership capability allocation. At decision 1714, the second user device determines whether it has received hub and attachment information from the first user device 1701. If not, the process proceeds to block 1716 and the second user device may query the hubs for current hub attributes and attachment allocation information for those hubs. Once the second user device 1702 has the current hub information, the second user device may store the information at the endpoint 1718 or at the second user device or another device.

Fig. 18 is a flow chart illustrating an exemplary process 1800 for a user device to assign an accessory device to a hub device selected from a plurality of hub devices. In some implementations, one or more of the operations of process 1800 may be similar to those described with reference to fig. 10 and 14.

At block 1802, a user device may receive first information about a first hub device and second information about a second hub device. The first information and the second information may correspond to hub attribute information as described herein and may be similar to hub attribute 1210 and slot 1216 of fig. 12 and device attribute 1336 of fig. 13. In some implementations, one or more of the operations of block 1802 may be similar to one or more of the operations described with reference to fig. 12 for the progress indicator 1240 or described with reference to block 1416 of fig. 14.

At block 1804, the user device may receive a connection request from the accessory. The connection request may be a request to connect to a hub device to which the user device assigns an attachment. In some embodiments, one or more of the operations of block 1804 may be similar to one or more of the operations described with respect to block 1412 of fig. 14.

At block 1806, the accessory device may send accessory information to the user device for determining accessory assignments. The attachment information may include information regarding the attachment attributes or characteristics and the requirements of the assigned hub device. In some implementations, one or more of the operations of block 1806 may be similar to one or more of the operations described with respect to process indicator 1230 of fig. 12 or block 1414 of fig. 14.

At block 1808, the user device may determine a score for the first hub device and the second hub device by comparing the attachment attribute information received at block 1806 with the first information and the second information received from the first hub device and the second hub device at block 1802. In some embodiments, one or more of the operations of block 1808 may be similar to one or more of the operations described for block 1424 of fig. 14.

At block 1810, the user device may compare the scores determined at block 1808 to determine whether to assign the accessory to the first hub device or the second device. The determination may be based on which of the hub devices has a higher score. In some implementations, one or more of the operations of block 1810 may be similar to one or more of the operations described for decision 1428 of fig. 14.

At block 1812, the selected hub device may receive an instruction to connect to the accessory. In some embodiments, connecting to the accessory may include creating an accessory interaction instance at the hub device corresponding to the assigned accessory device. In some embodiments, one or more of the operations of block 1802 may be similar to one or more of the operations described with respect to block 1430 of fig. 14 or block 1536 of fig. 15.

Exemplary techniques for load balancing between multiple hub devices and accessories are described above. Some or all of these techniques may be implemented, at least in part, by an architecture such as those illustrated at least in fig. 10-18 above, but need not be implemented by such an architecture. While many embodiments are described above with reference to server devices, accessory devices, user devices, and hub devices, it should be understood that other types of computing devices may be suitable for performing the techniques disclosed herein. Further, various non-limiting examples are described in the foregoing description. For purposes of explanation, numerous specific configurations and details are set forth in order to provide a thorough understanding of the examples. It will be apparent, however, to one skilled in the art that some examples may be practiced without these specific details. Furthermore, well-known features are sometimes omitted or simplified in order not to obscure the examples described herein.

While specific exemplary embodiments have been described, various modifications, alterations, alternative constructions, and equivalents are also encompassed within the scope of the disclosure. Embodiments are not limited to operation within certain specific data processing environments, but may be freely operable within multiple data processing environments. Furthermore, while embodiments have been described using a particular series of transactions and steps, it should be apparent to those of skill in the art that the scope of the present disclosure is not limited to the series of transactions and steps. The various features and aspects of the above-described embodiments may be used alone or in combination.

Examples of the techniques described may be illustrated by the following clauses.

Clause 1. A method comprising:

receiving, by a user device, first information about a first hub device of a plurality of hub devices and second information about a second hub device of the plurality of hub devices;

receiving, by the user device, a connection request from an accessory;

receiving accessory attribute information from the accessory;

comparing, by the user device, the accessory attribute information of the accessory with the first information about the first hub device and the second information about the second hub device;

Determining whether the accessory is connected to the first hub device or the second hub device based at least in part on the comparison; and

in accordance with a determination that the accessory is connected to the first hub device, instructions are provided to the first hub device to connect to the accessory.

Clause 2 the method of clause 1, wherein the first information about the first hub device comprises at least one of a set of capabilities of the first hub device or a number of available connection slots of the first hub device, and wherein the second information about the second hub device comprises at least one of a set of capabilities of the second hub device or a number of available connection slots of the second hub device.

Clause 3, the method of clause 2, wherein comparing the attachment attribute information with the first information about the first hub device and the second information about the second hub device comprises:

generating a first metric for the accessory potentially connected to the first hub device based at least in part on the accessory attribute information and the set of capabilities of the first hub device;

Multiplying the first metric by the number of available connection slots of the first hub device to generate a first connection score;

generating a second metric for the accessory potentially connected to the second hub device based at least in part on the accessory attribute information and the set of capabilities of the second hub device; and

the second metric is multiplied by the number of available connection slots of the second hub device to generate a second connection score.

Clause 4 the method of clause 3, wherein determining whether the accessory is connected to the first hub device or the second hub device comprises selecting which of the first connection score or the second connection score is higher.

Clause 5. The method of clause 1, further comprising:

receiving, by the user device, hub connection information regarding the first hub device; and

the method further includes determining that the first hub device is no longer connected to the accessory based at least in part on the hub connection information.

Clause 6 the method of clause 5, further comprising: in accordance with the determination that the first hub device is no longer connected to the accessory:

Comparing, by the user device, the accessory attribute information with the second information about the second hub device and third information about a third hub device;

determining whether the accessory is connected to the second hub device or the third hub device based at least in part on the comparison; and

based on the determination, instructions are provided to the second hub device or the third hub device to connect to the accessory.

Clause 7. The method of clause 1, further comprising:

information identifying the user device as a leader hub device configured to manage connections between a plurality of hub devices and a plurality of accessory devices is received by the user device, the first hub device and the second hub device being hub devices of the plurality of hub devices, and the accessory being an accessory device of the plurality of accessory devices.

Clause 8 the method of clause 7, wherein the user device is one of the plurality of hub devices.

Clause 9 the method of clause 8, further comprising:

determining, by the user device, that the user device is no longer suitable as the leader hub device; and

In accordance with the determination, a hub leader liability is assigned to another hub device of the plurality of hub devices.

Clause 10. The method of clause 9, wherein assigning the hub leader responsibilities to the another hub device comprises:

a hub leader reassignment request is transmitted by the user device to a server device configured to provide second information identifying a second user device as the leader hub device.

Clause 11. The method of clause 1, further comprising:

receiving, by the user device, hub processor usage information regarding the first hub device; and

the method further includes determining, based at least in part on the hub processor usage information, that the first hub device is no longer able to respond to the attachment within a threshold amount of time.

Clause 12 the method of clause 11, further comprising: in accordance with the determination that the first hub device is no longer able to respond to the accessory within a threshold amount of time:

Clause 13 the method of clause 1, further comprising:

receiving, by the user device, functional information about the first hub device; and determining, based at least in part on the functional information, that the first hub device is no longer the best hub for the accessory.

Clause 14 the method of clause 13, further comprising: in accordance with the determination that the first hub device is no longer the best hub for the accessory:

Clause 15 the method of clause 1, further comprising:

receiving, by the user device, a second connection request from a second accessory;

receiving second accessory attribute information from the second accessory;

comparing, by the user device, the second attachment attribute information with the first information about the first hub device and the second information about the second hub device;

determining whether the second accessory is connected to the first hub device or the second hub device based at least in part on the comparison; and

in accordance with a determination that the second accessory is connected to the first hub device, instructions are provided to the first hub device to connect to the second accessory.

Clause 16 the method of clause 1, further comprising:

receiving, by the user device, a second connection request from a second accessory of a plurality of accessory devices including at least the first accessory and the second accessory;

receiving second accessory attribute information from the second accessory;

Determining whether the second accessory is connected to the first hub device or the second hub device based at least in part on the comparison;

in accordance with a determination that the second accessory is connected to the second hub device, providing instructions to the second hub device to connect to the second accessory; and

Clause 17 the method of clause 16, further comprising:

managing, by the user device, a first state of the accessory that identifies the accessory as being connected to the first hub device and a second state of the second accessory, the first state identifying that the accessory is connected to the first hub device

The second state identifies that the accessory is connected to one of the first accessory or the second accessory.

Clause 18 the method of clause 17, further comprising:

a third state of a third accessory of the plurality of accessories is managed by the user device, the third state identifying that the third accessory is not connected to any hub of the plurality of hub devices.

Clause 19, a user equipment comprising:

A memory configured to store computer-executable instructions; and

a processor configured to connect to the memory and execute the computer-executable instructions to perform at least the method of any one of clauses 1-18.

Clause 20. A computer readable storage medium configured to store computer executable instructions that, when executed by a user device, cause the user device to perform the method according to any of clauses 1 to 18.

Techniques for establishing communication with third party attachments

Embodiments of the present disclosure may provide techniques for establishing communication between an accessory device and a cellular-enabled device. As a first example, consider a residential environment corresponding to a residence. A person in a home may want to make a telephone call using voice commands. The person may issue a verbal request (e.g., "computer, call mom") to an accessory device (e.g., a third party accessory that is not manufactured or designed by the same entity that manufactured or designed the residential device (e.g., hub) or cellular-enabled device (e.g., smart phone)). The accessory device can determine that the request is for the device and then transmit the received audio information to a hub device (e.g., a hub speaker). The hub device may process the audio information to determine the nature of the request and prepare a corresponding response (e.g., connect to a cellular telephone to initiate a call). Alternatively, or in part in combination with the above, the hub device may transmit some or all of the verbal request to a server computer (e.g., implementing a service provider), where the service provider may determine the nature of the request and/or prepare a corresponding response. The hub device may then communicate with the cellular-enabled device to instruct the cellular-enabled device to initiate a call and establish a separate audio communication channel with the accessory. The hub device may then enter a listening state to listen for a request from the user to end the telephone call (e.g., "computer, hang-up"). When the hub device receives the hang-up request, the hub device may send an instruction to the accessory device to terminate the call. The accessory device may then transmit information to the cellular-enabled device to end the call.

As an illustration of the above examples, a residential environment may include numerous "smart" devices, e.g., electronic devices having features that allow them to operate interactively and autonomously to some extent. The smart device may have various functions including cameras, speakers, thermostats, headphones and headsets, telephones, or media players. The smart device may also have various network communication capabilities including WiFi, ethernet, bluetooth, zigbee, cellular, and the like. These devices may be produced by different manufacturers. In some cases, the smart devices may be categorized as hub devices and accessory devices. The hub device may be a resident device of the residence (e.g., a smart speaker, a smart digital media player configured to control a Television (TV), a mobile phone, etc.). While not always so, in some examples, resident devices are contemplated to reside within the home and to move infrequently (e.g., move within the home or outside the home). The hub device may have the capability to equal or exceed the capabilities of the accessory device. For example, the hub device may be a mobile phone that may include wireless (e.g., wiFi) and cellular communication capabilities, multimedia capabilities, and device assistants. In this same example, the accessory device may be a smart speaker that may include audio media and wireless communication capabilities but lack device assistance. The device assistant may be a virtual assistant program configured to interact with the user. In these examples, the smart speaker may be a hub device or an accessory device, depending on its capabilities. In some examples, if the accessory is manufactured by an entity other than the entity that manufactured the hub device, the accessory may not be initially configured with the capability to communicate with the user device. In some cases, a hub device manufacturer may provide an accessory development kit ("ADK") for installation on an accessory that enables such communication after the accessory is manufactured, sold, supplied, or used. The controller device may be a hub device as described herein and may include user interface features. In some embodiments, the controller device is a leader device selected from hub devices in a residential environment. As used herein, the terms hub device, user device, leader device, and controller device may indicate one or more similar devices that are distinct from accessory devices. The cellular-enabled device may be any device associated with a residential environment that is capable of connecting to a cellular network. In some embodiments, the cellular-enabled device may be a hub device or an accessory device.

In some embodiments, the hub device may obtain information about accessory devices present in the residential environment. This information may be obtained by a hub device that communicates directly with accessory devices sharing the same network within the residential environment. In other embodiments, information about the accessory device may be sent to the hub device by a second hub device, a user device configured as a leader device, or a remote server device (e.g., a service provider). For example, a user in a home may add a new accessory device to the home environment. As part of this process, the user may interact with a second hub device (e.g., a mobile phone) to configure the new accessory device and send new accessory device information to the first hub device. As another example, a leader device in a residential environment may have information about a plurality of accessory devices in the residential environment and report information about some or all of the accessory devices to a hub device. The hub device may then use this information to form an association with the corresponding accessory device. The attachment information may be stored by the hub device.

The hub device may be associated with a plurality of accessory devices by creating an accessory interaction instance for each accessory device. An interaction instance may be a software module or process configured to perform tasks at a hub device. In some implementations, the interaction examples can each implement and/or communicate with a device assistant. For example, the hub device may receive information regarding an accessory smart speaker and a smart thermostat located in a residential environment. The hub device may create two interaction instances corresponding to the device assistant, one for each of the intelligent speakers and intelligent thermostats. In some embodiments, the interaction instance may be a copy of the device assistant, while in other embodiments, the instance may be a collection of modules including the device assistant and other processes for performing tasks on the hub device. The interaction instance may include different modules or processes depending on the associated accessory and its capabilities. It should be appreciated that any suitable combination of processes running on the hub device may be included in the interaction instance corresponding to the accessory device.

Continuing with the first example above, the user may speak a request to the accessory. For example, a user may speak "computer, call mom? "in this example, the request (" call mom ") may correspond to a portion of the audio input to the user in the smart speaker. The start phrase ("computer") may correspond to different portions of the user's audio input and may be a trigger or wake word. In some implementations, the smart speaker can perform voice recognition processing on the wake word. Based on this process, the intelligent speaker may determine whether the user's voice is intended to be a request or command that the speaker should respond to. If so identified, the smart speaker may then transmit the user audio to a hub device running an accessory interaction instance corresponding to the smart speaker. In some embodiments, the accessory device can temporarily store a copy of the audio input for transmission to the user device after processing the wake word portion. In other implementations, in processing the wake word portion of the audio input, the accessory device can establish a streaming audio connection with the hub device to relay the portion of the audio input subsequent to the wake word of the user. In these embodiments, the accessory device can transmit a copy of the wake word portion of the stored audio input to the hub device for additional processing.

Upon receiving the audio input from the smart speaker, the hub device may perform additional processing on both the wake word portion of the audio input and the portion corresponding to the request or command. For example, the hub device may perform natural language processing ("NLP") on the wake word. Based on this wake word processing, the hub device may then process the portion of the audio corresponding to the request. If the hub device determines that the wake word portion is not actually an exact wake word, it may ignore the remainder of the audio or terminate the audio stream from the accessory. In some embodiments, the voice processing module on the hub device may be part of the accessory interaction instance. The interaction instance may also transmit all or a portion of the audio input to another device for analysis (e.g., to a service provider device). The service provider device may be a remote server computer or cloud device that can perform voice processing and parse the request to provide an appropriate response. In some cases, the hub device performs wake word processing while remotely processing the remainder of the audio. However, in other examples, the hub device may handle all processing. Parsing the request includes determining the content and context of the audio spoken by the user and providing a response for the hub device to take action. In the present example, the response may be to communicate with the cellular-enabled device to initiate a call to the mother, which may be performed by the hub device using an appropriate process or by the remote server device or by a combination of both devices. In some embodiments, the hub device may relay instructions to the accessory device for establishing the call, including the identity of the selected cellular-enabled device. The accessory device may then establish communication with the cellular-enabled device and send an instruction to initiate the call. In another example, the hub device may relay instructions for establishing the call to the cellular enabled device, including the identification of the accessory device, which may then establish communication with the accessory device and then initiate the call. The calling mother may include: for example, by identifying the user and then accessing the user's contact information to access user information identifying "mom" in the context of the requesting user. The mom's telephone number may then be obtained and sent to the cellular-enabled device.

Once the response has been determined, the hub device may execute the response. This may include identifying a cellular-enabled device and sending an instruction to it to initiate a call. The instructions may include information identifying the call recipient, including a telephone number to dial out or a recipient identification stored locally at the cellular-enabled device to provide the telephone number at the cellular-enabled device. For example, in one embodiment, as part of the processing of the audio request, the hub device may obtain the phone number of the mother and then instruct the cellular-enabled device to dial the number. In another embodiment, the hub device may instruct the cellular enabled device to call "mom" and let the cellular enabled device use its own information about the identity of the mom to obtain the number. This latter embodiment applies to the following examples: the hub device may identify the cellular-enabled device corresponding to the user making the call request, but may not have access to the user's contact information, for example, if the contact is stored only at the cellular-enabled device. The preparation and execution of the response may occur in an interaction instance corresponding to the accessory. A response requiring a particular action (e.g., initiating a telephone call) may be delegated from the interaction instance to another process on the hub device or to another device with which the hub device may communicate, as appropriate.

The hub device may also communicate with the accessory device to identify the cellular-enabled device that is initiating the call and instruct the accessory to establish a communication channel with the cellular-enabled device. The communication channel may be a real-time audio connection using the real-time transport protocol ("RTP") or other suitable method. The audio channel may be used to send and receive audio between the accessory device and the cellular-enabled device. In some embodiments, the accessory device can establish a second communication channel with the cellular-enabled device to send telephony control instructions to the cellular-enabled device. These telephony control instructions may be instructions to end the call based on the accessory receiving information from the hub device that the user has requested to end the call. In some embodiments, the second communication channel may also be used by the accessory device to send instructions to the cellular-enabled device to initiate a call in instances where the hub device does not transmit the call instructions directly to the cellular-enabled device.

Once the call has been established, the hub device may enter a call listening state at an accessory interaction instance corresponding to the accessory device. While in the call listening state, the hub device listens only for "hang-up" or "end call" or other similar commands from the accessory device indicating that the user wishes to terminate the telephone call. In this way, the device assistant and other processes associated with the accessory interaction instance do not inadvertently capture, record, or process audio information from the telephone call. For example, at the end of a telephone call, the user may say "computer, hang up". As with the call request described above, the "computer" portion of the command may correspond to a wake word that indicates to the accessory device that the audio following the wake word may be a user command rather than part of a telephone conversation. The phrase "hang-up" may be an ending word. While in the call listening state, the accessory may receive audio corresponding to wake-up words and end words. The wake word will be processed as the other wake words described herein. If a wake word is detected, the accessory interaction instance can process the end word. Because the hub device is in a call listening state, end word processing may be more limited than audio processing for other user requests received. For example, the accessory interaction instance can perform end word detection locally without transmitting audio to a remote service provider for NLP. In some implementations, while in the call listening state, the accessory interaction instance at the hub device may only process a limited portion of the audio input after the wake word such that the accessory interaction instance receives only a short audio sufficient to contain an end word like "hang up" or "end call". In this way, the accessory interaction instance does not capture user audio that does not correspond to an end word.

In some embodiments, the call listening state may be a state of a particular accessory interaction instance associated with an accessory device connected to the call. Other accessory interaction instances present at the hub device may function properly and may not be limited by the call listening status. For example, continuing with the user call described above, the user may place a telephone call with the mother using the smart speaker. A second user in the home may issue a request (e.g., "computer to turn the heating up to 72°f") to an intelligent thermostat also associated with the hub device. The accessory interaction instance corresponding to the intelligent thermostat may not be in a call listening state and may normally process the second user request, instructing the thermostat to change its temperature set point in accordance with the request. This allows the hub device to retain its attachment management functionality while ensuring data privacy for the first user and the external call recipient.

Fig. 19 is a simplified block diagram of an exemplary embodiment. Process 1900 is an exemplary high-level process flow for a system including accessory device 1912 that can receive a call request and establish communication with cellular-enabled device 1930 via controller device 1920. Diagram 1901 shows the state of a system corresponding to the blocks of process 1900. Process 1900 may be performed within a residential environment that includes a plurality of accessory devices, a hub device, and a cellular-enabled device. As depicted herein, accessory device 1912 may be a smart speaker and controller device 1920 may be a hub speaker. Although described as specific devices, it is apparent that accessory device 1912 and controller device 1920 can be several types of smart devices in various combinations and numbers. For example, a smart phone, media device (e.g., a smart TV), or tablet (connected to a cellular network, to a local area network via WiFi of a residential network, or to a wide area network ("WAN")) may perform one or more of the operations of process 1900.

Turning in more detail to process 1900, at block 1902, accessory device 1912 can receive a call request 1916 from user 1914. In some implementations, the call request 1916 can include a portion of audio corresponding to the request (e.g., "initiate call") and a second portion corresponding to the wake word (e.g., "computer"). The wake word need not be a single word, and may be a word or phrase that signals to the system that the user has or will speak a request, command, or other audible interaction to the system. Upon receiving input containing a wake word, accessory 1912 can process the portion of call request 1916 at a first level to determine the presence of the wake word. The first level of processing may be performed in a time and resource efficient manner that determines when wake words may exist. For example, accessory 1912 can perform voice pattern matching using a stored voice pattern corresponding to a user speaking a wake word. The stored patterns may be associated with users in a residential environment that contains the system, or may be generic patterns that are applicable to a large number of possible users. In this way, accessory device 1912 is not burdened with complex voice detection procedures, nor is it responsive to each extraneous audio input received by the user or other sources in its vicinity.

Moving down to block 1904, upon detecting the wake word, accessory device 1912 may transmit received call request 1916 to controller device 1920 where it is to be processed. As shown, smart speaker 1916 has a corresponding accessory interaction instance 1922 on controller device 1920 such that accessory interaction instance 1922 manages the processing of call request 1916 received from smart speaker 1912. The accessory interaction instance 1922 can include modules configured to process the call request 1916. For example, accessory interaction instance 1922 can include a voice detection module that can analyze a portion of call request 1916 that corresponds to a wake word. The analysis may be performed at a second level where the presence of wake words may be confirmed with a higher degree of probability than wake word detection at the smart speaker 1912. Further, in some embodiments, the voice detection module may determine a language of the user and perform wake word detection based on the determined language. If the wake word is not detected by the voice detection module of the accessory interaction instance 1922, the controller device 1920 may ignore the audio input.

The controller device 1920 may also access user profiles 1926, 1928. The user profiles 1926, 1928 may be stored at the controller device 1920 or at another device like a server device. The user profiles 1926, 1928 may correspond to users within a residential environment and include information that may be used to identify one or more cellular-enabled devices associated with the users. For example, user profile 1926 may correspond to user 1914 and may identify that cellular-enabled device 1930 depicted as a smartphone is associated with user 1914. When processing call request 1916, accessory interaction instance 1922 can identify user 1914 as having made call request 1916 and access user profile 1926 to determine the appropriate cellular-enabled device to perform the call. In addition, the user profile 1926 may also include information regarding the potential recipient of the call. For example, user profile 1926 may include a contact list for user 1914. The accessory interaction instance 1922 can use the contact information when resolving the call request, for example, to determine a dialed telephone number to be called by the cellular-enabled device 1930 when executing the call request 1916. In some embodiments, the user profiles 1926, 1928 may be stored at a remote server or other device and accessed by a remote service provider for processing call requests.

Moving to block 1906, the controller device 1920 may process the call request to identify the cellular-enabled device 1930 from which the call is to be initiated. As described above with reference to block 1904, the controller device 1920 may access the user profiles 1926, 1928 to determine the appropriate cellular-enabled device 1930.

At block 1908, the controller device 1920 may instruct the cellular-enabled device 1930 to initiate a call corresponding to the call request. In some embodiments, this may include determining a dialed number to be dialed by cellular-enabled device 1930 when making a call. In other embodiments, the controller device 1920 may instruct the cellular-enabled device 1930 to initiate the call based on a tag or other identifier (e.g., "mom," "office," etc.) contained within the call request 1916. In addition to sending instructions to cellular-enabled device 1930, controller device 1920 can instruct accessory device 1912 to establish a communication channel with cellular-enabled device 1930. The communication channel may be a real-time audio channel that transmits and receives audio during a call.

At block 1910, the accessory interaction instance 1922 at the controller device 1920 may enter a call listening state to listen for a hang-up command (e.g., "computer, end call"). The hang-up command may be composed of a portion corresponding to a wake-up word (e.g., "computer") and a portion corresponding to an end word (e.g., "end call" or "hang-up"). As with the wake word, the end word need not be a single word and may be any word or phrase that is identified as indicating the end of a telephone call. When user 1914 issues a hang-up command, accessory 1912 may process the wake-up word at the first level as described above with respect to block 1902. If a wake-up word is detected, accessory 1912 may send audio of a hang-up command to controller device 1920. The controller device 1920 may process wake words at the second level as described with respect to 1904. Upon confirming the presence of the wake word, the accessory interaction instance 1922 can process the end word portion of the hang-up command. The processing of the end word may be performed in a limited manner such that the controller device 1920 does not receive or process additional audio information potentially captured from an ongoing phone call at the accessory device 1912. If an end word is detected, controller device 1920 may transmit instructions to accessory device 1912 to terminate the call. Accessory device 1912 can then issue a hang-up command to cellular-enabled device 1930 to close the cellular connection at cellular-enabled device 1930. Alternatively, in some embodiments, the controller device may instruct the cellular-enabled device 1930 to directly end the call and transmit an indication to accessory 1912 that the call has been successfully terminated.

Fig. 20 is a schematic diagram of a residential environment including a hub device, an accessory device, and a cellular-enabled device, in accordance with some embodiments. The hub devices may include a hub speaker 2002 and a media player 2004. These hub devices may correspond to the controller device 1920 from the embodiment described above with respect to fig. 19. The smart phone 2006 may be a cellular-enabled device, such as cellular-enabled device 1930 of fig. 19. In several implementations, the smart phone 2006 can act as a hub device. Accessory devices can include smart speakers 2012, 2014, smart watch 2016, and thermostat 2024. Similarly, these accessory devices may correspond to accessory device 1912 described with respect to fig. 19. All or some of these accessory devices may be third party devices (e.g., not manufactured, programmed, or supplied by the manufacturer, programmer, or supplier of the hub device or user device). Thus, they may not be automatically and/or initially compatible with the user device. Each hub device in residential environment 2000 may be associated with 0, one, or more accessory devices. As shown by the long dashed lines, hub speaker 2002 is associated with smart speakers 2012, 2014 and smart watch 2016, while media player 2004 is associated with thermostat 2024. The smart phone 2006 is not associated with an accessory device. Devices within the residential environment 2000 may be configured to communicate over one or more networks associated with the residential environment 2000 using one or more network protocols. For example, the residential environment 2000 may be associated with a local area network ("LAN"), WAN, cellular network, personal area network, or other network, and the devices may communicate using WiFi connections, bluetooth connections, thread connections, zigbee connections, or other communication methods.

The arrangement of the association of accessory devices with the hub device may include various different combinations and may be modified by another device associated with the residential environment (e.g., one of the hub devices or the user device). For example, the user device may associate the new accessory device to one of the hub devices in the home. In some embodiments, the allocation of the accessory device to the hub device may be based on a scoring algorithm applied by the user device. The user device may also use the scoring algorithm to transfer existing accessory devices from one hub device to another. The transfer may occur automatically based on information received by the hub device regarding the residential environment 2000, including but not limited to information that another user device may be more suitable for association with one or more accessories or that accessories have been added to or removed from the residential environment 2000. When an attachment is assigned to or associated with a hub device, the hub device may create an attachment interaction instance corresponding to each assigned attachment. Thus, the hub device may include a unique software ecosystem for each assigned accessory. The suitability of any particular hub device associated with an accessory may be based at least in part on the capabilities of the hub device, the capabilities of the accessory device, the current processing load experienced by the hub device, the location of the device within the residential environment, and the status of communication between devices on the network. Many other criteria for rearranging device associations in a residential environment are contemplated.

In some embodiments, non-resident accessory devices and hub devices may also leave the residential environment or lose network connectivity with the residential environment. Accessory devices that leave the residential environment can be disassociated by previously associated hub devices. Accessory devices associated with hub devices that lose network connectivity to the residential environment may be reassigned by another hub device that maintains network connectivity. In this case, the other hub device may receive information that the hub device is no longer able to communicate with the accessory device and reassign the accessory device. Some embodiments may have a hub device designated as a leader device to manage allocation of accessory devices between hub devices within a residential environment. In other embodiments, if the hub device and accessory device are associated and leave the residential environment and network connectivity is lost, the hub device may maintain its association with the accessory device and perform the materialization methods described herein.

As a hub device, the smart phone 2006 may communicate with other hub devices within the residential environment, including receiving accessory assignments from user devices or leader devices. Thus, other hub devices may communicate with the smart phone 2006 to instruct the smart phone to initiate a call over the cellular network in response to a user call request. In some embodiments, the smart phone 2006 may not be able to act as a hub device, but remain known to the hub device, user device, or leader device within the residence so that the call request may be transmitted to the smart phone 2006 as a cellular-enabled device. In other embodiments, the smart phone 2006 may be identifiable by a remote device (e.g., a server device in communication with one or more of the networks associated with the residential environment).

Continuing with fig. 20, the residential environment 2000 can have a plurality of users 2030, 2034 issuing audio requests 2032, 2036 for accessories. The requests 2032, 2036 may occur separately or simultaneously and may be received by multiple accessory devices, as indicated by the short dashed lines. For example, request 2032 may be received by smart speaker 2014 or smart watch 2016, while request 2036 may be received by smart speaker 2012 and thermostat 2024. As previously mentioned, the accessory device and its associated arrangement may take various forms and may change over time. Thus, the user request may be received by a plurality of accessory devices associated with different hub devices. For example, the user request 2036 is received by both the thermostat 2024 associated with the media player 2004 and the smart speaker 2012 associated with the hub speaker 2002. As depicted, depending on the location of user 2030 and user 2034 in the home, these users may not directly access cellular-enabled devices to make phone calls. Further, each hub device may be assigned to more than one accessory device. The ability to pair an accessory device with a cellular-enabled device via a corresponding accessory interaction instance to make a call allows the hub device to continue managing the functionality of other accessory devices while maintaining data privacy with respect to the ongoing call.

As a specific example of the foregoing embodiment, consider a case where the user 2030 issues a call request 2032. The receiving accessory, smart speaker 2014, and smart watch 2016 may lack cellular communication capabilities. Likewise, hub speaker 2002 may not be a cellular-enabled device. In some embodiments, the accessory device can coordinate with other accessory devices within the residential environment 2000 to determine which accessory device should respond to user requests received by one or more accessory devices. For purposes of this example, consider the case where smart speaker 2014 is the accessory device selected to process call request 2032. Upon receiving the request 2032 and detecting the wake word, the smart speaker may transmit the request 2032 to the hub speaker 2002. Hub speaker 2002 may process call request 2032 and identify that smart phone 2006 is a suitable cellular-enabled device for performing calls. Hub speaker 2002 may then instruct smart phone 2006 to initiate the call and establish a communication channel with smart speaker 2014 for relaying call audio. Alternatively, in some implementations, hub speaker 2002 may instruct smart speaker 2014 to communicate with smart phone 2006 to establish a call. The accessory interaction instance associated with smart speaker 2014 at hub speaker 2002 may then enter a call listening state and listen to user 2030 to speak an end word at smart speaker 2014.

Continuing with the example above, consider that during the time when user 2030 is making a telephone call at smart speaker 2014, user 2034 issues a user request 2036 to query for the current time (e.g., "computer, now several. The request 2036 may be received by a smart speaker 2012 associated with the hub speaker 2002. While the accessory interaction instance associated with smart speaker 2014 is in a call listening state at hub speaker 2002, the accessory interaction instance associated with smart speaker 2012 is not in a restricted state and can process request 2036 normally and relay the response to smart speaker 2012 (e.g., "now 10:30 pm").

Fig. 21 is another simplified block diagram illustrating a process 2100 for establishing communication between an accessory device and a cellular-enabled device in accordance with some embodiments. In some embodiments, the accessory device 2102 can correspond to an accessory device described herein (e.g., accessory device 1912 of fig. 19). Similarly, the controller device 2104 may correspond to other controller devices or hub devices, and the cellular-enabled device 2106 may correspond to other cellular-enabled devices as described herein. The controller device 2101 (depicted here as a hub speaker) may be a hub device, a leader device, a configuration device, or other device for determining the appropriate cellular-enabled device 2106 to initiate a call and direct the attachment device 2102 to connect to the cellular-enabled device 2106. The controller device 2101 may be configured to communicate with other hub devices, accessory devices 2102, other accessories, and cellular-enabled devices 2106 via one or more networks (including LANs or WANs) as described herein.

The various elements of the connection process 2100 are presented in more detail. The accessory device 2102 can include an ADK 2108. The ADK 2108 may be a software development kit ("SDK") stored on the accessory device 2102 and configured to be executed or processed on the accessory device. As used herein, an SDK may include an application programming interface and associated software libraries sufficient to enable operation of other software within or associated with the SDK. In some implementations, the ADK 2108 may be provided by an entity (e.g., manufacturer thereof) associated with the controller device 2104. The ADK 2108 may include a telephony audio module 2110 and a telephony control module 2112. The telephone audio module 2110 can establish a real-time audio connection with the cellular-enabled device 2106 to send and receive audio during a telephone call. The phone control module 2112 may send and receive instructions and indications corresponding to device control over phone connections to and from the cellular enabled device 2106. In some implementations, the phone control module 2112 may send a signal to the cellular-enabled device 2106 to terminate the cellular connection upon receiving a hang-up instruction from the controller device 2104 to end the call.

The controller device 2104 may include an accessory management module 2114, which may be a software process running on the controller device 2104. In some implementations, the accessory management module 2114 can receive, process, store, update, and transmit accessory management settings. For a particular user device, its accessory management settings may include a list of all accessories assigned to the controller device and other information regarding the capabilities of those assigned accessories. The attachment management module may include a user profile 2116. These user profiles 2116 may correspond to one or more users within the residential environment and may contain information associating each user with one or more cellular-enabled devices, including cellular-enabled device 2106. The user profile 2116 may also include information identifying one or more contacts or other information that may be used by the controller device 2104 to respond to a call request and direct the establishment of a call. The accessory management module 2120 can also include an accessory interaction instance 2118. Accessory interaction instance 2118 can be created by controller device 2104 for each accessory assigned to controller device 2104. Each of the accessory interaction instances 2118 can represent one or more different software ecosystems on the controller device. For example, an accessory interaction instance corresponding to accessory device 2102 can represent a first software ecosystem of a controller device, while another accessory interaction instance corresponding to another accessory device can represent a second software ecosystem of the controller device.

The cellular-enabled device 2106 may include a media module 2120 and a telephony control module 2122. In some embodiments, media module 2120 may send and receive media data, including phone audio, over one or more cellular networks to which cellular-enabled device 2106 may be connected. The media module 2120 may also connect to the accessory device 2106 via a real-time audio channel or other channel through which the cellular-enabled device 2106 may send and receive audio data corresponding to telephone audio. The phone control module 2122 can send and receive instructions and indications corresponding to device control of a phone connection to and from the accessory device 2102. In some embodiments, upon receiving an instruction from the controller device 2104, the phone control module 2122 may initiate a phone call by dialing a selected phone number or accessing contact information locally stored at the cellular-enabled device and dialing a phone number associated with the contact information.

Completing the detailed elements of fig. 21, the progress indicators 2130, 2140, 2150 represent data transfers between the accessory device 2102 and the controller device 2104, between the controller device 2104 and the cellular-enabled device 2106, and between the accessory device 2102 and the cellular-enabled device 2106, respectively. The progress indicators 2130, 2140, 2150 may indicate communication between various devices as described herein over one or more networks, including but not limited to a WiFi LAN or an internet WAN. In some embodiments, the progress indicator 2130 indicates the transmission of data corresponding to a user request to initiate a call at the accessory device 2102 and a subsequent response from the controller device 2104 providing an indication that the request was successfully processed. Similarly, the progress indicator 2140 indicates the transmission of data including instructions to the cellular-enabled device 2106 to initiate a call in response to a user request. In some embodiments, the cellular-enabled device 2106 does not transmit any corresponding data or information back to the controller device 2104. The progress indicator 2150 indicates transmission of data including real-time audio of a telephone call and telephone control instructions or indications including instructions to establish or terminate a call according to some embodiments.

Fig. 22 is a simplified block diagram 2200 illustrating at least some techniques for communicating between an accessory device 2201 and a cellular-enabled device 2203 to initiate a telephone call at the cellular-enabled device 2203. Diagram 2200 includes process flow arrows that provide a general indication of the transfer of data or information, and some detailed architecture of a representative device. Process flow arrows are not intended to imply any particular architectural connection between the elements detailed herein. Each of the elements depicted in fig. 22 may be similar to one or more elements depicted in other figures described herein. For example, accessory device 2201 can correspond to one or more of the accessory and accessory devices described herein, and so forth. In some embodiments, at least some of the elements in diagram 2200 may operate within the context of a residential environment like residential environment 2000 of fig. 20.

Turning to each element in greater detail, accessory device 2201 can have audio input and output functionality, including accessory audio input/output ("I/O") 2204. The accessory audio I/O2204 can include both hardware (e.g., speaker and microphone) and software/firmware necessary to provide audio input and output functionality. The accessory device 2201 also includes an ADK 2206.ADK 2206 may be similar to ADK 2108 described above with respect to fig. 21. ADK may include wake word detection module 2208, audio module 2210, and telephony control module 2212. The wake word detection module 2208 may perform a first process on a portion of the audio input corresponding to a trigger or wake word. The wake word detection module itself may contain information about the wake word and trigger, including, for example, trigger criteria and audio patterns corresponding to the particular wake word. The audio module 2210 and the telephony control module 2212 may be similar to the telephony audio module 2110 and the telephony control module 2112, respectively, as described with reference to fig. 21. As indicated by the process flow arrow, audio input received at the accessory device 2201 can be processed by the wake word detection module 2208. If a wake word is detected, the accessory device can transmit the audio input to the controller device 2202 for further processing. The accessory device 2201 may continue to listen for wake words during the phone call. Because the user will speak during the call and the spoken audio is transmitted from the audio module 2210 to the cellular-enabled device 2203, the wake word detection module 2208 may allow the accessory device to distinguish user phone audio from audio intended to convey a request or command of the accessory device.

The controller device may include a voice processing module 2214 that includes a wake word detection module 2216. As depicted in fig. 22, the wake word detection module 2216 can have instances corresponding to each of the accessory interaction instances 2218 such that each instance of the wake word detection module 2216 is part of a separate software ecosystem at the controller device. The wake word detection module 2216 can process the wake word audio at a second level where the presence of wake words can be confirmed with a higher degree of probability than the wake word detection module 2208 at the accessory device 2201. If the voice processing module 2214 does not detect a wake word, the controller device 2202 may ignore the audio input. If a wake word is detected, the audio input may be further processed at the accessory interaction instance 2218.

The accessory interaction instance 2218 can include a virtual device assistant 2220. During normal operation, the controller device 2202 may process audio from a user request or other audio input through the wake word detection module 2216 by: the attachment interaction instance 2218 is connected to a remote service and a portion of the audio input is transmitted to the remote service. NLP and other services for processing audio input may include remote services. However, during a call, the accessory interaction instance corresponding to accessory device 2201 may be in a call listening state. In this state, the device assistant 2220 may not process any other user audio except for the end word at the end word detection module 2222. In some implementations, the end word detection module can operate entirely within the corresponding accessory interaction instance such that the device assistant 2220 does not transmit any phone call audio to the remote service or other device if the wake word detection module 2216 indicates that the wake word is heard. In other embodiments, the device assistant 2220 may process only a limited portion of the audio input received after the wake word is detected. For example, end word detection may process short audio portions sufficient to cover the end words "hang-up" and "end call". If an end word is detected, the accessory interaction instance can transmit a hang up command 2224 to the accessory device 2201. The accessory device 2201 may then signal the cellular-enabled device 2203 to end the call and terminate the audio between the devices.

The cellular-enabled device 2203 may include a media module 2230 that may be configured to send, receive, and process audio and video data. The media modules may include a call module 2232. The call module 2232 may be configured to send, receive, and process audio data of telephone calls made via the cellular network 2250 to which the cellular-enabled device is connected. The call module 2232 may transmit and receive audio data to and from the accessory device 2201 over an audio channel 2226, which may be a real-time audio channel using RTP or similar communication protocols. The cellular-enabled device 2203 may also include a call service module 2234 that may be configured to perform processes including negotiating a telephone call (e.g., placing a telephone call) and receiving call instructions (e.g., terminating the call) from the accessory device. The call service module 2234 may include a phone control module 2236 and an attachment discovery module 2238. The telephony control module 2236 may be similar to the telephony control module 2122 of fig. 21. The accessory discovery module 2238 can allow the cellular-enabled device 2203 to discover and negotiate a communication channel with accessory devices within the residential environment. For example, in some embodiments, the cellular-enabled device 2203 may receive instructions from the controller device 2202 to initiate a call and connect to the accessory device 2201. Accessory discovery module 2238 can locate accessory device 2201 within one of the networks of the residential environment and establish a communication connection with accessory device 2201, including call control channel 2228, in conjunction with telephone control module 2236.

Fig. 23 is a simplified block diagram 2300 showing an exemplary architecture of a system for establishing communication between an accessory device and a cellular-enabled device, according to an embodiment. The diagram 2300 includes a controller device 2302, one or more accessory devices 2304, a representative accessory device 2306, one or more networks 2308, a cellular-enabled device 2310, and a cellular network 2312. Each of these elements depicted in fig. 23 may be similar to one or more elements depicted in the other figures described herein. In some embodiments, at least some elements of diagram 2300 may operate within the context of a residential environment (e.g., residential environment 2000 of fig. 20).

The accessory device 2304 and representative accessory device 2306 may be any suitable computing device (e.g., smart speaker, smart watch, smart thermostat, camera, etc.). In some embodiments, the accessory devices 2304, 2306 may perform any one or more of the operations of the accessory devices described herein. Depending on the type of accessory device and/or the location of the accessory device (e.g., within or outside of a residential environment), accessory device 2306 can be enabled to communicate over network 2308 (e.g., including a LAN or WAN) using one or more network protocols (e.g., bluetooth connection, thread connection, zigBee connection, wiFi connection, etc.) and network paths, as further described herein.

In some implementations, the controller device 2302 may correspond to any one or more of the controller devices or hub devices described herein. For example, the controller device 2302 may correspond to one or more of the hub devices of the residential environment 2000 of fig. 20. The controller device may be any suitable computing device (e.g., a mobile phone, a tablet, a smart hub speaker device, a smart media player communicatively connected to a TV, etc.).

In some embodiments, the one or more networks 2308 may include an internet WAN and LAN. As described herein, a residential environment may be associated with a LAN, wherein devices within the residential environment may communicate with each other through the LAN. As described herein, the WAN may be external to the residential environment. For example, a router associated with a LAN (and thus with a residential environment) may enable traffic from the LAN to be transmitted to a WAN, and vice versa.

As described herein, the controller device 2302 may represent one or more controller devices or hub devices connected to one or more of the networks 2308. The controller device 2302 has at least one memory 2314, a communication interface 2316, one or more processing units (or processors) 2318, a storage unit 2320, and one or more input/output (I/O) devices 2322.

Turning in further detail to each element of the controller device 2302, the processor 2318 may be implemented in hardware, computer-executable instructions, firmware, or a combination thereof, as appropriate. Computer-executable instructions or firmware implementations of the processor 2318 include computer-executable instructions or machine-executable instructions written in any suitable programming language to perform the various functions described.

The memory 2314 may store program instructions that may be loaded and executed on the processor 2318, as well as data generated during execution of such programs. Depending on the configuration and type of controller device 2302, memory 2314 may be volatile (such as random access memory ("RAM")) or non-volatile (such as read-only memory ("ROM"), flash memory, etc.). In some implementations, the memory 2314 may include a variety of different types of memory, such as static random access memory ("SRAM"), dynamic random access memory ("DRAM"), or ROM. The controller device 2302 may also include additional storage 2320, such as removable storage or non-removable storage, including, but not limited to, magnetic storage, optical disks, and/or tape storage. The disk drives and their associated computer-readable media may provide nonvolatile storage of computer-readable instructions, data structures, program modules and other data for the computing device. In some embodiments, the storage 2320 may be used to store data content received from one or more other devices (e.g., other controller devices, cellular-enabled devices 2310, accessory device 2304, or representative accessory device 2306).

The controller device 2302 may also contain a communication interface 2316 that allows the controller device 2302 to communicate with a storage database, another computing device or server, a user terminal, or other device via the network 2308. The controller device 2302 may also include I/O devices 2322, such as to enable connection with a keyboard, mouse, pen, voice input device, touch input device, display, speakers, printer, and so forth.

Memory 2314 may include an operating system 2324 and one or more applications or services for implementing the features disclosed herein, including a communication module 2326, a voice processing module 2328, and an accessory interaction instance 2330. The speech processing module 2328 also includes a wake word module 2332, and the accessory interaction instance 2330 also includes a digital assistant 2334 and an end word module 2336.

Communication module 2326 may include code that causes processor 2318 to generate instructions and messages, transmit messages, or otherwise communicate with other entities. For example, communication module 2326 can communicate data associated with establishing a telephone call to and receive data associated with establishing a telephone call from accessory device 2306 and cellular-enabled device 2310 in conjunction with digital assistant 2334. As described herein, the communication module 2326 may transmit messages via one or more network paths of the network 2308 (e.g., via a LAN or internet WAN associated with a residential environment).

According to some implementations, the voice processing module 2328 may include code that causes the processor 2318 to receive and process audio input corresponding to a dictation request to initiate a call or to end a call. Processing the spoken audio may include, for example, NLP or audio pattern matching. In some implementations, one or more of the operations of the voice processing module 2328 may be similar to those described with reference to the voice processing module 2214 of fig. 22. Wake word module 2332 may include code that causes processor 2318 to receive and process a portion of an audio input corresponding to a trigger or wake word. In some implementations, one or more of the operations of wake word module 2332 may be similar to those described with reference to wake word detection module 2216 of fig. 22. For example, wake word module 2332 may analyze a portion of the audio input to determine the presence of wake words.

The accessory interaction instance 2330 can include code that causes the processor 2318 to receive and process a portion of the audio input corresponding to a user request. In some implementations, one or more of the operations of accessory interaction example 2330 can be similar to those described with reference to accessory interaction example 2218 of fig. 22. For example, accessory interaction instance 2330 can include a number of processes or services that can cause processor 2318 to send and receive data to and from a remote service, identify an appropriate cellular-enabled device for initiating a call upon user request, or enter a call listening state that restricts the functionality of digital assistant 2334 while the call is in progress. Accessory interaction instance 2330 can include digital assistant 2334, which can perform one or more of these exemplary operations, as well as additional operations related to interactions between accessory devices 2304, 2306 and cellular-enabled device 2310 as described herein. The accessory interaction instance 2330 can also include an end word module 2336. While in the call listening state, the accessory interaction instance corresponding to the accessory device in the call may limit the voice processing functionality of its digital assistant 2334 to detecting only call end words (e.g., "hang up"). In those cases, after processing at wake word module 2332, end word module 2336 may receive audio input. The end word module 2336 may process the audio input to detect end words. In some embodiments, the end word module 2336 performs speech analysis similar to the wake word module 2332. When an end word is detected, the digital assistant 2334 in conjunction with the communication module 2326 may send an instruction to the accessory device to end the call.

Turning now to the details of the representative accessory device 2306, in some embodiments the accessory device 2306 can have at least one memory 2340, a communication interface 2342, a processor 2344, a storage unit 2346, and an I/O device 2348. As described herein with respect to the controller device 2302, these elements of the accessory device may have the same appropriate hardware implementations as their counterparts on the controller device 2302.

The memory 2340 of the accessory device 2306 may include an operating system 2350 and one or more applications or services for implementing the features disclosed herein, including a communication module 2352, an audio module 2354, and an ADK 2356. As described herein with respect to controller device 2302, communication module 2352 may have similar appropriate functionality as its corresponding communication module 2326.

The audio module 2354 may include code that causes the processor 2344 to receive, process, and transmit audio signals in conjunction with the I/O device 2348. In some implementations, one or more of the operations of the audio module can be similar to those described with reference to the accessory audio module 2210 of fig. 22. For example, the audio module 2354 may receive user utterances or other audio inputs at a microphone using the I/O device 2348 and transmit the audio data to the controller device 2302 via a streaming audio channel or other suitable connection. The audio input may correspond to a wake word that incorporates an end word. The audio module 2354 may also receive user audio at the microphone and relay the audio to the cellular-enabled device as part of a telephone call. Similarly, the audio module may receive audio from a cellular-enabled device and play the audio at a speaker.

ADK 2356 may include code that causes processor 2344 to receive and process a portion of the audio input corresponding to the trigger or wake word. In some embodiments, one or more of the operations of ADK 2356 may be similar to those described with reference to ADK 2206 of fig. 22. ADK 2356 may include wake word module 2358. The wake word module 2358 may include code to cause the processor 2344 to receive and process wake words. In some embodiments, one or more of the operations of the wake word module 2358 may be similar to those described with reference to the wake word detection module 2208 of fig. 22. For example, the wake word module 2358 may analyze a portion of the audio input to determine the presence of wake words.

In some implementations, ADK 2356 may also include a telephony control module 2360. The telephony control module 2360 can include code that causes the processor 2344 to send and receive commands and instructions to and from the cellular-enabled device 2310. For example, upon receiving an audio input containing an end word, the controller device 2302 may transmit an instruction to the accessory device 2306 to end the call. The accessory device 2306 via its phone control module 2360 can then send a signal to signal the cellular-enabled device 2310 to end the call and close the audio connection between the two devices.

Turning now to the details of the cellular-enabled device 2310, similar to the other device architecture illustrated in fig. 23, in some embodiments the cellular-enabled device 2310 may have at least one memory 2362, a communication interface 2364, a processor 2366, a storage unit 2368, and I/O devices. As described herein with respect to the controller device 2302, these elements of the accessory device may have the same appropriate hardware implementations as their counterparts on the controller device 2302.

Memory 2362 of cellular-enabled device 2310 may include an operating system 2372 and one or more applications or services for implementing the features disclosed herein, including a media module 2374, a telephony control module 2376, and an accessory discovery module 2376. As described herein with respect to accessory device 2306, phone control module 2376 may have similar appropriate functionality as its corresponding phone control module 2360.

Media module 2374 may include code that causes processor 2366 to send, receive, and process data contained in telephone calls. The data may be received from and transmitted to a server device or other device connected to the cellular network 2312. The media module 2374 can also transmit, receive, and process data corresponding to real-time audio of telephone calls sent to the accessory device 2306 over one of the networks 2308. In some implementations, one or more of the operations of media module 2374 may be similar to those described with reference to media module 2230 of fig. 22.

The accessory discovery module 2376 can include code that causes the processor 2366 to receive information regarding a selected accessory device 2306 on one of the networks 2308 within the residential environment to establish a communication channel from the accessory device 2306 for direct call placement at the cellular-enabled device 2310. In some embodiments, one or more of the operations of the accessory discovery module 2376 may be similar to those described with reference to the accessory discovery module 2238 of fig. 22.

Fig. 24 is a flow diagram illustrating a particular example process 2400 for requesting a phone call at an accessory 2401 and initiating the phone call at a cellular enabled device 2403, according to an embodiment.

Each of the elements and operations depicted in fig. 24 may be similar to one or more elements depicted in other figures described herein. For example, user device 2402 may be similar to other user devices, controller devices or hub devices, and so forth. In some embodiments, process 2400 may be performed within a residential environment (e.g., residential environment 2000 of fig. 20).

Process 2400 and processes 2500 and 2600 of fig. 25 and 26 (described below) are illustrated as logic flow diagrams, each operation of which represents a series of operations that can be implemented in hardware, computer instructions, or a combination thereof. In the context of computer instructions, operations represent computer-executable instructions stored on one or more computer-readable storage media that, when executed by one or more processors, perform the recited operations. Generally, computer-executable instructions include routines, programs, objects, components, data structures, etc. that perform particular functions or implement particular data types. The order in which the operations are described is not intended to be construed as a limitation, and any number of the operations may be combined in any order and/or in parallel to implement the process.

At block 2404, accessory 2401 may receive an audio input including a wake word and a call request. For example, the audio input may be a user utterance "computer, call mom," where "computer" includes a wake-up word and "call mom" includes a request.

At block 2406, the accessory 2401 may process the wake word in a first pass to determine the presence of the wake word. The first pass may be performed in a time and resource efficient manner to determine when a wake word may be present. At decision 2408, based on the first pass, accessory 2401 determines whether a wake word is present. If not, the process may terminate at endpoint 2410 by ignoring the user utterance. If, according to the first pass, there is a wake word, then the process continues to block 2412.

At block 2412, the accessory 2401 may transmit the audio input to the user device 2402 via a streaming audio connection. The connection may occur through one of the networks to which accessory 2401 and user device 2402 are connected (e.g., through a WiFi LAN). Streaming audio may use any number of methods or protocols including, but not limited to, airPlay, real-time transport protocol ("RTP"), real-time streaming protocol (RTSP), and the like.

At block 2414, the user device 2402 receives the wake word and may process the wake word a second pass. This process may occur at a second level where the presence of wake words may be confirmed with a higher degree of probability than the first pass process at the attachment 2401 at block 2406. At decision 2416, if the user device 2402 does not confirm the presence of the wake word, the process moves to the endpoint 2418 and ignores the audio input by terminating the streaming audio connection with the accessory 2401. If the user device 2402 confirms the presence of the wake-up word, the process moves to block 2420 and processes the call request. In processing the call requests, user device 2402 may transmit some or all of the call requests to a remote server device for voice analysis using NLP or other techniques. The analysis may determine the user making the call request. Upon identifying the user, the user device 2402 or the remote service device may access a user profile associated with the user in the residential environment. These user profiles may be stored at user device 2402 or at a remote service device or other device accessible by the device performing the speech analysis. Processing the call request may ultimately result in the identification of the call recipient (e.g., mom). The call recipient may be a telephone number associated with a name or other tag spoken by the user making the call request. In some embodiments, the call recipient may be identifiable by the cellular enabled device 2403 such that the portion of the call request identifying the recipient is sent to the cellular enabled device 2403 along with the instruction to initiate the call.

At block 2422, the user device 2402 may determine the appropriate cellular-enabled device for placing the call. The determination may be based on information about the requesting user obtained from the user profile. In some embodiments, a suitable cellular-enabled device 2403 may be a personal cellular telephone of a user. At block 2424, the user device 2402 may instruct the cellular enabled device to establish a call. This may include sending information to the cellular enabled device 2403 identifying the call recipient and identifying that the cellular enabled device 2403 should connect to the accessory 2401 to relay the call audio. Although not depicted in fig. 24, in some embodiments, the user device 2402 may alternatively instruct the accessory 2401 to communicate with the selected cellular-enabled device 2403 and instruct the cellular-enabled device 2403 to initiate the call. Such a scenario may occur in an environment where the user device 2402 may identify the appropriate cellular-enabled device 2403 but cannot communicate with the cellular-enabled device, for example, because the two devices do not have a trusted connection with each other. User device 2402 may then enter a call listening state at endpoint 2426. While in the call listening state, the accessory interaction instance corresponding to accessory 2401 at user device 2402 can have limited speech processing with respect to analyzing user requests received from accessory 2401. Other accessory interaction instances corresponding to different accessories associated with user device 2402 may normally handle user requests at those accessories.

At block 2428, the cellular enabled device 2403 may initiate a call to a call recipient via the cellular network 2430. At block 2432, the cellular-enabled device 2403 may establish an audio channel with the accessory 2401. Accessory 2401 may then begin relaying audio to and from cellular enabled device 2403 to form a telephone conversation. While the call is in progress, the user device 2402 may stay in the listening state at block 2426 to listen for the "end" word to terminate the call. However, in other examples, accessory 2401 may be configured to also (or instead of user device 2402) be in a listening state in order to listen for "end" words to terminate the call, or to listen for other instructions (e.g., potentially unrelated to a telephone call). Although both devices (e.g., the accessory 2501 and the user device 2502) are able to listen for the "end" word, the user device 2502 may be better suited for this task based on the fact that the detector on the accessory 2501 may be worse than the detector on the user device 2502.

Fig. 25 is another flow diagram illustrating an exemplary process 2500 for requesting termination of a telephone call at an accessory device 2501 and ending the call at a cellular-enabled device 2503 according to an embodiment. Each of the elements and operations depicted in fig. 25 may be similar to one or more elements depicted in other figures described herein. In some embodiments, process 2500 may be a continuation of process 2400 described above in fig. 24. Thus, accessory 2501 may correspond to accessory 2401, user device 2502 may correspond to user device 2402, and cellular-enabled device 2503 may correspond to cellular-enabled device 2403.

User device 2502 begins process 2500 in call listening state 2504 when a call is in progress between cellular enabled device 2503 and one of its associated accessories (e.g., accessory 2501). In some implementations, the call listening state may be similar to the listening state described for endpoint 2426 of fig. 24. Alternatively, the attachment 2501 may also be in a listening state (for both wake words and/or for "end" words). At block 2506, accessory 2501 may receive user audio input. The audio input may consist of a wake-up word and an end word, such as "computer, hang-up". The accessory 2501 may process the wake-up word and transmit the audio input to the user device 2502 for further processing in blocks 2508-2520. In some embodiments, one or more of the operations of blocks 2508-2520 may be similar to one or more of the operations described with respect to blocks 2406-2418 with reference to fig. 24.

If the user device 2502 detects a wake word at block 2518, the user device may then process an end word portion of the audio input at block 2522. Processing of end words may be limited to detecting and processing only specific end words (e.g., "hang-up") or small sets of equivalent words (e.g., "end call", "end", etc.). At decision 2524, if an end word is not detected, the user device 2502 will ignore the audio input and the process will end at endpoint 2526. Because the user device 2502 is in a call listening state (e.g., at block 2504), the user device will ignore all audio inputs that do not contain an end word, even if those inputs otherwise contain a valid and determinable request. In many embodiments, if the end word is not detected in a portion sufficient to contain the desired end word, the user device 2502 may not process any portion of the audio input beyond the sufficient portion. In this manner, user device 2502 is unable to monitor any portion of the call and/or record the call.

At block 2528, if an end word is detected, user device 2502 may instruct accessory 2501 to end the call. At block 2529, the accessory receives the instructions and communicates with cellular enabled device 2503 to terminate the call. Alternatively, in some examples, the user device 2502 may instead instruct the cellular-enabled device 2503 to end the call at block 2528. In this example, block 2529 will be skipped and the cellular enabled device 2503 will be instructed to end the call without going through the attachment 2501. At block 2530, cellular-enabled device 2503 may terminate the call connection with cellular network 2532 and then close the communication channel with accessory 2501 at endpoint 2534.

Fig. 26 is a flow diagram illustrating a process 2600 for a controller device to establish a connection between an accessory device and a cellular-enabled device, according to some embodiments. In some implementations, one or more of the operations of process 2600 may be similar to those described with reference to fig. 24 and 25.

At block 2602, the controller device may establish a first network connection with a cellular-enabled device and a second network connection with an accessory device. The network connection may occur over one or more networks of the networks associated with the residential environment. In the case of the accessory device, the second network connection may be a network connection through which the controller device communicates with the accessory device in response to a user request at the accessory device. In some implementations, one or more of the operations of block 2602 may be similar to one or more of the operations described with reference to fig. 21 for process indicators 2130 and 2140.

At block 2604, the controller device can listen for audio input from the accessory device over a second network connection. Such listening behavior is typical of a controller device acting as a hub device for one or more associated accessories. When the accessory receives an audio input containing a trigger or wake word, the accessory may transmit the audio input to its associated hub device for further processing.

At block 2606, the controller device can receive audio input from the accessory device over the second network connection. The audio input may include wake words and call requests. The call request may correspond to a request to make a telephone call to a cellular-enabled device. In some implementations, one or more of the operations of block 2606 may be similar to one or more of the operations described with respect to block 2420 with reference to fig. 24.

At block 2608, upon receiving the call request, the controller device may transmit an instruction to establish a third network connection with the accessory device and initiate the call to the cellular-enabled device over the first network connection for the cellular-enabled device. In some implementations, one or more of the operations of block 2608 may be similar to one or more of the operations described with respect to block 2424 with reference to fig. 24.

At block 2610, the controller device may enter a call listening state. While in this state, the controller device may monitor for a second audio input from the accessory device. In some implementations, one or more of the operations of block 2610 may be similar to one or more of the operations described with respect to blocks 2516 and 2522 with reference to fig. 25.

At block 2612, if the controller device identifies an end word in the second audio input, the controller device may transmit an instruction to end the call and close the connection with the accessory device to the cellular-enabled device. In some implementations, one or more of the operations of block 2612 may be similar to one or more of the operations described with respect to process indicator 2140 with reference to fig. 21.

Exemplary techniques for communicating between an accessory device and a cellular-enabled device are described above. Some or all of these techniques may be implemented, at least in part, by an architecture such as those illustrated at least in fig. 19-8 above, but need not be implemented by such an architecture. While many embodiments are described above with reference to server devices, accessory devices, user devices, and hub devices, it should be understood that other types of computing devices may be suitable for performing the techniques disclosed herein. Further, various non-limiting examples are described in the foregoing description. For purposes of explanation, numerous specific configurations and details are set forth in order to provide a thorough understanding of the examples. It will be apparent, however, to one skilled in the art that some examples may be practiced without these specific details. Furthermore, well-known features are sometimes omitted or simplified in order not to obscure the examples described herein.

As described above, one aspect of the present technology is to collect and use data from specific and legal sources to improve delivery of heuristic content or any other content that may be of interest to a user when updating firmware. The present disclosure contemplates that in some instances, the collected data may include personal information data that uniquely identifies or may be used to identify a particular person. Such personal information data may include demographic data, location-based data, online identifiers, telephone numbers, email addresses, home addresses, data or records related to the user's health or fitness level (e.g., vital sign measurements, medication information, and exercise information), date of birth, or any other personal information.

The present disclosure recognizes that the use of such personal information data in the present technology may be used to benefit users. For example, personal information data may be used to deliver targeted content that may be of greater interest to the user according to their preferences. Thus, the use of such personal information data enables a user to have greater control over the delivered content. In addition, the present disclosure contemplates other uses for personal information data that are beneficial to the user.

The present disclosure contemplates that entities responsible for collecting, analyzing, disclosing, transmitting, storing, or otherwise using such personal information data will adhere to established privacy policies and/or privacy practices. In particular, it would be desirable for such entity implementations and consistent applications to generally be recognized as meeting or exceeding privacy practices required by industries or governments maintaining user privacy. Such information about the use of personal data should be highlighted and conveniently accessible to the user and should be updated as the collection and/or use of the data changes. The user's personal information should be collected only for legitimate use. In addition, such collection/sharing should only occur after receiving user consent or other legal basis specified in the applicable law. In addition, such entities should consider taking any necessary steps to defend and secure access to such personal information data and to ensure that others who have access to personal information data adhere to their privacy policies and procedures. In addition, such entities may subject themselves to third party evaluations to prove compliance with widely accepted privacy policies and practices. In addition, policies and practices should be tailored to the particular type of personal information data being collected and/or accessed and adapted to applicable laws and standards, including jurisdictional-specific considerations that may be used to impose higher standards. For example, in the united states, the collection or acquisition of certain health data may be governed by federal and/or state law, such as the health insurance flow and liability act (HIPAA); while health data in other countries may be subject to other regulations and policies and should be processed accordingly.

In spite of the foregoing, the present disclosure also contemplates embodiments in which a user selectively prevents use or access to personal information data. That is, the present disclosure contemplates that hardware elements and/or software elements may be provided to prevent or block access to such personal information data. For example, such as with respect to an advertisement delivery service, the present technology may be configured to allow a user to choose to "opt-in" or "opt-out" to participate in the collection of personal information data during or at any time after registration with the service. In another example, the user may choose not to provide mood-related data for the targeted content delivery service. As another example, the user may choose to limit the length of time that the mood-related data is maintained, or to prevent development of the underlying emotional condition altogether. In addition to providing the "opt-in" and "opt-out" options, the present disclosure also contemplates providing notifications related to accessing or using personal information. For example, the user may be notified that his personal information data will be accessed when the application is downloaded, and then be reminded again just before the personal information data is accessed by the application.

Further, it is an object of the present disclosure that personal information data should be managed and processed to minimize the risk of inadvertent or unauthorized access or use. Once the data is no longer needed, risk can be minimized by limiting the data collection and deleting the data. In addition, and when applicable, included in certain health-related applications, the data de-identification may be used to protect the privacy of the user. De-identification may be facilitated by removing identifiers, controlling the amount or specificity of stored data (e.g., collecting location data at a city level instead of at an address level), controlling how data is stored (e.g., aggregating data among users), and/or other methods such as differentiated privacy, as appropriate.

Thus, while the present disclosure broadly covers the use of personal information data to implement one or more of the various disclosed embodiments, the present disclosure also contemplates that the various embodiments may be implemented without accessing such personal information data. That is, various embodiments of the present technology do not fail to function properly due to the lack of all or a portion of such personal information data. For example, content may be selected and delivered to a user based on aggregated non-personal information data or absolute minimum amount of personal information, such as content processed only on user devices or other non-personal information available to a content delivery service.

The various embodiments may also be implemented in a variety of operating environments that may include, in some cases, one or more user computers, computing devices, or processing devices that may be used to operate any of a number of applications. The user device or client device may include any of a variety of different types of computers, such as desktop or laptop computers running a standard operating system, as well as cellular, wireless, and handheld devices running mobile software and capable of supporting multiple networking protocols and instant messaging protocols. This system may also include a plurality of workstations running any of a variety of commercially available operating systems and other known applications for purposes such as development and database management. These devices may also include other electronic devices such as virtual terminals, thin clients, gaming systems, and other devices capable of communicating via a network.

Most embodiments utilize at least one network familiar to those skilled in the art to support communications using any of a variety of commercially available protocols such as TCP/IP, OSI, FTP, UPnP, NFS, CIFS and AppleTalk. The network may be, for example, a local area network, a wide area network, a virtual private network, the internet, an intranet, an extranet, a public switched telephone network, an infrared network, a wireless network, and any combination thereof.

In embodiments utilizing web servers, the web server may run any of a variety of servers or middle tier applications, including HTTP servers, FTP servers, CGI servers, data servers, java servers, and business application servers. The one or more servers may also be capable of executing programs or scripts in response to requests from the user device, such as by executing one or more applications, which may be implemented in any programming language, such asC. C# or c++, or any scripting language such as Perl, python, or TCL, or combinations thereof. The servers may also include database servers including, but not limited toNot limited to being derivable from And->Those commercially available.

The environment may include various data stores and other memory and storage media, as described above. These may reside at various locations, such as on storage media local to one or more computers or on storage media remote from any or all of the computers on the network (and/or resident in one or more computers). In a particular set of embodiments, the information may reside in a Storage Area Network (SAN) familiar to those skilled in the art. Similarly, any necessary files for performing the functions attributed to a computer, server, or other network device may be stored locally and/or remotely as desired. When the system includes computerized devices, each such device may include hardware elements that may be electrically coupled via a bus, including, for example, at least one Central Processing Unit (CPU), at least one input device (e.g., a mouse, keyboard, controller, touch screen, or keypad), and at least one output device (e.g., a display device, printer, or speaker). Such systems may also include one or more storage devices, such as disk drives, optical storage devices, and solid-state storage devices such as RAM or ROM, as well as removable media devices, memory cards, flash memory cards, and the like.

Such devices may also include a computer-readable storage medium reader, a communication device (e.g., modem, network card (wireless or wired), infrared communication device, etc.), and working memory as described above. The computer-readable storage medium reader may be connected to or configured to receive non-transitory computer-readable storage media representing remote, local, fixed, and/or removable storage devices, as well as storage media for temporarily and/or more permanently containing, storing, transmitting, and retrieving computer-readable information. The system and various devices will typically also include a plurality of software applications, modules, services, or other elements, including an operating system and applications such as a client application or browser, located within at least one working memory device. It should be understood that alternative embodiments may have many variations according to the above description. For example, custom hardware may also be used, and/or certain elements may be implemented in hardware, software (including portable software, such as applets), or both. In addition, connections to other computing devices, such as network input/output devices, may be used.

Non-transitory storage media and computer-readable storage media for containing code or portions of code may include any suitable medium known or used in the art, such as, but not limited to, volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information (e.g., computer readable instructions, data structures, program modules, or other data), including RAM, ROM, electrically erasable programmable read-only memory ("EEPROM"), flash memory or other memory technology, CD-ROM, DVD or other optical memory, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the system device. Based at least in part on the disclosure and teachings provided herein, one of ordinary skill in the art will recognize other ways and/or methods of implementing various embodiments. However, computer-readable storage media do not include transitory media such as carrier waves and the like.

The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the disclosure as set forth in the claims.

Other variations are within the spirit of the disclosure. Thus, while the disclosed technology is susceptible to various modifications and alternative constructions, certain illustrated embodiments thereof are shown in the drawings and have been described above in detail. It should be understood, however, that there is no intention to limit the disclosure to the specific forms disclosed, but on the contrary, the intention is to cover all modifications, alternative constructions, and equivalents falling within the spirit and scope of the disclosure as defined by the appended claims.

The use of the terms "a" and "an" and "the" and similar referents in the context of describing the disclosed embodiments (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context. Unless otherwise indicated, the terms "comprising," "having," "including," and "containing" are to be construed as open-ended terms (i.e., meaning "including, but not limited to"). The term "connected" is to be interpreted as including partially or wholly contained within, attached to, or joined together even if there is intervening matter. The phrase "based, at least in part, on" should be understood to be open ended, and not limited in any way, and is intended to be interpreted, or otherwise interpreted, as "based, at least in part, on" where appropriate. Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., "such as") provided herein, is intended merely to better illuminate embodiments of the disclosure and does not pose a limitation on the scope of the disclosure unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the disclosure.

Unless specifically stated otherwise, a disjunctive language such as at least one of the phrases "X, Y or Z" is understood in the context of generally being used to present items, terms, etc., which may be X, Y or Z, or any combination thereof (e.g., X, Y and/or Z). Thus, such disjunctive language is generally not intended and should not imply that certain embodiments require at least one of X, at least one of Y, or at least one of Z to each be present. In addition, unless specifically stated otherwise, a conjunctive language such as the phrase "at least one of X, Y, and Z" is also understood to mean X, Y, Z or any combination thereof, including "X, Y and/or Z".

Preferred embodiments of this disclosure are described herein, including the best mode. Variations of those preferred embodiments may become apparent to those of ordinary skill in the art upon reading the foregoing description. It is contemplated that the skilled artisan will be able to employ such variants as appropriate and to practice the disclosure in ways other than those specifically described herein. Accordingly, this disclosure includes all modifications and equivalents of the subject matter recited in the claims appended hereto as permitted by applicable law. Moreover, unless indicated otherwise or clearly contradicted by context, this disclosure encompasses any combination of all possible variations of the above elements.

Examples of the techniques described may be illustrated by the following clauses:

clause 1. A method comprising:

establishing, by a controller device, a first network connection with a cellular-enabled device and with an accessory

A second network connection of devices;

listening, by the controller device, for a first audio input from the accessory device via the second network connection;

while listening, identifying, from the accessory device via the second network connection, the first audio input including at least a request to initiate a telephone call;

transmitting instructions to the cellular-enabled device via the first network connection or to the accessory device via the second network connection for establishing a call between the cellular-enabled device and the accessory device via a third network connection;

listening for a second audio input from the accessory device via the second network connection during the call between the cellular-enabled device and the accessory device;

in accordance with a determination that the second audio input is recognized, an instruction to end the call with the accessory device is transmitted to the cellular-enabled device via the first network connection or to the accessory device via the second network connection.

Clause 2. The method of clause 1, wherein the second audio input comprises an end word configured to identify that the call is to be ended.

Clause 3 the method of clause 2, wherein during the call, the controller device listens to only the end word.

Clause 4 the method of clause 2, wherein during the call, the controller device does not process any audio received from the accessory device via the second network connection except for the end word.

Clause 5. The method of clause 2, wherein the end word comprises at least one of "hang-up" or "end call".

Clause 6. The method of clause 1, further comprising:

in accordance with the determination that the second audio input is recognized, transmitting information to the accessory device via the second network connection, the information informing the accessory device that the call is to be ended.

Clause 7. The method of clause 1, wherein the first audio input further comprises at least a wake word.

Clause 8 the method of clause 2, wherein the wake-up word indicates that the second portion of the first audio input identifies an action to be performed by the controller device.

Clause 9 the method of clause 8, wherein the action comprises transmitting the instruction to establish the call via the third network connection.

Clause 10. The method of clause 2, wherein the wake-up word corresponds to a first software ecosystem of the controller device and does not correspond to a second software ecosystem of the accessory device.

Clause 11. The method of clause 1, wherein the accessory device is a third party device that is built for a first entity that is different from a second entity for which the controller device is built.

Clause 12 the method of clause 11, wherein the accessory device is configured to implement a software development kit provided by the second entity associated with the controller device.

Clause 13. The method of clause 1, wherein the accessory device does not support cellular.

Clause 14. The method of clause 1, wherein the accessory device is configured with a speaker and a microphone.

Clause 15 the method of clause 1, wherein the cellular-enabled device is configured to relay the call from the service provider to the accessory device.

Clause 16 the method of clause 1, wherein the controller device is one of a plurality of hub devices in a residential environment.

Clause 17, a controller device, comprising:

a memory configured to store computer-executable instructions; and

a processor configured to access the memory and execute the computer-executable instructions to perform at least the method of any one of clauses 1-16.

Clause 18 is a computer readable storage medium configured to store computer executable instructions that, when executed by a controller device, cause the controller device to perform the method according to any of clauses 1 to 16.

Claims

1. A method, comprising:

receiving, by a user device, information identifying a plurality of accessories configured to communicate with the user device;

implementing, by the user device, a respective accessory interaction instance for each accessory of the plurality of accessories;

receiving a first audio input from a first accessory of the plurality of accessories and a second audio input from a second accessory of the plurality of accessories;

Processing at least a portion of the first audio input by a first one of the respective accessory interaction instances;

receiving, by the first one of the respective accessory interaction instances, a first response from a server computer, the first response corresponding to a processed portion of the first audio input; and

transmitting, by the user device, the first response to the first accessory of the plurality of accessories.

2. The method of claim 1, wherein the processing of the portion of the first audio input by the first one of the respective accessory interaction instances comprises:

the processed portion of the first audio input is transmitted to the server computer by the first accessory interaction instance.

3. The method of claim 2, wherein the processing of the portion of the first audio input by the first one of the respective accessory interaction instances further comprises:

an action is delegated by the first accessory-interaction instance to one or more other processes.

4. The method of claim 3, wherein the one or more other processes comprise at least one of a music service or a voice communication service, and wherein the action comprises instructions for the one or more other processes to provide audio content for the first accessory.

5. The method of claim 1, further comprising:

determining, by the user device, whether at least a portion of the second audio input matches a wake word; and

determining, by the user device, whether the second accessory is authorized to interact with a second accessory interaction instance of the respective accessory interaction instances.

6. The method of claim 5, further comprising:

processing at least a portion of the second audio input by the second one of the respective accessory interaction instances according to at least one of: determining that the portion of the second audio input matches the wake word or that the second accessory is authorized to interact with the second one of the respective accessory interaction instances.

7. The method of claim 1, wherein each accessory of the plurality of accessories is configured to implement a software development kit provided by an entity associated with the user device.

8. The method of claim 7, wherein each of the respective accessory interaction instances is configured to communicate with a corresponding software development kit of a respective accessory of the plurality of accessories.

9. The method of claim 1, further comprising managing, by the user device, respective accessory settings for each accessory of the plurality of accessories.

10. The method of claim 1, wherein the user device is a first user device, and wherein the information comprises at least one of:

a request from at least one of the plurality of accessories to connect to the first user device; or alternatively

Instructions from a second user device to connect the first user device to at least one of the accessories of the plurality of accessories.

11. A user equipment, comprising:

a memory configured to store computer-executable instructions; and

a processor configured to connect to the memory and execute the computer-executable instructions to at least:

receiving information identifying a plurality of accessories configured to communicate with the user device;

implementing a respective accessory interaction instance for each accessory of the plurality of accessories;

transmitting the first response to the first accessory of the plurality of accessories.

12. The user device of claim 11, wherein the processing of the portion of the first audio input by the first one of the respective accessory interaction instances comprises at least one of:

transmitting, by the first accessory interaction instance, the processed portion of the first audio input to the server computer; or alternatively

13. The user device of claim 11, wherein the processor is configured to execute the computer-executable instructions to at least:

determining whether at least a portion of the second audio input matches a wake word; and

a determination is made as to whether the second accessory is authorized to interact with a second accessory interaction instance of the respective accessory interaction instances.

14. The user device of claim 13, wherein the processor is configured to execute the computer-executable instructions to at least:

15. The user device of claim 11, wherein each accessory of the plurality of accessories is configured to implement a software development kit provided by an entity associated with the user device.

16. A computer-readable storage medium configured to store computer-executable instructions that, when executed by a user device, cause the user device to perform operations comprising:

17. The computer-readable storage medium of claim 16, wherein the processing of the portion of the first audio input by the first one of the respective accessory interaction instances comprises at least one of:

18. The computer-readable storage medium of claim 16, wherein the operations further comprise:

19. The computer-readable storage medium of claim 18, wherein the operations further comprise:

20. The computer-readable storage medium of claim 16, wherein each accessory of the plurality of accessories is configured to implement a software development kit provided by an entity associated with the user device.