CA3129236C - Devices, systems, and methods for distributed voice processing - Google Patents
Devices, systems, and methods for distributed voice processing Download PDFInfo
- Publication number
- CA3129236C CA3129236C CA3129236A CA3129236A CA3129236C CA 3129236 C CA3129236 C CA 3129236C CA 3129236 A CA3129236 A CA 3129236A CA 3129236 A CA3129236 A CA 3129236A CA 3129236 C CA3129236 C CA 3129236C
- Authority
- CA
- Canada
- Prior art keywords
- playback device
- wake word
- playback
- remote computing
- wake
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/20—Arrangements for obtaining desired frequency or directional characteristics
- H04R1/32—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
- H04R1/40—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
- H04R1/406—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R27/00—Public address systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers
- H04R3/005—Circuits for transducers for combining the signals of two or more microphones
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R2227/00—Details of public address [PA] systems covered by H04R27/00 but not provided for in any of its subgroups
- H04R2227/005—Audio distribution systems for home, i.e. multi-room use
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Acoustics & Sound (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Otolaryngology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Selective Calling Equipment (AREA)
- Telephonic Communication Services (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001]
The present application claim priority to U.S. Patent Application No.
16/271,550, filed February 8, 2019, and U.S. Patent Application No. 16/271,560, filed February 8, 2019.
TECHNICAL FIELD
The present technology relates to consumer goods and, more particularly, to methods, systems, products, features, services, and other elements directed to voice-controllable media playback systems or some aspect thereof.
BACKGROUND
Options for accessing and listening to digital audio in an out-loud setting were limited until in 2003, when SONOS, Inc. filed for one of its first patent applications, entitled "Method for Synchronizing Audio Playback between Multiple Networked Devices,"
and began offering a media playback system for sale in 2005. The SONOS Wireless HiFi System enables people to experience music from many sources via one or more networked playback devices.
Through a software control application installed on a smai ________________ (phone, tablet, or computer, one can play what he or she wants in any room that has a networked playback device.
Additionally, using a controller, for example, different songs can be streamed to each room that has a playback device, rooms can be grouped together for synchronous playback, or the same song can be heard in all rooms synchronously.
Given the ever-growing interest in digital media, there continues to be a need to develop consumer-accessible technologies to further enhance the listening experience.
BRIEF DESCRIPTION OF THE DRAWINGS
Features, aspects, and advantages of the presently disclosed technology may be better understood with regard to the following description and accompanying drawings where:
Figure 1A is a partial cutaway view of an environment having a media playback system configured in accordance with aspects of the disclosed technology.
Figure 1B is a schematic diagram of the media playback system of Figure 1A and one or more networks;
Date Recue/Date Received 2021-11-17
Date Recue/Date Received 2021-11-17
DETAILED DESCRIPTION
I. Overview
In some implementations, network microphone devices may be used to control smart home devices.
Extracting the detected sound may include reading out and packaging a stream of the detected-sound according to a particular format and transmitting the packaged sound-data to an appropriate VAS for interpretation.
A VAS traditionally takes the form of a remote service implemented using one or more cloud servers configured to process voice inputs (e.g., AMAZON's ALEXA, APPLE's SIRI, MICROSOFT 's CORTANA, GOOGLE'S ASSISTANT, etc.). In some instances, certain components and functionality of the VAS may be distributed across local and remote devices.
Additionally, or alternatively, a VAS may take the form of a local service implemented at an NMD
or a media playback system comprising the NMD such that a voice input or certain types of voice input (e.g., rudimentary commands) are processed locally without intervention from a remote VAS.
will typically process this data, which involves identifying the voice input and determining an intent of words captured in the voice input. The VAS may then provide a response back to the NMD with some instruction according to the determined intent. Based on that instruction, the NMD may cause one or more smart devices to perform an action. For example, in accordance with an instruction from a VAS, an NMD may cause a playback device to play a particular song or an illumination device to turn on/off, among other examples. In some cases, an NMD, or a media system with NMDs (e.g., a media playback system with NMD-equipped playback devices) may be configured to interact with multiple VASes. In practice, the NMD may select one VAS over another based on the particular wake word identified in the sound detected by the NMD.
Date Recue/Date Received 2021-11-17
For instance, in querying AMAZON, a user might speak the wake word "Alexa"
followed by a voice command. Other examples include "Ok, Google" for querying GOOGLE and "Hey, Sin"
for querying APPLE.
In some cases, this is a manufacturer-specific wake word rather than a wake word tied to any particular voice service (e.g., "Hey, Sonos" where the NMD is a SONOS playback device). Given such a wake word, the NMD can identify a particular voice service to process the request. For instance, if the voice input following the wake word is related to a particular type of command Date Recue/Date Received 2021-11-17 (e.g., music playback), then the voice input is sent to a particular voice service associated with that type of command (e.g. a streaming music service having voice command capabilities).
In such instances, a user may wish to replace the pre-associated VAS with a different VAS of the user's choosing. Additionally, some voice-enabled playback devices may be sold without any pre-associated VAS, in which cases a user may wish to manage the selection and association of a particular VAS with the playback device.
Example Operating Environment
Referring first to Figure 1A, the MPS 100 as shown is associated with an example home environment having a plurality of rooms and spaces, which may be collectively referred to as a "home environment," "smart home," or "environment 101." The environment 101 comprises a household having several rooms, spaces, and/or playback zones, including a master bathroom 101a, a master bedroom 101b (referred to herein as "Nick's Room"), a second bedroom 101c, a family room or den 101d, an office 101e, a living room 101f, a dining room 101g, a kitchen 101h, and an outdoor patio 101i. While certain embodiments and examples are described below in the context of a home environment, the technologies described herein may be implemented in other types of environments. In some embodiments, for example, the MPS 100 can be implemented in one or more commercial settings (e.g., a restaurant, mall, airport, hotel, a retail or other store), one or more vehicles (e.g., a sports utility vehicle, bus, car, a ship, a boat, an airplane), multiple environments (e.g., a combination of home and vehicle environments), and/or another suitable environment where multi-zone audio may be desirable.
The remote computing devices 106 may be configured to interact with computing devices in the environment 101 in various ways. For example, the remote computing devices 106 may be configured to facilitate streaming and/or controlling playback of media content, such as audio, in the home environment 101.
MUSIC, or other media content services.
For example, the playback devices 102a¨e include or are otherwise equipped with corresponding NMDs 103a¨e, respectively. A playback device that includes or is otherwise equipped with an NMD may be referred to herein interchangeably as a playback device or an NMD unless indicated otherwise in the description. In some cases, one or more of the NMDs 103 may be a stand-alone device. For example, the NMDs 103f and 103g may be stand-alone devices. A stand-alone NMD
may omit components and/or functionality that is typically included in a playback device, such as a speaker or related electronics. For instance, in such cases, a stand-alone NMD may not produce audio output or may produce limited audio output (e.g., relatively low-quality audio output).
"Dining Room," "Living Room," and "Office," respectively. Further, certain playback devices may have functionally descriptive names. For example, the playback devices 102a and 102b are assigned the names "Right" and "Front," respectively, because these two devices are configured to provide specific audio channels during media playback in the zone of the Den 101d (Figure 1A). The playback device 102c in the Patio may be named portable because it is battery-powered and/or Date Recue/Date Received 2021-11-17 readily transportable to different areas of the environment 101. Other naming conventions are possible.
Interactions with the VAS
190 may be initiated, for example, when an NMD identifies in the detected sound a potential wake word. The identification causes a wake-word event, which in turn causes the NMD to begin transmitting detected-sound data to the VAS 190. In some implementations, the various local network devices 102-105 (Figure 1A) and/or remote computing devices 106c of the MPS 100 may exchange various feedback, information, instructions, and/or related data with the remote computing devices associated with the selected VAS. Such exchanges may be related to or independent of transmitted messages containing voice inputs. In some embodiments, the remote computing device(s) and the media playback system 100 may exchange data via communication paths as described herein and/or using a metadata exchange channel as described in U.S.
Application No. 15/438,749 filed February 21, 2017, and titled "Voice Control of a Media Playback System".
may direct an assigned playback device to play audio in response to a remote VAS receiving a voice input from the NMD to play the audio, which the NMD might have sent to the VAS in response to a user speaking a command to play a certain song, album, playlist, etc. Additional details regarding assigning NMDs and playback devices as designated or default devices may be found, for example, in previously mentioned U.S. Patent Application No.
15/438,749.
100, technologies described herein are not limited to applications within, among other things, the home environment described above. For instance, the technologies described herein may be useful in other home environment configurations comprising more or fewer of any of the playback, network microphone, and/or controller devices 102-104. For example, the technologies herein may be utilized within an environment having a single playback device 102 and/or a single NMD
103. In some examples of such cases, the LAN 111 (Figure 1B) may be eliminated and the single playback device 102 and/or the single NMD 103 may communicate directly with the remote computing devices 106a¨d. In some embodiments, a telecommunication network (e.g., an LTE
network, a 5G network, etc.) may communicate with the various playback, network microphone, and/or controller devices 102-104 independent of a LAN.
-It-Date Recue/Date Received 2021-11-17 a. Example Playback & Network Microphone Devices
playback device because it includes components that support the functionality of an NMD, such as one of the NMDs 103 shown in Figure 1A.
8,234,395 filed on April 4, 2004, and titled "System and method for synchronizing operations among a plurality of independently clocked digital data processing devices," provides in more detail some examples for audio playback synchronization among playback devices.
Date Recue/Date Received 2021-11-17
In some implementations, one or more of the audio processing components 216 may be a subcomponent of the processor 212. In operation, the audio processing components 216 receive analog and/or digital audio and process and/or otherwise intentionally alter the audio to produce audio signals for playback.
Date Recue/Date Received 2021-11-17
wired interface may provide network interface functions for the playback device 102 to communicate over a wired connection with other devices in accordance with a communication protocol (e.g., IEEE 802.3).
While the network interface 224 shown in Figure 2A include both wired and wireless interfaces, the playback device 102 may in some implementations include only wireless interface(s) or only wired interface(s).
Date Recue/Date Received 2021-11-17
In this respect, certain voice processing components 220 may be configured with particular parameters (e.g., gain and/or spectral parameters) that may be modified or otherwise tuned to achieve particular functions. In some implementations, one or more of the voice processing components 220 may be a subcomponent of the processor 212.
The user interface 240 may further include one or more of lights (e.g., LEDs) and the speakers to provide visual and/or audio feedback to a user.
"BEAM," "CONNECT," and "SUB." Any other past, present, and/or future playback devices may additionally or alternatively be used to implement the playback devices of example embodiments disclosed herein. Additionally, it should be understood that a playback device is not limited to the examples illustrated in Figures 2A or 2B or to the SONOS product offerings.
For example, a playback device may include, or otherwise take the form of, a wired or wireless headphone set, Date Recue/Date Received 2021-11-17 which may operate as a part of the media playback system 100 via a network interface or the like.
In another example, a playback device may include or interact with a docking station for personal mobile media playback devices. In yet another example, a playback device may be integral to another device or component such as a television, a lighting fixture, or some other device for indoor or outdoor use.
b. Example Playback Device Configurations
Referring first to Figure 3A, in some example instances, a single playback device may belong to a zone. For example, the playback device 102c (Figure 1A) on the Patio may belong to Zone A. In some implementations described below, multiple playback devices may be "bonded" to form a "bonded pair," which together form a single zone. For example, the playback device 102f (Figure 1A) named "Bed 1" in Figure 3A may be bonded to the playback device 102g (Figure 1A) named "Bed 2" in Figure 3A to form Zone B. Bonded playback devices may have different playback responsibilities (e.g., channel responsibilities). In another implementation described below, multiple playback devices may be merged to form a single zone. For example, the playback device 102d named "Bookcase" may be merged with the playback device 102m named "Living Room"
to form a single Zone C. The merged playback devices 102d and 102m may not be specifically assigned different playback responsibilities. That is, the merged playback devices 102d and 102m may, aside from playing audio content in synchrony, each play audio content as they would if they were not merged.
In contrast to certain bonded playback devices, playback devices that are merged may not have assigned playback responsibilities, but may each render the full range of audio content that each respective playback device is capable of. Nevertheless, merged devices may be represented as a single UI
entity (i.e., a zone, as discussed above). For instance, Figure 3E shows the playback devices 102d and 102m in the Living Room merged, which would result in these devices being represented by the single UI entity of Zone C. In one embodiment, the playback devices 102d and 102m may playback audio in synchrony, during which each outputs the full range of audio content that each respective playback device 102d and 102m is capable of rendering.
An NMD may Date Recue/Date Received 2021-11-17 also be bonded or merged with another device so as to form a zone. For example, the NMD device 103f named "Island" may be bonded with the playback device 102i Kitchen, which together form Zone F, which is also named "Kitchen." Additional details regarding assigning NMDs and playback devices as designated or default devices may be found, for example, in previously mentioned U.S. Patent Application No. 15/438,749. In some embodiments, a stand-alone NMD
may not be assigned to a zone.
For example, three, four, five, or more (e.g., all) of the Zones A-I may be grouped. When grouped, the zones of individual and/or bonded playback devices may play back audio in synchrony with one another, as described in previously mentioned U.S. Patent No. 8,234,395.
Grouped and bonded devices are example types of associations between portable and stationary playback devices that may be caused in response to a trigger event, as discussed above and described in greater detail below.
Kitchen," as shown in Figure 3A. In some embodiments, a zone group may be given a unique name selected by a user, such as "Nick's Room," as also shown in Figure 3A. The name "Nick's Room" may be a name chosen by a user over a prior name for the zone group, such as the room name "Master Bedroom."
Date Recue/Date Received 2021-11-17
Identifiers associated with the Living Room may indicate that the Living Room is not grouped with other zones but includes bonded playback devices 102a, 102b, 102j, and 102k. Identifiers associated with the Dining Room may indicate that the Dining Room is part of Dining Room +
Kitchen group and that devices 103f and 102i are bonded. Identifiers associated with the Kitchen may indicate the same or similar information by virtue of the Kitchen being part of the Dining Room + Kitchen zone group. Other example zone variables and identifiers are described below.
Such data may pertain to audio sources accessible by the playback device 102 or a playback queue that the playback device (or some other playback device(s)) may be associated with. In embodiments Date Recue/Date Received 2021-11-17 described below, the memory 213 is configured to store a set of command data for selecting a particular VAS when processing voice inputs.
Synchronization among playback zones may be achieved in a manner similar to that of synchronization among playback devices, as described in previously mentioned U.S. Patent No. 8,234,395.
c. Example Controller Devices
may include components that are generally similar to certain components of the network devices described above, such as a processor 412, memory 413 storing program software 414, at least one network interface 424, and one or more microphones 422. In one example, a controller device may be a dedicated controller for the MPS 100. In another example, a controller device may be a network device on which media playback system controller application software may be installed, such as for example, an iPhone'TM, iPadTM or any other smart phone, tablet, or network device (e.g., a networked computer such as a PC or MacIm).
The memory 413 may be loaded with instructions in software 414 that are executable by the processor 412 to achieve certain functions, such as facilitating user access, control, and/or configuration of the MPS 100. The controller device 104 is configured to communicate with other network devices via the network interface 424, which may take the form of a wireless interface, as described above.
For instance, the controller device 104 may receive playback zone and zone group configurations in the MPS 100 from a playback device, an NMD, or another network device.
Likewise, the Date Recue/Date Received 2021-11-17 controller device 104 may transmit such system information to a playback device or another network device via the network interface 424. In some cases, the other network device may be another controller device.
100. The user interface 440 may include a touch-screen display or other physical interface configured to provide various graphical controller interfaces, such as the controller interfaces 440a and 440b shown in Figures 4B and 4C. Referring to Figures 4B and 4C together, the controller interfaces 440a and 440b includes a playback control region 442, a playback zone region 443, a playback status region 444, a playback queue region 446, and a sources region 448. The user interface as shown is just one example of an interface that may be provided on a network device, such as the controller device shown in Figure 4A, and accessed by users to control a media playback system, such as the MPS
100. Other user interfaces of varying formats, styles, and interactive sequences may alternatively be implemented on one or more network devices to provide comparable control access to a media playback system.
Date Recue/Date Received 2021-11-17
In another example, audio items in a playback queue may be saved as a playlist. In a further example, a playback queue may be empty, or populated but "not in use" when the playback zone or zone group is playing continuously streamed audio content, such as Internet radio that may continue to play until otherwise stopped, rather than discrete audio items that have playback durations. In an alternative embodiment, a playback queue can include Internet radio and/or other streaming audio content items and be "in use" when the playback zone or zone group is playing those items. Other examples are also possible.
playback queues associated with the affected playback zones or zone groups may be cleared or re-associated. For example, if a first playback zone including a first playback queue is grouped with a second playback zone including a second playback queue, the established zone group may have an associated playback queue that is initially empty, that contains audio items from the first playback queue (such as if the second playback zone was added to the first playback zone), that contains audio items from the second playback queue (such as if the first playback zone was added to the second playback zone), or a combination of audio items from both the first and second playback queues. Subsequently, if the established zone group is ungrouped, the resulting first playback zone may be re-associated with the previous first playback queue or may be associated with a new playback queue that is empty or contains audio items from the playback queue associated with the established zone group before the established zone group was ungrouped.
Similarly, the resulting second playback zone may be re-associated with the previous second playback queue or may be associated with a new playback queue that is empty or contains audio items from the playback queue associated with the established zone group before the established zone group was ungrouped. Other examples are also possible.
In one example, graphical representations of audio content may be selectable to bring up additional selectable icons to manage and/or manipulate the playback queue and/or audio content represented in the playback queue. For instance, a represented audio content may be removed from the playback queue, moved to a different position within the playback queue, or selected to be played Date Recue/Date Received 2021-11-17 immediately, or after any currently playing audio content, among other possibilities. A playback queue associated with a playback zone or zone group may be stored in a memory on one or more playback devices in the playback zone or zone group, on a playback device that is not in the playback zone or zone group, and/or some other designated device. Playback of such a playback queue may involve one or more playback devices playing back media items of the queue, perhaps in sequential or random order.
d. Example Audio Content Sources
One or more playback devices in a zone or zone group may be configured to retrieve for playback audio content (e.g., according to a corresponding URI or URL for the audio content) from a variety of available audio content sources. In one example, audio content may be retrieved by a playback device directly from a corresponding audio content source (e.g., via a line-in connection). In another example, audio content may be provided to a playback device over a network via one or more other playback devices or network devices. As described in greater detail below, in some embodiments audio content may be provided by one or more media content services.
Date Recue/Date Received 2021-11-17
Indexing of audio items may involve scanning for identifiable audio items in all folders/directories shared over a network accessible by playback devices in the media playback system and generating or updating an audio content database comprising metadata (e.g., title, artist, album, track length, among others) and other associated information, such as a URI or URL for each identifiable audio item found. Other examples for managing and maintaining audio content sources may also be possible.
e. Example Network Microphone Devices
Figure 5 is a functional block diagram showing an NMD 503 configured in accordance with embodiments of the disclosure. The NMD 503, for example, may be configured for use with the MPS 100 and may be in communication with any of the playback and/or network microphone devices described herein. As noted above, in some implementations an NMD may be standalone, while in other implementations be a playback device or a different device, such as smart household appliance (e.g., a smart washing machine, microwave, etc.). As shown in Figure 5, The NMD 503 includes a voice processor 560, a wake-word engine 570, and at least one voice extractor 572, each of which is operably coupled to the voice processor 560.
The NMD 503 may be NMD-equipped such that it includes the microphones 222 and the at least one network interface 224 described above. The NMD 503 may also include other components, such as audio amplifiers, etc., which are not shown in Figure 5 for purposes of clarity.
The microphones 222 of the NMD 503 are configured to provide detected sound, SD, from the environment of the NMD 503 to the voice processor 560. The detected sound SD may take the form of one or more analog or digital signals. In example implementations, the detected sound SD may be composed of a plurality of signals associated with respective channels 562 that are fed to the voice processor 560. Each channel 562 may provide all or a portion of the detected sound SD to the voice processor 560.
Each channel 562 may correspond to a particular microphone 222. For example, an NMD having six microphones may have six corresponding channels. Each channel of the detected sound SD may bear certain similarities to the other channels but may differ in certain regards, which may be due to the position of the given channel's corresponding microphone relative to the microphones of other channels. For example, one or more of the channels of the detected sound Date Recue/Date Received 2021-11-17 SD may have a greater signal to noise ratio ("SNR") of speech to background noise than other channels.
In operation, the AEC 564 receives the detected sound SD and filters or otherwise processes the sound to suppress echoes and/or to otherwise improve the quality of the detected sound SD. That processed sound may then be passed to the spatial processor 566.
and identify certain characteristics, such as a sound's amplitude (e.g., decibel level), frequency spectrum, directionality, etc. In one respect, the spatial processor 566 may help filter or suppress ambient noise in the detected sound SD from potential user speech based on similarities and differences in the constituent channels 562 of the detected sound SD, as discussed above. As one possibility, the spatial processor 566 may monitor metrics that distinguish speech from other sounds. Such metrics can include, for example, energy within the speech band relative to background noise and entropy within the speech band¨a measure of spectral structure¨which is typically lower in speech than in most common background noise. In some implementations, the spatial processor 566 may be configured to determine a speech presence probability, examples of such functionality are disclosed in U.S. Patent Application No. 15/984,073, filed May 18, 2018, titled "Linear Filtering for Noise-Suppressed Speech Detection".
More specifically, the one or more buffers 568 capture detected-sound data that was processed by the upstream AEC 564 and spatial processor 566. The detected-sound and/or any associated data may be referred to as a "sound specimen" when retained in at least one buffer 568. A sound specimen may comprise, for example, (a) audio data or (b) audio data and metadata regarding the audio data. As an example, a first buffer may temporarily retain audio samples used for streaming audio data, as described below. A second buffer may temporarily retain metadata (e.g., spectral data, sound pressure-level, etc.) regarding the current audio samples in the first buffer, a certain number of audio samples captured prior to the current audio samples, and/or a certain number of audio samples captured after the current audio samples. In some implementations, this type of second buffer may be referred as a look-back buffer. Additional details describing buffers, including look-back buffers, and configurations of buffers with voice processors (e.g., spatial Date Recue/Date Received 2021-11-17 processors) may be found in, for example, U.S. Patent Application No.
15/989,715, filed May 25, 2018, titled "Determining and Adapting to Changes in Microphone Performance of Playback Devices," U.S. Patent Application No. 16/138,111, filed September 21, 2018, titled "Voice Detection Optimization Using Sound Metadata," and U.S. Patent Application No.
16/141,875, filed September 25, 2018, titled "Voice Detection Optimization Based on Selected Voice Assistant Service".
may be composed of frames, each of which may include one or more sound samples. The frames may be streamed (i.e., read out) from the one or more buffers 568 for further processing by downstream components, such as the wake-word engine 570 and the voice extractor 572 of the NMD 503.
Thus, in some embodiments, a frame may include a portion of sound (e.g., one or more samples of a given sound specimen) and metadata regarding the portion of sound. In other embodiments, a frame may only include a portion of sound (e.g., one or more samples of a given sound specimen) or metadata regarding a portion of sound.
Date Recue/Date Received 2021-11-17
(e.g., streamed sound frames) to spot potential wake words in the detected-sound SD. Many first- and third-party wake word detection algorithms are known and commercially available. Different voice services (e.g.
AMAZON's ALEXA, APPLE's SIR!, MICROSOFT's CORTANA, GOOGLE'S ASSISTANT, etc.), for example, each use a different wake word for invoking their respective voice service, and some voice services make their algorithms available for use in third-party devices. In some embodiments, the wake-word engine 570 is configured to run multiple wake word detection algorithms on the received audio simultaneously (or substantially simultaneously). To support multiple voice services, the wake-word engine 570 may run the received sound-data stream SDS
through the wake word detection algorithm for each supported voice service in parallel. In such embodiments, the NMD 503 may include VAS selector components (not shown) configured to pass voice input to the appropriate voice assistant service. In other embodiments, the VAS selector components may be omitted, such as when each of the NMD's wake-word engine(s) are dedicated to the same VAS.
More specifically, in response to the wake-word event (e.g., in response to a signal Sw from the wake-word engine 570 indicating the wake-word event), the voice extractor 572 is configured to receive and format (e.g., packetize) the sound-data stream SDS. For instance, the voice extractor 572 may packetize the frames of the sound-data stream SDS into messages, Mv, for relaying the sound-data to a VAS over a network. In the example shown in Figure 5, the voice extractor 572 transmits or streams these messages in real time or near real time, to one or more remote computing devices associated with a VAS, such as the VAS 190 (Figure 1B), via the network interface 224.
contained in the messages Mv sent from the NMD 503. More specifically, the VAS is configured to identify any Date Recue/Date Received 2021-11-17 voice input based on the sound-data stream SDS and/or data derived from the sound-data stream SDS. Referring to Figure 6A, a voice input 680 may include a wake-word portion 680a and an utterance portion 680b. The wake-word portion 680a corresponds to detected sound that caused the wake-word event. For instance, the wake-word portion 680a corresponds to detected sound that caused the wake-word engine 570 to provide an indication of a wake-word event to the voice extractor 572. The utterance portion 680b corresponds to detected sound that potentially comprises a user request following the wake-word portion 680a.
when the word "Alexa" is the target wake word). In such an occurrence, the VAS
may send a response to the NMD 503 (Figure 5) with an indication for the NMD 503 to cease extraction of sound data, which may cause the voice extractor 572 to cease further streaming of the detected-sound data to the VAS. The wake-word engine 570 may resume or continue monitoring sound specimens until another potential wake word, leading to another wake-word event. In some implementations, the VAS may not process or receive the wake-word portion 680a but instead processes only the utterance portion 680b.
The words may correspond to a certain command and certain keywords 684 (identified individually in Figure 6A as a first keyword 684a and a second keyword 684b).
A keyword may be, for example, a word in the voice input 680 identifying a particular device or group in the MPS
100. For instance, in the illustrated example, the keywords 684 may be one or more words Date Recue/Date Received 2021-11-17 identifying one or more zones in which the music is to be played, such as the Living Room and the Dining Room (Figure 1A).
Command criteria may be based on the inclusion of certain keywords within the voice input, among other possibilities. Additionally, or alternatively, command criteria for commands may involve identification of one or more control-state and/or zone-state variables in conjunction with identification of one or more particular commands. Control-state variables may include, for example, indicators identifying a level of volume, a queue associated with one or more devices, and playback state, such as whether devices are playing a queue, paused, etc.
Zone-state variables may include, for example, indicators identifying which, if any, zone players are grouped.
One, some, or all of the playback components may be on-board a playback device comprising the Date Recue/Date Received 2021-11-17 NMD 503, or may be associated with a different playback device of MPS 100. The network interface 224 may communicate a signal Si to the audio interface 519 based on the response from the VAS, and the audio interface 519 may transmit an audio signal As to the audio-output processor 515. The audio-output processor 515, for example, may comprise one or more of the audio processing components 216 discussed above with reference to Figure 2A.
Finally, the audio-output processor 515 transmits the processed audio signal Ap to the speakers 218 of a playback device for playback. The audio-output processor 515 may also transmit one or more reference signals REF to the AEC 564 based on the processed audio signal Ap to suppress echoed audio components from the audio content played back by a playback device that may otherwise be present in detected sound SD.
"turn on," etc.) and/or certain keywords or phrases, such as the unique name assigned to a given playback device (e.g., "Bookcase," "Patio," "Office," etc.). In response to identifying one or more of these commands, keywords, and/or phrases, the NMD 503 may communicate a signal (not shown in Figure 5) that causes the audio processing components 216 (Figure 2A) to perform one or more actions. For instance, when a user says "Hey Sonos, stop the music in the office," the NMD 503 may communicate a signal to the office playback device 102n, either directly, or indirectly via one or more other devices of the MPS 100, which causes the office device 102n to stop audio playback. Reducing or eliminating the need for assistance from a remote VAS may reduce latency that might otherwise occur when processing voice input remotely. In some cases, the identification algorithms employed may be configured to identify commands that are spoken without a preceding wake word. For instance, in the example above, the NMD 503 may employ an identification algorithm that triggers an event to stop the music in the office without the user first saying "Hey Sonos" or another wake word.
III. Example Systems and Methods for Distributed Voice Processing
The playback devices 702 further include voice processing components that may be similar to some or all of the voice processing components of the NMD 503 described above with reference to Figure 5. For example, the first and second playback devices 702a and 702b include respective first and second voice processors 760a and 760b (collectively "voice processors 760"), first and second wake word engines 770a and 770b (collectively "wake word engines 770") associated with respective first and second VASes 790a and 790b. The first and second playback devices 702a and 702b further include respective first and second network interfaces 724a and 724b (collectively "network interfaces") configured to communicate with one another over local and/or wide area networks.
The first and second network interfaces 724a and 724b may also be configured to communicate with other computing devices of the MPS 100 and/or one or more remote servers (such as those associated with a VAS) over local and/or wide area networks.
The first wake-word engine 770a may be configured to detect a first wake word specific to the first VAS 790a.
For example, the first wake word engine 770a may be associated with AMAZON's ALEXA and be configured to run a corresponding wake word detection algorithm (e.g., configured to detect Date Recue/Date Received 2021-11-17 the wake word "Alexa" or other associated wake word). The first wake word engine 770a may be configured to detect only wake words associated with the first VAS 790a (such as the first wake word), and cannot detect wake words associated with a different VAS (such as a second VAS
790b, described below).
and configured to run a corresponding wake word detection algorithm (e.g., configured to detect the wake word "OK, Google" or other associated wake word). Thus, in some aspects of the technology, the first and second wake word engines 770a and 770b are configured to detect different wake words associated with different VASes.
"mic/ch. 2," etc.). In other embodiments, the first playback device 702a may have more or fewer than six microphones or channels. The sound detected by the microphones 722 may be processed by the first voice processor 760a and fed to the first wake-word engine 770a and the first network interface 724a. In the example depicted in Figure 7A, the first voice processor 760a transmits the processed detected sound from microphones 1-6 to the first wake word engine 770a, and transmits Date Recue/Date Received 2021-11-17 the processed detected sound from microphones 5 and 6 to the first network interface 724a (for subsequent transmission to the second playback device 702b, detailed below).
Processing the data to be transmitted may include compressing the data prior to transmission. In some implementations, it may be beneficial to perform acoustic echo cancellation (via the first AEC
764a) with the reference signal(s) before transmitting the detected sound to reduce bandwidth. In some embodiments, the second AEC 764b may be bypassed or omitted from the second voice processor 760b in configurations in which acoustic cancellation is applied to sound data to be transmitted from the first playback device 702a to the second playback device 702b. In additional or alternate embodiments, spatial processing may be carried out on the data to be transmitted to the second playback device 702b, in which case the second spatial processor 766b may be bypassed or omitted from the second voice processor 760b.
The first voice processor 760a passes the detected sound data from microphones 1-6 to the first wake word engine 770a, and passes the detected sound data from microphones 5 and 6 to the first network interface 724b for transmission to the second playback device 702b via the second network interface 724b. The second network interface 724b feeds the detected sound data to the second voice processor 760b, which may apply one or more voice processing techniques before sending to the second wake word engine 770b for detection of the second wake word. Because the command includes the first wake word, the first wake word engine 770a triggers the voice extractor (for example, voice extractor 572 in Figure 5) to stream messages (e.g., messages containing packetized frames of the detected sound to the first VAS 790a) via first network interface 724a. As the command does not include the second wake word, the second wake word engine 770b does not trigger voice extraction to the second VAS 790b. The first VAS 790a processes the packetized voice data and sends a response to the first network interface 724 with instructions for the first playback device 702a to perform the action requested by the user, i.e., to play back music by the Beatles. The first VAS 790a may also send the first playback device 702a a voice response for playback by the first playback device 702a to acknowledge to the user that the MPS 100 and/or first VAS 790a has processed the user's request.
As shown in Figure 7B, in such a scenario the second wake word engine 770b detects the second wake word in the detected sound and triggers the voice extractor (such as voice extractor 572 in Figure 5, which may then extract sound data (e.g., packetizing frames of the detected sound into messages).
In the example shown in Figure 7B, the voice extractor extracts sound data to one or more remote computing devices associated with the second VAS 790b (e.g., via second network interface 724b). The remote computing device(s) associated with the second VAS
790b are configured to process the sound data associated with the detected sound and send a response to the second playback device 702b (e.g., via the second network interface 724b) that may include instructions for the first playback device 702a, the second playback device 702b, and/or another playback device(s) of the MPS 100 to perform an action or series of actions (or, in some instances, do nothing). For the example command provided in Figure 7B ("play the Beatles"), the second VAS 790b sends a message to the second playback device 702b with instructions for the first playback device 702a to play music by the Beatles. The second playback device 702b may then forward the instructions to the first playback device 702a, and the first playback device 702 performs the action. The second VAS 790b may also send the second playback device 702b a voice response for playback by the first playback device 702a to acknowledge to the user that the MPS 100 and/or second VAS 790b has processed the user's request. As shown in Figure 7B, the first playback device 702a may then play back the voice response ("okay").
790b sends the response(s) 784, 785 containing the instructions for responding to the user's request, the instructions are identified to the second playback device 702b for playback by the first playback device 702a. In some embodiments, the tag T is only meaningful to the second playback device 702b and the second VAS 790b passively includes the tag in the responses without being aware of its function or implication. In other embodiments, the tag T also indicates to the second VAS
790b that the first playback device 702a will be performing the requested action (or at least that the second playback device 702b is not performing the requested action).
Date Recue/Date Received 2021-11-17
For all such non-auditory actions, the second playback device 702b may receive instructions to provide an audible acknowledgment (e.g., "okay," a chime, etc.) of the command.
790b processes a voice input to suppress inadvertent detection of the first wake word and prevent potentially conflicting actions and/or output by the first and/or second playback devices 702a, 702b. In some embodiments, once the second VAS 790b has completed processing of the voice input, the first wake word engine 770a may be re-enabled. Likewise, in some embodiments the first playback device 702a and/or the MPS 100 may temporarily disable the second wake word engine 770b when the first wake word engine 770a detects a wake word. Additionally or alternatively, the microphones assigned to the first or second playback device 702a, 702b may be temporarily disabled when the wake word engine of the other playback device detects its respective wake word. In some embodiments, disabling a wake-word engine may include allowing the wake-word engine to continue to monitor for wake-words but temporarily muting the audio input upstream from the spatial processor, such as by inserting zeroes in a digital input stream or silence at a low noise level such that wake-word is less or not capable of detecting wake-words while muted.
In some aspects, certain ones of the microphones 722 are assigned exclusively to the first playback device 702a (for example, by one or both of the playback devices 702, the MPS
100, and/or another playback device of the MPS 100), and certain ones of the microphones 722 are assigned exclusively to the second playback device 702b. In such embodiments, the first and second subsets of microphones have no microphones in common. In other embodiments, the first and second subsets of microphones may have at least one microphone in common.
The microphone selector, for example, may utilize a lookback buffer to provide feedback to one or more remote computing devices of the MPS 100 for determining if, when, and/or which of the microphones 722 of the first playback device 702a can be shared with or assigned for exclusive use to the second playback device 702b. Additional details regarding microphone selection and/or aggregation across multiple playback devices may be found in, for example, in previously mentioned U.S. Patent Application Nos. 15/989,715; 16/138,111; and 16/141,875.
However, in Figure 7E, the first playback device 702a sends the second playback device 702b reference data from the first AEC 764a, (represented by arrow 40) as well as the raw mic data from designated ones of the microphones (e.g., microphones 5 and 6, represented by arrows (I(g) and I(h)). In such embodiments, the second voice processor 760b may include a second AEC 764b and a second spatial processor 766b in addition to the second buffer 768b. The second AEC 764b and the second spatial processor 766b may have generally similar components and functions to respective first AEC 764a and first spatial processor 766a. The second voice processor 766b may be configured to receive and process the reference data and detected sound data before sending the detected sound data to the second wake word engine 770b for detection of the second wake word.
790a while the second playback device 702b is disabled or otherwise prevented from communicating with the second VAS 790.
Also, the various blocks may be combined into fewer blocks, divided into additional blocks, and/or removed based upon a desired implementation.
In some embodiments, the first playback device 702a may have limited processing resources (e.g., available system memory, power constraints, etc.) relative to the second playback device 702b.
As such, a playback device without sufficient resources to run microphone DSP, a wakeword engine, and an additional NLU/event-detection engine may offload NLU/event-detection engine to another playback device. As an example, the first playback device 702a may be a portable playback device, such as set of wireless headphones. In related embodiments, the second wake-word engine 770b may be able to detect wake-words more accurately than the first wake-word engine 770a. In such instances, the second wake-word engine 770b may intervene if the first wake-word engine 770a failed to detect a certain wake-word and/or if the first wake-word engine 770a was triggered by a wake word that the second wake-word engine 770b determined to be a false positive.
Examples
Example 8: A system comprising a first playback device and a second playback device. The first playback device may comprise one or more processors, a microphone array, and a first computer-readable medium storing instructions that, when executed by the one or more processors, cause the first device to perform first operations, the first operations comprising:
detecting sound via the microphone array; analyzing, via a first wake-word engine of the first playback device, the detected sound; and transmitting data associated with the detected sound from the first playback device to a second playback device over a local area network. The second playback device may comprise one or more processors and a second computer-readable medium storing instructions that, when executed by the one or more processors, cause the second device Date Recue/Date Received 2021-11-17 to perform second operations, the second operations comprising: analyzing, via a second wake-word engine of the second playback device, the transmitted data associated with the detected sound; identifying that the detected sound contains a second wake word based on the analysis via the second wake-word engine; and based on the identification, transmitting sound data corresponding to the detected sound over a wide area network to a remote computing device associated with a particular voice assistant service. Example 9: the system of Example 8, wherein the sound data further contains a voice utterance and the second operations further comprise receiving at least one message from the remote computing device. The message may comprise a playback command that is based on the voice utterance. In such embodiments, the first operations may further comprise playing back audio content based on the playback command.
Example 10:
the system of Example 8 or Example 9, wherein identifying the second wake word is (i) based on the transmitted data associated with the detected sound and (ii) without detecting the sound via the second playback device. Example 11: the system of any one of Examples 8 to 10, wherein the microphone array comprises a plurality of individual microphones and the first playback device comprises a voice processor configured to receive portions of the detected sound from respective ones of the individual microphones. In such operations, the first operations may comprise processing, via the voice processor, one or more of the portions of the detected sound to produce the data associated with the detected sound that is transmitted to the second playback device.
Example 12: the system of any one of Examples 8 to 11, wherein processing the one or more portions of the detected sound comprises processing fewer than all of the portions of the detected sound. Example 13: the system of any one of Examples 8 to 12, wherein the first operations further comprise spatially processing, via the voice processor, the detected sound based on one or more of the portions of the detected sound. In such embodiments, analyzing the detected sound via the first wake-word engine comprises analyzing the spatially processed detected sound. Example 14:
the system of any one of Examples 8 to 13, wherein the first operations further comprise playing back, via the first playback device, audio content, and producing, via the first playback device, at least one reference signal based on the audio content. In such embodiments, the data associated with the detected sound that is transmitted to the second playback device comprises data that is based on the at least one reference signal.
Example 15: A plurality of non-transitory computer-readable media storing instructions for distributed wake-word detection, including a first computer-readable storage medium and a second computer-readable storage medium. The first computer-readable medium may store instructions that, when executed by one or more processors, cause the one or more Date Recue/Date Received 2021-11-17 processors to perform first operations. The first operations may comprise detecting sound via the microphone array; analyzing, via a first wake-word engine of the first playback device, the detected sound; and transmitting data associated with the detected sound from the first playback device to a second playback device over a local area network. The second computer-readable medium may store instructions that, when executed by one or more processors, cause the one or more processors to perform second operations. The second operations may comprise: analyzing, via a second wake-word engine of the second playback device, the transmitted data associated with the detected sound; identifying that the detected sound contains a second wake word based on the analysis via the second wake-word engine; and based on the identification, transmitting sound data corresponding to the detected sound over a wide area network to a remote computing device associated with a particular voice assistant service. Example 16: the plurality of non-transitory computer-readable media of Example 15, wherein the sound data further contains a voice utterance, and wherein (a) the second operations further comprise receiving at least one message from the remote computing device, wherein the message comprises a playback command, and wherein the playback command is based on the voice utterance; and (b) the first operations further comprise playing back audio content based on the playback command. Example 17: the plurality of non-transitory computer-readable media of Example 15 or Example 16, wherein identifying the second wake word is (i) based on the transmitted data associated with the detected sound and (ii) without detecting the sound via the second playback device. Example 18:
the plurality of non-transitory computer-readable media of any one of Examples 15 to 17, wherein the microphone array comprises a plurality of individual microphones, the first playback device comprises a voice processor configured to receive portions of the detected sound from respective ones of the individual microphones, and the first operations comprise processing, via the voice processor, one or more of the portions of the detected sound to produce the data associated with the detected sound that is transmitted to the second playback device. Example 19: the plurality of non-transitory computer-readable media of any one of Examples 15 to 18, wherein processing the one or more portions of the detected sound comprises processing fewer than all of the portions of the detected sound. Example 20: the plurality of non-transitory computer-readable media of any one of Examples 15 to 19, wherein the first operations may further comprise spatially processing, via the voice processor, the detected sound based on one or more of the portions of the detected sound, and wherein analyzing the detected sound via the first wake-word engine comprises analyzing the spatially processed detected sound.
Date Recue/Date Received 2021-11-17
transmitting a message from the second playback device to the first playback device over the local area network, wherein the message is based on the response from the remote computing device and includes instructions to perform an action; and performing the action via the first playback device. Example 22: the method of Example 21, wherein the action is a first action and the method further comprises performing a second action via the second playback device, where the second action is based on the response from the remote computing device. Example 23: the method of Example 21 or Example 22, further comprising disabling a wake word engine of the first playback device in response to the identification of the wake word via the wake word engine of the second playback device. Example 24: the method of any one of Examples 21 to 23, further comprising enabling a wake word engine of the first playback device after the second playback device receives the response from the remote computing device. Example 25: the method of Example 24, wherein the wake word may be a second wake word, and the wake word engine of the first playback device is configured to detect a first wake word that is different than the second wake word. Example 26:
the method of any one of Examples 21 to 25, wherein the first playback device is configured to communicate with the remote computing device associated with the particular voice assistant service. Example 27: the method of any one of Examples 21 to 26, wherein the remote computing device is a first remote computing device and the voice assistant service is a first voice assistant service, and the first playback device is configured to detect a wake word associated with a second voice assistant service different than the first voice assistant service.
receiving a response from the remote computing device, wherein the response is based on the detected sound; and transmitting a message to the second playback device over the local area network, wherein the message is based on the response from the remote computing device and includes instructions for the second playback device to perform an action. Example 29: the first playback device of Example 28, wherein the action is a first action and the operations further comprise performing a second action via the first playback device, where the second action is based on the response from the remote computing device. Example 30: the first playback device of Example 28 or Example 29, wherein the operations may comprise disabling a wake word engine of the second playback device in response to the identification of the wake word via the wake word engine of the first playback device. Example 31: the first playback device of any one of Examples 28 to 30, wherein the operations of the first playback device may comprise enabling the wake word engine of the second playback device after the first playback device receives the response from the remote computing device. Example 32: the first playback device of any one of Examples 28 to 31, wherein the wake word is a first wake word and the wake word engine of the second playback device is configured to detect a second wake word that is different than the first wake word. Example 33:
the first playback device of any one of Examples 27 to 32, wherein the first playback device is configured to communicate with the remote computing device associated with the particular voice assistant service. Example 34: the first playback device of any one of Examples 28 to 33, wherein the remote computing device is a first remote computing device and the voice assistant service is a first voice assistant service. In such embodiments, the second playback device may be configured to detect a wake word associated with a second voice assistant service different than the first voice assistant service.
Example 35: A system comprising a first playback device and a second playback device. The first playback device may comprise one or more processors, a microphone array, and a first computer-readable medium storing instructions that, when executed by the one or more processors, cause the first playback device to perform first operations. The first operations may comprise: detecting sound via the microphone array; transmitting data associated with the detected Date Recue/Date Received 2021-11-17 sound to a second playback device over a local area network. The second playback device may comprise one or more processors and a second computer-readable medium storing instructions that, when executed by the one or more processors, cause the second playback device to perform second operations. The second operations may comprise analyzing, via a wake word engine of the second playback device, the transmitted data associated with the detected sound from the first playback device for identification of a wake word; identifying that the detected sound contains the wake word based on the analysis via the wake word engine; based on the identification, transmitting sound data corresponding to the detected sound to a remote computing device over a wide area network, wherein the remote computing device is associated with a particular voice assistant service; receiving a response from the remote computing device, wherein the response is based on the detected sound; and transmitting a message to the first playback device over the local area network, wherein the message is based on the response from the remote computing device and includes instructions to perform an action. The first computer-readable medium of the first playback device may cause the first playback device to perform the action from the instructions received from the second playback device. Example 36: the system of Example 35, wherein the action is a first action and the second operations further comprise performing a second action via the second playback device, where the second action is based on the response from the remote computing device. Example 37: the system of Example 35 or Example 36, wherein the second operations may further comprise disabling a wake word engine of the first playback device in response to the identification of the wake word via the wake word engine of the second playback device. Example 38: the system of any one of Examples 35 to 37, wherein the second operations may further comprise enabling the wake word engine of the first playback device after the second playback device receives the response from the remote computing device.
Example 39: the system of any one of Examples 35 to 38, wherein the first playback device may be configured to communicate with the remote computing device associated with the particular voice assistant service. Example 40: the system of any one of Examples 35 to 39, wherein the remote computing device is a first remote computing device and the voice assistant service is a first voice assistant service, and wherein the first playback device is configured to detect a wake word associated with a second voice assistant service different than the first voice assistant service.
Conclusion
The description above discloses, among other things, various example systems, methods, apparatus, and articles of manufacture including, among other components, firmware Date Recue/Date Received 2021-11-17 and/or software executed on hardware. It is understood that such examples are merely illustrative and should not be considered as limiting. For example, it is contemplated that any or all of the firmware, hardware, and/or software aspects or components can be embodied exclusively in hardware, exclusively in software, exclusively in firmware, or in any combination of hardware, software, and/or firmware. Accordingly, the examples provided are not the only way(s) to implement such systems, methods, apparatus, and/or articles of manufacture.
For example, a first playback device may be merged with a second playback device to form a single merged "device." The merged playback devices and may not be specifically assigned different playback responsibilities. That is, the merged playback devices and may, aside from playing audio content in synchrony, each play audio content as they would if they were not merged. However, the merged devices may present to the media playback system and/or to the user as a single user interface (UI) entity for control.
Examples Date Recue/Date Received 2021-11-17
Date Recue/Date Received 2021-11-17
analyzing, via a first wake-word engine of the first playback device, the detected sound; and transmitting data associated with the detected sound from the first playback device to a second playback device over a local area network; the second playback device comprising: one or more processors; and a second computer-readable medium storing instructions that, when executed by the one or more processors, cause the second device to perform second operations, the second operations comprising: analyzing, via a second wake-word engine of the second playback device, the transmitted data associated with the detected sound; identifying that the detected sound contains a second wake word based on the analysis via the second wake-word engine; and based on the identification, transmitting sound data corresponding to the detected sound over a wide area network to a remote computing device associated with a particular voice assistant service.
Date Recue/Date Received 2021-11-17
detecting sound via the microphone array; analyzing, via a first wake-word engine of the first playback device, the detected sound; and transmitting data associated with the detected sound from the first playback device to a second playback device over a local area network; a second computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform second operations, the second operations comprising:
analyzing, via a second wake-word engine of the second playback device, the transmitted data associated with the Date Recue/Date Received 2021-11-17 detected sound; identifying that the detected sound contains a second wake word based on the analysis via the second wake-word engine; and based on the identification, transmitting sound data corresponding to the detected sound over a wide area network to a remote computing device associated with a particular voice assistant service.
the second operations further comprise receiving at least one message from the remote computing device, wherein the message comprises a playback command, and wherein the playback command is based on the voice utterance; and the first operations further comprise playing back audio content based on the playback command.
transmitting a message from the second playback device to the first playback device over the local area network, wherein the message is based on the response from the remote computing device and includes instructions to perform an action; and performing the action via the first playback device.
receiving, from a Date Recue/Date Received 2021-11-17 second playback device over a local area network, data associated with sound detected via a microphone array of the second playback device; analyzing, via a wake word engine of the first playback device, the data associated with the detected sound for identification of a wake word;
identifying that the detected sound contains the wake word based on the analysis via the wake word engine; based on the identification, transmitting sound data corresponding to the detected sound to a remote computing device over a wide area network, wherein the remote computing device is associated with a particular voice assistant service; receiving a response from the remote computing device, wherein the response is based on the detected sound; and transmitting a message to the second playback device over the local area network, wherein the message is based on the response from the remote computing device and includes instructions for the second playback device to perform an action.
Date Recue/Date Received 2021-11-17
Date Recue/Date Received 2021-11-17
SUMMARY
detecting sound via a microphone array of a first playback device;
analyzing, via a first wake-word engine of the first playback device, the detected sound, wherein the first wake-word engine is configured to detect a first wake word;
transmitting data associated with the detected sound from the first playback device to a second playback device over a local area network;
analyzing, via a second wake-word engine of the second playback device, the transmitted data associated with the detected sound, wherein the second wake-word engine is configured to detect a second wake word that is different from the first wake word;
when one of the first and second wake word engines determines that the detected sound contains one of the first and second wake words based on the analysis by the first and second wake-word engines of the first and second playback devices, respectively, transmitting data corresponding to the detected sound over a wide area network to a remote computing device associated with a particular voice assistant service associated with the identified wake word.
receiving, from a first playback device by a second playback device over a local area network, data associated with sound detected via a microphone array of the first playback device, wherein the first playback device comprises a first wake-word engine is configured to detect a first wake word;
analyzing, via a second wake-word engine of the second playback device, the data associated with the detected sound for identification of a second wake word that is different from the first wake word;
identifying, by the second playback device, that the detected sound contains the second wake word based on the analysis via the wake-word engine;
Date Recue/Date Received 2021-11-17 based on the identification, transmitting, by the second playback device to a remote computing device over a wide area network, data corresponding to the detected sound, wherein the remote computing device is associated with a particular voice assistant service;
receiving, by the second playback device from the remote computing device, a response based on the detected sound; and transmitting, by the second playback device to the first playback device over the local area network, a message based on the response from the remote computing device that includes instructions for the second playback device to perfoiiii an action.
one or more processors;
a computer-readable medium storing instructions that, when executed by the one or more processors, cause the playback device to perform to perform any one of the preceding aspects.
a first playback device according to any one of the preceding aspects; and a second playback device;
wherein the system is configured to perform the method defined in any one of the preceding aspects.
detecting sound via a microphone array of a first playback device;
transmitting data associated with the detected sound from the first playback device to a second playback device over a local area network; analyzing, via a wake word engine of the second playback device, the transmitted data associated with the detected sound for identification of a wake word;
identifying that the detected sound contains the wake word based on the analysis via the wake word engine; based on the identification, transmitting sound data corresponding to the detected sound from the second Date Recue/Date Received 2021-11-17 playback device to a remote computing device over a wide area network, wherein the remote computing device is associated with a particular voice assistant service;
receiving via the second playback device a response from the remote computing device, wherein the response is based on the detected sound; transmitting a message from the second playback device to the first playback device over the local area network, wherein the message is based on the response from the remote computing device and includes instructions to perform an action;
and performing the action via the first playback device.
one or more processors; a computer-readable medium storing processor-executable instructions that, when executed by the one or more processors, cause the first playback device to perform operations comprising:
receiving, from a second playback device over a local area network, data associated with sound detected via a microphone array of the second playback device;
analyzing, via a wake word engine of the first playback device, the data associated with the detected sound for identification of a wake word; identifying that the detected sound contains the wake word based on the analysis via the wake word engine; based on the identification, transmitting sound data corresponding to the detected sound to a remote computing device over a wide area network, wherein the remote computing device is associated with a particular voice assistant service; receiving a response from the remote computing device, wherein the response is based on the detected sound; and transmitting a message to the second playback device over the local area network, wherein the message is based on the response from the remote computing device and includes instructions for the second playback device to perform an action.
a first playback device comprising:
one or more processors; a microphone array; and a first computer-readable medium storing first processor-executable instructions that, when executed by the one or more processors, cause the first playback device to perform first operations, the first operations comprising:
detecting sound via the microphone array;
transmitting data associated with the detected sound to a second playback device over a local area network; the second playback device comprising:
Date Recue/Date Received 2021-11-17 one or more processors; and a second computer-readable medium storing second processor-executable instructions that, when executed by the one or more processors, cause the second playback device to perform second operations, the second operations comprising:
analyzing, via a wake word engine of the second playback device, the transmitted data associated with the detected sound from the first playback device for identification of a wake word;
identifying that the detected sound contains the wake word based on the analysis via the wake word engine; based on the identification, transmitting sound data corresponding to the detected sound to a remote computing device over a wide area network, wherein the remote computing device is associated with a particular voice assistant service;
receiving a response from the remote computing device, wherein the response is based on the detected sound; and transmitting a message to the first playback device over the local area network, wherein the message is based on the response from the remote computing device and includes instructions to perform an action, wherein the first computer-readable medium of the first playback device causes the first playback device to perform the action from the instructions received from the second playback device.
detecting sound via a microphone array of a first playback device to obtain sound data corresponding to the detected sound;
analyzing, via a first wake-word engine of the first playback device, the sound data, the first wake-word engine configured to detect a first wake word;
transmitting the sound data from the first playback device to a second playback device ove a local area network;
analyzing, via a second wake-word engine of the second playback device, the transmitted sound data, the second wake-word engine being different from the first wake-word engine and being configured to detect a second wake word different from the first wake word;
identifying that the detected sound contains either (i) a first wake word based on the analysis via the first wake-word engine or (ii) a second wake word based on the analysis via the second wake-word engine; and based on the identification, transmitting the sound data over a wide area network to a remote computing device associated with a particular voice assistant service.
a first playback device comprising:
Date Recue/Date Received 2021-11-17 one or more processors; a microphone array; and a first computer-readable medium storing instructions that, when executed by the one or more processors, cause the first device to perform first operations, the first operations comprising:
detecting sound via the microphone array; to obtain data corresponding to the detected sound; analyzing, via a first wake-word engine of the first playback device, the sound data, the first wake-word engine configured to detect a first wake word; and transmitting the sound data from the first playback device to a second playback device over a local area network; the second playback device comprising:
one or more processors; and a second computer-readable medium storing instructions that, when executed by the one or more processors, cause the second device to perform second operations, the second operations comprising:
analyzing, via a second wake-word engine of the second playback device, the transmitted sound data , the second wake-word engine being different from the first wake-word engine and being configured to detect a second wake word different from the first wake word;
identifying that the detected sound contains the second wake word based on the analysis via the second wake-word engine; and based on the identification, transmitting the sound data over a wide area network to a remote computing device associated with a particular voice assistant service.
According to another aspect of the invention, there is provided a plurality of non-transitory computer-readable media storing instructions for distributed wake-word detection, comprising: a first computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform first operations, the first operations comprising:
detecting sound via the microphone array to obtain sound data corresponding to the detected sound; analyzing, via a first wake-word engine of the first playback device, the , sound data the first wake-word engine configured to detect a first wake word; and transmitting the sound data from the first playback device to a second playback device over a local area network; a second computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform second operations, the second operations comprising:
Date Recue/Date Received 2021-11-17 analyzing, via a second wake-word engine of the second playback device, the transmitted sound data, the second wake-word engine being different from the first wake-word engine and being configured to detect a second wake word different from the first wake word;
identifying that the detected sound contains the second wake word based on the analysis via the second wake-word engine; and based on the identification, transmitting the sound data over a wide area network to a remote computing device associated with a particular voice assistant service.
a network microphone device (NMD) and a playback device, the NMD comprising:
one or more processors; one or more microphones; and a first computer-readable medium storing instructions that, when executed by the one or more processors, cause the NMD
to perform first operations, the first operations comprising:
detecting sound via the one or more microphones; transmitting data associated with the detected sound to the playback device over a local area network; the playback device comprising: one or more processors; and a second computer-readable medium storing instructions that, when executed by the one or more processors, cause the playback device to perform second operations, the second operations comprising:
identifying, via a wake word engine of the playback device, a wake word based on the transmitted data associated with the detected sound from the NMD; based on the identification, transmitting sound data corresponding to the detected sound to one or more remote computing devices over a wide area network; after the transmitting, receiving a response from the one or more remote computing devices; and after receiving the response, transmitting a message to the NMD device over the local area network, wherein the message includes instructions to perform an action, wherein the first computer-readable medium of the NMD causes the NMD to perform the action.
detecting sound via one or more microphones of a network microphone device (NMD);
transmitting data associated with the detected sound from the NMD to a playback device over a local area network; identifying, via a wake word engine of the playback device, a wake word based on the transmitted data associated with the detected sound; based on the identification, transmitting sound data corresponding to the detected sound from the playback device to one or more remote computing devices over a wide area network; after the transmitting, receiving, via the playback device, a response from the one or more remote computing devices;
after receiving the response, transmitting a message from the playback device to the NMD over the local area network, wherein the message includes instructions to perform an action; and Date Recue/Date Received 2021-11-17 performing the action via the NMD.
According to another aspect of the invention there is provided a playback device comprising:
one or more processors; and a computer-readable medium storing instructions that, when executed by the one or more processors, cause the playback device to perform operations comprising:
receiving, from a network microphone device (NMD) over a local area network, data associated with sound detected via one or more microphones of the NMD; identifying, via a wake word engine of the playback device, a wake word based on the data associated with the detected sound;
after the identification, transmitting sound data corresponding to the detected sound to one or more remote computing devices over a wide area network; after the transmitting, receiving a response from the one or more remote computing devices; and after receiving the response, transmitting a message to the NMD over the local area network, wherein the message includes instructions for the playback device to perform an action.
Date Recue/Date Received 2021-11-17
Claims (41)
detecting sound via a microphone array of a first playback device;
transmitting data associated with the detected sound from the first playback device to a second playback device over a local area network;
analyzing, via a wake word engine of the second playback device, the transmitted data associated with the detected sound for identification of a wake word;
identifying that the detected sound contains the wake word based on the analysis via the wake word engine;
based on the identification, transmitting sound data corresponding to the detected sound from the second playback device to a remote computing device over a wide area network, wherein the remote computing device is associated with a particular voice assistant service;
receiving via the second playback device a response from the remote computing device, wherein the response is based on the detected sound;
transmitting a message from the second playback device to the first playback device over the local area network, wherein the message is based on the response from the remote computing device and includes instructions to perform an action; and performing the action via the first playback device.
Date recue/Date received 2023-04-21
one or more processors;
a computer-readable medium storing processor-executable instructions that, when executed by the one or more processors, cause the first playback device to perform operations comprising:
receiving, from a second playback device over a local area network, data associated with sound detected via a microphone array of the second playback device;
analyzing, via a wake word engine of the first playback device, the data associated with the detected sound for identification of a wake word;
identifying that the detected sound contains the wake word based on the analysis via the wake word engine;
based on the identification, transmitting sound data corresponding to the detected sound to a remote computing device over a wide area network, wherein the remote computing device is associated with a particular voice assistant service;
receiving a response from the remote computing device, wherein the response is based on the detected sound; and transmitting a message to the second playback device over the local area network, wherein the message is based on the response from the remote computing device and includes instructions for the second playback device to perform an action.
Date recue/Date received 2023-04-21
a first playback device comprising:
one or more processors;
a microphone array; and a first computer-readable medium storing first processor-executable instructions that, when executed by the one or more processors, cause the first playback device to perform first operations, the first operations comprising:
detecting sound via the microphone array;
tansmitting data associated with the detected sound to a second playback device over a local area network;
the second playback device comprising:
Date recue/Date received 2023-04-21 one or more processors; and a second computer-readable medium storing second processor-executable instructions that, when executed by the one or more processors, cause the second playback device to perform second operations, the second operations comprising:
analyzing, via a wake word engine of the second playback device, the transmitted data associated with the detected sound from the first playback device for identification of a wake word;
identifying that the detected sound contains the wake word based on the analysis via the wake word engine;
based on the identification, transmitting sound data corresponding to the detected sound to a remote computing device over a wide area network, wherein the remote computing device is associated with a particular voice assistant service;
receiving a response from the remote computing device, wherein the response is based on the detected sound; and tansmitting a message to the first playback device over the local area network, wherein the message is based on the response from the remote computing device and includes instructions to perform an action, wherein the first computer-readable medium of the first playback device causes the first playback device to perform the action from the instructions received from the second playback device.
Date recue/Date received 2023-04-21
a network microphone device (NMD) and a playback device, the NMD comprising:
one or more processors;
one or more microphones; and a first computer-readable medium storing instructions that, when executed by the one or more processors, cause the NMD to perform first operations, the first operations comprising:
detecting sound via the one or more microphones;
transmitting data associated with the detected sound to the playback device over a local area network;
the playback device comprising:
one or more processors; and a second computer-readable medium storing instructions that, when executed by the one or more processors, cause the playback device to perform second operations, the second operations comprising:
identifying, via a wake word engine of the playback device, a wake word based on the transmitted data associated with the detected sound from the NMD;
based on the identification, transmitting sound data corresponding to the detected sound to one or more remote computing devices over a wide area network;
after the transmitting, receiving a response from the one or more remote computing devices; and Date recue/Date received 2023-04-21 after receiving the response, transmitting a message to the NMD
device over the local area network, wherein the message includes instructions to perform an action, wherein the first computer-readable medium of the NMD causes the NMD to perform the action.
detecting sound via one or more microphones of a network microphone device (NMD);
transmitting data associated with the detected sound from the NMD to a playback device over a local area network;
identifying, via a wake word engine of the playback device, a wake word based on the transmitted data associated with the detected sound;
Date recue/Date received 2023-04-21 based on the identification, transmitting sound data corresponding to the detected sound from the playback device to one or more remote computing devices over a wide area network;
after the transmitting, receiving, via the playback device, a response from the one or more remote computing devices;
after receiving the response, transmitting a message from the playback device to the NMD over the local area network, wherein the message includes instructions to perform an action; and performing the action via the NMD.
one or more processors; and Date recue/Date received 2023-04-21 a computer-readable medium storing processor-executable instructions that, when executed by the one or more processors, cause the playback device to perform operations comprising:
receiving, from a network microphone device (NMD) over a local area network, data associated with sound detected via one or more microphones of the NMD;
identifying, via a wake word engine of the playback device, a wake word based on the data associated with the detected sound;
after the identification, transmitting sound data corresponding to the detected sound to one or more remote computing devices over a wide area network-, after the transmitting, receiving a response from the one or more remote computing devices; and after receiving the response, transmitting a message to the NMD over the local area network, wherein the message includes instructions for the playback device to perform an action.
Date recue/Date received 2023-04-21
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CA3227238A CA3227238A1 (en) | 2019-02-08 | 2020-02-07 | Devices, systems, and methods for distributed voice processing |
Applications Claiming Priority (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US16/271,550 US11315556B2 (en) | 2019-02-08 | 2019-02-08 | Devices, systems, and methods for distributed voice processing by transmitting sound data associated with a wake word to an appropriate device for identification |
| US16/271,550 | 2019-02-08 | ||
| US16/271,560 US10867604B2 (en) | 2019-02-08 | 2019-02-08 | Devices, systems, and methods for distributed voice processing |
| US16/271,560 | 2019-02-08 | ||
| PCT/US2020/017150 WO2020163679A1 (en) | 2019-02-08 | 2020-02-07 | Devices, systems, and methods for distributed voice processing |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CA3227238A Division CA3227238A1 (en) | 2019-02-08 | 2020-02-07 | Devices, systems, and methods for distributed voice processing |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CA3129236A1 CA3129236A1 (en) | 2020-08-13 |
| CA3129236C true CA3129236C (en) | 2024-04-16 |
Family
ID=69784528
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CA3227238A Pending CA3227238A1 (en) | 2019-02-08 | 2020-02-07 | Devices, systems, and methods for distributed voice processing |
| CA3129236A Active CA3129236C (en) | 2019-02-08 | 2020-02-07 | Devices, systems, and methods for distributed voice processing |
Family Applications Before (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CA3227238A Pending CA3227238A1 (en) | 2019-02-08 | 2020-02-07 | Devices, systems, and methods for distributed voice processing |
Country Status (6)
| Country | Link |
|---|---|
| EP (1) | EP3922047A1 (en) |
| KR (1) | KR20210125527A (en) |
| CN (1) | CN113711625B (en) |
| AU (1) | AU2020218258B2 (en) |
| CA (2) | CA3227238A1 (en) |
| WO (1) | WO2020163679A1 (en) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12322390B2 (en) * | 2021-09-30 | 2025-06-03 | Sonos, Inc. | Conflict management for wake-word detection processes |
Family Cites Families (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8234395B2 (en) | 2003-07-28 | 2012-07-31 | Sonos, Inc. | System and method for synchronizing operations among a plurality of independently clocked digital data processing devices |
| US8483853B1 (en) | 2006-09-12 | 2013-07-09 | Sonos, Inc. | Controlling and manipulating groupings in a multi-zone media system |
| JP6437695B2 (en) * | 2015-09-17 | 2018-12-12 | ソノズ インコーポレイテッド | How to facilitate calibration of audio playback devices |
| CN105679310A (en) * | 2015-11-17 | 2016-06-15 | 乐视致新电子科技(天津)有限公司 | Method and system for speech recognition |
| US10134399B2 (en) * | 2016-07-15 | 2018-11-20 | Sonos, Inc. | Contextualization of voice inputs |
| US10181323B2 (en) * | 2016-10-19 | 2019-01-15 | Sonos, Inc. | Arbitration-based voice recognition |
| US10074371B1 (en) * | 2017-03-14 | 2018-09-11 | Amazon Technologies, Inc. | Voice control of remote device by disabling wakeword detection |
| CN107526512B (en) * | 2017-08-31 | 2020-11-20 | 联想(北京)有限公司 | Switching method and system for electronic equipment |
-
2020
- 2020-02-07 CA CA3227238A patent/CA3227238A1/en active Pending
- 2020-02-07 CN CN202080026535.XA patent/CN113711625B/en active Active
- 2020-02-07 EP EP20710649.3A patent/EP3922047A1/en active Pending
- 2020-02-07 AU AU2020218258A patent/AU2020218258B2/en active Active
- 2020-02-07 CA CA3129236A patent/CA3129236C/en active Active
- 2020-02-07 WO PCT/US2020/017150 patent/WO2020163679A1/en not_active Ceased
- 2020-02-07 KR KR1020217028723A patent/KR20210125527A/en active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| AU2020218258B2 (en) | 2025-02-13 |
| CN113711625A (en) | 2021-11-26 |
| WO2020163679A1 (en) | 2020-08-13 |
| KR20210125527A (en) | 2021-10-18 |
| CN113711625B (en) | 2024-06-28 |
| CA3227238A1 (en) | 2020-08-13 |
| CA3129236A1 (en) | 2020-08-13 |
| EP3922047A1 (en) | 2021-12-15 |
| AU2020218258A1 (en) | 2021-09-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12165643B2 (en) | Devices, systems, and methods for distributed voice processing | |
| US11315556B2 (en) | Devices, systems, and methods for distributed voice processing by transmitting sound data associated with a wake word to an appropriate device for identification | |
| US11817083B2 (en) | Networked microphone devices, systems, and methods of localized arbitration | |
| US12579978B2 (en) | Networked devices, systems, and methods for intelligently deactivating wake-word engines | |
| US12518756B2 (en) | Voice assistant persistence across multiple network microphone devices | |
| US20240114192A1 (en) | Networked devices, systems, & methods for associating playback devices based on sound codes | |
| US12518755B2 (en) | Voice verification for media playback | |
| CA3129236C (en) | Devices, systems, and methods for distributed voice processing | |
| US20240233721A1 (en) | Delay Gating for Voice Arbitration |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| EEER | Examination request |
Effective date: 20211117 |
|
| EEER | Examination request |
Effective date: 20211117 |
|
| EEER | Examination request |
Effective date: 20211117 |
|
| EEER | Examination request |
Effective date: 20211117 |
|
| EEER | Examination request |
Effective date: 20211117 |
|
| EEER | Examination request |
Effective date: 20211117 |
|
| EEER | Examination request |
Effective date: 20211117 |
|
| EEER | Examination request |
Effective date: 20211117 |
|
| MPN | Maintenance fee for patent paid |
Free format text: FEE DESCRIPTION TEXT: MF (PATENT, 5TH ANNIV.) - STANDARD Year of fee payment: 5 |
|
| U00 | Fee paid |
Free format text: ST27 STATUS EVENT CODE: A-4-4-U10-U00-U101 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE REQUEST RECEIVED Effective date: 20250129 |
|
| U11 | Full renewal or maintenance fee paid |
Free format text: ST27 STATUS EVENT CODE: A-4-4-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT DETERMINED COMPLIANT Effective date: 20250129 |