Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Apple Inc
Original Assignee
Apple Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filedlitigationCriticalhttps://patents.darts-ip.com/?family=44304930&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=US10706841(B2)"Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by Apple IncfiledCriticalApple Inc
Priority to US15/394,162priorityCriticalpatent/US10706841B2/en
Publication of US20170178626A1publicationCriticalpatent/US20170178626A1/en
Priority to US16/879,643prioritypatent/US11423886B2/en
Application grantedgrantedCritical
Publication of US10706841B2publicationCriticalpatent/US10706841B2/en
Priority to US17/732,011prioritypatent/US20220254338A1/en
Priority to US17/882,457prioritypatent/US20220383864A1/en
G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
G—PHYSICS
G06—COMPUTING; CALCULATING OR COUNTING
G06F—ELECTRIC DIGITAL DATA PROCESSING
G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
G06F3/16—Sound input; Sound output
G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
B—PERFORMING OPERATIONS; TRANSPORTING
B60—VEHICLES IN GENERAL
B60K—ARRANGEMENT OR MOUNTING OF PROPULSION UNITS OR OF TRANSMISSIONS IN VEHICLES; ARRANGEMENT OR MOUNTING OF PLURAL DIVERSE PRIME-MOVERS IN VEHICLES; AUXILIARY DRIVES FOR VEHICLES; INSTRUMENTATION OR DASHBOARDS FOR VEHICLES; ARRANGEMENTS IN CONNECTION WITH COOLING, AIR INTAKE, GAS EXHAUST OR FUEL SUPPLY OF PROPULSION UNITS IN VEHICLES
B60K35/00—Instruments specially adapted for vehicles; Arrangement of instruments in or on vehicles
G—PHYSICS
G06—COMPUTING; CALCULATING OR COUNTING
G06F—ELECTRIC DIGITAL DATA PROCESSING
G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
G06F16/33—Querying
G06F16/332—Query formulation
G06F16/3329—Natural language query formulation
G—PHYSICS
G06—COMPUTING; CALCULATING OR COUNTING
G06F—ELECTRIC DIGITAL DATA PROCESSING
G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
G06F16/33—Querying
G06F16/3331—Query processing
G06F16/334—Query execution
G06F16/3344—Query execution using natural language analysis
G—PHYSICS
G06—COMPUTING; CALCULATING OR COUNTING
G06F—ELECTRIC DIGITAL DATA PROCESSING
G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
G06F16/90—Details of database functions independent of the retrieved data types
G06F16/95—Retrieval from the web
G06F16/953—Querying, e.g. by the use of web search engines
G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
G—PHYSICS
G06—COMPUTING; CALCULATING OR COUNTING
G06F—ELECTRIC DIGITAL DATA PROCESSING
G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
G06F3/16—Sound input; Sound output
G—PHYSICS
G06—COMPUTING; CALCULATING OR COUNTING
G06F—ELECTRIC DIGITAL DATA PROCESSING
G06F40/00—Handling natural language data
G06F40/20—Natural language analysis
G06F40/279—Recognition of textual entities
G—PHYSICS
G06—COMPUTING; CALCULATING OR COUNTING
G06F—ELECTRIC DIGITAL DATA PROCESSING
G06F40/00—Handling natural language data
G06F40/30—Semantic analysis
G—PHYSICS
G06—COMPUTING; CALCULATING OR COUNTING
G06F—ELECTRIC DIGITAL DATA PROCESSING
G06F40/00—Handling natural language data
G06F40/30—Semantic analysis
G06F40/35—Discourse or dialogue representation
G—PHYSICS
G06—COMPUTING; CALCULATING OR COUNTING
G06F—ELECTRIC DIGITAL DATA PROCESSING
G06F40/00—Handling natural language data
G06F40/40—Processing or translation of natural language
G—PHYSICS
G06—COMPUTING; CALCULATING OR COUNTING
G06F—ELECTRIC DIGITAL DATA PROCESSING
G06F9/00—Arrangements for program control, e.g. control units
G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
G06F9/44—Arrangements for executing specific programs
G—PHYSICS
G06—COMPUTING; CALCULATING OR COUNTING
G06F—ELECTRIC DIGITAL DATA PROCESSING
G06F9/00—Arrangements for program control, e.g. control units
G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
G06F9/46—Multiprogramming arrangements
G06F9/54—Interprogram communication
G—PHYSICS
G06—COMPUTING; CALCULATING OR COUNTING
G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
G06N3/00—Computing arrangements based on biological models
G06N3/004—Artificial life, i.e. computing arrangements simulating life
G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
G—PHYSICS
G06—COMPUTING; CALCULATING OR COUNTING
G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
G06N5/00—Computing arrangements using knowledge-based models
G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
G06N5/00—Computing arrangements using knowledge-based models
G06N5/04—Inference or reasoning models
G06N5/041—Abduction
G—PHYSICS
G06—COMPUTING; CALCULATING OR COUNTING
G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
G06Q10/00—Administration; Management
G—PHYSICS
G10—MUSICAL INSTRUMENTS; ACOUSTICS
G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
G10L13/00—Speech synthesis; Text to speech systems
G10L13/02—Methods for producing synthetic speech; Speech synthesisers
G—PHYSICS
G10—MUSICAL INSTRUMENTS; ACOUSTICS
G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
G10L13/00—Speech synthesis; Text to speech systems
G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G—PHYSICS
G10—MUSICAL INSTRUMENTS; ACOUSTICS
G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
G10L15/00—Speech recognition
G10L15/08—Speech classification or search
G10L15/18—Speech classification or search using natural language modelling
G—PHYSICS
G10—MUSICAL INSTRUMENTS; ACOUSTICS
G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
G10L15/00—Speech recognition
G10L15/08—Speech classification or search
G10L15/18—Speech classification or search using natural language modelling
G10L15/1815—Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
G—PHYSICS
G10—MUSICAL INSTRUMENTS; ACOUSTICS
G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
G10L15/00—Speech recognition
G10L15/08—Speech classification or search
G10L15/18—Speech classification or search using natural language modelling
G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
G—PHYSICS
G10—MUSICAL INSTRUMENTS; ACOUSTICS
G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
G10L15/00—Speech recognition
G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
G—PHYSICS
G10—MUSICAL INSTRUMENTS; ACOUSTICS
G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
G10L15/00—Speech recognition
G10L15/26—Speech to text systems
G10L15/265—
G—PHYSICS
G10—MUSICAL INSTRUMENTS; ACOUSTICS
G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
G10L15/00—Speech recognition
G10L15/28—Constructional details of speech recognition systems
G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
G—PHYSICS
G10—MUSICAL INSTRUMENTS; ACOUSTICS
G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
H—ELECTRICITY
H04—ELECTRIC COMMUNICATION TECHNIQUE
H04M—TELEPHONIC COMMUNICATION
H04M1/00—Substation equipment, e.g. for use by subscribers
H04M1/60—Substation equipment, e.g. for use by subscribers including speech amplifiers
H04M1/6033—Substation equipment, e.g. for use by subscribers including speech amplifiers for providing handsfree use or a loudspeaker mode in telephone sets
H04M1/6041—Portable telephones adapted for handsfree use
H04M1/6075—Portable telephones adapted for handsfree use adapted for handsfree use in a vehicle
H04M1/6083—Portable telephones adapted for handsfree use adapted for handsfree use in a vehicle by interfacing with the vehicle audio system
H04M1/6091—Portable telephones adapted for handsfree use adapted for handsfree use in a vehicle by interfacing with the vehicle audio system including a wireless interface
H—ELECTRICITY
H04—ELECTRIC COMMUNICATION TECHNIQUE
H04M—TELEPHONIC COMMUNICATION
H04M1/00—Substation equipment, e.g. for use by subscribers
H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
H04M1/724—User interfaces specially adapted for cordless or mobile telephones
H04M1/72403—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
H04M1/7243—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
H—ELECTRICITY
H04—ELECTRIC COMMUNICATION TECHNIQUE
H04M—TELEPHONIC COMMUNICATION
H04M1/00—Substation equipment, e.g. for use by subscribers
H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
H04M1/724—User interfaces specially adapted for cordless or mobile telephones
H04M1/72448—User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
H—ELECTRICITY
H04—ELECTRIC COMMUNICATION TECHNIQUE
H04M—TELEPHONIC COMMUNICATION
H04M1/00—Substation equipment, e.g. for use by subscribers
H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
H04M1/724—User interfaces specially adapted for cordless or mobile telephones
H04M1/72484—User interfaces specially adapted for cordless or mobile telephones wherein functions are triggered by incoming communication events
H04M1/72547—
H04M1/72563—
H04M1/72597—
G—PHYSICS
G10—MUSICAL INSTRUMENTS; ACOUSTICS
G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
G10L13/00—Speech synthesis; Text to speech systems
G—PHYSICS
G10—MUSICAL INSTRUMENTS; ACOUSTICS
G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
G10L15/00—Speech recognition
G10L15/08—Speech classification or search
G10L15/18—Speech classification or search using natural language modelling
G10L15/1822—Parsing for meaning understanding
G—PHYSICS
G10—MUSICAL INSTRUMENTS; ACOUSTICS
G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
G10L15/00—Speech recognition
G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
G10L2015/223—Execution procedure of a spoken command
Definitions
the present inventionrelates to intelligent systems, and more specifically for classes of applications for intelligent automated assistants.
Today's electronic devicesare able to access a large, growing, and diverse quantity of functions, services, and information, both via the Internet and from other sources. Functionality for such devices is increasing rapidly, as many consumer devices, smartphones, tablet computers, and the like, are able to run software applications to perform various tasks and provide different types of information. Often, each application, function, website, or feature has its own user interface and its own operational paradigms, many of which can be burdensome to learn or overwhelming for users. In addition, many users may have difficulty even discovering what functionality and/or information is available on their electronic devices or on various websites; thus, such users may become frustrated or overwhelmed, or may simply be unable to use the resources available to them in an effective manner.
novice usersor individuals who are impaired or disabled in some manner, and/or are elderly, busy, distracted, and/or operating a vehicle may have difficulty interfacing with their electronic devices effectively, and/or engaging online services effectively.
Such usersare particularly likely to have difficulty with the large number of diverse and inconsistent functions, applications, and websites that may be available for their use.
an intelligent automated assistantis implemented on an electronic device, to facilitate user interaction with a device, and to help the user more effectively engage with local and/or remote services.
the intelligent automated assistantengages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions.
the intelligent automated assistantintegrates a variety of capabilities provided by different software components (e.g., for supporting natural language recognition and dialog, multimodal input, personal information management, task flow management, orchestrating distributed services, and the like). Furthermore, to offer intelligent interfaces and useful functionality to users, the intelligent automated assistant of the present invention may, in at least some embodiments, coordinate these components and services.
the conversation interface, and the ability to obtain information and perform follow-on task,are implemented, in at least some embodiments, by coordinating various components such as language components, dialog components, task management components, information management components and/or a plurality of external services.
intelligent automated assistant systemsmay be configured, designed, and/or operable to provide various different types of operations, functionalities, and/or features, and/or to combine a plurality of features, operations, and applications of an electronic device on which it is installed.
the intelligent automated assistant systems of the present inventioncan perform any or all of: actively eliciting input from a user, interpreting user intent, disambiguating among competing interpretations, requesting and receiving clarifying information as needed, and performing (or initiating) actions based on the discerned intent. Actions can be performed, for example, by activating and/or interfacing with any applications or services that may be available on an electronic device, as well as services that are available over an electronic network such as the Internet.
such activation of external servicescan be performed via APIs or by any other suitable mechanism.
the intelligent automated assistant systems of various embodiments of the present inventioncan unify, simplify, and improve the user's experience with respect to many different applications and functions of an electronic device, and with respect to services that may be available over the Internet.
the usercan thereby be relieved of the burden of learning what functionality may be available on the device and on web-connected services, how to interface with such services to get what he or she wants, and how to interpret the output received from such services, rather, the assistant of the present invention can act as a go-between between the user and such diverse services.
the assistant of the present inventionprovides a conversational interface that the user may find more intuitive and less burdensome than conventional graphical user interfaces.
the usercan engage in a form of conversational dialog with the assistant using any of a number of available input and output mechanisms, such as for example speech, graphical user interfaces (buttons and links), text entry, and the like.
the systemcan be implemented using any of a number of different platforms, such as device APIs, the web, email, and the like, or any combination thereof.
Requests for additional inputcan be presented to the user in the context of such a conversation. Short and long term memory can be engaged so that user input can be interpreted in proper context given previous events and communications within a given session, as well as historical and profile information about the user.
context information derived from user interaction with a feature, operation, or application on a devicecan be used to streamline the operation of other features, operations, or applications on the device or on other devices.
the intelligent automated assistantcan use the context of a phone call (such as the person called) to streamline the initiation of a text message (for example to determine that the text message should be sent to the same person, without the user having to explicitly specify the recipient of the text message).
the intelligent automated assistant of the present inventioncan thereby interpret instructions such as “send him a text message”, wherein the “him” is interpreted according to context information derived from a current phone call, and/or from any feature, operation, or application on the device.
the intelligent automated assistanttakes into account various types of available context data to determine which address book contact to use, which contact data to use, which telephone number to use for the contact, and the like, so that the user need not re-specify such information manually.
the assistantcan also take into account external events and respond accordingly, for example, to initiate action, initiate communication with the user, provide alerts, and/or modify previously initiated action in view of the external events. If input is required from the user, a conversational interface can again be used.
the systemis based on sets of interrelated domains and tasks, and employs additional functionally powered by external services with which the system can interact.
these external servicesinclude web-enabled services, as well as functionality related to the hardware device itself.
the assistantcan control many operations and functions of the device, such as to dial a telephone number, send a text message, set reminders, add events to a calendar, and the like.
system of the present inventioncan be implemented to provide assistance in any of a number of different domains. Examples include:
the intelligent automated assistant systems disclosed hereinmay be configured or designed to include functionality for automating the application of data and services available over the Internet to discover, find, choose among, purchase, reserve, or order products and services.
at least one intelligent automated assistant system embodiment disclosed hereinmay also enable the combined use of several sources of data and services at once. For example, it may combine information about products from several review sites, check prices and availability from multiple distributors, and check their locations and time constraints, and help a user find a personalized solution to their problem.
At least one intelligent automated assistant system embodiment disclosed hereinmay be configured or designed to include functionality for automating the use of data and services available over the Internet to discover, investigate, select among, reserve, and otherwise learn about things to do (including but not limited to movies, events, performances, exhibits, shows and attractions); places to go (including but not limited to travel destinations, hotels and other places to stay, landmarks and other sites of interest, etc.); places to eat or drink (such as restaurants and bars), times and places to meet others, and any other source of entertainment or social interaction which may be found on the Internet.
things to doincluding but not limited to movies, events, performances, exhibits, shows and attractions
places to goincluding but not limited to travel destinations, hotels and other places to stay, landmarks and other sites of interest, etc.
places to eat or drinksuch as restaurants and bars
At least one intelligent automated assistant system embodiment disclosed hereinmay be configured or designed to include functionality for enabling the operation of applications and services via natural language dialog that may be otherwise provided by dedicated applications with graphical user interfaces including search (including location-based search); navigation (maps and directions); database lookup (such as finding businesses or people by name or other properties); getting weather conditions and forecasts, checking the price of market items or status of financial transactions; monitoring traffic or the status of flights; accessing and updating calendars and schedules; managing reminders, alerts, tasks and projects; communicating over email or other messaging platforms; and operating devices locally or remotely (e.g., dialing telephones, controlling light and temperature, controlling home security devices, playing music or video, etc.).
searchincluding location-based search
navigationmaps and directions
database lookupsuch as finding businesses or people by name or other properties
getting weather conditions and forecastschecking the price of market items or status of financial transactions
monitoring traffic or the status of flightsaccessing and updating calendars and schedules
managing reminders, alerts, tasks and projectscommunicating over email or other messaging
At least one intelligent automated assistant system embodiment disclosed hereinmay be configured or designed to include functionality for identifying, generating, and/or providing personalized recommendations for activities, products, services, source of entertainment, time management, or any other kind of recommendation service that benefits from an interactive dialog in natural language and automated access to data and services.