CN106383866B - Location-based conversational understanding - Google Patents
Location-based conversational understanding Download PDFInfo
- Publication number
- CN106383866B CN106383866B CN201610801496.1A CN201610801496A CN106383866B CN 106383866 B CN106383866 B CN 106383866B CN 201610801496 A CN201610801496 A CN 201610801496A CN 106383866 B CN106383866 B CN 106383866B
- Authority
- CN
- China
- Prior art keywords
- based query
- location
- speech
- query
- environmental context
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/9032—Query formulation
- G06F16/90332—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Abstract
Location based conversational understanding may be provided. When a query is received from a user, an environmental context associated with the query may be generated. The query may be interpreted according to the environmental context. The interpreted query may be executed and at least one result associated with the query is provided to the user.
Description
The application is a divisional application of the same-name Chinese invention patent application with the application date of 2012, 3, 29 and the application number of 201210087420.9.
Technical Field
The present application relates to environmental context, and in particular location-based conversational understanding.
Background
Location-based conversational understanding may provide a mechanism to leverage environmental context to improve query execution and results. Conventional speech recognition programs do not have techniques to improve the quality and accuracy of new queries from new and/or existing users using information from one user to another (e.g., speech utterances, geographic data, acoustic environment of certain locations, typical queries made from a particular location). In some cases, the speech to text conversion must be made without such benefits as employing similar, potentially relevant queries to aid understanding.
Speech-to-text conversion (i.e., speech recognition) may include converting spoken phrases into text phrases that may be processed by a computing system. Acoustic modeling and/or language modeling may be used in modern statistical data-based speech recognition algorithms. Hidden Markov Models (HMMs) are widely used in many conventional systems. The HMM may include a statistical data model that may output a sequence of symbols or quantities. HMMs can be used for speech recognition because a speech signal can be considered a piecewise stationary signal or a short-time stationary signal. In a short time (e.g., 10 milliseconds), speech can be approximated as a stationary process. Speech can therefore be considered a markov model for many stochastic purposes.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter. This summary is also not intended to be used to limit the scope of the claimed subject matter.
Location based conversational understanding may be provided. When a query is received from a user, an environmental context associated with the query may be generated. The query may be interpreted according to the environmental context. The interpreted query may be executed and at least one result associated with the query is provided to the user.
Both the foregoing general description and the following detailed description provide examples, and are explanatory only. Accordingly, the foregoing general description and the following detailed description should not be considered to be restrictive. Further, other features or variations may be provided in addition to those set forth herein. For example, embodiments may be directed to various feature combinations and sub-combinations described in the detailed description.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate embodiments of the invention. In the drawings:
FIG. 1 is a block diagram of an operating environment;
FIG. 2 is a flow diagram of a method for providing location-based conversational understanding; and
FIG. 3 is a block diagram of a system including a computing device.
Detailed Description
The following detailed description refers to the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings and the following description to refer to the same or like elements. While embodiments of the invention may be described, modifications, adaptations, and other implementations are possible. For example, substitutions, additions, or modifications may be made to the elements illustrated in the drawings, and the methods described herein may be modified by substituting, reordering, or adding stages to the disclosed methods. The following detailed description, therefore, is not to be taken in a limiting sense. Rather, the proper scope of the invention is defined by the appended claims.
Location based conversational understanding may be provided. For example, a speech-to-text system may be provided that correlates information from multiple users in order to improve the accuracy of the conversation and the results of the query included in the converted sentence. According to embodiments of the present invention, a personal assistant program may receive speech-based queries from users at multiple locations. Each query may be analyzed for acoustic and/or environmental characteristics, and such characteristics may be stored and associated with the location from which the query was received. For example, a query received from a user at a subway station may detect acoustic echoes off the tile wall and/or the presence of background environmental sounds of people or subway trains. It is then known that these characteristics can be used in the future to filter out queries from that location to allow for more accurate translation of those queries. According to embodiments of the present invention, a location may be defined, for example, by a location of a user's Global Positioning System (GPS), an area code associated with the user, a zip code associated with the user, and/or a proximity of the user at a landmark (e.g., a train station, a stadium, a museum, an office building, etc.).
Processing the query may include rewriting the query according to an acoustic model. For example, the acoustic model may include background sounds that are known to be present at a particular location. Applying an acoustic model may allow for more accurate translation of queries by ignoring irrelevant sounds. The acoustic model also allows changes to be made to the display of any results associated with the query. For example, in certain noisy environments, the results may be displayed on a screen rather than through audio. An environmental context may also be associated with the understanding model to facilitate speech to text conversion. For example, the understanding model may include a Hidden Markov Model (HMM). The environmental context may also be associated with a semantic model to assist in executing the query. For example, the semantic model may include an ontology (ontology). Ontology was applied in related applications S/N __/___, ___, _____, 2011, and entitled "personalization of queries, sessions, and searches," incorporated herein by reference in its entirety.
Moreover, the subject matter of the query may be used to improve the results of future queries. For example, if a user at a subway station queries "when is next shift? ", the personal assistant program may determine through the process of several queries that the user would like to know when the next train will arrive. This may be done by requiring that the query from the first user be classified and storing the classification for future use. In another example, if a user queries "when is next shift? "and another user inquires about" when there is a next train? ", the program may correlate the queries and make the assumption that both users are requesting the same information.
FIG. 1 is a block diagram of an operating environment 100 for providing location-based conversational understanding. Operating environment 100 may include a Spoken Dialog System (SDS)110 that includes a personal assistant program 112, a speech to text converter 114, and a context database 116. Personal assistant program 112 may receive queries from a first plurality of users 130(a) - (C) located at a first location 140 and a second plurality of users 150(a) - (C) located at a second location 160 over network 120. Context database 116 may be operative to store context data associated with queries received from users, such as first plurality of users 130(A) - (C) and/or second plurality of users 150(A) - (C). The context data may include acoustic and/or environmental characteristics as well as query context information, such as query subject matter, time/date of the query, user details, and/or location from which the query was made. According to embodiments of the invention, network 120 may include, for example, a private data network (e.g., Ethernet), a cellular data network, and/or a public network such as the Internet.
The agent may be associated with a Spoken Dialog System (SDS). Such systems allow people to interact with computers through their voices. The master component that drives the SDS may include a dialog manager: the component manages dialog-based sessions with the user. The dialog manager may determine the user's intent through a combination of multiple input sources, such as speech recognition and natural language understanding component output, context from previous dialog turns, user context, and/or results returned from a knowledge base (e.g., a search engine). Upon determining the intent, the dialog manager can take action, such as displaying the final result to the user and/or continuing the dialog with the user to satisfy their intent. The spoken dialog system may include a plurality of conversational understanding models, such as an acoustic model associated with a location and/or a speech language understanding model for processing speech-based input.
Figure 2 is a flow chart illustrating the general stages involved in a method 200 consistent with an embodiment of the invention for providing location-based conversational understanding. The method 200 may be implemented using a computing device 300, which will be described in more detail below with reference to FIG. 3. The manner in which the stages of method 200 are implemented will be described in greater detail below. Method 200 may begin at starting block 205 and continue to stage 210 where computing device 300 may receive a speech-based query from a user at a location. For example, user 130(A) may send a query to SDS 110 via a device such as a cellular phone.
From stage 210, method 200 may advance to stage 215 where computing device 300 may determine whether an environmental context associated with the location exists in a memory store. For example, the SDS 110 may identify a location (e.g., the first location 140) from which the query was received and determine whether an environmental context associated with the location exists in the context database 116.
If there is no context associated with the location, method 200 proceeds to stage 220 where computing device 300 may identify at least one acoustic disturbance in the voice-based query. For example, SDS 110 may analyze the audio of the query and identify background noise such as that associated with a large number of people around user 130(a) and/or passing trains.
If a context associated with the location exists, method 200 may advance to stage 235 where computing device 300 may load an environmental context associated with the location. For example, the SDS 110 may load the environmental context from the context database 116 as described above.
After creating the context at stage 240 or loading the context at stage 235, method 200 may then advance to stage 240 where computing device 300 may convert the speech-based query to a text-based query according to the environmental context. For example, the SDS 110 may convert the voice-based query to a text-based query by applying a filter to remove at least one acoustic disturbance associated with the environmental context.
Embodiments consistent with the invention may comprise a system for providing location-based conversational understanding. The system may include a memory storage, and a processing unit coupled to the memory storage. The processing unit is operatively operative to receive a query from a user, generate an environmental context associated with the query, interpret the query in accordance with the environmental context, execute the interpreted query, and provide at least one result of the query to the user. The query may comprise, for example, a voice query that the processing unit is operable to convert into computer-readable text. According to embodiments of the invention, the conversion of speech to text may utilize a hidden Markov model algorithm that includes statistical weights for the various most likely words associated with the understanding model and/or semantic concepts associated with the semantic model. The processing unit is operatively operative to increase a statistical weight of at least one expected term, e.g., based on at least one previous query received from the location, and store the statistical weight as part of the environmental context.
The environmental context may include an acoustic model associated with a location from which the query was received. The processing unit is operable to rewrite the query based on at least one background sound derived from the speech-based query based on the acoustic model. For example, it may be known that background sounds (e.g., train sirens) are present in a voice query received from a given location (e.g., a train station). Background sounds may be detected and measured for pitch, amplitude, and other acoustic characteristics. The query may be rewritten to ignore such sounds, and the sounds may be computed and stored for application to future queries from that location. The processing unit may also be operable to receive a second speech-based query from a second user and rewrite the query to obtain the same background sound according to the updated acoustic model. The processing unit may also be operable to aggregate environmental contexts associated with a plurality of queries from a plurality of users and store the aggregated environmental context associated with the location.
Embodiments consistent with the invention may comprise a system for providing location-based conversational understanding. The system may include a memory storage, and a processing unit coupled to the memory storage. The processing unit is operatively enabled to receive a voice-based query from a user at a location, load an environmental context associated with the location, convert the voice-based query to text according to the environmental context, execute the converted query according to the environmental context, and provide at least one result associated with the executed query to the user. The environmental context may include, for example, a time of the at least one previous query, a date of the at least one previous query, a topic of the at least one previous query, a semantic model including an ontology, an understanding model, and an acoustic model of the location. The processing unit is operable to rewrite the query based on a known acoustic disturbance associated with the location. The processing unit may also be operative to store a plurality of environmental contexts associated with a plurality of locations aggregated from a plurality of queries received from a plurality of users. The processing unit may also be operative to receive a correction to the converted text from the user and update the environmental context in accordance with the correction. The processing unit may be further operative to receive a second voice-based query from the user at a second location, load a second environmental context associated with the second location, convert the second voice-based query to text in accordance with the second environmental context, execute the converted query in accordance with the second environmental context and provide at least one second result associated with the executed query to the user.
Yet another embodiment consistent with the invention may comprise a system for providing a context-aware environment. The system may include a memory storage, and a processing unit coupled to the memory storage. The processing unit is operable to receive a voice-based query from a user at a location and determine whether an environmental context associated with the location exists in the memory store. In response to determining that no environmental context exists, the processing unit may be operative to identify at least one acoustic disturbance in the voice-based query, identify at least one topic associated with the voice-based query, and create a new environmental context associated with the location for storage in the memory storage. In response to determining that an environmental context exists, the processor unit may be operative to load the environmental context. The processing unit may then be operable to convert the voice-based query to a text-based query according to an environmental context, wherein the operable to convert the voice-based query to the text-based query according to the environmental context includes operable to apply a filter to remove at least one acoustic disturbance associated with the environmental context, execute the text-based query according to the environmental context, wherein the operable to execute the text-based query according to the environmental context includes operable to execute the query, wherein the at least one acoustic disturbance is associated with an acoustic model, and wherein the at least one identified topic is associated with a semantic model, the semantic model being associated with the environmental context, and provide at least one result of the executed text-based query to a user.
Fig. 3 is a block diagram of a system including a computing device 300. In accordance with an embodiment of the present invention, the above-described memory storage and processing unit may be implemented in a computing device, such as computing device 300 of FIG. 3. The memory storage and processing unit may be implemented using any suitable combination of hardware, software, or firmware. For example, the memory storage and processing unit may be implemented with computing device 300 or any of the other computing devices 318, in combination with computing device 300. The above-described systems, devices, and processors are examples, and other systems, devices, and processors may include the above-described memory storage and processing unit, according to embodiments of the invention. Further, computing device 300 may include an operating environment for system 100 as described above. The system 100 may operate in other environments and is not limited to the computing device 300.
Referring to FIG. 3, a system according to an embodiment of the invention may include a computing device, such as computing device 300. In a basic configuration, computing device 300 may include at least one processing unit 302 and system memory 304. Depending on the configuration and type of computing device, system memory 304 may include, but is not limited to, volatile (e.g., Random Access Memory (RAM)), non-volatile (e.g., read-only memory (ROM)), flash memory, or any combination. System memory 304 may include an operating system 305, one or more programming modules 306, and may include a personal assistant program 112. For example, operating system 305 may be suitable for controlling the operation of computing device 300. Furthermore, embodiments of the invention may be practiced in conjunction with a graphics library, other operating systems, or any other application program and is not limited to any particular application or system. This basic configuration is illustrated in fig. 3 by those components within dashed line 308.
As mentioned above, a number of program modules and data files may be stored in system memory 304, including operating system 305. When executed on processing unit 302, programming modules 306 (e.g., personal assistant program 112) may perform processes including, for example, one or more of the stages of method 200 as described above. The above process is one example, and processing unit 302 may perform other processes. Other programming modules that may be used in accordance with embodiments of the present invention may include email and contacts applications, word processing applications, spreadsheet applications, database applications, slide presentation applications, drawing or computer-aided application programs, and the like.
Generally, program modules may include routines, programs, components, data structures, and other types of structures that may perform particular tasks or that may implement particular abstract data types in accordance with embodiments of the invention. Moreover, embodiments of the invention may be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and the like. Embodiments of the invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
Furthermore, embodiments of the invention may be practiced in a circuit comprising discrete electronic elements, packaged or integrated electronic chips containing logic gates, a circuit utilizing a microprocessor, or on a single chip containing electronic elements or microprocessors. Embodiments of the invention may also be practiced using other technologies capable of performing logical operations such as, for example, AND, OR, AND NOT, including but NOT limited to mechanical, optical, fluidic, AND quantum technologies. In addition, embodiments of the invention may be practiced in a general purpose computer or any other circuits or systems.
For example, embodiments of the invention may be implemented as a computer process (method), a computing system, or as an article of manufacture such as a computer program product or computer readable media. The computer program product may be a computer storage media readable by a computer system and encoding a computer program of instructions for executing a computer process. The computer program product may also be a propagated signal on a carrier readable by a computing system and encoding a computer program of instructions for executing a computer process. Accordingly, the present invention may be embodied in hardware and/or in software (including firmware, resident software, micro-code, etc.). In other words, embodiments of the invention may take the form of a computer program product on a computer-usable or computer-readable storage medium having computer-usable or computer-readable program code embodied in the medium for use by or in connection with an instruction execution system. A computer-usable or computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
The computer-usable or computer-readable medium may be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, and a portable compact disc read-only memory (CD-ROM). Note that the computer-usable or computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted, of otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
Embodiments of the present invention are described above with reference to block diagrams and/or operational illustrations of methods, systems, and computer program products according to embodiments of the invention, for example. The functions/acts noted in the blocks may occur out of the order noted in any flowchart. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
While specific embodiments of the invention have been described, other embodiments are possible. In addition, although embodiments of the present invention have been described as being associated with data stored in memory and other storage media, data may also be stored on or read from other types of computer-readable media, such as secondary storage devices (like hard disks, floppy disks, or a CD-ROM), a carrier wave from the Internet, or other forms of RAM or ROM. In addition, steps of the disclosed methods may be modified in any manner, including by reordering steps and/or inserting or deleting steps, without departing from the invention.
All rights including copyrights in the code included herein are vested in and the property of the applicant. The applicant retains and reserves all rights in the code included herein and grants permission to reproduce the material only in connection with reproduction of the granted patent and for no other purpose.
Although the description includes examples, the scope of the invention is indicated by the appended claims. Furthermore, although the specification has been described in language specific to structural features and/or methodological acts, the claims are not limited to the features or acts described above. Rather, the specific features and acts described above are disclosed as examples of embodiments of the invention.
Claims (20)
1. A system for providing location-based conversational understanding, comprising:
at least one processor; and
a memory coupled to the at least one processor, the memory comprising computer-executable instructions that, when executed by the at least one processor, perform a method for providing location-based conversational understanding, the method comprising:
receiving, by a computing device, a voice-based query from a user located at a location;
determining whether an environmental context is associated with the location;
when it is determined that no environmental context is associated with the location:
identifying at least one acoustic disturbance in the speech-based query;
identifying a subject matter of the voice-based query;
creating an environmental context based at least on the identified topic of the voice-based query and the acoustic interference in the voice-based query;
converting the speech-based query into a text-based query using the environmental context to suppress acoustic interference.
2. The system of claim 1, wherein the environmental context is associated with at least one of:
an understanding model that facilitates speech to text conversion; and
semantic models that facilitate query execution.
3. The system of claim 1, wherein the location is determined using at least one of:
global Positioning System (GPS) coordinates;
a region code associated with the user;
a zip code associated with the user; and
proximity to landmarks.
4. The system of claim 1, wherein identifying at least one acoustic disturbance comprises: analyzing audio of the query and identifying background noise in the audio.
5. The system of claim 1, wherein identifying the topic of the speech-based query comprises:
requesting clarification of the voice-based query from the user; and
correlating a plurality of queries, wherein the plurality of queries are identified as requesting similar information.
6. The system of claim 1, wherein creating the environmental context comprises:
associating the identified acoustic interference and the identified subject with the location; and
storing information of the identified acoustic disturbance, the identified subject, and the location in a context database.
7. The system of claim 1, wherein converting the speech-based query comprises: applying a filter for removing acoustic interference associated with the environmental context.
8. The system of claim 1, wherein converting the speech-based query comprises: using a hidden Markov model algorithm that includes statistical weights for at least one of the terms and the semantic concepts.
9. The system of claim 1, the method further comprising:
executing the text-based query according to the environmental context within a search domain associated with the identified topic; and
providing results of the executed text-based query to the user.
10. A method for providing location-based conversational understanding, the method comprising:
receiving, by a computing device, a first speech-based query from a user located at a location;
determining whether an environmental context is associated with the location;
when it is determined that no environmental context is associated with the location:
identifying at least a first acoustic disturbance in the first speech-based query;
identifying a subject matter of the first voice-based query;
creating a first environmental context based at least on the identified topic of the first voice-based query and a first acoustic disturbance in the first voice-based query; and
converting the first speech-based query into a text-based query using the first environmental context.
11. The method of claim 10, wherein converting the first speech-based query comprises using a hidden markov model algorithm comprising statistical weights for at least one of:
words that may be associated with understanding the model; and
semantic concepts associated with the semantic model.
12. The method of claim 11, further comprising: increasing a statistical weight of one or more predicted terms as a function of one or more previous queries received at the location.
13. The method of claim 10, wherein the first environmental context includes an acoustic model associated with the location, and wherein the first speech-based query is adjusted according to the first acoustic disturbance using the acoustic model.
14. The method of claim 13, wherein adjusting the first speech-based query comprises:
identifying at least one background sound from the first acoustic disturbance;
adjusting the first speech-based query to ignore the at least one background sound; and
storing the at least one background sound.
15. The method of claim 14, further comprising:
receiving a second voice-based query associated with the location;
applying the acoustic model associated with the location to the second speech-based query; and
adjusting the second speech-based query to ignore the stored at least one background sound.
16. The method of claim 15, further comprising:
identifying a second acoustic disturbance in the second speech-based query;
updating the acoustic model associated with the location based on the second acoustic interference; and
adjusting the second speech-based query to ignore one or more background sounds in the second acoustic disturbance.
17. The method of claim 10, further comprising:
receiving a second voice-based query associated with the location;
creating a second environmental context based on the second voice-based query;
aggregating the first environmental context and the second environmental context into an aggregated environmental context, wherein the aggregated environmental context is associated with the location; and
storing the aggregated environmental context.
18. The method of claim 17, wherein the aggregated environmental context comprises a subject matter of the first voice-based query and a subject matter of the second voice-based query.
19. The method of claim 18, wherein a topic of the first speech-based query is used to improve a result of the second speech-based query.
20. A computer-readable storage device storing computer-executable instructions that, when executed, cause a computing system to perform a method for providing location-based conversational understanding, the method comprising:
receiving, by a computing device, a voice-based query from a user located at a location;
determining whether an environmental context is associated with the location;
when it is determined that no environmental context is associated with the location:
identifying at least one acoustic disturbance in the speech-based query;
identifying a subject matter of the voice-based query;
creating an environmental context based at least on the identified topic of the voice-based query and the acoustic interference in the voice-based query; and
converting the speech-based query into a text-based query using the environmental context.
Applications Claiming Priority (15)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/077,368 US9298287B2 (en) | 2011-03-31 | 2011-03-31 | Combined activation for natural user interface systems |
US13/077,431 US10642934B2 (en) | 2011-03-31 | 2011-03-31 | Augmented conversational understanding architecture |
US13/077,233 US20120253789A1 (en) | 2011-03-31 | 2011-03-31 | Conversational Dialog Learning and Correction |
US13/077,368 | 2011-03-31 | ||
US13/077,303 US9858343B2 (en) | 2011-03-31 | 2011-03-31 | Personalization of queries, conversations, and searches |
US13/076,862 US9760566B2 (en) | 2011-03-31 | 2011-03-31 | Augmented conversational understanding agent to identify conversation context between two humans and taking an agent action thereof |
US13/077,431 | 2011-03-31 | ||
US13/077,455 | 2011-03-31 | ||
US13/077,455 US9244984B2 (en) | 2011-03-31 | 2011-03-31 | Location based conversational understanding |
US13/077,303 | 2011-03-31 | ||
US13/077,233 | 2011-03-31 | ||
US13/077,396 US9842168B2 (en) | 2011-03-31 | 2011-03-31 | Task driven user intents |
US13/077,396 | 2011-03-31 | ||
US13/076,862 | 2011-03-31 | ||
CN201210087420.9A CN102737096B (en) | 2011-03-31 | 2012-03-29 | Location-based session understands |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210087420.9A Division CN102737096B (en) | 2011-03-31 | 2012-03-29 | Location-based session understands |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106383866A CN106383866A (en) | 2017-02-08 |
CN106383866B true CN106383866B (en) | 2020-05-05 |
Family
ID=46931884
Family Applications (8)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610801496.1A Active CN106383866B (en) | 2011-03-31 | 2012-03-29 | Location-based conversational understanding |
CN201210087420.9A Active CN102737096B (en) | 2011-03-31 | 2012-03-29 | Location-based session understands |
CN201210090634.1A Active CN102750311B (en) | 2011-03-31 | 2012-03-30 | The dialogue of expansion understands architecture |
CN201210090349.XA Active CN102737099B (en) | 2011-03-31 | 2012-03-30 | Personalization to inquiry, session and search |
CN201210091176.3A Active CN102737101B (en) | 2011-03-31 | 2012-03-30 | Combined type for natural user interface system activates |
CN201210092263.0A Active CN102750270B (en) | 2011-03-31 | 2012-03-31 | The dialogue of expansion understands agency |
CN201210101485.4A Expired - Fee Related CN102750271B (en) | 2011-03-31 | 2012-03-31 | Converstional dialog learning and correction |
CN201210093414.4A Active CN102737104B (en) | 2011-03-31 | 2012-03-31 | Task driven user intents |
Family Applications After (7)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210087420.9A Active CN102737096B (en) | 2011-03-31 | 2012-03-29 | Location-based session understands |
CN201210090634.1A Active CN102750311B (en) | 2011-03-31 | 2012-03-30 | The dialogue of expansion understands architecture |
CN201210090349.XA Active CN102737099B (en) | 2011-03-31 | 2012-03-30 | Personalization to inquiry, session and search |
CN201210091176.3A Active CN102737101B (en) | 2011-03-31 | 2012-03-30 | Combined type for natural user interface system activates |
CN201210092263.0A Active CN102750270B (en) | 2011-03-31 | 2012-03-31 | The dialogue of expansion understands agency |
CN201210101485.4A Expired - Fee Related CN102750271B (en) | 2011-03-31 | 2012-03-31 | Converstional dialog learning and correction |
CN201210093414.4A Active CN102737104B (en) | 2011-03-31 | 2012-03-31 | Task driven user intents |
Country Status (5)
Country | Link |
---|---|
EP (6) | EP2691949A4 (en) |
JP (4) | JP2014512046A (en) |
KR (3) | KR101963915B1 (en) |
CN (8) | CN106383866B (en) |
WO (7) | WO2012135218A2 (en) |
Families Citing this family (205)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US10002189B2 (en) | 2007-12-20 | 2018-06-19 | Apple Inc. | Method and apparatus for searching using an active ontology |
US9330720B2 (en) | 2008-01-03 | 2016-05-03 | Apple Inc. | Methods and apparatus for altering audio output signals |
US8996376B2 (en) | 2008-04-05 | 2015-03-31 | Apple Inc. | Intelligent text-to-speech conversion |
US20100030549A1 (en) | 2008-07-31 | 2010-02-04 | Lee Michael M | Mobile device having human language translation capability with positional feedback |
US8676904B2 (en) | 2008-10-02 | 2014-03-18 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
US10706373B2 (en) | 2011-06-03 | 2020-07-07 | Apple Inc. | Performing actions associated with task items that represent tasks to perform |
US10241752B2 (en) | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
US8682667B2 (en) | 2010-02-25 | 2014-03-25 | Apple Inc. | User profiling for selecting user specific voice input processing information |
US10032127B2 (en) | 2011-02-18 | 2018-07-24 | Nuance Communications, Inc. | Methods and apparatus for determining a clinician's intent to order an item |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US9842168B2 (en) | 2011-03-31 | 2017-12-12 | Microsoft Technology Licensing, Llc | Task driven user intents |
US10642934B2 (en) | 2011-03-31 | 2020-05-05 | Microsoft Technology Licensing, Llc | Augmented conversational understanding architecture |
US9760566B2 (en) | 2011-03-31 | 2017-09-12 | Microsoft Technology Licensing, Llc | Augmented conversational understanding agent to identify conversation context between two humans and taking an agent action thereof |
US9064006B2 (en) | 2012-08-23 | 2015-06-23 | Microsoft Technology Licensing, Llc | Translating natural language utterances to keyword search queries |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
US10417037B2 (en) | 2012-05-15 | 2019-09-17 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
US9721563B2 (en) | 2012-06-08 | 2017-08-01 | Apple Inc. | Name recognition system |
KR20150046100A (en) | 2012-08-10 | 2015-04-29 | 뉘앙스 커뮤니케이션즈, 인코포레이티드 | Virtual agent communication for electronic devices |
US9547647B2 (en) | 2012-09-19 | 2017-01-17 | Apple Inc. | Voice-based media searching |
CN113470640B (en) | 2013-02-07 | 2022-04-26 | 苹果公司 | Voice trigger of digital assistant |
EP2946322A1 (en) * | 2013-03-01 | 2015-11-25 | Nuance Communications, Inc. | Methods and apparatus for determining a clinician's intent to order an item |
US10652394B2 (en) | 2013-03-14 | 2020-05-12 | Apple Inc. | System and method for processing voicemail |
US9436287B2 (en) * | 2013-03-15 | 2016-09-06 | Qualcomm Incorporated | Systems and methods for switching processing modes using gestures |
US10748529B1 (en) | 2013-03-15 | 2020-08-18 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
WO2014197334A2 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
WO2014197335A1 (en) | 2013-06-08 | 2014-12-11 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
KR101922663B1 (en) | 2013-06-09 | 2018-11-28 | 애플 인크. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
US9728184B2 (en) | 2013-06-18 | 2017-08-08 | Microsoft Technology Licensing, Llc | Restructuring deep neural network acoustic models |
US9311298B2 (en) | 2013-06-21 | 2016-04-12 | Microsoft Technology Licensing, Llc | Building conversational understanding systems using a toolset |
US9589565B2 (en) * | 2013-06-21 | 2017-03-07 | Microsoft Technology Licensing, Llc | Environmentally aware dialog policies and response generation |
US10296160B2 (en) | 2013-12-06 | 2019-05-21 | Apple Inc. | Method for extracting salient dialog usage from live data |
CN104714954A (en) * | 2013-12-13 | 2015-06-17 | 中国电信股份有限公司 | Information searching method and system based on context understanding |
US20150170053A1 (en) * | 2013-12-13 | 2015-06-18 | Microsoft Corporation | Personalized machine learning models |
US10534623B2 (en) | 2013-12-16 | 2020-01-14 | Nuance Communications, Inc. | Systems and methods for providing a virtual assistant |
US10015770B2 (en) | 2014-03-24 | 2018-07-03 | International Business Machines Corporation | Social proximity networks for mobile phones |
US9529794B2 (en) | 2014-03-27 | 2016-12-27 | Microsoft Technology Licensing, Llc | Flexible schema for language model customization |
US20150278370A1 (en) * | 2014-04-01 | 2015-10-01 | Microsoft Corporation | Task completion for natural language input |
US10111099B2 (en) | 2014-05-12 | 2018-10-23 | Microsoft Technology Licensing, Llc | Distributing content in managed wireless distribution networks |
US9874914B2 (en) | 2014-05-19 | 2018-01-23 | Microsoft Technology Licensing, Llc | Power management contracts for accessory devices |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
TWI566107B (en) | 2014-05-30 | 2017-01-11 | 蘋果公司 | Method for processing a multi-part voice command, non-transitory computer readable storage medium and electronic device |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
US9355640B2 (en) * | 2014-06-04 | 2016-05-31 | Google Inc. | Invoking action responsive to co-presence determination |
US9717006B2 (en) | 2014-06-23 | 2017-07-25 | Microsoft Technology Licensing, Llc | Device quarantine in a wireless network |
JP6275569B2 (en) * | 2014-06-27 | 2018-02-07 | 株式会社東芝 | Dialog apparatus, method and program |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US9582482B1 (en) | 2014-07-11 | 2017-02-28 | Google Inc. | Providing an annotation linking related entities in onscreen content |
US10146409B2 (en) * | 2014-08-29 | 2018-12-04 | Microsoft Technology Licensing, Llc | Computerized dynamic splitting of interaction across multiple content |
US9818400B2 (en) | 2014-09-11 | 2017-11-14 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
KR102188268B1 (en) * | 2014-10-08 | 2020-12-08 | 엘지전자 주식회사 | Mobile terminal and method for controlling the same |
WO2016065020A2 (en) | 2014-10-21 | 2016-04-28 | Robert Bosch Gmbh | Method and system for automation of response selection and composition in dialog systems |
KR102329333B1 (en) | 2014-11-12 | 2021-11-23 | 삼성전자주식회사 | Query processing apparatus and method |
US9836452B2 (en) | 2014-12-30 | 2017-12-05 | Microsoft Technology Licensing, Llc | Discriminating ambiguous expressions to enhance user experience |
WO2016112005A1 (en) | 2015-01-05 | 2016-07-14 | Google Inc. | Multimodal state circulation |
US10572810B2 (en) | 2015-01-07 | 2020-02-25 | Microsoft Technology Licensing, Llc | Managing user interaction for input understanding determinations |
WO2016129767A1 (en) * | 2015-02-13 | 2016-08-18 | 주식회사 팔락성 | Online site linking method |
US10152299B2 (en) | 2015-03-06 | 2018-12-11 | Apple Inc. | Reducing response latency of intelligent automated assistants |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US10460227B2 (en) | 2015-05-15 | 2019-10-29 | Apple Inc. | Virtual assistant in a communication session |
US10083688B2 (en) * | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US10200824B2 (en) | 2015-05-27 | 2019-02-05 | Apple Inc. | Systems and methods for proactively identifying and surfacing relevant content on a touch-sensitive device |
US9578173B2 (en) | 2015-06-05 | 2017-02-21 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US9792281B2 (en) * | 2015-06-15 | 2017-10-17 | Microsoft Technology Licensing, Llc | Contextual language generation by leveraging language understanding |
US20160378747A1 (en) | 2015-06-29 | 2016-12-29 | Apple Inc. | Virtual assistant for media playback |
US10249297B2 (en) | 2015-07-13 | 2019-04-02 | Microsoft Technology Licensing, Llc | Propagating conversational alternatives using delayed hypothesis binding |
US10331312B2 (en) | 2015-09-08 | 2019-06-25 | Apple Inc. | Intelligent automated assistant in a media environment |
US10740384B2 (en) | 2015-09-08 | 2020-08-11 | Apple Inc. | Intelligent automated assistant for media search and playback |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
KR20170033722A (en) * | 2015-09-17 | 2017-03-27 | 삼성전자주식회사 | Apparatus and method for processing user's locution, and dialog management apparatus |
US10262654B2 (en) * | 2015-09-24 | 2019-04-16 | Microsoft Technology Licensing, Llc | Detecting actionable items in a conversation among participants |
US10970646B2 (en) * | 2015-10-01 | 2021-04-06 | Google Llc | Action suggestions for user-selected content |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10956666B2 (en) | 2015-11-09 | 2021-03-23 | Apple Inc. | Unconventional virtual assistant interactions |
KR102393928B1 (en) * | 2015-11-10 | 2022-05-04 | 삼성전자주식회사 | User terminal apparatus for recommanding a reply message and method thereof |
WO2017090954A1 (en) * | 2015-11-24 | 2017-06-01 | Samsung Electronics Co., Ltd. | Electronic device and operating method thereof |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
KR102502569B1 (en) | 2015-12-02 | 2023-02-23 | 삼성전자주식회사 | Method and apparuts for system resource managemnet |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US9905248B2 (en) | 2016-02-29 | 2018-02-27 | International Business Machines Corporation | Inferring user intentions based on user conversation data and spatio-temporal data |
US9978396B2 (en) | 2016-03-16 | 2018-05-22 | International Business Machines Corporation | Graphical display of phone conversations |
US10587708B2 (en) * | 2016-03-28 | 2020-03-10 | Microsoft Technology Licensing, Llc | Multi-modal conversational intercom |
US11487512B2 (en) | 2016-03-29 | 2022-11-01 | Microsoft Technology Licensing, Llc | Generating a services application |
US10158593B2 (en) * | 2016-04-08 | 2018-12-18 | Microsoft Technology Licensing, Llc | Proactive intelligent personal assistant |
US10945129B2 (en) * | 2016-04-29 | 2021-03-09 | Microsoft Technology Licensing, Llc | Facilitating interaction among digital personal assistants |
US10409876B2 (en) * | 2016-05-26 | 2019-09-10 | Microsoft Technology Licensing, Llc. | Intelligent capture, storage, and retrieval of information for task completion |
WO2017210613A1 (en) * | 2016-06-03 | 2017-12-07 | Maluuba Inc. | Natural language generation in a spoken dialogue system |
US11227589B2 (en) | 2016-06-06 | 2022-01-18 | Apple Inc. | Intelligent list reading |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US10282218B2 (en) * | 2016-06-07 | 2019-05-07 | Google Llc | Nondeterministic task initiation by a personal assistant module |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
DK179588B1 (en) | 2016-06-09 | 2019-02-22 | Apple Inc. | Intelligent automated assistant in a home environment |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
DK179343B1 (en) | 2016-06-11 | 2018-05-14 | Apple Inc | Intelligent task discovery |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
US10216269B2 (en) * | 2016-06-21 | 2019-02-26 | GM Global Technology Operations LLC | Apparatus and method for determining intent of user based on gaze information |
CA3033724A1 (en) * | 2016-08-23 | 2018-03-01 | Illumina, Inc. | Semantic distance systems and methods for determining related ontological data |
US10446137B2 (en) * | 2016-09-07 | 2019-10-15 | Microsoft Technology Licensing, Llc | Ambiguity resolving conversational understanding system |
US10474753B2 (en) | 2016-09-07 | 2019-11-12 | Apple Inc. | Language identification using recurrent neural networks |
US10503767B2 (en) * | 2016-09-13 | 2019-12-10 | Microsoft Technology Licensing, Llc | Computerized natural language query intent dispatching |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US9940390B1 (en) * | 2016-09-27 | 2018-04-10 | Microsoft Technology Licensing, Llc | Control system using scoped search and conversational interface |
CN107885744B (en) | 2016-09-29 | 2023-01-03 | 微软技术许可有限责任公司 | Conversational data analysis |
US10535005B1 (en) | 2016-10-26 | 2020-01-14 | Google Llc | Providing contextual actions for mobile onscreen content |
JP6697373B2 (en) | 2016-12-06 | 2020-05-20 | カシオ計算機株式会社 | Sentence generating device, sentence generating method and program |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
CN110249326B (en) * | 2017-02-08 | 2023-07-14 | 微软技术许可有限责任公司 | Natural language content generator |
US10643601B2 (en) * | 2017-02-09 | 2020-05-05 | Semantic Machines, Inc. | Detection mechanism for automated dialog systems |
US10586530B2 (en) | 2017-02-23 | 2020-03-10 | Semantic Machines, Inc. | Expandable dialogue system |
EP3563375B1 (en) * | 2017-02-23 | 2022-03-02 | Microsoft Technology Licensing, LLC | Expandable dialogue system |
US10798027B2 (en) * | 2017-03-05 | 2020-10-06 | Microsoft Technology Licensing, Llc | Personalized communications using semantic memory |
US10237209B2 (en) * | 2017-05-08 | 2019-03-19 | Google Llc | Initializing a conversation with an automated agent via selectable graphical element |
US10417266B2 (en) | 2017-05-09 | 2019-09-17 | Apple Inc. | Context-aware ranking of intelligent response suggestions |
DK201770383A1 (en) | 2017-05-09 | 2018-12-14 | Apple Inc. | User interface for correcting recognition errors |
DK201770439A1 (en) | 2017-05-11 | 2018-12-13 | Apple Inc. | Offline personal assistant |
US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
DK180048B1 (en) | 2017-05-11 | 2020-02-04 | Apple Inc. | MAINTAINING THE DATA PROTECTION OF PERSONAL INFORMATION |
US10395654B2 (en) | 2017-05-11 | 2019-08-27 | Apple Inc. | Text normalization based on a data-driven learning network |
US11301477B2 (en) | 2017-05-12 | 2022-04-12 | Apple Inc. | Feedback analysis of a digital assistant |
DK179496B1 (en) | 2017-05-12 | 2019-01-15 | Apple Inc. | USER-SPECIFIC Acoustic Models |
DK201770429A1 (en) | 2017-05-12 | 2018-12-14 | Apple Inc. | Low-latency intelligent automated assistant |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
DK201770431A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
DK201770432A1 (en) | 2017-05-15 | 2018-12-21 | Apple Inc. | Hierarchical belief states for digital assistants |
US20180336275A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Intelligent automated assistant for media exploration |
US20180336892A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Detecting a trigger of a digital assistant |
US10311144B2 (en) | 2017-05-16 | 2019-06-04 | Apple Inc. | Emoji word sense disambiguation |
US10403278B2 (en) | 2017-05-16 | 2019-09-03 | Apple Inc. | Methods and systems for phonetic matching in digital assistant services |
DK179560B1 (en) | 2017-05-16 | 2019-02-18 | Apple Inc. | Far-field extension for digital assistant services |
US10664533B2 (en) * | 2017-05-24 | 2020-05-26 | Lenovo (Singapore) Pte. Ltd. | Systems and methods to determine response cue for digital assistant based on context |
US10679192B2 (en) * | 2017-05-25 | 2020-06-09 | Microsoft Technology Licensing, Llc | Assigning tasks and monitoring task performance based on context extracted from a shared contextual graph |
US10657328B2 (en) | 2017-06-02 | 2020-05-19 | Apple Inc. | Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling |
US10742435B2 (en) * | 2017-06-29 | 2020-08-11 | Google Llc | Proactive provision of new content to group chat participants |
US11132499B2 (en) | 2017-08-28 | 2021-09-28 | Microsoft Technology Licensing, Llc | Robust expandable dialogue system |
US10445429B2 (en) | 2017-09-21 | 2019-10-15 | Apple Inc. | Natural language understanding using vocabularies with compressed serialized tries |
US10755051B2 (en) | 2017-09-29 | 2020-08-25 | Apple Inc. | Rule-based natural language processing |
US10546023B2 (en) * | 2017-10-03 | 2020-01-28 | Google Llc | Providing command bundle suggestions for an automated assistant |
US10636424B2 (en) | 2017-11-30 | 2020-04-28 | Apple Inc. | Multi-turn canned dialog |
CN110019718B (en) * | 2017-12-15 | 2021-04-09 | 上海智臻智能网络科技股份有限公司 | Method for modifying multi-turn question-answering system, terminal equipment and storage medium |
US11341422B2 (en) | 2017-12-15 | 2022-05-24 | SHANGHAI XIAOl ROBOT TECHNOLOGY CO., LTD. | Multi-round questioning and answering methods, methods for generating a multi-round questioning and answering system, and methods for modifying the system |
US10733982B2 (en) | 2018-01-08 | 2020-08-04 | Apple Inc. | Multi-directional dialog |
US10839160B2 (en) * | 2018-01-19 | 2020-11-17 | International Business Machines Corporation | Ontology-based automatic bootstrapping of state-based dialog systems |
US10733375B2 (en) | 2018-01-31 | 2020-08-04 | Apple Inc. | Knowledge-based framework for improving natural language understanding |
US10789959B2 (en) | 2018-03-02 | 2020-09-29 | Apple Inc. | Training speaker recognition models for digital assistants |
US10592604B2 (en) | 2018-03-12 | 2020-03-17 | Apple Inc. | Inverse text normalization for automatic speech recognition |
KR102635811B1 (en) * | 2018-03-19 | 2024-02-13 | 삼성전자 주식회사 | System and control method of system for processing sound data |
US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
US10909331B2 (en) | 2018-03-30 | 2021-02-02 | Apple Inc. | Implicit identification of translation payload with neural machine translation |
US10685075B2 (en) * | 2018-04-11 | 2020-06-16 | Motorola Solutions, Inc. | System and method for tailoring an electronic digital assistant query as a function of captured multi-party voice dialog and an electronically stored multi-party voice-interaction template |
US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
US10984780B2 (en) | 2018-05-21 | 2021-04-20 | Apple Inc. | Global semantic word embeddings using bi-directional recurrent neural networks |
DK201870355A1 (en) | 2018-06-01 | 2019-12-16 | Apple Inc. | Virtual assistant operation in multi-device environments |
DK179822B1 (en) | 2018-06-01 | 2019-07-12 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
DK180639B1 (en) | 2018-06-01 | 2021-11-04 | Apple Inc | DISABILITY OF ATTENTION-ATTENTIVE VIRTUAL ASSISTANT |
US11386266B2 (en) | 2018-06-01 | 2022-07-12 | Apple Inc. | Text correction |
US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
US10496705B1 (en) | 2018-06-03 | 2019-12-03 | Apple Inc. | Accelerated task performance |
CN112567621A (en) | 2018-08-29 | 2021-03-26 | 松下知识产权经营株式会社 | Power conversion system and power storage system |
US11010561B2 (en) | 2018-09-27 | 2021-05-18 | Apple Inc. | Sentiment prediction from textual data |
US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
US10839159B2 (en) | 2018-09-28 | 2020-11-17 | Apple Inc. | Named entity normalization in a spoken dialog system |
US11170166B2 (en) | 2018-09-28 | 2021-11-09 | Apple Inc. | Neural typographical error modeling via generative adversarial networks |
US11475898B2 (en) | 2018-10-26 | 2022-10-18 | Apple Inc. | Low-latency multi-speaker speech recognition |
US11638059B2 (en) | 2019-01-04 | 2023-04-25 | Apple Inc. | Content playback on multiple devices |
CN111428721A (en) * | 2019-01-10 | 2020-07-17 | 北京字节跳动网络技术有限公司 | Method, device and equipment for determining word paraphrases and storage medium |
US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
DK201970509A1 (en) | 2019-05-06 | 2021-01-15 | Apple Inc | Spoken notifications |
US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
US11475884B2 (en) | 2019-05-06 | 2022-10-18 | Apple Inc. | Reducing digital assistant latency when a language is incorrectly determined |
US11423908B2 (en) | 2019-05-06 | 2022-08-23 | Apple Inc. | Interpreting spoken requests |
US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
DK180129B1 (en) | 2019-05-31 | 2020-06-02 | Apple Inc. | User activity shortcut suggestions |
US11496600B2 (en) | 2019-05-31 | 2022-11-08 | Apple Inc. | Remote execution of machine-learned models |
US11289073B2 (en) | 2019-05-31 | 2022-03-29 | Apple Inc. | Device text to speech |
DK201970511A1 (en) | 2019-05-31 | 2021-02-15 | Apple Inc | Voice identification in digital assistant systems |
US11468890B2 (en) | 2019-06-01 | 2022-10-11 | Apple Inc. | Methods and user interfaces for voice-based control of electronic devices |
US11360641B2 (en) | 2019-06-01 | 2022-06-14 | Apple Inc. | Increasing the relevance of new available information |
WO2021056255A1 (en) | 2019-09-25 | 2021-04-01 | Apple Inc. | Text detection using global geometry estimators |
US11061543B1 (en) | 2020-05-11 | 2021-07-13 | Apple Inc. | Providing relevant data items based on context |
US11183193B1 (en) | 2020-05-11 | 2021-11-23 | Apple Inc. | Digital assistant hardware abstraction |
US11755276B2 (en) | 2020-05-12 | 2023-09-12 | Apple Inc. | Reducing description length based on confidence |
US11490204B2 (en) | 2020-07-20 | 2022-11-01 | Apple Inc. | Multi-device audio adjustment coordination |
US11438683B2 (en) | 2020-07-21 | 2022-09-06 | Apple Inc. | User identification using headphones |
US11783827B2 (en) | 2020-11-06 | 2023-10-10 | Apple Inc. | Determining suggested subsequent user actions during digital assistant interaction |
EP4174848A1 (en) * | 2021-10-29 | 2023-05-03 | Televic Rail NV | Improved speech to text method and system |
CN116644810B (en) * | 2023-05-06 | 2024-04-05 | 国网冀北电力有限公司信息通信分公司 | Power grid fault risk treatment method and device based on knowledge graph |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2122542A1 (en) * | 2006-12-08 | 2009-11-25 | Medhat Moussa | Architecture, system and method for artificial neural network implementation |
Family Cites Families (71)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5265014A (en) * | 1990-04-10 | 1993-11-23 | Hewlett-Packard Company | Multi-modal user interface |
US5748974A (en) * | 1994-12-13 | 1998-05-05 | International Business Machines Corporation | Multimodal natural language interface for cross-application tasks |
US5970446A (en) * | 1997-11-25 | 1999-10-19 | At&T Corp | Selective noise/channel/coding models and recognizers for automatic speech recognition |
EP1116134A1 (en) * | 1998-08-24 | 2001-07-18 | BCL Computers, Inc. | Adaptive natural language interface |
US6499013B1 (en) * | 1998-09-09 | 2002-12-24 | One Voice Technologies, Inc. | Interactive user interface using speech recognition and natural language processing |
US6332120B1 (en) * | 1999-04-20 | 2001-12-18 | Solana Technology Development Corporation | Broadcast speech recognition system for keyword monitoring |
JP3530109B2 (en) * | 1999-05-31 | 2004-05-24 | 日本電信電話株式会社 | Voice interactive information retrieval method, apparatus, and recording medium for large-scale information database |
CA2375222A1 (en) * | 1999-06-01 | 2000-12-07 | Geoffrey M. Jacquez | Help system for a computer related application |
US6598039B1 (en) * | 1999-06-08 | 2003-07-22 | Albert-Inc. S.A. | Natural language interface for searching database |
JP3765202B2 (en) * | 1999-07-09 | 2006-04-12 | 日産自動車株式会社 | Interactive information search apparatus, interactive information search method using computer, and computer-readable medium recording program for interactive information search processing |
JP2001125896A (en) * | 1999-10-26 | 2001-05-11 | Victor Co Of Japan Ltd | Natural language interactive system |
US7050977B1 (en) * | 1999-11-12 | 2006-05-23 | Phoenix Solutions, Inc. | Speech-enabled server for internet website and method |
JP2002024285A (en) * | 2000-06-30 | 2002-01-25 | Sanyo Electric Co Ltd | Method and device for user support |
JP2002082748A (en) * | 2000-09-06 | 2002-03-22 | Sanyo Electric Co Ltd | User support device |
US7197120B2 (en) * | 2000-12-22 | 2007-03-27 | Openwave Systems Inc. | Method and system for facilitating mediated communication |
GB2372864B (en) * | 2001-02-28 | 2005-09-07 | Vox Generation Ltd | Spoken language interface |
JP2003115951A (en) * | 2001-10-09 | 2003-04-18 | Casio Comput Co Ltd | Topic information providing system and topic information providing method |
US7224981B2 (en) * | 2002-06-20 | 2007-05-29 | Intel Corporation | Speech recognition of mobile devices |
US7693720B2 (en) * | 2002-07-15 | 2010-04-06 | Voicebox Technologies, Inc. | Mobile systems and methods for responding to natural language speech utterance |
EP1411443A1 (en) * | 2002-10-18 | 2004-04-21 | Hewlett Packard Company, a Delaware Corporation | Context filter |
JP2004212641A (en) * | 2002-12-27 | 2004-07-29 | Toshiba Corp | Voice input system and terminal device equipped with voice input system |
JP2004328181A (en) * | 2003-04-23 | 2004-11-18 | Sharp Corp | Telephone and telephone network system |
JP4441782B2 (en) * | 2003-05-14 | 2010-03-31 | 日本電信電話株式会社 | Information presentation method and information presentation apparatus |
JP2005043461A (en) * | 2003-07-23 | 2005-02-17 | Canon Inc | Voice recognition method and voice recognition device |
KR20050032649A (en) * | 2003-10-02 | 2005-04-08 | (주)이즈메이커 | Method and system for teaching artificial life |
US7747601B2 (en) * | 2006-08-14 | 2010-06-29 | Inquira, Inc. | Method and apparatus for identifying and classifying query intent |
US7720674B2 (en) * | 2004-06-29 | 2010-05-18 | Sap Ag | Systems and methods for processing natural language queries |
JP4434972B2 (en) * | 2005-01-21 | 2010-03-17 | 日本電気株式会社 | Information providing system, information providing method and program thereof |
ATE510259T1 (en) * | 2005-01-31 | 2011-06-15 | Ontoprise Gmbh | MAPPING WEB SERVICES TO ONTOLOGIES |
GB0502259D0 (en) * | 2005-02-03 | 2005-03-09 | British Telecomm | Document searching tool and method |
CN101120341A (en) * | 2005-02-06 | 2008-02-06 | 凌圭特股份有限公司 | Method and equipment for performing mobile information access using natural language |
US20060206333A1 (en) * | 2005-03-08 | 2006-09-14 | Microsoft Corporation | Speaker-dependent dialog adaptation |
US7409344B2 (en) * | 2005-03-08 | 2008-08-05 | Sap Aktiengesellschaft | XML based architecture for controlling user interfaces with contextual voice commands |
WO2006108061A2 (en) * | 2005-04-05 | 2006-10-12 | The Board Of Trustees Of Leland Stanford Junior University | Methods, software, and systems for knowledge base coordination |
US7991607B2 (en) * | 2005-06-27 | 2011-08-02 | Microsoft Corporation | Translation and capture architecture for output of conversational utterances |
US7640160B2 (en) * | 2005-08-05 | 2009-12-29 | Voicebox Technologies, Inc. | Systems and methods for responding to natural language speech utterance |
US7620549B2 (en) * | 2005-08-10 | 2009-11-17 | Voicebox Technologies, Inc. | System and method of supporting adaptive misrecognition in conversational speech |
US7822699B2 (en) * | 2005-11-30 | 2010-10-26 | Microsoft Corporation | Adaptive semantic reasoning engine |
US7627466B2 (en) * | 2005-11-09 | 2009-12-01 | Microsoft Corporation | Natural language interface for driving adaptive scenarios |
US20070136222A1 (en) * | 2005-12-09 | 2007-06-14 | Microsoft Corporation | Question and answer architecture for reasoning and clarifying intentions, goals, and needs from contextual clues and content |
US20070143410A1 (en) * | 2005-12-16 | 2007-06-21 | International Business Machines Corporation | System and method for defining and translating chat abbreviations |
CN100373313C (en) * | 2006-01-12 | 2008-03-05 | 广东威创视讯科技股份有限公司 | Intelligent recognition coding method for interactive input apparatus |
US8209407B2 (en) * | 2006-02-10 | 2012-06-26 | The United States Of America, As Represented By The Secretary Of The Navy | System and method for web service discovery and access |
CA2652150A1 (en) * | 2006-06-13 | 2007-12-21 | Microsoft Corporation | Search engine dash-board |
US20080005068A1 (en) * | 2006-06-28 | 2008-01-03 | Microsoft Corporation | Context-based search, retrieval, and awareness |
CN1963752A (en) * | 2006-11-28 | 2007-05-16 | 李博航 | Man-machine interactive interface technique of electronic apparatus based on natural language |
US20080172359A1 (en) * | 2007-01-11 | 2008-07-17 | Motorola, Inc. | Method and apparatus for providing contextual support to a monitored communication |
US20080172659A1 (en) | 2007-01-17 | 2008-07-17 | Microsoft Corporation | Harmonizing a test file and test configuration in a revision control system |
US20080201434A1 (en) * | 2007-02-16 | 2008-08-21 | Microsoft Corporation | Context-Sensitive Searches and Functionality for Instant Messaging Applications |
US20090076917A1 (en) * | 2007-08-22 | 2009-03-19 | Victor Roditis Jablokov | Facilitating presentation of ads relating to words of a message |
US7720856B2 (en) * | 2007-04-09 | 2010-05-18 | Sap Ag | Cross-language searching |
US8762143B2 (en) * | 2007-05-29 | 2014-06-24 | At&T Intellectual Property Ii, L.P. | Method and apparatus for identifying acoustic background environments based on time and speed to enhance automatic speech recognition |
US7788276B2 (en) * | 2007-08-22 | 2010-08-31 | Yahoo! Inc. | Predictive stemming for web search with statistical machine translation models |
AU2008292781B2 (en) * | 2007-08-31 | 2012-08-09 | Microsoft Technology Licensing, Llc | Identification of semantic relationships within reported speech |
US8165886B1 (en) * | 2007-10-04 | 2012-04-24 | Great Northern Research LLC | Speech interface system and method for control and interaction with applications on a computing system |
US8504621B2 (en) * | 2007-10-26 | 2013-08-06 | Microsoft Corporation | Facilitating a decision-making process |
JP2009116733A (en) * | 2007-11-08 | 2009-05-28 | Nec Corp | Application retrieval system, application retrieval method, monitor terminal, retrieval server, and program |
JP5158635B2 (en) * | 2008-02-28 | 2013-03-06 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Method, system, and apparatus for personal service support |
US20090234655A1 (en) * | 2008-03-13 | 2009-09-17 | Jason Kwon | Mobile electronic device with active speech recognition |
WO2009129315A1 (en) * | 2008-04-15 | 2009-10-22 | Mobile Technologies, Llc | System and methods for maintaining speech-to-speech translation in the field |
CN101499277B (en) * | 2008-07-25 | 2011-05-04 | 中国科学院计算技术研究所 | Service intelligent navigation method and system |
US8874443B2 (en) * | 2008-08-27 | 2014-10-28 | Robert Bosch Gmbh | System and method for generating natural language phrases from user utterances in dialog systems |
JP2010128665A (en) * | 2008-11-26 | 2010-06-10 | Kyocera Corp | Information terminal and conversation assisting program |
JP2010145262A (en) * | 2008-12-19 | 2010-07-01 | Pioneer Electronic Corp | Navigation apparatus |
US8326637B2 (en) * | 2009-02-20 | 2012-12-04 | Voicebox Technologies, Inc. | System and method for processing multi-modal device interactions in a natural language voice services environment |
JP2010230918A (en) * | 2009-03-26 | 2010-10-14 | Fujitsu Ten Ltd | Retrieving device |
US8700665B2 (en) * | 2009-04-27 | 2014-04-15 | Avaya Inc. | Intelligent conference call information agents |
US20100281435A1 (en) * | 2009-04-30 | 2010-11-04 | At&T Intellectual Property I, L.P. | System and method for multimodal interaction using robust gesture processing |
KR101622111B1 (en) * | 2009-12-11 | 2016-05-18 | 삼성전자 주식회사 | Dialog system and conversational method thereof |
KR101007336B1 (en) * | 2010-06-25 | 2011-01-13 | 한국과학기술정보연구원 | Personalizing service system and method based on ontology |
US20120253789A1 (en) * | 2011-03-31 | 2012-10-04 | Microsoft Corporation | Conversational Dialog Learning and Correction |
-
2012
- 2012-03-27 WO PCT/US2012/030740 patent/WO2012135218A2/en active Application Filing
- 2012-03-27 WO PCT/US2012/030757 patent/WO2012135229A2/en active Application Filing
- 2012-03-27 JP JP2014502721A patent/JP2014512046A/en active Pending
- 2012-03-27 KR KR1020137025586A patent/KR101963915B1/en active IP Right Grant
- 2012-03-27 JP JP2014502718A patent/JP6105552B2/en active Active
- 2012-03-27 KR KR20137025578A patent/KR20140014200A/en not_active Application Discontinuation
- 2012-03-27 EP EP12763866.6A patent/EP2691949A4/en not_active Ceased
- 2012-03-27 EP EP12763913.6A patent/EP2691885A4/en not_active Ceased
- 2012-03-27 WO PCT/US2012/030730 patent/WO2012135210A2/en unknown
- 2012-03-27 JP JP2014502723A patent/JP6087899B2/en not_active Expired - Fee Related
- 2012-03-27 EP EP12765896.1A patent/EP2691877A4/en not_active Withdrawn
- 2012-03-27 KR KR1020137025540A patent/KR101922744B1/en active IP Right Grant
- 2012-03-27 EP EP12764494.6A patent/EP2691870A4/en not_active Ceased
- 2012-03-27 WO PCT/US2012/030636 patent/WO2012135157A2/en unknown
- 2012-03-27 WO PCT/US2012/030751 patent/WO2012135226A1/en unknown
- 2012-03-29 CN CN201610801496.1A patent/CN106383866B/en active Active
- 2012-03-29 CN CN201210087420.9A patent/CN102737096B/en active Active
- 2012-03-30 CN CN201210090634.1A patent/CN102750311B/en active Active
- 2012-03-30 EP EP12764853.3A patent/EP2691875A4/en not_active Ceased
- 2012-03-30 WO PCT/US2012/031722 patent/WO2012135783A2/en unknown
- 2012-03-30 CN CN201210090349.XA patent/CN102737099B/en active Active
- 2012-03-30 CN CN201210091176.3A patent/CN102737101B/en active Active
- 2012-03-30 WO PCT/US2012/031736 patent/WO2012135791A2/en unknown
- 2012-03-30 EP EP12765100.8A patent/EP2691876A4/en not_active Ceased
- 2012-03-31 CN CN201210092263.0A patent/CN102750270B/en active Active
- 2012-03-31 CN CN201210101485.4A patent/CN102750271B/en not_active Expired - Fee Related
- 2012-03-31 CN CN201210093414.4A patent/CN102737104B/en active Active
-
2017
- 2017-03-01 JP JP2017038097A patent/JP6305588B2/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2122542A1 (en) * | 2006-12-08 | 2009-11-25 | Medhat Moussa | Architecture, system and method for artificial neural network implementation |
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106383866B (en) | Location-based conversational understanding | |
US10049667B2 (en) | Location-based conversational understanding | |
JP6942841B2 (en) | Parameter collection and automatic dialog generation in the dialog system | |
US10585957B2 (en) | Task driven user intents | |
US9299342B2 (en) | User query history expansion for improving language model adaptation | |
CN107039050B (en) | Automatic testing method and device for voice recognition system to be tested | |
RU2571608C2 (en) | Creating notes using voice stream | |
US7966171B2 (en) | System and method for increasing accuracy of searches based on communities of interest | |
US9594744B2 (en) | Speech transcription including written text | |
JP2019503526A5 (en) | ||
US8688447B1 (en) | Method and system for domain-specific noisy channel natural language processing (NLP) | |
CN113838451B (en) | Voice processing and model training method, device, equipment and storage medium | |
KR101483945B1 (en) | Method for spoken semantic analysis with speech recognition and apparatus thereof | |
Bernsen et al. | Building Usable Spoken Dialogue Systems. Some Approaches |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |