US7660715B1 - Transparent monitoring and intervention to improve automatic adaptation of speech models - Google Patents

Transparent monitoring and intervention to improve automatic adaptation of speech models Download PDF

Info

Publication number
US7660715B1
US7660715B1 US10/756,669 US75666904A US7660715B1 US 7660715 B1 US7660715 B1 US 7660715B1 US 75666904 A US75666904 A US 75666904A US 7660715 B1 US7660715 B1 US 7660715B1
Authority
US
United States
Prior art keywords
speech
user
transcription
user utterance
recognition result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US10/756,669
Inventor
David Preshan Thambiratnam
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Arlington Technologies LLC
Avaya Management LP
Original Assignee
Avaya Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Avaya Inc filed Critical Avaya Inc
Priority to US10/756,669 priority Critical patent/US7660715B1/en
Assigned to AVAYA TECHNOLOGY reassignment AVAYA TECHNOLOGY ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: THAMBIRATNAM, DAVID PRESHAN
Assigned to CITIBANK, N.A., AS ADMINISTRATIVE AGENT reassignment CITIBANK, N.A., AS ADMINISTRATIVE AGENT SECURITY AGREEMENT Assignors: AVAYA TECHNOLOGY LLC, AVAYA, INC., OCTEL COMMUNICATIONS LLC, VPNET TECHNOLOGIES, INC.
Assigned to CITICORP USA, INC., AS ADMINISTRATIVE AGENT reassignment CITICORP USA, INC., AS ADMINISTRATIVE AGENT SECURITY AGREEMENT Assignors: AVAYA TECHNOLOGY LLC, AVAYA, INC., OCTEL COMMUNICATIONS LLC, VPNET TECHNOLOGIES, INC.
Assigned to AVAYA INC reassignment AVAYA INC REASSIGNMENT Assignors: AVAYA LICENSING LLC, AVAYA TECHNOLOGY LLC
Assigned to AVAYA TECHNOLOGY LLC reassignment AVAYA TECHNOLOGY LLC CONVERSION FROM CORP TO LLC Assignors: AVAYA TECHNOLOGY CORP.
Publication of US7660715B1 publication Critical patent/US7660715B1/en
Application granted granted Critical
Assigned to BANK OF NEW YORK MELLON TRUST, NA, AS NOTES COLLATERAL AGENT, THE reassignment BANK OF NEW YORK MELLON TRUST, NA, AS NOTES COLLATERAL AGENT, THE SECURITY AGREEMENT Assignors: AVAYA INC., A DELAWARE CORPORATION
Assigned to BANK OF NEW YORK MELLON TRUST COMPANY, N.A., THE reassignment BANK OF NEW YORK MELLON TRUST COMPANY, N.A., THE SECURITY AGREEMENT Assignors: AVAYA, INC.
Assigned to CITIBANK, N.A., AS ADMINISTRATIVE AGENT reassignment CITIBANK, N.A., AS ADMINISTRATIVE AGENT SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: AVAYA INC., AVAYA INTEGRATED CABINET SOLUTIONS INC., OCTEL COMMUNICATIONS CORPORATION, VPNET TECHNOLOGIES, INC.
Assigned to AVAYA INC. reassignment AVAYA INC. BANKRUPTCY COURT ORDER RELEASING ALL LIENS INCLUDING THE SECURITY INTEREST RECORDED AT REEL/FRAME 025863/0535 Assignors: THE BANK OF NEW YORK MELLON TRUST, NA
Assigned to VPNET TECHNOLOGIES, INC., AVAYA INC., OCTEL COMMUNICATIONS LLC (FORMERLY KNOWN AS OCTEL COMMUNICATIONS CORPORATION), AVAYA INTEGRATED CABINET SOLUTIONS INC. reassignment VPNET TECHNOLOGIES, INC. BANKRUPTCY COURT ORDER RELEASING ALL LIENS INCLUDING THE SECURITY INTEREST RECORDED AT REEL/FRAME 041576/0001 Assignors: CITIBANK, N.A.
Assigned to AVAYA INC. reassignment AVAYA INC. BANKRUPTCY COURT ORDER RELEASING ALL LIENS INCLUDING THE SECURITY INTEREST RECORDED AT REEL/FRAME 030083/0639 Assignors: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A.
Assigned to GOLDMAN SACHS BANK USA, AS COLLATERAL AGENT reassignment GOLDMAN SACHS BANK USA, AS COLLATERAL AGENT SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: AVAYA INC., AVAYA INTEGRATED CABINET SOLUTIONS LLC, OCTEL COMMUNICATIONS LLC, VPNET TECHNOLOGIES, INC., ZANG, INC.
Assigned to CITIBANK, N.A., AS COLLATERAL AGENT reassignment CITIBANK, N.A., AS COLLATERAL AGENT SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: AVAYA INC., AVAYA INTEGRATED CABINET SOLUTIONS LLC, OCTEL COMMUNICATIONS LLC, VPNET TECHNOLOGIES, INC., ZANG, INC.
Assigned to WILMINGTON TRUST, NATIONAL ASSOCIATION reassignment WILMINGTON TRUST, NATIONAL ASSOCIATION SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: AVAYA INC., AVAYA INTEGRATED CABINET SOLUTIONS LLC, AVAYA MANAGEMENT L.P., INTELLISIST, INC.
Assigned to AVAYA TECHNOLOGY LLC, AVAYA, INC., OCTEL COMMUNICATIONS LLC, VPNET TECHNOLOGIES reassignment AVAYA TECHNOLOGY LLC BANKRUPTCY COURT ORDER RELEASING THE SECURITY INTEREST RECORDED AT REEL/FRAME 020156/0149 Assignors: CITIBANK, N.A., AS ADMINISTRATIVE AGENT
Assigned to WILMINGTON TRUST, NATIONAL ASSOCIATION, AS COLLATERAL AGENT reassignment WILMINGTON TRUST, NATIONAL ASSOCIATION, AS COLLATERAL AGENT INTELLECTUAL PROPERTY SECURITY AGREEMENT Assignors: AVAYA CABINET SOLUTIONS LLC, AVAYA INC., AVAYA MANAGEMENT L.P., INTELLISIST, INC.
Assigned to VPNET TECHNOLOGIES, INC., AVAYA TECHNOLOGY, LLC, OCTEL COMMUNICATIONS LLC, SIERRA HOLDINGS CORP., AVAYA, INC. reassignment VPNET TECHNOLOGIES, INC. RELEASE OF SECURITY INTEREST ON REEL/FRAME 020166/0705 Assignors: CITICORP USA, INC.
Assigned to AVAYA INC., AVAYA MANAGEMENT L.P., AVAYA HOLDINGS CORP., AVAYA INTEGRATED CABINET SOLUTIONS LLC reassignment AVAYA INC. RELEASE OF SECURITY INTEREST IN PATENTS AT REEL 45124/FRAME 0026 Assignors: CITIBANK, N.A., AS COLLATERAL AGENT
Assigned to WILMINGTON SAVINGS FUND SOCIETY, FSB [COLLATERAL AGENT] reassignment WILMINGTON SAVINGS FUND SOCIETY, FSB [COLLATERAL AGENT] INTELLECTUAL PROPERTY SECURITY AGREEMENT Assignors: AVAYA INC., AVAYA MANAGEMENT L.P., INTELLISIST, INC., KNOAHSOFT INC.
Assigned to CITIBANK, N.A., AS COLLATERAL AGENT reassignment CITIBANK, N.A., AS COLLATERAL AGENT INTELLECTUAL PROPERTY SECURITY AGREEMENT Assignors: AVAYA INC., AVAYA MANAGEMENT L.P., INTELLISIST, INC.
Assigned to INTELLISIST, INC., AVAYA MANAGEMENT L.P., AVAYA INC., AVAYA INTEGRATED CABINET SOLUTIONS LLC reassignment INTELLISIST, INC. RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 53955/0436) Assignors: WILMINGTON TRUST, NATIONAL ASSOCIATION, AS NOTES COLLATERAL AGENT
Assigned to INTELLISIST, INC., AVAYA INC., AVAYA INTEGRATED CABINET SOLUTIONS LLC, AVAYA MANAGEMENT L.P. reassignment INTELLISIST, INC. RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 61087/0386) Assignors: WILMINGTON TRUST, NATIONAL ASSOCIATION, AS NOTES COLLATERAL AGENT
Assigned to AVAYA INC., VPNET TECHNOLOGIES, INC., AVAYA INTEGRATED CABINET SOLUTIONS LLC, CAAS TECHNOLOGIES, LLC, HYPERQUALITY, INC., ZANG, INC. (FORMER NAME OF AVAYA CLOUD INC.), HYPERQUALITY II, LLC, AVAYA MANAGEMENT L.P., OCTEL COMMUNICATIONS LLC, INTELLISIST, INC. reassignment AVAYA INC. RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 045034/0001) Assignors: GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT
Assigned to AVAYA LLC reassignment AVAYA LLC (SECURITY INTEREST) GRANTOR'S NAME CHANGE Assignors: AVAYA INC.
Assigned to AVAYA LLC, AVAYA MANAGEMENT L.P. reassignment AVAYA LLC INTELLECTUAL PROPERTY RELEASE AND REASSIGNMENT Assignors: CITIBANK, N.A.
Assigned to AVAYA LLC, AVAYA MANAGEMENT L.P. reassignment AVAYA LLC INTELLECTUAL PROPERTY RELEASE AND REASSIGNMENT Assignors: WILMINGTON SAVINGS FUND SOCIETY, FSB
Assigned to ARLINGTON TECHNOLOGIES, LLC reassignment ARLINGTON TECHNOLOGIES, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: AVAYA LLC
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training

Definitions

  • Another mode such as supervised monitoring and intervention provides better input data to adapt one or more speech models.
  • supervised monitoring and intervention has not been real-time, that is, monitoring a speaker's voice inputs has not been used to automatically adapt one or more speech models.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)

Abstract

A system and method to improve the automatic adaptation of one or more speech models in automatic speech recognition systems. After a dialog begins, for example, the dialog asks the customer to provide spoken input and it is recorded. If the speech recognizer determines it may not have correctly transcribed the verbal response, i.e., voice input, the invention uses monitoring and if necessary, intervention to guarantee that the next transcription of the verbal response is correct. The dialog asks the customer to repeat his verbal response, which is recorded and a transcription of the input is sent to a human monitor, i.e., agent or operator. If the transcription of the spoken input is correct, the human does not intervene and the transcription remains unmodified. If the transcription of the verbal response is incorrect, the human intervenes and the transcription of the misrecognized word is corrected. In both cases, the dialog asks the customer to confirm the unmodified and corrected transcription. If the customer confirms the unmodified or newly corrected transcription, the dialog continues and the customer does not hang up in frustration because most times only one misrecognition occurred. Finally, the invention uses the first and second customer recording of the misrecognized word or utterance along with the corrected or unmodified transcription to automatically adapt one or more speech models, which improves the performance of the speech recognition system.

Description

FIELD OF THE INVENTION
The present invention is directed generally to a speech recognition system and specifically to a two-stage system to further filter information before adapting one or more speech models.
BACKGROUND OF THE INVENTION
A typical speech recognition system uses one or more speech models developed from a large vocabulary stored in a speech recognition adaptation database. The vocabulary includes most common words and attempts to cover vast language differences within a single language due to voice characteristics, dialects, education, noisy environments, etc. When the speech recognition system is first installed, often the performance is very poor because the one or more speech models need to be trained for the speakers in the region. Over a long period of time, retraining the speech models will improve the speech recognition system performance. Even after training the speech models, the speech recognition system typically recognizes an average speaker's verbal response. However, the speech recognizer may not still be able to correctly transcribe the verbal response of all speakers due to the reasons listed previously. Additionally, technical terms and proper names that have not entered the common jargon may not be recognized. Hence while undergoing this retraining process, which could take a significant period of time, the customer will continue to receive poor performance.
Typical speech recognition systems use unsupervised automatic adaptation, i.e., mathematical algorithms and/or confidence scores to determine whether to use a correctly or incorrectly recognized word or utterance and its transcript to update the vocabulary in the adaptation database. Mathematical algorithms determine the probability the transcription, i.e., text of the utterance or word is correct or incorrect. A high probability, such as 90%, would indicate the correct speech model was used to recognize the utterance or word. When the probability is high, it is likely the recognized utterance or word and transcript may be used to retrain one or more speech models.
The speech recognition system may assign a confidence score to each recognized utterance or word to provide a measure of the accuracy of the recognition for the utterance or word. Hence a confidence score of 30 or below would indicate the speech recognition system does not have much confidence the utterance or word was correctly recognized and should not be used to retrain one or more speech models. Whereas, a confidence score of 90 or above would indicate the utterance or word was correctly recognized and can be used to retrain one or more speech models.
One of the problems faced by current speech recognition systems using unsupervised automatic adaptation is the speech recognizer has no way of determining if it correctly recognized the word or utterance it will use to retrain one or more speech models. For example, if the confidence score or probability of correctness is low, the utterance or word is not used to adapt a speech model even if it was recognized correctly. However, if the confidence score or probability is high, but the utterance or word was incorrectly recognized, it will be used to adapt one or more speech models. Unfortunately when using incorrectly recognized utterances or words to adapt one or more speech models, instead of improved speech recognition, there is a decrease of correctly recognized utterances or words by the speech recognition system.
In this unsupervised mode, a dialog needs to request from the speaker a confirmation that it correctly recognized the verbal response, i.e., utterance or word, such as “Did you mean X?” Where X is the recognized verbal response, i.e., transcription or text of the utterance or word, that has been converted to speech by a text-to-speech resource. Typically confirmation is requested for complicated dialogs, such as when a customer requests to transfer money between bank accounts and the dialog requests confirmation of the bank account numbers and the amount of the transfer. Asking for confirmation after every verbal response by the customer can be annoying to the customer and lengthen the amount of time the customer is using the speech recognition system.
Additionally while the speech recognition system is undergoing the improvement process using unsupervised automatic adaptation of one or more speech models, the speaker will experience frustration and hang-up if the speech recognition system misrecognizes too many words or utterances.
The following is an example of a speech recognition system where multiple misrecognitions have occurred and the customer hangs up in frustration:
IVR dialog: “Please state the name of the company you wish to find.”
Speaker: “Avaya.”
IVR dialog: “Was that Papaya Limited?”
Speaker: “No.”
IVR dialog: “Please state the name of the company you wish to find.”
Speaker: “Avaya.” (spoken in a louder tone)
IVR dialog: “Was that Avalon Labs?”
Speaker: “No.”
IVR dialog: “Please state the name of the company you wish to find.”
Speaker: “Avaya.” (spoken in an frustrated voice)
IVR dialog: “Was that Papaya Limited?”
Speaker hangs up.
Another mode, such as supervised monitoring and intervention provides better input data to adapt one or more speech models. However, supervised monitoring and intervention has not been real-time, that is, monitoring a speaker's voice inputs has not been used to automatically adapt one or more speech models.
SUMMARY
These and other needs are addressed by the various embodiments of the present invention. The present invention is directed to the use of human operator-intervention in an automated speech recognition system to correct speech recognition errors.
In one embodiment of the present invention, an automatic speech recognition system (ASR) is provided that includes:
(a) a speech recognition resource operable to extract a first user utterance from a first input voice stream from a user, the first user utterance being in response to a query; select a first speech model as a first tentative recognition result characterizing the first user utterance; and determine that the first tentative recognition result does not correctly characterize the first user utterance; and
(b) a model adaptation agent operable, when the first tentative recognition result does not correctly characterize the first user utterance, to alert a human operator, based on the first user utterance, to select a second speech model correctly characterizing the first user utterance.
The improved system and method can provide a real-time, hybrid mode to filter out incorrectly recognized data and quickly retrain and improve one or more speech models using only correctly recognized words or utterances. Thus, accuracy of the speech recognition system can increase dramatically, even during retraining of the ASR system, and provide increased levels of user/customer satisfaction. Providing accurate information to the adaptation engine can result in the ASR performance accuracy improving more rapidly when compared to conventional systems that do not employ human operator intervention as a filter. Although conventional systems that employ confidence measures can significantly increase the quality of data provided to the adaptation engine, this method possesses greater risk than that of the present invention as the use of confidence can lead to false positives and false negatives, leading to an overall rate of accuracy improvement that is less than that realized by the present invention. The present invention allows the ASR system to guarantee no more than one misrecognition, thereby avoiding false positives and negatives.
In one configuration, the ASR process of the present invention uses a two-stage process to filter effectively incorrectly recognized data.
In the first stage, mathematical algorithms and/or confidence scores are used by an automatic speech recognition system (ASR) to recognize utterances and words. The ASR enters a second stage when it believes that a word or utterance is incorrectly recognized. In this second stage, a human operator determines whether the ASR correctly or incorrectly recognized an utterance or word. If the ASR incorrectly recognized the word or utterance, the human operator intervenes to correct the transcription of the utterance or word. If the transcription did not require any correction, the dialog continues. The ASR uses the first incorrectly recognized verbal response along with the second recognized verbal response and either the corrected transcription or unmodified transcription of the second recognized verbal response to automatically retrain one or more speech models.
After the start of a dialog, for example, the IVR dialog asks the speaker to provide a verbal response and a model adaptation agent in the ASR records this spoken input, i.e., utterances or words. In a first stage of normal operation, a speech recognition engine in the ASR receives the verbal response and attempts to recognize the verbal response, i.e., provide a transcript of the spoken utterances or words. The speech recognition engine communicates with a text-to-speech TTS resource and uses the TTS resource to convert the transcript to speech, which is heard by the customer to confirm the speech recognition engine correctly recognized the spoken input.
In the second stage of normal operation, the speech recognition engine determines the speaker does not believe the verbal response was correctly recognized, i.e., an error is detected, such as the speaker confirms that the first attempt at recognition was wrong. A model adaptation agent uses monitoring and intervention, if necessary, to guarantee that the next transcription of the spoken input, i.e., the actual text of the word or utterance, is correct.
Hence in the second stage of operation, the model adaptation agent alerts a human agent, for example an operator. The IVR dialog asks the speaker to repeat his verbal response. The model adaptation agent records this second verbal response and requests the speech recognition engine to provide a transcription of the verbal response. The model adaptation agent sends the recorded verbal response along with a transcription of this second verbal response to the operator's workstation. The operator listens to the recording of the speaker's second verbal response and reviews the transcription of the second verbal response. If the transcription of the verbal response is correct, the operator does not intervene. The transcription is sent to the TTS resource, which converts the transcription to speech. The next dialog message to the speaker asks the speaker to confirm the transcription. If the speaker confirms the transcription, the dialog continues. The model adaptation agent sends the first and second verbal responses along with the unmodified transcription of the second verbal response to the adaptation engine.
If the transcription of the input is incorrect, the human agent intervenes and the transcription of the misrecognized second verbal response is corrected. The corrected transcription is sent to the TTS resource, which converts the transcription to speech. The next dialog message to the speaker asks the speaker to confirm the corrected transcription of the second verbal response that was converted to speech. If the speaker confirms the newly corrected transcription, the dialog continues and the speaker does not hang up in frustration because only one misrecognition occurred. The model adaptation agent sends the first and second verbal responses along with the corrected transcription of the second verbal response to the adaptation engine. Hence, the model adaptation agent ensures only correct data is used to automatically adapt one or more speech models.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of a system utilizing the present invention.
FIG. 2 is a flow diagram of the present invention.
DETAILED DESCRIPTION
FIG. 1 is a block diagram of a system utilizing the automatic speech recognition (ASR) 50 of the present invention. System 91 includes telecommunications equipment found in an enterprise, such as a bank, small or large corporation, university, government facility etc. When customer 10 dials an enterprise but does not know the direct telephone number of the individual he wants to speak to, customer 10 may place a telephone call via the public switched telephone network (PSTN) 20 to the enterprise's main telephone number. Similarly, customer 10 may use IP softphone application on his personal computer 11 to allow his PC to function as a desktop IP phone or use a desktop IP telephone 10 to place a telephone call using the Internet 30. The enterprise most likely will have an interactive voice response (IVR) 70 system present a variety of menu options to the customer 10. If the customer's 10 telephone does not have touch-tone (dual tone multi-frequency “DTMF”) capability, the customer may still communicate with the interactive voice response system (IVR) 70 using voice response if the IVR 70 has automatic speech recognition (“ASR”) capability. In this case, the customer 10 may chose to select the enterprise's corporate name directory menu and be transferred to the desired individual within the enterprise when the correct name is found. Note rather than use a telephone, the caller may alternatively use a microphone associated with the caller's workstation 11, or an personal digital assistant with telephony capabilities, or any other communication device with telephony capabilities.
System 91 includes IVR 70, automatic speech recognition (ASR)/text-to-speech (TTS) resource server 50 with a hard disk drive 55 storing an adaptation database 56, a speech recognition engine 66, a text-to-speech engine 64, and adaptation engine 62, and a model adaptation agent 58, a PBX 40, a router/gateway 60, telephones 41, 42, 45 and a workstation for a human agent 80. The operator's workstation 80, IVR 70, ASR/TTS server 50 and PBX communicate via LAN 90 to other servers within system 91. System 91 may also include unified messaging system 47 and workstation with IP softphone application 46, which also communicate via LAN 90.
The PBX (otherwise known as telephony server) 40 answers and transfers inbound and outbound telephone calls from the PSTN 20 or Internet 30 for internal users, such as, call center agents (not shown), operator 45, or employees (not shown). Calls are routed through the Internet 30 using the router/gateway 60. The telephony server 40 has analog and digital port boards (not shown) to support legacy analog 41 and digital telephones 42 as well as an IP board (not shown) to support IP desktop telephones 45 or IP softphone application software operating on a workstation 46. The telephony server software includes the APIs and drivers to communicate with other computers such as the management console (not shown), IVR 70, ASR/TTS server 50, etc.
The IVR 70 provides the appropriate dialog, i.e., a set menu of options that create navigation paths from one menu to another depending on the caller's menu choice. IVR 70 includes CPU (not shown), memory (not shown), hard-disk drive (not shown), resource cards, such as to answer or transfer calls (not shown), and LAN card (not shown), such as an Ethernet card. IVR 70 hard-disk drive stores recorded prompts, platform software for the operation, administration and maintenance of the IVR such as backup, speech administration, interface software to communicate with other servers such as the ASR/TTS server and the telephony server 40. When the IVR 40 accesses and provides information from a web page, it uses a VXML (voice extensible markup language) interpreter (not shown). The VXML interpreter provides the functionality for parsing and processing the VXML documents that may be stored on a web server (not shown) or anywhere on the Internet 30.
The ASR/TTS resource server 50 includes the APIs and drivers to communicate with other computers such as the management console (not shown) to administer the ASR/TTS server 50, telephony server 40, IVR 70, router/gateway 60, and an operator's workstation 80, etc. The ASR/TTS resource server 50 also includes the software to provide the text-to-speech (TTS) resource and the automatic speech recognition (ASR) resource. The ASR/TTS resource server 50 communicates with the operator's workstation over the LAN using TCP/IP, for example.
The ASR/TTS server 50 has a CPU (not shown), memory (not shown), LAN card (not shown) and hard-disk drive 55. The hard-disk drive 55 also stores the speech recognition engine 66 (which extracts a user utterance from a sampled input voice stream, selects a speech model as a first tentative recognition result correctly characterizing the user utterance, generates a transcription of the utterance based on the selected speech model, and determines from a user's response, when presented with the first tentative recognition result, whether the first tentative recognition result correctly characterizes the utterance), the text-to-speech engine 64 (which converts the transcription into speech), the adaptation engine 62 (which retrains one or more speech models based on the user utterance and the transcription or a human operator-corrected version thereof), and adaptation database 56 (which includes the various speech models particularly suited for one or more different voice-enabled applications that may reside on the ASR/TTS server 50, such as, on-line banking, technical support, request for a name directory, accessing a desktop calendar or appointment scheduler, information from a web page, etc.). Other voice-enabled applications may include an individual calling from home or anywhere outside of the place of business to retrieve information such as voice, email or fax messages via his unified messaging system 47 and IVR 70. Alternatively when customer 10 or internal user within the enterprise is going to purchase items from a catalog, he may dial the main 800 telephone number for the catalog and wait for the next available agent to assist him with the order or use the voice response ordering system to order the item from the catalog. The adaptation database 56 also stores the vocabulary to support the various speech models. The ASR/TTS server 50 uses APIs and API calls to retrieve or store information within the adaptation database 56.
The ASR/TTS server 50 hard-disk drive 55 stores the ASR application which includes a model adaptation agent 58 as well as the maintenance, administration, and operating software. The ASR operating software allows customers to utilize voice-enabled applications such as those listed above. The speaker may either dial the IVR 70, dial the ASR/TTS server 50 directly, if configured in that manner, or speak directly to an voice input mechanism, such as a microphone, associated with an object utilizing an ASR/TTS application.
The ASR operating software also includes the model adaptation agent 58. The model adaptation agent 58 uses a human agent, such as an operator, to guarantee that only good data, i.e., incorrectly recognized data is filtered, is provided to the adaptation engine to improve one or more speech models.
When the IVR 70 dialog requests and the customer provides voice input, the IVR interfaces with the ASR/TTS application to provide the voice response, i.e., audio stream to the ASR/TTS application. The model adaptation agent 58 has the drivers necessary to record and save the audio stream. This audio recording may be stored temporarily in the ASR/TTS memory (not shown) or stored on the ASR/TTS hard disk drive 55.
A management console (not shown) administers the telephony server 40, ASR/TTS resource server 50, IVR 70 and the router/gateway 60 and allows the administrator to retrieve reports regarding performance, system usage, etc. The management console (not shown) may use a command-line or graphical user interface to administer the various servers. Alternatively, the management console may have interface software and utilities, which allows an administrator to use an Internet browser to administer the various servers. In this case, the servers administered via the Internet browser also need to have the proper interface software to allow administration via an Internet browser.
In an alternate embodiment, the IVR 70 and ASR/TTS server 50 may be co-resident, i.e., configured on one server, and the telephony server 40 communicates directly with this co-resident IVR/ASR/TTS server. Operating these co-resident applications depends on the system performance requirements for the server and applications. For small enterprises with minimal ASR requests and hence require less processing resources, operating these co-resident applications on a server may not affect the system performance of the applications or of the server. For large enterprises with presumably many more ASR requests, there may be a need to have the IVR and ASR/TTS applications operating on separate servers to maintain server performance levels.
FIG. 2 is a flow diagram of one embodiment of the present invention. In step 100, a customer dials the access telephone number for a business. In normal operations, the telephony server 40 answers a call and transfers the call to the IVR 70. If the customer uses speech recognition instead of DTMF to respond to the IVR dialog prompts, the IVR 70 uses API calls to communicate with the ASR application on the ASR/TTS server 50.
In step 110, the IVR 70 application initiates the appropriate voice-enabled dialog script particularly suited for the application requested, i.e., dialed by the customer. For example, the dialog script to retrieve names from a name directory would differ from the dialog script to transfer funds between bank accounts.
After the start of an IVR 70 dialog, the IVR 70 dialog prompt asks the customer, i.e., speaker to provide a verbal response. (FIG. 2, step 120). The ASR application instructs an ASR engine residing on the ASR/TTS server 50 to receive the verbal response, i.e., audio stream from the IVR 70. The ASR engine recognizes the audio stream as speech and the model adaptation agent 58 records the audio stream. (FIG. 2, step 120). This first recording is temporarily stored in the ASR/TTS server 50 in a storage device, such as memory (not shown) or hard-disk drive 55.
In a first stage of normal operation, the ASR engine receives the audio stream, i.e., voice response, and attempts to recognize the response, i.e., provide a transcription of the audio stream. The ASR engine provides the transcript to a text-to-speech TTS resource and uses the TTS resource to convert the transcript to speech, i.e., audio. The ASR application includes the API and drivers to communicate and send the audio back to the telephony server 40 voice channel established between the customer and IVR for presentation to the customer or user to confirm the ASR correctly recognized the spoken input. (FIG. 2, step 130). Alternatively, the ASR application returns the speech to the software originating the ASR resource request.
In a second stage of normal operation, the ASR engine determines whether it correctly recognized the voice input, i.e., provided the customer a correct transcription of the audio stream. (FIG. 2, step 140). The ASR engine may determine the speaker does not believe the audio was correctly recognized, i.e., an error is detected, based on the speaker's behavior. An error is detected, for example when a speaker verbally confirms the first attempt at recognition was wrong by saying “No” as a response to the confirmation request. There are other methods to determine the ASR engine did not correctly recognize the audio stream. For example, the speaker's voice level may get louder and more irate or the customer may begin pressing a key on the touch-tone telephone to allow the customer to make another voice input, etc.
The model adaptation agent 58 interfaces with the ASR engine to monitor and if necessary intervene to guarantee that the next transcription of the verbal response, i.e., the actual spelling of the word or utterance, is correct. Monitoring occurs when the ASR engine determines it may not have correctly recognized the customer's first voice response. Hence in step 160, the model adaptation agent 58 alerts a human agent, for example an operator, by either sending a pop-up message to the operators console (not shown) notifying the operator that an audio stream and transcription is forthcoming, by sending a special tone to the operator's headset or using a special ring tone to notify the operator of the incoming information. In step 165, the model adaptation agent 58 communicates with the IVR 70 using API calls and instructs the IVR 70 dialog to request the speaker to repeat his verbal response. The model adaptation agent 58 records this second verbal response, i.e., audio stream, and provides the audio stream to the ASR engine. The model adaptation agent 58 provides the recorded audio stream of the second verbal response and a transcription of the audio stream from the ASR engine to the operator's workstation/console 80.
The operator workstation/console 80 allows the operator to view the transcription of the recognized utterance or word. The operator also hears the recording of the audio, i.e., the spoken utterance or word using a headset (not shown) or via the telephone 41, 42, 45, 46 to determine whether the transcription is correct. The model adaptation agent 58 provides possible matches to the operator from the vocabulary stored in the adaptation database 56. If the transcription is not correct and there are several possible matches, the operator toggles through the choices and selects the best choice or alternatively the operator edits the transcription if none of the choices are a match. The interface to the console/workstation 80 may use a command line or graphical user interface (GUI) to view and correct the transcription. Alternatively, the interface to the console/workstation may allow the use of an Internet browser to view the ASR transcription and correct the transcription if necessary.
In step 170, the operator listens to the recording of the speaker's second verbal response and reviews the transcription of the verbal response to determine whether the transcription of the second verbal response is correct. If the transcription of the response is correct, the operator will not modify the transcription, i.e., the operator does not intervene. The transcription is sent to the TTS resource, which converts the transcription to speech. In step 220, the next dialog message asks the speaker to confirm the unmodified transcription converted to speech, such as “Was that X?” If the speaker confirms the unmodified transcription (step 230), the dialog continues (step 240) without on-line intervention by the operator. The model adaptation agent 58 instructs an adaptation engine on the ASR/TTS server 50 to use the first and second recording of the speaker's verbal response along with the unmodified transcription of the second recording to update one or more speech models (210). Retraining the speech model with this information will allow the ASR engine to correctly recognize the utterance or word in the future. The newly adapted speech model is stored in the ASR/TTS server 50 hard-disk drive 55 adaptation database 56.
However in step 220, the speaker may not confirm the unmodified transcription of the second verbal response, i.e., word or utterance is correct. (FIG. 2, step 230). In this case, the model adaptation agent 58 presents the transcription of the second verbal response and the voice recording of the 2nd verbal response to the operator for reevaluation and correction (step 180). The operator corrects the transcription of the second verbal response. This may be simply correcting the text of the word or utterance. Alternatively, correcting the transcription may be more involved and include correcting the phoneme of the word or utterance, for example, depending on the sophistication of the editing tool on the operator's workstation 80. The operator temporarily stores the corrected transcription in the ASR/TTS server 50 hard-disk drive 55 or alternatively in any other storage means, such as memory (not shown) or flash card (not shown).
In step 190, the model adaptation agent 58 instructs the next IVR dialog message to ask the speaker to confirm the corrected transcription. If the speaker confirms the transcription is correct, the dialog continues. (FIG. 2, step 200). The model adaptation agent 58 sends the 1st and 2nd recorded verbal responses and corrected transcription of the second verbal response to the adaptation engine. The model adaptation agent 58 instructs the adaptation engine to retrain one or more speech models with this filtered information. The one or more updated speech models are stored in the ASR/TTS server 50 hard-disk drive 55 adaptation database 56.
In step 170, if the transcription of the verbal response is incorrect the operator intervenes and the transcription of the misrecognized word or utterance is corrected (step 180). The operator temporarily stores the corrected transcription in the ASR/TTS server 50 hard-disk drive 55 or memory (not shown). The corrected transcription is sent to the TTS resource, which converts the transcription to speech. In step 190, the model adaptation agent 58 instructs the next IVR dialog message to asks the speaker to confirm the corrected transcription that was converted to speech. If the speaker confirms the newly corrected transcription, the dialog continues (step 200) and the speaker does not hang up in frustration because only one misrecognition occurred. In step 210, the model adaptation agent sends the first and second recording of the misrecognized verbal response, i.e., word or utterance, along with the corrected transcription to the adaptation engine. The model adaptation agent 58 instructs an adaptation engine to use the first and second recorded verbal response along with corrected transcription to update one or more speech models. Retraining one or more speech models with this information will allow the ASR engine to correctly recognize the utterance or word in the future. The updated speech model is stored in the ASR/TTS server 50 hard-disk drive 55 adaptation database 56.
Hence, using the example provided in the background of the invention where several misrecognitions occurred, the invention rapidly improves one or more speech models and the performance of the speech recognition system as shown:
IVR dialog: “Please state the name of the company you wish to find.”
Customer: “Avaya.”
(This first voice response is recorded by the model adaptation agent 58 and temporarily stored on the ASR/TTS server. FIG. 2, step 120.)
IVR dialog: “Was that Papaya Limited?”
(This is what the ASR engine recognized as the customer's voice response. FIG. 2, step 130.)
Customer: “No.”
(ASR application alerts an operator to monitor the call. FIG. 2, step 140, 160. The ASR application requests the IVR dialog to repeat the question, which requires the customer to repeat the verbal response. FIG. 2, step 165.)
IVR dialog: “Please state the name of the company you wish to find.”
Customer: “Avaya.”
(The model adaptation agent 58 records the second verbal response and temporarily stores the recording on the ASR/TTS server 50. The model adaptation agent 58 sends the recorded audio stream to the operator workstation 80. The agent hears the recorded audio stream and sees on the console the ASR engine's transcription of the audio stream. The agent sees the ASR engine has misrecognized the second verbal response as “Avalon Labs” and corrects the transcription to “Avaya.” FIG. 2, step 170, 180. The operator stores the corrected transcription on the ASR/TTS server 50. The model adaptation agent 58 sends the corrected transcription to the TTS resource. The TTS resource converts the corrected transcription to speech and sends the speech to the IVR.)
IVR dialog: “Was that Avaya?”
(The model adaptation agent 58 instructs the dialog to request the customer to confirm the corrected transcription. FIG. 2, step 190.)
Customer: “Yes.”
(Customer confirms corrected transcription. Now the dialog proceeds as usual. FIG. 2, step 200. The model adaptation agent 58 sends the first and second recorded voice response along with the corrected transcription to update one or more speech models. (FIG. 2, step 210.) In the future, the speech recognizer should recognize the word “Avaya.”)
The present invention, in various embodiments, includes components, methods, processes, systems and/or apparatus substantially as depicted and described herein, including various embodiments, subcombinations, and subsets thereof. Those of skill in the art will understand how to make and use the present invention after understanding the present disclosure. The present invention, in various embodiments, includes providing devices and processes in the absence of items not depicted and/or described herein or in various embodiments hereof, including in the absence of such items as may have been used in previous devices or processes, e.g., for improving performance, achieving ease and/or reducing cost of implementation.
The foregoing discussion of the invention has been presented for purposes of illustration and description. The foregoing is not intended to limit the invention to the form or forms disclosed herein. Although the description of the invention has included description of one or more embodiments and certain variations and modifications, other variations and modifications are within the scope of the invention, e.g., as may be within the skill and knowledge of those in the art, after understanding the present disclosure. It is intended to obtain rights which include alternative embodiments to the extent permitted, including alternative, interchangeable and/or equivalent structures, functions, ranges or steps to those claimed, whether or not such alternative, interchangeable and/or equivalent structures, functions, ranges or steps are disclosed herein, and without intending to publicly dedicate any patentable subject matter.

Claims (24)

1. A method to retrain an automatic speech recognition system, in which automatic speech recognition system a plurality of speech models is stored, the method comprising:
(a) extracting, by the automatic speech recognition system, a first user utterance from a sampled first input voice stream received from a user in response to a query;
(b) selecting, by the automatic speech recognition system and based on the first user utterance, a first speech model from among the plurality of speech models, the first speech model producing a first tentative recognition result corresponding to the first user utterance;
(c) informing the user of the first tentative recognition result;
(d) determining, by the automatic speech recognition system, from the user's response whether the first tentative recognition result was correct;
(e) performing the following steps when the first tentative recognition result is not correct:
(i) requesting the user to repeat the response to the query;
(ii) extracting, by the automatic speech recognition system, a second user utterance from a sampled second input voice stream received from the user in response to the requesting step;
(iii) selecting, by the automatic speech recognition system and based on the second user utterance, a second speech model, different than the first speech model, the second speech model producing a second tentative recognition result corresponding to the second user utterance; and
(iv) determining, by a human operator, when the second speech model correctly corresponds to at least one of the first and second user utterances,
wherein the first and second speech models are selected from a plurality of speech models.
2. The method of claim 1, wherein the plurality of speech models are developed from a large vocabulary stored in a speech recognition adaptation database; and wherein step (e) further comprises:
(v) retraining the first speech model using the second speech model.
3. The method of claim 1, wherein the first tentative recognition result is at least one word.
4. The method of claim 1, wherein, when the first tentative recognition result is correct, steps (i)-(v) are not performed.
5. The method of claim 1, wherein the automatic speech recognition system provides a transcription of at least one of the first and second user utterances and wherein the informing step comprises:
converting, by a text-to-speech resource, the transcription into speech; and
communicating, by an interactive voice response unit, the speech to the user.
6. The method of claim 5, wherein the determining, by a human operator, step comprises:
displaying the transcription to the human operator;
playing a recording of the at least one of the first and second user utterances to the human operator; and
selecting, by the human operator, a third speech model as correctly corresponding to the recording, based on the transcription and recording.
7. The method of claim 1, further comprising:
selecting, by the human operator, a third speech model that correctly corresponds to the second user utterance, when the second speech model does not correctly correspond to the second user utterance.
8. The method of claim 7, further comprising:
retraining at least one speech model using said third speech model.
9. A computer readable medium comprising processor executable instructions that, when executed, perform the steps of claim 1.
10. A method to retrain an automatic speech recognition system, in which automatic speech recognition system a plurality of speech models is stored, the method comprising:
(a) extracting, by the automatic speech recognition system, a first user utterance from a first input voice stream from a user, the first user utterance being a response to a query;
(b) selecting, by the automatic speech recognition system, a first speech model, the first speech model producing a first tentative recognition result based on the first user utterance;
(c) determining, by the automatic speech recognition system, that the first tentative recognition result does not correctly characterize the first user utterance;
(d) selecting, by a human operator and based on at least one of the first user utterance and a second user utterance received from the user, a second speech model as correctly characterizing the first user utterance, the second speech model producing a second tentative recognition result; and
(e) retraining the first speech model using at least one of the first and second user utterances and the second tentative recognition result.
11. The method of claim 10, wherein, when the first tentative recognition result correctly characterizes the first user utterance, not performing the selecting step (d).
12. The method of claim 10, wherein the plurality of speech models are developed from a large vocabulary stored in a speech recognition adaptation database and wherein the determining step comprises:
(C1) informing the user of the first tentative recognition result; and
(C2) determining from the first user's response whether the first tentative recognition result correctly characterizes the first user utterance.
13. The method of claim 12, further comprising before the human operator selecting step (d):
(f) requesting the first user to repeat the response to the query;
(g) extracting the second user utterance from a sampled second input voice stream received from the first user in response to the requesting step (e); and
(h) selecting the second speech model producing the second tentative recognition result corresponding to the second user utterance.
14. The method of claim 10, wherein the automatic speech recognition system generates a transcription of the first user utterance and wherein the determining step (c) comprises:
(C1) converting, by a text-to-speech resource, the transcription into speech; and
(C2) communicating, by an interactive voice response unit, the speech to the user.
15. The method of claim 14, wherein the selecting step (d) comprises:
(D1) displaying the transcription to the human operator;
(D2) playing a recording of the first user utterance to the human operator; and
(D3) selecting, by the human operator and based on the transcription and recording, a third speech model as correctly corresponding to the recording.
16. The method of claim 14, wherein an adaptation agent is operable to provide an adaptation engine improved data to retrain at least one speech model, the improved data comprising said first user utterance and at least one of (i) a corrected transcription of said first user utterance when said human operator corrects said transcription; (ii) an unmodified transcription of said first user utterance when said human operator does not correct said transcription.
17. A computer readable medium comprising instructions that, when executed, perform the steps of claim 10.
18. The method of claim 10, wherein the first tentative recognition result is at least one word.
19. A speech recognition system comprising:
a speech recognition resource operable to extract a first user utterance from a first input voice stream from a user, the first user utterance being a response to a query; select a first speech model producing a first tentative recognition result characterizing the first user utterance; and
determine that the first tentative recognition result does not correctly characterize the first user utterance;
a model adaptation agent operable, when the first tentative recognition result does not correctly characterize the first user utterance, to alert a human operator, based on the first user utterance, to select a second speech model, different than the first speech model, to produce a second tentative recognition result correctly characterizing the first user utterance,
wherein the first and second speech models are selected from a plurality of speech models.
20. The system of claim 19, further comprising:
an interactive voice response unit operable to inform the user of the first tentative recognition result and wherein the speech recognition resource is operable to determine from the first user's response whether the first tentative recognition result correctly characterizes the first user utterance; and further comprising:
an adaptation engine operable to retrain at least one speech model using at least the second tentative recognition result.
21. The system of claim 19, further comprising:
an interactive voice response unit operable to request the first user to repeat the response to the query; and wherein the automatic speech recognition system is operable to extract a second user utterance from a sampled second input voice stream received from the first user in response to the request and select a third speech model to produce a third tentative recognition result corresponding to the second user utterance.
22. The system of claim 19 wherein, when the first tentative recognition result correctly characterizes the first user utterance, the adaptation engine does not alert the human operator.
23. The system of claim 19, wherein the speech recognition resource generates a transcription of at the first user utterance and further comprising:
a text-to-speech resource operable to convert the transcription into speech; and
an interactive voice response unit operable to communicate the speech to the user.
24. The system of claim 23, wherein the adaptation agent is operable to display the transcription to the human operator and play a recording of the first user utterance to the human operator and wherein the human operator, based on the transcription and recording, selects a third speech model as correctly corresponding to the recording.
US10/756,669 2004-01-12 2004-01-12 Transparent monitoring and intervention to improve automatic adaptation of speech models Active 2027-07-28 US7660715B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/756,669 US7660715B1 (en) 2004-01-12 2004-01-12 Transparent monitoring and intervention to improve automatic adaptation of speech models

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/756,669 US7660715B1 (en) 2004-01-12 2004-01-12 Transparent monitoring and intervention to improve automatic adaptation of speech models

Publications (1)

Publication Number Publication Date
US7660715B1 true US7660715B1 (en) 2010-02-09

Family

ID=41646514

Family Applications (1)

Application Number Title Priority Date Filing Date
US10/756,669 Active 2027-07-28 US7660715B1 (en) 2004-01-12 2004-01-12 Transparent monitoring and intervention to improve automatic adaptation of speech models

Country Status (1)

Country Link
US (1) US7660715B1 (en)

Cited By (163)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050215239A1 (en) * 2004-03-26 2005-09-29 Nokia Corporation Feature extraction in a networked portable device
US20070047718A1 (en) * 2005-08-25 2007-03-01 Sbc Knowledge Ventures, L.P. System and method to access content from a speech-enabled automated system
US20080140410A1 (en) * 2006-12-06 2008-06-12 Soonthorn Ativanichayaphong Enabling grammars in web page frame
US20090041212A1 (en) * 2005-04-15 2009-02-12 Avaya Inc. Interactive Voice Response System With Prioritized Call Monitoring
US20090292531A1 (en) * 2008-05-23 2009-11-26 Accenture Global Services Gmbh System for handling a plurality of streaming voice signals for determination of responsive action thereto
US20090292533A1 (en) * 2008-05-23 2009-11-26 Accenture Global Services Gmbh Treatment processing of a plurality of streaming voice signals for determination of a responsive action thereto
US20100091978A1 (en) * 2005-06-03 2010-04-15 At&T Intellectual Property I, L.P. Call routing system and method of using the same
US20100202604A1 (en) * 2009-02-12 2010-08-12 Siegel Laurence R Universal Access to Caller-Specific Ringtones
US8027457B1 (en) * 2005-12-01 2011-09-27 Cordell Coy Process for automated deployment of natural language
US20120237007A1 (en) * 2008-02-05 2012-09-20 Htc Corporation Method for setting voice tag
US20130159000A1 (en) * 2011-12-15 2013-06-20 Microsoft Corporation Spoken Utterance Classification Training for a Speech Recognition System
US20130325450A1 (en) * 2012-05-31 2013-12-05 Elwha LLC, a limited liability company of the State of Delaware Methods and systems for speech adaptation data
US20130325454A1 (en) * 2012-05-31 2013-12-05 Elwha Llc Methods and systems for managing adaptation data
US20130325452A1 (en) * 2012-05-31 2013-12-05 Elwha LLC, a limited liability company of the State of Delaware Methods and systems for speech adaptation data
US20130325448A1 (en) * 2012-05-31 2013-12-05 Elwha LLC, a limited liability company of the State of Delaware Speech recognition adaptation systems based on adaptation data
US20130325449A1 (en) * 2012-05-31 2013-12-05 Elwha Llc Speech recognition adaptation systems based on adaptation data
US20130325451A1 (en) * 2012-05-31 2013-12-05 Elwha LLC, a limited liability company of the State of Delaware Methods and systems for speech adaptation data
US20140142940A1 (en) * 2012-11-21 2014-05-22 Verint Systems Ltd. Diarization Using Linguistic Labeling
US8751232B2 (en) 2004-08-12 2014-06-10 At&T Intellectual Property I, L.P. System and method for targeted tuning of a speech recognition system
US8751222B2 (en) 2008-05-23 2014-06-10 Accenture Global Services Limited Dublin Recognition processing of a plurality of streaming voice signals for determination of a responsive action thereto
US8824659B2 (en) 2005-01-10 2014-09-02 At&T Intellectual Property I, L.P. System and method for speech-enabled call routing
US20140249811A1 (en) * 2013-03-01 2014-09-04 Google Inc. Detecting the end of a user question
US20150019221A1 (en) * 2013-07-15 2015-01-15 Chunghwa Picture Tubes, Ltd. Speech recognition system and method
US9112972B2 (en) 2004-12-06 2015-08-18 Interactions Llc System and method for processing speech
US9218807B2 (en) * 2010-01-08 2015-12-22 Nuance Communications, Inc. Calibration of a speech recognition engine using validated text
US20160027442A1 (en) * 2014-07-25 2016-01-28 International Business Machines Corporation Summarization of audio data
US9263034B1 (en) * 2010-07-13 2016-02-16 Google Inc. Adapting enhanced acoustic models
US9460722B2 (en) 2013-07-17 2016-10-04 Verint Systems Ltd. Blind diarization of recorded calls with arbitrary number of speakers
US9495966B2 (en) 2012-05-31 2016-11-15 Elwha Llc Speech recognition adaptation systems based on adaptation data
US9571652B1 (en) 2005-04-21 2017-02-14 Verint Americas Inc. Enhanced diarization systems, media and methods of use
US20170169822A1 (en) * 2015-12-14 2017-06-15 Hitachi, Ltd. Dialog text summarization device and method
US20170178619A1 (en) * 2013-06-07 2017-06-22 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US20170329848A1 (en) * 2016-05-13 2017-11-16 Google Inc. Personalized and Contextualized Audio Briefing
US9875739B2 (en) 2012-09-07 2018-01-23 Verint Systems Ltd. Speaker separation in diarization
US9875742B2 (en) 2015-01-26 2018-01-23 Verint Systems Ltd. Word-level blind diarization of recorded calls with arbitrary number of speakers
US20180081869A1 (en) * 2006-04-17 2018-03-22 Iii Holdings 1, Llc Methods and systems for correcting transcribed audio files
US9984706B2 (en) 2013-08-01 2018-05-29 Verint Systems Ltd. Voice activity detection using a soft decision mechanism
US10083690B2 (en) 2014-05-30 2018-09-25 Apple Inc. Better resolution when referencing to concepts
US10108612B2 (en) 2008-07-31 2018-10-23 Apple Inc. Mobile device having human language translation capability with positional feedback
US10192554B1 (en) 2018-02-26 2019-01-29 Sorenson Ip Holdings, Llc Transcription of communications using multiple speech recognition systems
US10235997B2 (en) 2016-05-10 2019-03-19 Google Llc Voice-controlled closed caption display
US10303715B2 (en) 2017-05-16 2019-05-28 Apple Inc. Intelligent automated assistant for media exploration
US10311144B2 (en) 2017-05-16 2019-06-04 Apple Inc. Emoji word sense disambiguation
US10311871B2 (en) 2015-03-08 2019-06-04 Apple Inc. Competing devices responding to voice triggers
US10332518B2 (en) 2017-05-09 2019-06-25 Apple Inc. User interface for correcting recognition errors
US10354652B2 (en) 2015-12-02 2019-07-16 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10356243B2 (en) 2015-06-05 2019-07-16 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10372737B2 (en) * 2017-11-16 2019-08-06 International Business Machines Corporation Automatic identification of retraining data in a classifier-based dialogue system
US10381016B2 (en) 2008-01-03 2019-08-13 Apple Inc. Methods and apparatus for altering audio output signals
US10388272B1 (en) 2018-12-04 2019-08-20 Sorenson Ip Holdings, Llc Training speech recognition systems using word sequences
US10390213B2 (en) 2014-09-30 2019-08-20 Apple Inc. Social reminders
US10395654B2 (en) 2017-05-11 2019-08-27 Apple Inc. Text normalization based on a data-driven learning network
US10403283B1 (en) 2018-06-01 2019-09-03 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US10403278B2 (en) 2017-05-16 2019-09-03 Apple Inc. Methods and systems for phonetic matching in digital assistant services
US10417405B2 (en) 2011-03-21 2019-09-17 Apple Inc. Device access using voice authentication
US10417344B2 (en) 2014-05-30 2019-09-17 Apple Inc. Exemplar-based natural language processing
US10417266B2 (en) 2017-05-09 2019-09-17 Apple Inc. Context-aware ranking of intelligent response suggestions
US10431204B2 (en) 2014-09-11 2019-10-01 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10438595B2 (en) 2014-09-30 2019-10-08 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10445429B2 (en) 2017-09-21 2019-10-15 Apple Inc. Natural language understanding using vocabularies with compressed serialized tries
US10453443B2 (en) 2014-09-30 2019-10-22 Apple Inc. Providing an indication of the suitability of speech recognition
US10474753B2 (en) 2016-09-07 2019-11-12 Apple Inc. Language identification using recurrent neural networks
US10496705B1 (en) 2018-06-03 2019-12-03 Apple Inc. Accelerated task performance
US10497365B2 (en) 2014-05-30 2019-12-03 Apple Inc. Multi-command single utterance input method
US10522133B2 (en) * 2011-05-23 2019-12-31 Nuance Communications, Inc. Methods and apparatus for correcting recognition errors
US10529332B2 (en) 2015-03-08 2020-01-07 Apple Inc. Virtual assistant activation
US10553215B2 (en) 2016-09-23 2020-02-04 Apple Inc. Intelligent automated assistant
US10573312B1 (en) 2018-12-04 2020-02-25 Sorenson Ip Holdings, Llc Transcription generation from multiple speech recognition systems
US10580409B2 (en) 2016-06-11 2020-03-03 Apple Inc. Application integration with a digital assistant
US10592604B2 (en) 2018-03-12 2020-03-17 Apple Inc. Inverse text normalization for automatic speech recognition
US10636424B2 (en) 2017-11-30 2020-04-28 Apple Inc. Multi-turn canned dialog
US10643611B2 (en) 2008-10-02 2020-05-05 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US10657328B2 (en) 2017-06-02 2020-05-19 Apple Inc. Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling
US10657961B2 (en) 2013-06-08 2020-05-19 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US10665241B1 (en) * 2019-09-06 2020-05-26 Verbit Software Ltd. Rapid frontend resolution of transcription-related inquiries by backend transcribers
US10660344B1 (en) * 2017-02-16 2020-05-26 Tyson Foods, Inc. Method of making a meat product and a meat product
USD885436S1 (en) 2016-05-13 2020-05-26 Google Llc Panel of a voice interface device
US10684703B2 (en) 2018-06-01 2020-06-16 Apple Inc. Attention aware virtual assistant dismissal
US10692504B2 (en) 2010-02-25 2020-06-23 Apple Inc. User profiling for voice input processing
US10699717B2 (en) 2014-05-30 2020-06-30 Apple Inc. Intelligent assistant for home automation
US10714117B2 (en) 2013-02-07 2020-07-14 Apple Inc. Voice trigger for a digital assistant
US10726832B2 (en) 2017-05-11 2020-07-28 Apple Inc. Maintaining privacy of personal information
US10733375B2 (en) 2018-01-31 2020-08-04 Apple Inc. Knowledge-based framework for improving natural language understanding
US10733982B2 (en) 2018-01-08 2020-08-04 Apple Inc. Multi-directional dialog
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10741185B2 (en) 2010-01-18 2020-08-11 Apple Inc. Intelligent automated assistant
US10748546B2 (en) 2017-05-16 2020-08-18 Apple Inc. Digital assistant services based on device capabilities
US10755051B2 (en) 2017-09-29 2020-08-25 Apple Inc. Rule-based natural language processing
US10769385B2 (en) 2013-06-09 2020-09-08 Apple Inc. System and method for inferring user intent from speech inputs
US10789959B2 (en) 2018-03-02 2020-09-29 Apple Inc. Training speaker recognition models for digital assistants
US10789945B2 (en) 2017-05-12 2020-09-29 Apple Inc. Low-latency intelligent automated assistant
US10818288B2 (en) 2018-03-26 2020-10-27 Apple Inc. Natural assistant interaction
US10839159B2 (en) 2018-09-28 2020-11-17 Apple Inc. Named entity normalization in a spoken dialog system
US10887452B2 (en) 2018-10-25 2021-01-05 Verint Americas Inc. System architecture for fraud detection
US10892996B2 (en) 2018-06-01 2021-01-12 Apple Inc. Variable latency device coordination
US10909331B2 (en) 2018-03-30 2021-02-02 Apple Inc. Implicit identification of translation payload with neural machine translation
US20210049927A1 (en) * 2019-08-13 2021-02-18 Vanderbilt University System, method and computer program product for determining a reading error distance metric
US10928918B2 (en) 2018-05-07 2021-02-23 Apple Inc. Raise to speak
US10942703B2 (en) 2015-12-23 2021-03-09 Apple Inc. Proactive assistance based on dialog communication between devices
US10942702B2 (en) 2016-06-11 2021-03-09 Apple Inc. Intelligent device arbitration and control
US10956666B2 (en) 2015-11-09 2021-03-23 Apple Inc. Unconventional virtual assistant interactions
US10984780B2 (en) 2018-05-21 2021-04-20 Apple Inc. Global semantic word embeddings using bi-directional recurrent neural networks
US11010561B2 (en) 2018-09-27 2021-05-18 Apple Inc. Sentiment prediction from textual data
US11010127B2 (en) 2015-06-29 2021-05-18 Apple Inc. Virtual assistant for media playback
US11017778B1 (en) 2018-12-04 2021-05-25 Sorenson Ip Holdings, Llc Switching between speech recognition systems
US11023513B2 (en) 2007-12-20 2021-06-01 Apple Inc. Method and apparatus for searching using an active ontology
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US11048473B2 (en) 2013-06-09 2021-06-29 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US11069347B2 (en) 2016-06-08 2021-07-20 Apple Inc. Intelligent automated assistant for media exploration
US11069336B2 (en) 2012-03-02 2021-07-20 Apple Inc. Systems and methods for name pronunciation
US11115521B2 (en) 2019-06-20 2021-09-07 Verint Americas Inc. Systems and methods for authentication and fraud detection
US11120372B2 (en) 2011-06-03 2021-09-14 Apple Inc. Performing actions associated with task items that represent tasks to perform
US11126400B2 (en) 2015-09-08 2021-09-21 Apple Inc. Zero latency digital assistant
US11127397B2 (en) 2015-05-27 2021-09-21 Apple Inc. Device voice control
US11133008B2 (en) 2014-05-30 2021-09-28 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US11140099B2 (en) 2019-05-21 2021-10-05 Apple Inc. Providing message response suggestions
US11145294B2 (en) 2018-05-07 2021-10-12 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11170166B2 (en) 2018-09-28 2021-11-09 Apple Inc. Neural typographical error modeling via generative adversarial networks
US11170761B2 (en) 2018-12-04 2021-11-09 Sorenson Ip Holdings, Llc Training of speech recognition systems
US11204787B2 (en) 2017-01-09 2021-12-21 Apple Inc. Application integration with a digital assistant
US11217251B2 (en) 2019-05-06 2022-01-04 Apple Inc. Spoken notifications
US20220005478A1 (en) * 2009-02-27 2022-01-06 Nec Corporation Mobile wireless communications device with speech to text conversion and related methods
US11227589B2 (en) 2016-06-06 2022-01-18 Apple Inc. Intelligent list reading
US11227606B1 (en) * 2019-03-31 2022-01-18 Medallia, Inc. Compact, verifiable record of an audio communication and method for making same
US11231904B2 (en) 2015-03-06 2022-01-25 Apple Inc. Reducing response latency of intelligent automated assistants
US11237797B2 (en) 2019-05-31 2022-02-01 Apple Inc. User activity shortcut suggestions
US11269678B2 (en) 2012-05-15 2022-03-08 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US11281993B2 (en) 2016-12-05 2022-03-22 Apple Inc. Model and ensemble compression for metric learning
US11289073B2 (en) 2019-05-31 2022-03-29 Apple Inc. Device text to speech
US11301477B2 (en) 2017-05-12 2022-04-12 Apple Inc. Feedback analysis of a digital assistant
US11307752B2 (en) 2019-05-06 2022-04-19 Apple Inc. User configurable task triggers
US11314370B2 (en) 2013-12-06 2022-04-26 Apple Inc. Method for extracting salient dialog usage from live data
US11314942B1 (en) * 2017-10-27 2022-04-26 Interactions Llc Accelerating agent performance in a natural language processing system
US11348573B2 (en) 2019-03-18 2022-05-31 Apple Inc. Multimodality in digital assistant systems
US11350253B2 (en) 2011-06-03 2022-05-31 Apple Inc. Active transport based notifications
US11360641B2 (en) 2019-06-01 2022-06-14 Apple Inc. Increasing the relevance of new available information
US11386266B2 (en) 2018-06-01 2022-07-12 Apple Inc. Text correction
US11388291B2 (en) 2013-03-14 2022-07-12 Apple Inc. System and method for processing voicemail
US11398239B1 (en) 2019-03-31 2022-07-26 Medallia, Inc. ASR-enhanced speech compression
US11405466B2 (en) 2017-05-12 2022-08-02 Apple Inc. Synchronization and task delegation of a digital assistant
US11423886B2 (en) 2010-01-18 2022-08-23 Apple Inc. Task flow identification based on user intent
US11423908B2 (en) 2019-05-06 2022-08-23 Apple Inc. Interpreting spoken requests
US11462215B2 (en) 2018-09-28 2022-10-04 Apple Inc. Multi-modal inputs for voice commands
US11468282B2 (en) 2015-05-15 2022-10-11 Apple Inc. Virtual assistant in a communication session
US11475884B2 (en) 2019-05-06 2022-10-18 Apple Inc. Reducing digital assistant latency when a language is incorrectly determined
US11475898B2 (en) 2018-10-26 2022-10-18 Apple Inc. Low-latency multi-speaker speech recognition
US11488406B2 (en) 2019-09-25 2022-11-01 Apple Inc. Text detection using global geometry estimators
US11488604B2 (en) 2020-08-19 2022-11-01 Sorenson Ip Holdings, Llc Transcription of audio
US11495218B2 (en) 2018-06-01 2022-11-08 Apple Inc. Virtual assistant operation in multi-device environments
US11496600B2 (en) 2019-05-31 2022-11-08 Apple Inc. Remote execution of machine-learned models
US11500672B2 (en) 2015-09-08 2022-11-15 Apple Inc. Distributed personal assistant
US11501767B2 (en) * 2017-01-23 2022-11-15 Audi Ag Method for operating a motor vehicle having an operating device
US11526368B2 (en) 2015-11-06 2022-12-13 Apple Inc. Intelligent automated assistant in a messaging environment
US11532306B2 (en) 2017-05-16 2022-12-20 Apple Inc. Detecting a trigger of a digital assistant
US11538128B2 (en) 2018-05-14 2022-12-27 Verint Americas Inc. User interface for fraud alert management
US11638059B2 (en) 2019-01-04 2023-04-25 Apple Inc. Content playback on multiple devices
US11657813B2 (en) 2019-05-31 2023-05-23 Apple Inc. Voice identification in digital assistant systems
US11798547B2 (en) 2013-03-15 2023-10-24 Apple Inc. Voice activated device for use with a voice-based digital assistant
US11868453B2 (en) 2019-11-07 2024-01-09 Verint Americas Inc. Systems and methods for customer authentication based on audio-of-interest
WO2024018598A1 (en) * 2022-07-21 2024-01-25 Nttテクノクロス株式会社 Information processing system, information processing method, and program
US11928604B2 (en) 2005-09-08 2024-03-12 Apple Inc. Method and apparatus for building an intelligent automated assistant
US12010262B2 (en) 2013-08-06 2024-06-11 Apple Inc. Auto-activating smart responses based on activities from remote devices
US12035070B2 (en) 2020-02-21 2024-07-09 Ultratec, Inc. Caption modification and augmentation systems and methods for use by hearing assisted user

Citations (63)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0076687A1 (en) 1981-10-05 1983-04-13 Signatron, Inc. Speech intelligibility enhancement system and method
US4468804A (en) 1982-02-26 1984-08-28 Signatron, Inc. Speech enhancement techniques
EP0140249A1 (en) 1983-10-13 1985-05-08 Texas Instruments Incorporated Speech analysis/synthesis with energy normalization
US4696039A (en) 1983-10-13 1987-09-22 Texas Instruments Incorporated Speech analysis/synthesis system with silence suppression
US4852170A (en) 1986-12-18 1989-07-25 R & D Associates Real time computer speech recognition system
EP0360265A2 (en) 1988-09-21 1990-03-28 Nec Corporation Communication system capable of improving a speech quality by classifying speech signals
US5206903A (en) 1990-12-26 1993-04-27 At&T Bell Laboratories Automatic call distribution based on matching required skills with agents skills
US5583969A (en) 1992-04-28 1996-12-10 Technology Research Association Of Medical And Welfare Apparatus Speech signal processing apparatus for amplifying an input signal based upon consonant features of the signal
US5634086A (en) * 1993-03-12 1997-05-27 Sri International Method and apparatus for voice-interactive language instruction
US5644680A (en) * 1994-04-14 1997-07-01 Northern Telecom Limited Updating markov models based on speech input and additional information for automated telephone directory assistance
US5684872A (en) 1995-07-21 1997-11-04 Lucent Technologies Inc. Prediction of a caller's motivation as a basis for selecting treatment of an incoming call
JPH10124089A (en) 1996-10-24 1998-05-15 Sony Corp Processor and method for speech signal processing and device and method for expanding voice bandwidth
US5802149A (en) * 1996-04-05 1998-09-01 Lucent Technologies Inc. On-line training of an automated-dialing directory
US5828747A (en) 1997-01-28 1998-10-27 Lucent Technologies Inc. Call distribution based on agent occupancy
US5905793A (en) 1997-03-07 1999-05-18 Lucent Technologies Inc. Waiting-call selection based on anticipated wait times
US5982873A (en) 1997-03-07 1999-11-09 Lucent Technologies Inc. Waiting-call selection based on objectives
WO2000022611A1 (en) 1998-10-09 2000-04-20 Hejna Donald J Jr Method and apparatus to prepare listener-interest-filtered works
US6064731A (en) 1998-10-29 2000-05-16 Lucent Technologies Inc. Arrangement for improving retention of call center's customers
US6084954A (en) 1997-09-30 2000-07-04 Lucent Technologies Inc. System and method for correlating incoming and outgoing telephone calls using predictive logic
US6088441A (en) 1997-12-17 2000-07-11 Lucent Technologies Inc. Arrangement for equalizing levels of service among skills
US6122614A (en) * 1998-11-20 2000-09-19 Custom Speech Usa, Inc. System and method for automating transcription services
US6151571A (en) 1999-08-31 2000-11-21 Andersen Consulting System, method and article of manufacture for detecting emotion in voice signals through analysis of a plurality of voice signal parameters
US6163607A (en) 1998-04-09 2000-12-19 Avaya Technology Corp. Optimizing call-center performance by using predictive data to distribute agents among calls
US6178400B1 (en) 1998-07-22 2001-01-23 At&T Corp. Method and apparatus for normalizing speech to facilitate a telephone call
US6192122B1 (en) 1998-02-12 2001-02-20 Avaya Technology Corp. Call center agent selection that optimizes call wait times
US6243680B1 (en) * 1998-06-15 2001-06-05 Nortel Networks Limited Method and apparatus for obtaining a transcription of phrases through text and spoken utterances
US6259969B1 (en) 1997-06-04 2001-07-10 Nativeminds, Inc. System and method for automatically verifying the performance of a virtual robot
US6275991B1 (en) 1996-02-06 2001-08-14 Fca Corporation IR transmitter with integral magnetic-stripe ATM type credit card reader and method therefor
US6275806B1 (en) 1999-08-31 2001-08-14 Andersen Consulting, Llp System method and article of manufacture for detecting emotion in voice signals by utilizing statistics for voice signal parameters
US6278777B1 (en) 1998-03-12 2001-08-21 Ser Solutions, Inc. System for managing agent assignments background of the invention
US6292550B1 (en) 1998-06-01 2001-09-18 Avaya Technology Corp. Dynamic call vectoring
US6314165B1 (en) * 1998-04-30 2001-11-06 Matsushita Electric Industrial Co., Ltd. Automated hotel attendant using speech recognition
US20020019737A1 (en) * 1999-02-19 2002-02-14 Stuart Robert O. Data retrieval assistance system and method utilizing a speech recognition system and a live operator
US6353810B1 (en) 1999-08-31 2002-03-05 Accenture Llp System, method and article of manufacture for an emotion detection system improving emotion recognition
US6363346B1 (en) 1999-12-22 2002-03-26 Ncr Corporation Call distribution system inferring mental or physiological state
US6374221B1 (en) * 1999-06-22 2002-04-16 Lucent Technologies Inc. Automatic retraining of a speech recognizer while using reliable transcripts
US6389132B1 (en) 1999-10-13 2002-05-14 Avaya Technology Corp. Multi-tasking, web-based call center
US6408273B1 (en) 1998-12-04 2002-06-18 Thomson-Csf Method and device for the processing of sounds for auditory correction for hearing impaired individuals
US6427137B2 (en) 1999-08-31 2002-07-30 Accenture Llp System, method and article of manufacture for a voice analysis system that detects nervousness for preventing fraud
US6463415B2 (en) 1999-08-31 2002-10-08 Accenture Llp 69voice authentication system and method for regulating border crossing
US6480826B2 (en) 1999-08-31 2002-11-12 Accenture Llp System and method for a telephonic emotion detection that provides operator feedback
US20030191639A1 (en) * 2002-04-05 2003-10-09 Sam Mazza Dynamic and adaptive selection of vocabulary and acoustic models based on a call context for speech recognition
US6697457B2 (en) 1999-08-31 2004-02-24 Accenture Llp Voice messaging system that organizes voice messages based on detected emotion
WO2004056086A2 (en) 2002-12-16 2004-07-01 Koninklijke Philips Electronics N.V. Method and apparatus for selectable rate playback without speech distortion
US6766014B2 (en) 2001-01-09 2004-07-20 Avaya Technology Corp. Customer service by batch
US20040148161A1 (en) 2003-01-28 2004-07-29 Das Sharmistha S. Normalization of speech accent
US20040215453A1 (en) 2003-04-25 2004-10-28 Orbach Julian J. Method and apparatus for tailoring an interactive voice response experience based on speech characteristics
US6823312B2 (en) 2001-01-18 2004-11-23 International Business Machines Corporation Personalized system for providing improved understandability of received speech
US6839669B1 (en) * 1998-11-05 2005-01-04 Scansoft, Inc. Performing actions identified in recognized speech
US6847714B2 (en) 2002-11-19 2005-01-25 Avaya Technology Corp. Accent-based matching of a communicant with a call-center agent
US20050065789A1 (en) 2003-09-23 2005-03-24 Sherif Yacoub System and method with automated speech recognition engines
US6889186B1 (en) 2000-06-01 2005-05-03 Avaya Technology Corp. Method and apparatus for improving the intelligibility of digitally compressed speech
US20050094822A1 (en) 2005-01-08 2005-05-05 Robert Swartz Listener specific audio reproduction system
US6940951B2 (en) * 2001-01-23 2005-09-06 Ivoice, Inc. Telephone application programming interface-based, speech enabled automatic telephone dialer using names
US6999563B1 (en) * 2000-08-21 2006-02-14 Volt Delta Resources, Llc Enhanced directory assistance automation
US20060036437A1 (en) 2004-08-12 2006-02-16 Sbc Knowledge Ventures, Lp System and method for targeted tuning module of a speech recognition system
US7065485B1 (en) 2002-01-09 2006-06-20 At&T Corp Enhancing speech intelligibility using variable-rate time-scale modification
US20060252376A1 (en) 2005-04-21 2006-11-09 Kenny Fok Methods and apparatus for monitoring voice quality on a wireless communication device
US20070038455A1 (en) 2005-08-09 2007-02-15 Murzina Marina V Accent detection and correction system
US7180997B2 (en) 2002-09-06 2007-02-20 Cisco Technology, Inc. Method and system for improving the intelligibility of a moderator during a multiparty communication session
US7222074B2 (en) 2001-06-20 2007-05-22 Guojun Zhou Psycho-physical state sensitive voice dialogue system
US7222075B2 (en) 1999-08-31 2007-05-22 Accenture Llp Detecting emotions using voice signal analysis
US7267652B2 (en) 2003-04-10 2007-09-11 Vivometrics, Inc. Systems and methods for respiratory event detection

Patent Citations (67)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0076687A1 (en) 1981-10-05 1983-04-13 Signatron, Inc. Speech intelligibility enhancement system and method
US4468804A (en) 1982-02-26 1984-08-28 Signatron, Inc. Speech enhancement techniques
EP0140249A1 (en) 1983-10-13 1985-05-08 Texas Instruments Incorporated Speech analysis/synthesis with energy normalization
US4696039A (en) 1983-10-13 1987-09-22 Texas Instruments Incorporated Speech analysis/synthesis system with silence suppression
US4852170A (en) 1986-12-18 1989-07-25 R & D Associates Real time computer speech recognition system
EP0360265A2 (en) 1988-09-21 1990-03-28 Nec Corporation Communication system capable of improving a speech quality by classifying speech signals
US5018200A (en) 1988-09-21 1991-05-21 Nec Corporation Communication system capable of improving a speech quality by classifying speech signals
CA1333425C (en) 1988-09-21 1994-12-06 Kazunori Ozawa Communication system capable of improving a speech quality by classifying speech signals
US5206903A (en) 1990-12-26 1993-04-27 At&T Bell Laboratories Automatic call distribution based on matching required skills with agents skills
US5583969A (en) 1992-04-28 1996-12-10 Technology Research Association Of Medical And Welfare Apparatus Speech signal processing apparatus for amplifying an input signal based upon consonant features of the signal
US5634086A (en) * 1993-03-12 1997-05-27 Sri International Method and apparatus for voice-interactive language instruction
US5644680A (en) * 1994-04-14 1997-07-01 Northern Telecom Limited Updating markov models based on speech input and additional information for automated telephone directory assistance
US5684872A (en) 1995-07-21 1997-11-04 Lucent Technologies Inc. Prediction of a caller's motivation as a basis for selecting treatment of an incoming call
US6275991B1 (en) 1996-02-06 2001-08-14 Fca Corporation IR transmitter with integral magnetic-stripe ATM type credit card reader and method therefor
US5802149A (en) * 1996-04-05 1998-09-01 Lucent Technologies Inc. On-line training of an automated-dialing directory
JPH10124089A (en) 1996-10-24 1998-05-15 Sony Corp Processor and method for speech signal processing and device and method for expanding voice bandwidth
US5828747A (en) 1997-01-28 1998-10-27 Lucent Technologies Inc. Call distribution based on agent occupancy
US5905793A (en) 1997-03-07 1999-05-18 Lucent Technologies Inc. Waiting-call selection based on anticipated wait times
US5982873A (en) 1997-03-07 1999-11-09 Lucent Technologies Inc. Waiting-call selection based on objectives
US6259969B1 (en) 1997-06-04 2001-07-10 Nativeminds, Inc. System and method for automatically verifying the performance of a virtual robot
US6084954A (en) 1997-09-30 2000-07-04 Lucent Technologies Inc. System and method for correlating incoming and outgoing telephone calls using predictive logic
US6088441A (en) 1997-12-17 2000-07-11 Lucent Technologies Inc. Arrangement for equalizing levels of service among skills
US6192122B1 (en) 1998-02-12 2001-02-20 Avaya Technology Corp. Call center agent selection that optimizes call wait times
US6278777B1 (en) 1998-03-12 2001-08-21 Ser Solutions, Inc. System for managing agent assignments background of the invention
US6163607A (en) 1998-04-09 2000-12-19 Avaya Technology Corp. Optimizing call-center performance by using predictive data to distribute agents among calls
US6173053B1 (en) 1998-04-09 2001-01-09 Avaya Technology Corp. Optimizing call-center performance by using predictive data to distribute calls among agents
US6314165B1 (en) * 1998-04-30 2001-11-06 Matsushita Electric Industrial Co., Ltd. Automated hotel attendant using speech recognition
US6292550B1 (en) 1998-06-01 2001-09-18 Avaya Technology Corp. Dynamic call vectoring
US6243680B1 (en) * 1998-06-15 2001-06-05 Nortel Networks Limited Method and apparatus for obtaining a transcription of phrases through text and spoken utterances
US6178400B1 (en) 1998-07-22 2001-01-23 At&T Corp. Method and apparatus for normalizing speech to facilitate a telephone call
US6801888B2 (en) 1998-10-09 2004-10-05 Enounce Incorporated Method and apparatus to prepare listener-interest-filtered works
WO2000022611A1 (en) 1998-10-09 2000-04-20 Hejna Donald J Jr Method and apparatus to prepare listener-interest-filtered works
US6064731A (en) 1998-10-29 2000-05-16 Lucent Technologies Inc. Arrangement for improving retention of call center's customers
US6839669B1 (en) * 1998-11-05 2005-01-04 Scansoft, Inc. Performing actions identified in recognized speech
US6122614A (en) * 1998-11-20 2000-09-19 Custom Speech Usa, Inc. System and method for automating transcription services
US6408273B1 (en) 1998-12-04 2002-06-18 Thomson-Csf Method and device for the processing of sounds for auditory correction for hearing impaired individuals
US20020019737A1 (en) * 1999-02-19 2002-02-14 Stuart Robert O. Data retrieval assistance system and method utilizing a speech recognition system and a live operator
US6374221B1 (en) * 1999-06-22 2002-04-16 Lucent Technologies Inc. Automatic retraining of a speech recognizer while using reliable transcripts
US6353810B1 (en) 1999-08-31 2002-03-05 Accenture Llp System, method and article of manufacture for an emotion detection system improving emotion recognition
US6427137B2 (en) 1999-08-31 2002-07-30 Accenture Llp System, method and article of manufacture for a voice analysis system that detects nervousness for preventing fraud
US6463415B2 (en) 1999-08-31 2002-10-08 Accenture Llp 69voice authentication system and method for regulating border crossing
US6480826B2 (en) 1999-08-31 2002-11-12 Accenture Llp System and method for a telephonic emotion detection that provides operator feedback
US7222075B2 (en) 1999-08-31 2007-05-22 Accenture Llp Detecting emotions using voice signal analysis
US6697457B2 (en) 1999-08-31 2004-02-24 Accenture Llp Voice messaging system that organizes voice messages based on detected emotion
US6275806B1 (en) 1999-08-31 2001-08-14 Andersen Consulting, Llp System method and article of manufacture for detecting emotion in voice signals by utilizing statistics for voice signal parameters
US6151571A (en) 1999-08-31 2000-11-21 Andersen Consulting System, method and article of manufacture for detecting emotion in voice signals through analysis of a plurality of voice signal parameters
US6389132B1 (en) 1999-10-13 2002-05-14 Avaya Technology Corp. Multi-tasking, web-based call center
US6363346B1 (en) 1999-12-22 2002-03-26 Ncr Corporation Call distribution system inferring mental or physiological state
US6889186B1 (en) 2000-06-01 2005-05-03 Avaya Technology Corp. Method and apparatus for improving the intelligibility of digitally compressed speech
US6999563B1 (en) * 2000-08-21 2006-02-14 Volt Delta Resources, Llc Enhanced directory assistance automation
US6766014B2 (en) 2001-01-09 2004-07-20 Avaya Technology Corp. Customer service by batch
US6823312B2 (en) 2001-01-18 2004-11-23 International Business Machines Corporation Personalized system for providing improved understandability of received speech
US6940951B2 (en) * 2001-01-23 2005-09-06 Ivoice, Inc. Telephone application programming interface-based, speech enabled automatic telephone dialer using names
US7222074B2 (en) 2001-06-20 2007-05-22 Guojun Zhou Psycho-physical state sensitive voice dialogue system
US7065485B1 (en) 2002-01-09 2006-06-20 At&T Corp Enhancing speech intelligibility using variable-rate time-scale modification
US20030191639A1 (en) * 2002-04-05 2003-10-09 Sam Mazza Dynamic and adaptive selection of vocabulary and acoustic models based on a call context for speech recognition
US7180997B2 (en) 2002-09-06 2007-02-20 Cisco Technology, Inc. Method and system for improving the intelligibility of a moderator during a multiparty communication session
US6847714B2 (en) 2002-11-19 2005-01-25 Avaya Technology Corp. Accent-based matching of a communicant with a call-center agent
WO2004056086A2 (en) 2002-12-16 2004-07-01 Koninklijke Philips Electronics N.V. Method and apparatus for selectable rate playback without speech distortion
US20040148161A1 (en) 2003-01-28 2004-07-29 Das Sharmistha S. Normalization of speech accent
US7267652B2 (en) 2003-04-10 2007-09-11 Vivometrics, Inc. Systems and methods for respiratory event detection
US20040215453A1 (en) 2003-04-25 2004-10-28 Orbach Julian J. Method and apparatus for tailoring an interactive voice response experience based on speech characteristics
US20050065789A1 (en) 2003-09-23 2005-03-24 Sherif Yacoub System and method with automated speech recognition engines
US20060036437A1 (en) 2004-08-12 2006-02-16 Sbc Knowledge Ventures, Lp System and method for targeted tuning module of a speech recognition system
US20050094822A1 (en) 2005-01-08 2005-05-05 Robert Swartz Listener specific audio reproduction system
US20060252376A1 (en) 2005-04-21 2006-11-09 Kenny Fok Methods and apparatus for monitoring voice quality on a wireless communication device
US20070038455A1 (en) 2005-08-09 2007-02-15 Murzina Marina V Accent detection and correction system

Non-Patent Citations (23)

* Cited by examiner, † Cited by third party
Title
Arslan, Levent M., "Foreign Accent Classification in American English," thesis, pp. 1-200, Department of Electrical Computer Engineering, Duke University, 1996.
Arslan, Levent M., et al., "Language Accent Classification in American English," Robust Speech Processing Laboratory Department of Electrical Engineering, Durham, North Carolina, Technical Report RSPL-96-7(1996).
Background of The Invention for the above-captioned application (previously provided).
Entwistle, "Training Methods and Enrollment Techniques to Improve the Performance of Automated Speech Recognition Systems Under Conditions of Human Exertion", A Dissertation Submitted in Partial Fulfillment of The Requirements for the Degree of Doctor of Philosophy, University of South Dakota, Jul. 2005.
Entwistle, The performance of automated speech recognition systems under adverse conditions of human exertion. Int. J. Hum.-Comput. Interact. 16 (2003) (2), pp. 127-140.
Hansen, John H.L., et al., "Foreign Accent Classification Using Source Generator Based Prosodic Features," IEEE Proc. ICASSP, vol. 1, Detroit U.S.A., (1995), pp. 836-839.
Hosom, John-Paul, et al., "Training Neural Networks for Speech Recognition," Center for Spoken Language Understanding, Oregon Graduate Institute of Science and Technology (Feb. 2, 1999), 51 pages.
Jackson, Philip J.B., et al., "Aero-Acoustic Modeling of Voiced and Unvoiced Fricatives Based on MRI Data," University of Birmingham and University of Southampton, (undated), 4 pages.
Kirriemuri, John, "Speech Recognition Technologies," TSW 03-03 (Mar. 2003), 13 pages.
Lamel, L.F., et al., "Language Identification Using Phone-based Acoustic Likelihoods," ICASSP-94.
Landauer et al., "An Introduction to Latent Semantic Analysis", Discourse Processes, 1998, 41 pages.
Lin et al., "Phoneme-less Hierarchical Accent Classification", HP Laboratories Palo Alto, Oct. 4, 2004, 5 pages.
Loizou, Philip, "Speech Production and Perception," EE 6362 Lecture Notes (Fall 2000), pp. 1-30.
Michaelis, "Speech Digitization and Compression", In W. Warkowski (Ed.), International Encyclopedia of Ergonomics and Human Factors. London: Taylor Francis, 2001, 683-686.
Novak, D Cuesta-Frau, and L. Lhotska: Speech recognition methods applied to biomedical signals processing. Engineering in Medicine and Biology Society. 2004; 1: 118-121.
Pervasive, Human-Centered Computing, MIT Project Oxygen, MIT Laboratory for Computer Science, Jun. 2000.
U.S. Appl. No. 10/882,975, filed Jun. 30, 2004, Becker et al.
U.S. Appl. No. 11/131,108, filed May 16, 2005, Michaelis.
U.S. Appl. No. 11/388,694, filed Mar. 24, 2006, Blair et al.
U.S. Appl. No. 11/508,442, filed Aug. 22, 2006, Coughlan.
U.S. Appl. No. 11/508,477, filed Aug. 22, 2006, Michaelis.
U.S. Appl. No. 11/768,567, filed Jun. 26, 2007, Coughlan.
Zue, Victor, "The MIT Oxygen Project," MIT Laboratory for Computer Science, Apr. 25-26, 2000.

Cited By (279)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
USD951298S1 (en) 1991-11-29 2022-05-10 Google Llc Panel of a voice interface device
US20050215239A1 (en) * 2004-03-26 2005-09-29 Nokia Corporation Feature extraction in a networked portable device
US9368111B2 (en) * 2004-08-12 2016-06-14 Interactions Llc System and method for targeted tuning of a speech recognition system
US20140236599A1 (en) * 2004-08-12 2014-08-21 AT&T Intellectuall Property I, L.P. System and method for targeted tuning of a speech recognition system
US8751232B2 (en) 2004-08-12 2014-06-10 At&T Intellectual Property I, L.P. System and method for targeted tuning of a speech recognition system
US9350862B2 (en) 2004-12-06 2016-05-24 Interactions Llc System and method for processing speech
US9112972B2 (en) 2004-12-06 2015-08-18 Interactions Llc System and method for processing speech
US9088652B2 (en) 2005-01-10 2015-07-21 At&T Intellectual Property I, L.P. System and method for speech-enabled call routing
US8824659B2 (en) 2005-01-10 2014-09-02 At&T Intellectual Property I, L.P. System and method for speech-enabled call routing
US20090041212A1 (en) * 2005-04-15 2009-02-12 Avaya Inc. Interactive Voice Response System With Prioritized Call Monitoring
US8085927B2 (en) * 2005-04-15 2011-12-27 Avaya Inc. Interactive voice response system with prioritized call monitoring
US9571652B1 (en) 2005-04-21 2017-02-14 Verint Americas Inc. Enhanced diarization systems, media and methods of use
US8280030B2 (en) 2005-06-03 2012-10-02 At&T Intellectual Property I, Lp Call routing system and method of using the same
US20100091978A1 (en) * 2005-06-03 2010-04-15 At&T Intellectual Property I, L.P. Call routing system and method of using the same
US8619966B2 (en) 2005-06-03 2013-12-31 At&T Intellectual Property I, L.P. Call routing system and method of using the same
US8526577B2 (en) * 2005-08-25 2013-09-03 At&T Intellectual Property I, L.P. System and method to access content from a speech-enabled automated system
US20070047718A1 (en) * 2005-08-25 2007-03-01 Sbc Knowledge Ventures, L.P. System and method to access content from a speech-enabled automated system
US11928604B2 (en) 2005-09-08 2024-03-12 Apple Inc. Method and apparatus for building an intelligent automated assistant
US8027457B1 (en) * 2005-12-01 2011-09-27 Cordell Coy Process for automated deployment of natural language
US20180081869A1 (en) * 2006-04-17 2018-03-22 Iii Holdings 1, Llc Methods and systems for correcting transcribed audio files
US11594211B2 (en) 2006-04-17 2023-02-28 Iii Holdings 1, Llc Methods and systems for correcting transcribed audio files
US10861438B2 (en) * 2006-04-17 2020-12-08 Iii Holdings 1, Llc Methods and systems for correcting transcribed audio files
US20080140410A1 (en) * 2006-12-06 2008-06-12 Soonthorn Ativanichayaphong Enabling grammars in web page frame
US7827033B2 (en) * 2006-12-06 2010-11-02 Nuance Communications, Inc. Enabling grammars in web page frames
US11023513B2 (en) 2007-12-20 2021-06-01 Apple Inc. Method and apparatus for searching using an active ontology
US10381016B2 (en) 2008-01-03 2019-08-13 Apple Inc. Methods and apparatus for altering audio output signals
US20120237007A1 (en) * 2008-02-05 2012-09-20 Htc Corporation Method for setting voice tag
US8964948B2 (en) * 2008-02-05 2015-02-24 Htc Corporation Method for setting voice tag
US9444939B2 (en) 2008-05-23 2016-09-13 Accenture Global Services Limited Treatment processing of a plurality of streaming voice signals for determination of a responsive action thereto
US20090292531A1 (en) * 2008-05-23 2009-11-26 Accenture Global Services Gmbh System for handling a plurality of streaming voice signals for determination of responsive action thereto
US8751222B2 (en) 2008-05-23 2014-06-10 Accenture Global Services Limited Dublin Recognition processing of a plurality of streaming voice signals for determination of a responsive action thereto
US20090292533A1 (en) * 2008-05-23 2009-11-26 Accenture Global Services Gmbh Treatment processing of a plurality of streaming voice signals for determination of a responsive action thereto
US8676588B2 (en) * 2008-05-23 2014-03-18 Accenture Global Services Limited System for handling a plurality of streaming voice signals for determination of responsive action thereto
US10108612B2 (en) 2008-07-31 2018-10-23 Apple Inc. Mobile device having human language translation capability with positional feedback
US11348582B2 (en) 2008-10-02 2022-05-31 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US10643611B2 (en) 2008-10-02 2020-05-05 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US9300794B2 (en) * 2009-02-12 2016-03-29 At&T Intellectual Property I, L.P. Universal access to caller-specific ringtones
US20100202604A1 (en) * 2009-02-12 2010-08-12 Siegel Laurence R Universal Access to Caller-Specific Ringtones
US20220005478A1 (en) * 2009-02-27 2022-01-06 Nec Corporation Mobile wireless communications device with speech to text conversion and related methods
US9218807B2 (en) * 2010-01-08 2015-12-22 Nuance Communications, Inc. Calibration of a speech recognition engine using validated text
US12087308B2 (en) 2010-01-18 2024-09-10 Apple Inc. Intelligent automated assistant
US11423886B2 (en) 2010-01-18 2022-08-23 Apple Inc. Task flow identification based on user intent
US10741185B2 (en) 2010-01-18 2020-08-11 Apple Inc. Intelligent automated assistant
US10692504B2 (en) 2010-02-25 2020-06-23 Apple Inc. User profiling for voice input processing
US9263034B1 (en) * 2010-07-13 2016-02-16 Google Inc. Adapting enhanced acoustic models
US9858917B1 (en) 2010-07-13 2018-01-02 Google Inc. Adapting enhanced acoustic models
US10417405B2 (en) 2011-03-21 2019-09-17 Apple Inc. Device access using voice authentication
US10522133B2 (en) * 2011-05-23 2019-12-31 Nuance Communications, Inc. Methods and apparatus for correcting recognition errors
US11350253B2 (en) 2011-06-03 2022-05-31 Apple Inc. Active transport based notifications
US11120372B2 (en) 2011-06-03 2021-09-14 Apple Inc. Performing actions associated with task items that represent tasks to perform
US20130159000A1 (en) * 2011-12-15 2013-06-20 Microsoft Corporation Spoken Utterance Classification Training for a Speech Recognition System
US9082403B2 (en) * 2011-12-15 2015-07-14 Microsoft Technology Licensing, Llc Spoken utterance classification training for a speech recognition system
US11069336B2 (en) 2012-03-02 2021-07-20 Apple Inc. Systems and methods for name pronunciation
US11269678B2 (en) 2012-05-15 2022-03-08 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US10431235B2 (en) * 2012-05-31 2019-10-01 Elwha Llc Methods and systems for speech adaptation data
US20130325449A1 (en) * 2012-05-31 2013-12-05 Elwha Llc Speech recognition adaptation systems based on adaptation data
US9620128B2 (en) * 2012-05-31 2017-04-11 Elwha Llc Speech recognition adaptation systems based on adaptation data
US20170069335A1 (en) * 2012-05-31 2017-03-09 Elwha Llc Methods and systems for speech adaptation data
US20130325448A1 (en) * 2012-05-31 2013-12-05 Elwha LLC, a limited liability company of the State of Delaware Speech recognition adaptation systems based on adaptation data
US20130325453A1 (en) * 2012-05-31 2013-12-05 Elwha LLC, a limited liability company of the State of Delaware Methods and systems for speech adaptation data
US9899040B2 (en) * 2012-05-31 2018-02-20 Elwha, Llc Methods and systems for managing adaptation data
US9495966B2 (en) 2012-05-31 2016-11-15 Elwha Llc Speech recognition adaptation systems based on adaptation data
US10395672B2 (en) * 2012-05-31 2019-08-27 Elwha Llc Methods and systems for managing adaptation data
US20130325451A1 (en) * 2012-05-31 2013-12-05 Elwha LLC, a limited liability company of the State of Delaware Methods and systems for speech adaptation data
US20130325452A1 (en) * 2012-05-31 2013-12-05 Elwha LLC, a limited liability company of the State of Delaware Methods and systems for speech adaptation data
US20130325454A1 (en) * 2012-05-31 2013-12-05 Elwha Llc Methods and systems for managing adaptation data
US9305565B2 (en) * 2012-05-31 2016-04-05 Elwha Llc Methods and systems for speech adaptation data
US20130325450A1 (en) * 2012-05-31 2013-12-05 Elwha LLC, a limited liability company of the State of Delaware Methods and systems for speech adaptation data
US20130325441A1 (en) * 2012-05-31 2013-12-05 Elwha Llc Methods and systems for managing adaptation data
US9875739B2 (en) 2012-09-07 2018-01-23 Verint Systems Ltd. Speaker separation in diarization
US20190066691A1 (en) * 2012-11-21 2019-02-28 Verint Systems Ltd. Diarization using linguistic labeling
US10438592B2 (en) * 2012-11-21 2019-10-08 Verint Systems Ltd. Diarization using speech segment labeling
US10692500B2 (en) * 2012-11-21 2020-06-23 Verint Systems Ltd. Diarization using linguistic labeling to create and apply a linguistic model
US10902856B2 (en) 2012-11-21 2021-01-26 Verint Systems Ltd. System and method of diarization and labeling of audio data
US10134400B2 (en) * 2012-11-21 2018-11-20 Verint Systems Ltd. Diarization using acoustic labeling
US10692501B2 (en) * 2012-11-21 2020-06-23 Verint Systems Ltd. Diarization using acoustic labeling to create an acoustic voiceprint
US10134401B2 (en) * 2012-11-21 2018-11-20 Verint Systems Ltd. Diarization using linguistic labeling
US10950242B2 (en) 2012-11-21 2021-03-16 Verint Systems Ltd. System and method of diarization and labeling of audio data
US10522152B2 (en) 2012-11-21 2019-12-31 Verint Systems Ltd. Diarization using linguistic labeling
US11227603B2 (en) * 2012-11-21 2022-01-18 Verint Systems Ltd. System and method of video capture and search optimization for creating an acoustic voiceprint
US20140142944A1 (en) * 2012-11-21 2014-05-22 Verint Systems Ltd. Diarization Using Acoustic Labeling
US10522153B2 (en) * 2012-11-21 2019-12-31 Verint Systems Ltd. Diarization using linguistic labeling
US11367450B2 (en) * 2012-11-21 2022-06-21 Verint Systems Inc. System and method of diarization and labeling of audio data
US11380333B2 (en) * 2012-11-21 2022-07-05 Verint Systems Inc. System and method of diarization and labeling of audio data
US10650826B2 (en) * 2012-11-21 2020-05-12 Verint Systems Ltd. Diarization using acoustic labeling
US11322154B2 (en) * 2012-11-21 2022-05-03 Verint Systems Inc. Diarization using linguistic labeling
US20200105275A1 (en) * 2012-11-21 2020-04-02 Verint Systems Ltd. Diarization using linguistic labeling
US20220139399A1 (en) * 2012-11-21 2022-05-05 Verint Systems Ltd. System and method of video capture and search optimization for creating an acoustic voiceprint
US11776547B2 (en) * 2012-11-21 2023-10-03 Verint Systems Inc. System and method of video capture and search optimization for creating an acoustic voiceprint
US20140142940A1 (en) * 2012-11-21 2014-05-22 Verint Systems Ltd. Diarization Using Linguistic Labeling
US10593332B2 (en) * 2012-11-21 2020-03-17 Verint Systems Ltd. Diarization using textual and audio speaker labeling
US10720164B2 (en) * 2012-11-21 2020-07-21 Verint Systems Ltd. System and method of diarization and labeling of audio data
US20200043501A1 (en) * 2012-11-21 2020-02-06 Verint Systems Ltd. Diarization using acoustic labeling
US10950241B2 (en) 2012-11-21 2021-03-16 Verint Systems Ltd. Diarization using linguistic labeling with segmented and clustered diarized textual transcripts
US20200035246A1 (en) * 2012-11-21 2020-01-30 Verint Systems Ltd. Diarization using acoustic labeling
US10446156B2 (en) * 2012-11-21 2019-10-15 Verint Systems Ltd. Diarization using textual and audio speaker labeling
US20200035245A1 (en) * 2012-11-21 2020-01-30 Verint Systems Ltd. Diarization using linguistic labeling
US10714117B2 (en) 2013-02-07 2020-07-14 Apple Inc. Voice trigger for a digital assistant
US10978090B2 (en) 2013-02-07 2021-04-13 Apple Inc. Voice trigger for a digital assistant
US20140249811A1 (en) * 2013-03-01 2014-09-04 Google Inc. Detecting the end of a user question
US9123340B2 (en) * 2013-03-01 2015-09-01 Google Inc. Detecting the end of a user question
US11388291B2 (en) 2013-03-14 2022-07-12 Apple Inc. System and method for processing voicemail
US11798547B2 (en) 2013-03-15 2023-10-24 Apple Inc. Voice activated device for use with a voice-based digital assistant
US9966060B2 (en) * 2013-06-07 2018-05-08 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US20170178619A1 (en) * 2013-06-07 2017-06-22 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US10657961B2 (en) 2013-06-08 2020-05-19 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US11048473B2 (en) 2013-06-09 2021-06-29 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10769385B2 (en) 2013-06-09 2020-09-08 Apple Inc. System and method for inferring user intent from speech inputs
US11727219B2 (en) 2013-06-09 2023-08-15 Apple Inc. System and method for inferring user intent from speech inputs
US20150019221A1 (en) * 2013-07-15 2015-01-15 Chunghwa Picture Tubes, Ltd. Speech recognition system and method
US10109280B2 (en) 2013-07-17 2018-10-23 Verint Systems Ltd. Blind diarization of recorded calls with arbitrary number of speakers
US9460722B2 (en) 2013-07-17 2016-10-04 Verint Systems Ltd. Blind diarization of recorded calls with arbitrary number of speakers
US9881617B2 (en) 2013-07-17 2018-01-30 Verint Systems Ltd. Blind diarization of recorded calls with arbitrary number of speakers
US9984706B2 (en) 2013-08-01 2018-05-29 Verint Systems Ltd. Voice activity detection using a soft decision mechanism
US11670325B2 (en) 2013-08-01 2023-06-06 Verint Systems Ltd. Voice activity detection using a soft decision mechanism
US10665253B2 (en) 2013-08-01 2020-05-26 Verint Systems Ltd. Voice activity detection using a soft decision mechanism
US12010262B2 (en) 2013-08-06 2024-06-11 Apple Inc. Auto-activating smart responses based on activities from remote devices
US11314370B2 (en) 2013-12-06 2022-04-26 Apple Inc. Method for extracting salient dialog usage from live data
US10657966B2 (en) 2014-05-30 2020-05-19 Apple Inc. Better resolution when referencing to concepts
US10497365B2 (en) 2014-05-30 2019-12-03 Apple Inc. Multi-command single utterance input method
US10878809B2 (en) 2014-05-30 2020-12-29 Apple Inc. Multi-command single utterance input method
US11257504B2 (en) 2014-05-30 2022-02-22 Apple Inc. Intelligent assistant for home automation
US10417344B2 (en) 2014-05-30 2019-09-17 Apple Inc. Exemplar-based natural language processing
US11133008B2 (en) 2014-05-30 2021-09-28 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US10699717B2 (en) 2014-05-30 2020-06-30 Apple Inc. Intelligent assistant for home automation
US10083690B2 (en) 2014-05-30 2018-09-25 Apple Inc. Better resolution when referencing to concepts
US10714095B2 (en) 2014-05-30 2020-07-14 Apple Inc. Intelligent assistant for home automation
US20160027442A1 (en) * 2014-07-25 2016-01-28 International Business Machines Corporation Summarization of audio data
US9728190B2 (en) * 2014-07-25 2017-08-08 International Business Machines Corporation Summarization of audio data
US10431204B2 (en) 2014-09-11 2019-10-01 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10438595B2 (en) 2014-09-30 2019-10-08 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10453443B2 (en) 2014-09-30 2019-10-22 Apple Inc. Providing an indication of the suitability of speech recognition
US10390213B2 (en) 2014-09-30 2019-08-20 Apple Inc. Social reminders
US10726848B2 (en) 2015-01-26 2020-07-28 Verint Systems Ltd. Word-level blind diarization of recorded calls with arbitrary number of speakers
US10366693B2 (en) 2015-01-26 2019-07-30 Verint Systems Ltd. Acoustic signature building for a speaker from multiple sessions
US9875743B2 (en) 2015-01-26 2018-01-23 Verint Systems Ltd. Acoustic signature building for a speaker from multiple sessions
US11636860B2 (en) 2015-01-26 2023-04-25 Verint Systems Ltd. Word-level blind diarization of recorded calls with arbitrary number of speakers
US9875742B2 (en) 2015-01-26 2018-01-23 Verint Systems Ltd. Word-level blind diarization of recorded calls with arbitrary number of speakers
US11231904B2 (en) 2015-03-06 2022-01-25 Apple Inc. Reducing response latency of intelligent automated assistants
US10930282B2 (en) 2015-03-08 2021-02-23 Apple Inc. Competing devices responding to voice triggers
US11087759B2 (en) 2015-03-08 2021-08-10 Apple Inc. Virtual assistant activation
US10529332B2 (en) 2015-03-08 2020-01-07 Apple Inc. Virtual assistant activation
US10311871B2 (en) 2015-03-08 2019-06-04 Apple Inc. Competing devices responding to voice triggers
US11468282B2 (en) 2015-05-15 2022-10-11 Apple Inc. Virtual assistant in a communication session
US11127397B2 (en) 2015-05-27 2021-09-21 Apple Inc. Device voice control
US10356243B2 (en) 2015-06-05 2019-07-16 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10681212B2 (en) 2015-06-05 2020-06-09 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US11010127B2 (en) 2015-06-29 2021-05-18 Apple Inc. Virtual assistant for media playback
US11500672B2 (en) 2015-09-08 2022-11-15 Apple Inc. Distributed personal assistant
US11126400B2 (en) 2015-09-08 2021-09-21 Apple Inc. Zero latency digital assistant
US11526368B2 (en) 2015-11-06 2022-12-13 Apple Inc. Intelligent automated assistant in a messaging environment
US10956666B2 (en) 2015-11-09 2021-03-23 Apple Inc. Unconventional virtual assistant interactions
US10354652B2 (en) 2015-12-02 2019-07-16 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US20170169822A1 (en) * 2015-12-14 2017-06-15 Hitachi, Ltd. Dialog text summarization device and method
US10942703B2 (en) 2015-12-23 2021-03-09 Apple Inc. Proactive assistance based on dialog communication between devices
US10332516B2 (en) 2016-05-10 2019-06-25 Google Llc Media transfer among media output devices
US10535343B2 (en) 2016-05-10 2020-01-14 Google Llc Implementations for voice assistant on devices
US11990126B2 (en) 2016-05-10 2024-05-21 Google Llc Voice-controlled media play in smart media environment
US10304450B2 (en) 2016-05-10 2019-05-28 Google Llc LED design language for visual affordance of voice user interfaces
US10861461B2 (en) 2016-05-10 2020-12-08 Google Llc LED design language for visual affordance of voice user interfaces
US11935535B2 (en) 2016-05-10 2024-03-19 Google Llc Implementations for voice assistant on devices
US10235997B2 (en) 2016-05-10 2019-03-19 Google Llc Voice-controlled closed caption display
US11922941B2 (en) 2016-05-10 2024-03-05 Google Llc Implementations for voice assistant on devices
US11355116B2 (en) 2016-05-10 2022-06-07 Google Llc Implementations for voice assistant on devices
US11341964B2 (en) 2016-05-10 2022-05-24 Google Llc Voice-controlled media play in smart media environment
USD927550S1 (en) 2016-05-13 2021-08-10 Google Llc Voice interface device
USD979602S1 (en) 2016-05-13 2023-02-28 Google Llc Panel of a voice interface device
US10402450B2 (en) * 2016-05-13 2019-09-03 Google Llc Personalized and contextualized audio briefing
US20170329848A1 (en) * 2016-05-13 2017-11-16 Google Inc. Personalized and Contextualized Audio Briefing
US11860933B2 (en) 2016-05-13 2024-01-02 Google Llc Personalized and contextualized audio briefing
USD885436S1 (en) 2016-05-13 2020-05-26 Google Llc Panel of a voice interface device
US11227589B2 (en) 2016-06-06 2022-01-18 Apple Inc. Intelligent list reading
US11069347B2 (en) 2016-06-08 2021-07-20 Apple Inc. Intelligent automated assistant for media exploration
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US11037565B2 (en) 2016-06-10 2021-06-15 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US11152002B2 (en) 2016-06-11 2021-10-19 Apple Inc. Application integration with a digital assistant
US10580409B2 (en) 2016-06-11 2020-03-03 Apple Inc. Application integration with a digital assistant
US10942702B2 (en) 2016-06-11 2021-03-09 Apple Inc. Intelligent device arbitration and control
US10474753B2 (en) 2016-09-07 2019-11-12 Apple Inc. Language identification using recurrent neural networks
US10553215B2 (en) 2016-09-23 2020-02-04 Apple Inc. Intelligent automated assistant
US11281993B2 (en) 2016-12-05 2022-03-22 Apple Inc. Model and ensemble compression for metric learning
US11204787B2 (en) 2017-01-09 2021-12-21 Apple Inc. Application integration with a digital assistant
US11656884B2 (en) 2017-01-09 2023-05-23 Apple Inc. Application integration with a digital assistant
US11501767B2 (en) * 2017-01-23 2022-11-15 Audi Ag Method for operating a motor vehicle having an operating device
US10660344B1 (en) * 2017-02-16 2020-05-26 Tyson Foods, Inc. Method of making a meat product and a meat product
US10417266B2 (en) 2017-05-09 2019-09-17 Apple Inc. Context-aware ranking of intelligent response suggestions
US10332518B2 (en) 2017-05-09 2019-06-25 Apple Inc. User interface for correcting recognition errors
US10741181B2 (en) 2017-05-09 2020-08-11 Apple Inc. User interface for correcting recognition errors
US10395654B2 (en) 2017-05-11 2019-08-27 Apple Inc. Text normalization based on a data-driven learning network
US11599331B2 (en) 2017-05-11 2023-03-07 Apple Inc. Maintaining privacy of personal information
US10726832B2 (en) 2017-05-11 2020-07-28 Apple Inc. Maintaining privacy of personal information
US10847142B2 (en) 2017-05-11 2020-11-24 Apple Inc. Maintaining privacy of personal information
US10789945B2 (en) 2017-05-12 2020-09-29 Apple Inc. Low-latency intelligent automated assistant
US11301477B2 (en) 2017-05-12 2022-04-12 Apple Inc. Feedback analysis of a digital assistant
US11405466B2 (en) 2017-05-12 2022-08-02 Apple Inc. Synchronization and task delegation of a digital assistant
US11380310B2 (en) 2017-05-12 2022-07-05 Apple Inc. Low-latency intelligent automated assistant
US10403278B2 (en) 2017-05-16 2019-09-03 Apple Inc. Methods and systems for phonetic matching in digital assistant services
US10303715B2 (en) 2017-05-16 2019-05-28 Apple Inc. Intelligent automated assistant for media exploration
US11532306B2 (en) 2017-05-16 2022-12-20 Apple Inc. Detecting a trigger of a digital assistant
US10748546B2 (en) 2017-05-16 2020-08-18 Apple Inc. Digital assistant services based on device capabilities
US10311144B2 (en) 2017-05-16 2019-06-04 Apple Inc. Emoji word sense disambiguation
US10909171B2 (en) 2017-05-16 2021-02-02 Apple Inc. Intelligent automated assistant for media exploration
US11217255B2 (en) 2017-05-16 2022-01-04 Apple Inc. Far-field extension for digital assistant services
US10657328B2 (en) 2017-06-02 2020-05-19 Apple Inc. Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling
US10445429B2 (en) 2017-09-21 2019-10-15 Apple Inc. Natural language understanding using vocabularies with compressed serialized tries
US10755051B2 (en) 2017-09-29 2020-08-25 Apple Inc. Rule-based natural language processing
US11314942B1 (en) * 2017-10-27 2022-04-26 Interactions Llc Accelerating agent performance in a natural language processing system
US10902039B2 (en) 2017-11-16 2021-01-26 International Business Machines Corporation Automatic identification of retraining data in a classifier-based dialogue system
US10372737B2 (en) * 2017-11-16 2019-08-06 International Business Machines Corporation Automatic identification of retraining data in a classifier-based dialogue system
US10636424B2 (en) 2017-11-30 2020-04-28 Apple Inc. Multi-turn canned dialog
US10733982B2 (en) 2018-01-08 2020-08-04 Apple Inc. Multi-directional dialog
US10733375B2 (en) 2018-01-31 2020-08-04 Apple Inc. Knowledge-based framework for improving natural language understanding
US11710488B2 (en) 2018-02-26 2023-07-25 Sorenson Ip Holdings, Llc Transcription of communications using multiple speech recognition systems
US10192554B1 (en) 2018-02-26 2019-01-29 Sorenson Ip Holdings, Llc Transcription of communications using multiple speech recognition systems
US10789959B2 (en) 2018-03-02 2020-09-29 Apple Inc. Training speaker recognition models for digital assistants
US10592604B2 (en) 2018-03-12 2020-03-17 Apple Inc. Inverse text normalization for automatic speech recognition
US11710482B2 (en) 2018-03-26 2023-07-25 Apple Inc. Natural assistant interaction
US10818288B2 (en) 2018-03-26 2020-10-27 Apple Inc. Natural assistant interaction
US10909331B2 (en) 2018-03-30 2021-02-02 Apple Inc. Implicit identification of translation payload with neural machine translation
US11169616B2 (en) 2018-05-07 2021-11-09 Apple Inc. Raise to speak
US11145294B2 (en) 2018-05-07 2021-10-12 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11854539B2 (en) 2018-05-07 2023-12-26 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US10928918B2 (en) 2018-05-07 2021-02-23 Apple Inc. Raise to speak
US11538128B2 (en) 2018-05-14 2022-12-27 Verint Americas Inc. User interface for fraud alert management
US10984780B2 (en) 2018-05-21 2021-04-20 Apple Inc. Global semantic word embeddings using bi-directional recurrent neural networks
US11009970B2 (en) 2018-06-01 2021-05-18 Apple Inc. Attention aware virtual assistant dismissal
US10684703B2 (en) 2018-06-01 2020-06-16 Apple Inc. Attention aware virtual assistant dismissal
US10720160B2 (en) 2018-06-01 2020-07-21 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US11386266B2 (en) 2018-06-01 2022-07-12 Apple Inc. Text correction
US10984798B2 (en) 2018-06-01 2021-04-20 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US10892996B2 (en) 2018-06-01 2021-01-12 Apple Inc. Variable latency device coordination
US10403283B1 (en) 2018-06-01 2019-09-03 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US11495218B2 (en) 2018-06-01 2022-11-08 Apple Inc. Virtual assistant operation in multi-device environments
US11431642B2 (en) 2018-06-01 2022-08-30 Apple Inc. Variable latency device coordination
US10504518B1 (en) 2018-06-03 2019-12-10 Apple Inc. Accelerated task performance
US10496705B1 (en) 2018-06-03 2019-12-03 Apple Inc. Accelerated task performance
US10944859B2 (en) 2018-06-03 2021-03-09 Apple Inc. Accelerated task performance
US11010561B2 (en) 2018-09-27 2021-05-18 Apple Inc. Sentiment prediction from textual data
US11170166B2 (en) 2018-09-28 2021-11-09 Apple Inc. Neural typographical error modeling via generative adversarial networks
US11462215B2 (en) 2018-09-28 2022-10-04 Apple Inc. Multi-modal inputs for voice commands
US10839159B2 (en) 2018-09-28 2020-11-17 Apple Inc. Named entity normalization in a spoken dialog system
US10887452B2 (en) 2018-10-25 2021-01-05 Verint Americas Inc. System architecture for fraud detection
US11240372B2 (en) 2018-10-25 2022-02-01 Verint Americas Inc. System architecture for fraud detection
US11475898B2 (en) 2018-10-26 2022-10-18 Apple Inc. Low-latency multi-speaker speech recognition
US11594221B2 (en) * 2018-12-04 2023-02-28 Sorenson Ip Holdings, Llc Transcription generation from multiple speech recognition systems
US11170761B2 (en) 2018-12-04 2021-11-09 Sorenson Ip Holdings, Llc Training of speech recognition systems
US10573312B1 (en) 2018-12-04 2020-02-25 Sorenson Ip Holdings, Llc Transcription generation from multiple speech recognition systems
US11017778B1 (en) 2018-12-04 2021-05-25 Sorenson Ip Holdings, Llc Switching between speech recognition systems
US11935540B2 (en) 2018-12-04 2024-03-19 Sorenson Ip Holdings, Llc Switching between speech recognition systems
US11145312B2 (en) 2018-12-04 2021-10-12 Sorenson Ip Holdings, Llc Switching between speech recognition systems
US10971153B2 (en) 2018-12-04 2021-04-06 Sorenson Ip Holdings, Llc Transcription generation from multiple speech recognition systems
US10672383B1 (en) 2018-12-04 2020-06-02 Sorenson Ip Holdings, Llc Training speech recognition systems using word sequences
US20210233530A1 (en) * 2018-12-04 2021-07-29 Sorenson Ip Holdings, Llc Transcription generation from multiple speech recognition systems
US10388272B1 (en) 2018-12-04 2019-08-20 Sorenson Ip Holdings, Llc Training speech recognition systems using word sequences
US11638059B2 (en) 2019-01-04 2023-04-25 Apple Inc. Content playback on multiple devices
US11348573B2 (en) 2019-03-18 2022-05-31 Apple Inc. Multimodality in digital assistant systems
US11398239B1 (en) 2019-03-31 2022-07-26 Medallia, Inc. ASR-enhanced speech compression
US11227606B1 (en) * 2019-03-31 2022-01-18 Medallia, Inc. Compact, verifiable record of an audio communication and method for making same
US11475884B2 (en) 2019-05-06 2022-10-18 Apple Inc. Reducing digital assistant latency when a language is incorrectly determined
US11423908B2 (en) 2019-05-06 2022-08-23 Apple Inc. Interpreting spoken requests
US11307752B2 (en) 2019-05-06 2022-04-19 Apple Inc. User configurable task triggers
US11217251B2 (en) 2019-05-06 2022-01-04 Apple Inc. Spoken notifications
US11140099B2 (en) 2019-05-21 2021-10-05 Apple Inc. Providing message response suggestions
US11237797B2 (en) 2019-05-31 2022-02-01 Apple Inc. User activity shortcut suggestions
US11360739B2 (en) 2019-05-31 2022-06-14 Apple Inc. User activity shortcut suggestions
US11496600B2 (en) 2019-05-31 2022-11-08 Apple Inc. Remote execution of machine-learned models
US11657813B2 (en) 2019-05-31 2023-05-23 Apple Inc. Voice identification in digital assistant systems
US11289073B2 (en) 2019-05-31 2022-03-29 Apple Inc. Device text to speech
US11360641B2 (en) 2019-06-01 2022-06-14 Apple Inc. Increasing the relevance of new available information
US11652917B2 (en) 2019-06-20 2023-05-16 Verint Americas Inc. Systems and methods for authentication and fraud detection
US11115521B2 (en) 2019-06-20 2021-09-07 Verint Americas Inc. Systems and methods for authentication and fraud detection
US20210049927A1 (en) * 2019-08-13 2021-02-18 Vanderbilt University System, method and computer program product for determining a reading error distance metric
US10665241B1 (en) * 2019-09-06 2020-05-26 Verbit Software Ltd. Rapid frontend resolution of transcription-related inquiries by backend transcribers
US11488406B2 (en) 2019-09-25 2022-11-01 Apple Inc. Text detection using global geometry estimators
US11868453B2 (en) 2019-11-07 2024-01-09 Verint Americas Inc. Systems and methods for customer authentication based on audio-of-interest
US12035070B2 (en) 2020-02-21 2024-07-09 Ultratec, Inc. Caption modification and augmentation systems and methods for use by hearing assisted user
US11488604B2 (en) 2020-08-19 2022-11-01 Sorenson Ip Holdings, Llc Transcription of audio
WO2024018598A1 (en) * 2022-07-21 2024-01-25 Nttテクノクロス株式会社 Information processing system, information processing method, and program

Similar Documents

Publication Publication Date Title
US7660715B1 (en) Transparent monitoring and intervention to improve automatic adaptation of speech models
US7542904B2 (en) System and method for maintaining a speech-recognition grammar
US9571638B1 (en) Segment-based queueing for audio captioning
US8374317B2 (en) Interactive voice response (IVR) system call interruption handling
JP4247929B2 (en) A method for automatic speech recognition in telephones.
US20090326939A1 (en) System and method for transcribing and displaying speech during a telephone call
US7346151B2 (en) Method and apparatus for validating agreement between textual and spoken representations of words
US8457964B2 (en) Detecting and communicating biometrics of recorded voice during transcription process
US8073699B2 (en) Numeric weighting of error recovery prompts for transfer to a human agent from an automated speech response system
US8731937B1 (en) Updating speech recognition models for contacts
US7907705B1 (en) Speech to text for assisted form completion
US6891932B2 (en) System and methodology for voice activated access to multiple data sources and voice repositories in a single session
US9183834B2 (en) Speech recognition tuning tool
US7318029B2 (en) Method and apparatus for a interactive voice response system
US20060093097A1 (en) System and method for identifying telephone callers
US20030191639A1 (en) Dynamic and adaptive selection of vocabulary and acoustic models based on a call context for speech recognition
US20060217978A1 (en) System and method for handling information in a voice recognition automated conversation
US20070121893A1 (en) Optimal call speed for call center agents
US20100063815A1 (en) Real-time transcription
US20090097634A1 (en) Method and System for Call Processing
US20020069060A1 (en) Method and system for automatically managing a voice-based communications systems
JPH10215319A (en) Dialing method and device by voice
US20180255180A1 (en) Bridge for Non-Voice Communications User Interface to Voice-Enabled Interactive Voice Response System
EP2124427A2 (en) Treatment processing of a plurality of streaming voice signals for determination of responsive action thereto
US6813342B1 (en) Implicit area code determination during voice activated dialing

Legal Events

Date Code Title Description
AS Assignment

Owner name: AVAYA TECHNOLOGY,NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:THAMBIRATNAM, DAVID PRESHAN;REEL/FRAME:014931/0565

Effective date: 20031218

AS Assignment

Owner name: CITIBANK, N.A., AS ADMINISTRATIVE AGENT,NEW YORK

Free format text: SECURITY AGREEMENT;ASSIGNORS:AVAYA, INC.;AVAYA TECHNOLOGY LLC;OCTEL COMMUNICATIONS LLC;AND OTHERS;REEL/FRAME:020156/0149

Effective date: 20071026

Owner name: CITIBANK, N.A., AS ADMINISTRATIVE AGENT, NEW YORK

Free format text: SECURITY AGREEMENT;ASSIGNORS:AVAYA, INC.;AVAYA TECHNOLOGY LLC;OCTEL COMMUNICATIONS LLC;AND OTHERS;REEL/FRAME:020156/0149

Effective date: 20071026

AS Assignment

Owner name: CITICORP USA, INC., AS ADMINISTRATIVE AGENT, NEW YORK

Free format text: SECURITY AGREEMENT;ASSIGNORS:AVAYA, INC.;AVAYA TECHNOLOGY LLC;OCTEL COMMUNICATIONS LLC;AND OTHERS;REEL/FRAME:020166/0705

Effective date: 20071026

Owner name: CITICORP USA, INC., AS ADMINISTRATIVE AGENT,NEW YO

Free format text: SECURITY AGREEMENT;ASSIGNORS:AVAYA, INC.;AVAYA TECHNOLOGY LLC;OCTEL COMMUNICATIONS LLC;AND OTHERS;REEL/FRAME:020166/0705

Effective date: 20071026

Owner name: CITICORP USA, INC., AS ADMINISTRATIVE AGENT, NEW Y

Free format text: SECURITY AGREEMENT;ASSIGNORS:AVAYA, INC.;AVAYA TECHNOLOGY LLC;OCTEL COMMUNICATIONS LLC;AND OTHERS;REEL/FRAME:020166/0705

Effective date: 20071026

AS Assignment

Owner name: AVAYA INC,NEW JERSEY

Free format text: REASSIGNMENT;ASSIGNORS:AVAYA TECHNOLOGY LLC;AVAYA LICENSING LLC;REEL/FRAME:021156/0082

Effective date: 20080626

Owner name: AVAYA INC, NEW JERSEY

Free format text: REASSIGNMENT;ASSIGNORS:AVAYA TECHNOLOGY LLC;AVAYA LICENSING LLC;REEL/FRAME:021156/0082

Effective date: 20080626

AS Assignment

Owner name: AVAYA TECHNOLOGY LLC,NEW JERSEY

Free format text: CONVERSION FROM CORP TO LLC;ASSIGNOR:AVAYA TECHNOLOGY CORP.;REEL/FRAME:022677/0550

Effective date: 20050930

Owner name: AVAYA TECHNOLOGY LLC, NEW JERSEY

Free format text: CONVERSION FROM CORP TO LLC;ASSIGNOR:AVAYA TECHNOLOGY CORP.;REEL/FRAME:022677/0550

Effective date: 20050930

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: BANK OF NEW YORK MELLON TRUST, NA, AS NOTES COLLATERAL AGENT, THE, PENNSYLVANIA

Free format text: SECURITY AGREEMENT;ASSIGNOR:AVAYA INC., A DELAWARE CORPORATION;REEL/FRAME:025863/0535

Effective date: 20110211

Owner name: BANK OF NEW YORK MELLON TRUST, NA, AS NOTES COLLAT

Free format text: SECURITY AGREEMENT;ASSIGNOR:AVAYA INC., A DELAWARE CORPORATION;REEL/FRAME:025863/0535

Effective date: 20110211

AS Assignment

Owner name: BANK OF NEW YORK MELLON TRUST COMPANY, N.A., THE, PENNSYLVANIA

Free format text: SECURITY AGREEMENT;ASSIGNOR:AVAYA, INC.;REEL/FRAME:030083/0639

Effective date: 20130307

Owner name: BANK OF NEW YORK MELLON TRUST COMPANY, N.A., THE,

Free format text: SECURITY AGREEMENT;ASSIGNOR:AVAYA, INC.;REEL/FRAME:030083/0639

Effective date: 20130307

FPAY Fee payment

Year of fee payment: 4

AS Assignment

Owner name: CITIBANK, N.A., AS ADMINISTRATIVE AGENT, NEW YORK

Free format text: SECURITY INTEREST;ASSIGNORS:AVAYA INC.;AVAYA INTEGRATED CABINET SOLUTIONS INC.;OCTEL COMMUNICATIONS CORPORATION;AND OTHERS;REEL/FRAME:041576/0001

Effective date: 20170124

FPAY Fee payment

Year of fee payment: 8

AS Assignment

Owner name: AVAYA INTEGRATED CABINET SOLUTIONS INC., CALIFORNIA

Free format text: BANKRUPTCY COURT ORDER RELEASING ALL LIENS INCLUDING THE SECURITY INTEREST RECORDED AT REEL/FRAME 041576/0001;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:044893/0531

Effective date: 20171128

Owner name: OCTEL COMMUNICATIONS LLC (FORMERLY KNOWN AS OCTEL COMMUNICATIONS CORPORATION), CALIFORNIA

Free format text: BANKRUPTCY COURT ORDER RELEASING ALL LIENS INCLUDING THE SECURITY INTEREST RECORDED AT REEL/FRAME 041576/0001;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:044893/0531

Effective date: 20171128

Owner name: AVAYA INC., CALIFORNIA

Free format text: BANKRUPTCY COURT ORDER RELEASING ALL LIENS INCLUDING THE SECURITY INTEREST RECORDED AT REEL/FRAME 025863/0535;ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST, NA;REEL/FRAME:044892/0001

Effective date: 20171128

Owner name: AVAYA INC., CALIFORNIA

Free format text: BANKRUPTCY COURT ORDER RELEASING ALL LIENS INCLUDING THE SECURITY INTEREST RECORDED AT REEL/FRAME 041576/0001;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:044893/0531

Effective date: 20171128

Owner name: VPNET TECHNOLOGIES, INC., CALIFORNIA

Free format text: BANKRUPTCY COURT ORDER RELEASING ALL LIENS INCLUDING THE SECURITY INTEREST RECORDED AT REEL/FRAME 041576/0001;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:044893/0531

Effective date: 20171128

Owner name: AVAYA INTEGRATED CABINET SOLUTIONS INC., CALIFORNI

Free format text: BANKRUPTCY COURT ORDER RELEASING ALL LIENS INCLUDING THE SECURITY INTEREST RECORDED AT REEL/FRAME 041576/0001;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:044893/0531

Effective date: 20171128

Owner name: OCTEL COMMUNICATIONS LLC (FORMERLY KNOWN AS OCTEL

Free format text: BANKRUPTCY COURT ORDER RELEASING ALL LIENS INCLUDING THE SECURITY INTEREST RECORDED AT REEL/FRAME 041576/0001;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:044893/0531

Effective date: 20171128

Owner name: AVAYA INC., CALIFORNIA

Free format text: BANKRUPTCY COURT ORDER RELEASING ALL LIENS INCLUDING THE SECURITY INTEREST RECORDED AT REEL/FRAME 030083/0639;ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A.;REEL/FRAME:045012/0666

Effective date: 20171128

AS Assignment

Owner name: GOLDMAN SACHS BANK USA, AS COLLATERAL AGENT, NEW YORK

Free format text: SECURITY INTEREST;ASSIGNORS:AVAYA INC.;AVAYA INTEGRATED CABINET SOLUTIONS LLC;OCTEL COMMUNICATIONS LLC;AND OTHERS;REEL/FRAME:045034/0001

Effective date: 20171215

Owner name: GOLDMAN SACHS BANK USA, AS COLLATERAL AGENT, NEW Y

Free format text: SECURITY INTEREST;ASSIGNORS:AVAYA INC.;AVAYA INTEGRATED CABINET SOLUTIONS LLC;OCTEL COMMUNICATIONS LLC;AND OTHERS;REEL/FRAME:045034/0001

Effective date: 20171215

AS Assignment

Owner name: CITIBANK, N.A., AS COLLATERAL AGENT, NEW YORK

Free format text: SECURITY INTEREST;ASSIGNORS:AVAYA INC.;AVAYA INTEGRATED CABINET SOLUTIONS LLC;OCTEL COMMUNICATIONS LLC;AND OTHERS;REEL/FRAME:045124/0026

Effective date: 20171215

AS Assignment

Owner name: WILMINGTON TRUST, NATIONAL ASSOCIATION, MINNESOTA

Free format text: SECURITY INTEREST;ASSIGNORS:AVAYA INC.;AVAYA MANAGEMENT L.P.;INTELLISIST, INC.;AND OTHERS;REEL/FRAME:053955/0436

Effective date: 20200925

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 12

AS Assignment

Owner name: VPNET TECHNOLOGIES, CALIFORNIA

Free format text: BANKRUPTCY COURT ORDER RELEASING THE SECURITY INTEREST RECORDED AT REEL/FRAME 020156/0149;ASSIGNOR:CITIBANK, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:060953/0412

Effective date: 20171128

Owner name: OCTEL COMMUNICATIONS LLC, CALIFORNIA

Free format text: BANKRUPTCY COURT ORDER RELEASING THE SECURITY INTEREST RECORDED AT REEL/FRAME 020156/0149;ASSIGNOR:CITIBANK, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:060953/0412

Effective date: 20171128

Owner name: AVAYA TECHNOLOGY LLC, CALIFORNIA

Free format text: BANKRUPTCY COURT ORDER RELEASING THE SECURITY INTEREST RECORDED AT REEL/FRAME 020156/0149;ASSIGNOR:CITIBANK, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:060953/0412

Effective date: 20171128

Owner name: AVAYA, INC., CALIFORNIA

Free format text: BANKRUPTCY COURT ORDER RELEASING THE SECURITY INTEREST RECORDED AT REEL/FRAME 020156/0149;ASSIGNOR:CITIBANK, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:060953/0412

Effective date: 20171128

AS Assignment

Owner name: WILMINGTON TRUST, NATIONAL ASSOCIATION, AS COLLATERAL AGENT, DELAWARE

Free format text: INTELLECTUAL PROPERTY SECURITY AGREEMENT;ASSIGNORS:AVAYA INC.;INTELLISIST, INC.;AVAYA MANAGEMENT L.P.;AND OTHERS;REEL/FRAME:061087/0386

Effective date: 20220712

AS Assignment

Owner name: VPNET TECHNOLOGIES, INC., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST ON REEL/FRAME 020166/0705;ASSIGNOR:CITICORP USA, INC.;REEL/FRAME:061328/0074

Effective date: 20171215

Owner name: OCTEL COMMUNICATIONS LLC, CALIFORNIA

Free format text: RELEASE OF SECURITY INTEREST ON REEL/FRAME 020166/0705;ASSIGNOR:CITICORP USA, INC.;REEL/FRAME:061328/0074

Effective date: 20171215

Owner name: AVAYA TECHNOLOGY, LLC, NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST ON REEL/FRAME 020166/0705;ASSIGNOR:CITICORP USA, INC.;REEL/FRAME:061328/0074

Effective date: 20171215

Owner name: SIERRA HOLDINGS CORP., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST ON REEL/FRAME 020166/0705;ASSIGNOR:CITICORP USA, INC.;REEL/FRAME:061328/0074

Effective date: 20171215

Owner name: AVAYA, INC., CALIFORNIA

Free format text: RELEASE OF SECURITY INTEREST ON REEL/FRAME 020166/0705;ASSIGNOR:CITICORP USA, INC.;REEL/FRAME:061328/0074

Effective date: 20171215

AS Assignment

Owner name: AVAYA INTEGRATED CABINET SOLUTIONS LLC, NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS AT REEL 45124/FRAME 0026;ASSIGNOR:CITIBANK, N.A., AS COLLATERAL AGENT;REEL/FRAME:063457/0001

Effective date: 20230403

Owner name: AVAYA MANAGEMENT L.P., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS AT REEL 45124/FRAME 0026;ASSIGNOR:CITIBANK, N.A., AS COLLATERAL AGENT;REEL/FRAME:063457/0001

Effective date: 20230403

Owner name: AVAYA INC., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS AT REEL 45124/FRAME 0026;ASSIGNOR:CITIBANK, N.A., AS COLLATERAL AGENT;REEL/FRAME:063457/0001

Effective date: 20230403

Owner name: AVAYA HOLDINGS CORP., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS AT REEL 45124/FRAME 0026;ASSIGNOR:CITIBANK, N.A., AS COLLATERAL AGENT;REEL/FRAME:063457/0001

Effective date: 20230403

AS Assignment

Owner name: WILMINGTON SAVINGS FUND SOCIETY, FSB (COLLATERAL AGENT), DELAWARE

Free format text: INTELLECTUAL PROPERTY SECURITY AGREEMENT;ASSIGNORS:AVAYA MANAGEMENT L.P.;AVAYA INC.;INTELLISIST, INC.;AND OTHERS;REEL/FRAME:063742/0001

Effective date: 20230501

AS Assignment

Owner name: CITIBANK, N.A., AS COLLATERAL AGENT, NEW YORK

Free format text: INTELLECTUAL PROPERTY SECURITY AGREEMENT;ASSIGNORS:AVAYA INC.;AVAYA MANAGEMENT L.P.;INTELLISIST, INC.;REEL/FRAME:063542/0662

Effective date: 20230501

AS Assignment

Owner name: AVAYA MANAGEMENT L.P., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 045034/0001);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063779/0622

Effective date: 20230501

Owner name: CAAS TECHNOLOGIES, LLC, NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 045034/0001);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063779/0622

Effective date: 20230501

Owner name: HYPERQUALITY II, LLC, NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 045034/0001);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063779/0622

Effective date: 20230501

Owner name: HYPERQUALITY, INC., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 045034/0001);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063779/0622

Effective date: 20230501

Owner name: ZANG, INC. (FORMER NAME OF AVAYA CLOUD INC.), NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 045034/0001);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063779/0622

Effective date: 20230501

Owner name: VPNET TECHNOLOGIES, INC., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 045034/0001);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063779/0622

Effective date: 20230501

Owner name: OCTEL COMMUNICATIONS LLC, NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 045034/0001);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063779/0622

Effective date: 20230501

Owner name: AVAYA INTEGRATED CABINET SOLUTIONS LLC, NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 045034/0001);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063779/0622

Effective date: 20230501

Owner name: INTELLISIST, INC., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 045034/0001);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063779/0622

Effective date: 20230501

Owner name: AVAYA INC., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 045034/0001);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063779/0622

Effective date: 20230501

Owner name: AVAYA INTEGRATED CABINET SOLUTIONS LLC, NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 53955/0436);ASSIGNOR:WILMINGTON TRUST, NATIONAL ASSOCIATION, AS NOTES COLLATERAL AGENT;REEL/FRAME:063705/0023

Effective date: 20230501

Owner name: INTELLISIST, INC., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 53955/0436);ASSIGNOR:WILMINGTON TRUST, NATIONAL ASSOCIATION, AS NOTES COLLATERAL AGENT;REEL/FRAME:063705/0023

Effective date: 20230501

Owner name: AVAYA INC., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 53955/0436);ASSIGNOR:WILMINGTON TRUST, NATIONAL ASSOCIATION, AS NOTES COLLATERAL AGENT;REEL/FRAME:063705/0023

Effective date: 20230501

Owner name: AVAYA MANAGEMENT L.P., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 53955/0436);ASSIGNOR:WILMINGTON TRUST, NATIONAL ASSOCIATION, AS NOTES COLLATERAL AGENT;REEL/FRAME:063705/0023

Effective date: 20230501

Owner name: AVAYA INTEGRATED CABINET SOLUTIONS LLC, NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 61087/0386);ASSIGNOR:WILMINGTON TRUST, NATIONAL ASSOCIATION, AS NOTES COLLATERAL AGENT;REEL/FRAME:063690/0359

Effective date: 20230501

Owner name: INTELLISIST, INC., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 61087/0386);ASSIGNOR:WILMINGTON TRUST, NATIONAL ASSOCIATION, AS NOTES COLLATERAL AGENT;REEL/FRAME:063690/0359

Effective date: 20230501

Owner name: AVAYA INC., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 61087/0386);ASSIGNOR:WILMINGTON TRUST, NATIONAL ASSOCIATION, AS NOTES COLLATERAL AGENT;REEL/FRAME:063690/0359

Effective date: 20230501

Owner name: AVAYA MANAGEMENT L.P., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 61087/0386);ASSIGNOR:WILMINGTON TRUST, NATIONAL ASSOCIATION, AS NOTES COLLATERAL AGENT;REEL/FRAME:063690/0359

Effective date: 20230501

AS Assignment

Owner name: AVAYA LLC, DELAWARE

Free format text: (SECURITY INTEREST) GRANTOR'S NAME CHANGE;ASSIGNOR:AVAYA INC.;REEL/FRAME:065019/0231

Effective date: 20230501

AS Assignment

Owner name: AVAYA MANAGEMENT L.P., NEW JERSEY

Free format text: INTELLECTUAL PROPERTY RELEASE AND REASSIGNMENT;ASSIGNOR:WILMINGTON SAVINGS FUND SOCIETY, FSB;REEL/FRAME:066894/0227

Effective date: 20240325

Owner name: AVAYA LLC, DELAWARE

Free format text: INTELLECTUAL PROPERTY RELEASE AND REASSIGNMENT;ASSIGNOR:WILMINGTON SAVINGS FUND SOCIETY, FSB;REEL/FRAME:066894/0227

Effective date: 20240325

Owner name: AVAYA MANAGEMENT L.P., NEW JERSEY

Free format text: INTELLECTUAL PROPERTY RELEASE AND REASSIGNMENT;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:066894/0117

Effective date: 20240325

Owner name: AVAYA LLC, DELAWARE

Free format text: INTELLECTUAL PROPERTY RELEASE AND REASSIGNMENT;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:066894/0117

Effective date: 20240325

AS Assignment

Owner name: ARLINGTON TECHNOLOGIES, LLC, TEXAS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:AVAYA LLC;REEL/FRAME:067022/0780

Effective date: 20240329