US20150142436A1 - Speech recognition in automated information services systems - Google Patents

Speech recognition in automated information services systems Download PDF

Info

Publication number
US20150142436A1
US20150142436A1 US14/610,124 US201514610124A US2015142436A1 US 20150142436 A1 US20150142436 A1 US 20150142436A1 US 201514610124 A US201514610124 A US 201514610124A US 2015142436 A1 US2015142436 A1 US 2015142436A1
Authority
US
United States
Prior art keywords
speech
information services
information
operator
update
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/610,124
Inventor
Bruce Bokish
Michael Craig Presnell
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rockstar Consortium US LP
Original Assignee
Rockstar Consortium US LP
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rockstar Consortium US LP filed Critical Rockstar Consortium US LP
Priority to US14/610,124 priority Critical patent/US20150142436A1/en
Publication of US20150142436A1 publication Critical patent/US20150142436A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/025Phonemes, fenemes or fenones being the recognition units
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0638Interactive procedures

Definitions

  • the present invention relates to information services, and in particular to improving speech recognition in information services automation systems.
  • Information services systems have been implemented since the beginning of telephony communications. For various reasons, and historically based on the need for directory assistance, telephony subscribers could call an information services system, request particular information, and receive the information. As communications evolve, the sophistication of the information services systems and the type of information provided with these systems has significantly increased. Currently, information services systems provide all types of information, from traditional directory numbers and addresses to driving directions and movie listings.
  • the primary hurdle in automation is the difficulty in recognizing speech due to the various languages, accents, dialects, and pronunciations of words that formulate the caller's request for information.
  • the speech recognition engines in these information services automation systems are only updated periodically, and these updates are not necessarily based on actual use, but rather on general predictions involving speech recognition patterns.
  • the present invention allows feedback from operator workstations to be used to update databases used for providing automated information services.
  • an automated process fails, recorded speech of the caller is passed on to the operator for decision making. Based on the selections made by the operator in light of the speech of or other interactions with the caller, a comparison is made between the speech and the selections made by the operator to arrive at information to update the databases in the information services automation system.
  • the speech may be associated with those words. The association between the speech and the words may be used to update different databases in the information services automation system.
  • the automation process involves processing the speech to detect phonemes, using the phonemes to detect words, and using the words to detect an entry that is associated with the information being requested by the caller. If there is a failure at any one of these detection stages, the speech is sent to the operator.
  • the various databases used to look up words based on phonemes, entries based on words, or information based on entries may be updated.
  • a word typed in by the operator may be associated with a group of phonemes for the speech.
  • an entry may be associated with a new word or group of words.
  • the information services automation system may send information identifying the step in the automation process where the automation failed. As such, the particular database to update based on the operator input can be selected based on the point of failure.
  • FIG. 1 is a block representation of an information services environment according to one embodiment of the present invention.
  • FIG. 2 is a block representation of an information services automation system according to one embodiment of the present invention.
  • FIG. 3 is a flow diagram providing an overview of the operation of the present invention according to one embodiment.
  • FIG. 4 is a block representation of an operator workstation according to one embodiment of the present invention.
  • FIG. 5 is a block representation of an information services automation system according to one embodiment of the present invention.
  • a communication network 12 may include an information services switch 14 , such as a circuit-switched based operator services switch or analogous cellular or packet-based switch, wherein incoming information services requests result in a voice-based communication session with an information services automation system 16 .
  • the communication network 12 may include any one or a combination of a Public Switched Telephone Network (PSTN), a cellular network, or a packet network.
  • PSTN Public Switched Telephone Network
  • the information services automation system 16 will provide automated greetings and questions to which the caller will respond to formulate the information request. Initially, the information services automation system 16 will attempt to recognize the information spoken by the caller and provide the requested information.
  • a voice session between the caller and an operator workstation 18 is established, wherein an operator will attempt to respond to the caller's request.
  • the term “operator” is used to describe any human agent capable of providing any type of information services, including but not limited to directory assistance, traditional operator assistance, and enhanced information services.
  • the information services do not need to be telephony based, and may include technical support, customer support, and the like.
  • the initial audible information provided by the caller that was recorded by the information services automation system 16 will be transferred to the operator workstation 18 , such that the operator may listen to the recorded information without having to ask the caller to repeat the information. If necessary, the operator may communicate with the caller to clarify information or obtain additional information to assist in obtaining the requested information.
  • a voice session is established between the caller and an automated audio system 20 .
  • the automated audio system 20 will then interact with the information services automation system 16 or the operator workstation 18 to obtain the requested information and deliver the requested information to the caller in a synthesized fashion.
  • the functionality of the automated audio system 20 may be integrated with the information services automation system 16 or the operator workstations 18 .
  • the information services automation system 16 , operator workstations 18 , and automated audio system 20 may communicate and cooperate with each other via any number of networks or signaling conventions.
  • the results of the subsequent operator assistance is fed back to the information services automation system 16 to update the various databases used for automation in a manner increasing the likelihood that subsequent information requests will be automated.
  • a caller may use any type of telephony terminal 22 and initiate a voice session, such as a traditional telephone call, to information services wherein the call will be directed to the information services automation system 16 via the information services switch 14 .
  • the speech from the caller is received and processed by a speech detection function 24 , which attempts to recognize phonemes of the incoming speech.
  • Phonemes represent the basic elements of a spoken language. Accordingly, the speech detection function 24 will provide a sequence of defined phonemes corresponding to the incoming speech.
  • the sequence of phonemes is sent to an endpoint detection function 26 , which will detect the beginning and ending of words within the sequence of phonemes. Thus, there may be one or more groups of phonemes that correspond to words in the original speech.
  • the endpoint detection function 26 will access a dictionary database 28 to determine actual words associated with the groups of phonemes.
  • the dictionary database 28 will include a list of words and their associated groups of phonemes. Notably, any words may be associated with multiple groups of phonemes, which may correspond to different languages, accents, dialects, or pronunciations of the word.
  • the words are then provided to a recognition detection function 30 , which will process the words by accessing a grammar database 32 in an effort to determine an associated entry corresponding to the words.
  • the resultant entries are then provided to a search function 34 , which will access an information database 36 to obtain information associated with the determined entry.
  • the grammar database 32 will list associations of words and corresponding entries, which will be found in the information database 36 .
  • the original speech is broken into phonemes, which are converted to words by the endpoint detection function 26 .
  • the words are then converted to available entries by the recognition detection function 30 .
  • Different words or word sequences may be associated with a given entry.
  • the entry of “Joey's Pub and Pizza” may be associated with the following word or words: 1) Joey's Restaurant, 2) Joey's Pub, 3) Joey's Bar, 4) Joey's Pizza, 5) Joey's Pizza Pub, and 6) Joey's Pizza and Pub.
  • the recognition detection function 30 and the grammar database 32 may be configured wherein each of the entries are not necessary but if a certain number of words match a decision is made on a desired entry.
  • the search function 34 will access the information database 36 to obtain the associated information.
  • the information may include directory assistance information including the directory number and address for Joey's Pizza and Pub, driving directions, menu information, specials, or any other information that may be desirable to provide to the caller or requested by the caller.
  • the requested information is obtained, it is sent to the automated audio system 20 for delivery to the caller.
  • the speech detection function 24 may not be able to select phonemes
  • the endpoint detection function 26 may not be able to determine a word or words
  • the recognition detection function 30 may not be able to detect an entry
  • the search function 34 may not be able to determine information for a given entry. If there is a failure at any of these points, a store and forward function 38 will send a recording of the speech to an available operator workstation 18 .
  • the store and forward function 38 may also indicate the type of failure or the point of failure in the automation process for the associated speech.
  • FIG. 3 a flow diagram is provided to illustrate the operation of the present invention according to one embodiment.
  • this process is implemented in the operator workstation 18 , but those skilled in the art will recognize that the various steps may be implemented in or distributed among the operator workstation 18 , the information services automation system 16 , a third entity, or a combination thereof.
  • the recorded speech from the caller is received at the operator workstation 18 (step 100 ).
  • the operator will listen to the recorded speech, and either interpret the recorded speech or interact with the caller to obtain addition information to determine an entry to provide to the information database 36 .
  • the operator will provide an operator entry corresponding to the callers request (step 102 ).
  • the operator workstation 18 will then generate an information database request based on the operator entry (step 104 ). If the information for the entry is unavailable (step 106 ), the information services process will end (step 116 ), or the operator may ask the caller for additional information or clarification. If the information associated with the operator entry is available (step 106 ), the operator may recite the information or may initiate an automated response for the requested information from the automated audio system 20 (step 108 ). As such, the operator workstation 18 will effect the requested information to be provided to the automated audio system 20 , which will then deliver the requested information in an audible format to the caller via the telephony terminal 22 .
  • the present invention continues by providing feedback to the information services automation system 16 based on the decisions made by the operator. Accordingly, a comparison is made between the operator input and the provided failure information, which may relate to the phonemes, words, or entries, depending on when the failure occurred (step 110 ).
  • the failure information may represent the point of failure, or may include the phonemes, words, or entries associated with the automation failure.
  • the comparison of the failure information to the operator entry can take place on the appropriate level, such as the phoneme level, the word level, or the entry level.
  • the databases are updated (step 114 ) and the process ends.
  • the words associated with the entry ultimately provided by the operator may be associated with the group of phonemes of the request and added to the dictionary database 28 .
  • a new group of phonemes may be associated with an existing word, or a new word may be added to the dictionary database 28 in association with the group of phonemes.
  • the grammar database 32 would be updated with a new word to associate with the entry as well.
  • the recorded speech that was unable to be processed by the information services automation system 16 is compared in light of an entry provided by the operator at the operator workstation 18 .
  • the comparison may be between the given speech and the entry, the recorded speech and the words of the entry, as well as the recorded speech and the phonemes corresponding to the words of the entry.
  • the actual speech recognition aspect associated with the dictionary database 28 and the grammar database 32 may be updated as well as the basic information database 36 .
  • the operator workstation 18 may take the form of a personal computer or workstation having a control system 40 , which is associated with an operator interface 42 and one or more communication interfaces, such as a voice interface 44 and an automation system interface 46 .
  • the voice interface 44 will support the actual communication session or call to allow the operator to communicate with the caller.
  • the automation system interface 46 will allow direct or indirect communications with the information services automation system 16 , the automated audio system 20 , or a combination thereof.
  • the control system 40 may also be associated with memory 48 with sufficient software 50 to facilitate the functionality described above. Again, the term “operator” is used only to indicate a human agent who is involved in providing any type of information services.
  • the information services automation system 16 may include a control system 52 associated with a voice interface 54 for receiving the audible speech in association with an information services request from a telephony user, and a communication interface 56 to facilitate communications with the operator workstations 18 , databases 28 , 32 , and 36 , or any other entities with which communications are required.
  • the control system 52 will include sufficient memory 58 having the requisite software 60 to facilitate the operation described above.

Abstract

The present invention allows feedback from operator workstations to be used to update databases used for providing automated information services. When an automated process fails, recorded speech of the caller is passed on to the operator for decision making. Based on the selections made by the operator in light of the speech or other interactions with the caller, a comparison is made between the speech and the selections made by the operator to arrive at information to update the databases in the information services automation system. Thus, when the operator inputs the words corresponding to the speech provided at the information services automation system, the speech may be associated with those words. The association between the speech and the words may be used to update different databases in the information services automation system.

Description

    CROSS-REFERENCE TO RELATED APPLICATION
  • The present application is a continuation of U.S. patent application Ser. No. 10/805,975, filed on Mar. 22, 2004, now U.S. Pat. No. 8,954,325, and which is set to issue on Feb. 10, 2015, the disclosure of which is hereby incorporated by reference in its entirety.
  • FIELD OF THE INVENTION
  • The present invention relates to information services, and in particular to improving speech recognition in information services automation systems.
  • BACKGROUND OF THE INVENTION
  • Information services systems have been implemented since the beginning of telephony communications. For various reasons, and historically based on the need for directory assistance, telephony subscribers could call an information services system, request particular information, and receive the information. As communications evolve, the sophistication of the information services systems and the type of information provided with these systems has significantly increased. Currently, information services systems provide all types of information, from traditional directory numbers and addresses to driving directions and movie listings.
  • As the need for information services increases, information services providers have implemented automated systems that are capable of handling certain requests in a fully automated fashion, without requiring operator assistance, by utilizing technologies such as speech recognition, speech synthesis, recorded speech playback, and digit detection. Naturally, there are numerous reasons, such as varying accents, dialects, and languages, which prevent these automated systems from being able to properly respond to all requests. As such, the requests that are not recognized or otherwise handled properly may be sent to a human operator, who will interact with the caller and provide the requested information.
  • Given the significant cost savings associated with automation, there is a continuing need to provide more accurate and reliable automation. The primary hurdle in automation is the difficulty in recognizing speech due to the various languages, accents, dialects, and pronunciations of words that formulate the caller's request for information. At this time, the speech recognition engines in these information services automation systems are only updated periodically, and these updates are not necessarily based on actual use, but rather on general predictions involving speech recognition patterns. Further, there is no mechanism to provide feedback to the automation system based on actions taken by the operator after the automation system has failed. There is a need to provide feedback to the automation system based on the operator's interaction with the caller to improve speech recognition, and thus the ability to automate future requests in a more effective manner.
  • SUMMARY OF THE INVENTION
  • The present invention allows feedback from operator workstations to be used to update databases used for providing automated information services. When an automated process fails, recorded speech of the caller is passed on to the operator for decision making. Based on the selections made by the operator in light of the speech of or other interactions with the caller, a comparison is made between the speech and the selections made by the operator to arrive at information to update the databases in the information services automation system. Thus, when the operator inputs the words corresponding to the speech provided at the information services automation system, the speech may be associated with those words. The association between the speech and the words may be used to update different databases in the information services automation system.
  • In one embodiment, the automation process involves processing the speech to detect phonemes, using the phonemes to detect words, and using the words to detect an entry that is associated with the information being requested by the caller. If there is a failure at any one of these detection stages, the speech is sent to the operator. When the operator listens to the speech and provides operator input corresponding to the words or entries, the various databases used to look up words based on phonemes, entries based on words, or information based on entries may be updated. As such, a word typed in by the operator may be associated with a group of phonemes for the speech. Similarly, an entry may be associated with a new word or group of words. The information services automation system may send information identifying the step in the automation process where the automation failed. As such, the particular database to update based on the operator input can be selected based on the point of failure.
  • Those skilled in the art will appreciate the scope of the present invention and realize additional aspects thereof after reading the following detailed description of the preferred embodiments in association with the accompanying drawing figures.
  • BRIEF DESCRIPTION OF THE DRAWING FIGURES
  • The accompanying drawing figures incorporated in and forming a part of this specification illustrate several aspects of the invention, and together with the description serve to explain the principles of the invention.
  • FIG. 1 is a block representation of an information services environment according to one embodiment of the present invention.
  • FIG. 2 is a block representation of an information services automation system according to one embodiment of the present invention.
  • FIG. 3 is a flow diagram providing an overview of the operation of the present invention according to one embodiment.
  • FIG. 4 is a block representation of an operator workstation according to one embodiment of the present invention.
  • FIG. 5 is a block representation of an information services automation system according to one embodiment of the present invention.
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • The embodiments set forth below represent the necessary information to enable those skilled in the art to practice the invention and illustrate the best mode of practicing the invention. Upon reading the following description in light of the accompanying drawing figures, those skilled in the art will understand the concepts of the invention and will recognize applications of these concepts not particularly addressed herein. It should be understood that these concepts and applications fall within the scope of the disclosure and the accompanying claims.
  • Turning now to FIG. 1, an information services environment 10 is illustrated according to one embodiment of the present invention. In general, a communication network 12 may include an information services switch 14, such as a circuit-switched based operator services switch or analogous cellular or packet-based switch, wherein incoming information services requests result in a voice-based communication session with an information services automation system 16. The communication network 12 may include any one or a combination of a Public Switched Telephone Network (PSTN), a cellular network, or a packet network. The information services automation system 16 will provide automated greetings and questions to which the caller will respond to formulate the information request. Initially, the information services automation system 16 will attempt to recognize the information spoken by the caller and provide the requested information. If the caller's request cannot be recognized or otherwise processed by the information services automation system 16, a voice session between the caller and an operator workstation 18 is established, wherein an operator will attempt to respond to the caller's request. The term “operator” is used to describe any human agent capable of providing any type of information services, including but not limited to directory assistance, traditional operator assistance, and enhanced information services. The information services do not need to be telephony based, and may include technical support, customer support, and the like.
  • During the transition from the information services automation system 16 to the operator workstation 18, the initial audible information provided by the caller that was recorded by the information services automation system 16 will be transferred to the operator workstation 18, such that the operator may listen to the recorded information without having to ask the caller to repeat the information. If necessary, the operator may communicate with the caller to clarify information or obtain additional information to assist in obtaining the requested information.
  • Once the requested information is obtained by the information services automation system 16 or by an operator at one of the operator workstations 18, a voice session is established between the caller and an automated audio system 20. The automated audio system 20 will then interact with the information services automation system 16 or the operator workstation 18 to obtain the requested information and deliver the requested information to the caller in a synthesized fashion. Notably, the functionality of the automated audio system 20 may be integrated with the information services automation system 16 or the operator workstations 18.
  • The information services automation system 16, operator workstations 18, and automated audio system 20 may communicate and cooperate with each other via any number of networks or signaling conventions. For the present invention, when the information services automation system 16 fails to provide a fully automated request, the results of the subsequent operator assistance is fed back to the information services automation system 16 to update the various databases used for automation in a manner increasing the likelihood that subsequent information requests will be automated. To initiate information services requests, a caller may use any type of telephony terminal 22 and initiate a voice session, such as a traditional telephone call, to information services wherein the call will be directed to the information services automation system 16 via the information services switch 14.
  • Turning now to FIG. 2, an overview of the information services automation system 16 is illustrated according to one embodiment. Initially, the speech from the caller is received and processed by a speech detection function 24, which attempts to recognize phonemes of the incoming speech. Phonemes represent the basic elements of a spoken language. Accordingly, the speech detection function 24 will provide a sequence of defined phonemes corresponding to the incoming speech. The sequence of phonemes is sent to an endpoint detection function 26, which will detect the beginning and ending of words within the sequence of phonemes. Thus, there may be one or more groups of phonemes that correspond to words in the original speech. The endpoint detection function 26 will access a dictionary database 28 to determine actual words associated with the groups of phonemes. Accordingly, the dictionary database 28 will include a list of words and their associated groups of phonemes. Notably, any words may be associated with multiple groups of phonemes, which may correspond to different languages, accents, dialects, or pronunciations of the word. The words are then provided to a recognition detection function 30, which will process the words by accessing a grammar database 32 in an effort to determine an associated entry corresponding to the words. The resultant entries are then provided to a search function 34, which will access an information database 36 to obtain information associated with the determined entry. Thus, the grammar database 32 will list associations of words and corresponding entries, which will be found in the information database 36.
  • The original speech is broken into phonemes, which are converted to words by the endpoint detection function 26. The words are then converted to available entries by the recognition detection function 30. Different words or word sequences may be associated with a given entry. For example, the entry of “Joey's Pub and Pizza” may be associated with the following word or words: 1) Joey's Restaurant, 2) Joey's Pub, 3) Joey's Bar, 4) Joey's Pizza, 5) Joey's Pizza Pub, and 6) Joey's Pizza and Pub. The recognition detection function 30 and the grammar database 32 may be configured wherein each of the entries are not necessary but if a certain number of words match a decision is made on a desired entry. Once the entry is determined, the search function 34 will access the information database 36 to obtain the associated information. In this instance, the information may include directory assistance information including the directory number and address for Joey's Pizza and Pub, driving directions, menu information, specials, or any other information that may be desirable to provide to the caller or requested by the caller. Once the requested information is obtained, it is sent to the automated audio system 20 for delivery to the caller.
  • There are many potential points of failure in the automation process. For example, the speech detection function 24 may not be able to select phonemes, the endpoint detection function 26 may not be able to determine a word or words, the recognition detection function 30 may not be able to detect an entry, and the search function 34 may not be able to determine information for a given entry. If there is a failure at any of these points, a store and forward function 38 will send a recording of the speech to an available operator workstation 18. The store and forward function 38 may also indicate the type of failure or the point of failure in the automation process for the associated speech.
  • Turning now to FIG. 3, a flow diagram is provided to illustrate the operation of the present invention according to one embodiment. In the illustrated embodiment, this process is implemented in the operator workstation 18, but those skilled in the art will recognize that the various steps may be implemented in or distributed among the operator workstation 18, the information services automation system 16, a third entity, or a combination thereof. Initially, the recorded speech from the caller, and possibly the failure information, is received at the operator workstation 18 (step 100). The operator will listen to the recorded speech, and either interpret the recorded speech or interact with the caller to obtain addition information to determine an entry to provide to the information database 36. Based on the recorded speech or information provided from the caller, the operator will provide an operator entry corresponding to the callers request (step 102). Accordingly, the operator workstation 18 will then generate an information database request based on the operator entry (step 104). If the information for the entry is unavailable (step 106), the information services process will end (step 116), or the operator may ask the caller for additional information or clarification. If the information associated with the operator entry is available (step 106), the operator may recite the information or may initiate an automated response for the requested information from the automated audio system 20 (step 108). As such, the operator workstation 18 will effect the requested information to be provided to the automated audio system 20, which will then deliver the requested information in an audible format to the caller via the telephony terminal 22.
  • Instead of stopping the automated processing system at this point as with traditional systems, the present invention continues by providing feedback to the information services automation system 16 based on the decisions made by the operator. Accordingly, a comparison is made between the operator input and the provided failure information, which may relate to the phonemes, words, or entries, depending on when the failure occurred (step 110). The failure information may represent the point of failure, or may include the phonemes, words, or entries associated with the automation failure. Thus, the comparison of the failure information to the operator entry can take place on the appropriate level, such as the phoneme level, the word level, or the entry level. The databases are updated (step 114) and the process ends. For example, if the endpoint detection function 26 was unable to detect a word based on the given phonemes, the words associated with the entry ultimately provided by the operator may be associated with the group of phonemes of the request and added to the dictionary database 28. Thus, a new group of phonemes may be associated with an existing word, or a new word may be added to the dictionary database 28 in association with the group of phonemes. In the latter case, the grammar database 32 would be updated with a new word to associate with the entry as well. Once the comparison is made, database information is generated to update the pertinent databases, such as the dictionary database 28, the grammar database 32, and the information database 36 (step 112). In an effort to keep the databases from growing too large, the additional information resulting from feedback from the operator workstations 18 may be removed after a certain period of time or if available memory stores become low.
  • From the above, the recorded speech that was unable to be processed by the information services automation system 16 is compared in light of an entry provided by the operator at the operator workstation 18. The comparison may be between the given speech and the entry, the recorded speech and the words of the entry, as well as the recorded speech and the phonemes corresponding to the words of the entry. Thus, the actual speech recognition aspect associated with the dictionary database 28 and the grammar database 32 may be updated as well as the basic information database 36. Those skilled in the art will recognize variations in the different information services automation systems 16 and realize different ways to provide feedback for updating the information services automation system 16 in light of the above teachings.
  • Turning now to FIG. 4, a block representation of an operator workstation 18 is provided according to one embodiment of the present invention. The operator workstation 18 may take the form of a personal computer or workstation having a control system 40, which is associated with an operator interface 42 and one or more communication interfaces, such as a voice interface 44 and an automation system interface 46. The voice interface 44 will support the actual communication session or call to allow the operator to communicate with the caller. The automation system interface 46 will allow direct or indirect communications with the information services automation system 16, the automated audio system 20, or a combination thereof. The control system 40 may also be associated with memory 48 with sufficient software 50 to facilitate the functionality described above. Again, the term “operator” is used only to indicate a human agent who is involved in providing any type of information services.
  • Turning now to FIG. 5, a basic block representation of an information services automation system 16 is illustrated. The information services automation system 16 may include a control system 52 associated with a voice interface 54 for receiving the audible speech in association with an information services request from a telephony user, and a communication interface 56 to facilitate communications with the operator workstations 18, databases 28, 32, and 36, or any other entities with which communications are required. The control system 52 will include sufficient memory 58 having the requisite software 60 to facilitate the operation described above.
  • Those skilled in the art will recognize improvements and modifications to the preferred embodiments of the present invention. All such improvements and modifications are considered within the scope of the concepts disclosed herein and the claims that follow.

Claims (17)

What is claimed is:
1. A method comprising:
a) receiving speech in association with a request for information services;
b) forwarding the speech to an operator workstation;
c) forwarding failure indicia identifying a step in an automation process for providing the information services in which automation failed for the speech; and
d) receiving an update for a database used for providing information services based on operator input provided to the operator workstation when determining a response to the request for information services, wherein the update corresponds to the failure indicia.
2. The method of claim 1 further comprising updating the database with the update.
3. The method of claim 1 wherein the update is further based on the speech.
4. The method of claim 1 wherein the update is a word to associate with a group of phonemes in the speech.
5. The method of claim 1 wherein the update is an entry to associate with a word in the speech, the entry corresponding to the operator input.
6. The method of claim 1 wherein the update is an entry to associate with a group of words in the speech, the entry corresponding to the operator input.
7. The method of claim 1 wherein the update relates to effectively associating a group of phonemes with an entry, the entry corresponding to the operator input.
8. The method of claim 1 wherein the database receiving the update is one of a plurality of databases based on a step in an automation process for providing the information services in which automation failed for the speech.
9. The method of claim 1 further comprising:
a) determining phonemes for the speech;
b) attempting to determine at least one word for the phonemes; and
c) if the at least one word is determined, attempting to determine an entry for the at least one word.
10. A method comprising:
a) receiving speech in association with a request for information services;
b) capturing the received speech;
c) initiating processing of the received speech;
d) detecting a failure in processing the received speech;
e) forwarding the captured received speech to an operator workstation;
f) receiving operator input from the operator workstation; and
g) updating a database used for providing the information services based on the operator input.
11. The method of claim 10, wherein capturing the received speech comprises storing information characterizing the received speech.
12. The method of claim 10, wherein capturing the received speech comprises associating phonemes with the received speech.
13. The method of claim 10, wherein updating the database comprises updating at least one record in the database associated with the captured speech.
14. The method of claim 12, wherein updating the database comprises associating a word with phonemes associated with the received speech
15. The method of claim 10, further comprising initiating delivery of a response to the request for information services.
16. The method of claim 10, further comprising:
h) modifying the request for information services based on the operator input; and
i) sending the modified request for information services to an information service provider
17. The method of claim 16, further comprising;
j) receiving an information service response from the information service provider; and
k) providing the information service response to an initiator of the request for information services
US14/610,124 2004-03-22 2015-01-30 Speech recognition in automated information services systems Abandoned US20150142436A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/610,124 US20150142436A1 (en) 2004-03-22 2015-01-30 Speech recognition in automated information services systems

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/805,975 US8954325B1 (en) 2004-03-22 2004-03-22 Speech recognition in automated information services systems
US14/610,124 US20150142436A1 (en) 2004-03-22 2015-01-30 Speech recognition in automated information services systems

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US10/805,975 Continuation US8954325B1 (en) 2004-03-22 2004-03-22 Speech recognition in automated information services systems

Publications (1)

Publication Number Publication Date
US20150142436A1 true US20150142436A1 (en) 2015-05-21

Family

ID=52443705

Family Applications (2)

Application Number Title Priority Date Filing Date
US10/805,975 Expired - Fee Related US8954325B1 (en) 2004-03-22 2004-03-22 Speech recognition in automated information services systems
US14/610,124 Abandoned US20150142436A1 (en) 2004-03-22 2015-01-30 Speech recognition in automated information services systems

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US10/805,975 Expired - Fee Related US8954325B1 (en) 2004-03-22 2004-03-22 Speech recognition in automated information services systems

Country Status (1)

Country Link
US (2) US8954325B1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11966439B2 (en) 2021-03-25 2024-04-23 Samsung Electronics Co., Ltd. Method for providing voice assistant service and electronic device supporting the same

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9524717B2 (en) * 2013-10-15 2016-12-20 Trevo Solutions Group LLC System, method, and computer program for integrating voice-to-text capability into call systems
US11076219B2 (en) * 2019-04-12 2021-07-27 Bose Corporation Automated control of noise reduction or noise masking

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7401023B1 (en) * 2000-09-06 2008-07-15 Verizon Corporate Services Group Inc. Systems and methods for providing automated directory assistance using transcripts

Family Cites Families (57)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4979206A (en) * 1987-07-10 1990-12-18 At&T Bell Laboratories Directory assistance systems
US5033088A (en) * 1988-06-06 1991-07-16 Voice Processing Corp. Method and apparatus for effectively receiving voice input to a voice recognition system
US5163083A (en) * 1990-10-12 1992-11-10 At&T Bell Laboratories Automation of telephone operator assistance calls
US5497319A (en) * 1990-12-31 1996-03-05 Trans-Link International Corp. Machine translation and telecommunications system
CA2088080C (en) * 1992-04-02 1997-10-07 Enrico Luigi Bocchieri Automatic speech recognizer
US6073101A (en) * 1996-02-02 2000-06-06 International Business Machines Corporation Text independent speaker recognition for transparent command ambiguity resolution and continuous access control
US5875426A (en) * 1996-06-12 1999-02-23 International Business Machines Corporation Recognizing speech having word liaisons by adding a phoneme to reference word models
US6911916B1 (en) * 1996-06-24 2005-06-28 The Cleveland Clinic Foundation Method and apparatus for accessing medical data over a network
US5915001A (en) * 1996-11-14 1999-06-22 Vois Corporation System and method for providing and using universally accessible voice and speech data files
US5991364A (en) * 1997-03-27 1999-11-23 Bell Atlantic Network Services, Inc. Phonetic voice activated dialing
US5897616A (en) * 1997-06-11 1999-04-27 International Business Machines Corporation Apparatus and methods for speaker verification/identification/classification employing non-acoustic and/or acoustic models and databases
US6195641B1 (en) * 1998-03-27 2001-02-27 International Business Machines Corp. Network universal spoken language vocabulary
US6138100A (en) * 1998-04-14 2000-10-24 At&T Corp. Interface for a voice-activated connection system
US6470357B1 (en) 1998-07-28 2002-10-22 International Bussiness Machines Corp. System and method of enhanced directory services for telecommunications management network applications
US6493671B1 (en) * 1998-10-02 2002-12-10 Motorola, Inc. Markup language for interactive services to notify a user of an event and methods thereof
US6182045B1 (en) * 1998-11-02 2001-01-30 Nortel Networks Corporation Universal access to audio maintenance for IVR systems using internet technology
US6446076B1 (en) * 1998-11-12 2002-09-03 Accenture Llp. Voice interactive web-based agent system responsive to a user location for prioritizing and formatting information
US6253181B1 (en) * 1999-01-22 2001-06-26 Matsushita Electric Industrial Co., Ltd. Speech recognition and teaching apparatus able to rapidly adapt to difficult speech of children and foreign speakers
US6480819B1 (en) * 1999-02-25 2002-11-12 Matsushita Electric Industrial Co., Ltd. Automatic search of audio channels by matching viewer-spoken words against closed-caption/audio content for interactive television
WO2000058946A1 (en) * 1999-03-26 2000-10-05 Koninklijke Philips Electronics N.V. Client-server speech recognition
US6487530B1 (en) * 1999-03-30 2002-11-26 Nortel Networks Limited Method for recognizing non-standard and standard speech by speaker independent and speaker dependent word models
DE60026637T2 (en) * 1999-06-30 2006-10-05 International Business Machines Corp. Method for expanding the vocabulary of a speech recognition system
US6553113B1 (en) * 1999-07-09 2003-04-22 First Usa Bank, Na System and methods for call decisioning in a virtual call center integrating telephony with computers
US6442519B1 (en) * 1999-11-10 2002-08-27 International Business Machines Corp. Speaker model adaptation via network of similar users
US7725307B2 (en) * 1999-11-12 2010-05-25 Phoenix Solutions, Inc. Query engine for processing voice based queries including semantic decoding
US6532446B1 (en) * 1999-11-24 2003-03-11 Openwave Systems Inc. Server based speech recognition user interface for wireless devices
US6999923B1 (en) * 2000-06-23 2006-02-14 International Business Machines Corporation System and method for control of lights, signals, alarms using sound detection
EP1325611A4 (en) 2000-09-15 2004-08-04 Grape Technology Group Inc Enhanced directory assistance system
GB0027178D0 (en) * 2000-11-07 2000-12-27 Canon Kk Speech processing system
US7277851B1 (en) * 2000-11-22 2007-10-02 Tellme Networks, Inc. Automated creation of phonemic variations
FR2820872B1 (en) * 2001-02-13 2003-05-16 Thomson Multimedia Sa VOICE RECOGNITION METHOD, MODULE, DEVICE AND SERVER
US7191133B1 (en) * 2001-02-15 2007-03-13 West Corporation Script compliance using speech recognition
US6882707B2 (en) * 2001-02-21 2005-04-19 Ultratec, Inc. Method and apparatus for training a call assistant for relay re-voicing
US7197459B1 (en) * 2001-03-19 2007-03-27 Amazon Technologies, Inc. Hybrid machine/human computing arrangement
US6944447B2 (en) * 2001-04-27 2005-09-13 Accenture Llp Location-based services
US7437295B2 (en) * 2001-04-27 2008-10-14 Accenture Llp Natural language processing for a location-based services system
US7127397B2 (en) * 2001-05-31 2006-10-24 Qwest Communications International Inc. Method of training a computer system via human voice input
US6671670B2 (en) * 2001-06-27 2003-12-30 Telelogue, Inc. System and method for pre-processing information used by an automated attendant
WO2004023455A2 (en) * 2002-09-06 2004-03-18 Voice Signal Technologies, Inc. Methods, systems, and programming for performing speech recognition
JP3997459B2 (en) * 2001-10-02 2007-10-24 株式会社日立製作所 Voice input system, voice portal server, and voice input terminal
US20030105638A1 (en) * 2001-11-27 2003-06-05 Taira Rick K. Method and system for creating computer-understandable structured medical data from natural language reports
US7174300B2 (en) * 2001-12-11 2007-02-06 Lockheed Martin Corporation Dialog processing method and apparatus for uninhabited air vehicles
US7149694B1 (en) * 2002-02-13 2006-12-12 Siebel Systems, Inc. Method and system for building/updating grammars in voice access systems
US6792096B2 (en) * 2002-04-11 2004-09-14 Sbc Technology Resources, Inc. Directory assistance dialog with configuration switches to switch from automated speech recognition to operator-assisted dialog
US7398209B2 (en) * 2002-06-03 2008-07-08 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
AU2002326879A1 (en) * 2002-06-05 2003-12-22 Vas International, Inc. Biometric identification system
US7680649B2 (en) * 2002-06-17 2010-03-16 International Business Machines Corporation System, method, program product, and networking use for recognizing words and their parts of speech in one or more natural languages
US7047193B1 (en) * 2002-09-13 2006-05-16 Apple Computer, Inc. Unsupervised data-driven pronunciation modeling
US6714631B1 (en) * 2002-10-31 2004-03-30 Sbc Properties, L.P. Method and system for an automated departure strategy
US20040162724A1 (en) * 2003-02-11 2004-08-19 Jeffrey Hill Management of conversations
US7103553B2 (en) * 2003-06-04 2006-09-05 Matsushita Electric Industrial Co., Ltd. Assistive call center interface
US7243072B2 (en) * 2003-06-27 2007-07-10 Motorola, Inc. Providing assistance to a subscriber device over a network
US7590533B2 (en) * 2004-03-10 2009-09-15 Microsoft Corporation New-word pronunciation learning using a pronunciation graph
US7447636B1 (en) * 2005-05-12 2008-11-04 Verizon Corporate Services Group Inc. System and methods for using transcripts to train an automated directory assistance service
US7542904B2 (en) * 2005-08-19 2009-06-02 Cisco Technology, Inc. System and method for maintaining a speech-recognition grammar
US7756708B2 (en) * 2006-04-03 2010-07-13 Google Inc. Automatic language model update
US8401847B2 (en) * 2006-11-30 2013-03-19 National Institute Of Advanced Industrial Science And Technology Speech recognition system and program therefor

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7401023B1 (en) * 2000-09-06 2008-07-15 Verizon Corporate Services Group Inc. Systems and methods for providing automated directory assistance using transcripts

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BPAI Decision for Application 10/805975 mailed 2/24/14 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11966439B2 (en) 2021-03-25 2024-04-23 Samsung Electronics Co., Ltd. Method for providing voice assistant service and electronic device supporting the same

Also Published As

Publication number Publication date
US8954325B1 (en) 2015-02-10

Similar Documents

Publication Publication Date Title
JP4247929B2 (en) A method for automatic speech recognition in telephones.
US7542904B2 (en) System and method for maintaining a speech-recognition grammar
US6243684B1 (en) Directory assistance system and method utilizing a speech recognition system and a live operator
US9112972B2 (en) System and method for processing speech
US8369492B2 (en) Directory dialer name recognition
US7450698B2 (en) System and method of utilizing a hybrid semantic model for speech recognition
US6885733B2 (en) Method of providing a user interface for audio telecommunications systems
US6744861B1 (en) Voice dialing methods and apparatus implemented using AIN techniques
US6687673B2 (en) Speech recognition system
US6643622B2 (en) Data retrieval assistance system and method utilizing a speech recognition system and a live operator
US7966176B2 (en) System and method for independently recognizing and selecting actions and objects in a speech recognition system
US6963633B1 (en) Voice dialing using text names
US20060235684A1 (en) Wireless device to access network-based voice-activated services using distributed speech recognition
US20060093097A1 (en) System and method for identifying telephone callers
US6650738B1 (en) Methods and apparatus for performing sequential voice dialing operations
US20130346080A1 (en) System And Method For Performing Distributed Speech Recognition
US7318029B2 (en) Method and apparatus for a interactive voice response system
US6690772B1 (en) Voice dialing using speech models generated from text and/or speech
JPH06242793A (en) Speaker certification using companion normalization scouring
JPH10215319A (en) Dialing method and device by voice
US6665377B1 (en) Networked voice-activated dialing and call-completion system
EP1466319A1 (en) Network-accessible speaker-dependent voice models of multiple persons
US20150142436A1 (en) Speech recognition in automated information services systems
US8654959B2 (en) Automated telephone attendant
US8213966B1 (en) Text messages provided as a complement to a voice session

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION