WO2002089451A2 - Procede et appareil destines a traiter automatiquement une communication d'utilisateur - Google Patents

Procede et appareil destines a traiter automatiquement une communication d'utilisateur Download PDF

Info

Publication number
WO2002089451A2
WO2002089451A2 PCT/US2002/010169 US0210169W WO02089451A2 WO 2002089451 A2 WO2002089451 A2 WO 2002089451A2 US 0210169 W US0210169 W US 0210169W WO 02089451 A2 WO02089451 A2 WO 02089451A2
Authority
WO
WIPO (PCT)
Prior art keywords
symbol strings
gram
listings
stored
recognized
Prior art date
Application number
PCT/US2002/010169
Other languages
English (en)
Other versions
WO2002089451A3 (fr
Inventor
Yevgenly Lyudovyk
Esther Levin
Original Assignee
Telelogue, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telelogue, Inc. filed Critical Telelogue, Inc.
Priority to EP02766736A priority Critical patent/EP1388097A2/fr
Publication of WO2002089451A2 publication Critical patent/WO2002089451A2/fr
Publication of WO2002089451A3 publication Critical patent/WO2002089451A3/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99934Query formulation, input preparation, or translation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99935Query augmenting and refining, e.g. inexact access
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99936Pattern matching access

Definitions

  • the present invention relates to automatically processing a user's communication.
  • the present invention relates to method and apparatus for automatically recognizing and/or processing possibly erroneous or incomplete user's communication.
  • automated attendants have become very popular. Many individuals or organizations use automated attendants to automatically provide information to callers or to route incoming calls. Typically, a user places a call and reaches an automated attendant (e.g. an Interactive Voice Recognition (IVR) system) that prompts the user for desired information and searches an informational database for the requested information. The user enters the request, for example, a name of a business or individual via a keyboard, keypad or spoken inputs. The automated attendant searches for a match based on the user's input and outputs a result if a match can be found.
  • IVR Interactive Voice Recognition
  • the invention concerns a method and apparatus for processing a user's communication.
  • the invention may include receiving a list of recognized symbol strings of one or more recognized entries.
  • the list of recognized symbol strings may include a first similarity score associated with each recognized entry.
  • From each recognized symbol string one or more contiguous sequences of N-symbols may be extracted.
  • At least one extracted contiguous sequence of N-symbols may be matched with at least one stored contiguous sequence of N-symbols from a first database. Based on those matched N-symbols and first similarity scores, a preliminary set of symbol strings and associated second similarity scores may be generated.
  • the preliminary set of symbol strings may include one or more stored symbol strings from a second database that contain one or more matched contiguous sequences of N-symbols.
  • a third similarity score associated with the one or more stored symbol strings included in the preliminary set of symbol strings may be computed.
  • a refined set of symbol strings from the preliminary set of symbol strings based on the computed third similarity score may be output.
  • FIG. 1 is an exemplary block diagram of a communication processing system in accordance with an embodiment of the present invention.
  • FIG. 2 is a detailed block diagram of a database entry matcher, shown i in FIG. 1 , in accordance with an exemplary embodiment of the present invention.
  • FIG. 3 is a flowchart illustrating an automatic input recognition process in accordance with an exemplary embodiment of the present invention.
  • FIG. 4 illustrates the application of the automatic input recognition process to an exemplary input in accordance with an embodiment of the present invention.
  • Embodiments of the present invention relate to a method and apparatus for automatically recognizing and/or processing a user's communication.
  • the user's communication may be erroneous or incomplete, but in other cases, the user's communication may be correct or complete.
  • the invention may utilize whole words, parts of words, or one or more contiguous sequences of symbols or characters to search one or more databases for entries that best match the user's input communication. Based on the user's communication, embodiments of the invention may reduce the volume of computations that may be required when processing, for example, a user's request for information. Accordingly, embodiments of the present invention may provide a more efficient and effective system for automatically processing the user's request with minimal external intervention.
  • one or more contiguous sequences of listings N-symbols may be extracted from each entry in a listings database.
  • the entries of the listings database may be stored symbol string entries corresponding to information that may be desired by users.
  • a user's communication is received and a ranked list of one or more recognized symbol string entries may be generated.
  • Each recognized symbol string entry in the list may be ranked according to a corresponding first similarity score.
  • One or more recognized contiguous sequence of N-symbols (recognized N-grams) may be extracted from each recognized symbol string entry in the list.
  • the extracted recognized N-grams for each entry may be matched with the identical listings N-grams.
  • a second similarity score may be calculated for one or more database entries mapped to the matched listings N-grams.
  • the second similarity score may be used to generate a preliminary set of symbol strings that may include one or more database entries mapped to the matched listings N-grams.
  • the second similarity score can be used to further narrow the number of listings that need to be searched to generate one or more entries that may be similar or equivalent to the user's communication.
  • a refined set of symbol strings may be generated.
  • the refined set of symbol strings may be generated based on associated third similarity scores.
  • entries of the refined list of symbol strings may include one or more database entries that have a favorable third similarity score.
  • the refined set of symbol strings may be output to the user for selection. If a selection is made, corresponding information may be retrieved and presented to the user. In further embodiments of the invention, further processing may occur by evaluating the refined set of symbol strings with respect to the recognized symbol strings before an output is presented to the user.
  • FIG. 1 is an exemplary block diagram of a communication processing system 100 for processing a user's communication in accordance with an embodiment of the present invention.
  • a recognizer 120 is coupled to a database 110, an output manager 130, and a database entry matcher 140.
  • the recognizer 120 may also receive a user's communication or inputs in the form of speech, text, digital signals, analog signals and/or any other forms of communications or communications signals.
  • user's communication can be a user's input in any form that represents, for example, a single word, multiple words, a single syllable, multiple syllables, a single phoneme and/or multiple phonemes.
  • the user's communication may include a request for information, products, services and/or any other suitable requests.
  • a user's communication may be input via a communication device such as a wired or wireless phone, a pager, a personal digital assistant, a personal computer, and/or any other device capable of sending and/or receiving communications.
  • the user's communication could be a search request to search the World Wide Web (WWW), a Local Area Network (LAN), and/or any other private or public network for the desired information.
  • WWW World Wide Web
  • LAN Local Area Network
  • the recognizer 120 may be any type of recognizer known to those skilled in the art.
  • the recognizer may be an automated speech recognizer (ASR) such as the type developed by Nuance Communications.
  • ASR automated speech recognizer
  • the communication processing system 100 where the recognizer 120 is an ASR, may operate similar to an IVR but includes the advantages of a database entry matcher 140 in accordance with embodiments of the present invention.
  • the recognizer 120 can be a text recognizer, optical character recognizer and/or another type of recognizer or device that recognizes and/or processes a user's inputs, and/or a device that receives a user's input, for example, a keyboard or a keypad.
  • the recognizer 120 may be incorporated within a personal computer, a telephone switch or telephone interface, and/or an Internet, Intranet and/or other type of server.
  • the recognizer 120 may include and/or may operate in conjunction with, for example, an Internet search engine that receives text, speech, etc. from an Internet user.
  • the recognizer 120 may receive user's communication via an Internet connection and operate in accordance with embodiments of the invention as described herein.
  • the recognizer 120 receives the user's communication and generates a list of ranked recognized symbol strings using known methods.
  • the symbol strings may be text or character strings that represent individual or business listings and/or other information for which the user desires additional information.
  • the recognized symbol string may be the name of a business for which the user desires a telephone number.
  • recognizer may generate a list of ranked symbol strings that are similar to the user's input. Each symbol string generated by the recognizer may be a hypothesis of what was originally input by the user. Each symbol string may be ranked by associated first similarity or probability scores.
  • the database 110 may include a listings database that has stored symbol strings or information entries (L(listings a n)) that represent information relating to a particular subject matter.
  • the listings database may include residential, governmental, and/or business listings for a particular town, city, state, and/or country. It is recognized that the stored symbol strings L(listings a1( ) could represent or include a myriad of other types of information such as individual directory information, specific business or vendor information, postal addresses, e- mail addresses, etc.
  • the database 110 can be part of larger database of listings information such as a database or other information resource that may be searched by, for example, any Internet search engine when performing a user's search request.
  • the database 110 may also include a recognizer grammar list generated from the symbol strings L(listings a n) stored in the listings database.
  • the recognizer grammar list may include, for example, a plurality different ways symbol strings stored in the listings database 110 may be referred to by users.
  • the recognizer 120 may generate the list of recognized symbol strings and associated first similarity scores based on the recognizer grammar list stored in the database 110.
  • the database entry matcher 140 receives the stored symbol strings L(listings a n) and may extract one or more contiguous sequence of N-symbols (listings N-gram) for each symbol string entry stored in the database 110 (i.e., each entry of stored symbol strings L(listings a n)).
  • the database entry matcher 140 also maps the listings N-gram with the corresponding symbol string from which it was extracted.
  • a list of listings symbol strings containing a particular N-gram is stored as well as elementary second similarity scores. This list may be full, that is, it may include all listings symbol strings from the database containing a particular N-gram.
  • the list may be short, that is, the list may include only a part of all listings symbol strings from the database containing that particular N-gram.
  • the elementary second similarity scores for a particular N- gram may be different for different listings symbol strings containing that N-gram, or the elementary second similarity scores may be the same for all listings symbol strings containing that N-gram.
  • full lists may significantly reduce the volume of computations that may be required to process a user's communication and present a more desirable search result.
  • the use of short lists may further reduce the volume of computations.
  • all of the listings N-grams may be mapped and stored with corresponding stored symbol strings entries in a master list.
  • the database entry matcher 140 extracts one or more recognized contiguous sequence of N-symbols (recognized N-grams) for each recognized symbol string entry in the list of recognized symbol strings generated by the recognizer 120.
  • the database entry matcher 140 may search all of the listings N-grams to find an identical match for each of the recognized N-grams.
  • the database entry matcher 140 may generate a preliminary set of symbol strings based on associated second similarity scores. Entries of the preliminary set of symbol strings include the stored symbol strings mapped to the matched N-grams. [26] The database entry matcher 140 may compute a third similarity score for each entry in the preliminary set of symbol strings. Based on the third similarity score, the database entry matcher 140 may generate a refined set of symbol strings. The refined set of symbol strings may typically include the best or closest match for the user's communication originally received by the recognizer 120.
  • the database entry matcher 140 outputs the refined set of symbol strings to the output manager 130 for processing.
  • the output manager 130 may forward the refined set of symbol strings to the user for selection. Based on the user's selection, the output manager 130 may route a call for the user, retrieve and present additional information to the user, present another prompt to the user, terminate the call if the desired results have been achieved, or perform other steps to output a desired result for the user.
  • the database entry matcher 140 may include for example, an N-gram map generator 210, an N-gram database 240, a rough and fast matcher 220, and a refined matcher 230.
  • the N- gram map generator is coupled to the database 110, the N-gram database 240 and the rough and fast matcher 220.
  • the rough and fast matcher 220 is coupled to the recognizer 120 and the refined matcher 230.
  • the refined matcher 230 is also coupled to the recognizer 120 and to the output manager 130.
  • the term coupled as used herein implies direct or indirect coupling.
  • the rough and fast matcher 220 is further coupled to, for example, the N-gram database 240 and the output manager 130.
  • the N-gram map generator 210 may extract one or more listings N-grams from each of the listings in the database 210. As indicated above, for one or more extracted listings N-grams a list, for example, a full list and/or a short list of listings symbol strings containing a particular N-gram along with corresponding elementary second similarity scores may be stored, for example, in the N-gram database 240. Additionally or optionally, this mapping of N-grams to, for example, full lists or short lists may be stored in database 110.
  • the rough and fast matcher 220 may receive a list of recognized symbol strings with associated first similarity scores. The rough and fast matcher 220 may extract from each recognized symbol string one or more recognized N- grams.
  • the rough and fast matcher 220 may match at least one of the recognized N-grams with at least one of the listings N-grams.
  • the rough and fast matcher 220 may further generate a preliminary set of symbol strings and associated second similarity scores.
  • the preliminary set of symbol strings may include one or more listing from the database 210 that is mapped to the matched N-grams.
  • the refined matcher 230 may receive the preliminary set of symbol strings and may compute a third similarity score for each listing from the database 210.
  • the refined matcher 230 may output a refined set of symbol strings from the preliminary set of symbol strings based on the computed third similarity score.
  • the N-gram map generator 210 receives the plurality of symbol strings L(listings a ⁇ ) stored in the database 110.
  • the N-gram map generator 210 may extract one or more listings N- gram from each symbol string entry stored in the database 110.
  • N can be, for example, 1 , 2, 3, 4, 5, or more symbols or characters in length.
  • a symbol as used herein can be any character, sign, mark, figure or other representation in, for example, the English language or any other language.
  • N a symbol string entry such as "Park Flowers” would yield the following N-grams (g k “ “(par”, “park”, “ark_”, rk_f”, “k_fl “, “_flo”, “flow”, “lowe”, “ower,” “wers”, and “ers)".
  • N a symbol string entry
  • N a symbol string entry
  • N a symbol string entry
  • N a symbol string entry
  • N a symbol string entry such as "Park Flowers”
  • the symbol “_” may indicate that the symbol string contains a space which implies that the symbol string is more than one word.
  • the symbol ")" in “ers)” may indicate that this is the last N-gram of the corresponding symbol string. It is recognized that any symbol or character may be chosen to designate the beginning, end, space or other characteristic of the symbol strings.
  • Each listings N-gram may be stored in the N-gram database 240 and may be mapped with a list of listings symbol strings stored in the database 110 containing this N-gram and with corresponding elementary second similarity scores.
  • This list may be full, that is, it may include all listings symbol strings from the database containing a particular N-gram. Alternatively it may be short, that is, include only a part of all listings symbol strings from the database containing that particular N-gram.
  • the elementary second similarity scores for a particular N-gram may be different for different listings symbol strings containing that N-gram, or those scores may be the same for all listings symbol strings containing that N-gram.
  • Each list List(g) may include all entries stored in the database 110 that contain a particular listings N-gram g, or it may include only a part of such entries.
  • Short lists that are subsets of full lists may be created in many ways.
  • the number of elements in short lists may be limited to some fixed number that may be the same for all N-grams, for example, 200.
  • Creating a short list for a particular N-gram g may be controlled by listings symbol string priorities.
  • a listings symbol string priority may be defined as the ratio of the number of all N-grams that can be extracted from this listings symbol string and the number of short lists in which this listings symbol string has been included before the processing of this particular N-gram g. So all the listings symbol strings containing this particular N-gram g may be ordered according to their priorities, and the top predefined number, for example 200, of them may be included in the short list for this N-gram g.
  • the priorities of the included listings symbol strings may be recomputed. In additional embodiments of the present invention, the priorities may be computed as some other function of the number of all N-grams that can be extracted from this listings symbol string and the number of short lists in which this listings symbol string has been included before the processing of a particular N-gram g.
  • each listings N-gram may be mapped to its corresponding list List(g).
  • Each list and corresponding mapping may be stored in the N-gram database 240.
  • Lists List(g) both full and/or short, can significantly reduce the volume of computations that may be required to process a user's communication and present a more desirable search result.
  • computations can be further reduced by creating sub-lists of the listings database that contain information related to a particular subject matter. For example, restaurant listings may be placed into a separate sub-list that can include further lists List(g) for corresponding listings N- grams. Thus, if a user requests a restaurant listing, only the sub-list for restaurants and corresponding lists may be evaluated in accordance with embodiments of the present invention.
  • the N-gram map generator 210 may not be case and/or punctuation sensitive so that punctuation may be removed and capital letters may be changed to lower case letters when N-grams are extracted. In an alternative embodiment, the N-gram map generator 210 may be case and/or punctuation sensitive so that punctuation and capital letters are retained when the N-grams are extracted.
  • the N-gram map generator 210 may also calculate a listings N-gram frequency score M(g), where g indicates the particular N-gram.
  • M(g) may designate the number of stored symbol strings L(listings a n) that contain, for example, N-gram g from the total number M of stored symbol strings L(listings a n) in the database 110.
  • the listings frequency score M(g) represents the total number of database listings in which the particular listings N- gram g appears. For example, a listings database containing business listings for Morristown, NJ may contain 5,743 total entries M.
  • listings frequency score M("k_fl) may equal 4 indicating that 4 entries out of 5,743 total entries, for example, contain the N-gram "k_fl.”
  • the frequency score M("ion)) may equal 116 indicating that 116 out of 5J43 total listings contain the N-gram "ion)".
  • the frequency score may be stored in the N-gram database 240 with the corresponding contiguous sequence of N-symbols and the corresponding list of database listings.
  • the N-gram map generator 210 may generate a listings N-gram frequency ratio R(g).
  • the frequency ratio R(g) may be the function of the number of the one or more symbol strings stored in the database 110 that contain the listings N-gram g and the total number of the one or more symbol strings L(listings a ) stored in the database 110.
  • the frequency ratio R(g) indicates the frequency by which a particular N- gram g appears in entire database listings ( i.e., L(listings a n)).
  • a lower ratio indicates that the particular N-gram g appears less frequently in the entire database listings.
  • the higher ratio indicates that the particular N-gram g appears more frequently in the entire database listings.
  • the value of the frequency ratio R(g) can be used to evaluate the "distinguishing power" of a particular N-gram g. Thus, the lower the frequency ratio R(g), the more distinguishing power the particular N-gram has.
  • the recognizer 120 may receive a users communication and may generate a list of recognized symbol strings (Si, S 2 , S 3 ,...S k ) and associated first similarity scores strings (d, C 2 , C 3 ,...C k ) from the recognizer 120.
  • C ⁇ is the first similarity score associated with S-
  • C 2 is the first similarity score associated with S 2
  • the recognizer forwards the list of recognized symbol strings to the rough and fast matcher 220.
  • the rough and fast matcher 220 receives the list of recognized symbol strings and associated first similarity scores.
  • the rough and fast matcher 220 may extract one or more recognized N-grams from each of the recognized symbol strings S T through S k , where N can be, for example, 1 , 2, 3, 4, 5, or more symbols or characters in length.
  • the value of N will be the same length for the recognized symbol strings and the stored symbol strings extracted by the N-gram map generator. In alternative embodiments, the value of N may be different lengths for the recognized symbol strings than the value of N for the stored symbol strings extracted by the N-gram map generator.
  • the value of N may be fixed or may vary in length. In other words, the value of N may be any fixed value such as 1 , 2, 3, 4, 5 symbols, etc. Alternatively, the value of N may vary between 1 to 3, 1 to 4, 2 to 5, 3 to 4, 3 to 5, 3 to 6 characters and/or any other range of values or any subset of any range, for example, the value of N may vary over 1 ,2, 4, 7 characters.
  • the rough and fast matcher 220 compares each of recognized N-grams g extracted from the recognized symbol strings with the listings N-grams g stored in the N-gram database 240 to find one or more identical matches. For every database entry stored in the listings database 110 (or every entry in the one or more full or short lists List(g)) mapped to the matched listings N-gram, a second similarity score is created. The rough and fast matcher 220 may generate a preliminary set of symbol strings based on the second similarity score associated with each entry in the preliminary set of symbol strings. The preliminary set of symbol strings may include one or more symbol strings from the full or short lists List(g) that are mapped to the matched listings N-grams.
  • the second similarity score for a listings symbol string from a preliminary set may be calculated based on elementary second similarity scores, for example, as a sum of the elementary second similarity scores for the recognized and matched N-grams that appear in this listings symbol string.
  • the frequency ratio R(g) may be used to generate an N-gram elementary second similarity score ESSS(g) for each of the recognized N-grams.
  • the N-gram elementary second similarity score ESSS(g) may be equal to the frequency ratio R(g), or may be calculated as a function of the frequency ration R(g).
  • the N-gram elementary second similarity score ESSS(g) may be calculated as -log(R(g)) or any other function of the ratio R(g).
  • the N-gram elementary second similarity score ESSS(g) for each of the extracted N-grams may be a predetermined fixed value such as 1 , 2, 3, etc.
  • the N-gram elementary second similarity score ESSS(g) for each matched N-gram that belongs to the same stored symbol strings may be added together to generate the second similarity scores.
  • the N-gram elementary second similarity scores ESSS(g) for N-grams from different recognized symbol strings that belong to a particular stored symbol string may be multiplied by the first similarity scores associated with the recognized symbol strings and added up to obtain a second similarity score for this particular stored symbol string.
  • rough and fast matcher takes as input a list of pairs ⁇ ( S-,, C-,),..., ( S k , C k ) ⁇ , where S, is the symbol string, C, is the first similarity score associated with it. Then, for all different listings listing* from the database, a second similarity score SSS2(listing.) is computed. For example, the symbol strings S, are scanned one by one, extracting all N-grams.
  • the second similarity score SSS2(listing.) is updated by adding elementary second similarity score ESSS(g) multiplied by the first similarity score of the symbol string S tread that is C,:
  • SSS2(listing.) SSS2(listing*) +ESSS(g) * C,.
  • the starting or initial values of second similarity scores may be set to the same value, for example, 0.
  • the starting values of second similarity scores may be different for different listings symbol strings reflecting the a priori information about the probabilities of all listings symbol strings, thus giving some advantage to more probable listings symbol strings.
  • a threshold limit may be established to determine the preliminary set of symbol strings.
  • second similarity score threshold limit may establish that any stored listing symbol string having a corresponding second similarity score that meets or exceeds the threshold may be included in the preliminary set of symbol strings.
  • any stored listings symbol string having a corresponding second similarity score that does not meet or exceed the threshold may not be included in the preliminary symbol string set.
  • the second similarity score threshold is set at 50 points (where 100 points is the highest second similarity score)
  • any stored symbol string having a second similarity score equal to or exceeding 50 points would be included in the preliminary string set.
  • any stored symbol string has a corresponding second similarity score that is less than, for example, 50 points, then the corresponding stored symbol string may not be included in the preliminary symbol string set.
  • the second similarity threshold may be an absolute threshold or may relative threshold (e.g., relative to the maximum value of the second similarity scores for those stored symbol strings).
  • other suitable methods may be used to determine which symbol strings may be included in the preliminary set of symbol strings. For example, the symbol strings with the corresponding top ten (10) highest second similarity scores, for example, may be included in the preliminary set of symbol strings.
  • the refined matcher 230 receives the preliminary set of symbol strings from the rough and fast matcher 220 and may compute a third similarity score associated with the one or more symbol strings included in the preliminary set of symbol strings.
  • the refined matcher 230 may also receive one or more N-grams associated with entries included in the preliminary set of symbol strings. Based on the third similarity score, the refined matcher 230 may output a refined set of symbol strings.
  • the refined set of symbol strings may include the best or closest match for the user's communication originally received by the recognizer 120.
  • the third similarity score may be determined by evaluating the list of recognized symbol strings with respect to the preliminary set of symbol strings including the one or more stored symbol strings.
  • the refined matcher 230 may also calculate a refined N-gram frequency score m(g), where g is any N-gram from the recognized symbol string.
  • m(g) is the number of stored symbol strings included in the preliminary set of symbol strings that contain, for example, recognized N-gram g from the total number m of stored symbol strings included in the preliminary set of symbol strings.
  • the refined N-gram frequency score m(g) represents the number of listings in the preliminary set of symbol strings in which the recognized N-gram g appears.
  • the refined matcher 230 may generate a refined N-gram frequency ratio r(g).
  • the refined frequency ratio r(g) may be the number of stored symbol strings in the preliminary set of symbol strings that contain the recognized N-gram g divided by a total number of stored symbol strings that appear in the preliminary set of symbol strings.
  • the refined matcher 230 may calculate the refined frequency ratio r(g) as a ratio of the number of stored symbol strings in the preliminary set of symbol strings that contain the recognized N-gram g and a total number of stored symbol strings that appear in the preliminary set of symbol strings.
  • the refined frequency ratio r(g) may be used to evaluate the relative importance of a particular N-gram g with respect to the stored listings symbol string in the preliminary set of symbol strings.
  • the refined frequency ratio r(g) indicates the frequency by which a particular N-gram g appears in the preliminary set of symbol strings.
  • a lower ratio indicates that the particular N-gram g appears less frequently in the preliminary set of symbol strings.
  • the higher ratio indicates that the particular N-gram g appears more frequently in the preliminary set of symbol strings.
  • the value of the refined frequency ratio r(g) can be used to evaluate the "distinguishing power" of a particular N-gram g with respect to the preliminary set of symbol strings.
  • the lower the refined frequency ratio r(g) the more distinguishing power the particular N-gram has.
  • the third similarity score for a stored listing listing* TSS3(listing j ) from a preliminary set of symbol strings may be calculated based on elementary third similarity scores ETSS(g), for example, as a sum of the elementary third similarity scores ETSS(g) for the recognized and matched N-grams that appear in this listings symbol string.
  • the refined frequency ratio r(g) may be used to generate an N- gram elementary third similarity score ETSS(g) for each of the recognized N-grams.
  • the N-gram elementary third similarity score ETSS(g) may be equal to the refined frequency ratio r(g), or may be calculated as a function of the refined frequency ratio r(g).
  • the elementary third similarity score for an N-gram g ETSS(g) may be calculated as -log(r(g)) or some other function of the refined ratio r(g).
  • the N-gram elementary third similarity score ETSS(g) for each of the extracted N-grams may be a predetermined fixed value such as 1 , 2, 3, etc.
  • refined matcher takes as an input a list of pairs ⁇ ( S ⁇ d), ... ( S k , C k ) ⁇ , where Sj is the symbol string and is the first similarity score associated with the symbol string. Then, for all different listings listing, from the preliminary set, the third similarity score TSS3(listing j ) is computed. For example, symbol strings S, may be scanned one by one, extracting all N-grams.
  • the third similarity score TSS3(listing,) is updated by adding elementary third similarity score ETSS(g) multiplied by the first similarity score of the symbol string S tread that is C,:
  • TSS3(listing,) TSS3(listing,) +ETSS(g) * C,.
  • the starting values or initial values of third similarity scores may be set to the same value, for example, 0.
  • the starting values of third similarity scores may be different for different listings symbol strings reflecting the a priori information about the probabilities of all listings symbol strings, thus giving some advantage to more probable listings symbol strings.
  • a threshold limit for example, may be established to determine the refined set of symbol strings.
  • the third similarity score threshold may be an absolute threshold or may relative threshold (e.g., relative to the maximum value of the third similarity scores).
  • other suitable methods may be used to determine which symbol strings may be included in the refined set of symbol strings. For example, symbol strings with the corresponding top ten (10) highest third similarity scores, for example, may be included in the refined set of symbol strings.
  • further processing may occur by evaluating the recognized symbol strings with respect to the refined set of symbol strings before an output is presented to the user in the same way as it is done with the preliminary set of symbol strings.
  • This process can be implemented repeatedly in accordance with embodiments of the present invention.
  • the refined matcher outputs the refined set of symbol strings to the output manager 130 for processing.
  • the output manager 130 may take a decision about what listing the user meant and it may route a call for the user, retrieve and/or present the requested information.
  • the output manager 130 may forward the refined set of symbol strings to the user for selection. Based on the user's selection, the output manager 130 may route a call for the user, retrieve and present the requested information.
  • the output manager 130 may present another prompt to the user, terminate the call if the desired results have been achieved, or perform other steps to output a desired result for the user. If the output manager 130 presents another prompt to the user, for example, asks the user to input the desired listings name once more, the new recognized symbol strings may be used to help the output manager to make the final decision about the user's goal. This can be done by changing the distribution of the third similarity scores by, for example, adding up third similarity scores for symbol strings from the refined set of symbol strings computed based on the first user input and third similarity scores for symbol strings from the refined set of symbol strings computed based the second user input.
  • the configuration of the communication(s) processing system 100 and the database entry matcher 140 as shown in FIGS. 1 and 2, and the corresponding description above, is given by example only and modifications can be made to the communication(s) processing system 100 and to the database entry matcher 140 that fall within the spirit of the invention.
  • the database entry matcher 140 and/or its functionality may be incorporated into the recognizer, the output manager and/or any combination(s) may be formed.
  • the intelligence of the communication(s) processing system 100 may be integrated into one or more application specific integrated circuits (ASICs) and/or one or more software programs.
  • ASICs application specific integrated circuits
  • FIG. 4 illustrates a flow chart of a method in accordance with an exemplary embodiment of the present invention.
  • FIG. 4 is an example of a user's communication that is processed in accordance with the method described in the flow chart of FIG. 3.
  • a user may call, for example, directory assistance to locate the telephone number, address and/or other information for a particular individual, organization, agency, business, etc.
  • an automated communication processing system 100 may receive the call and request the user to enter a search criteria.
  • the communication processing system 100 may include an automated attendant, an IVR or other suitable automated answering service.
  • the search criteria could be, for example, the name of a business for which additional information is required.
  • the search criteria could be a user's communication that can be spoken inputs, inputs entered via a keypad or keyboard, or other suitable inputs.
  • the recognizer 120 located in the communication processing system 100 may receive a user's communication.
  • the user's communication 401 may be a spoken request "Dantes Restaurant" 402 in response to a request for the search criteria from the communication processing system 100.
  • the recognizer 120 may search the recognizer grammar list 450 created from the recognizer listings database 453 for an N-best match.
  • the recognizer grammar list 450 and the listings database may be stored in, for example, the database 110.
  • the listings database 453 contains a plurality of symbol strings representing, for example, names of local restaurants.
  • the recognizer grammar list 450 include entries that represent the different ways users may refer to the symbol strings stored in the listings database 453.
  • the N-gram map generator 210 may extract one or more contiguous sequence of N-symbols (listings N-grams) from the plurality of symbol strings stored in the listings database 453.
  • the N-gram map generator may create an N-gram map 455 containing the extracted contiguous sequences of N-symbols that are mapped to corresponding symbol strings stored in the listings database 453.
  • the N-gram map 455 may be stored in, for example, N- gram database 240.
  • the recognizer 120 may generate a list of recognized symbol strings 403 including one or more recognized entries 405 based on the received user's communication 402 and may compute the associated first similarity scores 407 for the one or more recognized entries.
  • the recognizer 120 may transmit the list of recognized symbol string entries 405 and the associated first similarity scores 407 to the database entry matcher 140.
  • the rough and fast matcher 220 located in the database entry matcher 140, receives the list of recognized symbol strings 405 and the associated first similarity scores 407 (3070).
  • the rough and fast matcher 220 extracts from each of the recognized symbol strings 405, in the list of recognized symbol strings 403, one or more contiguous sequences of N-symbols 411 (recognized N-grams) (3090).
  • the rough and fast matcher 220 may match at least one of the extracted contiguous sequence of N-symbols 411 with the at least one stored contiguous sequence of N-symbols from the N-gram map 455 stored in the N-gram database 240 (3110).
  • the rough and fast matcher 220 further generates a preliminary set of symbol strings 413 based on the associated second similarity scores 417 (3130).
  • the preliminary set of symbol strings 413 may include one or more stored symbol strings from the listings database 453 that correspond to the matched contiguous sequence of N- symbols.
  • the refined matcher computes the third similarity scores 423 associated with the one or more stored symbol strings 415 included in the preliminary set of symbol strings 413 and outputs a refined set of symbol strings 421 (3150).
  • the refined set of symbol strings 421 may be output based on the computed third similarity scores 423 (3170).
  • the refined set of symbol strings 421 may be output to the output manager 130.
  • the output manager may directly output the refined set of symbol strings 421 to the user for selection.
  • the output manager 130 may retrieve additional information corresponding to the user's selection and present such information to the user. Such additional information may include, for example, a corresponding telephone number, mailing address, e-mail address, etc. This additional information may be located in, for example, database 110 and/or any other informational database.
  • the output manager may offer to connect the user with the selection if the user is satisfied with the resulting set of symbol strings presented. However, if the user is unsatisfied, the output manager 130 may return refined set of symbol strings 421 to the refined matcher 230 and/or rough and fast matcher 230 for further processing in accordance with embodiments of the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Circuits Of Receivers In General (AREA)

Abstract

L'invention concerne un procédé et un appareil destinés à traiter une communication d'utilisateur. Cette invention consiste à recevoir une liste de chaînes de symboles reconnus d'une ou plusieurs entrées reconnues. Cette liste de chaînes de symboles reconnus peut comprendre un premier score de similarité associé à chaque entrée reconnue. A partir de chaque chaîne de symboles reconnus une ou plusieurs séquences contiguës de N symboles peuvent être extraites. Une des séquences contiguës de N symboles extraites peut être mise en correspondance avec au moins une séquence contiguë de N symboles stockée dans une première base de données. Un ensemble préliminaire de chaînes de symboles et des deuxièmes scores de similarité peuvent être générés. L'ensemble préliminaire de chaînes de symboles peut comprendre une ou plusieurs chaînes de symboles stockés dans une deuxième base de données qui correspond à la séquence contiguë de N symboles mise en correspondance. Un troisième score de similarité associé à une ou plusieurs chaînes de symboles stockés compris dans l'ensemble préliminaire de chaînes de symboles peut être calculé. Un ensemble perfectionné de chaînes de symboles d'un ensemble préliminaire de chaînes de symboles fondés sur le troisième score de similarité calculé peut être produit.
PCT/US2002/010169 2001-04-12 2002-04-02 Procede et appareil destines a traiter automatiquement une communication d'utilisateur WO2002089451A2 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP02766736A EP1388097A2 (fr) 2001-04-12 2002-04-02 Procede et appareil destines a traiter automatiquement une communication d'utilisateur

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US28318201P 2001-04-12 2001-04-12
US607283,182 2001-04-12
US09/845,360 2001-05-01
US09/845,360 US6625600B2 (en) 2001-04-12 2001-05-01 Method and apparatus for automatically processing a user's communication

Publications (2)

Publication Number Publication Date
WO2002089451A2 true WO2002089451A2 (fr) 2002-11-07
WO2002089451A3 WO2002089451A3 (fr) 2003-07-24

Family

ID=26961909

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2002/010169 WO2002089451A2 (fr) 2001-04-12 2002-04-02 Procede et appareil destines a traiter automatiquement une communication d'utilisateur

Country Status (3)

Country Link
US (2) US6625600B2 (fr)
EP (1) EP1388097A2 (fr)
WO (1) WO2002089451A2 (fr)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6625600B2 (en) * 2001-04-12 2003-09-23 Telelogue, Inc. Method and apparatus for automatically processing a user's communication
US6934675B2 (en) * 2001-06-14 2005-08-23 Stephen C. Glinski Methods and systems for enabling speech-based internet searches
US8700404B1 (en) * 2005-08-27 2014-04-15 At&T Intellectual Property Ii, L.P. System and method for using semantic and syntactic graphs for utterance classification
US20070078653A1 (en) * 2005-10-03 2007-04-05 Nokia Corporation Language model compression
JP4321518B2 (ja) * 2005-12-27 2009-08-26 三菱電機株式会社 楽曲区間検出方法、及びその装置、並びにデータ記録方法、及びその装置
US7653183B2 (en) * 2006-04-06 2010-01-26 Cisco Technology, Inc. Method and apparatus to provide data to an interactive voice response (IVR) system
JP4442585B2 (ja) * 2006-05-11 2010-03-31 三菱電機株式会社 楽曲区間検出方法、及びその装置、並びにデータ記録方法、及びその装置
US20080091427A1 (en) * 2006-10-11 2008-04-17 Nokia Corporation Hierarchical word indexes used for efficient N-gram storage
US7818170B2 (en) * 2007-04-10 2010-10-19 Motorola, Inc. Method and apparatus for distributed voice searching
ATE479983T1 (de) * 2007-10-24 2010-09-15 Harman Becker Automotive Sys Verfahren und system zur spracherkennung zum durchsuchen einer datenbank
US7925652B2 (en) * 2007-12-31 2011-04-12 Mastercard International Incorporated Methods and systems for implementing approximate string matching within a database
GB0905457D0 (en) 2009-03-30 2009-05-13 Touchtype Ltd System and method for inputting text into electronic devices
GB201016385D0 (en) 2010-09-29 2010-11-10 Touchtype Ltd System and method for inputting text into electronic devices
GB0917753D0 (en) 2009-10-09 2009-11-25 Touchtype Ltd System and method for inputting text into electronic devices
US9424246B2 (en) 2009-03-30 2016-08-23 Touchtype Ltd. System and method for inputting text into electronic devices
US9189472B2 (en) 2009-03-30 2015-11-17 Touchtype Limited System and method for inputting text into small screen devices
US10191654B2 (en) 2009-03-30 2019-01-29 Touchtype Limited System and method for inputting text into electronic devices
GB201003628D0 (en) 2010-03-04 2010-04-21 Touchtype Ltd System and method for inputting text into electronic devices
GB201200643D0 (en) 2012-01-16 2012-02-29 Touchtype Ltd System and method for inputting text
US8745061B2 (en) * 2010-11-09 2014-06-03 Tibco Software Inc. Suffix array candidate selection and index data structure
US8942981B2 (en) * 2011-10-28 2015-01-27 Cellco Partnership Natural language call router
US9779312B2 (en) * 2015-01-30 2017-10-03 Honda Motor Co., Ltd. Environment recognition system
GB201610984D0 (en) 2016-06-23 2016-08-10 Microsoft Technology Licensing Llc Suppression of input images
US11138966B2 (en) * 2019-02-07 2021-10-05 Tencent America LLC Unsupervised automatic speech recognition

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5418951A (en) * 1992-08-20 1995-05-23 The United States Of America As Represented By The Director Of National Security Agency Method of retrieving documents that concern the same topic
US5724593A (en) * 1995-06-07 1998-03-03 International Language Engineering Corp. Machine assisted translation tools
US5991714A (en) * 1998-04-22 1999-11-23 The United States Of America As Represented By The National Security Agency Method of identifying data type and locating in a file
US6006221A (en) * 1995-08-16 1999-12-21 Syracuse University Multilingual document retrieval system and method using semantic vector matching
US6029167A (en) * 1997-07-25 2000-02-22 Claritech Corporation Method and apparatus for retrieving text using document signatures
US6252988B1 (en) * 1998-07-09 2001-06-26 Lucent Technologies Inc. Method and apparatus for character recognition using stop words
US6269368B1 (en) * 1997-10-17 2001-07-31 Textwise Llc Information retrieval using dynamic evidence combination
US6286006B1 (en) * 1999-05-07 2001-09-04 Alta Vista Company Method and apparatus for finding mirrored hosts by analyzing urls
US6397205B1 (en) * 1998-11-24 2002-05-28 Duquesne University Of The Holy Ghost Document categorization and evaluation via cross-entrophy

Family Cites Families (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3928724A (en) 1974-10-10 1975-12-23 Andersen Byram Kouma Murphy Lo Voice-actuated telephone directory-assistance system
US5052038A (en) 1984-08-27 1991-09-24 Cognitronics Corporation Apparatus and method for obtaining information in a wide-area telephone system with digital data transmission between a local exchange and an information storage site
US4608460A (en) 1984-09-17 1986-08-26 Itt Corporation Comprehensive automatic directory assistance apparatus and method thereof
US4650927A (en) 1984-11-29 1987-03-17 International Business Machines Corporation Processor-assisted communication system using tone-generating telephones
US4674112A (en) 1985-09-06 1987-06-16 Board Of Regents, The University Of Texas System Character pattern recognition and communications apparatus
US4915546A (en) 1986-08-29 1990-04-10 Brother Kogyo Kabushiki Kaisha Data input and processing apparatus having spelling-check function and means for dealing with misspelled word
DE3777829D1 (de) 1986-10-08 1992-04-30 American Telephone & Telegraph Rufnummernauskunftsermittlungsverarbeitungs- und nichtueberwachungssignalpruefungsanordnungen.
WO1989000793A1 (fr) 1987-07-10 1989-01-26 American Telephone & Telegraph Company Systemes d'assistance pour recherche de numero dans l'annuaire telephonique
US4979206A (en) 1987-07-10 1990-12-18 At&T Bell Laboratories Directory assistance systems
US5218536A (en) 1988-05-25 1993-06-08 Franklin Electronic Publishers, Incorporated Electronic spelling machine having ordered candidate words
US5214689A (en) 1989-02-11 1993-05-25 Next Generaton Info, Inc. Interactive transit information system
US5255310A (en) 1989-08-11 1993-10-19 Korea Telecommunication Authority Method of approximately matching an input character string with a key word and vocally outputting data
US5261112A (en) 1989-09-08 1993-11-09 Casio Computer Co., Ltd. Spelling check apparatus including simple and quick similar word retrieval operation
US5203705A (en) 1989-11-29 1993-04-20 Franklin Electronic Publishers, Incorporated Word spelling and definition educational device
AU631276B2 (en) 1989-12-22 1992-11-19 Bull Hn Information Systems Inc. Name resolution in a directory database
US5131045A (en) 1990-05-10 1992-07-14 Roth Richard G Audio-augmented data keying
US5087913A (en) * 1990-08-27 1992-02-11 Unisys Corporation Short-record data compression and decompression system
JPH0576671A (ja) 1991-09-20 1993-03-30 Aisin Seiki Co Ltd 刺繍機の刺繍処理システム
US5621857A (en) 1991-12-20 1997-04-15 Oregon Graduate Institute Of Science And Technology Method and system for identifying and recognizing speech
WO1994014270A1 (fr) 1992-12-17 1994-06-23 Bell Atlantic Network Services, Inc. Service de renseignements telephoniques mecanise
US5412756A (en) * 1992-12-22 1995-05-02 Mitsubishi Denki Kabushiki Kaisha Artificial intelligence software shell for plant operation simulation
US5519608A (en) * 1993-06-24 1996-05-21 Xerox Corporation Method for extracting from a text corpus answers to questions stated in natural language by using linguistic analysis and hypothesis generation
JPH0756933A (ja) * 1993-06-24 1995-03-03 Xerox Corp 文書検索方法
US5457770A (en) 1993-08-19 1995-10-10 Kabushiki Kaisha Meidensha Speaker independent speech recognition system and method using neural network and/or DP matching technique
US5623578A (en) 1993-10-28 1997-04-22 Lucent Technologies Inc. Speech recognition system allows new vocabulary words to be added without requiring spoken samples of the words
US5634055A (en) * 1994-09-27 1997-05-27 Bidplus, Inc. Method for selecting assignments
AU3734395A (en) 1994-10-03 1996-04-26 Helfgott & Karas, P.C. A database accessing system
US5479489A (en) 1994-11-28 1995-12-26 At&T Corp. Voice telephone dialing architecture
US5940825A (en) * 1996-10-04 1999-08-17 International Business Machines Corporation Adaptive similarity searching in sequence databases
US5839107A (en) 1996-11-29 1998-11-17 Northern Telecom Limited Method and apparatus for automatically generating a speech recognition vocabulary from a white pages listing
JP2996926B2 (ja) * 1997-03-11 2000-01-11 株式会社エイ・ティ・アール音声翻訳通信研究所 音素シンボルの事後確率演算装置及び音声認識装置
JP3601350B2 (ja) * 1998-09-29 2004-12-15 ヤマハ株式会社 演奏画像情報作成装置および再生装置
US6430559B1 (en) * 1999-11-02 2002-08-06 Claritech Corporation Method and apparatus for profile score threshold setting and updating
US6499029B1 (en) * 2000-03-29 2002-12-24 Koninklijke Philips Electronics N.V. User interface providing automatic organization and filtering of search criteria
US6625600B2 (en) * 2001-04-12 2003-09-23 Telelogue, Inc. Method and apparatus for automatically processing a user's communication

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5418951A (en) * 1992-08-20 1995-05-23 The United States Of America As Represented By The Director Of National Security Agency Method of retrieving documents that concern the same topic
US5724593A (en) * 1995-06-07 1998-03-03 International Language Engineering Corp. Machine assisted translation tools
US6006221A (en) * 1995-08-16 1999-12-21 Syracuse University Multilingual document retrieval system and method using semantic vector matching
US6029167A (en) * 1997-07-25 2000-02-22 Claritech Corporation Method and apparatus for retrieving text using document signatures
US6269368B1 (en) * 1997-10-17 2001-07-31 Textwise Llc Information retrieval using dynamic evidence combination
US5991714A (en) * 1998-04-22 1999-11-23 The United States Of America As Represented By The National Security Agency Method of identifying data type and locating in a file
US6252988B1 (en) * 1998-07-09 2001-06-26 Lucent Technologies Inc. Method and apparatus for character recognition using stop words
US6397205B1 (en) * 1998-11-24 2002-05-28 Duquesne University Of The Holy Ghost Document categorization and evaluation via cross-entrophy
US6286006B1 (en) * 1999-05-07 2001-09-04 Alta Vista Company Method and apparatus for finding mirrored hosts by analyzing urls

Also Published As

Publication number Publication date
US20040030695A1 (en) 2004-02-12
EP1388097A2 (fr) 2004-02-11
WO2002089451A3 (fr) 2003-07-24
US20020152207A1 (en) 2002-10-17
US6625600B2 (en) 2003-09-23

Similar Documents

Publication Publication Date Title
US6625600B2 (en) Method and apparatus for automatically processing a user's communication
US20030125948A1 (en) System and method for speech recognition by multi-pass recognition using context specific grammars
WO2003081886A1 (fr) Systeme et procede de traitement automatique d'une demande d'un utilisateur par un assistant automatise
US6671670B2 (en) System and method for pre-processing information used by an automated attendant
US6996531B2 (en) Automated database assistance using a telephone for a speech based or text based multimedia communication mode
US7627096B2 (en) System and method for independently recognizing and selecting actions and objects in a speech recognition system
US5832428A (en) Search engine for phrase recognition based on prefix/body/suffix architecture
US6018708A (en) Method and apparatus for performing speech recognition utilizing a supplementary lexicon of frequently used orthographies
US6122361A (en) Automated directory assistance system utilizing priori advisor for predicting the most likely requested locality
US5987414A (en) Method and apparatus for selecting a vocabulary sub-set from a speech recognition dictionary for use in real time automated directory assistance
US20020184003A1 (en) Determining language for character sequence
US20060093120A1 (en) Enhanced directory assistance automation
US20050004799A1 (en) System and method for a spoken language interface to a large database of changing records
US20060159240A1 (en) System and method of utilizing a hybrid semantic model for speech recognition
US8369492B2 (en) Directory dialer name recognition
US20060053107A1 (en) More efficient search algorithm (MESA) using virtual search parameters
TWI698756B (zh) 查詢服務之系統與方法
US20120158695A1 (en) More efficient search algorithm (MESA) using: integrated system optimizer
AU2002218318B2 (en) Method and device for automatically issuing information using a search engine
Natarajan et al. A scalable architecture for directory assistance automation
AU2002338597A1 (en) Method and apparatus for automatically processing a user's communication
US8069159B2 (en) More efficient search algorithm (MESA) using prioritized search sequencing
AU2002310485A1 (en) System and method for pre-processing information used by an automated attendant
Georgila et al. Improved large vocabulary speech recognition using lexical rules
KR20050066805A (ko) 음절 음성인식기의 음성인식결과 전달 방법

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2002766736

Country of ref document: EP

Ref document number: 529491

Country of ref document: NZ

Ref document number: 2002338597

Country of ref document: AU

WWP Wipo information: published in national office

Ref document number: 2002766736

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWW Wipo information: withdrawn in national office

Ref document number: 2002766736

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Ref document number: JP