WO1995014974A1 - Systeme d'entree pour rechercher du texte - Google Patents
Systeme d'entree pour rechercher du texte Download PDFInfo
- Publication number
- WO1995014974A1 WO1995014974A1 PCT/US1994/013279 US9413279W WO9514974A1 WO 1995014974 A1 WO1995014974 A1 WO 1995014974A1 US 9413279 W US9413279 W US 9413279W WO 9514974 A1 WO9514974 A1 WO 9514974A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- name
- word
- abbreviation
- inputs
- speaker
- Prior art date
Links
- 230000006870 function Effects 0.000 description 52
- 238000000034 method Methods 0.000 description 21
- 230000002452 interceptive effect Effects 0.000 description 20
- 150000001875 compounds Chemical class 0.000 description 11
- 229940079593 drug Drugs 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 241000288673 Chiroptera Species 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000012790 confirmation Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 108010001267 Protein Subunits Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 208000003580 polydactyly Diseases 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
- G06F16/90348—Query processing by searching ordered data, e.g. alpha-numerically ordered data
-
- G—PHYSICS
- G07—CHECKING-DEVICES
- G07B—TICKET-ISSUING APPARATUS; FARE-REGISTERING APPARATUS; FRANKING APPARATUS
- G07B15/00—Arrangements or apparatus for collecting fares, tolls or entrance fees at one or more control points
-
- G—PHYSICS
- G07—CHECKING-DEVICES
- G07C—TIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
- G07C15/00—Generating random numbers; Lottery apparatus
- G07C15/006—Generating random numbers; Lottery apparatus electronically
Definitions
- a fruitful approach to spelling names is to abbreviate them.
- a practical abbreviation method should combine user friendliness and efficient data compression.
- the basics of such a method, and the system for implementing it, are as follows: A speaker has in mind a name she would like to spell. The speaker chooses a word in the name by identifying the position of the word in the name. The speaker then spells the word using a recognizer. The system allows the speaker to go to another word at any time by identifying that word by its position in the name. Because the speaker can freely choose the most unusual word(s) in the name, this method minimizes the number of letters needed to be spelled to specify the name out of a list of names.
- Figure 1 shows a flow chart of the basic system for names.
- Figures 2 shows a flow chart of the system including additional, useful inputs.
- Figure 3 shows a flow chart of the system including erasure functions.
- Figure 4 shows a flow chart of the system in combination with a look-up system.
- Figures 5-5c show flow charts of the system together with an interactive look-up system.
- Figure 6 shows a flow chart of the system with a recognizer that does not confirm inputs.
- Figure 7 shows a flow chart of a probabilistic sorter of customer demand.
- the system create abbreviations that represent names.
- the abbreviation system can be combined with a look-up system which uses the abbreviations to find names in a data-base.
- the invention in combination with an interactive look-up system that guides the speaker in the use of the abbreviation method.
- Recognizer Means that enable a computer to have a reasonable chance of converting a speaker's speech inputs into the symbols intended by the speaker.
- Recognizer IR that outputs its guesses in voice form.
- Word A string of characters.
- Name A string of words. A name here is not used in the sense of a proper name. A name may have multiple proper names within it.
- Compound Name A string of names. Having more than one name in a full name does not necessarily mean that the full name is compound. A name with multiple names in it can be interpreted as a single name or a compound name. The designer of a system must decide which interpretation is best. For example, Sony Walkman MegaBass could be considered one name or three.
- a compressed text string used as a set of search parameters.
- Letter Input Input used to spell words. "Letter input” will mean the alpha-numeric symbols and punctuation marks that make up words.
- Word Input Input that is itself a full word.
- Word Identifier Input Input that denotes the sequential position of a word in a name. Used to specify what word a speaker's letter inputs correspond to. According to the number it contains, a word identifier specifies the position of all the letters entered after it itself is entered up until another word identifier is entered. For example, "Second word,” specifies that the next letters entered are spelling the second word in a name. The most useful word identifiers are the counting numbers; "Word” followed by the counting numbers; the ordinal numbers; and the ordinal numbers followed by "word”. If the system include full word inputs, a word identifier would also specify the position of a full word. For example, "Second Word” followed by a word would cause the word to be placed in the second word position of the abbreviation.
- Name Identifier Input Input that denotes what name a speaker's letter inputs correspond to. For example, "Second Name”, denotes that the next letters entered are spelling the second name in a compound name. Unlike a word identifier, a name identifier would not necessarily denote sequential position. A name identifier could be descriptive, for instance, "Manufacturer,” “Author,” “Street,” and so on.
- Step 1 The speaker starts by entering the word identifier corresponding to the word in the name that the speaker will be spelling first.
- Step 2 The speaker spells the word corresponding to the word identifier just entered. If a speaker returns to a word after having spelled it partially, the speaker continues where he left off.
- Step 3 The speaker can stop spelling a word by entering one of the following inputs: a. a word identifier, after which the speaker goes to step 2. b. "Done,” after which the speaker stops or goes to step 1.
- the basic system for implementing the method above requires the following elements in combination: A computer, an interactive recognizer (IR) 1 that recognizes and confirms inputs, and a program that executes the steps below for converting the confirmed inputs into an abbreviation:
- IR interactive recognizer
- the program waits for a word identifier 2. After this input is entered, the program begins building a set of search parameters, an abbreviation 3, for a name. The program stores the word identifier input as described in the step directly below. After this first input is stored,
- the program If a word identifier 5 is entered, the program the stores 4 it in the abbreviation to specify the position of the next letters entered. The position is specified by the number of the identifier.
- the system can include a Name Identifier input that signifies the name the speaker will be spelling.
- the addition of inputs and functions for handling compound names does not essentially change the system.
- the method simply includes steps enabling the speaker to signify what name she is spelling.
- the speaker identifies the name being spelled, and in the second, identifies the word being spelled.
- the speaker can change names by saying a different name identifier followed by a word identifier.
- the name identifier specifies the word identifiers that are entered after it.
- the program waits for a name identifier to be entered. After the name identifier is entered, the program begins building an abbreviation. The program stores the name identifier in the abbreviation as the specifier of the next word identifiers entered. Then, a. If a name identifier is entered, the program stores it in the abbreviation as the specifier of the next the word identifiers entered. b. If a word identifier is entered, the program stores it in the abbreviation in the field defined by the last name identifier entered. c. If a letter is entered, the program stores it in the abbreviation in the field defined by the last word identifier entered. d. If "Done" is entered, the program goes to Step 1.
- Part 3 System Including Additional Helpful Inputs
- Positional Letter Identifier Inputs (“letter identifiers"): The system can include inputs that denote the sequential position of letters in words.
- the most useful letter identifiers are those with counting numbers or ordinal numbers, for instance, "Letter three" or "Third letter.”
- the letter identifier is analogous to the word identifier and is nested within it.
- a letter identifier 31 is stored 32 in the field defined by the last word identifier entered. For example, "Third word,” “Fourth letter.” Then the letter following the letter identifier is stored 32 in the abbreviation as specified by the number in the letter identifier. For example, a person spelling "Rocklands” might say, “Fourth letter,” “K.” "K” is then stored as the fourth letter of the word.
- Last Letter As shown in figure 2, the system can include an input, call it "Last Letter,” that denotes that the next letter entered is the last letter of the word being spelled.
- the system program registers that if a letter is entered next, the letter is to be stored as the last letter 36 in the last word identified.
- the system can include an input, call it "Second- to-Last Letter,” that works similarly, but refers to the second to last letter of a word.
- the system can include an input, which might be called “Skip Letter,” that denotes that the speaker is skipping spelling a letter in the word being spelled. When the speaker enters this input, the system stores a blank character, in the abbreviation.
- the system can also include inputs, which might be called “Skip Nowel,” and “Skip Consonant,” that denote that the speaker is skipping a vowel and a consonant.
- Word Done As shown in figure 2, the system can include an input 37, which might be called “Word Done,” that denotes that all the letters in a word have been spelled. The system program stores 38 this input in the abbreviation under the last word identified. Words Done: The system can include an input 39, which may be called “Words Done,” that denotes that all the words in a name have been spelled to some extent. Numeric Value Inputs: The system can include numeric value inputs that denote the number of letters in a word, the number of words in a name, and the number of letters in a name. A numeric input has two parts, a number and a description telling what the number applies to. The program registers the descriptive part and then stores the number as the descriptive part specifies. The program must be designed to recognize the descriptive part which, as mentioned, can vary. Described below are four ways of entering numeric value inputs.
- a digit By sequential order, where the speaker has the option of entering a numeric value after a certain other input has been entered.
- the system can be designed to follow a convention whereby after a certain input, the speaker can enter an identifier, a letter, or a digit. If a digit, the digit is one of the three numeric values above. For example, after a word identifier, a digit could denote the number of letters in the word just identified. For example, "Second word,” "Four” would denote that the next letters entered belonged to the second word and that the word had four letters. It is also possible to have multiple digits where each digit denotes a different piece of information about the name.
- a speaker could begin by saying, "two, three, seven.”
- the "two” could denote that there are two words in the name.
- the "three” could denote that the next letter(s) spell the third word in the name.
- the "seven” could denote that there are seven letters in the third word.
- word identifier code By word identifier code, where a word identifier denotes more information than just the position of a word in a name. Such a coding method introduces imaginary, null words into a name. These null words can go in front of the real words or can be interspersed between them. If in front, the sequential order of the real words is preserved. If in between, the order of the real words is preserved. From the example above, a speaker could use the word identifier "237th word" which would be equivalent to "2-3-7" above. The system program would be designed to recognize the code in order to store the numeric part of the input. Part 4: Helpful Functions
- the system can include a step and function with which the speaker can erase a previous input or inputs. As shown in figure 4, the erasure could be at the letter 51 , 52, word 53, 54, 55, or name level. "Erase" 50 followed by any input(s) can signify that that input is to be erased from the abbreviation.
- the abbreviation system allows a speaker to return to a word that has been previously spelled to some extent and spell more of it.
- the system as described above has the speaker pick up where she left off because letters are stored sequentially in an abbreviation in the order they are entered. However, it may be easier for a speaker to start at the beginning of a word rather than try to remember her place.
- the system can include a function for eliminating re-confirmations of inputs such that: after a speaker enters a word identifier that has been previously entered, the system program counts the letters already spelled for that word; the program then detects the speaker's next inputs but does not confirm or store them until it detects as many inputs as there are letters already spelled; then it confirms and stores inputs as usual.
- a speaker returning to a word may need to be reminded of where he left off.
- the system can include a function for reminding the speaker. After a speaker enters a word identifier that has previously been entered, the system program outputs all the letters in the abbreviation stored under that word identifier. After this output, the program stores letters normally (in sequential order following the last letter already stored in the word).
- Full Word Acceptance Function While the system is designed to allow the speaker to abbreviate words and ultimately names, the method can include full word inputs. These can be very useful. Thus a speaker abbreviating "Reggies Bowling Alley,” might enter “First Word,” “R,” “Second Word,” “Bowling,” and so on. The recognizer would have to be able to recognize a vocabulary of full words in addition to those words used as program commands (such as "First,” “Word Done,” etc.). Words that are program commands might not be stored in the abbreviation. To accept word inputs as well as letter inputs, the system program would include a function that stores full words.
- the abbreviation system should be combined with a look-up system that includes: a) a data-base, b) a program for using the abbreviations created to search the data-base, and c) functions for outputting the results of the searches.
- This combined system would execute the same steps as the abbreviation system described above with the steps of the look-up system being added as shown in figure 4:
- the search program uses the abbreviation to search 81 the data-base for a name that uniquely matches the abbreviation. —If no match is found 82, the outputs a message 83 that the data could not be found, —If a unique match is found 84, outputs the data 85 corresponding to the name, --If a non-unique match is found, the abbreviation program waits for more inputs. As the steps above show, when a mismatch is reached, the system outputs a "no data" message. However, the system may include a "best match” function that outputs the best match for the abbreviation. Also, after a "no data” message, an erasure input could be entered, allowing the speaker to alter an abbreviation without starting all over.
- the speaker When using an abbreviation method to look-up a name, the speaker will usually not know enough to specify the name with a minimum number of inputs. Thus, it is useful for the look-up system to include interactive, guiding functions that reduce redundant inputs. Once the speaker has hit a dead-end, various functions can exit that dead-end. These will be described before interactive, guiding functions.
- the look-up system should include a function that outputs a message when a speaker has entered an incomplete abbreviation and has decided that he cannot further specify the name. Such a function can be called an exit and various exits are possible in a system.
- the look-up program can output: a) a message saying there are too many matched names, b) the data associated with the name that has been requested most, c) data associated with all the matched names, d) missing parts of the matched names and let the speaker choose one part.
- the corresponding name and data can then be outputted (For example, the data-base might have two listings that are the same except for the address.
- the system can output one address or the other and have the speaker confirm it or not.).
- the speaker can get stuck spelling a part of a word that is fully specified by previous inputs. This happens when the part is common to other words in the data-base. For example, say a speaker is trying to match "Internal Medicine Group" above. Once he enters "I,” he has narrowed his search to three names. Yet if he continues to spell "N-T-E-R-N-A" he will make no progress until he gets to "L.” In other words, the "I” has specified "N-T-E-R-N-A" as well.
- a speaker can get stuck spelling more than one word that is fully specified by previous inputs. This can happen if the speaker does not know all the words in a name. For example, a speaker trying to match a Sony Walkman WFF24 might specify "Sony" and "Walkman,” but if he does not know the last word, the model number, he is stuck spelling the first two words to no avail.
- a speaker can get stuck spelling a name that has been fully specified. This can happen if he does not know all the names in a compound name. He may then be stuck at a multiple match. For example, say the speaker is seeking to match a McDonalds above. Once he enters "M,” he has specified the first name. Spelling the rest of "McDonalds" is unnecessary. What is needed is part of the second name, the street address.
- the look-up system can include functions that "look ahead" into the data-base and tell the speaker when he is stuck. These functions can also suggest inputs that will efficiently specify the name sought. These functions may be called guides. There are two types of guides: probabalistic guides and definitive guides. Probabalistic guides yield suggestions that are good guesses.
- Definitive guides yield suggestions that are certain.
- a system can, and usually would, include both types of guides. These guides each come in two types: guides that tell a speaker he is stuck and guides that output suggested inputs. Guides output advice messages such as, "Try the third word," or "What is the third word?" Messages should be kept short.
- a useful form of message is a message that is conveyed by a short tone. For example, a beep could mean that the speaker is stuck in a word or a name.
- an interactive guide can be triggered to output advice when: a) a certain number of inputs has been entered for a word or name, b) the number of names matched is less than a certain number, c) the speaker has made less than a certain amount progress after having entered a certain number of inputs, d) the speaker enters a command, which might be called "Suggest,” that denotes that the speaker desires the system to provide advice, e) the interactive guide finds that a certain input is expected to narrow down the list of matches by more than a certain amount, f) the interactive guide calculates the speaker's expected progress and that progress is below a certain amount.
- a look-up system in combination with the abbreviation system can include a function that: a. Examines the list of all the compound names in the that match the abbreviation. b. Compares all the names in that list corresponding to the last name identifier entered. c. If all the names compared are identical, outputs a message telling the speaker that no more progress can be made in the name corresponding to that identifier.
- the look-up system in combination with the abbreviation system can include a function that: a. Examines 90 the list of all the names that match the abbreviation. b. Compares 91 all the words corresponding to the last word identifier entered. c. If all the words compared are identical 92, outputs a message 93 telling the speaker that no more progress can be made in the word corresponding to that identifier. For example, taking the toy data-base, if a speaker enters "First Word,” "Z,” the look-up system matches two names : “Zei Club” and "Zei Club Vacations.” The interactive guide compares the first words in the two names. The words are identical. Hence the guide outputs a message that no more progress can be made spelling the first word in the name.
- the function can also include these steps: d. If all the words compared are not identical,
- N M (a threshold value)
- Suggested inputs are inputs and sequences of inputs that a guide tests for possible outputting as advice.
- the term "suggested input” may be a bit confusing because it can refer to both a single input and a sequence of inputs.
- the reason that both are referred to as a suggested input is that in natural language a sequence of inputs and a single input are often phrased the same way. For example, a guide might output a message, "Enter the number of words in the name.” To enter this information the speaker may need only a single input.
- Another message might be, "Spell the second word.”
- a suggested input can involve multiple letters because a guide can test the value of spelling two or more letters in a row.
- a guide compares the information value of one suggested input to another, a single input may be compared to single input or to a sequence of inputs. There is no formula for deciding what inputs should be tested.
- both definitive and probabilistic guides must calculate the information value of different inputs in a set of suggested inputs.
- a guide can include a ranking of inputs according to user friendliness, so that, given two informationally equivalent inputs, the guide outputs the more user friendly choice.
- Information theory provides elementary formulas for measuring the value of a piece of information (an input). While a variety of these can be used in an interactive guide, all share the same idea. The idea is that the value of an input is measured by how many names it knocks out of a list of matches; by how much it narrows down the list of matches. A formula, call it INFO-
- VALUE(Input, Name) calculates the value of an input applied to a name. That's because the value of an input depends on what name the speaker is spelling. "Applying an input to a name” means that the function assumes that the speaker is spelling a certain name. The function then finds the letter or numeric value that corresponds to that input in that name. This letter or numeric value might be called the Resulting Input. An example: If the name is "Zei Club” and the function applies "Second Word,” “Last Letter,” then the Resulting Input is "B.”
- the ratio in step 7 measures of the value of the input selected applied to the name selected. For example, taking the toy data-base, say a speaker enters "First Word,” "F-E.” These inputs result in three matches: “Federal Election Committee,” “Federal Express,” and “Fetoosh Restaurant.” If INFO- VALUE applies Next Letter to the name, “Fetoosh Restaurant,” the Resulting Input is "T,” which eliminates two matches and leaves just one, “Fetoosh Restaurant.” Thus, INFO- VALUE of Next Letter applied to "Fetoosh” is (2 / (3-1)) or 1.0 ("T” has eliminated 100% of the "false” matches).
- a look-up system in combination with the abbreviation system can include a function that: a. Examines the set of all the names in the data-base that match the abbreviation. b. For each name, calculates 100 the value of the next letter being entered in the word corresponding to the last word identifier entered. (In other words, the function first checks all the names to find the value in each case of the next letter in the word currently being spelled.) c. If INFO-VALUE(Next Letter) is 1.0 for all names 101, does nothing, d. If not 101,
- the definitive guide described above suggests to the speaker an input that uniquely specifies a name in all cases. It may be though that no input will do this. And yet it may also be that certain inputs will have much higher values in all cases than the next letter of the word currently being spelled. In this case, a definitive guide can check the value of alternative inputs and suggest the one with the highest value.
- Probabilistic guides give advice that is probable (e.g., "You are probably stuck,” or "You should probably spell the second word.”).
- the information value of an input may vary widely, being high when applied to certain names and low when applied to others. A conclusion about every name is often not possible. And so, what a probabilistic guide does is sum the value of an input over all the names the speaker might be abbreviating. The input with the highest total is the one that has the highest value, on average (an average can be taken yielding the expected value of the input).
- a second toy data-base is introduced to illustrate points about probabilistic guides: Sony W2FF Sony W2FG Sony W2FH Sony Z9LH
- the input "First Letter,” “Second Word,” applied to the first three names has an INFO- VALUE of 1/3 (it knocks out only one name, the fourth name).
- the INFO- VALUE is 1 (it knocks out the other three names).
- weight means the probability of being abbreviated. What is needed is a weighted sum. How the weights are determined depends on the application of course. It is worth noting though that in many situations, the weights should come from the actual usage of the data-base itself. A guide can therefore include a demand function that measures how much a name is requested over time.
- a function can find the percentage that have identical names in the position specified by the last name identifier entered.
- a function can find the percentage that have identical words in the position specified by the last word identifier entered.
- a function can find the percentage of names that have identical word parts in the words specified by the last word identifier entered. When the percentage is above a threshold, the guide can declare that the speaker is probably stuck in the relevant name, word, or word part.
- a function can calculate the expected value of spelling the next N letters of a name, word, or word part. If the expected value of the sequence of letters is below a threshold, the guide declares that the speaker is probably stuck in the relevant part (see example below).
- a function can calculate not only the expected value of a sequence of letters but also the probability of a speaker entering any given sequence of inputs. This probability can be factored into an expected value function to yield the expected value of the speaker's input. If this expected value is below a threshold, the guide declares that the speaker is probably stuck. Rather than calculate all these probabilities, a guide can use historical data to supply the expected value of a speaker's input at a given stage of entering inputs. The point is that this expected value can be determined in various ways.
- a look-up system in combination with the abbreviation system can include a function that: a. Examines the set of names that match the abbreviation created thus far. b. Calculates the expected information value of the next N letters in all the words specified by the last word identifier entered. c. If the expected value is below a certain threshold, outputs a message telling the speaker that she will probably make little progress for the next N letters.
- the procedures above for defining whether a speaker is stuck include threshold values. If the speaker's expected progress is below a threshold, the guide declares that the speaker is probably stuck.
- a threshold can be a constant. Or, it can vary with the number depending on a variety of factors such as how many inputs have been entered. Or, it can depend on the expected value of other input sequences. Checking the value of other sequences can be useful for it is often counterproductive to tell the speaker he is probably stuck in, say, a word if he is only going to try another word where he is equally stuck.
- a guide can include steps for checking the "level of stuckness" in each word in a name.
- a probabilistic guide In order to suggest inputs, a probabilistic guide must evaluate a set of alternative input sequences and select the one that has the highest expected value (though user-friendliness can be taken into account).
- the best probability function to apply in a given situation is often a subjective matter.
- a quick analogy to baseball makes the point.
- a batting average Hits / At Bats
- a guide can include many factors. At least though, all will contain a core that measures the number of names a suggested input will eliminate if a given name is being abbreviated.
- a probabilistic guide that suggests inputs needs to: a. Examine 120 a set of suggested inputs. b. Calculate 121 the expected information value of each. c. Select 122 the suggested input with the highest value.
- the guide Since 0 is less than this amount, the guide then compares 125 the number of inputs already entered, which is two, and finds that that number is below a threshold of, say, five inputs entered. (If it is above, we assume the guide suggests an input.) Thus, the guide outputs a message 126 telling the speaker she is stuck.
- the guide then takes 120 a set of suggested inputs which consists, in this imaginary case, of the next two letters of the remaining words.
- the guide calculates 121 the expected value of each sequence and finds 122 that spelling the second and third words yields the same value.
- the guide checks its table ranking the input sequences by user friendliness and finds that spelling the second word is slightly more user friendly than spelling the third. Hence, the guide outputs, "Spell the the second word.”
- the threshold for triggering the guide is a comparison between the best suggested input and the spelling of the next N letters of the current word.
- the thresholds were an expected value constant (.5) and the number of inputs entered.
- the guide calculates 130 the expected value of spelling the next, say, three letters of this word and finds that the expected value is 2/3.
- the guide also calculates 131, 132 the value of the set of suggested inputs, which consists now of spelling the second word. The guide finds that the value of spelling the second word is 2/3 as well. The difference between the two values is taken 133.
- the system described above uses an interactive recognizer that confirms inputs. By confirming inputs, a single abbreviation is created that is then built upon. However, it is readily apparent that the system could also use a recognizer that does not confirm inputs. In this case, the recognizer would have multiple guesses about a speaker's inputs. As shown in figure 6, the system would use the guesses and permute 140 the possibilities to create multiple abbreviations based on the recognizer's guesses. The abbreviations would be matched against the data-base and the non- matching ones eliminated 141. This "post-processing" of inputs is well known in the field. When a non-confirming recognizer is used in the system, the multiple abbreviations can be weighted according to their probability of being correct. These weights can then be used by the system's interactive guides that calculate the expected value of inputs
- a system may include both confirming and and non-confirming modes.
- inputs with a high probability of being accurately recognized such as numbers and word identifiers, would usually not be confirmed.
- Another possibility for combining modes is to have users to first narrow down the search by using confirmed inputs and then continue with non-confirmed inputs. Or, the system can ask for a confirmation when the recognizer determines that the accuracy of the recognition is below a certain percentage. Or, the system could switch from non-confirming mode to confirming mode, in order to knock out certain false matches. The point is that confirming and non-confirming modes can be used together.
- Part 8 Function That Juggles Word Order
- a speaker When using the abbreviation method, a speaker might know a word or words in a name but may not be sure where the word or words go. For example, say a speaker wants to abbreviate "Herman's World of Sports,” but the speaker only knows that "Herman's” and “Sports” are in the name. The speaker would then only want to abbreviate these two words.
- the system can include a function that "juggles" the word order entered by the speaker, thereby creating multiple abbreviations. If the speaker is unsure of the word order in a name, she can enter an input, call it "Juggle,” that causes the function to use the speaker's inputs to create an abbreviation for every possible combination of word orders.
- the number of words in the name has to be established. Either the speaker will have entered the number of words or the function has a pre-set limit. This set of combinations (multiple abbreviations) is used to search a data ⁇ base. The combinations not matching any names are eliminated. Each input entered is placed by the juggle function in all the remaining combinations. After an input is placed, another search is executed and more abbreviations may then be eliminated. (A system could allow a speaker to invoke the juggle function after an abbreviation has been rejected for having no matches.)
- the juggle function can include a feature that allows a speaker to fix the position of a word. In all combinations the word would then have the position specified.
- the system program would include an input, call it "Sure,” signifying that the speaker is certain of the position of a word.
- the juggle function is best combined with an interactive look-up system that guides the speaker.
- the key guide is the one telling the speaker she is stuck.
- These guides works basically the same way as those described in section 6, except that they have to try more abbreviations. How though does the program suggest an input to enter when the speaker does not know the word order? One way is for the guide to select an input to suggest, then also find the word the speaker has entered that corresponds more to that input more than any other word the speaker has entered. (We presume the suggested input includes a letter.) The function then suggests to the speaker that he enter the next (or last) letter in that word. Thus the suggested input is translated to apply to a word the speaker has identified. Probabilistic Sorter of Customer Demand
- a random number generator can be used to sort customers according to whether they will pay a price, x, called the "full price,” or a lower price, y, called the “discounted price.”
- Customers are sorted by the uncertainty of receiving a commodity. It is assumed that customers who would pay full price need the commodity and would not tolerate the uncertainty of only possibly receiving it.
- a general method for implementing this principle has three conditions: 1) A customer signs up for the chance to win. 2) The customer pays nothing up front, paying only if he or she is picked randomly to have the right to buy at the discounted price. 3) The probability of the customer winning is set precisely.
- a second problem is that full payers will try to get the discounted prices.
- One way to stop full payers from doing this is to use uncertainty along with time.
- the idea here is to make the customer uncertain as to when the commodity can be purchased. It is assumed that the customer who will pay full price will not tolerate an uncertain delay. Thus, a business can set an undisclosed time period and then let potential customers have a random chance of buying the product or service at the discounted price after that period.
- a person can have a probability of getting the discount at a certain date or the date itself can be random. It is also possible to make it probabilistically near certain (or even certain) that a person will win the right to buy at a discount but to make the date random, thereby creating the necessary uncertainty.
- the airline would use customers such as this one to fill planes. Thus if a flight from NY to LA was 2/3's full, the airline could check its list of "random buyers" and pick some randomly to win the discounted seats. The winners would get discounted tickets while the airlines would get extra revenue above variable costs while not losing full payers. (Note: A business can let customers specify the discounted price they are willing to pay. The business could then also sort the customers by how much they say they will pay.)
- a probabilistic sorter includes at minimum an RNG and may include means for: registering 200 customer identification information, setting 201 time periods when a customer is eligible to win, setting 202 the chances of winning, operating 203 the RNG to determine whether customer has won, and recording 204 how many times and when a customer has engaged in the RSP.
- a probabilistic system for sorting customers can include the steps of:
- This RSP can take place in two basic ways:
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Business, Economics & Management (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Finance (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Emergency Alarm Devices (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
Abstract
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU12102/95A AU1210295A (en) | 1993-11-29 | 1994-11-29 | Input system for text retrieval |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/158,297 | 1993-11-29 | ||
US08/158,297 US5454063A (en) | 1993-11-29 | 1993-11-29 | Voice input system for data retrieval |
US08/165,676 US5620182A (en) | 1990-08-22 | 1993-12-13 | Expected value payment method and system for reducing the expected per unit costs of paying and/or receiving a given ammount of a commodity |
US08/165,676 | 1993-12-13 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1995014974A1 true WO1995014974A1 (fr) | 1995-06-01 |
Family
ID=26854909
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1994/013279 WO1995014974A1 (fr) | 1993-11-29 | 1994-11-29 | Systeme d'entree pour rechercher du texte |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN1136356A (fr) |
AU (1) | AU1210295A (fr) |
WO (1) | WO1995014974A1 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5903864A (en) * | 1995-08-30 | 1999-05-11 | Dragon Systems | Speech recognition |
CN105389305A (zh) * | 2015-10-30 | 2016-03-09 | 北京奇艺世纪科技有限公司 | 一种文本识别方法和装置 |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7212968B1 (en) * | 1999-10-28 | 2007-05-01 | Canon Kabushiki Kaisha | Pattern matching method and apparatus |
CN1835077B (zh) * | 2005-03-14 | 2011-05-11 | 台达电子工业股份有限公司 | 中文人名自动语音辨识输入方法及系统 |
CN111061925B (zh) * | 2019-12-16 | 2021-02-19 | 珠海格力电器股份有限公司 | 联系人查找方法、装置、终端设备及可读存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4433392A (en) * | 1980-12-19 | 1984-02-21 | International Business Machines Corp. | Interactive data retrieval apparatus |
US5228133A (en) * | 1990-10-01 | 1993-07-13 | Carl Oppedahl | Method to perform text search in application programs in computer by selecting a character and scanning the text string to/from the selected character offset position |
US5278980A (en) * | 1991-08-16 | 1994-01-11 | Xerox Corporation | Iterative technique for phrase query formation and an information retrieval system employing same |
US5309359A (en) * | 1990-08-16 | 1994-05-03 | Boris Katz | Method and apparatus for generating and utlizing annotations to facilitate computer text retrieval |
-
1994
- 1994-11-29 AU AU12102/95A patent/AU1210295A/en not_active Abandoned
- 1994-11-29 WO PCT/US1994/013279 patent/WO1995014974A1/fr active Application Filing
- 1994-11-29 CN CN94194316A patent/CN1136356A/zh active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4433392A (en) * | 1980-12-19 | 1984-02-21 | International Business Machines Corp. | Interactive data retrieval apparatus |
US5309359A (en) * | 1990-08-16 | 1994-05-03 | Boris Katz | Method and apparatus for generating and utlizing annotations to facilitate computer text retrieval |
US5228133A (en) * | 1990-10-01 | 1993-07-13 | Carl Oppedahl | Method to perform text search in application programs in computer by selecting a character and scanning the text string to/from the selected character offset position |
US5278980A (en) * | 1991-08-16 | 1994-01-11 | Xerox Corporation | Iterative technique for phrase query formation and an information retrieval system employing same |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5903864A (en) * | 1995-08-30 | 1999-05-11 | Dragon Systems | Speech recognition |
CN105389305A (zh) * | 2015-10-30 | 2016-03-09 | 北京奇艺世纪科技有限公司 | 一种文本识别方法和装置 |
Also Published As
Publication number | Publication date |
---|---|
CN1136356A (zh) | 1996-11-20 |
AU1210295A (en) | 1995-06-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5454063A (en) | Voice input system for data retrieval | |
US6778951B1 (en) | Information retrieval method with natural language interface | |
JP3759242B2 (ja) | 特徴確率自動生成方法及びシステム | |
Gärdenfors et al. | Decision, probability and utility: Selected readings | |
US5062143A (en) | Trigram-based method of language identification | |
US7028907B2 (en) | Method and device for data input | |
US4373726A (en) | Automatic gaming system | |
EP1198272B1 (fr) | Systeme permettant de relier un identifiant unique a un ticket de loterie instantanee | |
US6763338B2 (en) | Machine decisions based on preferential voting techniques | |
US4771385A (en) | Word recognition processing time reduction system using word length and hash technique involving head letters | |
US9818405B2 (en) | Dialog management system | |
EP0635795A2 (fr) | Méthode et ordinateur pour la détection de fautes d'orthographe dans un texte | |
US20060116862A1 (en) | System and method for tokenization of text | |
Walker et al. | Using natural language processing and discourse features to identify understanding errors in a spoken dialogue system | |
US4689743A (en) | Method and an apparatus for validating the electronic encoding of an ideographic character | |
US20020116191A1 (en) | Augmentation of alternate word lists by acoustic confusability criterion | |
Krahmer et al. | Error detection in spoken human-machine interaction | |
US8428933B1 (en) | Usage based query response | |
US6738515B1 (en) | Pattern string matching apparatus and pattern string matching method | |
WO1995014974A1 (fr) | Systeme d'entree pour rechercher du texte | |
WO2002056266A1 (fr) | Procede et dispositif de jeu | |
Niewiadomska-Bugaj et al. | Probability and statistical inference | |
Feldman et al. | Probability: the mathematics of uncertainty | |
WO2022019275A1 (fr) | Dispositif de recherche de document, système de recherche de document, programme de recherche de document et procédé de recherche de document | |
JP3945075B2 (ja) | 辞書機能を備えた電子装置及び情報検索処理プログラムを記憶した記憶媒体 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 94194316.X Country of ref document: CN |
|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AM AT AU BB BG BR BY CA CH CN CZ DE DK EE ES FI GB GE HU JP KE KG KP KR KZ LK LR LT LU LV MD MG MN MW NL NO NZ PL PT RO RU SD SE SI SK TJ TT UA UZ VN |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): KE MW SD SZ AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: CA |