GB2365551A

GB2365551A - Machine interface

Info

Publication number: GB2365551A
Application number: GB0007658A
Authority: GB
Inventors: Wide Roeland Hogenhout
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2000-03-29
Filing date: 2000-03-29
Publication date: 2002-02-20
Anticipated expiration: 2020-03-29
Also published as: GB0007658D0; GB2365551B

Abstract

A machine interface allows a user to select a machine operation. A plurality of questions are stored for output to a user. A score indicating the likelihood that the user will select a machine operation is stored for each corresponding machine operation. A next question for output to the user is selected from the stored questions by determining, for each of a plurality of the questions, an average of the least number of questions required to be answered by the user to arrive at each machine operation weighted by the respective scores, and selecting the question having the lowest average number. The selected question is output and an answer is received from the user. In response to the input answer a machine operation is carried out and/or the stored scores for each of the plurality of machine operations is adjusted. At least one further selection of a next question is then carried out for output to the user using the adjusted stored scores. This machine interface is applicable to both databases and call centres.

Description

<Desc/Clms Page number 1> MACHINE INTERFACE The present invention generally relates to a machine interface to allow a user to select a machine operation from amongst a plurality of possible machine operations. A great deal of effort has been expended in the prior art in order to solve the problem of how to interface a machine to a user to enable a user to more readily control the functioning of a machine.

When there are a plurality of possible machine operations which can be carried out and a user cannot uniquely and immediately identify the operation which the user requires to be carried out, it becomes a problem as to how to interface the machine to the user to enable a user to quickly and efficiently select a desired machine operation. For example, when accessing a database which" contains retrievable information, a user may not know the exact identity of the data which is required e.g. the file name of a picture or a document. The problem is thus how to interact with the user to extract the necessary information to identify a required record. This problem is also applicable to directing incoming

calls in a call centre. A solution to this problem is disclosed in a paper entitled "A Vector-Based Natural Language Call Routing" by J. Chu-Carroll and B. Carpenter (Computational Linguistics 1999). The solution described in this paper is to receive a query from a user and to calculate and compare a vector for the query with vectors for the nearest documents. If there is uncertainty about the nearest document vector, a new vector is generated which would help to distinguish the nearest document. This vector is then used to generate a further question. The limitation of this system is that the confirming question only allows retrieval terms to be explicitly confirmed or rejected by the user.

One aspect of the present invention provides a machine interface for a machine which allows a user to select a machine operation. The machine operation can comprise any operations which can be carried out by machine such as the retrieval of data e.g. text, audio, video and images, or the execution of an instruction such as the routing of incoming calls in a call centre, the printing of a document, or the transmission of a facsimile. Thus a machine operation can comprise any event which a user wishes to take place.

In the present invention a plurality of questions for output to a user are stored. This library of questions comprises a set of questions which aim to extract a response from the user which will enable the system to uniquely identify the machine operation which a user wishes to select. The stored questions can thus be tailored to provide the most efficient selection of machine operations. This flexibility allows for a system administrator to modify the database of questions as desired.

A score is stored for each of a plurality of machine operations. The score indicates the likelihood that the user will select a corresponding machine operation.

A question for output to a user is selected from the stored questions by determining, for each of a plurality of said questions, an average of the least number of" questions required to be answered by a user to arrive at _ each machine operation. The average is a weighted average which is weighted by the respective scores for the machine operations. The question having the lowest average number of questions is then selected as the next question .to be output to the user. Thus this question selection process identifies a question which is likely

to most quickly result in the selection of a machine operation.

The selected question is then output to the user and an input answer is received in response. In response to a received input answer, a machine operation can be carried out. Alternatively or in addition, the scores for each of the stored plurality of machine operations are adjusted and a selection of a further question takes place using the adjusted scores. This process will repeat until the score for a.particular machine operation leads the question selection process to ask the user a question which enables a machine operation to be identified as the desired machine operation to be selected.

The present invention thus comprises a machine interface which uses a plurality of questions to identify a desired " machine operation. Questions to be asked of a user are adaptively selected based on previous inputs by a user. The advantage of the present invention is flexibility in the design since the number and type of questions can be tailored as required, which leads to more natural focussed and effective interaction with the user.

In an embodiment of the present invention, the stored questions include expected answers. Any specified answer can have associated with it an identifier for a corresponding machine operation which is to be carried out in response to the input of the specified answer. Thus, each machine operation will have associated with it a 'final' question which will allow the unique identification of the machine operation as the selected machine operation.

In an alternative embodiment, a machine operation is carried out in response to an answer when the score for the machine operation is significantly different from the scores for other machine operations: thus indicating the unique identification of the machine operation. For example, the score for a machine operation may be required to reach a threshold level greater than the other scores by a threshold amount.

In one embodiment of the present invention, where expected answers to the questions are stored, question selection takes place using the determination of the least number of questions by predicting the expected answers input by a user to select each. of the machine operations.

In an embodiment of the present invention, keywords are stored for each of the machine operations and keywords are determined using the input answer from the user. The system responds to the input answer by matching the determined keywords to the stored keywords and adjusting the scores for each of the plurality of machine operations in dependence on the matching. Preferably, scores for the keywords for each of the plurality of machine operations are stored with the keywords. Scores for the keywords determined from the input answers can then be determined for each of the plurality of machine operations by matching the determined keywords to the stored keywords. The scores of each,of the plurality of machine operations are then adjusted using the determined scores for the keywords.

Thus in the embodiment of the present invention, the questions asked of the user are used to extract the necessary keywords from the user in order to be able to perform a keyword search to identify a desired machine operation.

The keywords need not be input by the user. The keywords can be stored in -association with the expected answers to at least some of the questions. For example, a user may

be asked the question 'Do you want music?' and the expected answer would be 'yes' or 'no' . The user has not entered keywords, but the keyword 'music' can be stored in association with the expected answer 'yes'. This keyword can then be used for searching.

In an alternative embodiment, instead of storing keywords in association with the expected answers, instructions can be stored for the extraction of the keywords from the question.

The algorithm performed to determine the next question to ask a user is a recursive process in which sequences of questions to reach each machine operation are processed in order to identify the shortest path to each machine operation from each question. However, this recursive algorithm requires evaluation for all questions and for a11 machine operations.

In a preferred embodiment, in order to reduce the processing, only machine operations having the highest scores are used in the recursive process.

In another embodiment of the present invention, in order to reduce processing, the path length i.e. the number of

questions in a sequence, is only allowed to reach a threshold length. Processing is not carried on above a threshold question sequence length.

In a further embodiment of the present invention, the recursive process is only performed for questions, the answers to which will cause the scores of a most likely machine operation to increase.

In yet another embodiment of the present invention, in order to reduce the number of questions in the recursive process, questions can be preselected. Questions can be preselected on the basis of three criteria: (i) by taking the score of the machine instruction having the highest score after asking the question and predicting a received answer; (ii) by assigning a high score to questions relating to the same topic as a previous input answer; and (iii)by assigning a high score to questions relating to _ the same topic as any previous input answers.

In a preferred embodiment, questions are preselected on the basis of a weighted average of a11 three of these techniques.

In one embodiment of the present invention, the system is able to indicate to a user when it is operating with a high degree of uncertainty e.g. when no machine operations have distinctive scores. In order to do this, scores for hierarchical classifications of the machine operations are stored, where each hierarchical classification comprises a topic to which the machine operations in the hierarchy below relate and each hierarchical classification has a score comprising the sum of the scores for the machine operations in the hierarchy below. When the score for any of the hierarchical classifications at a predetermined level of hierarchical classification is below a threshold, the system can indicate uncertainty to the user. This indication can help a user to more carefully input a query (an answer to a question) which will more quickly result in the selection of a machine operation.

In order to allow a user to change the direction in which a search for a desired machine operation is being carried out, in an embodiment of the present invention, scores for each of the machine operations which is stored are decayed by a predetermined amount after each question has been answered. Thus, if a user changes the subject of their queries, the change in the score brought about by

previous queries will gradually decay thereby allowing a user's more recent answers to predominate in the search for the desired machine operation.

A feature of the present invention is the facility to enable the questions to be added to and updated. Also, keywords used in an embodiment of the present invention can be added to and updated as well as their scores. The present invention is particularly suited to a dialogue system in which a dialogue is entered into between a user and a machine in order to achieve the implementation of the machine operation. The present invention is particular suitable, although not limited to, implementation in a spoken dialogue system in which the questions are generated as a speech output and the answers are received as speech and processed by a speech recognisor.

The present invention can be implemented by dedicated hardware or by a suitably programmed processing apparatus e.g. a programmed general purpose computer. The present invention thus encompasses computer program code for controlling a processor in a machine e.g. in a computer to carry out the method. The present invention thus

encompasses providing the computer code to the processing apparatus in any conventional form such as, as a signal e.g. an electrical signal carried over a communications network such as Internet, or on a storage medium such as a floppy disc, CD ROM, magnetic tape, or solid state memory device. The computer program code can be provided on any suitable carrier medium to the processing apparatus to be loaded in the processing apparatus to implement the method.

Embodiments of the present invention will now be described with reference to the accompanying drawings, in which: Figure 1 is a schematic diagram of an embodiment of the present invention; Figure 2 is a schematic diagram of an implementation of" an embodiment of the present invention on a general _ purpose computer; Figure 3A is a schematic illustration of a question data structure; Figure 3B is a schematic illustration of a 'final'

question data structure; Figure 4 is flow diagram illustrating the method of accessing a record of the database in accordance with an embodiment of the present invention, Figure 5 is a flow diagram of the process for selecting the next question in accordance with an embodiment of the present invention; Figure 6 is a flow diagram illustrating the path prediction step in the flow diagram of Figure 5 in more detail; and Figure 7 is a schematic diagram of the hierarchical classification of records in accordance with an embodiment of the present invention.

Figure 1 is a schematic illustration of an embodiment of _ the present invention for accessing records in a database by receiving user input queries and answers to questions generated by the system in order to aid the identification of the desired record.

The user input device 1 receives user input. The input

device will provide text based on the input to an answer translator 2 which interprets the answer by comparing the answer to expected answers to questions which are stored with the questions in a question data structure database 5. If the user input does not match the expected answers for a question which was asked, the user input is passed to a keyword extractor 4 to extract the keywords from the user input. The keywords are then stored in a keyword list storage device 3. If on the other hand the user input matches expected answers, this can result in the answer being translated to simply output a set of keywords associated with the expected answer to the keyword list storage device 3. If an answer which is matched to a user input indicates that the record should be rejected, the identity of the rejected record is stored in a rejected record storage device 6. If the user input matches an answer which has associated with it the identity of a record which is to be selected, i.e. the user input is sufficient to identify a record, the _ answer translator 2 will access a database for the records 12 in order to cause the record to be retrieved and output to an output device 11.

Each record of the database 12 has a score stored in an initial record scores database 13. The score for each

record indicates the likelihood that a user will wish to access the record. The initial scores can be used to identify popular records which are often accessed by users.

A keyword scores database 8 is provided which stores a score for keywords for each record. Thus, for example, for a keyword 'book', scores for the keyword for records which have information or relate. to books will be high. A score adjustment engine 7 is provided to read the keyword list from the keyword list storage device 3 and to identify if any records have been rejected by reading the rejected records storage device 6. If any records have been rejected, their score is set to zero indicating that the user does not wish to access these records. The score adjustment engine 7 accesses the keyword scores database 8 using the keywords in the keyword list read from the keyword storage device 3 in order to determine keyword scores for records. The score adjustment engine 7 also accesses current scores for records from a record scores storage device 9. Initially, the current scores in the record scores storage device 9 can be set to the initial record scores from the initial record scores

database 13. The score adjustment engine 7 then adjusts the current scores for each record in dependence upon the scores determined for each keyword for each record. The adjusted score is then stored as the current score for each record in the record scores storage device 9.

When a record has not been identified as a desired record as a result of a user input, the system requires more information to enable it to identify a desired record. This information is obtained by asking the user a next question retrieved from the question data structure database 5. The next question to be selected to be retrieved from the question data structure database 5 is determined by a question selector 10 which selects the question on the basis of the current scores for each record stored in the record scores storage device 9. Once a question has been selected by the question selector 10, it is retrieved from the question data structure database 5 and output to the output device 11. A reconfiguration interface 14 is provided and allows for the adjustment of the initial scores for records in the initial record scores database 13. The scores for records can be adjusted to take into account changes in user behaviour, e.g. the popularity of particular

records, or to add initial scores for records which have been added to the database of records 12.

The reconfiguration interface 14 is also provided to allow an administrator or manager of the system to reconfigure any of the data in the databases 5,8,12,13. This provides a system with a great deal of flexibility since it allows the records which can be selected by a user to be updated. It also allows the questions to be updated as necessary either to improve the abilities of the system to identify a record, or to add new questions when new records are added to the database records 12. Also, the reconfiguration interface 14 allows the keyword scores database 8 to be updated to take into account changes in user behaviour and changes in the records in the database of records 12.

Thus this embodiment of the present invention will continue to ask questions selected by the question _ selector 10 of the user in order to extract more keywords which will help to identify a desired record by adjusting the scores appropriately for the records.

Figure 2 is a schematic diagram of an implementation of the system of Figure 1 in a general purpose computer

which interfaces to a user using speech.

The computer includes an audio input device 20 such as a microphone and suitable analogue to digital conversion means in order to input spoken words into the computer. An audio output device 21 such as a loudspeaker and suitable digital to analogue means is provided to generate spoken words comprising questions or output audio data records to a user.

A question database 22, a record database 23, and a keyword database 24 are provided stored in conventional non-volatile memory means such as a hard disc drive, CD ROM, floppy disc drive or solid state device. A working memory 26 is provided to store data used during the implementation of the system. A program memory 27 is also provided to store the computer program code for the implementation of the system. The working memory 26 and the program memory 27 can be provided on any conventional volatile or non-volatile memory means e.g. hard disc drive, CD ROM, floppy disc drive or solid state device. The computer program code can be provided to the program memory 27 using any conventional carrier medium. In Figure 2 a floppy disc drive 29 is illustrated. However, any other carrier medium such a carrier signal e.g. an

electrical signal on the Internet, or any type of storage medium e.g. CD ROM, tape device, or solid state device can be used.

A processor 25 is provided and comprises the conventional CPU of a general purpose computer. The processor 25 implements various functions by loading and running computer program code stored in the program memory 27. In the present embodiment, the processor 25 implements a speech recognition engine 250 by loading and implementing speech recognition engine code from the program memory 27. This enables the audio input received. from the audio input device 20 to be converted into text. The processor 25 also implements an answer translator 251 by loading and implementing answer translator code from the program memory 27. The answer translator 251 receives the output of the speech recognition engine 250.

The processor 25 further implements a keyword extractor 252 by loading and implementing keyword extractor code from the program memory 27. Also the processor 25 implements a score adjustment engine 253 by loading and implementing score adjustment engine code from the program memory.27. Further, the processor 25 implements a question selector 254 by loading and implementing

question selector code from the program memory 27. Also, the processor 25 implements an audio output driver 256 by loading and implementing audio output driver code from the program memory 27. The audio output driver 256 can cause the retrieval of audio data as the selected record from the record database 25 for output by the audio output device 21. In an alternative arrangement, the audio output driver 256 can include a text to speech synthesiser if the records in the record database 23 comprise text. The text to speech synthesiser of the audio output driver 256 can then convert the text to speech data for output by the audio output device 21. The processor 25 also implements a reconfiguration interface 257 by loading and implementing configuration interface code from the program memory 27. The re- confirmation interface can be used for re-configurating the data in any of the databases 22,23,24.

The operation of the system will now be described.

The records in the database of this embodiment comprise audio files in the "wave" file format. Each record is identified by a record number to allow for ease of access.

The question data is formed into question data structures as illustrated in Figures 3A and 3B. Each question is identified by a question number. Associated with the question is a question prompt as an audio file in the "wave" format, e.g. QUESTION10.WAV. Associated with each question are expected answers. In the embodiment illustrated in Figure 3A the expected answers are "yes" or "no". The question output in this example could be an audio question "Do you want pop music?". If the user answers "yes", associated with the expected answer "yes" are the keywords "pop music" and "rock music". If the user answers "no", associated with the expected answer "no" is an instruction to reject three records as not being records which will be desired by the user, i . e . records-18, 22 and 36. This list of rejected records is stored in the rejected record list. The question data structure also includes an indication of the topic of the question which in this case generally comprises the topic "music". The question data structure illustrated in Figure 3A comprises a question data structure which does not result in the selection of a record as a result of an answer. Instead, the answer will result in the rejection of some records and the input of keywords which can be used to

adjust the scores for records which will then be used to select the next question to ask the user. Figure 3B illustrates another question data structure which is termed the "final" question data structure for a record. The question data structure is the same as that of Figure 3A except in the example given, the question to be output to the user is of course a different audio file related to question number 15. Also the expected answers result in different operations. For example, the question could be "Do you want pop artist 1?", where record number 20 contains a piece of music by pop artist 1. If the answer to this question is "yes", in the question data structure there is an instruction to set the selected record identifier for recyrd number 20. If on the other hand the answer is "no", the rejected record identifier is set to record number 20.

The other difference between the question data structure of Figure 3A and the question data structure of Figure 3B is that the topic is more narrowly defined as "pop music".

Figure 4 is a flow diagram illustrating the operation of the system according to this embodiment of the present

invention. In step S1 an initial question is output to the user. This question can simply be an initial prompt, e.g. "What would you like?" and the scores for the records are set to the initial record scores. In step S2 the system awaits the user input and when this is received, in step S3, the answer translator determines whether the input matches an expected answer. If it does not, in step S4 keywords are extracted from the input and in step S5 the keywords are added to the keyword list. The keywords are then used in step S6 to search. in the keyword database for scores for the words for.each record. These scores are then used to determine a revised score for each record.

In step S7 a next question to ask a user is selected using the. revised scores for each record. The selected" question is then output in step S8 to the user and the process returns to step S2 to await a user input.

In this embodiment the. initial scores for the records are set as an initial probability p(x). The scores for keywords stored comprises a probability of a word given a record p(wjx). The probability is thus updated by

multiplying the current probability p(x) by the word probability p(wlx).

In order to take into account the possibility that a user changes the target record during the question and answer session, the current probability for records is allowed to decay back towards the initial probability. For example, the new probability can be calculated from: p (x) _ (0 .2 x pi (x) + 0.8 x p(x)) x p(wIx) where pi(x) is the initial probability.

It can be seen that with a decay set by the numbers 0.2 and 0.8, the current probability can be made to decay towards the initial probability if the word probability does not modify the current probability.

If in step S3 the user's input does match an expected answer, the answer translator, in step S9, translates the input to keywords if the answer has keywords associated with it. Alternatively, if there is an instruction associated with an expected answer to set the selected record identifier to a record number or to set the rejected record identifier to a record number, this is done. In step S10 it is then determined whether the selected record identifier identifies a record. If so,

this means that a record has been selected and in step S11 the record is output. If in step S10 no record is identified by the selected record identifier, in step S12 it is determined whether the rejected record identifier identifies a record. If not, this means that the answer translator has determined keywords associated with the answer and these are added to the keyword list in step S5 to be used in step S6 to revise the score for each record. The revised score can then be used in step S7 to select the next question to ask a user for output in step S8.

If in step S12 it is determined that there is a record identified by the rejected record identifier, in step S13 the score for the record is .fixed to zero and the process to select a next question to ask a user in step S7 is carried out with the score for the record fixed to zero. If step S13 has been carried out a number of times, there can be a number of rejected records listed for which the scores are fixed to zero. These are listed in the rejected record list to ensure that their scores remain fixed at zero in the current scores used by the question selector to determine the next question to ask a user. The method by which the next question is determined using

the scores for each record will now be described with reference to the flow diagrams of Figures 5 and 6.

The algorithm to identify the next question is based on a recursive process wherein for each question and for each record an estimate is made as to the least number of questions needed to arrive at the record after answering the question.

The algorithm for selecting the next question to ask the user is based on an optimal or best answer assumption. Each question has associated with it expected answers. The assumption is that if a question is asked, the user will give the best answer to reach a target record.

Because the expected answers to the questions are known, it is possible to generate a sequence of questions and predicted answers in order to reach a target record Using these predicted answers it is possible to select a good path, i.e. a path having the least number of questions for every record. Thus the algorithm operates by looking at each question and predicting a response. The predicted response is then used to calculate predicted scores for the records and the predicted scores for the records are then used to select a next question.

This process repeats to find paths using a sequence of questions and predicted answers to reach each record. The shortest path length reach record is selected and an average of the shortest path lengths is taken wherein the average is weighted by the current probability for each respective record.

This process will be described in more detail with reference to Figures 5 and 6.

In step S20 the question index Q is set to 1 to start the first question, and in step S21 the record index x is set to 1 to start the first question. In step S22, assuming a user wishes to retrieve record x, the path is predicted after question Q in order to retrieve question x and the path length is stored as DL (x) . Then in step S23 the record index is incremented and in step S24 it is determined whether there are more records to process and if so the process returns to step S22 to predict path _ lengths DL(x) for these records. once all the records have been processed in step S25, the path length for the question is taken as the weighted average of the shortest path lengths to each record. The weighting used is the probability for each respective record. The equation for the calculation of the path length for each question is

given by: Y_p(x) DL(x).

In step S26 the question index Q is incremented and in step S27 it is determined whether a11 questions have been evaluated. If not the process returns to step S21 to continue the evaluation for each question and for each record. If a11 questions have been evaluated to determine the weighted average of the shortest path lengths, in step S28 the question is selected which has the shortest path length.

Thus for each question the path length is determined as an average path length taking into account the likelihood that the user will take a particular path to a particular record (because it's probability is higher). Thus the process uses a statistical process to select the most suitable next question to arrive most quickly at a record.

The process of predicting the path to a record in step S22 of Figure 5 will now be described in more detail with reference to Figure 6.

In step S30 an answer is chosen which a user would give

to arrive at the record x. The predicted answer is then translated in step S31 to look up the keywords or action relating to the answer. If in step S32 the action selects a record, in step S33 a path length equal to 1 is returned. If a record is not selected, in step S34 the keywords are used to update the predicted probability for the record p (x) . The process then has to select the next question in the path and this is done by initially setting the question index to 1 and predicting the path length again using the same process as step S22. In step S37 the predicted path length DL(x) is stored and in step S38 the question index is incremented. In step S39 it is then determined whether all of the questions have been processed and if not the process returns to step S36. If a11 of the questions have been processed, in step S40, the shortest path length is selected and in step S41 the shortest path length +1 is returned.

Thus Figures 5 and 6 comprise a recursive process used to identify path lengths for each next question to each record, where the path length comprises a sequence of questions of shortest length.

In the algorithms described with reference to Figures 5 and 6, the processing can be quite demanding where there

are a large number of questions and/or a large number of records. The following embodiments describe optimisation techniques in order to reduce the processing requirements of the algorithm.

In a first optimisation embodiment, instead of including all records in the algorithm, at each loop in the algorithm only records which are likely to be selected by the user are processed. These records are identified by their probabilities. Thus only records which have a probability above a threshold are included in the list of records at each loop of the procedure. This avoids having to process unlikely records.

A second optimisation embodiment comprises limiting the recursion depth. The number of questions included in the path length can be set to a threshold, e.g. 10. Once the algorithm recursively calculates the length DL(x) as reaching 10, it can stop and simply return a maximum _ value, i.e. 10. This assumes that there will be some other question which will provide a lower expected path length DL(x) which will be selected.

A third optimisation embodiment comprises only selecting questions which are likely to have a positive effect.

Since the questions have expected answers and actions or keywords associated with the expected answers, the predicted effect on the record having the highest probability can be determined. If the expected answer to a question generates keywords which can increase the probability for the record with the highest probability, this indicates that it could be a useful question to ask. In this way the number of questions which have to be processed can be reduced by ignoring questions which cannot increase the probability of the highest probability record.

A fourth optimisation embodiment involves the. pre- selection of questions. Thus rather than calculating the path DL(x) for every question, it is only calculated for a small number of questions. The questions can be pre- selected by giving the questions a score and only selecting the questions which have a score above athreshold or selecting a group of questions which have the highest scores.

There are many ways in which scores can be attributed to questions. one way is to assign a question the probability for the record having the highest probability after answering the question. Another method is to

assign a high score to questions which relate to the same topic as a previous dialogue, i.e. question and an answer. Another method is to assign a high score to questions that relate to the same topic as all previous dialogues, i.e. questions and answers.

Questions that relate to the same topic can be determined by comparing the topic data entry for the question (see Figures 3A and 3B) with the topic data entry for one or more previous user inputs.

Thus by assigning scores to questions it is possible to select questions with a score over a threshold, or a fixed number of questions with the highest score and calculations need only be carried out for these.

Any one of these processes for calculating probabilities or scores for questions can be used alone or in combination. For example, a weighted average of the three determined scores can be assigned to each question. A further feature of an embodiment of the present invention will now be described with reference to Figure 7.

Imagine a user says "I want Salsa" but the speech recognition for this misunderstands it as "I one these are". These can easily happen because of noise, deformations of the voice over a telephone line, unclear pronunciation or accent and many other reasons. In this case the system will not have good terms for deciding what the user wants.

Also, the user could change interest during the dialogue with the system e.g. a user first asks for Salsa and later for jazz. Further, a user input such as an initial input could lack useful keywords e.g. "I would like to buy a record with some dance music". The keywords "dance music" may not be particularly distinctive if there are thousands of records relating to dance music.

It can thus be useful to indicate to the user that the system is uncertain, or in other words that the supplied terms are not effective in differentiating between records. The system can indicate uncertainty by feedback to the user such as by saying "I am not very sure. Do you want classical music?".

Thus this further feedback to a user can help a user to try to think of a more useful input to identify the

record being sought.

In this embodiment records are organised in a hierarchical structure by calculating the similarity between records e.g. classifying the records. This can result in a tree as illustrated in Figure 7 wherein the records comprise the leaves of the tree. Every node in the tree is assigned a value calculated as the sum of the probabilities or scores of the leaves of which it is a parent. In Figure 7 probability values are used and hence at any level in the hierarchy the probability adds up to 1.0.

Thus in the storage of the current record scores, scores for nodes of the hierarchical tree can also be stored to facilitate the feature of indicating uncertainty to a user. The probability values for each node can be adjusted in accordance with the adjustments made to the probability values for the records in the hierarchy below the nodes.

This hierarchical structure of probabilities can be used for indicating uncertainty to a user by setting a threshold probability value, e.g. 0.6 shown in Figure 7. Using this threshold probability value, nodes in the

hierarchical structure which have probabilities above and below the threshold can be identified. If no nodes can be identified having a probability above the threshold value at a suitable level in the hierarchy, this indicates that no distinctive records have been identified and uncertainty can be indicated to a user. Although the present invention has been described hereinabove with reference to specific embodiments, modifications will be apparent to a skilled person in the art which lie within the spirit and scope of the present invention.

Although in the embodiments probability is used for words and records, any form of score can be used.

In the embodiment, a "final" question is used to cause the selection of a record for output. However, the selection of a record for output can alternatively take place by selecting a record which has a score which is significantly high, e.g. has a score above a threshold which is greater than other scores by a threshold amount. In the embodiment described with reference to Figure 4, expected answers have keywords, rejected records or

selected records associated with them. The present invention also encompasses a combination e.g. keywords and records to be rejected in response to the associated answer.

The present invention is applicable to any means by which questions and answers can be conveyed to and from a user to the system. The user interface can comprise speech or text for example.

The present invention is applicable to the selection of any type of machine operation from a number of possible machine operations. For example, the present invention is applicable to the selection of data records for retrieval, e.g. the retrieval of images, text, audio and video. Alternatively, the machine operation can simply comprise the marking or identification of a selected record. Further, the machine operation can be the. selection and execution of a spoken dialogue module such as a VoXML file. Also, the present invention is applicable to call centre technology wherein the selected machine operation is the routing of a telephone call or the selection of the service.

The present invention can be implemented by dedicated

hardware configured to perform the functions of the system. More preferably, the present invention is implemented in a processing system by computer program code. Such a processing system can be provided in any form of apparatus such as in a photocopying machine, facsimile machine, mobile telephone, or a general purpose computer. The present invention thus encompasses program code for controlling a processor to implement the method. The program code can be loaded into the processing system from any conventional carrier medium such as a transient carrier medium (e.g. an electrical signal carrying the program code) or a storage medium such as a floppy disc drive, CD ROM, magnetic tape device, or solid state device.

Claims

CLAIMS: 1. A machine having a machine interface to allow a user to select a machine operation, the machine interface comprising: question storage means for storing a plurality of questions for output to the user; score storage means for storing a score for each of a plurality of machine operations, said score indicating the likelihood that the user will select a corresponding machine operation; question selection means for selecting a next question for output to the user from said question storage means by determining, for each of a plurality of said questions, an average of the least number of questions required to be answered by the user to arrive at each said machine operation weighted by the respective scores, and selecting a question having the lowest average number; outputting means for outputting the selected question to the user; inputting means for receiving an input answer to the question from the user; and processing means for responding to the input answer by carrying out a said machine operation and/or by

<Desc/Clms Page number 38>

adjusting the scores for each of the plurality of machine operations stored in said operation storage means; said question selection means being adapted to carry out at least one further selection of a said next question using the adjusted scores stored in said operation storage means for output by said outputting means.
2. A machine according to claim 1 wherein said question storage means is adapted to store, for a specified answer, for each of a plurality of said questions, an identifier for a corresponding machine operation to be carried out in response to input of said specified answer, and said processing means is responsive to a said specified answer to a said question to carry out the machine operation identified by a corresponding said identifier for the specified answer.
3. A machine according to claim 1 or claim 2 wherein said processing means is responsive to the input answer to carry out a said machine operation having the most significant score stored in said operation storage means.
4. A machine according to any preceding claim wherein said question storage means is adapted to store expected

<Desc/Clms Page number 39>

answers to said questions from the user, and said question selection means is adapted to determine said least number of questions by predicting the expected answers input by the user to select each of said machine operations.
5. A machine according to any preceding claim including word storage means for storing keywords for each of said plurality of machine operations; and keyword determining means for determining keywords using said input answer; wherein said processing means is adapted to match the determined keywords to the keywords stored in said word storage means, and to adjust the scores for each of the plurality of machine operations in dependence upon the matching.
6. A machine according to claim 5 wherein said word storage means is adapted to store scores for the keywords for each of said plurality of machine operations, and said processing means is adapted to determine scores for determined keywords for each of said plurality of machine operations by matching the determined keywords to the keywords stored in said word storage means, and to adjust the scores for each of said plurality of machine operations using the determined scores for keywords.

<Desc/Clms Page number 40>
7. A machine according to claim 5 or claim 6 wherein said question storage means is adapted to store keywords associated with expected answers to at least some of the questions, and said keyword determining means is adapted to determine keywords from the association with an input answer using said question storage means.
8. A machine according to any preceding claim wherein said question selection means is adapted to use a recursive process for the determining process to identify sequences of questions to select each said machine operation.
9. A machine according to claim 8 wherein said question selection means is adapted to carry out the recursive process for each sequence until the sequence length reaches a threshold length.
10. A machine according to any preceding claim wherein said question selection means is adapted to perform the determining for each of a plurality of questions, by determining an average of the least number of questions required to be answered to arrive at only the machine operations having the highest scores weighted by the respective scores.

<Desc/Clms Page number 41>
11. A machine according to any preceding claim wherein said question selection means is adapted to perform the determining only for questions the answers to which can cause the score of a most likely machine operation to increase.
12. A machine according to any one of claims 1 to 10 wherein said question selection means is adapted to select a plurality of said questions for use as the plurality of questions in the determining process by selecting a plurality of questions assigned the highest score, and to determine scores for the questions by using at least one of three techniques, namely; I. taking the score of the machine instruction having the highest score after asking the question and predicting an answer; II assigning a high score to questions relating to the same topic as a previous input answer, and III assigning a high score to questions relating to the same topic as any previous answers.

<Desc/Clms Page number 42>
13. A machine according to claim 12 wherein said question selection means is adapted to determine scores for the questions by using a11 three techniques and taking a weighted average of the determined scores.
14. A machine according to any preceding claim wherein Siad score storage means is adapted to store scores for hierarchical classifications of said machine operations, each hierarchical classification comprising a topic to which the machine operations in the hierarchy below relate and having a score comprising the sum of the scores for the machine operations in the hierarchy below, the machine interface including uncertainty means for indicating uncertainty to a user if the score for any of the hierarchical classifications at predetermined level of hierarchical classification is below a threshold.
15. A machine according to any preceding claim including. means for uniformly decaying the scores for each said machine operation stored in said score storage means by a predetermined amount after a question has been answered.
16. A machine according to any preceding claim including means to allow questions to be entered into or adjusted

<Desc/Clms Page number 43>

in said question storage means.
17. A machine according to any preceding claim including means to allow scores to be entered into or adjusted in said score storage means.
18. A machine according to claim 5 or claim 6 including means to allow words to be entered into or adjusted in said word storage means.
19. A machine according to any preceding claim wherein said outputting means is adapted to generate speech and said inputting means is adapted to recognise speech.
20. A method of providing a machine interface to allow a user to select a machine operation, the method comprising: providing a stored plurality of questions for output to the user; providing a stored score for each of a plurality of machine operations, each score indicating the likelihood that the user will select a corresponding machine operation; selecting a next question for output to the user from the stored questions by determining, for each of a

<Desc/Clms Page number 44>

plurality of said questions, an average of the least number of questions required to be answered by the user to arrive at each said machine operation weighted by the respective scores, and selecting a question having the lowest average number; outputting the selected question to the user; receiving an input answer from the user; and responding to the input answer by carrying out a said machine operation and/or by adjusting the stored scores for each of the plurality of machine operations; and repeating the selecting step using the adjusted scores and subsequently repeating the outputting, receiving and responding steps.
21. A method according to claim 20, wherein for a specified answer, for each of a plurality of said questions, an identifier for a corresponding machine operation to be carried out in response to input of said specified answer is stored, and in response to said specified answer being received from the user to a said question, the machine operation identified by a corresponding said identifier for a specified answer is executed.

<Desc/Clms Page number 45>
22. A method according to claim 20 or claim 21, wherein a said machine operation having a stored score which is of a threshold significance is executed in response to the input answer.
23. A method according to any one of claims 20 to 22, wherein expected answers to said questions from the user are stored with the questions, and in said selecting step, the least number of questions is determined by predicting the expected answers input by the user to select each of said machine operations.
24. A method according to any one of claims 20 to 23 including providing stored keywords for each of said plurality of machine operations, and determining keywords using said input answer, wherein the input answers are responded to by matching the determined keywords to the stored keywords and adjusting the scores for each of the_ plurality of machine operations in dependence upon the matching step.
25. A method according to claim 24, wherein scores for the keywords for each of said plurality of machine operations is stored, scores for determined keywords for each of said plurality of machine operations are

<Desc/Clms Page number 46>

determined by matching the determined keywords to the stored keywords, and the scores for each of said plurality of machine operations are adjusted using the determined scores for keywords.
26. A method according to claim 24 or claim 25, wherein keywords associated with expected answers to at least some of the questions are stored, and the step of determining keywords comprises determining keywords from the association of stored keywords with an input answer.
27. A method according to any one of claims 20 to 26, wherein the selecting step comprises a recursive process to identify sequences of questions to select each said machine operation.
28. A method according to claim 27, wherein the selecting step carries out the recursive process for each sequence until the sequence length reaches a threshold length.
29. A method according to any one of claims 20 to 28, wherein in the selecting step the determination for each of a plurality of questions is carried out by determining an average of the least number of questions required to

<Desc/Clms Page number 47>

be answered to arrive at only the machine operations having the highest scores weighted by the respective scores.
30. A method according to any one of claims 20 to 29, wherein the selecting step performs the determination only for questions the answer to which can cause the score for a most likely machine operation to increase.
31. A machine according to any of claims 20 to 29, wherein said selection step selects a plurality of said questions for use as the plurality of questions in the determination by selecting a plurality of questions assigned the highest score, and determines scores for the questions by using at least one of three techniques, namely: (i) taking the score of the machine instruction having the highest score after asking the. questions and predicting an answer, (ii) assigning a high score to questions relating to the same topic as a previous input answer, and (iii)assigning a high score to questions relating to the same topic as any previous input answers.

<Desc/Clms Page number 48>
32. A method according to claim 31, wherein said selecting step determines scores for the questions by using a11 three techniques and taking a weighted average of the determined scores.
33. A method according to any one of claims 20 to 32, wherein scores for hierarchical classifications of said machine operations are stored, each hierarchical classification comprising a topic to which the machine operations in the hierarchy below relate and having a score comprising the sum of the scores for the machine operations in the hierarchy below; the method including indicating uncertainty to a user if the score for any of the hierarchical classifications at a predetermined level of hierarchical classification is below a threshold.
34. A method according to any one of claims 20 to 33, including uniformly decaying the scored scores for eachsaid machine operation by a predetermined amount after a question has been answered.
35. A method according to any one of claims 20 to 34, including receiving and storing new questions, or receiving instructions to adjust stored questions.

<Desc/Clms Page number 49>
36. A method according to any one of claims 20 to 35 including receiving and storing new scores for new machine operations, or instructions to adjust stored scores for current machine operations.
37. A method according to claim 24 or claim 25, including receiving and storing new keywords or receiving instructions to adjust stored keywords.
38. A method according to any one of claims 20 to 37, wherein the outputting step includes the generation of speech and the inputting step includes the recognition of speech.
39. Program code for controlling a processor to carry out the method of any one of claims 20 to 38.
40. A carrier medium for carrying the program code according to claim 39.