WO2010116785A1 - 検索装置 - Google Patents
検索装置 Download PDFInfo
- Publication number
- WO2010116785A1 WO2010116785A1 PCT/JP2010/051874 JP2010051874W WO2010116785A1 WO 2010116785 A1 WO2010116785 A1 WO 2010116785A1 JP 2010051874 W JP2010051874 W JP 2010051874W WO 2010116785 A1 WO2010116785 A1 WO 2010116785A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- search
- candidates
- narrowing
- candidate
- input
- Prior art date
Links
- 238000000034 method Methods 0.000 claims abstract description 76
- 230000006978 adaptation Effects 0.000 claims description 7
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 230000004044 response Effects 0.000 abstract description 3
- 230000002035 prolonged effect Effects 0.000 abstract 1
- 230000002776 aggregation Effects 0.000 description 19
- 238000004220 aggregation Methods 0.000 description 19
- 238000004364 calculation method Methods 0.000 description 11
- 230000014509 gene expression Effects 0.000 description 10
- 238000010586 diagram Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 6
- 239000000284 extract Substances 0.000 description 3
- 230000004043 responsiveness Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1815—Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
Definitions
- the present invention relates to a search device for names of facilities, for example, for input based on pattern recognition such as text input or input speech.
- the name search technology based on the character string index is to search for names that appear in search target morphemes and N character chain partial character strings.
- Japanese Patent No. 3665112 discloses a method in which the scores of each candidate are aggregated based on partial character string matching, and the top candidates are used as search results. At this time, a fuzzy search for extracting similar candidates is possible even if the character string does not exactly match the input. In the fuzzy search, it is necessary to hold a plurality of candidates having different scores, and the use memory and the calculation amount are larger than in the complete match search.
- the size of the character string index referred to above is proportional to the number of search units of the search target character string. For this reason, when the search target is large-scale, it is necessary to place a character string index to be referenced on a secondary storage such as a DVD (Digital Versatile Disk) or a hard disk. In this case, the processing time required for reading from the secondary storage increases.
- the number of times the dictionary is read out is the number of types of different partial character strings, and is almost proportional to the length of the input character string for short inputs such as names.
- a fuzzy search needs to hold a plurality of candidates having different scores, and has a larger memory and calculation amount than a complete match search.
- Japanese Patent Laid-Open No. 2008-262279 discloses a search method that takes into account the difference between a unit that considers speech recognition and a unit of search as a search method that uses speech. In this case, the search takes account of misrecognition at the time of speech recognition, and the number of candidates further increases.
- Japanese Patent No. 3134204 a hierarchical search mode that narrows down the document set that is the immediately preceding search result as a population, and a universe search mode that always searches a fixed set of documents in each search by an instruction operation. A method of enabling selection is disclosed.
- each narrowing method has the following problems.
- the document population is re-searched for each re-search. Therefore, the cost of managing the search history is small as long as the user's input is held as the search history. However, it is always necessary to process all inputs for all candidates. For this reason, since the number of times of index reading is large and the number of candidates for aggregation is also large, the processing time becomes long and the responsiveness decreases.
- Japanese Patent Laid-Open No. 2008-262279 creates a recognition dictionary that covers the entire search target. This dictionary does not consider the narrowing-down result, and the recognition rate does not improve even when narrowing down.
- an object of the present invention is to improve an average search time without increasing the management cost during a narrow search. Another object is to improve recognition accuracy in narrowing down by voice.
- the search device provides: An input means for accepting user input and outputting a search request; Search history storage means for storing a search history including input contents from the input means and a candidate list; Narrowing down to select a narrowing method from the two methods of limiting the search target to upper candidates and re-searching based on past input according to the contents of the search history stored in the search history storage means by the search request A method selection means; Candidate score updating means for setting a search candidate and its score from the search history based on the selected narrowing method, and updating the candidate score with reference to the search index based on the character string received from the input means; Candidate determination means for determining candidates to be presented based on the number of candidates updated by the candidate score update means and the distribution of scores; The candidate determination means includes candidate presentation means for presenting the candidates determined by the candidate determination means to the user with reference to the name information data.
- two methods a method for limiting the search target to upper candidates and a method for re-searching based on the past input according to the contents of the search history stored in the search history storage means Select the filtering method from. For this reason, when there are few candidates with high validity, it is possible to narrow down the target and narrow down the calculation time. In addition, when there are many highly relevant candidates, it is possible to perform a search with an expanded range by referring to the input character string of the search history. A search with a short calculation time is possible.
- FIG. 1 is an overall configuration diagram of a search device assumed by the present invention. It is a functional block diagram which shows the structure of the search device which concerns on Embodiment 1 of this invention. It is explanatory drawing of a name information dictionary example. It is explanatory drawing of the example of a search index based on character 2-gram. It is explanatory drawing of the example of search history. It is explanatory drawing of the table for aggregation showing a total score and a total flag. 4 is a flowchart showing a search processing operation of the search device according to the first embodiment. It is a characteristic view of the ranking and score of candidate search results for two inputs. It is a functional block diagram which shows the structure of the search device which concerns on Embodiment 2 of this invention.
- 10 is a flowchart showing a search processing operation of the search device according to the second embodiment. It is a functional block diagram which shows the structure of the search device which concerns on Embodiment 3 of this invention. 10 is a flowchart showing a search processing operation of the search device according to Embodiment 3.
- FIG. FIG. 1 shows the overall configuration of a search apparatus assumed by the present invention.
- the input unit 10 accepts input by text, speech, etc., and converts it into a format acceptable by the search unit 20 with reference to the large vocabulary speech recognition dictionary 103 as necessary.
- the search unit 20 performs a fuzzy search with reference to the search index 102.
- the presentation unit 30 refers to the name information dictionary 101 and presents the name of the search result by the search unit 20 and accompanying information to the user.
- the name information dictionary 101, the search index 102, and the large vocabulary speech recognition dictionary 103 are created from search target data. Since the data size increases as the search target becomes large, it is arranged on the secondary storage device 40.
- FIG. 2 is a functional block diagram showing the configuration of the search device according to Embodiment 1 of the present invention.
- the search apparatus includes a name information dictionary 101, a search index 102, an input unit 201 which is an example of a configuration unit of the input unit 10, a search history storage unit 202, a narrowing-down method selection unit 203, a candidate score update unit 204, a candidate determination unit 205.
- the candidate presentation unit 206 is an example of the configuration unit of the presentation unit 30.
- a characteristic part of the present invention is that it includes a narrowing-down method selection unit 203 and determines a narrowing-down method according to the search history read from the search history storage unit 202.
- the operation of each functional block will be described.
- the name information dictionary 101 is name information such as notation and pronunciation corresponding to a name ID (identification) to be presented to the user.
- FIG. 3 shows an example of the name information dictionary 101 composed of a name ID and a name reading. If the name information dictionary 101 is information associated with a name ID, a word division result, a notation, and the like can also be registered.
- the search index 102 stores the ID of the name corresponding to the partial character string.
- the score for each name ID can be updated by referring to the name ID from the input partial character string.
- the unit of the partial character string must be determined in advance, and a word (morpheme in the case of Japanese) or a character N-gram is used.
- position information in the name and importance for information retrieval such as tf / idf can be given.
- FIG. 4 is an example of the search index 102 based on the character 2-gram corresponding to FIG. In the search index, the corresponding name ID can be referenced from any two characters.
- the input means 201 accepts the user input and outputs a search character string to the candidate score update means 204.
- the search history storage means 202 stores the user search history so far.
- the search history includes an input ID, a user input character string, a name ID constituting a search result at that time, and a score thereof. Every time narrowing occurs, it is added to the search history, and when the narrowing is canceled, all search history candidates are cleared. The search history is terminated with an appropriate score threshold and the number of candidates that can be presented.
- FIG. 5 is an example of a search history.
- the narrowing-down method selection unit 203 selects a narrowing method based on the number of candidates, the score, and the like stored in the search history storage unit 202.
- the candidate score update unit 204 updates the score of the name ID of the tabulation table provided in the candidate score update unit 204 based on the partial character string constituting the character string for the character string acquired from the input unit 201.
- the tabulation table is given a score for each name ID and a tabulation flag indicating that it is a tabulation target by narrowing down.
- FIG. 6 is an example showing the total score and the total flag of the total table. If there is no search history, the scores of all name IDs in the tabulation table are cleared and the tabulation flag of the tabulation table is set.
- Candidate deciding means 205 includes a predetermined number or less candidates for presentation to the user from candidates whose score acquired by candidate score updating means 204 exceeds a predetermined value, candidate IDs and candidates for holding for search.
- the data is extracted from the tabulation table and output to the candidate presentation unit 206 and the search history storage unit 202.
- the candidate presenting means 206 refers to the name information dictionary 101 and presents a name corresponding to the name ID list acquired from the candidate determining means 205 to the user.
- FIG. 7 is a flowchart showing the search processing operation of the search device according to the first embodiment.
- the input unit 201 acquires a user input character string and issues a search request (step S1001).
- the narrowing-down method selection unit 203 refers to the search history storage unit 202 and checks whether there is a search history for the input character string (whether the history number h is 1 or more) (step S1002). If the number of histories is 0, the aggregation flag to be searched is set for all candidates in the aggregation table, the score is cleared to 0, and the process proceeds to step S1008.
- the narrowing-down method selection unit 203 includes at least the length of the input character string of the total search history stored in the search history storage unit 202, the number of final history candidates, and the final score candidate score distribution (1) Re-search based on past inputs: (2) Recalculate the score of the tabulation table, (2) Limit the search target to upper candidates: In the candidates held by the search history storage means 202 A narrowing-down method is selected from “limited” (step S1003). Details of the selection method selection will be described later. If the score is recalculated, the process proceeds to step S1004. If the search history storage unit 202 is limited to the candidates, the process proceeds to step S1007.
- the score for each name ID of the tabulation table is recalculated based on the past history input.
- the aggregation flag is set for all candidates in the aggregation table, and the history number i to be referred to is set to 1 (step S1004).
- the candidate score update means 204 reads the partial character string index of the search index 102 from the input character string included in the history information S [i], and adds the score for each candidate (step S1005).
- step S1005 If the reference history i is smaller than the stored history number h, 1 is added to i, and the process returns to step S1005. Otherwise, the process proceeds to step S1008 (step S1006). As a result, a score in consideration of the input character strings of all the histories is given to the candidate name ID.
- the candidate score update unit 204 sets the aggregation flag of the name ID held in the latest search history S [h] of the aggregation table, The score is updated (step S1007).
- the candidate score update unit 204 acquires a partial character string for referring to the search index 102 corresponding to the character string acquired from the input unit 201, and adds a score based on the partial character string with reference to the search index 102 (Step S1008).
- Candidate deciding means 205 extracts a presentation name ID to be presented to the user from candidates whose score obtained by candidate score updating means 204 exceeds a predetermined value and the score from the tabulation table for a predetermined number or less, and presents the candidate for presentation. Is confirmed (step S1009).
- the search history storage unit 202 stores the input character string extracted from the tabulation table by the candidate determination unit 205, the name ID of the presentation candidate, and the score (step S1010).
- the candidate presenting means 206 refers to the name information dictionary 101, acquires the presentation content such as the name corresponding to the name ID to be presented, and presents it to the user (step S1011).
- FIG. 8 shows search result candidates for a certain input (A) and (B), with the X axis representing the ranking and the Y axis representing the score.
- a threshold is set based on the validity of the candidate.
- an upper limit for the number of candidates presented simultaneously is set in order to ensure reasonable responsiveness.
- the input means 201 acquires the user's text input.
- the narrowing-down method is controlled based on the candidate score distribution and the number of candidates. For this reason, when there are few candidates with high validity, it is possible to narrow down the target and narrow down the calculation time. If there are many candidates with high validity, a search with an expanded range is performed with reference to the input character string of the search history. For this reason, even if the search history size is small, no leakage occurs and a search with a short average calculation time is realized.
- FIG. FIG. 9 is a functional block diagram showing the configuration of the search device according to Embodiment 2 of the present invention.
- the search device according to the second embodiment is provided with a narrowing-down recognition dictionary generating means 302 in addition to the search device of the first embodiment. Also, it is assumed that the input is voice.
- the same reference numerals as those used in FIG. 2 are attached to the same configurations as those of the first embodiment, and the description thereof will be omitted or simplified.
- the large vocabulary recognition dictionary 103 is a speech recognition dictionary created in advance for recognizing a user's search expression for name information to be searched. In general, in speech recognition, a higher recognition rate can be expected as the next word can be limited by a speech recognition dictionary.
- the N-gram language model is a model that estimates the probability of the next word based on the immediately preceding N-1 word.
- N 2
- the next word is predicted from the immediately preceding word and is called a bigram.
- the bigram language model predicts the next word from the word being recognized based on the concatenation probability P (w2
- FIG. 10 is a diagram illustrating the connection probability P (w2
- the words START and END are pseudo words representing the beginning and end of a sentence.
- w1) is calculated based on the appearance frequency in the learning data such as the actual utterance content and the name of the search target.
- the amount of learning data is limited. For example, there are a huge number of combinations of 25 million bigrams (5000 squares) for 5000 words.
- the voice input unit 301 which is an embodiment of the input unit 10 receives a user's voice input, recognizes the voice by referring to the recognition dictionary, and outputs a character string.
- the recognition dictionary has an effect of increasing the recognition rate by limiting the utterance of the assumed user.
- the narrowing-down recognition dictionary generating means 302 outputs a recognition dictionary, the recognition dictionary is referred to. If not, the large vocabulary recognition dictionary 103 covering various search expressions of the user created in advance is referred to.
- Non-Patent Document 4 General speech recognition methods using recognition dictionaries are described in detail in Non-Patent Document 4 and Non-Patent Document 5.
- Non-patent document 4 “Basics of speech recognition (top) (bottom)”, co-authored by Lawrence Rabiner and Biing-Hwang Juang, directed by Sadaaki Furui, NTT Advanced Technology Co., Ltd.
- Non-patent document 5 “SPOKEN LANGUAGE PROCESSING -A guide to Theory, Algorithm and System Development- '', Xuedong Huang, Alex Acero, Hsiao-Wuen Hon, Prentice Hall.
- the narrowing-down method selection unit 203 determines whether to generate a narrowing-down recognition dictionary according to the narrowing-down method based on the search history stored in the search history storage unit 202 when a narrowing-down input occurs.
- the narrowing-down recognition dictionary generation unit 302 performs name information corresponding to the target name ID. And a narrowing-down dictionary is generated from the name information 101.
- FIG. 11 is an example of a narrowing-down recognition dictionary that recognizes the three names and constituent words shown in FIG.
- the speech recognition target is a route from the node indicated by “START” to the node indicated by “END”.
- a node written in Katakana that passes along the way represents a unit of speech recognition.
- a route for skipping in units of words is set, and partial expressions can be accepted.
- the syllables common to “Kawasaki” and “Yokohama” and “Kouen” at the end are merged to reduce the size of the network.
- the recognition dictionary expressed by the network shown above can be created to recognize only the utterances related to the narrowing target. For this reason, assuming all the search targets, it is very compact compared to the large vocabulary speech recognition dictionary 103 that accepts various expressions, and the recognition rate for the narrowing targets is high. However, since the creation of a dictionary requires a calculation amount corresponding to the number of target names, it is difficult to generate a dictionary in a short time when there are many narrowing targets.
- FIG. 12 is a flowchart showing the search processing operation of the search device according to the second embodiment.
- the narrowing-down recognition dictionary generating unit 302 refers to the states of the search history storage unit 202 and the narrowing-down method selection unit 203, and confirms whether or not it is a candidate limited process held by the search history storage unit 202 (step S2001). ). In the case of refinement and limitation within candidates held by the search history storage unit 202, the narrowing-down recognition dictionary generation unit 302 refers to the name information dictionary 101 and the search history storage unit 202, and accepts expressions that can appear in the target candidates. A possible recognition dictionary is generated and used as the recognition dictionary of the voice input means 301 (step S2002). Otherwise, the voice input means 301 reads the large vocabulary voice recognition dictionary 103 (step S2003).
- the voice input means 301 recognizes the user's speech based on the set recognition dictionary, obtains a recognition result character string, outputs the character string to the candidate score update means 204, and makes a search request (step) S2004).
- the candidate score update unit 204 first checks whether there is a search history in the search history storage unit 202 (whether the history number h is 1 or more) (step S2005). If the number of histories is 0, the aggregation flag to be searched is set for all candidates in the aggregation table, the score is cleared to 0, and the process proceeds to step S2012.
- the narrowing-down method selection unit 203 selects at least one of the total length of input character strings stored in the search history storage unit 202, the number of final history candidates, and the candidate score distribution of the final history.
- Re-search based on past input recalculation of score of table for aggregation
- Limit search target to upper candidates Limited to candidates held by search history storage means 202
- a method for selecting the data is selected (step S2006). In the case of score recalculation, the process proceeds to step S2007, and in the case of limiting to the candidates held in the search history storage means 202, the process proceeds to step S2010.
- the aggregation flag is set for all candidates in the aggregation table, and the score is recalculated with reference to the past search history stored in the search history storage means 202.
- the history number i to be referred to is set to 1 (step S2007).
- the candidate score update means 204 reads the partial character string index of the search index from the input character string included in the history information S [i], and adds the score for each candidate (step S2008). If the reference history i is smaller than the history number h, i is incremented by 1, and the process returns to step S2008. Otherwise, the process proceeds to step S2011 (step S2009). As a result, a score considering all the histories of the tabulation table is assigned to the candidate name ID.
- the candidate score update unit 204 sets the aggregation flag of the name ID held in the latest search history S [h], and updates the score (step S2010).
- the candidate score update unit 204 acquires a partial character string for referring to the search index corresponding to the character string acquired from the input unit 301, and adds a score based on the partial character string with reference to the search index 102 (Step S2011).
- Candidate deciding means 205 extracts a presentation name ID and a score equal to or less than a predetermined number to be presented to the user from candidates whose score acquired by candidate score updating means 204 exceeds a predetermined value from the tabulation table, and presenting candidates Is confirmed (step S2012).
- the search history storage unit 202 stores the input character string extracted by the candidate determination unit 205, the name ID of the presentation candidate, and the score (step S2013).
- the candidate presentation unit 206 refers to the name information dictionary 101, acquires presentation contents such as a name corresponding to the name ID to be presented extracted by the candidate determination unit 205, and presents it to the user (step S1014).
- a narrow-down dictionary is generated according to the search history considering the number of candidates. For this reason, only when the target is limited, the recognition accuracy is improved without requiring a large processing time by dynamically generating a recognition dictionary for the limited name.
- the number of candidates is large, it takes time to generate a recognition dictionary, but the effect of limiting to narrowing candidates is relatively small, so that a narrowing-down recognition dictionary is not generated.
- FIG. FIG. 13 is a functional block diagram showing the configuration of the search device according to Embodiment 3 of the present invention.
- the search device according to the third embodiment is provided with a narrowing-down recognition dictionary adaptation means 401 added to the search device of the second embodiment.
- the same reference numerals as those used in FIG. 9 are attached to the same configurations as those of the second embodiment, and the description thereof will be omitted or simplified.
- the voice input unit 301 receives the user's voice input, recognizes the voice by referring to the recognition dictionary, and outputs a character string.
- the recognition dictionary refers to the large vocabulary recognition dictionary 103. If there is a search history, based on the narrowing-down method selection means 203, the recognition dictionary output from either the narrowing-down recognition dictionary generation means 301 or the narrowing-down recognition dictionary adaptation means 401 is referred to.
- the narrowing-down recognition dictionary adaptation means 401 refers to the input character string of the search history according to the instruction of the narrowing-down method selection means 203, and adapts the probability of the word or word string given by the large vocabulary recognition dictionary 103 for narrowing down.
- the recognition dictionary is a bigram language model
- the adaptation described above corrects a part of the probability of the built-in large vocabulary recognition dictionary based on the search history character string held in the search history storage means 202. For this reason, although the effect of accuracy improvement by narrowing down is smaller than that of dictionary re-creation, adaptation with a certain amount of computation is possible regardless of the number of search result candidates.
- FIG. 14 is a flowchart showing the search processing operation of the search device according to the third embodiment.
- the narrowing-down recognition dictionary generation unit 302 refers to the states of the search history storage unit 202 and the narrowing-down method selection unit 203, and confirms whether or not the narrowing-down and candidate restriction processing is performed (step S3001).
- the narrowing-down recognition dictionary generating unit 302 refers to the name information dictionary 101 and the search history storage unit 202, generates a recognition dictionary that can accept expressions that can appear in the target candidates, The recognition dictionary of the voice recognition unit 301 is used (step S3002). Otherwise, the narrowing-down recognition dictionary adaptation unit 401 reads the large vocabulary recognition dictionary 103, adapts the word chain probability of the recognition dictionary based on the character string described in the search history, and uses the voice input unit 301 to narrow down. Is an adaptive recognition dictionary (step S3003).
- the voice input means 301 recognizes the user's utterance based on the set recognition dictionary and acquires a recognition result character string (step S3004).
- the candidate score update unit 204 first checks whether there is an input history in the search history storage unit 202 (whether the history number h is 1 or more) (step S3005). If the number of histories is 0, the aggregation flag to be searched is set for all candidates, the score is cleared to 0, and the process proceeds to step S3012.
- the narrowing-down method selection unit 203 refers to at least one of the total input character string length stored in the input history, the number of final history candidates, and the candidate score distribution of the final history, (1) Re-search based on past input: recalculation of score in table for aggregation, (2) Limit search target to upper candidates: Limit search to candidates held by search history storage means 202, and entanglement method Select (step S3006). If the score is recalculated, the process proceeds to step S3007. If the score is limited to the candidates to be retained, the process proceeds to step S3010.
- the aggregation flag is set for all candidates in the aggregation table, and the score is recalculated with reference to the past history.
- the history number i to be referenced is set to 1 (step S3007).
- the candidate score update unit 204 reads the partial character string index from the input character string included in the history information S [i], and adds the score for each candidate (step S3008). If the reference history i is smaller than the history number h, i is incremented by 1, and the process returns to step S3008. Otherwise, the process proceeds to step S3011 (step S3009). As a result, a score considering all the histories of the tabulation table is assigned to the candidate name ID.
- the candidate score update unit 204 sets the aggregation flag of the name ID held in the latest search history S [h], and updates the score (step S3010).
- the candidate score update unit 204 acquires a partial character string for referring to the search index corresponding to the character string acquired from the input unit 201, and adds a score based on the partial character string with reference to the search index 101 (Step S3011).
- Candidate deciding means 205 extracts a presentation name ID and a score equal to or less than a predetermined number to be presented to the user from candidates whose score acquired by candidate score updating means 204 exceeds a predetermined value from the tabulation table, and presenting candidates Is confirmed (step S3012).
- the search history storage unit 202 stores the input character string extracted by the candidate determination unit 205, the name ID of the presentation candidate, and the score (step S3013).
- the candidate presenting means 206 refers to the name information dictionary 102, acquires presentation contents such as a name corresponding to the name ID presented by the candidate determining means 205, and presents it to the user (step S1014).
- a narrowing-down speech recognition dictionary limited to the target candidates is generated, and when the number of candidates is large, the large vocabulary recognition dictionary 103 is used. Is adapted based on the input of the search history. Since a narrowing-down recognition dictionary matched to the narrowing target is used, the recognition accuracy is improved as compared with the case of referring to the large vocabulary recognition dictionary without requiring a large amount of processing time.
- the search device according to the present invention is applied to a text and facility name search device, and may be particularly suitable for a relatively small search device incorporated in another device.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
しかし、一度、候補外となると、再び候補になることは無いため、候補が脱落しないようにする必要がある。例えば、東京都にある施設名を検索対象とした場合に、「東京」という入力に対しては膨大な数の候補が生じる。この場合、ユーザが全ての候補を確認することは困難であるにもかかわらず、検索履歴には多数候補が含まれ管理のコストが大きくなる問題がある。また、保持可能な候補数に上限がある場合、候補から脱落する場合が生じる。複数回の絞り込みや絞り込みのキャンセルを考慮すると、複数回分の検索履歴を格納しておく必要があり、管理コストも大きくなる。
ユーザ入力を受理し、検索要求を出力する入力手段と、
入力手段からの入力内容および候補リストを含む検索履歴を格納する検索履歴格納手段と、
検索要求により検索履歴格納手段に格納される検索履歴の内容に応じて、検索対象を上位の候補に限定する方式と、過去入力に基づいて再検索する方式の2方式から絞り込み方法を選択する絞り込み方式選択手段と、
選択した絞り込み方式に基づいて検索履歴から検索候補とそのスコアを設定し、入力手段より受理した文字列に基づき検索用索引を参照に候補スコアを更新する候補スコア更新手段と、
候補スコア更新手段により更新された候補数とスコアの分布に基づいて提示する候補を決定する候補決定手段と、
候補決定手段で決定された候補を名称情報データを参照してユーザに提示する候補提示手段からなる。
実施の形態1.
図1は、この発明が想定する検索装置の全体構成である。入力部10は、テキスト、音声等による入力を受理し、必要に応じて大語彙音声認識辞書103を参照して検索部20が受理可能な形式へ変換する。検索部20は検索用索引102を参照してあいまい検索を行う。提示部30は、名称情報辞書101を参照し検索部20による検索結果の名称や付帯情報をユーザへ提示する。
検索装置は、名称情報辞書101、検索用索引102、入力部10の構成手段の一例である入力手段201、検索履歴格納手段202、絞り込み方式選択手段203、候補スコア更新手段204、候補決定手段205、提示部30の構成手段の一例である候補提示手段206で構成されている。
図7は、実施の形態1に係る検索装置の検索処理動作を示すフローチャートである。ここでは、検索履歴格納手段202にh回の検索履歴S[i](i=1..h)が格納されているものとする。
図2の絞り込み方式選択手段203において絞り込み方法の選択基準を説明する。
図8は、ある入力(A)(B)に対する検索結果の候補を、X軸が順位、Y軸がスコアとして表したものである。候補の妥当性からしきい値が設定される。また、妥当な応答性を確保するため同時に提示する候補数の上限が設定される。
図9は、この発明の実施の形態2に係る検索装置の構成を示す機能ブロック図である。実施の形態2に係る検索装置は、実施の形態1の検索装置に絞り込み用認識辞書生成手段302が追加して設けられている。また、入力は音声であることを想定している。以下、実施の形態1と同一の構成には図2で使用した符号と同一の符号を付し、説明を省略または簡略化する。
非特許文献5:「SPOKEN LANGUAGE PROCESSING -A guide to Theory, Algorithm and System Development-」、Xuedong Huang, Alex Acero, Hsiao-Wuen Hon共著、Prentice Hall.
絞り込み用認識辞書生成手段302は、絞り込み方式選択手段203が選択した絞り込み方法が検索履歴格納手段202に格納された候補内に限定している場合、対象となっている名称IDと対応する名称情報を取得し、名称情報101から絞り込み用の辞書を生成する。
絞り込みかつ検索履歴格納手段202が保持する候補内限定の場合、絞り込み用認識辞書生成手段302は、名称情報辞書101、検索履歴格納手段202を参照し、対象となる候補で出現しうる表現を受理可能な認識辞書を生成し、音声入力手段301の認識辞書とする(ステップS2002)。
そうでない場合、音声入力手段301は、大語彙音声認識辞書103を読み込む(ステップS2003)。
候補スコア更新手段204は、検索要求に対して、まず検索履歴格納手段202に検索履歴があるか(履歴数hが1以上か)確認する(ステップS2005)。履歴数が0の場合、集計用テーブルの全ての候補について検索対象の集計フラグをセットし、スコアを0にクリアして、ステップS2012へ進む。
次に、候補スコア更新手段204は、履歴情報S[i]に含まれる入力文字列から検索用索引の部分文字列索引を読み出し、候補ごとのスコアを加算する(ステップS2008)。
参照履歴iが履歴数hより小さければ、iを1加算し、ステップS2008に戻る。そうでなければステップS2011に進む(ステップS2009)。この結果、集計用テーブルの全ての履歴を考慮したスコアが候補の名称IDに付与される。
候補スコア更新手段204は、入力手段301から取得した文字列に対応する検索用索引を参照するための部分文字列を取得し、検索用索引102を参照して部分文字列に基づくスコアを加算する(ステップS2011)。
検索履歴格納手段202は、候補決定手段205で抽出された入力文字列と、提示候補の名称ID、スコアを格納する(ステップS2013)。
候補提示手段206は、名称情報辞書101を参照し、候補決定手段205で抽出された提示する名称IDに対応する名称等の提示内容を取得し、ユーザへ提示する(ステップS1014)。
図13は、この発明の実施の形態3に係る検索装置の構成を示す機能ブロック図である。実施の形態3に係る検索装置は、実施の形態2の検索装置に絞り込み用認識辞書適応化手段401を追加して設けている。以下、実施の形態2と同一の構成には図9で使用した符号と同一の符号を付し、説明を省略または簡略化する。
絞り込み用認識辞書生成手段302は、検索履歴格納手段202および絞り込み方式選択手段203の状態を参照し、絞り込みかつ候補内限定処理となっているかどうか確認する(ステップS3001)。
そうでない場合、絞り込み用認識辞書適応化手段401は、大語彙認識辞書103を読み込み、検索履歴に記載された文字列に基づいて認識辞書の単語連鎖確率を絞り込み用に適応化し、音声入力手段301の適応化認識辞書とする(ステップS3003)。
候補スコア更新手段204は、検索要求に対して、まず検索履歴格納手段202に入力履歴があるか(履歴数hが1以上か)確認する(ステップS3005)。履歴数が0の場合、全ての候補について検索対象の集計フラグをセットし、スコアを0にクリアして、ステップS3012へ進む。
次に、候補スコア更新手段204は、履歴情報S[i]に含まれる入力文字列から部分文字列索引を読み出し、候補ごとのスコアを加算する(ステップS3008)。
参照履歴iが履歴数hより小さければ、iを1加算し、ステップS3008に戻る。そうでなければステップS3011に進む(ステップS3009)。この結果、集計用テーブルの全ての履歴を考慮したスコアが候補の名称IDに付与される。
候補スコア更新手段204は、入力手段201から取得した文字列に対応する検索用索引を参照するための部分文字列を取得し、検索用索引101を参照して部分文字列に基づくスコアを加算する(ステップS3011)。
検索履歴格納手段202は、候補決定手段205が抽出した入力文字列と、提示候補の名称ID、スコアを格納する(ステップS3013)。
Claims (3)
- ユーザ入力を受理し、検索要求を出力する入力手段と、
入力手段からの入力内容および候補リストを含む検索履歴を格納する検索履歴格納手段と、
検索要求により検索履歴格納手段に格納される検索履歴の内容に応じて、検索対象を上位の候補に限定する方式と、過去入力に基づいて再検索する方式の2方式から絞り込み方法を選択する絞り込み方式選択手段と、
選択した絞り込み方式に基づいて検索履歴から検索候補とそのスコアを設定し、入力手段より受理した文字列に基づき検索用索引を参照に候補スコアを更新する候補スコア更新手段と、
候補スコア更新手段により更新された候補数とスコアの分布に基づいて提示する候補を決定する候補決定手段と、
候補決定手段で決定された候補を名称情報データを参照してユーザに提示する候補提示手段からなる検索装置。 - 音声認識用の大語彙認識辞書と、
前記絞り込み方式選択手段が上位の候補に限定する方式を選択した場合に、対象候補の名称情報に基づいて絞り込み用認識辞書を生成する絞り込み用認識辞書生成手段を備え、
前記入力手段は音声を入力して、前記絞り込み方式選択手段が上位の候補に限定する方式を選択した場合は、絞り込み用認識辞書を用い、他の場合は大語彙認識辞書を用いて音声認識を行いテキストを出力する構成にされた請求項1記載の検索装置。 - 前記絞り込み方式選択手段が過去入力に基づく再検索が選択された場合に、検索履歴に基づいて大語彙認識辞書を、想定される絞り込み発話へ適応させるように修正して、適応化認識辞書とする絞り込み用認識辞書適応化手段を備え、
前記入力手段は音声を入力して、絞り込み方式選択手段に従い絞り込み用認識辞書または適応化認識辞書を読み込み、音声を認識してテキストを出力する構成にされた請求項2記載の検索装置。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP10761487A EP2418589A4 (en) | 2009-04-06 | 2010-02-09 | SEARCH DEVICE |
JP2011508269A JP5300974B2 (ja) | 2009-04-06 | 2010-02-09 | 検索装置 |
US13/255,517 US20110320464A1 (en) | 2009-04-06 | 2010-02-09 | Retrieval device |
CN201080015020.6A CN102365639B (zh) | 2009-04-06 | 2010-02-09 | 检索装置 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009092138 | 2009-04-06 | ||
JP2009-092138 | 2009-04-06 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2010116785A1 true WO2010116785A1 (ja) | 2010-10-14 |
Family
ID=42936074
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2010/051874 WO2010116785A1 (ja) | 2009-04-06 | 2010-02-09 | 検索装置 |
Country Status (5)
Country | Link |
---|---|
US (1) | US20110320464A1 (ja) |
EP (1) | EP2418589A4 (ja) |
JP (1) | JP5300974B2 (ja) |
CN (1) | CN102365639B (ja) |
WO (1) | WO2010116785A1 (ja) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5083367B2 (ja) * | 2010-04-27 | 2012-11-28 | カシオ計算機株式会社 | 検索装置、検索方法、ならびに、コンピュータプログラム |
US8805828B1 (en) * | 2012-01-13 | 2014-08-12 | Google Inc. | Providing information regarding prior searches |
US8918408B2 (en) * | 2012-08-24 | 2014-12-23 | Microsoft Corporation | Candidate generation for predictive input using input history |
CN103077718B (zh) * | 2013-01-09 | 2015-11-25 | 华为终端有限公司 | 语音处理方法、系统和终端 |
JP6064629B2 (ja) * | 2013-01-30 | 2017-01-25 | 富士通株式会社 | 音声入出力データベース検索方法、プログラム、及び装置 |
JP5951105B2 (ja) * | 2013-03-04 | 2016-07-13 | 三菱電機株式会社 | 検索装置 |
JP2014229272A (ja) * | 2013-05-27 | 2014-12-08 | 株式会社東芝 | 電子機器 |
US10452695B2 (en) * | 2017-09-22 | 2019-10-22 | Oracle International Corporation | Context-based virtual assistant implementation |
CN107731229B (zh) * | 2017-09-29 | 2021-06-08 | 百度在线网络技术(北京)有限公司 | 用于识别语音的方法和装置 |
CN109840062B (zh) * | 2017-11-28 | 2022-10-28 | 株式会社东芝 | 输入辅助装置以及记录介质 |
CN111274265B (zh) * | 2020-01-19 | 2023-09-19 | 支付宝(杭州)信息技术有限公司 | 基于多种检索方式融合检索的方法和装置 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH02189680A (ja) * | 1989-01-18 | 1990-07-25 | Nec Corp | 情報検索方式 |
JPH0528190A (ja) * | 1991-07-19 | 1993-02-05 | Hitachi Ltd | 情報検索用端末装置 |
JP2001282285A (ja) * | 2000-03-31 | 2001-10-12 | Matsushita Electric Ind Co Ltd | 音声認識方法及び音声認識装置、並びにそれを用いた番組指定装置 |
JP2001357064A (ja) * | 2001-04-09 | 2001-12-26 | Toshiba Corp | 情報共有支援システム |
JP3665112B2 (ja) | 1995-09-26 | 2005-06-29 | 新日鉄ソリューションズ株式会社 | 文字列検索方法及び装置 |
JP2008262279A (ja) | 2007-04-10 | 2008-10-30 | Mitsubishi Electric Corp | 音声検索装置 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5835667A (en) * | 1994-10-14 | 1998-11-10 | Carnegie Mellon University | Method and apparatus for creating a searchable digital video library and a system and method of using such a library |
US6757718B1 (en) * | 1999-01-05 | 2004-06-29 | Sri International | Mobile navigation of network-based electronic information using spoken input |
DE102005030967B4 (de) * | 2005-06-30 | 2007-08-09 | Daimlerchrysler Ag | Verfahren und Vorrichtung zur Interaktion mit einem Spracherkennungssystem zur Auswahl von Elementen aus Listen |
US20090287626A1 (en) * | 2008-05-14 | 2009-11-19 | Microsoft Corporation | Multi-modal query generation |
-
2010
- 2010-02-09 EP EP10761487A patent/EP2418589A4/en not_active Withdrawn
- 2010-02-09 CN CN201080015020.6A patent/CN102365639B/zh not_active Expired - Fee Related
- 2010-02-09 WO PCT/JP2010/051874 patent/WO2010116785A1/ja active Application Filing
- 2010-02-09 JP JP2011508269A patent/JP5300974B2/ja not_active Expired - Fee Related
- 2010-02-09 US US13/255,517 patent/US20110320464A1/en not_active Abandoned
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH02189680A (ja) * | 1989-01-18 | 1990-07-25 | Nec Corp | 情報検索方式 |
JPH0528190A (ja) * | 1991-07-19 | 1993-02-05 | Hitachi Ltd | 情報検索用端末装置 |
JP3134204B2 (ja) | 1991-07-19 | 2001-02-13 | 株式会社日立製作所 | 情報検索用端末装置及び情報検索用端末装置における情報表示・入出力方法 |
JP3665112B2 (ja) | 1995-09-26 | 2005-06-29 | 新日鉄ソリューションズ株式会社 | 文字列検索方法及び装置 |
JP2001282285A (ja) * | 2000-03-31 | 2001-10-12 | Matsushita Electric Ind Co Ltd | 音声認識方法及び音声認識装置、並びにそれを用いた番組指定装置 |
JP2001357064A (ja) * | 2001-04-09 | 2001-12-26 | Toshiba Corp | 情報共有支援システム |
JP2008262279A (ja) | 2007-04-10 | 2008-10-30 | Mitsubishi Electric Corp | 音声検索装置 |
Non-Patent Citations (3)
Title |
---|
LAWRENCE RABINER, BIING-HWANG JUANG, FUNDAMENTALS OF SPEECH RECOGNITION (VOL. 1 & 2, vol. 1, 2 |
See also references of EP2418589A4 |
XUEDONG HUANG, ALEX ACERO, HSIAO-WUEN HON: "SPOKEN LANGUAGE PROCESSING -A guide to Theory, Algorithm and System Development", PRENTICE HALL |
Also Published As
Publication number | Publication date |
---|---|
CN102365639B (zh) | 2014-11-26 |
US20110320464A1 (en) | 2011-12-29 |
CN102365639A (zh) | 2012-02-29 |
JPWO2010116785A1 (ja) | 2012-10-18 |
JP5300974B2 (ja) | 2013-09-25 |
EP2418589A4 (en) | 2012-09-12 |
EP2418589A1 (en) | 2012-02-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5300974B2 (ja) | 検索装置 | |
US8380505B2 (en) | System for recognizing speech for searching a database | |
US9418152B2 (en) | System and method for flexible speech to text search mechanism | |
JP4705023B2 (ja) | 音声認識装置、音声認識方法、及びプログラム | |
JP5089955B2 (ja) | 音声対話装置 | |
KR101309042B1 (ko) | 다중 도메인 음성 대화 장치 및 이를 이용한 다중 도메인 음성 대화 방법 | |
US6823493B2 (en) | Word recognition consistency check and error correction system and method | |
US20120179694A1 (en) | Method and system for enhancing a search request | |
JP2004005600A (ja) | データベースに格納された文書をインデックス付け及び検索する方法及びシステム | |
JP2001249684A (ja) | 音声認識装置および音声認識方法、並びに記録媒体 | |
JP2004133880A (ja) | インデックス付き文書のデータベースとで使用される音声認識器のための動的語彙を構成する方法 | |
Parlak et al. | Performance analysis and improvement of Turkish broadcast news retrieval | |
BRPI0613699A2 (pt) | busca de dicionário para dispositivos móveis que usa reconhecimento de escrita | |
Bazzi et al. | A multi-class approach for modelling out-of-vocabulary words | |
Moyal et al. | Phonetic search methods for large speech databases | |
Wang | Experiments in syllable-based retrieval of broadcast news speech in Mandarin Chinese | |
JP2013125144A (ja) | 音声認識装置およびそのプログラム | |
Palmer et al. | Improving out-of-vocabulary name resolution | |
JP5590549B2 (ja) | 音声検索装置および音声検索方法 | |
Seide et al. | Towards an automated directory information system. | |
JP6126965B2 (ja) | 発話生成装置、方法、及びプログラム | |
KR20040018008A (ko) | 품사 태깅 장치 및 태깅 방법 | |
JP2004309928A (ja) | 音声認識装置、電子辞書装置、音声認識方法、検索方法、及びプログラム | |
JP3894419B2 (ja) | 音声認識装置、並びにこれらの方法、これらのプログラムを記録したコンピュータ読み取り可能な記録媒体 | |
JP2012059126A (ja) | 検索装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201080015020.6 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10761487 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2011508269 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13255517 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2010761487 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |