WO2021250837A1 - Search device, search method, and recording medium - Google Patents

Search device, search method, and recording medium Download PDF

Info

Publication number
WO2021250837A1
WO2021250837A1 PCT/JP2020/022971 JP2020022971W WO2021250837A1 WO 2021250837 A1 WO2021250837 A1 WO 2021250837A1 JP 2020022971 W JP2020022971 W JP 2020022971W WO 2021250837 A1 WO2021250837 A1 WO 2021250837A1
Authority
WO
WIPO (PCT)
Prior art keywords
search
voice data
text data
character string
data
Prior art date
Application number
PCT/JP2020/022971
Other languages
French (fr)
Japanese (ja)
Inventor
秀治 古明地
靖夫 飯村
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to PCT/JP2020/022971 priority Critical patent/WO2021250837A1/en
Priority to JP2022530448A priority patent/JP7485030B2/en
Publication of WO2021250837A1 publication Critical patent/WO2021250837A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying

Definitions

  • the present invention relates to a search device or the like that generates a search query using text data converted from voice data.
  • the person to be inquired is confirmed by contacting the main office by voice call via wireless communication.
  • the database in which the inquiry information is stored can be searched by using the text data converted from the voice data based on the voice of the speaker, the immediate inquiry can be automated.
  • the pronunciation and accent vary depending on the speaker, it is assumed that erroneous conversion occurs when the voice data is converted into text data, and the inquiry is made based on the erroneous text data.
  • Patent Document 1 discloses a sequence signal search device for efficiently processing a plurality of search term candidates from sequence signals such as voice data including errors.
  • the apparatus of Patent Document 1 plots the syllable sequence of the speech recognition result of the speech data and the syllable sequence of the search term on a plane based on the distance (similarity) between the syllables.
  • the apparatus of Patent Document 1 realizes a search process of voice data by a search term by detecting a straight line on the plane.
  • Patent Document 2 discloses a navigation device that searches for place names and road names based on voice recognition.
  • the device of Patent Document 2 accepts the first character string included in the search target character string and narrows down the candidates for the search target character string. Then, the apparatus of Patent Document 2 extracts a search target character string from the narrowed down candidates based on the voice data input thereafter.
  • Patent Document 1 it is possible to extract a plurality of search term candidates based on the voice recognition result.
  • the method of Patent Document 1 is difficult to apply when the voice data is not known before the search, and is not suitable for an immediate inquiry in which it is necessary to extract a search term from text data based on arbitrary voice data. rice field.
  • the certainty of extracting the search target character string can be improved by narrowing down the search target character string candidates in advance.
  • An object of the present invention is to provide a search device or the like capable of generating a plurality of search queries composed of search terms corresponding to search items based on arbitrary voice data.
  • the search device has a conversion unit that converts input voice data into text data by voice recognition, and a character string corresponding to a search item is extracted from the text data and is a distance from the extracted character string. Based on the above, a search unit that generates a search term related to a character string for each search item and combines the search terms generated for each search item to generate a plurality of search queries is provided.
  • the computer converts the input voice data into text data by voice recognition, extracts the character string corresponding to the search item from the text data, and extracts the character string from the extracted character string. Based on the distance, the search term related to the character string is generated for each search item, and the search items generated for each search item are combined to generate multiple search queries.
  • the program of one aspect of the present invention is a process of converting input voice data into text data by voice recognition, a process of extracting a character string corresponding to a search item from text data, and a distance from the extracted character string.
  • the computer is made to execute a process of generating a search term related to a character string for each search item and a process of generating a plurality of search queries by combining the search items generated for each search item.
  • a search device or the like that can generate a plurality of search queries composed of search terms corresponding to search items based on arbitrary voice data.
  • This is an example of a table included in a dictionary used by the search device according to the third embodiment to generate a search term.
  • This is another example of a table included in a dictionary used by the search device according to the third embodiment to generate a search term.
  • It is a conceptual diagram which shows an example which the search apparatus which concerns on 3rd Embodiment converts text data for confirming correctness of a search term into voice data, and outputs it.
  • It is a conceptual diagram which shows an example which the voice data of the reply is input to the search device with respect to the voice data transmitted from the search device which concerns on 3rd Embodiment.
  • the search device of the present embodiment converts voice data into text data by using voice recognition technology, and recognizes at least one character string from the converted text data.
  • the search device of the present embodiment generates a plurality of search terms related to at least one recognized character string based on the distance between pronunciations, and generates a plurality of search queries (also referred to as search patterns) including those search terms. ..
  • search queries also referred to as search patterns
  • characters and symbols such as hiragana, katakana, kanji, and alphabet may be used instead of the phonemes used by speech recognition.
  • FIG. 1 is a block diagram showing an example of the configuration of the search device 10 of the present embodiment.
  • the search device 10 includes an acquisition unit 11, a first conversion unit 12, a search unit 13, a second conversion unit 18, and an output unit 19.
  • the acquisition unit 11 and the output unit 19 constitute an input / output unit 110.
  • the first conversion unit 12 and the second conversion unit 18 constitute a conversion unit 120.
  • FIG. 1 also shows a database (DB100) connected to the search device 10.
  • the DB 100 is connected to the search unit 13 via a network such as the Internet or an intranet. A plurality of collation data are stored in the DB 100.
  • the search device 10 transmits / receives voice data to / from a radio device (not shown).
  • the radio has a microphone and a speaker.
  • the radio converts the voice input by voice via the microphone into an electric signal (voice data).
  • a radio device transmits a radio signal including voice data by wireless communication in a specific frequency band.
  • a radio signal transmitted from a radio is converted into an electric signal via an antenna, an amplifier, a demodulator, or the like (not shown), and is input to the search device 10 via a network such as the Internet or an intranet.
  • the search device 10 outputs voice data to the radio.
  • the search device 10 and the radio device are not directly connected, the search device and the radio device will be described as exchanging voice data.
  • the acquisition unit 11 acquires voice data from the radio.
  • the acquisition unit 11 outputs the acquired voice data to the first conversion unit 12.
  • the first conversion unit 12 converts voice data into text data by voice recognition.
  • the first conversion unit 12 converts voice data into text data by using an algorithm of an acoustic model or a language model.
  • the first conversion unit 12 converts voice data into text data by using a method such as a statistical method or a dynamic time expansion / contraction method.
  • the first conversion unit 12 converts voice data into text data by using a technique such as deep learning or a hidden Markov model.
  • the first conversion unit 12 converts voice data into text data using a voice recognition dictionary including a voice model, a language model, a pronunciation dictionary, and the like.
  • the first conversion unit 12 calculates a speech recognition score (also referred to as a score) by text analysis for a character string (word) included in the text data.
  • a speech recognition score also referred to as a score
  • the first conversion unit 12 converts voice data into text data based on the score in voice recognition.
  • the voice recognition method described here is an example, and does not limit the conversion method from voice data to text data by the first conversion unit 12.
  • the search unit 13 recognizes at least one character string for each search item from the text data converted by the first conversion unit 12. For example, when the search items are a name, a date of birth, and a registered domicile, the search unit 13 detects a character string that can correspond to those search items. For example, the search unit 13 is a candidate for a character string corresponding to another search item based on the character string corresponding to a certain search item recognized from the text data and having a high possibility of appearing before and after the character string. And. For example, the search unit 13 has at least one of the search term candidates for each search item based on the speech recognition score (score) given to each of the character strings (words) extracted from the text data. Select one string as the search term. For example, the search unit 13 selects the character string having the highest score as the search term from the search term candidates for each search item. The character string having the highest score among the search term candidates for each search item corresponds to the character string of the recognition result.
  • the search unit 13 detects a character string that can correspond
  • the search unit 13 generates a search term from the recognized character string based on the distance between pronunciations.
  • the distance between pronunciations is the distance between two character strings based on pronunciation.
  • the search unit 13 compares phoneme strings constituting two character strings, and sets the number of different phonemes as the inter-sounding distance.
  • the distance between pronunciations is defined in consideration of the order in which phonemes appear in the character string.
  • the search unit 13 generates a character string having a close distance between the recognized character string and the pronunciation as a search term.
  • the recognized character string is generated as a search term because the distance between pronunciations of the recognized character string is 0.
  • the distance between pronunciations will be explained below with some examples.
  • the example given below is an example, and does not limit the distance between pronunciations used by the search unit 13 of the present embodiment when generating a search term.
  • the phonemes of Sato are “s”, “a”, “t”, and “o”.
  • the phonemes of Saito are “s”, “a”, “i”, "t”, and “o”. Sato and Saito differ in that there is one more phoneme in Saito. That is, in Sato and Saito, the distance between pronunciations is 1 because there is only one phoneme that is excessive or insufficient.
  • the phonemes of Sato are “s”, “a”, “t”, and “o”.
  • the phonemes of Suzuki are “s”, “u”, “z”, “u”, “k”, and “i”. Sato and Suzuki have three different phonemes and two phonemes that are excessive or deficient, so the distance between pronunciations is five.
  • the search unit 13 generates a plurality of search queries using the generated search terms. For example, when the search items are a name, a date of birth, and a registered domicile, the search unit 13 generates a search query that combines the search terms for each of those search items.
  • the search item may include a search item other than the name, date of birth, and address.
  • the search unit 13 searches the DB 100 using at least one of the generated plurality of search queries. For example, the search unit 13 searches the DB 100 using some of the generated search queries that satisfy predetermined criteria. For example, the search unit 13 may search the DB 100 using all of the generated search queries.
  • the DB100 is constructed in association with the type of collation target (also referred to as inquiry type).
  • the DB 100 stores a plurality of data (also referred to as collation data) including search items for searching the collation target.
  • the collation data stored in the DB 100 is searched using one of several search items as a key. If at least one of the searched collation data matches, the search is a hit. In the present embodiment, when all the search items are matched, it is expressed that the search is a hit.
  • the DB 100 is constructed for each inquiry type.
  • the search unit 13 selects at least one search query according to the score of the key search term included in the search query. For example, the search unit 13 inputs a plurality of search words generated from a character string that is a recognition result by voice recognition into a voice recognition engine that outputs a score according to the recognition result, and according to the output score. Rank the search terms.
  • the search unit 13 searches the DB 100 using the selected search query. For example, the search unit 13 selects the search query having the highest score of the key search term included in the search query, and searches the DB 100 using the selected search query.
  • FIG. 2 is an example (table 131) of a table in which the search terms of the surname included in the name, which is a search item, are ranked according to the score. For example, the search unit 13 selects a search query according to the score included in the surname of the name included in the search query.
  • the search unit 13 generates a search term with a score based on the recognition result of voice recognition for a plurality of search terms. For example, in a certain voice recognition, it is assumed that the recognition results of "Sato (0.41)” and “Kato (0.65)” are obtained for the surname (the score in parentheses). In addition, regarding the date of birth, it is assumed that the recognition results of "January 1, 1990 (0.56)” and “July 1, 1990 (0.92)” were obtained (in parentheses). Score). Then, it is assumed that the recognition result of "Tokyo A-ku (0.43)” is obtained for the registered domicile (the score in parentheses). For example, the search unit 13 generates the following text data.
  • the above text data is converted into voice data by the second conversion unit 18, and is output from the output unit 19 to a radio (not shown).
  • the above text data converted into voice data may be output in an order according to the score of any search item, or may be output in an order according to the total value of the scores of all the search items. ..
  • the second conversion unit 18 acquires text data from the search unit 13.
  • the second conversion unit 18 converts the acquired text data into voice data.
  • the second conversion unit 18 converts the text data into speech data by using a rule synthesis method such as formant speech synthesis or tone synthesis.
  • the second conversion unit 18 converts text data into speech data by using a waveform connection type speech synthesis method such as unit selection type speech synthesis, diphon speech synthesis, and field-limited speech synthesis.
  • the second conversion unit 18 converts text data into speech data by using a method of statistical parametric speech synthesis such as neural network speech synthesis and hidden Markov speech synthesis.
  • the speech synthesis method described here is an example, and does not limit the speech synthesis method used by the second conversion unit 18. Further, if the first conversion unit 12 can convert the text data into the voice data, the second conversion unit 18 may be omitted.
  • the output unit 19 outputs the voice data converted by the second conversion unit 18.
  • the voice data output from the output unit 19 is output to the radio, and is output as voice in the radio.
  • the output unit 19 outputs voice data based on the collation data that hits the search in an order according to the accuracy of the search query.
  • FIG. 3 is a conceptual diagram showing an example in which the search device 10 generates a plurality of search queries using voice data acquired from a radio device.
  • the search device 10 acquires the voice data "Mr. Shibataro, born on January 1, 1990, registered domicile address A-ku, Tokyo".
  • the search device 10 recognizes the character strings for each search item such as "Shibataro", "January 1, 1990", and "A-ku, Tokyo” from the acquired voice data by voice recognition.
  • the search device 10 generates a plurality of search terms related to the recognized character string based on the distance between pronunciations.
  • the search device 10 generates a search query that combines a plurality of generated search terms.
  • the search device 10 makes a plurality of search queries such as (Shibataro, January 1, 1990, A-ku, Tokyo), (Shibataro, July 1, 1990, A-ku, Tokyo), and so on. Generate.
  • FIG. 4 is an example of collation data (collation table 101) stored in the DB 100.
  • the collation table 101 stores collation data including search items (name, date of birth, registered domicile, etc.).
  • search items name, date of birth, registered domicile, etc.
  • the collation table 101 stores collation data of a person whose date of birth is January 1, 1990 and whose registered domicile is "Shibataro" in A-ku, Tokyo.
  • this person named "Shibataro" is hit.
  • FIG. 5 is an example in which the search device 10 converts text data corresponding to the search result of the DB 100 into voice data and outputs the data.
  • the search device 10 acquires the search result "Shibataro, January 1, 1990, A-ku, Tokyo, XX" from the DB 100.
  • " ⁇ " is an inquiry type of a person who hits the search.
  • the search device 10 generates text data such as "Registered domicile address A-ku, Tokyo, Mr. Shibataro, born on January 1, 1990, corresponds to XX" from the acquired search results.
  • the search device 10 converts the generated text data into voice data.
  • the search device 10 outputs the converted voice data to the radio.
  • the voice data output from the search device 10 is output as voice in a radio (not shown).
  • the search device 10 may generate text data for inquiring whether the search term is correct or not by using at least a part of the search terms related to the character string recognized in the text data.
  • FIG. 6 is an example of converting text data for confirming the correctness of a search term into voice data and outputting it.
  • the search device 10 outputs voice data based on the search word having the highest score and the text data for re-questioning the search word that could not be voice-recognized.
  • the search device 10 outputs voice data having the content "Search by Kibataro, registered address A-ku, Tokyo. Please give me your date of birth again.”
  • audio data is output to a radio.
  • the voice data output from the search device 10 is output as voice for confirming the correctness of the search term in a radio (not shown).
  • FIG. 7 is an example in which, in the example of FIG. 6, response voice data is returned from a radio (not shown) or the like with respect to the voice data transmitted from the search device 10.
  • the search device 10 acquires the voice data of the reply "The name is Shibataro and the date of birth is January 1, 1990".
  • the search device 10 obtains the recognition results of "Shibataro" and "January 1, 1990" from the acquired voice data by voice recognition.
  • the search device 10 generates a search term according to the recognition result, generates a search query including the generated search term, and generates text data of the response. If the correctness of the search term can be confirmed by the sender of the voice data before or at the time of generating the search query, it is possible to reduce the generation of an erroneous search query.
  • FIG. 8 is a flowchart for explaining an example of generating a search query by the search device 10.
  • the search device 10 is the main operating body.
  • the search device 10 acquires voice data (step S111).
  • the search device 10 acquires voice data output from a radio (not shown).
  • the search device 10 converts the acquired voice data into text data by voice recognition (step S112).
  • the search device 10 extracts the character string corresponding to the search item from the text data (step S113).
  • the search device 10 generates a search term related to the extracted character string for each search item based on the distance between pronunciations (step S114).
  • the search device 10 generates a plurality of search queries that combine search terms for each search item (step S115).
  • FIG. 9 is a flowchart for explaining an example in which the search device 10 confirms the correctness of the search term.
  • the flowchart of FIG. 9 is a process following step S114 of the flowchart of FIG. In the example of FIG. 9, it is assumed that the search term is given a score or a ranking. In the description according to the flowchart of FIG. 9, the search device 10 is the main operating body.
  • the search device 10 generates text data for confirming the correctness of the search term having the maximum score (step S121).
  • the search device 10 converts the generated text data into voice data and outputs the converted voice data (step S122).
  • the voice data output from the search device 10 is output as voice in a radio (not shown).
  • the search device 10 acquires the voice data of the response and converts it into text data by voice recognition (step S123). For example, the search device 10 acquires voice data from a radio (not shown). For example, if the search device 10 does not obtain a response for a predetermined period, the voice data may be retransmitted or the process may proceed to step S125.
  • the search device 10 generates a plurality of search queries that combine the search terms for each search item (step S126).
  • step S124 when the search term is incorrect (No in step S124), the search device 10 generates text data for confirming the correctness of the search term having the next highest score (step S125). After step S125, the process returns to step S122. The series of processes from step S122 to step S125 is continued until it is confirmed that the search term is correct. If it cannot be confirmed that the search term is correct even after repeating the series of processes of steps S122 to S125 a predetermined number of times / for a predetermined time, the process proceeds to step S126 or returns to step S113 of FIG. You may.
  • FIG. 10 is a flowchart for explaining an example in which the search device 10 listens back to an unrecognized search item.
  • the search device 10 is the main operating body.
  • the search device 10 acquires voice data (step S131).
  • the search device 10 acquires voice data transmitted from a radio (not shown).
  • the search device 10 converts the acquired voice data into text data by voice recognition (step S132).
  • the search device 10 extracts the character string corresponding to the search item from the text data (step S133).
  • step S134 when the search item is insufficient (Yes in step S134), the search device 10 outputs voice data for listening back to the missing search item (step S135). After step S135, the process returns to step S131.
  • the voice data output from the search device 10 is output as voice in a radio (not shown).
  • the search device 10 when the search items are not insufficient (No in step S134), the search device 10 generates a search term related to the extracted character string for each search item based on the distance between pronunciations (step S136). ..
  • the search device 10 generates a plurality of search queries that combine search terms for each search item (step S137).
  • FIG. 11 is a flowchart for explaining an example in which the search device 10 searches the DB 100 using the generated search query.
  • the search device 10 is the main operating body.
  • the search device 10 searches the DB 100 using the generated search query (step S151).
  • step S152 When the search is hit (Yes in step S152), the search device 10 generates text data including the hit search result (step S153). On the other hand, when the search is not hit (No in step S152), the search device 10 generates text data indicating that the search result was not obtained (step S154).
  • step S153 and step S154 the search device 10 converts the generated text data into voice data (step S155).
  • the search device 10 outputs voice data (step S156).
  • voice data output from the search device 10 is output as voice in a radio (not shown).
  • the search device of the present embodiment includes an acquisition unit, a first conversion unit, a search unit, a second conversion unit, and an output unit.
  • the acquisition unit inputs voice data.
  • the first conversion unit converts the voice data acquired by the acquisition unit into text data by voice recognition.
  • the search unit extracts the character string corresponding to the search item from the text data.
  • the search unit generates search terms related to the character string for each search item based on the distance from the extracted character string.
  • the search unit generates a plurality of search queries by combining the search terms generated for each search item.
  • the search unit searches a database in which collation data including search items are accumulated.
  • the search unit generates text data according to the search results.
  • the second conversion unit converts the generated text data into voice data.
  • the output unit outputs the voice data converted from the text data.
  • the search unit generates a character string having a close distance between the character string and the pronunciation as a search term based on the distance between pronunciations based on the difference in phonemes constituting the character string extracted from the text data. do.
  • the search unit calculates the inter-sounding distance using a predefined inter-phoneme distance between two phonemes.
  • the search unit assigns a score based on voice recognition to the search term, and generates text data including the search term to which the score is assigned.
  • the conversion unit converts the generated text data into voice data, and outputs the voice data converted from the text data.
  • the input of accident information and the like by police officers using a radio or the like is only voice input, it is desirable to reduce the frequency and confirmation of input. Further, in such an input, it is desirable to make an inquiry of a person or the like quickly and accurately based on the information input by voice.
  • the search device (Second embodiment) Next, the search device according to the second embodiment will be described with reference to the drawings.
  • the present embodiment is different from the first embodiment in that the generated search query is ranked according to the accuracy.
  • FIG. 12 is a block diagram showing an example of the configuration of the search device 20 of the present embodiment.
  • the search device 20 includes an acquisition unit 21, a first conversion unit 22, a search unit 23, a second conversion unit 28, and an output unit 29.
  • the acquisition unit 21 and the output unit 29 constitute an input / output unit 210.
  • the first conversion unit 22 and the second conversion unit 28 constitute a conversion unit 220.
  • FIG. 12 also shows a database (DB200) connected to the search device 20.
  • the DB 200 is connected to the search unit 23 via a network such as the Internet or an intranet.
  • a plurality of collation data are stored in the DB 200. Since the configurations other than the search unit 23 included in the search device 20 are the same as the configurations included in the search device 10 of the first embodiment, detailed description thereof will be omitted. In the following, the description will be focused on the search unit 23.
  • the search unit 23 selects a search query according to the accuracy of the search term.
  • the search unit 23 searches the DB 200 using the selected search query.
  • the accuracy is the sum of the distances between pronunciations for each search item such as name, date of birth, and address.
  • the accuracy of the character string (search term) itself of the recognition result by the voice recognition by the search unit 23 is 0.
  • each word ranked based on the score of the recognition result uses the distance between pronunciations as the accuracy.
  • the accuracy may be weighted for each search item. For example, there are far more variations in names than variations in dates of birth. Therefore, if the weight of the name is increased compared to the date of birth, the search accuracy will be improved.
  • rare surnames are unlikely to be encountered, so they may be excluded from the search terms regardless of their accuracy.
  • the search items are name (also called name query), date of birth (also called date of birth query), and registered domicile (also called registered domicile query).
  • name query also called name query
  • date of birth also called date of birth query
  • registered domicile also called registered domicile query
  • the recognition results of the name query, date of birth query, and registered domicile query are "Sato”, “January 1, 1990", and “Tokyo A”. It is assumed that it was a ward (Tokyo Toake).
  • Each recognition result is used as a search term.
  • the accuracy of "Sato”, which is the recognition result is 0.
  • the accuracy of the recognition result “January 1, 1990 (Heisei ni Nenichi ga Tsutsutachi)" is 0.
  • the accuracy of "Tokyo A-ku, Tokyo which is the recognition result, is 0.
  • Search Query 1 “Sato, January 1, 1990, A-ku, Tokyo”
  • Search Query 2 “Kato, January 1, 1990, A-ku, Tokyo”
  • Search Query 3 “Sato, July 1, 1990, A-ku, Tokyo”
  • Search Query 4 “Kato, July 1, 1990, A-ku, Tokyo”
  • Search Query 5 “Sato, January 1, 1990, D-ku, Tokyo”
  • Search query 6 “Kato, January 1, 1990, D-ku, Tokyo”
  • Search Query 7 “Sato, July 1, 1990, D-ku, Tokyo”
  • Search Query 8 “Kato, July 1, 1990, D-ku, Tokyo”
  • the notation of phonemes and the like of each search term is omitted.
  • Search query 5: ⁇ 1 x 0 + ⁇ 2 x 0 + ⁇ 3 x 1 ⁇ 3
  • Search query 6: ⁇ 1 x 1 + ⁇ 2 x 0 + ⁇ 3 x 1 ⁇ 1 + ⁇ 3
  • Search query 7 ⁇ 1 x 0 +
  • FIG. 13 is a conceptual diagram for explaining the accuracy of the search query calculated by the search device 20.
  • the voice data "Mr. Shibataro, born on January 1, 1990, registered domicile is A-ku, Tokyo" is input to the search device 20.
  • the weight of each verification item is 1.
  • the search device 20 extracts the character strings "Shibataro”, “Heiseininenichigatsutsuitachi”, and "Tokyotoake” from the text data based on the input voice data for each verification item.
  • the search device 20 generates a search term for each verification item based on the distance between pronunciations. For example, the search device 20 generates search terms such as "Shibataro" having a distance between pronunciations of 0, "Kibataro" having a distance between pronunciations of 1, and so on for a name query.
  • the search device 20 generates search terms such as "Heiseininenichigatsuitachi” with a pronunciation distance of 0, "Hei conflictenshichigatsutsuitachi” with a pronunciation distance of 1, and so on, with respect to the date of birth query. do.
  • the search device 20 generates search terms such as "Tokyo Tokyo” having a distance between pronunciations of 0, "Tokyo Tokyo” having a distance between pronunciations of 1, and so on, with respect to the registered domicile query.
  • the search device 20 generates a search query by combining a plurality of search terms generated for each search item. For example, the search device 20 generates a search query of "Shibataro, Heiseininenichigatsutsuitachi, Tokyotoake". The accuracy of this search query is 0, which is the sum of the distances between pronunciations of each search term. For example, the search device 20 generates a search query of "Kibataro, Hei required Shichigatsutsuitachi, Tokyo Todake". The accuracy of this search query is 3, which is the sum of the distances between pronunciations of each search term.
  • the search unit 23 searches the DB 100 based on the accuracy of the generated plurality of search queries, and generates text data that is the source of voice data to be output to a radio (not shown).
  • FIG. 14 is a flowchart for explaining an example of generating a search query by the search device 20.
  • the search device 20 is the main operating body.
  • the search device 20 acquires voice data (step S211).
  • the search device 20 acquires voice data transmitted from a radio (not shown).
  • the search device 20 converts the acquired voice data into text data by voice recognition (step S212).
  • the search device 20 extracts the character string corresponding to the search item from the text data (step S213).
  • the search device 20 generates a search term related to the extracted character string for each search item based on the distance between pronunciations (step S214).
  • the search device 20 generates a plurality of search queries that combine search terms for each search item (step S215).
  • the search device 20 calculates the accuracy of the generated search query based on the distance between pronunciations for each search term (step S216).
  • the search device 20 ranks a plurality of search queries according to the accuracy (step S217).
  • the search device 20 searches the DB 200 using the generated search query (step S251).
  • the search device 20 When a plurality of hits in the search are hit (Yes in step S252), the search device 20 generates a plurality of text data including the hit search results (step S253). Next, the search device 20 converts each of the generated text data into voice data (step S254). Then, the search device 10 outputs a plurality of voice data according to the order of accuracy of the search query (step S255). For example, a plurality of voice data output from the search device 20 are output as voice in a radio (not shown) in an order according to the order of accuracy of the search query.
  • the search device 20 when a plurality of searches are not hit (No in step S252), the search device 20 generates text data according to the search result (step S256).
  • the case where a plurality of search results are not hit includes the case where the search result is not hit and the case where only one search result is hit. If the search result is not hit, the search device 20 generates text data indicating that the search result was not obtained. On the other hand, when only one search result is hit, the search device 20 generates text data including the hit search result.
  • the search device 20 converts the generated text data into voice data (step S257). Then, the search device 20 outputs voice data (step S258). For example, the voice data output from the search device 20 is output as voice in a radio (not shown).
  • the search unit of the present embodiment calculates the accuracy of the sum of the pronunciation distances of the search terms to which the weights are given for each search item for each search query.
  • the search unit ranks search queries according to their accuracy.
  • the search unit generates text data including search terms constituting a search query ranked according to accuracy.
  • the conversion unit converts the generated text data into voice data, and outputs the voice data converted from the text data in order according to the order of accuracy of the search query from which the text data is generated.
  • the search device of this embodiment can output voice data according to the accuracy of the search query. For example, a user who hears the voice data output by the search device can recognize the search result according to the order of the accuracy of the search query.
  • the search device of the present embodiment is different from the first and second embodiments in that the search query is ranked based on the dictionary of the distance between pronunciations (also referred to as the dictionary of distances between pronunciations) for at least one search item. different.
  • FIG. 16 is a block diagram showing an example of the configuration of the search device 30 of the present embodiment.
  • the search device 30 includes an acquisition unit 31, a first conversion unit 32, a search unit 33, a dictionary 34, a second conversion unit 38, and an output unit 39.
  • the acquisition unit 31 and the output unit 39 constitute an input / output unit 310.
  • the first conversion unit 32 and the second conversion unit 38 constitute a conversion unit 320.
  • FIG. 16 also shows a database (DB300) connected to the search device 30.
  • the DB 300 is connected to the search unit 33 via a network such as the Internet or an intranet. A plurality of collation data are stored in the DB 300.
  • the dictionary 34 is a dictionary (also referred to as an inter-pronunciation distance dictionary) that summarizes the inter-pronunciation distances of character strings corresponding to search items.
  • the dictionary 34 is prepared in advance for each search item. For example, when the search item is the surname of the name, the surname recorded in the national biographical dictionary, the family register, or the like and the ranking according to the pronunciation distance between the surnames are registered in the dictionary 34. For example, when N surnames are stored, the ranking according to the distance between pronunciations with other surnames is registered in the dictionary 34 for each N surnames (N is a natural number). However, even if the kanji used for the surname is the same, the reading may be different, but one reading shall be associated with one kanji.
  • the dictionary 34 includes a data sequence in which other character strings are arranged in order of order according to the distance between pronunciations for each of the stored character strings.
  • the pronunciation distance between Yamada and Sato the pronunciation distance between Sato and Kato
  • the pronunciation distance between Kato and Yamada are defined.
  • the distance between pronunciations between Sato (SATOU) and Kato (KATOU) is 1 because one phoneme is replaced.
  • FIG. 17 is an example of a table (dictionary between pronunciations 340) included in the dictionary 34 whose search item is the surname. For example, in the inter-pronunciation distance dictionary 340, Kato is ranked first, Saito is ranked second, and so on. Note that FIG. 17 is a conceptual one of the dictionary 34 and does not accurately show the ranking corresponding to the actual surname.
  • the search unit 33 extracts a character string having a higher rank than "Sato" as a search term. For example, the search unit 33 extracts a character string having a rank up to M as a search term in the field of "Sato" of the pronunciation distance dictionary 340 (M is a natural number). For example, the search unit 33 may extract a character string whose inter-pronunciation distance is within the Xth place as a search term in the field of "Sato" of the inter-pronunciation distance dictionary 340 (X is a natural number).
  • the dictionary 34 may include a spelling alphabet.
  • the spelling alphabet is a table that summarizes the rules established to prevent mistakes in hearing voice in wireless communication and the like.
  • FIG. 18 is a spelling alphabet 360 including the contents of the spelling alphabet described in Appendix 5 of the Radio Station Operation Regulations. For example, to prevent misunderstanding of "A”, "Asahinoa” is uttered. For example, to prevent misunderstanding of "shi”, "shinbunnoshi” is uttered.
  • the spelling alphabet 360 may include not only characters but also data related to numbers and symbols. Further, the spelling alphabet 360 may include not only Japanese characters but also characters, numbers, and symbols related to European languages such as alphabets.
  • FIG. 19 is a conceptual diagram showing an example in which the search device 30 listens back to the name recognized by voice recognition.
  • the search device 30 outputs back-listening voice data according to the score and accuracy of the search term.
  • FIG. 20 is an example in which the correct voice data is returned via a radio (not shown) because the name of the voice data to be heard back in FIG. 19 is incorrect.
  • the search device 30 can refer to the voice data of "Shinbunnoshi" registered in the spelling alphabet 360 and recognize the exact name "Shibataro".
  • FIG. 21 is a flowchart for explaining an example of generating a search query by the search device 30.
  • the search device 30 is the main operating body.
  • the search device 30 acquires voice data (step S311).
  • the search device 30 acquires voice data output from a radio (not shown).
  • the search device 30 converts the acquired voice data into text data by voice recognition (step S312).
  • the search device 30 extracts the character string corresponding to the search item from the text data (step S313).
  • the search device 30 refers to the pronunciation distance dictionary for each search item, and selects a search term based on the order according to the pronunciation distance with the extracted character string (step S314).
  • the search device 30 generates a plurality of search queries that combine search terms selected for each search item (step S315).
  • a plurality of other character strings corresponding to the search item are ranked according to the distance between pronunciations for each of the plurality of character strings corresponding to the search item.
  • the search unit refers to the pronunciation distance dictionary and selects a character string having a higher rank in the pronunciation distance dictionary with respect to the character string extracted from the text data as a search term.
  • the search term is selected by referring to the inter-pronunciation distance dictionary, processing such as calculation of the inter-pronunciation distance can be omitted. Therefore, according to the present embodiment, it is possible to speed up the generation of search terms and search queries.
  • FIG. 22 is a block diagram showing an example of the configuration of the search device 40 of the present embodiment.
  • the search device 40 includes an input / output unit 41, a first conversion unit 42, a search unit 43, a registration information recording unit 44, and a second conversion unit 48.
  • the first conversion unit 42 and the second conversion unit 48 constitute a conversion unit 420.
  • FIG. 22 also shows a radio 450 that exchanges voice data with the search device 40, and a database group (DB group 400) connected to the search device 40.
  • DB group 400 database group connected to the search device 40.
  • Each of the plurality of DBs constituting the DB group 400 is connected to the search unit 43 via a network such as the Internet or an intranet.
  • the DB group 400 includes a plurality of DBs for each inquiry type.
  • a plurality of collation data for each inquiry type is stored in each of the plurality of DBs included in the DB group.
  • the radio 450 exchanges voice data with the search device 40. Although only one radio 450 is shown in FIG. 22, the search device 40 can exchange voice data with a plurality of radios 450. Further, the radio 450 may include a part or all of the configuration of the search device 40. Since the main configuration of the search device 40 is the same as the configuration included in the search device 10 of the first embodiment, detailed description thereof will be omitted. In the following, the explanation will focus on the exchange of voice data between the radio 450 and the search device 40.
  • the input / output unit 41 (also referred to as an input / output unit) acquires voice data based on a radio signal transmitted from the radio 450.
  • the input / output unit 41 outputs voice data to the first conversion unit 42. Further, the input / output unit 41 outputs the voice data acquired from the second conversion unit.
  • the radio 450 transmits a radio signal including voice data by wireless communication in a specific frequency band.
  • the radio signal transmitted from the radio 450 is converted into an electric signal via an antenna, an amplifier, a demodulator, etc. (not shown), and is converted into an electric signal to the input / output unit 41 of the search device 40 via a network such as the Internet or an intranet. Entered.
  • the input / output unit 41 outputs the voice data acquired from the second conversion unit 48 toward the radio 450.
  • the registration information of the radio 450 is registered in the registration information recording unit 44.
  • the registration information is a user identifier of a user who uses the radio 450 or a device identifier of the radio 450.
  • the search device 40 exchanges voice data with the radio 450 of the transmission source of the identification information matching the registration information recorded in the registration information recording unit 44.
  • the search unit 43 executes processing according to the content of the text data acquired from the first conversion unit 42.
  • the text data acquired by the search unit 43 includes identification information, inquiry type, inquiry information, and the like will be given.
  • the search unit 43 performs processing such as generation of text data including response contents to the source of voice data before conversion of text data and search of DB included in DB group 400 according to the contents of text data to be acquired. Run.
  • the text data generated by the search unit 43 is converted into voice data by the second conversion unit 48, and is output from the input / output unit 41 toward the radio 450.
  • the search unit 43 refers to the registration information recording unit 44 and determines whether the identification information is recorded in the registration information recording unit 44.
  • the search unit 43 generates text data for inquiring about the inquiry type. For example, when the identification information is not recorded in the registration information recording unit 44, the search unit 43 generates text data notifying that the identification information does not match and text data instructing the retransmission of the identification information.
  • the text data generated by the search unit 43 is converted into voice data by the second conversion unit 48 and output to the radio 450.
  • the search unit 43 When the inquiry type is included in the voice data from the radio 450, the search unit 43 generates text data for inquiring the inquiry content to the sender of the inquiry type. For example, the search unit 43 may generate text data for inquiring about the inquiry content, including the content for asking for the inquiry type.
  • the text data generated by the search unit 43 is converted into voice data by the second conversion unit 48 and output to the radio 450.
  • the search unit 43 extracts the character string corresponding to the search item from the text data.
  • the search unit 43 generates a search term related to the extracted character string based on the distance between pronunciations.
  • the search unit 43 includes text data including a content for confirming a search term having the closest pronunciation distance to the extracted character string according to the extraction status of the character string corresponding to the search item, and a search that could not be extracted. Re-listen to the item Generate text data that includes the content.
  • FIG. 23 is a sequence diagram showing an example of the flow from the connection to the search device 40 by the radio device 450 to the generation of the search term by the search device 40.
  • the radio 450 connects to the search device 40 (step S411).
  • the connection method of the radio 450 to the search device 40 is not particularly limited.
  • the search device 40 When the search device 40 detects the connection of the radio device 450, the search device 40 outputs voice data including a request for identification information to the radio device 450 (step S412). For example, when the search device 40 detects the connection of the radio 450, the search device 40 responds to the radio 450 with voice data including a request for identification information such as "Automatic response. Please give me your P number, affiliation, and name.” Is output.
  • the P number is an identifier for uniquely identifying a police officer.
  • the radio 450 When the radio 450 receives the voice data including the request for the identification information, the radio 450 outputs the voice data including the identification information input by voice according to the voice data to the search device 40 (step S413). For example, in response to a request for identification information, information including identification information such as "P number XX, XX station, area XX, XX" is input to the radio 450 by voice. For example, the radio 450 outputs voice data including identification information such as "P number XX, XX station, area XX, XX" to the search device 40.
  • the search device 40 When the search device 40 receives the voice data including the identification information, it confirms whether the identification information included in the voice data is registered in the registration information recording unit 44 (step S414). When the identification information included in the voice data received from the radio 450 is registered in the registration information recording unit 44, the search device 40 outputs the voice data requesting the inquiry type to the radio 450 (step S415). For example, when the identification information included in the voice data received from the radio 450 is registered, the search device 40 uses an inquiry type such as "P number OOOO, OOOO, please give me an inquiry type.” The voice data including the request of is output to the radio 450.
  • the radio 450 When the radio 450 receives the voice data including the request of the inquiry type, the radio 450 outputs the voice data including the inquiry type input by voice according to the voice data to the search device 40 (step S416). For example, the radio 450 outputs voice data including an inquiry type such as "It is a comprehensive inquiry due to exemption from liability" to the search device 40.
  • the search device 40 When the search device 40 receives the voice data including the inquiry type, the search device 40 confirms the inquiry type (step S417).
  • the search device 40 outputs voice data requesting inquiry information to the radio 450 (step S418).
  • the search device 40 is a voice requesting inquiry information such as "Comprehensive inquiry, isn't it? If there is an error in the inquiry type, please correct it. If you like, please give us the name, date of birth, etc. of the other party."
  • the data is output to the radio 450.
  • the radio 450 When the radio 450 receives the voice data requesting the inquiry information, it outputs the voice data including the inquiry information input by voice according to the voice data to the search device 40 (step S419). For example, the radio 450 outputs voice data including inquiry information such as "Mr. Shibataro, January 1, 1990, registered domicile address A-ku, Tokyo" to the search device 40.
  • the search device 40 When the search device 40 receives the voice data including the inquiry information, the search device 40 extracts the character string corresponding to the search item of the inquiry information from the text data based on the acquired voice data. The search device 40 generates a search term from the extracted character string based on the distance between pronunciations (step S420).
  • FIG. 24 is a sequence diagram showing an example of the flow from the generation of the search term by the search device 40 to the output of the collation result.
  • the sequence diagram of FIG. 24 relates to a process following the generation of the search term in step S420 of FIG.
  • the search device 40 when the search device 40 generates a search term (step S420), the search device 40 outputs voice data including the confirmation content of the search term to the radio device 450 (step S421). For example, the search device 40 outputs voice data including confirmation contents of a search term such as "Search in Kibataro, A-ku, Tokyo. If there is an error, please correct the error part.” To the radio 450. ..
  • the search device 40 combines the generated search terms for each search item to generate a plurality of search queries (step S422).
  • the search device 40 uses at least one of the generated plurality of search queries to search the DBs included in the DB group 400 in which the collation data of the query type being inquired is stored (step S423). ).
  • the radio 450 When the radio 450 receives the voice data including the confirmation content of the search term, it outputs the voice data including the voice input response according to the voice data to the search device 40 (step S424). For example, the radio 450 outputs voice data including a response such as "There is no doubt" to the search device 40. If there is no error in the confirmation content of the search term, step S424 may be omitted.
  • the search device 40 acquires the search result from the DB (step S425).
  • the search device 40 outputs the collation result according to the search result to the radio 450 (step S426). For example, if the search is a hit, the search device 40 will use the audio data including the collation result such as "Born on January 1, 1990, Mr. Shibataro, whose registered address is A-ku, Tokyo, corresponds to XX.” Is output to the radio 450.
  • the search device 40 includes a collation result such as "Born on January 1, 1990, Mr. Shibataro, whose registered address is A-ku, Tokyo, does not correspond to XX.”
  • the voice data is output to the radio 450.
  • FIG. 25 is a sequence diagram showing another example of the flow from the generation of the search term by the search device 40 to the output of the collation result.
  • FIG. 25 is an example in which the search term confirmation and the DB search using a plurality of search queries are performed in parallel.
  • the search term confirmation and the DB search using a plurality of search queries may be performed at the same timing, or may be performed at slightly different timings.
  • the sequence diagram of FIG. 25 relates to a process following the generation of the search term in step S420 of FIG.
  • step S420 when the search device 40 generates a search term (step S420), the search device 40 combines the generated search terms for each search item to generate a plurality of search queries (step S431).
  • the search device 40 outputs voice data including the confirmation content of the search term to the radio 450 (step S432).
  • the search device 40 outputs voice data including confirmation contents of a search term such as "Search in Kibataro, A-ku, Tokyo. If there is an error, please correct the error part.”
  • Step S432 may be performed in parallel with step S432 or may be performed in advance of step S432.
  • the search device 40 searches the DBs included in the DB group 400 for the DB in which the collation data of the query type being queried is stored, using the plurality of generated search queries. Step S433). At this time, the search device 40 searches the DB using all of the generated search queries. When the collation data of the inquiry type being inquired is stored in the DB, the search device 40 acquires the search result from the DB (step S434).
  • the radio 450 When the radio 450 receives the voice data including the confirmation content of the search term from the search device 40, the radio 450 outputs the voice data including the response (confirmation result) voiced according to the voice data to the search device 40 (step). S435). For example, if there is an error in the search term, voice information including correction of the search term such as "The name is Shibataro. Newspaper Shi.” Is input to the radio 450. The radio 450 outputs voice data corresponding to the input voice information to the search device 40. If there is no error in the confirmation content of the search term, step S435 may be omitted.
  • the search device 40 receives voice data including correction of the search term from the radio 450.
  • the search device 40 outputs the collation result corresponding to the search result hit by the search using the search query composed of the correct search terms to the radio 450 (step S436). For example, if the search is a hit, the search device 40 will use the audio data including the collation result such as "Born on January 1, 1990, Mr. Shibataro, whose registered address is A-ku, Tokyo, corresponds to XX.” Is output to the radio 450. For example, if the search does not hit, the search device 40 includes a collation result such as "Born on January 1, 1990, Mr. Shibataro, whose registered address is A-ku, Tokyo, does not correspond to XX.” The voice data is output to the radio 450.
  • the search efficiency may be improved because the search result using the search query composed of the correct search terms may be selected according to the confirmation result of the search terms.
  • FIG. 26 is a sequence diagram showing still another example of the flow from the generation of the search term by the search device 40 to the output of the collation result.
  • the sequence diagram of FIG. 26 relates to a process following the generation of the search term in step S420 of FIG.
  • the search device 40 when the search device 40 generates a search term (step S420), the search device 40 outputs voice data including the confirmation content of the search term to the radio device 450 (step S441). For example, the search device 40 outputs voice data including confirmation contents of a search term such as "Search in Kibataro, A-ku, Tokyo. If there is an error, please correct the error part.” To the radio 450. ..
  • the search device 40 combines the generated search terms for each search item to generate a plurality of search queries (step S442).
  • the search device 40 uses at least one of the generated plurality of search queries to search the DBs included in the DB group 400 in which the collation data of the query type being inquired is stored (step S443). ).
  • the radio 450 When the radio 450 receives the voice data including the confirmation content of the search term from the search device 40, the radio 450 outputs the voice data including the response (confirmation result) voiced according to the voice data to the search device 40 (step). S444). For example, if there is an error in the search term, voice information including a correction such as "The name is Shibataro. Newspaper Shi.” Is input to the radio 450. The radio 450 outputs voice data corresponding to the input voice information to the search device 40.
  • the search device 40 receives voice data including corrections from the radio 450 when there is an error in the search term.
  • the search device 40 selects another search term based on the correction, and outputs voice data including the confirmation content of the search term to the radio 450 (step S445).
  • the search device 40 outputs voice data including confirmation contents of a search term such as "Search by Shibatarou. If there is an error, please correct it."
  • the search device 40 uses a search query composed of search terms selected based on the correction, and among the DBs included in the DB group 400, the DB in which the collation data of the inquiry type being inquired is stored is stored. Search (step S446).
  • the radio 450 When the radio 450 receives the voice data including the reconfirmation content of the search term from the search device 40, the radio 450 outputs the voice data including the response (confirmation result) voiced according to the voice data to the search device 40 (). Step S447). For example, if the search term is correct, voice information such as "There is no doubt" is input to the radio 450. The radio 450 outputs voice data corresponding to the input voice information to the search device 40.
  • the search device 40 receives voice data indicating that the search term is correct from the radio 450.
  • the search device 40 acquires the search results hit by the search using the search query composed of the correct search terms from the DB (step S448).
  • the search device 40 outputs the collation result corresponding to the search result hit by the search using the search query composed of the correct search terms to the radio 450 (step S449). For example, if the search is a hit, the search device 40 will use the audio data including the collation result such as "Born on January 1, 1990, Mr. Shibataro, whose registered address is A-ku, Tokyo, corresponds to XX.” Is output to the radio 450.
  • the search device 40 includes a collation result such as "Born on January 1, 1990, Mr. Shibataro, whose registered address is A-ku, Tokyo, does not correspond to XX.”
  • the voice data is output to the radio 450.
  • the DB can be searched using a search query composed of search terms confirmed to be correct, so that the search accuracy is improved. Further, according to the example of FIG. 26, when the search term is wrong, the search using the search query including the wrong search term can be stopped, so that the search efficiency is improved.
  • FIG. 27 is a block diagram showing an example of the configuration of the search device 50 of the present embodiment.
  • the search device 50 includes a conversion unit 52 and a search unit 53.
  • the conversion unit 52 converts the input voice data into text data by voice recognition.
  • the search unit 53 extracts the character string corresponding to the search item from the text data.
  • the search unit 53 generates a search term related to the character string for each search item based on the distance from the extracted character string.
  • the search unit 53 generates a plurality of search queries by combining the search items generated for each search item.
  • the information processing device 90 of FIG. 28 is a configuration example for executing the processing of the search device of each embodiment, and does not limit the scope of the present invention.
  • the information processing device 90 includes a processor 91, a main storage device 92, an auxiliary storage device 93, an input / output interface 95, and a communication interface 96.
  • the interface is abbreviated as I / F (Interface).
  • the processor 91, the main storage device 92, the auxiliary storage device 93, the input / output interface 95, and the communication interface 96 are connected to each other via the bus 98 so as to be capable of data communication. Further, the processor 91, the main storage device 92, the auxiliary storage device 93, and the input / output interface 95 are connected to a network such as the Internet or an intranet via the communication interface 96.
  • the processor 91 expands the program stored in the auxiliary storage device 93 or the like to the main storage device 92, and executes the expanded program.
  • the software program installed in the information processing apparatus 90 may be used.
  • the processor 91 executes the process by the search device according to the present embodiment.
  • the main storage device 92 has an area in which the program is expanded.
  • the main storage device 92 may be a volatile memory such as a DRAM (Dynamic Random Access Memory). Further, a non-volatile memory such as MRAM (Magnetoresistive Random Access Memory) may be configured / added as the main storage device 92.
  • DRAM Dynamic Random Access Memory
  • MRAM Magnetic Random Access Memory
  • the auxiliary storage device 93 stores various data.
  • the auxiliary storage device 93 is composed of a local disk such as a hard disk or a flash memory. It is also possible to store various data in the main storage device 92 and omit the auxiliary storage device 93.
  • the input / output interface 95 is an interface for connecting the information processing device 90 and peripheral devices.
  • the communication interface 96 is an interface for connecting to an external system or device through a network such as the Internet or an intranet based on a standard or a specification.
  • the input / output interface 95 and the communication interface 96 may be shared as an interface for connecting to an external device.
  • the information processing device 90 may be configured to connect an input device such as a keyboard, a mouse, or a touch panel, if necessary. These input devices are used to input information and settings. When the touch panel is used as an input device, the display screen of the display device may also serve as the interface of the input device. Data communication between the processor 91 and the input device may be mediated by the input / output interface 95.
  • the information processing apparatus 90 may be equipped with a display device for displaying information.
  • a display device it is preferable that the information processing device 90 is provided with a display control device (not shown) for controlling the display of the display device.
  • the display device may be connected to the information processing device 90 via the input / output interface 95.
  • the above is an example of the hardware configuration for enabling the search device according to each embodiment of the present invention.
  • the hardware configuration of FIG. 28 is an example of the hardware configuration for executing the arithmetic processing of the search device according to each embodiment, and does not limit the scope of the present invention.
  • the scope of the present invention also includes a program for causing a computer to execute a process related to the search device according to each embodiment.
  • a recording medium on which a program according to each embodiment is recorded is also included in the scope of the present invention.
  • the recording medium can be realized by an optical recording medium such as a CD (Compact Disc) or a DVD (Digital Versatile Disc).
  • the recording medium may be realized by a semiconductor recording medium such as a USB (Universal Serial Bus) memory or an SD (Secure Digital) card, a magnetic recording medium such as a flexible disk, or another recording medium.
  • USB Universal Serial Bus
  • SD Secure Digital
  • the components of the search device of each embodiment can be arbitrarily combined. Further, the components of the search device of each embodiment may be realized by software or by a circuit.

Abstract

A search device provided with: a conversion unit that converts input voice data into text data by means of voice recognition in order to generate, on the basis of given voice data, a plurality of search queries configured from search terms corresponding to search items; and a search unit that extracts a character string corresponding to a search item from the text data, generates, for each search item, a search term related to the character string, on the basis of a distance from the extracted character string and, for each search item, combines the generated search terms, thereby generating a plurality of search queries.

Description

検索装置、検索方法、および記録媒体Search device, search method, and recording medium
 本発明は、音声データから変換されたテキストデータを用いて検索クエリを生成する検索装置等に関する。 The present invention relates to a search device or the like that generates a search query using text data converted from voice data.
 通常、警察官が行う職務質問等における身元照会においては、無線通信を通じた音声通話によって本署等に連絡を取り、照会対象の人物の身元確認を行う。そのような照会において、音声データをテキストデータに変換するテキスト分析技術を適用すれば、即時照会を自動化できる可能性がある。例えば、話者の音声に基づく音声データから変換されたテキストデータを用いて、照会情報が蓄積されたデータベースを検索することができれば、即時照会を自動化できる。しかしながら、実際には、発音やアクセントが話者によって様々であるため、音声データをテキストデータに変換する際に誤変換が発生し、誤ったテキストデータに基づいて照会されることが想定される。 Normally, when inquiring about the identity of a police officer in a job question, etc., the person to be inquired is confirmed by contacting the main office by voice call via wireless communication. In such queries, it may be possible to automate immediate queries by applying text analysis techniques that convert speech data to text data. For example, if the database in which the inquiry information is stored can be searched by using the text data converted from the voice data based on the voice of the speaker, the immediate inquiry can be automated. However, in reality, since the pronunciation and accent vary depending on the speaker, it is assumed that erroneous conversion occurs when the voice data is converted into text data, and the inquiry is made based on the erroneous text data.
 特許文献1には、誤りを含む音声データ等の系列信号から複数の検索語の候補を効率よく処理するための系列信号検索装置について開示されている。特許文献1の装置は、音声データの音声認識結果の音節列と検索語の音節列とを、音節間の距離(類似度)に基づいて平面上にプロットする。特許文献1の装置は、その平面上で直線を検出することにより、検索語による音声データの検索処理を実現する。 Patent Document 1 discloses a sequence signal search device for efficiently processing a plurality of search term candidates from sequence signals such as voice data including errors. The apparatus of Patent Document 1 plots the syllable sequence of the speech recognition result of the speech data and the syllable sequence of the search term on a plane based on the distance (similarity) between the syllables. The apparatus of Patent Document 1 realizes a search process of voice data by a search term by detecting a straight line on the plane.
 特許文献2には、音声認識に基づいて、地名や道路名を検索するナビゲーション装置について開示されている。特許文献2の装置は、検索対象文字列に含まれる先頭の文字列を受け付けて、検索対象文字列の候補を絞り込む。そして、特許文献2の装置は、その後に入力された音声データに基づいて、絞り込まれた候補の中から検索対象文字列を抽出する。 Patent Document 2 discloses a navigation device that searches for place names and road names based on voice recognition. The device of Patent Document 2 accepts the first character string included in the search target character string and narrows down the candidates for the search target character string. Then, the apparatus of Patent Document 2 extracts a search target character string from the narrowed down candidates based on the voice data input thereafter.
特開2011-128903号公報Japanese Unexamined Patent Publication No. 2011-128903 特開2010-038751号公報Japanese Unexamined Patent Publication No. 2010-038751
 特許文献1の手法によれば、音声認識結果に基づいて、複数の検索語の候補を抽出することが可能になる。しかしながら、特許文献1の手法は、音声データが検索前に既知ではない場合には適用しづらいため、任意の音声データに基づくテキストデータから検索語を抽出する必要がある即時照会には不向きであった。 According to the method of Patent Document 1, it is possible to extract a plurality of search term candidates based on the voice recognition result. However, the method of Patent Document 1 is difficult to apply when the voice data is not known before the search, and is not suitable for an immediate inquiry in which it is necessary to extract a search term from text data based on arbitrary voice data. rice field.
 特許文献2の手法によれば、検索対象文字列の候補を事前に絞り込んでおくことによって、検索対象文字列の抽出の確実性を向上できる。しかしながら、特許文献2の手法では、検索対象文字列に含まれる先頭の文字列を事前に入力する必要がある。そのため、特許文献2の手法は、複数の検索語によって構成されるテキストデータから、複数の検索語を抽出することはできなかった。 According to the method of Patent Document 2, the certainty of extracting the search target character string can be improved by narrowing down the search target character string candidates in advance. However, in the method of Patent Document 2, it is necessary to input the first character string included in the search target character string in advance. Therefore, the method of Patent Document 2 cannot extract a plurality of search terms from text data composed of a plurality of search terms.
 本発明の目的は、任意の音声データに基づいて、検索項目に該当する検索語によって構成される複数の検索クエリを生成できる検索装置等を提供することにある。 An object of the present invention is to provide a search device or the like capable of generating a plurality of search queries composed of search terms corresponding to search items based on arbitrary voice data.
 本発明の一態様の検索装置は、入力される音声データを音声認識によってテキストデータに変換する変換部と、検索項目に該当する文字列をテキストデータから抽出し、抽出された文字列からの距離に基づいて、文字列に関連する検索語を検索項目ごとに生成し、検索項目ごとに生成された検索語を組み合わせて複数の検索クエリを生成する検索部と、を備える。 The search device according to one aspect of the present invention has a conversion unit that converts input voice data into text data by voice recognition, and a character string corresponding to a search item is extracted from the text data and is a distance from the extracted character string. Based on the above, a search unit that generates a search term related to a character string for each search item and combines the search terms generated for each search item to generate a plurality of search queries is provided.
 本発明の一態様の検索方法においては、コンピュータが、入力される音声データを音声認識によってテキストデータに変換し、検索項目に該当する文字列をテキストデータから抽出し、抽出された文字列からの距離に基づいて、文字列に関連する検索語を検索項目ごとに生成し、検索項目ごとに生成された検索項目を組み合わせて複数の検索クエリを生成する。 In the search method of one aspect of the present invention, the computer converts the input voice data into text data by voice recognition, extracts the character string corresponding to the search item from the text data, and extracts the character string from the extracted character string. Based on the distance, the search term related to the character string is generated for each search item, and the search items generated for each search item are combined to generate multiple search queries.
 本発明の一態様のプログラムは、入力される音声データを音声認識によってテキストデータに変換する処理と、検索項目に該当する文字列をテキストデータから抽出する処理と、抽出された文字列からの距離に基づいて、文字列に関連する検索語を検索項目ごとに生成する処理と、検索項目ごとに生成された検索項目を組み合わせて複数の検索クエリを生成する処理とをコンピュータに実行させる。 The program of one aspect of the present invention is a process of converting input voice data into text data by voice recognition, a process of extracting a character string corresponding to a search item from text data, and a distance from the extracted character string. Based on the above, the computer is made to execute a process of generating a search term related to a character string for each search item and a process of generating a plurality of search queries by combining the search items generated for each search item.
 本発明によれば、任意の音声データに基づいて、検索項目に該当する検索語によって構成される複数の検索クエリを生成できる検索装置等を提供することが可能になる。 According to the present invention, it is possible to provide a search device or the like that can generate a plurality of search queries composed of search terms corresponding to search items based on arbitrary voice data.
第1の実施形態に係る検索装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of the structure of the search apparatus which concerns on 1st Embodiment. 第1の実施形態に係る検索装置が生成した検索語をスコアに応じて順位付けしたテーブルの一例である。This is an example of a table in which the search terms generated by the search device according to the first embodiment are ranked according to the score. 第1の実施形態に係る検索装置が、入力された音声データに基づく複数の検索クエリを生成する一例を示す概念図である。It is a conceptual diagram which shows an example which the search apparatus which concerns on 1st Embodiment generates a plurality of search queries based on input voice data. 第1の実施形態に係る検索装置によって検索されるデータベースに格納された照合データの一例を示す概念図である。It is a conceptual diagram which shows an example of the collation data stored in the database searched by the search apparatus which concerns on 1st Embodiment. 第1の実施形態に係る検索装置が検索結果に応じたテキストデータを音声データに変換して出力する一例を示す概念図である。It is a conceptual diagram which shows an example which the search apparatus which concerns on 1st Embodiment converts text data corresponding to a search result into voice data and outputs it. 第1の実施形態に係る検索装置が、検索語の正否を確認するためのテキストデータを音声データに変換して出力する一例を示す概念図である。It is a conceptual diagram which shows an example which the search apparatus which concerns on 1st Embodiment converts text data for confirming correctness of a search term into voice data, and outputs it. 第1の実施形態に係る検索装置から送信された音声データに対して、返答の音声データが検索装置に入力される一例を示す概念図である。It is a conceptual diagram which shows an example which the voice data of the reply is input to the search device with respect to the voice data transmitted from the search device which concerns on 1st Embodiment. 第1の実施形態に係る検索装置による検索クエリの生成の一例を示すフローチャートである。It is a flowchart which shows an example of the generation of the search query by the search apparatus which concerns on 1st Embodiment. 第1の実施形態に係る検索装置による検索語の正否確認の一例を示すフローチャートである。It is a flowchart which shows an example of the correctness confirmation of the search term by the search apparatus which concerns on 1st Embodiment. 第1の実施形態に係る検索装置による検索語の聞き返しの一例を示すフローチャートである。It is a flowchart which shows an example of hearing back of the search term by the search apparatus which concerns on 1st Embodiment. 第1の実施形態に係る検索装置によるデータベースの検索の一例を示すフローチャートである。It is a flowchart which shows an example of the search of the database by the search apparatus which concerns on 1st Embodiment. 第2の実施形態に係る検索装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of the structure of the search apparatus which concerns on 2nd Embodiment. 第2の実施形態に係る検索装置が算出する検索クエリの確度について説明するための概念図である。It is a conceptual diagram for demonstrating the accuracy of the search query calculated by the search apparatus which concerns on 2nd Embodiment. 第2の実施形態に係る検索装置による検索クエリの生成の一例を示すフローチャートである。It is a flowchart which shows an example of the generation of the search query by the search apparatus which concerns on 2nd Embodiment. 第2の実施形態に係る検索装置によるデータベースの検索の一例を示すフローチャートである。It is a flowchart which shows an example of the search of the database by the search apparatus which concerns on 2nd Embodiment. 第3の実施形態に係る検索装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of the structure of the search apparatus which concerns on 3rd Embodiment. 第3の実施形態に係る検索装置が検索語の生成に用いる辞書に含まれるテーブルの一例である。This is an example of a table included in a dictionary used by the search device according to the third embodiment to generate a search term. 第3の実施形態に係る検索装置が検索語の生成に用いる辞書に含まれるテーブルの別の一例である。This is another example of a table included in a dictionary used by the search device according to the third embodiment to generate a search term. 第3の実施形態に係る検索装置が、検索語の正否を確認するためのテキストデータを音声データに変換して出力する一例を示す概念図である。It is a conceptual diagram which shows an example which the search apparatus which concerns on 3rd Embodiment converts text data for confirming correctness of a search term into voice data, and outputs it. 第3の実施形態に係る検索装置から送信された音声データに対して、返答の音声データが検索装置に入力される一例を示す概念図である。It is a conceptual diagram which shows an example which the voice data of the reply is input to the search device with respect to the voice data transmitted from the search device which concerns on 3rd Embodiment. 第3の実施形態に係る検索装置による検索クエリの生成の一例を示すフローチャートである。It is a flowchart which shows an example of the generation of the search query by the search apparatus which concerns on 3rd Embodiment. 第4の実施形態に係る検索装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of the structure of the search apparatus which concerns on 4th Embodiment. 第4の実施形態に係る検索装置に対する無線機による接続から、検索装置による検索語の生成までの流れの一例を示すシーケンス図である。It is a sequence diagram which shows an example of the flow from the connection by a wireless device to the search device which concerns on 4th Embodiment, to the generation of a search term by a search device. 第4の実施形態に係る検索装置による検索語の生成から、照合結果の出力までの流れの一例を示すシーケンス図である。It is a sequence diagram which shows an example of the flow from the generation of the search term by the search apparatus which concerns on 4th Embodiment to the output of a collation result. 第4の実施形態に係る検索装置による検索語の生成から、照合結果の出力までの流れの別の一例を示すシーケンス図である。It is a sequence diagram which shows another example of the flow from the generation of the search term by the search apparatus which concerns on 4th Embodiment to the output of a collation result. 第4の実施形態に係る検索装置による検索語の生成から、照合結果の出力までの流れのさらに別の一例を示すシーケンス図である。It is a sequence diagram which shows still another example of the flow from the generation of the search term by the search apparatus which concerns on 4th Embodiment to the output of a collation result. 第5の実施形態に係る検索装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of the structure of the search apparatus which concerns on 5th Embodiment. 各実施形態に係る検索装置を実現するハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware configuration which realizes the search apparatus which concerns on each embodiment.
 以下に、本発明を実施するための形態について図面を用いて説明する。ただし、以下に述べる実施形態には、本発明を実施するために技術的に好ましい限定がされているが、発明の範囲を以下に限定するものではない。なお、以下の実施形態の説明に用いる全図においては、特に理由がない限り、同様箇所には同一符号を付す。また、以下の実施形態において、同様の構成・動作に関しては繰り返しの説明を省略する場合がある。また、図面中の矢印の向きは、一例を示すものであり、ブロック間の信号の向き等を限定するものではない。 Hereinafter, embodiments for carrying out the present invention will be described with reference to the drawings. However, although the embodiments described below have technically preferable limitations for carrying out the present invention, the scope of the invention is not limited to the following. In all the drawings used in the following embodiments, the same reference numerals are given to the same parts unless there is a specific reason. Further, in the following embodiments, repeated explanations may be omitted for similar configurations and operations. Further, the direction of the arrow in the drawing shows an example, and does not limit the direction of the signal between the blocks.
 (第1の実施形態)
 まず、第1の実施形態に係る検索装置について図面を参照しながら説明する。本実施形態の検索装置は、音声認識技術を用いて音声データをテキストデータに変換し、変換後のテキストデータから少なくとも一つの文字列を認識する。本実施形態の検索装置は、認識された少なくとも一つの文字列に関する複数の検索語を発音間距離に基づいて生成し、それらの検索語を含む複数の検索クエリ(検索パターンとも呼ぶ)を生成する。以下の説明においては、音声認識によって用いられる音素を表記する代わりに、ひらがなやカタカナ、漢字、アルファベット等の文字や記号を用いる場合がある。
(First Embodiment)
First, the search device according to the first embodiment will be described with reference to the drawings. The search device of the present embodiment converts voice data into text data by using voice recognition technology, and recognizes at least one character string from the converted text data. The search device of the present embodiment generates a plurality of search terms related to at least one recognized character string based on the distance between pronunciations, and generates a plurality of search queries (also referred to as search patterns) including those search terms. .. In the following description, characters and symbols such as hiragana, katakana, kanji, and alphabet may be used instead of the phonemes used by speech recognition.
 (構成)
 図1は、本実施形態の検索装置10の構成の一例を示すブロック図である。検索装置10は、取得部11、第1変換部12、検索部13、第2変換部18、および出力部19を備える。取得部11と出力部19は、入出力部110を構成する。第1変換部12と第2変換部18は、変換部120を構成する。図1には、検索装置10に接続されるデータベース(DB100)を合わせて図示する。DB100は、インターネットやイントラネットなどのネットワークを介して、検索部13に接続される。DB100には、複数の照合データが格納される。
(composition)
FIG. 1 is a block diagram showing an example of the configuration of the search device 10 of the present embodiment. The search device 10 includes an acquisition unit 11, a first conversion unit 12, a search unit 13, a second conversion unit 18, and an output unit 19. The acquisition unit 11 and the output unit 19 constitute an input / output unit 110. The first conversion unit 12 and the second conversion unit 18 constitute a conversion unit 120. FIG. 1 also shows a database (DB100) connected to the search device 10. The DB 100 is connected to the search unit 13 via a network such as the Internet or an intranet. A plurality of collation data are stored in the DB 100.
 検索装置10は、図示しない無線機との間で音声データを送受信し合う。例えば、無線機は、マイクロフォンとスピーカを有する。無線機は、マイクロフォンを介して音声入力された音声を電気信号(音声データ)に変換する。例えば、無線機は、特定周波数帯の無線通信によって、音声データを含む無線信号を送信する。例えば、無線機から送信された無線信号は、図示しないアンテナや増幅器、復調器等を介して電気信号に変換され、インターネットやイントラネットなどのネットワークを介して、検索装置10に入力される。例えば、検索装置10は、音声データを、無線機に向けて出力する。以下においては、検索装置10と無線機が直接接続されているわけではないけれども、検索装置と無線機が音声データをやりとりするものとして説明する。 The search device 10 transmits / receives voice data to / from a radio device (not shown). For example, the radio has a microphone and a speaker. The radio converts the voice input by voice via the microphone into an electric signal (voice data). For example, a radio device transmits a radio signal including voice data by wireless communication in a specific frequency band. For example, a radio signal transmitted from a radio is converted into an electric signal via an antenna, an amplifier, a demodulator, or the like (not shown), and is input to the search device 10 via a network such as the Internet or an intranet. For example, the search device 10 outputs voice data to the radio. In the following, although the search device 10 and the radio device are not directly connected, the search device and the radio device will be described as exchanging voice data.
 取得部11は、無線機からの音声データを取得する。取得部11は、取得した音声データを第1変換部12に出力する。 The acquisition unit 11 acquires voice data from the radio. The acquisition unit 11 outputs the acquired voice data to the first conversion unit 12.
 第1変換部12は、音声認識によって、音声データをテキストデータに変換する。例えば、第1変換部12は、音響モデルや言語モデルのアルゴリズムを用いて、音声データをテキストデータに変換する。例えば、第1変換部12は、統計的手法や動的時間伸縮法等の手法を用いて、音声データをテキストデータに変換する。例えば、第1変換部12は、ディープラーニングや隠れマルコフモデル等の手法を用いて、音声データをテキストデータに変換する。例えば、第1変換部12は、音声モデルや言語モデル、発音辞書などを含む音声認識辞書を用いて、音声データをテキストデータに変換する。例えば、第1変換部12は、テキストデータに含まれる文字列(単語)に対して、テキスト解析により音声認識スコア(スコアとも呼ぶ)を計算する。例えば、第1変換部12は、音声認識におけるスコアに基づいて、音声データをテキストデータに変換する。なお、ここで挙げた音声認識の手法は一例であって、第1変換部12による音声データからテキストデータへの変換方式を限定するものではない。 The first conversion unit 12 converts voice data into text data by voice recognition. For example, the first conversion unit 12 converts voice data into text data by using an algorithm of an acoustic model or a language model. For example, the first conversion unit 12 converts voice data into text data by using a method such as a statistical method or a dynamic time expansion / contraction method. For example, the first conversion unit 12 converts voice data into text data by using a technique such as deep learning or a hidden Markov model. For example, the first conversion unit 12 converts voice data into text data using a voice recognition dictionary including a voice model, a language model, a pronunciation dictionary, and the like. For example, the first conversion unit 12 calculates a speech recognition score (also referred to as a score) by text analysis for a character string (word) included in the text data. For example, the first conversion unit 12 converts voice data into text data based on the score in voice recognition. The voice recognition method described here is an example, and does not limit the conversion method from voice data to text data by the first conversion unit 12.
 検索部13は、第1変換部12によって変換されたテキストデータから、検索項目ごとに、少なくとも一つの文字列を認識する。例えば、検索項目が、氏名、生年月日、本籍である場合、検索部13は、それらの検索項目に該当しうる文字列を検出する。例えば、検索部13は、テキストデータから認識されたある検索項目に該当する文字列に基づいて、その文字列の前後に表れる可能性の高い文字列を他の検索項目に該当する文字列の候補とする。例えば、検索部13は、テキストデータから抽出される文字列(単語)の各々に対して付与された音声認識スコア(スコア)に基づいて、検索項目ごとの検索語の候補の中から、少なくとも一つの文字列を検索語として選択する。例えば、検索部13は、検索項目ごとの検索語の候補の中から、スコアが最大の文字列を検索語として選択する。検索項目ごとの検索語の候補のうちスコアが最大の文字列は、認識結果の文字列に相当する。 The search unit 13 recognizes at least one character string for each search item from the text data converted by the first conversion unit 12. For example, when the search items are a name, a date of birth, and a registered domicile, the search unit 13 detects a character string that can correspond to those search items. For example, the search unit 13 is a candidate for a character string corresponding to another search item based on the character string corresponding to a certain search item recognized from the text data and having a high possibility of appearing before and after the character string. And. For example, the search unit 13 has at least one of the search term candidates for each search item based on the speech recognition score (score) given to each of the character strings (words) extracted from the text data. Select one string as the search term. For example, the search unit 13 selects the character string having the highest score as the search term from the search term candidates for each search item. The character string having the highest score among the search term candidates for each search item corresponds to the character string of the recognition result.
 検索部13は、発音間距離に基づいて、認識された文字列から検索語を生成する。発音間距離とは、発音に基づく、二つの文字列の距離である。例えば、検索部13は、二つの文字列を構成する音素列を比較し、異なる音素の数を発音間距離とする。本実施形態では、文字列において音素の登場する順番も考慮に入れて、発音間距離を定義する。例えば、検索部13は、認識された文字列と発音間距離が近い文字列を検索語として生成する。なお、認識された文字列は、それの文字列自体との発音間距離は0であるため、検索語として生成される。 The search unit 13 generates a search term from the recognized character string based on the distance between pronunciations. The distance between pronunciations is the distance between two character strings based on pronunciation. For example, the search unit 13 compares phoneme strings constituting two character strings, and sets the number of different phonemes as the inter-sounding distance. In this embodiment, the distance between pronunciations is defined in consideration of the order in which phonemes appear in the character string. For example, the search unit 13 generates a character string having a close distance between the recognized character string and the pronunciation as a search term. The recognized character string is generated as a search term because the distance between pronunciations of the recognized character string is 0.
 以下に、いくつかの事例を挙げて、発音間距離について説明する。なお、以下に挙げる例は一例であって、本実施形態の検索部13が検索語を生成する際に用いる発音間距離を限定するものではない。 The distance between pronunciations will be explained below with some examples. The example given below is an example, and does not limit the distance between pronunciations used by the search unit 13 of the present embodiment when generating a search term.
 まず、佐藤(さとう)と加藤(かとう)の発音間距離について説明する。佐藤(さとう)の音素は、“s”、“a”、“t”、“o”である。加藤(かとう)の音素は、“k”、“a”、“t”、“o”である。佐藤(さとう)と加藤(かとう)とでは、先頭の音素のみが異なる。すなわち、佐藤(さとう)と加藤(かとう)とでは、異なる音素が一つであるため、発音間距離は1である。 First, I will explain the distance between pronunciations of Sato and Kato. The phonemes of Sato are "s", "a", "t", and "o". The phonemes of Kato are "k", "a", "t", and "o". Only the first phoneme is different between Sato and Kato. That is, since Sato and Kato have one different phoneme, the distance between pronunciations is 1.
 次に、佐藤(さとう)と斎藤(さいとう)の発音間距離について説明する。佐藤(さとう)の音素は、“s”、“a”、“t”、“o”である。斎藤(さいとう)の音素は、“s”、“a”、“i”、“t”、“o”である。佐藤(さとう)と斎藤(さいとう)とでは、斎藤(さいとう)の音素が一つ多い点で異なる。すなわち、佐藤(さとう)と斎藤(さいとう)とでは、過不足する音素が一つであるため、発音間距離は1である。 Next, I will explain the distance between pronunciations of Sato and Saito. The phonemes of Sato are "s", "a", "t", and "o". The phonemes of Saito are "s", "a", "i", "t", and "o". Sato and Saito differ in that there is one more phoneme in Saito. That is, in Sato and Saito, the distance between pronunciations is 1 because there is only one phoneme that is excessive or insufficient.
 次に、佐藤(さとう)と鈴木(すずき)の発音間距離について説明する。佐藤(さとう)の音素は、“s”、“a”、“t”、“o”である。鈴木(すずき)の音素は、“s”、“u”、“z”、“u”、“k”、“i”である。佐藤(さとう)と鈴木(すずき)とでは、異なる音素が三つあり、過不足する音素が二つあるため、発音間距離は5である。 Next, I will explain the distance between pronunciations of Sato and Suzuki. The phonemes of Sato are "s", "a", "t", and "o". The phonemes of Suzuki are "s", "u", "z", "u", "k", and "i". Sato and Suzuki have three different phonemes and two phonemes that are excessive or deficient, so the distance between pronunciations is five.
 また、検索部13は、二つの音素間の音響的な距離で定義される距離(音素間距離とも呼ぶ)を発音間距離として、認識された文字列から検索語を生成してもよい。すなわち、発音間距離は、任意の音素間において予め定義された音素間距離であってもよい。例えば、二つの音素(p1、p2)間の音素間距離pd(p1、p2)を予め定義しておく。例えば、二つの音素間の音素間距離は、pd(s、s)=0.0、pd(s、k)=1.2、pd(o、o)=0.1、pd(‘’、k)=4.0のように定義しておく。例えば、佐藤(sato)と加藤(kato)の発音間距離は、先頭の音素(s、k)が異なるため、発音間距離Dは音素間距離pd(s、k)=1.2に相当する。 Further, the search unit 13 may generate a search term from the recognized character string with a distance defined by an acoustic distance between two phonemes (also referred to as a distance between phonemes) as a distance between pronunciations. That is, the inter-sounding distance may be a predefined inter-phoneme distance between arbitrary phonemes. For example, defined in advance two phonemes (p 1, p 2) between the distance between phonemes pd (p 1, p 2). For example, the distance between two phonemes is pd (s, s) = 0.0, pd (s, k) = 1.2, pd (o, o) = 0.1, pd ('', It is defined as k) = 4.0. For example, the distance between pronunciations of Sato (sato) and Kato (kato) is different from the first phoneme (s, k), so the distance between pronunciations D corresponds to the distance between phonemes pd (s, k) = 1.2. ..
 検索部13は、生成された検索語を用いて、複数の検索クエリを生成する。例えば、検索項目が、氏名、生年月日、本籍である場合、検索部13は、それらの検索項目ごとの検索語を組み合わせた検索クエリを生成する。検索項目は、氏名、生年月日、住所以外の検索項目を含んでもよい。 The search unit 13 generates a plurality of search queries using the generated search terms. For example, when the search items are a name, a date of birth, and a registered domicile, the search unit 13 generates a search query that combines the search terms for each of those search items. The search item may include a search item other than the name, date of birth, and address.
 検索部13は、生成された複数の検索クエリのうち少なくとも一つを用いて、DB100を検索する。例えば、検索部13は、生成された複数の検索クエリのうち、予め定められた基準を満たすいくつかの検索クエリを用いて、DB100を検索する。例えば、検索部13は、生成された複数の検索クエリの全てを用いて、DB100を検索してもよい。 The search unit 13 searches the DB 100 using at least one of the generated plurality of search queries. For example, the search unit 13 searches the DB 100 using some of the generated search queries that satisfy predetermined criteria. For example, the search unit 13 may search the DB 100 using all of the generated search queries.
 DB100は、照合対象の種別(照会種別とも呼ぶ)に対応付けて構築される。DB100には、照合対象を検索するための検索項目を含む複数のデータ(照合データとも呼ぶ)が格納される。例えば、DB100に格納された照合データは、いくつかの検索項目のうち一つをキーとして検索される。検索された照合データのうち、少なくとも一つの検索項目が合致していれば、検索がヒットしたことになる。本実施形態においては、検索項目の全てが合致した場合、検索がヒットしたと表現する。例えば、DB100は、照会種別ごとに構築される。 DB100 is constructed in association with the type of collation target (also referred to as inquiry type). The DB 100 stores a plurality of data (also referred to as collation data) including search items for searching the collation target. For example, the collation data stored in the DB 100 is searched using one of several search items as a key. If at least one of the searched collation data matches, the search is a hit. In the present embodiment, when all the search items are matched, it is expressed that the search is a hit. For example, the DB 100 is constructed for each inquiry type.
 例えば、検索部13は、検索クエリに含まれるキーとなる検索語のスコアに応じて、少なくとも一つの検索クエリを選択する。例えば、検索部13は、音声認識による認識結果である文字列から生成された複数の検索語を、認識結果に応じたスコアを出力する音声認識エンジンに入力し、その出力であるスコアに応じて検索語を順位付けする。検索部13は、選択された検索クエリを用いて、DB100を検索する。例えば、検索部13は、検索クエリに含まれるキーとなる検索語のスコアが最も高い検索クエリを選択し、選択された検索クエリを用いて、DB100を検索する。図2は、検索項目である氏名に含まれる苗字の検索語を、スコアに応じて順位付けしたテーブルの一例(テーブル131)である。例えば、検索部13は、検索クエリに含まれる氏名の苗字に含まれるスコアに応じて検索クエリを選択する。 For example, the search unit 13 selects at least one search query according to the score of the key search term included in the search query. For example, the search unit 13 inputs a plurality of search words generated from a character string that is a recognition result by voice recognition into a voice recognition engine that outputs a score according to the recognition result, and according to the output score. Rank the search terms. The search unit 13 searches the DB 100 using the selected search query. For example, the search unit 13 selects the search query having the highest score of the key search term included in the search query, and searches the DB 100 using the selected search query. FIG. 2 is an example (table 131) of a table in which the search terms of the surname included in the name, which is a search item, are ranked according to the score. For example, the search unit 13 selects a search query according to the score included in the surname of the name included in the search query.
 例えば、検索部13は、複数の検索語に関して、音声認識の認識結果に基づくスコアを付与した検索語を生成する。例えば、ある音声認識において、苗字に関して、「佐藤(0.41)」、「加藤(0.65)」という認識結果が得られたものとする(括弧内がスコア)。また、生年月日に関して、「平成2年1月1日(0.56)」、「平成2年7月1日(0.92)」という認識結果が得られたものとする(括弧内がスコア)。そして、本籍に関して、「東京都A区(0.43)」という認識結果が得られたものとする(括弧内がスコア)。例えば、検索部13は、以下のようなテキストデータを生成する。
「苗字は佐藤(0.41)で、生年月日は平成2年1月1日(0.56)、本籍は東京都A区(0.43)です」
「苗字は加藤(0.65)で、生年月日は平成2年1月1日(0.56)、本籍は東京都A区(0.43)です」
「苗字は佐藤(0.41)で、生年月日は平成2年7月1日(0.92)、本籍は東京都A区(0.43)です」
「苗字は加藤(0.65)で、生年月日は平成2年7月1日(0.92)、本籍は東京都A区(0.43)です」
例えば、上記のテキストデータは、第2変換部18において音声データに変換され、出力部19から無線機(図示しない)に出力される。音声データに変換後の上記のテキストデータは、いずれかの検索項目のスコアに応じた順番で出力されてもよいし、全ての検索項目のスコアの合計値に応じた順番で出力されてもよい。
For example, the search unit 13 generates a search term with a score based on the recognition result of voice recognition for a plurality of search terms. For example, in a certain voice recognition, it is assumed that the recognition results of "Sato (0.41)" and "Kato (0.65)" are obtained for the surname (the score in parentheses). In addition, regarding the date of birth, it is assumed that the recognition results of "January 1, 1990 (0.56)" and "July 1, 1990 (0.92)" were obtained (in parentheses). Score). Then, it is assumed that the recognition result of "Tokyo A-ku (0.43)" is obtained for the registered domicile (the score in parentheses). For example, the search unit 13 generates the following text data.
"My last name is Sato (0.41), my date of birth is January 1, 1990 (0.56), and my registered domicile is A-ku, Tokyo (0.43)."
"My last name is Kato (0.65), my date of birth is January 1, 1990 (0.56), and my registered domicile is A-ku, Tokyo (0.43)."
"My last name is Sato (0.41), my date of birth is July 1, 1990 (0.92), and my registered domicile is A-ku, Tokyo (0.43)."
"My last name is Kato (0.65), my date of birth is July 1, 1990 (0.92), and my registered domicile is A-ku, Tokyo (0.43)."
For example, the above text data is converted into voice data by the second conversion unit 18, and is output from the output unit 19 to a radio (not shown). The above text data converted into voice data may be output in an order according to the score of any search item, or may be output in an order according to the total value of the scores of all the search items. ..
 第2変換部18は、検索部13からテキストデータを取得する。第2変換部18は、取得したテキストデータを音声データに変換する。例えば、第2変換部18は、フォルマント音声合成や調音音声合成等の規則合成の手法を用いて、テキストデータを音声データに変換する。例えば、第2変換部18は、単位選択型音声合成やダイフォン音声合成、分野限定音声合成等の波形接続型音声合成の手法を用いて、テキストデータを音声データに変換する。例えば、第2変換部18は、ニューラルネットワーク音声合成や隠れマルコフ音声合成等の統計的パラメトリック音声合成の手法を用いて、テキストデータを音声データに変換する。なお、ここで挙げた音声合成の手法は一例であって、第2変換部18が用いる音声合成の手法を限定するものではない。また、第1変換部12が、テキストデータを音声データに変換できる場合は、第2変換部18を省略してもよい。 The second conversion unit 18 acquires text data from the search unit 13. The second conversion unit 18 converts the acquired text data into voice data. For example, the second conversion unit 18 converts the text data into speech data by using a rule synthesis method such as formant speech synthesis or tone synthesis. For example, the second conversion unit 18 converts text data into speech data by using a waveform connection type speech synthesis method such as unit selection type speech synthesis, diphon speech synthesis, and field-limited speech synthesis. For example, the second conversion unit 18 converts text data into speech data by using a method of statistical parametric speech synthesis such as neural network speech synthesis and hidden Markov speech synthesis. The speech synthesis method described here is an example, and does not limit the speech synthesis method used by the second conversion unit 18. Further, if the first conversion unit 12 can convert the text data into the voice data, the second conversion unit 18 may be omitted.
 出力部19は、第2変換部18によって変換された音声データを出力する。出力部19から出力された音声データは、無線機に出力され、無線機において音声として出力される。例えば、出力部19は、検索にヒットした照合データに基づく音声データを、検索クエリの確度に応じた順番で出力する。 The output unit 19 outputs the voice data converted by the second conversion unit 18. The voice data output from the output unit 19 is output to the radio, and is output as voice in the radio. For example, the output unit 19 outputs voice data based on the collation data that hits the search in an order according to the accuracy of the search query.
 図3は、検索装置10が、無線機から取得した音声データを用いて、複数の検索クエリを生成する一例を示す概念図である。図3の例では、検索装置10は、「芝太郎さん、平成2年1月1日生まれ、本籍住所東京都A区です」という音声データを取得する。例えば、検索装置10は、取得した音声データから、「シバタロウ」、「平成2年1月1日」、「東京都A区」という検索項目ごとの文字列を音声認識によって認識する。検索装置10は、発音間距離に基づいて、認識した文字列に関連する複数の検索語を生成する。検索装置10は、生成した複数の検索語を組み合わせた検索クエリを生成する。例えば、検索装置10は、(シバタロウ、平成2年1月1日、東京都A区)、(シバタロウ、平成2年7月1日、東京都A区)、・・・といった複数の検索クエリを生成する。 FIG. 3 is a conceptual diagram showing an example in which the search device 10 generates a plurality of search queries using voice data acquired from a radio device. In the example of FIG. 3, the search device 10 acquires the voice data "Mr. Shibataro, born on January 1, 1990, registered domicile address A-ku, Tokyo". For example, the search device 10 recognizes the character strings for each search item such as "Shibataro", "January 1, 1990", and "A-ku, Tokyo" from the acquired voice data by voice recognition. The search device 10 generates a plurality of search terms related to the recognized character string based on the distance between pronunciations. The search device 10 generates a search query that combines a plurality of generated search terms. For example, the search device 10 makes a plurality of search queries such as (Shibataro, January 1, 1990, A-ku, Tokyo), (Shibataro, July 1, 1990, A-ku, Tokyo), and so on. Generate.
 図4は、DB100に格納された照合データの一例(照合テーブル101)である。照合テーブル101には、検索項目(氏名、生年月日、本籍等)を含む照合データが格納される。例えば、照合テーブル101には、生年月日が平成2年1月1日、本籍が東京都A区の「シバタロウ」という人物の照合データが格納される。例えば、図3の例で生成された検索クエリを用いてDB100を検索すると、この「シバタロウ」という人物がヒットする。 FIG. 4 is an example of collation data (collation table 101) stored in the DB 100. The collation table 101 stores collation data including search items (name, date of birth, registered domicile, etc.). For example, the collation table 101 stores collation data of a person whose date of birth is January 1, 1990 and whose registered domicile is "Shibataro" in A-ku, Tokyo. For example, when the DB 100 is searched using the search query generated in the example of FIG. 3, this person named "Shibataro" is hit.
 図5は、検索装置10が、DB100の検索結果に応じたテキストデータを音声データに変換して出力する例である。図5の例では、検索装置10は、「シバタロウ、平成2年1月1日、東京都A区、○○」という検索結果をDB100から取得する。図5において、「○○」は、検索にヒットした人物の照会種別である。例えば、検索装置10は、取得した検索結果を「本籍住所東京都A区、平成2年1月1日生まれのシバタロウさんは、○○に該当します」といったテキストデータを生成する。検索装置10は、生成したテキストデータを音声データに変換する。検索装置10は、変換後の音声データを無線機に出力する。検索装置10から出力された音声データは、無線機(図示しない)において音声として出力される。 FIG. 5 is an example in which the search device 10 converts text data corresponding to the search result of the DB 100 into voice data and outputs the data. In the example of FIG. 5, the search device 10 acquires the search result "Shibataro, January 1, 1990, A-ku, Tokyo, XX" from the DB 100. In FIG. 5, "○○" is an inquiry type of a person who hits the search. For example, the search device 10 generates text data such as "Registered domicile address A-ku, Tokyo, Mr. Shibataro, born on January 1, 1990, corresponds to XX" from the acquired search results. The search device 10 converts the generated text data into voice data. The search device 10 outputs the converted voice data to the radio. The voice data output from the search device 10 is output as voice in a radio (not shown).
 また、検索装置10は、テキストデータにおいて認識された文字列に関連する検索語のうち少なくとも一部を用いて、検索語の正否を問い合わせるためのテキストデータを生成してもよい。 Further, the search device 10 may generate text data for inquiring whether the search term is correct or not by using at least a part of the search terms related to the character string recognized in the text data.
 図6は、検索語の正否を確認するためのテキストデータを音声データに変換して出力する例である。図6の例では、検索装置10は、スコアが最も大きい検索語と、音声認識できなかった検索語を問い直すテキストデータに基づく音声データを出力する。図6の例では、検索装置10は、「キバタロウ、本籍住所東京都A区、で検索します。生年月日をもう一度お願いします。」という内容の音声データを出力する。例えば、音声データを無線機に出力する。検索装置10から出力された音声データは、無線機(図示しない)において、検索語の正否を確認するための音声として出力される。 FIG. 6 is an example of converting text data for confirming the correctness of a search term into voice data and outputting it. In the example of FIG. 6, the search device 10 outputs voice data based on the search word having the highest score and the text data for re-questioning the search word that could not be voice-recognized. In the example of FIG. 6, the search device 10 outputs voice data having the content "Search by Kibataro, registered address A-ku, Tokyo. Please give me your date of birth again." For example, audio data is output to a radio. The voice data output from the search device 10 is output as voice for confirming the correctness of the search term in a radio (not shown).
 図7は、図6の例において、検索装置10から送信された音声データに対して、無線機(図示しない)等から返答の音声データが戻ってきた例である。図7の例では、検索装置10は、「氏名はシバタロウ、生年月日は平成2年1月1日です」という返答の音声データを取得する。例えば、検索装置10は、音声認識によって、取得した音声データから「シバタロウ」、「平成2年1月1日」という認識結果を得る。検索装置10は、認識結果に応じて検索語を生成し、生成された検索語を含む検索クエリを生成したり、返答のテキストデータを生成したりする。検索クエリの生成前や生成時において、音声データの送信元に検索語の正否を確認できれば、誤った検索クエリの生成を低減できる。 FIG. 7 is an example in which, in the example of FIG. 6, response voice data is returned from a radio (not shown) or the like with respect to the voice data transmitted from the search device 10. In the example of FIG. 7, the search device 10 acquires the voice data of the reply "The name is Shibataro and the date of birth is January 1, 1990". For example, the search device 10 obtains the recognition results of "Shibataro" and "January 1, 1990" from the acquired voice data by voice recognition. The search device 10 generates a search term according to the recognition result, generates a search query including the generated search term, and generates text data of the response. If the correctness of the search term can be confirmed by the sender of the voice data before or at the time of generating the search query, it is possible to reduce the generation of an erroneous search query.
 (動作)
 次に、検索装置10の動作について図面を参照しながら説明する。以下においては、検索クエリの生成や、検索語の正否の確認、検索項目の聞き直し、生成された検索クエリを用いた検索等について個別に説明する。なお、以下の動作は一例であって、検索装置10の動作を限定するものではない。
(motion)
Next, the operation of the search device 10 will be described with reference to the drawings. In the following, the generation of the search query, the confirmation of the correctness of the search term, the re-listening of the search items, the search using the generated search query, and the like will be individually described. The following operation is an example and does not limit the operation of the search device 10.
 〔検索クエリ生成〕
 図8は、検索装置10による検索クエリの生成の一例について説明するためのフローチャートである。図8のフローチャートに沿った説明においては、検索装置10を動作主体とする。
[Search query generation]
FIG. 8 is a flowchart for explaining an example of generating a search query by the search device 10. In the description according to the flowchart of FIG. 8, the search device 10 is the main operating body.
 図8において、まず、検索装置10は、音声データを取得する(ステップS111)。例えば、検索装置10は、無線機(図示しない)から出力された音声データを取得する。 In FIG. 8, first, the search device 10 acquires voice data (step S111). For example, the search device 10 acquires voice data output from a radio (not shown).
 次に、検索装置10は、音声認識によって、取得した音声データをテキストデータに変換する(ステップS112)。 Next, the search device 10 converts the acquired voice data into text data by voice recognition (step S112).
 次に、検索装置10は、検索項目に該当する文字列をテキストデータから抽出する(ステップS113)。 Next, the search device 10 extracts the character string corresponding to the search item from the text data (step S113).
 次に、検索装置10は、発音間距離に基づいて、抽出された文字列に関連する検索語を検索項目ごとに生成する(ステップS114)。 Next, the search device 10 generates a search term related to the extracted character string for each search item based on the distance between pronunciations (step S114).
 次に、検索装置10は、検索項目ごとの検索語を組み合わせた複数の検索クエリを生成する(ステップS115)。 Next, the search device 10 generates a plurality of search queries that combine search terms for each search item (step S115).
 〔正否確認〕
 図9は、検索装置10が、検索語の正否を確認する一例について説明するためのフローチャートである。図9のフローチャートは、図8のフローチャートのステップS114に後続する処理である。図9の例では、検索語にスコアや順位が付与されているものとする。図9のフローチャートに沿った説明においては、検索装置10を動作主体とする。
[Confirmation of correctness]
FIG. 9 is a flowchart for explaining an example in which the search device 10 confirms the correctness of the search term. The flowchart of FIG. 9 is a process following step S114 of the flowchart of FIG. In the example of FIG. 9, it is assumed that the search term is given a score or a ranking. In the description according to the flowchart of FIG. 9, the search device 10 is the main operating body.
 図9において、まず、検索装置10は、スコアが最大の検索語の正否を確認するためのテキストデータを生成する(ステップS121)。 In FIG. 9, first, the search device 10 generates text data for confirming the correctness of the search term having the maximum score (step S121).
 次に、検索装置10は、生成されたテキストデータを音声データに変換し、変換後の音声データを出力する(ステップS122)。例えば、検索装置10から出力された音声データは、無線機(図示しない)において音声として出力される。 Next, the search device 10 converts the generated text data into voice data and outputs the converted voice data (step S122). For example, the voice data output from the search device 10 is output as voice in a radio (not shown).
 次に、検索装置10は、応答の音声データを取得し、音声認識によってテキストデータに変換する(ステップS123)。例えば、検索装置10は、無線機(図示しない)からの音声データを取得する。例えば、検索装置10は、所定期間応答が得られなかった場合、音声データを再送してもよいし、ステップS125に進んでもよい。 Next, the search device 10 acquires the voice data of the response and converts it into text data by voice recognition (step S123). For example, the search device 10 acquires voice data from a radio (not shown). For example, if the search device 10 does not obtain a response for a predetermined period, the voice data may be retransmitted or the process may proceed to step S125.
 ここで、検索語が正しかった場合(ステップS124でYes)、検索装置10は、検索項目ごとの検索語を組み合わせた複数の検索クエリを生成する(ステップS126)。 Here, if the search term is correct (Yes in step S124), the search device 10 generates a plurality of search queries that combine the search terms for each search item (step S126).
 一方、検索語が正しくなかった場合(ステップS124でNo)、検索装置10は、次にスコアが高い検索語の正否を確認するためのテキストデータを生成する(ステップS125)。ステップS125の後は、ステップS122に戻る。ステップS122~ステップS125の一連の処理は、検索語が正しいことが確認されるまで継続される。なお、ステップS122~ステップS125の一連の処理を所定回数/所定時間繰り返しても検索語が正しいことが確認できなかった場合は、ステップS126に進んだり、図8のステップS113に戻ったりするようにしてもよい。 On the other hand, when the search term is incorrect (No in step S124), the search device 10 generates text data for confirming the correctness of the search term having the next highest score (step S125). After step S125, the process returns to step S122. The series of processes from step S122 to step S125 is continued until it is confirmed that the search term is correct. If it cannot be confirmed that the search term is correct even after repeating the series of processes of steps S122 to S125 a predetermined number of times / for a predetermined time, the process proceeds to step S126 or returns to step S113 of FIG. You may.
 〔聞き返し〕
 図10は、検索装置10が、認識できなかった検索項目を聞き返す一例について説明するためのフローチャートである。図10のフローチャートに沿った説明においては、検索装置10を動作主体とする。
[Return]
FIG. 10 is a flowchart for explaining an example in which the search device 10 listens back to an unrecognized search item. In the description according to the flowchart of FIG. 10, the search device 10 is the main operating body.
 図10において、まず、検索装置10は、音声データを取得する(ステップS131)。例えば、検索装置10は、無線機(図示しない)から送信された音声データを取得する。 In FIG. 10, first, the search device 10 acquires voice data (step S131). For example, the search device 10 acquires voice data transmitted from a radio (not shown).
 次に、検索装置10は、音声認識によって、取得した音声データをテキストデータに変換する(ステップS132)。 Next, the search device 10 converts the acquired voice data into text data by voice recognition (step S132).
 次に、検索装置10は、検索項目に該当する文字列をテキストデータから抽出する(ステップS133)。 Next, the search device 10 extracts the character string corresponding to the search item from the text data (step S133).
 ここで、検索項目が不足している場合(ステップS134でYes)、検索装置10は、足りない検索項目を聞き返すための音声データを出力する(ステップS135)。ステップS135の後は、ステップS131に戻る。例えば、検索装置10から出力された音声データは、無線機(図示しない)において音声として出力される。 Here, when the search item is insufficient (Yes in step S134), the search device 10 outputs voice data for listening back to the missing search item (step S135). After step S135, the process returns to step S131. For example, the voice data output from the search device 10 is output as voice in a radio (not shown).
 一方、検索項目が不足していない場合(ステップS134でNo)、検索装置10は、発音間距離に基づいて、抽出された文字列に関連する検索語を検索項目ごとに生成する(ステップS136)。 On the other hand, when the search items are not insufficient (No in step S134), the search device 10 generates a search term related to the extracted character string for each search item based on the distance between pronunciations (step S136). ..
 次に、検索装置10は、検索項目ごとの検索語を組み合わせた複数の検索クエリを生成する(ステップS137)。 Next, the search device 10 generates a plurality of search queries that combine search terms for each search item (step S137).
 〔検索〕
 図11は、検索装置10が、生成された検索クエリを用いてDB100を検索する一例について説明するためのフローチャートである。図11のフローチャートに沿った説明においては、検索装置10を動作主体とする。
〔search〕
FIG. 11 is a flowchart for explaining an example in which the search device 10 searches the DB 100 using the generated search query. In the description according to the flowchart of FIG. 11, the search device 10 is the main operating body.
 図11において、まず、検索装置10は、生成された検索クエリを用いてDB100を検索する(ステップS151)。 In FIG. 11, first, the search device 10 searches the DB 100 using the generated search query (step S151).
 検索がヒットした場合(ステップS152でYes)、検索装置10は、ヒットした検索結果を含むテキストデータを生成する(ステップS153)。一方、検索がヒットしなかった場合(ステップS152でNo)、検索装置10は、検索結果が得られなかったことを示すテキストデータを生成する(ステップS154)。 When the search is hit (Yes in step S152), the search device 10 generates text data including the hit search result (step S153). On the other hand, when the search is not hit (No in step S152), the search device 10 generates text data indicating that the search result was not obtained (step S154).
 ステップS153およびステップS154の後、検索装置10は、生成されたテキストデータを音声データに変換する(ステップS155)。 After step S153 and step S154, the search device 10 converts the generated text data into voice data (step S155).
 次に、検索装置10は、音声データを出力する(ステップS156)。例えば、検索装置10から出力された音声データは、無線機(図示しない)において音声として出力される。 Next, the search device 10 outputs voice data (step S156). For example, the voice data output from the search device 10 is output as voice in a radio (not shown).
 以上のように、本実施形態の検索装置は、取得部、第1変換部、検索部、第2変換部、出力部を備える。取得部は、音声データを入力する。第1変換部は、音声認識によって、取得部によって取得された音声データをテキストデータに変換する。検索部は、検索項目に該当する文字列をテキストデータから抽出する。検索部は、抽出された文字列からの距離に基づいて、文字列に関連する検索語を検索項目ごとに生成する。検索部は、検索項目ごとに生成された検索語を組み合わせて複数の検索クエリを生成する。検索部は、検索項目を含む照合データが蓄積されたデータベースを検索する。検索部は、検索結果に応じたテキストデータを生成する。第2変換部は、生成されたテキストデータを音声データに変換する。出力部は、テキストデータから変換された音声データを出力する。 As described above, the search device of the present embodiment includes an acquisition unit, a first conversion unit, a search unit, a second conversion unit, and an output unit. The acquisition unit inputs voice data. The first conversion unit converts the voice data acquired by the acquisition unit into text data by voice recognition. The search unit extracts the character string corresponding to the search item from the text data. The search unit generates search terms related to the character string for each search item based on the distance from the extracted character string. The search unit generates a plurality of search queries by combining the search terms generated for each search item. The search unit searches a database in which collation data including search items are accumulated. The search unit generates text data according to the search results. The second conversion unit converts the generated text data into voice data. The output unit outputs the voice data converted from the text data.
 本実施形態の一態様において、検索部は、テキストデータから抽出された文字列を構成する音素の違いに基づく発音間距離に基づいて、文字列と発音間距離の近い文字列を検索語として生成する。例えば、検索部は、二つの音素間において予め定義された音素間距離を用いて発音間距離を計算する。 In one aspect of the present embodiment, the search unit generates a character string having a close distance between the character string and the pronunciation as a search term based on the distance between pronunciations based on the difference in phonemes constituting the character string extracted from the text data. do. For example, the search unit calculates the inter-sounding distance using a predefined inter-phoneme distance between two phonemes.
 本実施形態によれば、任意の音声データに基づいて、検索項目に該当する検索語によって構成される複数の検索クエリを生成できる。 According to this embodiment, it is possible to generate a plurality of search queries composed of search terms corresponding to search items based on arbitrary voice data.
 本実施形態の一態様にいて、検索部は、音声認識に基づくスコアを検索語に付与し、スコアが付与された検索語を含むテキストデータを生成する。変換部は、生成されたテキストデータを音声データに変換し、テキストデータから変換された音声データを出力する。 In one aspect of the present embodiment, the search unit assigns a score based on voice recognition to the search term, and generates text data including the search term to which the score is assigned. The conversion unit converts the generated text data into voice data, and outputs the voice data converted from the text data.
 例えば、無線機等を用いた警察官による事故情報等の入力は、音声入力のみであるため、入力の頻度や確認を減らすことが望ましい。また、そのような入力においては、音声入力された情報に基づいて、人物等の照会を迅速かつ正確に行われることが望ましい。本実施形態によれば、音声データに基づいて、検索項目に含まれる検索語によって構成される複数の検索クエリを大量に生成できる。それらの検索クエリは、音声データから抽出された文字列との距離に応じて生成されるため、正確な検索語を含む検索クエリが含まれる可能性が高い。また、本実施形態においては、生成された検索語の正否確認を行うこともできる
ため、正確な検索語に基づいた検索クエリを生成できる。
For example, since the input of accident information and the like by police officers using a radio or the like is only voice input, it is desirable to reduce the frequency and confirmation of input. Further, in such an input, it is desirable to make an inquiry of a person or the like quickly and accurately based on the information input by voice. According to the present embodiment, it is possible to generate a large number of a plurality of search queries composed of search terms included in a search item based on voice data. Since those search queries are generated according to the distance from the character string extracted from the voice data, it is highly possible that the search queries include the exact search terms. Further, in the present embodiment, since it is possible to confirm the correctness of the generated search term, it is possible to generate a search query based on an accurate search term.
 (第2の実施形態)
 次に、第2の実施形態に係る検索装置について図面を参照しながら説明する。本実施形態は、生成された検索クエリに対して、確度に応じた順位付けを行う点において、第1の実施形態とは異なる。
(Second embodiment)
Next, the search device according to the second embodiment will be described with reference to the drawings. The present embodiment is different from the first embodiment in that the generated search query is ranked according to the accuracy.
 (構成)
 図12は、本実施形態の検索装置20の構成の一例を示すブロック図である。検索装置20は、取得部21、第1変換部22、検索部23、第2変換部28、および出力部29を備える。取得部21と出力部29は、入出力部210を構成する。第1変換部22と第2変換部28は、変換部220を構成する。図12には、検索装置20に接続されるデータベース(DB200)を合わせて図示する。例えば、DB200は、インターネットやイントラネットなどのネットワークを介して、検索部23に接続される。DB200には、複数の照合データが格納される。なお、検索装置20に含まれる検索部23以外の構成は、第1の実施形態の検索装置10に含まれる構成と同様であるため、詳細な説明は省略する。以下においては、検索部23に焦点を当てて説明する。
(composition)
FIG. 12 is a block diagram showing an example of the configuration of the search device 20 of the present embodiment. The search device 20 includes an acquisition unit 21, a first conversion unit 22, a search unit 23, a second conversion unit 28, and an output unit 29. The acquisition unit 21 and the output unit 29 constitute an input / output unit 210. The first conversion unit 22 and the second conversion unit 28 constitute a conversion unit 220. FIG. 12 also shows a database (DB200) connected to the search device 20. For example, the DB 200 is connected to the search unit 23 via a network such as the Internet or an intranet. A plurality of collation data are stored in the DB 200. Since the configurations other than the search unit 23 included in the search device 20 are the same as the configurations included in the search device 10 of the first embodiment, detailed description thereof will be omitted. In the following, the description will be focused on the search unit 23.
 検索部23は、検索語の確度に応じて検索クエリを選択する。検索部23は、選択された検索クエリを用いて、DB200を検索する。確度は、氏名、生年月日、住所などの検索項目ごとの発音間距離を足し合わせた値である。例えば、検索部23による音声認識による認識結果の文字列(検索語)そのものの確度は0である。また、認識結果のスコアに基づいて順位付けされた単語の各々は、その発音間距離を確度とする。例えば、確度は、検索項目ごとに重みづけされてもよい。例えば、生年月日のバリエーションと比べると、氏名のバリエーションは格段に多い。そのため、生年月日と比べて、氏名の重みづけを大きくすれば、検索精度の向上につながる。また、レアな苗字は、遭遇する確率が低いので、確度の大小にかかわらず、検索語から除外してもよい。 The search unit 23 selects a search query according to the accuracy of the search term. The search unit 23 searches the DB 200 using the selected search query. The accuracy is the sum of the distances between pronunciations for each search item such as name, date of birth, and address. For example, the accuracy of the character string (search term) itself of the recognition result by the voice recognition by the search unit 23 is 0. In addition, each word ranked based on the score of the recognition result uses the distance between pronunciations as the accuracy. For example, the accuracy may be weighted for each search item. For example, there are far more variations in names than variations in dates of birth. Therefore, if the weight of the name is increased compared to the date of birth, the search accuracy will be improved. In addition, rare surnames are unlikely to be encountered, so they may be excluded from the search terms regardless of their accuracy.
 ここで、検索部23が生成する検索クエリの確度について、一例を挙げて説明する。以下の例では、氏名(氏名クエリとも呼ぶ)、生年月日(生年月日クエリとも呼ぶ)、本籍(本籍クエリとも呼ぶ)を検索項目とする。以下の例では、説明を簡略化するため、氏名クエリに関しては苗字のみとする。 Here, the accuracy of the search query generated by the search unit 23 will be described with an example. In the following example, the search items are name (also called name query), date of birth (also called date of birth query), and registered domicile (also called registered domicile query). In the following example, for the sake of brevity, only the surname is used for the name query.
 例えば、氏名クエリ、生年月日クエリ、本籍クエリの各々の認識結果が、「佐藤(さとう)」、「平成2年1月1日(へいせいにねんいちがつついたち)」、「東京都A区(トウキョウトエーク)」であったとする。それぞれの認識結果は、検索語として用いられる。氏名クエリに関しては、認識結果である「佐藤(さとう)」の確度は0である。生年月日クエリに関しては、認識結果である「平成2年1月1日(へいせいにねんいちがつついたち)」の確度は0である。本籍クエリに関しては、認識結果である「東京都A区(トウキョウトエーク)」の確度は0である。 For example, the recognition results of the name query, date of birth query, and registered domicile query are "Sato", "January 1, 1990", and "Tokyo A". It is assumed that it was a ward (Tokyo Toake). Each recognition result is used as a search term. Regarding the name query, the accuracy of "Sato", which is the recognition result, is 0. Regarding the date of birth query, the accuracy of the recognition result "January 1, 1990 (Heisei ni Nenichi ga Tsutsutachi)" is 0. Regarding the registered domicile query, the accuracy of "Tokyo A-ku, Tokyo", which is the recognition result, is 0.
 例えば、氏名クエリ、生年月日クエリ、本籍クエリの各々の検索語の候補として、「加藤(かとう)」、「平成2年7月1日(へいせいにねんしちがつついたち)」、「東京都D区(トウキョウトデーク)」が生成されたものとする。「加藤(かとう)」の確度は、発音間距離に基づいて1である。「平成2年7月1日(へいせいにねんしちがつついたち)」の確度は、発音間距離に基づいて1である。「東京都D区(トウキョウトデーク)」の確度は、発音間距離に基づいて1である。 For example, as search term candidates for name query, date of birth query, and registered domicile query, "Kato", "July 1, 1990 (Heisei ni Nenshi ga Tsutsutachi)", " It is assumed that "Tokyo D-ku, Tokyo" was generated. The accuracy of "Kato" is 1 based on the distance between pronunciations. The accuracy of "July 1, 1990" is 1 based on the distance between pronunciations. The accuracy of "Tokyo D-ku, Tokyo" is 1 based on the distance between pronunciations.
 上記の各検索項目は、氏名クエリが2つ、生年月日クエリが2つ、本籍クエリが2つであるため、以下の8通りの検索クエリが生成される。
検索クエリ1:「佐藤、平成2年1月1日、東京都A区」
検索クエリ2:「加藤、平成2年1月1日、東京都A区」
検索クエリ3:「佐藤、平成2年7月1日、東京都A区」
検索クエリ4:「加藤、平成2年7月1日、東京都A区」
検索クエリ5:「佐藤、平成2年1月1日、東京都D区」
検索クエリ6:「加藤、平成2年1月1日、東京都D区」
検索クエリ7:「佐藤、平成2年7月1日、東京都D区」
検索クエリ8:「加藤、平成2年7月1日、東京都D区」
 なお、上記においては、各検索語の音素等の表記は省略する。
Since each of the above search items has two name queries, two date of birth queries, and two registered domicile queries, the following eight types of search queries are generated.
Search Query 1: "Sato, January 1, 1990, A-ku, Tokyo"
Search Query 2: "Kato, January 1, 1990, A-ku, Tokyo"
Search Query 3: "Sato, July 1, 1990, A-ku, Tokyo"
Search Query 4: "Kato, July 1, 1990, A-ku, Tokyo"
Search Query 5: "Sato, January 1, 1990, D-ku, Tokyo"
Search query 6: "Kato, January 1, 1990, D-ku, Tokyo"
Search Query 7: "Sato, July 1, 1990, D-ku, Tokyo"
Search Query 8: "Kato, July 1, 1990, D-ku, Tokyo"
In the above, the notation of phonemes and the like of each search term is omitted.
 ここで、氏名クエリの重みをλ1、生年月日クエリの重みをλ2、本籍クエリの重みをλ3とすると、上記の検索クエリの確度は、以下のように計算される。
検索クエリ1:λ1×0+λ2×0+λ3×0=0
検索クエリ2:λ1×1+λ2×0+λ3×0=λ1
検索クエリ3:λ1×0+λ2×1+λ3×0=λ2
検索クエリ4:λ1×1+λ2×1+λ3×0=λ1+λ2
検索クエリ5:λ1×0+λ2×0+λ3×1=λ3
検索クエリ6:λ1×1+λ2×0+λ3×1=λ1+λ3
検索クエリ7:λ1×0+λ2×1+λ3×1=λ2+λ3
検索クエリ8:λ1×1+λ2×1+λ3×1=λ1+λ2+λ3
例えば、氏名クエリの重みλ1、生年月日クエリの重みλ2、本籍クエリの重みλ3の各々は、認識誤りが多い検索項目を大きな値に設定することが好ましい。なお、氏名クエリの重みλ1、生年月日クエリの重みλ2、本籍クエリの重みλ3の各々は、全て1であってもよい。
Here, assuming that the weight of the name query is λ 1 , the weight of the date of birth query is λ 2 , and the weight of the registered domicile query is λ 3 , the accuracy of the above search query is calculated as follows.
Search query 1: λ 1 x 0 + λ 2 x 0 + λ 3 x 0 = 0
Search query 2: λ 1 x 1 + λ 2 x 0 + λ 3 x 0 = λ 1
Search query 3: λ 1 x 0 + λ 2 x 1 + λ 3 x 0 = λ 2
Search query 4: λ 1 x 1 + λ 2 x 1 + λ 3 x 0 = λ 1 + λ 2
Search query 5: λ 1 x 0 + λ 2 x 0 + λ 3 x 1 = λ 3
Search query 6: λ 1 x 1 + λ 2 x 0 + λ 3 x 1 = λ 1 + λ 3
Search query 7: λ 1 x 0 + λ 2 x 1 + λ 3 x 1 = λ 2 + λ 3
Search query 8: λ 1 × 1 + λ 2 × 1 + λ 3 × 1 = λ 1 + λ 2 + λ 3
For example, for each of the name query weight λ 1 , the date of birth query weight λ 2 , and the registered domicile query weight λ 3 , it is preferable to set search items with many recognition errors to large values. The name query weight λ 1 , the date of birth query weight λ 2 , and the registered domicile query weight λ 3 may all be 1.
 図13は、検索装置20が算出する検索クエリの確度について説明するための概念図である。図13の例では、「芝太郎さん、平成2年1月1日生まれ、本籍住所東京都A区です。」という音声データが検索装置20に入力される。図13の例では、各検証項目の重みは1とする。 FIG. 13 is a conceptual diagram for explaining the accuracy of the search query calculated by the search device 20. In the example of FIG. 13, the voice data "Mr. Shibataro, born on January 1, 1990, registered domicile is A-ku, Tokyo" is input to the search device 20. In the example of FIG. 13, the weight of each verification item is 1.
 検索装置20は、入力された音声データに基づくテキストデータから、「シバタロウ」、「ヘイセイニネンイチガツツイタチ」、「トウキョウトエーク」という文字列を検証項目ごとに抽出する。検索装置20は、検証項目ごとに、発音間距離に基づいて、検索語を生成する。例えば、検索装置20は、氏名クエリに関して、発音間距離が0の「シバタロウ」、発音間距離が1の「キバタロウ」、・・・という検索語を生成する。例えば、検索装置20は、生年月日クエリに関して、発音間距離が0の「ヘイセイニネンイチガツツイタチ」、発音間距離が1の「ヘイセイニネンシチガツツイタチ」、・・・という検索語を生成する。例えば、検索装置20は、本籍クエリに関して、発音間距離が0の「トウキョウトエーク」、発音間距離が1の「トウキョウトデーク」、・・・という検索語を生成する。 The search device 20 extracts the character strings "Shibataro", "Heiseininenichigatsutsuitachi", and "Tokyotoake" from the text data based on the input voice data for each verification item. The search device 20 generates a search term for each verification item based on the distance between pronunciations. For example, the search device 20 generates search terms such as "Shibataro" having a distance between pronunciations of 0, "Kibataro" having a distance between pronunciations of 1, and so on for a name query. For example, the search device 20 generates search terms such as "Heiseininenichigatsuitsuitachi" with a pronunciation distance of 0, "Heiseinenenshichigatsutsuitachi" with a pronunciation distance of 1, and so on, with respect to the date of birth query. do. For example, the search device 20 generates search terms such as "Tokyo Tokyo" having a distance between pronunciations of 0, "Tokyo Tokyo" having a distance between pronunciations of 1, and so on, with respect to the registered domicile query.
 検索装置20は、検索項目ごとに生成された複数の検索語を組み合わせて、検索クエリを生成する。例えば、検索装置20は、「シバタロウ、ヘイセイニネンイチガツツイタチ、トウキョウトエーク」という検索クエリを生成する。この検索クエリの確度は、それぞれの検索語の発音間距離の総和の0である。例えば、検索装置20は、「キバタロウ、ヘイセイニネンシチガツツイタチ、トウキョウトデーク」という検索クエリを生成する。この検索クエリの確度は、それぞれの検索語の発音間距離の総和の3である。 The search device 20 generates a search query by combining a plurality of search terms generated for each search item. For example, the search device 20 generates a search query of "Shibataro, Heiseininenichigatsutsuitachi, Tokyotoake". The accuracy of this search query is 0, which is the sum of the distances between pronunciations of each search term. For example, the search device 20 generates a search query of "Kibataro, Heiseinen Shichigatsutsuitachi, Tokyo Todake". The accuracy of this search query is 3, which is the sum of the distances between pronunciations of each search term.
 例えば、検索部23は、生成した複数の検索クエリの確度に基づいて、DB100を検索したり、無線機(図示しない)に対して出力する音声データの元となるテキストデータを生成したりする。 For example, the search unit 23 searches the DB 100 based on the accuracy of the generated plurality of search queries, and generates text data that is the source of voice data to be output to a radio (not shown).
 (動作)
 次に、検索装置20の動作について図面を参照しながら説明する。以下においては、検索クエリの生成や、生成された検索クエリを用いた検索について個別に説明する。なお、以下の動作は一例であって、検索装置20の動作を限定するものではない。
(motion)
Next, the operation of the search device 20 will be described with reference to the drawings. In the following, the generation of the search query and the search using the generated search query will be described individually. The following operation is an example and does not limit the operation of the search device 20.
 〔検索クエリ生成〕
 図14は、検索装置20による検索クエリの生成の一例について説明するためのフローチャートである。図14のフローチャートに沿った説明においては、検索装置20を動作主体とする。
[Search query generation]
FIG. 14 is a flowchart for explaining an example of generating a search query by the search device 20. In the description according to the flowchart of FIG. 14, the search device 20 is the main operating body.
 図14において、まず、検索装置20は、音声データを取得する(ステップS211)。例えば、検索装置20は、無線機(図示しない)から送信された音声データを取得する。 In FIG. 14, first, the search device 20 acquires voice data (step S211). For example, the search device 20 acquires voice data transmitted from a radio (not shown).
 次に、検索装置20は、音声認識によって、取得した音声データをテキストデータに変換する(ステップS212)。 Next, the search device 20 converts the acquired voice data into text data by voice recognition (step S212).
 次に、検索装置20は、検索項目に該当する文字列をテキストデータから抽出する(ステップS213)。 Next, the search device 20 extracts the character string corresponding to the search item from the text data (step S213).
 次に、検索装置20は、発音間距離に基づいて、抽出された文字列に関連する検索語を検索項目ごとに生成する(ステップS214)。 Next, the search device 20 generates a search term related to the extracted character string for each search item based on the distance between pronunciations (step S214).
 次に、検索装置20は、検索項目ごとの検索語を組み合わせた複数の検索クエリを生成する(ステップS215)。 Next, the search device 20 generates a plurality of search queries that combine search terms for each search item (step S215).
 次に、検索装置20は、検索語ごとの発音間距離に基づいて、生成された検索クエリの確度を計算する(ステップS216)。 Next, the search device 20 calculates the accuracy of the generated search query based on the distance between pronunciations for each search term (step S216).
 次に、検索装置20は、複数の検索クエリを確度に応じて順位付けする(ステップS217)。 Next, the search device 20 ranks a plurality of search queries according to the accuracy (step S217).
 〔検索〕
 図15は、検索装置20が、生成された検索クエリを用いてDB200を検索する一例について説明するためのフローチャートである。図15のフローチャートに沿った説明においては、検索装置20を動作主体とする。
〔search〕
FIG. 15 is a flowchart for explaining an example in which the search device 20 searches the DB 200 using the generated search query. In the description according to the flowchart of FIG. 15, the search device 20 is the main operating body.
 図15において、まず、検索装置20は、生成された検索クエリを用いてDB200を検索する(ステップS251)。 In FIG. 15, first, the search device 20 searches the DB 200 using the generated search query (step S251).
 検索が複数ヒットした場合(ステップS252でYes)、検索装置20は、ヒットした検索結果を含む複数のテキストデータを生成する(ステップS253)。次に、検索装置20は、生成された複数のテキストデータの各々を音声データに変換する(ステップS254)。そして、検索装置10は、複数の音声データを、検索クエリの確度の順位に応じて出力する(ステップS255)。例えば、検索装置20から出力された複数の音声データは、無線機(図示しない)において、検索クエリの確度の順位に応じた順番で音声として出力される。 When a plurality of hits in the search are hit (Yes in step S252), the search device 20 generates a plurality of text data including the hit search results (step S253). Next, the search device 20 converts each of the generated text data into voice data (step S254). Then, the search device 10 outputs a plurality of voice data according to the order of accuracy of the search query (step S255). For example, a plurality of voice data output from the search device 20 are output as voice in a radio (not shown) in an order according to the order of accuracy of the search query.
 一方、検索が複数ヒットしなかった場合(ステップS252でNo)、検索装置20は、検索結果に応じたテキストデータを生成する(ステップS256)。検索結果が複数ヒットしなかった場合とは、検索結果がヒットしなかった場合と、検索結果が一つだけヒットした場合とを含む。検索結果がヒットしなかった場合、検索装置20は、検索結果が得られなかったことを示すテキストデータを生成する。一方、検索結果が一つだけヒットした場合、検索装置20は、ヒットした検索結果を含むテキストデータを生成する。次に、検索装置20は、生成されたテキストデータを音声データに変換する(ステップS257)。そして、検索装置20は、音声データを出力する(ステップS258)。例えば、検索装置20から出力された音声データは、無線機(図示しない)において音声として出力される。 On the other hand, when a plurality of searches are not hit (No in step S252), the search device 20 generates text data according to the search result (step S256). The case where a plurality of search results are not hit includes the case where the search result is not hit and the case where only one search result is hit. If the search result is not hit, the search device 20 generates text data indicating that the search result was not obtained. On the other hand, when only one search result is hit, the search device 20 generates text data including the hit search result. Next, the search device 20 converts the generated text data into voice data (step S257). Then, the search device 20 outputs voice data (step S258). For example, the voice data output from the search device 20 is output as voice in a radio (not shown).
 以上のように、本実施形態の検索部は、検索項目ごとの重みが付与された検索語の発音間距離の総和である確度を検索クエリごとに計算する。検索部は、確度に応じて、検索クエリを順位付けする。 As described above, the search unit of the present embodiment calculates the accuracy of the sum of the pronunciation distances of the search terms to which the weights are given for each search item for each search query. The search unit ranks search queries according to their accuracy.
 本実施形態によれば、検索クエリの確度に応じて、検索クエリを構成する検索語の正否を聞き返したり、データベースを検索したりできる。そのため、本実施形態によれば、検索効率や検索精度を向上できる。 According to this embodiment, it is possible to ask back whether the search terms constituting the search query are correct or not, or to search the database according to the accuracy of the search query. Therefore, according to the present embodiment, the search efficiency and the search accuracy can be improved.
 本実施形態の一態様において、検索部は、確度に応じて順位付けされた検索クエリを構成する検索語を含むテキストデータを生成する。変換部は、生成されたテキストデータを音声データに変換し、テキストデータの生成元の検索クエリの確度の順位に応じて、テキストデータから変換された音声データを順番に出力する。 In one aspect of the present embodiment, the search unit generates text data including search terms constituting a search query ranked according to accuracy. The conversion unit converts the generated text data into voice data, and outputs the voice data converted from the text data in order according to the order of accuracy of the search query from which the text data is generated.
 本実施形態の検索装置は、検索クエリの確度に応じて、音声データを出力できる。例えば、検索装置によって出力された音声データを聞いたユーザは、検索クエリの確度の順番に応じた検索結果を認識できる。 The search device of this embodiment can output voice data according to the accuracy of the search query. For example, a user who hears the voice data output by the search device can recognize the search result according to the order of the accuracy of the search query.
 (第3の実施形態)
 次に、第3の実施形態の検索装置について図面を参照しながら説明する。本実施形態の検索装置は、少なくとも一つの検索項目に関する発音間距離の辞書(発音間距離辞書とも呼ぶ)に基づいて、検索クエリを順位付けする点において、第1~第2の実施形態とは異なる。
(Third embodiment)
Next, the search device of the third embodiment will be described with reference to the drawings. The search device of the present embodiment is different from the first and second embodiments in that the search query is ranked based on the dictionary of the distance between pronunciations (also referred to as the dictionary of distances between pronunciations) for at least one search item. different.
 (構成)
 図16は、本実施形態の検索装置30の構成の一例を示すブロック図である。検索装置30は、取得部31、第1変換部32、検索部33、辞書34、第2変換部38、および出力部39を備える。取得部31と出力部39は、入出力部310を構成する。第1変換部32と第2変換部38は、変換部320を構成する。図16には、検索装置30に接続されるデータベース(DB300)を合わせて図示する。DB300は、インターネットやイントラネットなどのネットワークを介して、検索部33に接続される。DB300には、複数の照合データが格納される。なお、検索装置30に含まれる検索部33および辞書34以外の構成は、第1の実施形態の検索装置10に含まれる構成と同様であるため、詳細な説明は省略する。以下においては、検索部33と辞書34に焦点を当てて説明する。
(composition)
FIG. 16 is a block diagram showing an example of the configuration of the search device 30 of the present embodiment. The search device 30 includes an acquisition unit 31, a first conversion unit 32, a search unit 33, a dictionary 34, a second conversion unit 38, and an output unit 39. The acquisition unit 31 and the output unit 39 constitute an input / output unit 310. The first conversion unit 32 and the second conversion unit 38 constitute a conversion unit 320. FIG. 16 also shows a database (DB300) connected to the search device 30. The DB 300 is connected to the search unit 33 via a network such as the Internet or an intranet. A plurality of collation data are stored in the DB 300. Since the configurations other than the search unit 33 and the dictionary 34 included in the search device 30 are the same as the configurations included in the search device 10 of the first embodiment, detailed description thereof will be omitted. In the following, the description will be focused on the search unit 33 and the dictionary 34.
 辞書34は、検索項目に該当する文字列の発音間距離をまとめた辞書(発音間距離辞書とも呼ぶ)である。辞書34は、検索項目ごとに予め準備される。例えば、検索項目が氏名の苗字である場合、辞書34には、全国人名辞典や戸籍等に収録された苗字と、それらの苗字同士の発音間距離に応じた順位が登録される。例えば、N個の苗字が格納される場合、N個の苗字ごとに、その他の苗字との発音間距離に応じた順位付けが辞書34に登録される(Nは自然数)。ただし、苗字に使われている漢字が同じであっても読み方が異なる場合もあるが、一つの漢字に一つの読み方が対応付けられるものとする。言い換えると、辞書34には、格納された文字列の各々に対する発音間距離に応じて、他の文字列が順位の順番に配列されたデータ系列が含まれる。 The dictionary 34 is a dictionary (also referred to as an inter-pronunciation distance dictionary) that summarizes the inter-pronunciation distances of character strings corresponding to search items. The dictionary 34 is prepared in advance for each search item. For example, when the search item is the surname of the name, the surname recorded in the national biographical dictionary, the family register, or the like and the ranking according to the pronunciation distance between the surnames are registered in the dictionary 34. For example, when N surnames are stored, the ranking according to the distance between pronunciations with other surnames is registered in the dictionary 34 for each N surnames (N is a natural number). However, even if the kanji used for the surname is the same, the reading may be different, but one reading shall be associated with one kanji. In other words, the dictionary 34 includes a data sequence in which other character strings are arranged in order of order according to the distance between pronunciations for each of the stored character strings.
 例えば、山田(YAMADA)、佐藤(SATOU)、加藤(KATOU)の各々については、山田-佐藤間の発音間距離、佐藤-加藤間の発音間距離、加藤-山田間の発音間距離が定義される。例えば、佐藤(SATOU)-加藤(KATOU)の間の発音間距離は、音素が一つ置換されているので1である。 For example, for each of Yamada (YAMADA), Sato (SATOU), and Kato (KATOU), the pronunciation distance between Yamada and Sato, the pronunciation distance between Sato and Kato, and the pronunciation distance between Kato and Yamada are defined. To. For example, the distance between pronunciations between Sato (SATOU) and Kato (KATOU) is 1 because one phoneme is replaced.
 図17は、苗字を検索項目とする辞書34に含まれるテーブルの一例(発音間距離辞書340)である。例えば、発音間距離辞書340において、佐藤に対応付けられた順位は、加藤が1位、斎藤が2位、・・・である。なお、図17は、辞書34の概念的なものであって、実際の苗字に対応する順位付けを正確に示すものではない。 FIG. 17 is an example of a table (dictionary between pronunciations 340) included in the dictionary 34 whose search item is the surname. For example, in the inter-pronunciation distance dictionary 340, Kato is ranked first, Saito is ranked second, and so on. Note that FIG. 17 is a conceptual one of the dictionary 34 and does not accurately show the ranking corresponding to the actual surname.
 例えば、検索部33は、音声データから認識された文字列が「佐藤」であった場合、「佐藤」に対して順位の高い文字列を検索語として抽出する。例えば、検索部33は、発音間距離辞書340の「佐藤」のフィールドにおいて、順位がM位までの文字列を検索語として抽出する(Mは自然数)。例えば、検索部33は、検索部33は、発音間距離辞書340の「佐藤」のフィールドにおいて、発音間距離がX位以内の文字列を検索語として抽出してもよい(Xは自然数)。 For example, when the character string recognized from the voice data is "Sato", the search unit 33 extracts a character string having a higher rank than "Sato" as a search term. For example, the search unit 33 extracts a character string having a rank up to M as a search term in the field of "Sato" of the pronunciation distance dictionary 340 (M is a natural number). For example, the search unit 33 may extract a character string whose inter-pronunciation distance is within the Xth place as a search term in the field of "Sato" of the inter-pronunciation distance dictionary 340 (X is a natural number).
 また、辞書34は、通話表を含んでいてもよい。通話表は、無線通信等において、音声の聞き間違いを防ぐために制定された規則をまとめた表である。図18は、無線局運用規則別表第5号に記載された通話表の内容を含む通話表360である。例えば、「ア」の聞き間違いを防ぐ際には、「アサヒノア」と発話される。例えば、「シ」の聞き間違いを防ぐ際には、「シンブンノシ」と発話される。なお、通話表360は、文字だけではなく、数字や記号に関するデータを含んでもよい。また、通話表360は、和文のみならず、アルファベットなどの欧文に関する文字や数字、記号を含んでもよい。 Further, the dictionary 34 may include a spelling alphabet. The spelling alphabet is a table that summarizes the rules established to prevent mistakes in hearing voice in wireless communication and the like. FIG. 18 is a spelling alphabet 360 including the contents of the spelling alphabet described in Appendix 5 of the Radio Station Operation Regulations. For example, to prevent misunderstanding of "A", "Asahinoa" is uttered. For example, to prevent misunderstanding of "shi", "shinbunnoshi" is uttered. The spelling alphabet 360 may include not only characters but also data related to numbers and symbols. Further, the spelling alphabet 360 may include not only Japanese characters but also characters, numbers, and symbols related to European languages such as alphabets.
 図19は、検索装置30が、音声認識で認識された氏名を聞き返す一例を示す概念図である。例えば、検索装置30は、検索語のスコアや確度に応じて、聞き返しの音声データを出力する。図20は、図19の聞き返しの音声データの氏名が間違っていたため、無線機(図示しない)を介して正しい音声データが返答された例である。図20の例においては、検索装置30が、通話表360に登録された「シンブンノシ」という音声データを参照し、正確な氏名である「シバタロウ」を認識できる。聞き返しをする場合において、通話表360に登録されたパターンで返答するように予め決めておけば、そのパターンに応じて、音声認識によって認識された文字列の正否を確認しやすくなる。 FIG. 19 is a conceptual diagram showing an example in which the search device 30 listens back to the name recognized by voice recognition. For example, the search device 30 outputs back-listening voice data according to the score and accuracy of the search term. FIG. 20 is an example in which the correct voice data is returned via a radio (not shown) because the name of the voice data to be heard back in FIG. 19 is incorrect. In the example of FIG. 20, the search device 30 can refer to the voice data of "Shinbunnoshi" registered in the spelling alphabet 360 and recognize the exact name "Shibataro". When answering back, if it is decided in advance to answer with the pattern registered in the spelling alphabet 360, it becomes easy to confirm the correctness of the character string recognized by the voice recognition according to the pattern.
 (動作)
 次に、検索装置30の動作について図面を参照しながら説明する。以下においては、検索クエリの生成について説明する。なお、以下の動作は一例であって、検索装置30の動作を限定するものではない。
(motion)
Next, the operation of the search device 30 will be described with reference to the drawings. In the following, the generation of the search query will be described. The following operation is an example and does not limit the operation of the search device 30.
 〔検索クエリ生成〕
 図21は、検索装置30による検索クエリの生成の一例について説明するためのフローチャートである。図21のフローチャートに沿った説明においては、検索装置30を動作主体とする。
[Search query generation]
FIG. 21 is a flowchart for explaining an example of generating a search query by the search device 30. In the description according to the flowchart of FIG. 21, the search device 30 is the main operating body.
 図21において、まず、検索装置30は、音声データを取得する(ステップS311)。例えば、検索装置30は、無線機(図示しない)から出力された音声データを取得する。 In FIG. 21, first, the search device 30 acquires voice data (step S311). For example, the search device 30 acquires voice data output from a radio (not shown).
 次に、検索装置30は、音声認識によって、取得した音声データをテキストデータに変換する(ステップS312)。 Next, the search device 30 converts the acquired voice data into text data by voice recognition (step S312).
 次に、検索装置30は、検索項目に該当する文字列をテキストデータから抽出する(ステップS313)。 Next, the search device 30 extracts the character string corresponding to the search item from the text data (step S313).
 次に、検索装置30は、検索項目ごとの発音間距離辞書を参照し、抽出された文字列との発音間距離に応じた順位に基づいて検索語を選択する(ステップS314)。 Next, the search device 30 refers to the pronunciation distance dictionary for each search item, and selects a search term based on the order according to the pronunciation distance with the extracted character string (step S314).
 次に、検索装置30は、検索項目ごとに選択された検索語を組み合わせた複数の検索クエリを生成する(ステップS315)。 Next, the search device 30 generates a plurality of search queries that combine search terms selected for each search item (step S315).
 以上のように、本実施形態の検索部は、検索項目に該当する複数の文字列の各々に対して、検索項目に該当する複数の他の文字列が発音間距離に応じて順位付けされた発音間距離辞書を参照する。検索部は、発音間距離辞書を参照し、テキストデータから抽出された文字列に対する発音間距離辞書における順位が高い文字列を検索語として選択する。 As described above, in the search unit of the present embodiment, a plurality of other character strings corresponding to the search item are ranked according to the distance between pronunciations for each of the plurality of character strings corresponding to the search item. Refer to the pronunciation distance dictionary. The search unit refers to the pronunciation distance dictionary and selects a character string having a higher rank in the pronunciation distance dictionary with respect to the character string extracted from the text data as a search term.
 本実施形態によれば、発音間距離辞書を参照して検索語を選択するため、発音間距離の計算等の処理を省略できる。そのため、本実施形態によれば、検索語や検索クエリの生成を高速化できる。 According to this embodiment, since the search term is selected by referring to the inter-pronunciation distance dictionary, processing such as calculation of the inter-pronunciation distance can be omitted. Therefore, according to the present embodiment, it is possible to speed up the generation of search terms and search queries.
 (第4の実施形態)
 次に、第4の実施形態の検索装置について図面を参照しながら説明する。本実施形態は、検索装置と無線機のやり取りを具体化するものである。本実施形態においては、警察官による身元照会を一例に挙げて説明する。
(Fourth Embodiment)
Next, the search device of the fourth embodiment will be described with reference to the drawings. This embodiment embodies the exchange between the search device and the radio. In this embodiment, an identity inquiry by a police officer will be described as an example.
 (構成)
 図22は、本実施形態の検索装置40の構成の一例を示すブロック図である。検索装置40は、入出力部41、第1変換部42、検索部43、登録情報記録部44、および第2変換部48を備える。第1変換部42と第2変換部48は、変換部420を構成する。図22には、検索装置40と音声データをやり取りする無線機450、検索装置40に接続されるデータベース群(DB群400)を合わせて図示する。DB群400を構成する複数のDBの各々は、インターネットやイントラネットなどのネットワークを介して、検索部43に接続される。DB群400は、照会種別ごとの複数のDBを含む。DB群に含まれる複数のDBの各々には、照会種別ごとの複数の照合データが格納される。無線機450は、検索装置40と音声データをやり取りする。図22には、無線機450を一つしか図示していないが、検索装置40は、複数の無線機450と音声データをやり取り可能である。また、無線機450が、検索装置40の構成の一部または全てを備えていてもよい。なお、検索装置40の主な構成は、第1の実施形態の検索装置10に含まれる構成と同様であるため、詳細な説明は省略する。以下においては、無線機450と検索装置40の間の音声データのやり取りに焦点を当てて説明する。
(composition)
FIG. 22 is a block diagram showing an example of the configuration of the search device 40 of the present embodiment. The search device 40 includes an input / output unit 41, a first conversion unit 42, a search unit 43, a registration information recording unit 44, and a second conversion unit 48. The first conversion unit 42 and the second conversion unit 48 constitute a conversion unit 420. FIG. 22 also shows a radio 450 that exchanges voice data with the search device 40, and a database group (DB group 400) connected to the search device 40. Each of the plurality of DBs constituting the DB group 400 is connected to the search unit 43 via a network such as the Internet or an intranet. The DB group 400 includes a plurality of DBs for each inquiry type. A plurality of collation data for each inquiry type is stored in each of the plurality of DBs included in the DB group. The radio 450 exchanges voice data with the search device 40. Although only one radio 450 is shown in FIG. 22, the search device 40 can exchange voice data with a plurality of radios 450. Further, the radio 450 may include a part or all of the configuration of the search device 40. Since the main configuration of the search device 40 is the same as the configuration included in the search device 10 of the first embodiment, detailed description thereof will be omitted. In the following, the explanation will focus on the exchange of voice data between the radio 450 and the search device 40.
 入出力部41(入出力部とも呼ぶ)は、無線機450から送信された無線信号に基づく音声データを取得する。入出力部41は、音声データを第1変換部42に出力する。また、入出力部41は、第2変換部から取得した音声データを出力する。 The input / output unit 41 (also referred to as an input / output unit) acquires voice data based on a radio signal transmitted from the radio 450. The input / output unit 41 outputs voice data to the first conversion unit 42. Further, the input / output unit 41 outputs the voice data acquired from the second conversion unit.
 例えば、無線機450は、特定周波数帯の無線通信によって、音声データを含む無線信号を送信する。例えば、無線機450から送信された無線信号は、図示しないアンテナや増幅器、復調器等を介して電気信号に変換され、インターネットやイントラネットなどのネットワークを介して、検索装置40の入出力部41に入力される。例えば、入出力部41は、第2変換部48から取得した音声データを、無線機450に向けて出力する。 For example, the radio 450 transmits a radio signal including voice data by wireless communication in a specific frequency band. For example, the radio signal transmitted from the radio 450 is converted into an electric signal via an antenna, an amplifier, a demodulator, etc. (not shown), and is converted into an electric signal to the input / output unit 41 of the search device 40 via a network such as the Internet or an intranet. Entered. For example, the input / output unit 41 outputs the voice data acquired from the second conversion unit 48 toward the radio 450.
 登録情報記録部44には、無線機450の登録情報が登録される。例えば、登録情報は、無線機450を用いるユーザのユーザ識別子や、無線機450の装置識別子である。検索装置40は、登録情報記録部44に記録された登録情報に合致する識別情報の送信元の無線機450との間で音声データをやり取りする。 The registration information of the radio 450 is registered in the registration information recording unit 44. For example, the registration information is a user identifier of a user who uses the radio 450 or a device identifier of the radio 450. The search device 40 exchanges voice data with the radio 450 of the transmission source of the identification information matching the registration information recorded in the registration information recording unit 44.
 検索部43は、第1変換部42から取得するテキストデータの内容に応じた処理を実行する。本実施形態においては、検索部43が取得するテキストデータには、識別情報、照会種別、照会情報等が含まれる例を挙げる。検索部43は、取得するテキストデータの内容に応じて、テキストデータの変換前の音声データの送信元に対する応答内容を含むテキストデータの生成や、DB群400に含まれるDBの検索等の処理を実行する。検索部43が生成したテキストデータは、第2変換部48によって音声データに変換され、無線機450に向けて入出力部41から出力される。 The search unit 43 executes processing according to the content of the text data acquired from the first conversion unit 42. In the present embodiment, an example in which the text data acquired by the search unit 43 includes identification information, inquiry type, inquiry information, and the like will be given. The search unit 43 performs processing such as generation of text data including response contents to the source of voice data before conversion of text data and search of DB included in DB group 400 according to the contents of text data to be acquired. Run. The text data generated by the search unit 43 is converted into voice data by the second conversion unit 48, and is output from the input / output unit 41 toward the radio 450.
 無線機450からの音声データに識別情報が含まれる場合、検索部43は、登録情報記録部44を参照し、その識別情報が登録情報記録部44に記録されているか判定する。登録情報記録部44に識別情報が記録されている場合、検索部43は、照会種別を問い合わせるテキストデータを生成する。例えば、登録情報記録部44に識別情報が記録されていない場合、検索部43は、識別情報が合致しなかったことを通知するテキストデータや、識別情報の再送を指示するテキストデータを生成する。検索部43によって生成されたテキストデータは、第2変換部48において音声データに変換され、無線機450に向けて出力される。 When the voice data from the radio 450 contains the identification information, the search unit 43 refers to the registration information recording unit 44 and determines whether the identification information is recorded in the registration information recording unit 44. When the identification information is recorded in the registration information recording unit 44, the search unit 43 generates text data for inquiring about the inquiry type. For example, when the identification information is not recorded in the registration information recording unit 44, the search unit 43 generates text data notifying that the identification information does not match and text data instructing the retransmission of the identification information. The text data generated by the search unit 43 is converted into voice data by the second conversion unit 48 and output to the radio 450.
 無線機450からの音声データに照会種別が含まれる場合、検索部43は、その照会種別の送信元に対して、照会内容を問い合わせるテキストデータを生成する。例えば、検索部43は、照会種別を聞き返す内容を含めて、照会内容を問い合わせるテキストデータを生成してもよい。検索部43によって生成されたテキストデータは、第2変換部48において音声データに変換され、無線機450に向けて出力される。 When the inquiry type is included in the voice data from the radio 450, the search unit 43 generates text data for inquiring the inquiry content to the sender of the inquiry type. For example, the search unit 43 may generate text data for inquiring about the inquiry content, including the content for asking for the inquiry type. The text data generated by the search unit 43 is converted into voice data by the second conversion unit 48 and output to the radio 450.
 無線機450からの音声データに照会情報が含まれる場合、検索部43は、そのテキストデータから検索項目に該当する文字列を抽出する。検索部43は、発音間距離に基づいて、抽出された文字列に関連する検索語を生成する。また、検索部43は、検索項目に該当する文字列の抽出状況に応じて、抽出された文字列に発音間距離が最も近い検索語を確認する内容を含むテキストデータや、抽出できなかった検索項目を聞き直す内容を含むテキストデータを生成する。 When the inquiry information is included in the voice data from the radio 450, the search unit 43 extracts the character string corresponding to the search item from the text data. The search unit 43 generates a search term related to the extracted character string based on the distance between pronunciations. Further, the search unit 43 includes text data including a content for confirming a search term having the closest pronunciation distance to the extracted character string according to the extraction status of the character string corresponding to the search item, and a search that could not be extracted. Re-listen to the item Generate text data that includes the content.
 (動作)
 次に、無線機450、検索装置40、DB群400に含まれるDBの間の動作における相互関係についてシーケンス図を用いて説明する。以下においては、検索語の生成に関する流れと、その後に実行される処理等の流れについて一例を挙げて説明する。以下において、無線機450には、無線機450を扱うユーザによって、検索装置40からの音声データに応じた情報が音声入力されるものとする。以下に挙げる無線機450、検索装置40、DB群400に含まれるDBの間の動作における相互関係は一例であって、それらの動作や相互関係を限定するものではない。
(motion)
Next, the mutual relationship in operation between the radio 450, the search device 40, and the DB included in the DB group 400 will be described using a sequence diagram. In the following, an example will be described of the flow related to the generation of search terms and the flow of processing and the like executed thereafter. In the following, it is assumed that the user who handles the radio 450 inputs voice information according to the voice data from the search device 40 to the radio 450. The mutual relationship in the operation between the radio 450, the search device 40, and the DB included in the DB group 400 listed below is an example, and does not limit the operation and the mutual relationship.
 図23は、無線機450による検索装置40への接続から、検索装置40による検索語の生成までの流れの一例を示すシーケンス図である。 FIG. 23 is a sequence diagram showing an example of the flow from the connection to the search device 40 by the radio device 450 to the generation of the search term by the search device 40.
 まず、無線機450が検索装置40に接続する(ステップS411)。無線機450による検索装置40への接続方式には、特に限定を加えない。 First, the radio 450 connects to the search device 40 (step S411). The connection method of the radio 450 to the search device 40 is not particularly limited.
 検索装置40は、無線機450の接続を検知すると、その無線機450に対して、識別情報の要求を含む音声データを出力する(ステップS412)。例えば、検索装置40は、無線機450の接続を検知すると、その無線機450に対して、「自動応答です。Pナンバー、所属、氏名をお願いします。」といった識別情報の要求を含む音声データを出力する。Pナンバーとは、警察官を一意に特定するための識別子である。 When the search device 40 detects the connection of the radio device 450, the search device 40 outputs voice data including a request for identification information to the radio device 450 (step S412). For example, when the search device 40 detects the connection of the radio 450, the search device 40 responds to the radio 450 with voice data including a request for identification information such as "Automatic response. Please give me your P number, affiliation, and name." Is output. The P number is an identifier for uniquely identifying a police officer.
 無線機450は、識別情報の要求を含む音声データを受信すると、その音声データに応じて音声入力された識別情報を含む音声データを検索装置40に出力する(ステップS413)。例えば、無線機450には、識別情報の要求に応じて、「Pナンバー○○、○○署、地域〇係、○○です。」といった識別情報を含む情報が音声入力される。例えば、無線機450は、「Pナンバー○○、○○署、地域〇係、○○です。」といった識別情報を含む音声データを検索装置40に出力する。 When the radio 450 receives the voice data including the request for the identification information, the radio 450 outputs the voice data including the identification information input by voice according to the voice data to the search device 40 (step S413). For example, in response to a request for identification information, information including identification information such as "P number XX, XX station, area XX, XX" is input to the radio 450 by voice. For example, the radio 450 outputs voice data including identification information such as "P number XX, XX station, area XX, XX" to the search device 40.
 検索装置40は、識別情報を含む音声データを受信すると、その音声データに含まれる識別情報が登録情報記録部44に登録されているか確認する(ステップS414)。無線機450から受信した音声データに含まれる識別情報が登録情報記録部44に登録されていた場合、検索装置40は、照会種別を要求する音声データを無線機450に出力する(ステップS415)。例えば、無線機450から受信した音声データに含まれる識別情報が登録されていた場合、検索装置40は、「Pナンバー〇〇〇〇、〇〇さん、照会種別をお願いします。」といった照会種別の要求を含む音声データを無線機450に出力する。 When the search device 40 receives the voice data including the identification information, it confirms whether the identification information included in the voice data is registered in the registration information recording unit 44 (step S414). When the identification information included in the voice data received from the radio 450 is registered in the registration information recording unit 44, the search device 40 outputs the voice data requesting the inquiry type to the radio 450 (step S415). For example, when the identification information included in the voice data received from the radio 450 is registered, the search device 40 uses an inquiry type such as "P number OOOO, OOOO, please give me an inquiry type." The voice data including the request of is output to the radio 450.
 無線機450は、照会種別の要求を含む音声データを受信すると、その音声データに応じて音声入力された照会種別を含む音声データを検索装置40に出力する(ステップS416)。例えば、無線機450は、「免責による、総合照会です。」といった照会種別を含む音声データを検索装置40に出力する。 When the radio 450 receives the voice data including the request of the inquiry type, the radio 450 outputs the voice data including the inquiry type input by voice according to the voice data to the search device 40 (step S416). For example, the radio 450 outputs voice data including an inquiry type such as "It is a comprehensive inquiry due to exemption from liability" to the search device 40.
 検索装置40は、照会種別を含む音声データを受信すると、照会種別を確認する(ステップS417)。検索装置40は、照会情報を要求する音声データを無線機450に出力する(ステップS418)。例えば、検索装置40は、「総合照会、ですね。照会種別に誤りがあれば、訂正願います。よければ、相手方の氏名、生年月日等をお願いします。」といった照会情報を要求する音声データを無線機450に出力する。 When the search device 40 receives the voice data including the inquiry type, the search device 40 confirms the inquiry type (step S417). The search device 40 outputs voice data requesting inquiry information to the radio 450 (step S418). For example, the search device 40 is a voice requesting inquiry information such as "Comprehensive inquiry, isn't it? If there is an error in the inquiry type, please correct it. If you like, please give us the name, date of birth, etc. of the other party." The data is output to the radio 450.
 無線機450は、照会情報を要求する音声データを受信すると、その音声データに応じて音声入力された照会情報を含む音声データを検索装置40に出力する(ステップS419)。例えば、無線機450は、「芝太郎さん、平成2年1月1日、本籍住所東京都A区です。」といった照会情報を含む音声データを検索装置40に出力する。 When the radio 450 receives the voice data requesting the inquiry information, it outputs the voice data including the inquiry information input by voice according to the voice data to the search device 40 (step S419). For example, the radio 450 outputs voice data including inquiry information such as "Mr. Shibataro, January 1, 1990, registered domicile address A-ku, Tokyo" to the search device 40.
 検索装置40は、照会情報を含む音声データを受信すると、取得した音声データに基づくテキストデータから、照会情報の検索項目に該当する文字列を抽出する。検索装置40は、発音間距離に基づいて、抽出された文字列から検索語を生成する(ステップS420)。 When the search device 40 receives the voice data including the inquiry information, the search device 40 extracts the character string corresponding to the search item of the inquiry information from the text data based on the acquired voice data. The search device 40 generates a search term from the extracted character string based on the distance between pronunciations (step S420).
 図24は、検索装置40による検索語の生成から、照合結果の出力までの流れの一例を示すシーケンス図である。図24のシーケンス図は、図23のステップS420の検索語生成に後続する処理に関する。 FIG. 24 is a sequence diagram showing an example of the flow from the generation of the search term by the search device 40 to the output of the collation result. The sequence diagram of FIG. 24 relates to a process following the generation of the search term in step S420 of FIG.
 図24において、検索装置40は、検索語を生成すると(ステップS420)、検索語の確認内容を含む音声データを無線機450に出力する(ステップS421)。例えば、検索装置40は、「キバタロウ、東京都A区で検索します。誤りがあれば、誤り部分の訂正お願いします。」といった検索語の確認内容を含む音声データを無線機450に出力する。 In FIG. 24, when the search device 40 generates a search term (step S420), the search device 40 outputs voice data including the confirmation content of the search term to the radio device 450 (step S421). For example, the search device 40 outputs voice data including confirmation contents of a search term such as "Search in Kibataro, A-ku, Tokyo. If there is an error, please correct the error part." To the radio 450. ..
 また、検索装置40は、生成された検索項目ごとの検索語を組み合わせて、複数の検索クエリを生成する(ステップS422)。検索装置40は、生成された複数の検索クエリのうち少なくともいずれかを用いて、DB群400に含まれるDBのうち、照会中の照会種別の照合データが格納されたDBを検索する(ステップS423)。 Further, the search device 40 combines the generated search terms for each search item to generate a plurality of search queries (step S422). The search device 40 uses at least one of the generated plurality of search queries to search the DBs included in the DB group 400 in which the collation data of the query type being inquired is stored (step S423). ).
 無線機450は、検索語の確認内容を含む音声データを受信すると、その音声データに応じて音声入力された応答を含む音声データを検索装置40に出力する(ステップS424)。例えば、無線機450は、「間違いありません。」といった応答を含む音声データを検索装置40に出力する。なお、検索語の確認内容に誤りがなければ、ステップS424を省略してもよい。 When the radio 450 receives the voice data including the confirmation content of the search term, it outputs the voice data including the voice input response according to the voice data to the search device 40 (step S424). For example, the radio 450 outputs voice data including a response such as "There is no doubt" to the search device 40. If there is no error in the confirmation content of the search term, step S424 may be omitted.
 照会中の照会種別の照合データが格納されたDBに格納されていた場合、検索装置40は、その検索結果をDBから取得する(ステップS425)。検索装置40は、検索結果に応じた照合結果を無線機450に出力する(ステップS426)。例えば、検索がヒットした場合、検索装置40は、「平成2年1月1日生まれ、本籍住所東京都A区の芝太郎さんは、○○に該当します。」といった照合結果を含む音声データを無線機450に出力する。例えば、検索がヒットしなかった場合、検索装置40は、「平成2年1月1日生まれ、本籍住所東京都A区の芝太郎さんは、○○に該当しません。」といった照合結果を含む音声データを無線機450に出力する。 When the collation data of the inquiry type being inquired is stored in the DB, the search device 40 acquires the search result from the DB (step S425). The search device 40 outputs the collation result according to the search result to the radio 450 (step S426). For example, if the search is a hit, the search device 40 will use the audio data including the collation result such as "Born on January 1, 1990, Mr. Shibataro, whose registered address is A-ku, Tokyo, corresponds to XX." Is output to the radio 450. For example, if the search does not hit, the search device 40 includes a collation result such as "Born on January 1, 1990, Mr. Shibataro, whose registered address is A-ku, Tokyo, does not correspond to XX." The voice data is output to the radio 450.
 図24の例によれば、正しいことが確認された検索語によって構成される検索クエリを用いた検索結果を得ることができるので、検索精度を向上できる。 According to the example of FIG. 24, since it is possible to obtain a search result using a search query composed of search terms confirmed to be correct, the search accuracy can be improved.
 図25は、検索装置40による検索語の生成から、照合結果の出力までの流れの別の一例を示すシーケンス図である。図25は、検索語確認と、複数の検索クエリを用いたDBの検索を並行して行う例である。なお、検索語確認と、複数の検索クエリを用いたDBの検索とは、同じタイミングで行われてもよいし、多少ずれたタイミングで行われてもよい。図25のシーケンス図は、図23のステップS420の検索語生成に後続する処理に関する。 FIG. 25 is a sequence diagram showing another example of the flow from the generation of the search term by the search device 40 to the output of the collation result. FIG. 25 is an example in which the search term confirmation and the DB search using a plurality of search queries are performed in parallel. The search term confirmation and the DB search using a plurality of search queries may be performed at the same timing, or may be performed at slightly different timings. The sequence diagram of FIG. 25 relates to a process following the generation of the search term in step S420 of FIG.
 図25において、検索装置40は、検索語を生成すると(ステップS420)、生成された検索項目ごとの検索語を組み合わせて、複数の検索クエリを生成する(ステップS431)。 In FIG. 25, when the search device 40 generates a search term (step S420), the search device 40 combines the generated search terms for each search item to generate a plurality of search queries (step S431).
 検索装置40は、検索語の確認内容を含む音声データを無線機450に出力する(ステップS432)。例えば、検索装置40は、「キバタロウ、東京都A区で検索します。誤りがあれば、誤り部分を訂正してください。」といった検索語の確認内容を含む音声データを無線機450に出力する。ステップS432は、ステップS432と並行して行われたり、先行して行われたりしてもよい。 The search device 40 outputs voice data including the confirmation content of the search term to the radio 450 (step S432). For example, the search device 40 outputs voice data including confirmation contents of a search term such as "Search in Kibataro, A-ku, Tokyo. If there is an error, please correct the error part." To the radio 450. .. Step S432 may be performed in parallel with step S432 or may be performed in advance of step S432.
 検索装置40は、ステップS432と並行して、生成された複数の検索クエリを用いて、DB群400に含まれるDBのうち、照会中の照会種別の照合データが格納されたDBを検索する(ステップS433)。このとき、検索装置40は、生成された複数の検索クエリの全てを用いてDBを検索する。照会中の照会種別の照合データが格納されたDBに格納されていた場合、検索装置40は、その検索結果をDBから取得する(ステップS434)。 In parallel with step S432, the search device 40 searches the DBs included in the DB group 400 for the DB in which the collation data of the query type being queried is stored, using the plurality of generated search queries. Step S433). At this time, the search device 40 searches the DB using all of the generated search queries. When the collation data of the inquiry type being inquired is stored in the DB, the search device 40 acquires the search result from the DB (step S434).
 無線機450は、検索語の確認内容を含む音声データを検索装置40から受信すると、その音声データに応じて音声入力された応答(確認結果)を含む音声データを検索装置40に出力する(ステップS435)。例えば、検索語に誤りがあった場合、無線機450には、「名前は、芝太郎です。新聞のシ。」といった検索語の訂正を含む音声情報が入力される。無線機450は、入力された音声情報に応じた音声データを検索装置40に出力する。なお、検索語の確認内容に誤りがなければ、ステップS435を省略してもよい。 When the radio 450 receives the voice data including the confirmation content of the search term from the search device 40, the radio 450 outputs the voice data including the response (confirmation result) voiced according to the voice data to the search device 40 (step). S435). For example, if there is an error in the search term, voice information including correction of the search term such as "The name is Shibataro. Newspaper Shi." Is input to the radio 450. The radio 450 outputs voice data corresponding to the input voice information to the search device 40. If there is no error in the confirmation content of the search term, step S435 may be omitted.
 検索装置40は、検索語に誤りがあった場合、検索語の訂正を含む音声データを無線機450から受信する。検索装置40は、正しい検索語によって構成された検索クエリを用いた検索でヒットした検索結果に応じた照合結果を、無線機450に出力する(ステップS436)。例えば、検索がヒットした場合、検索装置40は、「平成2年1月1日生まれ、本籍住所東京都A区の芝太郎さんは、○○に該当します。」といった照合結果を含む音声データを無線機450に出力する。例えば、検索がヒットしなかった場合、検索装置40は、「平成2年1月1日生まれ、本籍住所東京都A区の芝太郎さんは、○○に該当しません。」といった照合結果を含む音声データを無線機450に出力する。 When the search term is incorrect, the search device 40 receives voice data including correction of the search term from the radio 450. The search device 40 outputs the collation result corresponding to the search result hit by the search using the search query composed of the correct search terms to the radio 450 (step S436). For example, if the search is a hit, the search device 40 will use the audio data including the collation result such as "Born on January 1, 1990, Mr. Shibataro, whose registered address is A-ku, Tokyo, corresponds to XX." Is output to the radio 450. For example, if the search does not hit, the search device 40 includes a collation result such as "Born on January 1, 1990, Mr. Shibataro, whose registered address is A-ku, Tokyo, does not correspond to XX." The voice data is output to the radio 450.
 図25の例によれば、検索語確認と並行して、複数の検索クエリを用いてDBを検索するため、いずれかの検索クエリを用いた検索によって、誤った検索結果も得られるものの、正しい検索結果が得られることになる。図25の例によれば、検索語の確認結果に応じて、正しい検索語によって構成される検索クエリを用いた検索結果を選択すればよいので、検索効率が向上する。 According to the example of FIG. 25, since the DB is searched using a plurality of search queries in parallel with the search term confirmation, although an incorrect search result can be obtained by the search using any of the search queries, it is correct. Search results will be obtained. According to the example of FIG. 25, the search efficiency may be improved because the search result using the search query composed of the correct search terms may be selected according to the confirmation result of the search terms.
 図26は、検索装置40による検索語の生成から、照合結果の出力までの流れのさらに別の一例を示すシーケンス図である。図26のシーケンス図は、図23のステップS420の検索語生成に後続する処理に関する。 FIG. 26 is a sequence diagram showing still another example of the flow from the generation of the search term by the search device 40 to the output of the collation result. The sequence diagram of FIG. 26 relates to a process following the generation of the search term in step S420 of FIG.
 図26において、検索装置40は、検索語を生成すると(ステップS420)、検索語の確認内容を含む音声データを無線機450に出力する(ステップS441)。例えば、検索装置40は、「キバタロウ、東京都A区で検索します。誤りがあれば、誤り部分を訂正してください。」といった検索語の確認内容を含む音声データを無線機450に出力する。 In FIG. 26, when the search device 40 generates a search term (step S420), the search device 40 outputs voice data including the confirmation content of the search term to the radio device 450 (step S441). For example, the search device 40 outputs voice data including confirmation contents of a search term such as "Search in Kibataro, A-ku, Tokyo. If there is an error, please correct the error part." To the radio 450. ..
 また、検索装置40は、生成された検索項目ごとの検索語を組み合わせて、複数の検索クエリを生成する(ステップS442)。検索装置40は、生成された複数の検索クエリのうち少なくともいずれかを用いて、DB群400に含まれるDBのうち、照会中の照会種別の照合データが格納されたDBを検索する(ステップS443)。 Further, the search device 40 combines the generated search terms for each search item to generate a plurality of search queries (step S442). The search device 40 uses at least one of the generated plurality of search queries to search the DBs included in the DB group 400 in which the collation data of the query type being inquired is stored (step S443). ).
 無線機450は、検索語の確認内容を含む音声データを検索装置40から受信すると、その音声データに応じて音声入力された応答(確認結果)を含む音声データを検索装置40に出力する(ステップS444)。例えば、検索語に誤りがあった場合、無線機450には、「名前は、芝太郎です。新聞のシ。」といった訂正を含む音声情報が入力される。無線機450は、入力された音声情報に応じた音声データを検索装置40に出力する。 When the radio 450 receives the voice data including the confirmation content of the search term from the search device 40, the radio 450 outputs the voice data including the response (confirmation result) voiced according to the voice data to the search device 40 (step). S444). For example, if there is an error in the search term, voice information including a correction such as "The name is Shibataro. Newspaper Shi." Is input to the radio 450. The radio 450 outputs voice data corresponding to the input voice information to the search device 40.
 検索装置40は、検索語に誤りがあった場合、訂正を含む音声データを無線機450から受信する。検索装置40は、訂正に基づいて別の検索語を選択し、検索語の確認内容を含む音声データを無線機450に出力する(ステップS445)。例えば、検索装置40は、「シバタロウで検索します。誤りがあれば訂正してください。」といった検索語の確認内容を含む音声データを無線機450に出力する。また、検索装置40は、訂正に基づいて選択された検索語によって構成された検索クエリを用いて、DB群400に含まれるDBのうち、照会中の照会種別の照合データが格納されたDBを検索する(ステップS446)。 The search device 40 receives voice data including corrections from the radio 450 when there is an error in the search term. The search device 40 selects another search term based on the correction, and outputs voice data including the confirmation content of the search term to the radio 450 (step S445). For example, the search device 40 outputs voice data including confirmation contents of a search term such as "Search by Shibatarou. If there is an error, please correct it." To the radio 450. Further, the search device 40 uses a search query composed of search terms selected based on the correction, and among the DBs included in the DB group 400, the DB in which the collation data of the inquiry type being inquired is stored is stored. Search (step S446).
 無線機450は、検索語の再確認内容を含む音声データを検索装置40から受信すると、その音声データに応じて音声入力された応答(確認結果)を含む音声データを検索装置40に出力する(ステップS447)。例えば、検索語が正しい場合、無線機450には、「間違いありません。」といった音声情報が入力される。無線機450は、入力された音声情報に応じた音声データを検索装置40に出力する。 When the radio 450 receives the voice data including the reconfirmation content of the search term from the search device 40, the radio 450 outputs the voice data including the response (confirmation result) voiced according to the voice data to the search device 40 (). Step S447). For example, if the search term is correct, voice information such as "There is no doubt" is input to the radio 450. The radio 450 outputs voice data corresponding to the input voice information to the search device 40.
 検索装置40は、検索語が正しかった場合、検索語が正しかったことを示す音声データを無線機450から受信する。検索装置40は、正しい検索語によって構成された検索クエリを用いた検索によってヒットした検索結果をDBから取得する(ステップS448)。検索装置40は、正しい検索語によって構成された検索クエリを用いた検索でヒットした検索結果に応じた照合結果を、無線機450に出力する(ステップS449)。例えば、検索がヒットした場合、検索装置40は、「平成2年1月1日生まれ、本籍住所東京都A区の芝太郎さんは、○○に該当します。」といった照合結果を含む音声データを無線機450に出力する。例えば、検索がヒットしなかった場合、検索装置40は、「平成2年1月1日生まれ、本籍住所東京都A区の芝太郎さんは、○○に該当しません。」といった照合結果を含む音声データを無線機450に出力する。 When the search term is correct, the search device 40 receives voice data indicating that the search term is correct from the radio 450. The search device 40 acquires the search results hit by the search using the search query composed of the correct search terms from the DB (step S448). The search device 40 outputs the collation result corresponding to the search result hit by the search using the search query composed of the correct search terms to the radio 450 (step S449). For example, if the search is a hit, the search device 40 will use the audio data including the collation result such as "Born on January 1, 1990, Mr. Shibataro, whose registered address is A-ku, Tokyo, corresponds to XX." Is output to the radio 450. For example, if the search does not hit, the search device 40 includes a collation result such as "Born on January 1, 1990, Mr. Shibataro, whose registered address is A-ku, Tokyo, does not correspond to XX." The voice data is output to the radio 450.
 図26の例によれば、正しいことが確認された検索語によって構成される検索クエリを用いてDBを検索できるので、検索精度が向上する。また、図26の例によれば、検索語が間違っている場合、間違った検索語を含む検索クエリを用いた検索を中止できるので、検索効率が向上する。 According to the example of FIG. 26, the DB can be searched using a search query composed of search terms confirmed to be correct, so that the search accuracy is improved. Further, according to the example of FIG. 26, when the search term is wrong, the search using the search query including the wrong search term can be stopped, so that the search efficiency is improved.
 (第5の実施形態)
 次に、第5の実施形態の検索装置について図面を参照しながら説明する。本実施形態の検索装置は、第1~第4の実施形態の検索装置を簡略化した構成である。図27は、本実施形態の検索装置50の構成の一例を示すブロック図である。検索装置50は、変換部52、検索部53を備える。
(Fifth Embodiment)
Next, the search device of the fifth embodiment will be described with reference to the drawings. The search device of the present embodiment has a simplified configuration of the search device of the first to fourth embodiments. FIG. 27 is a block diagram showing an example of the configuration of the search device 50 of the present embodiment. The search device 50 includes a conversion unit 52 and a search unit 53.
 変換部52は、入力される音声データを音声認識によってテキストデータに変換する。検索部53は、検索項目に該当する文字列をテキストデータから抽出する。検索部53は、抽出された文字列からの距離に基づいて、文字列に関連する検索語を検索項目ごとに生成する。検索部53は、検索項目ごとに生成された検索項目を組み合わせて複数の検索クエリを生成する。 The conversion unit 52 converts the input voice data into text data by voice recognition. The search unit 53 extracts the character string corresponding to the search item from the text data. The search unit 53 generates a search term related to the character string for each search item based on the distance from the extracted character string. The search unit 53 generates a plurality of search queries by combining the search items generated for each search item.
 本実施形態によれば、任意の音声データに基づいて、検索項目に該当する検索語によって構成される複数の検索クエリを生成できる。 According to this embodiment, it is possible to generate a plurality of search queries composed of search terms corresponding to search items based on arbitrary voice data.
 (ハードウェア)
 ここで、本発明の各実施形態に係る検索装置の処理を実行するハードウェア構成について、図28の情報処理装置90を一例として挙げて説明する。なお、図28の情報処理装置90は、各実施形態の検索装置の処理を実行するための構成例であって、本発明の範囲を限定するものではない。
(hardware)
Here, the hardware configuration for executing the processing of the search device according to each embodiment of the present invention will be described by taking the information processing device 90 of FIG. 28 as an example. The information processing device 90 of FIG. 28 is a configuration example for executing the processing of the search device of each embodiment, and does not limit the scope of the present invention.
 図28のように、情報処理装置90は、プロセッサ91、主記憶装置92、補助記憶装置93、入出力インターフェース95、通信インターフェース96を備える。図28においては、インターフェースをI/F(Interface)と略して表記する。プロセッサ91、主記憶装置92、補助記憶装置93、入出力インターフェース95、通信インターフェース96は、バス98を介して互いにデータ通信可能に接続される。また、プロセッサ91、主記憶装置92、補助記憶装置93および入出力インターフェース95は、通信インターフェース96を介して、インターネットやイントラネットなどのネットワークに接続される。 As shown in FIG. 28, the information processing device 90 includes a processor 91, a main storage device 92, an auxiliary storage device 93, an input / output interface 95, and a communication interface 96. In FIG. 28, the interface is abbreviated as I / F (Interface). The processor 91, the main storage device 92, the auxiliary storage device 93, the input / output interface 95, and the communication interface 96 are connected to each other via the bus 98 so as to be capable of data communication. Further, the processor 91, the main storage device 92, the auxiliary storage device 93, and the input / output interface 95 are connected to a network such as the Internet or an intranet via the communication interface 96.
 プロセッサ91は、補助記憶装置93等に格納されたプログラムを主記憶装置92に展開し、展開されたプログラムを実行する。本実施形態においては、情報処理装置90にインストールされたソフトウェアプログラムを用いる構成とすればよい。プロセッサ91は、本実施形態に係る検索装置による処理を実行する。 The processor 91 expands the program stored in the auxiliary storage device 93 or the like to the main storage device 92, and executes the expanded program. In the present embodiment, the software program installed in the information processing apparatus 90 may be used. The processor 91 executes the process by the search device according to the present embodiment.
 主記憶装置92は、プログラムが展開される領域を有する。主記憶装置92は、例えばDRAM(Dynamic Random Access Memory)などの揮発性メモリとすればよい。また、MRAM(Magnetoresistive Random Access Memory)などの不揮発性メモリを主記憶装置92として構成・追加してもよい。 The main storage device 92 has an area in which the program is expanded. The main storage device 92 may be a volatile memory such as a DRAM (Dynamic Random Access Memory). Further, a non-volatile memory such as MRAM (Magnetoresistive Random Access Memory) may be configured / added as the main storage device 92.
 補助記憶装置93は、種々のデータを記憶する。補助記憶装置93は、ハードディスクやフラッシュメモリなどのローカルディスクによって構成される。なお、種々のデータを主記憶装置92に記憶させる構成とし、補助記憶装置93を省略することも可能である。 The auxiliary storage device 93 stores various data. The auxiliary storage device 93 is composed of a local disk such as a hard disk or a flash memory. It is also possible to store various data in the main storage device 92 and omit the auxiliary storage device 93.
 入出力インターフェース95は、情報処理装置90と周辺機器とを接続するためのインターフェースである。通信インターフェース96は、規格や仕様に基づいて、インターネットやイントラネットなどのネットワークを通じて、外部のシステムや装置に接続するためのインターフェースである。入出力インターフェース95および通信インターフェース96は、外部機器と接続するインターフェースとして共通化してもよい。 The input / output interface 95 is an interface for connecting the information processing device 90 and peripheral devices. The communication interface 96 is an interface for connecting to an external system or device through a network such as the Internet or an intranet based on a standard or a specification. The input / output interface 95 and the communication interface 96 may be shared as an interface for connecting to an external device.
 情報処理装置90には、必要に応じて、キーボードやマウス、タッチパネルなどの入力機器を接続するように構成してもよい。それらの入力機器は、情報や設定の入力に使用される。なお、タッチパネルを入力機器として用いる場合は、表示機器の表示画面が入力機器のインターフェースを兼ねる構成とすればよい。プロセッサ91と入力機器との間のデータ通信は、入出力インターフェース95に仲介させればよい。 The information processing device 90 may be configured to connect an input device such as a keyboard, a mouse, or a touch panel, if necessary. These input devices are used to input information and settings. When the touch panel is used as an input device, the display screen of the display device may also serve as the interface of the input device. Data communication between the processor 91 and the input device may be mediated by the input / output interface 95.
 また、情報処理装置90には、情報を表示するための表示機器を備え付けてもよい。表示機器を備え付ける場合、情報処理装置90には、表示機器の表示を制御するための表示制御装置(図示しない)が備えられていることが好ましい。表示機器は、入出力インターフェース95を介して情報処理装置90に接続すればよい。 Further, the information processing apparatus 90 may be equipped with a display device for displaying information. When a display device is provided, it is preferable that the information processing device 90 is provided with a display control device (not shown) for controlling the display of the display device. The display device may be connected to the information processing device 90 via the input / output interface 95.
 以上が、本発明の各実施形態に係る検索装置を可能とするためのハードウェア構成の一例である。なお、図28のハードウェア構成は、各実施形態に係る検索装置の演算処理を実行するためのハードウェア構成の一例であって、本発明の範囲を限定するものではない。また、各実施形態に係る検索装置に関する処理をコンピュータに実行させるプログラムも本発明の範囲に含まれる。さらに、各実施形態に係るプログラムを記録した記録媒体も本発明の範囲に含まれる。例えば、記録媒体は、CD(Compact Disc)やDVD(Digital Versatile Disc)などの光学記録媒体で実現できる。また、記録媒体は、USB(Universal Serial Bus)メモリやSD(Secure Digital)カードなどの半導体記録媒体や、フレキシブルディスクなどの磁気記録媒体、その他の記録媒体によって実現してもよい。 The above is an example of the hardware configuration for enabling the search device according to each embodiment of the present invention. The hardware configuration of FIG. 28 is an example of the hardware configuration for executing the arithmetic processing of the search device according to each embodiment, and does not limit the scope of the present invention. Further, the scope of the present invention also includes a program for causing a computer to execute a process related to the search device according to each embodiment. Further, a recording medium on which a program according to each embodiment is recorded is also included in the scope of the present invention. For example, the recording medium can be realized by an optical recording medium such as a CD (Compact Disc) or a DVD (Digital Versatile Disc). Further, the recording medium may be realized by a semiconductor recording medium such as a USB (Universal Serial Bus) memory or an SD (Secure Digital) card, a magnetic recording medium such as a flexible disk, or another recording medium.
 各実施形態の検索装置の構成要素は、任意に組み合わせることができる。また、各実施形態の検索装置の構成要素は、ソフトウェアによって実現してもよいし、回路によって実現してもよい。 The components of the search device of each embodiment can be arbitrarily combined. Further, the components of the search device of each embodiment may be realized by software or by a circuit.
 以上、実施形態を参照して本発明を説明してきたが、本発明は上記実施形態に限定されるものではない。本発明の構成や詳細には、本発明のスコープ内で当業者が理解し得る様々な変更をすることができる。 Although the present invention has been described above with reference to the embodiments, the present invention is not limited to the above embodiments. Various modifications that can be understood by those skilled in the art can be made to the structure and details of the present invention within the scope of the present invention.
 10、20、30、40  検索装置
 11、21、31  取得部
 12、22、32、42  第1変換部
 13、23、33、43  検索部
 18、28、38、48  第2変換部
 19、29、39  出力部
 34  辞書
 41  入出力部
 44  登録情報記録部
 52  変換部
 53  検索部
 90  情報処理装置
 91  プロセッサ
 92  主記憶装置
 93  補助記憶装置
 95  入出力インターフェース
 96  通信インターフェース
 98  バス
 100、200、300 DB
 400  DB群
10, 20, 30, 40 Search device 11, 21, 31 Acquisition unit 12, 22, 32, 42 First conversion unit 13, 23, 33, 43 Search unit 18, 28, 38, 48 Second conversion unit 19, 29 , 39 Output unit 34 Dictionary 41 Input / output unit 44 Registration information recording unit 52 Conversion unit 53 Search unit 90 Information processing device 91 Processor 92 Main storage device 93 Auxiliary storage device 95 Input / output interface 96 Communication interface 98 Bus 100, 200, 300 DB
400 DB group

Claims (10)

  1.  入力される音声データを音声認識によってテキストデータに変換する変換手段と、
     検索項目に該当する文字列を前記テキストデータから抽出し、抽出された前記文字列からの距離に基づいて、前記文字列に関連する検索語を前記検索項目ごとに生成し、前記検索項目ごとに生成された前記検索語を組み合わせて複数の検索クエリを生成する検索手段と、を備える検索装置。
    A conversion means that converts input voice data into text data by voice recognition,
    A character string corresponding to the search item is extracted from the text data, and a search term related to the character string is generated for each search item based on the extracted distance from the character string, and for each search item. A search device including a search means for generating a plurality of search queries by combining the generated search terms.
  2.  前記検索手段は、
     前記テキストデータから抽出された前記文字列を構成する音素の違いに基づく発音間距離に基づいて、前記文字列と前記発音間距離の近い文字列を前記検索語として生成する請求項1に記載の検索装置。
    The search means is
    The first aspect of claim 1, wherein a character string having a close distance between the character string and the pronunciation is generated as the search term based on the pronunciation distance based on the difference in phonemes constituting the character string extracted from the text data. Search device.
  3.  前記検索手段は、
     二つの音素間において予め定義された音素間距離を用いて前記発音間距離を計算する請求項2に記載の検索装置。
    The search means is
    The search device according to claim 2, wherein the distance between phonemes is calculated using a predefined distance between two phonemes.
  4.  前記検索手段は、
     前記検索項目に該当する複数の文字列の各々に対して、前記検索項目に該当する複数の他の文字列が前記発音間距離に応じて順位付けされた発音間距離辞書を参照し、前記テキストデータから抽出された前記文字列に対する前記発音間距離辞書における順位が高い前記文字列を前記検索語として選択する請求項2または3に記載の検索装置。
    The search means is
    For each of the plurality of character strings corresponding to the search item, the text is referred to with reference to the pronunciation distance dictionary in which the plurality of other character strings corresponding to the search item are ranked according to the pronunciation distance. The search device according to claim 2 or 3, wherein the character string having a higher rank in the pronunciation distance dictionary with respect to the character string extracted from the data is selected as the search term.
  5.  前記検索手段は、
     前記検索項目ごとの重みが付与された前記検索語の前記発音間距離の総和である確度を前記検索クエリごとに計算し、
     前記確度に応じて、前記検索クエリを順位付けする請求項2乃至4のいずれか一項に記載の検索装置。
    The search means is
    The accuracy, which is the sum of the distances between pronunciations of the search term to which the weight for each search item is given, is calculated for each search query.
    The search device according to any one of claims 2 to 4, which ranks the search queries according to the accuracy.
  6.  前記検索手段は、
     前記確度に応じて順位付けされた前記検索クエリを構成する前記検索語を含む前記テキストデータを生成し、
     前記変換手段は、
     生成された前記テキストデータを前記音声データに変換し、前記テキストデータの生成元の前記検索クエリの確度の順位に応じて、前記テキストデータから変換された前記音声データを順番に出力する請求項5に記載の検索装置。
    The search means is
    Generate the text data containing the search terms constituting the search query ranked according to the accuracy.
    The conversion means
    5. Claim 5 that the generated text data is converted into the voice data, and the voice data converted from the text data is sequentially output according to the order of accuracy of the search query from which the text data is generated. The search device described in.
  7.  前記検索手段は、
     前記音声認識に基づくスコアを前記検索語に付与し、前記スコアが付与された前記検索語を含む前記テキストデータを生成し、
     前記変換手段は、
     生成された前記テキストデータを前記音声データに変換し、前記テキストデータから変換された前記音声データを出力する請求項1乃至6のいずれか一項に記載の検索装置。
    The search means is
    A score based on the voice recognition is given to the search term, and the text data including the search term to which the score is given is generated.
    The conversion means
    The search device according to any one of claims 1 to 6, which converts the generated text data into the voice data and outputs the voice data converted from the text data.
  8.  前記検索手段は、
     少なくとも一つの前記検索クエリを用いて、前記検索項目を含む照合データが蓄積されたデータベースを検索し、
     検索結果に応じた前記テキストデータを生成し、
     前記変換手段は、
     生成された前記テキストデータを前記音声データに変換し、前記テキストデータから変換された前記音声データを出力する請求項1乃至7のいずれか一項に記載の検索装置。
    The search means is
    Using at least one of the search queries, a database in which matching data including the search item is accumulated is searched, and the database is searched.
    Generate the text data according to the search results and generate
    The conversion means
    The search device according to any one of claims 1 to 7, which converts the generated text data into the voice data and outputs the voice data converted from the text data.
  9.  コンピュータが、
     入力される音声データを音声認識によってテキストデータに変換し、
     検索項目に該当する文字列を前記テキストデータから抽出し、
     抽出された前記文字列からの距離に基づいて、前記文字列に関連する検索語を前記検索項目ごとに生成し、
     前記検索項目ごとに生成された前記検索項目を組み合わせて複数の検索クエリを生成する検索方法。
    The computer
    The input voice data is converted into text data by voice recognition,
    Extract the character string corresponding to the search item from the text data,
    Based on the extracted distance from the character string, a search term related to the character string is generated for each search item.
    A search method that generates a plurality of search queries by combining the search items generated for each search item.
  10.  入力される音声データを音声認識によってテキストデータに変換する処理と、
     検索項目に該当する文字列を前記テキストデータから抽出する処理と、
     抽出された前記文字列からの距離に基づいて、前記文字列に関連する検索語を前記検索項目ごとに生成する処理と、
     前記検索項目ごとに生成された前記検索項目を組み合わせて複数の検索クエリを生成する処理とをコンピュータに実行させるプログラムを記録させた非一過性の記録媒体。
    The process of converting input voice data into text data by voice recognition,
    The process of extracting the character string corresponding to the search item from the text data, and
    A process of generating a search term related to the character string for each search item based on the extracted distance from the character string, and
    A non-transient recording medium in which a program that causes a computer to execute a process of generating a plurality of search queries by combining the search items generated for each search item is recorded.
PCT/JP2020/022971 2020-06-11 2020-06-11 Search device, search method, and recording medium WO2021250837A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/JP2020/022971 WO2021250837A1 (en) 2020-06-11 2020-06-11 Search device, search method, and recording medium
JP2022530448A JP7485030B2 (en) 2020-06-11 Search device, search method, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/022971 WO2021250837A1 (en) 2020-06-11 2020-06-11 Search device, search method, and recording medium

Publications (1)

Publication Number Publication Date
WO2021250837A1 true WO2021250837A1 (en) 2021-12-16

Family

ID=78847087

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/022971 WO2021250837A1 (en) 2020-06-11 2020-06-11 Search device, search method, and recording medium

Country Status (1)

Country Link
WO (1) WO2021250837A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012063536A (en) * 2010-09-15 2012-03-29 Ntt Docomo Inc Terminal device, speech recognition method and speech recognition program
WO2015037098A1 (en) * 2013-09-12 2015-03-19 株式会社 東芝 Electronic device, method and program
WO2015040793A1 (en) * 2013-09-20 2015-03-26 三菱電機株式会社 Character string retrieval device
JP2017167270A (en) * 2016-03-15 2017-09-21 本田技研工業株式会社 Sound processing device and sound processing method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012063536A (en) * 2010-09-15 2012-03-29 Ntt Docomo Inc Terminal device, speech recognition method and speech recognition program
WO2015037098A1 (en) * 2013-09-12 2015-03-19 株式会社 東芝 Electronic device, method and program
WO2015040793A1 (en) * 2013-09-20 2015-03-26 三菱電機株式会社 Character string retrieval device
JP2017167270A (en) * 2016-03-15 2017-09-21 本田技研工業株式会社 Sound processing device and sound processing method

Also Published As

Publication number Publication date
JPWO2021250837A1 (en) 2021-12-16

Similar Documents

Publication Publication Date Title
US8515764B2 (en) Question and answer database expansion based on speech recognition using a specialized and a general language model
JP5318230B2 (en) Recognition dictionary creation device and speech recognition device
US5454062A (en) Method for recognizing spoken words
US8195459B1 (en) Augmentation and calibration of output from non-deterministic text generators by modeling its characteristics in specific environments
US20070094008A1 (en) Conversation control apparatus
JP2006504173A (en) Scalable neural network based language identification from document text
WO2002091360A1 (en) Multi-stage large vocabulary speech recognition system and method
KR19990008459A (en) Improved Reliability Word Recognition Method and Word Recognizer
JP2002288201A (en) Question-answer processing method, question-answer processing program, recording medium for the question- answer processing program, and question-answer processor
CN112287680B (en) Entity extraction method, device and equipment of inquiry information and storage medium
CN102439660A (en) Voice-tag method and apparatus based on confidence score
KR102372069B1 (en) Free dialogue system and method for language learning
JP2008234427A (en) Device, method, and program for supporting interaction between user
WO2021250837A1 (en) Search device, search method, and recording medium
JP7485030B2 (en) Search device, search method, and program
JP5004863B2 (en) Voice search apparatus and voice search method
JP2005284209A (en) Speech recognition system
JP3708747B2 (en) Speech recognition method
KR20130014473A (en) Speech recognition system and method based on location information
EP1158491A2 (en) Personal data spoken input and retrieval
CN113722447B (en) Voice search method based on multi-strategy matching
JP6991409B2 (en) Information processing equipment, programs and information processing methods
JP2016157019A (en) Word selection device, method and program
JP6281330B2 (en) Speech analysis method, speech analysis program, and speech analysis apparatus
JP2024038566A (en) Keyword detection device, keyword detection method, and keyword detection program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20940078

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022530448

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20940078

Country of ref document: EP

Kind code of ref document: A1