JP6597156B2

JP6597156B2 - Information generation system

Info

Publication number: JP6597156B2
Application number: JP2015203864A
Authority: JP
Inventors: 貴裕岩田; 優樹瀬戸; 友美子越智; 哲朗石田; 翔太森口; 裕之岩瀬
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2015-10-15
Filing date: 2015-10-15
Publication date: 2019-10-30
Anticipated expiration: 2035-10-15
Also published as: JP2017076279A

Description

本発明は、案内音声から情報を生成する技術に関する。 The present invention relates to a technique for generating information from guidance voice.

発話音声に対する音声認識の結果に応じて各種の情報を利用者に提供する技術が従来から提案されている。例えば、特許文献１には、利用者の発話音声に対する音声認識を実行して、音声認識の結果として特定された目標地点を中心として地図の表示領域や表示縮尺等を指示する構成が開示されている。 Techniques have been conventionally proposed that provide various types of information to users according to the results of speech recognition for uttered speech. For example, Patent Literature 1 discloses a configuration in which voice recognition is performed on a user's uttered voice and a map display area, a display scale, and the like are instructed around a target point specified as a result of voice recognition. Yes.

特開平０３―１７５４７８号公報Japanese Patent Laid-Open No. 03-175478

ところで、例えばショッピングモール等の商業施設で放送される案内音声を音声認識し、認識結果の文字列やその翻訳文等の関連情報を施設の来場者の端末装置に提供できれば、案内音声の聴取が困難な難聴者や案内音声の言語を理解できない外国人にとって便利である。しかし、完全な音声認識の実現は実際には困難であり、例えば個々の発話者に特有の発話の特徴（くせ）や収音時の背景雑音等に起因して誤認識が発生する場合がある。以上の事情を考慮して、本発明は、案内音声に応じた適正な関連情報を利用者に提供することを目的とする。 By the way, for example, if the guidance voice broadcast in a commercial facility such as a shopping mall can be recognized and related information such as a character string of the recognition result and its translation can be provided to the terminal device of the facility visitor, the guidance voice can be heard. Useful for people who have difficulty hearing or who cannot understand the language of the guidance voice. However, it is actually difficult to realize complete speech recognition. For example, erroneous recognition may occur due to utterance characteristics peculiar to individual speakers or background noise at the time of sound collection. . In view of the above circumstances, an object of the present invention is to provide a user with relevant information appropriate to a guidance voice.

以上の課題を解決するために、本発明に係る情報生成システムは、案内音声に関連する関連情報を端末装置が利用者に提示するために当該端末装置に送信される配信情報を生成するシステムであって、案内音声に対する音声認識で解析された複数の認識文字列の各々について、相異なる発音内容を表す複数の登録文字列のうち認識文字列に類似する当該登録文字列を特定する文字列特定部と、複数の登録文字列の相異なる組合せを指定する複数の対応情報のうち、文字列特定部が特定した複数の登録文字列の組合せに対応する対応情報に応じた関連情報を端末装置が利用者に提示するための配信情報を生成する情報生成部とを具備する。以上の構成では、複数の登録文字列のうち、案内音声に対する音声認識で解析された認識文字列に類似する登録文字列が特定される。したがって、案内音声に対する音声認識で解析された複数の認識文字列に応じた関連情報を利用者に提示するための配信情報を生成する構成と比較して、音声認識の誤認識の影響を含まない関連情報を利用者に提示するための配信情報を生成することが可能である。また、複数の登録文字列の相異なる組合せを指定する対応情報に応じた関連情報を端末装置が利用者に提示するための配信情報が生成されるので、文字列特定部が特定した複数の登録文字列に応じた関連情報を利用者に提示するための配信情報を生成する構成と比較して、案内者が例えば文単位で案内音声の発音を誤った場合（例えば、必要な文の発音を忘れた場合や不要な文を発音した場合）でも、案内音声の認識結果から特定される複数の登録文字列の組合せに対応する対応情報に応じた適切な関連情報を利用者に提示するための配信情報を生成することが可能である。 In order to solve the above problems, an information generation system according to the present invention is a system that generates distribution information transmitted to a terminal device so that the terminal device presents related information related to the guidance voice to the user. In addition, for each of a plurality of recognized character strings analyzed by voice recognition for the guidance voice, a character string specification that identifies the registered character string that is similar to the recognized character string among a plurality of registered character strings that represent different pronunciation contents And the terminal device provides related information corresponding to the correspondence information corresponding to the combination of the plurality of registered character strings specified by the character string specifying unit among the plurality of pieces of correspondence information designating different combinations of the plurality of registered character strings. An information generation unit that generates distribution information to be presented to the user. In the above configuration, a registered character string similar to a recognized character string analyzed by voice recognition for the guidance voice is specified from among a plurality of registered character strings. Therefore, it does not include the influence of misrecognition of voice recognition, compared with a configuration that generates distribution information for presenting related information according to a plurality of recognized character strings analyzed by voice recognition to the guidance voice. It is possible to generate distribution information for presenting related information to the user. In addition, since distribution information for the terminal device to present to the user related information corresponding to correspondence information specifying different combinations of a plurality of registered character strings is generated, a plurality of registrations specified by the character string specifying unit Compared to a configuration that generates distribution information for presenting relevant information according to a character string to the user, for example, when the guide mispronounces the guidance voice in units of sentences (for example, to pronounce the necessary sentences) For presenting appropriate related information according to the corresponding information corresponding to the combination of multiple registered character strings specified from the recognition result of the guidance voice, even when forgetting or when an unnecessary sentence is pronounced) Distribution information can be generated.

本発明の好適な態様において、情報生成部は、各対応情報について用意された複数の案内文のうち、文字列特定部が特定した複数の登録文字列の組合せに対応する対応情報の案内文を示す関連情報を端末装置が利用者に提示するための配信情報を生成する。以上の構成では、文字列特定部が特定した複数の登録文字列の組合せに対応する対応情報の案内文を示す関連情報を端末装置が利用者に提示するための配信情報が生成される。したがって、対応情報が指定する登録文字列の組合せとは独立に用意された案内文を示す関連情報を利用者に提示するための配信情報を生成することが可能である。 In a preferred aspect of the present invention, the information generation unit generates a guide sentence of correspondence information corresponding to a combination of a plurality of registered character strings specified by the character string specifying part among a plurality of guide sentences prepared for each piece of correspondence information. Distribution information for the terminal device to present the related information to be shown to the user is generated. In the above configuration, distribution information is generated for the terminal device to present related information indicating the guidance text of the corresponding information corresponding to the combination of the plurality of registered character strings specified by the character string specifying unit to the user. Therefore, it is possible to generate distribution information for presenting the user with related information indicating a guide sentence prepared independently of the combination of registered character strings designated by the correspondence information.

本発明の好適な態様において、情報生成部は、文字列特定部が特定した複数の登録文字列の組合せに対応する対応情報が指定する複数の登録文字列を示す関連情報を端末装置が利用者に提示するための配信情報を生成する。以上の構成では、対応情報が指定する複数の登録文字列を示す関連情報を端末装置が利用者に提示するための配信情報が生成される。したがって、文字列特定部が特定した複数の登録文字列の組合せに対応する対応情報の案内文を示す関連情報を利用者に提示するための配信情報を生成する構成と比較して、案内文を一意に指定するための情報が不要になるという利点がある。 In a preferred aspect of the present invention, the information generation unit is configured such that the terminal device displays related information indicating a plurality of registered character strings specified by correspondence information corresponding to a combination of a plurality of registered character strings specified by the character string specifying unit. To generate distribution information for presentation. With the above configuration, distribution information for the terminal device to present related information indicating a plurality of registered character strings designated by the correspondence information to the user is generated. Therefore, in comparison with the configuration for generating the distribution information for presenting the related information indicating the guidance text of the corresponding information corresponding to the combination of the plurality of registered character strings identified by the character string identification section, the guidance text is There is an advantage that information for uniquely specifying is unnecessary.

本発明の好適な態様において、対応情報は、複数の登録文字列の組合せと、当該組合せに係る複数の登録文字列の順序とを指定し、情報生成部は、対応情報が指定する順序で配列された複数の登録文字列を示す関連情報を端末装置が利用者に提示するための配信情報を生成する。以上の構成では、対応情報が指定する順序で配列された複数の登録文字列を示す関連情報を端末装置が利用者に提示するための配信情報が生成される。したがって、案内音声を構成する複数の文の発音の順序に関わらず、対応情報があらかじめ指定した順序で配列された複数の登録文字列を示す関連情報を利用者に提示するための配信情報を生成することが可能である。 In a preferred aspect of the present invention, the correspondence information specifies a combination of a plurality of registered character strings and an order of the plurality of registered character strings related to the combination, and the information generation unit is arranged in the order specified by the correspondence information The terminal device generates distribution information for presenting the relevant information indicating the plurality of registered character strings to the user. In the above configuration, distribution information for the terminal device to present related information indicating a plurality of registered character strings arranged in the order designated by the correspondence information to the user is generated. Therefore, regardless of the order of pronunciation of the plurality of sentences constituting the guidance voice, the distribution information for presenting the related information indicating the plurality of registered character strings in which the correspondence information is arranged in the order specified in advance is generated. Is possible.

本発明の好適な態様において、案内音声と、配信情報を示す音響とを放音する放音部を具備する。以上の構成では、案内音声を放音する放音部が、配信情報の音響の放音（すなわち空気振動たる音響を伝送媒体とする音響通信）に流用される。したがって、案内音声の放音に使用する放音装置とは別個の放音装置を配信情報を示す音響の放音に利用する構成と比較して、情報生成システムの構成を簡素化することが可能である。 The suitable aspect of this invention comprises the sound emission part which emits a guidance audio | voice and the sound which shows delivery information. In the above configuration, the sound emitting unit that emits the guidance voice is used for sound emission of distribution information (that is, acoustic communication using sound that is air vibration as a transmission medium). Therefore, it is possible to simplify the configuration of the information generation system as compared to a configuration in which a sound emitting device separate from the sound emitting device used for sounding the guidance voice is used for sound emission indicating the distribution information. It is.

第１実施形態に係る情報生成システムの概略図である。1 is a schematic diagram of an information generation system according to a first embodiment. 音声案内システムおよび管理装置の構成図である。It is a block diagram of a voice guidance system and a management apparatus. 案内音声の解析結果を複数の認識文字列に分割する動作の説明図である。It is explanatory drawing of the operation | movement which divides | segments the analysis result of a guidance voice into a some recognition character string. 文字列テーブルの模式図である。It is a schematic diagram of a character string table. 案内テーブルの模式図である。It is a schematic diagram of a guidance table. 端末装置の構成図である。It is a block diagram of a terminal device. 情報生成システムの動作の説明図である。It is explanatory drawing of operation | movement of an information generation system. 第２実施形態に係る案内テーブルの模式図である。It is a schematic diagram of the guidance table which concerns on 2nd Embodiment.

＜第１実施形態＞
図１は、本発明の第１実施形態に係る情報生成システム１の構成図である。第１実施形態の情報生成システム１は、ショッピングモール等の商業施設３００内にいる利用者Ｕに提供する情報を生成するためのコンピュータシステムであり、音声案内システム１００と管理装置１０とを具備する。音声案内システム１００は、例えば商業施設３００内に設置され、インターネット等を含む通信網２００を介して管理装置１０と通信する。管理装置１０は、例えば通信網２００に接続されたサーバ（例えばウェブサーバ）である。利用者Ｕは、端末装置３０を携帯して商業施設３００に来場する。端末装置３０は、例えば携帯電話機やスマートフォン等の可搬型の情報処理装置である。なお、実際には商業施設３００内にいる複数の利用者Ｕが情報生成システム１のサービスを利用し得るが、以下の説明では便宜的に１個の端末装置３０に着目する。 <First Embodiment>
FIG. 1 is a configuration diagram of an information generation system 1 according to the first embodiment of the present invention. The information generation system 1 of the first embodiment is a computer system for generating information to be provided to a user U in a commercial facility 300 such as a shopping mall, and includes a voice guidance system 100 and a management device 10. . The voice guidance system 100 is installed in, for example, a commercial facility 300 and communicates with the management apparatus 10 via a communication network 200 including the Internet. The management device 10 is a server (for example, a web server) connected to the communication network 200, for example. The user U carries the terminal device 30 and visits the commercial facility 300. The terminal device 30 is a portable information processing device such as a mobile phone or a smartphone. In practice, a plurality of users U in the commercial facility 300 can use the service of the information generation system 1, but in the following description, attention is focused on one terminal device 30 for convenience.

商業施設３００の従業員等の案内者ＯPは、来場者である利用者Ｕを案内する各種の音声（以下「案内音声」という）Ｖを発音する。第１実施形態では、複数の文から構成される商業施設３００の来場者向けの情報、例えば「本日はお越しくださいまして誠にありがとうございます。間もなく閉店のお時間となります。お気をつけてお帰りください。」等の音声が案内音声Ｖとして発音される。第１実施形態では、案内音声Ｖの発音時に参照するアナウンスブック等に事前に収録された複数の文章の何れかを案内者ＯPが選択的に案内音声Ｖとして発音する場合を想定する。複数の文章の各々は、相異なる発話内容を表す１文単位の文字列（以下「登録文字列」という）を組合せて構成される。 The guide OP, such as an employee of the commercial facility 300, produces various voices (hereinafter referred to as “guide voice”) V that guide the user U who is a visitor. In the first embodiment, information for visitors of the commercial facility 300 composed of a plurality of sentences, for example, “Thank you for coming today. It will be closing time soon. Please be careful. Is pronounced as the guidance voice V. In the first embodiment, it is assumed that the guider OP selectively pronounces any of a plurality of sentences recorded in advance in an announcement book or the like to be referred to when the guidance voice V is sounded as the guidance voice V. Each of the plurality of sentences is configured by combining one sentence-unit character string (hereinafter referred to as “registered character string”) representing different utterance contents.

情報生成システム１は、案内者ＯPが発音した案内音声Ｖに関連する情報（以下「関連情報」という）Ｚを利用者Ｕに提示するための情報（以下「配信情報」）Ｑを生成する。例えば、関連情報Ｚは、案内音声Ｖの発音内容を表現した文字列や当該発音内容の言語を他言語に翻訳した文字列または音声等の各種の情報である。第１実施形態では、案内音声Ｖに関連する予め用意された各種の案内文Ａ（Ａ1，Ａ2，……）を関連情報Ｚとして利用者Ｕに提示するための配信情報Ｑを生成する。 The information generation system 1 generates information (hereinafter referred to as “distribution information”) Q for presenting information (hereinafter referred to as “related information”) Z related to the guidance voice V pronounced by the guide OP to the user U. For example, the related information Z is various information such as a character string expressing the pronunciation content of the guidance voice V, a character string translated from the language of the pronunciation content into another language, or a voice. In the first embodiment, distribution information Q for presenting various guidance sentences A (A1, A2,...) Prepared in advance related to the guidance voice V as related information Z to the user U is generated.

＜音声案内システム１００＞
図２は、音声案内システム１００および管理装置１０の構成図である。図２に例示される通り、音声案内システム１００は、配信端末２０と収音装置２２と加算器２４と放音装置２６とを含んで構成される。 <Voice guidance system 100>
FIG. 2 is a configuration diagram of the voice guidance system 100 and the management device 10. As illustrated in FIG. 2, the voice guidance system 100 includes a distribution terminal 20, a sound collection device 22, an adder 24, and a sound emission device 26.

収音装置２２は、周囲の音響を収音する音響機器（マイクロホン）である。案内者ＯPは、案内音声Ｖの発音時に参照するアナウンスブック等に事前に収録された複数の文章を、例えば案内したい情報に応じて選択的に案内音声Ｖとして発音する。すなわち、案内音声Ｖは、基本的には、案内者ＯPが内容を任意に決定できるものではなく、事前に用意された既知の内容である。 The sound collection device 22 is an acoustic device (microphone) that collects ambient sounds. The guide OP selectively pronounces a plurality of sentences recorded in advance in an announcement book or the like to be referred to when the guidance voice V is sounded, for example, as the guidance voice V according to information to be guided. That is, the guidance voice V is basically a known content prepared in advance, not the content that the guide OP can arbitrarily determine.

収音装置２２は、案内者ＯPが発音した案内音声Ｖを収音して、当該案内音声Ｖの時間波形を表す音声信号ＳGを生成する。なお、収音装置２２が生成した音声信号ＳGをアナログからデジタルに変換するＡ/Ｄ変換器の図示は便宜的に省略されている。 The sound collection device 22 collects the guidance voice V generated by the guider OP and generates a voice signal SG representing the time waveform of the guidance voice V. Note that an A / D converter that converts the audio signal SG generated by the sound collection device 22 from analog to digital is not shown for convenience.

収音装置２２が生成した音声信号ＳGは、加算器２４を経由して音響信号Ｓ1として放音装置２６に供給される。放音装置２６は、加算器２４から供給される音響信号Ｓ1に応じた音響を放音する音響機器（スピーカ）である。例えば、音声信号ＳGが表す案内音声Ｖが放音装置２６から利用者Ｕに放音される。なお、音響信号Ｓ1をデジタルからアナログに変換するＤ/Ａ変換器の図示は便宜的に省略されている。 The sound signal SG generated by the sound collecting device 22 is supplied to the sound emitting device 26 via the adder 24 as an acoustic signal S1. The sound emitting device 26 is an acoustic device (speaker) that emits sound according to the acoustic signal S 1 supplied from the adder 24. For example, the guidance voice V represented by the voice signal SG is emitted from the sound emitting device 26 to the user U. The D / A converter that converts the acoustic signal S1 from digital to analog is not shown for convenience.

以上の説明から理解される通り、音声案内システム１００は、収音装置２２が収音した案内音声Ｖを放音装置２６から放送する既存の館内放送システムに配信端末２０と加算器２４とを接続した音響システムである。ただし、音声案内システム１００の形態は任意であり、例えば配信端末２０の各要素と収音装置２２と加算器２４と放音装置２６とを単体の装置に搭載することや収音装置２２を配信端末２０に搭載することも可能である。 As understood from the above description, the voice guidance system 100 connects the distribution terminal 20 and the adder 24 to an existing in-house broadcasting system that broadcasts the guidance voice V picked up by the sound pickup device 22 from the sound output device 26. Sound system. However, the form of the voice guidance system 100 is arbitrary. For example, each element of the distribution terminal 20, the sound collecting device 22, the adder 24, and the sound emitting device 26 are mounted on a single device, or the sound collecting device 22 is distributed. It can also be mounted on the terminal 20.

収音装置２２が生成した音声信号ＳGは、収音装置２２と加算器２４との間の経路から分岐して配信端末２０に供給される。なお、音声信号ＳGを無線により配信端末２０に供給することも可能である。 The audio signal SG generated by the sound collection device 22 is branched from the path between the sound collection device 22 and the adder 24 and supplied to the distribution terminal 20. It is also possible to supply the audio signal SG to the distribution terminal 20 wirelessly.

配信端末２０は、収音装置２２から供給される音声信号ＳGの案内音声Ｖに対応した配信情報Ｑを端末装置３０に提供するための機器であり、例えば、携帯電話機やスマートフォン等の可搬型の端末装置で実現される。配信端末２０は、図２に例示される通り、制御装置２１０と通信装置２２０とを含んで構成される。 The distribution terminal 20 is a device for providing the terminal device 30 with distribution information Q corresponding to the guidance voice V of the audio signal SG supplied from the sound collection device 22, and is, for example, a portable type such as a mobile phone or a smartphone. Realized by a terminal device. As illustrated in FIG. 2, the distribution terminal 20 includes a control device 210 and a communication device 220.

通信装置２２０は、図２に例示される通り、通信網２００を介して管理装置１０と通信する機器である。通信装置２２０は、音声信号ＳGを管理装置１０に送信するとともに、音声信号ＳGに対して管理装置１０から配信された配信情報Ｑを受信する。 As illustrated in FIG. 2, the communication device 220 is a device that communicates with the management device 10 via the communication network 200. The communication device 220 transmits the audio signal SG to the management device 10 and receives the distribution information Q distributed from the management device 10 to the audio signal SG.

制御装置２１０は、例えばＣＰＵ（Central Processing Unit）等の処理装置で構成され、配信端末２０の全体的な動作を制御する。具体的には、図２に例示される通り、半導体記録媒体や磁気記録媒体等の公知の記録媒体（図示略）に記憶されたプログラムを制御装置２１０が実行することで、案内音声Ｖに対応した配信情報Ｑの取得および配信するための複数の機能（音声取得部２１２，信号処理部２１４）が実現される。 The control device 210 is configured by a processing device such as a CPU (Central Processing Unit), for example, and controls the overall operation of the distribution terminal 20. Specifically, as illustrated in FIG. 2, the control device 210 executes a program stored in a known recording medium (not shown) such as a semiconductor recording medium or a magnetic recording medium, so that the guidance voice V can be handled. A plurality of functions (voice acquisition unit 212, signal processing unit 214) for acquiring and distributing the distributed information Q are realized.

音声取得部２１２は、案内音声Ｖの音声信号ＳGを収音装置２２から取得し、当該音声信号ＳGを通信装置２２０から通信網２００を介して管理装置１０に送信する。管理装置１０は、音声案内システム１００から送信された音声信号ＳGを受信し、当該音声信号ＳGの案内音声Ｖに対応する関連情報Ｚを利用者Ｕに提示するための配信情報Ｑを生成する。管理装置１０が生成した配信情報Ｑは、管理装置１０から音声案内システム１００に送信される。通信装置２２０は、管理装置１０から送信された配信情報Ｑを通信網２００から受信する。 The voice acquisition unit 212 acquires the voice signal SG of the guidance voice V from the sound collection device 22 and transmits the voice signal SG from the communication device 220 to the management device 10 via the communication network 200. The management apparatus 10 receives the voice signal SG transmitted from the voice guidance system 100 and generates distribution information Q for presenting the user U with related information Z corresponding to the guidance voice V of the voice signal SG. The distribution information Q generated by the management device 10 is transmitted from the management device 10 to the voice guidance system 100. The communication device 220 receives the distribution information Q transmitted from the management device 10 from the communication network 200.

信号処理部２１４は、通信装置２２０が管理装置１０から受信した配信情報Ｑを含む音響の音響信号ＳQを生成する。音響信号ＳQは、配信情報Ｑを所定の周波数帯域の音響成分として含有する。具体的には、音響信号ＳQの周波数帯域は、放音装置２６による放音と端末装置３０による収音とが可能な周波数帯域であり、かつ、利用者Ｕが通常の環境で聴取する音声（例えば案内音声Ｖ）や楽音等の音響の周波数帯域を上回る周波数帯域（例えば１８ｋＨｚ以上かつ２０ｋＨｚ以下）の範囲内に包含される。信号処理部２１４による音響信号ＳQの生成には公知の技術が任意に採用され得るが、例えば、所定の周波数の正弦波等の搬送波を配信情報Ｑで周波数変調することで音響信号ＳQを生成する構成や、拡散符号を利用した配信情報Ｑの拡散変調と所定の周波数の搬送波を利用した周波数変換とを順次に実行して音響信号ＳQを生成する構成が採用され得る。 The signal processing unit 214 generates an acoustic signal SQ including the distribution information Q received by the communication device 220 from the management device 10. The acoustic signal SQ contains the distribution information Q as an acoustic component in a predetermined frequency band. Specifically, the frequency band of the acoustic signal SQ is a frequency band in which sound emission by the sound emission device 26 and sound collection by the terminal device 30 can be performed, and the sound that the user U listens in a normal environment ( For example, it is included in the range of a frequency band (for example, 18 kHz or more and 20 kHz or less) exceeding the frequency band of sound such as guidance voice V) or musical sound. A known technique can be arbitrarily employed to generate the acoustic signal SQ by the signal processing unit 214. For example, the acoustic signal SQ is generated by frequency-modulating a carrier wave such as a sine wave having a predetermined frequency with the distribution information Q. A configuration or a configuration in which the acoustic signal SQ is generated by sequentially executing the spread modulation of the distribution information Q using a spread code and the frequency conversion using a carrier wave of a predetermined frequency may be employed.

音声案内システム１００の加算器２４は、信号処理部２１４が生成した音響信号ＳQと、収音装置２２が生成した音声信号ＳGとを加算することで音響信号Ｓ1を生成する。したがって、案内者ＯPが発音した案内音声Ｖとともに配信情報Ｑの音響成分が放音装置２６から放音される。以上の説明から理解される通り、放音装置２６は、空気振動としての音響（音波）を伝送媒体とする音響通信で配信情報Ｑを端末装置３０に送信する要素として機能する。すなわち、収音装置２２が収音した案内音声Ｖを放音する放音装置２６が配信情報Ｑの送信に流用される。したがって、案内音声Ｖの放音に使用する放音装置２６とは別個の放音装置２６を配信情報Ｑを示す音響の放音に利用する構成と比較して、情報生成システム１の構成を簡素化することが可能である。 The adder 24 of the voice guidance system 100 generates the acoustic signal S1 by adding the acoustic signal SQ generated by the signal processing unit 214 and the voice signal SG generated by the sound collecting device 22. Therefore, the sound component of the distribution information Q is emitted from the sound emitting device 26 together with the guidance voice V that is generated by the guide OP. As understood from the above description, the sound emitting device 26 functions as an element that transmits the distribution information Q to the terminal device 30 by acoustic communication using sound (sound wave) as air vibration as a transmission medium. That is, the sound emitting device 26 that emits the guidance voice V collected by the sound collecting device 22 is used for transmission of the distribution information Q. Therefore, the configuration of the information generation system 1 is simplified compared to the configuration in which the sound emitting device 26 that is different from the sound emitting device 26 used for sounding the guidance voice V is used for sound emission of the sound indicating the distribution information Q. It is possible to

＜管理装置１０＞
管理装置１０は、音声案内システム１００に送信される配信情報Ｑを管理する装置であり、図２に例示される通り、制御装置１１０と記憶装置１２０と通信装置１３０とを具備する。管理装置１０は、単体の装置として実現されるほか、相互に別体で構成された複数の装置でも実現され得る。例えば、管理装置１０とは別体で記憶装置１２０（クラウドストレージ）を設置し、制御装置１１０が例えば通信網２００を介して記憶装置１２０に対する読出／書込を実行することも可能である。すなわち、記憶装置１２０は管理装置１０から省略され得る。 <Management device 10>
The management device 10 is a device that manages the distribution information Q transmitted to the voice guidance system 100, and includes a control device 110, a storage device 120, and a communication device 130 as illustrated in FIG. The management device 10 can be realized as a single device or a plurality of devices configured separately from each other. For example, the storage device 120 (cloud storage) may be installed separately from the management device 10 and the control device 110 may execute read / write with respect to the storage device 120 via the communication network 200, for example. That is, the storage device 120 can be omitted from the management device 10.

通信装置１３０は、通信網２００を介して配信端末２０と通信する。具体的には、通信装置１３０は、配信端末２０から送信された音声信号ＳGを受信するとともに、音声信号ＳGに対して制御装置１１０が生成した配信情報Ｑを配信端末２０に送信する。 The communication device 130 communicates with the distribution terminal 20 via the communication network 200. Specifically, the communication device 130 receives the audio signal SG transmitted from the distribution terminal 20 and transmits the distribution information Q generated by the control device 110 to the distribution terminal 20 in response to the audio signal SG.

制御装置１１０は、例えばＣＰＵ等の処理装置で構成され、記憶装置１２０に記憶されたプログラムを実行することで、図２に例示される通り、配信情報Ｑの生成を制御する複数の要素（音声解析部１１２、文字列特定部１１４、情報生成部１１６）として機能する。なお、制御装置１１０の一部の機能を専用の電子回路で実現した構成や、制御装置１１０の機能を複数の装置に分散した構成も採用され得る。 The control device 110 is configured by a processing device such as a CPU, for example, and executes a program stored in the storage device 120 to execute a plurality of elements (voices) that control generation of the distribution information Q as illustrated in FIG. It functions as an analysis unit 112, a character string specifying unit 114, and an information generation unit 116). A configuration in which a part of the function of the control device 110 is realized by a dedicated electronic circuit, or a configuration in which the function of the control device 110 is distributed to a plurality of devices may be employed.

図２の音声解析部１１２は、案内音声Ｖの発音内容を表す文字列を複数に分割した認識文字列Ｌを生成する。具体的には、音声解析部１１２は、通信装置１３０が配信端末２０から受信した音声信号ＳGに対する音声認識で案内音声Ｖの発音内容を解析した結果（以下「解析結果」という）Ｋを文単位で複数の認識文字列Ｌに分割する。図３は、案内音声Ｖの解析結果Ｋを複数の認識文字列Ｌに分割する動作の説明図である。図３に例示される通り、解析結果Ｋ（Ｋ1，Ｋ2，Ｋ3）は文単位で複数の認識文字列Ｌに分割され得る。音声信号ＳGの音声認識には、例えばＨＭＭ（Hidden Markov Model）等の音響モデルと言語的な制約を示す言語モデルとを利用した認識技術等の公知の技術が任意に採用され得る。また、解析結果Ｋの分割には、形態素解析等の公知の技術が任意に採用され得る。 The voice analysis unit 112 in FIG. 2 generates a recognized character string L obtained by dividing a character string representing the pronunciation content of the guidance voice V into a plurality of parts. Specifically, the voice analysis unit 112 analyzes a result K (hereinafter referred to as “analysis result”) K obtained by analyzing the pronunciation content of the guidance voice V by voice recognition with respect to the voice signal SG received from the distribution terminal 20 by the communication device 130 in sentence units. Is divided into a plurality of recognized character strings L. FIG. 3 is an explanatory diagram of an operation of dividing the analysis result K of the guidance voice V into a plurality of recognized character strings L. As illustrated in FIG. 3, the analysis result K (K1, K2, K3) can be divided into a plurality of recognized character strings L in units of sentences. For voice recognition of the voice signal SG, a known technique such as a recognition technique using an acoustic model such as an HMM (Hidden Markov Model) and a language model indicating linguistic restrictions can be arbitrarily employed. For dividing the analysis result K, a known technique such as morphological analysis can be arbitrarily employed.

記憶装置１２０は、制御装置１１０が実行するプログラムや制御装置１１０が使用する各種のデータを記憶する。例えば、半導体記録媒体や磁気記録媒体等の公知の記録媒体または複数種の記録媒体の組合せが記憶装置１２０として任意に採用される。記憶装置１２０は、図２に例示される通り、文字列テーブルＴAと案内テーブルＴBとを記憶する。 The storage device 120 stores a program executed by the control device 110 and various data used by the control device 110. For example, a known recording medium such as a semiconductor recording medium or a magnetic recording medium or a combination of a plurality of types of recording media is arbitrarily employed as the storage device 120. As illustrated in FIG. 2, the storage device 120 stores a character string table TA and a guide table TB.

図４は、文字列テーブルＴAの模式図である。図４に例示される通り、文字列テーブルＴAは、案内音声Ｖとして想定される相異なる文章を構成する複数の登録文字列Ｘが登録されたデータテーブルである。各登録文字列Ｘには、当該登録文字列Ｘを一意に識別するための識別情報ＤX（ＤX1，ＤX2，……）が対応付けられる。アナウンスブック等に収録された複数種の文章の各々を構成する複数の登録文字列Ｘ（案内者ＯPによる発音が予定される文字列）が文字列テーブルＴAに登録される。 FIG. 4 is a schematic diagram of the character string table TA. As illustrated in FIG. 4, the character string table TA is a data table in which a plurality of registered character strings X constituting different sentences assumed as the guidance voice V are registered. Each registered character string X is associated with identification information DX (DX1, DX2,...) For uniquely identifying the registered character string X. A plurality of registered character strings X (character strings expected to be pronounced by the guide OP) constituting each of a plurality of types of sentences recorded in the announcement book or the like are registered in the character string table TA.

前述の通り、アナウンスブック等に事前に収録された複数の文章の各々は、複数の登録文字列Ｘを組合せて構成されている。複数の文章の間で登録文字列Ｘは共通し得る。例えば、「本日はお越しくださいまして誠にありがとうございます。」という文は、図３で例示される通り、閉店を通知する文章や迷子を通知する文章の間で共通する。図４で例示される通り、文字列テーブルＴAの登録文字列Ｘを１文単位とすることで共通の登録文字列Ｘを重複して文字列テーブルＴAに登録する必要がない。したがって、相異なる複数の案内音声Ｖの文章を個別に文字列テーブルＴAに登録する構成よりも、文字列テーブルＴAのデータ量を削減することが可能である。 As described above, each of a plurality of sentences recorded in advance in an announcement book or the like is configured by combining a plurality of registered character strings X. The registered character string X may be common among a plurality of sentences. For example, the sentence “Thank you very much for coming today” is common between the sentence notifying the closing of the store and the sentence notifying the lost child as illustrated in FIG. As illustrated in FIG. 4, the registered character string X of the character string table TA is set to one sentence unit, so that it is not necessary to register the common registered character string X in the character string table TA in duplicate. Therefore, the data amount of the character string table TA can be reduced as compared with the configuration in which sentences of a plurality of different guidance voices V are individually registered in the character string table TA.

図２の文字列特定部１１４は、音声解析部１１２が解析した複数の認識文字列Ｌの各々について、文字列テーブルＴAに登録された複数の登録文字列Ｘのうち当該認識文字列Ｌに類似する登録文字列Ｘを特定する。具体的には、文字列特定部１１４は、文字列テーブルＴAの複数の登録文字列Ｘの各々について認識文字列Ｌとの類似度の指標（以下「類似指標」という）を算定し、複数の登録文字列Ｘのうち類似指標が示す類似度が最大となる１個の登録文字列Ｘ（すなわち認識文字列Ｌに最も類似する登録文字列Ｘ）を特定する。 The character string specifying unit 114 in FIG. 2 is similar to the recognized character string L among the plurality of registered character strings X registered in the character string table TA for each of the plurality of recognized character strings L analyzed by the voice analysis unit 112. The registered character string X to be specified is specified. Specifically, the character string specifying unit 114 calculates an index of similarity to the recognized character string L (hereinafter referred to as “similar index”) for each of the plurality of registered character strings X in the character string table TA, and Among the registered character strings X, one registered character string X (ie, the registered character string X most similar to the recognized character string L) having the maximum similarity indicated by the similarity index is specified.

前述の通り、案内者ＯPは発音時に参照するアナウンスブック等に事前に収録された文章（複数の登録文字列Ｘの組合せで構成される文章）を発音するから、理想的には、音声解析部１１２が音声信号ＳGから生成する複数の認識文字列Ｌの各々は、文字列テーブルＴAに登録された何れかの登録文字列Ｘと一致する。しかし、実際には、個々の案内者ＯPに特有の発話の特徴（くせ）や商業施設３００内の背景雑音等に起因して音声解析部１１２による解析には誤認識が発生し得る。したがって、各認識文字列Ｌと登録文字列Ｘとは、相互に類似するけれども必ずしも一致しない場合がある。例えば、案内者ＯPが登録文字列Ｘ4「お気をつけてお帰りください。」を含む文章を発音しても、実際に音声解析部１１２が生成する認識文字列Ｌは、図３に例示される通り、登録文字列Ｘに類似するけれども完全には一致しない「…おくをつけたおかいりください。」というような認識文字列Ｌを含んだ解析結果Ｋ1になり得る。第１実施形態では、認識文字列Ｌに類似する登録文字列Ｘを特定するので、図４に例示される通り、音声認識の誤認識の影響を含まない登録文字列Ｘ4が特定され得る。 As described above, since the guide OP pronounces sentences (sentences composed of a combination of a plurality of registered character strings X) recorded in advance in an announcement book or the like to be referred to at the time of pronunciation, ideally a voice analysis unit Each of the plurality of recognized character strings L that 112 generates from the speech signal SG coincides with any registered character string X registered in the character string table TA. However, actually, erroneous recognition may occur in the analysis by the voice analysis unit 112 due to the utterance characteristics (feces) unique to each guider OP, background noise in the commercial facility 300, and the like. Therefore, each recognized character string L and registered character string X are similar to each other, but may not necessarily match. For example, the recognized character string L actually generated by the voice analysis unit 112 is exemplified in FIG. 3 even if the guider OP pronounces a sentence including the registered character string X4 “Please come home carefully”. As can be seen, it can be an analysis result K1 including a recognized character string L that is similar to the registered character string X but does not completely match, such as “... In the first embodiment, since the registered character string X similar to the recognized character string L is specified, as illustrated in FIG. 4, the registered character string X4 that does not include the influence of the misrecognition of voice recognition can be specified.

類似指標の種類は任意であるが、例えば文字列間の類似性を評価するための編集距離（レーベンシュタイン距離）等の公知の指標が類似指標として任意に採用され得る。文字列特定部１１４による登録文字列Ｘの特定は、音声解析部１１２が生成した複数の認識文字列Ｌの各々を、当該認識文字列Ｌに類似する登録文字列Ｘに補正する処理とも換言され得る。文字列特定部１１４は、以上の手順で特定した複数の登録文字列Ｘの各々について識別情報ＤXを文字列テーブルＴAから取得する。 The type of the similarity index is arbitrary, but a known index such as an edit distance (Levenstein distance) for evaluating the similarity between character strings can be arbitrarily adopted as the similarity index. The specification of the registered character string X by the character string specifying unit 114 is also referred to as a process of correcting each of the plurality of recognized character strings L generated by the speech analysis unit 112 to a registered character string X similar to the recognized character string L. obtain. The character string specifying unit 114 acquires the identification information DX from the character string table TA for each of the plurality of registered character strings X specified by the above procedure.

図５は、案内テーブルＴBの模式図である。図５に例示される通り、案内テーブルＴBは、案内者ＯPによる発音が想定される相異なる文章（複数の文の組合せ）に対応する複数の対応情報Ｃを含む。任意の１個の文章に対応する対応情報Ｃは、当該文章を構成する複数の登録文字列Ｘの識別情報ＤXの組合せを指定する。例えば、「本日はお越しくださいまして誠にありがとうございます。緊急のお知らせがございます。お車のヘッドライトがついたままになっております。」という文章に対応する対応情報Ｃ2は、第１文の登録文字列Ｘ1の識別情報ＤX1と、第２文の登録文字列Ｘ3の識別情報ＤX3と、第３文の登録文字列Ｘ6の識別情報ＤX6との組合せを指定する。案内テーブルＴBにおいて複数の対応情報Ｃの各々には識別情報ＤZ（ＤZ1，ＤZ2，……）が対応付けられる。識別情報ＤZは、案内音声Ｖに関連する予め用意された各種の案内文Ａを指定する。案内文Ａは、アナウンスブック等に事前に収録された適正な文章である。具体的には、任意の１個の対応情報Ｃに対応する識別情報ＤZは、当該対応情報Ｃが識別情報ＤXを指定する複数の登録文字列Ｘで構成される案内文Ａを指定する。なお、案内文Ａを構成する所定の単位（例えば、文）ごとに識別情報ＤZを有し、複数の識別情報ＤZが１個の案内文Ａを指定することも可能である。 FIG. 5 is a schematic diagram of the guide table TB. As illustrated in FIG. 5, the guide table TB includes a plurality of pieces of correspondence information C corresponding to different sentences (combination of a plurality of sentences) expected to be pronounced by the guide OP. Correspondence information C corresponding to an arbitrary sentence specifies a combination of identification information DX of a plurality of registered character strings X constituting the sentence. For example, correspondence information C2 corresponding to the sentence "Thank you for coming today. There is an urgent notice. Your car's headlight is still on." A combination of identification information DX1 of registered character string X1, identification information DX3 of registered character string X3 of the second sentence, and identification information DX6 of registered character string X6 of the third sentence is designated. Identification information DZ (DZ1, DZ2,...) Is associated with each of the plurality of correspondence information C in the guide table TB. The identification information DZ designates various guidance sentences A prepared in advance related to the guidance voice V. The guidance sentence A is an appropriate sentence recorded in advance in an announcement book or the like. Specifically, the identification information DZ corresponding to any one piece of correspondence information C designates a guide sentence A composed of a plurality of registered character strings X in which the correspondence information C designates the identification information DX. Note that it is also possible to have identification information DZ for each predetermined unit (for example, sentence) constituting the guidance sentence A, and a plurality of identification information DZ can designate one guidance sentence A.

図２の情報生成部１１６は、複数の登録文字列Ｘの相異なる組合せを指定する複数の対応情報Ｃのうち、文字列特定部１１４が取得した複数の登録文字列Ｘの組合せに対応する対応情報Ｃに応じた関連情報Ｚを端末装置３０が利用者Ｕに提示するための配信情報Ｑを生成する。具体的には、情報生成部１１６は、複数の対応情報Ｃのうち、文字列特定部１１４が取得した複数の識別情報ＤXの組合せに類似または合致する組合せを指定する対応情報Ｃを特定し、案内テーブルＴBから、特定した対応情報Ｃの識別情報ＤZを取得する。すなわち、情報生成部１１６は、文字列特定部１１４が取得した複数の識別情報ＤXと同様の組合せを指定する対応情報Ｃを特定するほか、アナウンスブック等に事前に収録された文章を文単位で発音を案内者ＯPが誤った場合（例えば、必要な文の発音を忘れた場合や不要な文を発音してしまった場合）、情報生成部１１６は、当該文章の複数の識別情報ＤXに類似する組合せを指定する対応情報Ｃを特定し、案内テーブルＴBから、特定した対応情報Ｃの識別情報ＤZを取得する。 The information generation unit 116 in FIG. 2 corresponds to a combination of a plurality of registered character strings X acquired by the character string specifying unit 114 among a plurality of pieces of correspondence information C that specify different combinations of the plurality of registered character strings X. The distribution information Q for the terminal device 30 to present to the user U the related information Z corresponding to the information C is generated. Specifically, the information generation unit 116 specifies correspondence information C that specifies a combination that is similar to or matches the combination of the plurality of pieces of identification information DX acquired by the character string specification unit 114 among the plurality of pieces of correspondence information C. The identification information DZ of the identified correspondence information C is acquired from the guidance table TB. In other words, the information generation unit 116 specifies correspondence information C that specifies the same combination as the plurality of identification information DX acquired by the character string specifying unit 114, and also sentences previously recorded in the announcement book or the like in sentence units. When the guide OP makes a mistake in pronunciation (for example, when the pronunciation of a necessary sentence is forgotten or an unnecessary sentence is pronounced), the information generation unit 116 resembles a plurality of pieces of identification information DX of the sentence. Correspondence information C specifying the combination to be identified is identified, and identification information DZ of the identified correspondence information C is acquired from the guidance table TB.

例えば、アナウンスブック等に収録されている「本日はお越しくださいまして誠にありがとうございます。緊急のお知らせがございます。お車のヘッドライトがついたままになっております。」という文章を案内者ＯPが正確に発音した場合、図４に例示される通り、文字列特定部１１４は、文字列テーブルＴAから、識別情報ＤX1、ＤX3、およびＤX6を取得する。図５に例示される通り、文字列特定部１１４が取得した識別情報ＤX1、ＤX3、およびＤX6と対応情報Ｃ2が指定する識別情報ＤX1、ＤX3、およびＤX6とが合致するので、情報生成部１１６は、案内テーブルＴBから、対応情報Ｃ2の識別情報ＤZ2を取得する。他方、「緊急のお知らせがございます。本日はお越しくださいまして誠にありがとうございます。お車のヘッドライトがついたままになっております。」と文の順序を間違えて案内者ＯPが発音した場合でも、情報生成部１１６は、案内テーブルＴBから、対応情報Ｃ2の識別情報ＤZ2を取得することができる。また、「緊急のお知らせがございます。」の文を抜かして「本日はお越しくださいまして誠にありがとうございます。お車のヘッドライトがついたままになっております。」と案内者ＯPが誤って発音した場合、図４に例示される通り、文字列特定部１１４は、文字列テーブルＴAから、識別情報ＤX1とＤX6とを取得する。識別情報ＤX1とＤX6との両方を含む対応情報Ｃが識別情報ＤX1、ＤX3、およびＤX6を指定する対応情報Ｃ2しか存在しない場合、情報生成部１１６は、案内テーブルＴBから、対応情報Ｃ2の識別情報ＤZ2を取得する。したがって、適切な案内文Ａ2が対応づけられている識別情報ＤZ2を取得することができる。 For example, in the announcement book, the text “Thank you for coming today. There is an urgent notice. The car headlight remains on.” 4 is accurately pronounced, the character string specifying unit 114 acquires identification information DX1, DX3, and DX6 from the character string table TA, as illustrated in FIG. As illustrated in FIG. 5, since the identification information DX1, DX3, and DX6 acquired by the character string specifying unit 114 matches the identification information DX1, DX3, and DX6 specified by the correspondence information C2, the information generation unit 116 The identification information DZ2 of the correspondence information C2 is acquired from the guidance table TB. On the other hand, if there is an urgent notice, thank you for coming today. The car headlight is still on. However, the information generation unit 116 can acquire the identification information DZ2 of the correspondence information C2 from the guidance table TB. Also, the message “I have an urgent notice” is overwritten. “Thank you for coming today. Your car ’s headlight is still on.” In the case of pronunciation, as illustrated in FIG. 4, the character string specifying unit 114 acquires the identification information DX1 and DX6 from the character string table TA. When the correspondence information C including both the identification information DX1 and DX6 includes only the correspondence information C2 specifying the identification information DX1, DX3, and DX6, the information generating unit 116 identifies the identification information of the correspondence information C2 from the guidance table TB. Get DZ2. Therefore, it is possible to obtain the identification information DZ2 associated with the appropriate guide sentence A2.

図２の情報生成部１１６は、以上の手順で取得した識別情報ＤZを含む配信情報Ｑを生成し、通信装置１３０を介して配信端末２０に配信情報Ｑを送信する。 The information generation unit 116 in FIG. 2 generates distribution information Q including the identification information DZ acquired by the above procedure, and transmits the distribution information Q to the distribution terminal 20 via the communication device 130.

以上の説明から理解される通り、対応情報Ｃが指定する識別情報ＤXの組合せとは独立に用意された案内文Ａを示す関連情報Ｚを利用者Ｕに提示するための配信情報Ｑを生成することが可能である。前述の通り、案内文Ａは、アナウンスブック等に事前に収録された文章（つまり対応情報Ｃが指定する登録文字列Ｘの組合せで構成される文章）が想定されるが、アナウンスブック等に事前に収録された文章とは異なる文章を案内文Ａとすることも可能である。例えば、アナウンスブック等に事前に収録された文章「本日はお越しくださいまして誠にありがとうございます。（＝Ｘ1）間もなく閉店のお時間となります。（＝Ｘ2）お気をつけてお帰りください。（＝Ｘ4）」を構成する３つの登録文字列Ｘの組合せを示す対応情報Ｃ1の識別情報ＤZ1に対応する案内文Ａ1を「間もなく閉店のお時間となるので、お気をつけてお帰りください。」といった内容にすることも可能である。つまり、アナウンスブック等に事前に収録された文章とは異なる文章を案内文Ａとすることも可能である。 As understood from the above description, the distribution information Q for presenting the user U with the related information Z indicating the guidance sentence A prepared independently of the combination of the identification information DX designated by the correspondence information C is generated. It is possible. As described above, the guidance sentence A is assumed to be a sentence recorded in advance in an announcement book or the like (that is, a sentence composed of a combination of registered character strings X specified by the correspondence information C). It is also possible to use a sentence different from the sentence recorded in the guide sentence A. For example, the text recorded in the announcement book etc. "Thank you for coming today. (= X1) It will be closed soon. (= X2) Please be careful. ) "A guide sentence A1 corresponding to the identification information DZ1 of the correspondence information C1 indicating the combination of the three registered character strings X that constitute") "content such as" Please be careful, as it will be closing time soon. " It is also possible to make it. That is, a sentence different from the sentence recorded in advance in the announcement book or the like can be used as the guide sentence A.

図２の音声案内システム１００では、管理装置１０から通信装置２２０が受信した配信情報Ｑの音響成分と案内音声Ｖとが放音装置２６から放音される。案内者ＯPによる案内音声Ｖの発話の終了後に制御装置１１０で行われる処理により配信情報Ｑの生成が実行されるから、放音装置２６からは、案内音声Ｖの放音から遅延して配信情報Ｑの音響成分が放音される。 In the voice guidance system 100 of FIG. 2, the acoustic component of the distribution information Q and the guidance voice V received by the communication device 220 from the management device 10 are emitted from the sound emitting device 26. Since the delivery information Q is generated by the process performed by the control device 110 after the utterance of the guidance voice V by the guider OP, the delivery information is delayed from the sound emission of the guidance voice V from the sound emitting device 26. The Q acoustic component is emitted.

＜端末装置３０＞
図６は、端末装置３０の構成図である。図６に例示される通り、端末装置３０は、収音装置３１０と制御装置３２０と記憶装置３３０と提示装置３４０とを含んで構成される。 <Terminal device 30>
FIG. 6 is a configuration diagram of the terminal device 30. As illustrated in FIG. 6, the terminal device 30 includes a sound collection device 310, a control device 320, a storage device 330, and a presentation device 340.

記憶装置３３０は、制御装置３２０が実行するプログラムや制御装置３２０が使用する各種のデータを記憶する。記憶装置３３０は、図６に例示される通り、提示テーブルＴCを記憶する。第１実施形態の提示テーブルＴCは、相異なる案内音声Ｖに対応する複数の案内文Ａ（Ａ1，Ａ2，……）の各々に識別情報ＤZ（ＤZ1，ＤZ2，……）が対応付けられたデータテーブルである。例えば、半導体記録媒体や磁気記録媒体等の公知の記録媒体または複数種の記録媒体の組合せが記憶装置３３０として任意に採用される。 The storage device 330 stores a program executed by the control device 320 and various data used by the control device 320. The storage device 330 stores a presentation table TC as illustrated in FIG. In the presentation table TC of the first embodiment, identification information DZ (DZ1, DZ2,...) Is associated with each of a plurality of guidance sentences A (A1, A2,...) Corresponding to different guidance voices V. It is a data table. For example, a known recording medium such as a semiconductor recording medium or a magnetic recording medium or a combination of a plurality of types of recording media is arbitrarily employed as the storage device 330.

収音装置３１０は、周囲の音響を収音する音響機器（マイクロホン）であり、配信端末２０の放音装置２６から放音される音響を収音して音響信号Ｓ2を生成する。音響信号Ｓ2は、配信情報Ｑの音響成分（音響信号ＳQ）を含有する。なお、収音装置３１０が生成した音響信号Ｓ2をアナログからデジタルに変換するＡ/Ｄ変換器の図示は便宜的に省略されている。 The sound collecting device 310 is an acoustic device (microphone) that picks up surrounding sounds, and collects sound emitted from the sound emitting device 26 of the distribution terminal 20 to generate an acoustic signal S2. The acoustic signal S2 contains the acoustic component (acoustic signal SQ) of the distribution information Q. The A / D converter that converts the acoustic signal S2 generated by the sound pickup device 310 from analog to digital is not shown for convenience.

制御装置３２０は、例えばＣＰＵ等の処理装置で構成され、記憶装置３３０に記憶されたプログラムを実行することで、提示テーブルＴCに登録された複数の案内文Ａのうち配信情報Ｑにより指定される案内文Ａを関連情報Ｚとして利用者Ｕに提示するための複数の機能（情報抽出部３２２および情報管理部３２４）を実現する。 The control device 320 is constituted by a processing device such as a CPU, for example, and is designated by the distribution information Q among a plurality of guidance sentences A registered in the presentation table TC by executing a program stored in the storage device 330. A plurality of functions (information extraction unit 322 and information management unit 324) for presenting the guidance sentence A to the user U as related information Z are realized.

情報抽出部３２２は、収音装置３１０が生成した音響信号Ｓ2の復調で配信情報Ｑを抽出する。具体的には、情報抽出部３２２は、音響信号Ｓ2のうち配信情報Ｑを含む周波数帯域の帯域成分を例えば帯域通過フィルタで強調し、配信情報Ｑの拡散変調に利用された拡散符号を係数とする整合フィルタを通過させることで配信情報Ｑを抽出する。 The information extraction unit 322 extracts the distribution information Q by demodulating the acoustic signal S2 generated by the sound collection device 310. Specifically, the information extraction unit 322 emphasizes the band component of the frequency band including the distribution information Q in the acoustic signal S2 with, for example, a band pass filter, and uses the spreading code used for the spread modulation of the distribution information Q as the coefficient. The distribution information Q is extracted by passing the matched filter.

情報管理部３２４は、提示テーブルＴCの複数の案内文Ａのうち、情報抽出部３２２が抽出した配信情報Ｑに含まれる識別情報ＤZに対応する案内文Ａを選択する。すなわち、案内者ＯPが発音した案内音声Ｖに関連する案内文Ａが関連情報Ｚとして選択される。 The information management unit 324 selects the guide sentence A corresponding to the identification information DZ included in the distribution information Q extracted by the information extraction unit 322 from among the plurality of guide sentences A in the presentation table TC. That is, the guidance sentence A related to the guidance voice V pronounced by the guider OP is selected as the related information Z.

提示装置３４０は、情報管理部３２４が選択した案内文Ａを関連情報Ｚとして端末装置３０の利用者Ｕに提示する。提示装置３４０は、関連情報Ｚが示す案内文Ａを表示する表示装置（例えば液晶表示パネル等）である。以上の説明から理解される通り、案内者ＯPが発音した案内音声Ｖの発話内容に対応する案内文Ａが関連情報Ｚとして提示装置３４０により利用者Ｕに提示される。利用者Ｕは、提示装置３４０に提示された関連情報Ｚを視認することで、案内者ＯPが発音して放音装置２６から放音された案内音声Ｖに関連する関連情報Ｚを視覚的に確認することが可能である。 The presentation device 340 presents the guidance sentence A selected by the information management unit 324 to the user U of the terminal device 30 as related information Z. The presentation device 340 is a display device (for example, a liquid crystal display panel) that displays the guidance sentence A indicated by the related information Z. As understood from the above description, the guidance sentence A corresponding to the utterance content of the guidance voice V pronounced by the guide OP is presented to the user U by the presentation device 340 as the related information Z. The user U visually recognizes the related information Z presented on the presentation device 340, so that the related information Z related to the guidance voice V generated by the guide OP and emitted from the sound emitting device 26 is visually displayed. It is possible to confirm.

図７は、情報生成システム１の全体的な動作の説明図である。案内者ＯPがアナウンスブック等に事前に収録された文章に対応する案内音声Ｖを発音すると、音声案内システム１００の収音装置２２は、案内音声Ｖを収音して音声信号ＳGを生成する（ＳA1）。収音装置２２が生成した音声信号ＳGは、収音装置２２から放音装置２６に出力されて放音される一方（ＳA2）、音声取得部２１２によって取得されて通信装置２２０から通信網２００に送信される（ＳA3）。 FIG. 7 is an explanatory diagram of the overall operation of the information generation system 1. When the guider OP generates a guidance voice V corresponding to a sentence recorded in advance in an announcement book or the like, the sound collection device 22 of the voice guidance system 100 collects the guidance voice V and generates a voice signal SG ( SA1). The sound signal SG generated by the sound collection device 22 is output from the sound collection device 22 to the sound emission device 26 and emitted (SA2), while being acquired by the sound acquisition unit 212 and transmitted from the communication device 220 to the communication network 200. It is transmitted (SA3).

配信端末２０から送信された音声信号ＳGを通信装置１３０が通信網２００から受信すると、管理装置１０の音声解析部１１２は、案内音声Ｖの音声信号ＳGに対する音声認識で案内音声Ｖの発音内容を表す複数の認識文字列Ｌを生成する（ＳA4）。文字列特定部１１４は、文字列テーブルＴAの複数の登録文字列Ｘから、音声解析部１１２が生成した複数の認識文字列Ｌの各々に類似する登録文字列Ｘを特定する（ＳA5）。すなわち、認識文字列Ｌが、当該音声解析部１１２による誤認識を解消した登録文字列Ｘに補正される。情報生成部１１６は、案内テーブルＴBの複数の対応情報Ｃから、文字列特定部１１４が特定した複数の登録文字列Ｘの組合せに対応する対応情報Ｃを特定し（ＳA6）、特定した対応情報Ｃの識別情報ＤZを含む配信情報Ｑを生成する（ＳA7）。通信装置１３０は、情報生成部１１６が生成した配信情報Ｑを配信端末２０に送信する（ＳA8）。 When the communication device 130 receives the voice signal SG transmitted from the distribution terminal 20 from the communication network 200, the voice analysis unit 112 of the management device 10 generates the pronunciation content of the guidance voice V by voice recognition for the voice signal SG of the guidance voice V. A plurality of recognized character strings L to be represented are generated (SA4). The character string specifying unit 114 specifies a registered character string X similar to each of the plurality of recognized character strings L generated by the speech analysis unit 112 from the plurality of registered character strings X in the character string table TA (SA5). That is, the recognized character string L is corrected to the registered character string X in which erroneous recognition by the voice analysis unit 112 is eliminated. The information generation unit 116 identifies the correspondence information C corresponding to the combination of the plurality of registered character strings X identified by the character string identification unit 114 from the plurality of correspondence information C in the guidance table TB (SA6), and the identified correspondence information Delivery information Q including C identification information DZ is generated (SA7). The communication device 130 transmits the distribution information Q generated by the information generation unit 116 to the distribution terminal 20 (SA8).

管理装置１０から送信された配信情報Ｑを通信装置２２０が受信すると、配信端末２０の信号処理部２１４は、配信情報Ｑを音響成分として含有する音響信号ＳQを生成する（ＳA9）。配信端末２０の加算器２４は、収音装置２２が生成した音声信号ＳGと信号処理部２１４が生成した音響信号ＳQとを加算することで音響信号Ｓ1を生成する（ＳA10）。放音装置２６は、音響信号Ｓ1に応じた音響を放音する（ＳA11）。すなわち、収音装置２２が収音した案内音声Ｖと、配信情報Ｑの音響成分が放音装置２６から放音される。 When the communication device 220 receives the distribution information Q transmitted from the management device 10, the signal processing unit 214 of the distribution terminal 20 generates an acoustic signal SQ containing the distribution information Q as an acoustic component (SA9). The adder 24 of the distribution terminal 20 generates the acoustic signal S1 by adding the audio signal SG generated by the sound collection device 22 and the acoustic signal SQ generated by the signal processing unit 214 (SA10). The sound emitting device 26 emits sound according to the acoustic signal S1 (SA11). That is, the guidance voice V picked up by the sound pickup device 22 and the acoustic component of the distribution information Q are emitted from the sound emission device 26.

端末装置３０の収音装置３１０は、放音装置２６から放音された音響を収音して音響信号Ｓ2を生成する（ＳA12）。情報抽出部３２２は、収音装置３１０が生成した音響信号Ｓ2の復調で配信情報Ｑを抽出する（ＳA13）。情報管理部３２４は、情報抽出部３２２が抽出した配信情報Ｑに含まれる識別情報ＤZに対応する案内文Ａを提示テーブルＴCから選択する。提示装置３４０は、情報管理部３２４が選択した案内文Ａが示す文字列を関連情報Ｚとして表示させることで利用者Ｕに視覚的に提示する（ＳA14）。 The sound collecting device 310 of the terminal device 30 collects the sound emitted from the sound emitting device 26 and generates an acoustic signal S2 (SA12). The information extraction unit 322 extracts the distribution information Q by demodulating the acoustic signal S2 generated by the sound collection device 310 (SA13). The information management unit 324 selects the guidance sentence A corresponding to the identification information DZ included in the distribution information Q extracted by the information extraction unit 322 from the presentation table TC. The presentation device 340 visually presents to the user U by displaying the character string indicated by the guidance sentence A selected by the information management unit 324 as the related information Z (SA14).

以上の説明から理解される通り、第１実施形態では、複数の登録文字列Ｘのうち、案内音声Ｖに対する音声認識で解析された認識文字列Ｌに類似する登録文字列Ｘが特定されるので、音声認識の誤認識の影響を含まない関連情報Ｚを利用者Ｕに提示するための配信情報Ｑを生成することが可能である。また、複数の登録文字列Ｘの相異なる組合せを指定する対応情報Ｃに応じた関連情報Ｚを利用者Ｕに提示するための配信情報Ｑが生成されるので、案内者が例えば文単位で案内音声Ｖの発音を誤った場合（例えば、必要な文の発音を忘れた場合や不要な文を発音した場合）でも、案内音声Ｖの解析結果Ｋから特定される複数の登録文字列Ｘの組合せに対応する対応情報Ｃに応じた適切な関連情報Ｚを利用者Ｕに提示するための配信情報Ｑを生成することが可能である。 As understood from the above description, in the first embodiment, among the plurality of registered character strings X, the registered character string X similar to the recognized character string L analyzed by the speech recognition for the guidance voice V is specified. It is possible to generate distribution information Q for presenting related information Z that does not include the influence of misrecognition of voice recognition to the user U. Further, since the distribution information Q for presenting the user U with the related information Z corresponding to the correspondence information C designating different combinations of the plurality of registered character strings X is generated, the guide guides for example in sentence units. A combination of a plurality of registered character strings X identified from the analysis result K of the guidance voice V even when the pronunciation of the voice V is wrong (for example, when a necessary sentence is forgotten or an unnecessary sentence is pronounced) It is possible to generate distribution information Q for presenting relevant information Z corresponding to the correspondence information C corresponding to the user U to the user U.

＜第２実施形態＞
本発明の第２実施形態を説明する。以下に例示する各態様において作用や機能が第１実施形態と同様である要素については、第１実施形態の説明で使用した符号を流用して各々の詳細な説明を適宜に省略する。 Second Embodiment
A second embodiment of the present invention will be described. Regarding the elements whose functions and functions are the same as those of the first embodiment in each aspect exemplified below, the detailed description of each is appropriately omitted by using the reference numerals used in the description of the first embodiment.

第１実施形態の情報生成部１１６は、文字列特定部１１４が特定した複数の登録文字列Ｘの組合せを指定する対応情報Ｃの識別情報ＤZに応じた案内文Ａを関連情報Ｚとして利用者Ｕに提示するための配信情報Ｑを生成した。第２実施形態における情報生成部１１６は、文字列特定部１１４が特定した複数の登録文字列Ｘの組合せに対応する対応情報Ｃが指定する複数の登録文字列Ｘを示す関連情報Ｚを端末装置３０が利用者Ｕに提示するための配信情報Ｑを生成する。 The information generation unit 116 according to the first embodiment uses the guidance sentence A corresponding to the identification information DZ of the correspondence information C that specifies a combination of a plurality of registered character strings X specified by the character string specifying unit 114 as related information Z. Distribution information Q to be presented to U is generated. The information generation unit 116 according to the second embodiment displays the related information Z indicating the plurality of registered character strings X specified by the correspondence information C corresponding to the combination of the plurality of registered character strings X specified by the character string specifying unit 114. 30 generates distribution information Q for presentation to the user U.

図８は、第２実施形態に係る案内テーブルＴBの模式図である。図８に例示される通り、第２実施形態に係る案内テーブルＴBは、案内者ＯPによる発音が想定される相異なる文章に対応する複数の対応情報Ｃを含む。任意の１個の文章に対応する対応情報Ｃは、当該文章を構成する複数の登録文字列Ｘの識別情報ＤXの組合せを指定する。例えば、「本日はお越しくださいまして誠にありがとうございます。緊急のお知らせがございます。お車のヘッドライトがついたままになっております。」という文章に対応する対応情報Ｃ2は、第１文の登録文字列Ｘ1の識別情報ＤX1と、第２文の登録文字列Ｘ3の識別情報ＤX3と、第３文の登録文字列Ｘ6の識別情報ＤX6との組合せを指定する。図５の第１実施形態に係る案内テーブルＴBと比較して、第２実施形態に係る案内テーブルＴBは、任意の１個の対応情報Ｃに対応する識別情報ＤZを有していない。 FIG. 8 is a schematic diagram of a guide table TB according to the second embodiment. As illustrated in FIG. 8, the guide table TB according to the second embodiment includes a plurality of pieces of correspondence information C corresponding to different sentences assumed to be pronounced by the guide OP. Correspondence information C corresponding to an arbitrary sentence specifies a combination of identification information DX of a plurality of registered character strings X constituting the sentence. For example, correspondence information C2 corresponding to the sentence "Thank you for coming today. There is an urgent notice. Your car's headlight is still on." A combination of identification information DX1 of registered character string X1, identification information DX3 of registered character string X3 of the second sentence, and identification information DX6 of registered character string X6 of the third sentence is designated. Compared with the guide table TB according to the first embodiment of FIG. 5, the guide table TB according to the second embodiment does not have the identification information DZ corresponding to any one correspondence information C.

情報生成部１１６は、第１実施形態と同様に、複数の対応情報Ｃから文字列特定部１１４が取得した複数の識別情報ＤXに対応する対応情報Ｃを特定する。情報生成部１１６は、案内テーブルＴBから、特定した対応情報Ｃが指定する複数の識別情報ＤXを取得し、当該複数の識別情報ＤXを含む配信情報Ｑを生成する。例えば、対応情報Ｃ2を特定した場合、情報生成部１１６は、識別情報ＤX1と識別情報ＤX3と識別情報ＤX6とを含む配信情報Ｑを生成し、通信装置１３０を介して配信端末２０に配信情報Ｑを送信する。 The information generation unit 116 specifies correspondence information C corresponding to the plurality of pieces of identification information DX acquired by the character string specification unit 114 from the plurality of pieces of correspondence information C, as in the first embodiment. The information generation unit 116 acquires a plurality of pieces of identification information DX designated by the identified correspondence information C from the guidance table TB, and generates distribution information Q including the plurality of pieces of identification information DX. For example, when the correspondence information C2 is specified, the information generation unit 116 generates the distribution information Q including the identification information DX1, the identification information DX3, and the identification information DX6, and transmits the distribution information Q to the distribution terminal 20 via the communication device 130. Send.

端末装置３０において、配信端末２０の放音装置２６から放音される音響を収音装置３１０が収音して音響信号Ｓ2を生成し、情報抽出部３２２が音響信号Ｓ2から配信情報Ｑを抽出する構成および動作は、第１実施形態と同様である。 In the terminal device 30, the sound collecting device 310 picks up the sound emitted from the sound emitting device 26 of the distribution terminal 20 to generate the sound signal S 2, and the information extraction unit 322 extracts the distribution information Q from the sound signal S 2. The configuration and operation are the same as those in the first embodiment.

第２実施形態において端末装置３０の記憶装置３３０が記憶する提示テーブルＴCは、図４に例示した文字列テーブルＴAと同様に、複数の登録文字列Ｘの各々について識別情報ＤXを対応させたデータテーブルである。情報管理部３２４は、情報抽出部３２２が抽出した配信情報Ｑが指定する各識別情報ＤXの登録文字列Ｘを提示テーブルＴCから取得し、提示装置３４０は、情報管理部３２４が選択した複数の登録文字列Ｘを関連情報Ｚとして利用者Ｕに提示する。 In the second embodiment, the presentation table TC stored in the storage device 330 of the terminal device 30 is data in which the identification information DX is associated with each of the plurality of registered character strings X, similarly to the character string table TA illustrated in FIG. It is a table. The information management unit 324 acquires the registered character string X of each identification information DX specified by the distribution information Q extracted by the information extraction unit 322 from the presentation table TC, and the presentation device 340 includes a plurality of information selected by the information management unit 324. The registered character string X is presented to the user U as related information Z.

なお、前述の例示では、提示テーブルＴCを文字列テーブルＴAと同様の内容としたが、提示テーブルＴCを文字列テーブルＴAとは異なる内容とすることも可能である。例えば、文字列テーブルＴAの識別情報ＤX5に対応する登録文字列Ｘ5は「迷子のお知らせを申し上げます。」であるが、提示テーブルＴCの識別情報ＤX5に対応する文字列は「迷子のお知らせ。」とすることも可能である。情報管理部３２４は、提示テーブルＴCの複数の識別情報ＤXのうち、配信情報Ｑに含まれる複数の識別情報ＤXの各々が示す登録文字列Ｘを選択し、当該選択した複数の登録文字列Ｘを組み合わせた文章を関連情報Ｚとして生成する。提示装置３４０は、情報管理部３２４が生成した関連情報Ｚを端末装置３０の利用者Ｕに提示する。 In the above example, the presentation table TC has the same contents as the character string table TA, but the presentation table TC may have different contents from the character string table TA. For example, the registered character string X5 corresponding to the identification information DX5 of the character string table TA is “I will give you a notification of lost child”, but the character string corresponding to the identification information DX5 of the presentation table TC is “Notice of lost child”. It is also possible. The information management unit 324 selects a registered character string X indicated by each of the plurality of identification information DX included in the distribution information Q from among the plurality of identification information DX in the presentation table TC, and selects the selected plurality of registered character strings X. Is generated as related information Z. The presentation device 340 presents the related information Z generated by the information management unit 324 to the user U of the terminal device 30.

第２実施形態においても第１実施形態と同様の効果が実現される。また、第２実施形態では、対応情報Ｃが指定する複数の登録文字列Ｘを示す関連情報Ｚを利用者Ｕに提示するための配信情報Ｑが生成される。図５の第１実施形態に係る案内テーブルＴBと比較すると、前述の通り、第２実施形態に係る案内テーブルＴBは、任意の１個の対応情報Ｃに対応する識別情報ＤZを有していない。したがって、文字列特定部１１４が特定した複数の登録文字列Ｘの組合せに対応する対応情報Ｃの案内文Ａを示す関連情報Ｚを利用者Ｕに提示するための配信情報Ｑを生成する第１実施形態の構成と比較して、案内文Ａを一意に指定するための情報（つまり識別情報ＤZ）が不要になるという利点がある。一方で、第１実施形態では、文字列特定部１１４が特定した複数の登録文字列Ｘの組合せに対応する対応情報Ｃの案内文Ａを示す関連情報Ｚを利用者Ｕに提示するための配信情報Ｑが生成される。したがって、対応情報Ｃが指定する登録文字列Ｘの組合せとは独立に用意された案内文Ａを示す関連情報Ｚを利用者Ｕに提示するための配信情報Ｑを生成することが可能である。 In the second embodiment, the same effect as in the first embodiment is realized. In the second embodiment, distribution information Q for presenting the user U with related information Z indicating a plurality of registered character strings X designated by the correspondence information C is generated. Compared with the guidance table TB according to the first embodiment of FIG. 5, as described above, the guidance table TB according to the second embodiment does not have the identification information DZ corresponding to any one correspondence information C. . Therefore, the distribution information Q for presenting the user U with the related information Z indicating the guidance sentence A of the correspondence information C corresponding to the combination of the plurality of registered character strings X specified by the character string specifying unit 114 is generated. Compared with the configuration of the embodiment, there is an advantage that information for uniquely designating the guidance sentence A (that is, identification information DZ) is not required. On the other hand, in the first embodiment, distribution for presenting the user U with related information Z indicating the guidance sentence A of the corresponding information C corresponding to the combination of the plurality of registered character strings X specified by the character string specifying unit 114. Information Q is generated. Accordingly, it is possible to generate the distribution information Q for presenting the user U with the related information Z indicating the guidance sentence A prepared independently of the combination of the registered character strings X designated by the correspondence information C.

＜第３実施形態＞
第２実施形態の対応情報Ｃは、複数の登録文字列Ｘの組合せを指定し、情報生成部１１６は、対応情報Ｃが指定した複数の登録文字列Ｘを示す関連情報Ｚを端末装置３０の利用者Ｕに提示するための配信情報Ｑを生成した。第３実施形態の対応情報Ｃは、複数の登録文字列Ｘの組合せと、当該組合せに係る複数の登録文字列Ｘの順序とを指定し、情報生成部１１６は、対応情報Ｃが指定する順序で配列された複数の登録文字列Ｘを示す関連情報Ｚを端末装置３０が利用者Ｕに提示するための配信情報Ｑを生成する。 <Third Embodiment>
The correspondence information C of the second embodiment designates a combination of a plurality of registered character strings X, and the information generation unit 116 provides related information Z indicating the plurality of registered character strings X designated by the correspondence information C to the terminal device 30. The distribution information Q to be presented to the user U is generated. The correspondence information C of the third embodiment designates a combination of a plurality of registered character strings X and the order of the plurality of registered character strings X related to the combination, and the information generation unit 116 designates the order designated by the correspondence information C. The distribution information Q for the terminal device 30 to present to the user U the related information Z indicating the plurality of registered character strings X arranged in the above is generated.

第３実施形態に係る案内テーブルＴBは、案内者ＯPによる発音が想定される相異なる文章に対応する複数の対応情報Ｃを含む。任意の１個の文章に対応する対応情報Ｃは、当該文章を構成する複数の登録文字列Ｘの識別情報ＤXの組合せと当該複数の識別情報ＤXの各々が示す登録文字列Ｘが文章を構成する順序とを指定する。具体的には、任意の１個の対応情報Ｃは、当該対応情報Ｃに対応する文章を構成する複数の登録文字列Ｘの正規の順序で各登録文字列Ｘの識別情報ＤXを配列した情報である。ただし、複数の登録文字列Ｘの順序を指定する方法は以上の例示に限定されない。例えば、対応情報Ｃが指定する組合せに係る複数の登録文字列Ｘの各々の識別情報ＤXに、当該組合せ内の登録文字列Ｘの順序を示す情報を付加することも可能である。 The guide table TB according to the third embodiment includes a plurality of pieces of correspondence information C corresponding to different sentences that are supposed to be pronounced by the guide OP. Correspondence information C corresponding to an arbitrary sentence is composed of a combination of identification information DX of a plurality of registered character strings X constituting the sentence and a registered character string X indicated by each of the plurality of identification information DX. Specify the order to perform. Specifically, any one piece of correspondence information C is information in which identification information DX of each registered character string X is arranged in a regular order of a plurality of registered character strings X constituting a sentence corresponding to the correspondence information C. It is. However, the method of designating the order of the plurality of registered character strings X is not limited to the above examples. For example, it is possible to add information indicating the order of the registered character strings X in the combination to the identification information DX of each of the plurality of registered character strings X related to the combination designated by the correspondence information C.

第３実施形態に係る情報生成部１１６は、対応情報Ｃが指定する複数の識別情報ＤXを、対応情報Ｃが指定する順序で配列して配信情報Ｑを生成する。例えば、「緊急のお知らせがございます。（＝ＤX3）本日はお越しくださいまして誠にありがとうございます。（＝ＤX1）お車のヘッドライトがついたままになっております。（＝ＤX6）」と案内者ＯPが発音した場合、対応情報Ｃが、「識別情報ＤX1：１番、識別情報ＤX3：２番、識別情報ＤX6：３番」という順序を指定していれば、「本日はお越しくださいまして誠にありがとうございます。（＝ＤX1）緊急のお知らせがございます。（＝ＤX3）お車のヘッドライトがついたままになっております。（＝ＤX6）」という関連情報Ｚを端末装置３０が利用者Ｕに提示するための配信情報Ｑが生成される。 The information generation unit 116 according to the third embodiment generates the distribution information Q by arranging a plurality of pieces of identification information DX designated by the correspondence information C in the order designated by the correspondence information C. For example, “There is an urgent notice. (= DX3) Thank you very much for coming today. (= DX1) Your car headlight remains on. (= DX6)” If the correspondence information C specifies the order of “identification information DX1: No.1, identification information DX3: No.2, identification information DX6: No.3” Thank you. (= DX1) There is an urgent notice. (= DX3) The car headlight is still on. (= DX6) Distribution information Q for presentation to U is generated.

第３実施形態においても第２実施形態と同様の効果が実現される。また、対応情報Ｃが指定する順序で配列された複数の登録文字列Ｘを示す関連情報Ｚを端末装置３０が利用者Ｕに提示するための配信情報Ｑが生成される。したがって、案内音声Ｖを構成する複数の文の発音の順序に関わらず、対応情報Ｃがあらかじめ指定した順序で配列された複数の登録文字列Ｘを示す関連情報Ｚを端末装置３０が利用者Ｕに提示するための配信情報Ｑを生成することが可能である。 In the third embodiment, the same effect as in the second embodiment is realized. Also, distribution information Q for the terminal device 30 to present to the user U related information Z indicating a plurality of registered character strings X arranged in the order specified by the correspondence information C is generated. Therefore, regardless of the order of pronunciation of the plurality of sentences constituting the guidance voice V, the terminal device 30 displays the related information Z indicating the plurality of registered character strings X in which the correspondence information C is arranged in a predetermined order. It is possible to generate distribution information Q for presentation to the user.

＜第４実施形態＞
第１実施形態から第３実施形態では、関連情報Ｚの提示を指示する配信情報Ｑを情報生成システム１から端末装置３０に送信した。第４実施形態では、案内音声Ｖに対応する関連情報Ｚを情報生成システム１が生成して利用者Ｕに提供する。音声解析部１１２および文字列特定部１１４の動作と情報生成部１１６が対応情報Ｃを特定する動作とは、前述の各形態と同様である。したがって、前述の各形態と同様に、音声認識の誤認識の影響を低減した、対応情報Ｃに応じた適切な関連情報Ｚを利用者Ｕに提示することが可能である。 <Fourth embodiment>
In the first to third embodiments, the distribution information Q instructing the presentation of the related information Z is transmitted from the information generation system 1 to the terminal device 30. In the fourth embodiment, the information generation system 1 generates related information Z corresponding to the guidance voice V and provides it to the user U. The operations of the voice analysis unit 112 and the character string specifying unit 114 and the operation of the information generating unit 116 specifying the correspondence information C are the same as in the above-described embodiments. Therefore, similarly to each of the above-described embodiments, it is possible to present the user U with appropriate related information Z corresponding to the correspondence information C, in which the influence of erroneous recognition of voice recognition is reduced.

第４実施形態の情報生成部１１６は、特定した対応情報Ｃの識別情報ＤZが示す案内文Ａを他言語に翻訳した文字列を関連情報Ｚとして生成する。なお、文字列の翻訳には、例えば公知の機械翻訳が任意に採用され得る。情報生成部１１６が生成した関連情報Ｚは、音声案内システム１００の配信端末２０に送信される。 The information generation unit 116 according to the fourth embodiment generates, as related information Z, a character string obtained by translating the guidance sentence A indicated by the identification information DZ of the identified correspondence information C into another language. For example, a known machine translation can be arbitrarily adopted for the translation of the character string. The related information Z generated by the information generation unit 116 is transmitted to the distribution terminal 20 of the voice guidance system 100.

配信端末２０の信号処理部２１４は、関連情報Ｚを適用した音声合成で音響信号ＳQを生成する。第４実施形態の音響信号ＳQは、関連情報Ｚが指定する文字列を発音した音声を表す信号である。音響信号ＳQの生成には、公知の音声合成が任意に採用され得る。信号処理部２１４が生成した音響信号ＳQは、加算器２４を経由して放音装置２６に供給される。したがって、関連情報Ｚが指定する文字列を発音した音声が放音装置２６から放音される。すなわち、案内者ＯPが発音した案内音声Ｖの放音に引続いて、当該案内音声Ｖを他言語に翻訳した音声が放音装置２６から利用者Ｕに対して放音される。例えば、案内文Ａの翻訳文が関連情報Ｚとして生成された場合には、案内音声Ｖに相当する案内文Ａの翻訳文の音声が当該案内音声Ｖに引続き放音される。 The signal processing unit 214 of the distribution terminal 20 generates the acoustic signal SQ by voice synthesis using the related information Z. The acoustic signal SQ according to the fourth embodiment is a signal representing a voice that is produced by sounding a character string designated by the related information Z. For the generation of the acoustic signal SQ, known speech synthesis can be arbitrarily adopted. The acoustic signal SQ generated by the signal processing unit 214 is supplied to the sound emitting device 26 via the adder 24. Therefore, the sound that pronounces the character string specified by the related information Z is emitted from the sound emitting device 26. That is, following the sound emission of the guidance voice V pronounced by the guide OP, a voice obtained by translating the guidance voice V into another language is emitted from the sound emission device 26 to the user U. For example, when the translated sentence of the guide sentence A is generated as the related information Z, the voice of the translated sentence of the guide sentence A corresponding to the guide voice V is emitted following the guide voice V.

なお、以上の説明では、第１実施形態を基礎とした構成を例示したが、第２実施形態または第３実施形態における文字列の特定を第４実施形態に適用することも可能である。例えば、第４実施形態の情報生成部１１６は、特定した対応情報Ｃが指定する複数の識別情報ＤXの各々が示す登録文字列Ｘを組み合わせた文章を他言語に翻訳した文字列を関連情報Ｚとして生成する。したがって、複数の登録文字列Ｘを組み合わせた文章の翻訳文の音声が案内音声Ｖに引続いて放音装置２６から放音される。また、第３実施形態を想定すると、特定した対応情報Ｃが指定する複数の識別情報ＤXの各々が示す登録文字列Ｘを対応情報Ｃが指定する順序で配列した文章を他言語に翻訳した文字列を表す関連情報Ｚを情報生成部１１６が生成することも可能である。以上の構成では、複数の登録文字列Ｘを対応情報Ｃがあらかじめ指定した順序で配列した文章の翻訳文の音声が当該案内音声Ｖに引続いて放音装置２６から放音される。 In the above description, the configuration based on the first embodiment has been exemplified. However, it is also possible to apply the character string specification in the second embodiment or the third embodiment to the fourth embodiment. For example, the information generation unit 116 according to the fourth embodiment converts a character string obtained by translating a sentence in which a registered character string X indicated by each of a plurality of pieces of identification information DX specified by the identified correspondence information C into another language is related information Z. Generate as Accordingly, the voice of the translated sentence of the sentence in which the plurality of registered character strings X are combined is emitted from the sound emitting device 26 following the guidance voice V. Further, assuming the third embodiment, a character obtained by translating a sentence in which a registered character string X indicated by each of a plurality of identification information DX designated by the identified correspondence information C is arranged in the order designated by the correspondence information C into another language It is also possible for the information generating unit 116 to generate related information Z representing a column. In the above-described configuration, the translated speech of a sentence in which a plurality of registered character strings X are arranged in the order designated by the correspondence information C is emitted from the sound emitting device 26 following the guidance voice V.

以上の説明から理解される通り、第４実施形態の情報生成システム１は、案内音声Ｖに関連する関連情報Ｚを生成（および利用者Ｕに提供）するシステムであり、案内音声Ｖに対する音声認識で解析された複数の認識文字列Ｌの各々について、相異なる発音内容を表す複数の登録文字列Ｘのうち認識文字列Ｌに類似する当該登録文字列Ｘを特定する文字列特定部１１４と、複数の登録文字列Ｘの相異なる組合せを指定する複数の対応情報Ｃのうち、文字列特定部１１４が特定した複数の登録文字列Ｘの組合せに対応する対応情報Ｃに応じた関連情報Ｚを生成する情報生成部１１６とを具備する。対応情報Ｃに応じた関連情報Ｚの典型例は、第１実施形態で例示した案内文Ａの翻訳文、および、対応情報Ｃが示す複数の登録文字列Ｘから構成される文章（第２実施形態または第３実施形態）の翻訳文である。なお、関連情報Ｚが示す文字列の音声を放音装置２６から放音する構成を以上の説明では例示したが、関連情報Ｚの出力方法は以上の例示に限定されない。例えば関連情報Ｚが示す文字列を表示装置により表示することも可能である。
＜変形例＞
以上に例示した各態様は多様に変形され得る。具体的な変形の態様を以下に例示する。以下の例示から任意に選択された２個以上の態様は、相互に矛盾しない範囲で適宜に併合され得る。 As understood from the above description, the information generation system 1 of the fourth embodiment is a system that generates (and provides to the user U) related information Z related to the guidance voice V, and recognizes the voice for the guidance voice V. A character string specifying unit 114 that specifies the registered character string X similar to the recognized character string L among the plurality of registered character strings X representing different pronunciation contents for each of the plurality of recognized character strings L analyzed in Related information Z corresponding to the correspondence information C corresponding to the combination of the plurality of registered character strings X identified by the character string identifying unit 114 among the plurality of correspondence information C designating different combinations of the plurality of registered character strings X And an information generation unit 116 to be generated. A typical example of the related information Z corresponding to the correspondence information C is a sentence composed of the translated sentence of the guidance sentence A exemplified in the first embodiment and a plurality of registered character strings X indicated by the correspondence information C (second embodiment). Or a translated sentence of the third embodiment). In addition, although the structure which emits the sound of the character string which the related information Z shows from the sound emission apparatus 26 was illustrated in the above description, the output method of the related information Z is not limited to the above illustration. For example, a character string indicated by the related information Z can be displayed on the display device.
<Modification>
Each aspect illustrated above can be variously modified. Specific modifications are exemplified below. Two or more modes arbitrarily selected from the following examples can be appropriately combined within a range that does not contradict each other.

（１）第１実施形態から第３実施形態では、情報生成部１１６は、案内音声Ｖの発音内容を示す文字列の識別情報ＤZ（第２実施形態や第３実施形態では識別情報ＤX）を配信情報Ｑとして生成したが、配信情報Ｑの内容は以上の例示に限定されない。例えば、案内音声Ｖの発音内容を示す文字列そのものや案内音声Ｖの発音内容を他言語に翻訳した文字列を提示装置３４０が表示することも可能である。ただし、利用者Ｕに対する提示の方法は、表示に限定されない。例えば、配信情報Ｑで指定された案内文Ａや登録文字列Ｘを音声として放音する放音装置を提示装置３４０として利用することも可能である。以上の例示から理解される通り、配信情報Ｑは、案内音声Ｖに関する情報として包括的に表現される。 (1) In the first to third embodiments, the information generating unit 116 uses character string identification information DZ (identification information DX in the second and third embodiments) indicating the pronunciation of the guidance voice V. Although generated as distribution information Q, the contents of distribution information Q are not limited to the above examples. For example, the presentation device 340 can display a character string indicating the pronunciation content of the guidance voice V or a character string obtained by translating the pronunciation content of the guidance voice V into another language. However, the method of presentation to the user U is not limited to display. For example, a sound emitting device that emits the guidance sentence A or the registered character string X designated by the distribution information Q as sound can be used as the presentation device 340. As understood from the above examples, the distribution information Q is comprehensively expressed as information on the guidance voice V.

（２）前述の各形態では、情報生成部１１６が対応情報Ｃを特定した上で配信情報Ｑ（第４実施形態では関連情報Ｚ）を生成したが、配信情報Ｑの生成に対応情報Ｃを特定できない場合を加味することも可能である。例えば、第１実施形態から第３実施形態の構成を基礎として、情報生成部１１６が対応情報Ｃを特定できない場合、案内音声Ｖの解析結果Ｋを配信情報Ｑとする構成や配信情報Ｑを生成しない構成も可能である。また、第４実施形態の構成を基礎として、情報生成部１１６が対応情報Ｃを特定できない場合、案内音声Ｖの解析結果Ｋの文字列の翻訳文を関連情報Ｚとする構成や関連情報Ｚを生成しない構成も可能である。 (2) In each of the above-described forms, the information generation unit 116 specifies the correspondence information C and then generates the distribution information Q (related information Z in the fourth embodiment). It is also possible to consider the case where it cannot be specified. For example, on the basis of the configuration of the first to third embodiments, when the information generating unit 116 cannot identify the correspondence information C, the configuration or distribution information Q is generated with the analysis result K of the guidance voice V as the distribution information Q. A configuration without this is also possible. Further, on the basis of the configuration of the fourth embodiment, when the information generation unit 116 cannot identify the correspondence information C, a configuration in which the translated information of the character string of the analysis result K of the guidance voice V is the related information Z and the related information Z are A configuration without generation is also possible.

（３）前述の各形態では、アナウンスブック等に収録された文章を構成する文を登録文字列Ｘとしたが、挿入区間を設けた定型句と挿入句とから構成される文を登録文字列Ｘとすることも可能である。例えば、解析結果Ｋに類似する定型句に、解析結果Ｋの挿入区間に対応する部分に類似する挿入句を挿入した文を登録文字列Ｘとする。以上の構成では、１個の定型句が複数の解析結果Ｋにわたり共用される（複数の案内音声Ｖにわたり共通する文字列を文字列テーブルＴAの個々の登録文字列Ｘに含ませる必要がない）から、文字列テーブルＴAのデータ量が削減されるという利点がある。 (3) In each of the above-described forms, a sentence constituting a sentence recorded in an announcement book or the like is a registered character string X. However, a sentence composed of a fixed phrase provided with an insertion section and an insertion phrase is a registered character string. X can also be used. For example, a registered character string X is a sentence in which an insertion phrase similar to a portion corresponding to the insertion section of the analysis result K is inserted into a fixed phrase similar to the analysis result K. In the above configuration, one fixed phrase is shared over a plurality of analysis results K (a character string common to a plurality of guidance voices V need not be included in each registered character string X of the character string table TA). Therefore, there is an advantage that the data amount of the character string table TA is reduced.

（４）前述の各形態では、管理装置１０が音声解析部１１２と文字列特定部１１４と情報生成部１１６とを具備する構成を例示したが、管理装置１０の一部または全部の機能を音声案内システム１００に搭載することも可能である。例えば、第１実施形態から第３実施形態の構成を基礎として、音声解析部１１２と文字列特定部１１４と情報生成部１１６とを配信端末２０に搭載した構成では、音声信号ＳGの解析（音声解析部１１２）と、登録文字列Ｘの特定（文字列特定部１１４）と、配信情報Ｑの生成（情報生成部１１６）とが配信端末２０にて実行され、配信情報Ｑが放音装置２６から端末装置３０に送信される。この構成では、音声案内システム１００と管理装置１０との間の通信が不要であるから、通信網２００を利用した通信ができない環境でも配信情報Ｑを端末装置３０に提供することが可能である。他方、第４実施形態の構成を基礎として、音声解析部１１２と文字列特定部１１４と情報生成部１１６とを配信端末２０に搭載した構成では、音声信号ＳGの解析（音声解析部１１２）と、登録文字列Ｘの特定（文字列特定部１１４）と、関連情報Ｚの生成（情報生成部１１６）とが配信端末２０にて実行され、関連情報Ｚが放音装置２６から放音される。 (4) In each of the above-described embodiments, the configuration in which the management device 10 includes the voice analysis unit 112, the character string specifying unit 114, and the information generation unit 116 is illustrated. However, some or all of the functions of the management device 10 are voiced. It can also be installed in the guidance system 100. For example, in the configuration in which the voice analysis unit 112, the character string specifying unit 114, and the information generation unit 116 are mounted on the distribution terminal 20 on the basis of the configuration of the first to third embodiments, the analysis of the voice signal SG (voice The analysis unit 112), identification of the registered character string X (character string identification unit 114), and generation of distribution information Q (information generation unit 116) are executed by the distribution terminal 20, and the distribution information Q is output from the sound emitting device 26. To the terminal device 30. In this configuration, since communication between the voice guidance system 100 and the management device 10 is unnecessary, it is possible to provide the distribution information Q to the terminal device 30 even in an environment where communication using the communication network 200 is not possible. On the other hand, in the configuration in which the speech analysis unit 112, the character string specifying unit 114, and the information generation unit 116 are mounted on the distribution terminal 20 on the basis of the configuration of the fourth embodiment, the analysis of the speech signal SG (speech analysis unit 112) and The identification of the registered character string X (character string specifying unit 114) and the generation of the related information Z (information generating unit 116) are executed by the distribution terminal 20, and the related information Z is emitted from the sound emitting device 26. .

（５）第１実施形態から第３実施形態では、提示テーブルＴCを端末装置３０の記憶装置３３０が記憶したが、提示テーブルＴCの記憶先は以上の例示に限定されない。例えば、移動体通信網やインターネット等の通信網を介して端末装置３０と通信する配信サーバ装置に提示テーブルＴCを記憶することも可能である。端末装置３０は、配信情報Ｑに含まれる識別情報ＤZ（第２実施形態や第３実施形態では識別情報ＤX）を指定した情報要求を配信サーバ装置に送信し、配信サーバ装置は、情報要求で指定された識別情報ＤZに対応する案内文Ａを要求元の端末装置３０に送信する。端末装置３０の提示装置３４０は、配信サーバ装置から受信した案内文Ａを利用者Ｕに提示する。以上の説明から理解される通り、提示テーブルＴCを端末装置３０の記憶装置３３０に記憶することは必須ではない。 (5) In the first embodiment to the third embodiment, the storage device 330 of the terminal device 30 stores the presentation table TC, but the storage destination of the presentation table TC is not limited to the above examples. For example, the presentation table TC can be stored in a distribution server device that communicates with the terminal device 30 via a communication network such as a mobile communication network or the Internet. The terminal device 30 transmits an information request specifying the identification information DZ (identification information DX in the second and third embodiments) included in the distribution information Q to the distribution server device. The guidance sentence A corresponding to the specified identification information DZ is transmitted to the requesting terminal device 30. The presentation device 340 of the terminal device 30 presents the guidance sentence A received from the distribution server device to the user U. As understood from the above description, it is not essential to store the presentation table TC in the storage device 330 of the terminal device 30.

（６）前述の各形態では、音声認識による案内音声Ｖの解析結果Ｋを音声解析部１１２が文単位に分割したが、解析結果Ｋを分割する単位は以上の例示に限定されない。例えば、文節や単語単位に解析結果Ｋを分割することも可能である。また、所定数の文を単位として解析結果Ｋを分割してもよい。 (6) In each of the above embodiments, the speech analysis unit 112 divides the analysis result K of the guidance voice V by speech recognition into sentence units, but the unit for dividing the analysis result K is not limited to the above examples. For example, the analysis result K can be divided into phrases or words. The analysis result K may be divided in units of a predetermined number of sentences.

（７）前述の各形態では、ショッピングモール等の商業施設３００で情報生成システム１を利用したが、情報生成システム１を利用する場所は以上の例示に限定されない。例えば、バスや電車等の交通機関の案内に情報生成システム１を利用することも可能である。 (7) In each above-mentioned form, although information generation system 1 was used in commercial facilities 300, such as a shopping mall, the place which uses information generation system 1 is not limited to the above illustration. For example, the information generation system 1 can be used for guidance of transportation facilities such as buses and trains.

（８）第１実施形態から第３実施形態では、端末装置３０に対する配信情報Ｑの配信に音響通信を利用したが、配信情報Ｑを端末装置３０に配信する通信の方式は以上の例示に限定されない。例えば、赤外線や電波を利用した無線通信（例えば近距離無線通信）で端末装置３０に配信情報Ｑを配信することも可能である。 (8) In the first to third embodiments, acoustic communication is used for distributing the distribution information Q to the terminal device 30. However, the communication method for distributing the distribution information Q to the terminal device 30 is limited to the above examples. Not. For example, the distribution information Q can be distributed to the terminal device 30 by wireless communication (for example, short-range wireless communication) using infrared rays or radio waves.

（９）第２実施形態おいて、対応情報Ｃは複数の登録文字列Ｘの組合せを指定したが、対応情報Ｃが指定する情報は以上の例示に限定されない。例えば、対応情報Ｃが示す複数の識別情報ＤXのうち、配信情報Ｑに含めない（つまり利用者Ｕに提示しない）登録文字列Ｘの識別情報ＤXを指定することも可能である。例えば、「本日はお越しくださいまして誠にありがとうございます。（＝ＤX1）緊急のお知らせがございます。（＝ＤX3）お車のヘッドライトがついたままになっております。（＝ＤX6）」と案内者ＯPが発音した場合、識別情報ＤX1、ＤX3およびＤX6を示す対応情報Ｃ2が、「識別情報ＤX1：非提示」と指定していれば、「緊急のお知らせがございます。（＝ＤX3）お車のヘッドライトがついたままになっております。（＝ＤX6）」という関連情報Ｚを端末装置３０が利用者Ｕに提示するための配信情報Ｑを生成することができる。以上の構成によれば、案内音声Ｖとして放音はしたい文ではあるが、端末装置３０への配信はしたくない文（例えば、個人情報を含む文や重要性が低い文）を配信情報Ｑに含めないことが可能である。 (9) In the second embodiment, the correspondence information C designates a combination of a plurality of registered character strings X, but the information designated by the correspondence information C is not limited to the above examples. For example, among the plurality of pieces of identification information DX indicated by the correspondence information C, the identification information DX of the registered character string X that is not included in the distribution information Q (that is, not presented to the user U) can be designated. For example, “Thank you for coming today. (= DX1) There is an urgent notice. (= DX3) The car headlight is still on. (= DX6)” When the operator OP pronounces, if the correspondence information C2 indicating the identification information DX1, DX3 and DX6 specifies “identification information DX1: non-presentation”, “There is an urgent notice. (= DX3) The distribution information Q for the terminal device 30 to present to the user U the related information Z (= DX6) ”can be generated. According to the above configuration, a sentence that is desired to be emitted as the guidance voice V, but a sentence that is not desired to be distributed to the terminal device 30 (for example, a sentence including personal information or a less important sentence) is distributed to the distribution information Q. Can be excluded.

（１０）前述の各形態で例示した情報生成システム１は、前述の各形態の例示の通り、管理装置１０の制御装置１１０とプログラムとの協働で実現される。例えば、第１実施形態から第３実施形態に対応するプログラムは、案内音声Ｖに関連する関連情報Ｚを端末装置３０が利用者Ｕに提示するために当該端末装置３０に送信される配信情報Ｑを生成するプログラムであって、案内音声Ｖに対する音声認識で解析された複数の認識文字列Ｌの各々について、相異なる発音内容を表す複数の登録文字列Ｘのうち認識文字列Ｌに類似する当該登録文字列Ｘを特定する文字列特定部１１４、および、複数の登録文字列Ｘの相異なる組合せを指定する複数の対応情報Ｃのうち、文字列特定部１１４が特定した複数の登録文字列Ｘの組合せに対応する対応情報Ｃに応じた関連情報Ｚを端末装置３０が利用者Ｕに提示するための配信情報Ｑを生成する情報生成部１１６としてコンピュータを機能させる。また、第４実施形態に対応するプログラムは、案内音声Ｖに関連する関連情報Ｚを生成（および利用者Ｕに提供）するプログラムであって、案内音声Ｖに対する音声認識で解析された複数の認識文字列Ｌの各々について、相異なる発音内容を表す複数の登録文字列Ｘのうち認識文字列Ｌに類似する当該登録文字列Ｘを特定する文字列特定部１１４、および、複数の登録文字列Ｘの相異なる組合せを指定する複数の対応情報Ｃのうち、文字列特定部１１４が特定した複数の登録文字列Ｘの組合せに対応する対応情報Ｃに応じた関連情報Ｚを生成する情報生成部１１６としてコンピュータを機能させる。以上に例示したプログラムは、コンピュータが読取可能な記録媒体に格納された形態で提供されてコンピュータにインストールされ得る。記録媒体は、例えば非一過性（non-transitory）の記録媒体であり、ＣＤ-ＲＯＭ等の光学式記録媒体（光ディスク）が好例であるが、半導体記録媒体や磁気記録媒体等の公知の任意の形式の記録媒体を包含し得る。また、通信網を介した配信の形態でプログラムをコンピュータに配信することも可能である。 (10) The information generation system 1 exemplified in each of the above embodiments is realized by the cooperation of the control device 110 of the management device 10 and a program as illustrated in the above embodiments. For example, the program corresponding to the first to third embodiments has the distribution information Q transmitted to the terminal device 30 in order for the terminal device 30 to present the related information Z related to the guidance voice V to the user U. For each of a plurality of recognized character strings L analyzed by voice recognition with respect to the guidance voice V, and similar to the recognized character string L among a plurality of registered character strings X representing different pronunciation contents Among the plurality of pieces of correspondence information C that specify different combinations of a plurality of registered character strings X, a plurality of registered character strings X specified by the character string specifying unit 114 are specified. The computer is caused to function as the information generation unit 116 that generates the distribution information Q for the terminal device 30 to present to the user U the related information Z corresponding to the correspondence information C corresponding to the combination. The program corresponding to the fourth embodiment is a program for generating (and providing to the user U) related information Z related to the guidance voice V, and a plurality of recognitions analyzed by voice recognition for the guidance voice V. For each of the character strings L, a character string specifying unit 114 for specifying the registered character string X similar to the recognized character string L among the plurality of registered character strings X representing different pronunciation contents, and the plurality of registered character strings X Information generating unit 116 that generates related information Z corresponding to the corresponding information C corresponding to the combination of the plurality of registered character strings X specified by the character string specifying unit 114 among the plurality of corresponding information C designating different combinations of As a computer to function. The programs exemplified above can be provided in a form stored in a computer-readable recording medium and installed in the computer. The recording medium is, for example, a non-transitory recording medium, and an optical recording medium (optical disk) such as a CD-ROM is a good example, but a known arbitrary one such as a semiconductor recording medium or a magnetic recording medium This type of recording medium can be included. It is also possible to distribute the program to a computer in the form of distribution via a communication network.

（１１）本発明は、前述の各形態に係る情報生成システム１の動作方法（情報生成方法）としても特定される。例えば、第１実施形態から第３実施形態に対応する情報生成方法は、案内音声Ｖに関連する関連情報Ｚを端末装置３０が利用者Ｕに提示するために当該端末装置３０に送信される配信情報Ｑを生成する方法であって、案内音声Ｖに対する音声認識で解析された複数の認識文字列Ｌの各々について、相異なる発音内容を表す複数の登録文字列Ｘのうち認識文字列Ｌに類似する当該登録文字列Ｘを特定し、複数の登録文字列Ｘの相異なる組合せを指定する複数の対応情報Ｃのうち、特定した複数の登録文字列Ｘの組合せに対応する対応情報Ｃに応じた関連情報Ｚを端末装置３０が利用者Ｕに提示するための配信情報Ｑを生成する。また、第４実施形態に対応する情報生成方法は、案内音声Ｖに関連する関連情報Ｚを生成（および利用者Ｕに提供）する方法であって、案内音声Ｖに対する音声認識で解析された複数の認識文字列Ｌの各々について、相異なる発音内容を表す複数の登録文字列Ｘのうち認識文字列Ｌに類似する当該登録文字列Ｘを特定し、複数の登録文字列Ｘの相異なる組合せを指定する複数の対応情報Ｃのうち、特定した複数の登録文字列Ｘの組合せに対応する対応情報Ｃに応じた関連情報Ｚを生成する。 (11) The present invention is also specified as an operation method (information generation method) of the information generation system 1 according to each of the above-described embodiments. For example, in the information generation method corresponding to the first to third embodiments, the distribution transmitted to the terminal device 30 so that the terminal device 30 presents the related information Z related to the guidance voice V to the user U. A method of generating information Q, which is similar to a recognized character string L among a plurality of registered character strings X representing different pronunciation contents for each of a plurality of recognized character strings L analyzed by voice recognition for the guidance voice V In accordance with the correspondence information C corresponding to the specified combination of the plurality of registered character strings X among the plurality of correspondence information C specifying the registered character string X to be specified and specifying different combinations of the plurality of registered character strings X Distribution information Q for the terminal device 30 to present the related information Z to the user U is generated. The information generation method corresponding to the fourth embodiment is a method for generating (and providing to the user U) related information Z related to the guidance voice V, and a plurality of information analyzed by voice recognition with respect to the guidance voice V. For each of the recognized character strings L, the registered character string X similar to the recognized character string L is specified from among the plurality of registered character strings X representing different pronunciation contents, and different combinations of the plurality of registered character strings X are determined. Of the plurality of correspondence information C to be specified, the related information Z corresponding to the correspondence information C corresponding to the specified combination of the plurality of registered character strings X is generated.

１……情報生成システム、１０……管理装置、２０……配信端末、２２……収音装置、２４……加算器、２６……放音装置、３０……端末装置、１００……音声案内システム、１１０……制御装置、１１２……音声解析部、１１４……文字列特定部、１１６……情報生成部、１２０……記憶装置、１３０……通信装置、２００……通信網、２１０……制御装置、２１２……音声取得部、２１４……信号処理部、２２０……通信装置、３００……商業施設、３１０……収音装置、３２０……制御装置、３２２……情報抽出部、３２４……情報管理部、３３０……記憶装置、３４０……提示装置 DESCRIPTION OF SYMBOLS 1 ... Information generation system, 10 ... Management apparatus, 20 ... Distribution terminal, 22 ... Sound collection apparatus, 24 ... Adder, 26 ... Sound emission apparatus, 30 ... Terminal apparatus, 100 ... Voice guidance System 110 110 Control device 112 Speech analysis unit 114 Character string specifying unit 116 Information generation unit 120 Storage unit 130 Communication device 200 Communication network 210 ... Control device, 212 ... Audio acquisition unit, 214 ... Signal processing unit, 220 ... Communication device, 300 ... Commercial facility, 310 ... Sound pickup device, 320 ... Control device, 322 ... Information extraction unit, 324 ... Information management unit, 330 ... Storage device, 340 ... Presentation device

Claims

A system for generating distribution information transmitted to a terminal device in order for the terminal device to present related information related to the guidance voice to the user,
For each of a plurality of recognized character strings analyzed by voice recognition with respect to the guidance voice, a character string specifying unit for specifying the registered character string similar to the recognized character string among a plurality of registered character strings representing different pronunciation contents When,
Of the plurality of pieces of correspondence information designating different combinations of the plurality of registered character strings, the terminal device displays the related information according to the correspondence information corresponding to the combination of the plurality of registered character strings specified by the character string specifying unit. An information generation system comprising: an information generation unit that generates the distribution information to be presented to a user.

The information generation unit includes related information indicating a guidance sentence of correspondence information corresponding to a combination of a plurality of registered character strings identified by the character string identification unit among a plurality of guidance sentences prepared for each correspondence information. The information generation system according to claim 1, wherein the terminal device generates the distribution information to be presented to the user.

The information generation unit is configured so that the terminal device presents to the user related information indicating a plurality of registered character strings specified by correspondence information corresponding to a combination of a plurality of registered character strings specified by the character string specifying unit. The information generation system according to claim 1, wherein the distribution information is generated.

The correspondence information specifies a combination of a plurality of registered character strings and an order of the plurality of registered character strings related to the combination,
The information generation unit generates the distribution information for the terminal device to present related information indicating a plurality of registered character strings arranged in an order designated by the correspondence information to the user. Generation system.

5. The information generation system according to claim 1, further comprising a sound emitting unit that emits the guidance voice and sound indicating the distribution information.