JP2018136472A

JP2018136472A - Language clarification device and loudspeaker broadcasting system

Info

Publication number: JP2018136472A
Application number: JP2017031763A
Authority: JP
Inventors: 博至橋本; Hiroshi Hashimoto; 犬飼　修; Osamu Inukai; 修犬飼; 達弥宮本; Tatsuya Miyamoto
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2017-02-23
Filing date: 2017-02-23
Publication date: 2018-08-30
Anticipated expiration: 2037-02-23
Also published as: JP6772889B2

Abstract

PROBLEM TO BE SOLVED: To make possible to reduce the processing load of language clarification.SOLUTION: The language clarification device is provided which comprises; a fundamental tone extracting unit which divides a speech signal at predetermined time intervals and extracts a fundamental tone for each of the divided speech signals; a filter application unit which previously stores a filter generated based on the features of the formant, and determines an application frequency of the filter according to the pitch of the fundamental tone and the divided speech signal, and applies the filter to the divided speech signal; and a maximization processing unit which emphasizes a predetermined frequency band of the speech signal processed by the filter application unit.SELECTED DRAWING: Figure 3

Description

本発明は、言語明瞭化装置及び拡声放送システムに関する。 The present invention relates to a language clarification device and a loudspeaker broadcasting system.

防災行政無線システム等の各種拡声放送システムにおいては、大別すると、拡声器を用いて拡声放送を行う子局と、放送内容に関する情報を子局に伝送する親局とが存在する。ここで、例えば、子局が屋外に設置されている場合等においては、劣悪な外部環境（騒音（交通騒音、生活騒音等）によってユーザが拡声放送を聞き取りづらい場合がある。 In various loudspeaker broadcasting systems such as a disaster prevention administrative radio system, there are roughly classified into a slave station that performs loudspeaker broadcasting using a loudspeaker and a master station that transmits information related to broadcast contents to the slave station. Here, for example, when the slave station is installed outdoors, it may be difficult for the user to hear the loudspeaker broadcast due to a bad external environment (noise (traffic noise, daily noise, etc.)).

このような事象に対応するために、拡声放送を明瞭化する技術については様々な研究が行われている。例えば、以下の特許文献１には、防災行政無線システム等にて、フォルマントを強調することで明瞭化を実現する技術が開示されている。また、特許文献２には、音声を音素毎に分けて、音声を強調する必要性を判断して予め記憶されたデータを用いて音声を修正する技術が開示されている。また、特許文献３には、音韻毎に設定されたフォルマントのフィルタバンクを用いて、フォルマントを強調する技術が開示されている。さらに、特許文献４には、予め用意された、フォルマントを強調するためのフィルタのデータベースを用いてフォルマントを強調する技術が開示されている。 In order to deal with such an event, various researches have been conducted on techniques for clarifying loud-speaking broadcasting. For example, Patent Document 1 below discloses a technique for realizing clarification by emphasizing formants in a disaster prevention administrative radio system or the like. Japanese Patent Application Laid-Open No. 2004-228867 discloses a technique for dividing a voice into phonemes, determining the necessity for emphasizing the voice, and correcting the voice using data stored in advance. Patent Document 3 discloses a technique for emphasizing formants using a formant filter bank set for each phoneme. Further, Patent Document 4 discloses a technique for emphasizing formants using a prepared database of filters for emphasizing formants.

特開２０１５−１６１９１１号公報JP, 2015-161911, A 特開２００８−７０５６４号公報JP 2008-70564 A 特開２００４−４９５２号公報JP 2004-4952 A 特開２００３−２７１２００号公報JP 2003-271200 A

しかし、既存技術によっては言語明瞭化の処理負荷が高くなりやすいという問題があった。例えば、特許文献１の技術においては、スペクトルの包絡線に基づくスペクトル・シェーピング処理でフォルマントを抽出、雑音中の音声の知覚にさして影響を与えない調波を間引く処理で周囲雑音の録音を行って、間引かれた調波のエネルギーを他の重要な成分に再配分する処理が行われるが、処理負荷が高い。そのため、高性能な装置が求められたり、処理遅延が発生したりする可能性があった。 However, there is a problem that the processing load of language clarification tends to increase depending on the existing technology. For example, in the technique of Patent Document 1, a formant is extracted by spectrum shaping processing based on a spectrum envelope, and ambient noise is recorded by thinning out harmonics that do not affect the perception of speech in noise. The processing of redistributing the thinned harmonic energy to other important components is performed, but the processing load is high. Therefore, there is a possibility that a high-performance device is required or a processing delay occurs.

そこで、本発明は、上記問題に鑑みてなされたものであり、本発明の目的とするところは、言語明瞭化の処理負荷をより低減させることが可能な、新規かつ改良された言語明瞭化装置及び拡声放送システムを提供することにある。 Therefore, the present invention has been made in view of the above problems, and an object of the present invention is to provide a new and improved language clarification device capable of further reducing the processing load of language clarification. And providing a loudspeaker broadcasting system.

上記課題を解決するために、本発明のある観点によれば、所定時間毎に音声信号を分割し、分割された音声信号毎に基音を抽出する基音抽出部と、フォルマントの特徴に基づいて生成されたフィルタを予め記憶し、前記基音および前記分割された音声信号の音程により前記フィルタの適用周波数を決定し、前記分割された音声信号に対して前記フィルタを適用するフィルタ処理部と、前記フィルタ処理部によって処理された音声信号における所定の周波数帯域を強調するマキシマイズ処理部と、を備えた、言語明瞭化装置が提供される。 In order to solve the above problems, according to one aspect of the present invention, a sound signal is divided every predetermined time, and a basic sound extraction unit that extracts a basic sound for each divided sound signal is generated based on the characteristics of the formant. A filter processing unit that stores the filtered filter in advance, determines an applied frequency of the filter based on a pitch of the fundamental sound and the divided audio signal, and applies the filter to the divided audio signal; There is provided a language clarification device including a maximization processing unit that emphasizes a predetermined frequency band in an audio signal processed by a processing unit.

前記フィルタ処理部は、前記基音の周波数以上の周波数を、前記適用周波数として決定してもよい。 The filter processing unit may determine a frequency equal to or higher than the frequency of the fundamental tone as the applied frequency.

前記フィルタはくし型フィルタであってもよい。 The filter may be a comb filter.

また、上記課題を解決するために、本発明の別の観点によれば、拡声放送について子局装置を制御する親局装置と、前記拡声放送を行う前記子局装置と、前記拡声放送の音声を明瞭化する言語明瞭化装置と、を備える拡声放送システムであって、前記言語明瞭化装置は、所定時間毎に音声信号を分割し、分割された音声信号毎に基音を抽出する基音抽出部と、フォルマントの特徴に基づいて生成されたフィルタを予め記憶し、前記基音および前記分割された音声信号の音程により前記フィルタの適用周波数を決定し、前記分割された音声信号に対して前記フィルタを適用するフィルタ処理部と、前記フィルタ処理部によって処理された音声信号における所定の周波数帯域を強調するマキシマイズ処理部と、を備え、前記親局装置は、前記子局装置に対して前記音声信号を含む信号を送信する送信部を備え、前記子局装置は、前記信号を受信する受信部と、前記信号に基づいて前記拡声放送を制御する鳴動制御部と、を備える、拡声放送システムが提供される。 In order to solve the above problems, according to another aspect of the present invention, a master station device that controls a slave station device for loudspeaker broadcasting, the slave station device that performs the loudspeaker broadcast, and audio of the loudspeaker broadcast A speech clarification device comprising: a fundamental sound extraction unit that divides a speech signal every predetermined time and extracts a fundamental tone for each of the divided speech signals A filter generated based on formant characteristics is pre-stored, an applied frequency of the filter is determined based on a pitch of the fundamental sound and the divided audio signal, and the filter is applied to the divided audio signal. A filter processing unit to be applied; and a maximization processing unit that emphasizes a predetermined frequency band in the audio signal processed by the filter processing unit, wherein the master station device includes the slave station A transmitter that transmits a signal including the audio signal to a device, and the slave station device includes: a receiver that receives the signal; and a ringing controller that controls the loudspeak broadcast based on the signal. A loudspeaker broadcasting system is provided.

以上説明したように本発明によれば、言語明瞭化の処理負荷をより低減させることを可能にする。 As described above, according to the present invention, it becomes possible to further reduce the processing load of language clarification.

本実施形態に係る拡声放送システムの構成を示す図である。It is a figure which shows the structure of the loud sound broadcasting system which concerns on this embodiment. 本実施形態に係る言語明瞭化装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the language clarification apparatus which concerns on this embodiment. 本実施形態に係る言語明瞭化装置の処理部の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the process part of the language clarification apparatus which concerns on this embodiment. フィルタが適用される周波数の推移の一例を示す図である。It is a figure which shows an example of transition of the frequency to which a filter is applied. 本実施形態に係る親局の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the main | base station which concerns on this embodiment. 本実施形態に係る子局の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the subunit | mobile_unit which concerns on this embodiment. 本実施形態に係る言語明瞭化装置による言語明瞭化処理の動作を示すシーケンス図である。It is a sequence diagram which shows the operation | movement of the language clarification process by the language clarification apparatus which concerns on this embodiment. 本実施形態に係る言語明瞭化装置のハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions of the language clarification apparatus which concerns on this embodiment.

以下に添付図面を参照しながら、本発明の好適な実施の形態について詳細に説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複説明を省略する。 Exemplary embodiments of the present invention will be described below in detail with reference to the accompanying drawings. In addition, in this specification and drawing, about the component which has the substantially same function structure, duplication description is abbreviate | omitted by attaching | subjecting the same code | symbol.

＜１．背景＞
上記のとおり、拡声放送を明瞭化する技術については様々な研究が行われている。例えば、特許文献１には、防災行政無線システム等にて、フォルマントを強調することで明瞭化を実現する技術が開示されている。特許文献１には、主に２つの技術によって音声明瞭化が実現されている。１つは、スペクトルの包絡線に基づくスペクトル・シェーピングによる雑音特性への音声の適応技術である。もう１つは雑音中の音声の知覚にさして影響を与えない調波を間引き、間引かれた調波のエネルギーを他の重要な成分に再配分する技術である。 <1. Background>
As described above, various researches have been conducted on techniques for clarifying loud-sound broadcasting. For example, Patent Document 1 discloses a technique for realizing clarification by emphasizing formants in a disaster prevention administrative radio system or the like. In Patent Document 1, speech clarification is realized mainly by two techniques. One is a technique for adapting speech to noise characteristics by spectrum shaping based on a spectrum envelope. The other is a technique that thins out harmonics that do not affect the perception of speech in noise, and redistributes the energy of the thinned harmonics to other important components.

しかし、スペクトル包絡線の抽出処理によってフォルマントを導き出したり、雑音中の音声の知覚にさして影響を与えない調波を間引き、間引かれた調波のエネルギーを残された調波に再分配したりする処理に要する負荷は大きい。また、本技術においては、音声を雑音特性に適応させるにあたり、放送が行われる周辺で集音された雑音が明瞭化の処理に用いられている。しかし、拡声放送システムの放送エリアが広域であり、放送される地点毎の雑音特性が互いに異なる場合には、一か所だけで集音された雑音が明瞭化の処理に用いられると、音声が各地点の雑音特性に十分適応することができない。また、仮に、放送される地点毎の雑音が用いられて明瞭化の処理が行われる場合には、処理に要する負荷がさらに大きくなる。 However, the formant can be derived by the spectral envelope extraction process, the harmonics that do not affect the perception of speech in noise are thinned out, and the energy of the thinned harmonics can be redistributed to the remaining harmonics. The load required for processing is large. Further, in the present technology, when adapting speech to noise characteristics, noise collected in the vicinity where broadcasting is performed is used for clarification processing. However, if the broadcast area of the loudspeaker broadcasting system is wide and the noise characteristics at each broadcast point are different from each other, if the noise collected at only one place is used for clarification processing, It cannot sufficiently adapt to the noise characteristics at each point. In addition, if the clarification process is performed using noise at each broadcast point, the load required for the process further increases.

本件の発明者は上記事情に鑑み、本発明に想到するに至った。本発明の一実施形態に係る言語明瞭化装置は、フォルマントの特徴に基づいて生成されたフィルタ特性をテンプレート化して予め記憶しておく。そして、言語明瞭化装置は、放送される音声信号を所定時間毎に分割し、分割された音声信号の基音を抽出（基音の定義については後述する）し、当該基音に基づいてフィルタの適用周波数を決定する。これによって、言語明瞭化装置は、明瞭化の処理負荷をより低減させることができる。 In view of the above circumstances, the inventors of the present invention have arrived at the present invention. The language clarification device according to an embodiment of the present invention preliminarily stores filter characteristics generated based on formant features as templates. Then, the language clarification device divides the broadcast audio signal every predetermined time, extracts a fundamental tone of the divided speech signal (the definition of the fundamental tone will be described later), and applies an applied frequency of the filter based on the fundamental tone. To decide. Thereby, the language clarification device can further reduce the processing load of clarification.

以降では、主に、「２．本実施形態に係る拡声放送システムの概要」「３．各装置の機能構成」「４．言語明瞭化装置の動作」「５．言語明瞭化装置のハードウェア構成」という順番で本実施形態について詳細に説明していく。 In the following, mainly, “2. Outline of the loudspeaker broadcasting system according to the present embodiment” “3. Functional configuration of each device” “4. Operation of language clarification device” “5. Hardware configuration of language clarification device” The embodiment will be described in detail in the order of "."

＜２．本実施形態に係る拡声放送システムの概要＞
上記では、本発明の背景について説明した。続いて、本発明の一実施形態に係る拡声放送システムの概要について説明する。 <2. Overview of Loudspeaker Broadcasting System According to this Embodiment>
The background of the present invention has been described above. Then, the outline | summary of the loud sound broadcasting system which concerns on one Embodiment of this invention is demonstrated.

（２−１．拡声放送システムの構成）
まず、図１を参照して、本実施形態に係る拡声放送システムの構成について説明する。図１は、本実施形態に係る拡声放送システムの構成を示す図である。図１に示すように、本実施形態に係る拡声放送システムは、言語明瞭化装置１００と、操作卓２００と、親局３００と、子局４００と、を備える。親局３００は、無線通信に用いるアンテナ３１０を備え、子局４００は、無線通信に用いるアンテナ４１０と、拡声放送に用いる拡声器４２０と、を備える。以降で各機能について説明する。 (2-1. Configuration of loudspeaker broadcasting system)
First, with reference to FIG. 1, the structure of the loudspeaker broadcasting system according to the present embodiment will be described. FIG. 1 is a diagram showing a configuration of a loudspeaker broadcasting system according to the present embodiment. As shown in FIG. 1, the loudspeaker broadcasting system according to the present embodiment includes a language clarification device 100, an operation console 200, a master station 300, and a slave station 400. The master station 300 includes an antenna 310 used for wireless communication, and the slave station 400 includes an antenna 410 used for wireless communication and a loudspeaker 420 used for loudspeaker broadcasting. Each function will be described below.

（言語明瞭化装置１００）
言語明瞭化装置１００は、拡声放送に用いられる音声信号に対して各種処理を行うことで、放送される音声の明瞭化を実現する。より具体的に説明すると、言語明瞭化装置１００は、拡声放送の実施者がマイク等（図示なし）を用いて入力した音声信号に対して、基音抽出処理、フィルタ処理またはマキシマイズ処理等を行い、処理後の音声信号を後述する操作卓２００へ提供する。言語明瞭化装置１００による処理の詳細については後述する。 (Language Clarification Device 100)
The language clarification device 100 realizes clarification of the broadcasted audio by performing various processes on the audio signal used for the loudspeaker broadcasting. More specifically, the language clarification device 100 performs fundamental sound extraction processing, filtering processing, or maximization processing, etc., on an audio signal input by a person who performs a loud sound broadcast using a microphone or the like (not shown), The processed audio signal is provided to the console 200 described later. Details of processing by the language clarification device 100 will be described later.

（操作卓２００）
操作卓２００は、拡声放送における各種設定が行われる装置である。より具体的に説明すると、操作卓２００は、拡声放送の実施者によって、放送の対象となる各子局４００、放送の際の設定等が入力される。操作卓２００は、各種設定に関する情報および言語明瞭化装置１００から提供された音声信号を後述する親局３００に提供する。 (Operation console 200)
The console 200 is a device for performing various settings in loudspeaker broadcasting. More specifically, the console 200 is inputted with each broadcasting station 400 to be broadcasted, settings at the time of broadcasting, and the like by a person who carries out loud sound broadcasting. The console 200 provides information related to various settings and the audio signal provided from the language clarification device 100 to the master station 300 described later.

（親局３００）
親局３００は、拡声放送に用いられる情報を無線信号として送信する。より具体的に説明すると、親局３００は、操作卓２００から入力された情報に基づいて無線信号を生成し、アンテナ３１０を用いて、当該無線信号を中継局（図示なし）または各子局４００へ送信する。親局３００が送信する無線信号は複数種類あるとする。例えば、親局３００は、拡声放送を指示する信号、放送内容が含まれる信号、拡声放送の終了を指示する信号等を子局４００に送信してもよい。なお、これらの信号はあくまで一例であり、拡声放送の種類もしくは方式によって送信される信号は適宜変更される。 (Master station 300)
The master station 300 transmits information used for loudspeaker broadcasting as a radio signal. More specifically, the master station 300 generates a radio signal based on information input from the console 200, and uses the antenna 310 to transmit the radio signal to a relay station (not shown) or each slave station 400. Send to. It is assumed that there are a plurality of types of radio signals transmitted by the master station 300. For example, the master station 300 may transmit to the slave station 400 a signal for instructing loudspeaking broadcast, a signal including broadcast contents, a signal for instructing termination of the loudspeaking broadcast, and the like. Note that these signals are merely examples, and the signals transmitted are appropriately changed depending on the type or method of the loudspeaker broadcasting.

（子局４００）
子局４００は、拡声放送を行う装置である。より具体的に説明すると、子局４００は、アンテナ４１０を用いて親局３００または中継局から無線信号を受信し、当該信号に含まれる情報を用いて拡声放送を行う。例えば、子局４００は、無線信号に含まれる情報に基づいて自局が放送の対象であるか否かを判定し、自局が放送の対象である場合、信号に含まれる動作設定を反映させた状態で、拡声器４２０を用いて拡声放送を行う。本明細書では、子局４００が道路または公園等に設置される固定局である場合を一例として説明するが、子局４００の種類は任意である。 (Slave station 400)
The slave station 400 is a device that performs loudspeaker broadcasting. More specifically, the slave station 400 receives a radio signal from the master station 300 or the relay station using the antenna 410 and performs loudspeak broadcasting using information included in the signal. For example, the slave station 400 determines whether or not the local station is a broadcasting target based on information included in the radio signal, and reflects the operation setting included in the signal when the local station is the broadcasting target. In this state, loudspeaker broadcasting is performed using the loudspeaker 420. In this specification, the case where the slave station 400 is a fixed station installed on a road or a park will be described as an example, but the type of the slave station 400 is arbitrary.

（２−２．拡声放送システムの機能概要）
上記では、本実施形態に係る拡声放送システムの構成について説明した。続いて、本実施形態に係る拡声放送システムの機能概要について説明する。 (2-2. Function overview of the loudspeaker broadcasting system)
The configuration of the loudspeaker broadcasting system according to the present embodiment has been described above. Next, a functional overview of the loudspeaker broadcasting system according to the present embodiment will be described.

本実施形態に係る拡声放送システムの言語明瞭化装置１００は、まず、フォルマントの特徴に基づいて生成されたフィルタ特性をテンプレート化して記憶しておく。以下、本明細書では記憶されたフィルタ特性のテンプレートのことをフィルタとして記載する。当該フィルタの種類は任意である。本明細書では、当該フィルタがくし型フィルタである場合を想定して記載する。そして、言語明瞭化装置１００は、拡声放送の実施者によって音声信号が入力された場合、当該音声信号を所定の時間毎に分割し、分割された音声信号を解析することで基音を抽出する。ここで、基音とは、音声信号が複数の正弦波の合成によって表される場合において、最も低い周波数の音を指す。なお、音声信号が分割される間隔である所定の時間とは任意の時間を指すが、本明細書では、例えば、音声信号が分割される間隔が数十[ｍｓ]であることを想定している。 The language clarification device 100 of the loudspeaker broadcasting system according to the present embodiment first stores a filter characteristic generated based on the formant characteristics as a template. Hereinafter, the stored filter characteristic template is described as a filter. The type of the filter is arbitrary. In this specification, description is made assuming that the filter is a comb filter. Then, when an audio signal is input by the person who performs the loudspeaker broadcasting, the language clarification device 100 divides the audio signal every predetermined time, and extracts the fundamental sound by analyzing the divided audio signal. Here, the fundamental tone refers to a sound having the lowest frequency when the audio signal is represented by a combination of a plurality of sine waves. Note that the predetermined time, which is the interval at which the audio signal is divided, refers to an arbitrary time. In this specification, for example, it is assumed that the interval at which the audio signal is divided is several tens [ms]. Yes.

そして、言語明瞭化装置１００は、基音の周波数以上の周波数に対して、予め記憶しておいたフィルタを適用する。これによって、言語明瞭化装置１００は、当該分割された音声信号について、言語として強調すべき重要な周波数帯域の成分を残した上で、ノイズ等の重要性の低い周波数帯域の成分を除去することができる。 And the language clarification apparatus 100 applies the filter memorize | stored beforehand with respect to the frequency more than the frequency of a fundamental tone. As a result, the language clarification device 100 removes a component of a frequency band with low importance such as noise, while leaving a component of an important frequency band to be emphasized as a language for the divided audio signal. Can do.

また、基本的に、基音は音程によって変化するため、言語明瞭化装置１００は、分割された音声信号毎に、それぞれの音程に応じてフィルタを適用する周波数を決定する（実質的には、基音に応じてフィルタを適用する周波数を決定することと同義）。これによって、言語明瞭化装置１００は、音程が刻々と変化するような音声信号に対しても適切に対応することができる。 Basically, since the fundamental tone changes depending on the pitch, the language clarification device 100 determines a frequency to apply the filter for each divided speech signal according to each pitch (substantially, the fundamental tone). Is the same as determining the frequency to apply the filter according to). Thereby, the language clarification device 100 can appropriately cope with an audio signal whose pitch changes every moment.

その後、言語明瞭化装置１００は、フィルタを適用した後の音声信号における所定の帯域成分を増幅する処理を行う。フィルタの適用によって重要性の低い周波数帯域の成分が除去されることで音声信号の振幅が減少する（換言すると、音響エネルギーが減少する）ため、言語明瞭化装置１００、その減少分だけ音声信号の増幅を行うことができる。 Thereafter, the language clarification device 100 performs a process of amplifying a predetermined band component in the audio signal after applying the filter. Since the amplitude of the speech signal decreases (in other words, the acoustic energy decreases) by removing the components of the less important frequency band by the application of the filter, the speech clarification device 100 and the decrease in the speech signal by the decrease amount. Amplification can be performed.

上記の処理が行われた音声信号を用いて子局４００が拡声放送を行うことによって、本実施形態に係る拡声放送システムは、劣悪な外部環境（騒音（交通騒音、生活騒音等）、降雨、暴風雨、暴風雪、台風、落雷等）においてもユーザが聞き取りやすい拡声放送を実現することができる。また、音程に基づいてフィルタを適用する処理（実質的には、基音に基づいてフィルタを適用する処理と同義）は、特許文献１による処理（スペクトル包絡線の抽出処理等）よりも負荷が低いため、本実施形態に係る拡声放送システムは、言語明瞭化に要する処理の負荷をより低減させることができる。 When the slave station 400 performs loudspeaking broadcasting using the audio signal subjected to the above processing, the loudspeaker broadcasting system according to this embodiment has a poor external environment (noise (traffic noise, daily noise, etc.), rainfall, (Soundstorm, storm snow, typhoon, lightning strike, etc.) can be realized. Further, the process of applying the filter based on the pitch (substantially synonymous with the process of applying the filter based on the fundamental tone) has a lower load than the process according to Patent Document 1 (spectrum envelope extraction process or the like). Therefore, the loudspeaker broadcasting system according to the present embodiment can further reduce the processing load required for language clarification.

また、雑音のスペクトルは一定ではないが、言語明瞭化装置１００が分割された音声信号毎にフィルタの適用周波数を適度に変更することによって、雑音のスペクトル成分にマスクされにくくなり、雑音下での言語明瞭化が可能となる。 Moreover, although the noise spectrum is not constant, the language clarification device 100 is not easily masked by the noise spectrum component by appropriately changing the applied frequency of the filter for each of the divided audio signals. Language clarification is possible.

また、言語明瞭化装置１００は、拡声放送に用いられる信号の送信元（操作卓２００または親局３００等）側に備えられればよく、子局４００側に備えられなくてもよい。これによって、言語明瞭化装置１００の設置、調整、保守または運用等に要するリソースが削減される。 Moreover, the language clarification apparatus 100 should just be provided in the transmission source (the console 200 or the master station 300 grade | etc.) Side of the signal used for loud sound broadcasting, and does not need to be provided in the slave station 400 side. Thereby, resources required for installation, adjustment, maintenance or operation of the language clarification device 100 are reduced.

＜３．各装置の機能構成＞
上記では、本実施形態に係る拡声放送システムの機能概要について説明した。続いて、各装置の機能構成について説明する。 <3. Functional configuration of each device>
The function outline of the loudspeaker broadcasting system according to the present embodiment has been described above. Next, the functional configuration of each device will be described.

（言語明瞭化装置１００の機能構成）
まず、図２および図３を参照して、本実施形態に係る言語明瞭化装置１００の機能構成について説明する。図２は、本実施形態に係る言語明瞭化装置１００の機能構成を示すブロック図である。図２に示すように、本実施形態に係る言語明瞭化装置１００は、切替部１１０と、処理部１２０と、を備える。また、図３は、本実施形態に係る言語明瞭化装置１００の処理部１２０の機能構成を示すブロック図である。図３に示すように、本実施形態に係る処理部１２０は、前処理部１２１と、基音抽出部１２２と、フィルタ処理部１２３と、マキシマイズ処理部１２４と、を備える。以降で各機能構成について説明する。 (Functional configuration of language clarification device 100)
First, with reference to FIG. 2 and FIG. 3, the functional configuration of the language clarification device 100 according to the present embodiment will be described. FIG. 2 is a block diagram illustrating a functional configuration of the language clarification device 100 according to the present embodiment. As shown in FIG. 2, the language clarification device 100 according to the present embodiment includes a switching unit 110 and a processing unit 120. FIG. 3 is a block diagram illustrating a functional configuration of the processing unit 120 of the language clarification device 100 according to the present embodiment. As illustrated in FIG. 3, the processing unit 120 according to the present embodiment includes a preprocessing unit 121, a fundamental tone extraction unit 122, a filter processing unit 123, and a maximize processing unit 124. Hereinafter, each functional configuration will be described.

（切替部１１０）
切替部１１０は、音声信号に対して言語明瞭化装置１００による処理（以降、便宜的に「言語明瞭化処理」と呼称する）を行うか否かの切り替えを行う。より具体的に説明すると、切替部１１０は、音声信号および切替制御信号を外部装置から受信した場合、切替制御信号を解析して言語明瞭化処理が要求されていると判断した場合には、受信した音声信号を処理部１２０に順次提供し、言語明瞭化処理が要求されていないと判断した場合には言語明瞭化装置１００を介すことなく音声信号を操作卓２００へ順次提供する。なお、切替部１１０が切替制御信号を外部装置から受信するのではなく、拡声放送の実施者が言語明瞭化装置１００に備えられる入力部（図示なし。ボタン、スイッチ、タッチパネル等）を操作することによって、切替部１１０が切替制御信号を自ら生成してもよい。 (Switching unit 110)
The switching unit 110 switches whether to perform processing by the language clarification device 100 (hereinafter referred to as “language clarification processing” for convenience) on the audio signal. More specifically, when the switching unit 110 receives an audio signal and a switching control signal from an external device, the switching unit 110 receives the audio signal and the switching control signal when analyzing the switching control signal and determining that language clarification processing is required. The audio signals are sequentially provided to the processing unit 120, and when it is determined that language clarification processing is not required, the audio signals are sequentially provided to the console 200 without using the language clarification device 100. Note that, instead of the switching unit 110 receiving a switching control signal from an external device, a voice broadcaster operates an input unit (not shown; button, switch, touch panel, etc.) provided in the language clarification device 100. Thus, the switching unit 110 may generate the switching control signal by itself.

（処理部１２０）
処理部１２０は、音声信号に対して言語明瞭化処理を行う。上記のとおり、処理部１２０は、前処理部１２１と、基音抽出部１２２と、フィルタ処理部１２３と、マキシマイズ処理部１２４と、を備える。以降で処理部１２０の各機能構成について説明する。 (Processing unit 120)
The processing unit 120 performs language clarification processing on the audio signal. As described above, the processing unit 120 includes the preprocessing unit 121, the fundamental tone extraction unit 122, the filter processing unit 123, and the maximize processing unit 124. Hereinafter, each functional configuration of the processing unit 120 will be described.

（前処理部１２１）
前処理部１２１は、音声信号に対して各種処理を行うことで、後工程における処理の円滑化を実現する。各種処理には、任意の信号処理が含まれる。例えば、前処理部１２１は、音声信号に対して増幅処理、平滑化処理または整形処理等を行う。前処理部１２１は、各種処理を行った音声信号を後述する基音抽出部１２２およびフィルタ処理部１２３に提供する。なお、前処理部１２１は、基音抽出部１２２またはフィルタ処理部１２３からのフィードバックを受けて各種処理を実施し直したり、各種処理に用いられるパラメータを変更したりしてもよい。 (Pre-processing unit 121)
The preprocessing unit 121 performs various processes on the audio signal, thereby realizing smooth processing in the subsequent process. Various processing includes arbitrary signal processing. For example, the preprocessing unit 121 performs amplification processing, smoothing processing, shaping processing, or the like on the audio signal. The preprocessing unit 121 provides the sound signal that has been subjected to various types of processing to a fundamental tone extraction unit 122 and a filter processing unit 123, which will be described later. Note that the preprocessing unit 121 may perform various processes again in response to feedback from the fundamental tone extraction unit 122 or the filter processing unit 123, or may change parameters used in the various processes.

（基音抽出部１２２）
基音抽出部１２２は、音声信号の基音を抽出する。より具体的に説明すると、基音抽出部１２２は、前処理部１２１によって各種処理が施された音声信号を分割し、当該信号を解析することによって、分割された音声信号毎に基音を抽出する。そして、基音抽出部１２２は、分割された音声信号毎の基音に関する情報を制御信号として後述するフィルタ処理部１２３に提供する。 (Fundamental tone extraction unit 122)
The fundamental tone extraction unit 122 extracts the fundamental tone of the audio signal. More specifically, the fundamental tone extraction unit 122 divides the speech signal that has been subjected to various processes by the preprocessing unit 121, and analyzes the signal to extract a fundamental tone for each of the divided speech signals. And the fundamental tone extraction part 122 provides the information regarding the fundamental tone for every divided | segmented audio | voice signal to the filter process part 123 mentioned later as a control signal.

（フィルタ処理部１２３）
フィルタ処理部１２３は、音声信号にフィルタを適用する処理（以降、便宜的に「フィルタ処理」と呼称する）を行う。より具体的に説明すると、フィルタ処理部１２３は、フォルマントの特徴に基づいて生成されたフィルタを予め記憶しておき、前処理部１２１から音声信号を提供された場合、基音抽出部１２２から提供される制御信号に基づいて、基音の周波数以上の周波数に対してフィルタを適用する。 (Filter processing unit 123)
The filter processing unit 123 performs a process of applying a filter to the audio signal (hereinafter referred to as “filter process” for convenience). More specifically, the filter processing unit 123 stores a filter generated based on the formant characteristics in advance, and is provided from the fundamental tone extraction unit 122 when an audio signal is provided from the preprocessing unit 121. Based on the control signal, a filter is applied to frequencies above the fundamental frequency.

ここで、図４を参照して、フィルタ処理について説明する。図４は、フィルタが適用される周波数の推移の一例を示す図である。まず、基音抽出部１２２が時点１（例えば、音声信号の入力が開始された時点）における音声信号を解析し、基音を抽出する。そして、フィルタ処理部１２３は、図４Ａに示すように、基音の周波数Ｆ０以上の周波数に対してフィルタを適用する。そして、基音抽出部１２２は、時点１の直後に分割した時点２における音声信号を解析する。時点１における音声信号の音程と時点２における音声信号の音程は互いに異なるため、時点１における基音と時点２における基音は互いに異なる。基音抽出部１２２は、時点２における音声信号の基音を抽出し、図４Ｂに示すように、フィルタ処理部１２３は基音の周波数Ｆ０＋α以上の周波数に対してフィルタを適用する。その後、基音抽出部１２２およびフィルタ処理部１２３は、時点３における音声信号に対して同様の処理を行うことで、図４Ｃに示すように、基音の周波数Ｆ０−β以上の周波数に対してフィルタを適用する。 Here, the filtering process will be described with reference to FIG. FIG. 4 is a diagram illustrating an example of the transition of the frequency to which the filter is applied. First, the fundamental tone extraction unit 122 analyzes the speech signal at the time point 1 (for example, when the input of the speech signal is started), and extracts the fundamental tone. And the filter process part 123 applies a filter with respect to the frequency more than the frequency F0 of a fundamental tone, as shown to FIG. 4A. Then, the fundamental tone extraction unit 122 analyzes the audio signal at the time point 2 divided immediately after the time point 1. Since the pitch of the audio signal at time 1 and the pitch of the audio signal at time 2 are different from each other, the fundamental at time 1 and the fundamental at time 2 are different from each other. The fundamental tone extraction unit 122 extracts the fundamental tone of the audio signal at time point 2, and as illustrated in FIG. 4B, the filter processing unit 123 applies a filter to a frequency equal to or higher than the fundamental frequency F0 + α. Thereafter, the fundamental tone extraction unit 122 and the filter processing unit 123 perform the same processing on the audio signal at the time point 3 to filter the frequencies higher than the fundamental frequency F0-β as shown in FIG. 4C. Apply.

このように、フィルタ処理部１２３は、基音に応じてフィルタが適用される周波数を刻々と変更していくことで、音程が刻々と変化するような音声信号に対しても適切に対応することができる。フィルタ処理部１２３は、フィルタ処理を行った音声信号をマキシマイズ処理部１２４に提供する。 In this way, the filter processing unit 123 can appropriately cope with an audio signal whose pitch changes every moment by changing the frequency to which the filter is applied according to the fundamental tone. it can. The filter processing unit 123 provides the audio signal subjected to the filter process to the maximize processing unit 124.

（マキシマイズ処理部１２４）
マキシマイズ処理部１２４は、音声信号に対して増幅処理（以降、便宜的に「マキシマイズ処理」と呼称する）を行う。より具体的に説明すると、前段のフィルタ処理によって音声信号の振幅が減少するため、マキシマイズ処理部１２４は、その減少分だけ、音声の認識に重要な周波数帯域の成分を増幅するマキシマイズ処理を行う。 (Maximize processing unit 124)
The maximize processing unit 124 performs an amplification process (hereinafter referred to as “maximize process” for convenience) on the audio signal. More specifically, since the amplitude of the audio signal is reduced by the preceding filtering process, the maximizing processing unit 124 performs a maximizing process that amplifies the frequency band component important for speech recognition by the reduced amount.

（親局３００の機能構成）
上記では、言語明瞭化装置１００の機能構成について説明した。続いて、図５を参照して、親局３００の機能構成について説明する。図５は、本実施形態に係る親局３００の機能構成を示すブロック図である。 (Functional configuration of master station 300)
The functional configuration of the language clarification device 100 has been described above. Next, the functional configuration of the master station 300 will be described with reference to FIG. FIG. 5 is a block diagram showing a functional configuration of the master station 300 according to the present embodiment.

図５に示すように、親局３００は、有線通信部３２０と、生成部３３０と、無線通信部３４０と、制御部３５０と、を備える。また、無線通信部３４０はアンテナ３１０を備える。以降で各機能構成について説明する。 As shown in FIG. 5, the master station 300 includes a wired communication unit 320, a generation unit 330, a wireless communication unit 340, and a control unit 350. In addition, the wireless communication unit 340 includes an antenna 310. Hereinafter, each functional configuration will be described.

（有線通信部３２０）
有線通信部３２０は、拡声放送に用いられる情報を操作卓２００から受信する有線ネットワークへのインタフェースである。より具体的に説明すると、拡声放送の実施者が、操作卓２００を用いて、放送の対象となる各子局４００、放送の際の設定等を入力すると、有線通信部３２０は、これらの情報を操作卓２００から受信し、後述する生成部３３０に提供する。 (Wired communication unit 320)
The wired communication unit 320 is an interface to a wired network that receives information used for loud sound broadcasting from the console 200. More specifically, when the person who performs the loudspeaker broadcasts, using the console 200, inputs each slave station 400 to be broadcasted, settings at the time of broadcasting, and the like, the wired communication unit 320 displays these information. Is received from the console 200 and provided to the generation unit 330 described later.

（生成部３３０）
生成部３３０は、有線通信部３２０によって受信された情報に基づいて各種信号を生成する。より具体的に説明すると、生成部３３０は、拡声放送を指示する信号、放送内容が含まれる信号、拡声放送の終了を指示する信号等を生成し、これらの信号を後述する無線通信部３４０に提供する。 (Generator 330)
The generating unit 330 generates various signals based on information received by the wired communication unit 320. More specifically, the generation unit 330 generates a signal for instructing loudspeaking broadcast, a signal including broadcast contents, a signal for instructing the end of the loudspeaking broadcast, and the like, and these signals are transmitted to the wireless communication unit 340 to be described later. provide.

（無線通信部３４０）
無線通信部３４０は、子局４００へ無線信号を送信する送信部として機能する。具体的には、無線通信部３４０は、生成部３３０から提供される各種信号をアンテナ３１０により無線信号に変換して送信する。 (Wireless communication unit 340)
The radio communication unit 340 functions as a transmission unit that transmits a radio signal to the slave station 400. Specifically, the radio communication unit 340 converts various signals provided from the generation unit 330 into radio signals by the antenna 310 and transmits the radio signals.

（制御部３５０）
制御部３５０は、親局３００における各種処理を統括的に制御する。例えば、制御部３５０は、操作卓２００から拡声放送に用いられる各種情報が提供された場合に、各機能構成を制御することで、拡声放送に用いられる情報を含む信号を子局４００へ送信させる。 (Control unit 350)
The control unit 350 comprehensively controls various processes in the master station 300. For example, when various information used for loudspeaker broadcasting is provided from console 200, control unit 350 controls each functional configuration to transmit a signal including information used for loudspeaker broadcast to slave station 400. .

（子局４００の機能構成）
上記では、親局３００の機能構成について説明した。続いて、図６を参照して、子局４００の機能構成について説明する。図６は、本実施形態に係る子局４００の機能構成を示すブロック図である。 (Functional configuration of slave station 400)
The functional configuration of the master station 300 has been described above. Next, the functional configuration of the slave station 400 will be described with reference to FIG. FIG. 6 is a block diagram showing a functional configuration of the slave station 400 according to the present embodiment.

図６に示すように、子局４００は、無線通信部４３０と、制御部４４０と、鳴動制御部４５０と、を備える。また、無線通信部４３０はアンテナ４１０を備え、鳴動制御部４５０は拡声器４２０を備える。以降で各機能構成について説明する。 As shown in FIG. 6, the slave station 400 includes a wireless communication unit 430, a control unit 440, and a ringing control unit 450. The wireless communication unit 430 includes an antenna 410, and the ringing control unit 450 includes a loudspeaker 420. Hereinafter, each functional configuration will be described.

（無線通信部４３０）
無線通信部４３０は、親局３００からの無線信号を受信する受信部として機能する。より具体的に説明すると、無線通信部４３０は、親局３００から送信された無線信号をアンテナ４１０によって受信し、当該信号に対して復調処理または復号処理を行うことで拡声放送に用いられる情報を取得する。無線通信部４３０は、当該情報を後述する制御部４４０に提供する。 (Wireless communication unit 430)
The wireless communication unit 430 functions as a receiving unit that receives a wireless signal from the master station 300. More specifically, the wireless communication unit 430 receives the wireless signal transmitted from the master station 300 by the antenna 410, and performs demodulation processing or decoding processing on the signal to obtain information used for loudspeaker broadcasting. get. The wireless communication unit 430 provides the information to the control unit 440 described later.

（制御部４４０）
制御部４４０は、子局４００における各種処理を統括的に制御する。例えば、制御部４４０は、無線通信部４３０から提供される情報を鳴動制御部４５０に提供することによって拡声放送を実現する。なお、制御部４４０は、無線通信部４３０から提供される情報に対して各種処理を行ったり、当該情報に基づいて鳴動制御部４５０を制御するための信号を生成することで鳴動制御部４５０を制御したりしてもよい。 (Control unit 440)
The control unit 440 comprehensively controls various processes in the slave station 400. For example, the control unit 440 realizes the sound broadcasting by providing the information provided from the wireless communication unit 430 to the ringing control unit 450. Note that the control unit 440 performs various processes on the information provided from the wireless communication unit 430 and generates a signal for controlling the ring control unit 450 based on the information, thereby causing the ring control unit 450 to operate. Or may be controlled.

（鳴動制御部４５０）
鳴動制御部４５０は、拡声放送に関する制御を行う。より具体的に説明すると、鳴動制御部４５０は、制御部４４０から提供される情報に基づいて拡声器４２０を制御することによって拡声放送を実現する。 (Ring control unit 450)
The ringing control unit 450 performs control related to loud sound broadcasting. More specifically, the ringing control unit 450 realizes the loudspeaker broadcasting by controlling the loudspeaker 420 based on information provided from the control unit 440.

＜４．言語明瞭化装置の動作＞
上記では、各装置の機能構成について説明した。続いて、図７を参照して、言語明瞭化処理の動作について説明する。図７は、本実施形態に係る言語明瞭化装置１００による言語明瞭化処理の動作を示すシーケンス図である。 <4. Operation of language clarification device>
In the above, the functional configuration of each device has been described. Next, the operation of language clarification processing will be described with reference to FIG. FIG. 7 is a sequence diagram showing an operation of language clarification processing by the language clarification device 100 according to the present embodiment.

ステップＳ１０００では、前処理部１２１が順次入力される音声信号に対して各種処理（増幅処理、平滑化処理または整形処理等）を行う。ステップＳ１００４では、前処理部１２１が、各種処理を施した音声信号を基音抽出部１２２に順次提供し、ステップＳ１００８では、前処理部１２１が、当該音声信号をフィルタ処理部１２３に順次提供する。ステップＳ１０１２では、基音抽出部１２２が所定の時間毎に音声信号を分割し、ステップＳ１０１６では、基音抽出部１２２が分割後の音声信号毎に基音を抽出する。ステップＳ１０２０では、基音抽出部１２２が、分割された音声信号毎の基音に関する情報を含む制御信号を生成し、当該信号をフィルタ処理部１２３に提供する。 In step S1000, the preprocessing unit 121 performs various types of processing (amplification processing, smoothing processing, shaping processing, etc.) on the audio signals that are sequentially input. In step S <b> 1004, the preprocessing unit 121 sequentially provides the sound signal subjected to various processes to the fundamental extraction unit 122, and in step S <b> 1008, the preprocessing unit 121 sequentially provides the sound signal to the filter processing unit 123. In step S1012, the fundamental tone extraction unit 122 divides the speech signal every predetermined time. In step S1016, the fundamental tone extraction unit 122 extracts the fundamental tone for each of the divided speech signals. In step S <b> 1020, the fundamental tone extraction unit 122 generates a control signal including information related to the fundamental tone for each divided audio signal, and provides the signal to the filter processing unit 123.

ステップＳ１０２４では、フィルタ処理部１２３が、当該制御信号に基づいて基音の周波数以上の周波数に対してフィルタを適用することで分割された音声信号に対してフィルタ処理を行う。ステップＳ１０２８では、フィルタ処理部１２３が、フィルタ処理を行った後の音声信号をマキシマイズ処理部１２４に提供する。ステップＳ１０３２では、マキシマイズ処理部１２４が音声信号に対して所定の帯域成分を増幅するマキシマイズ処理を行う。その後、マキシマイズ処理部１２４は、マキシマイズ処理を行った後の音声信号を操作卓２００に提供する。 In step S1024, the filter processing unit 123 performs filter processing on the audio signal divided by applying a filter to a frequency equal to or higher than the frequency of the fundamental tone based on the control signal. In step S1028, the filter processing unit 123 provides the audio signal after the filter processing to the maximize processing unit 124. In step S1032, the maximize processing unit 124 performs a maximize process for amplifying a predetermined band component on the audio signal. Thereafter, the maximize processing unit 124 provides the console 200 with the audio signal after performing the maximize processing.

以上が、言語明瞭化装置１００による言語明瞭化処理の動作である。なお、図７に示した動作は、音声信号が入力される限り継続して行われる。また、図７に示した動作は、並行的に行われてもよい。例えば、前処理部１２１による前処理（ステップＳ１０００）、基音抽出部１２２による音声信号の分割（ステップＳ１０１２）もしくは基音の抽出処理（ステップＳ１０１６）、フィルタ処理部１２３によるフィルタ処理（ステップＳ１０２４）、またはマキシマイズ処理部１２４によるマキシマイズ処理（ステップＳ１０３２）は並行的に行われてもよい。これによって、言語明瞭化装置１００に入力された音声信号が順次円滑に処理され得るため、音声が遅れたり途切れたりすることなく拡声放送が実施され得る。なお、上記で説明した動作はあくまで一例であるため、適宜され得る。 The above is the operation of the language clarification processing by the language clarification device 100. Note that the operation shown in FIG. 7 is continuously performed as long as an audio signal is input. Further, the operations shown in FIG. 7 may be performed in parallel. For example, pre-processing by the pre-processing unit 121 (step S1000), division of an audio signal by the fundamental tone extraction unit 122 (step S1012) or fundamental tone extraction processing (step S1016), filter processing by the filter processing unit 123 (step S1024), or The maximize processing (step S1032) by the maximize processing unit 124 may be performed in parallel. As a result, since the audio signal input to the language clarification device 100 can be processed smoothly one after another, it is possible to carry out a loudspeaker broadcast without delay or interruption of the audio. Note that the operation described above is merely an example, and can be appropriately performed.

＜５．言語明瞭化装置のハードウェア構成＞
以上、言語明瞭化処理の動作について説明について説明した。続いて、図８を参照して、本実施形態に係る言語明瞭化装置１００のハードウェア構成について説明する。 <5. Hardware configuration of language clarification device>
Heretofore, the description of the operation of language clarification processing has been described. Subsequently, a hardware configuration of the language clarification device 100 according to the present embodiment will be described with reference to FIG.

図８は、本実施形態に係る言語明瞭化装置１００のハードウェア構成を示すブロック図であり、言語明瞭化装置１００は、図８に示す情報処理装置９００によって具現される。情報処理装置９００は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）９０１と、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）９０２と、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）９０３と、ホストバス９０４と、ブリッジ９０５と、外部バス９０６と、インタフェース９０７と、入力装置９０８と、出力装置９０９と、ストレージ装置９１０と、通信装置９１１と、を備える。 FIG. 8 is a block diagram showing a hardware configuration of the language clarification device 100 according to the present embodiment, and the language clarification device 100 is embodied by the information processing device 900 shown in FIG. The information processing apparatus 900 includes a CPU (Central Processing Unit) 901, a ROM (Read Only Memory) 902, a RAM (Random Access Memory) 903, a host bus 904, a bridge 905, an external bus 906, an interface 907, , An input device 908, an output device 909, a storage device 910, and a communication device 911.

ＣＰＵ９０１は、演算処理装置および制御装置として機能し、各種プログラムに従って情報処理装置９００内の動作全般を制御する。また、ＣＰＵ９０１は、マイクロプロセッサであってもよい。ＲＯＭ９０２は、ＣＰＵ９０１が使用するプログラムや演算パラメータ等を記憶する。ＲＡＭ９０３は、ＣＰＵ９０１の実行において使用するプログラムや、その実行において適宜変化するパラメータ等を一時記憶する。これらはＣＰＵバス等から構成されるホストバス９０４により相互に接続されている。言語明瞭化装置１００の切替部１１０、処理部１２０、前処理部１２１、基音抽出部１２２、フィルタ処理部１２３またはマキシマイズ処理部１２４は、ＣＰＵ９０１によって具現され得る。 The CPU 901 functions as an arithmetic processing device and a control device, and controls the overall operation in the information processing device 900 according to various programs. Further, the CPU 901 may be a microprocessor. The ROM 902 stores programs used by the CPU 901, calculation parameters, and the like. The RAM 903 temporarily stores programs used in the execution of the CPU 901, parameters that change as appropriate during the execution, and the like. These are connected to each other by a host bus 904 including a CPU bus or the like. The switching unit 110, the processing unit 120, the preprocessing unit 121, the fundamental tone extraction unit 122, the filter processing unit 123, or the maximize processing unit 124 of the language clarification device 100 can be implemented by the CPU 901.

ホストバス９０４は、ブリッジ９０５を介して、ＰＣＩ（ＰｅｒｉｐｈｅｒａｌＣｏｍｐｏｎｅｎｔＩｎｔｅｒｃｏｎｎｅｃｔ／Ｉｎｔｅｒｆａｃｅ）バス等の外部バス９０６に接続されている。なお、必ずしもホストバス９０４、ブリッジ９０５および外部バス９０６を分離構成する必要はなく、１つのバスにこれらの機能を実装してもよい。 The host bus 904 is connected to an external bus 906 such as a PCI (Peripheral Component Interconnect / Interface) bus via a bridge 905. Note that the host bus 904, the bridge 905, and the external bus 906 are not necessarily configured separately, and these functions may be mounted on one bus.

入力装置９０８は、マウス、キーボード、タッチパネル、ボタン、マイクロフォン、スイッチおよびレバー等ユーザが情報を入力するための入力手段と、ユーザによる入力に基づいて入力信号を生成し、ＣＰＵ９０１に出力する入力制御回路等から構成されている。情報処理装置９００を操作するユーザは、この入力装置９０８を操作することにより、情報処理装置９００に対して各種のデータを入力したり処理動作を指示したりすることができる。 The input device 908 includes an input means for inputting information such as a mouse, a keyboard, a touch panel, a button, a microphone, a switch, and a lever, and an input control circuit that generates an input signal based on the input by the user and outputs the input signal to the CPU 901. Etc. A user who operates the information processing apparatus 900 can input various data or instruct a processing operation to the information processing apparatus 900 by operating the input device 908.

出力装置９０９は、例えば、ＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）ディスプレイ装置、液晶ディスプレイ（ＬＣＤ）装置、ＯＬＥＤ（ＯｒｇａｎｉｃＬｉｇｈｔＥｍｉｔｔｉｎｇＤｉｏｄｅ）装置、ランプ等の表示装置およびスピーカ等の音声出力装置を含む。 The output device 909 includes, for example, a CRT (Cathode Ray Tube) display device, a liquid crystal display (LCD) device, an OLED (Organic Light Emitting Diode) device, a display device such as a lamp, and an audio output device such as a speaker.

ストレージ装置９１０は、データ格納用の装置である。ストレージ装置９１０は、記憶媒体、記憶媒体にデータを記録する記録装置、記憶媒体からデータを読み出す読出し装置および記憶媒体に記録されたデータを削除する削除装置等を含んでもよい。ストレージ装置９１０は、例えば、フラッシュメモリで構成される。このストレージ装置９１０は、フラッシュメモリを駆動し、ＣＰＵ９０１が実行するプログラムや各種データを格納してもよい。 The storage device 910 is a device for storing data. The storage device 910 may include a storage medium, a recording device that records data on the storage medium, a reading device that reads data from the storage medium, a deletion device that deletes data recorded on the storage medium, and the like. The storage device 910 is composed of, for example, a flash memory. The storage device 910 may drive a flash memory and store programs executed by the CPU 901 and various data.

通信装置９１１は、例えば、ネットワークに接続するための通信デバイス等で構成された通信インタフェースである。また、通信装置９１１は、無線ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）対応通信装置であっても、有線による通信を行うワイヤー通信装置であってもよい。 The communication device 911 is a communication interface configured with, for example, a communication device for connecting to a network. The communication device 911 may be a wireless LAN (Local Area Network) compatible communication device or a wire communication device that performs wired communication.

＜６．むすび＞
以上の説明のように、本実施形態にかかる言語明瞭化装置１００は、所定時間毎に音声信号を分割し、分割された音声信号毎に基音を抽出し、基音の周波数以上の周波数に対して、予め記憶しておいたフィルタを適用する。これによって、言語明瞭化装置１００は、当該分割された各音声信号について、言語として強調すべき重要な周波数帯域の成分を残した上で、ノイズ等の重要性の低い周波数帯域の成分を除去することができるため、音程が刻々と変化するような音声信号に対しても適切に対応することができる。 <6. Conclusion>
As described above, the language clarification device 100 according to the present embodiment divides a speech signal at predetermined time intervals, extracts a fundamental tone for each of the divided speech signals, and applies a frequency higher than the fundamental tone frequency. Apply a pre-stored filter. As a result, the language clarification device 100 removes components of a frequency band with low importance such as noise, while leaving components of important frequency bands to be emphasized as language for each of the divided audio signals. Therefore, it is possible to appropriately cope with an audio signal whose pitch changes every moment.

なお、雑音のスペクトルは一定ではないが、言語明瞭化装置１００が分割された音声信号毎にフィルタの適用周波数を適度に変更することによって、雑音のスペクトル成分にマスクされにくくなり、雑音下での言語明瞭化が可能となる。 Note that the noise spectrum is not constant, but by appropriately changing the applied frequency of the filter for each divided speech signal, the language clarification device 100 is less likely to be masked by the noise spectrum component. Language clarification is possible.

そして、子局４００が言語明瞭化処理後の音声信号を用いて拡声放送を行うことによって、本実施形態に係る拡声放送システムは、劣悪な外部環境（騒音（交通騒音、生活騒音等）、降雨、暴風雨、暴風雪、台風、落雷等）においてもユーザが聞き取りやすい拡声放送を実現することができる。また、音程に基づいてフィルタを適用する処理（実質的には、基音に基づいてフィルタを適用する処理と同義）は、特許文献１による処理よりも負荷が低いため、本実施形態に係る拡声放送システムは、言語明瞭化に要する処理の負荷をより低減させることができる。 Then, when the slave station 400 performs loudspeaker broadcasting using the speech signal after the language clarification process, the loudspeaker broadcast system according to the present embodiment has a poor external environment (noise (traffic noise, daily noise, etc.), rainfall It is possible to realize loudspeaker broadcasting that is easy for the user to hear even in the case of storm, storm snow, typhoon, lightning, etc. Also, the process of applying the filter based on the pitch (substantially synonymous with the process of applying the filter based on the fundamental tone) has a lower load than the process according to Patent Document 1, and thus the loudspeaker broadcast according to the present embodiment. The system can further reduce the processing load required for language clarification.

以上、添付図面を参照しながら本発明の好適な実施形態について詳細に説明したが、本発明はかかる例に限定されない。本発明の属する技術の分野における通常の知識を有する者であれば、特許請求の範囲に記載された技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、これらについても、当然に本発明の技術的範囲に属するものと了解される。 The preferred embodiments of the present invention have been described in detail above with reference to the accompanying drawings, but the present invention is not limited to such examples. It is obvious that a person having ordinary knowledge in the technical field to which the present invention pertains can come up with various changes or modifications within the scope of the technical idea described in the claims. Of course, it is understood that these also belong to the technical scope of the present invention.

例えば、各装置の機能構成は適宜外部装置に備えられてもよい。例えば、言語明瞭化装置１００の切替部１１０、前処理部１２１、基音抽出部１２２、フィルタ処理部１２３、マキシマイズ処理部１２４のそれぞれは、適宜外部装置に備えられてもよい。 For example, the functional configuration of each device may be provided in an external device as appropriate. For example, each of the switching unit 110, the preprocessing unit 121, the fundamental tone extraction unit 122, the filter processing unit 123, and the maximize processing unit 124 of the language clarification device 100 may be appropriately provided in an external device.

また、各装置の位置関係、または連携の態様は適宜変更され得る。例えば、言語明瞭化装置１００は、拡声放送の信号の送信元（操作卓２００または親局３００等）側ではなく、子局４００側に備えられてもよい。これによって、拡声放送の実施者は、言語明瞭化処理のパラメータ、フィルタの種類等を子局４００毎に変更することができるため、各子局４００が設置されている場所における雑音の発生状況に応じた言語明瞭化を実現することができる。また、言語明瞭化装置１００が有する機能は、操作卓２００、親局３００または子局４００のうち、いずれかの装置が有していてもよい。 Moreover, the positional relationship of each apparatus or the aspect of cooperation can be changed suitably. For example, the language clarification device 100 may be provided on the side of the slave station 400 instead of the source of the loudspeaker broadcast signal (such as the console 200 or the master station 300). As a result, the speaker of the loudspeaker broadcast can change the parameter of language clarification processing, the type of filter, and the like for each slave station 400, so that the situation of noise generation at the location where each slave station 400 is installed. Corresponding language clarification can be realized. Further, the function of the language clarification device 100 may be included in any of the console 200, the master station 300, or the slave station 400.

また、上記における通信方式は適宜変更されてもよい。例えば、親局３００と子局４００は、無線通信ではなく有線通信を行ってもよいし、言語明瞭化装置１００、操作卓２００または親局３００は、有線通信ではなく無線通信を行ってもよい。 Moreover, the communication system in the above may be changed as appropriate. For example, the master station 300 and the slave station 400 may perform wired communication instead of wireless communication, and the language clarification device 100, the console 200, or the master station 300 may perform wireless communication instead of wired communication. .

１００言語明瞭化装置
１１０切替部
１２０処理部
１２１前処理部
１２２基音抽出部
１２３フィルタ処理部
１２４マキシマイズ処理部
２００操作卓
３００親局
３１０アンテナ
３２０有線通信部
３３０生成部
３４０無線通信部
３５０制御部
４００子局
４１０アンテナ
４２０拡声器
４３０無線通信部
４４０制御部
４５０鳴動制御部
DESCRIPTION OF SYMBOLS 100 Language clarification apparatus 110 Switching part 120 Processing part 121 Pre-processing part 122 Fundamental sound extraction part 123 Filter processing part 124 Maximize processing part 200 Console 300 Master station 310 Antenna 320 Wired communication part 330 Generation part 340 Wireless communication part 350 Control part 400 Slave station 410 Antenna 420 Loudspeaker 430 Wireless communication unit 440 Control unit 450 Ring control unit

Claims

A fundamental sound extraction unit that divides a speech signal every predetermined time and extracts a fundamental tone for each of the divided speech signals;
A filter generated based on the formant characteristics is stored in advance, an applied frequency of the filter is determined based on a pitch of the fundamental sound and the divided audio signal, and the filter is applied to the divided audio signal. A filter processing unit;
A maximize processing unit that emphasizes a predetermined frequency band in the audio signal processed by the filter processing unit,
Language clarification device.

The filter processing unit determines a frequency equal to or higher than the frequency of the fundamental tone as the applied frequency;
The language clarification device according to claim 1.

The filter is a comb filter;
The language clarification device according to claim 1.

A loudspeaker broadcasting system comprising: a master station device that controls a slave station device for loudspeaker broadcasting; the slave station device that performs the loudspeaker broadcast; and a language clarification device that clarifies the voice of the loudspeaker broadcast,
The language clarification device comprises:
A fundamental sound extraction unit that divides a speech signal every predetermined time and extracts a fundamental tone for each of the divided speech signals;
A filter generated based on the formant characteristics is stored in advance, an applied frequency of the filter is determined based on a pitch of the fundamental sound and the divided audio signal, and the filter is applied to the divided audio signal. A filter processing unit;
A maximize processing unit that emphasizes a predetermined frequency band in the audio signal processed by the filter processing unit,
The master station device is
A transmission unit for transmitting a signal including the audio signal to the slave station device;
The slave station device is
A receiver for receiving the signal;
A sound control unit for controlling the loudspeaker broadcast based on the signal,
Amplified broadcasting system.