JP2012059203A

JP2012059203A - Specific audio recognition apparatus and specific audio recognition method

Info

Publication number: JP2012059203A
Application number: JP2010204440A
Authority: JP
Inventors: Shinya Kubo; 眞也久保
Original assignee: NEC AccessTechnica Ltd
Current assignee: NEC Platforms Ltd
Priority date: 2010-09-13
Filing date: 2010-09-13
Publication date: 2012-03-22
Anticipated expiration: 2030-09-13
Also published as: JP5211124B2

Abstract

PROBLEM TO BE SOLVED: To determine whether or not an acquired peripheral sound is matched to an area specific sound in a specific geographical area even if area specific sound data (such as sounds of a siren) used in the geographical area are different for each geographical area such as a municipality and if matched, to notify such a state in a vehicle.SOLUTION: A specific sound recognition apparatus includes: a sound acquisition section for acquiring an inputted peripheral sound and outputting the sound as digital sound data; and a current location generation section for calculating and outputting the current location of the specific sound recognition apparatus itself. Furthermore, the specific sound recognition apparatus includes an area specific sound storage section for storing an area specific sound that is heard only in a predetermined geographic area; and a current location recognition specification section for outputting a specific geographical area to which the current location belongs. Moreover, the specific sound recognition apparatus includes a sound recognition result notification section which compares the area specific sound of the specific geographical area with digital sound data and, when it is determined that both match, notifies it.

Description

本発明は、緊急車両のサイレン音等、ある地理的領域に固有な音声が検出された場合、その旨を報知する特定音声認識装置および特定音声認識方法に関する。 The present invention relates to a specific voice recognition device and a specific voice recognition method for notifying that when a voice unique to a certain geographical area such as a siren sound of an emergency vehicle is detected.

近年の道路交通法においては、緊急自動車（消防用自動車、救急用自動車その他の政令で定める自動車）がサイレン等を吹鳴しながら接近してきたときは、車両は、道路の左側に寄って、これに進路を譲らなければならない。また、交差点又はその附近においては、車両は、交差点を避けて、道路の左側（或いは、必要な場合は右側）に寄って一時停止しなければならない、と定められている。なお、ここで、緊急自動車とは、当該緊急用務のため、政令で定めるところにより、運転中のものをいう。以下、緊急自動車を緊急車両とも称することとする。 According to the Road Traffic Law in recent years, when an emergency car (fire-fighting car, emergency car, or other car specified by a government ordinance) approaches while blowing a siren or the like, the vehicle approaches the left side of the road. You have to give way. In addition, at or near the intersection, it is stipulated that the vehicle must stop at the left side of the road (or the right side if necessary), avoiding the intersection. Here, the emergency vehicle means a vehicle in operation for the emergency service as defined by a Cabinet Order. Hereinafter, the emergency vehicle is also referred to as an emergency vehicle.

しかし、自家用車などで車両内のオーディオ機器等による音量が大きな場合、或いは、車両内外の騒音が大きな場合には、車両外の緊急車両のサイレン音を聞き逃す場合もありうる状態となり、車両の運転者が適切な行動をとれないこともあり得る。 However, if the volume of audio equipment inside the vehicle is high in a private vehicle, etc., or if the noise inside or outside the vehicle is high, the siren sound of an emergency vehicle outside the vehicle may be missed. The driver may not be able to take appropriate actions.

そこで、緊急車両のサイレン音を聞き逃さないようにするための一つの手法を提案しているものがある（例えば、特許文献１参照。）。 In view of this, there has been proposed one technique for preventing the siren sound of an emergency vehicle from being missed (see, for example, Patent Document 1).

上述した特許文献１「車外音導入装置」には、以下の記載がなされている。 The following description is made in the above-described Patent Document 1 “Vehicle Sound Introducing Device”.

すなわち、車両の上面、後面及び左右外側面にそれぞれ車外音を収集する複数のマイクロホンが取り付けられている。そして、各マイクロホンから収集された車外音がマイクロコンピュータにより選別され、アンプにより増幅される。増幅された車外音は、それぞれマイクロホンの取り付け位置に対応して車両内に設置されているスピーカから車室内に放出されるように構成されている。このことにより、車両の運転者にとって有益な車外音を運転者が確実に聞き取ることができるようになる、としている。 In other words, a plurality of microphones for collecting outside sounds are attached to the upper surface, rear surface, and left and right outer surfaces of the vehicle. The vehicle exterior sound collected from each microphone is selected by a microcomputer and amplified by an amplifier. The amplified vehicle exterior sound is configured to be emitted into the vehicle interior from a speaker installed in the vehicle corresponding to the microphone mounting position. As a result, the driver can surely hear the sound outside the vehicle useful for the driver of the vehicle.

また、窓を閉め切ってカーステレオ等の音量を高くして聞いている場合にも、緊急自動車接近の状況を運転者に知らせる、ようにしているものもある（例えば、特許文献２参照。）。 Further, even when listening with the sound of a car stereo or the like being closed with the window closed, the driver is informed of the state of emergency vehicle approach (see, for example, Patent Document 2).

上述した特許文献２「緊急車両検知警報装置および位置表示装置」には、以下の記載がなされている。 The following description is made in Patent Document 2 “Emergency vehicle detection alarm device and position display device” described above.

すなわち、マイクロホンの信号が緊急車両認識装置に入力され、緊急車両認識装置は緊急車両のサイレン音を検知し信号を表示コントローラと音声コントローラに出力する。表示コントローラは緊急車両認識装置から信号が入力したときは、内蔵する映像信号発生装置から警報画面の映像信号を発生させディスプレイに警報画面を表示させる。音声コントローラは内蔵する音声合成装置から警報のアナウンスを発生させてスピーカを駆動する、としている。 That is, the microphone signal is input to the emergency vehicle recognition device, and the emergency vehicle recognition device detects the siren sound of the emergency vehicle and outputs the signal to the display controller and the voice controller. When a signal is input from the emergency vehicle recognition device, the display controller generates a video signal of a warning screen from the built-in video signal generation device and displays the warning screen on the display. The voice controller generates an alarm announcement from the built-in voice synthesizer and drives the speaker.

特開平０８−００２３３９号公報（第２〜４頁、図１〜５）Japanese Patent Laid-Open No. 08-002339 (pages 2 to 4, FIGS. 1 to 5) 特開平０６−３２５２９１号公報（第４〜６頁、図１〜５）Japanese Patent Laid-Open No. 06-325291 (pages 4-6, FIGS. 1-5)

上述した特許文献１、２に記載の手法においては、車外のマイクロホンにより収集された車外音が、事前に車両内の記憶手段に登録された緊急車両のサイレン音などの音声と一致するかの比較を行う。そして、一致した場合、緊急車両が接近していると認識し、車内の画面やスピーカで、運転者に報知するようになっている。特許文献２においては、車両内の記憶手段に登録された緊急車両のサイレン音などの音声は、消防用自動車、救急用自動車、警察関係自動車（パトカー、など）などに分かれて記憶されている。 In the methods described in Patent Documents 1 and 2 described above, a comparison is made as to whether the vehicle exterior sound collected by the microphone outside the vehicle matches the sound such as the emergency vehicle siren sound registered in the storage means in the vehicle in advance. I do. And if it corresponds, it recognizes that the emergency vehicle is approaching, and notifies a driver | operator with the screen and speaker in a vehicle. In Patent Document 2, sounds such as siren sounds of emergency vehicles registered in the storage means in the vehicle are stored separately for fire-fighting vehicles, emergency vehicles, police-related vehicles (such as police cars), and the like.

しかし、緊急車両のサイレン音などの音声データは、自治体などにより異なる場合があるにも拘らず、上述した特許文献１、２においては、このことが考慮されていない、という課題を有している。 However, although the voice data such as the siren sound of the emergency vehicle may vary depending on the local government or the like, the above-described Patent Documents 1 and 2 have a problem that this is not taken into consideration. .

すなわち、モーターサイレンを使用する消防用自動車などの場合、サイレン音の周波数は連続して所定の時間の間で変化するようになっている。そして、使用する周波数の目安は、２５０〜８５０Ｈｚ程度となっている場合が多い。また、交差点等を通過するときに鳴らす場合のサイレンの周波数は９００Ｈｚ程度となっている。なお、モーターサイレンとは、モーターで羽を回して音を出す仕組みのサイレンであるが、近年においては、モーターサイレンの音を電子音で代替するようにしている場合もある。 That is, in the case of a fire-fighting vehicle using a motor siren, the frequency of the siren sound is continuously changed during a predetermined time. And the standard of the frequency to be used is often about 250 to 850 Hz. Moreover, the frequency of the siren when it rings when passing an intersection etc. is about 900 Hz. Note that the motor siren is a siren that makes a sound by rotating its wings with a motor. However, in recent years, the sound of the motor siren may be replaced with an electronic sound.

そして、サイレン音の周波数の変化の程度や、音の継続時間などに関しては、目安が示されているだけであるため、自治体毎に異なるサイレン音を使用する場合があり得るものとなっている。 And since only the standard is shown about the grade of the change of the frequency of a siren sound, the duration of a sound, etc., it becomes possible to use a different siren sound for each local government.

本発明は上述した課題を解決するためになされたものである。従って、本発明の目的は、
自治体などの地理的領域毎に、当該地理的領域で使用されている領域固有音声データ（サイレン音など）が異なっている場合であっても、取得した周辺の音声が特定の地理的領域の領域固有音声に一致するかを判定することを可能とし、一致する場合には、その旨を車両内に報知することを可能とする、特定音声認識装置および特定音声認識方法、を提供することにある。 The present invention has been made to solve the above-described problems. Therefore, the object of the present invention is to
Even if the area-specific voice data (siren sounds, etc.) used in the geographical area is different for each geographical area such as a local government, the acquired surrounding voice is the area of the specific geographical area. To provide a specific speech recognition device and a specific speech recognition method that make it possible to determine whether or not the voice matches the specific voice and, if the voice matches, notify the vehicle of that fact. .

上記の目的を達成するため、本発明の特定音声認識装置は、入力される周辺の音声を一定時間長の単位で取得し、デジタル音声データとして出力する音声取得部と、入力される所定の信号に基づき、自らの現在位置を算出して出力する現在位置生成部と、所定の地理的領域でのみ聞かれる領域固有音声を前記一定時間長以上の単位で前記地理的領域に対応付けて記憶する領域固有音声記憶部と、前記現在位置生成部から入力される前記現在位置が属する前記地理的領域である特定地理的領域を示す情報を出力する現在位置認識特定部と、入力される前記特定地理的領域につき前記領域固有音声記憶部から抽出した前記領域固有音声と、前記音声取得部から入力される前記デジタル音声データを比較し、両者が一致すると判定した場合、その旨を報知する報知信号を出力する音声認識結果報知部と
を含む。 In order to achieve the above object, a specific speech recognition apparatus according to the present invention acquires a peripheral sound to be input in a unit of a predetermined time length and outputs it as digital sound data, and a predetermined signal to be input Based on the current position generation unit that calculates and outputs its current position, and stores area-specific speech that is heard only in a predetermined geographical area in association with the geographical area in units of the predetermined time length or more. A region-specific voice storage unit; a current position recognition specifying unit that outputs information indicating a specific geographical region that is the geographical region to which the current position belongs, input from the current position generation unit; and the input specific geography If the region specific sound extracted from the region specific sound storage unit for each target region is compared with the digital sound data input from the sound acquisition unit, and it is determined that both match, And it outputs a notification signal for notifying includes a speech recognition result notification unit.

また、本発明の特定音声認識方法は、周辺の音声を一定時間長の単位でアナログ音声データとして取得し、前記アナログ音声データをＡ／Ｄ変換によってデジタル音声データに変換し、一定時間間隔で複数のＧＰＳ衛星が送出する電波を受信し、受信した電波情報に基づき、現在位置を算出し、前記現在位置に基づき、現在位置近傍の複数の自治体を特定し、前記特定した複数の自治体が用いている複数の緊急車両音声データを抽出し、前記デジタル音声データが、前記現在位置近傍の複数の自治体の複数の緊急車両音声データの何れかと一致するかを判定し、一致すると判定した場合、緊急車両が近づいている旨を音声、メッセージデータ、或いは画像データにより出力し、報知する。 Also, the specific speech recognition method of the present invention acquires peripheral speech as analog speech data in a unit of a certain time length, converts the analog speech data into digital speech data by A / D conversion, and a plurality of them at regular time intervals. Receiving the radio wave transmitted from the GPS satellite, calculating the current position based on the received radio wave information, identifying a plurality of local governments in the vicinity of the current position based on the current position, and using the identified plurality of local governments A plurality of emergency vehicle voice data is extracted, and if the digital voice data matches any of a plurality of emergency vehicle voice data of a plurality of local governments in the vicinity of the current position. Is output and notified by voice, message data, or image data.

本発明によれば、地理的領域（自治体など）毎に当該地理的領域が使用している領域固有音声（緊急車両の吹鳴するサイレン音など）が異なっている場合であっても、取得した周辺の音声が特定の地理的領域の領域固有音声に一致するかを判定できる。そして、一致する場合、その旨を車両内に報知することが可能となる。 According to the present invention, even if the region-specific voice (such as a siren sounded by an emergency vehicle) used by the geographical region is different for each geographical region (such as a local government), the acquired surroundings Can be determined to match the region-specific speech of a particular geographic region. And when it corresponds, it becomes possible to alert | report to that effect in a vehicle.

本発明の特定音声認識装置の第１の実施形態を示すブロック図である。It is a block diagram which shows 1st Embodiment of the specific speech recognition apparatus of this invention. 本発明の特定音声認識装置の第２の実施形態を示すブロック図である。It is a block diagram which shows 2nd Embodiment of the specific speech recognition apparatus of this invention. 特定音声認識装置の第２の実施形態の動作を説明するシーケンス図である。It is a sequence diagram explaining operation | movement of 2nd Embodiment of a specific speech recognition apparatus. 本発明の特定音声認識装置の第３の実施形態を示すブロック図である。It is a block diagram which shows 3rd Embodiment of the specific speech recognition apparatus of this invention. デジタル音声データの波形の一例を示す図である。It is a figure which shows an example of the waveform of digital audio data. 周波数毎のパワースペクトルの波形データの一例を示す図である。It is a figure which shows an example of the waveform data of the power spectrum for every frequency. ソナグラムパターンデータのデータ形式の一例を示す図である。It is a figure which shows an example of the data format of sonagram pattern data. ソナグラムのパターンデータの表示例の一例を示す図である。It is a figure which shows an example of the example of a display of the pattern data of a sonogram. 特定音声認識装置の第３の実施形態の動作を説明するシーケンス図である。It is a sequence diagram explaining operation | movement of 3rd Embodiment of a specific speech recognition apparatus. 本発明の特定音声認識装置の第４の実施形態を示すブロック図である。It is a block diagram which shows 4th Embodiment of the specific speech recognition apparatus of this invention. 特定音声認識装置の第３実施形態の動作を説明するシーケンス図ある。It is a sequence diagram explaining operation | movement of 3rd Embodiment of a specific speech recognition apparatus.

次に、本発明の実施形態について図面を参照して説明する。
［第１の実施形態］ Next, embodiments of the present invention will be described with reference to the drawings.
[First Embodiment]

図１は、本発明の特定音声認識装置の第１の実施形態を示すブロック図である。 FIG. 1 is a block diagram showing a first embodiment of the specific speech recognition apparatus of the present invention.

図１に示す特定音声認識装置１００は、音声取得部２と、現在位置生成部４と、領域固有音声記憶部６を含んでいる。また、特定音声認識装置１００は、現在位置認識特定部３−２と、音声認識結果報知部３−６を含んでいる。 A specific speech recognition apparatus 100 shown in FIG. 1 includes a speech acquisition unit 2, a current position generation unit 4, and a region specific speech storage unit 6. The specific speech recognition apparatus 100 includes a current position recognition specification unit 3-2 and a speech recognition result notification unit 3-6.

音声取得部２は、特定音声認識装置１００を搭載する車両の周辺の音声を、一定時間長の単位で取得し、デジタル音声データとして音声認識結果報知部３−６へ出力する。 The voice acquisition unit 2 acquires the voice around the vehicle on which the specific voice recognition device 100 is mounted in a unit of a certain time length, and outputs it as digital voice data to the voice recognition result notification unit 3-6.

現在位置生成部４は、ＧＰＳ衛星から受信する電波信号に基づき、特定音声認識装置１００の現在位置を算出し、現在位置認識特定部３−２へ出力する。 The current position generation unit 4 calculates the current position of the specific speech recognition apparatus 100 based on the radio signal received from the GPS satellite, and outputs the current position to the current position recognition specifying unit 3-2.

領域固有音声記憶部６は、自治体などの特定の地理的領域でのみ聞かれる緊急車両の吹鳴するサイレン音などを、領域固有音声として記憶する。なお、前記の領域固有音声は、前記の地理的領域と対応付けを行って記憶する。 The area specific sound storage unit 6 stores, as area specific sounds, siren sounds generated by emergency vehicles that are heard only in a specific geographical area such as a local government. The area-specific speech is stored in association with the geographical area.

現在位置認識特定部３−２は、現在位置生成部４から入力された現在位置が、前記の地理的領域の内の何れに位置するかを特定し、特定地理的領域の情報として音声認識結果報知部３−６へ出力する。 The current position recognition specifying unit 3-2 specifies where the current position input from the current position generating unit 4 is located in the geographical area, and the voice recognition result is used as information on the specific geographical area. It outputs to the alerting | reporting part 3-6.

音声認識結果報知部３−６は、現在位置認識特定部３−２から入力した特定地理的領域に基づき、領域固有音声記憶部６を検索し、特定地理的領域に対応する領域固有音声を抽出する。そして、抽出した領域固有音声と、音声取得部２から入力したデジタル音声データを比較する。比較の結果、両者が一致する場合、特定音声認識装置１００を搭載する車両に、緊急車両が近づいているものと判断する。そして、緊急車両が近づいている旨を報知する報知信号を、特定音声認識装置１００を搭載する車両内に出力する。 The speech recognition result notifying unit 3-6 searches the region specific speech storage unit 6 based on the specific geographical region input from the current position recognition specifying unit 3-2, and extracts the region specific speech corresponding to the specific geographical region. To do. Then, the extracted region specific voice is compared with the digital voice data input from the voice acquisition unit 2. As a result of the comparison, if both match, it is determined that the emergency vehicle is approaching the vehicle on which the specific speech recognition device 100 is mounted. And the alerting | reporting signal which alert | reports that the emergency vehicle is approaching is output in the vehicle carrying the specific speech recognition apparatus 100. FIG.

以上説明したように、本実施形態の特定音声認識装置１００は、入力される周辺の音声を一定時間長の単位で取得し、デジタル音声データとして出力する音声取得部２を含んでいる。 As described above, the specific speech recognition apparatus 100 according to the present embodiment includes the speech acquisition unit 2 that acquires input peripheral speech in units of a predetermined time length and outputs the digital speech data.

また、入力される所定の信号に基づき、自らの現在位置を算出して出力する現在位置生成部４を含んでいる。 Further, it includes a current position generator 4 that calculates and outputs its current position based on a predetermined input signal.

さらに、所定の地理的領域でのみ聞かれる領域固有音声を前記一定時間長以上の単位で前記地理的領域に対応付けて記憶する領域固有音声記憶部６を含んでいる。 Furthermore, a region-specific speech storage unit 6 is provided that stores region-specific speech heard only in a predetermined geographic region in association with the geographic region in units of the predetermined time length or more.

また、前記現在位置生成部４から入力される前記現在位置が属する前記地理的領域である特定地理的領域を示す情報を出力する現在位置認識特定部３−２を含んでいる。 Moreover, the present position recognition specific | specification part 3-2 which outputs the information which shows the specific geographical area which is the said geographical area to which the said present position belongs input from the said present position production | generation part 4 is included.

さらに、入力される前記特定地理的領域につき前記領域固有音声記憶部６から抽出した前記領域固有音声と、前記音声取得部２から入力される前記デジタル音声データを比較する音声認識結果報知部３−６を含んでいる。また、前記音声認識結果報知部３−６は、前記領域固有音声と前記デジタル音声データの両者が一致すると判定した場合、その旨を報知する報知信号を出力する、ようになっている。 Furthermore, the speech recognition result notifying unit 3- compares the region specific speech extracted from the region specific speech storage unit 6 with respect to the input specific geographical region and the digital speech data input from the speech acquisition unit 2. 6 is included. In addition, when it is determined that both the region specific voice and the digital voice data match, the voice recognition result notification unit 3-6 outputs a notification signal to notify that fact.

従って、本実施形態によれば、地理的領域（自治体など）毎に当該地理的領域が使用している領域固有音声（緊急車両の吹鳴するサイレン音など）が異なっている場合であっても、取得した周辺の音声が特定の地理的領域の領域固有音声に一致するかを判定できる。そして、一致する場合、その旨を車両内に報知することが可能となる。
[第２の実施形態]
図２は、本発明の特定音声認識装置の第２の実施形態を示すブロック図である。 Therefore, according to the present embodiment, even if the region specific sound (such as a siren sounded by an emergency vehicle) used by the geographical region is different for each geographical region (such as a local government), It can be determined whether the acquired surrounding voice matches the area specific voice of a specific geographical area. And when it corresponds, it becomes possible to alert | report to that effect in a vehicle.
[Second Embodiment]
FIG. 2 is a block diagram showing a second embodiment of the specific speech recognition apparatus of the present invention.

図２に示す特定音声認識装置１００は、図１の音声取得部２に相当するマイク１０と、音声データ抽出部２０と、を含んでいる。また、制御部３０を含んでいる。 A specific speech recognition apparatus 100 shown in FIG. 2 includes a microphone 10 corresponding to the speech acquisition unit 2 in FIG. 1 and a speech data extraction unit 20. Moreover, the control part 30 is included.

さらに、図１の現在位置生成部４に相当するＧＰＳ（ＧｌｏｂａｌＰｏｓｉｔｉｏｎｉｎｇＳｙｓｔｅｍ：全地球測位システム）受信部４０を含んでいる。 Furthermore, a GPS (Global Positioning System) receiving unit 40 corresponding to the current position generating unit 4 of FIG. 1 is included.

また、図１の領域固有音声記憶部６に相当する制御部３０に接続された緊急車両音声データ記憶部６０を含んでいる。 Moreover, the emergency vehicle audio | voice data storage part 60 connected to the control part 30 corresponded to the area | region specific audio | voice storage part 6 of FIG. 1 is included.

さらに、特定音声認識装置１００は、図１の現在位置認識特定部３−２に相当する制御部３０に接続された地図データ記憶部５０と、制御部３０に含まれる現在位置特定部３２と、を含んでいる。 Furthermore, the specific speech recognition apparatus 100 includes a map data storage unit 50 connected to the control unit 30 corresponding to the current position recognition specifying unit 3-2 in FIG. 1, a current position specifying unit 32 included in the control unit 30, Is included.

また、図１の音声認識結果報知部３−６に相当する緊急車両音声データ選択部３４と、警報音認識部３６を含んでいる。さらに、特定音声認識装置１００は、制御部３０に接続される警報出力部７０を含んでいる。 Moreover, the emergency vehicle audio | voice data selection part 34 and the warning sound recognition part 36 equivalent to the audio | voice recognition result alerting | reporting part 3-6 of FIG. Furthermore, the specific speech recognition apparatus 100 includes an alarm output unit 70 connected to the control unit 30.

マイク１０は、特定音声認識装置１００を搭載する車両の外部に設置され、マイク１０周辺の音声を取得し、音声データ抽出部２０に送出する。 The microphone 10 is installed outside the vehicle on which the specific speech recognition device 100 is mounted, acquires sound around the microphone 10, and sends it to the sound data extraction unit 20.

音声データ抽出部２０は、マイク１０から送出されたアナログ音声データをデジタル音声データに変換し、制御部３０に送出する。 The voice data extraction unit 20 converts the analog voice data sent from the microphone 10 into digital voice data and sends it to the control unit 30.

ＧＰＳ受信部４０は、ＧＰＳ衛星からの電波を受信し、特定音声認識装置１００の現在位置データを、例えば、緯度、経度のデータとして算出し、制御部３０に送出する。 The GPS receiver 40 receives radio waves from GPS satellites, calculates current position data of the specific speech recognition apparatus 100 as, for example, latitude and longitude data, and sends the data to the controller 30.

地図データ記憶部５０は、例えば、日本全土の地図データを、緯度、経度のデータと共に記憶しており、さらに、当該緯度、経度の地点が日本のどの自治体（□□県、○○市、△△町、など）に属しているかの情報も記憶している。 The map data storage unit 50 stores, for example, map data for the whole of Japan together with latitude and longitude data, and, in addition, any local government (□□ prefecture, XX city, It also stores information about whether it belongs to a town.

緊急車両音声データ記憶部６０は、自治体毎に、当該自治体の緊急車両が吹鳴するサイレン音などの緊急車両音声データを記憶している。緊急車両の種類は、例えば、消防用自動車、救急用自動車、パトロールカー、などに分け、緊急車両音声データ記憶部６０は、緊急車両の種類毎の緊急車両音声データを記憶している。なお、本実施形態における緊急車両の音声データは、音声波形の情報を、時間、周波数、強度の３次元で表現したソナグラム（ｓｏｎａｇｒａｍ：声紋）の形で記憶されているものとする。ソナグラムの詳細については、第３の実施形態において後述する。 The emergency vehicle voice data storage unit 60 stores emergency vehicle voice data such as a siren sound generated by an emergency vehicle of the local government for each local government. The types of emergency vehicles are classified into, for example, fire-fighting vehicles, emergency vehicles, patrol cars, and the like, and the emergency vehicle audio data storage unit 60 stores emergency vehicle audio data for each type of emergency vehicle. Note that the emergency vehicle audio data in the present embodiment is stored in the form of a sonagram (voiceprint) that expresses audio waveform information in three dimensions of time, frequency, and intensity. The details of the sonagram will be described later in the third embodiment.

制御部３０の現在位置特定部３２は、ＧＰＳ受信部４０から特定音声認識装置１００の現在位置データを受信する。そして、現在位置特定部３２は、受信した現在位置データが、どの自治体の中の位置に相当するかについて、地図データ記憶部５０を参照して特定する。なお、現在位置特定部３２は、特定した自治体の近辺の他の２〜３の自治体も特定する。これは、特定音声認識装置１００が、或る自治体の境界あたりに位置している場合、現在位置データが示す自治体の緊急車両の他に、当該自治体の近辺の他の自治体の緊急車両の緊急車両音声データを取得する場合も有り得ることを考慮したためである。現在位置特定部３２は、現在位置特定部３２が特定した複数の自治体の情報を、緊急車両音声データ選択部３４に送出する。 The current position specifying unit 32 of the control unit 30 receives the current position data of the specific speech recognition apparatus 100 from the GPS receiving unit 40. Then, the current position specifying unit 32 specifies which municipality the received current position data corresponds to by referring to the map data storage unit 50. Note that the current position specifying unit 32 also specifies other two or three local governments in the vicinity of the specified local government. This is because, when the specific speech recognition apparatus 100 is located around a boundary of a certain local government, in addition to the emergency vehicle of the local government indicated by the current position data, the emergency vehicle of the emergency vehicle of another local government in the vicinity of the local government This is because it is possible that audio data is acquired. The current position specifying unit 32 sends information on a plurality of local governments specified by the current position specifying unit 32 to the emergency vehicle voice data selecting unit 34.

制御部３０の緊急車両音声データ選択部３４は、現在位置特定部３２から受信した複数の自治体の情報に基づき、当該複数の自治体の緊急車両の緊急車両音声データを、緊急車両音声データ記憶部６０から抽出し、警報音認識部３６に送出する。 The emergency vehicle voice data selection unit 34 of the control unit 30 based on the information of the plurality of local governments received from the current position specifying unit 32, the emergency vehicle voice data storage unit 60 stores the emergency vehicle voice data of the emergency vehicles of the plurality of local governments. And is sent to the alarm sound recognition unit 36.

制御部３０の警報音認識部３６は、音声データ抽出部２０から受信したデジタル音声データを、先ず、ソナグラムの形の３次元データに変換する（以降、受信ソナグラムデータと称するものとする）。次に、受信ソナグラムデータと、緊急車両音声データ選択部３４から受信した複数の自治体の緊急車両の緊急車両音声データ（ソナグラムの形になっている）とを比較する。緊急車両音声データ選択部３４から受信した複数の自治体の緊急車両音声データの何れかが、受信ソナグラムデータと一致した場合、その受信ソナグラムデータは緊急車両が吹鳴している音声データであると判断する。そして、緊急車両の緊急車両音声データであると判断した場合、警報音認識部３６は、緊急車両が近づいている旨を、警報出力部７０に送出する。 The alarm sound recognition unit 36 of the control unit 30 first converts the digital audio data received from the audio data extraction unit 20 into three-dimensional data in the form of a sonagram (hereinafter referred to as reception sonagram data). Next, the received sonogram data is compared with emergency vehicle voice data (in the form of a sonogram) of emergency vehicles of a plurality of local governments received from the emergency vehicle voice data selection unit 34. If any of the emergency vehicle audio data of a plurality of municipalities received from the emergency vehicle audio data selection unit 34 matches the received sonargram data, it is determined that the received sonagram data is audio data that the emergency vehicle is blowing. . If it is determined that the emergency vehicle audio data is the emergency vehicle, the warning sound recognition unit 36 sends a notification to the warning output unit 70 that the emergency vehicle is approaching.

警報出力部７０は、緊急車両が特定音声認識装置１００に近づいている旨を、音声、メッセージデータ、或いは画像データにより出力し、報知する。 The alarm output unit 70 outputs and notifies that the emergency vehicle is approaching the specific voice recognition device 100 by voice, message data, or image data.

次に、図３を参照して、本実施形態の動作について説明する。 Next, the operation of the present embodiment will be described with reference to FIG.

図３は、本実施形態の動作を説明するシーケンス図である。 FIG. 3 is a sequence diagram for explaining the operation of the present embodiment.

図３において、特定音声認識装置１００のマイク１０は、マイク１０周辺の音声を一定時間長の単位で取得し、音声データ抽出部２０に送出する（図３のステップＳ１）。その後、次の時間長の音声を取得するためステップＳ１を繰り返す。 In FIG. 3, the microphone 10 of the specific speech recognition apparatus 100 acquires the sound around the microphone 10 in a unit of a predetermined time length and sends it to the sound data extraction unit 20 (step S <b> 1 in FIG. 3). After that, step S1 is repeated to acquire the next time length voice.

音声データ抽出部２０は、マイク１０から送出されたアナログ音声データをデジタル音声データに変換（Ａ／Ｄ変換：ＡｎａｌｏｇｕｅｔｏＤｉｇｉｔａｌＣｏｎｖｅｒｓｉｏｎ）する（ステップＳ２）。そして、変換したデジタル音声データを制御部３０に送出する。 The audio data extraction unit 20 converts analog audio data sent from the microphone 10 into digital audio data (A / D conversion: Analog to Digital Conversion) (step S2). Then, the converted digital audio data is sent to the control unit 30.

一方、特定音声認識装置１００のＧＰＳ受信部４０は、一定時間間隔で、複数のＧＰＳ衛星が送出する電波を受信し、受信した電波情報に基づき、特定音声認識装置１００の現在位置を算出する（ステップＳ３）。そして、算出した現在位置の情報を制御部３０に送出する。その後、次の時間間隔でのＧＰＳ衛星からの電波を受信するため、ステップＳ３を繰り返す。 On the other hand, the GPS receiver 40 of the specific speech recognition apparatus 100 receives radio waves transmitted by a plurality of GPS satellites at regular time intervals, and calculates the current position of the specific speech recognition apparatus 100 based on the received radio wave information ( Step S3). Then, the calculated current position information is sent to the control unit 30. Thereafter, step S3 is repeated to receive radio waves from GPS satellites at the next time interval.

ＧＰＳ受信部４０から現在位置を受信した制御部３０は、受信した現在位置が、どの自治体の中の位置に相当するかについて、図２に示した地図データ記憶部５０を参照して特定する。さらに、制御部３０は、特定した自治体の近辺の他の２〜３の自治体も特定する（ステップＳ４）。 The control unit 30 that has received the current position from the GPS receiving unit 40 specifies which local government the received current position corresponds to by referring to the map data storage unit 50 shown in FIG. Furthermore, the control unit 30 also specifies other 2-3 local governments in the vicinity of the specified local government (step S4).

次に、制御部３０は、特定した複数の自治体が用いている緊急車両音声データを、図２に示した緊急車両音声データ記憶部６０から抽出する（ステップＳ５）。ここで抽出した緊急車両音声データの数は、複数の自治体における複数の緊急車両の種類分有り、かつ、緊急車両音声データはソナグラムの形の３次元データとなっている。 Next, the control unit 30 extracts emergency vehicle sound data used by the plurality of specified local governments from the emergency vehicle sound data storage unit 60 shown in FIG. 2 (step S5). The number of emergency vehicle voice data extracted here is the number of types of emergency vehicles in a plurality of local governments, and the emergency vehicle voice data is three-dimensional data in the form of a sonogram.

一方、制御部３０は、音声データ抽出部２０から受信したデジタル音声データを、先ず、
ソナグラムの形の３次元データに変換する（受信ソナグラムデータと称している）。 On the other hand, the control unit 30 first receives the digital audio data received from the audio data extraction unit 20,
It is converted into three-dimensional data in the form of sonagram (referred to as received sonagram data).

次に、制御部３０は、受信ソナグラムデータが、ステップＳ５で抽出した複数の自治体の複数の緊急車両音声データ（これらもソナグラムで表されるデータとなっている）の何れかと一致するかを比較する（ステップＳ６）。そして、制御部３０は、受信ソナグラムデータが、ステップＳ５で抽出した複数の自治体の複数の緊急車両音声データの何れかと一致するかを判定する（ステップＳ７）。一致の判定は、ソナグラムで表されたデータ同士のパターンマッチングにより行うと良い。 Next, the control unit 30 compares the received sonargram data with any of a plurality of emergency vehicle voice data (also represented by sonagrams) of a plurality of local governments extracted in step S5. (Step S6). Then, the control unit 30 determines whether the received sonogram data matches any of the plurality of emergency vehicle voice data of the plurality of local governments extracted in step S5 (step S7). The coincidence determination may be performed by pattern matching between data represented by a sonogram.

ここで、受信ソナグラムデータが、ステップＳ５で抽出した複数の自治体の複数の緊急車両音声データの何れかと一致すると判定した場合（ステップＳ７で「一致」）、制御部３０は、緊急車両が近づいているものと判断する。そして、緊急車両が近づいている旨を、特定音声認識装置１００の警報出力部７０に通知する。警報出力部７０は、緊急車両が近づいている旨を、音声、メッセージデータ、或いは画像データにより出力し、報知する（ステップＳ８）。そして、特定音声認識装置１００の利用者がこの報知に気づき、報知の停止操作（例えば、図示しない報知停止釦の押下など）を行った場合、警報出力部７０は警報の報知を停止し、もとの初期状態に戻る。 Here, when it is determined that the received sonagram data matches any of a plurality of emergency vehicle voice data of a plurality of municipalities extracted in step S5 (“match” in step S7), the control unit 30 approaches the emergency vehicle. Judge that there is. And it notifies the alarm output part 70 of the specific speech recognition apparatus 100 that the emergency vehicle is approaching. The warning output unit 70 outputs and notifies that an emergency vehicle is approaching by voice, message data, or image data (step S8). When the user of the specific speech recognition apparatus 100 notices this notification and performs a notification stop operation (for example, pressing a notification stop button (not shown)), the alarm output unit 70 stops the alarm notification. Return to the initial state.

ステップＳ７で、受信ソナグラムデータが、ステップＳ５で抽出した複数の自治体の複数の緊急車両音声データの何れとも一致しない場合（ステップＳ７で「不一致」）、制御部３０は、何もせず、もとの初期状態に戻る。 In step S7, when the received sonagram data does not match any of the plurality of emergency vehicle audio data of the plurality of local governments extracted in step S5 (“mismatch” in step S7), the control unit 30 does not do anything, Return to the initial state.

以上、本発明の第２の実施形態の動作について説明した。 The operation of the second embodiment of the present invention has been described above.

以上説明したように、本実施形態の特定音声認識装置１００は、周辺の音声を一定時間長の単位で取得するマイク１０と、前記マイク１０が取得したアナログ音声データをＡ／Ｄ変換しデジタル音声データとして制御部３０に送出する音声データ抽出部２０と、を含んでいる。 As described above, the specific speech recognition apparatus 100 according to the present embodiment performs A / D conversion on the analog audio data acquired by the microphone 10 that acquires peripheral audio in units of a certain time length and digital audio. And an audio data extraction unit 20 to be sent to the control unit 30 as data.

また、一定時間間隔で複数のＧＰＳ衛星が送出する電波を受信し、受信した電波情報に基づき、現在位置を算出して前記制御部３０に送出するＧＰＳ受信部４０と、を含んでいる。 And a GPS receiving unit 40 that receives radio waves transmitted by a plurality of GPS satellites at regular time intervals, calculates a current position based on the received radio wave information, and transmits the current position to the control unit 30.

さらに、前記制御部３０に接続され、自治体毎に、当該自治体の緊急車両が吹鳴するサイレン音などの緊急車両音声データを記憶する緊急車両音声データ記憶部６０と、を含んでいる。 Furthermore, it includes an emergency vehicle voice data storage unit 60 that is connected to the control unit 30 and stores emergency vehicle voice data such as a siren sound generated by an emergency vehicle of the local government for each local government.

また、前記制御部に接続され、地図データを記憶する地図データ記憶部５０を含んでいる。 Moreover, the map data storage part 50 connected to the said control part and memorize | stores map data is included.

さらに、前記制御部３０に含まれ、前記ＧＰＳ受信部４０から受信した現在位置に基づき、現在位置近傍の複数の自治体を、前記地図データ記憶部５０を参照して特定する現在位置特定部３２、を含んでいる。 Furthermore, a current position specifying unit 32 that is included in the control unit 30 and specifies a plurality of local governments in the vicinity of the current position with reference to the map data storage unit 50 based on the current position received from the GPS receiving unit 40, Is included.

また、前記制御部３０に含まれ、前記現在位置特定部が特定した複数の自治体が用いている緊急車両音声データを、前記緊急車両音声データ記憶部６０から抽出する緊急車両音声データ選択部３４と、を含んでいる。 The emergency vehicle voice data selection unit 34 that extracts emergency vehicle voice data included in the control unit 30 and used by a plurality of local governments specified by the current position specifying unit from the emergency vehicle voice data storage unit 60; , Including.

さらに、前記音声データ抽出部２０から受信したデジタル音声データが、前記緊急車両音声データ選択部３４が抽出した複数の自治体の複数の緊急車両音声データの何れかと一致するかを判定する警報音認識部３６を含んでいる。そして、前記警報音認識部３６は、一致すると判定した場合、緊急車両が近づいている旨を前記制御部３０に接続された警報出力部７０に通知する。 Further, an alarm sound recognition unit that determines whether the digital sound data received from the sound data extraction unit 20 matches any of a plurality of emergency vehicle sound data of a plurality of local governments extracted by the emergency vehicle sound data selection unit 34. 36. When the warning sound recognition unit 36 determines that they match, the warning sound recognition unit 36 notifies the warning output unit 70 connected to the control unit 30 that an emergency vehicle is approaching.

そして、前記警報出力部７０は、緊急車両が近づいている旨を、音声、メッセージデータ、或いは画像データにより出力し、報知する、ようになっている。 The alarm output unit 70 is configured to output and notify that an emergency vehicle is approaching by voice, message data, or image data.

従って、本実施形態によれば、自治体毎に、当該自治体が使用している緊急車両音声データ（サイレン音など）が異なっている場合であっても、車両外の音声データが緊急車両の吹鳴する緊急車両音声データに一致するかを判定できる。そして、一致する場合には、緊急車両が近づいている旨を車両内に報知することが可能となる。
[第３の実施形態]
次に、図２に示した第２の実施形態をさらに具体化した本発明による第３の実施形態について説明する。 Therefore, according to the present embodiment, even when the emergency vehicle voice data (siren sound etc.) used by the local government is different for each local government, the voice data outside the vehicle is blown by the emergency vehicle. It can be determined whether or not it matches the emergency vehicle voice data. And when it corresponds, it becomes possible to alert | report in the vehicle that the emergency vehicle is approaching.
[Third embodiment]
Next, a third embodiment according to the present invention that further embodies the second embodiment shown in FIG. 2 will be described.

図４は、本発明の特定音声認識装置の第３の実施形態を示すブロック図である。図４は、図２に示した特定音声認識装置についてさらに詳細に説明するブロック図である。従って、図４において、図２の構成要素に対応するものは同一の参照数字または符号を付し、その説明を極力省略するものとする。 FIG. 4 is a block diagram showing a third embodiment of the specific speech recognition apparatus of the present invention. FIG. 4 is a block diagram for explaining the specific speech recognition apparatus shown in FIG. 2 in more detail. Therefore, in FIG. 4, the same reference numerals or symbols are assigned to the components corresponding to those in FIG. 2, and the description thereof is omitted as much as possible.

図４に示す特定音声認識装置１００は、図２と同様に、マイク１０と、音声データ抽出部２０と、制御部３０と、ＧＰＳ受信部４０を含んでいる。また、特定音声認識装置１００は、制御部３０に接続された地図データ記憶部５０と、緊急車両音声データ記憶部６０と、警報出力部７０を含んでいる。 The specific speech recognition apparatus 100 shown in FIG. 4 includes a microphone 10, a speech data extraction unit 20, a control unit 30, and a GPS reception unit 40 as in FIG. 2. The specific voice recognition device 100 includes a map data storage unit 50, an emergency vehicle voice data storage unit 60, and an alarm output unit 70 connected to the control unit 30.

マイク１０は、マイク１０周辺の音声を取得し、音声データ抽出部２０に送出する。マイク１０は、特定音声認識装置１００を搭載する車両の外部で、かつ、外部の音声データを取得しやすい場所に設置されているものとする。 The microphone 10 acquires sound around the microphone 10 and sends it to the sound data extraction unit 20. The microphone 10 is assumed to be installed outside the vehicle on which the specific speech recognition apparatus 100 is mounted and at a place where external voice data can be easily obtained.

音声データ抽出部２０は、アンプ（ａｍｐｌｉｆｉｅｒ）２２と、バンドパスフィルタ（ＢａｎｄＰａｓｓＦｉｌｔｅｒ）２４と、Ａ／Ｄコンバータ（ＡｎａｌｏｇｕｅｔｏＤｉｇｉｔａｌＣｏｎｖｅｒｔｅｒ：Ａ／Ｄ変換器）２６とを含んでいる。 The audio data extracting unit 20 includes an amplifier 22, a band pass filter (Band Pass Filter) 24, and an A / D converter (Analog to Digital Converter) 26.

アンプ２２は、マイク１０から送出されたアナログ音声データを増幅し、バンドパスフィルタ２４に送出する。 The amplifier 22 amplifies the analog audio data sent from the microphone 10 and sends it to the bandpass filter 24.

バンドパスフィルタ２４は、通常、必要な範囲の周波数のみ通し、他の周波数は通さないフィルタ回路である。本実施形態のバンドパスフィルタ２４は、緊急車両が吹鳴するサイレン音の周波数範囲（例えば、２５０〜９００Ｈｚ）のみの音声データを通すものとし、
当該音声データをＡ／Ｄコンバータ２６に送出する。 The band-pass filter 24 is a filter circuit that normally passes only frequencies in a necessary range and does not pass other frequencies. The band-pass filter 24 of the present embodiment passes audio data only in the frequency range (for example, 250 to 900 Hz) of the siren sound that the emergency vehicle blows.
The audio data is sent to the A / D converter 26.

Ａ／Ｄコンバータ２６は、バンドパスフィルタ２４から受信したアナログの音声データをデジタル音声データに変換し、制御部３０に送出する。 The A / D converter 26 converts the analog audio data received from the bandpass filter 24 into digital audio data, and sends the digital audio data to the control unit 30.

制御部３０は、特定音声認識装置１００の演算処理を行うＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ：中央処理装置）３７を含んでいる。また、制御部３０は、ＣＰＵ３７を動作させるプログラムや、緊急車両接近警告画像データ、緊急車両接近警メッセージデータ等を記憶するＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ：ロム）３８を含んでいる。ＲＯＭ３８に記憶されＣＰＵ３７を動作させるプログラムとしては、第２の実施形態で説明した現在位置特定部３２、緊急車両音声データ選択部３４、及び、警報音認識部３６の機能を実行するプログラムも含まれている。 The control unit 30 includes a CPU (Central Processing Unit) 37 that performs arithmetic processing of the specific speech recognition apparatus 100. Further, the control unit 30 includes a ROM (Read Only Memory) 38 for storing a program for operating the CPU 37, emergency vehicle approach warning image data, emergency vehicle approach warning message data, and the like. Programs that are stored in the ROM 38 and operate the CPU 37 include programs that execute the functions of the current position specifying unit 32, the emergency vehicle voice data selection unit 34, and the alarm sound recognition unit 36 described in the second embodiment. ing.

さらに、制御部３０は、ＣＰＵ３７のワークエリアとして各種のデータを記憶するＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ：ラム）３９を含んでいる。 Further, the control unit 30 includes a RAM (Random Access Memory) 39 that stores various data as a work area of the CPU 37.

ＧＰＳ受信部４０は、ＧＰＳ衛星からの電波を受信するＧＰＳアンテナ４２と、ＧＰＳアンテナ４２から受信した電波に基づいて、特定音声認識装置１００の現在位置データを、例えば、緯度、経度のデータとして算出するＧＰＳレシーバ４４を含んでいる。ＧＰＳレシーバ４４は、算出した現在位置データを制御部３０に送出する。 The GPS receiver 40 calculates the current position data of the specific speech recognition apparatus 100 as, for example, latitude and longitude data, based on the GPS antenna 42 that receives radio waves from GPS satellites and the radio waves received from the GPS antenna 42. A GPS receiver 44 is included. The GPS receiver 44 sends the calculated current position data to the control unit 30.

地図データ記憶部５０は、例えば、日本全土の地図データを、緯度、経度のデータと共に記憶する地図データ５２を含んでいる。地図データ５２は、さらに、当該緯度、経度の地点が日本のどの自治体（□□県、○○市、△△町、など）に属しているかの情報も記憶している。 The map data storage unit 50 includes, for example, map data 52 that stores map data of the whole of Japan together with latitude and longitude data. The map data 52 further stores information on which local government in Japan (□□ prefecture, XX city, △ Δ town, etc.) the latitude and longitude points belong to.

緊急車両音声データ記憶部６０は、自治体毎に、当該自治体の緊急車両が吹鳴するサイレン音などの緊急車両音声データを記憶する緊急車両音声データ６２を含んでいる。緊急車両音声データ６２は、緊急車両の種類として、例えば、消防用自動車、救急用自動車、パトロールカー、などに分けてそれぞれの音声データを記憶している。なお、本実施形態における緊急車両の音声データは、音声波形の情報を、時間、周波数、強度の３次元で表現したソナグラムの形で記憶されているものとする。 The emergency vehicle audio data storage unit 60 includes emergency vehicle audio data 62 for storing emergency vehicle audio data such as sirens sounded by an emergency vehicle of the local government for each local government. The emergency vehicle voice data 62 stores the respective voice data as the types of emergency vehicles, for example, for fire fighting cars, emergency cars, patrol cars, and the like. In addition, the audio | voice data of the emergency vehicle in this embodiment shall be memorize | stored in the form of the sonagram which expressed the information of the audio | voice waveform in three dimensions of time, a frequency, and an intensity | strength.

ここで、図５、図６、図７、図８を参照して、本実施形態で使用するソナグラムについて説明しておく。 Here, with reference to FIG. 5, FIG. 6, FIG. 7, and FIG. 8, the sonagram used in this embodiment will be described.

図５は、音声データ抽出部２０のＡ／Ｄコンバータ２６が、制御部３０に送出したデジタル音声データの波形の一例を示す図である。制御部３０のＣＰＵ３７はこのデジタル音声データをＲＡＭ３９に記憶させる。図５に示すデジタル音声データは、一定時間の間隔に区切っておくものとし、各一定時間を「Ｔ1」〜「Ｔn」として表す。そして、ＣＰＵ３７はこのデジタル音声データを以下の手順でソナグラムの形のデータに変換する。 FIG. 5 is a diagram showing an example of the waveform of the digital audio data sent from the A / D converter 26 of the audio data extraction unit 20 to the control unit 30. The CPU 37 of the control unit 30 stores this digital audio data in the RAM 39. The digital audio data shown in FIG. 5 is divided at intervals of a fixed time, and each fixed time is expressed as “T1” to “Tn”. Then, the CPU 37 converts this digital audio data into data in the form of a sonogram in the following procedure.

すなわち、ＣＰＵ３７は、ＲＡＭ３９に記憶したデジタル音声データ（図５）を、一定時間「Ｔ1」〜「Ｔn」毎にフーリエ変換し、周波数毎のパワースペクトルの波形データを求める。図６は、例えば、一定時間「Ｔ1」における周波数毎のパワースペクトルの波形データの一例を示す図である。図６に示す波形データの周波数の範囲は、本実施形態の場合には、例えば、２５０〜９００Ｈｚ（バンドパスフィルタ２４でフィルタリングしたため）である。なお、フーリエ変換には、ＦＦＴ（ＦａｓｔＦｏｕｒｉｅｒＴｒａｎｓｆｏｒｍ：高速フーリエ変換）を使用すれば、高速処理が可能となり、好都合である。 That is, the CPU 37 Fourier transforms the digital audio data (FIG. 5) stored in the RAM 39 for each predetermined time “T1” to “Tn” to obtain waveform data of the power spectrum for each frequency. FIG. 6 is a diagram illustrating an example of waveform data of a power spectrum for each frequency in a certain time “T1”, for example. In the case of the present embodiment, the frequency range of the waveform data shown in FIG. 6 is, for example, 250 to 900 Hz (because it is filtered by the bandpass filter 24). Note that if FFT (Fast Fourier Transform) is used for Fourier transform, high-speed processing is possible, which is convenient.

次に、周波数毎のパワースペクトルの波形データ（図６）について、図６の中程に示す閾値２００を越える周波数を求め、周波数と時間との対応データを求め、図７のような形でＲＡＭ３９に記憶させる。ここで、図６は、前述したように、一定時間「Ｔ1」における周波数毎のパワースペクトルの波形データを示している。そして、図６の閾値２００は、パワー、すなわち、各周波数の強度を、閾値２００の上側（強度が強い）か下側（強度が弱い）かの２値に分類するための値となっている。閾値２００の上側にある周波数を「１」で表し、閾値２００の下側にある周波数を「０」で表せば、図７の最下行（「Ｔ1」時間の行６０１）のように、周波数「ｆ1」〜「ｆm」（図７の６４１〜６４ｍ列）のデータとして示すことが出来る。 Next, with respect to the waveform data (FIG. 6) of the power spectrum for each frequency, the frequency exceeding the threshold value 200 shown in the middle of FIG. 6 is obtained, and the correspondence data between the frequency and time is obtained. Remember me. Here, FIG. 6 shows the waveform data of the power spectrum for each frequency for a certain time “T1” as described above. The threshold value 200 in FIG. 6 is a value for classifying the power, that is, the intensity of each frequency, into two values, the upper side (higher intensity) or the lower side (lower intensity) of the threshold 200. . If the frequency above the threshold 200 is represented by “1” and the frequency below the threshold 200 is represented by “0”, as shown in the bottom row (“T1” time row 601) of FIG. It can be shown as data of “f1” to “fm” (641 to 64m columns in FIG. 7).

一定時間「Ｔ2」〜「Ｔn」についても、図６と同様に、周波数毎のパワースペクトルの波形データを求める。そして、上記と同様に、それぞれ「閾値」の上側か下側かを「１」と「０」で表せば、図７の６０２行（「Ｔ2」時間の行）〜６０ｎ行（「Ｔn」時間の行）のような、周波数「ｆ1」〜「ｆm」（図７の６４１〜６４ｍ列）のデータとして示すことが出来る。 Similarly to FIG. 6, the waveform data of the power spectrum for each frequency is obtained for the fixed times “T2” to “Tn”. Similarly to the above, if “1” and “0” represent the upper side or the lower side of the “threshold value”, the 602 row (“T2” time row) to the 60n row (“Tn” time) in FIG. The data of the frequencies “f1” to “fm” (columns 641 to 64m in FIG. 7) as shown in FIG.

つまり、図７においては、図５に示したデジタル音声データを、時間（「Ｔ1」〜「Ｔn」）
、周波数（「ｆ1」〜「ｆm」）、強度（「１」又は「０」）で表現したものである。従って、図７は、ソナグラムパターンデータのデータ形式の一例を示す図となっている。 That is, in FIG. 7, the digital audio data shown in FIG. 5 is converted into time (“T1” to “Tn”).
, Frequency (“f1” to “fm”) and intensity (“1” or “0”). Accordingly, FIG. 7 shows an example of the data format of the sonagram pattern data.

なお、図６に示した閾値２００は、その値を変更可能なように構成しておくものとする。このように構成することにより、周波数毎の強度として表す値を、周囲の状況（騒音の大きさなど）に応じて変更することが可能となる。また、上記においては、閾値２００を１つだけ設定し、強度を２値に分類する、として説明したが、これを、４値や８値など複数の強度に分類できよう、複数の閾値を設定できるようにしておいても良い。このように構成することにより、図７のソナグラムパターンデータ（第一例）６００に示した強度を、２値だけでなく、４値や８値に変更し、より詳細なソナグラムのデータを得ることが可能となる。 Note that the threshold 200 shown in FIG. 6 is configured so that the value can be changed. With this configuration, the value expressed as the intensity for each frequency can be changed according to the surrounding situation (noise level, etc.). In the above description, it has been described that only one threshold 200 is set and the intensity is classified into binary. However, a plurality of thresholds are set so that the intensity can be classified into a plurality of intensity such as 4-value and 8-value. You can make it possible. By configuring in this way, the intensity shown in the sonagram pattern data (first example) 600 of FIG. 7 is changed to not only binary but also quaternary or octal, so that more detailed sonagram data can be obtained. Is possible.

また、図８は、ソナグラムのパターンデータの表示例の一例を示す図であり、ソナグラムパターンデータ（第二例）７００と称するものとする。つまり、図７に示した「１」の部分を、図８の「斜線部」として表し、図７の「０」の部分を、図８の「空白部」として表したものである。ソナグラムのパターンデータとしては、図７、図８の何れを用いても良い。しかし、後述するパターンマッチングの方法によっては、図７、図８の何れが適しているか異なる場合もあるため、図７、図８の何れかでパターンマッチングを行えるように構成しておくと良い。 FIG. 8 is a diagram illustrating an example of display of sonagram pattern data, and is referred to as sonagram pattern data (second example) 700. That is, the “1” portion shown in FIG. 7 is represented as “shaded portion” in FIG. 8, and the “0” portion in FIG. 7 is represented as “blank portion” in FIG. As the sonagram pattern data, either FIG. 7 or FIG. 8 may be used. However, depending on the pattern matching method to be described later, which of FIGS. 7 and 8 is suitable may be different. Therefore, it is preferable that the pattern matching can be performed in either of FIGS.

図４に戻り、特定音声認識装置１００の警報出力部７０は、ディスプレイ７２とスピーカ７４を含んでいる。そして、制御部３０のＣＰＵ３７から送出された、緊急車両が特定音声認識装置１００に近づいている旨につき、メッセージデータ、或いは、画像データをディスプレイ７２により出力し、報知する。また、緊急車両が特定音声認識装置１００に近づいている旨につき、スピーカ７４から音声で出力し、報知する。 Returning to FIG. 4, the alarm output unit 70 of the specific speech recognition apparatus 100 includes a display 72 and a speaker 74. Then, message data or image data output from the CPU 37 of the control unit 30 and reporting that the emergency vehicle is approaching the specific voice recognition apparatus 100 is output and notified. Further, the fact that the emergency vehicle is approaching the specific voice recognition device 100 is output by voice from the speaker 74 and notified.

次に、図９を参照して、第３の実施形態の動作について説明する。 Next, the operation of the third embodiment will be described with reference to FIG.

本発明の第３の実施形態は、図２に示した特定音声認識装置１００の機能構成を、図４に示すようなハードウェア構成として図示表現を変更したものである。従って、図９に示す動作は、図３に示した動作とほぼ同一であり、図９において図３に示す動作には同一の参照数字又は符号を付し、その説明を極力省略するものとする。 In the third embodiment of the present invention, the functional configuration of the specific speech recognition apparatus 100 shown in FIG. 2 is changed to a hardware representation as shown in FIG. Accordingly, the operation shown in FIG. 9 is almost the same as the operation shown in FIG. 3. In FIG. 9, the operation shown in FIG. 3 is denoted by the same reference numeral or symbol, and the description thereof is omitted as much as possible. .

図９は、第３の実施形態の動作を説明するシーケンス図である。 FIG. 9 is a sequence diagram for explaining the operation of the third embodiment.

図９において、特定音声認識装置１００のマイク１０は、マイク１０周辺の音声を一定時間長の単位で取得し、音声データ抽出部２０に送出する（図９のステップＳ１）。その後、次の時間長の音声を取得するためステップＳ１を繰り返す。 In FIG. 9, the microphone 10 of the specific speech recognition apparatus 100 acquires the sound around the microphone 10 in a unit of a certain time length and sends it to the sound data extraction unit 20 (step S1 in FIG. 9). After that, step S1 is repeated to acquire the next time length voice.

音声データ抽出部２０は、マイク１０から送出されたアナログ音声データをアンプ２２で増幅し、バンドパスフィルタ２４に送出する。バンドパスフィルタ２４は、緊急車両が吹鳴するサイレン音の周波数範囲（例えば、２５０〜９００Ｈｚ）のみの音声データを通し、Ａ／Ｄコンバータ２６に送出する。Ａ／Ｄコンバータ２６は、バンドパスフィルタ２４から受信したアナログ音声データをデジタル音声データに変換する（ステップＳ２−１）。そして、変換したデジタル音声データを制御部３０に送出する。 The audio data extraction unit 20 amplifies the analog audio data sent from the microphone 10 with the amplifier 22 and sends it to the bandpass filter 24. The band-pass filter 24 passes the audio data only in the frequency range (for example, 250 to 900 Hz) of the siren sound that the emergency vehicle blows and sends it to the A / D converter 26. The A / D converter 26 converts the analog audio data received from the bandpass filter 24 into digital audio data (step S2-1). Then, the converted digital audio data is sent to the control unit 30.

一方、特定音声認識装置１００のＧＰＳ受信部４０は、一定時間間隔で、複数のＧＰＳ衛星が送出する電波をＧＰＳアンテナ４２で受信する。そして、ＧＰＳアンテナ４２で受信した電波情報に基づき、ＧＰＳレシーバ４４は、特定音声認識装置１００の現在位置を算出する（ステップＳ３−１）。そして、算出した現在位置の情報を制御部３０に送出する。その後、次の時間間隔でのＧＰＳ衛星からの電波を受信するため、ステップＳ３−１を繰り返す。 On the other hand, the GPS receiving unit 40 of the specific speech recognition apparatus 100 receives the radio waves transmitted from a plurality of GPS satellites by the GPS antenna 42 at regular time intervals. Then, based on the radio wave information received by the GPS antenna 42, the GPS receiver 44 calculates the current position of the specific speech recognition device 100 (step S3-1). Then, the calculated current position information is sent to the control unit 30. Thereafter, step S3-1 is repeated to receive radio waves from the GPS satellite at the next time interval.

ＧＰＳ受信部４０のＧＰＳレシーバ４４から現在位置を受信した制御部３０のＣＰＵ３７は、現在位置特定部（図２の現在位置特定部３２）のプログラムを実行させる。そして、ＣＰＵ３７は受信した現在位置が、どの自治体の中の位置に相当するかについて、図４に示した地図データ記憶部５０の地図データ５２を参照して特定する。さらに、ＣＰＵ３７は、特定した自治体の近辺の他の２〜３の自治体も特定する（ステップＳ４−１）。 The CPU 37 of the control unit 30 that has received the current position from the GPS receiver 44 of the GPS receiving unit 40 causes the program of the current position specifying unit (current position specifying unit 32 in FIG. 2) to be executed. Then, the CPU 37 specifies which municipality the received current position corresponds to by referring to the map data 52 of the map data storage unit 50 shown in FIG. Furthermore, the CPU 37 also specifies other 2-3 local governments in the vicinity of the specified local government (step S4-1).

次に、制御部３０のＣＰＵ３７は、緊急車両音声データ選択部（図２の緊急車両音声データ選択部３４）のプログラムを実行させる。そして、ステップＳ４−１で特定した複数の自治体が用いている緊急車両音声データを、図４に示した緊急車両音声データ記憶部６０の緊急車両音声データ６２から抽出する（ステップＳ５−１）。ここで抽出した緊急車両音声データの数は、複数の自治体における複数の緊急車両の種類分有り、かつ、緊急車両音声データはソナグラムの形の３次元データとなっている。 Next, the CPU 37 of the control unit 30 causes the emergency vehicle audio data selection unit (emergency vehicle audio data selection unit 34 in FIG. 2) to execute a program. And the emergency vehicle audio | voice data which the some local government specified by step S4-1 are using are extracted from the emergency vehicle audio | voice data 62 of the emergency vehicle audio | voice data storage part 60 shown in FIG. 4 (step S5-1). The number of emergency vehicle voice data extracted here is the number of types of emergency vehicles in a plurality of local governments, and the emergency vehicle voice data is three-dimensional data in the form of a sonogram.

一方、制御部３０のＣＰＵ３７は、音声データ抽出部２０のＡ／Ｄコンバータ２６からデジタル音声データを受信すると、警報音認識部（図２の警報音認識部３６）のプログラムを実行させる。そして、ＣＰＵ３７は先ず、デジタル音声データをソナグラムの形の３次元データに変換し（受信ソナグラムデータと称している）、ＲＡＭ３９に記憶させる。次に、ＣＰＵ３７は、受信ソナグラムデータが、ステップＳ５−１で抽出した複数の自治体の複数の緊急車両音声データの何れかと一致するかを比較する（ステップＳ６−１）。そして、ＣＰＵ３７は、受信ソナグラムデータが、ステップＳ５−１で抽出した複数の自治体の複数の緊急車両音声データの何れかと一致するかを判定する（ステップＳ７−１）。一致の判定は、ソナグラムで表されたデータ同士のパターンマッチングにより行う。 On the other hand, when the CPU 37 of the control unit 30 receives the digital audio data from the A / D converter 26 of the audio data extraction unit 20, the CPU 37 executes the program of the alarm sound recognition unit (the alarm sound recognition unit 36 in FIG. 2). The CPU 37 first converts the digital audio data into three-dimensional data in the form of sonagram (referred to as received sonargram data) and stores it in the RAM 39. Next, the CPU 37 compares the received sonogram data with any of the plurality of emergency vehicle voice data of the plurality of local governments extracted at step S5-1 (step S6-1). Then, the CPU 37 determines whether the received sonargram data matches any of the plurality of emergency vehicle sound data of the plurality of local governments extracted in step S5-1 (step S7-1). The coincidence is determined by pattern matching between the data represented by the sonagram.

図９のステップＳ７−１における判定で、受信ソナグラムデータが、ステップＳ５−１で抽出した複数の自治体の複数の緊急車両音声データの何れかと一致すると判定した場合（ステップＳ７−１で「一致」）、制御部３０のＣＰＵ３７は、緊急車両が近づいているものと判断する。そして、緊急車両が近づいている旨の緊急車両接近警告画像データ、緊急車両接近警告メッセージデータ等を、ＲＯＭ３８から読み出し、特定音声認識装置１００の警報出力部７０に通知する。警報出力部７０は、緊急車両が近づいている旨を、画像データ、或いはメッセージデータによりディスプレイ７２に出力し、報知する。又は、緊急車両が近づいている旨を、スピーカ７４からメッセージデータとして音声で出力し、報知する（ステップＳ８−１）。そして、特定音声認識装置１００の利用者がこの報知に気づき、報知の停止操作（例えば、図示しない報知停止釦の押下など）を行った場合、警報出力部７０は警報の報知を停止し、もとの初期状態に戻る。 When it is determined in step S7-1 in FIG. 9 that the received sonogram data matches any of a plurality of emergency vehicle voice data of a plurality of local governments extracted in step S5-1 (“match” in step S7-1). ), The CPU 37 of the control unit 30 determines that the emergency vehicle is approaching. Then, emergency vehicle approach warning image data indicating that an emergency vehicle is approaching, emergency vehicle approach warning message data, and the like are read from the ROM 38 and notified to the alarm output unit 70 of the specific speech recognition apparatus 100. The alarm output unit 70 notifies the display 72 that the emergency vehicle is approaching by using image data or message data. Alternatively, the fact that the emergency vehicle is approaching is output by voice as message data from the speaker 74 and notified (step S8-1). When the user of the specific speech recognition apparatus 100 notices this notification and performs a notification stop operation (for example, pressing a notification stop button (not shown)), the alarm output unit 70 stops the alarm notification. Return to the initial state.

ステップＳ７−１で、受信ソナグラムデータが、ステップＳ５−１で抽出した複数の自治体の複数の緊急車両音声データの何れとも一致しない場合（ステップＳ７−１で「不一致」）、制御部３０のＣＰＵ３７は、何もせず、もとの初期状態に戻る。 In step S7-1, when the received sonagram data does not match any of the plurality of emergency vehicle audio data of the plurality of local governments extracted in step S5-1 (“mismatch” in step S7-1), the CPU 37 of the control unit 30 Does nothing and returns to the original initial state.

以上、本発明の第３の実施形態の動作について説明した。 The operation of the third embodiment of the present invention has been described above.

前述したように、本発明の第３の実施形態は、図２に示した第２の実施形態の特定音声認識装置の機能構成を、図４に示したようなハードウェア構成として詳細化したものである。従って、以下において第３の実施形態の全体の機能説明は省略し、第３の実施形態に特有の箇所に関してのみ説明するものとする。 As described above, in the third embodiment of the present invention, the functional configuration of the specific speech recognition apparatus of the second embodiment shown in FIG. 2 is detailed as a hardware configuration as shown in FIG. It is. Therefore, in the following, description of the overall function of the third embodiment will be omitted, and only the parts specific to the third embodiment will be described.

以上説明したように、本実施形態の特定音声認識装置１００の音声データ抽出部２０は、アンプ２２とバンドパスフィルタ２４とＡ／Ｄコンバータ２６を含んでいる。そして、バンドパスフィルタ２４は、緊急車両が吹鳴するサイレン音の周波数範囲のみの音声データを通すようになっている。 As described above, the voice data extraction unit 20 of the specific voice recognition apparatus 100 of this embodiment includes the amplifier 22, the bandpass filter 24, and the A / D converter 26. The band-pass filter 24 passes audio data only in the frequency range of the siren sound that the emergency vehicle blows.

また、特定音声認識装置１００の制御部３０に接続される地図データ記憶部５０は、緯度、経度のデータと共に、当該緯度、経度の地点がどの自治体に属しているかの情報も併せて表現する地図データ５２を含んでいる。 In addition, the map data storage unit 50 connected to the control unit 30 of the specific speech recognition apparatus 100 also expresses information on which municipality the latitude and longitude points belong together with the latitude and longitude data. Data 52 is included.

さらに、特定音声認識装置１００の制御部３０に接続される緊急車両音声データ記憶部６０は、自治体毎に、当該自治体の緊急車両が吹鳴する緊急車両音声データをソナグラムの形で記憶する緊急車両音声データ６２を含んでいる。 Furthermore, the emergency vehicle voice data storage unit 60 connected to the control unit 30 of the specific voice recognition apparatus 100 stores emergency vehicle voice data in the form of a sonagram for each local government, which stores emergency vehicle voice data generated by the local emergency vehicle. Data 62 is included.

また、制御部３０に含まれるＣＰＵ３７は、ＲＯＭ３８に記憶されている現在位置特定部プログラム、緊急車両音声データ選択部プログラム、警報音認識部プログラムなどを実行させ、以下に示す動作を行う。 In addition, the CPU 37 included in the control unit 30 executes a current position specifying unit program, an emergency vehicle voice data selection unit program, an alarm sound recognition unit program, and the like stored in the ROM 38, and performs the following operations.

すなわち、ＣＰＵ３７は、ＧＰＳ受信部４０のＧＰＳレシーバ４４から現在位置を受信した場合、受信した現在位置が、どの自治体の中の位置に相当するかについて、地図データ記憶部５０の地図データ５２を参照して特定する。また、ＣＰＵ３７は、特定した自治体の近辺の他の２〜３の自治体も特定する。 That is, when the CPU 37 receives the current position from the GPS receiver 44 of the GPS receiving unit 40, the CPU 37 refers to the map data 52 of the map data storage unit 50 as to which local government the received current position corresponds to. To identify. In addition, the CPU 37 also specifies other 2-3 local governments in the vicinity of the specified local government.

次に、ＣＰＵ３７は、特定した複数の自治体が用いている緊急車両音声データを、緊急車両音声データ記憶部６０の緊急車両音声データ６２から、抽出する。ここで抽出した緊急車両音声データの数は、複数の自治体における複数の緊急車両の種類分有り、かつ、緊急車両音声データはソナグラムの形の３次元データとなっている。 Next, the CPU 37 extracts emergency vehicle sound data used by the specified plurality of local governments from the emergency vehicle sound data 62 of the emergency vehicle sound data storage unit 60. The number of emergency vehicle voice data extracted here is the number of types of emergency vehicles in a plurality of local governments, and the emergency vehicle voice data is three-dimensional data in the form of a sonogram.

一方、ＣＰＵ３７は、音声データ抽出部２０のＡ／Ｄコンバータ２６からデジタル音声データを受信すると、先ず、デジタル音声データをソナグラムの形の３次元データに変換する（受信ソナグラムデータと称している）。次に、ＣＰＵ３７は、受信ソナグラムデータが、現在位置近辺の複数の自治体の複数の緊急車両音声データ（これらもソナグラムで表されるデータとなっている）の何れかと一致するかを判定する。一致の判定は、ソナグラムで表されたデータ同士のパターンマッチングにより行う。 On the other hand, when the CPU 37 receives the digital audio data from the A / D converter 26 of the audio data extraction unit 20, the CPU 37 first converts the digital audio data into three-dimensional data in the form of sonagram (referred to as received sonagram data). Next, the CPU 37 determines whether or not the received sonargram data matches any of a plurality of emergency vehicle sound data of a plurality of local governments in the vicinity of the current position (these are also data represented by sonagrams). The coincidence is determined by pattern matching between the data represented by the sonagram.

そして、ＣＰＵ３７は、受信ソナグラムデータが、現在位置近辺の複数の自治体の複数の緊急車両音声データの何れかと一致すると判定した場合、緊急車両が近づいている旨を制御部３０に接続された警報出力部７０に通知する。 When the CPU 37 determines that the received sonargram data matches any of a plurality of emergency vehicle audio data of a plurality of local governments in the vicinity of the current position, an alarm output connected to the control unit 30 that the emergency vehicle is approaching Notification to the unit 70.

そして、ＣＰＵ３７から緊急車両が近づいている旨の通知を受けた場合、警報出力部７０は、緊急車両が近づいている旨を、画像データ、メッセージデータ、或いは音声により、ディスプレイ７２やスピーカ７４により出力し、報知する、ようになっている。 When the CPU 37 receives a notification that the emergency vehicle is approaching, the alarm output unit 70 outputs the fact that the emergency vehicle is approaching from the display 72 or the speaker 74 by image data, message data, or voice. And informs you.

従って、本実施形態によれば、自治体毎に、当該自治体が使用している緊急車両音声データ（サイレン音など）が異なっている場合であっても、車両外の音声データが緊急車両の吹鳴する緊急車両音声データに一致するかを判定できる。そして、一致する場合には、緊急車両が近づいている旨を車両内に報知することが可能となる。 Therefore, according to the present embodiment, even when the emergency vehicle voice data (siren sound etc.) used by the local government is different for each local government, the voice data outside the vehicle is blown by the emergency vehicle. It can be determined whether or not it matches the emergency vehicle voice data. And when it corresponds, it becomes possible to alert | report in the vehicle that the emergency vehicle is approaching.

また、バンドパスフィルタ２４により、緊急車両が吹鳴する周波数範囲の音声データのみを抽出可能となり、受信した音声データを効率よくソナグラムに変換できる。 Further, the band-pass filter 24 can extract only voice data in a frequency range where the emergency vehicle blows, and the received voice data can be efficiently converted into a sonogram.

さらに、緊急車両音声データ６２には、全自治体の緊急車両が吹鳴する音声データが多数記憶されているが、特定音声認識装置１００を搭載する車両の現在位置近傍のいくつかの自治体の緊急車両音声データだけを抽出できる。従って、ソナグラムのデータ形のパターンマッチングの回数を大幅に削減可能となる。
[第４の実施形態]
次に、図４に示した第３の実施形態の構成要素の一部を、カーナビゲーションシステムに代替させるよう構成した本発明の第４の実施形態について説明する。 Further, the emergency vehicle audio data 62 stores a large number of audio data that sounds from emergency vehicles of all local governments. Only data can be extracted. Therefore, it is possible to greatly reduce the number of times of pattern matching of the sonagram data type.
[Fourth Embodiment]
Next, a description will be given of a fourth embodiment of the present invention in which some of the components of the third embodiment shown in FIG. 4 are replaced with a car navigation system.

図１０は、本発明の特定音声認識装置の第４の実施形態を示すブロック図である。なお、図１０において図４の構成要素に対応するものは同一の参照数字または符号を付し、その説明を極力省略するものとする。 FIG. 10 is a block diagram showing a fourth embodiment of the specific speech recognition apparatus of the present invention. 10 that correspond to the components in FIG. 4 are denoted by the same reference numerals or symbols, and the description thereof is omitted as much as possible.

図１０に示す特定音声認識装置１００−１は、マイク１０と、音声データ抽出部２０と、制御部３０と、制御部３０に接続された緊急車両音声データ記憶部６０を含んでいる。 A specific voice recognition device 100-1 shown in FIG. 10 includes a microphone 10, a voice data extraction unit 20, a control unit 30, and an emergency vehicle voice data storage unit 60 connected to the control unit 30.

マイク１０は、図４と同様であり、マイク１０周辺の音声を取得し、音声データ抽出部２０に送出する。マイク１０は、特定音声認識装置１００−１を搭載する車両の外部で、かつ、外部の音声データを取得しやすい場所に設置されているものとする。 The microphone 10 is the same as in FIG. 4, acquires sound around the microphone 10, and sends it to the sound data extraction unit 20. The microphone 10 is assumed to be installed outside the vehicle on which the specific speech recognition device 100-1 is mounted and at a place where external speech data can be easily obtained.

音声データ抽出部２０は、図４と同様であり、アンプ２２と、バンドパスフィルタ２４と、Ａ／Ｄコンバータ２６とを含んでいる。そして、各構成要素の機能も図４で示したと同様であるため、これ以上の説明は省略する。 The audio data extraction unit 20 is the same as that in FIG. 4, and includes an amplifier 22, a band pass filter 24, and an A / D converter 26. And since the function of each component is the same as that shown in FIG. 4, further explanation is omitted.

制御部３０は、特定音声認識装置１００−１の演算処理を行うＣＰＵ３７−１を含んでいる。また、制御部３０は、ＣＰＵ３７−１を動作させるプログラム等を記憶するＲＯＭ３８−１を含んでいる。ＲＯＭ３８−１に記憶されＣＰＵ３７−１を動作させるプログラムとしては、第２の実施形態で説明した緊急車両音声データ選択部３４、警報音認識部３６の機能を実行するプログラムも含まれている。 The control unit 30 includes a CPU 37-1 that performs arithmetic processing of the specific speech recognition apparatus 100-1. The control unit 30 includes a ROM 38-1 for storing a program for operating the CPU 37-1. The programs that are stored in the ROM 38-1 and operate the CPU 37-1 include programs that execute the functions of the emergency vehicle voice data selection unit 34 and the alarm sound recognition unit 36 described in the second embodiment.

さらに、制御部３０は、ＣＰＵ３７−１のワークエリアとして各種のデータを記憶するＲＡＭ３９−１を含んでいる。 Furthermore, the control unit 30 includes a RAM 39-1 that stores various data as a work area of the CPU 37-1.

また、制御部３０に接続される緊急車両音声データ記憶部６０は、図４と同様であり、自治体毎に、当該自治体の緊急車両が吹鳴するサイレン音などの緊急車両音声データを記憶する緊急車両音声データ６２を含んでいる。そして、緊急車両音声データ６２は、図４と同様に、緊急車両の音声データをソナグラムの形で記憶している。 Moreover, the emergency vehicle audio | voice data storage part 60 connected to the control part 30 is the same as that of FIG. 4, and the emergency vehicle which memorize | stores emergency vehicle audio | voice data, such as the siren sound which the emergency vehicle of the said local government blows for every local government Audio data 62 is included. And the emergency vehicle audio | voice data 62 has memorize | stored the audio data of the emergency vehicle in the form of the sonagram similarly to FIG.

また、図１０に示す特定音声認識装置１００−１は、カーナビゲーションシステム８０を含んでいる。 10 includes a car navigation system 80. The specific speech recognition apparatus 100-1 shown in FIG.

カーナビゲーションシステム８０は、ＧＰＳ受信部４０−１と、カーナビゲーションシステム８０の演算処理を行うＣＰＵ８２を含み、ＣＰＵ８２に接続される地図データ記憶部５０−１を含んでいる。 The car navigation system 80 includes a GPS receiver 40-1 and a CPU 82 that performs arithmetic processing of the car navigation system 80, and includes a map data storage unit 50-1 connected to the CPU 82.

ＧＰＳ受信部４０−１は、ＧＰＳ衛星からの電波を受信するＧＰＳアンテナ４２−１を含んでいる。また、ＧＰＳアンテナ４２−１から受信した電波に基づいて、特定音声認識装置１００−１の現在位置データを、例えば、緯度、経度のデータとして算出するＧＰＳレシーバ４４−１を含んでいる。ＧＰＳレシーバ４４−１は、算出した現在位置データをＣＰＵ８２に送出する。 The GPS receiver 40-1 includes a GPS antenna 42-1 that receives radio waves from GPS satellites. Moreover, the GPS receiver 44-1 which calculates the present position data of the specific speech recognition apparatus 100-1 as, for example, latitude and longitude data based on the radio wave received from the GPS antenna 42-1 is included. The GPS receiver 44-1 sends the calculated current position data to the CPU 82.

ＣＰＵ８２には、ＣＰＵ８２を動作させるプログラムや、緊急車両接近警告画像データ、緊急車両接近警告メッセージデータ等を記憶するＲＯＭ８４と、ＣＰＵ８２のワークエリアとして各種のデータを記憶するＲＡＭ８６が接続されている。ＲＯＭ８４に記憶されＣＰＵ８２を動作させるプログラムとしては、カーナビゲーションの機能や、第２の実施形態で説明した現在位置特定部３２の機能を実行するプログラムも含まれている。 The CPU 82 is connected to a ROM 84 that stores a program for operating the CPU 82, emergency vehicle approach warning image data, emergency vehicle approach warning message data, and the like, and a RAM 86 that stores various data as a work area of the CPU 82. The program stored in the ROM 84 and operating the CPU 82 includes a program for executing the function of the car navigation and the function of the current position specifying unit 32 described in the second embodiment.

また、ＣＰＵ８２には、カーナビゲーションを行うためのディスプレイ７２−１や、音声を出力するスピーカ７４−１も接続されている。 The CPU 82 is also connected to a display 72-1 for performing car navigation and a speaker 74-1 for outputting sound.

地図データ記憶部５０−１は、例えば、日本全土の地図データを、緯度、経度のデータと共に記憶する地図データ５２−１を含んでいる。地図データ５２−１は、さらに、当該緯度、経度の地点が日本のどの自治体（□□県、○○市、△△町、など）に属しているかの情報も記憶している。 The map data storage unit 50-1 includes, for example, map data 52-1 for storing map data for the whole of Japan together with latitude and longitude data. The map data 52-1 further stores information on which local government in Japan (□□ prefecture, XX city, △ Δ town, etc.) the latitude and longitude points belong to.

なお、上述したＣＰＵ３７−１と、カーナビゲーションシステム８０のＣＰＵ８２は図示しないバス（ｂｕｓ）で接続されており、各種情報の送受信を相互に行うことが可能となっている。 The CPU 37-1 described above and the CPU 82 of the car navigation system 80 are connected by a bus (not shown) so that various types of information can be transmitted and received mutually.

次に、図１１を参照して、第４の実施形態の動作について説明する。 Next, the operation of the fourth embodiment will be described with reference to FIG.

本発明の第４の実施形態は、図４に示した第３の実施形態の構成要素の一部を、カーナビゲーションシステムに代替させるよう構成したものである。従って、図１１に示す動作は、図９に示した動作とほぼ同一であり、図１１において図９に示す動作には同一の参照数字又は符号を付し、その説明を極力省略するものとする。 In the fourth embodiment of the present invention, a part of the components of the third embodiment shown in FIG. 4 is replaced with a car navigation system. Therefore, the operation shown in FIG. 11 is almost the same as the operation shown in FIG. 9. In FIG. 11, the operation shown in FIG. 9 is given the same reference numeral or symbol, and the description thereof is omitted as much as possible. .

図１１は、第３実施形態の動作を説明するシーケンス図である。 FIG. 11 is a sequence diagram for explaining the operation of the third embodiment.

図１１において、特定音声認識装置１００−１のマイク１０は、マイク１０周辺の音声を一定時間長の単位で取得し、音声データ抽出部２０に送出する（図１１のステップＳ１）。その後、次の時間長の音声を取得するためステップＳ１を繰り返す。 In FIG. 11, the microphone 10 of the specific speech recognition apparatus 100-1 acquires the sound around the microphone 10 in a unit of a certain time length and sends it to the sound data extraction unit 20 (step S <b> 1 in FIG. 11). After that, step S1 is repeated to acquire the next time length voice.

一方、特定音声認識装置１００−１のカーナビゲーションシステム８０のＧＰＳ受信部４０−１は、一定時間間隔で、複数のＧＰＳ衛星が送出する電波をＧＰＳアンテナ４２−１で受信する。そして、ＧＰＳアンテナ４２−１で受信した電波情報に基づき、ＧＰＳレシーバ４４−１は、特定音声認識装置１００−１の現在位置を算出する（ステップＳ３−２）。そして、算出した現在位置の情報をＣＰＵ８２に送出する。その後、次の時間間隔でのＧＰＳ衛星からの電波を受信するため、ステップＳ３−２を繰り返す。 On the other hand, the GPS receiving unit 40-1 of the car navigation system 80 of the specific speech recognition apparatus 100-1 receives the radio waves transmitted from the plurality of GPS satellites by the GPS antenna 42-1 at regular time intervals. Then, based on the radio wave information received by the GPS antenna 42-1, the GPS receiver 44-1 calculates the current position of the specific speech recognition device 100-1 (step S3-2). Then, the calculated current position information is sent to the CPU 82. Thereafter, step S3-2 is repeated to receive radio waves from the GPS satellite at the next time interval.

ＧＰＳ受信部４０−１のＧＰＳレシーバ４４−１から現在位置を受信したカーナビゲーションシステム８０のＣＰＵ８２は、現在位置特定部（図２の現在位置特定部３２）のプログラムを実行させる。そして、ＣＰＵ８２は受信した現在位置が、どの自治体の中の位置に相当するかについて、図１０に示した地図データ記憶部５０−１の地図データ５２−１を参照して特定する。さらに、ＣＰＵ８２は、特定した自治体の近辺の他の２〜３の自治体も特定する（ステップＳ４−２）。そして、ＣＰＵ８２は、特定した複数の自治体の情報を、制御部３０のＣＰＵ３７−１に送出する。 The CPU 82 of the car navigation system 80 that has received the current position from the GPS receiver 44-1 of the GPS receiver 40-1 causes the program of the current position specifying unit (current position specifying unit 32 in FIG. 2) to be executed. Then, the CPU 82 specifies which municipality the received current position corresponds to by referring to the map data 52-1 of the map data storage unit 50-1 shown in FIG. Further, the CPU 82 specifies other 2-3 local governments in the vicinity of the specified local government (step S4-2). And CPU82 sends out the information of the specified some local government to CPU37-1 of the control part 30. FIG.

ＣＰＵ８２から、特定した複数の自治体情報を受信したＣＰＵ３７−１は、緊急車両音声データ選択部（図２の緊急車両音声データ選択部３４）のプログラムを実行させる。そして、ステップＳ４−２で特定した複数の自治体が用いている緊急車両音声データを、図１０に示した緊急車両音声データ記憶部６０の緊急車両音声データ６２から抽出する（ステップＳ５−２）。ここで抽出した緊急車両音声データの数は、複数の自治体における複数の緊急車両の種類分有り、かつ、緊急車両音声データはソナグラムの形の３次元データとなっている。 The CPU 37-1 that has received the specified plurality of local government information from the CPU 82 causes the emergency vehicle voice data selection unit (emergency vehicle voice data selection unit 34 in FIG. 2) to execute a program. And the emergency vehicle audio | voice data which the some local government specified by step S4-2 uses are extracted from the emergency vehicle audio | voice data 62 of the emergency vehicle audio | voice data storage part 60 shown in FIG. 10 (step S5-2). The number of emergency vehicle voice data extracted here is the number of types of emergency vehicles in a plurality of local governments, and the emergency vehicle voice data is three-dimensional data in the form of a sonogram.

また、制御部３０のＣＰＵ３７−１は、音声データ抽出部２０のＡ／Ｄコンバータ２６からデジタル音声データを受信すると、警報音認識部（図２の警報音認識部３６）のプログラムを実行させる。そして、ＣＰＵ３７−１は先ず、デジタル音声データをソナグラムの形の３次元データに変換し（受信ソナグラムデータと称している）、ＲＡＭ３９−１に記憶させる。次に、ＣＰＵ３７−１は、受信ソナグラムデータが、ステップＳ５−２で抽出した複数の自治体の複数の緊急車両音声データの何れかと一致するかを比較する（ステップＳ６−２）。そして、ＣＰＵ３７−１は、受信ソナグラムデータが、ステップＳ５−２で抽出した複数の自治体の複数の緊急車両音声データの何れかと一致するかを判定する（ステップＳ７−２）。一致の判定は、ソナグラムで表されたデータ同士のパターンマッチングにより行う。ＣＰＵ３７−１は、判定結果（「一致」、又は、「不一致」）をカーナビゲーションシステム８０のＣＰＵ８２に返送する。 When the CPU 37-1 of the control unit 30 receives the digital audio data from the A / D converter 26 of the audio data extraction unit 20, the CPU 37-1 causes the alarm sound recognition unit (the alarm sound recognition unit 36 in FIG. 2) to execute a program. The CPU 37-1 first converts the digital audio data into three-dimensional data in the form of sonagram (referred to as received sonargram data) and stores it in the RAM 39-1. Next, the CPU 37-1 compares whether the received sonogram data matches any of the plurality of emergency vehicle voice data of the plurality of local governments extracted in step S5-2 (step S6-2). Then, the CPU 37-1 determines whether the received sonogram data matches any of the plurality of emergency vehicle voice data of the plurality of local governments extracted in step S5-2 (step S7-2). The coincidence is determined by pattern matching between the data represented by the sonagram. The CPU 37-1 returns the determination result (“match” or “non-match”) to the CPU 82 of the car navigation system 80.

カーナビゲーションシステム８０のＣＰＵ８２は、判定結果が「一致」又は「不一致」を判定する（ステップＳ７１−２）。 The CPU 82 of the car navigation system 80 determines whether the determination result is “match” or “mismatch” (step S71-2).

図１１のステップＳ７１−２における判定結果が、「一致」である場合（ステップＳ７１−２で「一致」）、カーナビゲーションシステム８０のＣＰＵ８２は、緊急車両が近づいているものと判断する。そして、緊急車両が近づいている旨の緊急車両接近警告画像データ、緊急車両接近警告メッセージデータ等を、ＲＯＭ８４から読み出し、ディスプレイ７２−１、スピーカ７４−１に通知する。ディスプレイ７２−１は、緊急車両が近づいている旨を、画像データとして出力し、報知する。同時に／又は、スピーカ７４−１は、緊急車両が近づいている旨を、音声で出力し、報知する（ステップＳ８−２）。そして、特定音声認識装置１００−１の利用者がこの報知に気づき、報知の停止操作を行った場合、ディスプレイ７２−１やスピーカ７４−１は報知を停止し、もとの状態に戻る。 When the determination result in step S71-2 in FIG. 11 is “match” (“match” in step S71-2), the CPU 82 of the car navigation system 80 determines that the emergency vehicle is approaching. Then, emergency vehicle approach warning image data, emergency vehicle approach warning message data, and the like indicating that an emergency vehicle is approaching are read from the ROM 84 and notified to the display 72-1 and the speaker 74-1. The display 72-1 outputs and notifies that the emergency vehicle is approaching as image data. At the same time or / or the speaker 74-1 outputs and notifies that the emergency vehicle is approaching (step S8-2). When the user of the specific speech recognition apparatus 100-1 notices this notification and performs a notification stop operation, the display 72-1 and the speaker 74-1 stop the notification and return to the original state.

ステップＳ７１−２における判定結果が「不一致」（ステップＳ７１−２「不一致」）、の場合、カーナビゲーションシステム８０のＣＰＵ８２は、何もせず、もとの状態に戻る。 When the determination result in step S71-2 is “mismatch” (step S71-2 “mismatch”), the CPU 82 of the car navigation system 80 does nothing and returns to the original state.

以上、本発明の第４の実施形態の動作について説明した。 The operation of the fourth embodiment of the present invention has been described above.

前述したように、本発明の第４の実施形態は、図４に示した第３の実施形態の構成要素の一部を、カーナビゲーションシステム８０に代替させるように構成したものである。 As described above, the fourth embodiment of the present invention is configured so that the car navigation system 80 replaces some of the components of the third embodiment shown in FIG.

従って、本実施形態によれば、第３の実施形態におけると同様の、以下のような効果を奏することができる。 Therefore, according to the present embodiment, the following effects similar to those in the third embodiment can be obtained.

すなわち、自治体毎に、当該自治体が使用している緊急車両音声データ（サイレン音など）が異なっている場合であっても、車両外の音声データが緊急車両の吹鳴する緊急車両音声データに一致するかを判定できる。そして、一致する場合には、緊急車両が近づいている旨を車両内に報知することが可能となる。 That is, even if the emergency vehicle audio data (siren sound, etc.) used by the local government is different for each local government, the audio data outside the vehicle matches the emergency vehicle audio data generated by the emergency vehicle. Can be determined. And when it corresponds, it becomes possible to alert | report in the vehicle that the emergency vehicle is approaching.

さらに、緊急車両音声データ６２には、全自治体の緊急車両が吹鳴する音声データが多数記憶されているが、特定音声認識装置１００−１を搭載する車両の現在位置近傍のいくつかの自治体の緊急車両音声データだけを抽出できる。従って、ソナグラムのデータ形のパターンマッチングの回数を大幅に削減可能となる。 Further, the emergency vehicle audio data 62 stores a large number of audio data that sounds from emergency vehicles of all local governments. Only vehicle audio data can be extracted. Therefore, it is possible to greatly reduce the number of pattern matching of the data form of the sonogram.

加えて、第４の実施形態によれば、既存のカーナビゲーションシステム８０を利用するようにしているため、特定音声認識装置１００−１のコスト低減を図ることが可能となる。 In addition, according to the fourth embodiment, since the existing car navigation system 80 is used, the cost of the specific speech recognition apparatus 100-1 can be reduced.

以上の各実施形態では、領域固有音声として主に緊急車両の吹鳴するサイレン音を例として説明してきたが、これに限られない。上記各実施形態は、地理的領域に固有な音声について適用可能である。また、地理的領域として自治体を代表的な例として説明してきたが、これに限られない。領域固有音声が聞かれる範囲として予め定められていれば、これを地理的領域とみなして各実施形態を適用することができる。
In each of the above embodiments, the siren sound that the emergency vehicle blows mainly has been described as an example of the region-specific sound, but the present invention is not limited to this. Each of the above embodiments can be applied to speech unique to a geographical region. Moreover, although the local government was demonstrated as a representative example as a geographical area, it is not restricted to this. If the area-specific voice is heard in advance, it can be regarded as a geographical area and each embodiment can be applied.

２音声取得部
３−２現在位置認識特定部
３−６音声認識結果報知部
４現在位置生成部
６領域固有音声記憶部
１０マイク
２０音声データ抽出部
２２アンプ
２４バンドパスフィルタ
２６Ａ／Ｄコンバータ
３０制御部
３２現在位置特定部
３４緊急車両音声データ選択部
３６警報音認識部
３７ＣＰＵ
３７−１ＣＰＵ
３８ＲＯＭ
３８−１ＲＯＭ
３９ＲＡＭ
３９−１ＲＡＭ
４０ＧＰＳ受信部
４０−１ＧＰＳ受信部
４２ＧＰＳアンテナ
４２−１ＧＰＳアンテナ
４４ＧＰＳレシーバ
４４−１ＧＰＳレシーバ
５０地図データ記憶部
５０−１地図データ記憶部
５２地図データ
５２−１地図データ
６０緊急車両音声データ記憶部
６２緊急車両音声データ
７０警報出力部
７２ディスプレイ
７２−１ディスプレイ
７４スピーカ
７４−１スピーカ
８０カーナビゲーションシステム
８２ＣＰＵ
８４ＲＯＭ
８６ＲＡＭ
１００特定音声認識装置
１００−１特定音声認識装置
６００ソナグラムパターンデータ（第一例）
７００ソナグラムパターンデータ（第二例） 2 Voice acquisition unit 3-2 Current position recognition specifying unit 3-6 Voice recognition result notification unit 4 Current position generation unit 6 Area specific voice storage unit 10 Microphone 20 Audio data extraction unit 22 Amplifier 24 Band pass filter 26 A / D converter 30 Control unit 32 Current position specifying unit 34 Emergency vehicle voice data selection unit 36 Alarm sound recognition unit 37 CPU
37-1 CPU
38 ROM
38-1 ROM
39 RAM
39-1 RAM
40 GPS receiver 40-1 GPS receiver 42 GPS antenna 42-1 GPS antenna 44 GPS receiver 44-1 GPS receiver 50 Map data storage unit 50-1 Map data storage unit 52 Map data 52-1 Map data 60 Emergency vehicle sound Data storage unit 62 Emergency vehicle voice data 70 Alarm output unit 72 Display 72-1 Display 74 Speaker 74-1 Speaker 80 Car navigation system 82 CPU
84 ROM
86 RAM
100 specific speech recognition device 100-1 specific speech recognition device 600 sonagram pattern data (first example)
700 sonagram pattern data (second example)

Claims

An audio acquisition unit that acquires input peripheral audio in units of a certain time length and outputs it as digital audio data;
Based on a predetermined signal that is input, a current position generation unit that calculates and outputs its current position;
A region-specific speech storage unit that stores region-specific speech heard only in a predetermined geographical region in association with the geographical region in units of the predetermined time length or more;
A current position recognition specifying unit that outputs information indicating a specific geographical area to which the current position belongs, which is input from the current position generating unit;
If the region specific voice extracted from the region specific voice storage unit for the specific geographical region to be input is compared with the digital voice data input from the voice acquisition unit, and it is determined that both match, A specific speech recognition apparatus comprising: a speech recognition result notifying unit that outputs a notification signal for notifying the user.

The voice acquisition unit is configured to acquire a peripheral voice in a unit of a predetermined time length;
An audio data extraction unit that A / D-converts analog audio data acquired by the microphone and sends it to the control unit as digital audio data;
The current position generation unit includes a GPS reception unit that receives radio waves transmitted by a plurality of GPS satellites at regular time intervals, calculates a current position based on the received radio wave information, and transmits the current position to the control unit,
The area-specific voice storage unit is connected to the control unit, and includes, for each municipality, an emergency vehicle voice data storage unit that stores emergency vehicle voice data such as a siren sound that the emergency vehicle of the municipality blows.
The current position recognition specifying unit is connected to the control unit, and includes a map data storage unit for storing map data, and a plurality of near the current position based on the current position received from the GPS receiving unit. A current position specifying unit that specifies the local government with reference to the map data storage unit,
The voice recognition result notifying unit is included in the control unit, and emergency vehicle voice data for extracting emergency vehicle voice data used by a plurality of local governments specified by the current position specifying unit from the emergency vehicle voice data storage unit A selection section;
The digital voice data included in the control unit and received from the voice data extraction unit is determined to match any of a plurality of emergency vehicle voice data of a plurality of local governments extracted by the emergency vehicle voice data selection unit, An alarm sound recognizing unit for notifying an alarm output unit connected to the control unit that an emergency vehicle is approaching,
The warning output unit, when notified from the warning sound recognition unit, outputs and informs that an emergency vehicle is approaching by voice, message data, or image data,
The specific speech recognition apparatus according to claim 1.

The voice data extraction unit includes a band-pass filter, and the band-pass filter extracts only voice data in a frequency range of emergency vehicle voice data such as a siren sound generated by a local emergency vehicle.
The specific speech recognition apparatus according to claim 2.

The audio data stored in the emergency vehicle audio data storage unit is audio data in the form of a sonogram,
The warning sound recognizing unit converts the digital sound data received from the sound data extracting unit into sound data in the form of a sonogram, and then a plurality of emergency vehicle sound data of a plurality of local governments extracted by the emergency vehicle sound data selecting unit. It is determined by pattern matching whether it matches any of
The specific speech recognition apparatus according to claim 2, wherein the specific speech recognition apparatus is provided.

The map data is mounted on a car navigation system, and the car navigation system executes the functions of the GPS receiver, the current position specifying unit, and the alarm output unit.
The specific speech recognition apparatus according to any one of claims 2 to 4, wherein the specific speech recognition apparatus is characterized.

Acquire surrounding audio as analog audio data in units of a certain length of time,
Converting the analog audio data into digital audio data by A / D conversion;
Receive radio waves sent by multiple GPS satellites at regular time intervals, calculate the current position based on the received radio wave information,
Based on the current location, identify a plurality of local governments near the current location,
Extracting a plurality of emergency vehicle voice data used by the plurality of specified local governments,
It is determined whether the digital voice data matches any of a plurality of emergency vehicle voice data of a plurality of local governments in the vicinity of the current position, and if determined to match, voice, message data indicating that an emergency vehicle is approaching, Alternatively, output and notify by image data,
A specific speech recognition method characterized by the above.

The A / D conversion is performed after extracting only the voice data in the frequency range of the emergency vehicle voice data such as sirens sounded by the emergency vehicle of the local government from the analog voice data.
The specific speech recognition method according to claim 6.

The emergency vehicle sound data is sound data in the form of a sonagram;
After converting the digital voice data into voice data in the form of a sonogram, it is determined by pattern matching whether it matches any of a plurality of emergency vehicle voice data of a plurality of local governments in the vicinity of the current position.
The specific speech recognition method according to claim 6, wherein the specific speech recognition method is provided.