JPWO2018061839A1

JPWO2018061839A1 - Transmission apparatus, transmission method and transmission program

Info

Publication number: JPWO2018061839A1
Application number: JP2018542405A
Authority: JP
Inventors: 敬彦山縣; 晋平笹野; 昌治板谷; 健太郎今川; 善成石橋
Original assignee: Murata Manufacturing Co Ltd
Current assignee: Murata Manufacturing Co Ltd
Priority date: 2016-09-29
Filing date: 2017-09-15
Publication date: 2019-06-27
Anticipated expiration: 2037-09-15
Also published as: WO2018061839A1; JP6781961B2

Abstract

ニーズに応じた情報を提供する。コンピュータに、音声を収集して当該音声を示す音声データを生成する音声収集部と、音声データから、音声に含まれるキーワードである音声キーワードを抽出する音声キーワード抽出部と、音声データから、音声の強さ、速度及び抑揚の少なくとも一つを含む、音声の特徴を抽出する音声特徴抽出部と、音声の特徴に基づいて、予め格納された感情キーワード群から、抽出された音声の特徴に対応する感情キーワードを選択する感情キーワード選択部と、抽出された音声キーワード及び選択された感情キーワードを送信する送信部と、を実現させるための送信プログラム。Provide information according to your needs. A voice collecting unit for collecting voice and generating voice data indicating the voice, a voice keyword extraction unit for extracting a voice keyword which is a keyword included in the voice from the voice data, a voice from the voice data A voice feature extraction unit for extracting voice features including at least one of strength, speed, and intonation, and voice features extracted from a group of emotion keywords stored in advance based on the features of the voice A transmission program for realizing an emotion keyword selection unit for selecting an emotion keyword, and a transmission unit for transmitting the extracted voice keyword and the selected emotion keyword.

Description

本発明は、送信装置、送信方法及び送信プログラムに関する。 The present invention relates to a transmission device, a transmission method, and a transmission program.

従来の情報提示装置として、特開２０１２−２５６１８３号公報（特許文献１）に記載されたものがある。上記従来の情報提示装置は、ユーザが現在有している欲求の強さと、ユーザが現在置かれている状況とを推定し、これらの組み合わせを記述したデータベースを照会することにより、ユーザが有している欲求とユーザが置かれている状況の双方を満たすことのできるアイテムを提示する。 As a conventional information presentation apparatus, there is one described in Japanese Patent Application Laid-Open No. 2012-256183 (Patent Document 1). The above-mentioned conventional information presentation apparatus estimates the strength of the desire that the user currently has and the situation in which the user is currently placed, and by querying the database describing these combinations, Present items that can meet both the needs of the user and the situation in which the user is placed.

特開２０１２−２５６１８３号公報JP 2012-256183 A

しかしながら、上記従来の情報提示装置では、ユーザの欲求の強さとユーザが置かれている状況を推定しているに止まるため、必ずしもユーザのニーズに応じた情報を提示することができなかった。 However, in the above-mentioned conventional information presentation apparatus, since it only estimates the strength of the desire of the user and the situation in which the user is placed, it has not always been possible to present information according to the needs of the user.

本発明はこのような事情に鑑みてなされたものであり、ユーザのニーズにより応じた情報を提供することを目的とする。 The present invention has been made in view of such circumstances, and it is an object of the present invention to provide information according to the needs of the user.

本発明の一側面に係る送信プログラムは、コンピュータに、音声を収集して当該音声を示す音声データを生成する音声収集部と、音声データから、音声に含まれるキーワードである音声キーワードを抽出する音声キーワード抽出部と、音声データから、音声の強さ、速度及び抑揚の少なくとも一つを含む、音声の特徴を抽出する音声特徴抽出部と、音声の特徴に基づいて、予め格納された感情キーワード群から、抽出された音声の特徴に対応する感情キーワードを選択する感情キーワード選択部と、抽出された音声キーワード及び選択された感情キーワードを送信する送信部とを実現させる。 A transmission program according to an aspect of the present invention includes a voice collection unit that collects voice and generates voice data indicating the voice in a computer, and voice for extracting a voice keyword that is a keyword included in the voice from the voice data. A keyword extraction unit, a voice feature extraction unit for extracting voice features including at least one of voice strength, speed, and intonation from voice data, and emotion keyword groups stored in advance based on voice features Then, an emotion keyword selection unit for selecting an emotion keyword corresponding to the extracted voice feature and a transmission unit for transmitting the extracted speech keyword and the selected emotion keyword are realized.

本発明の他の側面に係る送信装置は、音声を収集して当該音声を示す音声データを生成する音声収集部と、音声データから、音声に含まれるキーワードである音声キーワードを抽出する音声キーワード抽出部と、音声データから、音声の強さ、速度及び抑揚の少なくとも一つを含む、音声の特徴を抽出する音声特徴抽出部と、音声の特徴に基づいて、予め格納された感情キーワード群から、抽出された音声の特徴に対応する感情キーワードを選択する感情キーワード選択部と、抽出された音声キーワード及び選択された感情キーワードを送信する送信部とを備える。 A transmitting apparatus according to another aspect of the present invention includes a voice collecting unit for collecting voice and generating voice data indicating the voice, and voice keyword extraction for extracting a voice keyword which is a keyword included in the voice from the voice data. A voice feature extraction unit that extracts voice features including at least one of voice strength, speed, and intonation from voice data, and emotion keyword groups stored in advance based on voice features, An emotion keyword selection unit that selects an emotion keyword corresponding to an extracted voice feature, and a transmission unit that transmits the extracted speech keyword and the selected emotion keyword.

本発明のさらに他の側面に係る送信方法は、音声を収集して当該音声を示す音声データを生成することと、音声データから、音声に含まれるキーワードである音声キーワードを抽出することと、音声データから、音声の強さ、速度及び抑揚の少なくとも一つを含む、音声の特徴を抽出することと、音声の特徴に基づいて、予め格納された感情キーワード群から、抽出された音声の特徴に対応する感情キーワードを選択することと、抽出された音声キーワード及び選択された感情キーワードを送信することとを含む。 According to still another aspect of the present invention, there is provided a transmission method comprising: collecting voice and generating voice data indicating the voice; extracting a voice keyword, which is a keyword included in the voice, from the voice data; Extracting voice features including at least one of voice strength, speed, and intonation from data, and based on voice features, to voice features extracted from a group of prestored emotion keywords It includes selecting the corresponding emotion keyword, and transmitting the extracted voice keyword and the selected emotion keyword.

本発明によれば、ユーザのニーズにより応じた情報を提供することが可能となる。 According to the present invention, it is possible to provide information according to the needs of the user.

本発明の一実施形態に係る送信装置１１０を含む検索システム１００の構成を示す図である。It is a figure which shows the structure of the search system 100 containing the transmission device 110 which concerns on one Embodiment of this invention. 送信装置１１０が備える機能ブロックの一例を示す図である。It is a figure which shows an example of the functional block with which the transmitter 110 is provided. ラッセルの感情円環モデル及びそれに含まれるキーワードを示す図である。It is a figure which shows Russell's emotion ring model and the keyword contained in it. 検索装置１２０が備える機能ブロックの一例を示す図である。It is a figure which shows an example of the functional block with which the search device 120 is provided. 出力装置１３０が備える機能ブロックの一例を示す図である。It is a figure which shows an example of the functional block with which the output device 130 is equipped. 検索システム１００における処理の一例を示すフローチャートである。6 is a flowchart showing an example of processing in the search system 100. 辞書データベース更新処理を示すフローチャートである。It is a flowchart which shows a dictionary database update process.

以下、添付の図面を参照して本発明の一実施形態について説明する。図１は、本発明の一実施形態に係る送信装置１１０を含む検索システム１００の構成を示す図である。検索システム１００は、送信装置１１０と、検索装置１２０と、出力装置１３０とを備える。本実施形態に係る検索システム１００は、送信装置１１０が収集した音声から抽出されたキーワードと、当該音声から抽出された感情や雰囲気に対応するキーワードに基づいて、所定の検索を行い、その検索結果を出力するシステムである。 Hereinafter, an embodiment of the present invention will be described with reference to the attached drawings. FIG. 1 is a diagram showing the configuration of a search system 100 including a transmission device 110 according to an embodiment of the present invention. The search system 100 includes a transmission device 110, a search device 120, and an output device 130. The search system 100 according to the present embodiment performs a predetermined search based on the keyword extracted from the voice collected by the transmitting device 110 and the keyword corresponding to the emotion or atmosphere extracted from the voice, and the search result Is a system that outputs

送信装置１１０は、送信装置１１０が置かれている場所で収集した音声に基づいて、所定のキーワードを検索装置１２０に送信するコンピュータである。送信装置１１０は、施設や店舗に設置されたコンピュータであってもよいし、施設や店舗を訪問したユーザが所有するコンピュータ（スマートフォンやタブレット端末等）であってもよい。送信装置は、プロセッサ、メモリ及び通信インタフェースを備える。送信装置１１０は、例えば、携帯電話通信網やインターネットを経由して、検索装置１２０と通信を行うことができる。 The transmission device 110 is a computer that transmits a predetermined keyword to the search device 120 based on the voice collected at the place where the transmission device 110 is placed. The transmission device 110 may be a computer installed in a facility or a store, or may be a computer (such as a smartphone or a tablet terminal) owned by a user who has visited the facility or the store. The transmitter comprises a processor, a memory and a communication interface. The transmitting device 110 can communicate with the search device 120 via, for example, a mobile telephone communication network or the Internet.

検索装置１２０は、送信装置１１０から受信したキーワードに基づいて、出力装置１３０のユーザに対して情報提供を行うコンピュータ（サーバ）である。検索装置１２０は、プロセッサ、メモリ、データベース及び通信インタフェースを備える。検索装置１２０は、例えば、インターネットを経由して、送信装置１１０及び出力装置１３０と通信を行うことができる。 The search device 120 is a computer (server) that provides information to the user of the output device 130 based on the keyword received from the transmission device 110. The search device 120 comprises a processor, a memory, a database and a communication interface. The search device 120 can communicate with the transmission device 110 and the output device 130 via, for example, the Internet.

出力装置１３０は、検索装置１２０から提供されるデータ（表示データ）に基づいて、検索結果の出力を行うコンピュータである。出力装置１３０は、検索結果として、数値、文字、映像（画像）、音声等を、ディスプレイやスピーカ等に出力する。出力装置１３０は、例えば、スマートフォンやタブレット端末、パーソナルコンピュータ等である。出力装置１３０は、プロセッサ、メモリ及び通信インタフェースを備える。出力装置１３０は、例えば、携帯電話通信網やインターネットを経由して、検索装置１２０と通信を行うことができる。 The output device 130 is a computer that outputs a search result based on data (display data) provided from the search device 120. The output device 130 outputs, as a search result, a numerical value, a character, a video (image), an audio and the like to a display, a speaker and the like. The output device 130 is, for example, a smartphone, a tablet terminal, a personal computer or the like. The output device 130 comprises a processor, a memory and a communication interface. The output device 130 can communicate with the search device 120 via, for example, a mobile telephone communication network or the Internet.

なお、出力装置１３０は、検索結果に基づいて、所定の動作を出力してもよい。出力装置１３０は、例えば、検索結果に基づいて、所定の通信を行ったり、モータ、アクチュエータ、センサ等を制御したりしてもよい。 The output device 130 may output a predetermined operation based on the search result. The output device 130 may perform predetermined communication or control a motor, an actuator, a sensor or the like based on, for example, a search result.

図２は、送信装置１１０が備える機能ブロックの一例を示す図である。送信装置１１０は、音声収集部２００と、音声キーワード抽出部２１０と、辞書データベース２２０と、音声特徴抽出部２３０と、感情キーワード選択部２４０と、感情データベース２５０と、送信部２６０と、辞書データベース更新指示作成部２７０とを備える。 FIG. 2 is a diagram illustrating an example of functional blocks included in the transmission device 110. The transmission device 110 includes a voice collection unit 200, a voice keyword extraction unit 210, a dictionary database 220, a voice feature extraction unit 230, an emotion keyword selection unit 240, an emotion database 250, a transmission unit 260, and a dictionary database update. And an instruction generation unit 270.

送信装置１１０のメモリには、送信プログラムが格納されており、送信装置１１０のハードウェア資源と送信プログラムとの協働により、送信装置１１０の各機能（音声収集部２００、音声キーワード抽出部２１０、辞書データベース２２０、音声特徴抽出部２３０、感情キーワード選択部２４０、感情データベース２５０、送信部２６０、及び辞書データベース更新指示作成部２７０）が実現される。送信プログラムは、コンピュータ読み取り可能な記録媒体から送信装置１１０のメモリに読み込まれて、送信装置１１０のプロセッサにより実行される。 The transmission program is stored in the memory of the transmission device 110, and each function of the transmission device 110 (the voice collection unit 200, the speech keyword extraction unit 210, The dictionary database 220, the voice feature extraction unit 230, the emotion keyword selection unit 240, the emotion database 250, the transmission unit 260, and the dictionary database update instruction creation unit 270 are realized. The transmission program is read from the computer readable recording medium into the memory of the transmission device 110 and executed by the processor of the transmission device 110.

音声収集部２００は、送信装置１１０の周囲で発せられた音声を収集し、当該音声を示す音声データを生成する。具体的には、音声収集部２００は、マイク等によって収集した音声を電気信号に変換し、当該電気信号が示す情報をデジタルデータに変換した音声データを生成する。また、音声収集部２００は、生成した音声データを一時的に記憶する記憶部を有してもよい。当該記憶部は、例えば、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）、メモリカード、光ディスク、又はＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）などの磁気的、電気的又は光学的に記憶可能な既存の記憶装置又は記憶媒体である。音声収集部２００は、送信装置１１０の内蔵マイクでもよく、或いは外付けマイク（外付け型の有線式マイク又は無線式のマイク）でもよい。例えば、外付けマイクが接続されたタブレット端末は、送信装置１１０として機能する。送信装置１１０は、複数の音声収集部２００を備えてもよい。送信装置１１０は、複数の音声収集部２００の相対位置を検出してもよい。 The voice collection unit 200 collects voices emitted around the transmitter 110 and generates voice data indicating the voices. Specifically, the voice collection unit 200 converts voice collected by a microphone or the like into an electrical signal, and generates voice data by converting information indicated by the electrical signal into digital data. Further, the voice collection unit 200 may have a storage unit for temporarily storing the generated voice data. The storage unit is, for example, an existing storage that can be stored magnetically, electrically or optically, such as a hard disk drive (HDD), a solid state drive (SSD), a memory card, an optical disk, or a random access memory (RAM). It is an apparatus or a storage medium. The sound collection unit 200 may be a built-in microphone of the transmission device 110, or may be an external microphone (an external wired microphone or a wireless microphone). For example, a tablet terminal to which an external microphone is connected functions as the transmission device 110. The transmitter 110 may include a plurality of voice collection units 200. The transmitting device 110 may detect the relative position of the plurality of voice collecting units 200.

音声キーワード抽出部２１０は、音声収集部２００が生成した音声データから、当該音声に含まれるキーワード（以下「音声キーワード」とも呼ぶ。）を抽出する。具体的には、音声キーワード抽出部２１０は、まず、音声データを解析して、当該音声をテキストデータに変換する。そして、音声キーワード抽出部２１０は、当該テキストデータに含まれる各単語を、辞書データベース２２０に予め格納された単語と比較する。そして、音声キーワード抽出部２１０は、当該テキストデータに含まれる各単語が、辞書データベース２２０に予め格納された単語と一致した場合に、当該単語を音声キーワードとして抽出する。 The voice keyword extraction unit 210 extracts, from the voice data generated by the voice collection unit 200, a keyword included in the voice (hereinafter also referred to as a "voice keyword"). Specifically, the speech keyword extraction unit 210 first analyzes speech data and converts the speech into text data. Then, the speech keyword extraction unit 210 compares each word included in the text data with a word stored in advance in the dictionary database 220. Then, when each word included in the text data matches a word stored in advance in the dictionary database 220, the speech keyword extraction unit 210 extracts the word as a speech keyword.

音声特徴抽出部２３０は、音声収集部２００が生成した音声データから、当該音声の特徴を抽出する。本実施形態において、音声特徴抽出部２３０は、当該音声の強さ、速度及び抑揚の少なくとも一つを抽出し、そして、当該強さ、速度若しくは抑揚、又は、これらの２つ以上の組み合わせを、当該音声の特徴として抽出する。 The voice feature extraction unit 230 extracts the feature of the voice from the voice data generated by the voice collection unit 200. In the present embodiment, the voice feature extraction unit 230 extracts at least one of the strength, speed and intonation of the voice, and the strength, speed or intonation, or a combination of two or more of them. It extracts as the feature of the voice concerned.

音声特徴抽出部２３０は、音声データが示す音声信号の振幅の大きさに基づいて、当該音声の強さを抽出する。音声特徴抽出部２３０は、例えば、所定の単位期間における音声の強度の平均を、当該音声の強さとして抽出する。また、音声特徴抽出部２３０は、所定の文字数、単語数又は文章数を含む音声の強度の平均を、当該音声の強さとして抽出してもよい。 The voice feature extraction unit 230 extracts the strength of the voice based on the magnitude of the amplitude of the voice signal indicated by the voice data. The voice feature extraction unit 230 extracts, for example, the average of the voice strength in a predetermined unit period as the voice strength. Also, the voice feature extraction unit 230 may extract the average of the strength of the voice including the predetermined number of characters, the number of words, or the number of sentences as the strength of the voice.

また、音声特徴抽出部２３０は、音声に含まれる言葉の音数に基づいて、当該音声の速さを抽出する。音声特徴抽出部２３０は、例えば、所定の単位期間の含まれる音数を、当該音声の速さとして抽出する。また、音声特徴抽出部２３０は、所定の単位期間における当該音声の文字数、単語数又は文章数を、当該音声の速さとして抽出してもよい。 Further, the voice feature extraction unit 230 extracts the speed of the voice based on the number of sounds of the words included in the voice. The voice feature extraction unit 230 extracts, for example, the number of sounds included in a predetermined unit period as the speed of the voice. Further, the voice feature extraction unit 230 may extract the number of characters, the number of words, or the number of sentences of the voice in a predetermined unit period as the speed of the voice.

また、音声特徴抽出部２３０は、音声の強さの変化に基づいて、当該音声の抑揚を抽出する。音声特徴抽出部２３０は、例えば、音声を複数の単位に分割し、各単位内における強さの変化、及び／又は、単位間における強さの変化を、当該音声の抑揚として抽出する。音声の単位は、例えば、音声に含まれる文章を構成する単語や節である。 Also, the voice feature extraction unit 230 extracts the intonation of the voice based on the change in the voice strength. For example, the voice feature extraction unit 230 divides a voice into a plurality of units, and extracts a change in strength in each unit and / or a change in strength between units as an expression of the voice. The unit of speech is, for example, words or clauses that constitute a sentence included in speech.

感情キーワード選択部２４０は、音声特徴抽出部２３０が抽出した音声の特徴に基づいて、当該特徴が示す話者の感情や場の雰囲気をキーワードに変換する。感情キーワード選択部２４０は、例えば、感情データベース２５０に予め格納された感情キーワード群から、音声特徴抽出部２３０が抽出した音声の特徴に対応する感情キーワードを選択する。また、感情データベース２５０は、感情や雰囲気に関連する多数のキーワードを、音声の特徴と対応づけて格納する。具体的には、感情データベース２５０は、当該音声の強さ、速度及び抑揚のそれぞれの値（又は値が取り得る所定の範囲）や、当該値又は範囲の組み合わせのパターンを、各キーワードと対応づけて格納する。そして、感情キーワード選択部２４０は、音声特徴抽出部２３０が抽出した音声の特徴、すなわち、当該音声の強さ、速度及び抑揚のそれぞれの値（又は値が取り得る所定の範囲）や、当該値又は範囲の組み合わせのパターンに基づいて、感情データベース２５０に格納された多数のキーワードの中から、所定のキーワードを選択する。なお、感情データベース２５０に格納されるキーワードは、図３に示すような、いわゆるラッセルの感情円環モデルに含まれるキーワードであってもよい。 The emotion keyword selection unit 240 converts the speaker's emotion or the atmosphere of the place indicated by the feature into a keyword based on the feature of the voice extracted by the voice feature extraction unit 230. For example, the emotion keyword selection unit 240 selects, from the emotion keyword group stored in advance in the emotion database 250, the emotion keyword corresponding to the feature of the voice extracted by the voice feature extraction unit 230. In addition, the emotion database 250 stores a large number of keywords related to emotions and moods in association with voice features. Specifically, the emotion database 250 associates each value (or a predetermined range of possible values) of the strength, speed and intonation of the voice, and a pattern of combinations of the values or ranges with each keyword. Store. Then, the emotion keyword selection unit 240 determines the feature of the voice extracted by the voice feature extraction unit 230, that is, the value (or the predetermined range in which the value can be taken) of the strength, the speed and the intonation of the voice. Alternatively, a predetermined keyword is selected from among a large number of keywords stored in the emotion database 250 based on the pattern of combination of ranges. The keywords stored in the emotion database 250 may be keywords included in a so-called Russell emotion ring model as shown in FIG.

感情キーワード選択部２４０は、音声特徴抽出部２３０が抽出した音声の特徴の他に、収集された音声の発した話者の生体情報にさらに基づいて、感情キーワードを選択してもよい。感情キーワード選択部２４０は、例えば、話者の体温、心拍、脈拍、脳波、皮膚コンダクタンスレベル等の生体情報に基づいて、当該話者の感情を推定し、推定された感情に対応するキーワードを選択してもよい。話者の生体情報は、送信装置１１０に接続されたセンサから取得されてもよいし、また、例えばネットワークを通じて送信装置１１０の外部装置から取得されてもよい。 The emotion keyword selection unit 240 may select an emotion keyword further based on the biometric information of the speaker who has emitted the collected voice, in addition to the feature of the voice extracted by the voice feature extraction unit 230. The emotion keyword selection unit 240 estimates the emotion of the speaker based on biological information such as the temperature, heart rate, pulse, brain waves, skin conductance level of the speaker, and selects a keyword corresponding to the estimated emotion. You may The speaker's biometric information may be obtained from a sensor connected to the transmitting device 110 or may be obtained from an external device of the transmitting device 110 through, for example, a network.

音声特徴抽出部２３０は、音声データの波形から発言の肯定又は否定の度合いに関する指標を抽出してもよい。例えば、音声特徴抽出部２３０は、Ｅｍｐａｔｈ（登録商標）を用いて、音声データの波形から発言の肯定又は否定の度合いに関する指標を抽出することができる。Ｅｍｐａｔｈは、音声データの波形の物理的な特徴を解析し、平常、怒り、喜び、悲しみ及び元気度の各項目について５０段階で判定値を算出する。音声特徴抽出部２３０は、このように算出された結果に基づいて、発言の肯定又は否定の度合いに関する指標を抽出することができる。このように、音声データの波形から感情や気分に関する判定値を算出する技術は、音声気分解析技術と呼ばれる。音声気分解析技術を用いて判定される項目は、平常、怒り、喜び、悲しみ及び元気度に限られるものではなく、感情に関わるあらゆる項目を含み得る。感情データベース２５０は、発言の肯定又は否定の度合いに関する指標を、各感情キーワードと対応付けて格納してもよい。感情キーワード選択部２４０は、音声特徴抽出部２３０が抽出した、発言の肯定又は否定の度合いに関する指標に対応する感情キーワードを感情データベース２５０から選択してもよい。 The voice feature extraction unit 230 may extract an index related to the degree of positive or negative of the utterance from the waveform of the voice data. For example, using the Empath (registered trademark), the voice feature extraction unit 230 can extract an index related to the degree of positive or negative of speech from the waveform of voice data. Empath analyzes the physical characteristics of the waveform of the voice data, and calculates the judgment value in 50 steps for each item of normality, anger, joy, sadness and spirit level. The voice feature extraction unit 230 can extract an index related to the degree of positive or negative of the utterance based on the result calculated as described above. As described above, a technology for calculating a determination value relating to emotion or mood from a waveform of voice data is called voice mood analysis technology. Items determined using voice mood analysis techniques are not limited to normality, anger, joy, sadness and vigor, but may include any item related to emotion. The emotion database 250 may store an index related to the degree of affirmation or denial of speech in association with each emotion keyword. The emotion keyword selection unit 240 may select, from the emotion database 250, the emotion keyword corresponding to the index related to the degree of positive or negative of the utterance extracted by the voice feature extraction unit 230.

送信部２６０は、音声キーワード抽出部２１０が抽出した音声キーワード、及び、感情キーワード選択部２４０が選択した感情キーワードを、検索装置１２０に送信する。送信部２６０は、例えば、音声収集部２００が音声を収集してから所定の期間が経過したこと、音声特徴抽出部２３０が抽出した記音声の強さが所定の値を超えたこと、音声キーワード抽出部２１０が所定の単語を抽出したこと等に応答して、音声キーワード抽出部２１０が抽出した音声キーワード、及び、感情キーワード選択部２４０が選択した前記感情キーワードを、検索装置１２０に送信する。当該所定の単語は、例えば、「検索」等である。 The transmitting unit 260 transmits the speech keyword extracted by the speech keyword extraction unit 210 and the emotion keyword selected by the emotion keyword selection unit 240 to the search device 120. For example, the transmitting unit 260 indicates that a predetermined period has elapsed since the voice collecting unit 200 collected voice, that the strength of the recorded voice extracted by the voice feature extracting unit 230 exceeds a predetermined value, the voice keyword In response to the extraction unit 210 extracting a predetermined word, the speech keyword extracted by the speech keyword extraction unit 210 and the emotion keyword selected by the emotion keyword selection unit 240 are transmitted to the search device 120. The predetermined word is, for example, "search" or the like.

送信部２６０は、音声キーワード及び感情キーワードの他に、環境情報又は生体情報を、検索装置１２０にさらに送信してもよい。環境情報は、例えば、位置、温度、湿度、照度、揺れ等といった、送信装置１１０が置かれている場所や空間に関する情報である。環境情報は、ＧＰＳ、温度センサ、湿度センサ、照度センサ、加速度センサ、赤外線センサ等の測定装置によって取得される。送信部２６０は、環境情報や生体情報を、測定装置から直接的に取得してもよいし、ネットワーク等を通じて間接的に取得してもよい。 The transmitting unit 260 may further transmit environment information or biometric information to the search device 120 in addition to the speech keyword and the emotion keyword. The environmental information is, for example, information on a place or space where the transmission device 110 is placed, such as position, temperature, humidity, illuminance, shaking and the like. Environmental information is acquired by measuring devices, such as GPS, a temperature sensor, a humidity sensor, an illumination sensor, an acceleration sensor, and an infrared sensor. The transmission unit 260 may directly acquire environmental information and biological information from the measurement device, or may indirectly acquire it via a network or the like.

辞書データベース更新指示作成部２７０は、辞書データベース２２０を更新するための指示（辞書データベース更新指示）を作成する。辞書データベース更新指示は、所定の契機で作成される。ここで、所定の契機は、辞書データベース２２０のヒット率が予め定められた閾値を下回った時点でもよく、或いは、予め定められた一定間隔でもよい。ヒット率とは、辞書データベース２２０に登録されている全てのキーワードのうち、音声収集部２００によって収集された音声に含まれているものと判定されたことのあるキーワードの割合を意味する。送信部２６０は、辞書データベース更新指示作成部２７０によって作成された辞書データベース更新指示を検索装置１２０に送信する。 The dictionary database update instruction creating unit 270 creates an instruction for updating the dictionary database 220 (a dictionary database update instruction). The dictionary database update instruction is created at a predetermined timing. Here, the predetermined trigger may be when the hit rate of the dictionary database 220 falls below a predetermined threshold, or may be a predetermined constant interval. The hit rate means the ratio of keywords that have been determined to be included in the voice collected by the voice collection unit 200 among all the keywords registered in the dictionary database 220. The transmitting unit 260 transmits the dictionary database update instruction created by the dictionary database update instruction creating unit 270 to the search device 120.

図４は、検索装置１２０が備える機能ブロックの一例を示す図である。検索装置１２０は、受信部３００と、検索実行部３１０と、データベース３２０と、送信部３３０と、キーワードデータベース３４０と、推測部３５０と、辞書データベース作成部３６０とを備える。検索装置１２０は、所定のキーワードに基づいてデータベース３２０に格納された情報を検索し、検索結果を出力装置１３０に送信する検索エンジンである。検索装置１２０は、送信装置１１０から送信される音声キーワードを受信し、これを蓄積することにより、蓄積された音声キーワードから話題を推測する処理を行ってもよい。音声キーワードが、例えば、「ハンドル」、「ブレーキ」、「タイヤ」などである場合、「車」に関する話題がなされているものと推測できる。推測された話題は、例えば、広告業者に提供される。また、検索装置１２０は、送信装置１１０からの辞書データベース更新指示に応答して、辞書データベース２２０を更新する処理をも行う。 FIG. 4 is a diagram illustrating an example of functional blocks included in the search device 120. The search device 120 includes a reception unit 300, a search execution unit 310, a database 320, a transmission unit 330, a keyword database 340, an estimation unit 350, and a dictionary database creation unit 360. The search device 120 is a search engine which searches for information stored in the database 320 based on a predetermined keyword, and transmits the search result to the output device 130. The search device 120 may receive a speech keyword transmitted from the transmission device 110 and accumulate the information to estimate the topic from the accumulated speech keyword. If the speech keyword is, for example, "handle", "brake", "tire" or the like, it can be inferred that a topic related to "car" is made. The inferred topic is provided to, for example, an advertiser. Further, in response to the dictionary database update instruction from the transmitter 110, the search device 120 also performs a process of updating the dictionary database 220.

検索装置１２０のメモリには、音声キーワード及び感情キーワードに基づく検索処理を検索装置１２０に実行させる検索プログラムが格納されており、検索装置１２０のハードウェア資源と検索プログラムとの協働により、検索装置１２０の各機能（受信部３００、検索実行部３１０、データベース３２０、送信部３３０、キーワードデータベース３４０、推測部３５０、及びデータベース作成部３６０）が実現される。検索プログラムは、コンピュータ読み取り可能な記録媒体から検索装置１２０のメモリに読み込まれて、検索装置１２０のプロセッサにより実行される。 The memory of the search device 120 stores a search program that causes the search device 120 to execute a search process based on the speech keyword and the emotion keyword. The search device is obtained by cooperation of the hardware resources of the search device 120 and the search program. The respective functions of 120 (the receiving unit 300, the search execution unit 310, the database 320, the transmitting unit 330, the keyword database 340, the estimating unit 350, and the database creating unit 360) are realized. The search program is read from the computer readable recording medium into the memory of the search device 120 and executed by the processor of the search device 120.

受信部３００は、送信装置１１０が送信した音声キーワード及び感情キーワードを受信する。また、受信部３００は、送信装置１１０が送信した音声キーワード及び感情キーワードの他に、環境情報や生体情報をさらに受信してもよい。受信部３００は、送信装置１１０が送信した辞書データベース更新指示を受信する。 The receiving unit 300 receives the speech keyword and the emotion keyword transmitted by the transmission device 110. Further, the receiving unit 300 may further receive environment information and biometric information in addition to the speech keyword and the emotion keyword transmitted by the transmission device 110. The receiving unit 300 receives the dictionary database update instruction transmitted by the transmitting device 110.

検索実行部３１０は、受信部３００が受信した音声キーワード及び感情キーワード並びに／又は環境情報及び／若しくは生体情報に基づいて、データベース３２０を検索する。本実施形態において、音声キーワード、感情キーワード、環境情報及び生体情報は、いずれもテキストデータであり、検索実行部３１０は、例えば、データベース３２０に格納された情報のうち、検索に使われたテキストデータのいずれをも含む情報を、検索結果として抽出する。当該情報は、例えば、ウェブサイトに含まれるテキストデータ等である。 The search execution unit 310 searches the database 320 based on the speech keywords and emotion keywords received by the reception unit 300 and / or the environment information and / or the biometric information. In the present embodiment, the speech keyword, the emotion keyword, the environment information, and the biometric information are all text data, and the search execution unit 310 is, for example, text data used for a search among the information stored in the database 320. Information including any of the above is extracted as a search result. The information is, for example, text data or the like included in the website.

送信部３３０は、検索実行部３１０が抽出した検索結果を、ネットワークを通じて出力装置１３０に送信する。送信部３３０は、例えば、検索実行部３１０が抽出したウェブサイトのＵＲＬや当該ウェブサイトに含まれるテキストデータや画像データ等を、検索結果として出力装置１３０に送信する。 The transmission unit 330 transmits the search result extracted by the search execution unit 310 to the output device 130 through the network. The transmitting unit 330 transmits, for example, the URL of the website extracted by the search execution unit 310, text data, image data, and the like included in the website as a search result to the output device 130.

キーワードデータベース３４０は、受信部３００を通じて受信された音声キーワードを格納する。推測部３５０は、受信部３００を通じて受信された辞書データベース更新指示に応答して、キーワードデータベース３４０に格納されている音声キーワードの関連語を推測する。音声キーワードの関連語とは、音声キーワードと頻繁に同時に使われる語句を意味し、このような語句は、「共起語」とも呼ばれている。共起語は、同意語を必ずしも意味するものではない。推測部３５０は、例えば、人工知能を応用した共起語検索ツールである。このような共起語検索ツールとして、例えば、グーグルやヤフーなどの検索結果の上位となるウェブページのコンテンツで形態素解析を行い、同一文書に頻出する語句を共起語として提示するものが知られている。形態素解析とは、文法的な情報の注記のない自然言語のテキストデータから、対象言語の文法や、辞書と呼ばれる単語の品詞などの情報に基づき、形態素（言語で意味を持つ最小単位）の列に分解し、それぞれの形態素の品詞などを判別する作業を意味する。 The keyword database 340 stores speech keywords received through the receiver 300. The estimation unit 350 estimates a related term of the speech keyword stored in the keyword database 340 in response to the dictionary database update instruction received through the reception unit 300. The term related to the speech keyword means a term that is frequently used simultaneously with the speech keyword, and such a term is also called "co-occurring word". Co-occurring words do not necessarily mean synonymous terms. The estimation unit 350 is, for example, a co-occurrence word search tool to which artificial intelligence is applied. As such a co-occurrence word search tool, for example, there is known one which performs morphological analysis on the content of a web page which is higher than the search results such as Google and Yahoo, and presents words and phrases frequently appearing in the same document as co-occurrence words. ing. Morphological analysis is a sequence of morphemes (the smallest units having meaning in a language) based on information such as natural language text data without annotations of grammatical information, grammar of the target language, and part of speech of words called a dictionary. It means the task of decomposing it into two and determining the part of speech of each morpheme.

辞書データベース作成部３６０は、受信部３００を通じて受信された辞書データベース更新指示に応答して、辞書データベース２２０を更新する。辞書データベース作成部３６０は、キーワードデータベース３４０に格納されている音声キーワードと、推測部３５０により推測された関連語とに基づいて、更新された辞書データベース２２０を作成する。更新された辞書データベース２２０は、音声収集部２００によって収集された音声から抽出されたことのある音声キーワードとその関連語をキーワードとして登録している。更新前の辞書データベース２２０に登録されているキーワードのうち、音声収集部２００によって収集された音声から抽出されたことのない音声キーワードは、更新後の辞書データベース２２０から削除される。このような更新処理を繰り返すことにより、辞書データベース２２０のヒット率を高めることができる。 The dictionary database creation unit 360 updates the dictionary database 220 in response to the dictionary database update instruction received through the reception unit 300. The dictionary database creation unit 360 creates the updated dictionary database 220 based on the speech keywords stored in the keyword database 340 and the related words estimated by the estimation unit 350. The updated dictionary database 220 registers voice keywords that have been extracted from the voice collected by the voice collection unit 200 and their related words as keywords. Among keywords registered in the dictionary database 220 before update, speech keywords that have not been extracted from the speech collected by the speech collection unit 200 are deleted from the dictionary database 220 after update. By repeating such update processing, the hit rate of the dictionary database 220 can be increased.

送信部３３０は、更新された辞書データベース２２０を送信装置１１０に送信する。更新された辞書データベース２２０を受信した送信装置１１０は、更新前の辞書データベース２２０を更新後の辞書データベース２２０に差し替える。 The transmitting unit 330 transmits the updated dictionary database 220 to the transmitting device 110. The transmitter 110 that has received the updated dictionary database 220 replaces the dictionary database 220 before update with the dictionary database 220 after update.

図５は、出力装置１３０が備える機能ブロックの一例を示す図である。出力装置１３０は、検索結果受信部４００と、検索結果出力部４１０とを備える。出力装置１３０において、検索結果受信部４００は、検索結果送信部３３０が送信した検索結果を受信し、検索結果出力部４１０は、受信した検索結果を、数値、文字、映像（画像）、音声等として、ディスプレイやスピーカ等を通じて出力する。 FIG. 5 is a diagram illustrating an example of functional blocks included in the output device 130. The output device 130 includes a search result receiving unit 400 and a search result output unit 410. In the output device 130, the search result receiving unit 400 receives the search result transmitted by the search result transmitting unit 330, and the search result output unit 410 receives the received search result as a numerical value, a character, a video (image), a voice, etc. Output through a display or a speaker.

出力装置１３０のメモリには、出力プログラムが格納されており、出力装置１３０のハードウェア資源と出力プログラムとの協働により、出力装置１３０の各機能（検索結果受信部４００、及び検索結果出力部４１０）が実現される。出力プログラムは、コンピュータ読み取り可能な記録媒体から出力装置１３０のメモリに読み込まれて、出力装置１３０のプロセッサにより実行される。 An output program is stored in the memory of the output device 130, and each function of the output device 130 (search result receiving unit 400, and search result output unit is realized by cooperation between the hardware resources of the output device 130 and the output program. 410) is realized. The output program is read from the computer readable recording medium into the memory of the output device 130 and executed by the processor of the output device 130.

図６は、検索システム１００における処理の一例を示すフローチャートである。 FIG. 6 is a flowchart showing an example of processing in the search system 100.

まず、音声収集部２００は、送信装置１１０の周囲で発せられた音声を収集し、当該音声を示す音声データを生成する（Ｓ６００）。音声収集部２００は、例えば、一人又は複数の話者から「今日は暑いですね。」という音声を収集した場合に、この音声データを生成する。次に、音声キーワード抽出部２１０が、音声データから、音声キーワードを抽出する（Ｓ６０１）。音声キーワード抽出部２１０は、例えば、「今日は暑いですね。」という音声データから、「暑い」と単語を音声キーワードとして抽出する。次に、音声特徴抽出部２３０が、音声収集部２００が生成した音声データから、当該音声の特徴を抽出する（Ｓ６０２）。音声特徴抽出部２３０は、例えば、「今日は暑いですね。」という音声における、音声の強さ、速さ、抑揚を抽出する。 First, the voice collection unit 200 collects voices emitted around the transmission device 110, and generates voice data indicating the voices (S600). The voice collection unit 200 generates this voice data, for example, when voices of “Today is hot” are collected from one or more speakers. Next, the speech keyword extraction unit 210 extracts speech keywords from the speech data (S601). For example, the speech keyword extraction unit 210 extracts a word “hot” as speech keyword from speech data “It is hot today”. Next, the voice feature extraction unit 230 extracts the feature of the voice from the voice data generated by the voice collection unit 200 (S602). The voice feature extraction unit 230 extracts, for example, the strength, speed, and intonation of the voice in the voice “It is hot today.”

次に、感情キーワード選択部２４０が、音声特徴抽出部２３０が抽出した音声の特徴に基づいて、話者の感情や場の雰囲気を示す感情キーワードを選択する（Ｓ６０３）。感情キーワード選択部２４０は、例えば、「今日は暑いですね。」という音声における、当該音声の強さ、速さ、抑揚の値を組み合わせたパターンに対応する感情キーワードとして、感情データベース２５０に格納されていた「苛立ち」を選択する。次に、送信部２６０は、音声キーワード抽出部２１０が抽出した音声キーワード、及び、感情キーワード選択部２４０が選択した感情キーワードを、検索装置１２０に送信する（Ｓ６０４）。送信部２６０は、「暑い」及び「苛立ち」を、それぞれ音声キーワード及び感情キーワードとして、検索装置１２０に送信する。また、送信部２６０は、送信装置１１０が置かれている場所である「京都府長岡京市神足１丁目にある長岡京駅の外」を示す位置情報と、当該場所における気温である「３６℃」を、環境情報として検索装置１２０に送信する。 Next, the emotion keyword selection unit 240 selects an emotion keyword indicating the speaker's emotion or the atmosphere of the place based on the voice feature extracted by the voice feature extraction unit 230 (S603). The emotion keyword selection unit 240 is stored, for example, in the emotion database 250 as an emotion keyword corresponding to a pattern combining the voice strength, speed, and intonation value in the voice "Today is hot." Select the "frustration" that was Next, the transmission unit 260 transmits the speech keyword extracted by the speech keyword extraction unit 210 and the emotion keyword selected by the emotion keyword selection unit 240 to the search device 120 (S604). The transmitting unit 260 transmits “hot” and “frustration” to the search device 120 as the speech keyword and the emotion keyword, respectively. In addition, the transmitting unit 260 is position information indicating "outside of Nagaokakyo Station at Kamitome 1-chome, Nagaokakyo City, Kyoto Prefecture" where the transmitting device 110 is placed, and "36 ° C" which is the temperature at the relevant place. , It transmits to the search device 120 as environmental information.

次に、キーワード受信部３００は、送信装置１１０が送信した音声キーワード、感情キーワード及び環境情報を受信する（Ｓ６０５）。キーワード受信部３００は、例えば、「暑い」及び「苛立ち」を、それぞれ音声キーワード及び感情キーワードとして、また、「京都府長岡京市神足１丁目にある長岡京駅の外」を示す位置情報と気温「３６℃」を環境情報として受信する。次に、検索実行部３１０は、キーワード受信部３００が受信した音声キーワード及び感情キーワード並びに環境情報に基づいて、データベース３２０を検索する（Ｓ６０６）。検索実行部３１０は、例えば、これらのキーワード及び環境情報に基づいて、「今日は暑いですね。」という音声を発した話者が、「京都府長岡京市神足１丁目にある長岡京駅の外」において、気温「３６℃」の中で、「暑い」という単語を「苛立ち」という感情とともに発していると判断する。そして、検索実行部３１０は、「『京都府長岡京市神足１丁目にある長岡京駅』の半径１００ｍにおいて、冷たい飲み物を提供する店舗」を検索する。次に、検索結果送信部３３０が、検索実行部３１０が抽出した検索結果を、ネットワークを通じて出力装置１３０に送信する（Ｓ６０７）。検索結果送信部３３０は、例えば、「『京都府長岡京市神足１丁目にある長岡京駅』の半径１００ｍにおいて、冷たい飲み物を提供する店舗」に該当する店舗のウェブサイトの一覧を、検索結果として出力装置１３０に送信する。 Next, the keyword receiving unit 300 receives the voice keyword, the emotion keyword, and the environment information transmitted by the transmitting device 110 (S605). For example, the keyword receiving unit 300 uses “hot” and “irritated” as speech keywords and emotion keywords, respectively, and also indicates position information and temperature “36 outside of Nagaokakyo Station in Kamifu 1-chome, Nagaokakyo City, Kyoto Prefecture”. "° C" is received as environmental information. Next, the search execution unit 310 searches the database 320 based on the speech keyword and emotion keyword received by the keyword reception unit 300 and the environment information (S606). For example, based on these keywords and environmental information, the search execution unit 310 has a speaker who issued a voice saying "It is hot today." "Outside of Nagaokakyo Station at Kamitome 1-chome, Nagaokakyo City, Kyoto Prefecture" In the temperature "36 ° C", it is judged that the word "hot" is emitted with the feeling "frustration". Then, the search execution unit 310 searches for “a store providing a cold drink at a radius of 100 m of“ Nagaokakyo Station at Kamitome 1-chome, Nagaokakyo City, Kyoto Prefecture ”. Next, the search result transmission unit 330 transmits the search result extracted by the search execution unit 310 to the output device 130 through the network (S607). The search result transmission unit 330 outputs, for example, as a search result, a list of websites of stores corresponding to "a store providing cold drinks at a radius of 100 m of" Nagaokakyo Station at Kamitome 1-chome, Nagaokakyo City, Kyoto Prefecture ". Send to device 130.

次に、検索結果受信部４００が、検索結果送信部３３０が送信した検索結果を受信する（Ｓ６０８）。そして、検索結果出力部４１０は、受信した検索結果を、ディスプレイやスピーカ等を通じて出力する（Ｓ６０９）。検索結果出力部４１０は、例えば、「『京都府長岡京市神足１丁目にある長岡京駅』の半径１００ｍにおいて、冷たい飲み物を提供する店舗」に該当する店舗のウェブサイトの一覧を、出力装置１３０が備えるディスプレイに表示する。なお、本例において、例えば、位置情報が「京都府長岡京市神足１丁目にある長岡京駅の外」ではなく「京都府長岡京市東神足１丁目１０番１号の建物」を示し、気温が「３６℃」ではなく「２８℃」を示す場合、検索実行部３１０は、「今日は暑いですね。」という音声を発した話者が、当該建物において、「暑い」という単語を気温「２８℃」の中で「苛立ち」という感情とともに発していると判断してもよい。この場合、検索実行部３１０は、「当該建物において気温を制御できること」を検索し、空調を制御することを、検索結果として出力してもよい。この場合、検索結果出力部４１０は、例えば、当該空調の設定温度を「２５℃」に変更することを出力する。 Next, the search result receiving unit 400 receives the search result transmitted by the search result transmitting unit 330 (S608). Then, the search result output unit 410 outputs the received search result through a display, a speaker or the like (S609). For example, the output device 130 outputs a list of websites of stores corresponding to “stores that provide cold drinks at a radius of 100 m of“ Kyoto nagaokakyo city 1 foot, Nagaokakyo station ””. Display on the display provided. In this example, for example, the position information indicates not "outside of Nagaokakyo Station in Kamigaoka, Nagaokakyo City, Kyoto Prefecture" but "Building of No. 10, No. 1 of Nagaokakyo City, Kyoto Prefecture" and temperature "36 If the search execution unit 310 indicates that "It is hot today," the word "Hot" is the word "hot" in the building, and the search execution unit 310 indicates that the temperature is "28 ° C". It may be judged that it is emitted with the feeling of "frustration" in In this case, the search execution unit 310 may search for "that the temperature of the building can be controlled" and control the air conditioning may be output as a search result. In this case, for example, the search result output unit 410 outputs that the set temperature of the air conditioning is changed to “25 ° C.”.

次に、図７を参照しながら、辞書データベース更新処理の流れについて説明する。
送信装置１１０は、所定の契機で辞書データベース更新指示を作成し（Ｓ７００）、これを検索装置１２０に送信する（Ｓ７０１）。検索装置１２０は、辞書データベース更新指示を受信すると（Ｓ７０２）、キーワードデータベース３４０から音声キーワードを取得し（Ｓ７０３）、音声キーワードの関連語を推測する（Ｓ７０４）。次に、検索装置１２０は、音声キーワードとその関連語に基づいて、更新された辞書データベース２２０を作成し（Ｓ７０５）、これを送信装置１１０に送信する（Ｓ７０６）。送信装置１１０は、更新後の辞書データベース２２０を受信すると（Ｓ７０７）、更新前の辞書データベース２２０を更新後の辞書データベース２２０に差し替えることにより、辞書データベース２２０を更新する（Ｓ７０８）。Next, the flow of the dictionary database update process will be described with reference to FIG.
The transmission device 110 creates a dictionary database update instruction at a predetermined timing (S700), and transmits this to the search device 120 (S701). When the search device 120 receives the dictionary database update instruction (S702), the search device 120 acquires a speech keyword from the keyword database 340 (S703), and estimates a related term of the speech keyword (S704). Next, the search device 120 creates the updated dictionary database 220 based on the speech keyword and its related words (S 705), and transmits this to the transmission device 110 (S 706). When the transmitting device 110 receives the updated dictionary database 220 (S 707), the transmitting device 110 updates the dictionary database 220 by replacing the old dictionary database 220 with the updated dictionary database 220 (S 708).

以上、本発明の例示的な実施形態について説明した。本実施形態によれば、音声を収集して当該音声を示す音声データを生成し、音声データから、音声に含まれるキーワードである音声キーワードを抽出し、音声データから、音声の強さ、速度及び抑揚の少なくとも一つを含む、音声の特徴を抽出し、音声の特徴に基づいて、予め格納された感情キーワード群から、抽出された音声の特徴に対応する感情キーワードを選択し、抽出された音声キーワード及び選択された感情キーワードを送信する。これにより、音声から抽出された単語と感情の双方に基づいて検索することができるので、当該音声を発した話者のニーズにより応じた情報を提供することができる。 The exemplary embodiments of the present invention have been described above. According to the present embodiment, the voice is collected to generate voice data indicating the voice, the voice keyword which is the keyword included in the voice is extracted from the voice data, and the voice strength, speed and A voice feature including at least one of intonation is extracted, and based on the voice feature, an emotion keyword corresponding to the extracted voice feature is selected from a pre-stored emotion keyword group, and the extracted voice is extracted. Send keywords and selected emotional keywords. As a result, the search can be performed based on both the word and emotion extracted from the voice, so that information according to the needs of the speaker who has made the voice can be provided.

また、本実施形態において、ラッセルの感情円環モデルに含まれるキーワードを、キーワード群として格納してもよい。これにより、話者の感情をより的確に抽出することができる。 Further, in the present embodiment, keywords included in Russell's emotion circular model may be stored as a keyword group. This makes it possible to more accurately extract the speaker's emotions.

また、本実施形態において、音声に含まれる単語が辞書データベースに格納された単語のいずれかと一致した場合に、当該単語を音声キーワードとして抽出してもよい。これにより、音声キーワードを抽出するアルゴリズムの負荷を低減させることができる。 Further, in the present embodiment, when a word included in the speech matches any of the words stored in the dictionary database, the word may be extracted as the speech keyword. This can reduce the load on the algorithm for extracting speech keywords.

また、本実施形態において、音声収集部が音声を収集してから所定の期間が経過したこと、音声特徴抽出部が抽出した音声の強さが所定の値を超えたこと、又は、音声キーワード抽出部が所定の単語を抽出したことに応答して、音声キーワード及び選択された感情キーワードを送信してもよい。これにより、検索の精度をさらに上げることができる。 Also, in the present embodiment, a predetermined period has elapsed since the voice collection unit collected voice, the strength of the voice extracted by the voice feature extraction unit exceeded a predetermined value, or voice keyword extraction The speech keyword and the selected emotion keyword may be transmitted in response to the section extracting a predetermined word. This can further increase the accuracy of the search.

また、本実施形態において、所定の検索による検索結果を出力してもよい。これにより、検索結果を話者に伝えたり、又は、話者が置かれた環境に反映させたりすることができる。 Further, in the present embodiment, a search result by a predetermined search may be output. Thereby, the search result can be transmitted to the speaker or reflected in the environment in which the speaker is placed.

辞書データベース２２０に登録されている音声キーワードに一致するものとして、音声収集部２００が収集した音声から抽出された音声キーワードとその関連語とに基づいて辞書データベース２２０を更新することにより、辞書データベース２２０のヒット率を高めることができる。これにより、辞書データベース２２０を最適化することができる。辞書データベース２２０を最適化することにより、音声収集部２００が収集する音声（例えば、顧客が実際に発話した音声）に基づいて、商品又はサービスが実際に流行しているか否かを適切に判断することができる。また、音声収集部２００が収集する音声は、送信装置１１０が置かれている場所で実際に収集された音声であるため、インターネット上の評価ではなく、現実社会における評価を基礎として、商品又はサービスが実際に流行しているか否かを判断できる。また、辞書データベース２２０の更新を繰り返すことにより、不要なキーワードは、辞書データベース２２０から削除されるため、辞書データベース２２０の記憶容量は少なくてもよい。これにより、送信装置１１０の小型化及び低消費電力化が可能となる。 The dictionary database 220 is updated based on the speech keywords extracted from the speech collected by the speech collection unit 200 and the related words, as those matching the speech keywords registered in the dictionary database 220. Can increase the hit rate of Thereby, the dictionary database 220 can be optimized. By optimizing the dictionary database 220, it is appropriately determined whether a product or service is actually popular based on the voice collected by the voice collection unit 200 (for example, the voice actually uttered by the customer). be able to. Further, since the voice collected by the voice collection unit 200 is the voice actually collected at the place where the transmission device 110 is placed, the product or service is not based on the evaluation on the Internet but based on the evaluation in the real world. It can be determined whether or not it is actually in vogue. Moreover, since unnecessary keywords are deleted from the dictionary database 220 by repeating the update of the dictionary database 220, the storage capacity of the dictionary database 220 may be small. As a result, it is possible to miniaturize the transmitter 110 and reduce the power consumption.

なお、以上説明した各実施形態は、本発明の理解を容易にするためのものであり、本発明を限定して解釈するためのものではない。本発明は、その趣旨を逸脱することなく、変更／改良され得るととともに、本発明にはその等価物も含まれる。即ち、各実施形態に当業者が適宜設計変更を加えたものも、本発明の特徴を備えている限り、本発明の範囲に包含される。例えば、各実施形態が備える各要素及びその配置、材料、条件、形状、サイズなどは、例示したものに限定されるわけではなく適宜変更することができる。また、各実施形態は例示であり、異なる実施形態で示した構成の部分的な置換又は組み合わせが可能であることは言うまでもなく、これらも本発明の特徴を含む限り本発明の範囲に包含される。 In addition, each embodiment described above is for making an understanding of this invention easy, and is not for limiting and interpreting this invention. The present invention can be modified / improved without departing from the gist thereof, and the present invention also includes the equivalents thereof. That is, those in which persons skilled in the art appropriately modify the design of each embodiment are also included in the scope of the present invention as long as they have the features of the present invention. For example, each element included in each embodiment and its arrangement, material, conditions, shape, size, and the like are not limited to those illustrated, and can be appropriately changed. Each embodiment is an exemplification, and it goes without saying that partial replacement or combination of the configurations shown in different embodiments is possible, and these are also included in the scope of the present invention as long as they include the features of the present invention. .

１００…検索システム、１１０…送信装置、１２０…検索装置、１３０…出力装置、２００…音声収集部、２１０…音声キーワード抽出部、２２０…辞書データベース、２３０…音声特徴抽出部、２４０…感情キーワード選択部、２５０…感情データベース、２６０…送信部、３００…キーワード受信部、３１０…検索実行部、３２０…データベース、３３０…検索結果送信部、４００…検索結果受信部、４１０…検索結果出力部 DESCRIPTION OF SYMBOLS 100 ... Search system, 110 ... Transmission apparatus, 120 ... Search apparatus, 130 ... Output apparatus, 200 ... Speech collection part, 210 ... Speech keyword extraction part, 220 ... Dictionary database, 230 ... Speech feature extraction part, 240 ... Emotion keyword selection Part 250 250 emotion database 260 transmitter 300 keyword receiver 310 search executor 320 database 330 search result transmitter 400 search result receiver 410 search result output unit

Claims

On the computer
A voice collecting unit that collects voice and generates voice data indicating the voice;
A speech keyword extraction unit that extracts speech keywords that are keywords included in the speech from the speech data;
A voice feature extraction unit for extracting features of the voice including at least one of strength, speed, and intonation of the voice from the voice data;
An emotion keyword selection unit for selecting an emotion keyword corresponding to the extracted voice feature from a previously stored emotion keyword group based on the voice feature;
A transmitter configured to transmit the extracted speech keyword and the selected emotion keyword;
Transmission program to realize.

The transmission program according to claim 1, wherein
Causing the computer to further realize an emotion database that stores keywords included in Russell's emotion ring model as the keyword group;
Send program.

The transmission program according to claim 1 or 2, wherein
Causing the computer to further realize a dictionary database in which a plurality of words are stored in advance;
The speech keyword extraction unit extracts the word as the speech keyword when a word included in the speech matches any of the words stored in the dictionary database.
Send program.

The transmission program according to any one of claims 1 to 3, wherein
The transmitting unit transmits the voice keyword and the selected emotion keyword in response to a lapse of a predetermined period after the voice collecting unit collects the voice.
Send program.

The transmission program according to any one of claims 1 to 3, wherein
The voice feature extraction unit extracts at least the strength of the voice as the feature of the voice,
The transmitting unit transmits the voice keyword and the selected emotion keyword in response to the voice strength extracted by the voice feature extraction unit exceeding a predetermined value.
Send program.

The transmission program according to any one of claims 1 to 3, wherein
The transmitting unit transmits the speech keyword and the selected emotion keyword in response to the speech keyword extraction unit extracting a predetermined word.
Send program.

The transmission program according to any one of claims 1 to 6, wherein
Causing the computer to further realize an output unit that outputs a search result by the predetermined search;
Send program.

A voice collecting unit that collects voice and generates voice data indicating the voice;
A speech keyword extraction unit that extracts speech keywords that are keywords included in the speech from the speech data;
A voice feature extraction unit for extracting features of the voice including at least one of strength, speed, and intonation of the voice from the voice data;
An emotion keyword selection unit for selecting an emotion keyword corresponding to the extracted voice feature from a previously stored emotion keyword group based on the voice feature;
A transmitter configured to transmit the extracted speech keyword and the selected emotion keyword;
Transmitter equipped with

Collecting voice and generating voice data indicative of the voice;
Extracting an audio keyword which is a keyword included in the audio from the audio data;
Extracting features of the voice from the voice data, the voice characteristics including at least one of strength, speed, and intonation of the voice;
Selecting an emotion keyword corresponding to the feature of the voice extracted from a group of emotion keywords stored in advance based on the feature of the voice;
Transmitting the extracted speech keyword and the selected emotion keyword.

The transmitter according to claim 8, wherein
The transmission device transmits the position information of the transmission device.

A search apparatus for executing a search process based on the speech keyword and the emotion keyword according to claim 3, wherein
A keyword database storing the speech keywords;
A receiving unit that receives an instruction to update the dictionary database from the computer;
An estimation unit configured to estimate a related term of the speech keyword stored in the keyword database in response to the update instruction received through the reception unit;
A dictionary database creation unit for creating an updated dictionary database based on the speech keywords stored in the keyword database and the inferred related words;
A transmitter for transmitting the updated dictionary database to the computer;
A search device comprising