JP2015210502A

JP2015210502A - Voice data output device and voice output system

Info

Publication number: JP2015210502A
Application number: JP2014094114A
Authority: JP
Inventors: 昌幸加納; Masayuki Kano; 俊成縣; Toshinari Agata
Original assignee: Murata Machinery Ltd
Current assignee: Murata Machinery Ltd
Priority date: 2014-04-30
Filing date: 2014-04-30
Publication date: 2015-11-24

Abstract

PROBLEM TO BE SOLVED: To provide a voice data output device which avoids storage capacity being wastefully occupied by voice data.SOLUTION: A voice data output device includes: an event detection part 10 for detecting an event occurring in the device or in another device; a text storage part 12 for storing text data corresponding to an event which is the event detected by the event detection part 10 and which is to be notified of; a voice conversion part 13 for converting the text data stored in the text storage part 12 into voice data; a transmission part 14 for transmitting the voice data which is converted by the voice conversion part 13 to a device 2 of the event notification destination which outputs the voice by using the voice data; and a determination part 15 for determining whether or not the device 2 of the event notification destination is in a state capable of acquiring the voice data. When the determination part 15 determines that the device 2 of the event notification destination is in a state capable of acquiring the voice data, the voice conversion part 13 converts the text data to the voice data, and the transmission part 14 transmits the voice data.

Description

本発明は、イベント通知先の装置（パソコン等）に対し、イベントに対応する音声データを出力する音声データ出力装置及び音声出力システムに関する。 The present invention relates to an audio data output device and an audio output system that output audio data corresponding to an event to an event notification destination device (such as a personal computer).

イベント通知先の装置（パソコン等）と、当該イベント通知先の装置にイベントを通知する通知装置とを有するシステムにおいて、画像形成装置等の装置で発生した各種イベントをイベント通知先の装置に通知し、イベント通知先の装置において通知を受けたイベントに対応する音声データ（予め記憶されている）を用いて、イベントをユーザに知らせる技術がある。 In a system having an event notification destination device (such as a personal computer) and a notification device for notifying the event notification destination device of an event, the event notification destination device is notified of various events occurring in the image forming device or the like. There is a technique for notifying a user of an event using audio data (stored in advance) corresponding to the event notified by the event notification destination device.

なお、本願と異なる技術分野であるが、参考情報として特許文献１及び２が挙げられる。特許文献１は、集合住宅の複数住戸にそれぞれ配置される子機がイベントを検出し、当該イベントが親機に通知され、親機において音声データの生成及び出力を行うことが記載されている。特許文献２は、携帯電話が備える音声メモ機能（通話中の会話内容を録音する機能）において、音声認識によって音声をテキストデータに変換することが記載されている。 In addition, although it is a technical field different from this application, patent document 1 and 2 are mentioned as reference information. Patent Document 1 describes that a child device arranged in each of a plurality of dwelling units of an apartment house detects an event, the event is notified to the parent device, and voice data is generated and output in the parent device. Patent Document 2 describes that voice is converted into text data by voice recognition in a voice memo function (a function of recording conversation contents during a call) included in a mobile phone.

特開２００７−１７２４０４号公報JP 2007-172404 A 特開平１１−１６８５５２号公報JP-A-11-168552

しかしながら、イベント通知先の装置が起動していない場合、当該イベント通知先の装置が起動した後にイベントの通知がなされることになる。音声合成により生成したイベントに対応する音声データをイベント通知先の装置に出力する音声データ出力装置を設けた場合には、イベント通知先の装置に対して音声データが出力できるようになるまで、生成した音声データを記憶部で保持しておかなければならず、記憶容量を無駄に消費してしまうおそれが考えられる。 However, when the event notification destination device is not activated, the event notification is made after the event notification destination device is activated. When an audio data output device that outputs audio data corresponding to an event generated by speech synthesis to an event notification destination device is provided, it is generated until the audio data can be output to the event notification destination device. The stored audio data must be held in the storage unit, and the storage capacity may be wasted.

本発明は、このような課題に着目してなされたものであって、その目的は、音声データにより記憶容量が無駄に占有されることを避けて、記憶容量を有効に利用可能な音声データ出力装置及び音声出力システムを提供することである。 The present invention has been made paying attention to such a problem, and its purpose is to avoid the audio data from being occupied unnecessarily by the audio data, and to output the audio data that can effectively use the memory capacity. An apparatus and an audio output system are provided.

本発明は、かかる目的を達成するために、次のような手段を講じている。 In order to achieve the object, the present invention takes the following means.

すなわち、本発明の音声データ出力装置は、自装置又は他装置において発生するイベントを検出するイベント検出部と、前記イベント検出部で検出されたイベントであって、通知すべきイベントに対応するテキストデータを記憶するテキスト記憶部と、前記テキスト記憶部に記憶されるテキストデータを音声データに変換する音声変換部と、前記音声変換部が変換した音声データを、当該音声データを用いて音声を出力するイベント通知先の装置へ送信する送信部と、前記イベント通知先の装置が前記音声データを取得可能な状態であるか否かを判定する判定部と、を備え、前記イベント通知先の装置が音声データを取得可能な状態であると前記判定部が判定した場合に、前記音声変換部がテキストデータから音声データへ変換し、前記送信部が前記音声データを送信することを特徴とする。 That is, the audio data output device of the present invention includes an event detection unit that detects an event that occurs in its own device or another device, and text data corresponding to an event to be notified that is an event detected by the event detection unit. A text storage unit that stores the voice data, a voice conversion unit that converts text data stored in the text storage unit into voice data, and voice data converted by the voice conversion unit using the voice data. A transmission unit that transmits to the event notification destination device, and a determination unit that determines whether or not the event notification destination device is in a state where the audio data can be acquired. When the determination unit determines that data can be acquired, the voice conversion unit converts text data into voice data and transmits the data. There and transmits the voice data.

この構成によれば、イベント通知先の装置が音声データを取得可能な状態であるときに、イベントに対応するテキストデータが音声データに変換され、イベント通知先の装置に送信される。したがって、イベント通知先の装置が音声データを取得できない状態のときに音声データが生成されて記憶容量が無駄に占有されることを回避でき、記憶容量を有効に利用可能となる。また、多様な音声態様で音声出力する場合には、記憶容量を無駄に占有することを回避でき、記憶容量をより有効に利用できる。 According to this configuration, when the event notification destination device is in a state where voice data can be acquired, the text data corresponding to the event is converted into voice data and transmitted to the event notification destination device. Therefore, it can be avoided that the voice data is generated and the storage capacity is unnecessarily occupied when the event notification destination apparatus cannot acquire the voice data, and the storage capacity can be used effectively. In addition, when outputting sound in various sound modes, it is possible to avoid occupying the storage capacity wastefully and to use the storage capacity more effectively.

イベント通知先の装置が音声データを取得可能であるか否かを正確に判定するためには、前記判定部は、前記イベント通知先の装置から音声データの出力要求を受けたか否かを判定することによって、前記音声データを取得可能な状態であるか否かを判定することが好ましい。 In order to accurately determine whether or not the event notification destination device is capable of acquiring audio data, the determination unit determines whether or not an audio data output request has been received from the event notification destination device. Accordingly, it is preferable to determine whether or not the audio data can be acquired.

他の判定方法としては、前記判定部は、前記イベント通知先の装置が起動しているか否かを判定することによって、前記音声データを取得可能な状態であるか否かを判定することが挙げられる。 As another determination method, the determination unit may determine whether or not the audio data can be acquired by determining whether or not the event notification destination device is activated. It is done.

他の判定方法としては、前記イベント通知先の装置は、所定の電力状態である通常状態と、前記通常状態よりも電力の消費量が低い電力状態である省エネ状態と、の間で遷移可能であり、前記判定部は、前記イベント通知先の装置が前記省エネ状態であるか否かを判定することによって、前記音声データを取得可能な状態であるか否かを判定することが挙げられる。 As another determination method, the event notification destination device can transition between a normal state that is a predetermined power state and an energy saving state that is a power state in which power consumption is lower than the normal state. Yes, the determination unit may determine whether the audio data can be acquired by determining whether the event notification destination device is in the energy saving state.

記憶容量を的確に確保するためには、記憶部に一時的に記憶される音声データを削除する削除部を備え、前記音声変換部が変換した音声データは、前記送信部が前記イベント通知先の装置へ送信し終わるまで記憶部に一時的に記憶され、送信が完了した後に、前記削除部により前記音声データが削除されることが好ましい。 In order to ensure the storage capacity accurately, the storage unit includes a deletion unit that deletes the audio data temporarily stored in the storage unit, and the transmission unit converts the audio data converted by the audio conversion unit to the event notification destination. It is preferable that the audio data is temporarily stored in the storage unit until transmission to the apparatus is completed, and the audio data is deleted by the deletion unit after transmission is completed.

利便性を向上させるためには、前記音声変換部は、複数の音声態様で音声データを生成可能であり、前記イベント通知先の装置、利用者、イベント又はこれらの組み合わせと、前記音声データに用いる音声態様とを関連づけた音声態様情報を記憶する音声態様情報記憶部を備え、前記音声変換部は、前記音声態様情報に対応する音声態様で音声データを生成することが好ましい。 In order to improve convenience, the sound conversion unit can generate sound data in a plurality of sound modes, and is used for the event notification destination device, user, event, or a combination thereof, and the sound data. It is preferable that a voice mode information storage unit that stores voice mode information associated with a voice mode is provided, and that the voice conversion unit generates voice data in a voice mode corresponding to the voice mode information.

本発明の音声出力システムは、画像処理に関する各種イベントが発生する画像処理装置と、前記画像処理装置において発生するイベントを前記イベント検出部が検出する上記音声データ出力装置と、を備え、前記画像処理装置と前記音声データ出力装置とが互いにネットワークを介して接続されている。 An audio output system according to the present invention includes: an image processing apparatus in which various events relating to image processing occur; and the audio data output apparatus in which the event detection unit detects an event occurring in the image processing apparatus; The apparatus and the audio data output apparatus are connected to each other via a network.

本発明は、以上説明した構成であるので、イベント通知先の装置が音声データを取得できない状態のときに音声データが生成されて記憶容量が無駄に占有されることを回避でき、記憶容量を有効に利用可能となる。また、多様な音声態様で音声出力する場合には、記憶容量を無駄に占有することを回避でき、記憶容量をより有効に利用できる。 Since the present invention has the configuration described above, it is possible to avoid generating unnecessary voice data and occupying the storage capacity when the event notification destination device cannot acquire the voice data. Will be available. In addition, when outputting sound in various sound modes, it is possible to avoid occupying the storage capacity wastefully and to use the storage capacity more effectively.

本発明の一実施形態に係る音声データ出力装置を含むシステムの概要及び動作に関する図。The figure regarding the outline | summary and operation | movement of a system containing the audio | voice data output device which concerns on one Embodiment of this invention. 音声データの基礎となるテキストデータの一例を示す図。The figure which shows an example of the text data used as the foundation of audio | voice data. 音声態様情報を示すテーブルに関する図。The figure regarding the table which shows audio | voice mode information. 各装置の動作を示すシーケンス図。The sequence diagram which shows operation | movement of each apparatus. 本発明の他の実施形態において各装置の動作を示すシーケンス図。The sequence diagram which shows operation | movement of each apparatus in other embodiment of this invention.

以下、本発明の一実施形態に係る音声データ出力装置及び音声出力システムを、図面を参照して説明する。 Hereinafter, an audio data output device and an audio output system according to an embodiment of the present invention will be described with reference to the drawings.

図１に概略的に示すように、音声データ出力装置１は、１つ以上のパソコン等の情報処理装置２と、画像処理に関する各種イベントが発生する複合機又はプリンタ等の１つ以上の画像処理装置３と、共にネットワーク（ＮＷ）に接続されており、画像処理装置３と音声データ出力装置１とが互いにネットワーク（ＮＷ）を介して通信可能に接続されている。ネットワーク（ＮＷ）として、ＩＰ（Internet Protocol）を用いた有線ＬＡＮ（Local Area Network）又は無線ＬＡＮが挙げられる。音声出力システムは、音声データ出力装置１と、画像処理装置３とを有する。 As schematically shown in FIG. 1, an audio data output device 1 includes one or more information processing devices 2 such as a personal computer, and one or more image processing such as a multifunction peripheral or a printer that generates various events related to image processing. The apparatus 3 is connected to a network (NW), and the image processing apparatus 3 and the audio data output apparatus 1 are connected to each other via a network (NW) so as to communicate with each other. Examples of the network (NW) include a wired LAN (Local Area Network) or wireless LAN using IP (Internet Protocol). The audio output system includes an audio data output device 1 and an image processing device 3.

本システムにおける基本的な動作の概要を説明する。図１に示すように、音声データ出力装置１は、イベント発生元となる画像処理装置３を監視し（図中（１）参照）、画像処理装置３で発生するイベントを検出する（図中（２）参照）。画像処理装置３は、自装置でイベントが発生した場合には、音声データ出力装置１へ能動的に通知する（図中（２）参照）。音声データ出力装置１は、イベント通知先の装置である情報処理装置２にイベントが存在することを通知する（図中（３）参照）。情報処理装置２は、音声データ出力装置１へ音声データの出力要求を送信する（図中（４）参照）。音声データ出力装置１は、テキストデータを生成し、音声データに変換する（図中（５）参照）。音声データ出力装置１は、情報処理装置２へ音声データを送信する（図中（６）参照）。情報処理装置２は、音声データを用いた音声をスピーカから出力する（図中（７）参照）。勿論、（５）におけるテキスト生成のタイミングは、実装に応じて適宜変更可能である。 An outline of basic operations in this system will be described. As shown in FIG. 1, the audio data output device 1 monitors the image processing device 3 that is an event generation source (see (1) in the figure) and detects an event that occurs in the image processing device 3 (in the figure ( 2)). The image processing apparatus 3 actively notifies the audio data output apparatus 1 when an event occurs in itself (see (2) in the figure). The audio data output device 1 notifies the information processing device 2 that is an event notification destination device that an event exists (see (3) in the figure). The information processing apparatus 2 transmits an audio data output request to the audio data output apparatus 1 (see (4) in the figure). The voice data output device 1 generates text data and converts it into voice data (see (5) in the figure). The audio data output device 1 transmits audio data to the information processing device 2 (see (6) in the figure). The information processing apparatus 2 outputs sound using the sound data from the speaker (see (7) in the figure). Of course, the text generation timing in (5) can be appropriately changed according to the implementation.

上記動作を実現するために、音声データ出力装置１は、イベント検出部１０と、イベント記憶部１１と、テキスト記憶部１２と、音声変換部１３と、送信部１４と、判定部１５と、削除部１６と、音声態様情報記憶部１７と、を有する。これら各部１０〜１７は、図示しないＣＰＵ、メモリ（ＲＯＭ、ＲＡＭ）、ネットワークインターフェイスを含むコンピュータにおいて、メモリに記憶される所定のプログラムがＣＰＵで実行されることにより、ソフトウェアとハードウェアが協働して実現される。 In order to realize the above operation, the audio data output device 1 includes an event detection unit 10, an event storage unit 11, a text storage unit 12, a voice conversion unit 13, a transmission unit 14, a determination unit 15, and a deletion. Unit 16 and voice mode information storage unit 17. Each of these units 10 to 17 is a computer including a CPU, a memory (ROM, RAM), and a network interface (not shown), and a predetermined program stored in the memory is executed by the CPU so that software and hardware cooperate with each other. Realized.

イベント検出部１０は、他装置（画像処理装置３）において発生するイベントを検出する。具体的には、画像処理装置３を監視し、発生したイベント情報を得るように構成してもよく、また、画像処理装置３が発生したイベントを自発的に通知するように構成してもよい。なお、本実施形態では、音声データ出力装置１とイベント発生元となる画像処理装置３が別の装置であるが、両者を一つの装置に実装する場合には、イベント検出部１０は自装置において発生するイベントを検出することになる。 The event detection unit 10 detects an event that occurs in another device (image processing device 3). Specifically, it may be configured to monitor the image processing apparatus 3 and obtain event information that has occurred, or may be configured to voluntarily notify an event that has occurred by the image processing apparatus 3. . In the present embodiment, the audio data output device 1 and the image processing device 3 that is the event generation source are different devices. However, when both are mounted on a single device, the event detection unit 10 is in the own device. The event that occurs will be detected.

イベント記憶部１１は、イベント検出部１０が検出したイベントに関するデータを記憶する。本実施形態において、イベント記憶部１１に記憶するイベントに関するデータは、通知するか否かにかかわらず、全て記憶している。勿論、実装に応じてイベント記憶部１１は省略可能である。 The event storage unit 11 stores data related to the event detected by the event detection unit 10. In the present embodiment, all the data related to the event stored in the event storage unit 11 is stored regardless of whether or not to notify. Of course, the event storage unit 11 may be omitted depending on the implementation.

テキスト記憶部１２は、イベント検出部１０で検出されたイベントであって、通知すべきイベントに対応するテキストデータを記憶する。本実施形態において、通知すべきイベントは、イベント通知先の装置２（情報処理装置２）から要求があったイベントとしている。イベント検出部１０が検出したイベントのうち、いずれのイベントを”通知すべきイベント”にするかは、設定に従って図示しない通知決定部が決定する。なお、本実施形態では、予めなされた設定に基づき”通知すべきイベント”を決定し、当該イベントに対応するテキストデータをテキスト記憶部１２に記憶するように構成しているが、イベント検出部１０が検出したイベントをそのまま通知すべきイベントとし、全てのイベントに対応するテキストデータをテキスト記憶部１２に記憶するように構成してもよい。 The text storage unit 12 stores text data corresponding to the event detected by the event detection unit 10 and to be notified. In this embodiment, an event to be notified is an event requested from the event notification destination device 2 (information processing device 2). Of the events detected by the event detection unit 10, a notification determination unit (not shown) determines which event should be “notified event” according to the setting. In the present embodiment, the “event to be notified” is determined based on a preset setting, and the text data corresponding to the event is stored in the text storage unit 12. It is also possible to configure such that the event detected by is set as an event to be notified as it is, and text data corresponding to all events is stored in the text storage unit 12.

テキスト記憶部１２が記憶するテキストデータの例を図２Ａに示す。ここでは、テキストデータＡ〜Ｈの９つのデータが図示されている。図２Ａの例では、テキストデータは、音声に変換するテキストと、音声態様を表している。音声態様は、言語及び方言を表している。 An example of text data stored in the text storage unit 12 is shown in FIG. 2A. Here, nine pieces of text data A to H are shown. In the example of FIG. 2A, the text data represents the text to be converted into speech and the speech mode. The voice mode represents a language and a dialect.

音声変換部１３は、テキスト記憶部１２に記憶されるテキストデータを音声データに変換する。変換された音声データは、図示しない記憶部に一時的に記憶される。本実施形態において、音声変換部１３は、複数の音声態様で音声データを生成可能に構成されている。ここでいう「音声態様」は、例えば、日本語及び英語といった言語、日本語においても標準語及び関西弁といった方言、男性又は女性の声、話す速度などの音声の特徴の組み合わせを意味する。音声態様が異なれば、テキスト又はイベントが同じであっても音声が異なることになる。 The voice conversion unit 13 converts the text data stored in the text storage unit 12 into voice data. The converted audio data is temporarily stored in a storage unit (not shown). In the present embodiment, the sound conversion unit 13 is configured to be able to generate sound data in a plurality of sound modes. The “speech mode” here means a combination of voice features such as languages such as Japanese and English, dialects such as standard words and Kansai dialect in Japanese, male or female voice, and speaking speed. If the voice mode is different, the voice will be different even if the text or event is the same.

送信部１４は、音声変換部１３が変換した音声データを、当該音声データを用いて音声を出力するイベント通知先の装置２（情報処理装置２）へ送信する。 The transmission unit 14 transmits the audio data converted by the audio conversion unit 13 to the event notification destination apparatus 2 (information processing apparatus 2) that outputs the audio using the audio data.

判定部１５は、イベント通知先の装置２が音声データを取得可能な状態であるか否かを判定する。具体的には、イベント通知先の装置２から音声データの出力要求を受けたか否かで判定する。判定部１５は、上記出力要求があれば、出力要求した装置２が音声データを取得可能な状態であると判定する。 The determination unit 15 determines whether or not the event notification destination device 2 is in a state in which audio data can be acquired. Specifically, the determination is made based on whether or not an audio data output request is received from the event notification destination apparatus 2. If there is the output request, the determination unit 15 determines that the apparatus 2 that has requested the output is in a state in which audio data can be acquired.

削除部１６は、送信部１４の動作に基づき、記憶部に一時的に記憶される音声データを削除する。具体的には、音声変換部１３が変換した音声データは、送信部１４がイベント通知先の装置２へ送信し終わるまで前記記憶部に一時的に記憶される。送信部１４による送信が完了した後に、削除部１６が音声データを削除する。 The deletion unit 16 deletes the audio data temporarily stored in the storage unit based on the operation of the transmission unit 14. Specifically, the audio data converted by the audio conversion unit 13 is temporarily stored in the storage unit until the transmission unit 14 finishes transmitting the event notification destination device 2. After the transmission by the transmission unit 14 is completed, the deletion unit 16 deletes the audio data.

音声態様情報記憶部１７は、イベント通知先の装置、利用者、イベント又はこれらの組み合わせと、音声データに用いる音声態様とを関連付けた音声態様情報を記憶する。音声態様情報は、イベント通知先の装置毎、利用者毎、イベント毎又はこれらの組み合わせ毎に、音声態様を指定可能にするために用いる。音声態様情報は、利用者が適宜変更可能である。音声変換部１３は、音声態様情報に対応する音声態様で音声データを生成（変換）する。 The audio mode information storage unit 17 stores audio mode information in which an event notification destination device, a user, an event, or a combination thereof is associated with an audio mode used for audio data. The sound mode information is used to enable specification of a sound mode for each event notification destination device, for each user, for each event, or for each combination thereof. The voice mode information can be appropriately changed by the user. The voice conversion unit 13 generates (converts) voice data in a voice mode corresponding to the voice mode information.

音声態様情報記憶部１７が記憶する音声態様情報の一例を図２Ｂに示す。図２Ｂの例では、利用者とイベントの組み合わせに対して音声態様が関連づけられている。本実施形態では、情報処理装置２から音声データの出力要求を受けた場合に利用者情報を取得し、取得した利用者情報とイベント記憶部１１に記憶しているイベントをキーとして、音声態様情報を参照することによって音声態様を特定し、テキストデータを一意に特定し、テキスト記憶部１２に記憶している。 An example of the audio mode information stored in the audio mode information storage unit 17 is shown in FIG. 2B. In the example of FIG. 2B, a voice mode is associated with a combination of a user and an event. In the present embodiment, user information is acquired when a voice data output request is received from the information processing apparatus 2, and voice mode information is obtained using the acquired user information and an event stored in the event storage unit 11 as a key. The voice mode is specified by referring to the text, and the text data is uniquely specified and stored in the text storage unit 12.

次に、図３を用いて、システムの簡単な動作について説明する。まず、ステップ（１）において、音声データ出力装置１のイベント検出部１０が情報処理装置２を監視し、イベントを検出する。イベントが検出されれば、イベントに関するデータをイベント記憶部１１に記憶する。次のステップ（２）において、音声データ出力装置１が、情報処理装置２から音声データの出力要求（イベント確認）を受ければ、判定部１５が、イベント通知先の装置２（情報処理装置２）が音声データを取得可能な状態であると判定する。そして、音声態様情報において対応する音声態様を特定し、テキストデータが一意に特定し、当該テキストデータをテキスト記憶部１２に記憶する。次のステップ（３）において、音声変換部１３がテキスト記憶部１２に記憶されているテキストデータを指定された音声態様で音声データに変換し、記憶部に一時的に記憶される。次のステップ（４）において、変換された音声データを送信部１４が情報処理装置２（イベント通知先の装置）へ送信する。送信が完了すれば、削除部１６が音声データを削除する。次のステップ（５）において、情報処理装置２が音声データを用いて音声を出力する。 Next, a simple operation of the system will be described with reference to FIG. First, in step (1), the event detection unit 10 of the audio data output device 1 monitors the information processing device 2 and detects an event. If an event is detected, data related to the event is stored in the event storage unit 11. In the next step (2), when the audio data output device 1 receives an audio data output request (event confirmation) from the information processing device 2, the determination unit 15 determines that the event notification destination device 2 (information processing device 2). Is in a state where audio data can be acquired. Then, the corresponding voice mode is specified in the voice mode information, the text data is uniquely specified, and the text data is stored in the text storage unit 12. In the next step (3), the voice conversion unit 13 converts the text data stored in the text storage unit 12 into voice data in a designated voice mode, and is temporarily stored in the storage unit. In the next step (4), the transmission unit 14 transmits the converted audio data to the information processing apparatus 2 (event notification destination apparatus). When the transmission is completed, the deletion unit 16 deletes the audio data. In the next step (5), the information processing apparatus 2 outputs sound using the sound data.

以上のように、本実施形態の音声データ出力装置１は、自装置又は他装置において発生するイベントを検出するイベント検出部１０と、イベント検出部１０で検出されたイベントであって、通知すべきイベントに対応するテキストデータを記憶するテキスト記憶部１２と、テキスト記憶部１２に記憶されるテキストデータを音声データに変換する音声変換部１３と、音声変換部１３が変換した音声データを、当該音声データを用いて音声を出力するイベント通知先の装置２へ送信する送信部１４と、イベント通知先の装置２が音声データを取得可能な状態であるか否かを判定する判定部１５と、を備える。イベント通知先の装置２が音声データを取得可能な状態であると判定部１５が判定した場合に、音声変換部１３がテキストデータから音声データへ変換し、送信部１４が音声データを送信する。 As described above, the audio data output device 1 according to the present embodiment is an event detection unit 10 that detects an event that occurs in the own device or another device, and an event that is detected by the event detection unit 10 and should be notified. A text storage unit 12 that stores text data corresponding to an event, a voice conversion unit 13 that converts text data stored in the text storage unit 12 into voice data, and voice data converted by the voice conversion unit 13 A transmission unit 14 that transmits data to the event notification destination device 2 that outputs sound, and a determination unit 15 that determines whether the event notification destination device 2 is in a state in which audio data can be acquired. Prepare. When the determination unit 15 determines that the event notification destination apparatus 2 is in a state in which the voice data can be acquired, the voice conversion unit 13 converts the text data into the voice data, and the transmission unit 14 transmits the voice data.

この構成によれば、イベント通知先の装置２が音声データを取得可能であるときに、イベントに対応するテキストデータが音声データに変換され、イベント通知先の装置２に送信される。したがって、イベント通知先の装置２が音声データを取得できない状態のときに音声データが生成されて記憶容量が無駄に占有されることを回避でき、記憶容量を有効に利用可能となる。また、多様な音声態様で音声出力する場合には、記憶容量を無駄に占有することを回避でき、記憶容量をより有効に利用できる。 According to this configuration, when the event notification destination apparatus 2 can acquire voice data, the text data corresponding to the event is converted into voice data and transmitted to the event notification destination apparatus 2. Therefore, it can be avoided that the voice data is generated and the storage capacity is unnecessarily occupied when the event notification destination apparatus 2 cannot acquire the voice data, and the storage capacity can be used effectively. In addition, when outputting sound in various sound modes, it is possible to avoid occupying the storage capacity wastefully and to use the storage capacity more effectively.

本実施形態では、判定部１５は、イベント通知先の装置２から音声データの出力要求を受けたか否かを判定することによって、音声データを取得可能な状態であるか否かを判定する。この構成によれば、音声データを取得可能な状態であるかを正確に判定することが可能となる。 In the present embodiment, the determination unit 15 determines whether or not audio data can be acquired by determining whether or not an audio data output request has been received from the event notification destination device 2. According to this configuration, it is possible to accurately determine whether audio data can be acquired.

本実施形態では、記憶部に一時的に記憶される音声データを削除する削除部１６を備え、音声変換部１３が変換した音声データは、送信部１４がイベント通知先の装置２へ送信し終わるまで記憶部に一時的に記憶され、送信が完了した後に、削除部１６により音声データが削除される。 In the present embodiment, a deletion unit 16 that deletes audio data temporarily stored in the storage unit is provided, and the audio data converted by the audio conversion unit 13 is completely transmitted by the transmission unit 14 to the event notification destination apparatus 2. Until the transmission is completed, the deletion unit 16 deletes the audio data.

この構成によれば、音声データが削除部１６によって適切に削除されるので、記憶容量を的確に確保することが可能となる。 According to this configuration, since the audio data is appropriately deleted by the deletion unit 16, it is possible to ensure the storage capacity accurately.

本実施形態では、音声変換部１３は、複数の音声態様で音声データを生成可能であり、イベント通知先の装置、利用者、イベント又はこれらの組み合わせと、音声データに用いる音声態様とを関連づけた音声態様情報を記憶する音声態様情報記憶部１７を備え、音声変換部１３は、音声態様情報に対応する音声態様で音声データを生成する。 In the present embodiment, the voice conversion unit 13 can generate voice data in a plurality of voice modes, and associates an event notification destination device, a user, an event, or a combination thereof with a voice mode used for the voice data. A voice mode information storage unit 17 that stores voice mode information is provided, and the voice conversion unit 13 generates voice data in a voice mode corresponding to the voice mode information.

この構成によれば、利用者の好みに応じた音声態様での音声でイベントを通知でき、利便性を向上させることが可能となる。 According to this configuration, it is possible to notify the event by voice in a voice mode according to the user's preference, and it is possible to improve convenience.

上記音声データ出力装置を用いたシステムとしては、画像処理に関する各種イベントが発生する画像処理装置３と、画像処理装置３において発生するイベントをイベント検出部１０が検出する上記音声データ出力装置１と、を備え、画像処理装置３と音声データ出力装置１とが互いにネットワークを介して接続されていることが挙げられる。 As a system using the audio data output device, the image processing device 3 in which various events relating to image processing occur, the audio data output device 1 in which the event detection unit 10 detects an event occurring in the image processing device 3, and The image processing apparatus 3 and the audio data output apparatus 1 are connected to each other via a network.

以上、本発明の実施形態について図面に基づいて説明したが、具体的な構成は、これらの実施形態に限定されるものでないと考えられるべきである。本発明の範囲は、上記した実施形態の説明だけではなく特許請求の範囲によって示され、さらに特許請求の範囲と均等の意味および範囲内でのすべての変更が含まれる。 As mentioned above, although embodiment of this invention was described based on drawing, it should be thought that a specific structure is not limited to these embodiment. The scope of the present invention is shown not only by the above description of the embodiments but also by the scope of claims for patent, and further includes all modifications within the meaning and scope equivalent to the scope of claims for patent.

例えば、判定部１５を、イベント通知先の装置２が起動しているか否かを判定することによって、音声データを取得可能な状態であるか否かを判定するように、構成してもよい。具体的には、図４に示すように、音声データ出力装置１が情報処理装置２に対してＩＣＭＰ（Internet Control Message Protocol）に準拠するｐｉｎｇパケットを送信すれば、起動している装置２から返信があるので、イベント通知先の装置２が起動しているが判明する。 For example, the determination unit 15 may be configured to determine whether or not the voice data can be acquired by determining whether or not the event notification destination device 2 is activated. Specifically, as shown in FIG. 4, if the voice data output device 1 transmits a ping packet conforming to ICMP (Internet Control Message Protocol) to the information processing device 2, a reply is sent from the activated device 2. Therefore, it is found that the event notification destination apparatus 2 is activated.

また、イベント通知先の装置２（情報処理装置２）が、所定の電力状態である通常状態と、通常状態よりも電力の消費量が低い電力状態である省エネ状態と、の間で遷移可能である場合に、判定部１５を、イベント通知先の装置２が省エネ状態であるか否かを判定することによって、音声データを取得可能な状態であるか否かを判定するように、構成してもよい。 In addition, the event notification destination device 2 (information processing device 2) can transition between a normal state that is a predetermined power state and an energy saving state that is a power state in which power consumption is lower than the normal state. In some cases, the determination unit 15 is configured to determine whether or not the voice data can be acquired by determining whether or not the event notification destination device 2 is in the energy saving state. Also good.

各部の具体的な構成は、上述した実施形態のみに限定されるものではなく、本発明の趣旨を逸脱しない範囲で種々変形が可能である。 The specific configuration of each unit is not limited to the above-described embodiment, and various modifications can be made without departing from the spirit of the present invention.

１…音声データ出力装置
２…イベント通知先の装置（情報処理装置）
３…画像処理装置
１０…イベント検出部
１２…テキスト記憶部
１３…音声変換部
１４…送信部
１５…判定部
１６…削除部
１７…音声態様情報記憶部 DESCRIPTION OF SYMBOLS 1 ... Audio | voice data output device 2 ... Event notification destination apparatus (information processing apparatus)
DESCRIPTION OF SYMBOLS 3 ... Image processing apparatus 10 ... Event detection part 12 ... Text storage part 13 ... Audio | voice conversion part 14 ... Transmission part 15 ... Determination part 16 ... Deletion part 17 ... Audio | voice aspect information storage part

Claims

An event detection unit for detecting an event occurring in the own device or another device;
A text storage unit for storing text data corresponding to an event to be notified which is an event detected by the event detection unit;
A voice conversion unit that converts text data stored in the text storage unit into voice data;
A transmission unit that transmits the audio data converted by the audio conversion unit to an event notification destination device that outputs audio using the audio data;
A determination unit that determines whether or not the event notification destination device is in a state where the audio data can be acquired;
When the determination unit determines that the event notification destination device is in a state where voice data can be acquired, the voice conversion unit converts text data into voice data, and the transmission unit transmits the voice data. An audio data output device.

2. The determination unit according to claim 1, wherein the determination unit determines whether or not the audio data can be acquired by determining whether or not an audio data output request is received from the event notification destination device. Audio data output device.

The audio data output device according to claim 1, wherein the determination unit determines whether or not the audio data can be acquired by determining whether or not the event notification destination device is activated. .

The event notification destination device can transition between a normal state that is a predetermined power state and an energy saving state that is a power state in which power consumption is lower than the normal state.
The audio data according to claim 1, wherein the determination unit determines whether or not the audio data can be acquired by determining whether or not the event notification destination device is in the energy saving state. Output device.

A deletion unit for deleting audio data temporarily stored in the storage unit;
The voice data converted by the voice conversion unit is temporarily stored in the storage unit until the transmission unit finishes transmitting to the event notification destination device. After the transmission is completed, the voice data is deleted by the deletion unit. The voice data output device according to any one of claims 1 to 4.

The voice conversion unit can generate voice data in a plurality of voice modes,
A voice mode information storage unit that stores voice mode information in which the event notification destination device, user, event, or combination thereof is associated with a voice mode used for the voice data;
The audio data output device according to claim 1, wherein the audio conversion unit generates audio data in an audio mode corresponding to the audio mode information.

An image processing apparatus in which various events relating to image processing occur;
The audio data output device according to any one of claims 1 to 6, wherein the event detection unit detects an event that occurs in the image processing device,
An audio output system in which the image processing device and the audio data output device are connected to each other via a network.