JP2009244814A

JP2009244814A - Portable audio device

Info

Publication number: JP2009244814A
Application number: JP2008094371A
Authority: JP
Inventors: Koichiro Nishino; 幸一郎西野
Original assignee: Aplix Corp
Current assignee: Aplix Corp
Priority date: 2008-03-31
Filing date: 2008-03-31
Publication date: 2009-10-22

Abstract

<P>PROBLEM TO BE SOLVED: To provide a portable audio device, capable of reporting information such as musical composition name or artist information added to a music data file under reproduction to a user without causing the user to view a display device. <P>SOLUTION: The device comprises a speech synthesis part 107 which generate synthesized speech data from a text character string. A control part 105 causes, when it detects a predetermined operation to an operation part 101 by the user during reproduction of a predetermined music data file, the speech synthesis part 107 to generate synthesized speech data from a text character string of a musical composition name and/or an artist name added to the predetermined music data file, and causes an audio reproduction part 104 to reproduce the synthesized speech. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、ポータブルオーディオ機器に関するものである。 The present invention relates to a portable audio device.

近年、ハードディスク、フラッシュメモリ、メモリカード等に音楽データを記録・再生するポータブルオーディオ機器が広く利用されている。このようなポータブルオーディオ機器は、例えば音楽データをＣＤから取り込んだりネットワーク経由でダウンロードしたパーソナルコンピュータと接続され、パーソナルコンピュータから転送された音楽データファイルを記録・再生するものである。このようなポータブル機器が再生する音楽データファイルの形式は機種によって様々であるが、例えば、非圧縮音声フォーマット（WAV、AIFF、AU等）、可逆圧縮を伴うフォーマット（FLAC、Monkey's Audio、TTA、Apple Lossless、lossless WMA等）、非可逆圧縮を伴うフォーマット（MP3、Vorbis、lossy WMA、AAC等）といったものを挙げることができる。 In recent years, portable audio devices that record and reproduce music data on a hard disk, a flash memory, a memory card, and the like have been widely used. Such a portable audio device is connected to a personal computer that takes in music data from a CD or downloaded via a network, for example, and records / reproduces a music data file transferred from the personal computer. The format of music data files played by such portable devices varies depending on the model. For example, uncompressed audio formats (WAV, AIFF, AU, etc.), formats with lossless compression (FLAC, Monkey's Audio, TTA, Apple, etc.) Lossless, lossless WMA, etc.) and formats with lossy compression (MP3, Vorbis, lossy WMA, AAC, etc.).

このようなポータブルオーディオ機器は、音楽データを再生するだけでなく、パーソナルコンピュータのメディアプレイヤーと同様に、音楽データファイルに付加されている演奏時間、曲名、アーティスト名等の情報を読み出す機能を有していることが通常である。そして、このような機能を利用してポータブルオーディオ機器の本体またはリモコンに、音楽データファイルのリストを表示したり、再生している音楽データの情報を表示することが行われている。 Such a portable audio device not only reproduces music data, but also has a function of reading information such as performance time, song title, artist name, etc. added to the music data file in the same manner as a media player of a personal computer. It is normal. Then, using such a function, a list of music data files or information of music data being played back is displayed on the main body or remote control of the portable audio device.

特開２０００−３０５５８８には、このようなポータブルオーディオ機器において、ユーザが所望のデータを音楽データファイルに付加することができ、音楽データファイルの再生に同期してユーザの付加したデータを表示することの可能なポータブルオーディオ機器が開示されている。
特開２０００−３０５５８８ Japanese Patent Laid-Open No. 2000-305588 discloses that in such a portable audio device, a user can add desired data to a music data file and display the data added by the user in synchronization with the reproduction of the music data file. Possible portable audio devices are disclosed.
JP 2000-305588 A

従来、ポータブルオーディオ機器で現在再生されている音楽データファイルの楽曲名やアーティスト情報をユーザが知るためには、ポータブルオーディオ機器本体またはリモコンの表示装置を見なければならなかった。このため、表示装置付きのリモコンが設けられていない場合にはポータブルオーディオ機器本体を取り出さなければならず、また、表示装置付きのリモコンが設けられている場合であっても咄嗟にリモコンの表示装置を見ることができなかったり煩雑なことがあるという問題があった。 Conventionally, in order for a user to know the music title and artist information of a music data file currently being played back on a portable audio device, the user has to look at the display device of the portable audio device main body or the remote control. For this reason, when the remote controller with the display device is not provided, the portable audio device main body must be taken out, and even when the remote controller with the display device is provided, the display device of the remote controller is very much used. There was a problem that it could not be seen or sometimes complicated.

本発明は、上記状況に鑑み、ポータブルオーディオ機器の本体またはリモコンに設けられた表示装置をユーザに視認させることなく、再生中の音楽データファイルに付加された楽曲名やアーティスト情報といった情報をユーザに知らせることの可能なポータブルオーディオ機器を提供することを目的とする。 In view of the above situation, the present invention provides the user with information such as the song name and artist information added to the music data file being played back without allowing the user to visually recognize the display device provided on the main body of the portable audio device or the remote controller. It is an object to provide a portable audio device that can be notified.

上記課題を解決するため、本発明の第１の観点においては、ユーザインターフェイスをなす操作部と、
音楽データファイルを記憶する記憶部と、
前記音楽データファイルを再生するオーディオ再生部と、
ユーザが前記操作部を操作することに応じて、前記記憶部に記憶された前記音楽データファイルを前記オーディオ再生部に再生させる制御部とを備えたポータブルオーディオ機器において、
さらに、テキスト文字列から合成音声データを生成する音声合成部を有し、
前記制御部は、所定の音楽データファイルを再生中に、ユーザが前記操作部に所定の操作を加えたことを検出すると、前記音声合成部によって前記所定の音楽データファイルに付加されている楽曲名および／またはアーティスト名のテキスト文字列から合成音声データを生成し、前記オーディオ再生部で当該合成音声を再生することを特徴とするポータブルオーディオ機器を提供する。 In order to solve the above problem, in a first aspect of the present invention, an operation unit that forms a user interface;
A storage unit for storing music data files;
An audio playback unit for playing back the music data file;
In a portable audio device comprising: a control unit that causes the audio reproduction unit to reproduce the music data file stored in the storage unit in response to a user operating the operation unit.
Furthermore, it has a speech synthesizer that generates synthesized speech data from a text string,
When the control unit detects that the user has performed a predetermined operation on the operation unit during reproduction of the predetermined music data file, the song name added to the predetermined music data file by the voice synthesis unit In addition, a portable audio device is provided in which synthesized voice data is generated from a text character string of an artist name and the synthesized voice is played back by the audio playback unit.

また、本発明の第２の観点は、ユーザインターフェイスをなす操作部と、
音楽データファイルを記憶する記憶部と、
前記音楽データファイルを再生するオーディオ再生部と、
ユーザが前記操作部を操作することに応じて、前記記憶部に記憶された前記音楽データファイルを前記オーディオ再生部に再生させる制御部とを備えたポータブルオーディオ機器において、
さらに、テキスト文字列から合成音声データを生成する音声合成部を有し、
前記制御部は、予め前記記憶部に記憶された音楽データファイルに付加されている楽曲名および／またはアーティスト名のテキスト文字列から前記音声合成部によって合成音声データを生成しておき、所定の音楽データファイルを再生中にユーザが前記操作部に所定の操作を加えたことを検出すると、当該音楽データファイルの楽曲名および／またはアーティスト名の合成音声データを前記オーディオ再生部で再生することを特徴とするポータブルオーディオ機器を提供する。 A second aspect of the present invention provides an operation unit that forms a user interface;
A storage unit for storing music data files;
An audio playback unit for playing back the music data file;
In a portable audio device comprising: a control unit that causes the audio reproduction unit to reproduce the music data file stored in the storage unit in response to a user operating the operation unit.
Furthermore, it has a speech synthesizer that generates synthesized speech data from a text string,
The control unit generates synthesized voice data by the voice synthesizer from a text string of a song name and / or artist name added in advance to a music data file stored in the storage unit, When it is detected that a user has performed a predetermined operation on the operation unit while a data file is being reproduced, the audio reproduction unit reproduces the synthesized voice data of the music name and / or artist name of the music data file. To provide portable audio equipment.

以上のような構成のポータブルオーディオ機器において、前記制御部は、前記所定の音楽データファイルの再生を中断することなく、楽曲名および／またはアーティスト名のテキスト文字列から生成された合成音声データを同時に再生する構成とすることが好適である。 In the portable audio device configured as described above, the control unit simultaneously generates synthesized voice data generated from the text string of the song name and / or artist name without interrupting the reproduction of the predetermined music data file. It is preferable to adopt a configuration for reproducing.

本発明の第１の観点にかかるポータブルオーディオ機器は、テキスト文字列から合成音声データを生成する音声合成部を有し、所定の音楽データファイルを再生中に、ユーザが前記操作部に所定の操作を加えたことを検出すると、前記音声合成部によって前記所定の音楽データファイルに付加されている楽曲名および／またはアーティスト名のテキスト文字列から合成音声データを生成し、前記オーディオ再生部で当該合成音声を再生する。したがって、本発明の第１の観点にかかるポータブルオーディオ機器を利用しているユーザは、操作部に所定の操作を行うことによって再生中の音楽データのアーティスト名および／または楽曲名の合成音声を聞くことができ、これにより表示装置を視認することなく再生中の音楽データに関する情報を知ることができる。 A portable audio device according to a first aspect of the present invention includes a speech synthesizer that generates synthesized speech data from a text string, and a user can perform a predetermined operation on the operation unit while playing a predetermined music data file. Is detected, the synthesized voice data is generated from the text string of the song name and / or the artist name added to the predetermined music data file by the voice synthesizer, and the synthesized voice data is generated by the audio playback unit. Play audio. Therefore, the user who uses the portable audio device according to the first aspect of the present invention listens to the synthesized voice of the artist name and / or song name of the music data being played by performing a predetermined operation on the operation unit. Thus, information relating to the music data being reproduced can be known without visually recognizing the display device.

また、本発明の第２の観点にかかるポータブルオーディオ機器は、第１の観点と同様にテキスト文字列から合成音声データを生成する音声合成部を有し、予め前記記憶部に記憶された音楽データファイルに付加されている楽曲名および／またはアーティスト名のテキスト文字列から前記音声合成部によって合成音声データを生成しておき、所定の音楽データファイルを再生中にユーザが前記操作部に所定の操作を加えたことを検出すると、当該音楽データファイルの楽曲名および／またはアーティスト名の合成音声データを前記オーディオ再生部で再生する。このように本発明の第２の観点にかかるポータブルオーディオ機器では、ユーザの操作に先行して楽曲名および／またはアーティスト名の合成音声データを生成しておくので、所定の操作を行ってからアーティスト名および／または楽曲名の合成音声が再生されるまでにかかるシステムの負荷を軽減することができる。 Further, the portable audio device according to the second aspect of the present invention has a speech synthesizer for generating synthesized speech data from a text character string, as in the first aspect, and music data stored in the storage unit in advance. Synthetic voice data is generated by the voice synthesizer from the text string of the song name and / or artist name added to the file, and the user performs a predetermined operation on the operation unit while playing a predetermined music data file. Is detected, the synthesized audio data of the music name and / or artist name of the music data file is reproduced by the audio reproduction unit. As described above, in the portable audio device according to the second aspect of the present invention, the synthesized voice data of the song name and / or the artist name is generated prior to the user's operation. It is possible to reduce the load on the system until the synthesized voice of the name and / or the song name is reproduced.

さらに、本発明において、前記制御部は、前記所定の音楽データファイルの再生を中断することなく、楽曲名および／またはアーティスト名のテキスト文字列から生成された合成音声データを同時に再生する構成とすれば、音楽データの再生を中断させることなく、再生している音楽データの楽曲名および／またはアーティスト名の合成音声をユーザに提供することができる。 Further, in the present invention, the control unit is configured to simultaneously reproduce the synthesized voice data generated from the text character string of the song name and / or artist name without interrupting the reproduction of the predetermined music data file. For example, it is possible to provide the user with the synthesized voice of the music name and / or artist name of the music data being played back without interrupting the playback of the music data.

以下、本発明を実施するための最良の形態について説明する。図１は、本発明の第１の実施形態にかかるポータブルオーディオ機器１００の機能ブロック図である。図１に示すように、このポータブルオーディオ機器１００は、主として、表示部１０１と、操作部１０２と、通信Ｉ／Ｆ部１０３と、オーディオ再生部１０４と、制御部１０５と、記憶部１０６と、音声合成部１０７と、出力Ｉ／Ｆ部１０８と、音声出力部１０９とから構成されている。以下、それぞれの構成について説明する。 Hereinafter, the best mode for carrying out the present invention will be described. FIG. 1 is a functional block diagram of a portable audio device 100 according to the first embodiment of the present invention. As shown in FIG. 1, this portable audio device 100 mainly includes a display unit 101, an operation unit 102, a communication I / F unit 103, an audio playback unit 104, a control unit 105, a storage unit 106, The speech synthesis unit 107, the output I / F unit 108, and the speech output unit 109 are configured. Hereinafter, each configuration will be described.

表示部１０１は、ポータブルオーディオ機器１００の筐体に設けられた液晶表示装置または有機ＥＬ表示装置であって、ポータブルオーディオ機器１００の動作状態、ポータブルオーディオ機器１００を操作するためのユーザインターフェイス等を後述する制御部１０５の制御下で表示するものである。ポータブルオーディオ機器１００のユーザは、表示部１０１に示されたユーザインターフェイスを介してポータブルオーディオ機器１００の状態を知り、その操作を行うものである。また、ポータブルオーディオ機器１００が、動画ファイルの再生を行う場合には、表示部１０１に当該動画ファイルにかかる映像が表示される。 The display unit 101 is a liquid crystal display device or an organic EL display device provided in a casing of the portable audio device 100, and describes an operating state of the portable audio device 100, a user interface for operating the portable audio device 100, and the like. Displayed under the control of the control unit 105. The user of the portable audio device 100 knows the state of the portable audio device 100 via the user interface shown on the display unit 101 and performs the operation. Further, when the portable audio device 100 reproduces a moving image file, the video relating to the moving image file is displayed on the display unit 101.

操作部１０２は、ポータブルオーディオ機器１００の筐体（不図示）または付属品に設けられたボタン（例えば、イヤフォンのコードに取り付けられたリモコンの）、およびその他キーパッド等の入力装置であり、ユーザが操作することにより各キーまたはボタンに応じた信号であるキーイベントが後述する制御部１０５に通知され、種々の操作ないしは制御に利用される。 The operation unit 102 is an input device such as a button (for example, a remote controller attached to an earphone cord) provided on a casing (not shown) or an accessory of the portable audio device 100, and a keypad. As a result of the operation, a key event which is a signal corresponding to each key or button is notified to the control unit 105 described later, and is used for various operations or control.

通信Ｉ／Ｆ部１０３は、ポータブルオーディオ機器１００をホストＰＣ等と接続して通信を行うためのインターフェイスである。例えば、ポータブルオーディオ機器１００とホストＰＣとをケーブル接続する場合には、通信Ｉ／Ｆ部１０３をＵＳＢ仕様に基づいて構成することができる。また、無線接続する場合にはＷｉＦｉにより無線接続する構成としてもよい。このような通信Ｉ／Ｆ部１０３は、後述する制御部１０５の制御下で動作することにより、ＰＣのようなホスト機器と通信可能に接続し、ホスト機器から音楽データのファイルを受け取る機能を実現する。また、ホスト機器から操作コマンドを受け取り、後述する記憶部１０６の内容を適宜変更する機能を実現する。さらに、通信Ｉ／Ｆ部１０３にホスト機能を設け、通信Ｉ／Ｆ部１０３に接続されたスレーブ機器をポータブルオーディオ機器１００から操作するようにしても構わない。 The communication I / F unit 103 is an interface for performing communication by connecting the portable audio device 100 to a host PC or the like. For example, when the portable audio device 100 and the host PC are connected by cable, the communication I / F unit 103 can be configured based on the USB specification. In the case of wireless connection, the wireless connection may be performed by WiFi. Such a communication I / F unit 103 operates under the control of the control unit 105 to be described later, thereby realizing communication with a host device such as a PC and receiving a music data file from the host device. To do. In addition, a function of receiving an operation command from the host device and appropriately changing the contents of the storage unit 106 described later is realized. Further, the communication I / F unit 103 may be provided with a host function, and a slave device connected to the communication I / F unit 103 may be operated from the portable audio device 100.

オーディオ再生部１０４は、WMA、MP3、AAC等のフォーマットにより作成された音楽データファイルのデータをデコードし、デコードにより得られたデジタル信号をアナログ信号に変換して後述する出力Ｉ／Ｆ部１０８に出力する機能を有している。図２は、このようなオーディオ再生部１０４の構成例を具体的に示した図面である。図２に示すように、このオーディオ再生部１０４は、制御部Ｉ／Ｆ部１１０と、ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）１１２と、ミキサ１１３と、デジタルアンプ１１４と、ＤＡＣ（ＤｉｇｉｔａｌＡｎａｌｏｇｕｅＣｏｎｖｅｒｔｅｒ）１１５とを有している。制御部Ｉ／Ｆ部１１０は、後述する制御部１０５とのインターフェイスをなし、制御部１０５がオーディオ再生部１０４に引き渡すコマンド、パラメータおよびデータを受け付けるものである。制御部Ｉ／Ｆ部１１０にはこれらの信号を一時的に記録するバッファを設けても良い。 The audio playback unit 104 decodes the data of the music data file created in a format such as WMA, MP3, AAC, etc., converts the digital signal obtained by the decoding into an analog signal, and outputs it to an output I / F unit 108 described later. It has a function to output. FIG. 2 is a diagram specifically showing a configuration example of such an audio playback unit 104. As shown in FIG. 2, the audio reproduction unit 104 includes a control unit I / F unit 110, a DSP (Digital Signal Processor) 112, a mixer 113, a digital amplifier 114, and a DAC (Digital Analog Converter) 115. Have. The control unit I / F unit 110 serves as an interface with the control unit 105 described later, and receives commands, parameters, and data that the control unit 105 passes to the audio playback unit 104. The control unit I / F unit 110 may be provided with a buffer for temporarily recording these signals.

ＤＳＰ１１２は、オーディオ処理に適した演算部、制御部およびアドレス計算部を有し、制御部Ｉ／Ｆ部１１０を介して制御部１０５からのコマンド、パラメータおよびデータを受け取り、コマンドおよびパラメータに従ってデータのデコードを行い、デコードにより得られたデジタル信号を出力する機能を有している。ＤＳＰ１１２は複数の音源データを並行してデコードすることが可能であり、各音源から得られたデジタル信号は各々に出力される。ミキサ１１３はＤＳＰ１１２の出力側に設けられており、制御部Ｉ／Ｆ部１１０を介して受け取ったパラメータに応じてデジタル信号をミキシングする機能を有している。デジタルアンプは１１４は、ミキサ１１３の出力側に設けられており、制御部Ｉ／Ｆ部１１０を介して受け取ったパラメータに応じて、ミキサ１１３からデジタル信号の音量を増幅する機能を有している。デジタルアンプの出力側にはＤＡＣ１１５が設けられており、デジタルアンプ１１４の出力したデジタル信号をアナログ信号に変換してオーディオ再生部１０４から出力Ｉ／Ｆ部１０８に出力する。なお、オーディオ再生部１０４は、このようにＤＳＰ１１２を利用して音楽データファイルのデータをデコードするものに限られず、ハードウェアによりデコードを行う構成としてもよい。 The DSP 112 includes a calculation unit, a control unit, and an address calculation unit suitable for audio processing. The DSP 112 receives commands, parameters, and data from the control unit 105 via the control unit I / F unit 110, and receives data according to the commands and parameters. It has a function of performing decoding and outputting a digital signal obtained by decoding. The DSP 112 can decode a plurality of sound source data in parallel, and a digital signal obtained from each sound source is output to each. The mixer 113 is provided on the output side of the DSP 112 and has a function of mixing digital signals in accordance with parameters received via the control unit I / F unit 110. The digital amplifier 114 is provided on the output side of the mixer 113 and has a function of amplifying the volume of the digital signal from the mixer 113 according to the parameter received via the control unit I / F unit 110. . A DAC 115 is provided on the output side of the digital amplifier. The digital signal output from the digital amplifier 114 is converted into an analog signal and output from the audio reproduction unit 104 to the output I / F unit 108. Note that the audio playback unit 104 is not limited to the one that decodes the data of the music data file using the DSP 112 as described above, and may be configured to perform decoding by hardware.

制御部１０５は、不図示のＣＰＵ上で所定のプログラムを実行することにより仮想的に構成される機能ブロックであって、ポータブルオーディオ機器１００の各機能ブロックとの間でデータおよび制御信号をやり取りすることにより、ポータブルオーディオ機器１００の各種機能を実現するものである。また、本実施形態における制御部１０５は、ポータブルオーディオ機器１００の通常の機能に加えて、音楽データファイルの再生中に操作部１０２に加えられた所定の操作を検出することにより、後述する動作を行うように、さらには、これと付随した動作を行うように各機能ブロックを制御する。 The control unit 105 is a functional block virtually configured by executing a predetermined program on a CPU (not shown), and exchanges data and control signals with each functional block of the portable audio device 100. Thus, various functions of the portable audio device 100 are realized. In addition to the normal functions of the portable audio device 100, the control unit 105 according to the present embodiment detects a predetermined operation applied to the operation unit 102 during the reproduction of the music data file, thereby performing an operation described later. In addition, each functional block is controlled so as to perform an operation associated therewith.

記憶部１０６は、音楽データファイル等を制御部１０５の制御下で記憶し、また制御部１０５に記憶している情報を提供するためのハードディスクドライブまたはフラッシュメモリである。ポータブルオーディオ機器１００の通信Ｉ／Ｆ部１０３をホスト機器と接続した状態で、ホスト機器から転送された音楽データのファイルは、記憶部１０６に記憶され、その後ユーザが操作部１０２を操作することに応じてオーディオ再生部１０４により再生される。 The storage unit 106 is a hard disk drive or flash memory for storing music data files and the like under the control of the control unit 105 and providing information stored in the control unit 105. With the communication I / F unit 103 of the portable audio device 100 connected to the host device, the music data file transferred from the host device is stored in the storage unit 106, and then the user operates the operation unit 102. In response, the audio reproduction unit 104 reproduces the information.

音声合成部１０７は、制御部１０５から引き渡されたテキスト文字列について音声合成を行い、テキスト情報に相当する音声デジタル信号を制御部１０５に返す機能を有している。図３は、このような音声合成部１０７の構成例を具体的に示した図面である。図３に示すように、この音声合成部１０７は、制御部Ｉ／Ｆ部１２１と、言語処理部１２２と、単語データベース１２４と、音響処理部１２３と、音響データベース１２５とから構成されている。制御部Ｉ／Ｆ部１２１は、制御部１０５とのインターフェイスをなし、制御部１０５が音声合成部１０７に引き渡すコマンド、パラメータおよびテキスト文字列を受け付けるものであり、これらの信号を一時的に記録するバッファを有している。 The voice synthesis unit 107 has a function of performing voice synthesis on the text character string delivered from the control unit 105 and returning a voice digital signal corresponding to text information to the control unit 105. FIG. 3 is a diagram specifically illustrating a configuration example of such a speech synthesis unit 107. As shown in FIG. 3, the speech synthesis unit 107 includes a control unit I / F unit 121, a language processing unit 122, a word database 124, an acoustic processing unit 123, and an acoustic database 125. The control unit I / F unit 121 serves as an interface with the control unit 105, and accepts commands, parameters, and text character strings that the control unit 105 passes to the speech synthesis unit 107, and temporarily records these signals. Has a buffer.

制御部Ｉ／Ｆ部１２１には言語処理部１２２および音響処理部１２３が接続されており、言語処理部１２２は制御部Ｉ／Ｆ部１２１を介して制御部１０５からコマンド、パラメータおよびテキスト文字列を受けとることが可能になっている。また、言語処理部１２２は受け取ったテキスト文字列について、代表的な単語および固有名詞の読み並びにアクセント等の登録された単語データベース１２４を参照することにより、テキスト文字列を表音文字列に変換して出力する。ここで、表音文字列とは、テキスト文字列の読み、アクセントおよびイントネーションを記述した文字列である。 A language processing unit 122 and an acoustic processing unit 123 are connected to the control unit I / F unit 121. The language processing unit 122 receives commands, parameters, and text character strings from the control unit 105 via the control unit I / F unit 121. It is possible to receive. In addition, the language processing unit 122 converts the text character string into a phonetic character string by referring to the registered word database 124 such as readings of typical words and proper nouns and accents for the received text character string. Output. Here, the phonetic character string is a character string describing the reading, accent, and intonation of a text character string.

音響処理部１２３は、言語処理部１２２の出力側に設けられており、言語処理部１２２の出力した表音文字列を受け取り、人間が所定のテキスト文字列を読み上げた場合の音響データの登録された音響データベース１２５を参照することにより、受け取った表音文字列と対応する音素若しくは音素列の音響データを抽出してそれらを組み合わせ、組み合わせた音響データを平滑化するとともにアクセントおよびイントネーションの情報に応じて各音素の高低および強弱が付くように音響データに修飾を施して、デジタル音声信号を生成する機能を有している。音響処理部１２３の出力したデジタル音声信号は制御部Ｉ／Ｆ部１２１を介して制御部１０５に返される。 The acoustic processing unit 123 is provided on the output side of the language processing unit 122, receives the phonetic character string output by the language processing unit 122, and registers acoustic data when a human reads a predetermined text character string. By referring to the acoustic database 125, the phoneme corresponding to the received phonetic character string or the acoustic data of the phoneme string is extracted and combined, the combined acoustic data is smoothed, and according to the accent and intonation information Thus, the sound data is modified so that the level and strength of each phoneme is given, and a digital audio signal is generated. The digital audio signal output from the acoustic processing unit 123 is returned to the control unit 105 via the control unit I / F unit 121.

このような音声合成部１０７、ひいては言語処理部１２２および音響処理部１２３は、制御部１０５とは独立したハードウェア若しくはハードウェアおよびソフトウェアの組み合わせとして構成することも可能であるが、制御部１０５を実現するＣＰＵ上で所定のソフトウェアを実行することにより構成するようにしてもよい。また、いずれの場合においても、言語処理部１２２の利用する単語データベース１２４と、音響処理部１２３の利用する音響データベース１２５とは、それぞれ記憶部１０６に記憶させるようにしても構わないが、別の記憶部に記憶させる構成としてもよい。 Such a speech synthesis unit 107, and thus the language processing unit 122 and the acoustic processing unit 123 can be configured as hardware independent of the control unit 105 or a combination of hardware and software. You may make it comprise by running predetermined software on CPU implement | achieved. In any case, the word database 124 used by the language processing unit 122 and the acoustic database 125 used by the acoustic processing unit 123 may be stored in the storage unit 106. It is good also as a structure memorize | stored in a memory | storage part.

出力Ｉ／Ｆ部１０８は、オーディオ再生部１０４の生成したアナログ音声信号を出力するためのインターフェイスである。オーディオ再生部１０４からの出力は、この出力Ｉ／Ｆ部１０８を介してヘッドフォン等の音声出力部１０９に伝達され、該音声出力部１０９からユーザに音声信号として提供される。 The output I / F unit 108 is an interface for outputting the analog audio signal generated by the audio reproduction unit 104. The output from the audio playback unit 104 is transmitted to the audio output unit 109 such as headphones via the output I / F unit 108, and is provided to the user as an audio signal from the audio output unit 109.

図４は、本実施形態にかかるポータブルオーディオ機器１００が再生する音楽データファイルのフォーマット例を示す図面である。ここでは、ＩＤ３ｖ１タグの付加されたＭＰ３フォーマットの音楽データファイルについて示す。図４に示すように、ＭＰ３フォーマットにＩＤ３ｖ１タグを付加する場合、タグデータは複数のフレームからなる音楽データの末尾に付加される。ＩＤ３ｖ１タグには、タイトル、アーティスト名、アルバム名等の情報が所定位置に記述されている。なおＩＤ３ｖ１タグが１２８バイトの固定長データであるのに対し、後継のＩＤ３ｖ２は可変長であり情報量をより多くすることも可能であるが、タグを音楽データの前に挿入する形になる。また、ここではＭＰ３フォーマットの音楽データについて示したが、これに限らずＷＭＡ、ＡＡＣ、ＡＴＲＡＣ等のフォーマットでも略同様にタグ情報を付加することが可能である。 FIG. 4 is a diagram showing a format example of a music data file played back by the portable audio device 100 according to the present embodiment. Here, an MP3 format music data file to which an ID3v1 tag is added is shown. As shown in FIG. 4, when an ID3v1 tag is added to the MP3 format, the tag data is added to the end of music data composed of a plurality of frames. In the ID3v1 tag, information such as a title, an artist name, and an album name is described at a predetermined position. The ID3v1 tag is fixed-length data of 128 bytes, while the succeeding ID3v2 has a variable length and can increase the amount of information, but the tag is inserted before the music data. Although the MP3 format music data is shown here, the present invention is not limited to this, and tag information can be added in substantially the same manner in formats such as WMA, AAC, and ATRAC.

以下、本実施形態にかかるポータブルオーディオ機器１００で音楽再生中に、再生している曲のタイトル、アーティスト名等を表示装置を利用することなくユーザに確認させる動作について、図５のフローチャートを参照しながら説明する。 Hereinafter, with reference to the flowchart of FIG. 5, an operation for allowing the user to check the title, artist name, and the like of a song being played back without using a display device during music playback on the portable audio device 100 according to the present embodiment. While explaining.

まず、所定の曲を再生するために、ユーザが操作部１０２を操作して、所定の音楽データファイルの再生を指示する（ＳＴ１０１）。ここでの指示は、例えば記憶部１０６に保存されている音楽データファイルからタグ情報を読み出し、各音楽データファイルのタイトルおよびアーティスト名を表示したリストを表示部１０１に示し、ユーザは操作部１０２を操作することにより再生する音楽データファイルを選択する形式とすることができ、選択した音楽データファイルの再生が終了するとリストの次に位置する曲の再生を開始するように設定することができる。 First, in order to reproduce a predetermined song, the user operates the operation unit 102 to instruct reproduction of a predetermined music data file (ST101). The instruction here is, for example, reading tag information from a music data file stored in the storage unit 106, and displaying a list displaying the title and artist name of each music data file on the display unit 101. A music data file to be played can be selected by operating, and when playback of the selected music data file is completed, playback of a song positioned next to the list can be set to start.

このＳＴ１０１におけるユーザの指示に応じて、制御部１０５は指定された音楽データファイルからデータを読み出してオーディオ再生部１０４に引き渡し（ＳＴ１０２）、オーディオ再生部１０４が引き渡されたデータをデコードおよびアナログ変換して出力を開始する（ＳＴ１０３）。ここで、制御部１０５はオーディオ再生部１０４の制御部Ｉ／Ｆ部１１０のバッファの容量に応じたサイズを音楽データファイルから順次取り出し、オーディオ再生部１０４に引き渡す構成とすることができる。また、オーディオ再生部１０４における詳細な動作は個々の実装によって異なるが、図２に示した本実施形態のオーディオ再生部１０４においては、制御部１０５から引き渡されたデータはまず制御部Ｉ／Ｆ部１１０に受け取られてバッファに一時保存される。ＤＳＰ１１２は、このデータを逐次デコードすることによりＷＭＡ、ＡＡＣ、ＡＴＲＡＣ等のフォーマット形式から１６ｂｉｔＰＣＭのステレオサウンドデータを生成し、ミキサ１１３を介してデジタルアンプ１１４に出力する。デジタルアンプ１１４は別途制御部１０５から指示された音量となるようにデータを増幅し、これをＤＡＣ１１５でアナログ変換することにより音楽データファイルを再生したアナログ信号がオーディオ再生部１０４から出力される。 In response to the user instruction in ST101, the control unit 105 reads out data from the designated music data file and delivers it to the audio playback unit 104 (ST102), and the audio playback unit 104 decodes and converts it to analog. The output is started (ST103). Here, the control unit 105 can sequentially take out a size corresponding to the capacity of the buffer of the control unit I / F unit 110 of the audio playback unit 104 from the music data file and deliver it to the audio playback unit 104. In addition, although the detailed operation in the audio playback unit 104 varies depending on each implementation, in the audio playback unit 104 of the present embodiment shown in FIG. 2, the data delivered from the control unit 105 is first a control unit I / F unit. 110 is received and temporarily stored in the buffer. The DSP 112 sequentially decodes this data to generate 16-bit PCM stereo sound data from a format such as WMA, AAC, and ATRAC, and outputs the stereo sound data to the digital amplifier 114 via the mixer 113. The digital amplifier 114 amplifies the data so that the volume is separately instructed from the control unit 105, and analog-converts the data by the DAC 115, thereby outputting an analog signal that reproduces the music data file from the audio reproduction unit 104.

ＳＴ１０３で音楽データファイルの再生を開始した後、制御部１０５は操作部１０２に所定の操作が加えられたかを判断する（ＳＴ１０４）。ここでの判断は、例えば制御部１０５が操作部１０２から通知されたキーイベントをキューに蓄積している場合には、キューの内容を参照することにより行うことができる。ＳＴ１０４で所定の操作が加えられていると判断されなかった場合には、所定時間ウェイトし（ＳＴ１１０）、その後ＳＴ１０４の判断を繰り返す。一方、ＳＴ１０４で所定の操作が加えられていると判断された場合には、ＳＴ１０５以降の動作が実行される。なお、ここでは制御部１０５が所定の操作が加えられたか否かを定期的にポーリングする場合を示したが、制御部１０５が割り込みにより所定の操作が加えられたことを検出する構成としてもよい。 After starting the reproduction of the music data file in ST103, the control unit 105 determines whether a predetermined operation has been performed on the operation unit 102 (ST104). For example, when the control unit 105 accumulates key events notified from the operation unit 102 in a queue, the determination can be made by referring to the contents of the queue. If it is not determined that a predetermined operation is applied in ST104, the process waits for a predetermined time (ST110), and then repeats the determination in ST104. On the other hand, when it is determined in ST104 that a predetermined operation has been performed, the operations after ST105 are executed. Here, a case is shown in which the control unit 105 periodically polls whether or not a predetermined operation has been applied. However, the control unit 105 may detect that a predetermined operation has been applied by an interrupt. .

ＳＴ１０４で操作が加えられたと判断した場合には、まず制御部１０５が、オーディオ再生部１０４にデコードさせている音楽データファイルのヘッダから楽曲名、アーティスト名の文字列を取得する（ＳＴ１０５）。より具体的には、例えば図４に示すように音楽データファイルの末尾に設けられたＩＤ３タグの所定位置から楽曲名およびアーティスト名の文字列を取得すればよい。次いで、制御部１０５は、取得した楽曲名およびアーティスト名の文字列を、音声合成部１０７に引き渡す（ＳＴ１０６）。 If it is determined in ST104 that an operation has been made, the control unit 105 first obtains the character string of the song name and artist name from the header of the music data file decoded by the audio playback unit 104 (ST105). More specifically, for example, as shown in FIG. 4, a character string of a song name and an artist name may be acquired from a predetermined position of an ID3 tag provided at the end of the music data file. Next, the control unit 105 delivers the acquired character string of the song name and the artist name to the voice synthesis unit 107 (ST106).

オーディオ再生部１０４が再生している音楽データファイルの楽曲名およびアーティスト名の文字列を受け取った音声合成部１０７は、引き渡された文字列について音声データを合成し、オーディオ再生部１０４に出力する（ＳＴ１０７）。ここで音声合成部１０７における処理をより詳細に説明すると、例えば図３に示した構成において、引き渡された文字列は制御部Ｉ／Ｆ部１２１のバッファに一時保存され、まず言語処理部１２２に渡される。言語処理部１２２は、楽曲名またはアーティスト名のテキスト文字列について、代表的な単語・人名の読み・アクセント等が登録された辞書データを参照することによって、表音文字列を生成する。次に生成された表音文字列は音響処理部１２３に入力され、音響処理部１２３は表音文字列の内容から対応する音素列若しくは音素の音データを音響データベースを抽出してそれらを組み合わせ、組み合わせた音データを平滑化するとともにアクセント・イントネーションの情報に応じて各音素の高低および強弱が付くようにデータに修飾を施して、１６ｂｉｔＰＣＭ形式のデジタル音声信号を生成する。このような処理を楽曲名およびアーティスト名のそれぞれについて行い、得られたデジタル音声信号を制御部Ｉ／Ｆ部１２１から制御部１０５を介してオーディオ再生部１０４に出力する。 The voice synthesizing unit 107 that has received the character string of the music name and artist name of the music data file being reproduced by the audio reproducing unit 104 synthesizes audio data for the delivered character string and outputs the synthesized audio data to the audio reproducing unit 104 ( ST107). Here, the processing in the speech synthesizer 107 will be described in more detail. For example, in the configuration shown in FIG. 3, the delivered character string is temporarily stored in the buffer of the control unit I / F unit 121, and is first stored in the language processing unit 122. Passed. The language processing unit 122 generates a phonetic character string by referring to dictionary data in which typical words, readings of personal names, accents, and the like are registered with respect to a text character string of a song name or an artist name. Next, the generated phonetic character string is input to the sound processing unit 123, and the sound processing unit 123 extracts a corresponding phoneme sequence or phoneme sound data from the content of the phonetic character string and combines them, The combined sound data is smoothed, and the data is modified so that the height and intensity of each phoneme is given according to the accent / intonation information, thereby generating a 16-bit PCM digital audio signal. Such processing is performed for each of the music name and artist name, and the obtained digital audio signal is output from the control unit I / F unit 121 to the audio reproduction unit 104 via the control unit 105.

この出力を受けたオーディオ再生部１０４は、音声合成部１０７が合成したデジタル音声信号を、オーディオ再生部１０４が音楽データファイルをデコードして得たデジタル音声信号とミキサ１１３によりミキシングし（ＳＴ１０８）、ミキシングされたデータをアナログ変換して出力する（ＳＴ１０９）。このアナログ変換されたデータによって出力Ｉ／Ｆ部１０８を介して接続された音声出力部１０９を駆動することにより、音楽データファイルの再生と同時に、当該音楽データファイルの楽曲名およびアーティスト名の合成音声を音声出力部１０９から再生することができる。 Upon receiving this output, the audio reproduction unit 104 mixes the digital audio signal synthesized by the audio synthesis unit 107 with the digital audio signal obtained by decoding the music data file by the audio reproduction unit 104 by the mixer 113 (ST108). The mixed data is converted to analog and output (ST109). By driving the audio output unit 109 connected via the output I / F unit 108 with the analog-converted data, the synthesized audio of the music name and artist name of the music data file is played simultaneously with the reproduction of the music data file. Can be reproduced from the audio output unit 109.

このような動作によれば、ユーザがポータブルオーディオ機器１００で再生している曲の楽曲名およびアーティスト名を知りたくなった場合には、操作部１０２を操作することにより音声出力部１０９から聞こえる合成音声で楽曲名およびアーティスト名を知ることが可能である。すなわち、本発明によれば、ユーザにポータブルオーディオ機器１００の筐体を取り出して表示部１０１の表示内容を見るような煩雑な動作をさせることなく、現在再生している音楽データファイルの楽曲名およびアーティスト名を知らせることができる。また、ここでは音楽データファイルの再生を停止することなく、再生された音楽と同時に楽曲名およびアーティスト名の合成音声を聴くことができるので、ユーザが音楽を楽しんでいる状態を中断せずに楽曲名等の合成音声をユーザに聴かせることが可能である。 According to such an operation, when the user wants to know the song name and artist name of the song being played on the portable audio device 100, the synthesis that can be heard from the audio output unit 109 by operating the operation unit 102. It is possible to know the song name and artist name by voice. In other words, according to the present invention, the user can remove the casing of the portable audio device 100 and perform a complicated operation such as viewing the display content of the display unit 101, and the music name and You can inform the artist name. Also, here, you can listen to the synthesized voice of the song name and artist name at the same time as the played music without stopping the playback of the music data file, so the user can enjoy the music without interruption. It is possible to make the user listen to synthesized speech such as names.

以上、本発明の実施の形態について説明したが、本発明はこれに限定されることなく、その趣旨を逸脱しない範囲で種々の改良・変更が可能であることは勿論である。 The embodiment of the present invention has been described above, but the present invention is not limited to this, and it is needless to say that various improvements and modifications can be made without departing from the spirit of the present invention.

例えば、上記実施形態では音楽データファイルの再生音に重ね合わせて、楽曲名およびアーティスト名の合成音声をユーザに提供する構成としたが、音楽データファイルの再生を一旦中断して合成音声を提供する構成としてもよい。また、上記実施形態では、ユーザが操作部１０２に所定の操作を加えてから音楽データファイルの楽曲名およびアーティスト名を音声合成する場合を示したが、ユーザの操作を待つことなく、記憶部１０６に記憶された音楽データファイルの楽曲名およびアーティスト名を予め音声合成しておく構成とすることも可能である。 For example, in the above embodiment, the synthesized voice of the song name and artist name is provided to the user in superposition with the reproduced sound of the music data file, but the reproduction of the music data file is temporarily interrupted to provide the synthesized voice. It is good also as a structure. In the above embodiment, the case where the user performs a predetermined operation on the operation unit 102 and then synthesizes the music name and artist name of the music data file is described. However, the storage unit 106 does not have to wait for the user's operation. The music name and artist name of the music data file stored in can be synthesized in advance.

本発明の実施形態にかかるポータブルオーディオ機器の機能ブロック図。1 is a functional block diagram of a portable audio device according to an embodiment of the present invention. オーディオ再生部の機能ブロック図。The functional block diagram of an audio reproduction part. 音声合成部の機能ブロック図。The functional block diagram of a speech synthesizer. 音楽データファイルの構造を示す図。The figure which shows the structure of a music data file. ポータブルオーディオ機器における動作のフローチャート。The flowchart of the operation | movement in portable audio equipment.

Explanation of symbols

１００ポータブルオーディオ機器
１０１表示部
１０２操作部
１０３通信Ｉ／Ｆ部
１０４オーディオ再生部
１０５制御部
１０６記憶部
１０７音声合成部
１０８出力Ｉ／Ｆ部 DESCRIPTION OF SYMBOLS 100 Portable audio apparatus 101 Display part 102 Operation part 103 Communication I / F part 104 Audio reproduction part 105 Control part 106 Storage part 107 Speech synthesis part 108 Output I / F part

Claims

An operation unit constituting a user interface;
A storage unit for storing music data files;
An audio playback unit for playing back the music data file;
In a portable audio device comprising: a control unit that causes the audio reproduction unit to reproduce the music data file stored in the storage unit in response to a user operating the operation unit.
Furthermore, it has a speech synthesis unit that generates synthesized speech data from a text string,
When the control unit detects that the user has performed a predetermined operation on the operation unit during reproduction of the predetermined music data file, the song name added to the predetermined music data file by the voice synthesis unit A portable audio device characterized in that synthesized audio data is generated from a text character string of an artist name and the synthesized audio data is reproduced by the audio reproduction unit.

An operation unit constituting a user interface;
A storage unit for storing music data files;
An audio playback unit for playing back the music data file;
In a portable audio device comprising: a control unit that causes the audio reproduction unit to reproduce the music data file stored in the storage unit in response to a user operating the operation unit.
Furthermore, it has a speech synthesizer that generates synthesized speech data from a text string,
The control unit generates synthesized voice data by the voice synthesizer from a text string of a song name and / or artist name added in advance to a music data file stored in the storage unit, When it is detected that a user has performed a predetermined operation on the operation unit while a data file is being reproduced, the audio reproduction unit reproduces the synthesized voice data of the music name and / or artist name of the music data file. Portable audio equipment.

2. The control unit according to claim 1, wherein the control unit simultaneously reproduces the synthesized voice data generated from the text character string of the song name and / or the artist name without interrupting the reproduction of the predetermined music data file. 2. The portable audio device according to 2.