JP4158356B2

JP4158356B2 - Information terminal device and program

Info

Publication number: JP4158356B2
Application number: JP2001168435A
Authority: JP
Inventors: 匠山田
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2001-06-04
Filing date: 2001-06-04
Publication date: 2008-10-01
Anticipated expiration: 2021-06-04
Also published as: JP2002366179A

Description

【０００１】
【発明の属する技術分野】
本発明は、音声データを画像データと対応付けて記録する情報端末装置、及びプログラムに関する。
【０００２】
【従来の技術】
近年、音楽のデジタルデータ化が進み、インターネット経由での音楽のダウンロードをはじめとして、ユーザがパソコン上で音声データを取り扱うことが増えてきた。従来、音声データを管理する機能として、音声データに音符やスピーカ等が表示された画像データを対応付けることによってパソコンの画面上で音声データであることを視覚的に区別できるようにし、当該画像データをマウスでクリックする等の簡単な操作で音声データが再生できる音声スタンプが利用されている。
【０００３】
通常、音声スタンプは録音、受信メールからの音声登録時に自動的にデフォルトの画像データが対応づけられており、ユーザがこの画像データを別の画像データに変更する機能を有している。画像データの変更は、ユーザが入力部を介して任意の画像データを選択することにより行い、選択された画像データには音声スタンプであることを示す音符が合成されている。
【０００４】
【発明が解決しようとする課題】
しかしながら、従来の音声スタンプでは、画像データの変更はできるものの、既存の音声データの音質を変化させる等の加工をしたり、加工の種類と度合いに応じて画像データを変化させる機能がなかった。このため、ユーザが思い通りに音声データの加工を行って、加工の種類と度合いを視覚的に識別可能な音声スタンプを作成することはできなかった。
【０００５】
本発明の課題は、音声データの加工の種類と加工度に対応した画像データを作成、表示することにより、ユーザが簡単に音声データを加工できるようにするとともに、音声データの加工内容を容易に知ることができるようにすることである。
【０００６】
【課題を解決するための手段】
本発明は、上記課題を解決するため、以下の様な特徴を備えている。以下に示す手段の説明において、実施の形態に対応する構成を括弧内に例示する。なお、符号は後記の図面参照符号に対応する。
【０００７】
請求項１記載の発明は、
音声データを画像データと対応付けて記録する情報端末装置（例えば、図１の情報端末装置１）において、
録音した音声データ又は録音されていた音声データに対応させる第１の画像データを選択する画像選択手段（例えば、図７のステップＳ１０を実行するための入力部１２）と、
前記音声データを加工する為の加工の種類に対応した第２の画像データを選択することにより、加工の種類を選択する加工種類選択手段（例えば、図７のステップＳ１３を実行するための入力部１２）と、
前記加工種類選択手段により選択された加工の種類に従って、前記録音した音声データ又は録音されていた音声データを加工する音声加工手段（例えば、図７のステップＳ１４を実行するためのＣＰＵ１１）と、
前記音声加工手段により加工された音声データに、前記第１の画像データと前記第２の画像データを合成した画像データを対応付けて記録する合成画像記録手段（例えば、図７のステップＳ１６〜ステップＳ１８を記憶装置１７に実行させるためのＣＰＵ１１）と、
を備えたことを特徴としている。
【０００８】
請求項１記載の発明によれば、録音した、あるいは録音されていた音声データに対応させる画像データを、一覧表示した画像データの中から選択し、音声データを加工する為の加工の種類に対応した第２の画像データを選択することにより、加工の種類を選択する。そして、選択された加工の種類に従って、音声データを加工し、加工された音声データを第１の画像データと第２の画像データを合成した画像データと対応付けて記録する。従って、ユーザは音声データに対して思い通りの加工をすることができ、また、加工の種類に応じた画像データを表示することにより、音声データの加工内容を容易に知ることができるようになる。
【０００９】
請求項２記載の発明によれば、
音声データを画像データと対応付けて記録する情報端末装置（例えば、図１の情報端末装置１）において、
録音した音声データ又は録音されていた音声データに対応させる第１の画像データを選択する第１画像選択手段（例えば、図１５のステップＳ３０を実行するための入力部１２）と、
前記音声データの加工の種類を選択する加工種類選択手段（例えば、図１５のステップＳ３２を実行するための入力部１２）と、
前記加工種類選択手段により選択された加工の種類に対応する第２の画像データを選択する第２画像選択手段（例えば、図１５のステップＳ３３を実行するための入力部１２）と、
前記音声データの加工度を指定する加工度指定手段（例えば、図１５のステップＳ３４を実行するための入力部１２）と、
前記加工種類選択手段により選択された加工の種類と、前記加工度指定手段により指定された加工度とに従って、前記録音した音声データ又は録音されていた音声データを加工する音声加工手段（例えば、図１５のステップＳ３５を実行するためのＣＰＵ１１）と、
前記音声加工手段により加工された音声データに、前記第１の画像データと前記第２の画像データを合成した画像データを対応付けて記録する合成画像記録手段（例えば、図１５のステップＳ３７〜ステップＳ３９を記憶装置１７に実行させるためのＣＰＵ１１）と、
を備えたことを特徴としている。
【００１０】
請求項２記載の発明によれば、録音した、あるいは録音されていた音声データに対応させる画像データを、一覧表示した画像データの中から選択し、音声データの加工の種類を選択し、選択された加工の種類に対応する第２の画像データを選択した後、音声データの加工度を指定する。そして、選択された加工の種類と指定された加工度に従って、音声データを加工し、加工された音声データを第１の画像データと第２の画像データを合成した画像データと対応付けて記録する。従って、ユーザは音声データに対して思い通りの加工をすることができ、また、加工の種類に応じた画像データを表示することにより、音声データの加工内容を容易に知ることができるようになる。
【００１１】
請求項４記載の発明は、
音声データを画像データと対応付けて記録する情報端末装置（例えば、図１の情報端末装置１）において、
録音した音声データ又は録音されていた音声データに対応させる第１の画像データを選択する第１画像選択手段（例えば、図２８のステップＳ７０を実行するための入力部１２）と、
前記録音した音声データ又は録音されていた音声データを解析する音声解析手段（例えば、図２８のステップＳ７１を実行するためのＣＰＵ１１）と、
前記音声解析手段による解析結果に基づいて判定された加工の種類に対応する画像データ又は選択指定された新規の画像データを、第２の画像データとして選択する第２画像選択手段（例えば、図２８のステップＳ７３〜Ｓ７５を実行するための入力部１２）と、
前記録音した音声データ又は録音されていた音声データに、前記第１の画像データと前記第２の画像データを合成した画像データを対応付けて記録する合成画像記録手段（例えば、図２８のステップＳ７６〜ステップＳ７８を記憶装置１７に実行させるためのＣＰＵ１１）と、
を備えたことを特徴としている。
【００１２】
請求項４記載の発明によれば、録音した、あるいは録音されていた音声データに対応させる画像データを、一覧表示した画像データの中から選択し、音声データを解析し、解析した結果に基づいて判定された加工の種類に対応する画像データ、又は選択指定された新規の画像データを第２の画像データとして選択し、前記音声データを第１の画像データと第２の画像データを合成した画像データと対応付けて記録する。従って、音声データの内容が視覚的に識別可能となり、その結果、ユーザは音声データの加工内容を一目で把握できるようになる。従って、音声データの内容が視覚的に識別可能となり、その結果、ユーザは音声データの加工内容を一目で把握できるようになる。
【００１３】
【発明の実施の形態】
以下、図を参照して本発明の実施の形態を詳細に説明する。
【００１４】
〔第１の実施の形態〕
まず、構成を説明する。
図１は、本実施の形態における情報端末装置１の全体構成を示す図である。図１に示す様に、情報端末装置１は、ＣＰＵ１１、入力部１２、ＲＡＭ１３、伝送制御部１４、ＶＲＡＭ１５、表示部１６、記録媒体１７ａを有する記憶装置１７、スピーカ１８ａ、マイク１８ｂを備えた音声処理部１８により構成され、記録媒体１７ａを除く各部は、バス１９により接続されているコンピュータである。
【００１５】
ＣＰＵ（Central Processing Unit）１１は、記憶装置１７の有する記録媒体１７ａに記憶されている各種制御プログラムを読み出し、ＲＡＭ１３内に形成されたワークメモリに展開し、該制御プログラムに従って各部の動作を集中制御する。また、ＣＰＵ１１は、ＲＡＭ１３内のワークメモリに展開した制御プログラムに従って、後述する音声スタンプ作成処理Ａ等を実行し、その処理結果をＲＡＭ１３内のワークメモリに格納すると共に表示部１６に表示させる。そして、ワークメモリに格納した処理結果を、記憶装置１７或いは記録媒体１７ａ内の所定の保存先に保存させる。
【００１６】
すなわち、ＣＰＵ１１は、音声スタンプ作成処理Ａの実行に際して、音声の録音指示があると、録音を開始して音声スタンプを作成するための音声データを作成する。録音の指示がなければ、過去に録音した音声データの一覧を表示して、一覧の中から選択された音声データを、音声スタンプを作成する音声データとして認識する。そして、画像データの一覧を表示させて、録音された、あるいは選択された音声データに対応させる画像データを選択する。画像データの選択後、音声データ加工の指示があると、加工の種類と加工度合いを表す合成マークの一覧を表示させ、選択された合成マークに対応した加工処理を音声データに対して行う。そして、選択された画像データと合成マークを合成し、合成した画像に加工済みの音声データを関連付けて音声スタンプを設定し登録する。音声データ加工の指示がなければ、選択された画像データに、録音された、あるいは選択された音声データを関連付けて音声スタンプを設定し登録する。
【００１７】
入力部１２は、文字／英数字入力キー、カーソルキー、及び各種機能キー等を備えたキーボードと、ポインティングデバイスであるマウスと、を備えて構成され、キーボードで押圧操作されたキーの押圧信号とマウスによる操作信号とを、入力信号としてＣＰＵ１１へ出力する。若しくは、入力部１２は、表示部１６の表示画面を覆う透明なシートパネルに、指又は専用のタッチペンで触れることにより入力される位置情報を入力信号としてＣＰＵ１１へ出力する、タッチパネルにより構成される。
【００１８】
ＲＡＭ（Random Access Memory）１３は、ＣＰＵ１１により実行制御される上記各種処理において、記憶装置１７から読み出された情報端末装置１で実行可能なシステムプログラム、制御プログラム、入力若しくは出力データ、及びパラメータ等を一時的に格納する。
【００１９】
伝送制御部１４は、ルータやＴＡ（Terminal Adapter）等によって構成され、専用線、或いはＩＳＤＮ（Integrated Service Digital Network）回線等の通信回線を介してネットワークに接続された他の端末との通信制御を行う。ルータは、情報端末装置１がＬＡＮを構成している場合に、外部のＬＡＮとの間を接続する装置であり、ＴＡは、ＩＳＤＮ回線を介して外部機器との通信を行うために、既存のインタフェースをＩＳＤＮに対応するインタフェースに変換する装置である。
【００２０】
ＶＲＡＭ１５は、ＣＰＵ１１の表示指示に従って、表示部１６に表示するための画像データを一時的に格納する。表示部１６は、ＬＣＤ（Liquid Crystal Display）やＣＲＴ（Cathode Ray Tube）等により構成され、ＣＰＵ１１から入力される表示信号の指示に従って、表示画面上に、後述する音声データファイル選択画面１００１、画像データ選択画面１００２、合成マーク選択画面１００３、音声スタンプ設定画面１００４等の表示を行う。
【００２１】
記憶装置１７は、プログラムやデータ等が予め記憶された記録媒体１７ａを有し、この記録媒体１７ａは磁気的、光学的記録媒体、若しくは半導体等の不揮発性メモリで構成されている。記録媒体１７ａは、記憶装置１７に固定的に設けたもの、若しくは着脱自在に装着するものであり、記録媒体１７ａには情報端末装置１に対応するシステムプログラム、及び該システムプログラム上で実行可能な音声スタンプ作成処理Ａ等の各種処理プログラム、及びこれらのプログラムで処理されたデータ等を記憶する。これらの各処理プログラムは、読み取り可能なプログラムコードの形態で格納され、ＣＰＵ１１は、当該プログラムコードに従った動作を逐次実行する。
【００２２】
本実施の形態において、記憶装置１７は、図２に示す様に、内部に音声データファイル１７１、画像データファイル１７２、音声スタンプ登録情報ファイル（１）１７３、合成マークファイル１７４を有して構成される。以下、これら各ファイル内のデータ構成について図３〜図６を参照して詳細に説明する。
【００２３】
図３は、録音した音声データを格納する音声データファイル１７１内のデータ格納例を示す図である。図３に示す様に、音声データファイル１７１は、音声データを特定する為に一意的に割り当てられた識別コード（例えば、“yama.def”、“kawa.def”、“umi.def”、…）を「音声データ名」として格納する音声データ名領域１７１ａと、録音された、あるいは既存の音声データ（例えば、“音声１”、“音声２”、“音声３”…）を「音声データ」として格納する音声データ領域１７１ｂと、を有する。
【００２４】
図４は、画像データを格納する画像データファイル１７２内のデータ格納例を示す図である。図４に示す様に、画像データファイル１７２は、画像データを特定する為に一意的に割り当てられた識別コード（例えば、“speaker.ghi”、“house.ghi”、“maru.ghi”、…）を「画像データ名」として格納する画像データ名領域１７２ａと、登録された、あるいは既存の画像データ（例えば、“画像１”、“画像２”、“画像３”…）を「画像データ」として格納する画像データ領域１７２ｂと、を有する。
【００２５】
図５は、音声スタンプの登録に必要な情報を格納する音声スタンプ登録情報ファイル（１）１７３内のデータ格納例を示す図である。図５に示す様に、音声スタンプ登録情報ファイル（１）１７３は、音声スタンプを特定するために一意的に割り当てられた識別コード（例えば、“onsei.abc”、“onsei1.abc”、“onsei2.abc”、…）を「ファイル名」として格納するファイル名領域１７３ａと、当該音声スタンプに登録されている音声データを特定するために一意的に割り当てられた識別コード（例えば、“yama.def”、“kawa.def”、“umi.def”、…）を「音声データ名」として格納する音声データ名領域１７３ｂと、該音声スタンプに登録されている画像データを特定する為に一意的に割り当てられた識別コード（例えば、“speaker.ghi”、“house.ghi”、“maru.ghi”、…）を「画像データ名」として格納する画像データ名領域１７３ｃと、該音声データに施された加工の種類を表す文字列データ（例えば、“音量”、“音の高さ”、“音質”、…）を「加工の種類」として格納する加工の種類領域１７３ｄと、該音声データに施された加工度を表す数値データ（例えば、“１”、“２”、“４”、…）を「加工度」として格納する加工度領域１７３ｅと、該音声スタンプが登録された日付を表す日付データ（例えば、“01.03.10”、“01.03.09”、“01.02.10”、…）を「登録日」として格納する登録日領域１７３ｆと、該音声スタンプが登録された時間を表すデータ（例えば、“15：01”、“14：58”、“12：00”、…）を「登録時間」として格納する登録時間領域１７３ｇと、を有する。なお、「加工の種類」と「加工度」は、音声スタンプ作成時に音声データを加工しなかった場合は空欄となる。
【００２６】
図６は、音声データの加工の種類、加工度に対応づけた合成マークを格納する合成マークファイル１７４内のデータ格納例を示す図である。図６に示す様に、合成マークファイル１７４は、加工の種類を表す文字列データ（例えば、“音量”、…）を「加工の種類」として格納する加工の種類領域１７４ａと、加工度を表す数値データ（例えば、“１”、“２”、“３”…）を「加工度」として格納する加工度領域１７４ｂと、合成マークの画像データ（例えば、“音符１”、“音符２”、“音符３”、…）を「合成マークデータ」として格納する合成マークデータ領域１７４ｃと、を有する。
【００２７】
音声処理部１８は、アナログ／デジタル変換器、増幅器等により構成され、スピーカ１８ａ、マイク１８ｂを備える。音声処理部１８は、記憶装置１７に格納されている音声データを変換器でアナログ信号に変換し、増幅器を介してスピーカ１８ａから外部に出力する。また、音声処理部１８は、マイク１８ｂ等から入力される音声信号を変換器によりデジタル信号に変換し、音声データファイル１７１に音声データとして格納する。
【００２８】
次に、動作を説明する。
情報端末装置１により実行される音声スタンプ作成処理Ａについて図７のフローチャートを参照して説明する。
動作説明の前提として、以下のフローチャートに記述されている各機能を実現するためのプログラムは、読み取り可能なプログラムコードの形態で記録媒体１７ａに格納されており、ＣＰＵ１１は、上記プログラムコードに従った動作を逐次実行する。また、ＣＰＵ１１は、伝送媒体を介して伝送されてきた上述のプログラムコードに従った動作を逐次実行することもできる。すなわち、記録媒体１７ａの他、伝送媒体を介して外部供給されたプログラム或いはデータを利用して本実施の形態特有の動作を実行することもできる。
【００２９】
まず、ＣＰＵ１１は、入力部１２により音声スタンプの作成を指示する信号が入力されると（ステップＳ１）、録音して音声データを作成するか、既存の音声データを利用するか、の選択画面を表示する（ステップＳ２）。録音が選択されると（ステップＳ３；Ｙ）、ＣＰＵ１１は、音声処理部１８を介して録音を開始し（ステップＳ４）、録音が終了すると（ステップＳ５；Ｙ）、音声データファイルを作成し、当該音声データファイルを、音声スタンプを作成するための音声データとして認識する（ステップＳ６）。一方、既存の音声データの利用が選択されると（ステップＳ３；Ｎ）、ＣＰＵ１１は、記憶装置１７に記憶されている音声データファイル１７１の一覧を表示し（ステップＳ７）、音声データが選択されると、当該音声データを、音声スタンプを作成するための音声データとして認識する（ステップＳ８）。
【００３０】
図８は、ステップＳ７で表示部１６に表示される音声データファイル１７１の一覧を示す音声データファイル選択画面１００１の例を示す図である。図８に示す様に、音声データファイル選択画面１００１は、「音声データを選択して下さい」等のユーザへの指示メッセージが表示され、その下に音声データのファイルが一覧表示される。ユーザは、一覧表示されている音声データファイルの中から所望のファイルに対応するファイル名にカーソルを合わせ、選択ボタンを指定することにより、音声データを選択する。
【００３１】
次に、ＣＰＵ１１は、記憶装置１７に記憶されている画像データファイル１７２の画像データを表示する（ステップＳ９）。表示一覧の中から当該音声データに対応させる画像データが選択されると（ステップＳ１０）、ＣＰＵ１１は、当該音声データを加工するか否かの選択画面を表示部１６に表示させる。加工することが選択されると（ステップＳ１１；Ｙ）、記憶装置１７に記憶されている合成マークファイル１７４の合成マークと、合成マークに対応する加工の種類を一覧表示する（ステップＳ１２）。
【００３２】
合成マークが選択されると（ステップＳ１３）、ＣＰＵ１１は、当該音声データに、当該合成マークに対応した加工処理を行う（ステップＳ１４）。音声データの加工処理が終了すると、ＣＰＵ１１は、当該合成マークと当該画像データの合成を行い（ステップＳ１５）、合成した画像に対して当該音声データを関連付け（ステップＳ１６）、音声スタンプを設定登録し、一連の音声スタンプ作成処理Ａを終了する（ステップＳ１８）。
【００３３】
図９は、ステップＳ９で表示部１６に表示される画像データファイル１７２の画像一覧を示す画像データ選択画面１００２の例を図示している。図９に示す様に、画像データ選択画面１００２は、「音声スタンプの画像を選択して下さい」等のユーザへの指示メッセージが表示され、その下に画像データの内容が表示される。ユーザは一覧表示されている画像の中から所望の画像にカーソルを合わせ、選択ボタンを指定することにより、画像データを選択する。画像データ選択画面１００２では、画像データ（ａ）の様なスピーカの画像が選択されたことを示している。
【００３４】
図１０、図１１は、ステップＳ１１で音声データの加工が選択された場合に、ステップＳ１２〜ステップＳ１８において音声データが加工され、音声スタンプが設定登録される過程で表示部１６に表示される画面の例である。
【００３５】
図１０は、ステップＳ１３で表示部１６に表示される合成マークファイル１７４の一覧を示す合成マーク選択画面１００３の例を図示している。図１０に示す様に、合成マーク選択画面１００３は、「音声効果の画像を選択して下さい」等のユーザへの指示メッセージが表示され、その下に音声データの加工の種類に対応した合成マークが表示される。ユーザは、表示されている合成マークの中から希望する音声データの加工の種類に対応している合成マークにカーソルを合わせることにより、音声データの加工の種類を選択する。合成マーク選択画面１００３では、音量を最小にすることを示す加工の種類に対応している合成マークである最も薄色の音符が選択されたことを示している。
【００３６】
図１１は、ステップＳ１８で表示部１６に表示される音声スタンプ設定画面１００４の例を示す図である。音声スタンプ設定画面１００４では、図９で選択された画像（ここでは、スピーカ）と、図１０で選択された合成マーク（ここでは、最も薄色の音符）を合成した音声スタンプが表示されている。画面最下部に指示されている登録ボタンの選択操作を行うと、表示されている音声スタンプが登録（記録）される。
【００３７】
なお、音声データの加工の種類を示す画像は、ここでは音符等の合成マークとしているが、額縁等であってもよい。即ち、図９で選択された画像データに額縁をつけて、その額縁の形状、模様、色等により加工の種類を区別するようにしてもよい。この場合、音符等の合成マークと額縁を併用するこも勿論可能である。
【００３８】
一方、ステップＳ１１において、当該音声データを加工しないという指示があると（ステップＳ１１；Ｎ）、ＣＰＵ１１は、画像データに当該音声データを関連付け（ステップＳ１７）、音声スタンプを設定登録し、一連の音声スタンプ作成処理Ａを終了する（ステップＳ１８）。
【００３９】
図１２は、ステップＳ９でスピーカの画像を選択し、ステップＳ１１で音声データの加工をしないことを選択した場合にステップＳ１８で表示部１６に表示される音声スタンプ設定画面１００５の例を示す図である。画面下に指示されている登録ボタンの選択操作を行うと、表示されている音声スタンプが登録される。
【００４０】
以上説明した様に、情報端末装置１によれば、音声データと所望の画像データを対応付けて音声スタンプを作成、登録する。音声データは加工することができ、加工する場合には、音声データの加工の種類に対応した合成マークを選択することによって音声データに加工を行い、合成マークと画像データとを合成した画像を音声スタンプとして設定する。これによって、ユーザが簡単に音声データを加工できるようになり、設定された音声スタンプの画像から、加工の種類を容易に知ることができるようになる。
【００４１】
なお、上記第１の実施の形態における記述内容は、本発明に係る情報端末装置１の好適な一例であり、これに限定されるものではない。
例えば、上記実施の形態では、ステップＳ６で録音した音声データファイル１７１を作成した後に、引き続き音声スタンプの作成のステップに入っているが、録音、作成した音声データファイル１７１を保存しておき、後日音声スタンプを作成することもできる。また、予め音声データの加工の種類を決めてから録音し、録音した音声データを自動的に加工して音声スタンプを作成するものとしてもよい。
【００４２】
さらに、上記実施の形態によれば、ステップＳ１４の音声データ加工のあと、すぐに画像データを合成しているが、一旦加工した音声データを出力して、ユーザの嗜好にあった音声が出力されなければ、ステップＳ１２の合成マーク一覧表示のステップまで戻り、嗜好にあった音声加工がなされるまでステップＳ１２〜ステップＳ１４の各処理を繰り返し行うようにしてもよい。
【００４３】
〔第２の実施の形態〕
以下、第１の実施の形態の応用例として、音声データの加工の種類と加工度を指定して、その指定に基づいて音声データを加工し、加工した音声データに対応して、画像データの大きさ、明るさ等を変化させたもの（以下、「画像エフェクト」と記す。）を利用した実施の形態について詳述する。なお、本実施の形態における情報端末装置１の構成は、上述した第１の実施の形態と同様であるので、各構成要素には同一の符号を付し、その構成の図示及び説明は省略する。
【００４４】
但し、情報端末装置１の記憶装置１７は、図１３に示すように、内部に音声データファイル１７１、画像データファイル１７２、音声スタンプ登録情報ファイル（２）１７５を有して構成されており、音声スタンプ登録情報ファイル（２）１７５は、本実施の形態特有の構成要素であるので以下詳細に説明する。
【００４５】
図１４は、後述する音声スタンプ作成処理Ｂにおいて音声スタンプの登録に必要な情報を格納する音声スタンプ登録情報ファイル（２）１７５内のデータ格納例を示す図である。図１５に示す様に、音声スタンプ登録情報ファイル（２）１７５は、ファイル名領域１７５ａと、音声データ名領域１７５ｂと、画像データ名領域１７５ｃと、加工の種類領域１７５ｄと、加工度領域１７５ｅと、画像エフェクト情報１７５ｆと、登録日領域１７５ｇと、登録時間領域１７５ｈと、から構成される。
【００４６】
ファイル名領域１７５ａは、音声スタンプを特定するために一意的に割り当てられた識別コード（例えば、“onsei.abc”、“onsei1.abc”、“onsei2.abc”、…）を「ファイル名」として格納する。音声データ名領域１７５ｂは、当該音声スタンプに登録されている音声データを特定するために一意的に割り当てられた識別コード（例えば、“yama.def”、“kawa.def”、“umi.def”、…）を「音声データ名」として格納する。画像データ名領域１７５ｃは、音声スタンプに登録されている画像データを特定する為に一意的に割り当てられた識別コード（例えば、“speaker.ghi”、“house.ghi”、“maru.ghi”、…）を「画像データ名」として格納する。
【００４７】
加工の種類領域１７５ｄは、該音声データに施された加工の種類を表す文字列データ（例えば、“音量”、“音質”、…）を「加工の種類」として格納する。加工度領域１７５ｅは、該音声データに施された加工度を表す数値データ（例えば、“１”、“４”、…）を「加工度」として格納する。画像エフェクト情報１７５ｆは、音声データの加工の種類と度合いによって画像をどのように変化させるかを関連付けた情報を識別するための記号データ（例えば、“ａ”、“ｂ”…）を「画像エフェクト情報」として格納する。なお、「加工の種類」、「加工度」、「画像エフェクト情報」は、音声スタンプ作成時に音声データを加工しなかった場合は上から２番目のレコードの様に空欄となる。
【００４８】
登録日領域１７５ｇは、該音声スタンプが登録された日付を表す日付データ（例えば、“01.03.10”、“01.03.09”、“01.02.10”、…）を「登録日」として格納する。登録時間領域１７５ｈは、該音声スタンプが登録された時間を表すデータ（例えば、“15：01”、“14：58”、“12：00”、…）を「登録時間」として格納する。
【００４９】
次に動作を説明する。
情報端末装置１により実行される音声スタンプ作成処理Ｂについて図１５のフローチャートを参照して説明する。
まず、ＣＰＵ１１は、入力部１２により音声スタンプの作成を指示する信号が入力されると（ステップＳ２１）、録音して音声データを作成するか、既存の音声データを利用するか、の選択画面を表示させる（ステップＳ２２）。録音が選択されると（ステップＳ２３；Ｙ）、ＣＰＵ１１は、音声処理部１８により録音を開始し（ステップＳ２４）、録音が終了すると（ステップＳ２５；Ｙ）、音声データファイルを作成し、当該音声データファイルを、音声スタンプを作成するための音声データとして認識する（ステップＳ２６）。
【００５０】
一方、既存の音声データの利用が選択されると（ステップＳ２３；Ｎ）、ＣＰＵ１１は、記憶装置１７に記憶されている音声データファイル１７１の一覧を表示し（ステップＳ２７）、音声データが選択されると、当該音声データを、音声スタンプを作成するための音声データとして認識する（ステップＳ２８）。
【００５１】
次に、ＣＰＵ１１は、記憶装置１７に記憶されている画像データファイル１７２の画像データを表示させる（ステップＳ２９）。表示一覧の中から当該音声データに対応させる画像データが選択されると（ステップＳ３０）、ＣＰＵ１１は、当該音声データを加工するか否かの選択画面を表示させる。加工することが選択されると（ステップＳ３１；Ｙ）、ＣＰＵ１１は、加工の種類を一覧にした選択画面を表示させる。
【００５２】
加工の種類が選択されると（ステップＳ３２）、ＣＰＵ１１は、上述した画像エフェクトの一覧画面を表示部１６に表示させ、選択された画像エフェクトと音声データの加工の種類とを関連づける（ステップＳ３３）。
【００５３】
次いで、ＣＰＵ１１は、加工度を入力する画面を表示部１６に表示し、加工度が入力されると（ステップＳ３４）、入力された加工度に従って音声データを加工する（ステップＳ３５）。音声データを加工すると、ＣＰＵ１１は、ステップＳ３３で選択された画像エフェクトに従い、画像データを変更し（ステップＳ３６）、変更した画像データに音声データを関連付け（ステップＳ３７）、音声スタンプを設定登録し、一連の音声スタンプ作成処理Ｂを終了する（ステップＳ３９）。
【００５４】
以下に示す図１６〜図２１は、ステップＳ３１で音声データの加工が選択された場合に、ステップＳ３２〜ステップＳ３９において音声スタンプが設定登録される過程を示す表示画面の例である。
【００５５】
図１６は、ステップＳ３２で表示部１６に表示される音声データに施す加工の種類選択画面１００６の例を示す図である。図１６に示す様に、加工の種類選択画面１００６は、「音声データの加工の種類を選択して下さい」等のユーザへの指示メッセージが表示され、その下に音声データの加工の種類が表示される。ユーザは、一覧表示されている音声データの加工の種類の中から、音声データに施そうとする加工の種類にカーソルを合わせることによりにより、音声データの加工の種類を選択する。当該加工の種類選択画面１００６では、音量変更が選択された様子を示している。
【００５６】
図１７は、ステップＳ３３で表示部１６に表示される、ステップＳ３２で選択された音声データの加工の種類に対応する画像エフェクト選択画面１００７の例を示す図である。図１７に示す様に、画像エフェクト選択画面１００７は、「音量変化に対応する画像エフェクトを選択して下さい」等のユーザへの指示メッセージが表示され、その下に画像エフェクトによる画像データの変化が表示される。ユーザは、一覧表示された画像エフェクトの中から所望の画像エフェクトにカーソルを合わせることによって、音声データの加工の種類に対応する画像エフェクトを選択する。当該画像エフェクト選択画面１００７では、音量変化と画像のサイズが関連付けられた様子を示している。
【００５７】
図１８〜図２１は、ステップＳ３４で表示部１６に表示される加工度入力画面の一例である。図１８は、ステップＳ３２で加工の種類として音量が選択され、ステップＳ３３で画像エフェクトとして画像データの画像サイズの変更が選択された場合に表示される加工度入力画面１００８の例を示す図である。図１８に示す様に、加工度入力画面１００８は、「加工度を入力して下さい」等のユーザへの指示メッセージが表示される。その下にはステップＳ３０で選択された画像データの画像サイズが音声データの加工度と対応づけて表示され、更にその下に加工度を数値で入力可能な領域１００８ａが設けられている。ユーザは、領域１００８ａに数値を入力することにより、加工度を指定する。画面最下部に指示されている決定ボタンの指定操作を行うと、入力された加工度が設定される。
【００５８】
図１９は、ステップＳ３２で加工の種類として音量が選択され、ステップＳ３３で画像エフェクトとして合成マークのサイズ変更が選択された場合に表示される加工度入力画面１００９の例を示す図である。図１９に示す様に、加工度入力画面１００９は、「加工度を入力して下さい」等のユーザへの指示メッセージが表示される。その下には合成マークである音符のサイズが音声データの加工度と対応づけて表示され、更にその下に加工度を数値で入力可能な領域１００９ａが設けられている。ユーザは、領域１００９ａに数値を入力することにより、加工度を指定する。画面最下部に指示されている決定ボタンの指定操作を行うと、入力された加工度が設定される。
【００５９】
図２０は、ステップＳ３２で加工の種類として音質が選択され、ステップＳ３３で画像エフェクトとして画像データの明るさの変更が選択された場合に表示される加工度入力画面１０１０の例を示す図である。図２０に示す様に、加工度入力画面１０１０は、「加工度を入力して下さい」等のユーザへの指示メッセージが表示される。その下にはステップＳ３０で選択された画像データの画像の明るさが音声データの加工度と対応づけて表示され、更にその下には加工度を数値で入力可能な領域１０１０ａが設けられている。ユーザは、領域１０１０ａに数値を入力することにより、加工度を指定する。画面下に指示されている決定ボタンの指定操作を行うと、入力された加工度が設定される。
【００６０】
図２１は、ステップＳ３２で加工の種類として音の高さが選択され、ステップＳ３３で画像エフェクトとして合成マークの明るさの変更が選択された場合に表示される加工度入力画面１０１１の例を示す図である。図２１に示す様に、加工度入力画面１０１１は、「加工度を入力して下さい」等のユーザへの指示メッセージが表示される。その下には合成マークである音符の明るさが音声データの加工度と対応づけて表示され、更にその下に加工度を数値で入力可能な領域１０１１ａが設けられている。ユーザは、領域１０１１ａに数値を入力することにより、加工度を指定する。、画面最下部に指示されている決定ボタンの指定操作を行うと、入力された加工度が設定される。
【００６１】
図２２は、ステップＳ３９で表示部１６に表示される音声スタンプ設定画面１０１２の例を示す図である。音声スタンプ設定画面１０１２では、図９で選択された画像（ここでは、スピーカ）に、図１６で選択された音声データの加工の種類（ここでは、音量）と図１８で入力された加工度に対応した画像エフェクト（ここでは、画像の大きさ）をかけることによって作成した音声スタンプを表示している。画面下に指示されている「登録」操作を行うと、表示されている音声スタンプが登録（記録）される。
【００６２】
一方、ステップＳ３１において、音声データを加工しないという指示があると（ステップＳ３１；Ｎ）、ＣＰＵ１１は、画像データに当該音声データを関連付け（ステップＳ３８）、音声スタンプを設定登録し、一連の音声スタンプ作成処理Ｂを終了する（ステップＳ３９）。
【００６３】
図１２は、ステップＳ２９でスピーカの画像を選択し、ステップＳ３１で音声データの加工をしないことを選択した場合にステップＳ３９で表示部１６に表示される音声スタンプ設定画面１００５を例示する図である。画面下に指示されている「登録」操作を行うと、表示されている“onsei1.abc”のファイル名の音声スタンプが登録（記録）される。
【００６４】
以上説明した様に、情報端末装置１によれば、音声データと所望の画像データを対応付けて音声スタンプを作成、登録する。音声データは加工することができ、加工する場合には、ユーザは加工の種類と加工度を指定し、好みの画像エフェクトを選択するだけで音声データを加工し、音声データの加工の種類と加工度に対応した画像エフェクトを画像データに施し、音声スタンプとして設定する。これによって、ユーザが簡単に音声データを加工できるようになり、設定された音声スタンプの画像から、加工内容を容易に知ることができるようになる。
【００６５】
なお、上記第２の実施の形態における記述内容は、本発明に係る情報端末装置１の好適な一例であり、これに限定されるものではない。
例えば、上記実施の形態では、ステップＳ２６で録音した音声データファイル１７１を作成した後に、引き続き音声スタンプの作成のステップに入っているが、録音、作成した音声データファイル１７１を保存しておき、後日音声スタンプを作成することもできる。また、予め音声データの加工の種類と度合いを決めてから録音し、録音した音声データを自動的に加工して音声スタンプを作成するものとしてもよい。
【００６６】
さらに、ステップＳ３５の音声データ加工のあと、すぐに画像データを合成しているが、一旦加工した音声データを出力して、ユーザの嗜好にあった音声が出力されなければ、ステップＳ３３の加工に対応する画像エフェクト選択のステップまで戻り、嗜好にあった音声加工がなされるまでステップＳ３３〜ステップＳ３５の各処理を繰り返し行うようにしてもよい。
【００６７】
〔第３の実施の形態〕
以下、上記各実施の形態の応用例として、登録された音声スタンプが表示されている場合に、当該音声スタンプの加工度を切り替えるための実施の形態について詳述する。なお、本実施の形態における情報端末装置１の構成は、上述した第１の実施の形態と同様であるので、各構成要素には同一の符号を付し、その構成の図示及び説明は省略する。
【００６８】
以下、動作を説明する。
情報端末装置１により実行される加工度変更処理について図２３のフローチャートを参照して説明する。なお、ここで使用される音声スタンプは第1の実施の形態で登録されたものであっても、第２の実施の形態で登録されたものであっても、加工度変更処理を行うことができるが、ここでは、第１の実施の形態で登録された音声スタンプに対して加工度の切り替えを行う動作について説明する。
【００６９】
まず、ＣＰＵ１１は、入力部１２により特定の音声スタンプの表示を指示する信号が入力されると、表示部１６に音声スタンプを表示させる（ステップＳ４１）。そして、音声スタンプの加工度を変更するとういう指示が入力されると（ステップＳ４２；Ｙ）、ＣＰＵ１１は、当該音声スタンプに加工が施されているかどうかを判別し、加工が施されていると判別すると（ステップＳ４３；Ｙ）、記憶装置１７の音声スタンプ登録情報ファイル（１）１７３（図２参照）から音声スタンプの情報を読み出す（ステップＳ４４）。そして、ＣＰＵ１１は、当該音声スタンプに加工度の変更の指示がされる毎に音声を出力する（ステップＳ４５）。加工度の変更の指示とは、例えば、当該音声スタンプにマウスポインタを合わせて１回クリックするという操作によってなされる。
【００７０】
そして、ＣＰＵ１１は、出力した音声データの加工度に応じて音声スタンプの画面を切り替えて表示させる（ステップＳ４６）。加工が終了すると（ステップＳ４７；Ｎ）、当該音声スタンプを登録するかを確認し、登録の指示があると（ステップＳ４８；Ｙ）、当該音声スタンプを新規に登録し、一連の加工度変更処理を終了する（ステップＳ４９）。
【００７１】
図２４、図２５は、ステップＳ４５でマウスによるクリック等の操作により音声データの加工度が変更され、ステップＳ４６で加工度の変更に応じて画像データが切り替わる様子を図示したものである。図２４は、第１の実施の形態で設定登録した画像データと、音声データの加工の種類に対応した合成マークとの合成画像による音声スタンプの加工度切り替えの様子を示している。図２５は、第２の実施の形態で設定登録した画像データに、音声データの加工の種類に対応した画像エフェクトを施した音声スタンプの加工度切り替えの様子を示している。
【００７２】
一方、ステップＳ４２において、音声スタンプの加工度の変更が指示されなければ（ステップＳ４２；Ｎ）、音声データを出力する（ステップＳ５０）。また、音声データが加工されていなければ（ステップＳ４３；Ｎ）、前述した音声スタンプ作成処理Ａ（図７参照）のステップＳ１２〜ステップＳ１８の処理を行い、音声データを加工した音声スタンプを作成する。
【００７３】
以上説明した様に、情報端末装置１によれば、登録された音声スタンプの画像をクリックする等の簡単な操作で音声スタンプに登録された音声データの加工度を切り替えて出力すると共に、その出力に併せて音声スタンプの画像も変化させて表示する。そして、登録指示の有った時点でその音声スタンプを登録する。これによって、ユーザが簡単に音声スタンプに登録された音声データの加工度を切り替えることができるようになる。
【００７４】
なお、上記各実施の形態における記述内容は、本発明に係る情報端末装置の好適な一例であり、これに限定されるものではない。
例えば、上記実施の形態では、第1の実施の形態で登録された音声スタンプにおける音声データの加工度を切り替える処理をしているが、第２の実施の形態において登録された音声スタンプにおいては、ＣＰＵ１１がステップＳ４４において該音声スタンプに加工が施されていないと判断すると、音声スタンプ作成処理Ｂ（図１５参照）のステップＳ３２〜ステップＳ３９の処理を行うようにしてもよい。
【００７５】
〔第４の実施の形態〕
以下、上記各実施の形態の応用例として、音声データを解析し、解析結果から加工の種類を判断し、解析結果と加工の種類に対応して画像データを合成する実施の形態について詳述する。なお、本実施の形態における情報端末装置１の構成は、上述した第１の実施の形態と同様であるので、各構成要素には同一の符号を付し、その構成の図示及び説明は省略する。
【００７６】
但し、情報端末装置１の音声処理部１８は、マイク１８ｂ等により入力された音声データを解析する。また、記憶装置１７に形成された図２６に示す音声データファイル１７１、画像データファイル１７２、音声解析スタンプ登録情報ファイル１７６を有して構成されており、音声解析スタンプ登録情報ファイル１７６は、本実施の形態特有の構成要素であるので以下詳細に説明する。
【００７７】
図２７に示す様に、音声解析スタンプ登録情報ファイル１７６は、ファイル名領域１７６ａと、音声データ名領域１７６ｂと、画像データ名領域１７６ｃと、解析結果領域１７６ｄと、登録日領域１７６ｅと、登録時間領域１７６ｆと、から構成される。
【００７８】
ファイル名領域１７６ａは、音声解析スタンプを特定するために一意的に割り当てられた識別コード（例えば、“onsei.abc”、“onsei1.abc”、“onsei2.abc”、…）を「ファイル名」として格納する。音声データ名領域１７６ｂは、当該音声スタンプに登録されている音声データを特定するために一意的に割り当てられた識別コード（例えば、“yama.def”、“kawa.def”、“umi.def”、…）を「音声データ名」として格納する。画像データ名領域１７６ｃは、該音声スタンプに登録されている画像データを特定する為に一意的に割り当てられた識別コード（例えば、“speaker.ghi”、“house.ghi”、“maru.ghi”、…）を「画像データ名」として格納する。
【００７９】
解析結果領域１７６ｄは、音声データの解析結果を表すための文字列データ（例えば、“人の声”、“楽器”…）を「解析結果」として格納する。登録日領域１７６ｅは、該音声スタンプが登録された日付を表す日付データ（例えば、“01.03.10”、“01.03.09”、“01.02.10”、…）を「登録日」として格納する。登録時間領域１７６ｆは、該音声スタンプが登録された時間を表すデータ（例えば、“15：01”、“14：58”、“12：00”、…）を「登録時間」として格納する。
【００８０】
以下、動作を説明する。
情報端末装置１により実行される音声解析スタンプ作成処理について図２８のフローチャートを参照して説明する。
【００８１】
まず、ＣＰＵ１１は、入力部１２により音声解析スタンプの作成を指示する信号が入力されると（ステップＳ６１）、録音して音声データを作成するか、既存の音声データを利用するか、の選択画面を表示する（ステップＳ６２）。録音が選択されると（ステップＳ６３；Ｙ）、ＣＰＵ１１は、音声処理部１８により録音を開始し（ステップＳ６４）、録音が終了すると（ステップＳ６５；Ｙ）、音声データファイルを作成し、当該音声データファイルを、音声解析スタンプを作成するための音声データとして認識する（ステップＳ６６）。一方、既存の音声データの利用が選択されると（ステップＳ６３；Ｎ）、ＣＰＵ１１は、記憶装置１７に記憶されている音声データファイル１７１の一覧を表示し（ステップＳ６７）、音声データが選択されると、当該音声データを、音声解析スタンプを作成するための音声データとして認識する（ステップＳ６８）。
【００８２】
次に、ＣＰＵ１１は、記憶装置１７に記憶されている画像データファイル１７２の画像データを表示させる（ステップＳ６９）。表示一覧の中から当該音声データに対応させる画像データが選択されると（ステップＳ７０）、ＣＰＵ１１は、当該音声データを解析し（ステップＳ７１）、解析した結果を表示部１６に表示させる（ステップＳ７２）。
【００８３】
図２９は、ステップＳ７２で表示部１６に表示される、音声データの解析結果を表示する音声データ解析結果表示画面１０１３の例である。画面上には、「音声データの解析」と表示され、その下に「音声データを解析した結果、人の声が含まれています」等の文字により、ユーザに対して解析結果を示すと共に、解析結果に応じた解析合成画像を表示する。
【００８４】
図３０は、解析合成画像と解析結果の対応関係を一覧表示した解析合成画像一覧表示画面１０１４の例を示す図である。ユーザは、所定の入力操作により、解析合成画像一覧表示画面１０１４の解析合成画像を自由に切替表示して画像と解析結果の対応関係を確認することができる。
【００８５】
図２８に戻り、解析結果を表示させた後、ＣＰＵ１１は、解析合成画像を選択するか否かの選択画面を表示部１６に表示させる。ユーザが解析合成画像を選択するという指示があると（ステップＳ７３；Ｙ）、ＣＰＵ１１は、図３１に示す解析合成画像変更選択画面１０１５を一覧表示させる（ステップＳ７４）。次に、ユーザは、解析合成画像変更選択画面１０１５上の解析合成画像の中から、解析結果に対応させる所望の解析合成画像を選択する（ステップＳ７５）。そして、ＣＰＵ１１は、ステップＳ７０で選択した画像データと、ステップＳ７５で選択した、解析結果に応じた解析合成画像とを合成し（ステップＳ７６）、合成した画像に音声データファイルを関連付けて（ステップＳ７７）、音声解析スタンプとして設定登録する（ステップＳ７８）。
【００８６】
図３１は、ステップＳ７４で表示部１６に表示される解析合成画像の変更候補を示した解析合成画像変更選択画面１０１５の例を示す図である。図３１に示す様に、解析合成画像変更選択画面１０１５は、「画像を選択して下さい」等のユーザへの指示メッセージが表示され、その下に、解析合成画像の選択候補が一覧表示される。ユーザは一覧表示されている解析合成画像の選択候補の中から所望の画像にカーソルを合わせ、選択ボタンを指定することにより、画像データを選択する。
【００８７】
一方、ステップＳ７３において、ユーザが解析合成画像を選択しないという指示があると（ステップＳ７３；Ｎ）、ＣＰＵ１１は、図３０に示す様な解析合成画像と解析結果との対応関係に基づいて、ステップＳ７１における解析結果に対応する解析合成画像を自動的に選択し、この解析合成画像とステップＳ７０で選択した画像データとを合成する（ステップＳ７６）。そして、ＣＰＵ１１は、合成した画像に音声データファイルを関連付けて（ステップＳ７７）、音声解析スタンプとして設定登録する（ステップＳ７８）。
【００８８】
以上説明した様に、情報端末装置１によれば、音声データの内容を解析し、解析結果に対応した解析合成画像と画像データとを合成して当該音声データに関連付けることによって音声解析スタンプを設定、登録（記録）する。これによって、ユーザは、音声データの内容を簡単に把握することができる。
【００８９】
なお、上記各実施の形態における記述内容は、本発明に係る情報端末装置１の好適な一例であり、これに限定されるものではない。
例えば、上記実施の形態により登録された音声解析スタンプに関しても、第３の実施の形態における加工度の切り替えを行うことができる。
【００９０】
また、上述した第１から第４の実施の形態において、第３の実施の形態以外は独立した機能として実現可能であるが、１つのアプリケーションの中にある機能としてそれぞれの実施の形態をユーザが選択できるようにすることで、加工のバリエーションも増え、ユーザインターフェイスを向上させることができる。
その他、情報端末装置１の細部構成、及び詳細動作に関しても、本発明の趣旨を逸脱することのない範囲で適宜変更可能である。
【００９１】
【発明の効果】
請求項１記載の発明によれば、録音した、あるいは録音されていた音声データに対応させる画像データを、一覧表示した画像データの中から選択し、音声データを加工する為の加工の種類に対応した第２の画像データを選択することにより、加工の種類を選択する。そして、選択された加工の種類に従って、音声データを加工し、加工された音声データを第１の画像データと第２の画像データを合成した画像データと対応付けて記録する。従って、ユーザは音声データに対して思い通りの加工をすることができ、また、加工の種類に応じた画像データを表示することにより、音声データの加工内容を容易に知ることができるようになる。
【００９２】
請求項２記載の発明によれば、録音した、あるいは録音されていた音声データに対応させる画像データを、一覧表示した画像データの中から選択し、音声データの加工の種類を選択し、選択された加工の種類に対応する第２の画像データを選択した後、音声データの加工度を指定する。そして、選択された加工の種類と指定された加工度に従って、音声データを加工し、加工された音声データを第１の画像データと第２の画像データを合成した画像データと対応付けて記録する。従って、ユーザは音声データに対して思い通りの加工をすることができ、また、加工の種類のみならず加工度に応じた画像データを変化させて表示することにより、音声データの加工内容をより容易に知ることができるようになる。
【００９３】
請求項３記載の発明によれば、記録された音声データに対応付けられた画像データを表示し、表示された画像データの中から加工度を切り替える画像データを選択し、前記画像データが選択される毎に、画像データに対応する音声データの加工度を切り替えて出力し、変更された加工度に対応させて画像データを切り替える。従って、すでに記録されている音声データに対応づけられた画像データに簡単な操作をすることによって当該音声データの加工を切り替えることができ、ユーザは思い通りの音声データの加工を容易に行うことができる。
【００９４】
請求項４記載の発明によれば、録音した、あるいは録音されていた音声データに対応させる画像データを、一覧表示した画像データの中から選択し、音声データを解析し、解析した結果に基づいて判定された加工の種類に対応する画像データ、又は選択指定された新規の画像データを第２の画像データとして選択し、前記音声データを第１の画像データと第２の画像データを合成した画像データと対応付けて記録する。従って、音声データの内容が視覚的に識別可能となり、その結果、ユーザは音声データの加工内容を一目で把握できるようになる。
【００９５】
請求項５記載の発明によれば、録音した、あるいは録音されていた音声データに対応させる画像データを、一覧表示した画像データの中から選択させ、音声データを加工する為の加工の種類に対応した第２の画像データを選択させることにより、加工の種類を選択させて、選択された加工の種類に従って音声データを加工し、加工された音声データを第１の画像データと第２の画像データを合成した画像データと対応付けて記録させるプログラムをコンピュータに読み込ませることで、請求項１に記載する機能を実現できる。従って、システムと独立したソフトウェア製品単体としての販売、配布も容易になる。また、汎用コンピュータ等のハードウェア資源を用いて、当該プログラムを実行することにより、本発明の技術をハードウェア上で容易に実施できる。
【００９６】
請求項６記載の発明によれば、録音した、あるいは録音されていた音声データに対応させる画像データを、一覧表示した画像データの中から選択させ、音声データの加工の種類と、選択された加工の種類に対応する第２の画像データを選択させ、音声データの加工度を指定させて、加工の種類と加工度に従って、音声データを加工して、加工された音声データを第１の画像データと第２の画像データを合成した画像データと対応付けて記録させるプログラムをコンピュータに読み込ませることで、請求項２に記載する機能を実現できる。従って、システムと独立したソフトウェア製品単体としての販売、配布も容易になる。また、汎用コンピュータ等のハードウェア資源を用いて、当該プログラムを実行することにより、本発明の技術をハードウェア上で容易に実施できる。
【００９７】
請求項７記載の発明によれば、録音した、あるいは録音されていた音声データに対応させる画像データを、一覧表示した画像データの中から選択させ、音声データを解析させ、解析した結果に基づいて判定された加工の種類に対応する画像データ、又は選択指定された新規の画像データを第２の画像データとして選択させ、前記音声データを第１の画像データと第２の画像データを合成した画像データと対応付けて記録させるプログラムをコンピュータに読み込ませることで、請求項２に記載する機能を実現できる。従って、システムと独立したソフトウェア製品単体としての販売、配布も容易になる。また、汎用コンピュータ等のハードウェア資源を用いて、当該プログラムを実行することにより、本発明の技術をハードウェア上で容易に実施できる。
【図面の簡単な説明】
【図１】本発明に係る情報端末装置１の機能的構成を示すブロック図である。
【図２】図１の記憶装置１７内部のファイル構成を示す図である。
【図３】図２の音声データファイル１７１内部のデータ格納例を示す図である。
【図４】図２の画像データファイル１７２内部のデータ格納例を示す図である。
【図５】図２の音声スタンプ登録情報ファイル（１）１７３内部のデータ格納例を示す図である。
【図６】図２の合成マークファイル１７４内部のデータ格納例を示す図である。
【図７】図１のＣＰＵ１１により実行される音声スタンプ作成処理Ａの動作を示すフローチャートである。
【図８】図７のステップＳ７で表示される音声データファイル選択画面１００１の一例を示す図である。
【図９】図７のステップＳ９で表示される画像データ選択画面１００２の一例を示す図である。
【図１０】図７のステップＳ１３で表示される合成マーク選択画面１００３の一例を示す図である。
【図１１】図７のステップＳ１８で表示される音声スタンプ設定画面１００４の一例を示す図である。
【図１２】図７のステップＳ３９で表示される音声スタンプ設定画面１００５の一例を示す図である。
【図１３】図１の記憶装置１７内部のファイル構成を示す図である。
【図１４】図１３の音声スタンプ登録情報ファイル（２）１７５内部のデータ格納例を示す図である。
【図１５】図１のＣＰＵ１１により実行される音声スタンプ作成処理Ｂの動作を示すフローチャートである。
【図１６】図１５のステップＳ３２で表示される、音声データに施す加工の種類選択画面１００６の一例を示す図である。
【図１７】図１５のステップＳ３３で表示される画像エフェクト選択画面１００７の一例を示す図である。
【図１８】図１５のステップＳ３４で表示される加工度入力画面１００８の一例を示す図である。
【図１９】図１５のステップＳ３４で表示される加工度入力画面１００９の一例を示す図である。
【図２０】図１５のステップＳ３４で表示される加工度入力画面１０１０の一例を示す図である。
【図２１】図１５のステップＳ３４で表示される加工度入力画面１０１１の一例を示す図である。
【図２２】図１５のステップＳ３９で表示される音声スタンプ設定画面１０１２の一例を示す図である。
【図２３】図１のＣＰＵ１１により実行される加工度変更処理の動作を示すフローチャートである。
【図２４】図２３のステップＳ４６における画面切り替えの一例を示す図である。
【図２５】図２３のステップＳ４６における画面切り替えの一例を示す図である。
【図２６】図１の記憶装置１７内部のファイル構成を示す図である。
【図２７】図２６の音声解析スタンプ登録情報ファイル１７６内部のデータ格納例を示す図である。
【図２８】図１のＣＰＵ１１により実行される音声解析スタンプ作成処理の動作を示すフローチャートである。
【図２９】図２８のステップＳ７０で表示される音声データ解析結果表示画面１０１３の一例を示す図である。
【図３０】図２８のステップＳ７０で表示される解析合成画像と、解析結果との対応関係を表す解析合成画像一覧表示画面１０１４の一例を示す図である。
【図３１】図２８のステップＳ７４で表示される解析合成画像変更選択画面１０１５の一例を示す図である。
【符号の説明】
１情報端末装置
１１ＣＰＵ
１２入力部
１３ＲＡＭ
１４伝送制御部
１５ＶＲＡＭ
１６表示部
１７記憶装置
１７ａ記録媒体
１８音声処理部
１８ａスピーカ
１８ｂマイク[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an information terminal device that records audio data in association with image data, and a program.
[0002]
[Prior art]
In recent years, digitalization of music has progressed, and users are increasingly handling audio data on personal computers, including downloading music via the Internet. Conventionally, as a function of managing audio data, it is possible to visually distinguish audio data on a personal computer screen by associating audio data with image data in which notes, speakers, and the like are displayed. Audio stamps that can be used to reproduce audio data with simple operations such as clicking with a mouse are used.
[0003]
Usually, the voice stamp is automatically associated with default image data when recording or registering voice from a received mail, and the user has a function of changing this image data to another image data. The image data is changed by the user selecting arbitrary image data via the input unit, and a musical note indicating a voice stamp is synthesized with the selected image data.
[0004]
[Problems to be solved by the invention]
However, although the conventional audio stamp can change the image data, it has no function of changing the sound quality of the existing audio data or changing the image data according to the type and degree of processing. For this reason, it has been impossible to create a voice stamp that allows the user to visually process the voice data and visually identify the type and degree of processing.
[0005]
An object of the present invention is to create and display image data corresponding to the type and degree of processing of audio data, so that the user can easily process the audio data and easily process the audio data. It is to be able to know.
[0006]
[Means for Solving the Problems]
In order to solve the above problems, the present invention has the following features. In the description of the means described below, a configuration corresponding to the embodiment is illustrated in parentheses. Reference numerals correspond to the reference numerals of the drawings described later.
[0007]
The invention according to claim 1
In an information terminal device (for example, the information terminal device 1 in FIG. 1) that records audio data in association with image data,
Image selection means (for example, the input unit 12 for executing step S10 in FIG. 7) for selecting the recorded audio data or the first image data corresponding to the recorded audio data;
Processing type selection means for selecting the type of processing by selecting second image data corresponding to the type of processing for processing the audio data (for example, an input unit for executing step S13 in FIG. 7) 12)
Voice processing means (for example, CPU 11 for executing step S14 in FIG. 7) for processing the recorded voice data or recorded voice data according to the type of processing selected by the processing type selection means;
Synthetic image recording means for recording the audio data processed by the audio processing means in association with the image data obtained by combining the first image data and the second image data (for example, steps S16 to S16 in FIG. 7). CPU 11) for causing the storage device 17 to execute S18,
It is characterized by having.
[0008]
According to the first aspect of the present invention, the image data to be recorded or made to correspond to the recorded audio data is selected from the displayed image data, and corresponds to the type of processing for processing the audio data. The type of processing is selected by selecting the second image data. Then, the audio data is processed according to the selected type of processing, and the processed audio data is recorded in association with the image data obtained by combining the first image data and the second image data. Therefore, the user can process the audio data as desired, and the image data corresponding to the type of processing can be displayed, so that the processing contents of the audio data can be easily known.
[0009]
According to invention of Claim 2,
In an information terminal device (for example, the information terminal device 1 in FIG. 1) that records audio data in association with image data,
First image selection means (for example, the input unit 12 for executing step S30 in FIG. 15) for selecting the recorded image data or the first image data corresponding to the recorded sound data;
Processing type selection means (for example, the input unit 12 for executing step S32 in FIG. 15) for selecting the processing type of the audio data;
Second image selection means (for example, the input unit 12 for executing step S33 in FIG. 15) for selecting second image data corresponding to the type of processing selected by the processing type selection means;
A processing level specifying means (for example, the input unit 12 for executing step S34 in FIG. 15) for specifying the processing level of the audio data;
Audio processing means for processing the recorded voice data or recorded voice data according to the type of processing selected by the processing type selection means and the processing degree specified by the processing degree specifying means (for example, FIG. CPU 11 for executing step S35 of 15),
Synthetic image recording means for recording the audio data processed by the audio processing means in association with the image data obtained by synthesizing the first image data and the second image data (for example, step S37 to step S37 in FIG. 15). CPU 11) for causing the storage device 17 to execute S39,
It is characterized by having.
[0010]
According to the second aspect of the present invention, the image data to be recorded or made to correspond to the recorded audio data is selected from the displayed image data, the type of processing of the audio data is selected and selected. After selecting the second image data corresponding to the type of processing, the processing level of the audio data is designated. Then, the audio data is processed according to the selected type of processing and the specified processing level, and the processed audio data is recorded in association with the image data obtained by combining the first image data and the second image data. . Therefore, the user can process the audio data as desired, and the image data corresponding to the type of processing can be displayed, so that the processing contents of the audio data can be easily known.
[0011]
The invention according to claim 4
In an information terminal device (for example, the information terminal device 1 in FIG. 1) that records audio data in association with image data,
First image selection means (for example, the input unit 12 for executing step S70 in FIG. 28) for selecting recorded audio data or first image data corresponding to the recorded audio data;
Voice analysis means for analyzing the recorded voice data or the recorded voice data (for example, CPU 11 for executing step S71 of FIG. 28);
Second image selection means (for example, FIG. 28) that selects image data corresponding to the type of processing determined based on the analysis result by the voice analysis means or new image data selected and designated as second image data. Input unit 12) for executing steps S73 to S75 of FIG.
Synthetic image recording means for recording the recorded audio data or the recorded audio data in association with the image data obtained by synthesizing the first image data and the second image data (for example, step S76 in FIG. 28). To CPU 11) for causing the storage device 17 to execute step S78,
It is characterized by having.
[0012]
According to the fourth aspect of the present invention, the image data to be recorded or corresponded to the recorded audio data is selected from the list of image data, the audio data is analyzed, and based on the analysis result. An image obtained by selecting image data corresponding to the determined processing type or new image data selected and designated as second image data, and combining the audio data with the first image data and the second image data. Record in association with the data. Therefore, the contents of the audio data can be visually identified, and as a result, the user can grasp the processing contents of the audio data at a glance. Therefore, the contents of the audio data can be visually identified, and as a result, the user can grasp the processing contents of the audio data at a glance.
[0013]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
[0014]
[First Embodiment]
First, the configuration will be described.
FIG. 1 is a diagram showing an overall configuration of an information terminal device 1 according to the present embodiment. As shown in FIG. 1, the information terminal device 1 includes a CPU 11, an input unit 12, a RAM 13, a transmission control unit 14, a VRAM 15, a display unit 16, a storage device 17 having a recording medium 17a, a speaker 18a, and a microphone 18b. Each unit configured by the processing unit 18 and excluding the recording medium 17 a is a computer connected by a bus 19.
[0015]
A CPU (Central Processing Unit) 11 reads various control programs stored in a recording medium 17a of the storage device 17, develops them in a work memory formed in the RAM 13, and centrally controls the operation of each unit according to the control program. To do. Further, the CPU 11 executes a voice stamp creation process A and the like described later in accordance with a control program developed in the work memory in the RAM 13, stores the processing result in the work memory in the RAM 13, and causes the display unit 16 to display it. Then, the processing result stored in the work memory is stored in a predetermined storage destination in the storage device 17 or the recording medium 17a.
[0016]
That is, when the voice stamp creation process A is executed, the CPU 11 starts recording and creates voice data for creating a voice stamp when there is a voice recording instruction. If there is no recording instruction, a list of audio data recorded in the past is displayed, and the audio data selected from the list is recognized as audio data for generating an audio stamp. Then, a list of image data is displayed, and image data corresponding to the recorded or selected audio data is selected. If there is an instruction to process the audio data after selecting the image data, a list of composite marks indicating the type and degree of processing is displayed, and the processing corresponding to the selected composite mark is performed on the audio data. Then, the selected image data and the synthesis mark are synthesized, and a voice stamp is set and registered by associating the processed voice data with the synthesized image. If there is no instruction to process audio data, an audio stamp is set and registered in association with the recorded or selected audio data to the selected image data.
[0017]
The input unit 12 includes a keyboard having character / alphanumeric input keys, cursor keys, various function keys, and the like, and a mouse that is a pointing device. An operation signal from the mouse is output to the CPU 11 as an input signal. Alternatively, the input unit 12 is configured by a touch panel that outputs position information input by touching a transparent sheet panel covering the display screen of the display unit 16 with a finger or a dedicated touch pen to the CPU 11 as an input signal.
[0018]
A RAM (Random Access Memory) 13 is a system program, control program, input or output data, parameters, and the like that can be executed by the information terminal device 1 read from the storage device 17 in the above-described various processes controlled by the CPU 11. Is temporarily stored.
[0019]
The transmission control unit 14 includes a router, a TA (Terminal Adapter), and the like, and performs communication control with other terminals connected to the network via a dedicated line or a communication line such as an ISDN (Integrated Service Digital Network) line. Do. The router is a device that connects to an external LAN when the information terminal device 1 constitutes a LAN. The TA is an existing device for communicating with an external device via an ISDN line. It is a device that converts an interface into an interface corresponding to ISDN.
[0020]
The VRAM 15 temporarily stores image data to be displayed on the display unit 16 in accordance with a display instruction from the CPU 11. The display unit 16 is configured by an LCD (Liquid Crystal Display), a CRT (Cathode Ray Tube), or the like, and an audio data file selection screen 1001, which will be described later, image data, is displayed on the display screen according to instructions of a display signal input from the CPU 11. A selection screen 1002, a composite mark selection screen 1003, a voice stamp setting screen 1004, and the like are displayed.
[0021]
The storage device 17 includes a recording medium 17a in which programs, data, and the like are stored in advance, and the recording medium 17a is configured by a magnetic or optical recording medium or a nonvolatile memory such as a semiconductor. The recording medium 17a is fixedly attached to the storage device 17 or is detachably mounted. The recording medium 17a can be executed on the system program corresponding to the information terminal device 1 and the system program. Various processing programs such as voice stamp creation processing A, and data processed by these programs are stored. Each of these processing programs is stored in the form of a readable program code, and the CPU 11 sequentially executes operations according to the program code.
[0022]
In this embodiment, as shown in FIG. 2, the storage device 17 includes an audio data file 171, an image data file 172, an audio stamp registration information file (1) 173, and a composite mark file 174. The Hereinafter, the data structure in each of these files will be described in detail with reference to FIGS.
[0023]
FIG. 3 is a diagram showing an example of data storage in the audio data file 171 that stores the recorded audio data. As shown in FIG. 3, the audio data file 171 includes an identification code (for example, “yama.def”, “kawa.def”, “umi.def”,...) Uniquely assigned to specify the audio data. ) As a “voice data name”, and recorded or existing voice data (for example, “voice 1”, “voice 2”, “voice 3”...) Is “voice data”. As a voice data area 171b.
[0024]
FIG. 4 is a diagram illustrating an example of data storage in the image data file 172 that stores image data. As shown in FIG. 4, the image data file 172 has an identification code (for example, “speaker.ghi”, “house.ghi”, “maru.ghi”,...) Uniquely assigned to specify the image data. ) As an “image data name”, and registered or existing image data (for example, “image 1”, “image 2”, “image 3”,...) Is “image data”. As an image data area 172b.
[0025]
FIG. 5 is a diagram showing an example of data storage in the voice stamp registration information file (1) 173 that stores information necessary for voice stamp registration. As shown in FIG. 5, the voice stamp registration information file (1) 173 includes an identification code (for example, “onsei.abc”, “onsei1.abc”, “onsei2”) uniquely assigned to specify the voice stamp. .abc ”,...) as a“ file name ”, and an identification code (for example,“ yama.def ”uniquely assigned to identify the audio data registered in the audio stamp) ”,“ Kawa.def ”,“ umi.def ”,...) As“ audio data name ”, and uniquely for specifying the image data registered in the audio stamp. An image data name area 173c for storing the assigned identification code (for example, “speaker.ghi”, “house.ghi”, “maru.ghi”,...) As “image data name”, and the audio data. Character string indicating the type of machining Data (for example, “volume”, “pitch”, “sound quality”,...) Are stored as “processing type”, and a numerical value representing the processing level applied to the audio data. A processing degree area 173e for storing data (for example, “1”, “2”, “4”,...) As “processing degree” and date data (for example, “01.03. 10 ”,“ 01.03.09 ”,“ 01.02.10 ”,...) As“ registration date ”and data indicating the time when the voice stamp was registered (for example,“ 15:01 ”) , “14:58”, “12:00”,...) Are stored as “registration time”. Note that “processing type” and “processing level” are blank when the audio data is not processed when the audio stamp is created.
[0026]
FIG. 6 is a diagram showing an example of data storage in the composite mark file 174 that stores composite marks associated with the type and degree of processing of audio data. As shown in FIG. 6, the composite mark file 174 represents a processing type area 174 a that stores character string data (for example, “volume”,...) Representing a processing type as a “processing type”, and a processing level. A processing degree region 174b for storing numerical data (for example, “1”, “2”, “3”...) As “processing degree”, and image data (for example, “note 1”, “note 2”, And a composite mark data area 174c for storing “note 3”,...) As “composite mark data”.
[0027]
The audio processing unit 18 includes an analog / digital converter, an amplifier, and the like, and includes a speaker 18a and a microphone 18b. The audio processing unit 18 converts the audio data stored in the storage device 17 into an analog signal with a converter, and outputs the analog signal to the outside through the amplifier. In addition, the audio processing unit 18 converts an audio signal input from the microphone 18 b or the like into a digital signal by a converter, and stores the digital signal in the audio data file 171.
[0028]
Next, the operation will be described.
The voice stamp creation process A executed by the information terminal device 1 will be described with reference to the flowchart of FIG.
As a premise of the operation description, a program for realizing each function described in the following flowchart is stored in the recording medium 17a in the form of a readable program code, and the CPU 11 follows the program code. Perform operations sequentially. In addition, the CPU 11 can sequentially execute operations in accordance with the above-described program code transmitted via the transmission medium. In other words, in addition to the recording medium 17a, an operation unique to the present embodiment can be executed using a program or data supplied externally via a transmission medium.
[0029]
First, when a signal instructing creation of an audio stamp is input from the input unit 12 (step S1), the CPU 11 displays a selection screen for recording to create audio data or using existing audio data. Display (step S2). When recording is selected (step S3; Y), the CPU 11 starts recording via the audio processing unit 18 (step S4), and when recording ends (step S5; Y), creates an audio data file. The voice data file is recognized as voice data for creating a voice stamp (step S6). On the other hand, when the use of existing audio data is selected (step S3; N), the CPU 11 displays a list of audio data files 171 stored in the storage device 17 (step S7), and the audio data is selected. Then, the voice data is recognized as voice data for creating a voice stamp (step S8).
[0030]
FIG. 8 is a diagram showing an example of an audio data file selection screen 1001 showing a list of audio data files 171 displayed on the display unit 16 in step S7. As shown in FIG. 8, the voice data file selection screen 1001 displays an instruction message to the user such as “Please select voice data”, and a list of voice data files is displayed below. The user selects audio data by moving the cursor to a file name corresponding to a desired file from the audio data files displayed in a list and designating a selection button.
[0031]
Next, the CPU 11 displays the image data of the image data file 172 stored in the storage device 17 (step S9). When image data corresponding to the audio data is selected from the display list (step S10), the CPU 11 causes the display unit 16 to display a selection screen as to whether or not to process the audio data. When processing is selected (step S11; Y), the composite mark of the composite mark file 174 stored in the storage device 17 and the type of processing corresponding to the composite mark are displayed in a list (step S12).
[0032]
When the composite mark is selected (step S13), the CPU 11 performs processing corresponding to the composite mark on the audio data (step S14). When the processing of the audio data is completed, the CPU 11 combines the composite mark and the image data (step S15), associates the audio data with the combined image (step S16), and sets and registers the audio stamp. Then, the series of voice stamp creation processing A is completed (step S18).
[0033]
FIG. 9 shows an example of an image data selection screen 1002 showing an image list of the image data file 172 displayed on the display unit 16 in step S9. As shown in FIG. 9, the image data selection screen 1002 displays an instruction message to the user such as “Please select an audio stamp image”, and the content of the image data is displayed below. The user selects image data by moving the cursor to a desired image from the images displayed in a list and designating a selection button. The image data selection screen 1002 indicates that a speaker image such as image data (a) has been selected.
[0034]
FIG. 10 and FIG. 11 show screens displayed on the display unit 16 in the process in which voice data is processed in steps S12 to S18 and voice stamps are set and registered when voice data processing is selected in step S11. It is an example.
[0035]
FIG. 10 shows an example of a composite mark selection screen 1003 showing a list of composite mark files 174 displayed on the display unit 16 in step S13. As shown in FIG. 10, the composite mark selection screen 1003 displays an instruction message to the user such as “Please select a sound effect image”, and below it, a composite mark corresponding to the type of processing of the audio data Is displayed. The user selects the voice data processing type by moving the cursor to the desired synthetic data processing mark from among the displayed synthetic marks. The composite mark selection screen 1003 indicates that the lightest note that is the composite mark corresponding to the type of processing indicating that the volume is to be minimized is selected.
[0036]
FIG. 11 is a diagram showing an example of the voice stamp setting screen 1004 displayed on the display unit 16 in step S18. The voice stamp setting screen 1004 displays a voice stamp obtained by synthesizing the image selected in FIG. 9 (here, the speaker) and the synthesis mark selected in FIG. 10 (here, the lightest color note). . When the registration button instructed at the bottom of the screen is selected, the displayed voice stamp is registered (recorded).
[0037]
Note that the image indicating the type of processing of the audio data is a synthetic mark such as a note here, but may be a frame or the like. That is, a frame may be attached to the image data selected in FIG. 9, and the type of processing may be distinguished by the shape, pattern, color, etc. of the frame. In this case, it is of course possible to use a composite mark such as a note and a frame together.
[0038]
On the other hand, if there is an instruction not to process the sound data in step S11 (step S11; N), the CPU 11 associates the sound data with the image data (step S17), sets and registers a sound stamp, and then a series of sound data. The stamp creation process A is terminated (step S18).
[0039]
FIG. 12 is a diagram showing an example of a voice stamp setting screen 1005 displayed on the display unit 16 in step S18 when an image of a speaker is selected in step S9 and no processing of voice data is selected in step S11. is there. When the registration button instructed at the bottom of the screen is selected, the displayed voice stamp is registered.
[0040]
As described above, according to the information terminal device 1, a voice stamp is created and registered in association with voice data and desired image data. Audio data can be processed. When processing, audio data is processed by selecting a composite mark corresponding to the type of audio data processing, and an image obtained by combining the composite mark and image data is processed as audio. Set as a stamp. As a result, the user can easily process the audio data, and the type of processing can be easily known from the set audio stamp image.
[0041]
In addition, the description content in the said 1st Embodiment is a suitable example of the information terminal device 1 which concerns on this invention, and is not limited to this.
For example, in the above embodiment, after the voice data file 171 recorded in step S6 is created, the voice stamp creation step continues, but the recorded and created voice data file 171 is stored and saved at a later date. Audio stamps can also be created. Further, it is possible to record the sound data after determining the type of processing of the sound data in advance, and automatically process the recorded sound data to create a sound stamp.
[0042]
Furthermore, according to the above embodiment, the image data is synthesized immediately after the voice data processing in step S14, but the processed voice data is output and the voice that suits the user's preference is output. If not, the process returns to the step of displaying the composite mark list in step S12, and the processes in steps S12 to S14 may be repeated until the voice processing that suits the taste is performed.
[0043]
[Second Embodiment]
Hereinafter, as an application example of the first embodiment, the type and degree of processing of audio data are specified, the audio data is processed based on the specification, and the image data is processed in accordance with the processed audio data. An embodiment using a change in size, brightness, etc. (hereinafter referred to as “image effect”) will be described in detail. Note that the configuration of the information terminal device 1 in the present embodiment is the same as that of the first embodiment described above, and therefore, the same reference numerals are given to the respective components, and the illustration and description of the configuration are omitted. .
[0044]
However, as shown in FIG. 13, the storage device 17 of the information terminal device 1 includes an audio data file 171, an image data file 172, and an audio stamp registration information file (2) 175. The stamp registration information file (2) 175 is a component unique to this embodiment, and will be described in detail below.
[0045]
FIG. 14 is a diagram showing an example of data storage in the voice stamp registration information file (2) 175 for storing information necessary for voice stamp registration in the voice stamp creation process B described later. As shown in FIG. 15, the audio stamp registration information file (2) 175 includes a file name area 175a, an audio data name area 175b, an image data name area 175c, a processing type area 175d, and a processing degree area 175e. , Image effect information 175f, a registration date area 175g, and a registration time area 175h.
[0046]
In the file name area 175a, an identification code (for example, “onsei.abc”, “onsei1.abc”, “onsei2.abc”,...) Uniquely assigned to specify the voice stamp is set as “file name”. Store. The voice data name area 175b has an identification code (for example, “yama.def”, “kawa.def”, “umi.def”) uniquely assigned to specify voice data registered in the voice stamp. ,...) Are stored as “voice data name”. The image data name area 175c has an identification code (for example, “speaker.ghi”, “house.ghi”, “maru.ghi”, uniquely assigned to identify the image data registered in the audio stamp). ...) are stored as “image data name”.
[0047]
The processing type area 175d stores character string data (for example, “volume”, “sound quality”,...) Representing the processing type applied to the audio data as “processing type”. The processing degree area 175e stores numerical data (for example, “1”, “4”,...) Representing the processing degree applied to the audio data as the “processing degree”. The image effect information 175f includes symbol data (for example, “a”, “b”...) For identifying information associated with how to change the image depending on the type and degree of processing of the audio data. Store as “Information”. Note that “type of processing”, “degree of processing”, and “image effect information” are blank as in the second record from the top if the audio data is not processed when the audio stamp is created.
[0048]
The registration date area 175g stores date data (for example, “01.03.10”, “01.03.09”, “01.02.10”,...) Representing the date when the voice stamp was registered as “registration date”. The registration time area 175h stores data (eg, “15:01”, “14:58”, “12:00”,...) Indicating the time when the voice stamp is registered as “registration time”.
[0049]
Next, the operation will be described.
The voice stamp creation process B executed by the information terminal device 1 will be described with reference to the flowchart of FIG.
First, when a signal instructing creation of an audio stamp is input from the input unit 12 (step S21), the CPU 11 displays a selection screen for recording audio data or using existing audio data. It is displayed (step S22). When recording is selected (step S23; Y), the CPU 11 starts recording by the voice processing unit 18 (step S24), and when the recording is finished (step S25; Y), creates a voice data file and creates the voice. The data file is recognized as voice data for creating a voice stamp (step S26).
[0050]
On the other hand, when the use of the existing audio data is selected (step S23; N), the CPU 11 displays a list of audio data files 171 stored in the storage device 17 (step S27), and the audio data is selected. Then, the voice data is recognized as voice data for creating a voice stamp (step S28).
[0051]
Next, the CPU 11 displays the image data of the image data file 172 stored in the storage device 17 (step S29). When image data corresponding to the audio data is selected from the display list (step S30), the CPU 11 displays a selection screen as to whether or not to process the audio data. When processing is selected (step S31; Y), the CPU 11 displays a selection screen listing the types of processing.
[0052]
When the type of processing is selected (step S32), the CPU 11 displays the above-described image effect list screen on the display unit 16, and associates the selected image effect with the type of processing of the audio data (step S33). .
[0053]
Next, the CPU 11 displays a screen for inputting the degree of processing on the display unit 16, and when the degree of processing is input (step S34), the audio data is processed according to the input processing degree (step S35). When the audio data is processed, the CPU 11 changes the image data according to the image effect selected in step S33 (step S36), associates the audio data with the changed image data (step S37), sets and registers the audio stamp, The series of voice stamp creation processing B is terminated (step S39).
[0054]
FIGS. 16 to 21 shown below are examples of display screens showing a process in which voice stamps are set and registered in steps S32 to S39 when voice data processing is selected in step S31.
[0055]
FIG. 16 is a diagram showing an example of a processing type selection screen 1006 to be applied to the audio data displayed on the display unit 16 in step S32. As shown in FIG. 16, the processing type selection screen 1006 displays an instruction message to the user such as “Please select the processing type of the audio data”, and the processing type of the audio data is displayed below it. Is done. The user selects the type of audio data processing by placing the cursor on the type of processing to be applied to the audio data from among the types of audio data displayed in a list. The processing type selection screen 1006 shows a state in which volume change is selected.
[0056]
FIG. 17 is a diagram illustrating an example of an image effect selection screen 1007 corresponding to the type of processing of the audio data selected in step S32 displayed on the display unit 16 in step S33. As shown in FIG. 17, on the image effect selection screen 1007, an instruction message to the user such as “Please select an image effect corresponding to the volume change” is displayed, and a change in the image data due to the image effect is displayed below. Is displayed. The user selects an image effect corresponding to the type of processing of audio data by moving the cursor to a desired image effect from among the image effects displayed in a list. The image effect selection screen 1007 shows a state in which the volume change is associated with the image size.
[0057]
18 to 21 are examples of the processing degree input screen displayed on the display unit 16 in step S34. FIG. 18 is a diagram showing an example of the processing degree input screen 1008 displayed when the volume is selected as the type of processing in step S32 and the change of the image size of the image data is selected as the image effect in step S33. . As shown in FIG. 18, the processing degree input screen 1008 displays an instruction message to the user such as “Please input the processing degree”. Below that, the image size of the image data selected in step S30 is displayed in association with the processing level of the audio data, and below that, an area 1008a is provided in which the processing level can be entered numerically. The user designates the degree of processing by inputting a numerical value in the area 1008a. When the operation of specifying the enter button designated at the bottom of the screen is performed, the input processing degree is set.
[0058]
FIG. 19 is a diagram showing an example of the processing level input screen 1009 displayed when the volume is selected as the type of processing in step S32 and the size change of the composite mark is selected as the image effect in step S33. As shown in FIG. 19, the processing degree input screen 1009 displays an instruction message to the user such as “Please input the processing degree”. Below that, the size of the note, which is a composite mark, is displayed in association with the processing level of the voice data, and further below that, an area 1009a is provided in which the processing level can be entered numerically. The user designates the processing degree by inputting a numerical value in the area 1009a. When the operation of specifying the enter button designated at the bottom of the screen is performed, the input processing degree is set.
[0059]
FIG. 20 is a diagram illustrating an example of the processing level input screen 1010 displayed when sound quality is selected as the type of processing in step S32 and change of the brightness of the image data is selected as the image effect in step S33. . As shown in FIG. 20, on the processing degree input screen 1010, an instruction message to the user such as “Please input the processing degree” is displayed. Below that, the brightness of the image of the image data selected in step S30 is displayed in correspondence with the processing level of the audio data, and below that, an area 1010a where the processing level can be input numerically is provided. . The user designates the degree of processing by inputting a numerical value in the area 1010a. When the operation of specifying the enter button specified at the bottom of the screen is performed, the input processing degree is set.
[0060]
FIG. 21 shows an example of the processing level input screen 1011 displayed when the pitch of the sound is selected as the type of processing in step S32 and the change of the brightness of the composite mark is selected as the image effect in step S33. FIG. As shown in FIG. 21, the processing degree input screen 1011 displays an instruction message to the user such as “Please input the processing degree”. Below that, the brightness of the note, which is a composite mark, is displayed in association with the processing level of the audio data, and below that, an area 1011a is provided in which the processing level can be input numerically. The user designates the degree of processing by inputting a numerical value in the area 1011a. When the designation operation of the enter button designated at the bottom of the screen is performed, the input processing degree is set.
[0061]
FIG. 22 is a diagram showing an example of the voice stamp setting screen 1012 displayed on the display unit 16 in step S39. In the audio stamp setting screen 1012, the type of processing (here, volume) of the audio data selected in FIG. 16 and the degree of processing input in FIG. 18 are selected for the image (in this case, the speaker) selected in FIG. An audio stamp created by applying a corresponding image effect (here, the size of the image) is displayed. When the “registration” operation instructed at the bottom of the screen is performed, the displayed voice stamp is registered (recorded).
[0062]
On the other hand, when there is an instruction not to process the audio data in step S31 (step S31; N), the CPU 11 associates the audio data with the image data (step S38), sets and registers the audio stamp, and a series of audio stamps. The creation process B is terminated (step S39).
[0063]
FIG. 12 is a diagram illustrating an audio stamp setting screen 1005 displayed on the display unit 16 in step S39 when an image of a speaker is selected in step S29 and it is selected not to process audio data in step S31. . When the “Register” operation instructed at the bottom of the screen is performed, a voice stamp with the file name “onsei1.abc” displayed is registered (recorded).
[0064]
As described above, according to the information terminal device 1, a voice stamp is created and registered in association with voice data and desired image data. Audio data can be processed. When processing, the user can specify the type and degree of processing, select the desired image effect, process the audio data, and process and process the audio data. The image effect corresponding to the degree is applied to the image data and set as an audio stamp. As a result, the user can easily process the audio data, and the processing contents can be easily known from the set audio stamp image.
[0065]
In addition, the description content in the said 2nd Embodiment is a suitable example of the information terminal device 1 which concerns on this invention, and is not limited to this.
For example, in the above embodiment, after the audio data file 171 recorded in step S26 is created, the audio stamp file continues to be created. However, the recorded and created audio data file 171 is stored and saved at a later date. Audio stamps can also be created. It is also possible to record the sound data after determining the type and degree of processing of the sound data in advance, and automatically process the recorded sound data to create a sound stamp.
[0066]
Further, the image data is synthesized immediately after the voice data processing in step S35, but once the processed voice data is output and no voice according to the user's preference is output, the processing in step S33 is performed. You may make it return to the step of the corresponding image effect selection, and may repeat each process of step S33-step S35 until the audio processing suitable for preference is made.
[0067]
[Third Embodiment]
Hereinafter, as an application example of each of the above embodiments, an embodiment for switching the processing level of the voice stamp when a registered voice stamp is displayed will be described in detail. Note that the configuration of the information terminal device 1 in the present embodiment is the same as that of the first embodiment described above, and therefore, the same reference numerals are given to the respective components, and the illustration and description of the configuration are omitted. .
[0068]
The operation will be described below.
The processing degree changing process executed by the information terminal device 1 will be described with reference to the flowchart of FIG. It should be noted that the voice stamp used here can be subjected to the processing level change process even if it is registered in the first embodiment or registered in the second embodiment. However, here, the operation of switching the processing level for the voice stamp registered in the first embodiment will be described.
[0069]
First, when a signal instructing display of a specific voice stamp is input from the input unit 12, the CPU 11 displays a voice stamp on the display unit 16 (step S41). When an instruction to change the processing level of the voice stamp is input (step S42; Y), the CPU 11 determines whether or not the voice stamp has been processed, and has been processed. If it discriminate | determines (step S43; Y), the information of an audio stamp will be read from the audio stamp registration information file (1) 173 (refer FIG. 2) of the memory | storage device 17 (step S44). Then, the CPU 11 outputs a sound each time an instruction to change the processing level is given to the sound stamp (step S45). The instruction to change the processing level is given by, for example, an operation of placing the mouse pointer on the voice stamp and clicking once.
[0070]
Then, the CPU 11 switches and displays the voice stamp screen according to the degree of processing of the output voice data (step S46). When the processing is completed (step S47; N), it is confirmed whether or not the voice stamp is registered. When there is a registration instruction (step S48; Y), the voice stamp is newly registered, and a series of processing degree change processing is performed. Is finished (step S49).
[0071]
24 and 25 illustrate how the processing level of the audio data is changed by an operation such as clicking with the mouse in step S45, and the image data is switched in accordance with the change of the processing level in step S46. FIG. 24 shows a state of switching the processing level of the voice stamp by the composite image of the image data set and registered in the first embodiment and the composite mark corresponding to the processing type of the audio data. FIG. 25 shows how the processing level of the sound stamp is switched by applying the image effect corresponding to the processing type of the sound data to the image data set and registered in the second embodiment.
[0072]
On the other hand, if it is not instructed to change the processing level of the voice stamp in step S42 (step S42; N), voice data is output (step S50). If the voice data has not been processed (step S43; N), the processing of steps S12 to S18 of the voice stamp creation process A (see FIG. 7) described above is performed to create a voice stamp obtained by processing the voice data. .
[0073]
As described above, according to the information terminal device 1, the processing degree of the voice data registered in the voice stamp is switched and output by a simple operation such as clicking the image of the registered voice stamp, and the output In addition, the sound stamp image is also changed and displayed. Then, when the registration instruction is given, the voice stamp is registered. As a result, the user can easily switch the processing level of the audio data registered in the audio stamp.
[0074]
In addition, the description content in each said embodiment is a suitable example of the information terminal device which concerns on this invention, and is not limited to this.
For example, in the above embodiment, the processing for switching the processing level of the voice data in the voice stamp registered in the first embodiment is performed, but in the voice stamp registered in the second embodiment, If the CPU 11 determines in step S44 that the voice stamp has not been processed, the processes of steps S32 to S39 of the voice stamp creation process B (see FIG. 15) may be performed.
[0075]
[Fourth Embodiment]
As an application example of each of the embodiments described above, an embodiment in which audio data is analyzed, the type of processing is determined from the analysis result, and image data is synthesized corresponding to the analysis result and the type of processing will be described in detail. . Note that the configuration of the information terminal device 1 in the present embodiment is the same as that of the first embodiment described above, and therefore, the same reference numerals are given to the respective components, and the illustration and description of the configuration are omitted. .
[0076]
However, the voice processing unit 18 of the information terminal device 1 analyzes the voice data input by the microphone 18b or the like. 26 includes the audio data file 171, the image data file 172, and the audio analysis stamp registration information file 176 shown in FIG. 26 formed in the storage device 17, and the audio analysis stamp registration information file 176 is the present embodiment. Since it is a component peculiar to this form, it will be described in detail below.
[0077]
As shown in FIG. 27, the voice analysis stamp registration information file 176 includes a file name area 176a, a voice data name area 176b, an image data name area 176c, an analysis result area 176d, a registration date area 176e, and a registration time. An area 176f.
[0078]
In the file name area 176a, an identification code (for example, “onsei.abc”, “onsei1.abc”, “onsei2.abc”,...) Uniquely assigned to specify the voice analysis stamp is displayed as “file name”. Store as. The voice data name area 176b has an identification code (for example, “yama.def”, “kawa.def”, “umi.def”) uniquely assigned to specify voice data registered in the voice stamp. ,...) Are stored as “voice data name”. The image data name area 176c has an identification code (for example, “speaker.ghi”, “house.ghi”, “maru.ghi”) uniquely assigned to specify the image data registered in the sound stamp. ,...) Are stored as “image data name”.
[0079]
The analysis result area 176d stores character string data (for example, “human voice”, “instrument”,...) As an “analysis result” for representing the analysis result of the voice data. The registration date area 176e stores date data (for example, “01.03.10”, “01.03.09”, “01.02.10”,...) Indicating the date when the voice stamp was registered as “registration date”. The registration time area 176f stores data (eg, “15:01”, “14:58”, “12:00”,...) Indicating the time when the voice stamp is registered as “registration time”.
[0080]
The operation will be described below.
The voice analysis stamp creation process executed by the information terminal device 1 will be described with reference to the flowchart of FIG.
[0081]
First, when a signal instructing creation of a voice analysis stamp is input from the input unit 12 (step S61), the CPU 11 selects whether to record and create voice data or use existing voice data. Is displayed (step S62). When recording is selected (step S63; Y), the CPU 11 starts recording by the audio processing unit 18 (step S64), and when recording ends (step S65; Y), creates an audio data file, and the audio The data file is recognized as voice data for creating a voice analysis stamp (step S66). On the other hand, when the use of existing audio data is selected (step S63; N), the CPU 11 displays a list of audio data files 171 stored in the storage device 17 (step S67), and the audio data is selected. Then, the voice data is recognized as voice data for creating a voice analysis stamp (step S68).
[0082]
Next, the CPU 11 displays the image data of the image data file 172 stored in the storage device 17 (step S69). When image data corresponding to the audio data is selected from the display list (step S70), the CPU 11 analyzes the audio data (step S71) and displays the analysis result on the display unit 16 (step S72). ).
[0083]
FIG. 29 is an example of a voice data analysis result display screen 1013 that displays the voice data analysis result displayed on the display unit 16 in step S72. On the screen, “Analysis of voice data” is displayed, and below that, the analysis results are shown to the user by characters such as “Analysis of voice data and human voice is included”. An analysis composite image corresponding to the analysis result is displayed.
[0084]
FIG. 30 is a diagram illustrating an example of an analysis / synthesis image list display screen 1014 displaying a list of correspondence relationships between analysis / synthesis images and analysis results. The user can freely switch and display the analysis / synthesis image on the analysis / synthesis image list display screen 1014 by a predetermined input operation, and can confirm the correspondence between the image and the analysis result.
[0085]
Returning to FIG. 28, after the analysis result is displayed, the CPU 11 causes the display unit 16 to display a selection screen as to whether or not to select an analysis / synthesis image. When the user gives an instruction to select an analysis / synthesis image (step S73; Y), the CPU 11 displays a list of analysis / synthesis image change selection screens 1015 shown in FIG. 31 (step S74). Next, the user selects a desired analysis / synthesis image corresponding to the analysis result from the analysis / synthesis images on the analysis / synthesis image change selection screen 1015 (step S75). Then, the CPU 11 synthesizes the image data selected in step S70 and the analysis synthesized image selected in step S75 according to the analysis result (step S76), and associates the voice data file with the synthesized image (step S77). ) And set and register as a voice analysis stamp (step S78).
[0086]
FIG. 31 is a diagram showing an example of the analysis / synthesis image change selection screen 1015 showing the analysis / composition image change candidates displayed on the display unit 16 in step S74. As shown in FIG. 31, on the analysis / synthesis image change selection screen 1015, an instruction message to the user such as “Please select an image” is displayed, and a list of selection candidates for the analysis / synthesis image is displayed below. . The user selects the image data by moving the cursor to a desired image from among the selection candidates of analysis / synthesis images displayed in a list and specifying a selection button.
[0087]
On the other hand, when the user gives an instruction not to select an analysis / combination image in step S73 (step S73; N), the CPU 11 performs step based on the correspondence between the analysis / combination image and the analysis result as shown in FIG. An analysis composite image corresponding to the analysis result in S71 is automatically selected, and the analysis composite image and the image data selected in step S70 are combined (step S76). Then, the CPU 11 associates an audio data file with the synthesized image (step S77) and sets and registers it as an audio analysis stamp (step S78).
[0088]
As described above, according to the information terminal device 1, the voice analysis stamp is set by analyzing the content of the voice data, synthesizing the analysis synthesized image corresponding to the analysis result and the image data, and associating them with the voice data. Register (record). As a result, the user can easily grasp the contents of the audio data.
[0089]
In addition, the description content in each said embodiment is a suitable example of the information terminal device 1 which concerns on this invention, and is not limited to this.
For example, with respect to the voice analysis stamp registered according to the above embodiment, the degree of processing in the third embodiment can be switched.
[0090]
In the first to fourth embodiments described above, functions other than the third embodiment can be realized as independent functions. However, a user can define each embodiment as a function in one application. By making it selectable, variations in processing increase and the user interface can be improved.
In addition, the detailed configuration and detailed operation of the information terminal device 1 can be changed as appropriate without departing from the spirit of the present invention.
[0091]
【The invention's effect】
According to the first aspect of the present invention, the image data to be recorded or made to correspond to the recorded audio data is selected from the displayed image data, and corresponds to the type of processing for processing the audio data. The type of processing is selected by selecting the second image data. Then, the audio data is processed according to the selected type of processing, and the processed audio data is recorded in association with the image data obtained by combining the first image data and the second image data. Therefore, the user can process the audio data as desired, and the image data corresponding to the type of processing can be displayed, so that the processing contents of the audio data can be easily known.
[0092]
According to the second aspect of the present invention, the image data to be recorded or made to correspond to the recorded audio data is selected from the displayed image data, the type of processing of the audio data is selected and selected. After selecting the second image data corresponding to the type of processing, the processing level of the audio data is designated. Then, the audio data is processed according to the selected type of processing and the specified processing level, and the processed audio data is recorded in association with the image data obtained by combining the first image data and the second image data. . Therefore, the user can process the audio data as desired, and the processing contents of the audio data can be made easier by changing and displaying not only the type of processing but also the image data according to the degree of processing. To be able to know.
[0093]
According to the third aspect of the present invention, the image data associated with the recorded audio data is displayed, the image data for switching the processing degree is selected from the displayed image data, and the image data is selected. Each time, the processing degree of the audio data corresponding to the image data is switched and output, and the image data is switched corresponding to the changed degree of processing. Therefore, the processing of the sound data can be switched by performing a simple operation on the image data associated with the already recorded sound data, and the user can easily process the sound data as desired. .
[0094]
According to the fourth aspect of the present invention, the image data to be recorded or corresponded to the recorded audio data is selected from the list of image data, the audio data is analyzed, and based on the analysis result. An image obtained by selecting image data corresponding to the determined processing type or new image data selected and designated as second image data, and combining the audio data with the first image data and the second image data. Record in association with the data. Therefore, the contents of the audio data can be visually identified, and as a result, the user can grasp the processing contents of the audio data at a glance.
[0095]
According to the fifth aspect of the present invention, the image data to be recorded or made to correspond to the recorded audio data is selected from the image data displayed in a list and corresponds to the type of processing for processing the audio data. By selecting the second image data, the type of processing is selected, the audio data is processed according to the selected type of processing, and the processed audio data is converted into the first image data and the second image data. The function described in claim 1 can be realized by causing a computer to read a program for recording the image in association with the synthesized image data. Therefore, it becomes easy to sell and distribute the software product as a single unit independent of the system. In addition, by executing the program using hardware resources such as a general-purpose computer, the technique of the present invention can be easily implemented on the hardware.
[0096]
According to the sixth aspect of the present invention, image data corresponding to recorded or recorded audio data is selected from the displayed image data, and the type of audio data processing and the selected processing are selected. The second image data corresponding to the type of image data is selected, the processing level of the audio data is designated, the audio data is processed according to the processing type and the processing level, and the processed audio data is converted into the first image data. The function described in claim 2 can be realized by causing the computer to read a program for recording the image data in association with the image data obtained by combining the image data and the second image data. Therefore, it becomes easy to sell and distribute the software product as a single unit independent of the system. In addition, by executing the program using hardware resources such as a general-purpose computer, the technique of the present invention can be easily implemented on the hardware.
[0097]
According to the seventh aspect of the present invention, the recorded image data or the image data corresponding to the recorded sound data is selected from the displayed image data, the sound data is analyzed, and based on the analysis result. Image data corresponding to the determined type of processing or new image data selected and designated is selected as second image data, and the audio data is combined with the first image data and the second image data. The function described in claim 2 can be realized by causing a computer to read a program to be recorded in association with data. Therefore, it becomes easy to sell and distribute the software product as a single unit independent of the system. In addition, by executing the program using hardware resources such as a general-purpose computer, the technique of the present invention can be easily implemented on the hardware.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a functional configuration of an information terminal device 1 according to the present invention.
FIG. 2 is a diagram showing a file structure inside the storage device 17 of FIG. 1;
FIG. 3 is a diagram showing an example of data storage inside the audio data file 171 of FIG. 2;
4 is a diagram showing an example of data storage inside the image data file 172 of FIG. 2; FIG.
5 is a diagram showing an example of data storage in the voice stamp registration information file (1) 173 in FIG. 2; FIG.
6 is a diagram showing an example of data storage inside a composite mark file 174 in FIG. 2; FIG.
7 is a flowchart showing an operation of voice stamp creation processing A executed by CPU 11 of FIG.
8 is a diagram showing an example of an audio data file selection screen 1001 displayed in step S7 of FIG.
9 is a diagram showing an example of an image data selection screen 1002 displayed in step S9 of FIG.
10 is a diagram showing an example of a composite mark selection screen 1003 displayed in step S13 of FIG.
11 is a diagram showing an example of an audio stamp setting screen 1004 displayed in step S18 of FIG.
12 is a diagram showing an example of an audio stamp setting screen 1005 displayed in step S39 of FIG.
13 is a diagram showing a file configuration inside the storage device 17 of FIG. 1; FIG.
14 is a diagram showing an example of data storage in the voice stamp registration information file (2) 175 of FIG.
15 is a flowchart showing the operation of a voice stamp creation process B executed by the CPU 11 of FIG.
16 is a diagram showing an example of a processing type selection screen 1006 to be applied to audio data, which is displayed in step S32 in FIG.
FIG. 17 is a diagram showing an example of an image effect selection screen 1007 displayed in step S33 of FIG.
18 is a diagram showing an example of a processing degree input screen 1008 displayed in step S34 of FIG.
19 is a diagram showing an example of a processing degree input screen 1009 displayed in step S34 of FIG.
20 is a diagram showing an example of a processing degree input screen 1010 displayed in step S34 of FIG.
FIG. 21 is a diagram showing an example of a processing degree input screen 1011 displayed in step S34 of FIG.
22 is a diagram showing an example of a voice stamp setting screen 1012 displayed in step S39 of FIG.
FIG. 23 is a flowchart showing the processing degree change process executed by the CPU 11 of FIG. 1;
24 is a diagram showing an example of screen switching in step S46 of FIG.
FIG. 25 is a diagram showing an example of screen switching in step S46 of FIG.
FIG. 26 is a diagram showing a file structure inside the storage device 17 of FIG. 1;
27 is a diagram showing an example of data storage inside the voice analysis stamp registration information file 176 of FIG. 26. FIG.
FIG. 28 is a flowchart showing an operation of a voice analysis stamp creation process executed by the CPU 11 of FIG.
29 is a diagram showing an example of a voice data analysis result display screen 1013 displayed in step S70 of FIG.
30 is a diagram showing an example of an analysis / synthesis image list display screen 1014 representing a correspondence relationship between an analysis / synthesis image displayed in step S70 of FIG. 28 and an analysis result.
FIG. 31 is a diagram showing an example of the analysis / synthesis image change selection screen 1015 displayed in step S74 of FIG.
[Explanation of symbols]
1 Information terminal equipment
11 CPU
12 Input section
13 RAM
14 Transmission control unit
15 VRAM
16 Display section
17 Storage device
17a Recording medium
18 Voice processing part
18a speaker
18b microphone

Claims

In an information terminal device that records audio data in association with image data,
Image selection means for selecting the first image data corresponding to the recorded voice data or the recorded voice data;
Processing type selection means for selecting the type of processing by selecting second image data corresponding to the type of processing for processing the audio data;
According to the type of processing selected by the processing type selection means, the voice processing means for processing the recorded voice data or recorded voice data;
Synthetic image recording means for recording the audio data processed by the audio processing means in association with the image data obtained by combining the first image data and the second image data;
An information terminal device comprising:

In an information terminal device that records audio data in association with image data,
First image selecting means for selecting first image data corresponding to the recorded voice data or the recorded voice data;
A processing type selection means for selecting a processing type of the audio data;
Second image selection means for selecting second image data corresponding to the type of processing selected by the processing type selection means;
A processing degree specifying means for specifying a processing degree of the audio data;
A voice processing means for processing the recorded voice data or the recorded voice data in accordance with the type of processing selected by the processing type selection means and the processing degree specified by the processing degree specifying means;
Synthetic image recording means for recording the audio data processed by the audio processing means in association with the image data obtained by combining the first image data and the second image data;
An information terminal device comprising:

Composite image display means for displaying the image data recorded in the composite image recording means;
Composite image selection means for selecting image data displayed on the composite image display means;
Every time the image data is selected by the synthesized image selection means, a processing degree change output means for changing and outputting the processing degree of the voice data processed by the voice processing means,
Image switching means for switching the image data in correspondence with the processing degree changed by the processing degree change output means;
The information terminal device according to claim 1, further comprising:

In an information terminal device that records audio data in association with image data,
First image selecting means for selecting first image data corresponding to the recorded voice data or the recorded voice data;
Voice analysis means for analyzing the recorded voice data or recorded voice data;
Second image selection means for selecting, as second image data, image data corresponding to the type of processing determined based on the analysis result by the voice analysis means or new image data selected and designated;
Synthetic image recording means for recording the recorded audio data or the recorded audio data in association with the image data obtained by combining the first image data and the second image data;
An information terminal device comprising:

On the computer,
A function of selecting the first image data corresponding to the recorded voice data or the recorded voice data;
A function of selecting the type of processing by selecting second image data corresponding to the type of processing for processing the audio data;
A function of processing the recorded voice data or the recorded voice data according to the selected type of processing;
A function of recording the processed audio data in association with the image data obtained by combining the first image data and the second image data;
A program to realize

On the computer,
A function of selecting the first image data corresponding to the recorded voice data or the recorded voice data;
A function of selecting the type of processing of the audio data;
A function of selecting second image data corresponding to the selected type of processing;
A function for designating the processing level of the audio data;
A function of processing the recorded audio data or the recorded audio data according to the selected type of processing and the specified processing level;
A function of recording the processed audio data in association with the image data obtained by combining the first image data and the second image data;
A program to realize

On the computer,
A function of selecting the first image data corresponding to the recorded voice data or the recorded voice data;
A function of analyzing the recorded voice data or the recorded voice data;
A function of selecting, as second image data, image data corresponding to the type of processing determined based on the analysis result or new image data selected and designated;
A function of recording the recorded voice data or the recorded voice data in association with the image data obtained by combining the first image data and the second image data;
A program to realize