JP6146703B2

JP6146703B2 - Voice memo storage method related to schedule

Info

Publication number: JP6146703B2
Application number: JP2016132263A
Authority: JP
Inventors: 盛輝山崎; 齋藤　隆; 隆齋藤
Original assignee: 株式会社ナカヨ
Priority date: 2016-07-04
Filing date: 2016-07-04
Publication date: 2017-06-14
Anticipated expiration: 2033-01-08
Also published as: JP2016195428A

Description

本発明は、ボイスレコーダ機能を備えた情報機器や通話録音機能を備えた電話機器等の
音声録音機能を有する機器における、音声録音方法および録音した音声の登録方法に関す
る。 The present invention relates to a voice recording method and a recorded voice registration method in a device having a voice recording function such as an information device having a voice recorder function and a telephone device having a call recording function.

会話中に発せられる特定のキーワードを検出して会話中の特定部分を録音する技術があ
る（例えば、特許文献１）。この技術は、例えば、コールセンタにおいて、販売員が会話
中に発する特定の商品の開始キーワードを音声認識部が検出すると、部分録音用レコーダ
に録音を開始させ、顧客の理解が得られたら、販売員が発する上記商品の終了キーワード
で、レコーダの録音を停止させるものであり、会話の開始から終了までの内、肝心の特定
部分の会話のみを効率よく録音して保存しておくものである。 There is a technique for recording a specific part in a conversation by detecting a specific keyword issued during the conversation (for example, Patent Document 1). For example, in a call center, when a voice recognition unit detects a start keyword of a specific product issued by a salesperson during a conversation at a call center, the recording recorder starts recording, and if the customer's understanding is obtained, the salesperson This is an end keyword of the above-mentioned product issued by, which stops the recording of the recorder, and efficiently records and saves only a specific part of the conversation from the start to the end of the conversation.

しかしながら、この技術は、金融機関等における店頭販売や、顧客からの電話を受け付
けるコールセンタ等において、顧客への説明や質疑に係る会話に起因する、後のトラブル
の発生に備えた、会話履歴として残しておくことが目的であり、録音データ自体を通常の
業務に有効活用するものではない。 However, this technology is left as a conversation history in preparation for the occurrence of later troubles caused by conversations related to explanations and questions to customers, such as in-store sales at financial institutions and call centers that accept calls from customers. The recorded data itself is not effectively used for normal work.

特開２００６−１０７０４４号公報JP 2006-107044 A

本発明の課題は、音声録音機能と、スケジュールや備忘録等の行動予定や処置予定を管理する機能を備えている機器において、会話の特定部分を単に残すだけでなく、会話中の行動予定と処置予定に係る特定用語を検出し、会話中に約束される行動予定と処置予定に係る会話の部分的な録音を、音声メモとして簡単に登録できる、予定に係る音声メモ登録方法を提供することにある。 An object of the present invention is not only to leave a specific part of a conversation but also to an action schedule and a treatment during a conversation in a device having a voice recording function and a function for managing an action schedule and a treatment schedule such as a schedule and a memorandum. To provide a voice memo registration method for a schedule that detects specific terms related to the schedule, and can easily register a partial recording of the conversation related to the action schedule and the treatment schedule promised during the conversation as a voice memo. is there.

上記課題を解決するために、本発明は、マイクまたはネットワークを介して入力した音声情報を録音する音声録音手段を備える機器における、前記入力した音声情報の一部を音声メモとして蓄積する方法であって、前記入力した音声情報を解析し前記音声メモの開始に係る予め定められた１以上の起動キーワードのいずれかを検出する起動キーワード検出ステップと、前記入力した音声情報を解析し日時およびまたは場所に係る情報を抽出する日時場所情報抽出ステップと、前記入力した音声情報を解析し結論もしくは用件の末尾に係る予め定められた１以上の末尾キーワードのいずれかを検出する末尾キーワード検出ステップと、前記音声メモを前記機器が備えるもしくは前記機器と接続された情報蓄積手段に蓄積する音声メモ蓄積ステップと、を有し、前記音声メモを自動的に蓄積するモードが設定された状態において、前記起動キーワード検出ステップが前記起動キーワードのいずれかを検出した場合に、前記音声情報の録音を開始し、前記末尾キーワード検出ステップが前記末尾キーワードのいずれかを検出した場合または一定時間以上の無音状態を検出した場合に、前記音声情報の録音を停止し、前記日時場所情報抽出ステップが日時に係る情報および場所に係る情報を抽出したならば、前記抽出した最新の日時および最新の場所に係る情報と関連付けて、前記録音した音声情報を行動予定に係る音声メモとして蓄積し、前記日時場所情報抽出ステップが日時に係る情報を抽出し場所に係る情報を抽出しないならば、前記抽出した最新の日時に係る情報と関連付けて、前記録音した音声情報を処置予定に係る音声メモとして蓄積することを特徴とする。 In order to solve the above-mentioned problems, the present invention is a method of storing a part of the inputted voice information as a voice memo in a device including voice recording means for recording voice information inputted via a microphone or a network. An activation keyword detection step of analyzing the input voice information and detecting one or more predetermined activation keywords related to the start of the voice memo; and analyzing the input voice information to analyze the date and / or location A date and time location information extraction step for extracting information related to the above, a tail keyword detection step for analyzing the input voice information and detecting one or more predetermined end keywords at the end of the conclusion or message; voice memo storage step for accumulating the voice memo in the information storing means or connected to the device provided in the device If has, in a state where the mode for automatically storing said voice memo is set, when the start keyword detection step detects one of the starting keyword, and starts recording of the audio information, when the trailing keyword detection step detects the silent state of the above case is detected or a certain time one of the last keyword, to stop the recording of the audio information, information before Symbol time location information extraction step according to the date and time and if the extracted information relating to the location, in association with information relating to the most recent date and time and the latest location and the extracted, the accumulated recorded audio information as voice memo relating to planned activity, the extraction step wherein the time location information If the information related to the date and time is not extracted and the information related to the place is not extracted, Characterized by storing a voice memo according to recorded audio information for treatment planning.

本発明によれば、会話中に約束される行動予定と処置予定に係る特定の会話部分を音声メモとして、自動的に抽出すると共に、その音声メモを行動予定と処置予定に分類して蓄積することが容易なので、ボイスレコーダ機能や通話録音機能を備えた機器の利便性を向上できる利点がある。 According to the present invention, a particular conversation part of the treatment plan and action will be committed during a conversation as a voice memo, as well as automatically extracted, to accumulate classified into the treatment plan and action plans to the voice memo Therefore, there is an advantage that the convenience of a device having a voice recorder function and a call recording function can be improved.

本発明の一実施の形態に係る携帯電話端末１のブロック構成図である。It is a block block diagram of the mobile telephone terminal 1 which concerns on one embodiment of this invention. 本発明の一実施の形態に係る情報管理部１９が管理するデータを模式的に表した図である。It is the figure which represented typically the data which the information management part 19 which concerns on one embodiment of this invention manages. 本発明の一実施の形態に係る音声録音機器１の動作を表すフローチャート図である。It is a flowchart figure showing operation | movement of the audio | voice recording apparatus 1 which concerns on one embodiment of this invention.

以下、本発明の実施形態について、通話録音機能またはボイスレコーダ機能を有する携
帯電話機（以下、本装置）を例に説明する。 Hereinafter, an embodiment of the present invention will be described by taking a mobile phone having a call recording function or a voice recorder function (hereinafter, this apparatus) as an example.

図１は、本装置１のブロック構成図である。図１において、本装置１は、端末制御部１
０，スピーカ１１，マイク１２，操作部１３，表示部１４，録音再生制御部１５，録音デ
ータ蓄積部１６，音声認識処理部１７，キーワード登録部１８，情報管理部１９、および
、文字メモ蓄積部１９１と音声メモ蓄積部１９２を含むメモ管理部１９０から構成される
。 FIG. 1 is a block diagram of the apparatus 1. In FIG. 1, the apparatus 1 includes a terminal control unit 1.
0, speaker 11, microphone 12, operation unit 13, display unit 14, recording / playback control unit 15, recording data storage unit 16, voice recognition processing unit 17, keyword registration unit 18, information management unit 19, and character memo storage unit The memo management unit 190 includes a 191 and a voice memo storage unit 192.

なお、本実施例では、文字メモ蓄積部１９１と音声メモ蓄積部１９２は本装置１に内蔵
しているものとして説明するが、本装置１と接続される外部機器に内蔵されていてもよい
。 In this embodiment, the character memo accumulating unit 191 and the voice memo accumulating unit 192 are described as being incorporated in the apparatus 1, but may be incorporated in an external device connected to the apparatus 1.

端末制御部１０は、携帯電話端末としての本装置１の全般を制御する手段である。スピ
ーカ１１は、端末制御部１０から出力される音声情報を可聴音としての音声に変換して出
力する手段である。マイク１２は、本装置１を利用するユーザの音声を音声情報に変換し
て端末制御部１０へ出力する手段である。操作部１３は、本装置１を利用するユーザによ
る各種キー操作情報を端末制御部１０へ入力する手段である。表示部１４は、本装置１を
操作するユーザへ各種の情報を提示する手段である。 The terminal control unit 10 is a means for controlling the entire apparatus 1 as a mobile phone terminal. The speaker 11 is means for converting the sound information output from the terminal control unit 10 into sound as audible sound and outputting the sound. The microphone 12 is means for converting the voice of the user who uses the apparatus 1 into voice information and outputting the voice information to the terminal control unit 10. The operation unit 13 is means for inputting various key operation information by a user who uses the apparatus 1 to the terminal control unit 10. The display unit 14 is means for presenting various types of information to a user who operates the apparatus 1.

録音再生制御部１５は、本装置１が備えるボイスレコーダ機能または通話録音機能に係
る録音再生制御を実行する手段である。録音データ蓄積部１６は、録音再生制御部１５が
録音した音声情報を蓄積する手段である。 The recording / playback control unit 15 is means for executing recording / playback control related to the voice recorder function or the call recording function provided in the apparatus 1. The recording data storage unit 16 is means for storing the voice information recorded by the recording / playback control unit 15.

音声認識処理部１７は、端末制御部１０を介して入力されるマイク１２またはネットワ
ーク２からの音声情報を解析し、キーワード登録部１８に登録されている起動キーワード
または末尾キーワードを検出し、検出した起動キーワードまたは末尾キーワードを端末制
御部１０へ通知する。 The voice recognition processing unit 17 analyzes the voice information from the microphone 12 or the network 2 input via the terminal control unit 10 and detects the activation keyword or the end keyword registered in the keyword registration unit 18 and detects it. The start keyword or the end keyword is notified to the terminal control unit 10.

また、音声認識処理部１７は、端末制御部１０を介して入力されるマイク１２またはネ
ットワーク２からの音声情報を解析し、日時情報または場所情報を抽出し、抽出した起動
キーワードまたは末尾キーワードを端末制御部１０へ通知する。なお、音声認識処理部１
７は、日時情報や場所情報に係る辞書を内蔵しており（図示せず）、公知の音声認識処理
技術により、日時情報や場所情報を抽出する。また、音声認識処理部１７が抽出する日時
情報や場所情報を限定された範囲に制限してもよい。 Further, the voice recognition processing unit 17 analyzes voice information from the microphone 12 or the network 2 input via the terminal control unit 10, extracts date / time information or location information, and uses the extracted activation keyword or end keyword as the terminal Notify the control unit 10. The voice recognition processing unit 1
7 has a built-in dictionary related to date / time information and location information (not shown), and extracts date / time information and location information by a known voice recognition processing technique. Further, the date information and location information extracted by the voice recognition processing unit 17 may be limited to a limited range.

ところで、音声認識処理部１７は、話者を特定しない不特定話者認識方式でもよいし、
この本装置１を利用する特定の者が発する音声のみを認識できる特定話者認識方式でもよ
い。 By the way, the voice recognition processing unit 17 may be an unspecified speaker recognition method that does not specify a speaker,
A specific speaker recognition method capable of recognizing only a voice uttered by a specific person using the apparatus 1 may be used.

特に、特定話者認識方式の場合は、本装置１を利用する特定の者が意図的に発する言葉
の音声だけ登録しておけばよく、認識処理の効率が良く、認識精度も不特定話者認識方式
より高いという利点がある。また、その者のみが本願の機能を利用できるので、本装置１
を一時的に他者が利用した場合でも、意図しない音声メモが残らないという利点もある。
ただし、特定話者が発する言葉の音声のみを認識するので、相手が言ったキーワードに対
しても、自身で復唱して発する必要がある。例えば、相手が「“それでは”、・・・と“
しましょう”」に対して、「分かりました。“それでは”、・・・と“します”」のよう
に復唱して、自身が登録したキーワードを発すればよい。これは、約束事の確認にもなり
、会話を損なうものではない。 In particular, in the case of the specific speaker recognition method, it is only necessary to register the speech of words intentionally uttered by a specific person using the device 1, the recognition processing efficiency is high, and the recognition accuracy is also unspecified. There is an advantage that it is higher than the recognition method. In addition, since only the person can use the function of the present application, the apparatus 1
There is also an advantage that an unintended voice memo does not remain even when someone else uses the phone temporarily.
However, since it recognizes only the speech of the words uttered by the specific speaker, it is necessary to repeat and utter even the keyword said by the other party. For example, if the other party is "So then ..."
“I ’ll do it.” I ’ll just repeat it and say the keyword I ’ve registered. This also confirms the promise and does not damage the conversation.

キーワード登録部１８は、待ち合わせの予定や依頼する用件の冒頭の部分に用いられる
常套語（例えば、「それでは」，「そこで」等）、および、予定や用件の末尾の部分に用
いられる常套語（例えば、「待っています」，「して下さい」等）を、起動キーワードお
よび末尾キーワードとして、予め登録しておく手段である。 The keyword registration unit 18 uses common words (for example, “Now”, “where”, etc.) used in the beginning part of a schedule or a request, and a common word used in the last part of a schedule or a message. This is means for previously registering words (for example, “waiting”, “please”, etc.) as activation keywords and end keywords.

情報管理部１９は、メモ管理部１９０，文字メモ蓄積部１９１，音声メモ蓄積部１９２
から構成され、端末制御部１０から入力するコマンドに応じて、録音再生制御部１５が録
音する録音データを、登録，管理する手段である。なお、メモ管理部１９０には各種の予
定メモや雑メモを登録できるが、少なくとも、日時情報を含む行動予定および処置予定は
自動的に登録可能である。行動予定および処置予定の自動登録については後述する。 The information management unit 19 includes a memo management unit 190, a character memo storage unit 191, and a voice memo storage unit 192.
This is a means for registering and managing recording data to be recorded by the recording / playback control unit 15 in accordance with a command input from the terminal control unit 10. In addition, although various schedule memos and miscellaneous memos can be registered in the memo management unit 190, at least an action schedule and a treatment schedule including date and time information can be automatically registered. The automatic registration of the action schedule and the treatment schedule will be described later.

図２は、本発明の一実施の形態に係る情報管理部１９が管理しているデータの構成を模
式的に表した図である。 FIG. 2 is a diagram schematically showing the configuration of data managed by the information management unit 19 according to the embodiment of the present invention.

列２０１は、会話した日時と相手の情報を記憶しておく列であり、日時情報は端末制御
部１０が内蔵している時計機能（図示せず）により、自動的に入力される。また、会話し
た相手の情報は自動または手動で入力する。会話がネットワーク２を介した電話通話の場
合は、端末制御部１０が内蔵している電話帳機能（図示せず）により、相手電話番号から
相手の名称等が自動的に入力される。また、音声認識処理部１７に特定話者識別機能を持
たせて、当該会話の相手を自動的に判定して入力するようにしてもよい。手動で入力する
場合は、操作部１３を操作して、相手の名称を選択またはキー操作して入力する。 A column 201 is a column for storing the date and time of conversation and information on the other party, and the date and time information is automatically input by a clock function (not shown) built in the terminal control unit 10. In addition, the information of the conversation partner is input automatically or manually. When the conversation is a telephone call via the network 2, the name of the other party is automatically input from the other party's telephone number by a telephone directory function (not shown) built in the terminal control unit 10. Further, the voice recognition processing unit 17 may be provided with a specific speaker identification function so that the conversation partner is automatically determined and input. When inputting manually, the operation part 13 is operated and the name of a partner is selected or operated by key operation.

列２０２は、音声認識処理部１７が会話中の音声データを解析して抽出した日時情報を
記憶しておく列である。会話中に日時を含む音声データが複数回抽出された場合、ここで
は最新の日時情報を有効としている。 A column 202 is a column for storing date and time information extracted by the voice recognition processing unit 17 analyzing and extracting voice data during conversation. When voice data including date and time is extracted a plurality of times during conversation, the latest date and time information is valid here.

列２０３は、音声認識処理部１７が会話中の音声データを解析して抽出した場所情報を
記憶しておく列である。会話中に場所を含む音声データが複数回抽出された場合、ここで
は最新の場所情報を有効としている。 A column 203 is a column for storing location information extracted by the voice recognition processing unit 17 analyzing and extracting voice data during conversation. When voice data including a location is extracted a plurality of times during a conversation, the latest location information is valid here.

列２０４は、音声認識処理部１７が会話中の音声データを解析して抽出した、キーワー
ド登録部１８に登録されている、録音を起動すべき起動キーワードまたは録音を停止すべ
き末尾キーワードである。このキーワードはこの本装置１を利用する者が意図的に発する
特定の言葉もしくは特殊な言葉であってもよい。なお、列２０４は本実施例を説明するた
めの便宜上の列であり、録音した音声メモと関連付けて記憶しておく必要はないが、記憶
しておくと、どのキーワードで当該音声メモが録音されたかを検証できるので、登録キー
ワードを見直す際に便利である。 A column 204 is an activation keyword to start recording or an end keyword to stop recording, registered in the keyword registration unit 18, which is extracted by analyzing the voice data during conversation by the voice recognition processing unit 17. This keyword may be a specific word or a special word intentionally issued by a person who uses the apparatus 1. Note that the column 204 is a column for convenience in explaining the present embodiment, and it is not necessary to store it in association with the recorded voice memo, but if it is stored, the voice memo is recorded with any keyword. This is useful when reviewing registered keywords.

列２０５は、起動キーワードで録音が起動され、末尾キーワードで録音が停止した録音
データを、音声メモとして音声メモ蓄積部１９２に登録されている音声メモの内容を記憶
しておく列である。 Column 205 is a column for storing the contents of voice memos registered in voice memo accumulating unit 192 as voice memos, with the recording data activated by the activation keyword and stopped by the tail keyword.

列２０６は、音声メモの種別であり、列２０５の各音声メモが行動予定，処置予定，雑
メモのいずれであるかを記憶しておく列である。この行動予定，処置予定，雑メモは、メ
モ管理部１９０に登録されている。録音した音声メモが行動予定，処置予定，雑メモのい
ずれであるかを自動的に識別する方法については後述する。 A column 206 is a type of voice memo, and is a column that stores whether each voice memo in the column 205 is an action schedule, a treatment schedule, or a miscellaneous memo. The action schedule, treatment schedule, and miscellaneous memo are registered in the memo management unit 190. A method for automatically identifying whether the recorded voice memo is an action schedule, a treatment schedule, or a miscellaneous memo will be described later.

ここで、列２０１〜列２０３は文字データであり、文字メモとして、文字メモ蓄積部１
９１に登録される。また、列２０５は音声メモとして、音声メモ蓄積部１９２に登録され
る。なお、音声メモの実体（録音データ）は録音データ蓄積部１６に蓄積されている。そ
して、これらの文字メモおよび音声メモはメモ管理部１９０により関連付けられる。 Here, the columns 201 to 203 are character data, and the character memo storage unit 1 is used as a character memo.
91 is registered. The column 205 is registered in the voice memo storage unit 192 as a voice memo. The actual voice memo (recorded data) is stored in the recorded data storage unit 16. These character memos and voice memos are associated by the memo management unit 190.

さらに、情報管理部１９が管理する情報は予定メモ以外の種々の情報（例えば、行動予
定に係る地図、や処置予定に係る参考資料等）を管理しており、メモ管理部１９０の各登
録内容と情報管理部１９が管理する種々の情報を関連付け可能である。 Furthermore, the information managed by the information management unit 19 manages various information other than the scheduled memo (for example, a map related to the action schedule, reference materials related to the treatment schedule, etc.), and each registered content of the memo management unit 190 And various information managed by the information management unit 19 can be associated with each other.

図３は、本発明の一実施の形態に係る本装置１の動作フローチャート図である。以下、
図１および図２を併用して本装置１の動作を説明する。自動的に会話中の音声から音声メ
モを抽出する、自動音声メモモードが起動された状態で、本フローはスタートする（Ｓ３
００）。 FIG. 3 is an operation flowchart of the apparatus 1 according to the embodiment of the present invention. Less than,
The operation of the apparatus 1 will be described with reference to FIGS. This flow starts with the automatic voice memo mode activated, which automatically extracts voice memos from the voice during conversation (S3).
00).

端末制御部１０は、待機状態において、音声入力の有無（Ｓ３０１），起動キーワード
検出の有無（Ｓ３１０），日時情報抽出の有無（Ｓ３２０），場所情報抽出の有無（Ｓ３
３０），末尾キーワード検出の有無（Ｓ３４０），長時間無音検出の有無（Ｓ３４１）を
循環して監視している。 In the standby state, the terminal control unit 10 determines whether there is a voice input (S301), whether a startup keyword is detected (S310), whether date information is extracted (S320), and whether location information is extracted (S3).
30), the presence / absence of end keyword detection (S340) and the presence / absence of silent detection for a long time (S341) are circulated and monitored.

本装置１の自動音声メモモードがオンの状態において、音声認識処理部１７は、端末制
御部１０を介して、マイク１２もしくはネットワーク２からの音声情報、または録音再生
制御部１５が再生する音声情報の入力を監視し、音声情報が入力された場合（Ｓ３０１，
ＹＥＳ）、入力してくる音声情報を解析して、音声情報中に起動キーワードに対応する音
声情報の存在を監視する（Ｓ３１０）。 In the state where the automatic voice memo mode of the apparatus 1 is on, the voice recognition processing unit 17 performs voice information from the microphone 12 or the network 2 or voice information reproduced by the recording / playback control unit 15 via the terminal control unit 10. Input is monitored, and voice information is input (S301,
YES), the input voice information is analyzed, and the presence of the voice information corresponding to the activation keyword is monitored in the voice information (S310).

ここで、音声認識処理部１７に入力してくる音声情報は、例えば、音声パケットデータ
であって、そのパケットサイズは一定時間分の音声データ（データ量可変）または一定量
の音声データ（録音時間可変）等、入力する音声情報の単位は任意である。 Here, the voice information input to the voice recognition processing unit 17 is, for example, voice packet data, and the packet size is voice data for a fixed time (data amount variable) or a fixed amount of voice data (recording time). The unit of audio information to be input is arbitrary.

そして、起動キーワードを検出した場合（Ｓ３１０，ＹＥＳ）、音声認識処理部１７は
、端末制御部１０へ起動キーワードを検出した旨および検出した起動キーワードを通知し
、端末制御部１０は、録音再生制御部１５を制御し、音声メモとしての録音をスタートさ
せて（Ｓ３１１）、Ｓ３２０へ進む。 If the activation keyword is detected (S310, YES), the voice recognition processing unit 17 notifies the terminal control unit 10 that the activation keyword has been detected and the detected activation keyword, and the terminal control unit 10 performs the recording / playback control. The unit 15 is controlled to start recording as a voice memo (S311), and the process proceeds to S320.

なお、図示していないが、Ｓ３１０でＹＥＳの場合、端末制御部１０は、情報管理部１
９を制御して、録音再生制御部１５が録音した音声メモに対応するメモ管理部１９０のメ
モリに、音声認識処理部１７が検出した起動キーワードを記憶しておく（図２の列２０４
に対応）。 Although not shown, if YES is obtained in S310, the terminal control unit 10 determines that the information management unit 1
9 and the activation keyword detected by the voice recognition processing unit 17 is stored in the memory of the memo management unit 190 corresponding to the voice memo recorded by the recording / playback control unit 15 (column 204 in FIG. 2).
Corresponding).

Ｓ３２０において、音声認識処理部１７は、入力してくる音声情報を解析して、音声情
報中に日時情報に対応する音声情報の存在を監視し、日時情報を抽出した場合（Ｓ３２０
，ＹＥＳ）、端末制御部１０へ日時情報を抽出した旨を通知し、端末制御部１０は、情報
管理部１９を制御して、録音再生制御部１５が録音した音声メモに対応するメモ管理部１
９０のメモリ（図２の列２０２に対応）に、前記抽出した日時情報を記憶し（Ｓ３２１）
、Ｓ３３０へ進む。 In S320, the voice recognition processing unit 17 analyzes the input voice information, monitors the presence of voice information corresponding to the date / time information in the voice information, and extracts the date / time information (S320).
, YES), the terminal control unit 10 is notified that the date / time information has been extracted, and the terminal control unit 10 controls the information management unit 19 to correspond to the voice memo recorded by the recording / playback control unit 15. 1
The extracted date / time information is stored in 90 memories (corresponding to the column 202 in FIG. 2) (S321).
The process proceeds to S330.

Ｓ３３０において、音声認識処理部１７は、入力してくる音声情報を解析して、音声情
報中に場所情報に対応する音声情報の存在を監視し、場所情報を抽出した場合（Ｓ３３０
，ＹＥＳ）、端末制御部１０へ場所情報を抽出した旨を通知し、端末制御部１０は、情報
管理部１９を制御して、録音再生制御部１５が録音した音声メモに対応するメモ管理部１
９０のメモリ（図２の列２０３に対応）に、前記抽出した場所情報を記憶し（Ｓ３３１）
、Ｓ３４０へ進む。 In S330, the voice recognition processing unit 17 analyzes the input voice information, monitors the presence of voice information corresponding to the location information in the voice information, and extracts the location information (S330).
, YES), the terminal control unit 10 is notified that the location information has been extracted, and the terminal control unit 10 controls the information management unit 19 to correspond to the voice memo recorded by the recording / playback control unit 15. 1
The extracted location information is stored in 90 memories (corresponding to column 203 in FIG. 2) (S331).
The process proceeds to S340.

なお、Ｓ３２１およびＳ３３１における、日時情報および場所情報の記憶は上書方式に
よる記憶であり、それ以前に記憶した内容は自動的に消去される（上書）。これにより、
常に最新の日時情報および場所情報が記憶されるので、会話の途中で抽出された確定前の
情報は無効となり、その後の結論で確定した、最後に発せられた日時情報および場所情報
が有効となる。 In S321 and S331, the date / time information and the location information are stored by overwriting, and the contents stored before that are automatically deleted (overwriting). This
Since the latest date / time information and location information are always stored, the information prior to confirmation extracted in the middle of the conversation is invalid, and the date / time information and location information that was issued last is valid. .

Ｓ３４０において、音声認識処理部１７は、入力してくる音声情報を解析して、音声情
報中に末尾キーワードに対応する音声情報の存在を監視し、末尾キーワードを検出した場
合（Ｓ３４０，ＹＥＳ）、端末制御部１０へ末尾キーワードを検出した旨および検出した
末尾キーワードを通知し、端末制御部１０は、録音再生制御部１５を制御して、録音を停
止し（Ｓ３４２）、Ｓ３５０へ進む。 In S340, the speech recognition processing unit 17 analyzes the input speech information, monitors the presence of speech information corresponding to the end keyword in the speech information, and detects the end keyword (YES in S340). The terminal control unit 10 is notified that the end keyword has been detected and the detected end keyword, and the terminal control unit 10 controls the recording / playback control unit 15 to stop recording (S342), and proceeds to S350.

なお、図示していないが、Ｓ３４０でＹＥＳの場合、端末制御部１０は、情報管理部１
９を制御して、録音再生制御部１５が録音した音声メモに対応するメモ管理部１９０のメ
モリに、音声認識処理部１７が検出した末尾キーワードを記憶しておく（図２の列２０４
に対応）。 Although not shown, if YES is obtained in S340, the terminal control unit 10 determines that the information management unit 1
9 is stored in the memory of the memo management unit 190 corresponding to the voice memo recorded by the recording / playback control unit 15 (column 204 in FIG. 2).
Corresponding).

Ｓ３４０において、音声認識処理部１７が末尾キーワードを検出しない場合（Ｓ３４０
，ＮＯ）、端末制御部１０は予め定められた長時間無音（例えば、１分）の有無を監視し
、長時間無音を検出した場合（Ｓ３４１，ＹＥＳ）、上述したＳ３４２へ進み、端末制御
部１０は、音声メモの録音を停止し（Ｓ３４２）、Ｓ３５０へ進む。長時間無音を検出し
ない場合（Ｓ３４１，ＮＯ）、Ｓ３０１に戻る。 In S340, when the speech recognition processing unit 17 does not detect the end keyword (S340)
, NO), the terminal control unit 10 monitors the presence / absence of a predetermined long period of silence (for example, 1 minute), and if the long period of silence is detected (S341, YES), the terminal control unit 10 proceeds to S342 described above. 10 stops the recording of the voice memo (S342), and proceeds to S350. When silence is not detected for a long time (S341, NO), the process returns to S301.

Ｓ３５０において、端末制御部１０は、Ｓ３２１に対応する日時情報の記憶とＳ３３１
に対応する場所情報の有無を判定し（Ｓ３５０）、日時情報が無ければ音声メモを雑メモ
として登録し（Ｓ３５３）、日時情報があり場所情報の記憶が無いならば録音した音声メ
モを処置予定として登録し（Ｓ３５１）、日時情報と場所情報の両方の記憶が有るならば
録音した音声メモを行動予定として登録し（Ｓ３５２）、Ｓ３６０へ進む。 In S350, the terminal control unit 10 stores the date and time information corresponding to S321 and S331.
If there is no date / time information, the voice memo is registered as a miscellaneous memo (S353). If there is date / time information and no location information is stored, the recorded voice memo is scheduled to be treated. (S351), if both date information and place information are stored, the recorded voice memo is registered as an action schedule (S352), and the process proceeds to S360.

なお、ここでは、日時情報と場所情報を抽出した場合はどこかへ出かける行動予定と判
断し、日時情報のみ抽出した場合は何らかの処置を要する処置予定と判断し、日時情報を
抽出しない場合は雑メモと判断する、単純な種別の自動識別なので誤判断の可能性がある
が、この登録は仮の登録であって、登録の確定前に修正可能である（後述）。 It should be noted that here, if date / time information and location information are extracted, it is determined to be an action plan to go to somewhere. Since it is a simple type of automatic identification that is determined as a memo, there is a possibility of erroneous determination, but this registration is provisional registration and can be corrected before the registration is confirmed (described later).

ところで、音声メモの種別として、行動予定，処置予定，雑メモの各々について自動的
に蓄積する／しないの設定は、Ｓ３５１，Ｓ３５２，Ｓ３５３をパスするようにフローを
変更すればよい（図示せず）。 By the way, as the type of voice memo, the setting of whether or not to automatically accumulate each of the action schedule, the treatment schedule, and the miscellaneous memo may be performed by changing the flow so as to pass S351, S352, and S353 (not shown). ).

Ｓ３６０において、端末制御部１０は、Ｓ３５１またはＳ３５２で登録した音声メモに
係る情報（例えば、“次の音声メモを抽出しました。登録／編集／削除しますか？”）を
表示部１４に表示し、Ｓ３６１へ進む。 In S360, the terminal control unit 10 displays information related to the voice memo registered in S351 or S352 (for example, “The next voice memo has been extracted. Do you want to register / edit / delete?”) On the display unit 14. Then, the process proceeds to S361.

Ｓ３６１において、端末制御部１０は、操作部１３からの入力される操作情報に応じて
、Ｓ３５１またはＳ３５２で登録した音声メモに係る情報を登録／編集／削除すると共に
、記憶している日時情報と場所情報の記憶をクリアし（Ｓ３６２）、Ｓ３０１に戻る。こ
の音声メモに係る情報の登録は、自動的に仮登録された内容を確定する作業であり、音声
メモに係る情報の編集は、当該音声メモと関連付ける文字メモ（図２の列２０１〜列２０
３）等の編集や、その他の情報（スケジュールや備忘録等）との関連付け等である。 In S361, the terminal control unit 10 registers / edits / deletes the information related to the voice memo registered in S351 or S352 according to the operation information input from the operation unit 13, and stores the date / time information stored therein. The storage of the location information is cleared (S362), and the process returns to S301. The registration of the information related to the voice memo is an operation of automatically confirming the temporarily registered contents, and the editing of the information related to the voice memo is performed by editing the text memo associated with the voice memo (columns 201 to 20 in FIG. 2).
3), etc., and association with other information (schedule, memorandum, etc.).

なお、日時情報と場所情報の記憶をクリアする理由は、登録済または削除済みの予定メ
モに係る日時情報と場所情報が後で抽出される音声メモとリンクすることを防止するため
である。 The reason for clearing the storage of the date / time information and the location information is to prevent the date / time information and the location information related to the registered or deleted scheduled memo from being linked to the voice memo extracted later.

以上の動作フローによる音声メモの抽出は、自動音声メモモードが停止されるまで継続
される。 The extraction of voice memos according to the above operation flow is continued until the automatic voice memo mode is stopped.

以上、本発明の一実施形態について説明した。本発明によれば、会話中に約束される行
動予定や処置予定に係る、特定の会話部分を音声メモとして、自動的に抽出して簡単に登
録できると共に、その音声メモをスケジュールや備忘録と関連付けることが容易なので、
ボイスレコーダ機能や通話録音機能を備えた機器の利便性を向上できる利点がある。 The embodiment of the present invention has been described above. According to the present invention, a specific conversation part related to an action schedule and a treatment schedule promised during a conversation can be automatically extracted as a voice memo and can be easily registered, and the voice memo can be associated with a schedule or a memorandum. Because it is easy
There is an advantage that the convenience of a device having a voice recorder function and a call recording function can be improved.

ところで、本発明の実施形態として、携帯電話端末を例に説明したが、本発明はこれに
限定されない。本発明を適用する装置は、一般的な事務所や家庭に設置される固定電話機
や、複数の内線を収容する内線電話システムを構成する電話主装置等、電話装置一般に適
用可能である。さらに、電話装置以外にも、ボイスレコーダ機能を有する、パーソナルコ
ンピュータ、パーソナルデジタルアシスタント（ＰＤＡ）、ボイスレコーダ、スマートホ
ン等、各種の情報装置に適用可能である。 By the way, although the mobile phone terminal has been described as an example of the embodiment of the present invention, the present invention is not limited to this. The apparatus to which the present invention is applied can be applied to telephone apparatuses in general, such as fixed telephones installed in general offices and homes, and telephone main apparatuses constituting an extension telephone system that accommodates a plurality of extensions. Further, in addition to the telephone device, the present invention can be applied to various information devices such as a personal computer, a personal digital assistant (PDA), a voice recorder, and a smart phone having a voice recorder function.

１・・・携帯電話端末
２・・・ネットワーク
１０・・・端末制御部
１１・・・スピーカ
１２・・・マイク
１３・・・操作部
１４・・・表示部
１５・・・録音再生制御部
１６・・・録音データ蓄積部
１７・・・音声認識処理部
１８・・・キーワード登録部
１９・・・情報管理部
１９０・・・メモ管理部
１９１・・・文字メモ蓄積部
１９２・・・音声メモ蓄積部

DESCRIPTION OF SYMBOLS 1 ... Mobile phone terminal 2 ... Network 10 ... Terminal control part 11 ... Speaker 12 ... Microphone 13 ... Operation part 14 ... Display part 15 ... Recording / reproducing control part 16 ... Recording data storage unit 17 ... Voice recognition processing unit 18 ... Keyword registration unit 19 ... Information management unit 190 ... Memo management unit 191 ... Character memo storage unit 192 ... Voice memo Accumulator

Claims

In a device comprising voice recording means for recording voice information inputted via a microphone or a network, a method of storing a part of the inputted voice information as a voice memo,
An activation keyword detection step of analyzing the input voice information and detecting any one or more predetermined activation keywords related to the start of the voice memo , and analyzing the input voice information to determine a date and / or location A date and time location information extracting step for extracting information, a tail keyword detecting step for analyzing the input speech information and detecting one or more predetermined end keywords related to the conclusion or the end of the message, and the speech note anda voice memo storage step of storing the connection information storage unit and the device comprises or said device,
In a state where the mode for automatically storing said voice memo is set, when the start keyword detection step detects one of the starting keyword, and starts recording of the audio information, the tail keyword detection step when detecting the silent state of more or when a predetermined time has detected one of the last keyword, to stop the recording of the audio information,
If before Symbol date and time location information extraction step is to extract the information related to the information and the location according to the date and time, in association with the information relating to the latest date and time and the latest of the location where the extracted, according to the audio information the recording to the action plan If it is stored as a voice memo and the date and time location information extraction step extracts information related to date and time and does not extract information related to location, the recorded voice information is treated in association with the extracted latest date and time information. A voice memo storage method according to a schedule, characterized in that the voice memo is stored as a voice memo according to the schedule.