JP2016218522A

JP2016218522A - Schedule preparation device, schedule preparation method, and program

Info

Publication number: JP2016218522A
Application number: JP2015099323A
Authority: JP
Inventors: 孝基土橋; Koki Dobashi; 浩良小川; Hiroyoshi Ogawa; 英明松田; Hideaki Matsuda; 亮奥村; Ryo Okumura
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2015-05-14
Filing date: 2015-05-14
Publication date: 2016-12-22
Anticipated expiration: 2035-05-14
Also published as: JP6596913B2

Abstract

PROBLEM TO BE SOLVED: To prepare an exact shared schedule.SOLUTION: A schedule preparation device 200 comprises: an extraction unit 22 for extracting a keyword belonging to a category related to a schedule from conversation voice of a plurality of speakers; an estimation unit 23 for, on the basis of whether the extraction unit 22 extracts a plurality of different keywords belonging to the same category, estimating the presence or absence of schedule change in the conversation voice; a selection unit 24 for, in the case where the estimation unit 23 estimates the presence of schedule change, selecting a keyword to be used for a shared schedule shared among the plurality of speakers, out of the plurality of different keywords belonging to the same category; and a schedule preparation unit 25 for preparing a shared schedule on the basis of the keyword selected by the selection unit 24.SELECTED DRAWING: Figure 3

Description

本発明は、スケジュール作成装置、スケジュール作成方法及びプログラムに関する。 The present invention relates to a schedule creation device, a schedule creation method, and a program.

従来から、ＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）や携帯電話などでスケジュール管理をすることが広く行われている。また、近時、ユーザが入力した音声に基づいてスケジュール作成する技術が知られている。
例えば、特許文献１には、入力された音声から抽出したスケジュールに関連する単語と、その単語の付加情報と、を用いてスケジュール作成する技術が開示されている。
また、特許文献２には、テキスト化された音声データの内容を、ユーザ間で共有する技術が開示されている。 Conventionally, schedule management using a PC (Personal Computer), a mobile phone, or the like has been widely performed. Recently, a technique for creating a schedule based on voice input by a user is known.
For example, Patent Literature 1 discloses a technique for creating a schedule using words related to a schedule extracted from input speech and additional information of the words.
Patent Document 2 discloses a technology for sharing the contents of voiced text data between users.

特開２０１０−１９８２６２号公報JP 2010-198262 A 特開２０１３−１１８４８８号公報JP 2013-118488 A

ところで、スケジュールは関係者複数で共有される場合がある。ここで、特許文献１と特許文献２との技術により、関係者複数で共有するスケジュールを自動作成することが考えられる。
しかしながら、複数の関係者の会話では、例えば関係者間のスケジュール調整がその場で議論されるような場合もある。このため、一人の話者の音声から単独のスケジュールを作成する場合に比べて、共有スケジュールの精度が下がってしまう。このようなことから、正確な共有スケジュールを作成することが望まれている。 By the way, a schedule may be shared by a plurality of related parties. Here, it is conceivable that a schedule shared by a plurality of parties is automatically created by the techniques of Patent Document 1 and Patent Document 2.
However, in a conversation between a plurality of parties, for example, schedule adjustment between the parties may be discussed on the spot. For this reason, compared with the case where a single schedule is created from the voice of one speaker, the accuracy of the sharing schedule is lowered. For this reason, it is desired to create an accurate sharing schedule.

そこで、本発明は、上記事情を鑑みてなされたものであり、正確な共有スケジュールを作成するスケジュール作成装置等を提供することを目的とする。 Therefore, the present invention has been made in view of the above circumstances, and an object thereof is to provide a schedule creation device or the like that creates an accurate sharing schedule.

上記目的を達成するため、本発明の第１の観点に係るスケジュール作成装置は、
複数の話者の会話音声からスケジュールに関連するカテゴリに属すキーワードを抽出する抽出手段と、
前記抽出手段が同一カテゴリに属す異なるキーワードを複数抽出したか否かに基づいて、前記会話音声におけるスケジュール変更の有無を推定する推定手段と、
前記推定手段がスケジュール変更有りと推定した場合、前記同一カテゴリに属す複数の異なるキーワードのうち、前記複数の話者で共有する共有スケジュールに用いるキーワードを選択する選択手段と、
前記選択手段が選択したキーワードに基づいて、前記共有スケジュールを作成するスケジュール作成手段と、
を備えたことを特徴とする。 In order to achieve the above object, a schedule creation device according to the first aspect of the present invention includes:
An extraction means for extracting keywords belonging to a category related to a schedule from conversational voices of a plurality of speakers;
Based on whether the extraction means has extracted a plurality of different keywords belonging to the same category, the estimation means for estimating the presence or absence of a schedule change in the conversation voice;
When the estimating means estimates that there is a schedule change, out of a plurality of different keywords belonging to the same category, a selecting means for selecting a keyword used for a shared schedule shared by the plurality of speakers;
Schedule creation means for creating the shared schedule based on the keyword selected by the selection means;
It is provided with.

上記目的を達成するため、本発明の第２の観点に係るスケジュール作成装置は、
話者を識別する話者識別手段と、
前記話者識別手段が話者を複数識別した場合、該複数の話者の会話音声からスケジュールに関連するカテゴリに属すキーワードを抽出する抽出手段と、
前記抽出手段が抽出したキーワードを発話した話者のみで共有する共有スケジュールを作成するスケジュール作成手段と、
前記スケジュール作成手段が作成した共有スケジュールを、前記複数の話者の会話音声に基づいて修正するスケジュール修正手段と、
を備えたことを特徴とする。 In order to achieve the above object, a schedule creation device according to the second aspect of the present invention provides:
Speaker identification means for identifying the speaker;
When the speaker identification means identifies a plurality of speakers, an extraction means for extracting keywords belonging to a category related to the schedule from the conversation voices of the plurality of speakers;
Schedule creation means for creating a sharing schedule that is shared only by the speaker who spoke the keyword extracted by the extraction means;
Schedule correction means for correcting the sharing schedule created by the schedule creation means based on conversational voices of the plurality of speakers;
It is provided with.

本発明によれば、正確な共有スケジュールを作成することができる。 According to the present invention, an accurate sharing schedule can be created.

本発明の第１の実施形態に係るスケジュール作成装置を説明するための図である。It is a figure for demonstrating the schedule creation apparatus which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係るユーザ端末の構成を示すブロック図である。It is a block diagram which shows the structure of the user terminal which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係るスケジュール作成装置の構成を示すブロック図である。It is a block diagram which shows the structure of the schedule creation apparatus which concerns on the 1st Embodiment of this invention. 話者識別用テンプレートの一例を示す図である。It is a figure which shows an example of the template for speaker identification. キーワードテーブルの一例を示す図である。It is a figure which shows an example of a keyword table. スケジュールフォーマットの一例を示す図である。It is a figure which shows an example of a schedule format. 本発明の第１の実施形態に係るスケジュール作成処理のフローチャートの一例を示す図である。It is a figure which shows an example of the flowchart of the schedule creation process which concerns on the 1st Embodiment of this invention. 共有スケジュールの一例を示す図である。It is a figure which shows an example of a sharing schedule. 本発明の第２の実施形態に係るスケジュール作成装置の構成を示すブロック図である。It is a block diagram which shows the structure of the schedule preparation apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第２の実施形態に係るスケジュール作成処理のフローチャートの一例を示す図である。It is a figure which shows an example of the flowchart of the schedule creation process which concerns on the 2nd Embodiment of this invention. 更新前の単独スケジュールの一例を示す図である。It is a figure which shows an example of the single schedule before an update. 更新後の共有スケジュールの一例を示す図である。It is a figure which shows an example of the share schedule after an update. 本発明の第３の実施形態に係るスケジュール作成装置を説明するための図である。It is a figure for demonstrating the schedule creation apparatus which concerns on the 3rd Embodiment of this invention. 本発明の第３の実施形態に係るスケジュール作成装置の構成を示すブロック図である。It is a block diagram which shows the structure of the schedule preparation apparatus which concerns on the 3rd Embodiment of this invention. 本発明の第４の実施形態に係るスケジュール作成装置の構成を示すブロック図である。It is a block diagram which shows the structure of the schedule preparation apparatus which concerns on the 4th Embodiment of this invention.

（第１の実施形態）
以下、図１を参照しながら、本発明の実施形態に係るスケジュール作成装置の概要について説明する。この実施形態では、図１に示すように、話者Ａ及びＢが会議などでスケジュールについて会話をしている場面設定を前提として説明する。また、この実施形態においては、一例として、スケジュール作成装置２００をサーバ、ユーザ端末１００をスマートフォン、として説明する。なお、以下では、図１の話者Ａ又はＢを特段特定する必要がない場合は、単に話者と称して説明する。 (First embodiment)
Hereinafter, an outline of a schedule creation device according to an embodiment of the present invention will be described with reference to FIG. In this embodiment, as shown in FIG. 1, a description will be given on the assumption that the speakers A and B are talking about a schedule at a meeting or the like. In this embodiment, as an example, the schedule creation device 200 will be described as a server, and the user terminal 100 will be described as a smartphone. In the following description, when it is not necessary to specifically identify the speaker A or B in FIG.

スケジュール作成装置２００は、話者の音声に基づいてスケジュールを作成する。図１に示す場面設定の場合、複数の話者の会話音声に基づいてスケジュールを作成する。この音声は、ユーザ端末１００が収音した音声をリアルタイムでスケジュール作成装置２００に送信することで得られる。 The schedule creation device 200 creates a schedule based on the voice of the speaker. In the case of the scene setting shown in FIG. 1, a schedule is created based on the conversational voices of a plurality of speakers. This voice is obtained by transmitting the voice collected by the user terminal 100 to the schedule creation device 200 in real time.

以下、ユーザ端末１００及びスケジュール作成装置２００の具体的な構成について順に説明する。まず、ユーザ端末１００の構成を、図２を参照して説明する。 Hereinafter, specific configurations of the user terminal 100 and the schedule creation device 200 will be described in order. First, the configuration of the user terminal 100 will be described with reference to FIG.

ユーザ端末１００は、図２に示すように、制御部１１と、入力部１２と、マイク１３と、カメラ１４と、記憶部１５と、通信部１６と、表示部１７と、を備える。 As illustrated in FIG. 2, the user terminal 100 includes a control unit 11, an input unit 12, a microphone 13, a camera 14, a storage unit 15, a communication unit 16, and a display unit 17.

制御部１１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）を備え、記憶部１５に記憶された制御プログラムを実行することによりユーザ端末１００全体を制御する。 The control unit 11 includes a CPU (Central Processing Unit), and controls the entire user terminal 100 by executing a control program stored in the storage unit 15.

入力部１２は、ユーザの指示入力のための各種操作ボタン、表示部１７が備えるディスプレイに重畳して配置されたタッチパネル、及びこのディスプレイに表示されたソフトウェアキーボードなどで構成される。この実施形態における指示入力としては、例えば、スケジュール作成処理の開始を指示する開始指示、スケジュール作成処理の終了を指示する終了指示などがある。 The input unit 12 includes various operation buttons for inputting user instructions, a touch panel arranged in a superimposed manner on a display included in the display unit 17, a software keyboard displayed on the display, and the like. Examples of the instruction input in this embodiment include a start instruction for instructing the start of the schedule creation process, and an end instruction for instructing the end of the schedule creation process.

マイク１３は、話者の音声を収音する。カメラ１４は、話者の画像を撮像する。なお、収音された音声と撮像された画像は通信部１６を介してスケジュール作成装置２００にリアルタイムに送信される。 The microphone 13 picks up the voice of the speaker. The camera 14 captures an image of the speaker. The collected voice and the captured image are transmitted to the schedule creation device 200 in real time via the communication unit 16.

記憶部１５は、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）と、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）と、不揮発性メモリと、を備える。ＲＡＭは、データやプログラムを一時的に記憶し、制御部１１が備えるＣＰＵのワークメモリとして機能する。ＲＯＭは、ユーザ端末１００全体の制御に必要な制御プログラムを記憶する。不揮発性メモリは、例えばハードディスクであり、各種データを記憶する。各種データとしては例えば、ユーザ端末１００の端末ＩＤ（端末識別情報）を含む。 The storage unit 15 includes a RAM (Random Access Memory), a ROM (Read Only Memory), and a nonvolatile memory. The RAM temporarily stores data and programs, and functions as a work memory for the CPU provided in the control unit 11. The ROM stores a control program necessary for controlling the entire user terminal 100. The nonvolatile memory is, for example, a hard disk and stores various data. As various data, for example, the terminal ID (terminal identification information) of the user terminal 100 is included.

通信部１６は、任意の通信網（例えば、携帯電話網）を介して、スケジュール作成装置２００との間でデータを送受信する。具体的には、通信部１６は、マイク１３が収音した音声とカメラ１４が撮像した画像と自端末の端末ＩＤとをスケジュール作成装置２００に送信する。 The communication unit 16 transmits / receives data to / from the schedule creation device 200 via an arbitrary communication network (for example, a mobile phone network). Specifically, the communication unit 16 transmits the voice collected by the microphone 13, the image captured by the camera 14, and the terminal ID of the own terminal to the schedule creation device 200.

表示部１７は、ディスプレイを備え、各種画像を表示する。各種画像としては、例えば、スケジュール作成装置２００が作成したスケジュールなどである。 The display unit 17 includes a display and displays various images. Examples of the various images include schedules created by the schedule creation device 200.

次に、スケジュール作成装置２００の構成を、図３を参照して説明する。
スケジュール作成装置２００は、図３に示すように、制御部２０と、記憶部３０と、通信部４０と、を備える。 Next, the configuration of the schedule creation device 200 will be described with reference to FIG.
As illustrated in FIG. 3, the schedule creation device 200 includes a control unit 20, a storage unit 30, and a communication unit 40.

制御部２０は、ＣＰＵを備え、記憶部３０に記憶された制御プログラムを実行することにより、スケジュール作成装置２００全体を制御する。 The control unit 20 includes a CPU, and controls the entire schedule creation device 200 by executing a control program stored in the storage unit 30.

記憶部３０は、ＲＡＭと、ＲＯＭと、不揮発性メモリと、を備える。ＲＡＭは、データやプログラムを一時的に記憶し、制御部２０が備えるＣＰＵのワークメモリとして機能する。ＲＯＭは、スケジュール作成装置２００全体の制御に必要な制御プログラムを記憶する。不揮発性メモリは、例えばハードディスクであり、スケジュール作成プログラムや各種データを記憶する。各種データとしては、話者識別用テンプレート３１、キーワードテーブル３２、及びスケジュールフォーマット３３などである。これらについては後述する。また、記憶部３０は、制御部２０が作成したスケジュールを記憶するスケジュール記憶部３４として機能する。 The storage unit 30 includes a RAM, a ROM, and a nonvolatile memory. The RAM temporarily stores data and programs, and functions as a work memory for the CPU provided in the control unit 20. The ROM stores a control program necessary for controlling the entire schedule creation device 200. The nonvolatile memory is, for example, a hard disk, and stores a schedule creation program and various data. The various data include a speaker identification template 31, a keyword table 32, a schedule format 33, and the like. These will be described later. The storage unit 30 functions as a schedule storage unit 34 that stores the schedule created by the control unit 20.

通信部４０は、任意の通信網（例えば、携帯電話網）を介して、ユーザ端末１００との間でデータを送受信する。具体的には、通信部４０は、マイク１３が収音した音声とカメラ１４が撮像した画像とユーザ端末１００の端末ＩＤとをユーザ端末１００から受信する。 The communication unit 40 transmits and receives data to and from the user terminal 100 via an arbitrary communication network (for example, a mobile phone network). Specifically, the communication unit 40 receives the sound collected by the microphone 13, the image captured by the camera 14, and the terminal ID of the user terminal 100 from the user terminal 100.

次に、制御部２０の機能について説明する。
制御部２０は、記憶部３０に記憶されたスケジュール作成プログラムを実行することにより、話者識別部２１、抽出部２２、推定部２３、選択部２４、スケジュール作成部２５として機能する。 Next, functions of the control unit 20 will be described.
The control unit 20 functions as a speaker identification unit 21, an extraction unit 22, an estimation unit 23, a selection unit 24, and a schedule creation unit 25 by executing a schedule creation program stored in the storage unit 30.

話者識別部２１は、ユーザ端末１００から通信部４０を介して受信した話者の音声、話者の画像及び端末ＩＤに基づいて、話者を識別する。
以下では、（１）音声認識により話者を識別する場合、（２）画像認識により話者を識別する場合、（３）端末ＩＤにより話者を識別する場合、について順に説明する。 The speaker identification unit 21 identifies the speaker based on the voice of the speaker, the image of the speaker, and the terminal ID received from the user terminal 100 via the communication unit 40.
Hereinafter, (1) when a speaker is identified by voice recognition, (2) when a speaker is identified by image recognition, and (3) when a speaker is identified by a terminal ID will be described in order.

まず、（１）音声認識により話者を識別する場合、話者識別部２１は、任意の公知技術を用い、受信した話者の音声から、話者の音声の特徴を示す特徴量を取得する。この実施形態においては、一例として、話者の声紋（サウンドスペクトログラム）を音声の特徴量として用いる。話者識別部２１は、取得した話者の声紋が、予め学習しておいた複数の登録ユーザそれぞれの声紋の何れかと一致するか否かにより話者を識別する。なお、登録ユーザとは、スケジュール作成装置２００のスケジュール作成機能を使用するユーザとして予めアカウント登録されたユーザをいう。 First, (1) when a speaker is identified by voice recognition, the speaker identification unit 21 uses any known technique to acquire a feature amount indicating the characteristics of the speaker's voice from the received speaker's voice. . In this embodiment, as an example, a voice print (sound spectrogram) of a speaker is used as a voice feature amount. The speaker identifying unit 21 identifies a speaker based on whether or not the acquired voiceprint of the speaker matches any of the voiceprints of a plurality of registered users learned in advance. The registered user refers to a user who is registered in advance as a user who uses the schedule creation function of the schedule creation device 200.

次に、（２）画像認識により話者を識別する場合、話者識別部２１は、任意の公知技術を用い、受信した話者の画像から、話者の特徴を示す画像を取得する。この実施形態においては、一例として、話者の顔画像を話者の特徴を示す画像として用いる。話者識別部２１は、取得した話者の顔画像が、予め学習しておいた複数の登録ユーザそれぞれの顔画像の何れかと一致するか否かにより話者を識別する。 Next, (2) when a speaker is identified by image recognition, the speaker identifying unit 21 acquires an image indicating the characteristics of the speaker from the received speaker image using any known technique. In this embodiment, as an example, the face image of the speaker is used as an image indicating the characteristics of the speaker. The speaker identifying unit 21 identifies a speaker based on whether or not the acquired face image of the speaker matches any of the face images of a plurality of registered users learned in advance.

次に、（３）端末ＩＤにより話者を識別する場合、話者識別部２１は、受信した端末ＩＤと、予め記憶するユーザＩＤ（ユーザ識別情報）に対応付けられた端末ＩＤと、が一致するか否かにより話者を識別する。 Next, (3) when the speaker is identified by the terminal ID, the speaker identification unit 21 matches the received terminal ID with the terminal ID associated with the user ID (user identification information) stored in advance. The speaker is identified by whether or not to do so.

以上の（１）乃至（３）による話者の識別は、具体的には話者識別用テンプレート３１を用いて行われる。 The speaker identification according to the above (1) to (3) is specifically performed using the speaker identification template 31.

話者識別用テンプレート３１は、図４に示すように、ユーザＩＤである登録ユーザ名と、その登録ユーザ名の登録ユーザの声紋と、その登録ユーザの顔画像と、その登録ユーザが使用するユーザ端末（この実施形態ではユーザ端末１００）の端末ＩＤ（端末識別情報）と、を対応付けたテーブルである。例えば、登録ユーザ名が「Ａ」あれば、登録ユーザＡの声紋は「Ａ１」、登録ユーザＡの顔画像は「Ａ２」、登録ユーザＡが使用するユーザ端末の端末ＩＤは「Ａ３」であることが分かる。 As shown in FIG. 4, the speaker identification template 31 includes a registered user name as a user ID, a voice print of the registered user of the registered user name, a face image of the registered user, and a user used by the registered user. It is the table which matched terminal ID (terminal identification information) of a terminal (user terminal 100 in this embodiment). For example, if the registered user name is “A”, the voiceprint of the registered user A is “A1”, the face image of the registered user A is “A2”, and the terminal ID of the user terminal used by the registered user A is “A3”. I understand that.

話者識別部２１は、ユーザ端末から受信した話者の声紋、話者の顔画像及び受信した端末ＩＤと、話者識別用テンプレート３１に含まれる声紋、顔画像及び端末ＩＤと、をそれぞれ比較する。
話者識別部２１は、声紋、顔画像、端末ＩＤのうち少なくとも何れか一つが話者識別用テンプレート３１に含まれる声紋、顔画像又は端末ＩＤに一致すると判定した場合、話者を、一致した声紋、顔画像又は端末ＩＤに対応付けられた登録ユーザ名の話者であると識別する。そして、話者識別部２１は、識別した話者の登録ユーザ名（例えば、「Ａ」）を、スケジュール作成部２５へ供給する。 The speaker identification unit 21 compares the voice print of the speaker received from the user terminal, the face image of the speaker, and the received terminal ID with the voice print, face image, and terminal ID included in the speaker identification template 31, respectively. To do.
If the speaker identification unit 21 determines that at least one of the voiceprint, face image, and terminal ID matches the voiceprint, face image, or terminal ID included in the speaker identification template 31, the speaker identification unit 21 matches the speaker. The speaker is identified as a speaker with a registered user name associated with a voiceprint, face image, or terminal ID. Then, the speaker identifying unit 21 supplies the registered user name (for example, “A”) of the identified speaker to the schedule creating unit 25.

図３に戻って、抽出部２２は、話者の音声からスケジュールに関連するカテゴリに属すキーワードを抽出する。話者が複数の場合、抽出部２２は、複数の話者の会話音声から上記キーワードを抽出する。具体的には、抽出部２２は、スケジュールに関連するカテゴリに属すキーワードを、予め学習しておいたキーワードテーブル３２を用いて抽出する。 Returning to FIG. 3, the extraction unit 22 extracts keywords belonging to the category related to the schedule from the voice of the speaker. When there are a plurality of speakers, the extraction unit 22 extracts the keywords from the conversation voices of the plurality of speakers. Specifically, the extraction unit 22 extracts keywords belonging to a category related to the schedule using the keyword table 32 learned in advance.

キーワードテーブル３２は、図５に示すように、スケジュールに関連するカテゴリと、そのカテゴリに属すキーワードのテキストと、を対応付けたテーブルである。本実施形態において、キーワードテーブル３２は、図５に示すように、「月」、「日」、「時間」、「場所」、「予定内容」をカテゴリとして含んでいる。例えば、カテゴリ「月」には、キーワードとして「１月」、「２月」、「３月」、「４月」等が対応付けられている。 As shown in FIG. 5, the keyword table 32 is a table in which categories related to the schedule are associated with texts of keywords belonging to the categories. In the present embodiment, as shown in FIG. 5, the keyword table 32 includes “month”, “day”, “time”, “location”, and “planned contents” as categories. For example, the category “month” is associated with keywords “January”, “February”, “March”, “April”, and the like.

抽出部２２は、ユーザ端末１００から受信した話者の音声をテキストに変換して、その変換したテキストと、キーワードテーブル３２のキーワードのテキストと、を比較する。話者が複数の場合、抽出部２２は、複数の話者の会話音声をテキストに変換し、その変換したテキストが、キーワードテーブル３２の複数のカテゴリのうち何れかのカテゴリに属すキーワードのテキストと一致した場合、その一致したテキストをキーワードとして抽出する。 The extraction unit 22 converts the voice of the speaker received from the user terminal 100 into text, and compares the converted text with the keyword text in the keyword table 32. When there are a plurality of speakers, the extraction unit 22 converts the conversational voices of the plurality of speakers into text, and the converted text is a text of a keyword belonging to any one of a plurality of categories in the keyword table 32. If they match, the matched text is extracted as a keyword.

図３に戻って、推定部２３は、抽出部２２が同一カテゴリに属す異なるキーワードを複数抽出したか否かに基づいて、音声（話者複数であれば会話音声）におけるスケジュール変更の有無を推定する。この推定部２３は、このスケジュール変更の有無を、そのスケジュール変更に係るスケジュールとは別のスケジュールを除外して推定する。つまり、スケジュール変更の有無を、同一スケジュール内で推定する。 Returning to FIG. 3, the estimation unit 23 estimates whether or not there is a schedule change in the voice (conversation voice if there are a plurality of speakers) based on whether or not the extraction unit 22 has extracted a plurality of different keywords belonging to the same category. To do. The estimation unit 23 estimates the presence / absence of the schedule change by excluding a schedule different from the schedule related to the schedule change. That is, the presence / absence of a schedule change is estimated within the same schedule.

例えば、図１の話者Ａがスケジュール調整のために「４月１日３時から会議室で打ち合わせをしたいのですが、都合いかがですか。」と聞き、話者Ｂが「４月２日３時の方が打ち合わせの都合がいいです。」が答えた場合、カテゴリの「予定内容」である打ち合わせが一致するので、話者Ａ及びＢは同一スケジュールについて会話していると推定する。そして、推定部２３は、同一スケジュールにおいてカテゴリ「日」に属す異なるキーワード「１日」「２日」が抽出されたことにより、話者Ａ、Ｂの会話音声でカテゴリ「日」についてスケジュール変更があったと推定する。 For example, speaker A in Figure 1 asks, “I would like to have a meeting in the conference room from 3 o'clock on April 1st. If “3 o'clock is more convenient for the meeting”, the meeting, which is the “scheduled content” of the category, matches, so it is estimated that speakers A and B are talking about the same schedule. The estimation unit 23 then changes the schedule for the category “day” in the conversation voices of the speakers A and B by extracting different keywords “1 day” and “2 days” belonging to the category “day” in the same schedule. Presume that there was.

このように、推定部２３は、同一スケジュールか否かを所定カテゴリ（この実施形態では、一例として、「予定内容」）のキーワードが話者間で一致するか否かに基づいて推定した上で、その同一スケジュール内でのスケジュール変更の有無を推定する。なお、同一スケジュールか否かの判断基準は一例であって、別の基準を用いてもよい。 As described above, the estimation unit 23 estimates whether or not the schedules are the same based on whether or not keywords of a predetermined category (in this embodiment, “planned contents” as an example) match between speakers. The presence or absence of a schedule change within the same schedule is estimated. Note that the criterion for determining whether or not the schedules are the same is an example, and another criterion may be used.

次に、選択部２４は、推定部２３がスケジュール変更有りと推定した場合、同一カテゴリに属す複数の異なるキーワードのうち、スケジュール作成に用いるキーワードを選択する。上述の例の場合、選択部２４は、カテゴリ「日」に属す異なるキーワード「１日」「２日」のうち、共有スケジュールに用いるキーワードを選択する。選択の基準については後述する。 Next, when the estimation unit 23 estimates that there is a schedule change, the selection unit 24 selects a keyword to be used for schedule creation from among a plurality of different keywords belonging to the same category. In the case of the above-described example, the selection unit 24 selects a keyword to be used for the sharing schedule from different keywords “1 day” and “2 days” belonging to the category “day”. Selection criteria will be described later.

一方、選択部２４は、推定部２３がスケジュール変更無しと推定した場合、すなわち複数の話者の会話音声の中から同一カテゴリで１つのキーワードが抽出された場合、その１つのキーワードを選択する。
このように、この実施形態においては、推定部２３と選択部２４によりスケジュール（特に、共有スケジュール）の精度を上げるようにしている。 On the other hand, the selection unit 24 selects one keyword when the estimation unit 23 estimates that there is no schedule change, that is, when one keyword is extracted in the same category from conversational voices of a plurality of speakers.
As described above, in this embodiment, the estimation unit 23 and the selection unit 24 improve the accuracy of the schedule (particularly, the shared schedule).

次に、スケジュール作成部２５は、選択部２４が選択したキーワードに基づいて、スケジュールを作成する。選択部２４は、話者複数の場合共有スケジュールを、話者単独の場合単独スケジュールを、それぞれ作成する。 Next, the schedule creation unit 25 creates a schedule based on the keyword selected by the selection unit 24. The selection unit 24 creates a shared schedule when there are a plurality of speakers and a single schedule when there is only one speaker.

ここで、スケジュール作成部２５は、選択部２４が選択したキーワードを、スケジュールフォーマット３３に入力することによりスケジュールを作成する。
スケジュールフォーマット３３は、図６に示すように、登録ユーザ名と、スケジュールに関連するカテゴリ毎にキーワードを格納するデータスペースと、を互いに対応付けたテーブルである。 Here, the schedule creation unit 25 creates a schedule by inputting the keyword selected by the selection unit 24 into the schedule format 33.
As shown in FIG. 6, the schedule format 33 is a table in which registered user names and data spaces for storing keywords for each category related to the schedule are associated with each other.

スケジュール作成部２５は、選択部２４が選択したキーワードを、識別した話者の登録ユーザ名に対応付けられたデータスペースに入力してスケジュールを作成する。なお、単独スケジュールとは登録ユーザ１名のデータスペースにキーワードが入力されたスケジュール、共有スケジュールとは複数の登録ユーザのデータスペースにキーワードが共通して入力されたスケジュールをいう。 The schedule creation unit 25 creates the schedule by inputting the keyword selected by the selection unit 24 into the data space associated with the registered user name of the identified speaker. The single schedule is a schedule in which a keyword is input to the data space of one registered user, and the shared schedule is a schedule in which keywords are input in common to the data spaces of a plurality of registered users.

スケジュール作成部２５は、作成したスケジュールを記憶部３０のスケジュール記憶部３４に供給する。なお、スケジュール記憶部３４に記憶されたスケジュールは、各登録ユーザが使用するユーザ端末それぞれからアクセスすることで参照できる。 The schedule creation unit 25 supplies the created schedule to the schedule storage unit 34 of the storage unit 30. In addition, the schedule memorize | stored in the schedule memory | storage part 34 can be referred by accessing from each user terminal which each registered user uses.

以上、ユーザ端末１００及びスケジュール作成装置２００の具体的な構成について説明した。以下では、スケジュール作成装置２００が実行するスケジュール作成処理について、図７を参照しながら説明する。なお、このスケジュール作成処理については、図１の場面設定を適宜例にとりながら説明する。また、スケジュール作成装置２００は、当然ながら話者識別前は、話者が誰なのか不知である。 The specific configurations of the user terminal 100 and the schedule creation device 200 have been described above. Hereinafter, the schedule creation process executed by the schedule creation device 200 will be described with reference to FIG. The schedule creation process will be described by taking the scene setting shown in FIG. 1 as an example. Further, the schedule creation device 200 naturally does not know who the speaker is before speaker identification.

まず、話者Ａは、ユーザ端末１００の入力部１２からスケジュール作成プログラムを起動後、スケジュール作成を開始するための開始指示を入力する。すると、この開始指示を受け付けたユーザ端末１００は、音声の収音ならびに画像の撮像を開始するとともに、自端末の端末ＩＤを開始指示とあわせてスケジュール作成装置２００に送信する。
一方、スケジュール作成装置２００は、開始指示を受け付けると、音声・画像の待ち受け状態になる。スケジュール作成処理は、ユーザ端末１００から送信される音声・画像を受信したことを契機に開始される。 First, the speaker A inputs a start instruction for starting schedule creation after starting the schedule creation program from the input unit 12 of the user terminal 100. Then, the user terminal 100 that has received the start instruction starts collecting sound and capturing an image, and transmits the terminal ID of the own terminal together with the start instruction to the schedule creation device 200.
On the other hand, when receiving the start instruction, the schedule creation device 200 enters a voice / image standby state. The schedule creation process is started when a voice / image transmitted from the user terminal 100 is received.

まず、スケジュール作成装置２００の抽出部２２は、キーワードを抽出する（ステップＳ１１）。具体的には、抽出部２２は、上述したキーワードテーブル３２を用いた抽出手法によるキーワードの抽出処理を、リアルタイムに受信する音声に対して行う。 First, the extraction unit 22 of the schedule creation device 200 extracts keywords (step S11). Specifically, the extraction unit 22 performs the keyword extraction process by the extraction method using the keyword table 32 described above on the voice received in real time.

次に、話者識別部２１は、話者を識別する（ステップＳ１２）。具体的には、話者識別部２１は、話者識別用テンプレート３１を用いて、上述した（１）（２）の話者識別処理を、リアルタイムに受信する音声・画像に対して行う。同時に、話者識別部２１は、上述した（３）の話者識別処理を、開始指示とともに送信された端末ＩＤと話者識別用テンプレート３１とを用いて行う。 Next, the speaker identification part 21 identifies a speaker (step S12). Specifically, the speaker identification unit 21 uses the speaker identification template 31 to perform the above-described speaker identification processing (1) and (2) on the voice / image received in real time. At the same time, the speaker identification unit 21 performs the above-described speaker identification process (3) using the terminal ID and the speaker identification template 31 transmitted together with the start instruction.

次に、抽出部２２は、キーワードの抽出を終了したか否か判定する（ステップＳ１３）。例えば、抽出部２２は、音声が所定時間以上途切れた場合のタイムアウト、あるいはユーザ端末１００から送信される終了指示の受信、などによりキーワード抽出の終了判定を行う。このキーワード抽出が終了する（ステップＳ１３；Ｎｏ）まで、すなわち、話者複数であれば会話終了、話者単独であれば発話終了するまでステップＳ１１、Ｓ１２の処理を繰り返して、キーワードの抽出と話者の識別を行い続ける（ステップＳ１１、Ｓ１２、Ｓ１３のループ）。 Next, the extraction unit 22 determines whether or not the keyword extraction has ended (step S13). For example, the extraction unit 22 determines whether or not to extract the keyword based on a timeout when the sound is interrupted for a predetermined time or more, or reception of an end instruction transmitted from the user terminal 100. Until this keyword extraction ends (step S13; No), that is, the conversation ends when there are a plurality of speakers, and the processing of steps S11 and S12 is repeated until the utterance ends when the speakers are alone, thereby extracting keywords and talking. Identification of the person is continued (loop of steps S11, S12, S13).

ここで、上記ステップＳ１１及び１２は、説明の便宜上、順番で処理しているものの、これら抽出と話者識別はリアルタイムに受信する音声・画像に対して同時進行で処理される。つまり、スケジュール作成装置２００は、キーワード抽出が終了するまで、受信した音声からキーワードを抽出しつつ、音声・画像・端末ＩＤにより話者を識別する。 Here, although steps S11 and S12 are processed in order for convenience of explanation, these extraction and speaker identification are processed simultaneously with respect to voice / image received in real time. In other words, the schedule creation device 200 identifies the speaker by the voice / image / terminal ID while extracting the keyword from the received voice until the keyword extraction is completed.

ここで、例１として、図１の話者Ａがスケジュール確認するために「４月２日の打ち合わせについて、時間と場所を教えて下さい。」と聞き、それに対して話者Ｂが「打ち合わせは、３時から会議室で行います。」と回答した場面を想定する。 Here, as an example 1, in order for speaker A in FIG. 1 to confirm the schedule, he / she hears "Tell me about the time and place for the April 2 meeting." Suppose that it will be held in the meeting room from 3 o'clock.

この場合、抽出部２２は、ステップＳ１１において、ユーザ端末１００から受信した音声を変換したテキストのうち、キーワードテーブル３２に含まれるキーワードのテキストに一致する「４月」、「２日」、「打ち合わせ」、「３時」、「会議室」の各テキストを、スケジュールに関連するキーワードとして抽出する。 In this case, in step S11, the extraction unit 22 selects “April”, “2nd”, “Meeting” that matches the keyword text included in the keyword table 32 from the text converted from the voice received from the user terminal 100. ”,“ 3 o'clock ”, and“ meeting room ”are extracted as keywords related to the schedule.

一方、話者識別部２１は、ステップＳ１２において、話者識別用テンプレート３１を用いて、開始指示とともに送信された端末ＩＤ「Ａ３」に対応付けられた登録ユーザ名「Ａ」を特定する。そして、話者識別部２１は、話者が登録ユーザ名「Ａ」の話者であると識別する。
また、話者識別部２１は、話者識別用テンプレート３１を用いて、受信した音声「打ち合わせは、３時から会議室で行います。」から求めた声紋が、声紋Ｂ１と一致すると判定し、別の話者が登録ユーザ名「Ｂ」の話者であると識別する。 On the other hand, in step S12, the speaker identification unit 21 uses the speaker identification template 31 to specify the registered user name “A” associated with the terminal ID “A3” transmitted together with the start instruction. Then, the speaker identifying unit 21 identifies that the speaker is a speaker having the registered user name “A”.
In addition, the speaker identification unit 21 determines that the voiceprint obtained from the received voice “Meeting is performed in the conference room from 3 o'clock” matches the voiceprint B1 using the speaker identification template 31; Another speaker is identified as the speaker with the registered user name “B”.

図７のスケジュール作成処理に戻って、キーワードの抽出を終了すると（ステップＳ１３；Ｙｅｓ）、話者識別部２１は、識別した話者は複数か否か判定する（ステップＳ１４）。ここで、識別した話者は複数であると判定した場合（ステップＳ１４；Ｙｅｓ）、共有スケジュールの精度向上のために、推定部２３と選択部２４とが協働して会話音声におけるスケジュール変更の有無に基づいて、キーワードを選択する（ステップＳ１５）。 Returning to the schedule creation process of FIG. 7, when the keyword extraction is completed (step S13; Yes), the speaker identifying unit 21 determines whether there are a plurality of identified speakers (step S14). Here, when it is determined that there are a plurality of identified speakers (step S14; Yes), in order to improve the accuracy of the sharing schedule, the estimation unit 23 and the selection unit 24 cooperate to change the schedule in the conversation voice. A keyword is selected based on the presence or absence (step S15).

上述した例１の場合、まず、推定部２３は、話者Ａから発話された「打ち合わせ」と話者Ｂから発話された「打ち合わせ」が一致し、かつ、抽出したキーワード（「４月」、「２日」、「打ち合わせ」、「３時」、「会議室」）において同一カテゴリに属す複数の異なるキーワードがないので、スケジュール変更なしと推定する。続いて、選択部２４は、スケジュール変更なしと推定されたので、同一カテゴリに属す１つのキーワード（「４月」、「２日」、「打ち合わせ」、「３時」、「会議室」）をそれぞれ選択する。 In the case of Example 1 described above, the estimation unit 23 first matches the “meeting” uttered by the speaker A and the “meeting” uttered by the speaker B, and the extracted keyword (“April”, In “2 days”, “meeting”, “3 o'clock”, “meeting room”), there is no plurality of different keywords belonging to the same category, so it is estimated that there is no schedule change. Subsequently, since it is estimated that there is no schedule change, the selection unit 24 selects one keyword (“April”, “2nd day”, “meeting”, “3 o'clock”, “meeting room”) belonging to the same category. Select each one.

図７に戻って、スケジュール作成部２５は、選択したキーワードに基づいて、共有スケジュールを作成する（ステップＳ１６）。上述した例１の場合、スケジュール作成部２５は、「４月」、「２日」、「打ち合わせ」、「３時」、「会議室」に基づいて、図８に示す話者Ａ、Ｂの共有スケジュールを作成する。
一方、例２として、図１の話者Ａがスケジュール調整のために、「４月１日３時から会議室で打ち合わせをしたいのですが、都合いかがですか。」と聞き、話者Ｂが「４月２日３時の方が打ち合わせの都合がいいです。」が答え、さらに話者Ａが「私は、４月１日の方がいいのですが、４月２日でも大丈夫です。」と返答したとする。 Returning to FIG. 7, the schedule creation unit 25 creates a shared schedule based on the selected keyword (step S16). In the case of Example 1 described above, the schedule creation unit 25 determines whether the speakers A and B shown in FIG. 8 are based on “April”, “2 days”, “meeting”, “3 o'clock”, and “meeting room”. Create a sharing schedule.
On the other hand, as example 2, speaker A in Fig. 1 asks, "I would like to have a meeting in the conference room from 3 o'clock on April 1st. “It would be better to have a meeting at 3 o'clock on April 2,” and speaker A replied, “I prefer April 1, but April 2 is fine. ".

この場合、ステップＳ１５において、推定部２３は、まず、話者Ａから発話された「打ち合わせ」と話者Ｂから発話された「打ち合わせ」が一致するので同一スケジュールと推定する。そして、推定部２３は、抽出したキーワード（「４月」、「１日」、「２日」、「打ち合わせ」、「３時」、「会議室」）において同一カテゴリに属す複数の異なるキーワード「１日」、「２日」があるので、スケジュール変更ありと推定する。 In this case, in step S15, since the “meeting” uttered by the speaker A matches the “meeting” uttered by the speaker B, the estimating unit 23 first estimates the same schedule. Then, the estimation unit 23 uses a plurality of different keywords “belonging to the same category in the extracted keywords (“ April ”,“ 1st ”,“ 2nd ”,“ meeting ”,“ 3 o'clock ”,“ meeting room ”). Since there are “1 day” and “2 days”, it is estimated that there is a schedule change.

続いて、選択部２４は、「１日」又は「２日」のうち、複数の話者Ａ、Ｂが発話したキーワードを話者単独が発話したキーワードよりも優先して選択する。すなわち、「１日」は話者Ａが単独で計２回発話しており、「２日」は話者Ａ、Ｂがそれぞれ１回ずつ計２回発話しているから、選択部２４は、「２日」を共有スケジュールに用いるキーワードとして選択する。このように、話者単独よりも複数の話者の意思が合致した場合にキーワード選択の重みをつけるようにする。同時に選択部２４は、同一カテゴリに属す１つのキーワード（「４月」、「打ち合わせ」、「３時」、「会議室」）を選択する。そして、スケジュール作成部２５は、これらキーワード（「４月」、「２日」、「打ち合わせ」、「３時」、「会議室」）に基づいて、図８に示す共有スケジュールを作成する。 Subsequently, the selection unit 24 selects a keyword spoken by a plurality of speakers A and B in preference to a keyword spoken by a speaker alone, from “1 day” or “2 days”. That is, since “Speaker A” uttered a total of two times on “1 day” and “Speaker A” and “B” uttered twice each on “2 days”, the selection unit 24 “2 days” is selected as a keyword used for the sharing schedule. As described above, when the intentions of a plurality of speakers agree with each other rather than the speaker alone, the keyword selection is weighted. At the same time, the selection unit 24 selects one keyword (“April”, “Meeting”, “3 o'clock”, “Meeting room”) belonging to the same category. Then, the schedule creation unit 25 creates the shared schedule shown in FIG. 8 based on these keywords (“April”, “2nd day”, “meeting”, “3 o'clock”, “conference room”).

なお、上記例２の場合において、話者Ａが「１日」と１回発話し、話者Ｂが「２日」と１回発話した場合、選択部２４は、時系列で後出のキーワード「２日」を前出のキーワード「１日」よりも優先して選択するとよい。時系列で後に発話されるキーワードは、会話の中で更新されたキーワードである可能性が高いからである。 In the case of the above example 2, when the speaker A speaks “1 day” once and the speaker B speaks “2 days” once, the selection unit 24 selects the keyword in time series later. “2 days” may be selected in preference to the keyword “1 day”. This is because a keyword that is uttered later in time series is highly likely to be a keyword updated in the conversation.

図７に戻って、識別した話者は複数でないと判定した場合（ステップＳ１４；Ｎｏ）、すなわち話者が単独の場合、上述した推定部２３と選択部２４の要領で話者単独の発話音声におけるスケジュール変更の有無に基づいて、キーワードを選択する（ステップＳ１７）。話者単独の場合に、同一スケジュール内でスケジュール変更があれば、選択部２４は、例えば、複数のキーワードのうち話者の発話回数が最も多いキーワードを選択する、あるいは発話回数が同数であれば時系列で後出のキーワードを選択する、などによりキーワードを選択する。 Returning to FIG. 7, when it is determined that there are not a plurality of identified speakers (step S <b> 14; No), that is, when there is only one speaker, the utterance voice of the speaker alone in the manner of the estimation unit 23 and the selection unit 24 described above. A keyword is selected based on whether or not the schedule has been changed (step S17). In the case of a speaker alone, if there is a schedule change within the same schedule, the selection unit 24 selects, for example, a keyword having the largest number of utterances by a speaker among a plurality of keywords, or if the number of utterances is the same. A keyword is selected by selecting a later keyword in time series.

次に、スケジュール作成部２５は、選択したキーワードに基づいて、単独スケジュールを作成する（ステップＳ１８）。共有スケジュール又は単独スケジュールを作成後（ステップＳ１６又はＳ１８の後）、スケジュール作成処理を終了する。 Next, the schedule creation unit 25 creates a single schedule based on the selected keyword (step S18). After creating a shared schedule or a single schedule (after step S16 or S18), the schedule creation process ends.

以上説明したように、本実施形態に係るスケジュール作成装置２００によれば、推定部２３と選択部２４とを備えたことにより、単に抽出したキーワードに基づき共有スケジュールを作成するだけでなく、会話の中でスケジュール変更があれば同一カテゴリで２以上あるキーワードの中から選択したキーワードを用いて共有スケジュールを作成することができる。このため、会話の中でのスケジュール変更を加味した正確な共有スケジュールを作成することができる。 As described above, according to the schedule creation device 200 according to the present embodiment, by providing the estimation unit 23 and the selection unit 24, not only a shared schedule is created based on the extracted keyword, but also a conversation If there is a schedule change, a shared schedule can be created using keywords selected from two or more keywords in the same category. For this reason, it is possible to create an accurate sharing schedule that takes into account the schedule change in the conversation.

また、選択部２４は、同一カテゴリの２以上のキーワードのうち、複数の話者が発話したキーワードを話者単独が発話したキーワードよりも優先して選択するようにしている。このため、複数の話者の意思の合致を考慮したより正確な共有スケジュールを作成することができる。なお、選択部２４の選択手法はこれに限られず、例えば、時系列で後出のキーワードを前出のキーワードより優先して選択してもよい。これによれば会話の中で更新されたキーワード、例えば、言い間違えの訂正などを反映した正確な共有スケジュールを作成することができる。 The selection unit 24 selects a keyword uttered by a plurality of speakers in preference to a keyword uttered by a speaker alone from two or more keywords of the same category. For this reason, it is possible to create a more accurate sharing schedule in consideration of matching of intentions of a plurality of speakers. Note that the selection method of the selection unit 24 is not limited to this, and, for example, a later keyword in time series may be selected with priority over the preceding keyword. According to this, it is possible to create an accurate sharing schedule reflecting keywords updated in the conversation, for example, correction of mistakes.

また、推定部２３は、会話音声におけるスケジュール変更の有無を、そのスケジュール変更に係るスケジュールとは別のスケジュールを除外して推定する。これによれば、会話の中での別スケジュールの話しをスケジュール変更と誤認する事態を避けることができるので、正確な共有スケジュールを作成することができる。なお、スケジュール作成部２５は、除外された別スケジュールを共有スケジュールとして、同時に２つの共有スケジュールを作成してもよい。 Moreover, the estimation part 23 estimates the presence or absence of the schedule change in conversation speech, excluding the schedule different from the schedule concerning the schedule change. According to this, since it is possible to avoid a situation in which a different schedule in the conversation is mistaken as a schedule change, an accurate sharing schedule can be created. Note that the schedule creation unit 25 may create two shared schedules at the same time using the excluded different schedule as a shared schedule.

また、この実施形態では、同一スケジュールか否かの判断基準（別のスケジュールとして除外する基準）を所定カテゴリ「予定内容」が一致するか否かとしたが、これに代えて、例えば異なるカテゴリの一致数に基づいて同一スケジュールか否か判断してもよい。例えば、カテゴリ「時間」「場所」「予定内容」の３つが一致すれば同一スケジュールと判断することが考えられる。 Further, in this embodiment, the criterion for determining whether or not the schedules are the same (a criterion to be excluded as a separate schedule) is whether or not the predetermined category “planned contents” matches. You may determine whether it is the same schedule based on the number. For example, if the three categories “time”, “location”, and “planned content” match, it may be determined that the schedules are the same.

また、スケジュール作成装置２００の抽出部２２は、話者の音声からスケジュールに関連するキーワードをテキストのマッチングにより抽出する。キーワード抽出を音声のマッチングで行った場合、すなわちキーワードテーブル３２の各キーワードがテキストではなく音素の集合からなる音素波形だった場合、話者毎に発声したキーワードは必ずしも同じ音素波形になるとは限らない。
このため、テキストのマッチングによるキーワード抽出の手法によれば、話者毎の音の個性が失われるのでキーワード抽出の精度を向上することができる。ただし、音声同士の比較によるキーワード抽出を妨げるものではない。 Moreover, the extraction part 22 of the schedule creation apparatus 200 extracts the keyword relevant to a schedule from a speaker's voice by text matching. When keyword extraction is performed by voice matching, that is, when each keyword in the keyword table 32 is a phoneme waveform consisting of a set of phonemes instead of text, the keywords uttered for each speaker are not necessarily the same phoneme waveform. .
For this reason, according to the keyword extraction technique based on text matching, the individuality of the sound for each speaker is lost, so the accuracy of keyword extraction can be improved. However, it does not hinder keyword extraction by comparing voices.

また、この実施形態におけるスケジュール作成装置２００によれば、話者識別部２１は、話者を（１）音声認識（２）画像認識（３）端末ＩＤにより識別するようにしている。このため、何れか一つの話者識別の手法により話者識別できればよいので、一つの手法のみを用いた話者識別よりも識別確度を上げることができる。特に、ユーザ端末１００のカメラ１４の画角に話者の顔が入ってない場合などに好適である。また、上記（１）乃至（３）に限らず、話者の指紋データや虹彩パターンの照合などにより話者を識別してもよい。 Further, according to the schedule creation device 200 in this embodiment, the speaker identification unit 21 identifies a speaker by (1) voice recognition (2) image recognition (3) terminal ID. For this reason, it is sufficient that the speaker can be identified by any one of the speaker identification methods, and therefore the identification accuracy can be improved as compared with the speaker identification using only one method. In particular, it is suitable when the face of the speaker does not enter the angle of view of the camera 14 of the user terminal 100. Further, the present invention is not limited to the above (1) to (3), and the speaker may be identified by comparing the speaker's fingerprint data or iris pattern.

また、上述した図７のスケジュール作成処理においては、識別した話者が複数であれば（ステップＳ１４；Ｙｅｓ）、その複数の話者全員に対する共有スケジュールを作成するようにしたが、これに限られない。例えば、ステップＳ１４の後に、複数の話者のうち、所定の条件を満たすか否か判定するステップを加え、その後にスケジュール作成部２５は、所定の条件を満たす話者で共有される共有スケジュールを作成してもよい。 In the schedule creation process of FIG. 7 described above, if there are a plurality of identified speakers (step S14; Yes), a sharing schedule is created for all of the plurality of speakers. Absent. For example, after step S <b> 14, a step of determining whether or not a predetermined condition is satisfied among a plurality of speakers is added, and then the schedule creation unit 25 creates a shared schedule shared by the speakers satisfying the predetermined condition. You may create it.

例えば、所定の条件として音量やトーンを用いることができる。音量を用いる場合、制御部２０は、話者識別部２１が話者を複数識別した場合、その複数の話者の会話音声の音量に基づいて、その複数の話者のうち共有スケジュールを共有する話者を特定すればよい。そして、スケジュール作成部２５は、制御部２０が特定した話者で共有する共有スケジュールを作成すればよい。なお、この場合、制御部２０は、特定手段として機能する。 For example, volume or tone can be used as the predetermined condition. When the volume is used, when the speaker identification unit 21 identifies a plurality of speakers, the control unit 20 shares a sharing schedule among the plurality of speakers based on the volume of the conversation voices of the plurality of speakers. What is necessary is just to identify a speaker. And the schedule preparation part 25 should just create the sharing schedule shared with the speaker which the control part 20 specified. In this case, the control unit 20 functions as a specifying unit.

具体的には、制御部２０は、話者識別部２１が識別した複数の話者のうち、その話者の会話音声の音量が所定の閾値以下である話者を、共有スケジュールを共有する話者として特定すればよい。
この態様によれば、会議やブリーフィングに参加している登録ユーザ（例えば、図１の話者Ａ及びＢ）と、登録ユーザではあるものの会議やブリーフィングの参加者ではない通りすがりの人（例えば、話者ではない登録ユーザ名Ｃの登録ユーザ）と、を音量に基づいて判別し、前者によってのみ共有される共有スケジュールを作成できる。会議やブリーフィングの参加者以外の人間が通りかかる環境においてスケジュールの打ち合わせを行う話者は、互いにのみ会話の内容が聞き取れるように、小声で話す可能性が高いからである。 Specifically, the control unit 20 is a talk that shares a sharing schedule with a speaker whose conversation voice volume is equal to or lower than a predetermined threshold among a plurality of speakers identified by the speaker identification unit 21. What is necessary is just to identify as a person.
According to this aspect, a registered user (for example, speakers A and B in FIG. 1) who participates in a conference or briefing, and a passing person who is a registered user but is not a participant in the conference or briefing (for example, a story) A registered user who is not a registered user name C) can be determined based on the sound volume, and a sharing schedule shared only by the former can be created. This is because speakers who make schedule meetings in an environment where people other than participants in conferences and briefings pass are highly likely to speak quietly so that the content of the conversation can be heard only from one another.

また、制御部２０は、話者識別部２１が識別した複数の話者のうち少なくとも何れか一人の会話音声の声量が所定の閾値以上である場合、話者識別部２１が識別した複数の話者全員を、共有スケジュールを共有する話者として特定することとしてもよい。
この態様によれば、会議やブリーフィングにおいて参加者全員を対象に周知したいスケジュールを、参加者全員によって共有されるスケジュールとして共有設定することができる。話者が一定以上の大声でスケジュールについて話している場合、会議やブリーフィングに参加している人全員でスケジュールを共有することを所望している可能性が高いからである。 In addition, when the volume of at least one of the conversation voices among the plurality of speakers identified by the speaker identification unit 21 is equal to or greater than a predetermined threshold, the control unit 20 includes a plurality of stories identified by the speaker identification unit 21. It is good also as specifying all the speakers as speakers who share a sharing schedule.
According to this aspect, it is possible to share and set a schedule to be made known to all participants in a meeting or briefing as a schedule shared by all participants. This is because if the speaker is speaking about the schedule louder than a certain level, there is a high possibility that all the participants participating in the conference or briefing want to share the schedule.

以上で実施形態の説明を終了するが、上記実施形態は一例であり、スケジュール装置２００の構成やスケジュール作成処理の内容などが上記実施形態で説明したものに限られないことはもちろんである。 Although the description of the embodiment has been completed above, the embodiment is an example, and it is needless to say that the configuration of the schedule device 200, the contents of the schedule creation process, and the like are not limited to those described in the embodiment.

（第２の実施形態）
上述の第１の実施形態において、スケジュール作成装置２００は、スケジュール作成部２５が作成したスケジュールが既に登録されているか否か判定することなく、スケジュール登録するようにした。そこで、第２の実施形態におけるスケジュール作成装置２００’においては、重複登録を避けるための新たな機能を追加した点がスケジュール作成装置２００と異なる。以下では、この異なる点を中心に説明する。 (Second Embodiment)
In the first embodiment described above, the schedule creation device 200 registers a schedule without determining whether or not the schedule created by the schedule creation unit 25 has already been registered. Therefore, the schedule creation apparatus 200 ′ according to the second embodiment is different from the schedule creation apparatus 200 in that a new function for avoiding duplicate registration is added. Below, it demonstrates centering on this different point.

図９に示すように、スケジュール作成装置２００’は、新たな機能として、判定部２６及び更新部２７を備える。 As illustrated in FIG. 9, the schedule creation device 200 ′ includes a determination unit 26 and an update unit 27 as new functions.

判定部２６は、スケジュール作成部２５が作成した共有スケジュールが、複数の話者全員に共有済みか否か判定する。また、判定部２６は、スケジュール作成部２５が作成した単独スケジュールが登録済みか否か判定する。 The determination unit 26 determines whether or not the sharing schedule created by the schedule creation unit 25 has been shared by all the speakers. The determination unit 26 determines whether or not the single schedule created by the schedule creation unit 25 has been registered.

更新部２７は、共有スケジュールが複数の話者全員に共有済みでないと判定部２６が判定した場合、共有済みでない話者のスケジュールを更新する。また、更新部２７は、単独スケジュールが登録済みではないと判定部２６が判定した場合、スケジュール作成部２５が作成した単独スケジュールを登録する。 When the determination unit 26 determines that the sharing schedule has not been shared with all the speakers, the update unit 27 updates the schedule of speakers that have not been shared. Moreover, the update part 27 registers the single schedule which the schedule preparation part 25 created, when the determination part 26 determines with the single schedule not having been registered.

以下、スケジュール作成装置２００’のスケジュール作成処理について、図１０を参照しながら説明する。このスケジュール作成処理は、新たな機能に係るステップＳ１９乃至２２を加えた以外は第１の実施形態に係るスケジュール作成処理と同じなので、異なるステップを中心に説明する。 Hereinafter, the schedule creation processing of the schedule creation device 200 'will be described with reference to FIG. Since this schedule creation process is the same as the schedule creation process according to the first embodiment except that steps S19 to S22 relating to the new function are added, the description will focus on different steps.

ステップＳ１６においてスケジュール作成部２５が共有スケジュールを作成した後、判定部２６は、共有スケジュールは共有済みか否か判定する（ステップＳ１９）。具体的には、判定部２６は、共有スケジュールが複数の話者全員に共有済みか否か判定する。 After the schedule creation unit 25 creates a sharing schedule in step S16, the determination unit 26 determines whether or not the sharing schedule has been shared (step S19). Specifically, the determination unit 26 determines whether or not the sharing schedule is already shared with a plurality of speakers.

ここで、共有スケジュールが複数の話者全員に共有されていないと判定された場合（ステップＳ１９；Ｎｏ）、更新部２７は、共有スケジュールを共有済みでない話者のスケジュールを更新して（ステップＳ２０）、処理を終了する。
例えば、識別した話者がＡ及びＢである場合において、Ａのみに既存スケジュールが登録されている場合は、図１１に示す更新前の単独スケジュールを、図１２に示す共有スケジュールに更新する。
一方、共有スケジュールが共有済みと判定された場合（ステップＳ１９；Ｙｅｓ）、例えば、話者Ａ及びＢが既にスケジュールを共有済みである場合、処理を終了する。 Here, when it is determined that the sharing schedule is not shared by all the speakers (step S19; No), the updating unit 27 updates the schedule of speakers who have not shared the sharing schedule (step S20). ), The process is terminated.
For example, when the identified speakers are A and B, and the existing schedule is registered only in A, the single schedule before update shown in FIG. 11 is updated to the shared schedule shown in FIG.
On the other hand, if it is determined that the sharing schedule has been shared (step S19; Yes), for example, if speakers A and B have already shared the schedule, the process ends.

一方、ステップＳ１８においてスケジュール作成部２５が単独スケジュールを作成した後、判定部２６は、単独スケジュールは登録済みか否か判定する（ステップＳ２１）。
ここで、単独スケジュールが複数の登録済みでないと判定された場合（ステップＳ２１；Ｎｏ）、更新部２７は、単独スケジュールを新規登録して（ステップＳ２２）、処理を終了する。一方、単独スケジュールが登録済みと判定された場合（ステップＳ２１；Ｙｅｓ）、処理を終了する。 On the other hand, after the schedule creation unit 25 creates a single schedule in step S18, the determination unit 26 determines whether the single schedule has been registered (step S21).
Here, when it is determined that the single schedule is not already registered (step S21; No), the update unit 27 newly registers the single schedule (step S22) and ends the process. On the other hand, when it is determined that the single schedule has already been registered (step S21; Yes), the process ends.

以上説明したように、第２の実施形態に係るスケジュール作成装置２００’によれば、判定部２６と更新部２７を備えたことにより、共有スケジュールが共有済みでない場合と単独スケジュールが登録済みでない場合とに限って更新・新規登録を行うことができる。このため、スケジュール作成装置２００’によれば、重複登録を避けつつ、必要がある場合にのみ更新・新規登録を行うことができる。 As described above, according to the schedule creation device 200 ′ according to the second embodiment, when the determination unit 26 and the update unit 27 are provided, the shared schedule is not shared and the single schedule is not registered. Updates and new registrations can be made only with Therefore, according to the schedule creation device 200 ′, update / new registration can be performed only when necessary while avoiding duplicate registration.

（第３の実施形態）
上述した第１及び第２の実施形態において、スケジュール作成装置２００は、別装置であるユーザ端末１００から受信した話者の音声、画像、端末ＩＤを用いて話者を識別することを前提に説明したが、これに限られない。例えば、図１３に示すように、ノートＰＣであるスケジュール作成装置３００が音声の収音、画像の撮像を行うようなスタンドアロン型の構成にしてもよい。 (Third embodiment)
In the first and second embodiments described above, the schedule creation device 200 is described on the assumption that the speaker is identified using the voice, image, and terminal ID of the speaker received from the user terminal 100 which is another device. However, it is not limited to this. For example, as shown in FIG. 13, the schedule creation device 300, which is a notebook PC, may have a stand-alone configuration in which sound is collected and an image is captured.

この場合、スケジュール作成装置３００は、図１４に示すように、音声を収音するためのマイク５０、画像を撮像するためのカメラ６０を備える。そして、スケジュール作成装置３００は、自装置で収音した音声と撮像した画像とにより、上述した図７のスケジュール作成処理を実行してスケジュールを作成する。スケジュール作成処理のトリガは、話者Ａ又はＢがスケジュール作成プログラムを起動して開始指示を入力した場合にすればよい。なお、作成した共有スケジュールは、サーバ等にアップロードすることでスケジュール作成装置３００の所有者以外も共有スケジュールを参照することができる。 In this case, the schedule creation device 300 includes a microphone 50 for collecting sound and a camera 60 for capturing an image, as shown in FIG. Then, the schedule creation device 300 creates the schedule by executing the schedule creation processing of FIG. 7 described above based on the sound collected by the own device and the captured image. The schedule creation process may be triggered when the speaker A or B activates the schedule creation program and inputs a start instruction. Note that the created shared schedule can be referred to by anyone other than the owner of the schedule creation device 300 by uploading it to a server or the like.

以上説明した第３の実施形態に係るスケジュール作成装置３００によれば、スタンドアロン型であるため、音声と画像の送信に伴う遅延を生じることなく迅速に話者識別、キーワード抽出を行ってスケジュールを作成することができる。このため、通信ネットワークの状況に依らずに、即座にスケジュールを作成することができるのでユーザビリティを向上することができる。 According to the schedule creation device 300 according to the third embodiment described above, since it is a stand-alone type, a schedule is created by quickly performing speaker identification and keyword extraction without causing a delay associated with transmission of voice and images. can do. For this reason, since a schedule can be created immediately regardless of the state of the communication network, usability can be improved.

（第４の実施形態）
上述した第１の実施形態において、スケジュール作成装置２００は、複数の話者全員で共有する共有スケジュールを作成することを前提に説明した。この場合、スケジュールに関連するキーワードを発話していない話者についても会話音声から話者識別されると共有スケジュールを作成することになる。例えば、話者Ａ及びＢ以外のＣがキーワードを発話していないような場合にもＡＢＣで共有する共有スケジュールを作成する。 (Fourth embodiment)
In the first embodiment described above, the schedule creating apparatus 200 has been described on the assumption that a shared schedule shared by all of a plurality of speakers is created. In this case, if a speaker who does not speak a keyword related to the schedule is also identified from the conversation voice, a sharing schedule is created. For example, a sharing schedule that is shared by ABC even when C other than speakers A and B does not speak a keyword is created.

そこで、図１５に示す第４の実施形態に係るスケジュール作成装置４００のスケジュール作成部２５は、抽出部２２が抽出したキーワードを発話した話者のみで共有する共有スケジュールを作成するようにする。この場合、例えば、話者識別部２１は、話者識別用テンプレートを用いて話者を識別する際、抽出部２２が抽出したキーワードの発話音声部分の声紋から話者を識別するようにする。そして、スケジュール作成部２５は、キーワードを発話した話者のみで共有する共有スケジュール（上述の例では、Ｃを除いたＡＢで共有する共有スケジュール）を作成すればよい。 Therefore, the schedule creation unit 25 of the schedule creation device 400 according to the fourth embodiment shown in FIG. 15 creates a sharing schedule that is shared only by the speaker who spoke the keyword extracted by the extraction unit 22. In this case, for example, when the speaker identification unit 21 identifies a speaker using the speaker identification template, the speaker identification unit 21 identifies the speaker from the voiceprint of the utterance voice portion of the keyword extracted by the extraction unit 22. And the schedule preparation part 25 should just create the share schedule (The share schedule shared by AB except C in the above-mentioned example) shared only by the speaker who spoke the keyword.

また、スケジュール修正部２８は、スケジュール作成部２５によって作成され、スケジュール記憶部３４に記憶された共有スケジュールを修正する。上述した実施形態１のスケジュール作成装置２００は、推定部２３、選択部２４などによりリアルタイムで受信する会話音声から、修正が反映された修正済みの共有スケジュールを作成したが、スケジュール修正部２８は、一旦記憶された共有スケジュールに対して修正を行う。スケジュール修正部２８の修正手法は任意だが、例えば、録音された会話音声に基づいて、推定部２３、選択部２４などと同様の機能により、同一カテゴリで複数のキーワードが抽出されていたような場合、１つのキーワードを選択する等して修正を行えばよい。なお、録音音声に限らず、一旦記憶された共有スケジュールに対してリアルタイムに逐一修正を行ってもよいことはもちろんである。 The schedule correction unit 28 corrects the shared schedule created by the schedule creation unit 25 and stored in the schedule storage unit 34. The schedule creation device 200 according to the first embodiment described above creates a corrected shared schedule reflecting the correction from the conversation voice received in real time by the estimation unit 23, the selection unit 24, and the like. Modify the shared schedule once stored. Although the correction method of the schedule correction unit 28 is arbitrary, for example, when a plurality of keywords are extracted in the same category by the same function as the estimation unit 23, the selection unit 24, etc. based on the recorded conversation voice Correction may be performed by selecting one keyword or the like. Of course, not only the recorded voice but also the once-stored shared schedule may be corrected in real time.

この実施形態４に係るスケジュール作成装置４００によれば、修正機能により共有スケジュールの精度を上げつつ、スケジュールを共有する必要のない話者を除いた話者間での共有スケジュールを作成することができる。
共有するスケジュールは複数であってよく、例えば話者ＡＢＣＤＥが発話していたとき、「４月２日企画会議３時会議室」のスケジュールは話者ＡＢＣが共有し、「４月３日販売会議３時会議室」のスケジュールは話者ＡＣＤが共有する。 According to the schedule creation device 400 according to the fourth embodiment, it is possible to create a sharing schedule between speakers excluding speakers who do not need to share a schedule while improving the accuracy of the sharing schedule by a correction function. .
Multiple schedules may be shared. For example, when speaker ABCDE is speaking, the schedule of “April 2 Planning Meeting 3:00 Meeting Room” is shared by Speaker ABC, and “April 3 Sales Meeting” The schedule for the “3 o'clock meeting room” is shared by the speaker ACD.

なお、上述した第１及び第２の実施形態において図７のスケジュール作成処理は、識別した話者が複数であれば（ステップＳ１４；Ｙｅｓ）、共有スケジュールの作成を開始したが、これに限られない。例えば、ステップＳ１４の後に話者の承諾があったか否か判定する処理を加えて、承諾があることをトリガに共有スケジュールの作成を開始してもよい。
具体的には、予め承諾に関連するキーワード（例えば、「分かりました」や「了解」など）を記憶しておき、何れかの話者がそのキーワードを発話したことをトリガとして共有スケジュールを作成してもよい。これによれば、話者がスケジュールの共有を望まない場合などに強制的に共有スケジュールを作成してしまう事態を避けることができる。 In the first and second embodiments described above, the schedule creation processing in FIG. 7 starts the creation of the shared schedule if there are a plurality of identified speakers (step S14; Yes), but is not limited thereto. Absent. For example, after the step S14, a process for determining whether or not there is a speaker's consent may be added, and creation of a shared schedule may be started with the presence of the consent as a trigger.
Specifically, keywords related to consent (for example, “I understand” or “OK”) are memorized in advance, and a sharing schedule is created triggered by any speaker speaking the keyword. May be. According to this, it is possible to avoid a situation where the sharing schedule is forcibly created when the speaker does not want to share the schedule.

なお、上述した第１及び第２の実施形態おいては、ユーザ端末１００はスマートフォンであることを前提に説明したが、これに限られない。可搬型であればよく、例えば、ノートＰＣ、タブレット端末などを用いることができる。また、スケジュール作成装置２００、２００’は、サーバに限らず、スケジュール作成処理の負荷に耐えうる装置であればどのような装置でも構わない。例えば、ＰＣを用いることができる。
また、各実施形態で共通するスケジュールフォーマット３３の態様は一例であって、別の態様（例えば、カレンダー形式など）を採用してもよいことはもちろんである。 In addition, in 1st and 2nd embodiment mentioned above, although the user terminal 100 demonstrated on the assumption that it was a smart phone, it is not restricted to this. Any portable type may be used. For example, a notebook PC, a tablet terminal, or the like can be used. The schedule creation devices 200 and 200 ′ are not limited to servers, and any device can be used as long as it can withstand the load of schedule creation processing. For example, a PC can be used.
Moreover, the aspect of the schedule format 33 common to each embodiment is an example, and it is needless to say that another aspect (for example, a calendar format) may be adopted.

また、この発明のスケジュール作成装置２００、２００’、３００、４００の各機能は、通常のＰＣ等のコンピュータによっても実施することができる。
具体的には、上記実施形態では、スケジュール作成装置２００、２００’、３００、４００が行うスケジュール作成処理のプログラムが、記憶部３０のＲＯＭに予め記憶されているものとして説明した。しかし、スケジュール作成処理のプログラムを、フレキシブルディスク、ＣＤ−ＲＯＭ（ＣｏｍｐａｃｔＤｉｓｃＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｃ）及びＭＯ（Ｍａｇｎｅｔｏ−ＯｐｔｉｃａｌＤｉｓｃ）等のコンピュータ読み取り可能な記録媒体に格納して配布し、そのプログラムをコンピュータにインストールすることにより、上述の各機能を実現することができるコンピュータを構成してもよい。 The functions of the schedule creation devices 200, 200 ′, 300, and 400 according to the present invention can also be implemented by a computer such as a normal PC.
Specifically, in the above embodiment, the schedule creation processing program performed by the schedule creation devices 200, 200 ′, 300, and 400 has been described as being stored in advance in the ROM of the storage unit 30. However, schedule creation processing programs are stored and distributed on computer-readable recording media such as flexible disks, CD-ROMs (Compact Disc Read Only Memory), DVDs (Digital Versatile Discs), and MOs (Magneto-Optical Discs). Then, by installing the program in the computer, a computer capable of realizing the above-described functions may be configured.

以上、本発明の好ましい実施形態について説明したが、本発明は係る特定の実施形態に限定されるものではなく、本発明には、特許請求の範囲に記載された発明とその均等の範囲が含まれる。以下に、本願出願の当初の特許請求の範囲に記載された発明を付記する。 As mentioned above, although preferable embodiment of this invention was described, this invention is not limited to the specific embodiment which concerns, This invention includes the invention described in the claim, and its equivalent range It is. Hereinafter, the invention described in the scope of claims of the present application will be appended.

（付記１）
複数の話者の会話音声からスケジュールに関連するカテゴリに属すキーワードを抽出する抽出手段と、
前記抽出手段が同一カテゴリに属す異なるキーワードを複数抽出したか否かに基づいて、前記会話音声におけるスケジュール変更の有無を推定する推定手段と、
前記推定手段がスケジュール変更有りと推定した場合、前記同一カテゴリに属す複数の異なるキーワードのうち、前記複数の話者で共有する共有スケジュールに用いるキーワードを選択する選択手段と、
前記選択手段が選択したキーワードに基づいて、前記共有スケジュールを作成するスケジュール作成手段と、
を備えたことを特徴とするスケジュール作成装置。 (Appendix 1)
An extraction means for extracting keywords belonging to a category related to a schedule from conversational voices of a plurality of speakers;
Based on whether the extraction means has extracted a plurality of different keywords belonging to the same category, the estimation means for estimating the presence or absence of a schedule change in the conversation voice;
When the estimating means estimates that there is a schedule change, out of a plurality of different keywords belonging to the same category, a selecting means for selecting a keyword used for a shared schedule shared by the plurality of speakers;
Schedule creation means for creating the shared schedule based on the keyword selected by the selection means;
A schedule creation device comprising:

（付記２）
前記推定手段は、前記会話音声におけるスケジュール変更の有無を、該スケジュール変更に係るスケジュールとは別のスケジュールを除外して推定する、
ことを特徴とする付記１に記載のスケジュール作成装置。 (Appendix 2)
The estimating means estimates the presence or absence of a schedule change in the conversation voice by excluding a schedule different from the schedule related to the schedule change;
The schedule creation device according to Supplementary Note 1, wherein

（付記３）
前記推定手段は、前記抽出手段が同一カテゴリに属す異なるキーワードを複数抽出した場合、前記会話音声におけるスケジュール変更有りと推定し、
前記選択手段は、前記推定手段がスケジュール変更有りと推定した場合、前記同一カテゴリに属す複数の異なるキーワードのうち、前記複数の話者が発話したキーワードを話者単独が発話したキーワードよりも優先して選択する、
ことを特徴とする付記１又は２に記載のスケジュール作成装置。 (Appendix 3)
The estimation means estimates that there is a schedule change in the conversation voice when the extraction means extracts a plurality of different keywords belonging to the same category,
When the estimation unit estimates that there is a schedule change, the selection unit prioritizes a keyword spoken by the plurality of speakers over a keyword spoken by the speaker alone among a plurality of different keywords belonging to the same category. Select
The schedule creation device according to Supplementary Note 1 or 2, characterized in that:

（付記４）
前記推定手段は、前記抽出手段が同一カテゴリに属す異なるキーワードを複数抽出した場合、前記会話音声におけるスケジュール変更有りと推定し、
前記選択手段は、前記推定手段がスケジュール変更有りと推定した場合、前記同一カテゴリに属す複数の異なるキーワードのうち、時系列で後出のキーワードを前出のキーワードよりも優先して選択する、
ことを特徴とする付記１又は２に記載のスケジュール作成装置。 (Appendix 4)
The estimation means estimates that there is a schedule change in the conversation voice when the extraction means extracts a plurality of different keywords belonging to the same category,
The selection means, when the estimation means estimates that there is a schedule change, out of a plurality of different keywords belonging to the same category, to select the subsequent keywords in time series with priority over the previous keywords,
The schedule creation device according to Supplementary Note 1 or 2, characterized in that:

（付記５）
前記推定手段は、前記抽出手段が同一カテゴリに属す異なるキーワードを複数抽出しなかった場合、前記会話音声におけるスケジュール変更無しと推定し、
前記選択手段は、前記推定手段がスケジュール変更無しと推定した場合、前記抽出手段が抽出した前記同一カテゴリに属す１つのキーワードを選択する、
ことを特徴とする付記１乃至４の何れか一つに記載のスケジュール作成装置。 (Appendix 5)
If the extraction means does not extract a plurality of different keywords belonging to the same category, the estimation means estimates that there is no schedule change in the conversation voice,
The selecting means selects one keyword belonging to the same category extracted by the extracting means when the estimating means estimates that there is no schedule change;
The schedule creation device according to any one of supplementary notes 1 to 4, characterized in that:

（付記６）
前記複数の話者の会話音声の音量又はトーンに基づいて、該複数の話者のうち前記共有スケジュールを共有する話者を特定する特定手段を備え、
前記スケジュール作成手段は、前記選択手段が選択したキーワードに基づいて、前記特定手段が特定した話者で共有する共有スケジュールを作成する、
ことを特徴とする付記１乃至５の何れか一つに記載のスケジュール作成装置。 (Appendix 6)
Based on the volume or tone of the conversational voices of the plurality of speakers, comprising a specifying means for specifying a speaker sharing the sharing schedule among the plurality of speakers,
The schedule creating means creates a sharing schedule to be shared by the speakers identified by the identifying means based on the keywords selected by the selecting means;
The schedule creation device according to any one of appendices 1 to 5, characterized in that:

（付記７）
前記スケジュール作成手段が作成した前記共有スケジュールが、前記複数の話者全員に共有済みか否か判定する判定手段と、
前記判定手段が前記複数の話者全員に共有済みでないと判定した場合、共有済みでない話者のスケジュールを更新する更新手段と、
を備えたことを特徴とする付記１乃至６の何れか一つに記載のスケジュール作成装置。 (Appendix 7)
Determining means for determining whether the sharing schedule created by the schedule creating means has been shared by all the plurality of speakers;
An update unit that updates a schedule of speakers that have not been shared when the determination unit determines that the plurality of speakers have not been shared;
The schedule creation device according to any one of supplementary notes 1 to 6, further comprising:

（付記８）
前記抽出手段は、前記会話音声をテキストに変換し、該変換したテキストが、予め学習しておいた複数のカテゴリのうち何れかのカテゴリに属すキーワードのテキストと一致した場合、該一致したテキストをキーワードとして抽出する、
ことを特徴とする付記１乃至７の何れか一つに記載のスケジュール作成装置。 (Appendix 8)
The extraction means converts the conversation voice into text, and when the converted text matches the text of a keyword belonging to any of a plurality of categories learned in advance, the matched text is Extract as keywords,
The schedule creation device according to any one of supplementary notes 1 to 7, characterized in that:

（付記９）
前記複数の話者を音声認識又は画像認識により識別する識別手段を備えた、
ことを特徴とする付記１乃至８の何れか一つに記載のスケジュール作成装置。 (Appendix 9)
An identification means for identifying the plurality of speakers by voice recognition or image recognition;
The schedule creation device according to any one of appendices 1 to 8, characterized in that:

（付記１０）
話者を識別する話者識別手段と、
前記話者識別手段が話者を複数識別した場合、該複数の話者の会話音声からスケジュールに関連するカテゴリに属すキーワードを抽出する抽出手段と、
前記抽出手段が抽出したキーワードを発話した話者のみで共有する共有スケジュールを作成するスケジュール作成手段と、
前記スケジュール作成手段が作成した共有スケジュールを、前記複数の話者の会話音声に基づいて修正するスケジュール修正手段と、
を備えたことを特徴とするスケジュール作成装置。 (Appendix 10)
Speaker identification means for identifying the speaker;
When the speaker identification means identifies a plurality of speakers, an extraction means for extracting keywords belonging to a category related to the schedule from the conversation voices of the plurality of speakers;
Schedule creation means for creating a sharing schedule that is shared only by the speaker who spoke the keyword extracted by the extraction means;
Schedule correction means for correcting the sharing schedule created by the schedule creation means based on conversational voices of the plurality of speakers;
A schedule creation device comprising:

（付記１１）
前記スケジュール作成手段は、前記抽出手段が抽出したキーワードのうち共通のキーワードを発話した話者のみで共有する共有スケジュールを作成する、
ことを特徴とする付記１０に記載のスケジュール作成装置。 (Appendix 11)
The schedule creation means creates a shared schedule that is shared only by speakers who have spoken a common keyword among the keywords extracted by the extraction means.
The schedule creation device according to Supplementary Note 10, wherein

（付記１２）
複数の話者の会話音声からスケジュールに関連するカテゴリに属すキーワードを抽出する抽出ステップと、
前記抽出ステップにおいて同一カテゴリに属す異なるキーワードを複数抽出したか否かに基づいて、前記会話音声におけるスケジュール変更の有無を推定する推定ステップと、
前記推定ステップにおいてスケジュール変更有りと推定した場合、前記同一カテゴリに属す複数の異なるキーワードのうち、前記複数の話者で共有する共有スケジュールに用いるキーワードを選択する選択ステップと、
前記選択ステップにおいて選択したキーワードに基づいて、前記共有スケジュールを作成するスケジュール作成ステップと、
を備えたことを特徴とするスケジュール作成方法。 (Appendix 12)
An extraction step of extracting keywords belonging to a category related to the schedule from conversational voices of a plurality of speakers;
Based on whether or not a plurality of different keywords belonging to the same category are extracted in the extraction step, an estimation step for estimating the presence or absence of a schedule change in the conversation voice;
When it is estimated that there is a schedule change in the estimation step, a selection step of selecting a keyword to be used for a sharing schedule shared by the plurality of speakers among a plurality of different keywords belonging to the same category;
A schedule creation step of creating the shared schedule based on the keyword selected in the selection step;
A schedule creation method characterized by comprising:

（付記１３）
コンピュータを、
複数の話者の会話音声からスケジュールに関連するカテゴリに属すキーワードを抽出する抽出手段、
前記抽出手段が同一カテゴリに属す異なるキーワードを複数抽出したか否かに基づいて、前記会話音声におけるスケジュール変更の有無を推定する推定手段、
前記推定手段がスケジュール変更有りと推定した場合、前記同一カテゴリに属す複数の異なるキーワードのうち、前記複数の話者で共有する共有スケジュールに用いるキーワードを選択する選択手段、
前記選択手段が選択したキーワードに基づいて、前記共有スケジュールを作成するスケジュール作成手段、
として機能させるためのプログラム。 (Appendix 13)
Computer
An extraction means for extracting keywords belonging to a category related to a schedule from conversational voices of a plurality of speakers;
Estimating means for estimating the presence or absence of a schedule change in the conversation voice based on whether or not the extracting means has extracted a plurality of different keywords belonging to the same category;
A selection means for selecting a keyword to be used for a shared schedule shared by the plurality of speakers from a plurality of different keywords belonging to the same category when the estimation means estimates that there is a schedule change;
Schedule creation means for creating the shared schedule based on the keyword selected by the selection means;
Program to function as.

１００…ユーザ端末、２００，２００’，３００，４００…スケジュール作成装置、１１…制御部、１２…入力部、１３，５０…マイク、１４，６０…カメラ、１５…記憶部、１６…通信部、１７…表示部、２０…制御部、２１…話者識別部、２２…抽出部、２３…推定部、２４…選択部、２５…スケジュール作成部、２６…判定部、２７…更新部、２８…スケジュール修正部、３０…記憶部、３１…話者識別用テンプレート、３２…キーワードテーブル、３３…スケジュールフォーマット、３４…スケジュール記憶部、４０…通信部 DESCRIPTION OF SYMBOLS 100 ... User terminal, 200, 200 ', 300, 400 ... Schedule creation apparatus, 11 ... Control part, 12 ... Input part, 13, 50 ... Microphone, 14, 60 ... Camera, 15 ... Memory | storage part, 16 ... Communication part, DESCRIPTION OF SYMBOLS 17 ... Display part, 20 ... Control part, 21 ... Speaker identification part, 22 ... Extraction part, 23 ... Estimation part, 24 ... Selection part, 25 ... Schedule preparation part, 26 ... Determination part, 27 ... Update part, 28 ... Schedule correction unit, 30 ... storage unit, 31 ... speaker identification template, 32 ... keyword table, 33 ... schedule format, 34 ... schedule storage unit, 40 ... communication unit

Claims

An extraction means for extracting keywords belonging to a category related to a schedule from conversational voices of a plurality of speakers;
Based on whether the extraction means has extracted a plurality of different keywords belonging to the same category, the estimation means for estimating the presence or absence of a schedule change in the conversation voice;
When the estimating means estimates that there is a schedule change, out of a plurality of different keywords belonging to the same category, a selecting means for selecting a keyword used for a shared schedule shared by the plurality of speakers;
Schedule creation means for creating the shared schedule based on the keyword selected by the selection means;
A schedule creation device comprising:

The estimating means estimates the presence or absence of a schedule change in the conversation voice by excluding a schedule different from the schedule related to the schedule change;
The schedule creation device according to claim 1.

The estimation means estimates that there is a schedule change in the conversation voice when the extraction means extracts a plurality of different keywords belonging to the same category,
When the estimation unit estimates that there is a schedule change, the selection unit prioritizes a keyword spoken by the plurality of speakers over a keyword spoken by the speaker alone among a plurality of different keywords belonging to the same category. Select
The schedule creation device according to claim 1 or 2, wherein

The estimation means estimates that there is a schedule change in the conversation voice when the extraction means extracts a plurality of different keywords belonging to the same category,
The selection means, when the estimation means estimates that there is a schedule change, out of a plurality of different keywords belonging to the same category, to select the subsequent keywords in time series with priority over the previous keywords,
The schedule creation device according to claim 1 or 2, wherein

If the extraction means does not extract a plurality of different keywords belonging to the same category, the estimation means estimates that there is no schedule change in the conversation voice,
The selecting means selects one keyword belonging to the same category extracted by the extracting means when the estimating means estimates that there is no schedule change;
The schedule creation device according to claim 1, wherein the schedule creation device is a device for creating a schedule.

Based on the volume or tone of the conversational voices of the plurality of speakers, comprising a specifying means for specifying a speaker sharing the sharing schedule among the plurality of speakers,
The schedule creating means creates a sharing schedule to be shared by the speakers identified by the identifying means based on the keywords selected by the selecting means;
The schedule creation device according to claim 1, wherein the schedule creation device is a device for creating a schedule.

Determining means for determining whether the sharing schedule created by the schedule creating means has been shared by all the plurality of speakers;
An update unit that updates a schedule of speakers that have not been shared when the determination unit determines that the plurality of speakers have not been shared;
The schedule creation device according to any one of claims 1 to 6, further comprising:

The extraction means converts the conversation voice into text, and when the converted text matches the text of a keyword belonging to any of a plurality of categories learned in advance, the matched text is Extract as keywords,
The schedule creation device according to claim 1, wherein the schedule creation device is a device for creating a schedule.

An identification means for identifying the plurality of speakers by voice recognition or image recognition;
The schedule creation device according to claim 1, wherein the schedule creation device is a device for creating a schedule.

Speaker identification means for identifying the speaker;
When the speaker identification means identifies a plurality of speakers, an extraction means for extracting keywords belonging to a category related to the schedule from the conversation voices of the plurality of speakers;
Schedule creation means for creating a sharing schedule that is shared only by the speaker who spoke the keyword extracted by the extraction means;
Schedule correction means for correcting the sharing schedule created by the schedule creation means based on conversational voices of the plurality of speakers;
A schedule creation device comprising:

The schedule creation means creates a shared schedule that is shared only by speakers who have spoken a common keyword among the keywords extracted by the extraction means.
The schedule creation device according to claim 10.

An extraction step of extracting keywords belonging to a category related to the schedule from conversational voices of a plurality of speakers;
Based on whether or not a plurality of different keywords belonging to the same category are extracted in the extraction step, an estimation step for estimating the presence or absence of a schedule change in the conversation voice;
When it is estimated that there is a schedule change in the estimation step, a selection step of selecting a keyword to be used for a sharing schedule shared by the plurality of speakers among a plurality of different keywords belonging to the same category;
A schedule creation step of creating the shared schedule based on the keyword selected in the selection step;
A schedule creation method characterized by comprising:

Computer
An extraction means for extracting keywords belonging to a category related to a schedule from conversational voices of a plurality of speakers;
Estimating means for estimating the presence or absence of a schedule change in the conversation voice based on whether or not the extracting means has extracted a plurality of different keywords belonging to the same category;
A selection means for selecting a keyword to be used for a shared schedule shared by the plurality of speakers from a plurality of different keywords belonging to the same category when the estimation means estimates that there is a schedule change;
Schedule creation means for creating the shared schedule based on the keyword selected by the selection means;
Program to function as.