JP2007067972A

JP2007067972A - Conference system and control method for conference system

Info

Publication number: JP2007067972A
Application number: JP2005252817A
Authority: JP
Inventors: Yoshihiko Iwase; 好彦岩瀬; Toshinobu Tokita; 俊伸時田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2005-08-31
Filing date: 2005-08-31
Publication date: 2007-03-15

Abstract

<P>PROBLEM TO BE SOLVED: To provide an effective image in a conference or the like by preferentially photographing a human figure deeply relating to the conference or a human figure making a speech deeply relating to the relevant conference, for example. <P>SOLUTION: A participant recognition section 3 performs participant identification, a processing section 5 acquires human figure information of an identified human figure from a database 11 holding human figure information relating to human figures for each human figure, and an inference section 10a determines a priority order for each human figure, based on the recognized human figure information of each of the human figures. A camera 1 is controlled, based on the priority order determined for each human figure. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、例えばテレビ会議システム等に適用可能な会議システム及び会議システムの制御方法に関するものである。 The present invention relates to a conference system applicable to, for example, a video conference system and a conference system control method.

従来、遠隔地にいる複数の会議参加者同士で会議を行うことが可能なテレビ会議システムが実現されている。このシステムはカメラ、ディスプレイ、マイク等で構成され、固定カメラでお互いの会議室全体を撮影して会議を行う場合や、会議参加者が手動で相手側に設置されたカメラを遠隔操作して見たい映像を表示する場合や、音声によって発言者を特定して発言者を撮影する場合がある。その映像をディスプレイに表示して、遠隔地にいる参加者同士はそのディスプレイに表示された映像を通して会議を行っていた。 2. Description of the Related Art Conventionally, a video conference system that can hold a conference between a plurality of conference participants in a remote place has been realized. This system consists of a camera, a display, a microphone, etc. When a conference is performed by shooting the entire conference room with a fixed camera, or when a conference participant manually controls a camera installed on the other side. There is a case where a desired video is displayed or a case where a speaker is identified by voice and a speaker is photographed. The video was displayed on the display, and participants at remote locations had a meeting through the video displayed on the display.

例えば、特許文献１には、テレビ会議や遠隔監視などの際に、利用者の希望するカメラ姿勢等を簡単かつ瞬時に指示できるようにすることを目的とするものが開示されている。これは、雲台付きカメラで撮影した映像と、広角カメラにより撮影した映像とを表示部に映し出し、映像を見ながら遠隔の雲台付きカメラの姿勢制御を行う時、カメラが運動して撮影できる全ての領域の映像を見ながら雲台を制御できるものである。これにより、利用者は、あたかもカメラマンのようにカメラを持ってその場所に居ながら撮影範囲を制御しているかのように感じられ、容易に操作することができるものである。 For example, Japanese Patent Application Laid-Open No. 2004-151820 discloses a device that can easily and instantly indicate a camera posture or the like desired by a user during a video conference or remote monitoring. This is because the video taken by the camera with the pan head and the video taken by the wide-angle camera are projected on the display unit, and the camera can move and shoot when controlling the attitude of the remote camera with the pan head while watching the video. The camera platform can be controlled while viewing the video of all areas. Thus, the user can feel as if he / she is holding the camera and controlling the shooting range while staying at the place like a photographer and can easily operate the camera.

また、特許文献２には、設置作業が容易であって、参加者の数に制限のない会議撮影装置を提供することを目的とするものが開示されている。これは、無指向性マイクで周囲の音声を集音し、音声の中から参加者の声紋を抽出し、声紋データから発言者を特定し、参加者の中心に位置する回転カメラを発言した参加者の方に向けることにより、発言者の撮影を行う事ができるものである。 Further, Patent Document 2 discloses a device that aims to provide a conference photographing device that is easy to install and has no limitation on the number of participants. This is a participant who collects surrounding voice with an omnidirectional microphone, extracts the voiceprint of the participant from the voice, identifies the speaker from the voiceprint data, and speaks a rotating camera located at the center of the participant The speaker can be photographed by facing the person.

特開２０００−０３２３１９号公報JP 2000-032319 A 特開平１０−３０４３２９号公報JP-A-10-304329

そこで、本発明の目的は、上記技術を改良して、例えば会議に関連の深い人物や当該会議に関連の深い発言を行なう人物等を優先的に撮影する等して会議等において効果的な画像を提供することにある。 Therefore, an object of the present invention is to improve the above-described technique and, for example, image effective in a meeting or the like by preferentially photographing a person who is deeply related to the meeting or a person who makes a remark related to the meeting. Is to provide.

本発明の会議システムの第１の態様は、人物の個人認識を行う認識手段と、人物に係る人物情報を人物毎に保持する第１のデータベースから、前記認識手段により認識された人物の人物情報を取得する人物情報取得手段と、前記認識手段により認識された各人物の人物情報に基づいて、前記各人物について優先度を決定する優先度決定手段と、前記各人物について決定された優先度に基づいて、複数の撮像装置の制御を行う制御手段とを有することを特徴とする。
本発明の会議システムの第２の態様は、音声データを取得する音声データ取得手段と、前記音声データに基づいて人物の認識を行う認識手段と、前記音声データから所定のキーワード情報を抽出するキーワード抽出手段と、前記キーワード情報に基づいて、前記認識手段により認識された人物の優先度を決定する優先度決定手段と、前記認識手段により認識された各人物について決定された優先度に基づいて、複数の撮像装置の制御を行う制御手段とを有することを特徴とする。
本発明の会議システムの制御方法の第１の態様は、複数の撮像装置を有する会議システムの制御方法であって、人物の個人認識を行う認識ステップと、人物に係る人物情報を人物毎に保持するデータベースから、前記認識ステップにより認識された人物の人物情報を取得する人物情報取得ステップと、前記認識ステップにより認識された各人物の人物情報に基づいて、前記各人物について優先度を決定する優先度決定ステップと、前記各人物について決定された優先度に基づいて、前記撮像装置の制御を行う制御ステップとを含むことを特徴とする。
本発明の会議システムの制御方法の第２の態様は、複数の撮像装置を有する会議システムの制御方法であって、音声データを取得する音声データ取得ステップと、前記音声データに基づいて人物の認識を行う認識ステップと、前記音声データから所定のキーワード情報を抽出するキーワード抽出ステップと、前記キーワード情報に基づいて、前記認識ステップにより認識された人物の優先度を決定する優先度決定ステップと、前記認識ステップにより認識された各人物について決定された優先度に基づいて、画像を撮影する少なくとも一つの撮像装置の制御を行う制御ステップとを含むことを特徴とする。
本発明のプログラムは、前記会議システムの制御方法の第１又は第２の態様をコンピュータに実行させることを特徴とする。
本発明のコンピュータ読み取り可能な記録媒体は、前記プログラムを記録したことを特徴とする。 According to a first aspect of the conference system of the present invention, the personal information of a person recognized by the recognition means from a recognition means for performing personal recognition of the person and a first database that holds the personal information related to the person for each person. Personal information acquisition means for acquiring the priority, priority determination means for determining the priority for each person based on the person information of each person recognized by the recognition means, and the priority determined for each person. And a control means for controlling a plurality of imaging devices.
According to a second aspect of the conference system of the present invention, there is provided an audio data acquisition unit that acquires audio data, a recognition unit that recognizes a person based on the audio data, and a keyword that extracts predetermined keyword information from the audio data Based on extraction means, priority determination means for determining the priority of the person recognized by the recognition means based on the keyword information, and priority determined for each person recognized by the recognition means, And a control means for controlling a plurality of imaging devices.
A first aspect of a conference system control method according to the present invention is a conference system control method having a plurality of imaging devices, in which a recognition step for performing personal recognition of a person and person information related to the person are held for each person. A personal information acquisition step for acquiring the personal information of the person recognized by the recognition step from the database, and a priority for determining the priority for each person based on the personal information of each person recognized by the recognition step And a control step for controlling the imaging device based on the priority determined for each person.
According to a second aspect of the conference system control method of the present invention, there is provided a conference system control method having a plurality of imaging devices, an audio data acquisition step for acquiring audio data, and recognition of a person based on the audio data. Performing a recognition step, a keyword extraction step for extracting predetermined keyword information from the voice data, a priority determination step for determining a priority of the person recognized by the recognition step based on the keyword information, And a control step of controlling at least one imaging device that captures an image based on the priority determined for each person recognized in the recognition step.
The program of the present invention causes a computer to execute the first or second aspect of the control method of the conference system.
The computer-readable recording medium of the present invention is characterized in that the program is recorded.

本発明においては、各人物の人物情報や音声データに含まれるキーワード情報に基づいて、各人物について優先度を決定し、その優先度に基づいて撮影装置の制御を行なうように構成している。従って、本発明によれば、例えば会議に関連の深い人物情報を有する人物や当該会議に関連の深い発言を行なう人物等に高い優先度を付与し、付与された優先度に基づいて或る人物を優先的に撮影する等して会議等に効果的な画像を提供することが可能となる。 In the present invention, the priority is determined for each person based on the person information of each person and the keyword information included in the audio data, and the photographing apparatus is controlled based on the priority. Therefore, according to the present invention, a high priority is given to, for example, a person who has personal information that is closely related to the meeting, a person who makes a speech that is deeply related to the meeting, and the like. It is possible to provide an effective image for a meeting or the like by preferentially shooting the image.

以下、本発明を適用した好適な実施形態を、添付図面を参照しながら詳細に説明する。 DESCRIPTION OF EXEMPLARY EMBODIMENTS Hereinafter, preferred embodiments to which the invention is applied will be described in detail with reference to the accompanying drawings.

本発明の実施形態に係るテレビ会議システムでは、参加者の属性や経歴情報等の参加者情報と会議の内容に沿って、参加者に優先度を付け、議題に関連の深い人物を中心としたカメラワークを行う。 In the video conference system according to the embodiment of the present invention, the participants are prioritized according to the participant information such as the attributes and background information of the participant and the content of the conference, and the person who is deeply related to the agenda is focused on. Do camera work.

図１は本発明の実施形態に係るテレビ会議システムの構成図である。本実施形態に係るテレビ会議システムは、参加者を撮影するための複数台のカメラ１（撮像装置）、外部地点の参加者を表示するディスプレイ２、マイク４からなる。また、参加者認識部３は、ディスプレイ２に内臓あるいは外部接続、参加者と参加者情報を対応させる処理部５、記録部６、参加者の優先度を算出する推論部１０ａ、参加者情報を持つデータベース１１及びキーワードデータベース１２で構成される。なお、処理部５、記録部６、推論部１０ａ、データベース１１及びキーワードデータベース１２は、パーソナルコンピュータなどの情報処理装置によって構成される。 FIG. 1 is a configuration diagram of a video conference system according to an embodiment of the present invention. The video conference system according to the present embodiment includes a plurality of cameras 1 (imaging devices) for photographing participants, a display 2 for displaying participants at external points, and a microphone 4. In addition, the participant recognition unit 3 includes an internal or external connection to the display 2, a processing unit 5 that associates the participant with the participant information, a recording unit 6, an inference unit 10a that calculates the priority of the participant, and participant information. It has a database 11 and a keyword database 12. The processing unit 5, the recording unit 6, the inference unit 10a, the database 11, and the keyword database 12 are configured by an information processing device such as a personal computer.

次に図２を用いて、参加者認識部３によって参加者情報を取得する方法について説明する。図２は、参加者情報の取得方法について２つの実施形態を示したものである。図２（ａ）に示す例では、参加者認識部３がリーダ８によって構成される例を示している。リーダ８は、会議室の入口に設置され、参加者が所持しているタグ７内の個人情報を読み取ることにより参加者を認識する。なお、参加者が有するタグ７は無線通信により非接触でリーダ８と通信するが、他の実施形態として、リーダ８と接触型で通信を行ない、内部に記録される個人情報を接触型の通信でリーダ８に読み取らせる通信装置を用いてもよい。処理部５は、読み取られた参加者のＩＤからデータベース１１を参照して参加者に関する情報（参加者情報）を取得する。推論部１０ａは処理部５によって取得された参加者情報に基づいて参加者の優先度を算出し、記録部６に参加者情報と優先度を対応付けて記録する。 Next, a method for acquiring participant information by the participant recognition unit 3 will be described with reference to FIG. FIG. 2 shows two embodiments of a method for acquiring participant information. In the example illustrated in FIG. 2A, an example in which the participant recognition unit 3 is configured by the reader 8 is illustrated. The reader 8 is installed at the entrance of the conference room and recognizes the participant by reading the personal information in the tag 7 possessed by the participant. The tag 7 held by the participant communicates with the reader 8 in a non-contact manner by wireless communication. However, as another embodiment, the tag 7 communicates with the reader 8 in a contact type, and the personal information recorded therein is contact-type communication. Alternatively, a communication device that causes the reader 8 to read may be used. The processing unit 5 refers to the database 11 from the read participant ID, and acquires information on the participant (participant information). The inference unit 10 a calculates the priority of the participant based on the participant information acquired by the processing unit 5, and records the participant information and the priority in association with each other in the recording unit 6.

図２（ｂ）に示す例では、参加者認識部３がカメラ１及び画像処理部９ａによって構成され、画像認識によって参加者を認識する例を示したものである。具体的には、会議室に入室する参加者をカメラ１で撮影する。そして、画像処理部９ａは、カメラ１で撮影された画像データから参加者の特徴情報を抽出し、抽出した特徴情報と特徴データベース１３ａに登録された人物の特徴情報とから参加者を認識する構成からなる。そして、処理部５は、認識された参加者の識別情報に基づいて属性や経歴（参加者情報）をデータベース１１から取得する。推論部１０ａは、処理部５によって取得された参加者情報に基づいて優先度を算出し、記録部６に参加者情報と優先度を記録する。その他、ここでは図示しないが、入室する際の参加者認識部３が参加者の指紋・虹彩・静脈等を認識し、同様の後段の処理によってバイオメトリクスによる参加者の認証等を行ってもよい。 In the example illustrated in FIG. 2B, the participant recognition unit 3 includes the camera 1 and the image processing unit 9a and recognizes the participant through image recognition. Specifically, a participant who enters the conference room is photographed with the camera 1. The image processing unit 9a extracts the participant feature information from the image data captured by the camera 1, and recognizes the participant from the extracted feature information and the person feature information registered in the feature database 13a. Consists of. Then, the processing unit 5 acquires an attribute and a history (participant information) from the database 11 based on the recognized identification information of the participant. The inference unit 10 a calculates the priority based on the participant information acquired by the processing unit 5, and records the participant information and the priority in the recording unit 6. In addition, although not shown here, the participant recognizing unit 3 when entering the room may recognize the participant's fingerprint, iris, vein, etc., and perform authentication of the participant by biometrics by the same subsequent processing. .

ここで、参加者情報の取得の際に用いられるデータベース１１について詳細に説明する。図３は、データベース１１において格納されるデータの一構成例を模式的に示す図である。属性には、役職、所属、専門、氏名といったものが記録されており、経歴には、過去の所属や担当業務などが記録されている。その他に関連分野のキーワード、個人の論文・特許といったものを記録しておく。論文や特許に関しては、社内ネットワークに接続されたデータベースではなく、ネットワークに接続された外部のデータベースから収集してきてもよい。これらの情報は、その参加者が会議の議題とどれだけ関連があるかといった関連の深さや、その議題に対して決定権を持つかといった判断に用いることができる。 Here, the database 11 used for acquiring participant information will be described in detail. FIG. 3 is a diagram schematically illustrating a configuration example of data stored in the database 11. In the attribute, title, affiliation, specialization, name, etc. are recorded, and in the career, past affiliation, work in charge, etc. are recorded. In addition, record keywords in related fields, personal papers and patents. Articles and patents may be collected from an external database connected to the network instead of a database connected to the in-house network. These pieces of information can be used to determine how deeply the participant is related to the agenda of the conference and whether the participant has the right to make a decision on the agenda.

テレビ会議システムにおいて特定の参加者に注目して撮影を行うためには、誰がどこにいるか参加者の位置を認識する必要がある。参加者の位置を取得する方法について図４を用いて説明する。図４は、参加者の位置の取得方法について２つの実施形態を示したものである。 In order to perform shooting while paying attention to a specific participant in the video conference system, it is necessary to recognize the position of the participant who is where. A method for acquiring the position of the participant will be described with reference to FIG. FIG. 4 shows two embodiments of the method for acquiring the position of the participant.

図４（ａ）は、動線分析と参加者情報を結びつけることにより、参加者の位置情報を取得する方法である。この方法は以下のとおりである。まず、参加者の入室前に会議室の入口で参加者認識部３によって個人認識が行われる。そして、参加者を認識した順番が記録部６に記憶される。このとき認識された参加者にはそれぞれ入室順に番号を付与する。参加者が入室したときに、入室してきた参加者を室内に設置されているカメラ１で撮影し、第二の画像処理部９ｂによって動線分析を行う。具体的には、入室してきた参加者をカメラ１によってそれぞれ追尾し、動線分析結果として参加者の移動履歴および最終的な位置情報を蓄積することで参加者の動線を把握できる。そして、参加者認識部３によって参加者を認識した順番と、入室した順番に応じた動線分析の結果を対応付けることによって、誰がどこに座ったかを特定できる。その結果を参加者の位置情報として参加者情報とともに記録部６に記録する。 FIG. 4A shows a method of acquiring the location information of the participant by connecting the flow line analysis and the participant information. This method is as follows. First, personal recognition is performed by the participant recognition unit 3 at the entrance of the conference room before the participant enters the room. The order in which the participants are recognized is stored in the recording unit 6. Each participant recognized at this time is assigned a number in the order of entry. When the participant enters the room, the participant who has entered the room is photographed by the camera 1 installed in the room, and the flow line analysis is performed by the second image processing unit 9b. Specifically, each participant who has entered the room is tracked by the camera 1, and the movement line of the participant and the final position information are accumulated as a flow line analysis result, whereby the flow line of the participant can be grasped. Then, by associating the order in which the participants are recognized by the participant recognition unit 3 with the result of the flow line analysis corresponding to the order in which the rooms are entered, it is possible to specify who is sitting where. The result is recorded in the recording unit 6 together with the participant information as the location information of the participant.

図４（ｂ）は、図２（ｂ）の参加者認識方式による参加者情報及び位置情報を取得する方法について示す。この場合、会議参加者が着席後にそれぞれのカメラ１が撮影した画像データに基づいて画像処理部９ａが画像認識を行って参加者を認識するとともに、カメラ１から当該参加者への方向及び距離を算出することで、参加者の認識及び参加者の位置情報の取得を行うことができる。そして、参加者の認識を行った後に処理部５が参加者の属性や経歴等の参加者情報をデータベース１１から参照し、推論部１０ａは参加者情報に基づいて優先度を算出し、位置情報とともに記録部６に記録する処理が行われる。 FIG. 4B shows a method for acquiring participant information and position information by the participant recognition method of FIG. In this case, the image processing unit 9a recognizes the participant by recognizing the image based on the image data taken by each camera 1 after the conference participant is seated, and the direction and distance from the camera 1 to the participant are determined. By calculating, recognition of the participant and acquisition of the location information of the participant can be performed. After the recognition of the participant, the processing unit 5 refers to the participant information such as the attribute and career of the participant from the database 11, and the inference unit 10a calculates the priority based on the participant information, and the position information At the same time, processing for recording in the recording unit 6 is performed.

ここで図４（ａ）の場合において、会議室への入室時の参加者の認識の順番と参加者が入室する順番にずれによって、参加者情報と位置情報のミスマッチを防ぐために、図４（ｂ）に示した方法を併用してもよい。ただし、その場合は入室前にデータベース１１を参照して参加者情報を取得しているので、入室後に再度データベース１１を参照して参加者情報を取得する必要は無く、優先度を算出する必要もない。さらに、参加者は入室時に特定されているので、図４（ｂ）に示す方法での画像認識ではその参加者か否かを判断するだけでよい。もし、ミスマッチと判定されたら会議に参加している他の参加者を認識して正しい参加者と位置情報の対応付けを行う。 Here, in the case of FIG. 4A, in order to prevent a mismatch between the participant information and the position information due to a deviation in the order of recognition of the participants when entering the conference room and the order in which the participants enter the room, FIG. You may use together the method shown in b). However, in this case, since the participant information is obtained by referring to the database 11 before entering the room, it is not necessary to obtain the participant information by referring to the database 11 again after entering the room, and it is also necessary to calculate the priority. Absent. Further, since the participant is specified at the time of entering the room, the image recognition by the method shown in FIG. If it is determined that there is a mismatch, the other participants who are participating in the conference are recognized and the correct participants are associated with the position information.

次に、参加者が認識された際に優先順位付けを行うための情報処理装置の処理の流れを図５のフローチャートを用いて説明する。ステップＳ１において参加者認証部３から参加者の認識情報を受信すると、ステップＳ２において、処理部５はデータベース１１に問い合わせて該当する参加者情報を取得する。このとき、参加者情報として参加者の属性情報、経歴情報、参加者の特許数及び内容のポイントが取得される。さらにステップＳ３において、処理部５は取得した参加者情報の数から参加者の人数をカウントする。ステップＳ４において、その会議に関連するキーワードと参加者情報から推論部１０ａによって優先度を算出し、記録部６に参加者情報とともに記録しておく。会議に関連するキーワードについては議長もしくは参加者が、情報処理装置が接続している社内イントラネットの会議予約システム等を利用して会議に関連したキーワードや会議内容などをあらかじめ入力し、記録部６に格納しておく。続くステップＳ５において、推論部１０ａは現在認識している参加者の人数が１人より多い場合、ステップＳ４で算出した優先度から優先順位付けを行う。この処理を全ての参加者に対して行うことで、参加者全員に優先順位のランキングを付けることができる。 Next, a processing flow of the information processing apparatus for prioritizing when a participant is recognized will be described with reference to the flowchart of FIG. When the recognition information of the participant is received from the participant authentication unit 3 in step S1, the processing unit 5 inquires the database 11 and acquires the corresponding participant information in step S2. At this time, the participant's attribute information, career information, the number of patents of the participant, and points of contents are acquired as the participant information. Further, in step S3, the processing unit 5 counts the number of participants from the acquired number of participant information. In step S4, a priority is calculated by the inference unit 10a from the keyword related to the conference and the participant information, and recorded in the recording unit 6 together with the participant information. For keywords related to the conference, the chairperson or the participant inputs in advance the keywords related to the conference and the content of the conference using the in-house intranet conference reservation system to which the information processing apparatus is connected. Store it. In subsequent step S5, when the number of participants currently recognized is more than one, the inference unit 10a performs prioritization based on the priority calculated in step S4. By performing this process for all the participants, it is possible to rank all the participants in the priority ranking.

ここで、ステップＳ４における優先度の算出方法について説明する。優先度は参加者の決定権と関連度から算出される。決定権は参加者が会議の決定権がどの程度あるかを示す値である。また、関連度は参加者が議題に対してどの程度関連があるかを示す値である。決定権は属性情報に含まれる役職に応じて数値を与える。例えば、役職が部長と一般職では決定権は部長の方が高い数値となる。その他、ネットワークに接続された検索システムを用いて名前を検索した際にヒットした数を考慮した社会的影響度を決定権に加えても良い。 Here, the priority calculation method in step S4 will be described. The priority is calculated from the participant's right to determine and relevance. The decision right is a value indicating how much the participant has the decision right of the conference. The degree of association is a value indicating how much the participant is related to the agenda. The decision right is given a numerical value according to the job title included in the attribute information. For example, when the position is a general manager and a general position, the decision-making authority is higher for the general manager. In addition, a social influence degree that considers the number of hits when a name is searched using a search system connected to a network may be added to the decision right.

次に関連度について以下に説明する。関連度は、各キーワードの一致度と関連の深さから得られる値とキーワード一致数と経歴から得られる値との合計から算出する。図６にキーワードデータベース１２に登録されたツリー構造のキーワード表の一例を示す。 Next, the relevance will be described below. The degree of association is calculated from the sum of the value obtained from the degree of matching of each keyword and the depth of association, the number of keyword matches and the value obtained from the history. FIG. 6 shows an example of a tree structure keyword table registered in the keyword database 12.

まず、キーワード一致度と関連の深さについて説明する。例えば、議題に関連したキーワードとして「テンプレートマッチング」というキーワードが登録されていた時、参加者情報に「テンプレートマッチング」というキーワードがあれば、キーワード一致度は１とする。そして、「テンプレートマッチング」ではないが、類似性のある「マッチング」というキーワードがあれば、キーワード一致度は例えば０．９として算出する。さらに一致度には関連の深さが考慮され、一致したキーワードがツリー構造の枝の末端に存在するキーワードであれば関連の深さは１となる。しかし、ツリー構造の幹に近い言葉ほど広い意味のキーワードになるので関連の深さは０に近くなる。例えば、「画像処理」といったキーワードで一致度が高くても、「画像処理」に含まれる意味は広い（ツリー構造の幹に近い）。そのため、キーワード一致度に対応する値と関連の深さに対応する値との双方を考慮した場合、例えばこれらの値の乗算等が行われると、得られる値は１より小さい値となる。 First, the degree of keyword matching and the depth of association will be described. For example, when a keyword “template matching” is registered as a keyword related to the agenda, if the keyword “template matching” is included in the participant information, the keyword matching degree is 1. If there is a keyword “matching” that is not “template matching” but similar, the keyword matching degree is calculated as 0.9, for example. Furthermore, the degree of association is considered in the degree of matching, and the degree of association is 1 if the matched keyword is a keyword that exists at the end of a branch of the tree structure. However, the closer the word is to the trunk of the tree structure, the broader the keyword, so the related depth is close to zero. For example, even if the degree of matching is high for a keyword such as “image processing”, the meaning included in “image processing” is broad (close to the trunk of the tree structure). Therefore, when both the value corresponding to the keyword matching degree and the value corresponding to the related depth are considered, for example, when these values are multiplied, the obtained value becomes a value smaller than 1.

次に、キーワード一致数と年数について説明する。これは、データベース１１を参照し、一致したキーワード数とそのキーワードが検出された経歴に基づいて算出する。例えば、参加者情報において議題に関連したキーワードと一致したキーワードの種類の合計が複数の参加者で同じであった場合、過去に関連していた参加者よりも現在この仕事に関連している参加者の方が関連があると判断する。また、キーワード数に関しては、検出したキーワード数のどこか一つの数値を境にして、関連性の有無を決定することは難しいので、曖昧さを持たせるためにファジィを用いる。図７にこの場合のファジィ変数とファジィルールの一例を示す。図７（ａ）は後件部ファジィ変数、図７（ｂ）はファジィルールである。解の合成手法においてMin-Max法において重心をとる演算は、求める重心をy₀、横軸をy、合成したファジィ集合をμ(y)とすると式１のように定義できる。 Next, the number of keyword matches and the number of years will be described. This is calculated by referring to the database 11 and based on the number of matched keywords and the history of the detected keywords. For example, if the total number of keyword types that match a keyword related to the agenda in the participant information is the same for multiple participants, the participation that is currently related to this job rather than the participants that were related to the past The person is more relevant. In addition, regarding the number of keywords, it is difficult to determine the presence or absence of relevance at any one of the numbers of detected keywords, so fuzzy is used to provide ambiguity. FIG. 7 shows an example of fuzzy variables and fuzzy rules in this case. FIG. 7A shows a consequent part fuzzy variable, and FIG. 7B shows a fuzzy rule. In the solution synthesis method, the calculation of taking the center of gravity in the Min-Max method can be defined as Equation 1 where y ₀ is the center of gravity to be obtained, y is the horizontal axis, and μ (y) is the combined fuzzy set.

このように関連度を、キーワード一致度と関連の深さから得られる値、およびキーワード一致数と経歴から得られる値の合計から算出する。優先度の算出にこの関連度を用いることで、同じ役職だとしても議題に関連した参加者の方が優先度は高くなる。
なお、便宜的にキーワード一致度と関連の深さから得られる値のみを関連度としてもよいし、キーワード一致数と年数から得られる値のみを関連度としてもよい。 In this way, the degree of association is calculated from the sum of the value obtained from the keyword matching degree and the depth of association, and the value obtained from the keyword matching number and the history. By using this relevance level for priority calculation, participants who are related to the agenda have higher priority even if they have the same position.
For convenience, only the value obtained from the keyword matching degree and the depth of association may be used as the degree of association, or only the value obtained from the keyword matching number and the number of years may be used as the degree of association.

以上説明したように、会議の議題、関連キーワードと参加者の属性、経歴情報から会議が行われる時に自動で参加者の優先順位付けを行うことによって、予定外の参加者にも柔軟に対応できる。さらに、議長が会議参加者の過去の経歴や専門分野などを知らなくても、会議内容に応じてその都度参加者の優先順位付けを行うことができる。 As explained above, it is possible to flexibly deal with unscheduled participants by automatically prioritizing participants when the conference is held based on the agenda of the conference, related keywords and attributes of participants, and background information. . Furthermore, even if the chairperson does not know the past backgrounds and specialized fields of the conference participants, the priorities of the participants can be given each time according to the content of the conference.

これまで説明した参加者の優先度に基づくカメラワーク例を図８に示す。参加者の位置情報は上述したとおり参加者情報とともに記録部６に記録されている。この位置情報に基づいて、カメラ１による参加者の撮影が行われる。参加者がカメラ１の台数を越える場合、発言者を撮影しているカメラ１以外は、参加者の優先順位順に、適切な位置に設置されているカメラ１を参加者の撮影に割り当てる制御が行なわれる。また、例えば、最も優先順位の高い参加者については、図８（ａ）に示すようにズームアップ撮影を行う制御をしてもよい。その次に優先順位の高い参加者については、図８（ｂ）のように数人程度まとめてズームして撮影を行う制御をしてもよい。その他の参加者については、図８（ｃ）のように広角で撮影を行う制御をしてもよい。このように撮影された映像を見ることによって、発言者や会議に参加している参加者は、議題に適した参加者の反応を見ることができる。なお、カメラワークはここで示したものに限ったものではない。 An example of camera work based on the priorities of the participants described so far is shown in FIG. As described above, the location information of the participant is recorded in the recording unit 6 together with the participant information. Based on this position information, the participant 1 is photographed by the camera 1. When the number of participants exceeds the number of cameras 1, control is performed to assign the cameras 1 installed at appropriate positions to the participants in order of priority of the participants other than the camera 1 shooting the speaker. It is. Further, for example, the participant with the highest priority may be controlled to perform zoom-up shooting as shown in FIG. As for the participant with the second highest priority order, as shown in FIG. 8 (b), it may be controlled that several people are zoomed together and photographed. Other participants may be controlled to take a wide angle image as shown in FIG. By viewing the video taken in this way, a speaker or a participant participating in the conference can see the reaction of the participant suitable for the agenda. Note that camera work is not limited to the one shown here.

以上説明したように、優先順位に応じたカメラワークを行うことで、意思決定者や議題に関連の深い参加者を優先的に撮影することができ、発言者側もしくは、聞き手側にいる発言者の意見に対する、意思決定者や議題に関連の深い参加者の反応を見ることができる。 As described above, by performing camera work according to priority, it is possible to preferentially photograph decision makers and participants who are closely related to the agenda, and speakers who are on the speaker side or the listener side You can see the responses of the decision makers and participants closely related to the agenda.

次に、本発明の他の実施形態として、参加者の音声からキーワードを検出・認識し、参加者に優先順位を付け、議題や話の流れに応じたカメラワークを行うテレビ会議システムについて説明する。 Next, as another embodiment of the present invention, a video conference system that detects and recognizes keywords from the voices of the participants, prioritizes the participants, and performs camera work according to the agenda and the flow of the story will be described. .

図９は本発明の他の実施形態に係るテレビ会議システムの構成図である。この実施の形態の情報処理装置は、記録部６，第二の推論部１０ｂ，キーワードデータベース１２，第二のデータベース１３ｂ，音声処理部１４を備える。音声処理部１４は、参加者の音声からキーワードを検出・認識し、参加者との対応付けを行う。第二の推論部１０ｂは、発言時間、発言キーワードによって優先度を算出する。なお、人物毎に音声データの特徴情報を記録する第二の特徴データベース１３ｂ以外の基本構成（例えば、参加者の位置を検出し、その検出結果に基づいてカメラ１の駆動制御を行うための構成等）は第１の実施形態と同じであるのでその説明は省略する。 FIG. 9 is a configuration diagram of a video conference system according to another embodiment of the present invention. The information processing apparatus of this embodiment includes a recording unit 6, a second inference unit 10b, a keyword database 12, a second database 13b, and a voice processing unit 14. The voice processing unit 14 detects and recognizes a keyword from the voice of the participant and associates it with the participant. The second reasoning unit 10b calculates the priority based on the speech time and the speech keyword. A basic configuration other than the second feature database 13b that records feature information of audio data for each person (for example, a configuration for detecting the position of the participant and controlling the drive of the camera 1 based on the detection result). Etc.) is the same as that of the first embodiment, and the description thereof is omitted.

初めに、参加者の発言を認識した際に優先順位付けを行う処理の流れを図１０のフローチャートを用いて説明する。ステップＳ１１において、音声処理部１４は、マイク４から発言者の音声データを取得し、取得した音声データの特徴情報と特徴データベース１３ａに登録された人物の特徴情報とから発言者を認識する。次に、音声処理部１４はステップＳ１２において発言者と参加者との対応を記録部６に記録する。 First, the flow of processing for prioritizing when a participant's speech is recognized will be described with reference to the flowchart of FIG. In step S11, the voice processing unit 14 acquires the voice data of the speaker from the microphone 4, and recognizes the speaker from the feature information of the acquired voice data and the feature information of the person registered in the feature database 13a. Next, the voice processing unit 14 records the correspondence between the speaker and the participant in the recording unit 6 in step S12.

次に音声処理部１４はステップＳ１３において音声データの音声分割を行い、ステップＳ１４で参加者の発言からキーワードを抽出すると、ステップＳ１５で、記録部６において発言者及び参加者に発言時間を更に対応付けて登録する。 Next, the voice processing unit 14 performs voice division of the voice data in step S13, and extracts keywords from the participant's utterance in step S14. In step S15, the recording unit 6 further deals with the utterance time for the speaker and the participant. Add and register.

次に第二の推論部１０ｂは、ステップＳ１６で抽出された各キーワードから優先度を算出し、記録部６において発言者、参加者及び発言時間に優先度を更に対応付けて登録する。 Next, the second reasoning unit 10b calculates a priority from each keyword extracted in step S16, and registers the priority in association with the speaker, the participant, and the speech time in the recording unit 6.

キーワードデータベース１２においては、事前に登録したキーワードの他に、会議の中で頻繁に発生するキーワードも重要と判断し、優先度を算出するためのキーワードとしてキーワードデータベース１２に登録される。例えば事前に「背景差分」といったキーワードを登録していなかったとしても、会議の中で頻繁に使われるようであれば、それに関連した属性や経歴を持つ会議参加者の優先度は高くなる。従って、会議を始める時には優先度が高くなかった参加者だとしても、話の流れに応じては優先度が高くなる。これにより、会議の話の流れに応じてリアルタイムで参加者の優先度を算出できる。そのため、事前に予定していた会議の内容と途中で議題が変わってしまったとしても、会議内容に応じてその都度参加者の優先順位付け（ステップＳ１７）を行うことができる。 In the keyword database 12, in addition to keywords registered in advance, keywords that frequently occur during meetings are also determined to be important, and are registered in the keyword database 12 as keywords for calculating priorities. For example, even if a keyword such as “background difference” is not registered in advance, if it is frequently used in a meeting, the priority of meeting participants having attributes and backgrounds related to it is high. Therefore, even if the participant has a low priority when starting the conference, the priority will increase according to the flow of the story. Thereby, a participant's priority can be calculated in real time according to the flow of the talk of a meeting. Therefore, even if the agenda changes in the middle of the content of the conference scheduled in advance, the prioritization of the participants can be performed each time according to the content of the conference (step S17).

ここで、ステップ１６の優先度の算出方法について説明する。ステップＳ１６の優先度の算出方法は、図５のステップＳ４での優先度の算出方法に発言者か否かと発言時間の合計が更に考慮されたものである。発言中であれば発言者度を１とする。無音になった瞬間に発言者度を０にすると、一呼吸おいただけでも発言者ではないと判断されてしまうため、発言後も一定時間までは発言者であるとする。これを図１１に示す。発言時間は発言時間の合計から変換した値を用いる。本実施形態では、これらの値を図５のステップＳ４の優先度の算出に用いられる各要素の値に加算又は乗算することによって優先度を求める。但し、本実施形態では、優先度の算出処理の際に参加者情報は用いていないため、データベース１１を参照してキーワードが検出された経歴の年は優先度の算出に考慮されない。 Here, the priority calculation method in step 16 will be described. The priority calculation method in step S16 is a method in which the priority calculation method in step S4 in FIG. 5 further considers whether or not the speaker is a speaker and the total speech time. When speaking, the speaker degree is set to 1. If the speaker level is set to 0 at the moment of silence, it is determined that the speaker is not a speaker even after a short breath. Therefore, it is assumed that the speaker is a speaker until a certain time after speaking. This is shown in FIG. As the speech time, a value converted from the total speech time is used. In the present embodiment, the priority is obtained by adding or multiplying these values to the value of each element used for calculating the priority in step S4 in FIG. However, in this embodiment, since participant information is not used in the priority calculation process, the year of the history in which the keyword is detected with reference to the database 11 is not considered in the calculation of the priority.

図１２に、表示切り替え部によって選択された映像を、聞き手側のディスプレイに表示する際の画面表示レイアウト例を示す。図１２（ａ）は、発言者と発言者側にいる意思決定者を表示している場合である。また、図１２（ｂ）は、さらに、議題に関連が深い参加者を撮影した映像を同時に表示している場合を示している。表示方法のレイアウトに関してはここで示した種類に限ったものではない。画面構成は、マルチ画面ではなくシングル画面でもよい。画面設定は自動もしくは手動で変更できるものである。画面表示の切り替えに関しては、注目人物優先、発言者優先、ハイブリッドなどのモードがあり、注目人物優先モードでは、推論部１０ａによって算出された優先順位に応じて優先度の高い参加者をメイン画面に表示し、次に優先順位の高い参加者を撮影している映像をサブ画面に表示する処理が行われる。発言者優先モードでは、第二の推論部１０ｂによって算出された発言者の中で優先順位の高い参加者をメイン画面に表示して、次に優先順位の高い発言者を撮影している映像をサブ画面に表示する処理が行われる。ハイブリッドモードでは、発言者をメイン画面に表示し、意思決定者などの注目人物をサブ画面に表示する。この表示は逆でも良い。モードの設定は発言時間に応じて自動で切り替わるか参加者が手動で行っても良い。例えば、聞き手側のディスプレイには発言者優先モードもしくはハイブリッドモードで表示し、発言者側のディスプレイには注目人物優先モードで表示するというようになる。 FIG. 12 shows a screen display layout example when displaying the video selected by the display switching unit on the display on the listener side. FIG. 12A shows a case in which a speaker and a decision maker on the speaker side are displayed. Further, FIG. 12B shows a case where video images of participants who are closely related to the agenda are displayed at the same time. The layout of the display method is not limited to the types shown here. The screen configuration may be a single screen instead of a multi-screen. Screen settings can be changed automatically or manually. As for the screen display switching, there are modes such as attention person priority, speaker priority, and hybrid. In the attention person priority mode, a participant with a high priority is displayed on the main screen according to the priority order calculated by the inference unit 10a. A process of displaying on the sub-screen the video that is displayed and shooting the next highest priority participant is performed. In the speaker priority mode, the participant who has the highest priority among the speakers calculated by the second reasoning unit 10b is displayed on the main screen, and the video of the speaker with the next highest priority is captured. Processing to display on the sub screen is performed. In the hybrid mode, a speaker is displayed on the main screen, and a person of interest such as a decision maker is displayed on the sub screen. This display may be reversed. The mode setting may be automatically switched according to the speaking time or manually performed by the participant. For example, the display on the listener side is displayed in the speaker priority mode or the hybrid mode, and the display on the speaker side is displayed in the attention person priority mode.

上述した実施形態によれば、カメラは発言者に注目して撮影を行うだけではなく、参加者の属性情報や経歴によって、議題や話の流れに応じて参加者に優先順位をつけ、その発言に対する意思決定者や議題に関連の深い参加者を優先的に撮影することができる。従って、発言者やその他の会議参加者は、発言者側もしくは、別地点にいる意思決定者や議題に関連の深い参加者の意見に対する反応を見ることが可能となり、効果的な映像を表示することができるテレビ会議を行うことができる。 According to the above-described embodiment, the camera not only performs shooting while paying attention to the speaker, but also prioritizes the participant according to the agenda and the flow of the talk according to the attribute information and background of the participant, and It is possible to preferentially photograph participants who are closely related to decision-makers and agendas. Therefore, the speaker or other conference participants can see the reaction to the opinions of the speaker or the decision-makers at different points or participants who are closely related to the agenda, and display an effective video. You can have a video conference.

さらに、会議を行う場合に、重要な意見は議題に関連した知識を持つ参加者から出ることが多いと想定される。そのため、議題に応じた人物を中心的に撮影しておくことにより、カメラ映像が頻繁に切り替わったり、音に対してカメラが敏感に反応したりすることを少なくすることができる。 In addition, when conducting meetings, it is assumed that important opinions often come from participants with knowledge related to the agenda. Therefore, by taking a picture of a person according to the agenda at the center, it is possible to reduce the frequent switching of camera images and the camera's sensitive response to sound.

また、本発明の目的は、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記憶媒体を、上述した情報処理装置に供給し、その情報処理装置のコンピュータ（またはＣＰＵやＭＰＵ）が記憶媒体に格納されたプログラムコードを読み出し実行することによっても、達成されることは言うまでもない。 Also, an object of the present invention is to supply a storage medium storing software program codes for realizing the functions of the above-described embodiments to the above-described information processing apparatus, and a computer (or CPU or MPU) of the information processing apparatus. Needless to say, this can also be achieved by reading and executing the program code stored in the storage medium.

この場合、記憶媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになり、プログラムコード自体及びそのプログラムコードを記憶した記憶媒体は本発明を構成することになる。 In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiments, and the program code itself and the storage medium storing the program code constitute the present invention.

プログラムコードを供給するための記憶媒体としては、例えば、フレキシブルディスク、ハードディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、磁気テープ、不揮発性のメモリカード、ＲＯＭ等を用いることができる。 As a storage medium for supplying the program code, for example, a flexible disk, a hard disk, an optical disk, a magneto-optical disk, a CD-ROM, a CD-R, a magnetic tape, a nonvolatile memory card, a ROM, or the like can be used.

また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼動しているＯＳ(基本システム或いはオペレーティングシステム)などが実際の処理の一部又は全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also an OS (basic system or operating system) running on the computer based on the instruction of the program code. Needless to say, a case where the functions of the above-described embodiment are realized by performing part or all of the actual processing and the processing is included.

さらに、記憶媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書込まれた後、そのプログラムコードの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵ等が実際の処理の一部又は全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, after the program code read from the storage medium is written in a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer, the function is determined based on the instruction of the program code. It goes without saying that the CPU or the like provided in the expansion board or function expansion unit performs part or all of the actual processing and the functions of the above-described embodiments are realized by the processing.

本発明の実施形態に係るテレビ会議システムの構成図である。It is a block diagram of the video conference system which concerns on embodiment of this invention. 参加者認証部によって参加者情報を取得する方法を説明するための図である。It is a figure for demonstrating the method of acquiring participant information by a participant authentication part. データベースにおいて格納されるデータの一構成例を模式的に示す図である。It is a figure which shows typically the example of 1 structure of the data stored in a database. 参加者の位置を取得する方法について説明するための図である。It is a figure for demonstrating the method to acquire a participant's position. 参加者を認証した際に優先順位付けを行う処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process which prioritizes when a participant is authenticated. キーワードデータベースに登録されたツリー構造のキーワード表の一例を示す図である。It is a figure which shows an example of the keyword table of the tree structure registered into the keyword database. ファジィ変換とファジィルールの一例を示す図である。It is a figure which shows an example of a fuzzy transformation and a fuzzy rule. 参加者の優先度に応じたカメラワーク例を示す図である。It is a figure which shows the example of a camera work according to a participant's priority. 本発明の他の実施形態に係るテレビ会議システムの構成図である。It is a block diagram of the video conference system which concerns on other embodiment of this invention. 参加者の発言を認識した際に優先順位付けを行う処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process which prioritizes when a participant's statement is recognized. 発言者度と発言時間との関係を示す図である。It is a figure which shows the relationship between a speaker degree and speech time. 表示切り替え部によって選択された映像を、ディスプレイに表示する際の画面表示レイアウト例を示す図である。It is a figure which shows the example of a screen display layout at the time of displaying the image | video selected by the display switching part on a display.

Explanation of symbols

１カメラ
２ディスプレイ
３参加者認識部
４マイク
５処理部
６記録部
７タグ
８リーダ
９画像処理部
１０ａ推論部
１０ｂ第二の推論部
１１データベース
１２キーワードデータベース
１３ａ特徴データベース
１３ｂ第二の特徴データベース
１４音声処理部 DESCRIPTION OF SYMBOLS 1 Camera 2 Display 3 Participant recognition part 4 Microphone 5 Processing part 6 Recording part 7 Tag 8 Reader 9 Image processing part 10a Reasoning part 10b Second reasoning part 11 Database 12 Keyword database 13a Feature database 13b Second feature database 14 Voice Processing part

Claims

A recognition means for personal recognition of a person,
Person information acquisition means for acquiring person information of a person recognized by the recognition means from a first database that holds person information related to the person for each person;
Priority determination means for determining priority for each person based on the person information of each person recognized by the recognition means;
And a control unit that controls a plurality of imaging devices based on the priority determined for each person.

The priority determination unit collates the keyword information held in the second database with the person information of each person acquired by the person information acquisition unit, and prioritizes each person based on the comparison result. The conference system according to claim 1, wherein the degree is determined.

The priority determination means changes the priority to be determined according to the degree of coincidence between the keyword information held in the second database and the keyword information included in the person information corresponding to the keyword information. The conference system according to claim 2.

3. The priority determination unit changes the priority to be determined according to the content of keyword information included in the person information that matches the keyword information held in the second database. Or the conference system of 3.

3. The priority determination unit changes the priority to be determined according to the number of keyword information included in the person information that matches the keyword information held in the second database. The conference system described in 1.

The priority determination means refers to the time information corresponding to the keyword information included in the person information that matches the keyword information held in the second database from the first database, and according to the time information 6. The conference system according to claim 2, wherein the priority to be determined is changed.

7. The conference system according to claim 2, wherein the keyword information stored in the second database is keyword information related to a content of the conference.

8. The priority determination unit according to claim 1, wherein the priority determination unit changes a priority to be determined according to information indicating a position of a person included in the person information of each person. 9. Conference system.

9. The conference system according to claim 1, wherein the recognizing unit performs individual recognition of a person based on image data captured by the capturing unit.

The conference according to any one of claims 1 to 8, wherein the recognition means performs personal recognition of a person based on personal information of the person received from a communication device that performs contact or non-contact communication. system.

Audio data acquisition means for acquiring audio data;
Recognizing means for recognizing a person based on the audio data;
Keyword extracting means for extracting predetermined keyword information from the voice data;
Priority determination means for determining the priority of the person recognized by the recognition means based on the keyword information;
And a control unit that controls a plurality of imaging devices based on the priority determined for each person recognized by the recognition unit.

The priority determination means collates the keyword information held in the third database with the keyword information extracted by the keyword extraction means, and determines the priority for each person based on the collation result. The conference system according to claim 11.

The priority determination means changes the priority determined according to the degree of coincidence between the keyword information held in the third database and the keyword information extracted by the keyword extraction means corresponding to the keyword information. The conference system according to claim 12, wherein:

The priority determination unit changes the priority to be determined according to the content of keyword information extracted by the keyword extraction unit that matches the keyword information held in the third database. Item 14. The conference system according to Item 12 or 13.

The priority determination means changes priority to be determined according to the number of keyword information extracted by the keyword extraction means that matches the keyword information held in the third database. Item 13. The conference system according to Item 12.

16. The priority determination unit detects a speech time of a corresponding person based on the voice data, and changes the priority to be determined according to the length of the speech time. The conference system according to any one of the above.

A flow line analyzing means for analyzing a flow line of each person using image data photographed by the imaging device;
There is further provided a position specifying means for specifying the position of each person based on the order in which the person is recognized by the recognition means and the order in which the person's flow analysis is performed by the flow line analysis means. And
The conference system according to any one of claims 1 to 16, wherein the control unit controls the imaging device based on a position of each person specified by the position specifying unit.

It further has a position specifying means for specifying the position of each person using image data photographed by the imaging device,
The conference system according to any one of claims 1 to 16, wherein the control unit controls the imaging device based on a position of each person specified by the position specifying unit.

A control method for a conference system having a plurality of imaging devices,
A recognition step for personal recognition of a person,
A person information acquisition step of acquiring person information of the person recognized by the recognition step from a database holding person information related to the person for each person;
A priority determination step for determining a priority for each person based on the person information of each person recognized by the recognition step;
And a control step of controlling the imaging device based on the priority determined for each person.

A control method for a conference system having a plurality of imaging devices,
An audio data acquisition step for acquiring audio data;
A recognition step for recognizing a person based on the audio data;
A keyword extraction step of extracting predetermined keyword information from the voice data;
A priority determination step for determining the priority of the person recognized by the recognition step based on the keyword information;
And a control step of controlling at least one imaging device that captures an image based on the priority determined for each person recognized in the recognition step.

A program for causing a computer to execute the conference system control method according to claim 19 or 20.

A computer-readable recording medium on which the program according to claim 21 is recorded.