JP2002537734A

JP2002537734A - Apparatus, system and method for video communication

Info

Publication number: JP2002537734A
Application number: JP2000600429A
Authority: JP
Inventors: トビアスドルフネル
Original assignee: フォクサーアーゲー
Priority date: 1999-02-16
Filing date: 2000-02-16
Publication date: 2002-11-05
Also published as: EP1192807A1; AU3654700A; WO2000049806A1; IL144878A0

Abstract

(57)【要約】本発明は、リアルタイムのユーザ画像データを入力するユーザ画像データ入力装置と、該ユーザリアルタイム画像データから編集されたユーザ画像データを生成する画像データ編集装置と、ユーザ画像データを少なくとも１つの追加的ユーザに出力する画像データ出力装置とを備えたビデオ通信装置に関する。少なくとも１つの通信ユーザを識別する識別装置と、該識別装置に接続された編集選択制御器とが提供される。前記編集選択制御器は，未編集のリアルタイム又は編集済みのユーザ画像データを、画像データ出力装置により或いは付加的に画像データ編集装置を接続することにより該識別装置の識別結果に基づいて出力する。 (57) [Summary] The present invention relates to a user image data input device for inputting real-time user image data, an image data editing device for generating user image data edited from the user real-time image data, and at least one additional user And a video communication device provided with an image data output device that outputs the image data to a video communication device. An identification device for identifying at least one communication user and an edit selection control connected to the identification device are provided. The edit selection controller outputs unedited real-time or edited user image data based on the identification result of the identification device by an image data output device or by additionally connecting an image data editing device.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

TECHNICAL FIELD OF THE INVENTION

本発明は、請求項１の前文に従ったビデオ通信装置に関し、特に、かかるビデ
オ通信装置と、かかるビデオ装置が用いられるビデオ通信方法とを含むビデオ通
信システムに関する。The present invention relates to a video communication device according to the preamble of claim 1, and more particularly to a video communication system including such a video communication device and a video communication method using such a video device.

【０００２】[0002]

[Prior art]

聴覚領域だけでなく可視メディアの各々における音及び画像の伝送のためのチ
ャネルを提供するビデオ通信装置、ビデオ通信システム及びビデオ通信方法は既
知のものであるが、今だに広範な人々への大きな広がりは見いだされていない。
従来技術において本質的に不利な点は、少なくとも１つの更なる通信ユーザへの
画像情報の伝送により、これらが組み合わされてユーザのプライバシーに対して
望ましくない侵入がしばしば発生することが明らかである。通信上の接触を開始
しようとする者が誰かによって、ユーザ及び／又は彼の通信パートナーは、ある
視覚情報を送信しようと願うか又は願わない。最も好ましくは、通信ユーザが対
向する通信パートナーに採用される最適な「所望の外観画像」を送信しようと願
うことである。この画像は、適当な背景を単に含むだけでなく、適した衣服及び
更なる有利な外観をも含む。Video communication devices, video communication systems and methods for providing channels for the transmission of sound and images in each of the visual media as well as the auditory domain are known, but still large to a wide audience. No spread has been found.
It is clear that an inherent disadvantage of the prior art is that the transmission of image information to at least one further communicating user often causes them to combine to create an unwanted intrusion into the privacy of the user. Depending on who wishes to initiate a communication contact, the user and / or his communication partner may or may not wish to transmit certain visual information. Most preferably, the communication user wishes to transmit the optimal “desired appearance image” to be adopted by the opposite communication partner. This image not only includes a suitable background, but also includes suitable clothing and additional advantageous appearance.

【０００３】今迄のところ、例えば、ＷＯ９６／０９７２２において開示される如きビデオ
会議システムで用いられる画像編集アルゴリズムが使用されてきた。ＤＥ−４１０２８９５Ｃ１は、作業所ビデオシステムと共に、視線角度を補
正するため方法及び装置を開示している。ここで、ユーザの視覚領域は、人がス
クリーン上ではなくカメラを直接見る印象を与えるように編集される。この編集
機能は、伝送上の送信者及び受信者による通知無しに機能する。So far, the image editing algorithms used in video conferencing systems, for example as disclosed in WO 96/09722, have been used. DE-4102895 C1 discloses a method and a device for correcting gaze angles together with a workplace video system. Here, the user's visual area is edited to give the impression that a person looks directly at the camera rather than on the screen. This editing function works without notification by the sender and recipient on the transmission.

【０００４】更に、ＴＶ分野において送信者ユニットに伝送される前にリアルタイムに又は
殆どリアルタイムに該情報を変更する方法があることが知られている。これは、
編集される情報が制作者の所望の外観に対応する又は近似するという利点を有す
る。これらの方法は、例えば、「ブルーボックス」、「マスキング」及び「自動
マスキング」を含むものである。Furthermore, it is known in the TV field that there is a way to change this information in real time or almost in real time before it is transmitted to the sender unit. this is,
It has the advantage that the information to be edited corresponds or approximates the desired appearance of the author. These methods include, for example, "blue box", "masking", and "automatic masking".

【０００５】良く知られように、前述の音響視覚通信サービス及びシステムは、電話以上の
情報を送信する。他の中では、送り側の参加者の外観、彼のボディランゲージ、
表情及び動作のみならずビデオカメラにより取り込まれる周囲の様子がこの情報
に属する。しかし、この視覚情報の送信が、前述の音響視覚通信サービス及びシ
ステムが用いられた場合に心理的抑制閾値を招来するという問題を発生する。こ
れらの心理的抑制閾値は、例えばビデオ電話が今だに市場に受けいられていない
という事実に殆ど帰することで明らかと考えられる。As is well known, the aforementioned audiovisual communication services and systems transmit more than telephone calls. Among other things, the appearance of the sending participant, his body language,
The surroundings captured by the video camera as well as facial expressions and actions belong to this information. However, there arises a problem that the transmission of the visual information results in a psychological suppression threshold when the aforementioned audiovisual communication service and system are used. These psychological suppression thresholds may be evident, for example, due largely to the fact that video telephony is not yet on the market.

【０００６】[0006]

[Object of the invention]

従って、本発明の目的は、伝送されている通信パートナーに対して望ましくな
い影響を与えることなく個人或いはビジネスの両面において問題のない利用をな
さしめるビデオ通信装置、ビデオ通信システム及びビデオ通信方法を提供するこ
とである。Accordingly, it is an object of the present invention to provide a video communication apparatus, a video communication system and a video communication method that can be used without problems on both personal and business sides without undesirably affecting a communication partner being transmitted. It is to be.

【０００７】[0007]

Summary of the Invention

本発明によれば、この目的は、請求項１に従ったビデオ通信装置により、請求
項Ｕに従ったかかるビデオ通信装置を有するビデオ通信システムにより、同様に
、請求項Ｘに従ったかかるビデオ通信装置を用いたビデオ通信方法により達成さ
れる。本発明の手段により、ビデオ通信装置及びサービスを用いた場合の心理的
抑制閾値は低減或いは回避されて、呼応するビデオ通信技術は、広範な使用を有
利な方法でもたらすことができる。According to the invention, this object is achieved by a video communication device according to claim 1, a video communication system having such a video communication device according to claim U, as well as such a video communication device according to claim X This is achieved by a video communication method using a device. By means of the present invention, psychological suppression thresholds when using video communication devices and services are reduced or avoided, and the corresponding video communication technology can advantageously bring widespread use.

【０００８】本発明によれば、従って、現在のユーザ画像データの入力のためのユーザ画像
データ入力手段と、現在のユーザ画像データから編集されたユーザ画像データを
生成する画像データ編集手段と、少なくとも１人の更なる通信参加者に向けたユ
ーザ画像データの出力のための画像データ出力手段とを有するビデオ通信装置が
提供される。本発明によれば、少なくとも１人の参加者を識別するための識別手
段と、該識別手段に接続される編集選択制御器とが更に提供される。この編集選
択制御器は、該識手段の識別結果に基づいて、未編集の現在の又は編集済みのユ
ーザ画像データの出力を、該画像データ出力手段によって、所与の場合にあっ
ては該画像データ編集手段に事前接続して起動する。According to the present invention, therefore, at least user image data input means for inputting current user image data, and image data editing means for generating user image data edited from the current user image data, An image data output means for outputting user image data to one further communication participant is provided. According to the present invention, there is further provided identification means for identifying at least one participant, and an edit selection controller connected to the identification means. The edit selection controller outputs the output of the unedited current or edited user image data by the image data output means based on the identification result of the identification means. Start by connecting to the data editing means in advance.

【０００９】もし、ビデオ通信装置がユーザとして１人の人間についてのみ受容し得るのな
らば、呼び出し者の識別は足りている。他方もし、同じ編集済みユーザ画像デー
タが該ビデオ通信装置の幾つかのユーザの各々に割り当てられているならば、対
応するユーザの識別は足りていることになる。本発明の手段により、従って、未編集の画像データ又は事前設定可能な方法に
よりユーザが特に編集した画像データの伝送により、意図された視覚的な印象が
通信パートナーに対して生まれることが機械的に保証される。これにより、ビデ
オ通信通信装置は、何ら実現上の欠点なしに用いられ、特に、現在の周囲の情報
を伝送することなしに、現在のユーザの外観及び該ユーザの雰囲気の如き現在の
情報が用いられ得、これを所望しない場合の通信パートナーに送信することはな
い。それ故、ユーザは、例えば、彼の平板な部分、彼の肉体的な外観、或いは彼
の通信時の衣服の如きに何らかの情報を該通信パートナーが受信することを虞る
必要はない。本発明によるビデオ通信装置は次のことを確かにする。即ち、それ
は、もし要求されれば、例えば、上司、友達、又は親戚の如き特定の通信パート
ナーが誰かに依存して、対応する編集済み又は未編集の画像データが自動的に伝
送され、時間集約的なマニュアル設定とか選択がなされなければならないという
ことはない。[0009] If the video communication device is acceptable for only one person as a user, the identity of the caller is sufficient. On the other hand, if the same edited user image data has been assigned to each of several users of the video communication device, the identity of the corresponding user is sufficient. By means of the present invention, the transmission of unedited image data or image data specifically edited by the user in a pre-settable manner can mechanically create the intended visual impression for the communication partner. Guaranteed. This allows the video communication device to be used without any practical disadvantages, and in particular, to use the current information, such as the current user's appearance and the user's atmosphere, without transmitting the current surrounding information. And does not send it to the communication partner if it is not desired. Thus, the user need not be afraid that the communication partner will receive any information, such as, for example, his flat part, his physical appearance, or his communication clothing. The video communication device according to the invention ensures that: That is, if required, the corresponding edited or unedited image data is automatically transmitted and time-intensive, depending on who the particular communication partner is, for example, a boss, friend or relative. There is no need to make any manual settings or choices.

【００１０】従って、ビデオ通信装置は、対応する音響視覚通信サービス及びシステムのユ
ーザにとって単純且つ信頼性のあるやり方で、自己参照情報の交換の制御を可能
とする。本発明により、音響視覚通信信号の送信又は受信参加者としてのユーザは、送
りコンテンツがユーザの公開したいアイデアに近づくような、或いは彼のアイデ
アに適合するような方法により該コンテンツに影響を与える機会が与えられる。
これにより、現在の音響視覚通信における不利な点は克服される。音響視覚メッ
セージのユーザは独立して通信でき、妨げられることはない。例えば、朝起きた
後の彼の外観、病気の場合、彼の顔のキズ、或いは他の理由による好みに反する
外観に関わらず通信することができる。[0010] Thus, the video communication device allows for control of the exchange of self-referencing information in a simple and reliable manner for the user of the corresponding audiovisual communication service and system. According to the present invention, a user as a participant in transmitting or receiving an audiovisual communication signal has the opportunity to influence the content in a way that the transmitted content approaches or matches the idea that the user wants to publish. Is given.
This overcomes the disadvantages of current audiovisual communications. Users of the audiovisual message can communicate independently and are not disturbed. For example, it is possible to communicate irrespective of his appearance after getting up in the morning, illness, regardless of his facial flaws, or any other appearance that is contrary to his preference.

【００１１】例えば、画像電話、ビデオ会議、作業所会議、インターネット会議、その他と
伴に本発明を用いることで、音響視覚通信における各参加者は、カメラで記録さ
れる彼の外観及び周囲の様子が、受信側の参加者の場所で彼の内面の自己の尊厳
の理解に相応していることを確かにする。これにより、本発明の使用は、個人的
な圏域とプライバシーを保護する。For example, using the present invention in conjunction with video telephony, video conferencing, workplace conferencing, internet conferencing, etc., each participant in audiovisual communication will see each participant's appearance and surroundings recorded by a camera That at the receiving participant's location correspond to his understanding of his inner self-dignity. Thereby, use of the present invention protects personal sphere and privacy.

【００１２】好ましくは、本発明によるビデオ通信通信装置においては、識別手段が構成さ
れて該ユーザと、少なくとも１人の更なる交信中又は交信済みの通信参加者とを
識別する。それに加えて、かかるビデオ通信装置は、対向するパートナーとの通
信のために幾つかのユーザにより個別に用いられ得、各ユーザは個別の所望の表
象（presentation）で彼の通信パートナーの各々に現れる。Preferably, in the video communication device according to the invention, identification means are arranged to identify the user and at least one further communicating or already communicated communication participant. In addition, such a video communication device may be used individually by several users for communication with opposing partners, each user appearing in each of his communication partners in a separate desired presentation. .

【００１３】更に、本発明の範囲において、参加者選択データを保存する参加者選択データ
記憶手段と、ユーザ及び／又は少なくとも１人の更なる交信中又は交信済み参加
者とを各々識別する通信参加者識別を入力する参加者識別入力手段とが識別手段
に割り与えられることが好ましく、そして、該識別手段が、保存された参加者選
択データを現在の通信参加者識別と比較することにより、ユーザ及び／又は少な
くとも１つの更なる交信中又は交信済み参加者とに対する識別結果を実現するよ
うに構成されることが好ましい。Furthermore, within the scope of the present invention, participant selection data storage means for storing participant selection data, and communication participants respectively identifying the user and / or at least one further communicating or contacted participant. Preferably, the participant identification input means for inputting the participant identification is assigned to the identification means, and the identification means compares the stored participant selection data with the current communication participant identification to provide the user with the information. And / or is preferably configured to achieve an identification result for at least one further communicating or contacted participant.

【００１４】前述の実施例は、該参加者選択データ記憶手段が、少なくとも１つの可能なユ
ーザのユーザ選択データを保存するユーザ選択データ記憶手段と、及び／又は、
少なくとも１つの可能な通信参加者の通信参加者選択データを保存する参加者選
択データ記憶手段と、を含むように更に発展され得る。変形例、即ち実施例に対
する追加例として、特に、ユーザ選択データ及び／又は通信パートナー選択デー
タの現在の通信参加者識別を入力する参加者識別入力手段が、マニュアル選択手
段と、電気信号入力手段と、光学信号入力手段と、及び／又は音響信号入力手段
とを、所与の場合にあっては好ましくは含む。[0014] The above embodiments are characterized in that the participant selection data storage means stores user selection data of at least one possible user; and / or
Participant selection data storage means for storing communication participant selection data of at least one possible communication participant. As a variant, ie an addition to the embodiment, in particular, the participant identification input means for inputting the current communication participant identification of the user selection data and / or the communication partner selection data comprises manual selection means, electrical signal input means, , An optical signal input means and / or an acoustic signal input means in a given case are preferably included.

【００１５】 − マニュアル選択手段は、該ユーザによる通信参加者識別を入力するために
キーボード、メニュー制御キー、メニュー制御レバー即ちメニュー制御ポインテ
ィング、及び／又は、タッチセンシティブ入力手段を含み、及び／又は、 − 該ユーザ又は通信参加者による通信参加者識別を入力するために電気的信
号を受領するための電気信号入力手段が構成され、及び／又は、 − 該ユーザ又は通信参加者による通信参加者識別を入力するために光学的信
号を受領するための光学信号入力手段が構成され、及び／又は、 − 該ユーザ又は通信参加者による通信参加者識別を入力するために音響的信
号を受領するための音響信号入力手段が構成される。The manual selection means includes a keyboard, a menu control key, a menu control lever or menu control pointing, and / or a touch-sensitive input means for inputting a communication participant identification by the user; and / or -Electrical signal input means for receiving an electrical signal for inputting a communication participant identification by said user or communication participant; and / or-a communication participant identification by said user or communication participant. Optical signal input means for receiving an optical signal for input and / or sound for receiving an audio signal for inputting a communication participant identification by the user or the communication participant; Signal input means is configured.

【００１６】本発明の他の好ましい強化においては、事前に与えられた又は事前に設定され
たモード、或いは事前に与えられた又は事前に設定された複数のモードの中の１
つに従って、画像データ編集手段による現在のユーザ画像データの編集を、該識
別手段の識別結果に基づいて止めるか又は起こすように編集選択制御器が構成さ
れることが提供される。In another preferred enhancement of the invention, a pre-given or pre-set mode or one of a plurality of pre-given or pre-set modes.
According to another aspect, it is provided that the edit selection controller is configured to stop or cause the editing of the current user image data by the image data editing means based on the identification result of the identification means.

【００１７】更に、本発明によるビデオ通信装置においては、所与の画像データを保存する
所与画像データ記憶手段が提供され、且つ、所与画像データの手段により及び／
又はこれを基礎として現在ユーザ画像データを編集し、編集されたユーザ画像デ
ータを作成する画像データ編集手段が構成されることが、利点を伴って提供され
る。Furthermore, in the video communication apparatus according to the present invention, a given image data storage means for storing given image data is provided, and by means of given image data.
Alternatively, it is provided with an advantage that image data editing means for editing the current user image data based on the image data and creating the edited user image data is configured.

【００１８】この実施例は、かかるビデオ通信装置において、所与画像データ記憶手段が、
所与背景画像データ記憶手段及び／又は所与人間画像データ記憶手段を含むこと
、画像データ編集手段が、現在ユーザ画像データを少なくとも背景画像データと
人間画像データに分離し、背景画像データ及び／又は人間画像データを対応する
所与の背景画像データ又は所与の人間画像データで全部又は部分的に置き換える
ように構成されること、或いは、編集された背景画像データ及び／又は人間画像
データを対応する所与の背景画像データ又は所与の人間画像データを基礎として
対応する記憶手段から作成するように構成されることが提供される点で更なる発
展がなされ得る。これらにより、所与画像データ記憶手段又は所与人間画像デー
タ記憶手段が、胴体画像データ記憶手段と、頭部画像データ記憶手段とを更に含
むことができ、画像データ編集手段が、人間画像データを胴体画像データと頭部
画像データに分離し、胴体画像データ及び／又は頭部画像データを対応する所与
の胴体画像データ又は所与の頭部画像データで全部又は部分的に置き換えるよう
に構成されること、或いは、編集された胴体画像データ及び／又は頭部画像デー
タを対応する所与の胴体画像データ又は所与の頭部画像データを基礎として対応
する記憶手段から作成するように構成され得る。In this embodiment, in such a video communication device, the given image data storage means includes:
Including the given background image data storage means and / or the given human image data storage means, the image data editing means separates the current user image data into at least background image data and human image data, and outputs the background image data and / or the human image data. Being configured to replace the human image data in whole or in part with the corresponding given background image data or the given human image data, or to correspond to the edited background image data and / or human image data. A further development can be made in that it is provided to be configured to create from a corresponding storage means on the basis of given background image data or given human image data. Accordingly, the given image data storage means or the given human image data storage means can further include a torso image data storage means and a head image data storage means, and the image data editing means stores the human image data. It is configured to separate the torso image data and the head image data, and to replace all or part of the torso image data and / or the head image data with the corresponding given torso image data or given head image data. Or may be configured to create edited torso image data and / or head image data from corresponding storage means based on corresponding given torso image data or given head image data. .

【００１９】好ましくは、前述の実施例において、所与画像データ記憶手段、所与背景画像
データ記憶手段及び／又は所与人間画像データ記憶手段と、所与の場合にあって
は胴体画像データ記憶手段及び／又は頭部画像データ記憶手段とにおいて、複数
の所与の背景画像データ及び／又は所与の人間画像データと、或いは、所与の場
合にあっては所与の胴体画像データ及び／又は所与の頭部画像データ、又はこれ
らからなる複数の対応するセット又はサブセット又は要素が保存され得るか又は
保存される。これらは画像データ編集手段の異なる編集モードに割り当てられ得
る。Preferably, in the above embodiment, a given image data storage means, a given background image data storage means and / or a given human image data storage means and, in a given case, torso image data storage Means and / or head image data storage means with a plurality of given background image data and / or given human image data, or in given cases given body image data and / or Or, given head image data, or a plurality of corresponding sets or subsets or elements consisting thereof, may or may be stored. These can be assigned to different editing modes of the image data editing means.

【００２０】本発明の他の好ましい更なる発展によれば、画像データ編集手段は、互いに分
離し且つ少なくとも本質的に同時に現在画像データを分離して編集し、編集済み
ユーザ画像データの生成のために後に再びこれらを一緒にするように構成される
。本発明の他の好ましい更なる発展は、画像データ編集手段において、ユーザ画
像データの審美的（cosmetic）及び／又は技術的最適化が、識別結果に依存して
及び／又は識別結果からは独立して、所与の場合にあっては背景画像データ及び
／又は人間画像データの置換、所与の場合にあっては胴体画像データ及び／又は
頭部画像データの置換の前及び／又は後に、対応する所与の画像データにより各
々の所与の画像データ記憶手段から編集アルゴリズムを基礎として完全に又は部
分的に実行され得る。According to another preferred further development of the invention, the image data editing means separates and edits the current image data separately from each other and at least essentially simultaneously, for generating edited user image data. It is configured to bring these together again later. Another preferred further development of the invention is that, in the image data editing means, the cosmetic and / or technical optimization of the user image data is dependent on and / or independent of the identification result. The replacement of background image data and / or human image data in a given case, and before and / or after replacement of torso image data and / or head image data in a given case. The given image data can be completely or partially executed on the basis of an editing algorithm from each given image data storage means.

【００２１】更に、本発明によるビデオ通信装置において、ユーザ画像データ入力手段が、
画像データ編集手段により各々が個々に１つずつ及び／又は事前設定された規則
に基づいて編集された複数の又は現在のユーザ画像を連続して年代順に生成する
場合に利点となる。好ましくは、本発明によるビデオ通信装置における画像データ編集手段が、背
景画像データ及び／又は人間画像データのみならず、所与の場合にあっては胴体
画像データ及び頭部画像データの分離と、所与の場合にあっては各々の所与の画
像データ記憶手段から対応する所与の画像データの手段による置換と、を連続し
て動的に単一のユーザ画像について実行するように構成される。Further, in the video communication device according to the present invention, the user image data input means includes:
It is advantageous if a plurality of or current user images, each individually edited by the image data editing means and / or based on preset rules, are generated in chronological order in succession. Preferably, the image data editing means in the video communication apparatus according to the present invention is arranged to separate not only background image data and / or human image data but also, in a given case, torso image data and head image data. In the case of giving, the replacement of each given image data storage means by means of the corresponding given image data means is successively and dynamically performed for a single user image. .

【００２２】更に、ユーザ画像データ入力手段が少なくとも１つのカメラを含み、画像デー
タ出力手段が電話ネットワークとのインタフェースを少なくとも１つ含むことが
好ましい。本発明により提供されるようなビデオ通信装置においては、現在のユーザ音デ
ータを入力するユーザ音声入力手段と、現在のユーザ音データから編集済みユー
ザ音データを作成する音データ編集手段と、ユーザ音データを少なくとも１つの
更なる通信パートナーに出力する音データ出力手段と、が備えられることが更に
提供され得る。Further, it is preferable that the user image data input means includes at least one camera, and the image data output means includes at least one interface with a telephone network. In a video communication device as provided by the present invention, a user voice input means for inputting current user sound data, a sound data editing means for creating edited user sound data from the current user sound data, Sound data output means for outputting data to at least one further communication partner.

【００２３】この実施例は、編集選択制御器が、識別手段による識別結果に基づいて音デー
タ出力手段により未編集の現在又は編集済みのユーザ音データの出力を生成する
ように構成されることで更に発展され得る。変形例、即ちこれに対する追加例と
して、所与の音データを保存する所与音データ記憶手段が提供され得、そして、
音データ編集手段が、所与の音データの手段により及び／又はこれを基礎として
現在のユーザ音データを編集して、編集済みユーザ音データを作成するように構
成され得る。ここで説明された発明の実施例の変形の付加的又は代替的な強化としては、音
データ編集手段により、ユーザ音データの審美的及び／又は技術的音声最適化が
、識別結果に依存して及び／又は識別結果に独立して、所与の場合にあっては所
与音データ記憶手段から所与音声データ手段による現在のユーザ音データの編集
の前及び／又は後に、編集アルゴリズムを基礎として実行され得る。In this embodiment, the edit selection controller is configured to generate the output of the unedited current or edited user sound data by the sound data output means based on the identification result by the identification means. It can be further developed. As a variant, or additional example thereto, a given sound data storage means for storing given sound data may be provided, and
Sound data editing means may be configured to edit the current user sound data by and / or on the basis of the given sound data to create edited user sound data. As an additional or alternative enhancement of the variants of the embodiment of the invention described here, the aesthetic and / or technical audio optimization of the user audio data by the audio data editing means depends on the identification result. And / or independently of the identification result, in a given case before and / or after the editing of the current user sound data by the given sound data means from the given sound data storage means, based on the editing algorithm Can be performed.

【００２４】音データの分野において、ユーザにより所与の音データの編集手段の編集モー
ドが、個別に又はグループ単位に識別手段の識別結果に割り当てられることも好
ましいやり方で提供され得る。更に、ユーザ音データ入力手段が少なくとも１つのマイクロホンを含むこと、
音声データ出力手段が電気通信ネットワークとのインタフェースを少なくとも１
つ含むこと、及び／又は、識別手段が電気通信ネットワークとのインタフェース
を少なくとも１つ含むことが好ましい。In the field of sound data, it may also be provided in a preferred manner that the editing mode of the editing means of a given sound data is assigned individually or in groups by the user to the identification result of the identification means. Further, the user sound data input means includes at least one microphone;
The audio data output means has at least one interface with a telecommunications network.
Preferably, and / or the identification means comprises at least one interface with a telecommunications network.

【００２５】好ましくは、本発明によるビデオ通信装置において、ビデオ及び／又は音響デ
ータの表象最適化された伝送が可能であり、後に説明されるビデオ通信方法の性
能にとって好ましい。それ故、対応するビデオ通信装置は以下のものを含む。 −ビデオ入力及び出力手段 −音響入力及び出力手段 −送信及び受信手段 −少なくとも１つの伝送チャネルへのインタフェース −制御及び指令信号の入力のための入力装置 −所与の画像データ及び所与の音データと同様に、ユーザ及びシステムプログ
ラムの保存のための記憶手段ここで、上記の手段、設備及び部品は、説明された手段及び部品と相互作用し
てかかる方法ステップを実行するように構成されたプロセッサと機能的に接続さ
れる。Preferably, in the video communication device according to the present invention, representationally optimized transmission of video and / or audio data is possible, which is preferable for the performance of the video communication method described later. Therefore, corresponding video communication devices include: -Video input and output means-audio input and output means-transmission and reception means-interface to at least one transmission channel-input devices for input of control and command signals-given image data and given sound data As well as storage means for storage of user and system programs, wherein the means, equipment and components described above are configured to interact with the described means and components to perform such method steps Functionally connected to

【００２６】本発明によるビデオ通信装置の前述の実施例の更なる発展は、それが、上位の
管理ユニット及び／又は上位のメモリ媒体、例えば「パーソナルコンピュータ」
と接続するためのプロセッサユニットに接続されたインタフェースを更に有する
ことである。本発明の手段により、更に、上記の目的も達成されるビデオ通信システムが提
供される。かかるビデオ通信システムは、本発明に従い請求項１乃至２２に従っ
て、少なくとも２つのビデオ通信装置を含み、これらの装置は電話通信網を介し
て接続されるか又は接続され得る。A further development of the above-described embodiment of the video communication device according to the invention is that it comprises a higher management unit and / or a higher memory medium, for example a “personal computer”
And an interface connected to the processor unit for connecting to the processor. By means of the present invention, there is further provided a video communication system which also achieves the above objects. Such a video communication system comprises at least two video communication devices according to the invention and according to claims 1 to 22, which devices are or can be connected via a telephone network.

【００２７】ビデオ通信方法によっても本発明の目的は達成される。ここでは、通信参加者
が識別手段により識別され、現在のユーザ画像データがユーザ画像データ入力手
段に入力され、編集選択制御が現在のユーザ画像データを画像データ編集手段に
識別手段の識別結果に基づいて導くか又は導かず、画像データ編集手段が現在の
ユーザ画像データの受信に基づいて、これらから又はこれらとなる編集済み画像
データを生成し、最終的に、未編集の現在の又は存在する場合の編集済みユーザ
画像データが画像データ出力手段により出力される。The object of the present invention is also achieved by a video communication method. Here, the communication participant is identified by the identification means, the current user image data is input to the user image data input means, and the edit selection control sends the current user image data to the image data editing means based on the identification result of the identification means. If the image data editing means generates from or becomes these edited image data based on the reception of the current user image data, and finally unedited current or existing Is output by the image data output means.

【００２８】目的及び利点の達成のための議論については、本発明によるビデオ通信装置に
関わる文脈における表現に、全てにおいて、かかる目的及び利点の達成が反復を
避けるために参照される。本発明によるビデオ通信方法の好ましい更なる発展は、本発明によるビデオ通
信装置の上記の形態及びこれら形態の組み合わせの類似的及び適当な実装の結果
から得られる。For a discussion of achieving the objects and advantages, reference will be made, in all instances, to the expression in the context of a video communication device according to the present invention, in order to avoid repetition of achieving such objects and advantages. Preferred further developments of the video communication method according to the invention result from a similar and suitable implementation of the above-described aspects of the video communication device according to the invention and combinations of these aspects.

【００２９】更に、本発明によるビデオ通信方法は、画像データ又は特に画像電話のケース
における画像及び音データが表彰最適化された方法により伝送されることで更に
発展され得る。ここで、ビデオソースから導出された画像データ、所与の場合に
あって音響ソースから導出された音データの各々が、その通信パートナーへの伝
送前に、所与の画像データ又は、所与の場合にあっては所与の音データを各々基
礎として、少なくとも１つ所定の又は予め定められ得る判定基準に対応して、次
のステップにより変更される。Furthermore, the video communication method according to the invention can be further developed in that the image data, or in particular the image and sound data in the case of a visual telephone, are transmitted in a commendation-optimized manner. Here, each of the image data derived from the video source, the sound data derived in a given case from the acoustic source, is transmitted to a given image data or a given image data before transmission to its communication partner. In some cases, on the basis of the given sound data, it is changed by the following steps, corresponding to at least one predetermined or predeterminable criterion.

【００３０】通信を開始する前において、ａ）所与の画像データが作成され保存される。ｂ）画像データを参照するパラメータが定義、保存され，ステップａ）におい
て保存された所与の画像データに割り当てられる。そして，通信の間において、ｃ）ユーザ画像データ入力手段、特にビデオソースから導出され、１つ又は幾
つかの選択された又は選択可能な画像データパラメータに関するユーザ画像デー
タは、ステップｂ）において定義され且つ保存されたパラメータ（複数）の手段
により抽出される。Before commencing communication: a) Given image data is created and stored. b) Parameters referring to the image data are defined and stored and assigned to the given image data stored in step a). And during the communication c) the user image data derived from the user image data input means, in particular the video source and relating to one or several selected or selectable image data parameters are defined in step b) And extracted by means of the stored parameters (plural).

【００３１】ｄ）選択された画像データパラメータに基づくステップｃ）のユーザ画像デー
タは、割り当てられた所与の画像データを基礎として編集される。ｅ）ステップｄ）において編集されたユーザ画像データは、１つ又は幾つかの
通信パートナーに伝送される。この方法の更なる発展の変形は、ステップｄ）及びｅ）がユーザからは遠隔の
中央位置において実行されることにあり、ここで、ユーザ画像データ、割り当て
られた所与の画像データ及び画像データパラメータ（複数）がユーザの位置から
中央位置に伝送される。代替的又は付加的には、ステップａ）乃至ｅ）がユーザ
の位置において実行される。D) The user image data of step c) based on the selected image data parameters is edited based on the assigned given image data. e) The user image data edited in step d) is transmitted to one or several communication partners. A further development variant of the method consists in that steps d) and e) are performed at a central location remote from the user, wherein the user image data, the assigned given image data and the image data The parameter (s) are transmitted from the user's location to the central location. Alternatively or additionally, steps a) to e) are performed at the location of the user.

【００３２】更に、ここで説明された実施例の変形は更に発展され得る。即ち、付加的ステ
ップｆ）が提供され、事前に保存された認定ユーザの音声サンプルが現在のユー
ザの発声された符号フレーズ（code phrase）と比較され、該比較の肯定的な結
果において、このユーザに対する編集された通信の開放が実行される。これによ
り、認可されたユーザのみがユーザ特別最適化を実行することができ、編集され
た画像及び／又は変更された度合いで伝送を開始することができるようにするこ
とが好ましい簡単なやり方で保証される。その点で、ステップｆ）において、保
存された音声サンプルが付加的にユーザに属する所与の画像データに割り当てら
れるか、或いは、彼に属するものとして選択され、該ユーザが保存された音声サ
ンプル及び関連する所与の画像データを基礎として識別されることが更に提供さ
れる。代替的又は付加的に、発声された符号フレーズ及び音声分析と、ユーザ画
像データ入力手段、特にビデオソースから導出されたユーザ画像データの画像分
析とがステップｆ）において実行され得る。最後に述べる実施例は、ユーゴの画
像分析特徴上の顔面形態が関連する又は所与の画像データと比較される。Further, variations on the embodiments described herein can be further developed. That is, an additional step f) is provided in which the pre-stored authorized user's speech sample is compared with the current user's spoken code phrase, and in the positive result of the comparison, Release of the edited communication to is performed. This ensures that only authorized users can perform user-specific optimizations and can start transmissions with edited images and / or modified degrees, preferably in a simple manner Is done. At that point, in step f), the stored audio samples are additionally assigned to the given image data belonging to the user or selected as belonging to him, and the user saves the stored audio samples and It is further provided that the identification is based on associated given image data. Alternatively or additionally, an analysis of the uttered code phrases and speech and an image analysis of the user image data input means, in particular of the user image data derived from the video source, can be performed in step f). In the last-mentioned embodiment, the facial morphology on the Yugo's image analysis features is compared with relevant or given image data.

【００３３】直前に説明されたビデオ通信方法の発明の基本的な変形の他の更なる発展とし
ては、ステップｃ）の実行の前に、ユーザが、保存された所与の画像データの音
響視覚通信のための使用が認可される者として識別される。更に、本発明によるビデオ通信方法においては、処理される又は編集され解析
されるユーザ画像データと、所与の画像データのみならず編集済みユーザ画像デ
ータとがモーション画像、２次元及び３次元画像情報を含むことが好ましい。Another further development of the basic variant of the invention of the video communication method just described is that, prior to the execution of step c), the user is provided with an audiovisual view of the given stored image data. Identified as authorized to use for communication. Further, in the video communication method according to the present invention, the user image data to be processed or edited and analyzed, and the edited user image data as well as the given image data are used as motion image, two-dimensional and three-dimensional image information. It is preferable to include

【００３４】本発明によるビデオ通信の他の好ましい実施例では、伝送されるユーザ画像デ
ータが伝送前にユーザに示されることを含む。好ましくは、これは、ディスプレ
イが、選択可能な及び／又は選択された所与の画像データのみならず選択可能な
又は選択された画像データパラメータも表示するインタラクティブなユーザ面（
surface）を含むことにおいて更に発展され得る。Another preferred embodiment of the video communication according to the invention comprises that the transmitted user image data is shown to the user before the transmission. Preferably, this means that the display displays not only the selectable and / or selected given image data, but also the selectable or selected image data parameters.
surface) can be further developed.

【００３５】本発明によるビデオ通信装置、ビデオ通信システム及びビデオ通信方法は、画
像電話、ビデオ会議又はコンピュータネットワークを介したケースにおいて、ビ
デオ及び／又は音響データに画像特化した最適化伝送に好ましくは適合され得る
。本発明の更なる好ましく且つ有利な実施例は、従属請求項及びこれらの組み合
わせから各々得られるのみならず、各々の形態と装置、システム及び方法の請求
項の組み合わせとの類似且つ適合する変換から得られる。The video communication device, the video communication system and the video communication method according to the invention are preferably used for optimized transmission of video and / or audio data in the case of video telephony, video conferencing or computer networks. Can be adapted. Further preferred and advantageous embodiments of the invention are obtained not only from the dependent claims and from their combinations, but also from a similar and compatible transformation of each form with the combination of the apparatus, system and method claims. can get.

【００３６】引き続いて、本発明が図面において示される実施例を基礎としてより詳細に例
示として説明される。図面の個々の図形及び実例における同一の参照符号は、同一又は類似の又は等
しい又は同様の効果を与える要素を参照している。図面における実例を基礎とし
て、参照符号を有しないときも、引き続いて説明されているかいないかの事実に
かかわりなくかかる形態が明らかとなる。他方、本記述の形態は、図面に見られ
ず即ち図示されていないとしても当業者により明らかである。The invention will now be described in more detail by way of example on the basis of an embodiment shown in the drawings. The same reference numbers in individual figures and examples in the drawings refer to elements having the same or similar or equal or similar effect. On the basis of the illustrations in the drawings, even without the reference signs, such forms will become apparent irrespective of the fact that they are subsequently described or not. On the other hand, embodiments of the present description will be apparent to those skilled in the art even if not visible in the drawings, ie, not shown.

【Example】

【００３７】本発明は、音響視覚通信媒体の存在を基礎としている。音響視覚通信媒体の通
常の形態は、マイクロホン及びラウドスピーカと、ビデオカメラ及びモニタと、
制御ユニット、音響及びビデオ信号を処理する送信処理ユニットと、音響及びビ
デオ信号を処理する受信処理ユニットと、与えられたライン帯域幅、例えばアナ
ログ又はデジタル電話ネットワーク、インターネットを介したパッケージ制御の
通信、内部コンピュータネットワーク，その他の最適な使用法のための圧縮ユニ
ットとである。The present invention is based on the existence of an audiovisual communication medium. Common forms of audiovisual communication media are microphones and loudspeakers, video cameras and monitors,
A control unit, a transmission processing unit for processing audio and video signals, a reception processing unit for processing audio and video signals, and a given line bandwidth, e.g. analog or digital telephone network, communication for package control via the Internet, A compression unit for internal computer networks and other optimal uses.

【００３８】本発明は、他の中でも音響視覚通信媒体を、引き続く説明の「最適化画像処理
（ＯＩＰ）」と呼ばれる機能により拡張する。本発明によるＯＩＰ機能は、例え
ば、ビデオ画像電話、ビデオ会議システム又はインターネットプロトコルに基づ
くシステムにおいて用いられ得る。説明を単純にする目的で、本記述は主にユー
ザ制御されたＯＩＰを参照して音声最適化についてはより詳細には説明しない。The present invention extends, among other things, an audiovisual communication medium with a function called “optimized image processing (OIP)” in the description that follows. The OIP function according to the present invention can be used, for example, in video image telephony, video conferencing systems or systems based on Internet protocols. For the sake of simplicity, this description will not primarily describe speech optimization in more detail with reference to user controlled OIPs.

【００３９】本発明によるＯＩＰは、１つ又は幾つかのメモリ媒体における画像情報、即ち
所与の画像データとして、参加者に彼の所望の外観の最適化された画像の１乃至
ｎ個を保存する機会を与える。ここでは、異なる物理的種類の記憶媒体も用いら
れ得る。The OIP according to the invention saves a participant from one to n optimized images of his desired appearance as image information in one or several memory media, ie given image data. Give the opportunity to. Here, storage media of different physical types can also be used.

【００４０】ここで、図１を参照すると、図１は、本発明による方法を実行する、即ち、最
適化ユーザ制御により音響視覚情報を送信／受信するのに適合した装置のブロッ
ク図を示している。図１によれば、中央プロセッサユニット１０には、例えばビデオカメラである
画像入力ユニット１１の形式の画像データ入力手段と、例えばＰＣモニタ又は液
晶ディスプレイである表示装置１２と、画像データ及び音データ出力手段を含む
音響出力／入力手段１３と、送信ユニット１４、受信ユニット１５及び電話ネッ
トワーク、無線ネットワーク、移動ネットワーク即ちデータネットワーク１６、
即ち一般的には電気通信ネットワークとのインタフェースを含む音響視覚通信ユ
ニットと、例えばパーソナルコンピュータの如き上位の記憶媒体又はコンピュー
タとのインタフェース１７と、例えば電話キーボード又は分離されたキーボード
の如き参加者の識別手段としての入力装置１８と、少なくともユーザ独自のプロ
グラム２１及び基準画像又は所与の画像データ２２、所与の場合にあっては音響
基準データ又は所与の音データが保存され、所与の画像データ及び所与の音デー
タの記憶手段として表されるメモリ２０と、が機能的に接続される。Referring now to FIG. 1, FIG. 1 shows a block diagram of an apparatus adapted to carry out the method according to the invention, ie to transmit / receive audiovisual information with optimized user control. I have. According to FIG. 1, a central processor unit 10 has image data input means in the form of an image input unit 11, for example a video camera, a display device 12, for example a PC monitor or a liquid crystal display, and image and sound data output. Sound output / input means 13 including means, transmitting unit 14, receiving unit 15 and telephone network, wireless network, mobile network or data network 16,
An audiovisual communication unit, generally including an interface with a telecommunications network, an interface 17 with a host storage medium or computer such as a personal computer, and identification of a participant such as a telephone keyboard or a separate keyboard. The input device 18 as a means and at least the user's own program 21 and the reference image or given image data 22, in which case the acoustic reference data or given sound data are stored and the given image The data and a memory 20 represented as storage means for given sound data are operatively connected.

【００４１】図２は、メモリ２０が幾つかの物理的にも異なるメモリ媒体を含むこともでき
ることを示している。最適化即ち編集された基準情報、即ち，基準画像情報又は
編集済みユーザ画像データ及び基準音響情報又は編集済み音データが、ビデオカ
メラ，レコーダ、その他となし得る画像入力ユニット１１及び音響入力ユニット
１３を介して、すなわちインタフェース１７を介した上位のユニットからメモリ
２０へ一方向で伝送される。これにより、アプリケーション及び画像素材が、例
えば、ＲＯＭメモリ、ＲＡＭメモリ又は例えばハードディスク、フラッシュカー
ド又は同様のメディア（図２）で提供されるかは無関係である。既述のように、
メモリ２０は、又、ユーザ制御された彼の音声の最適化のために用いられ得る基
準音響情報又は所与の音データを含む。ユーザは、例えば、入力装置１８の手段
により最適化画像処理又は画像データ編集を制御し、通信パートナーに彼により
送られるべき画像及び／又は音声を彼の選択に従って最適化する。入力装置１８
は、例えば、電話キーボード、分離して接続されるキーボード、コンピュータマ
ウス、ライトペン、グラフィックトレイ、その他となし得る。FIG. 2 illustrates that the memory 20 can also include several physically different memory media. The optimized or edited reference information, i.e., the reference image information or edited user image data and the reference sound information or edited sound data is used for the image input unit 11 and the audio input unit 13 which can be a video camera, a recorder, and the like. Via the interface 17, that is, from the upper unit to the memory 20 in one direction. This makes it irrelevant whether the application and the image material are provided, for example, in a ROM memory, a RAM memory or, for example, a hard disk, a flash card or similar media (FIG. 2). As mentioned,
The memory 20 also contains reference sound information or given sound data that can be used for the user-controlled optimization of his sound. The user controls the optimized image processing or image data editing, for example by means of the input device 18, and optimizes the images and / or audio to be sent by him to the communication partner according to his choice. Input device 18
Can be, for example, a telephone keyboard, a separately connected keyboard, a computer mouse, a light pen, a graphic tray, and the like.

【００４２】図２において、メモリ２０の詳細のみならず、ここに保存されているユーザ独
自情報コンテンツ２１及び基準画像又は所与の画像データ２２が示されている。
示されるように、本発明に用いられ得るメモリ２０は、ＲＯＭメモリ、ＲＡＭメ
モリ、ハードディスク、リムーバブルディスク、フロッピー（登録商標）ディスク、フラッシュカード及び他の適合するメモリ媒体を有する又は含むことができる。メモリ２０は、又、かかる記憶媒体の組み合わせを含むことができる。コンテンツブロック２１において、ここでは、ユーザ独自プログラム及び情報又はデータが保存され、ユーザ認識又は識別のためのブロック２１０と、異なるアルゴリズム１、２．．．Ｎの手段による技術的画像最適化即ち編集のためのブロック２１１と、重ねて異なる画像編集アルゴリズム１、２．．．Ｎの手段による審美的外観最適化即ち画像最適化のためのブロック２１２と、背景処理のためのブロック２１３と、頭部処理のためのブロック２１４と、胴体処理のためのブロック２１５とが配置される。ここで、ユーザ独自ブロック２１及び上記に含まれる個々のブロック２１１乃至２１５は与えられた列挙に限定されないということと、音声認識のためのメモリコンテンツ及び更なるコンテンツが、繰り返しを避けるために述べてはいないが、画像と音声最適化との間の同様且つ類似の状況において存在しても良いということとが述べられなければならない。FIG. 2 shows the details of the memory 20 as well as the user-specific information content 21 and the reference image or given image data 22 stored therein.
As shown, the memory 20 that can be used in the present invention can include or include ROM memory, RAM memory, hard disk, removable disk, floppy disk, flash card, and other suitable memory media. You. Memory 20 may also include a combination of such storage media. In the content block 21, a user-specific program and information or data are stored here, and a block 210 for user recognition or identification and a different algorithm 1, 2,. . . N, a block 211 for technical image optimization or editing by means of N. . . A block 212 for aesthetic appearance optimization or image optimization by means of N, a block 213 for background processing, a block 214 for head processing, and a block 215 for torso processing. Be placed. Here, the user specific block 21 and the individual blocks 211 to 215 included above are not limited to the given enumeration, and the memory content for speech recognition and the additional content are described in order to avoid repetition. It must be mentioned that, although not, it may exist in similar and similar situations between image and audio optimization.

【００４３】図２に示され、所与の画像データの画像素材を参照するブロック２２は、所与
の背景画像データあるいは基準背景画像データ１、．．．Ｎのブロック２２１と
、ユーザの頭部（所与の頭部画像データ）の基準画像１、２．．．Ｎのブロック
２２２と、ユーザの胴体（所与の胴体画像データ）の基準画像１、２．．．Ｎの
ブロック２２３と、を含む。この用語法は、用語「胴体」が頭部を除外した体の
部分を含むことを基礎としている。As shown in FIG. 2, the block 22 for referring to the image material of the given image data includes the given background image data or the reference background image data 1,. . . N blocks 221 and reference images 1, 2,... Of the user's head (given head image data). . . N blocks 222 and reference images 1, 2,... Of the user's torso (given torso image data). . . N blocks 223. This terminology is based on the fact that the term "torso" includes body parts excluding the head.

【００４４】本発明により、図に示される音響視覚通信装置の形式のビデオ通信装置の場合
、これは多くの数の参加者に対して認可されるものであって、ブロック２２１乃
至２２３は対応して何数倍かのものが存在し、これらは個々の物理メモリ装置内
に、おそらくソフトウェアに従って、即ち各ユーザに対して別々の記憶手段の形
式で可能なそれらの適合した分割の手段により実現され得る。According to the present invention, in the case of a video communication device in the form of the audiovisual communication device shown in the figure, this is approved for a large number of participants, and blocks 221 to 223 correspond. There are several times more, which are realized in individual physical memory devices, possibly according to software, i.e. by means of their suitable division, possible in the form of separate storage means for each user. obtain.

【００４５】図３において、機能ブロック図の形式で、送信者側における画像編集最適化無
し、即ち本発明によるＯＩＰ無しの機能の手法が示されている。機能ブロック４
０に示される上述のＯＩＰは非活性化され、ビデオ信号ソースのビデオ信号は直
接に符号化ブロック４１に行く。この機能の手法は、例えばユーザ及び通信パー
トナー識別の如き識別結果を基礎として識別手段の手段によりビデオ通信装置に
より自動的に選択される。これは後に参照されるが、具体的な通信パートナーに
伝送されるものとユーザが先に判定した場合には、通信パートナーは現在のユー
ザ画像データを見ることができる。FIG. 3 shows, in the form of a functional block diagram, a method of a function without image editing optimization on the sender side, that is, a function without OIP according to the present invention. Function block 4
The above-mentioned OIP shown at 0 is deactivated and the video signal of the video signal source goes directly to the coding block 41. The method of this function is automatically selected by the video communication device by means of the identification means on the basis of the identification result, for example the identification of the user and the communication partner. This will be referred to later, but if the user previously determines that it is to be transmitted to a specific communication partner, the communication partner can view the current user image data.

【００４６】一方、図４は、機能ブロック図の形式で、現在ユーザ画像データの最適化画像
処理即ち編集が、ユーザ画像データを電気通信ネットワークを経て通信パートナ
ーに出力する前に実行されることを示している。この機能の手法は，ビデオ通信
装置により、例えばユーザ及び通信パートナー識別の如き識別結果を基礎として
識別手段の手段によりビデオ通信装置により自動的に選択される。具体的な通信
パートナーに伝送されないとユーザが先に判定した場合には、通信パートナーは
現在のユーザ画像データを見ることができないが、該ユーザは、特別の背景の前
に、着ている衣服及び／又は、髪形、髭，顔色及び洗練された顔，その他の如き
適当に準備された残りの外観によりこの具体的な通信パートナーによって見られ
ることを望む。例えば、もし、具体的な通信パートナーが、例えばビジネスパー
トナー、上司、部下、スポーツ又はレジャーの友人又は未知の人物である場合に
は、各々に適する手法において後者の手法が望まれるはずである。FIG. 4 shows, in the form of a functional block diagram, that the optimized image processing or editing of the current user image data is performed before the user image data is output to the communication partner via the telecommunications network. Is shown. The method of this function is automatically selected by the video communication device by means of the identification means on the basis of the identification result, for example the identification of the user and the communication partner. If the user first determines that it will not be transmitted to the specific communication partner, the communication partner will not be able to see the current user image data, but the user will be wearing clothing and clothing in front of the special background. And / or wants to be seen by this particular communication partner with the rest of the appearance properly prepared, such as hairstyle, beard, complexion and sophisticated face. For example, if the specific communication partner is, for example, a business partner, boss, subordinate, friend of sports or leisure or an unknown person, the latter approach would be desirable in each of the appropriate approaches.

【００４７】電気通信の開始、例えば送信の前に、例えばビデオ信号ソースである画像入力
ユニット１１により得られる例えばビデオ信号の形式である現在のユーザ画像デ
ータは、本発明により意図される編集即ち最適化のために、個々の領域、即ちレ
イヤ３１乃至３３（英語Layer）において、コンテンツ認識ブロック３０により
分離且つ復号される。より良い理解のみのために、レイヤー３１乃至３３は、こ
こでレイヤー１（背景）、レイヤー２（胴体）及びレイヤー３（頭部）と名前付
けされる。これらのレイヤーは、メモリ領域２２に保存されるレイヤー「背景」
即ち所与の背景画像２２１と、「頭部」即ち所与の頭部画像２２２と、「胴体」
即ち所与の胴体画像２２３とに対応する。これらの用語、背景、頭部、胴体は、
本発明の態様の表示を提供するにすぎない。備えられるメモリのサイズに依存し
て、更なる領域即ちレイヤーが更なる詳細化のために定義され得る。又、本方法
の表示の目的で、破線で描かれる更なる特別レイヤー３４が、例えば、テキスト
挿入のために定義され、望まれる場合に付加され得る。Before the start of the telecommunication, for example before transmission, the current user image data, for example in the form of a video signal, obtained by the image input unit 11, for example a video signal source, can be edited or optimized according to the invention. For the purpose of segmentation, the content is separated and decoded by the content recognition block 30 in individual regions, that is, layers 31 to 33 (English layer). For better understanding only, layers 31-33 are here named Layer 1 (background), Layer 2 (body) and Layer 3 (head). These layers are the layers "background" stored in the memory area 22.
That is, a given background image 221, a “head” or given head image 222, and a “body”
That is, it corresponds to the given torso image 223. These terms, background, head and torso are
It merely provides an indication of aspects of the invention. Depending on the size of the memory provided, further areas or layers may be defined for further refinement. Also, for the purpose of displaying the method, a further special layer 34 drawn in dashed lines may be defined, for example for text insertion, and added if desired.

【００４８】既述のレイヤー３１乃至３４はビデオ信号ソースの形式による画像入力手段１
１、例えばビデオカメラから流れるビデオ信号から、コンテンツ認識ブロック３
０内の画像処理アルゴリズムの手段により分離され、論理メモリ領域において別
々に認定される。もし、音響視覚通信が発生した場合には、ユーザの調整に依存
して、全て又は個々のレイヤー又はこれらの部分は、メモリ２２からの画像情報
即ち所与の画像データにより置換される。図４の例において、背景及び基準頭部
を参照するレイヤー３１及び３２は、メモリ領域２２１及び２２２からの基準背
景画像２及び頭部画像３により置換される。使用される所与の画像データ即ち画
像情報の制御は、基準マーカにより保証される。The above-mentioned layers 31 to 34 correspond to the image input means 1 in the form of a video signal source.
1, for example, from a video signal flowing from a video camera, a content recognition block 3
Separated by means of an image processing algorithm within 0 and identified separately in the logical memory area. If an audiovisual communication occurs, all or individual layers or portions thereof are replaced by image information from the memory 22, ie given image data, depending on the adjustment of the user. In the example of FIG. 4, the layers 31 and 32 referring to the background and the reference head are replaced by the reference background image 2 and the head image 3 from the memory areas 221 and 222. The control of the given image data or image information used is guaranteed by the fiducial markers.

【００４９】顔が更なる説明のために例を提供する。認識された基準マーカが、挿入される
べきレイヤー「頭部」の画像情報の制御のために使用される。もしユーザが頭を
動かした場合、例えば肯定的にうなずいた場合、最適化即ち編集された画像は同
じ動きを演じる。もしレイヤー「胴体」が活性化された場合には、現在のレイヤ
ー「胴体」は、領域２２３に所与の胴体画像データの形式にて保存された胴体基
準画像の１つにより置換される。全てのレイヤーは、送信されるべき最適化ビデ
オ画像として編集済みユーザ画像データの形に寄せ集められる。The face provides an example for further explanation. The recognized fiducial markers are used for controlling the image information of the layer “head” to be inserted. If the user moves his head, for example, nods affirmatively, the optimized or edited image will perform the same movement. If the layer "torso" is activated, the current layer "torso" is replaced by one of the torso reference images stored in the area 223 in the form of the given torso image data. All layers are put together in edited user image data as optimized video images to be transmitted.

【００５０】「レイヤー」の用語は、画像情報、即ち本発明による方法による処理又は編集
されたユーザ画像データが２次元としかなし得ないこと意味するものではないこ
とを考慮するべきである。これに替えて、３次元の画像情報即ちユーザ画像デー
タも処理され得る。新たに寄せ集められたビデオ画像は、ここで、事前選択された又は選択し得る
パラメータ（図４におけるブロック４２を参照）に従って、クロミナンス（chro
minance）、コントラスト、輝度に関して技術的に最適化される。It should be taken into account that the term “layer” does not mean that the image information, ie the user image data processed or edited by the method according to the invention, can only be two-dimensional. Alternatively, three-dimensional image information, ie, user image data, can also be processed. The newly assembled video image is now chrominance (chro) according to pre-selected or selectable parameters (see block 42 in FIG. 4).
minance), technically optimized for contrast and brightness.

【００５１】審美的画像最適化と呼ばれる更なる機能ユニット４３において、好ましくは顔
面領域における審美的改良が実行される。これは、例えば、目をより明るくする
こと、視線角度を修正すること、歯の領域を明るくすること、影を明るくするこ
と（例えば、あごの部分及び皮膚の広い着色変更）、例えば望ましくないいぼの
如き疾患に僅かの着色修正を加えることを含む。In a further functional unit 43, called aesthetic image optimization, an aesthetic improvement is preferably performed in the facial region. This may include, for example, brightening the eyes, modifying the gaze angle, brightening the tooth areas, brightening the shadows (eg, wide color changes in the chin and skin), eg, unwanted warts And making slight coloring corrections to diseases such as

【００５２】かかる方法により最適化されたビデオ信号は、符号化ユニット４１へと渡され
、最後に通信パートナー即ち受信者（複数）としての通信パートナーに送られる
。図５及び図６に関して、本発明による方法を用いた通信ステップが次に説明さ
れる。The video signal optimized in this way is passed to the coding unit 41 and finally sent to the communication partner, ie the communication partner as the receiver (s). With reference to FIGS. 5 and 6, the communication steps using the method according to the invention will now be described.

【００５３】最初に、外部的にセットアップされた通信接続が説明される。即ち、ユーザが
交信状態に入る。音響視覚通信装置（図１）は、他の側から信号を接続のセット
アップのために受信する。従来技術に対応する通信システムは、おそらく、通信
パートナーとしてサービスされるいわゆる呼者（caller）識別、即ち通常の通信
参加者識別を送信する。入信呼（ステップ５１）のケースでは、呼者識別は識別
手段によりチェックされる（ステップ５３）。これは、伝送された呼者識別即ち
通信参加者識別を参加者ディレクトリ５２（参加者のアドレスを含む電話帳）に
ある参加者選択データと比較することにより実行され、この呼者識別は参加者選
択記憶手段のメモリに保存される。あるＯＩＰ構成、即ち編集モードは、参加者
ディレクトリ５２に割り当てられ得る又は割り当てられる。First, an externally set up communication connection is described. That is, the user enters a communication state. The audiovisual communication device (FIG. 1) receives signals from the other side for connection setup. Communication systems corresponding to the prior art probably transmit so-called caller identities served as communication partners, ie normal communication participant identities. In the case of an incoming call (step 51), the caller identification is checked by the identification means (step 53). This is performed by comparing the transmitted caller identification or communication participant identification with participant selection data in the participant directory 52 (the telephone directory containing the participant's address), which is identified by the participant. It is stored in the memory of the selection storage means. Certain OIP configurations, or edit modes, may or may be assigned to the participant directory 52.

【００５４】提示された例において、参加者ディレクトリ５２に呼者に関するエントリが無
いか、或いはＯＩＰ構成「業務（office）」が該エントリに割り当てられている
かの何れかの場合が想定される。これらのケースにおいて、ビデオカメラの形式
における画像入力ユニット１１の信号は、コンテンツ認識機能ブロック３０に進
められる。ステップ５７は、レイヤー３１乃至３４（図４参照）を計算し、これ
は、基準情報、即ちステップ５８におけるＯＩＰ構成「業務」の所与の画像デー
タと共にリアルタイムに寄せ集められる。この後、既に述べた審美的画像最適化
４３及び技術的画像最適化４２が実行される。技術的画像最適化４２の後、該最
適化された画像は、事前選択されたパラメータ値、又はディレクトリ５２内の参
加者エントリの割り当てにおける編集モードに対応してブロック４０を与える。
該信号、即ち最適化された画像及び音響信号は、次いで、符号化され、機能ブロ
ック４１において通信プロトコルに基づいて伝送される。In the example presented, it is assumed that there is no entry for the caller in the participant directory 52 or that the OIP configuration “office” is assigned to the entry. In these cases, the signal of the image input unit 11 in the form of a video camera is forwarded to the content recognition function block 30. Step 57 computes layers 31-34 (see FIG. 4), which are assembled in real time with the reference information, ie, the given image data of the OIP configuration "job" in step 58. Thereafter, the aesthetic image optimization 43 and the technical image optimization 42 described above are performed. After technical image optimization 42, the optimized image provides a block 40 corresponding to a pre-selected parameter value or an edit mode in the assignment of participant entries in directory 52.
The signals, i.e. the optimized image and audio signals, are then encoded and transmitted in a function block 41 based on a communication protocol.

【００５５】ここで図５を参照すると、該ケースとしては、ａ）所望の参加者が参加者ディ
レクトリ５２で既知であること（ステップ５３における比較判定における出口「
ＹＥＳ」）と、ｂ）ＯＩＰ構成「プライベート」が参加者に割り当てられ（ステ
ップ５５）、即ち、ユーザ画像データの編集は無く、従って何ら編集モードは実
行されないことが考えられる。そして、未編集のビデオ信号、即ちＯＩＰを伴わ
ない信号は符号化ユニット４１に伝送される。Referring now to FIG. 5, the case includes: a) that the desired participant is known in the participant directory 52 (the exit “
YES)) and b) the OIP configuration "Private" is assigned to the participant (step 55), ie there is no editing of the user image data and therefore no editing mode is executed. Then, the unedited video signal, that is, the signal without OIP, is transmitted to the encoding unit 41.

【００５６】ここで図６が参照されると、これは、現在（present）のユーザから発する接
続のセットアップを示している。もし現在のユーザが通信をセットアップしたい
のならば、彼は参加者ディレクトリから１つの参加者を選択する（ステップ６２
）か、或いは、対応する参加者コードを手動で入力する（ステップ６１）ことが
可能とされる。所望の参加者を参加者ディレクトリ６２から選択するケースにお
いては、ＩＤコードが既知であるか否かのステップ６４におけるチェックが出口
「ＹＥＳ」へ導く。そして、参加者ディレクトリ６２におけるＯＩＰパラメータ
に従い、該ユーザによりなされた事前設定選択に対応して、例えばＯＩＰ構成「
余暇」が活性化される（ステップ６６）。機能ブロック４３における審美的最適
化及びブロック４２における技術的最適化だけが、この編集モードに例えばユー
ザにより割り当てられる。これにより変更されたビデオ信号は、次いで、機能ブ
ロック４１における通信プロトコルの要求仕様に従って符号化される。Referring now to FIG. 6, this illustrates a connection setup originating from a present user. If the current user wants to set up a communication, he selects one participant from the participant directory (step 62).
) Or manually entering the corresponding participant code (step 61). In the case of selecting the desired participant from the participant directory 62, a check in step 64 as to whether the ID code is known leads to an exit "YES". Then, according to the OIP parameters in the participant directory 62, the OIP configuration “
"Leisure" is activated (step 66). Only the aesthetic optimization in function block 43 and the technical optimization in block 42 are assigned to this editing mode, for example by the user. The modified video signal is then encoded according to the required specifications of the communication protocol in the function block 41.

【００５７】ここで例が説明される。ここでは、ユーザがステップ６１において要求される
接続コードを手動で、入力装置即ちユニット１８（図１）の手段により入力する
。問い合わせステップ６４において実行される手動入力された参加者番号又はア
ドレスは、この例では、問い合わせ６４において否定応答を与える。これに引き
続いて、編集選択制御は、ＯＩＰに対して構成「業務」及び全画像処理を活性化
する。即ち、画像情報即ちユーザ画像データ、所与の場合にあっては音響情報即
ちユーザ音データの最適化即ち編集が実行される。音響情報の最適化の実行は、
図５及び図６において単純化及び明瞭性の点から示されていないが、画像データ
編集と同様のやり方で実行され得る。編集信号は、次いで、符号化ユニット４１
に進められる。An example will now be described. Here, the user manually inputs the connection code required in step 61 by means of an input device or unit 18 (FIG. 1). The manually entered participant number or address performed in query step 64 gives a negative response in query 64 in this example. Following this, the edit selection control activates the configuration "work" and all image processing for the OIP. That is, optimization or editing of the image information, ie, the user image data, and in the given case, the acoustic information, ie, the user sound data, is performed. The optimization of acoustic information
Although not shown in FIGS. 5 and 6 for simplicity and clarity, it may be performed in a manner similar to image data editing. The edit signal is then passed to the encoding unit 41
Proceed to

【００５８】例えば完全なＯＩＰ機能即ち完全編集と、審美的画像最適化４３及び技術的画
像最適化４２だけによる画像情報即ち現在ユーザ画像データの「発育不全（rudi
mental）編集」との間の自動選択とは離れて、ユーザは、例えば、通信の間にも
入力手段１８により何時でも、ＯＩＰ構成、即ち編集モードを活性化又は非活性
化とすることが許容される。For example, the “rudi” of the current user image data, ie, the image information only by the full OIP function, that is, the full editing and the aesthetic image optimization 43 and the technical image optimization 42.
Apart from the automatic selection between "mental editing", the user is allowed to activate or deactivate the OIP configuration, i.e. the editing mode, at any time via the input means 18, for example also during communication. Is done.

【００５９】上記の本発明による方法は、又、例えば、他の人の顔、又は全く人工的にデザ
インされたキャラクタを送信するのに用いられ得る。ここでは、ユーザは「アニ
メータ」として行動する。本発明の手段により可能となるかかる「アニメーショ
ン」の実施は、商業使用において本質的に利点を有し、そこでは、例えば画像保
持者としての商業的に知られた人物の自己同一性が重要である。例えば、あるキ
ャラクタがブランドとして確立することが企業の要求であると想像される。例示
の理由で、保険会社のキャラクタ「ミスターカイサー」とか、同様にウォルト・
ディズニー社の「ミッキーマウス」とかが言及される。商業上の使用は、ここで
は、電話代理店にとってはＯＩＰ即ち本発明に従った対応する編集モードの準備
である。その点で、該編集モードを使用しようとする各ユーザは、対応する編集
モードの使用法にとって彼自身を識別しなければならない。これにより、非認定
の人物は彼自身を異なる外観で提示する、おそらく何らかの損害の原因となるこ
とはできないことが保証される。The method according to the invention described above can also be used, for example, to transmit the face of another person or a character that has been designed completely artificially. Here, the user acts as an "animator". The implementation of such "animations" enabled by the means of the present invention has inherent advantages in commercial use, where the identity of a commercially known person, for example as an image carrier, is important. is there. For example, it is imagined that establishing a character as a brand is a requirement of a company. For illustration reasons, the insurance company character "Mr. Kaiser" or Walt
It mentions Disney's "Mickey Mouse". Commercial use is here the preparation of the OIP, ie the corresponding editing mode according to the invention, for the telephone agent. In that regard, each user who intends to use the edit mode must identify himself for the usage of the corresponding edit mode. This assures that an uncertified person presents himself with a different appearance, possibly cannot cause any damage.

【００６０】企業は、例えば、対応するビデオ通信手段が備えられた電話代理店に、顧客の
問い合わせに常に返答し取得することを指示し，所望のキャラクタを用いること
を要求する。このやり方で、例としての上記の保険会社は、ＯＩＰを介した確立
された企業の親近感ある姿によりこの手法で音響視覚的に通信する。子供及び若
者には、特別な催し物及び彼らの余暇時間の活動にとってのヒントが、親近感の
あるコミックキャラクタ、例えば、ミッキーマウスの如き手段により該企業から
通知され得る。所望のキャラクタ人生を与える人物に対する通信参加者識別とし
て、例えば、顔認識のためのアルゴリズムが用いられ得て上記の方法の誤使用を
防ぎ得る。該顔認識は、上記のレイヤー「頭部」の制御の役割とすることができ
る。もし、ユーザが保存された頭部画像と同じキャラクタ顔面形状を持っていな
い場合には、該ビデオ信号が伝送のための符号化ユニットに未編集のやり方で進
められ、ビデオ通信装置の使用は防がれる。For example, a company instructs a telephone agent provided with a corresponding video communication means to always reply to and acquire a customer's inquiry, and requests to use a desired character. In this manner, the insurer described above as an example communicates acousto-visually in this manner with the familiarity of the established enterprise via OIP. Children and young people may be informed of special events and tips for their leisure time activities by the company by means of familiar comic characters, such as Mickey Mouse. As a communication participant identification for a person who gives a desired character life, for example, an algorithm for face recognition can be used to prevent misuse of the above method. The face recognition can play a role of controlling the above-mentioned layer “head”. If the user does not have the same character face shape as the stored head image, the video signal is forwarded to the encoding unit for transmission in an unedited manner and the use of the video communication device is prevented. Can come off.

【００６１】更に、認定参加者の選択のための上記の方法は、主に商業応用の分野で有効利
用される音声識別アルゴリズムを含むことができる。現在の参加者は符号フレー
ズを発声し、これは保存された音声サンプルと比較される。これは付加的な適合
処理に帰するべきであり、更に、上記の画像識別との相関処理が保存された頭部
画像を基礎としてなされ得る。従って、人の取り違い、又は他の人物の偽装を通
した本発明の誤使用が防がれる。顧客の通信参加者識別は、本発明によるビデオ
通信装置の応用においては要求されない。しかし、もし例えば企業内部の通信が
対応する装置を介して、例えば通信パートナーに関する通信参加者識別により、
発生することが望まれるならば、顧客との編集済み通信接続、或いは同僚又は上
司との未編集の通信接続が開始されることが見いだされる。更に、上記の応用に
より、所与の画像データが適合するハードウェア上に中枢的に備えられるなら、
全ての認定ユーザと常に有効なデータを使用することが保証されることが明らか
である。対応するシステムのケースにおいても、処理の実施が中枢的に備えられ
得る。Furthermore, the above-described method for the selection of authorized participants can include a voice identification algorithm that is mainly utilized in the field of commercial applications. The current participant utters the code phrase, which is compared to the stored speech sample. This should be attributed to the additional adaptation process, and furthermore, the above-mentioned correlation with the image identification can be based on the stored head image. Therefore, misuse of the present invention through the mistake of a person or impersonation of another person is prevented. Customer communication participant identification is not required in video communication device applications according to the present invention. However, if, for example, a communication participant identification with respect to a communication partner, e.g.
If it is desired to occur, it is found that an edited communication connection with the customer or an unedited communication connection with a colleague or boss is initiated. Furthermore, if the above application allows a given image data to be provided centrally on compatible hardware,
Clearly, it is guaranteed to use valid data at all times with all authorized users. Even in the case of a corresponding system, the implementation of the processing can be provided centrally.

【００６２】調整可能な呼の迂回と同様にして、電気通信ネットワーク提供者との間でその
システム又はハードウェアにおいて今日既に可能であって、参加者識別及び／又
は画像データ編集が、電気通信ネットワーク提供者のシステム又はハードウェア
内において即ちビデオ通信装置とは外部的に実行され得る。これにより、ユーザ
側におけるビデオ通信装置の技術的要求仕様は小さく保たれ得るし、それにもか
かわらず十分な処理及びメモリ資源が提供される。As well as adjustable call diversion, participant identification and / or image data editing is already possible in the system or hardware with the telecommunications network provider today. It can be implemented within the provider's system or hardware, ie externally to the video communication device. This allows the technical requirements of the video communication device on the user side to be kept small and nevertheless provides sufficient processing and memory resources.

【００６３】本発明の更なる応用としては、画像伝送を伴うドアインタホーンの分野、イン
タラクティブなビデオ監視又は同様の分野である。本発明は、上記且つ図面に示された実施例の特徴及びかかる特徴の組み合わせ
に限定されない。個々の状況において、本発明の特徴及びかかる特徴の組み合わ
せは実現され得るものであり、個別及び組み合わせの両方において保護されるべ
き価値を有する。本発明の理解のための本書に含まれる一般的且つ具体的記述に
かかわらず、本書から当業者が自身で及び／又はその専門知識を用いて容易に認
識できる改変、修正、代替及び組み合わせもまた本書の範囲の部分である。Further applications of the invention are in the field of door intercom with picture transmission, interactive video surveillance or similar. The invention is not limited to the features of the embodiments described above and illustrated in the drawings, and to combinations of such features. In individual situations, the features of the invention and combinations of such features may be realized and have value to be protected both individually and in combination. Regardless of the general and specific descriptions contained herein for an understanding of the present invention, alterations, modifications, substitutions, and combinations readily apparent to one of ordinary skill in the art and / or using its expertise may also be made from this document. It is part of the scope of this book.

[Brief description of the drawings]

【図１】本発明による、ユーザ最適化されたコンテンツで音響視覚情報を送信
／受信するための装置のブロック図である。FIG. 1 is a block diagram of an apparatus for transmitting / receiving audiovisual information in user-optimized content according to the present invention.

【図２】図１に従った機能群「メモリ」の詳細に関するブロック図である。FIG. 2 is a block diagram relating to details of a function group “memory” according to FIG. 1;

【図３】図１及び図２の装置を使用しユーザ制御最適化の無い場合の機能のブ
ロック図である。FIG. 3 is a block diagram of functions when the apparatus of FIGS. 1 and 2 is used and there is no user control optimization.

【図４】ユーザ制御最適化を用いた機能のブロック図である。FIG. 4 is a block diagram of a function using user control optimization.

【図５】ユーザ制御最適化が実行されない場合の方法ステップの機能フロー図
である。FIG. 5 is a functional flow diagram of method steps when no user control optimization is performed.

【図６】ユーザ制御最適化が実行される場合の方法ステップの機能フロー図で
ある。FIG. 6 is a functional flow diagram of method steps when user control optimization is performed.

[Brief description of reference numerals]

１０中央プロセッサユニット１１画像入力ユニット１２ディスプレイ１３音響入力ユニット１４送信ユニット１５受信ユニット１６データネットワーク１７インタフェース１８入力装置２０メモリ２１コンテンツブロック２２所定の画像データ３０コンテンツ認識ブロック２１０ユーザ識別ブロック２１１技術画像最適化ブロック２１２審美的最適化ブロック２１３背景処理ブロック２１４頭部処理ブロック２１５胴体処理ブロック２２１所与の背景画像データ２２２所与の頭部画像データ２２３所与の胴体画像データ３１レイヤ１（背景）３２レイヤ２（胴体）３３レイヤ３（頭部）３４特別レイヤ４０機能ブロック４１符号化処理ブロック４２最適化ブロック４３最適化ブロック Reference Signs List 10 central processor unit 11 image input unit 12 display 13 sound input unit 14 transmission unit 15 reception unit 16 data network 17 interface 18 input device 20 memory 21 content block 22 predetermined image data 30 content recognition block 210 user identification block 211 technical image optimization Block 212 Aesthetic optimization block 213 Background processing block 214 Head processing block 215 Body processing block 221 Given background image data 222 Given head image data 223 Given body image data 31 Layer 1 (background) 32 Layer 2 (body) 33 Layer 3 (head) 34 Special layer 40 Function block 41 Encoding processing block 42 Optimization block 43 Optimization block

───────────────────────────────────────────────────── フロントページの続き (81)指定国ＥＰ(ＡＴ，ＢＥ，ＣＨ，ＣＹ，ＤＥ，ＤＫ，ＥＳ，ＦＩ，ＦＲ，ＧＢ，ＧＲ，ＩＥ，ＩＴ，ＬＵ，ＭＣ，ＮＬ，ＰＴ，ＳＥ)，ＯＡ(ＢＦ，ＢＪ，ＣＦ，ＣＧ，ＣＩ，ＣＭ，ＧＡ，ＧＮ，ＧＷ，ＭＬ，ＭＲ，ＮＥ，ＳＮ，ＴＤ，ＴＧ)，ＡＰ(ＧＨ，ＧＭ，ＫＥ，ＬＳ，ＭＷ，ＳＤ，ＳＬ，ＳＺ，ＴＺ，ＵＧ，ＺＷ )，ＥＡ(ＡＭ，ＡＺ，ＢＹ，ＫＧ，ＫＺ，ＭＤ，ＲＵ，ＴＪ，ＴＭ)，ＡＥ，ＡＬ，ＡＭ，ＡＴ，ＡＵ，ＡＺ，ＢＡ，ＢＢ，ＢＧ，ＢＲ，ＢＹ，ＣＡ，ＣＨ，ＣＮ，ＣＲ，ＣＵ，ＣＺ，ＤＫ，ＤＭ，ＥＥ，ＥＳ，ＦＩ，ＧＢ，ＧＤ，ＧＥ，ＧＨ，ＧＭ，ＨＲ，ＨＵ，ＩＤ，ＩＬ，ＩＮ，ＩＳ，ＪＰ，ＫＥ，ＫＧ，ＫＰ，ＫＲ，ＫＺ，ＬＣ，ＬＫ，ＬＲ，ＬＳ，ＬＴ，ＬＵ，ＬＶ，ＭＡ，ＭＤ，ＭＧ，ＭＫ，ＭＮ，ＭＷ，ＭＸ，ＮＯ，ＮＺ，ＰＬ，ＰＴ，ＲＯ，ＲＵ，ＳＤ，ＳＥ，ＳＧ，ＳＩ，ＳＫ，ＳＬ，ＴＪ，ＴＭ，ＴＲ，ＴＴ，ＴＺ，ＵＡ，ＵＧ，ＵＳ，ＵＺ，ＶＮ，ＹＵ，ＺＡ，ＺＷ──────────────────────────────────────────────────続き Continuation of front page (81) Designated country EP (AT, BE, CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE ), OA (BF, BJ, CF, CG, CI, CM, GA, GN, GW, ML, MR, NE, SN, TD, TG), AP (GH, GM, KE, LS, MW, SD, SL, SZ, TZ, UG, ZW), EA (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), AE, AL, AM, AT, AU, AZ, BA, BB, BG, BR, BY, CA, CH, CN, CR, CU, CZ, DK, DM, EE, ES, FI, GB, GD, GE, GH, GM, HR, HU, ID, IL , IN, IS, JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, MX, NO, NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM, TR, TT, TZ, UA, UG, US, UZ, VN, YU, ZA, ZW

Claims

[Claims]

1. A user image data input means for inputting current user image data, an image data editing means for generating user image data edited from the current user image data, and at least one further communication participation Video data output means for outputting user image data to a user, comprising: identification means for identifying at least one communication participant; and an edit selection controller connected to the identification means. Wherein the edit selection controller activates output of unedited current image data or edited user image data using image data output means depending on the identification result of the identification means, In the case of (1), the video communication apparatus is connected to the image data editing means in advance.

2. The video communication as claimed in claim 1, wherein the identification means is arranged to identify the user and at least one further connection or connected communication participant. apparatus.

3. A participant selection data storage means for storing participant selection data, and a participant for input for identifying a user and / or at least one further connected participant or connected communication participant, respectively. Identification input means is assigned to the identification means, the identification means comparing the stored participant selection data with the proof of the current communication participant to further connect the user and / or at least one further participant. 3. The video communication device according to claim 1, wherein the video communication device is configured to provide an identification result to a person or a connected communication participant.

4. A participant selection data storage means for storing user selection data of at least one communicable user, and / or
4. The video communication apparatus according to claim 3, further comprising partner selection data storage means for storing communication partner selection data of at least one communicable partner.

5. The participant identification input means for inputting the current communication participant identification, in particular the user identification data and / or the communication partner selection data, comprises manual selection means, electrical signal input means, optical signal input means, And / or an audio signal input method, wherein the manual selection means includes a keyboard, a menu control key, a menu control lever or a menu control pointing and / or touch-sensitive input for a communication participant identification input by a user. And / or wherein said electrical signal input means is configured to receive an electrical signal for a communication participant identification input by said user or communication participant, and / or said optical signal input means comprises: Configured to receive an optical signal for a communication participant identification input by a user or a communication participant; 5. The video communication device according to claim 3, wherein the audio signal input unit is configured to receive an audio signal corresponding to a communication participant identification input by the user or the communication participant. 6.

6. The image data editing means according to claim 1, wherein said edit selection controller is provided with a predetermined or preset edit mode, or one of a plurality of predetermined or preset edit modes. Editing of the current user image data by
2. The video communication device according to claim 1, wherein the video communication device is configured to stop or start according to the identification result of the identification unit.

7. A given image data storage means is provided for storage of given image data, said image data editing means using the given image data to produce edited user image data, The video communication device according to claim 1, wherein the video communication device is configured to edit current user image data based on and / or.

8. The given image data storage means includes a given background image data storage means and / or a given person image data storage means, and the image data editing means comprises at least current user image data. Is separated into background image data and person image data, and the background image data and / or person image data are completely or partially replaced or corresponded to given background image data or given person image data. The edited background image data and / or person image data based on the corresponding given background image data or given person image data from the storage means. Item 8. The video communication device according to Item 7.

9. The given image data storage means or the given person image data storage means includes a torso image data storage means and / or a head image data storage means, and the image data editing means comprises a person Separating the image data into torso image data and head image data, and depending on the given torso image data and / or the given head image data, the torso image data and / or the head image data can be completely or And configured to generate edited torso image data or head image data based on the corresponding given torso image data or given head image data from the respective storage means. The video communication device according to claim 8, wherein the video communication is performed.

10. The given image data storage means, the given background image data storage means and / or the given person image data storage means, in certain cases the torso image data storage means and / or In the head image data storage means, a plurality of given background image data and / or given person image data are given in a given case, given torso image data and / or given head image data. 10. A method according to claim 7, wherein a plurality of corresponding sets or subsets or constituent elements thereof can be saved, or assigned to different editing modes of the image data editing means and saved. A video communication device according to claim 1.

11. The image data editing means edits the separated current image data independently of each other and at least essentially simultaneously, and combines them to create edited user image data. 11. The video communication device according to claim 7, wherein the video communication device is configured.

12. An aesthetic and / or technical image optimization of user image data depending on or independent of an identification result using image data editing means.
Before and / or after the replacement of the background image data and / or the person image data in the given case, and before and / or after the replacement of the torso image data and / or the head image data in the given case. 2. The method as claimed in claim 1, wherein, based on an editing algorithm, it can be performed completely or partially by corresponding given image data from each given image data storage means. Video communication device.

13. The user image data input means creates a plurality of current user images continuously in chronological order, and each of the images is individually and / or sequentially prepared in advance using image data editing means. 2. The video communication device according to claim 1, wherein the video communication device is edited according to settable rules.

14. The image data editing means separates background image data and / or person image data, in a given case, torso image data and head image data, and outputs a given user image. Wherein the replacement of the torso image and the head image by the corresponding given image data from each of the image data input means is dynamically performed continuously. 2. The video communication device according to 1.

15. The apparatus according to claim 1, wherein said user image data input means includes at least one camera, and said image data output means includes at least one interface to a data communication network. A video communication device as described.

16. A user sound data input unit for inputting current user sound data, a sound data editing unit for generating edited user sound data from the current user sound data, and at least one further device. 2. A video communication apparatus according to claim 1, wherein sound data output means for outputting user sound data to a communication partner is provided.

17. The editing selection control uses sound data output means to generate an output of unedited current sound data or edited user sound data depending on the identification result of the identification means. 17. The video communication device according to claim 16, wherein:

18. Given sound data storage means for storing given sound data, the sound data editing means uses the given sound data to create edited user sound data. 18. The video communication device according to claim 16 or 17, wherein the video communication device is configured to edit the current user sound data based on and / or.

19. An aesthetic and / or technical sound optimization of the user sound data, depending on the identification result or independently of the identification result, using sound data editing means in a given case. 19. The method according to claim 16, further comprising the step of: executing an editing algorithm before and / or after editing the current user sound data with the given sound data from the given sound data storage means. A video communication device according to claim 1.

20. A video as claimed in claim 16, wherein the editing mode of the sound data editing means given individually or in advance by the user in the group is assigned the identification result of the identification means. Communication device.

21. The method according to claim 16, wherein the user sound data input means includes at least one microphone, and the sound data output means includes at least one interface to a data communication network. Video communication device.

22. The video communication device according to claim 1, wherein said identification means includes an interface to at least one data communication network.

23. A video communication system comprising at least one video communication device according to one of claims 1 to 22 and connected or connectable to a data communication network.

24. A video communication method, wherein at least one communication participant is identified by identification means, current user image data is input to user image data input means, and edit selection control is performed by said identification means. Guiding or not guiding the current user image data to the image data editing means depending on the identification result of, the image data editing means edits this or for this immediately after receiving the current user image data Generating edited user image data, and finally outputting unedited current data or edited user image data, if any, by the image data output means.

25. A video communication method according to claim 24, wherein the video communication device according to claim 2 is used and / or operated.

26. Image data or image and sound data is transmitted in a manner optimized for image presentation, especially when the image is communicated over a telephone, wherein image data obtained from a video source, In some cases, the sound data from the sound source is based on a given image data or a given sound data, each corresponding to at least one predetermined or predeterminable criterion, before each transmission to the communication partner. Before the commencement of the communication, wherein: a) the given image data is generated and stored; and b) the parameters for querying the image data are defined and stored and stored in step a). C) user image data obtained from the user image data input means, in particular one or several Extracting a video source for the selected or selectable image data parameters by the parameters defined and stored in step b); and d) in step c) based on the selected image data parameters. The user image data is edited based on the assigned given image data; and e) the user image data edited in step d) is transmitted to one or several communication partners. 26. The video communication method according to claim 24, further comprising the step of:

27. The steps d) and e) are performed at a central location remote from the user, wherein the user image data, the assigned given image data and image data parameters are 27. The method of video communication according to claim 26, wherein the method is transmitted from a user location to the central part.

28. The video communication method according to claim 26, wherein the steps a) to e) are performed at a user location.

29. A pre-stored speech sample of an authorized user is compared with a code phrase uttered by the current user, and if the comparison is positive, the An additional step f in which the edited communication is released
29. The video communication method according to one of claims 26 to 28, wherein

30. In step f), the stored audio sample is additionally assigned to given image data belonging to the user or selected to belong to the user, and the user is stored 30. The method of video communication of claim 29, wherein the method is identified based on the audio sample and associated image data.

31. The speech analysis of the said uttered code phrase and the image analysis of the user image data obtained from the user image data input means, in particular the video source is performed in step f). Claim 29 or 3
0. The video communication method according to item 0.

32. The method of video communication according to claim 31, wherein in the image analysis, user-specific features are compared with relevant or selected given image data.

33. Before performing step c), it is confirmed that the user is authorized to use the given image data stored for audiovisual communication. 33. The video communication method according to any one of 30 to 32.

34. The user image data to be processed or edited and to be analyzed, the given image data as well as the edited image data comprise two-dimensional and three-dimensional motion images. 3. The image processing apparatus according to claim 2, further comprising image information.
34. The video communication method according to any one of items 4 to 33.

35. The video communication method according to claim 24, wherein the user image data to be transmitted is indicated to the user before transmission.

36. The display comprising an interactive user appearance,
36. The video communication method according to claim 35, further comprising displaying selectable and / or selected given image data as well as selectable or selected image data parameters.

37. The presentation of video and / or audio data according to claims 1 to 22 is optimized, not only for the implementation of the video communication method according to claims 24 to 26 in particular. A video communication device for transmitting, comprising: video input and output means (11, 12); audio input and output means (13); transmission and reception means (14, 15). An interface (16) to at least one transmission channel; an input device (18) for input of control and command signals; storage of user and system programs as well as given image data and given sound data. Means, devices and components as described above, having a functional connection with the processor unit (10), and a method according to the described means and components. Video communication apparatus characterized by being configured to perform the step.

38. An interface (17) with a higher-level management device and / or a processing device (10) for connection to a higher-level storage medium, for example a "personal computer"
The video communication device according to claim 37, further comprising:

39. The video communication device according to claim 1, wherein the video communication device is a video communication device.
37. Use of a video communication system according to claim 24 and a method of video communication according to one of claims 24 to 36, wherein the transmission of images by telephone, video conferencing or image optimized transmission of video and / or audio data in a computer network. The use method characterized in that it is particularly intended for.