JP2004274349A

JP2004274349A - Business support system

Info

Publication number: JP2004274349A
Application number: JP2003061714A
Authority: JP
Inventors: Mitsuhiko Seki; 光彦関; Kazuji Kotani; 和司小谷; Noboru Kobayashi; 昇小林
Original assignee: KOBASHOU KK
Current assignee: KOBASHOU KK
Priority date: 2003-03-07
Filing date: 2003-03-07
Publication date: 2004-09-30

Abstract

<P>PROBLEM TO BE SOLVED: To provide a system capable of communicating the image of a decided seller or purchaser in real time. <P>SOLUTION: This business support system is provided with a means for utilizing a video conference system for performing a video conference between users who are at least three persons (at least one person is a seller and at least two persons are decided purchaser) or more through the Internet and acquiring from each terminal image frame data and voice data compressed in each terminal, a means for allocating the plurality of pieces of acquired image frame data to a plurality of sections, obtained by dividing one image frame by compressing the plurality of pieces of acquired image frame data and ordering and arranging the image frame data while synchronizing them, and combining the sections into one piece of image frame data, a means for combining a plurality of acquired pieces of voice data, and a means for performing multicasting of the combined image data and voice data to respective user terminals. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、営業支援システムに関連し、より詳細には、複数の音声及び画像を通信するインターネットＴＶ会議システムを利用する営業支援システムに関連する。
【０００２】
【発明の背景】
一般に、企業が、ある商品ないしサービス（以下、単に「商品」という）を開発したとき、その商品を販売する人（以下、単に「販売者」という）は、その商品を購入するか否かを決定する人（以下、単に「購入決定者」という）に対して、商品説明をする場合もある。この場合、販売者が購入決定者の居る場所に出向いて、商品説明することは、珍しいことではない。
【０００３】
従って、購入決定者の数が多くなる程、販売者は、それぞれの場所に出向く回数も増え、その結果、販売者の負担が大きいという問題がある。特に、同一企業内ないし同一グループ内に、複数の購入決定者がいる場合、販売者は、それぞれの場所に出向く必要があり、上記問題は、より深刻な問題である。このような状況としては、例えば、ある製薬会社が、新しいスギ花粉対策商品（例えば、▲１▼シソ濃縮エキス及び甜茶エキスを含む栄養補助食品、▲２▼刺激が少なく、かつ爽快感のあるミント系の香りを有する洗浄液を利用する鼻洗浄器など）を開発した場合、その会社の販売者が、同一企業内の複数の薬局店舗に出向いて、同じ商品説明を個別に行わなければならない状況である。
【０００４】
このような状況において、従来のインターネットＴＶ会議システムを利用することも考えられる。即ち、１人の販売者と複数の購入決定者との間に、全員の音声及び画像を通信させることが可能であれば、販売者は、複数の場所に出向く必要がなくなると考えられる。
【０００５】
【発明が解決しようとする課題】
図１は、従来のインターネットＴＶ会議システムの構成例の概略を示した図である。図１に示すように、従来のインターネットＴＶ会議システムを利用すると、販売者と複数の購入決定者との総数がｎ人である場合、１台のパソコン（例えば、パソコン３１）は、そのパソコンの所有者の１組の音声及び画像（例えば、図中の符号「Ｆ１」）をインターネット１を介してサーバー２に送信する一方、その所有者を含めた全員のｎ組の音声及び画像（「Ｆ１〜Ｆｎ」）をサーバー２から受信する必要がある。即ち、パソコン端末３１、３２、３３、・・・、３ｎのそれぞれが、ｎ組の音声及び画像（「Ｆ１〜Ｆｎ」）をサーバー２から受信する必要があるので、販売者及び購入決定者のそれぞれが、大容量の通信インフラ（例えば、ＦＴＴＨ）を備える必要があった。このような大容量の通信インフラを備えるには、膨大なコストを伴うため、結果として、従来のインターネットＴＶ会議システムを利用して商品説明を行うことは、非現実的であった。
【０００６】
一方で、現在の普及型通信インフラ（例えば、ＡＤＳＬ１２Ｍなど）を利用したまま、商品説明を行うには、１組の音声及び画像のサイズを小さくすることも考えられる。しなしながら、各パソコン３は、ｎ組の音声及び画像（「Ｆ１〜Ｆｎ」）を受信しなければならないため、リアルタイムで販売者ないし購入決定者の画像を通信することができず、実質的に、商品説明を行うことができなかった。
【０００７】
そこで、本発明の目的は、現在の普及型通信インフラを利用したまま、リアルタイムで販売者ないし購入決定者の画像を通信することができるシステムを提供することである。
【０００８】
一方、本発明を利用すれば、リアルタイムで販売者ないし購入決定者の画像を通信することができ、その結果、インターネットＴＶ会議システムを利用して商品説明をすることが可能となる。従って、その商品説明後に、商談が成立し得り、インターネットＴＶ会議システムを利用して契約行為を行うことが可能となる。
【０００９】
そこで、本発明の別の目的は、インターネットＴＶ会議システムを利用する契約行為を促進し、又は証明することができるシステムを提供することである。
なお、本発明の他の目的は、以下に説明する発明の実施の形態を参照することによって、明らかになるであろう。
【００１０】
【課題を解決するための手段】
上記目的を達成するために、本発明の営業支援システムは、インターネットに接続された少なくとも３台以上のＴＶ会議用の端末と、インターネットに接続されたサーバとからなり、インターネットを介して少なくとも３人以上のユーザ間でＴＶ会議を行うためのＴＶ会議システムを利用する営業支援システムおいて、
サーバは、
各端末から、各端末において圧縮された画像フレームデータ及び音声データを取得するデータ取得手段と、
取得された複数の画像フレームデータを圧縮しかつ同期をとりつつ整理調整することにより、取得された複数の画像フレームデータを、１つの画像フレームを分割することによって得られる複数の区分に割り当てて、１つの画像フレームデータに合成する画像データ圧縮合成手段と、
取得された複数の音声データを合成する手段と、
合成された画像フレームデータ及び音声データをユーザ端末のそれぞれにマルチキャストするデータ配信手段と
を備え、
少なくとも１台の端末が、販売者用の端末であり、
少なくとも２台の端末が、購入決定者用の端末である。
【００１１】
購入決定者用の複数の端末の内の少なくとも１つの端末が、ＴＶ会議用のカメラを備えいない場合、好ましくは、
ＴＶ会議システムサーバはさらに、静止画像を記憶した記憶手段を備え、
画像データ圧縮合成手段は、１つの画像フレーム中のカメラを具備していないユーザ端末に対応する区分に、記憶手段から静止画像を読み出して割り当てるよう構成される。
【００１２】
ＴＶ会議に参加する購入決定者用の端末の数が、１つの画像フレームの複数の区分の数に満たない場合、好ましくは、
画像データ圧縮合成手段はさらに、余剰の区分に、記憶部から静止画像を読み出して割り当てるよう構成される。
【００１３】
さらに好ましくは、データ取得手段はさらに、取得された音声データの内の最も音量が大きい音声データ、又は、端末から発言者を表すキー操作信号を検出する手段を備え、
画像データ圧縮合成手段はさらに、検出された音量が最も大きい音声データに対応する端末、又は、検出された発言者を表すキー操作信号の送信元の端末からの画像フレームデータを処理して、該データの画像が他の端末からの画像に対比して拡大して１つの区分に割り当てるよう構成される。
【００１４】
加えて、サーバはさらに、
ＴＶ会議に参加を許可された販売者及び購入決定者の音声分析結果データを予め記憶しており、取得された音声データを分析し、予め記憶された音声分析結果データに一致するか否かを判定する手段を備え、
複数の音声データを合成する手段は、音声分析結果データに一致する音声データのみ合成し、
画像データ圧縮合成手段は、一致する音声データに対応する画像フレームデータのみ圧縮し、１つの画像フレームデータに合成するよう構成されていることが好ましい。
【００１５】
また、サーバはさらに、合成された画像フレームデータ及び音声データを記憶する手段を備えることが好ましい。
さらに、上記目的を達成するために、本発明の営業支援システムは、インターネットに接続された少なくとも１台以上の映像監視用の端末と、インターネットに接続された少なくとも２台以上のカメラと、インターネットに接続されたサーバとからなり、インターネットを介して少なくとも２台以上のカメラからの映像監視を行うための映像監視システムを利用する営業支援システムおいて、
サーバは、
各カメラから画像フレームデータを取得するデータ取得手段と、
取得された複数の画像フレームデータを圧縮し、圧縮された複数の画像フレームデータを、１つの画像フレームを分割することによって得られる複数の区分に割り当てることにより、１つの画像フレームデータに合成する画像データ圧縮合成手段と、
合成された画像フレームデータを端末にマルチキャストするデータ配信手段と
を備え、
カメラが、商品を備える店舗内に配置されている。
【００１６】
【発明の実施の形態】
（第１の実施形態）
以下に、本発明の第１の実施形態について、図面を参照して説明する。
【００１７】
図２は、本発明の営業支援システム１０の構成例の概略を示した図であり、図３は、図２中のサーバ４の機能ブロック図を示し、図４は、図２中の端末５の概略的な機能ブロック図を示す。
【００１８】
本発明の営業支援システム１０は、図２に示すように、インターネット３に接続可能なサーバ４及び少なくとも３台の端末５１、５２、５３、・・・、５ｎ（以下、代表して表す場合には「端末５」で表す）を備える。
【００１９】
図２中のサーバ４は、図３に示すように、データ取得部４１と、画像データ圧縮合成部４２と、音声データ合成部４３と、データ送信部４４と、資料管理部４５と、静止画像記憶部４６と、資料記憶部４７と、を備えている。
【００２０】
また、図２中の各端末５（例えば、パソコン）は、図４に示すように、様々なモードを実行するためのプログラムモジュール（ＰＭ）４８を予め記憶している。それらのモードは、例えば、ＴＶ会議参加者（少なくとも１人の営業者と少なくとも２人の購入決定者）の画像を表示する「ユーザ画像表示モード」と、会議資料を作成編集するための「資料作成編集モード」と、会議資料を表示するための「プレゼンテーションモード」と、である。また、各端末５は、サーバ４のデータ取得部４１及びデータ送信部２４との間で、適宜のプロトコルでデータ通信を行うが、そのためのプログラム（Ｐ）４８’も、各端末５内に予め記憶されている。さらに、各端末５は、画像データ及び音声データを、適宜の圧縮方法でデータ圧縮を行うが、そのためのプログラム（Ｐ）４８’も、各端末５内に予め記憶されている。なお、各端末５は、カメラ、マイク、スピーカ及びディスプレイ４９、並びに、ビデオコーディック及び音声コーディック４９’を備えている。
【００２１】
以下に、端末５１〜５ｎにおいて実行される「ユーザ画像表示モード」、「資料作成編集モード」、及び「プレゼンテーションモード」での動作について説明する。
【００２２】
［ユーザ画像表示モード］
各端末５においてユーザ画面モードが選択されると、サーバ４のデータ取得部４１は、インターネットＴＶ会議に参加する端末５１〜５ｎから、例えば端末５１〜５ｎの順番で、各端末５において圧縮された画像フレームデータ（図２中の「Ｆ１」〜「Ｆｎ」）を取り込む。
【００２３】
すなわち、データ取得部４１は、まず端末５１（例えば、この端末を営業者が所有する）に対して送信要求（画像データ及びそれに付随する音声データを送信するよう指示するための要求）を出力する。すると、それに応答して、端末５１が、カメラ４９で現在撮影されかつ圧縮された画像フレームデータＦ１（少なくとも営業者の顔を含む）と、マイク４９で取得されかつ圧縮された音声データ（営業者の音声）とを、データ取得部４１に返送する。同様にして、データ取得部４１が、端末５２〜５ｎ（例えば、これらの端末を購入決定者が所有する）に対して送信要求を順次送信すると、端末５２〜５ｎは、現在の圧縮された画像フレームデータＦ２〜Ｆｎ（少なくとも購入決定者の顔を含む）及び音声データ（購入決定者の音声）を、データ取得部４１に返送する。データ取得部４１は、ＴＶ会議中、このような画像フレームデータの取得動作を反復実行する。
【００２４】
データ取得部４１により取得された画像フレームデータＦ１〜Ｆｎは、同期をとって画像データ圧縮合成部４２に供給され、また、データ取得部４１により取得された音声データは、同期をとって音声データ合成部４３に供給される。
【００２５】
サーバ４の画像データ圧縮合成部４２は、画像フレームデータＦ１〜Ｆｎが入力されると、該データをそれぞれ圧縮する。画像データの圧縮は、適宜の既知の圧縮方法を採用可能であるが、各画像フレームデータを１／ｎ以下に圧縮する必要がある。画像データ圧縮合成部４２には、圧縮された画像フレームデータＦ１〜Ｆｎを、それぞれサブフレームデータＦｓ１〜Ｆｓｎとして、これらサブフレームデータを「１つ」の画像フレームデータＦのどこに組み込むかが、予め設定されている。
【００２６】
図５は、各端末５の各ディスプレイ４９に表示される画像Ｍ（フレームデータＦ）の構成例を表す。図５に示すように、例えば、サブフレームデータＦｓ１〜Ｆｓｎによるサブ画像Ｍ１〜Ｍｎのそれぞれが、画面Ｍをｎ個に分割した区分のそれぞれに割り当てられるように、サブフレームデータ列（Ｆｓ１〜Ｆｓｎ）が、組み替えられる。これにより、取得された画像フレームデータＦ１〜Ｆｎは、１つの画像フレームデータＦに合成される。合成された画像フレームデータＦは、データ送信部４４に供給される。
【００２７】
音声データ合成部４３は、端末５１〜５ｎから画像フレームデータＦ１〜Ｆｎに付随して送られてくる音声データを、データ取得部４１から受け取り、これを合成して（必要に応じて圧縮して）データ送信部４４に供給する。
【００２８】
データ送信部４４は、画像フレームデータＦ及び音声データを画像データ圧縮合成部４２及び音声データ合成部４３から受け取ると、これらをＴＶ会議の参加者の端末５１〜５ｎすべてにマルチキャストする。これにより、各端末５には、図５に示すように、１つの画面Ｍ上に、サブフレームデータＦｓ１〜Ｆｓｎに対応するサブ画像Ｍ１〜Ｍｎが表示される。
【００２９】
端末５１〜５ｎにマルチキャストされる画像フレームデータＦは、サーバ側での整理・調整処理により、１枚の画像フレームデータであるので、各端末に複数の画像フレームデータをマルチキャストする従来例のインターネットＴＶ会議システムに対比して、送信される画像データの量が低減されており、したがって、インターネット１、サーバ４及び各端末５の負荷が低減される。また、参加端末がいくら増大しても、一定の画像データ量を超えることがない方式としているため、インターネットの負荷が低減される上、全ての端末側のデータ量を合計した画像データフレームＦ及び音声データのデータ通信速度が、例えば３００ｋｂｐｓ以下となる方式にしている。この画像データ量の低減は、端末５の数が大きくなるほど、大きくなる。したがって、何百人という規模の同時会議においても音声及び画像品質が安定したＴＶ会議システムを実現することができる。
【００３０】
図６は、各端末５の各ディスプレイ４９に表示される画像Ｍ（フレームデータＦ）のもう１つの構成例を表す。
上記においては、ＴＶ会議の参加者総てが、画像データ及び音声データをシステムサーバ２のデータ取得部４１に返送する例について説明したが、カメラを保有していない購入決定者であっても、音声のみでＴＶ会議に参加することが可能である。例えば、端末５３にカメラ４９が備えられていない場合、合成された画像フレームデータＦのサブフレームデータＦｓ３に対応するサブフレームデータＦｓＮＣには、静止画像記憶部４６に予め記憶されている適宜の静止画像（例えば、端末４６の所有者の静止画像、カメラがない旨の表示等）がサブ画像Ｍ３として割り当てられる。カメラがない端末が、ＴＶ会議に参加する場合、画像データ圧縮合成部４２は、静止画像記憶部４６からの静止画像を最後尾の区分（サブ画像Ｍｎの位置）に配置するよう自動的に割り当てることもできる。
【００３１】
また、例えば、最大ｎ人がＴＶ会議に参加できるように、サブ画像をｎ個に設定していても、実際には参加者がｎよりも少ない人数、例えば、（ｎ−２）人である場合がある。このような場合、画像データ圧縮合成部４２は、残りの２個のサブ画像Ｍｎ−１及びＭｎに対応する区分位置に、静止画像記憶部４６からの適宜の静止画像（例えば、会議参加者不在であることを表す画像）を組み込む。このようにする代わりに、画像データ圧縮合成部４２を、参加者数に応じて、各端末５に表示される１枚の画像Ｍ中のサブ画像の数（すなわち区分数）及び圧縮率を変更可能に構成してもよい。
【００３２】
図７は、各端末５の各ディスプレイ４９に表示される画像Ｍ（フレームデータＦ）の他の構成例を表す。
さらに、画像データ圧縮合成部４２において、各端末５に表示される画像Ｍ中に、発言者サブ画像ＭＳＰを組み込むようにすることもできる。例えば、各端末５上の特定のキーを操作した参加者（販売者又は購入決定者）のみが発言権を得るように構成することができ、この場合、データ取得部２１では、発言権を得るためのキー操作がされた端末を識別することにより、発言者を特定することができる。このようにする替わりに、データ取得部２１において、音声検出を行うことにより、どの端末の参加者が発言者であるかを判定することもできる。複数の端末から音声データが送信された場合には、音声が最も大きいものを発言者として識別する。
【００３３】
例えば、端末５１の販売者が、発言している場合、好ましくは、発言者サブ画像ＭＳＰは、サブフレームデータＦＳ１を拡大したサブフレームデータＦＳＰである。なお、サブフレームデータＦＳＰは、他のサブクレームデータＦＳ１〜ＦＳｎとともに、画像圧縮合成部４２内で合成される。好ましくは、発言者を特定するためのサブ画面Ｍ１の外縁にある画像ＳＰに対応するサブフレームデータも、一緒に合成される。
【００３４】
図７に示すように、各端末５の各ディスプレイ４９は、画像Ｍに加えて、テキストチャット画像ＭＣＨを表示することもできる。テキストチャット画像ＭＣＨを表示する場合、サーバ２は更に、チャットサービスサーバとしての機能を備える必要があり、各端末５は更に、チャットプログラム４８’を備える必要がある。
【００３５】
［資料作成編集モード］
各端末５において、このモードが選択されると、該端末に商談用資料（例えば、▲１▼栄養補助食品に関する市場調査結果資料、▲２▼鼻洗浄器の詳細な構造説明資料、▲３▼購入決定者の店舗内でのテスト販売結果資料など））の作成編集画面（図示せず）が表示され、これにより資料の新規作成及び既存資料の改変が可能となる。この資料作成編集機能は、汎用の文書作成編集ソフトを用いることによって実現できる。作成された資料は、システムサーバ２内の資料記憶部２７に記憶され、会議参加者が共通にアクセス可能となる。なお、販売者の端末５１にのみ、この資料作成編集機能を備え、購入決定者の端末５２〜５ｎは、この機能を省略することもできる。
【００３６】
［プレゼンテーションモード］
各端末において、このモードが選択されると、資料記憶部２７に記憶されている資料にアクセスして該資料を表示可能となる。資料を用いてＴＶ会議でプレゼンテーションを行う場合、ＴＶ会議に参加しているすべての端末５１〜５ｎに、サーバ２を介して該資料を提供し表示させる。
【００３７】
この場合、プレセンタである端末（例えば、販売者の端末５１）から資料データをサーバ２に送信し、サーバ２は、該プレセンタからの指示に従い、全てのユ端末５に対してプレゼンテーションがなされることを通知する。この通知は、各端末５のモニタ画面に表示され、該通知が表示された時点でプレゼンテーションモードに切り換え操作を行うことができる。
【００３８】
なお、プレゼンテーション用の資料データとともに画像フレームデータＦ及び音声データも、データ送信部４４から全ての端末５１〜５ｎにマルチキャストされており、これにより、各参加者は、自分の端末をプレゼンテーションモードにするか又はユーザ画像表示モードにするかを、個別に選択することができる。
【００３９】
本発明は、以上のように構成され、ＴＶ会議システムの端末５側のソフトで１次圧縮した画像データをシステムサーバで受信した後、該データを同期をとりつつ整理・調整する過程で２次圧縮をすることにより、複数の画像データを１つの画像に合成する方式を採用しているので、システムサーバ４から各端末５に向かうデータ量を低減することができる。よって、会議参加者が何百台と増加しても、データ量を比較的小さく（例えば、３００ｋｂｐｓ以下）安定化することができる。
【００４０】
したがって、本発明によれば、高品質の画像・音声・プレゼンテーションのデータからのネットワーク負荷を大幅に低減させ、なおかつ一定負荷量に安定させて維持することができるＴＶ会議システムを利用する画期的な営業支援システムを実現することができる。
【００４１】
（第２の実施形態）
以下に、本発明の第２の実施形態について、図面を参照して説明する。なお、第２の実施形態は、インターネットＴＶ会議システムを利用する契約行為を促進し、又は証明するための手段を、第１の実施形態に追加したものである。従って、追加した手段について、以下、詳細に述べることとする。
【００４２】
［音声分析部７１］
図８は、第２の実施形態に係るサーバ４の機能ブロック図を示す。図８に示すように、サーバ４は更に、音声分析部７１を備える。音声分析部７１は、ＴＶ会議に参加を許可された販売者及び購入決定者の音声分析結果データ（例えば、声紋データ）を予め記憶している。音声分析部７１は、データ取得部４１により取得された音声データを分析し、予め記憶された音声分析結果データに一致する音声データのみ、音声データ合成部４３に供給する。このとき、音声分析部７１は、予め記憶された音声分析結果データに一致する音声データに対応する画像フレームデータのみ、画像データ圧縮合成部４２に供給するとともに、予め記憶された音声分析結果データに一致する音声データに対応する端末にのみ、合成された画像フレームデータ及び音声データを送信するように、データ送信部４４に命令する。
【００４３】
データ取得部４１により取得された音声データを音声分析部７１が分析するタイミングは、データ取得部４１から各端末５への送信要求に応答して、各音声データが、最初に取得された時である。また、音声分析部７１は、ＴＶ会議中、所定の間隔（例えば、３０分毎）で、音声データを分析することもできる。或いは、音声分析部７１は、ＴＶ会議中、常時、音声データを分析することもできる。
【００４４】
このように、予め許可された参加者のみＴＶ会議に参加することができるので、各参加者は、安心してインターネットＴＶ会議システムを利用することが可能となる。
【００４５】
［合成データ記憶部７２］
サーバ４は更に、合成データ記憶部７２を備える。合成データ記憶部７２は、その後にデータ送信部４４から送信される、画像データ圧縮合成部４２及び音声データ合成部４３で合成された画像フレームデータ及び音声データを記憶する。
【００４６】
合成された画像フレームデータ及び音声データを合成データ記憶部７２が記憶し始めるタイミングは、データ取得部４１から各端末５への送信要求に応答して、何れか１の画像フレームデータが、最初に取得された時である。その後、合成データ記憶部７２は、ＴＶ会議中、常時、合成された画像フレームデータ及び音声データを記憶する。ＴＶ会議の終了後、各端末５のユーザ画面モードが、すべて解除された時、合成データ記憶部７２は、データの記憶を停止する。
【００４７】
このように、ＴＶ会議における商談の内容を記憶することができるので、商談中に行われた契約行為を容易に証明することが可能となる。
また、端末５の中で、販売者の１台の端末（例えば、端末５１）及び購入決定者の１台の端末（例えば、端末５２）は、それぞれ記憶開始ボタン及び記憶停止ボタン（図示せず）を備えることもできる。この場合、販売者の端末（例えば、端末５１）において、その端末５１の第１記憶開始ボタンが、その端末５１のユーザ（販売者）によって押されると、その端末５１は、第１記憶開始信号を、インターネット１を介してサーバ４に送信する。同様に、購入決定者の端末（例えば、端末５２）において、その端末５２の第２記憶開始ボタンが、その端末５２のユーザ（購入決定者の代表者）によって押されると、その端末５２は、第２記憶開始信号を、インターネット１を介してサーバ４に送信する。これに対し、サーバ２の合成データ記憶部７２は、第１又は第２記憶開始信号を受信した後、合成された画像フレームデータ及び音声データを記憶し始める。
【００４８】
また、ＴＶ会議の途中又は終了後、販売者又は購入決定者の端末の何れか１の端末（例えば、端末５２）において、その端末５２の第２記憶停止ボタンが、その端末５２のユーザ（購入決定者の代表者）によって押されると、その端末５２は、第２記憶停止信号を、インターネット１を介してサーバ４に送信する。これに対し、サーバ２の合成データ記憶部７２は、第１又は第２記憶停止信号（例えば、第２記憶停止信号）を受信した後、データの記憶し終了する。
【００４９】
このように、記憶開始ボタンが押された時から、記憶停止ボタンが押された時まで、ＴＶ会議における商談の内容を記憶することができるので、データを記憶する容量を少なくすることができる。
【００５０】
（第３の実施形態）
以下に、本発明の第３の実施形態について、図面を参照して説明する。なお、第３の実施形態は、第１の実施形態の利用方法を変更したものである。従って、変更した利用方法（変更した構成）について、以下、詳細に述べることとする。
【００５１】
図９は、第３の実施形態に係る営業支援システム１０の構成例の概略を示した図である。
第３の実施実施形態に係る営業支援システム１０は、図９に示すように、インターネット３に接続可能なサーバ４、少なくとも１台の端末６１、６２、６３、・・・、６ｍ（以下、代表して表す場合には「端末６」で表す）、少なくとも２台以上の店舗内に取り付けられたカメラ７１、７２、７３、・・・、７ｎ（以下、代表して表す場合には「カメラ７」で表す）を備える。
【００５２】
図９中のサーバ４は、データ取得部４１と、画像データ圧縮合成部４２と、データ送信部４４と、を備え、図３に示すような、音声データ合成部４３と、資料管理部４５と、静止画像記憶部４６と、資料記憶部４７と、を備える必要がない。
【００５３】
また、図９中の各端末６（例えば、パソコン）は、店舗内の画像を表示する「店舗内画像表示モード」を実行するためのプログラムモジュール（ＰＭ）４８を予め記憶しており、上述の「ユーザ画像表示モード」、「資料作成編集モード」及び「プレゼンテーションモード」を実行するためのプログラムモジュール（ＰＭ）４８を予め記憶する必要がない。また、各端末６は、サーバ４のデータ取得部４１及びデータ送信部２４との間で、適宜のプロトコルでデータ通信を行うが、そのためのプログラム（Ｐ）４８’も、各端末５内に予め記憶されている。なお、各端末６は、ディスプレイ４９及びビデオコーディック４９’を備え、カメラ、マイク及びスピーカ４９、並びに、音声コーディック４９’を備える必要がない。
【００５４】
また、図９中の店舗内の各カメラ７は、ビデオコーディック４９’を備えている。好ましくは、各カメラ７は、店舗内にある商品が並んだ棚、店舗内に入店しようとする顧客などを、捕らえる様に、配置される。なお、同一店舗内に、複数のカメラを配置することもできる。
【００５５】
以下に、端末６１〜６ｍにおいて実行される「店舗内画像表示モード」の動作について説明する。
［店舗内画像表示モード］
各端末６において店舗内画面モードが選択されると、サーバ４のデータ取得部４１は、店舗内に設置されるカメラ７１〜７ｎから、例えばカメラ７１〜７ｎの順番で、画像フレームデータを取り込む。
【００５６】
すなわち、データ取得部４１は、まずカメラ７１に対して送信要求（画像データを送信するよう指示するための要求）を出力する。すると、それに応答して、カメラ７１が、現在撮影している画像フレームデータＦ１（店舗内の画像：商品棚及びその周辺の顧客、店舗内に入店しようとする入り口付近の顧客）を、データ取得部４１に返送する。同様にして、データ取得部４１が、カメラ７２〜７ｎに対して送信要求を順次送信すると、カメラ７２〜７ｎは、現在の画像フレームデータＦ２〜Ｆｎを、データ取得部４１に返送する。データ取得部４１は、このモードの動作中、このような画像フレームデータの取得動作を反復実行する。
【００５７】
データ取得部４１により取得された画像フレームデータＦ１〜Ｆｎは、取得順に画像データ圧縮合成部４２に供給される。
サーバ４の画像データ圧縮合成部４２は、画像フレームデータＦ１〜Ｆｎが入力されると、該データをそれぞれ圧縮する。画像データの圧縮は、適宜の既知の圧縮方法を採用可能であるが、各画像フレームデータを１／ｎ以下に圧縮する必要がある。画像データ圧縮合成部４２には、圧縮された画像フレームデータＦ１〜Ｆｎを、それぞれサブフレームデータＦｓ１〜Ｆｓｎとして、これらサブフレームデータを「１つ」の画像フレームデータＦのどこに組み込むかが、予め設定されている。合成された画像フレームデータＦは、データ送信部４４に供給される。
【００５８】
データ送信部４４は、画像フレームデータＦを画像データ圧縮合成部４２４３から受け取ると、これらを映像監視の参加者の端末６１〜６ｍすべてにマルチキャストする。
【００５９】
端末６１〜６ｎにマルチキャストされる画像フレームデータＦは、１枚の画像フレームデータであるので、各端末に複数の画像フレームデータをマルチキャストする従来例のインターネットＴＶ会議システムを変形したインターネット映像監視システムに対比して、送信される画像データの量が低減されており、したがって、インターネット１、サーバ４及び各端末７の負荷が低減される。
【００６０】
なお、第３の実施形態に係るサーバ４は、実施の形態２に係る合成データ記憶部７２のように、画像フレームデータＦを記憶することもできる。
本発明は、以上のように構成され、各端末６のユーザは、店舗内における顧客の行動（導線、購入実績）データをリアルタイムに把握し、分析することが可能となり、店舗内の商品の最適購入数、モデル棚割などを提案することができる。
【００６１】
また、入店した顧客を認識することが可能となり、顧客に適する広告を作成することができる。
【図面の簡単な説明】
【図１】従来のインターネットＴＶ会議システムの構成例の概略を示した図である。
【図２】本発明の営業支援システム１０の構成例の概略を示した図である。
【図３】図２中のサーバ４の機能ブロック図を示す。
【図４】図２中の端末５の概略的な機能ブロック図を示す。
【図５】各端末５の各ディスプレイ４９に表示される画像Ｍ（フレームデータＦ）の構成例を表す。
【図６】各端末５の各ディスプレイ４９に表示される画像Ｍ（フレームデータＦ）のもう１つの構成例を表す。
【図７】各端末５の各ディスプレイ４９に表示される画像Ｍ（フレームデータＦ）の他の構成例を表す。
【図８】第２の実施形態に係るサーバ４の機能ブロック図を示す。
【図９】第３の実施形態に係る営業支援システム１０の構成例の概略を示した図である。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a sales support system, and more particularly, to a sales support system that utilizes an Internet TV conference system that communicates a plurality of voices and images.
[0002]
BACKGROUND OF THE INVENTION
Generally, when a company develops a certain product or service (hereinafter simply referred to as "product"), the person who sells the product (hereinafter simply referred to as "seller") determines whether or not to purchase the product. In some cases, the merchandise is explained to a deciding person (hereinafter, simply referred to as a "purchase deciding person"). In this case, it is not uncommon for the seller to go to the place where the purchase decision maker is located and explain the product.
[0003]
Therefore, as the number of purchase decision-makers increases, the number of times that the seller goes to each place also increases, and as a result, there is a problem that the burden on the seller increases. In particular, when there are a plurality of purchase decision makers in the same company or the same group, the seller needs to go to each place, and the above problem is a more serious problem. Such situations include, for example, a case in which a pharmaceutical company has developed a new cedar pollen countermeasure product (for example, (1) a dietary supplement containing perilla concentrated extract and bean tea extract; Development of a nasal irrigator that uses a scent-based cleaning fluid), the seller of the company must go to multiple pharmacy stores within the same company and individually explain the same product is there.
[0004]
In such a situation, it is conceivable to use a conventional Internet TV conference system. That is, if it is possible to communicate all the voices and images between one seller and a plurality of purchase decision makers, the seller will not need to go to a plurality of places.
[0005]
[Problems to be solved by the invention]
FIG. 1 is a diagram schematically showing a configuration example of a conventional Internet TV conference system. As shown in FIG. 1, when a conventional Internet TV conference system is used, when the total number of sellers and a plurality of purchase decision makers is n, one personal computer (for example, personal computer 31) is A set of voices and images (eg, “F1” in the figure) of the owner is transmitted to the server 2 via the Internet 1, while n sets of voices and images (“F1”) of all the owners including the owner are transmitted. ... Fn ”) from the server 2. That is, each of the personal computer terminals 31, 32, 33,..., 3n needs to receive n sets of voices and images (“F1 to Fn”) from the server 2; Each had to have a large capacity communication infrastructure (eg, FTTH). Providing such a large-capacity communication infrastructure involves enormous costs, and as a result, it is impractical to explain a product using a conventional Internet TV conference system.
[0006]
On the other hand, in order to explain a product while using the current popular communication infrastructure (for example, ADSL12M), it is conceivable to reduce the size of one set of voice and image. However, since each of the personal computers 3 must receive n sets of voices and images (“F1 to Fn”), the personal computers 3 cannot communicate in real time the images of the seller or the purchase decision maker. Could not explain the product.
[0007]
Therefore, an object of the present invention is to provide a system capable of communicating images of a seller or a purchase decision maker in real time while using a current popular communication infrastructure.
[0008]
On the other hand, if the present invention is used, images of sellers or purchasers can be communicated in real time, and as a result, it is possible to explain products using the Internet TV conference system. Therefore, after the product explanation, a negotiation can be concluded, and it is possible to make a contract using the Internet TV conference system.
[0009]
Therefore, another object of the present invention is to provide a system capable of promoting or proving a contract using the Internet TV conference system.
Other objects of the present invention will become apparent by referring to embodiments of the invention described below.
[0010]
[Means for Solving the Problems]
In order to achieve the above object, a sales support system of the present invention comprises at least three or more TV conference terminals connected to the Internet and a server connected to the Internet. In the sales support system using the TV conference system for performing the TV conference between the above users,
The server is
From each terminal, data acquisition means for acquiring image frame data and audio data compressed in each terminal,
By arranging and adjusting the acquired plurality of image frame data while compressing and synchronizing, the acquired plurality of image frame data is assigned to a plurality of divisions obtained by dividing one image frame, Image data compression / synthesis means for synthesizing into one image frame data;
Means for synthesizing the plurality of acquired voice data;
Data distribution means for multicasting the synthesized image frame data and audio data to each of the user terminals;
With
At least one device is a merchant device,
At least two terminals are terminals for purchase decision makers.
[0011]
In a case where at least one of the plurality of terminals for the purchase decision maker does not have a camera for a TV conference, preferably,
The TV conference system server further includes storage means for storing a still image,
The image data compression / synthesis unit is configured to read out a still image from the storage unit and assign it to a section in one image frame corresponding to a user terminal without a camera.
[0012]
When the number of terminals for purchase decision-makers participating in the TV conference is less than the number of the plurality of sections of one image frame, preferably,
The image data compression / synthesis unit is further configured to read and assign a still image from the storage unit to the surplus section.
[0013]
More preferably, the data acquisition means further comprises means for detecting the largest volume of audio data among the acquired audio data, or a key operation signal representing the speaker from the terminal,
The image data compression / synthesis unit further processes the image frame data from the terminal corresponding to the audio data with the highest detected volume, or the terminal from which the key operation signal representing the detected speaker is transmitted. The data image is configured to be enlarged and compared to an image from another terminal and assigned to one section.
[0014]
In addition, the server also:
The voice analysis result data of the seller and the purchase decision maker who are permitted to participate in the TV conference are stored in advance, and the obtained voice data is analyzed to determine whether or not the voice analysis result data matches the previously stored voice analysis result data. Means for determining,
The means for synthesizing a plurality of voice data synthesizes only voice data that matches the voice analysis result data,
It is preferable that the image data compression / synthesis unit is configured to compress only image frame data corresponding to the matching audio data and synthesize the image data into one image frame data.
[0015]
Preferably, the server further includes a unit for storing the combined image frame data and audio data.
Further, in order to achieve the above object, the sales support system of the present invention includes at least one or more video monitoring terminals connected to the Internet, at least two or more cameras connected to the Internet, In a sales support system comprising a connected server and using a video monitoring system for monitoring video from at least two or more cameras via the Internet,
The server is
Data acquisition means for acquiring image frame data from each camera;
An image to be combined with one image frame data by compressing the obtained plurality of image frame data and assigning the plurality of compressed image frame data to a plurality of sections obtained by dividing one image frame. Data compression / synthesis means;
Data distribution means for multicasting the synthesized image frame data to the terminal;
With
A camera is located in the store with the product.
[0016]
BEST MODE FOR CARRYING OUT THE INVENTION
( First embodiment )
Hereinafter, a first embodiment of the present invention will be described with reference to the drawings.
[0017]
FIG. 2 is a diagram schematically showing a configuration example of the sales support system 10 of the present invention, FIG. 3 is a functional block diagram of the server 4 in FIG. 2, and FIG. 4 is a terminal 5 in FIG. FIG. 2 shows a schematic functional block diagram of FIG.
[0018]
As shown in FIG. 2, the sales support system 10 of the present invention includes a server 4 connectable to the Internet 3 and at least three terminals 51, 52, 53,. Is represented by “terminal 5”).
[0019]
As shown in FIG. 3, the server 4 in FIG. 2 includes a data acquisition unit 41, an image data compression and synthesis unit 42, an audio data synthesis unit 43, a data transmission unit 44, a material management unit 45, a still image A storage unit 46 and a material storage unit 47 are provided.
[0020]
Also, as shown in FIG. 4, each terminal 5 (for example, a personal computer) in FIG. 2 stores a program module (PM) 48 for executing various modes in advance. The modes include, for example, a “user image display mode” for displaying images of TV conference participants (at least one sales person and at least two purchase decision makers), and a “material” for creating and editing conference materials. A "creation / edit mode" and a "presentation mode" for displaying conference materials. In addition, each terminal 5 performs data communication with the data acquisition unit 41 and the data transmission unit 24 of the server 4 using an appropriate protocol. A program (P) 48 ′ for this is also stored in each terminal 5 in advance. It is remembered. Further, each terminal 5 performs data compression on image data and audio data by an appropriate compression method, and a program (P) 48 ′ for that purpose is stored in each terminal 5 in advance. Each terminal 5 includes a camera, a microphone, a speaker and a display 49, and a video codec and an audio codec 49 '.
[0021]
Hereinafter, operations in the “user image display mode”, the “material creation / editing mode”, and the “presentation mode” executed in the terminals 51 to 5n will be described.
[0022]
[User image display mode]
When the user screen mode is selected in each terminal 5, the data acquisition unit 41 of the server 4 compresses the data in each terminal 5 from the terminals 51 to 5n participating in the Internet TV conference, for example, in the order of the terminals 51 to 5n. The image frame data (“F1” to “Fn” in FIG. 2) is captured.
[0023]
That is, the data acquisition unit 41 first outputs a transmission request (a request to instruct to transmit image data and accompanying audio data) to the terminal 51 (for example, the terminal is owned by a business operator). . Then, in response, the terminal 51 transmits the image frame data F1 (including at least the face of the business person) currently captured and compressed by the camera 49 and the voice data (commercial Is returned to the data acquisition unit 41. Similarly, when the data acquisition unit 41 sequentially transmits transmission requests to the terminals 52 to 5n (for example, these terminals are owned by the purchase decision maker), the terminals 52 to 5n transmit the current compressed image The frame data F2 to Fn (including at least the face of the purchase decision maker) and the voice data (the voice of the purchase decision maker) are returned to the data acquisition unit 41. The data acquisition unit 41 repeatedly performs such an operation of acquiring image frame data during a TV conference.
[0024]
The image frame data F1 to Fn acquired by the data acquisition unit 41 are synchronously supplied to the image data compression / synthesis unit 42, and the audio data acquired by the data acquisition unit 41 is synchronized with the audio data It is supplied to the synthesizing unit 43.
[0025]
Upon receiving the image frame data F1 to Fn, the image data compression / combination unit 42 of the server 4 compresses the data. For the compression of the image data, an appropriate known compression method can be adopted, but it is necessary to compress each image frame data to 1 / n or less. The image data compression / synthesizing unit 42 sets the compressed image frame data F1 to Fn as sub-frame data Fs1 to Fsn, respectively, where the sub-frame data is to be incorporated into “one” image frame data F in advance. Is set.
[0026]
FIG. 5 shows a configuration example of an image M (frame data F) displayed on each display 49 of each terminal 5. As shown in FIG. 5, for example, the sub-frame data strings (Fs1 to Fsn) are so arranged that each of the sub-images M1 to Mn based on the sub-frame data Fs1 to Fsn is assigned to each of the sections obtained by dividing the screen M into n pieces. ) Is rearranged. Thus, the obtained image frame data F1 to Fn are combined into one image frame data F. The synthesized image frame data F is supplied to the data transmission unit 44.
[0027]
The audio data synthesizing unit 43 receives audio data transmitted from the terminals 51 to 5n accompanying the image frame data F1 to Fn from the data acquisition unit 41, synthesizes them (compresses them if necessary, and compresses them). ) Supply the data to the data transmission unit 44.
[0028]
When receiving the image frame data F and the audio data from the image data compression / synthesis unit 42 and the audio data synthesis unit 43, the data transmission unit 44 multicasts them to all of the terminals 51 to 5n of the TV conference participants. As a result, the sub-images M1 to Mn corresponding to the sub-frame data Fs1 to Fsn are displayed on one screen M on each terminal 5 as shown in FIG.
[0029]
Since the image frame data F to be multicast to the terminals 51 to 5n is one image frame data by the rearranging / adjustment process on the server side, the conventional Internet TV that multicasts a plurality of image frame data to each terminal. As compared with the conference system, the amount of image data to be transmitted is reduced, and thus the load on the Internet 1, the server 4, and each terminal 5 is reduced. Even if the number of participating terminals increases, the method does not exceed a certain amount of image data, so that the load on the Internet is reduced and the image data frames F and The data communication speed of the audio data is set to, for example, 300 kbps or less. This reduction in the amount of image data increases as the number of terminals 5 increases. Therefore, it is possible to realize a TV conference system with stable voice and image quality even in a simultaneous conference of hundreds of people.
[0030]
FIG. 6 shows another configuration example of the image M (frame data F) displayed on each display 49 of each terminal 5.
In the above description, an example has been described in which all participants of the TV conference return image data and audio data to the data acquisition unit 41 of the system server 2. However, even if the purchase decision-maker does not have a camera, It is possible to participate in a TV conference only by voice. For example, when the terminal 53 is not provided with the camera 49, the subframe data FsNC corresponding to the subframe data Fs3 of the synthesized image frame data F includes an appropriate still image stored in the still image storage unit 46 in advance. An image (for example, a still image of the owner of the terminal 46, an indication that there is no camera, etc.) is assigned as the sub-image M3. When a terminal without a camera participates in the TV conference, the image data compression / combination unit 42 automatically allocates the still image from the still image storage unit 46 to be arranged in the last section (the position of the sub-image Mn). You can also.
[0031]
Also, for example, even if the number of sub-images is set to n so that a maximum of n people can participate in the TV conference, the number of participants is actually smaller than n, for example, (n-2). There are cases. In such a case, the image data compression / synthesis unit 42 stores an appropriate still image from the still image storage unit 46 (for example, in the absence of a conference participant) in the divided position corresponding to the remaining two sub-images Mn-1 and Mn. ). Instead, the image data compression / synthesis unit 42 changes the number of sub-images (ie, the number of divisions) and the compression ratio in one image M displayed on each terminal 5 according to the number of participants. You may comprise so that it is possible.
[0032]
FIG. 7 shows another configuration example of the image M (frame data F) displayed on each display 49 of each terminal 5.
Further, the image data compression / synthesis unit 42 may incorporate the speaker sub-image MSP into the image M displayed on each terminal 5. For example, it is possible to configure so that only the participant (seller or purchase decider) who operates a specific key on each terminal 5 gets the floor, and in this case, the data acquisition unit 21 obtains the floor. By identifying the terminal on which the key operation has been performed, the speaker can be specified. Instead of this, the data acquisition unit 21 can detect which terminal participant is the speaker by performing voice detection. When voice data is transmitted from a plurality of terminals, the one with the highest voice is identified as the speaker.
[0033]
For example, when the seller of the terminal 51 is speaking, preferably, the speaker sub-image MSP is the sub-frame data FSP obtained by enlarging the sub-frame data FS1. The subframe data FSP is synthesized in the image compression synthesizing unit 42 together with the other subclaim data FS1 to FSn. Preferably, the sub-frame data corresponding to the image SP at the outer edge of the sub-screen M1 for specifying the speaker is also synthesized together.
[0034]
As shown in FIG. 7, each display 49 of each terminal 5 can also display a text chat image MCH in addition to the image M. When displaying the text chat image MCH, the server 2 needs to further have a function as a chat service server, and each terminal 5 needs to further have a chat program 48 '.
[0035]
[Material editing mode]
When this mode is selected at each terminal 5, the terminal provides the terminal with business negotiation materials (for example, (1) materials for market research results on dietary supplements, (2) detailed structure explanation materials for nasal irrigators, (3)). A creation / edit screen (not shown) of a purchase decision maker's in-store test sales result material) is displayed, thereby enabling new creation of materials and modification of existing materials. This material creation and editing function can be realized by using general-purpose document creation and editing software. The created material is stored in the material storage unit 27 in the system server 2 and is commonly accessible by the conference participants. Note that only the seller's terminal 51 is provided with this material creation / editing function, and the purchase determinator's terminals 52 to 5n may omit this function.
[0036]
[Presentation mode]
When this mode is selected in each terminal, it is possible to access the material stored in the material storage unit 27 and display the material. When a presentation is made in a TV conference using materials, the materials are provided and displayed via the server 2 to all the terminals 51 to 5n participating in the TV conference.
[0037]
In this case, the material data is transmitted from the terminal (for example, the terminal 51 of the seller) which is a pre-center to the server 2, and the server 2 makes a presentation to all the terminals 5 in accordance with the instruction from the pre-center. Notify. This notification is displayed on the monitor screen of each terminal 5, and when the notification is displayed, the operation can be switched to the presentation mode.
[0038]
In addition, the image frame data F and the audio data together with the material data for presentation are also multicast from the data transmission unit 44 to all the terminals 51 to 5n, whereby each participant puts his / her own terminal into the presentation mode. It is possible to individually select whether to enter the user image display mode.
[0039]
The present invention is configured as described above. After receiving image data that has been primarily compressed by software on the terminal 5 side of the TV conference system at the system server, the data is synchronized and organized and adjusted in a secondary process. Since a method is adopted in which a plurality of image data are combined into one image by performing compression, the amount of data traveling from the system server 4 to each terminal 5 can be reduced. Therefore, even if the number of conference participants increases to hundreds, the data amount can be stabilized relatively small (for example, 300 kbps or less).
[0040]
Therefore, according to the present invention, an epoch-making use of a TV conference system that can greatly reduce the network load from high-quality image / audio / presentation data and stably maintain a constant load amount. A simple sales support system can be realized.
[0041]
( Second embodiment )
Hereinafter, a second embodiment of the present invention will be described with reference to the drawings. In the second embodiment, means for promoting or proving a contract using the Internet TV conference system is added to the first embodiment. Therefore, the added means will be described in detail below.
[0042]
[Speech analysis unit 71]
FIG. 8 shows a functional block diagram of the server 4 according to the second embodiment. As shown in FIG. 8, the server 4 further includes a voice analysis unit 71. The voice analysis unit 71 stores voice analysis result data (for example, voiceprint data) of sellers and purchase decision makers who are allowed to participate in the TV conference in advance. The voice analysis unit 71 analyzes the voice data acquired by the data acquisition unit 41, and supplies only the voice data that matches the voice analysis result data stored in advance to the voice data synthesis unit 43. At this time, the audio analysis unit 71 supplies only the image frame data corresponding to the audio data that matches the previously stored audio analysis result data to the image data compression / synthesis unit 42, and The data transmission unit 44 is instructed to transmit the synthesized image frame data and audio data only to the terminal corresponding to the matching audio data.
[0043]
The timing at which the voice analysis unit 71 analyzes the voice data acquired by the data acquisition unit 41 is the time when each voice data is first acquired in response to a transmission request from the data acquisition unit 41 to each terminal 5. is there. Further, the voice analysis unit 71 can also analyze voice data at predetermined intervals (for example, every 30 minutes) during the TV conference. Alternatively, the audio analysis unit 71 can always analyze audio data during a TV conference.
[0044]
In this way, only the participants who have been permitted in advance can participate in the TV conference, so that each participant can use the Internet TV conference system with confidence.
[0045]
[Synthesized data storage unit 72]
The server 4 further includes a combined data storage unit 72. The composite data storage unit 72 stores the image frame data and the audio data that are subsequently transmitted from the data transmission unit 44 and combined by the image data compression and combination unit 42 and the audio data combination unit 43.
[0046]
The timing at which the synthesized data storage unit 72 starts to store the synthesized image frame data and audio data is such that, in response to a transmission request from the data acquisition unit 41 to each terminal 5, any one of the image frame data It is when it was obtained. Thereafter, the combined data storage unit 72 always stores the combined image frame data and audio data during the TV conference. After the end of the TV conference, when all the user screen modes of the terminals 5 are released, the combined data storage unit 72 stops storing data.
[0047]
As described above, since the contents of the negotiation in the TV conference can be stored, it is possible to easily prove the contract act performed during the negotiation.
Further, among the terminals 5, one terminal of the seller (for example, the terminal 51) and one terminal of the purchase decision maker (for example, the terminal 52) are provided with a storage start button and a storage stop button (not shown), respectively. ) Can also be provided. In this case, when the first storage start button of the terminal 51 is pressed by the user (seller) of the terminal 51 at the terminal of the seller (for example, the terminal 51), the terminal 51 receives the first storage start signal. Is transmitted to the server 4 via the Internet 1. Similarly, when the second storage start button of the terminal 52 is pressed by the user (representative of the purchaser) of the terminal 52 at the terminal of the purchaser (for example, the terminal 52), the terminal 52 becomes: A second storage start signal is transmitted to the server 4 via the Internet 1. On the other hand, after receiving the first or second storage start signal, the combined data storage unit 72 of the server 2 starts storing the combined image frame data and audio data.
[0048]
Further, during or after the TV conference, at one of the terminals of the seller or the purchase decision maker (for example, the terminal 52), the second memory stop button of the terminal 52 is pressed by the user of the terminal 52 (the purchaser). When pressed by the representative of the decider, the terminal 52 transmits a second memory stop signal to the server 4 via the Internet 1. On the other hand, after receiving the first or second storage stop signal (for example, the second storage stop signal), the combined data storage unit 72 of the server 2 stores the data and ends.
[0049]
As described above, since the contents of the negotiation in the TV conference can be stored from the time the storage start button is pressed to the time the storage stop button is pressed, the capacity for storing data can be reduced.
[0050]
( Third embodiment )
Hereinafter, a third embodiment of the present invention will be described with reference to the drawings. Note that the third embodiment is a modification of the method of using the first embodiment. Therefore, the changed use method (changed configuration) will be described in detail below.
[0051]
FIG. 9 is a diagram schematically illustrating a configuration example of a sales support system 10 according to the third embodiment.
As shown in FIG. 9, the sales support system 10 according to the third embodiment includes a server 4 connectable to the Internet 3, at least one terminal 61, 62, 63,. , 7n attached to at least two or more stores (hereinafter referred to as "camera 7"").
[0052]
9 includes a data acquisition unit 41, an image data compression / synthesis unit 42, and a data transmission unit 44. As shown in FIG. It is not necessary to provide the still image storage unit 46 and the material storage unit 47.
[0053]
Further, each terminal 6 (for example, a personal computer) in FIG. 9 stores in advance a program module (PM) 48 for executing an “in-store image display mode” for displaying an image in a store, and There is no need to previously store the program module (PM) 48 for executing the “user image display mode”, the “material creation / editing mode”, and the “presentation mode”. In addition, each terminal 6 performs data communication with the data acquisition unit 41 and the data transmission unit 24 of the server 4 using an appropriate protocol, and a program (P) 48 ′ for that is also stored in each terminal 5 in advance. It is remembered. Each terminal 6 includes a display 49 and a video codec 49 ', and does not need to include a camera, a microphone and a speaker 49, and an audio codec 49'.
[0054]
Each camera 7 in the store in FIG. 9 includes a video codec 49 ′. Preferably, each camera 7 is arranged so as to capture a shelf in the store where products are arranged, a customer trying to enter the store, and the like. Note that a plurality of cameras can be arranged in the same store.
[0055]
Hereinafter, the operation of the “in-store image display mode” executed in the terminals 61 to 6m will be described.
[In-store image display mode]
When the in-store screen mode is selected in each terminal 6, the data acquisition unit 41 of the server 4 captures image frame data from the cameras 71 to 7n installed in the store, for example, in the order of the cameras 71 to 7n.
[0056]
That is, the data acquisition unit 41 first outputs a transmission request (a request for instructing transmission of image data) to the camera 71. Then, in response, the camera 71 sets the image frame data F1 (the image in the store: the customer near the merchandise shelf and its surroundings, the customer near the entrance who wants to enter the store) in the data. It is returned to the acquisition unit 41. Similarly, when the data acquisition unit 41 sequentially transmits transmission requests to the cameras 72 to 7n, the cameras 72 to 7n return the current image frame data F2 to Fn to the data acquisition unit 41. The data acquisition unit 41 repeatedly performs such an operation of acquiring image frame data during the operation in this mode.
[0057]
The image frame data F1 to Fn acquired by the data acquisition unit 41 are supplied to the image data compression / combination unit 42 in the order of acquisition.
Upon receiving the image frame data F1 to Fn, the image data compression / combination unit 42 of the server 4 compresses the data. For the compression of the image data, an appropriate known compression method can be adopted, but it is necessary to compress each image frame data to 1 / n or less. The image data compression / synthesizing unit 42 sets the compressed image frame data F1 to Fn as sub-frame data Fs1 to Fsn, respectively, where the sub-frame data is to be incorporated into “one” image frame data F in advance. Is set. The synthesized image frame data F is supplied to the data transmission unit 44.
[0058]
Upon receiving the image frame data F from the image data compression / synthesis unit 4243, the data transmission unit 44 multicasts them to all of the terminals 61 to 6m of the video monitoring participants.
[0059]
Since the image frame data F to be multicast to the terminals 61 to 6n is one image frame data, an Internet video surveillance system modified from the conventional Internet TV conference system which multicasts a plurality of image frame data to each terminal is used. In contrast, the amount of image data to be transmitted is reduced, and thus the loads on the Internet 1, the server 4, and each terminal 7 are reduced.
[0060]
Note that the server 4 according to the third embodiment can also store the image frame data F like the combined data storage unit 72 according to the second embodiment.
The present invention is configured as described above, and the user of each terminal 6 can grasp and analyze in real time the behavior (leading wire, purchase result) data of the customer in the store, and can optimize the product in the store. The number of purchases, model shelving, etc. can be proposed.
[0061]
Further, it is possible to recognize a customer who has entered the store, and it is possible to create an advertisement suitable for the customer.
[Brief description of the drawings]
FIG. 1 is a diagram schematically illustrating a configuration example of a conventional Internet TV conference system.
FIG. 2 is a diagram schematically illustrating a configuration example of a sales support system 10 of the present invention.
FIG. 3 shows a functional block diagram of a server 4 in FIG.
FIG. 4 is a schematic functional block diagram of a terminal 5 in FIG. 2;
5 shows a configuration example of an image M (frame data F) displayed on each display 49 of each terminal 5. FIG.
6 shows another configuration example of an image M (frame data F) displayed on each display 49 of each terminal 5. FIG.
FIG. 7 shows another configuration example of an image M (frame data F) displayed on each display 49 of each terminal 5.
FIG. 8 is a functional block diagram of a server 4 according to the second embodiment.
FIG. 9 is a diagram schematically illustrating a configuration example of a sales support system 10 according to a third embodiment.

Claims

A TV conference system, comprising at least three TV conference terminals connected to the Internet and a server connected to the Internet, for performing a TV conference among at least three users via the Internet is used. Sales support system,
The server is
From each terminal, data acquisition means for acquiring image frame data and audio data compressed in each terminal,
By arranging and adjusting the acquired plurality of image frame data while compressing and synchronizing, the acquired plurality of image frame data is assigned to a plurality of divisions obtained by dividing one image frame, Image data compression / synthesis means for synthesizing into one image frame data;
Means for synthesizing the plurality of acquired voice data;
Data distribution means for multicasting the synthesized image frame data and audio data to each of the user terminals,
At least one device is a merchant device,
A sales support system, wherein at least two terminals are terminals for purchase decision makers.

In the sales support system according to claim 1,
At least one of the plurality of terminals for the purchase decision maker does not have a camera for the TV conference,
The TV conference system server further includes storage means for storing a still image,
A sales support system, wherein the image data compression / synthesis unit is configured to read out a still image from the storage unit and assign it to a section corresponding to a user terminal having no camera in one image frame.

3. The sales support system according to claim 2, wherein the image data compression / synthesis unit further comprises a surplus when the number of terminals for purchase decision-makers participating in the TV conference is less than the number of sections of one image frame. A sales support system, wherein a still image is read from a storage unit and assigned to a section.

In the sales support system according to any one of claims 1 to 3,
The data acquisition means further comprises means for detecting the loudest audio data among the acquired audio data,
The image data compression / synthesis unit further processes the image frame data from the terminal corresponding to the detected audio data having the largest volume, and enlarges the image of the data by 1 in comparison with the image from another terminal. A sales support system configured to be assigned to one of the divisions.

In the sales support system according to any one of claims 1 to 3,
The data acquisition means further includes means for detecting a key operation signal representing the speaker from the terminal,
The image data compression / synthesis unit further processes the image frame data from the terminal that transmitted the key operation signal representing the detected speaker, and enlarges the image of the data in comparison with the image from another terminal. A sales support system characterized in that the sales support system is configured to allocate the information to one section.

The sales support system according to any one of claims 1 to 5, wherein the server further comprises:
The voice analysis result data of the seller and the purchase decision maker who are permitted to participate in the TV conference are stored in advance, and the obtained voice data is analyzed to determine whether or not the voice analysis result data matches the previously stored voice analysis result data. Means for determining,
The means for synthesizing a plurality of voice data synthesizes only voice data that matches the voice analysis result data,
A sales support system characterized in that the image data compression / synthesis means is configured to compress only image frame data corresponding to the matching audio data and synthesize the image data into one image frame data.

The sales support system according to any one of claims 1 to 6, wherein the server further comprises means for storing the combined image frame data and audio data.

At least one or more video monitoring terminals connected to the Internet, at least two or more cameras connected to the Internet, and at least two or more cameras connected to the Internet via the Internet In a sales support system that uses a video surveillance system to monitor video from
The server is
Data acquisition means for acquiring image frame data from each camera;
An image to be combined with one image frame data by compressing the obtained plurality of image frame data and assigning the plurality of compressed image frame data to a plurality of sections obtained by dividing one image frame. Data compression / synthesis means;
Data distribution means for multicasting the synthesized image frame data to the terminal,
A sales support system, wherein a camera is arranged in a store having a product.