JP2018101965A

JP2018101965A - System, method for distributing video, and program for use therein

Info

Publication number: JP2018101965A
Application number: JP2016248525A
Authority: JP
Inventors: 正史吉田; Masashi Yoshida
Original assignee: DeNA Co Ltd
Current assignee: DeNA Co Ltd
Priority date: 2016-12-21
Filing date: 2016-12-21
Publication date: 2018-06-28
Anticipated expiration: 2036-12-21
Also published as: JP6426136B2

Abstract

PROBLEM TO BE SOLVED: To suppress deterioration in the speech quality of telephone conversation performed between a distributor terminal for distributing the video and a viewer terminal for viewing that video.SOLUTION: A video distributing system 1 includes a distribution server 10, and multiple user terminals 30 connected communicably with the distribution server 10 via a network 20, where the distribution server 10 provides video distribution service for distributing a distribution video, transmitted from the user terminals 30, to the multiple user terminals 30. This video distributing system 1 is configured so that in response to start of a prescribed communication, between a subscriber terminal 30 participating to the distribution video and the subscriber terminal 30, a voice received in the prescribed communication is outputted in place of the voice contained in the distribution video received from the distribution server 10.SELECTED DRAWING: Figure 1

Description

本発明は、動画を配信するためのシステム、方法、及び、これらに用いられるプログラムに関するものである。 The present invention relates to a system, a method, and a program used for distributing a moving image.

従来、配信者端末から送信される動画を複数の視聴者端末に配信する動画配信サービスにおいて、配信者と視聴者との間のコミュニケーションを可能とすることが行われている。例えば、下記特許文献１は、配信者端末から送信されるライブ映像を配信するコンテンツサーバが、ライブ映像と共に、配信者と特定の視聴者との間の音声通話の内容を配信することを開示する。こうしたシステムは、視聴者が、音声通話を介して配信者のライブ映像に参加することを可能とする。 Conventionally, in a video distribution service that distributes a video transmitted from a distributor terminal to a plurality of viewer terminals, communication between the distributor and the viewer has been made possible. For example, Patent Document 1 below discloses that a content server that distributes live video transmitted from a distributor terminal distributes the contents of a voice call between the distributor and a specific viewer along with the live video. . Such a system allows viewers to participate in the live video of the distributor via a voice call.

特開２０１１−１７２２００号公報JP 2011-172200 A

しかしながら、上述した従来のシステムでは、ライブ映像への参加者と配信者との間の音声通話の内容が、コンテンツサーバを介してライブ映像と共に参加者自身の端末にも送信されることに起因する不具合を生じ得る。具体的には、例えば、配信者端末においてスピーカーを介して出力された参加者の音声がマイクを介して入力されると、当該参加者の音声は、一定の遅延を伴ってライブ映像と共に参加者端末において出力されることになる。このように遅延して出力される参加者自身の音声（エコー）は、ハウリング等の音声品質の悪化をもたらし、参加者と配信者との間の円滑なコミュニケーションを阻害してしまう。 However, in the conventional system described above, the content of the voice call between the participant and the distributor of the live video is transmitted to the participant's own terminal together with the live video through the content server. It can cause problems. Specifically, for example, when a participant's voice output via a speaker at a distributor terminal is inputted via a microphone, the participant's voice is a participant with a live video with a certain delay. It will be output at the terminal. Thus, the participant's own voice (echo) output with a delay causes deterioration in voice quality such as howling, and hinders smooth communication between the participant and the distributor.

本発明の実施形態は、動画を配信する配信者端末と当該動画を視聴するための視聴者端末との間で行われる通話の音声品質の悪化を抑制することを目的の一つとする。本発明の実施形態の他の目的は、本明細書全体を参照することにより明らかとなる。 An embodiment of the present invention has an object of suppressing deterioration in voice quality of a call performed between a distributor terminal that distributes a moving image and a viewer terminal that views the moving image. Other objects of the embodiments of the present invention will become apparent by referring to the entire specification.

本発明の一実施形態に係るシステムは、配信者端末と、配信サーバと、複数の視聴者端末と、を備え、動画を配信するためのシステムであって、前記配信者端末が、マイクを介して入力されるリアルタイムの音声を少なくとも含む配信動画を前記配信サーバに送信するステップと、前記配信サーバが、前記配信者端末から受信する前記配信動画を前記複数の視聴者端末の各々に送信するステップと、前記複数の視聴者端末の各々が、前記配信サーバから受信する前記配信動画に含まれる画像及び音声を出力するステップと、前記複数の視聴者端末に含まれる参加者端末、及び、前記配信者端末の間で、少なくとも通話を実行可能な所定のコミュニケーションを開始するステップと、前記参加者端末が、前記所定のコミュニケーションの開始に応じて、前記配信サーバから受信する前記配信動画に含まれる音声に代えて、前記所定のコミュニケーションにおいて受信する音声を出力するステップと、を実行する。 A system according to an embodiment of the present invention includes a distributor terminal, a distribution server, and a plurality of viewer terminals, and is a system for distributing a moving image, wherein the distributor terminal is connected via a microphone. Transmitting a distribution video including at least real-time audio input to the distribution server, and transmitting the distribution video received from the distributor terminal to each of the plurality of viewer terminals. Each of the plurality of viewer terminals outputting an image and sound included in the distribution video received from the distribution server, a participant terminal included in the plurality of viewer terminals, and the distribution Starting predetermined communication capable of performing at least a call between the participant terminals, and the participant terminal starting the predetermined communication Flip and, in place of the speech contained in the distribution video received from the distribution server executes the steps of outputting a sound received at said predetermined communication.

本発明の一実施形態に係る方法は、配信者端末と、配信サーバと、複数の視聴者端末と、を備えるシステムが、動画を配信するための方法であって、前記配信者端末が、マイクを介して入力されるリアルタイムの音声を少なくとも含む配信動画を前記配信サーバに送信するステップと、前記配信サーバが、前記配信者端末から受信する前記配信動画を前記複数の視聴者端末の各々に送信するステップと、前記複数の視聴者端末の各々が、前記配信サーバから受信する前記配信動画に含まれる画像及び音声を出力するステップと、前記複数の視聴者端末に含まれる参加者端末、及び、前記配信者端末の間で、少なくとも通話を実行可能な所定のコミュニケーションを開始するステップと、前記参加者端末が、前記所定のコミュニケーションの開始に応じて、前記配信サーバから受信する前記配信動画に含まれる音声に代えて、前記所定のコミュニケーションにおいて受信する音声を出力するステップと、を備える。 A method according to an embodiment of the present invention is a method for distributing a moving image by a system including a distributor terminal, a distribution server, and a plurality of viewer terminals, and the distributor terminal includes a microphone Transmitting a distribution video including at least real-time audio input via the distribution server, and the distribution server transmitting the distribution video received from the distributor terminal to each of the plurality of viewer terminals Each of the plurality of viewer terminals outputting an image and audio included in the distribution video received from the distribution server, a participant terminal included in the plurality of viewer terminals, and Starting at least a predetermined communication capable of executing a call between the distributor terminals; and the participant terminal opening the predetermined communication. Depending on, instead of the speech contained in the distribution video received from the distribution server, and a step of outputting a sound received at said predetermined communication.

本発明の一実施形態に係る第１のプログラムは、配信者端末と、配信サーバと、複数の視聴者端末と、を備え、動画を配信するためのシステムにおいて、前記配信者端末上で実行されるプログラムであって、前記配信者端末に、マイクを介して入力されるリアルタイムの音声を少なくとも含む配信動画を前記配信サーバに送信する処理と、前記配信サーバから受信する前記配信動画に含まれる画像及び音声を出力する前記複数の視聴者端末に含まれる参加者端末との間で、少なくとも通話を実行可能な所定のコミュニケーションを開始する処理と、前記所定のコミュニケーションの開始に応じて、前記配信サーバに送信する前記配信動画に対して、マイクを介して入力されるリアルタイムの音声に加えて、前記所定のコミュニケーションにおいて受信する音声を付加する処理と、を実行させ、前記受信する音声を付加する処理は、マイクを介して入力される音声から、前記所定のコミュニケーションにおいて受信する音声のエコー成分を除去する処理を実行することを含む。 A first program according to an embodiment of the present invention includes a distributor terminal, a distribution server, and a plurality of viewer terminals, and is executed on the distributor terminal in a system for distributing moving images. An image included in the distribution video received from the distribution server, and a process of transmitting to the distribution server a distribution video including at least real-time audio input via a microphone to the distributor terminal. And a process for starting a predetermined communication capable of executing at least a call with the participant terminals included in the plurality of viewer terminals that output audio, and the distribution server according to the start of the predetermined communication In addition to real-time audio input via a microphone, the distribution video to be transmitted to And a process of adding the received voice, and the process of adding the received voice includes a process of removing an echo component of the voice received in the predetermined communication from the voice input through the microphone. Including performing.

本発明の一実施形態に係る第２のプログラムは、配信者端末と、配信サーバと、複数の視聴者端末と、を備え、動画を配信するためのシステムにおいて、前記視聴者端末上で実行されるプログラムであって、前記視聴者端末に、前記配信サーバから受信する配信動画であって、前記配信サーバが前記配信者端末から受信すると共に前記配信者端末のマイクを介して入力されるリアルタイムの音声を少なくとも含む前記配信動画に含まれる画像及び音声を出力する処理と、前記配信者端末との間で、少なくとも通話を実行可能な所定のコミュニケーションを開始する処理と、前記所定のコミュニケーションの開始に応じて、前記配信サーバから受信する前記配信動画に含まれる音声に代えて、前記所定のコミュニケーションにおいて受信する音声を出力する処理と、を実行させる。 A second program according to an embodiment of the present invention includes a distributor terminal, a distribution server, and a plurality of viewer terminals, and is executed on the viewer terminal in a system for distributing moving images. A distribution video received from the distribution server to the viewer terminal, which is received by the distribution server from the distributor terminal and input via the microphone of the distributor terminal. A process for outputting an image and a sound included in the distribution video including at least sound, a process for starting a predetermined communication capable of executing at least a call with the distributor terminal, and a start of the predetermined communication. Accordingly, instead of the sound included in the distribution video received from the distribution server, the sound received in the predetermined communication And outputting the causes execution.

本発明の様々な実施形態は、動画を配信する配信者端末と当該動画を視聴するための視聴者端末との間で行われる通話の音声品質の悪化を抑制する。 Various embodiments of the present invention suppress the deterioration of the voice quality of a call performed between a distributor terminal that distributes a moving image and a viewer terminal that views the moving image.

本発明の一実施形態に係る動画配信システム１のネットワークの構成を概略的に示す構成図。1 is a configuration diagram schematically showing a network configuration of a moving image distribution system 1 according to an embodiment of the present invention. 動画配信システム１の機能を概略的に示すブロック図。1 is a block diagram schematically showing functions of a moving image distribution system 1. FIG. メイン画面６０の一例を示す図。The figure which shows an example of the main screen. 配信準備画面７０の一例を示す図。The figure which shows an example of the delivery preparation screen. 配信画面８０の一例を示す図。The figure which shows an example of the delivery screen. 視聴画面９０の一例を示す図。The figure which shows an example of the viewing-and-listening screen 90. FIG. 視聴者が画面動画へ参加する際に配信者端末３０と視聴者端末３０との間で実行される処理の一例を示すシーケンス図。The sequence diagram which shows an example of the process performed between the distributor terminal 30 and the viewer terminal 30 when a viewer participates in a screen moving image. 視聴者端末３０からの参加要求を受信したときの配信画面８０の一例を示す図。The figure which shows an example of the delivery screen 80 when the participation request from the viewer terminal 30 is received. 通常時の配信者端末３０及び視聴者端末３０それぞれにおける音声の入出力を説明するための図。The figure for demonstrating the input-output of the audio | voice in each of the distributor terminal 30 and the viewer terminal 30 at the normal time. 参加者が画面動画に参加している期間の配信者端末３０、参加者端末３０、及び他の視聴者端末３０それぞれにおける音声の入出力を説明するための図。The figure for demonstrating the input-output of the audio | voice in each of the distributor terminal 30, the participant terminal 30, and the other viewer terminal 30 in the period when the participant is participating in a screen moving image. 配信者端末３０のコミュニケーション制御部５７における音声の入出力の詳細を説明するための図。The figure for demonstrating the detail of the input-output of the audio | voice in the communication control part 57 of the distributor terminal 30. FIG.

以下、図面を参照しながら、本発明の実施形態について説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図１は、本発明の一実施形態に係る動画配信システム１のネットワークの構成を概略的に示す構成図である。動画配信システム１は、図示するように、配信サーバ１０と、当該配信サーバ１０とインターネット等のネットワーク２０を介して通信可能に接続されたユーザ端末３０とを備える。図１においては、１つのユーザ端末３０のみが図示されているが、動画配信システム１は、複数のユーザ端末３０を備える。配信サーバ１０は、配信者のユーザ端末３０（以下、「配信者端末３０」と言うことがある。）が送信する配信動画を視聴者のユーザ端末３０（以下、「視聴者端末３０」と言うことがある。）に配信する動画配信サービスを提供する。本実施形態において、動画配信サービスのユーザは、配信者として動画を配信することができ、また、視聴者として他のユーザによって配信される動画を視聴することもできる。 FIG. 1 is a configuration diagram schematically showing a network configuration of a moving image distribution system 1 according to an embodiment of the present invention. As shown in the figure, the moving image distribution system 1 includes a distribution server 10 and a user terminal 30 that is communicably connected to the distribution server 10 via a network 20 such as the Internet. Although only one user terminal 30 is illustrated in FIG. 1, the moving image distribution system 1 includes a plurality of user terminals 30. The distribution server 10 refers to a distribution video transmitted by a distributor's user terminal 30 (hereinafter sometimes referred to as “distributor terminal 30”) as a viewer's user terminal 30 (hereinafter referred to as “viewer terminal 30”). To provide video distribution services. In the present embodiment, a user of a video distribution service can distribute a video as a distributor, and can also view a video distributed by another user as a viewer.

配信サーバ１０は、一般的なコンピュータとして構成されており、図１に示すように、ＣＰＵ（コンピュータプロセッサ）１１と、メインメモリ１２と、ユーザＩ／Ｆ１３と、通信Ｉ／Ｆ１４と、ストレージ（記憶装置）１５とを備え、これらの各構成要素が図示しないバス等を介して電気的に接続されている。 The distribution server 10 is configured as a general computer, and as shown in FIG. 1, a CPU (computer processor) 11, a main memory 12, a user I / F 13, a communication I / F 14, and a storage (memory). Device) 15, and these components are electrically connected via a bus (not shown) or the like.

ＣＰＵ１１は、ストレージ１５等に記憶されている様々なプログラムをメインメモリ１２に読み込んで、当該プログラムに含まれる各種の命令を実行する。メインメモリ１２は、例えば、ＤＲＡＭ等によって構成される。 The CPU 11 reads various programs stored in the storage 15 or the like into the main memory 12 and executes various instructions included in the programs. The main memory 12 is configured by, for example, a DRAM.

ユーザＩ／Ｆ１３は、ユーザとの間で情報をやり取りするための各種の入出力装置を含む。ユーザＩ／Ｆ１３は、例えば、キーボード、ポインティングデバイス（例えば、マウス、タッチパネル等）等の情報入力装置、マイクロフォン（マイク）等の音声入力装置、カメラ等の画像入力装置を含む。また、ユーザＩ／Ｆ１３は、ディスプレイ等の画像出力装置、スピーカー等の音声出力装置を含む。 The user I / F 13 includes various input / output devices for exchanging information with the user. The user I / F 13 includes, for example, an information input device such as a keyboard and a pointing device (for example, a mouse and a touch panel), a voice input device such as a microphone (microphone), and an image input device such as a camera. The user I / F 13 includes an image output device such as a display and an audio output device such as a speaker.

通信Ｉ／Ｆ１４は、ネットワークアダプタ等のハードウェア、各種の通信用ソフトウェア、及びこれらの組み合わせとして実装され、ネットワーク２０等を介した有線又は無線の通信を実現できるように構成されている。 The communication I / F 14 is implemented as hardware such as a network adapter, various types of communication software, and combinations thereof, and is configured to realize wired or wireless communication via the network 20 or the like.

ストレージ１５は、例えば磁気ディスク、フラッシュメモリ等によって構成される。ストレージ１５は、オペレーティングシステムを含む様々なプログラム、及び各種データ等を記憶する。 The storage 15 is configured by, for example, a magnetic disk, a flash memory, or the like. The storage 15 stores various programs including an operating system, various data, and the like.

本実施形態において、配信サーバ１０は、それぞれが上述したハードウェア構成を有する複数のコンピュータを用いて構成され得る。例えば、配信サーバ１０は、１又は複数のサーバ装置によって構成され得る。 In the present embodiment, the distribution server 10 can be configured using a plurality of computers each having the hardware configuration described above. For example, the distribution server 10 can be configured by one or a plurality of server devices.

このように構成された配信サーバ１０は、ウェブサーバ及びアプリケーションサーバとしての機能を有し、ユーザ端末３０にインストールされているウェブブラウザ及びその他のアプリケーション（例えば、動画配信サービス用のアプリケーション）からの要求に応答して各種の処理を実行し、当該処理の結果に応じた画面データ（例えば、ＨＴＭＬデータ）及び制御データ等をユーザ端末３０に送信する。ユーザ端末３０では、受信したデータに基づくウェブページ又はその他の画面が表示される。 The distribution server 10 configured as described above has functions as a web server and an application server, and requests from a web browser and other applications (for example, an application for a video distribution service) installed in the user terminal 30. In response to this, various processes are executed, and screen data (for example, HTML data) and control data corresponding to the result of the processes are transmitted to the user terminal 30. On the user terminal 30, a web page or other screen based on the received data is displayed.

ユーザ端末３０は、一般的なコンピュータとして構成されており、図１に示すように、ＣＰＵ（コンピュータプロセッサ）３１と、メインメモリ３２と、ユーザＩ／Ｆ３３と、通信Ｉ／Ｆ３４と、ストレージ（記憶装置）３５とを備え、これらの各構成要素が図示しないバス等を介して電気的に接続されている。 The user terminal 30 is configured as a general computer, and as shown in FIG. 1, a CPU (computer processor) 31, a main memory 32, a user I / F 33, a communication I / F 34, and a storage (memory). Device) 35, and these components are electrically connected via a bus or the like (not shown).

ＣＰＵ３１は、ストレージ３５等に記憶されている様々なプログラムをメインメモリ３２に読み込んで、当該プログラムに含まれる各種の命令を実行する。メインメモリ３２は、例えば、ＤＲＡＭ等によって構成される。 The CPU 31 reads various programs stored in the storage 35 and the like into the main memory 32 and executes various instructions included in the programs. The main memory 32 is configured by, for example, a DRAM.

ユーザＩ／Ｆ３３は、ユーザとの間で情報をやり取りするための各種の入出力装置である。ユーザＩ／Ｆ３３は、例えば、キーボード、ポインティングデバイス（例えば、マウス、タッチパネル等）等の情報入力装置、マイクロフォン（マイク）等の音声入力装置、カメラ等の画像入力装置を含む。また、ユーザＩ／Ｆ３３は、ディスプレイ等の画像出力装置、スピーカー等の音声出力装置を含む。 The user I / F 33 is various input / output devices for exchanging information with the user. The user I / F 33 includes, for example, an information input device such as a keyboard and a pointing device (for example, a mouse and a touch panel), a voice input device such as a microphone (microphone), and an image input device such as a camera. The user I / F 33 includes an image output device such as a display and an audio output device such as a speaker.

通信Ｉ／Ｆ３４は、ネットワークアダプタ等のハードウェア、各種の通信用ソフトウェア、及びこれらの組み合わせとして実装され、ネットワーク２０等を介した有線又は無線の通信を実現できるように構成されている。 The communication I / F 34 is implemented as hardware such as a network adapter, various types of communication software, and combinations thereof, and is configured to realize wired or wireless communication via the network 20 or the like.

ストレージ３５は、例えば磁気ディスク又はフラッシュメモリ等によって構成される。ストレージ３５は、オペレーティングシステムを含む様々なプログラム及び各種データ等を記憶する。ストレージ３５が記憶するプログラムは、アプリケーションマーケット等からダウンロードされてインストールされ得る。 The storage 35 is configured by, for example, a magnetic disk or a flash memory. The storage 35 stores various programs including the operating system, various data, and the like. The program stored in the storage 35 can be downloaded from an application market or the like and installed.

本実施形態において、ユーザ端末３０は、スマートフォン、タブレット端末、ウェアラブルデバイス、パーソナルコンピュータ、ゲーム専用端末、及びＶＲ（ＶｉｒｔｕａｌＲｅａｌｉｔｙ）装置（ヘッドマウントディスプレイ等）等として構成され得る。 In the present embodiment, the user terminal 30 can be configured as a smartphone, a tablet terminal, a wearable device, a personal computer, a game dedicated terminal, a VR (Virtual Reality) device (head mounted display or the like), and the like.

このように構成されたユーザ端末３０のユーザは、ストレージ３５等にインストールされているウェブブラウザ又は動画配信サービス用のアプリケーションを介した配信サーバ１０との通信を実行することによって、配信サーバ１０が提供する動画配信サービスを利用することができる。動画配信サービス用のアプリケーションは、本発明のプログラムの一部又は全部を実装するプログラムの一例となり得る。 The user of the user terminal 30 configured as described above provides the distribution server 10 by executing communication with the distribution server 10 via a web browser or an application for moving image distribution service installed in the storage 35 or the like. You can use the video distribution service. The application for moving image distribution service can be an example of a program that implements part or all of the program of the present invention.

次に、本実施形態の動画配信システム１が有する機能について説明する。図２は、配信サーバ１０及びユーザ端末３０がそれぞれ有する機能を概略的に示すブロック図である。配信サーバ１０は、図示するように、様々な情報を記憶及び管理する情報記憶管理部４１と、動画配信サービスの基本機能を制御する基本機能制御部４３と、動画の配信を制御する動画配信制御部４５とを有する。これらの機能は、ＣＰＵ１１及びメインメモリ１２等のハードウェア、並びに、ストレージ１５等に記憶されている各種プログラムやデータ等が協働して動作することによって実現され、例えば、メインメモリ１２に読み込まれたプログラムに含まれる命令をＣＰＵ１１が実行することによって実現される。また、図２に示す配信サーバ１０の機能の一部又は全部は、配信サーバ１０とユーザ端末３０とが協働することによって実現され、又は、ユーザ端末３０によって実現され得る。 Next, functions of the moving image distribution system 1 according to the present embodiment will be described. FIG. 2 is a block diagram schematically showing functions of the distribution server 10 and the user terminal 30. As shown in the figure, the distribution server 10 includes an information storage management unit 41 that stores and manages various information, a basic function control unit 43 that controls basic functions of the video distribution service, and a video distribution control that controls video distribution. Part 45. These functions are realized by the cooperation of hardware such as the CPU 11 and the main memory 12 and various programs and data stored in the storage 15 and the like. For example, the functions are read into the main memory 12. This is realized by the CPU 11 executing instructions included in the program. 2 may be realized by the cooperation of the distribution server 10 and the user terminal 30, or may be realized by the user terminal 30.

ユーザ端末３０は、図２に示すように、様々な情報を記憶及び管理する情報記憶管理部５１と、動画配信サービスにおける配信機能を制御する配信機能制御部５３と、動画配信サービスにおける視聴機能を制御する視聴機能制御部５５と、配信者と視聴者との間で行われる所定のコミュニケーションを制御するコミュニケーション制御部５７とを有する。これらの機能は、ＣＰＵ３１及びメインメモリ３２等のハードウェア、並びに、ストレージ３５等に記憶されている各種プログラム（例えば、動画配信サービス用のアプリケーション）やデータ等が協働して動作することによって実現され、例えば、メインメモリ３２に読み込まれたプログラムに含まれる命令をＣＰＵ３１が実行することによって実現される。また、図２に示すユーザ端末３０の機能の一部又は全部は、サーバ１０とユーザ端末３０とが協働することによって実現され、又は、サーバ１０によって実現され得る。 As shown in FIG. 2, the user terminal 30 has an information storage management unit 51 that stores and manages various information, a distribution function control unit 53 that controls a distribution function in the video distribution service, and a viewing function in the video distribution service. It has a viewing function control unit 55 that controls, and a communication control unit 57 that controls predetermined communication performed between the distributor and the viewer. These functions are realized by the cooperation of hardware such as the CPU 31 and the main memory 32, and various programs (for example, an application for moving image distribution service) and data stored in the storage 35 and the like. For example, it is realized by the CPU 31 executing an instruction included in the program read into the main memory 32. 2 may be realized by the cooperation of the server 10 and the user terminal 30, or may be realized by the server 10.

配信サーバ１０の情報記憶管理部４１は、ストレージ１５等において様々な情報を記憶及び管理する。サーバ１０の基本機能制御部４３は、動画配信サービスの基本機能の制御に関する様々な処理を実行する。例えば、基本機能制御部４３は、基本機能に関する様々な画面のＨＴＭＬデータ又は制御データをユーザ端末３０に送信し、ユーザ端末３０で表示される当該画面を介したユーザによる操作入力に応答して様々な処理を実行し、当該処理の結果に応じたＨＴＭＬデータ又は制御データをユーザ端末３０に送信する。基本機能制御部４３によって制御される基本機能は、例えば、ログイン認証（ユーザ認証）、課金制御、ユーザに関する情報の管理、及び、個別の動画配信に関する情報の管理を含む。ユーザ及び個別の動画配信に関する情報は、情報記憶管理部４１（ストレージ１５等）において管理され得る。 The information storage management unit 41 of the distribution server 10 stores and manages various information in the storage 15 and the like. The basic function control unit 43 of the server 10 executes various processes related to control of the basic function of the moving image distribution service. For example, the basic function control unit 43 transmits HTML data or control data of various screens related to the basic function to the user terminal 30, and in response to an operation input by the user via the screen displayed on the user terminal 30. This process is executed, and HTML data or control data corresponding to the result of the process is transmitted to the user terminal 30. The basic functions controlled by the basic function control unit 43 include, for example, login authentication (user authentication), billing control, management of information related to users, and management of information related to individual video distribution. Information relating to the user and individual video distribution can be managed in the information storage management unit 41 (storage 15 or the like).

配信サーバ１０の動画配信制御部４５は、動画の配信の制御に関する様々な処理を実行する。例えば、動画配信制御部４５は、配信者端末３０から受信する配信動画を複数の視聴者端末３０の各々に送信する。具体的には、動画配信制御部４５は、例えば、配信者端末３０から送信される配信動画の動画データを受信し、当該受信した動画データを視聴者端末３０に送信する。こうした動画の配信は、例えば、ＲＴＭＰ（ＲｅａｌＴｉｍｅＭｅｓｓａｇｉｎｇＰｒｏｔｏｃｏｌ）等のプロトコルを用いたストリーミング方式にて行われる。 The moving image distribution control unit 45 of the distribution server 10 executes various processes related to moving image distribution control. For example, the moving image distribution control unit 45 transmits a distributed moving image received from the distributor terminal 30 to each of the plurality of viewer terminals 30. Specifically, the moving image distribution control unit 45 receives, for example, moving image data of a distributed moving image transmitted from the distributor terminal 30 and transmits the received moving image data to the viewer terminal 30. Distribution of such a moving image is performed by a streaming method using a protocol such as RTMP (Real Time Messaging Protocol).

本実施形態の動画配信サービスにおける配信動画は、配信者端末３０のマイクを介して入力されるリアルタイムの音声を含むことができる。また、当該配信動画は、例えば、配信者端末３０のディスプレイ等を介して表示されるリアルタイムの表示画面に対応する画像を含む。以下、当該表示画面に対応する画像を含む動画を「画面動画」と言うことがある。また、配信動画は、例えば、配信者端末３０のカメラを介してリアルタイムに撮影（入力）される画像を含む。更に、配信動画は、マイクを介して入力されるリアルタイムの音声、及び、非リアルタイムの動画（例えば、配信者端末３０のストレージ３５に記憶されている動画）によっても構成され得る。 The distribution video in the video distribution service of the present embodiment can include real-time audio input via the microphone of the distributor terminal 30. Moreover, the said delivery moving image contains the image corresponding to the real-time display screen displayed via the display of the distributor terminal 30, etc., for example. Hereinafter, a moving image including an image corresponding to the display screen may be referred to as a “screen moving image”. In addition, the distribution moving image includes, for example, an image captured (input) in real time via the camera of the distributor terminal 30. Furthermore, the distribution video can be configured by real-time audio input via a microphone and non-real-time video (for example, a video stored in the storage 35 of the distributor terminal 30).

ユーザ端末３０の情報記憶管理部５１は、ストレージ３５等において様々な情報を記憶及び管理する。ユーザ端末３０の配信機能制御部５３は、動画配信サービスの配信機能の制御に関する様々な処理を実行する。当該配信機能は、ユーザ端末３０のユーザが配信者として動画を配信するための機能である。例えば、配信機能制御部５３は、配信動画を配信サーバ１０に送信する。具体的には、例えば、配信機能制御部５３は、マイクを介して入力されるリアルタイムの音声に少なくとも基づいて配信動画の動画データを生成し、当該生成した動画データを配信サーバ１０に送信する。例えば、配信動画が上記画面動画である場合において、配信機能制御部５３は、ユーザ端末３０のディスプレイ等を介して表示される表示画面に対応する画像、及び、マイクを介して入力される音声によって構成される画面動画の動画データをリアルタイムに生成し、当該生成した動画データを配信サーバ１０に送信する。 The information storage management unit 51 of the user terminal 30 stores and manages various information in the storage 35 and the like. The distribution function control unit 53 of the user terminal 30 executes various processes related to the control of the distribution function of the moving image distribution service. The distribution function is a function for the user of the user terminal 30 to distribute a moving image as a distributor. For example, the distribution function control unit 53 transmits the distribution video to the distribution server 10. Specifically, for example, the distribution function control unit 53 generates moving image data of a distribution video based at least on real-time audio input via a microphone, and transmits the generated video data to the distribution server 10. For example, when the distribution video is the screen video, the distribution function control unit 53 uses the image corresponding to the display screen displayed via the display of the user terminal 30 and the sound input via the microphone. The moving image data of the configured screen moving image is generated in real time, and the generated moving image data is transmitted to the distribution server 10.

ユーザ端末３０の視聴機能制御部５５は、動画配信サービスの視聴機能の制御に関する様々な処理を実行する。当該視聴機能は、ユーザ端末３０のユーザが視聴者として他のユーザによって配信される動画を視聴するための機能である。例えば、視聴機能制御部５５は、配信サーバ１０から受信する配信動画に含まれる画像及び音声を出力する。例えば、視聴機能制御部５５は、配信サーバ１０から送信される配信動画の動画データを受信し、当該受信した動画データに基づいて、動画に含まれる画像を、ディスプレイ等を介して表示すると共に、動画に含まれる音声を、スピーカー等を介して出力する。 The viewing function control unit 55 of the user terminal 30 executes various processes related to the control of the viewing function of the moving image distribution service. The viewing function is a function for the user of the user terminal 30 to view a video distributed by another user as a viewer. For example, the viewing function control unit 55 outputs an image and sound included in the distribution moving image received from the distribution server 10. For example, the viewing function control unit 55 receives moving image data of a distributed moving image transmitted from the distribution server 10 and displays an image included in the moving image via a display or the like based on the received moving image data. Audio included in the video is output via a speaker or the like.

ユーザ端末３０のコミュニケーション制御部５７は、配信者端末３０と配信動画への参加者のユーザ端末３０（以下、「参加者端末３０」と言うことがある。）との間で行われる所定のコミュニケーションの制御に関する様々な処理を実行する。本実施形態において、所定のコミュニケーションは、少なくとも通話を実行可能なコミュニケーションである。例えば、コミュニケーション制御部５７は、所定のコミュニケーションを行う相手のユーザ端末３０との間の通信（セッション）の確立に関する処理を実行し、当該コミュニケーション用のセッションを確立後、マイクを介して入力される音声の音声データを、当該セッションを介して送信する一方、当該セッションを介して受信する音声データに対応する音声を、スピーカー等を介して出力する。 The communication control unit 57 of the user terminal 30 performs predetermined communication between the distributor terminal 30 and the user terminal 30 of the participant in the distribution video (hereinafter, may be referred to as “participant terminal 30”). Various processes related to the control of are performed. In the present embodiment, the predetermined communication is communication that can execute at least a telephone call. For example, the communication control unit 57 executes processing related to establishment of communication (session) with the user terminal 30 of the other party that performs predetermined communication, establishes the communication session, and is input via a microphone. The audio data is transmitted through the session, and the audio corresponding to the audio data received through the session is output through a speaker or the like.

本実施形態において、参加者端末３０は、所定のコミュニケーションの開始に応じて、配信サーバ１０から受信する配信動画に含まれる音声に代えて、所定のコミュニケーションにおいて受信する音声を出力するように構成されている。例えば、視聴機能制御部５５は、所定のコミュニケーションの開始に応じて、配信サーバ１０から受信する配信動画に含まれる音声をミュート（消音）するように構成されており、この結果、参加者端末３０では、配信動画に含まれる音声に代えて、所定のコミュニケーションにおいて受信する音声が出力される。 In the present embodiment, the participant terminal 30 is configured to output the sound received in the predetermined communication instead of the sound included in the distribution video received from the distribution server 10 in response to the start of the predetermined communication. ing. For example, the viewing function control unit 55 is configured to mute the sound included in the distribution video received from the distribution server 10 in response to the start of predetermined communication. As a result, the participant terminal 30 Then, instead of the voice included in the distribution video, the voice received in the predetermined communication is output.

このように、本実施形態における動画配信システム１は、所定のコミュニケーションの開始に応じて、参加者端末３０が、配信サーバ１０から受信する配信動画に含まれる音声に代えて、所定のコミュニケーションにおいて受信する音声を出力する。従って、参加者端末３０と配信者端末３０との間の音声通話の内容が、配信動画に含まれる音声として配信サーバ１０を介して参加者端末３０に送信され、当該参加者端末３０において出力される場合と比較して、参加者端末３０における参加者自身の音声の遅延した出力（エコーの発生）が抑制され、この結果、通話の音声品質の悪化が抑制される。 As described above, the video distribution system 1 according to the present embodiment receives the predetermined communication in place of the audio included in the distribution video received by the participant terminal 30 from the distribution server 10 in response to the start of the predetermined communication. The sound to be output is output. Therefore, the content of the voice call between the participant terminal 30 and the distributor terminal 30 is transmitted to the participant terminal 30 via the distribution server 10 as sound included in the distribution video, and is output at the participant terminal 30. Compared with the case where the voice signal of the participant in the participant terminal 30 is delayed (occurrence of echo) is suppressed, and as a result, the deterioration of the voice quality of the call is suppressed.

本実施形態において、配信者端末３０は、所定のコミュニケーションの開始に応じて、配信サーバ１０に送信する配信動画に対して、マイクを介して入力されるリアルタイムの音声に加えて、所定のコミュニケーションにおいて受信する音声を付加するように構成され得る。例えば、コミュニケーション制御部５７は、マイクを介して入力される音声の音声データと、所定のコミュニケーションにおいて受信する音声の音声データとを合成するように構成され、配信機能制御部５３は、当該合成した音声データに基づいて配信動画の動画データを生成するように構成される。こうした構成は、所定のコミュニケーションにおいて受信する音声が、配信者端末３０のマイクを介することなく配信動画に付加されるから、配信動画に含まれる音声（当該音声は、他の視聴者端末３０においては出力される。）の品質を向上させる。また、所定のコミュニケーションを介して受信する音声が、配信者端末３０のマイクを介することなく配信動画に付加されるから、配信者端末３０においてヘッドフォン等を使用する場合であっても、参加者端末３０と配信者端末３０との間の音声通話の内容を含む配信動画が他の視聴者端末３０に配信される。 In the present embodiment, the distributor terminal 30 performs predetermined communication in addition to real-time audio input via a microphone for a distribution video to be transmitted to the distribution server 10 in response to the start of predetermined communication. It may be configured to append received audio. For example, the communication control unit 57 is configured to synthesize voice data input through a microphone and voice data received in a predetermined communication, and the distribution function control unit 53 performs the synthesis. The moving image data of the distribution moving image is generated based on the audio data. In such a configuration, since the sound received in the predetermined communication is added to the distribution video without passing through the microphone of the distributor terminal 30, the audio included in the distribution video (the audio is not transmitted to the other viewer terminals 30). Output quality). In addition, since the sound received through the predetermined communication is added to the distribution video without passing through the microphone of the distributor terminal 30, the participant terminal can be used even when using headphones or the like in the distributor terminal 30. The distribution video including the contents of the voice call between the terminal 30 and the distributor terminal 30 is distributed to the other viewer terminals 30.

また、配信者端末３０は、マイクを介して入力されるリアルタイムの音声から所定のコミュニケーションにおいて受信する音声のエコー成分を除去する処理を実行するように構成され得る。例えば、コミュニケーション制御部５７は、一般的なエコーキャンセル技術を適用し、所定のコミュニケーションにおいて受信する音声の音声データに基づいて、マイクを介して入力される音声の音声データから、当該受信する音声のエコー成分を除去する処理を実行するように構成される。この場合、配信機能制御部５３は、エコー成分を除去する処理が実行された音声データに基づいて配信動画の動画データを生成及び送信し、コミュニケーション制御部５７は、エコー成分を除去する処理が実行された音声データを所定のコミュニケーションにおいて送信する。こうした構成は、配信動画に含まれる音声、及び、所定のコミュニケーションにおける通話音声の、エコー成分による品質の悪化を抑制する。 Further, the distributor terminal 30 can be configured to execute processing for removing echo components of audio received in predetermined communication from real-time audio input via a microphone. For example, the communication control unit 57 applies a general echo cancellation technique, and based on audio data of audio received in predetermined communication, from the audio data of audio input through a microphone, the communication control unit 57 A process for removing the echo component is performed. In this case, the distribution function control unit 53 generates and transmits moving image data of the distribution moving image based on the audio data on which the process of removing the echo component is executed, and the communication control unit 57 executes the process of removing the echo component. The transmitted voice data is transmitted in a predetermined communication. Such a configuration suppresses deterioration in quality due to echo components of the voice included in the distribution video and the call voice in the predetermined communication.

本実施形態において、所定のコミュニケーションは、配信サーバ１０を介することなく行われ、例えば、Ｐ２Ｐ（ＰｅｅｒｔｏＰｅｅｒ）通信を用いて行われるように構成され得る。この場合、配信者端末３０及び参加者端末３０それぞれのコミュニケーション制御部５７は、例えば、図示しないシグナリングサーバを介したシグナリングを実行した後に、Ｐ２Ｐ通信を用いた所定のコミュニケーションを開始する。Ｐ２Ｐ通信を用いたコミュニケーションは、例えば、ＷｅｂＲＴＣ（ＷｅｂＲｅａｌ−ＴｉｍｅＣｏｍｍｕｎｉｃａｔｉｏｎ）を適用して実現することができる。ここで、本実施形態において、所定のコミュニケーションは、Ｐ２Ｐ通信を用いて行われるものに限定されず、例えば、クライアントサーバ型の通信を用いて行われるものも含まれる。 In the present embodiment, the predetermined communication is performed without going through the distribution server 10, and may be configured to be performed using, for example, P2P (Peer to Peer) communication. In this case, the communication control unit 57 of each of the distributor terminal 30 and the participant terminal 30 starts predetermined communication using P2P communication after performing signaling via a signaling server (not shown), for example. Communication using P2P communication can be realized by applying, for example, WebRTC (Web Real-Time Communication). Here, in the present embodiment, the predetermined communication is not limited to that performed using P2P communication, and includes, for example, that performed using client-server type communication.

本実施形態において、所定のコミュニケーションは、通話（音声の送受信）に加えて、他の情報の送受信が実行可能なものを含む。例えば、所定のコミュニケーションは、画像の送受信が実行可能であり、この場合、配信者端末３０は、所定のコミュニケーションの開始に応じて、配信サーバ１０に送信する配信動画に対して、所定のコミュニケーションにおいて受信する画像を付加するように構成され得る。例えば、配信機能制御部５３は、所定のコミュニケーションにおいて受信する画像を、コミュニケーション制御部５７を介して取得し、取得した画像を、配信動画の表示領域における一部の領域に付加するように構成される。ここで、所定のコミュニケーションにおいて受信する画像は、例えば、参加者端末３０のディスプレイ等を介して表示されるリアルタイムの表示画面に対応する画像、参加者端末３０のカメラ等を介してリアルタイムに撮影（入力）される画像、及び、その他の様々な画像を含み得る。こうした構成は、音声による配信動画への参加に加えて、画像による配信動画への参加を可能とする。 In the present embodiment, the predetermined communication includes a communication capable of transmitting / receiving other information in addition to a call (sending / receiving voice). For example, the predetermined communication can execute image transmission / reception. In this case, the distributor terminal 30 performs the predetermined communication on the distribution video to be transmitted to the distribution server 10 in response to the start of the predetermined communication. It may be configured to add an image to be received. For example, the distribution function control unit 53 is configured to acquire an image received in a predetermined communication via the communication control unit 57, and add the acquired image to a part of the display area of the distribution video. The Here, the image received in the predetermined communication is, for example, an image corresponding to a real-time display screen displayed via the display or the like of the participant terminal 30, or taken in real time via the camera or the like of the participant terminal 30 ( Input) and various other images. Such a configuration enables participation in a distribution video by image in addition to participation in a distribution video by voice.

本実施形態において、参加者端末３０は、所定のコミュニケーションの終了に応じて、所定のコミュニケーションにおいて受信する音声に代えて、配信サーバ１０から受信する配信動画に含まれる音声を出力するように構成され得る。例えば、視聴機能制御部５５は、所定のコミュニケーションの終了に応じて、配信サーバ１０から受信する配信動画に含まれる音声のミュートを解除するように構成される。 In the present embodiment, the participant terminal 30 is configured to output the audio included in the distribution video received from the distribution server 10 instead of the audio received in the predetermined communication in response to the end of the predetermined communication. obtain. For example, the viewing function control unit 55 is configured to cancel the mute of the audio included in the distribution video received from the distribution server 10 in response to the end of predetermined communication.

本実施形態において、所定のコミュニケーションは、複数の視聴者端末３０に含まれる特定の視聴者端末３０からの参加要求の配信者端末３０における承諾に応じて、当該特定の視聴者端末３０（参加者端末３０）及び配信者端末３０の間で開始されるように構成され得る。例えば、視聴者端末３０の視聴機能制御部５５が、視聴者端末３０において表示される画面を介した視聴者からの参加要求を受け付け、当該参加要求を、配信サーバ１０を介して配信者端末３０に送信し、配信者端末３０の配信機能制御部５３が、配信者端末３０において表示される画面を介した配信者による参加要求の承諾を受け付け、当該参加要求の承諾の受付に応じて、配信者端末３０及び参加者端末３０それぞれのコミュニケーション制御部５７が所定のコミュニケーションを開始する、ように構成される。 In the present embodiment, the predetermined communication is performed according to the specific viewer terminal 30 (participant) in accordance with the consent in the distributor terminal 30 of the participation request from the specific viewer terminal 30 included in the plurality of viewer terminals 30. Terminal 30) and distributor terminal 30 may be configured to start. For example, the viewing function control unit 55 of the viewer terminal 30 accepts a participation request from a viewer via a screen displayed on the viewer terminal 30, and sends the participation request to the distributor terminal 30 via the distribution server 10. And the distribution function control unit 53 of the distributor terminal 30 accepts the consent of the participation request by the distributor via the screen displayed on the distributor terminal 30, and distributes according to the acceptance of the acceptance of the participation request. The communication control unit 57 of each of the participant terminal 30 and the participant terminal 30 is configured to start predetermined communication.

次に、このような機能を有する本実施形態の動画配信システム１の具体例について説明する。この具体例における動画配信サービスでは、配信者端末３０の表示画面に対応する画像を含むリアルタイムの画面動画が、配信動画として、複数の視聴者端末３０に配信される。こうした画面動画の配信は、「画面の生配信」と呼ばれることがある。 Next, a specific example of the moving image distribution system 1 of this embodiment having such a function will be described. In the moving image distribution service in this specific example, a real-time screen moving image including an image corresponding to the display screen of the distributor terminal 30 is distributed to a plurality of viewer terminals 30 as a distributed moving image. Such screen video distribution is sometimes referred to as “screen live distribution”.

図３は、この例において、動画配信サービスのユーザが当該サービスを利用するときの起点となるメイン画面６０の一例を示す。このメイン画面６０は、例えば、動画配信サービス用のアプリケーションがユーザ端末３０上で起動されたとき、又は、配信サーバ１０が提供する動画配信サービス用のウェブサイトにユーザ端末３０がウェブブラウザを介してアクセスしたとき等に表示される。 FIG. 3 shows an example of the main screen 60 that is a starting point when the user of the video distribution service uses the service in this example. The main screen 60 is displayed when, for example, an application for a video distribution service is started on the user terminal 30, or the user terminal 30 connects to a video distribution service website provided by the distribution server 10 via a web browser. Displayed when accessed.

メイン画面６０は、図３に示すように、お薦めの動画を表示する推奨領域６１と、配信中の動画を一覧表示する動画一覧領域６２とを有し、下端に基本メニュ領域１００が配置されている。推奨領域６１及び動画一覧領域６２に表示される動画に関する情報には、動画のタイトル（配信タイトル。図３の例では「ＹＹＹ」、「ＸＸＸ」等と表示されている。）、及び、配信者情報が含まれる。ユーザは、推奨領域６１及び動画一覧領域６２に表示されている動画の何れかを選択することにより、視聴者として、当該動画の視聴を開始することができる。 As shown in FIG. 3, the main screen 60 has a recommended area 61 for displaying recommended videos and a moving picture list area 62 for displaying a list of videos being distributed. A basic menu area 100 is arranged at the lower end. Yes. Information related to the moving image displayed in the recommended area 61 and the moving image list area 62 includes a moving image title (distributed title, which is displayed as “YYY”, “XXX”, etc. in the example of FIG. 3), and a distributor. Contains information. The user can start viewing the moving image as a viewer by selecting any of the moving images displayed in the recommended area 61 and the moving image list area 62.

基本メニュ領域１００は、動画配信サービスを利用するときの基本となるメニュによって構成されており、メイン画面６０以外の主要な画面においても同様に配置されている。基本メニュ領域１００は、具体的には、メイン画面６０を表示するためのメインメニュ１０２と、ユーザや動画を検索するための検索メニュ１０４と、動画の配信を開始するための配信メニュ１０６と、ユーザに対するお知らせを表示するためのお知らせメニュ１０８と、自身のユーザページ（マイページ）を表示するためのマイページメニュ１０９とによって構成されている。 The basic menu area 100 is composed of menus that serve as a basis for using the moving image distribution service, and is similarly arranged on main screens other than the main screen 60. Specifically, the basic menu area 100 includes a main menu 102 for displaying the main screen 60, a search menu 104 for searching for users and moving images, a distribution menu 106 for starting distribution of moving images, A notification menu 108 for displaying a notification to the user and a my page menu 109 for displaying its own user page (my page) are configured.

ユーザが検索メニュ１０４を選択すると、例えば、キーワード等を用いて他のユーザ又は動画を検索するための検索用画面が表示される。また、ユーザがマイページメニュ１０９を選択すると、ユーザ自身のユーザページ（マイページ）を表示するマイページ画面が表示され、ユーザは当該画面を介して、例えば、自身の基本情報等を閲覧及び編集することができ、又、配信履歴や視聴履歴を閲覧することができる。 When the user selects the search menu 104, for example, a search screen for searching for another user or a moving image using a keyword or the like is displayed. In addition, when the user selects the My Page menu 109, a My Page screen that displays the user's own user page (My Page) is displayed, and the user can view and edit his / her basic information, for example, via the screen. It is also possible to browse the distribution history and viewing history.

ユーザが配信メニュ１０６を選択すると、図４に例示する配信準備画面７０がユーザ端末３０に表示される。配信準備画面７０は、図示するように、配信タイトルを入力するためのタイトル入力領域７２と、「配信開始！」と表示された配信開始ボタン７４とを有し、下端に基本メニュ領域１００が配置されている。この配信準備画面７０は、ユーザが動画の配信の開始を指示するための画面である。 When the user selects the distribution menu 106, a distribution preparation screen 70 illustrated in FIG. 4 is displayed on the user terminal 30. As shown in the figure, the distribution preparation screen 70 includes a title input area 72 for inputting a distribution title and a distribution start button 74 displayed as “Start distribution!”, And a basic menu area 100 is arranged at the lower end. Has been. This distribution preparation screen 70 is a screen for the user to instruct the start of distribution of moving images.

ユーザが、タイトル入力領域７２に所望の配信タイトル（例えば、「ゲームＸの生配信！」等）を入力した上で、配信開始ボタン７４を選択すると、配信準備画面７０が閉じられて、画面動画の配信が開始される。具体的には、配信者のユーザ端末３０（配信者端末３０）における表示画面に対応する画像、及び、マイクを介して入力される音声によって構成される画面動画の動画データの生成及び配信サーバ１０への送信が開始される。画面動画の配信が開始されると、視聴者からの視聴要求に応答して、配信サーバ１０から当該視聴者のユーザ端末３０（視聴者端末３０）に画面動画の動画データが送信されるようになる。 When the user selects a distribution start button 74 after inputting a desired distribution title (for example, “Live distribution of game X!” Etc.) in the title input area 72, the distribution preparation screen 70 is closed, and the screen animation is displayed. Distribution of is started. Specifically, the generation and distribution server 10 of the moving image data of the screen moving image constituted by the image corresponding to the display screen in the user terminal 30 (distributor terminal 30) of the distributor and the sound input through the microphone. Transmission to is started. When the distribution of the screen moving image is started, the moving image data of the screen moving image is transmitted from the distribution server 10 to the user terminal 30 (viewer terminal 30) of the viewer in response to the viewing request from the viewer. Become.

図５は、動画の配信中において配信者端末３０に表示される配信画面８０の一例を示す。当該配信画面８０は、配信開始ボタン７４が選択されて配信準備画面７０が閉じられたときに表示される。この具体例では、動画の配信中において、動画配信サービス用のアプリケーション等のプログラムはバックグラウンドで動作し、配信画面８０の画面全体に対応する表示領域２１０には、例えば、ＯＳのホーム画面、又は、起動中の他のアプリケーションの画面が表示される。 FIG. 5 shows an example of a distribution screen 80 displayed on the distributor terminal 30 during the distribution of moving images. The distribution screen 80 is displayed when the distribution start button 74 is selected and the distribution preparation screen 70 is closed. In this specific example, during the distribution of a moving image, a program such as an application for moving image distribution service operates in the background, and the display area 210 corresponding to the entire screen of the distribution screen 80 has, for example, an OS home screen or , The screen of another application that is running is displayed.

配信画面８０には、図５に示すように、画面上端のコメント入力領域８２、その左下側のカメラ画像表示領域８４、並びに、画面右下隅の設定ボタン８６及び終了ボタン８８が、それぞれ表示領域２１０に重畳して配置されている。コメント入力領域８２は、配信者が文字列等のコメントを入力できるように構成されている。また、カメラ画像表示領域８４は、配信者端末３０のフロントカメラ等を介して入力される画像（具体的には、例えば、配信者の映像）が表示される。 In the distribution screen 80, as shown in FIG. 5, a comment input area 82 at the upper end of the screen, a camera image display area 84 at the lower left side thereof, and a setting button 86 and an end button 88 at the lower right corner of the screen are displayed in the display area 210. It is arranged so as to overlap. The comment input area 82 is configured so that the distributor can input a comment such as a character string. The camera image display area 84 displays an image (specifically, for example, a video of a distributor) input via the front camera of the distributor terminal 30 or the like.

配信者が設定ボタン８６を選択すると、図示しない設定用画面を介して各種の配信設定を行うことができる。例えば、マイクの無効化（ミュート）、カメラの無効化（カメラ画像表示領域８４の非表示を含む。）、コメントの閲覧、コメント欄の非表示（コメント入力領域８２の非表示）等の設定を行うことができる。ここで、「コメントの閲覧」は、配信者自身が入力したコメント、及び、視聴者が入力したコメントを表示するための設定である。 When the distributor selects the setting button 86, various distribution settings can be performed via a setting screen (not shown). For example, settings such as microphone invalidation (mute), camera invalidation (including non-display of the camera image display area 84), comment viewing, comment column non-display (comment input area 82 non-display), etc. It can be carried out. Here, “view comment” is a setting for displaying a comment input by the distributor and a comment input by the viewer.

ここで、コメント入力領域８２、カメラ画像表示領域８４、設定ボタン８６、及び終了ボタン８８は、バックグラウンドで動作する動画配信サービス用のアプリケーション等によって制御されるウィジェット形式のオブジェクトである。配信者は、配信画面８０の表示領域２１０全体のうち、オブジェクト８２、８４、８６及び８８以外の領域をタップ操作等することによって、表示領域２１０に表示されている画面を介した操作（ＯＳ、又は、起動中の他のアプリケーションに対する操作）を実行することができる。また、配信者はコメント入力領域８２及びカメラ画像表示領域８４をスライド操作等することによって、これらの表示位置を変更する（オブジェクトを移動させる）こともできる。 Here, the comment input area 82, the camera image display area 84, the setting button 86, and the end button 88 are widget-type objects that are controlled by an application for a moving image distribution service that operates in the background. The distributor performs an operation (OS, OS, etc.) via the screen displayed in the display area 210 by tapping the area other than the objects 82, 84, 86 and 88 in the entire display area 210 of the distribution screen 80. Alternatively, an operation on another application being started can be executed. The distributor can also change the display position (move the object) by performing a slide operation or the like on the comment input area 82 and the camera image display area 84.

図６は、画面動画を視聴する視聴者のユーザ端末３０（視聴者端末３０）に表示される視聴画面９０の一例を示す。視聴者は、例えば、メイン画面６０の推奨領域６１及びコンテンツ一覧領域６２に表示されている動画の何れかを選択すること、又は、検索メニュ１０４を介して検索した動画の何れかを選択すること等によって、視聴する動画を選択することができ、こうした動画の選択に応じて、対応する画面動画の動画データが配信サーバ１０から視聴者端末３０に送信され、当該動画データに対応する動画を表示する視聴画面９０が視聴者端末３０において表示される。 FIG. 6 shows an example of a viewing screen 90 displayed on the user terminal 30 (viewer terminal 30) of the viewer who views the screen moving image. For example, the viewer selects any of the moving images displayed in the recommended area 61 and the content list area 62 of the main screen 60 or selects any of the moving pictures searched through the search menu 104. The video to be viewed can be selected by, for example, and the video data of the corresponding screen video is transmitted from the distribution server 10 to the viewer terminal 30 in response to the selection of such video, and the video corresponding to the video data is displayed. A viewing screen 90 to be displayed is displayed on the viewer terminal 30.

視聴画面９０は、図６に示すように、画面動画を表示する動画表示領域９２と、画面左下隅に位置するコメント入力領域９３と、画面右下隅に位置する設定ボタン９４及び参加要求ボタン９５とを有する。図示するように、動画表示領域９２は、配信者端末３０における表示画面（配信画面８０）に対応する画像を含む画面動画が表示される。 As shown in FIG. 6, the viewing screen 90 includes a moving image display area 92 for displaying a screen moving image, a comment input area 93 located at the lower left corner of the screen, a setting button 94 and a participation request button 95 located at the lower right corner of the screen. Have As shown in the drawing, the moving image display area 92 displays a screen moving image including an image corresponding to the display screen (distribution screen 80) in the distributor terminal 30.

コメント入力領域９３は、視聴者が文字列等のコメントを入力できるように構成されている。視聴者が設定ボタン９４を選択すると、図示しない設定用画面を介して各種の視聴設定を行うことができ、例えば、コメントの閲覧等の設定を行うことができる。 The comment input area 93 is configured so that the viewer can input a comment such as a character string. When the viewer selects the setting button 94, various viewing settings can be made via a setting screen (not shown). For example, setting such as viewing a comment can be performed.

視聴画面９０の参加要求ボタン９５は、視聴者が、配信者との通話を介した画面動画への参加を要求するためのボタンである。以下、視聴者が画面動画へ参加する際の動作について説明する。図７は、視聴者が画面動画へ参加する際に配信者端末３０と視聴者端末３０との間で実行される処理の一例を示すシーケンス図である。まず、図示するように、視聴者による参加要求ボタン９５の選択に応じて、視聴者端末３０が、配信者端末３０に対する参加要求を送信する（ステップＳ１００）。当該参加要求は、参加要求を送信した視聴者に関する情報を含み、配信サーバ１０を介して配信者端末３０に送信される。 The participation request button 95 on the viewing screen 90 is a button for the viewer to request participation in the screen video through a call with the distributor. Hereinafter, an operation when the viewer participates in the screen video will be described. FIG. 7 is a sequence diagram illustrating an example of processing executed between the distributor terminal 30 and the viewer terminal 30 when the viewer participates in the screen video. First, as shown in the figure, the viewer terminal 30 transmits a participation request to the distributor terminal 30 in accordance with the selection of the participation request button 95 by the viewer (step S100). The participation request includes information regarding the viewer who has transmitted the participation request, and is transmitted to the distributor terminal 30 via the distribution server 10.

そして、参加要求を受信した配信者端末３０は、参加要求の配信者による承諾を受け付ける（ステップＳ１１０）。図８は、参加要求を受信したときの配信画面８０を例示する。図示するように、配信画面８０は、参加要求の受信に応じて、通知オブジェクト８９を表示するように構成されている。当該通知オブジェクト８９は、カメラ画像表示領域８４の右側に位置し、参加要求を行った視聴者に関する情報を表示すると共に、承諾ボタン８９１を有する。配信者は、承諾ボタン８９１を選択することにより、視聴者からの参加要求を承諾することができる。なお、この例において、通知オブジェクト８９は、配信者によって承諾ボタン８９１が選択されないと、所定の期間（例えば、１０秒）表示された後に消える。 Then, the distributor terminal 30 that has received the participation request accepts consent from the distributor of the participation request (step S110). FIG. 8 illustrates the distribution screen 80 when a participation request is received. As shown in the drawing, the distribution screen 80 is configured to display a notification object 89 in response to the reception of the participation request. The notification object 89 is located on the right side of the camera image display area 84, displays information related to the viewer who has requested to participate, and has a consent button 891. The distributor can accept the participation request from the viewer by selecting the accept button 891. In this example, the notification object 89 disappears after being displayed for a predetermined period (for example, 10 seconds) unless the consent button 891 is selected by the distributor.

こうして通知オブジェクト８９の承諾ボタン８９１の選択に応じて参加要求の承諾が受け付けられると、配信者端末３０と視聴者端末３０（参加者端末３０）との間で通話のための通信（セッション）が確立される（ステップＳ１２０）。具体的には、この例では、配信者端末３０及び参加者端末３０が図示しないシグナリングサーバを介してシグナリングを実行した後に、配信者端末３０と参加者端末３０との間でＰ２Ｐ通信を用いた通話用のセッションが確立される。当該セッションの確立に応じて、配信者端末３０及び参加者端末３０の間の通話（音声の送受信）が可能となる。 When acceptance of the participation request is accepted in accordance with the selection of the acceptance button 891 of the notification object 89 in this way, communication (session) for a call is performed between the distributor terminal 30 and the viewer terminal 30 (participant terminal 30). It is established (step S120). Specifically, in this example, P2P communication was used between the distributor terminal 30 and the participant terminal 30 after the distributor terminal 30 and the participant terminal 30 performed signaling via a signaling server (not shown). A call session is established. According to the establishment of the session, a call (voice transmission / reception) between the distributor terminal 30 and the participant terminal 30 becomes possible.

ここで、配信者端末３０、視聴者端末３０、及び参加者端末３０それぞれにおける音声の入出力について説明する。まず、通常時（画面動画に参加する参加者が存在しない期間）における音声の入出力について説明する。図９は、通常時の配信者端末３０及び視聴者端末３０それぞれにおける音声の入出力を説明するための図である。図示するように、配信者端末３０においては、配信機能制御部５３が、マイクを介して入力される音声（マイク音声）及び表示画面に対応する画像によって構成される画面動画の動画データを生成して配信サーバ１０に送信し、視聴者端末３０においては、視聴機能制御部５５が、配信サーバ１０を介して受信する動画データに対応する画面動画（画像及び音声）をそのまま出力する。 Here, input / output of audio in each of the distributor terminal 30, the viewer terminal 30, and the participant terminal 30 will be described. First, audio input / output during normal times (a period in which there are no participants participating in the screen video) will be described. FIG. 9 is a diagram for explaining input / output of audio in each of the distributor terminal 30 and the viewer terminal 30 at normal times. As shown in the figure, in the distributor terminal 30, the distribution function control unit 53 generates moving image data of a screen moving image constituted by sound (microphone sound) input via a microphone and an image corresponding to the display screen. In the viewer terminal 30, the viewing function control unit 55 outputs the screen moving image (image and sound) corresponding to the moving image data received via the distribution server 10 as it is.

図１０は、参加者が画面動画に参加している期間（通話用のセッションが確立されている期間）の配信者端末３０、参加者端末３０、及び他の視聴者端末３０それぞれにおける音声の入出力を説明するための図である。図示するように、参加者が画面動画に参加している期間において、配信者端末３０及び参加者端末３０それぞれのコミュニケーション制御部５７が、マイクを介して入力されるマイク音声の音声データを、通話用のセッションを介して送信し、当該セッションを介して受信する音声データに対応する通話音声を、スピーカー等を介して出力する。 FIG. 10 shows the input of audio in each of the distributor terminal 30, the participant terminal 30, and the other viewer terminals 30 during the period in which the participant participates in the screen video (the period in which the call session is established). It is a figure for demonstrating an output. As shown in the figure, during the period when the participant is participating in the screen video, the communication control unit 57 of each of the distributor terminal 30 and the participant terminal 30 uses the microphone voice data input via the microphone as a call. A call voice corresponding to the voice data transmitted through the session and received through the session is output through a speaker or the like.

図１１は、図１０の配信者端末３０のコミュニケーション制御部５７における音声の入出力の詳細を説明するための図である。図示するように、この例において、コミュニケーション制御部５７はエコーキャンセラー５７２を有する。当該エコーキャンセラー５７２は、通話用のセッションを介して受信する音声データに基づいて、マイクを介して入力されるマイク音声の音声データから、当該受信する音声データのエコー成分を除去する機能を有する。コミュニケーション制御部５７が通話用のセッションを介して送信する音声データは、エコーキャンセラー５７２によって上記エコー成分を除去する処理が実行された音声データである。なお、参加者端末３０のコミュニケーション制御部５７においても、同様に、エコーキャンセラー５７２によるエコー成分を除去する処理が実行される。 FIG. 11 is a diagram for explaining the details of voice input / output in communication control unit 57 of distributor terminal 30 in FIG. As illustrated, in this example, the communication control unit 57 includes an echo canceller 572. The echo canceller 572 has a function of removing the echo component of the received voice data from the voice data of the microphone voice input via the microphone, based on the voice data received via the call session. The audio data transmitted by the communication control unit 57 via the call session is audio data that has been processed by the echo canceller 572 to remove the echo component. Similarly, the communication control unit 57 of the participant terminal 30 executes a process of removing an echo component by the echo canceller 572.

また、図１１に示すように、この例において、コミュニケーション制御部５７はミキサー５７４を有する。当該ミキサー５７４は、エコーキャンセラー５７２によって上記エコー成分を除去する処理が実行されたマイク音声の音声データと、通話用のセッションを介して受信する音声データとを合成する機能を有する。配信機能制御部５３は、当該合成音声の音声データをコミュニケーション制御部５７４から取得して、当該合成音声の音声データを含む動画データを生成及び送信する。 As shown in FIG. 11, in this example, the communication control unit 57 includes a mixer 574. The mixer 574 has a function of synthesizing the sound data of the microphone sound that has been subjected to the process of removing the echo component by the echo canceller 572 and the sound data received through the call session. The distribution function control unit 53 acquires the voice data of the synthesized voice from the communication control unit 574, and generates and transmits moving image data including the voice data of the synthesized voice.

図１０に戻り、配信サーバ１０は、配信者端末３０から送信される画面動画の動画データを、参加者端末３０を含む複数の視聴者端末３０に送信する。参加者端末３０以外の視聴者端末３０（画面動画に参加していない視聴者端末３０）においては、受信する動画データに対応する画面動画に含まれる画像及び音声がそのまま出力される。一方、参加者端末３０は、画面動画に参加している期間において、画面動画に含まれる音声がミュート（消音）され、画面動画に含まれる画像のみが出力（表示）される。この結果、参加者端末３０では、画面動画に含まれる画像が表示されると共に、通話用のセッションを介した通話音声が出力される。 Returning to FIG. 10, the distribution server 10 transmits the moving image data of the screen moving image transmitted from the distributor terminal 30 to a plurality of viewer terminals 30 including the participant terminals 30. In a viewer terminal 30 other than the participant terminal 30 (viewer terminal 30 not participating in the screen moving image), the image and sound included in the screen moving image corresponding to the received moving image data are output as they are. On the other hand, during the period in which the participant terminal 30 participates in the screen moving image, the sound included in the screen moving image is muted (silenced), and only the image included in the screen moving image is output (displayed). As a result, the participant terminal 30 displays an image included in the screen moving image and outputs a call voice through the call session.

このように、参加者が画面動画に参加している期間において、参加者端末３０は、配信サーバ１０を介して配信される画面動画に含まれる音声を出力せずに、通話用のセッションを介した通話音声を出力する。当該通話音声は、配信サーバ１０を介して配信される画面動画に含まれる音声と比較して、その遅延が抑制されている。また、配信者端末３０及び参加者端末３０それぞれのコミュニケーション制御部５７のエコーキャンセラー５７２によって、マイク音声に含まれるエコー成分を除去する処理が実行されるから、エコー成分による通話音声の品質の悪化が抑制されている。さらに、配信者端末３０において、エコー成分を除去する処理が実行されたマイク音声と通話用のセッションを介した通話音声との合成音声を含む画面動画の動画データが生成及び送信されるから、画面動画に参加していない視聴者端末３０において出力される音声も、その品質の悪化が抑制されている。 In this way, during the period in which the participant is participating in the screen video, the participant terminal 30 does not output the audio included in the screen video distributed via the distribution server 10 and does not pass through the call session. The call voice is output. The call voice is suppressed in delay compared to the voice included in the screen moving image distributed via the distribution server 10. In addition, since the echo canceller 572 of the communication control unit 57 of each of the distributor terminal 30 and the participant terminal 30 performs a process of removing the echo component included in the microphone voice, the quality of the call voice is deteriorated due to the echo component. It is suppressed. Further, since the distributor terminal 30 generates and transmits the video data of the screen moving image including the synthesized sound of the microphone sound that has been subjected to the process of removing the echo component and the call sound through the call session, Deterioration of the quality of the audio output from the viewer terminal 30 that has not participated in the video is also suppressed.

なお、配信者及び参加者は、配信画面８０又は視聴画面９０を介して通話の終了（セッションの解除）を指示することができる。通話の終了に応じて、参加者端末３０は、画面動画に参加していない状態に戻り、具体的には、画面動画に含まれる音声のミュート（消音）が解除される。 The distributor and the participant can instruct the end of the call (cancellation of the session) via the distribution screen 80 or the viewing screen 90. In response to the end of the call, the participant terminal 30 returns to a state in which it does not participate in the screen moving image, and specifically, the mute (mute) of the sound included in the screen moving image is released.

上述した例では、１の視聴者のみが画面動画に参加する場合について説明したが、複数の視聴者が画面動画に参加することもできる。この場合、配信者端末３０と複数の参加者端末３０との間でメッシュ型のＰ２Ｐ通信ネットワークを用いた通話用のセッションが確立され、当該セッションを介して、配信者端末３０と複数の参加者端末３０との間の通話（グループ通話）が行われる。 In the example described above, the case where only one viewer participates in the screen video has been described, but a plurality of viewers can also participate in the screen video. In this case, a call session using the mesh-type P2P communication network is established between the distributor terminal 30 and the plurality of participant terminals 30, and the distributor terminal 30 and the plurality of participants are connected via the session. A call (group call) with the terminal 30 is performed.

また、上述した例では、配信者端末３０と参加者端末３０との間で、通話用のセッションを介した通話（音声の送受信）が行われるようにしたが、通話に加えて、画像の送受信を行うようにしても良い。例えば、上述したＷｅｂＲＴＣは、音声以外の様々な情報の送受信を伴うコミュニケーションに適用することができる。この場合、配信者端末３０において、確立されたセッションを介して受信する画像を画面動画の表示領域の一部の領域に付加するようにしても良い。更に、参加者端末３０において、確立されたセッションを介して、配信者端末３０から画面動画自体を受信するようにしても良い。この場合、参加者端末３０においては、配信サーバ１０が配信する画面動画の出力を行うことなく、当該セッションを介して受信する画面動画を出力することになる。 In the above-described example, a call (send / receive voice) is performed between the distributor terminal 30 and the participant terminal 30 via a call session. However, in addition to the call, image transmission / reception is performed. May be performed. For example, the WebRTC described above can be applied to communication involving transmission / reception of various information other than voice. In this case, the distributor terminal 30 may add an image received via the established session to a part of the display area of the screen moving image. Further, the participant terminal 30 may receive the screen moving image itself from the distributor terminal 30 via the established session. In this case, the participant terminal 30 outputs the screen moving image received via the session without outputting the screen moving image distributed by the distribution server 10.

以上説明した本実施形態の動画配信システム１は、視聴者端末３０及び参加者端末３０の間で行われる所定のコミュニケーションの開始（例えば、通話用のセッションの確立）に応じて、参加者端末３０が、配信サーバ１０から受信する配信動画に含まれる音声に代えて、所定のコミュニケーションにおいて受信する音声を出力する。従って、参加者端末３０と配信者端末３０との間の音声通話の内容が、配信動画に含まれる音声として配信サーバ１０を介して参加者端末３０に送信され、当該参加者端末３０において出力される場合と比較して、参加者端末３０における参加者自身の音声の遅延した出力（エコーの発生）が抑制され、この結果、通話の音声品質の悪化が抑制される。つまり、本発明の実施形態は、動画を配信する配信者端末と当該動画を視聴するための視聴者端末との間で行われる通話の音声品質の悪化を抑制する。 The moving image distribution system 1 according to the present embodiment described above has the participant terminal 30 in response to the start of predetermined communication (for example, establishment of a call session) performed between the viewer terminal 30 and the participant terminal 30. However, instead of the audio included in the distribution video received from the distribution server 10, the audio received in the predetermined communication is output. Therefore, the content of the voice call between the participant terminal 30 and the distributor terminal 30 is transmitted to the participant terminal 30 via the distribution server 10 as sound included in the distribution video, and is output at the participant terminal 30. Compared with the case where the voice signal of the participant in the participant terminal 30 is delayed (occurrence of echo) is suppressed, and as a result, the deterioration of the voice quality of the call is suppressed. That is, the embodiment of the present invention suppresses deterioration of the voice quality of a call performed between a distributor terminal that distributes a moving image and a viewer terminal that views the moving image.

上述した実施形態では、ユーザ端末３０が、配信者端末３０及び視聴者端末３０として機能するように構成したが、本発明の実施形態において、各端末を専用の端末として構成しても良い。例えば、配信者専用の端末は、視聴機能制御部５５を有しないように構成され、視聴者専用の端末は、配信機能制御部５３を有しないように構成され得る。 In the above-described embodiment, the user terminal 30 is configured to function as the distributor terminal 30 and the viewer terminal 30. However, in the embodiment of the present invention, each terminal may be configured as a dedicated terminal. For example, the terminal dedicated to the distributor may be configured not to include the viewing function control unit 55, and the terminal dedicated to the viewer may be configured not to include the distribution function control unit 53.

本明細書で説明された処理及び手順は、明示的に説明されたもの以外にも、ソフトウェア、ハードウェアまたはこれらの任意の組み合わせによって実現される。例えば、本明細書で説明される処理及び手順は、集積回路、揮発性メモリ、不揮発性メモリ、磁気ディスク等の媒体に、当該処理及び手順に相当するロジックを実装することによって実現される。また、本明細書で説明された処理及び手順は、当該処理・手順に相当するコンピュータプログラムとして実装し、各種のコンピュータに実行させることが可能である。 The processes and procedures described in the present specification are implemented by software, hardware, or any combination thereof other than those explicitly described. For example, the processes and procedures described in this specification are realized by mounting logic corresponding to the processes and procedures on a medium such as an integrated circuit, a volatile memory, a nonvolatile memory, and a magnetic disk. The processing and procedure described in this specification can be implemented as a computer program corresponding to the processing / procedure and executed by various computers.

本明細書中で説明された処理及び手順が単一の装置、ソフトウェア、コンポーネント、モジュールによって実行される旨が説明されたとしても、そのような処理または手順は複数の装置、複数のソフトウェア、複数のコンポーネント、及び／又は複数のモジュールによって実行され得る。また、本明細書において説明されたソフトウェアおよびハードウェアの要素は、それらをより少ない構成要素に統合して、またはより多い構成要素に分解することによって実現することも可能である。 Even if the processes and procedures described herein are described as being performed by a single device, software, component, or module, such processes or procedures may be performed by multiple devices, multiple software, multiple Component and / or multiple modules. Also, the software and hardware elements described herein can be implemented by integrating them into fewer components or by disassembling them into more components.

本明細書において、発明の構成要素が単数もしくは複数のいずれか一方として説明された場合、又は、単数もしくは複数のいずれとも限定せずに説明された場合であっても、文脈上別に解すべき場合を除き、当該構成要素は単数又は複数のいずれであってもよい。 In the present specification, when the constituent elements of the invention are described as one or a plurality, or when they are described without being limited to one or a plurality of cases, they should be understood separately in context. The component may be either singular or plural.

１動画配信システム
１０配信サーバ
２０ネットワーク
３０ユーザ端末（配信者端末、視聴者端末、参加者端末）
４１情報記憶管理部
４３基本機能制御部
４５動画配信制御部
５１情報記憶管理部
５３配信機能制御部
５５視聴機能制御部
５７コミュニケーション制御部
６０メイン画面
７０配信準備画面
８０配信画面
９０視聴画面

1 video distribution system 10 distribution server 20 network 30 user terminal (distributor terminal, viewer terminal, participant terminal)
41 Information Storage Management Unit 43 Basic Function Control Unit 45 Video Distribution Control Unit 51 Information Storage Management Unit 53 Distribution Function Control Unit 55 Viewing Function Control Unit 57 Communication Control Unit 60 Main Screen 70 Distribution Preparation Screen 80 Distribution Screen 90 Viewing Screen

Claims

A system for delivering a video, comprising a distributor terminal, a distribution server, and a plurality of viewer terminals,
The distributor terminal transmitting a distribution video including at least real-time audio input via a microphone to the distribution server;
The distribution server transmitting the distribution video received from the distributor terminal to each of the plurality of viewer terminals;
Each of the plurality of viewer terminals outputting an image and sound included in the distribution video received from the distribution server;
Starting predetermined communication capable of at least making a call between the participant terminals included in the plurality of viewer terminals and the distributor terminal; and
The participant terminal, in response to the start of the predetermined communication, executes a step of outputting the audio received in the predetermined communication instead of the audio included in the distribution video received from the distribution server. ,
system.

In response to the start of the predetermined communication, the distributor terminal receives the distribution video to be transmitted to the distribution server in the predetermined communication in addition to the real-time sound input via a microphone. The system of claim 1, further comprising the step of adding speech.

The distributor terminal further executes a step of executing a process of removing an echo component of a voice received in the predetermined communication from a voice input via a microphone in response to the start of the predetermined communication. Item 1 or 2 system.

4. The system according to claim 1, wherein the predetermined communication is performed using P2P communication.

The said participant terminal further performs the step which outputs the audio | voice contained in the said delivery moving image received from the said delivery server instead of the audio | voice received in the said predetermined communication according to completion | finish of the said predetermined communication. Item 5. The system according to any one of Items 1 to 4.

The system according to any one of claims 1 to 5,
The predetermined communication is capable of transmitting and receiving images,
The distributor terminal further executes a step of adding an image received in the predetermined communication to the distribution video to be transmitted to the distribution server in response to the start of the predetermined communication.
system.

The step of starting the predetermined communication includes the first viewer as the participant terminal in response to an approval in the distributor terminal of a participation request from one viewer terminal included in the plurality of viewer terminals. The system according to claim 1, comprising starting the predetermined communication between a terminal and the distributor terminal.

The system according to any one of claims 1 to 7, wherein the distribution moving image includes an image corresponding to a display screen of the distributor terminal and / or an image input via a camera of the distributor terminal.

A system comprising a distributor terminal, a distribution server, and a plurality of viewer terminals is a method for distributing a video,
The distributor terminal transmitting a distribution video including at least real-time audio input via a microphone to the distribution server;
The distribution server transmitting the distribution video received from the distributor terminal to each of the plurality of viewer terminals;
Each of the plurality of viewer terminals outputting an image and sound included in the distribution video received from the distribution server;
Starting predetermined communication capable of at least making a call between the participant terminals included in the plurality of viewer terminals and the distributor terminal; and
The participant terminal includes a step of outputting a sound received in the predetermined communication instead of a sound included in the distribution moving image received from the distribution server in response to the start of the predetermined communication.
Method.

In a system for delivering a video, comprising a distributor terminal, a distribution server, and a plurality of viewer terminals, a program executed on the distributor terminal,
Processing for transmitting a distribution video including at least real-time audio input via a microphone to the distribution server;
A process of starting a predetermined communication capable of executing at least a call with a participant terminal included in the plurality of viewer terminals that outputs an image and sound included in the distribution moving image received from the distribution server;
Processing for adding sound received in the predetermined communication to the distribution video to be transmitted to the distribution server in addition to real-time sound input via a microphone in response to the start of the predetermined communication; , Execute
The process of adding the received voice includes executing a process of removing an echo component of the voice received in the predetermined communication from the voice input through a microphone.
program.

A system comprising a distributor terminal, a distribution server, and a plurality of viewer terminals, and is a program executed on the viewer terminal in a system for distributing moving images,
A distribution video received from the distribution server, the image included in the distribution video including at least real-time audio received by the distribution server from the distributor terminal and input via a microphone of the distributor terminal; Processing to output audio,
A process of starting a predetermined communication capable of executing at least a call with the distributor terminal;
In response to the start of the predetermined communication, in place of the audio included in the distribution video received from the distribution server, a process of outputting the audio received in the predetermined communication is executed.
program.