JP7307295B1

JP7307295B1 - CONTENT PROVIDING SYSTEM, CONTENT PROVIDING METHOD, AND CONTENT PROVIDING PROGRAM

Info

Publication number: JP7307295B1
Application number: JP2023075537A
Authority: JP
Inventors: 那雄友永
Original assignee: Individual
Current assignee: Individual
Priority date: 2023-05-01
Filing date: 2023-05-01
Publication date: 2023-07-11
Anticipated expiration: 2043-05-01

Abstract

【課題】本発明は、コンテンツ提供システム、コンテンツ提供方法、コンテンツ提供プログラムに関する。【解決手段】コンテンツ提供システム１は、表示処理部、視線検出部、決定処理部、音声提供部、記憶部、を備える。記憶部は、文書情報と、文書に対応する音声に関する音声情報を格納する。表示処理部は、文書を含んだコンテンツを表示処理し、その表示処理結果をユーザが使用する端末へ送信する。視線検出部は、ユーザから文書へ向けられる視線の位置、及びその位置における視線の滞在時間、を含んだ視線情報を検出し、決定処理部は、経時的に変化する前記視線情報に基づいて、視線が検出されてから音声が再生されるまでの待機時間を含んだ優先条件を文書ごとに決定する。音声提供部は、視線情報と、優先条件と、音声情報と、に基づいて文書に対応する音声を端末へ送信する。【選択図】図１Kind Code: A1 The present invention relates to a content providing system, a content providing method, and a content providing program. SOLUTION: A content providing system 1 includes a display processing unit, a line-of-sight detection unit, a decision processing unit, a voice providing unit, and a storage unit. The storage unit stores document information and audio information related to audio corresponding to the document. The display processing unit performs display processing on the content including the document, and transmits the display processing result to the terminal used by the user. The line-of-sight detection unit detects line-of-sight information including the position of the user's line of sight directed to the document and the duration of the line of sight at that position, and the decision processing unit determines, for each document, a priority condition including a waiting time from when the line of sight is detected to when the sound is reproduced, based on the line-of-sight information that changes over time. The voice providing unit transmits voice corresponding to the document to the terminal based on the line-of-sight information, the priority condition, and the voice information. [Selection drawing] Fig. 1

Description

本発明は、コンテンツ提供システム、コンテンツ提供方法、及びコンテンツ提供プログラムに関する。 The present invention relates to a content providing system, a content providing method, and a content providing program.

従来から電子文書などのコンテンツを提供するためのシステムが開発されている。そして昨今では、コンテンツを提供するための方法を工夫することに加え、当該コンテンツの演出効果を高める様々な工夫がされている。 Conventionally, systems have been developed for providing contents such as electronic documents. In recent years, in addition to devising methods for providing content, various devices have been devised to enhance the presentation effect of the content.

例えば特許文献１には、使用者の視線位置を検出するためのカメラモジュールを利用し、視線位置や検出領域における視線の滞在時間などを求め、視線位置等が所定条件に該当するときに漫画や電子文書における検出領域が使用者に読まれていると判断する電子文書端末が開示されている。 For example, in Patent Document 1, a camera module for detecting the line-of-sight position of the user is used to obtain the line-of-sight position and the length of time the line of sight stays in the detection area. An electronic document terminal is disclosed that determines that a detection area in an electronic document has been read by a user.

また特許文献２には、表示手段と、音声再生手段と、音声記憶手段と、音声データ選択手段と、を備える漫画再生システムが開示されている。本漫画再生システムによれば、利用者が漫画作品に対する音声を選択できるので、利用者の興味を継続させることができる。 Further, Patent Document 2 discloses a comic reproduction system including display means, audio reproduction means, audio storage means, and audio data selection means. According to this comic reproduction system, the user can select the voice for the comic work, so that the interest of the user can be maintained.

特開２０１５－２１２９８１号公報JP 2015-212981 A 特開２０２３－１１０６９号公報JP-A-2023-11069

ところで、特許文献１の技術と特許文献２の技術を組み合わせることで、ユーザの利便性に優れ、且つ、演出効果の高いコンテンツ提供システムを実現することが考えられる。 By the way, by combining the technology of Patent Document 1 and the technology of Patent Document 2, it is conceivable to realize a content providing system that is highly convenient for the user and has a high production effect.

すなわち、漫画などのコンテンツにおいて特許文献１の技術を利用してユーザの視線を検出（ユーザに読まれている文書を検出）することに加え、特許文献２の技術を利用することで、検出された文書（例えば漫画における登場人物の台詞など）に対応する音声を付したコンテンツを提供することができる。 That is, in addition to detecting the line of sight of the user (detecting the document being read by the user) using the technology of Patent Document 1 in content such as comics, the technology of Patent Document 2 is used to detect It is possible to provide content with audio corresponding to a document (for example, dialogue of a character in a cartoon).

しかしながら、漫画などのコンテンツでは台詞が読まれる順番をある程度想定していることが多い一方、実際にはユーザの視線が想定された順番に移動するとは限らない。また、ユーザは漫画などの台詞を読み飛ばしたり、読み直しをしたりすることもある。この場合、視線の移動速度が速いと前の音声の再生中に次の音声も再生されてしまい（音声が重なってしまい）、適切に台詞の音声を再生できない恐れがある。 However, while in content such as comics, the order in which lines are read is often assumed to some extent, the user's line of sight does not always move in the assumed order. In addition, the user sometimes skips or re-reads lines in comics and the like. In this case, if the line-of-sight movement speed is fast, the next sound will be reproduced while the previous sound is being reproduced (the sounds will overlap), and there is a risk that the dialogue sound cannot be reproduced properly.

本発明は、上記従来技術の課題に鑑みて行われたものであって、その目的は、従来よりも適切且つ効果的に音声が対応付けられたコンテンツを提供できるコンテンツ提供システムを提供することにある。 SUMMARY OF THE INVENTION The present invention has been made in view of the above-mentioned problems of the prior art, and its object is to provide a content providing system capable of providing content associated with sound more appropriately and effectively than the conventional art. be.

上記課題を解決するために、本発明は、コンテンツを提供するためのコンテンツ提供システムであって、
当該コンテンツ提供システムは、表示処理部と、視線検出部と、決定処理部と、音声提供部と、記憶部と、を備え、
前記記憶部は、文書に関する文書情報と、該文書に対応する音声に関する音声情報と、を格納し、
前記表示処理部は、前記文書を含んだ前記コンテンツを表示処理し、その表示処理結果をユーザが使用する端末へ送信し、
前記視線検出部は、前記ユーザから前記文書へ向けられる視線の位置、及びその位置における視線の滞在時間、を含んだ視線情報を検出し、
前記決定処理部は、経時的に変化する前記視線情報に基づいて、前記視線が検出されてから前記音声が再生されるまでの待機時間を含んだ優先条件を前記文書ごとに決定し、
前記音声提供部は、前記視線情報と、前記優先条件と、前記音声情報と、に基づいて前記文書に対応する音声を前記端末へ送信する。 In order to solve the above problems, the present invention provides a content providing system for providing content,
The content providing system includes a display processing unit, a line-of-sight detection unit, a determination processing unit, an audio providing unit, and a storage unit,
the storage unit stores document information about a document and voice information about voice corresponding to the document;
The display processing unit performs display processing on the content including the document, and transmits the display processing result to a terminal used by a user;
The line-of-sight detection unit detects line-of-sight information including a line-of-sight position directed toward the document from the user and a length of time the line of sight stays at that position,
The determination processing unit determines, for each document, a priority condition including a waiting time from when the line of sight is detected to when the sound is reproduced, based on the line of sight information that changes over time;
The audio providing unit transmits audio corresponding to the document to the terminal based on the line-of-sight information, the priority condition, and the audio information.

また、本発明は、コンテンツを提供するためのコンテンツ提供システムが実行するコンテンツ提供方法であって、
当該コンテンツ提供システムは、表示処理部と、視線検出部と、決定処理部と、音声提供部と、記憶部と、を備え、
前記記憶部が、文書に関する文書情報と、該文書に対応する音声に関する音声情報と、を格納し、
前記表示処理部が、前記文書を含んだ前記コンテンツを表示処理し、その表示処理結果をユーザが使用する端末へ送信し、
前記視線検出部が、前記ユーザから前記文書へ向けられる視線の位置、及びその位置における視線の滞在時間、を含んだ視線情報を検出し、
前記決定処理部が、経時的に変化する前記視線情報に基づいて、前記視線が検出されてから前記音声が再生されるまでの待機時間を含んだ優先条件を前記文書ごとに決定し、
前記音声提供部が、前記視線情報と、前記優先条件と、前記音声情報と、に基づいて前記文書に対応する音声を前記端末へ送信するステップと、を含む。 The present invention also provides a content providing method executed by a content providing system for providing content,
The content providing system includes a display processing unit, a line-of-sight detection unit, a determination processing unit, an audio providing unit, and a storage unit,
the storage unit stores document information about a document and voice information about voice corresponding to the document;
The display processing unit performs display processing on the content including the document, and transmits the display processing result to a terminal used by a user;
the line-of-sight detection unit detects line-of-sight information including a line-of-sight position directed to the document by the user and a length of stay of the line of sight at that position;
the decision processing unit, based on the line-of-sight information that changes over time, determines a priority condition for each of the documents, including a waiting time from detection of the line of sight to reproduction of the sound;
the voice providing unit transmitting voice corresponding to the document to the terminal based on the line-of-sight information, the priority condition, and the voice information.

また、本発明は、コンテンツを提供するためのコンテンツ提供プログラムであって、
コンピュータを、表示処理部と、視線検出部と、決定処理部と、音声提供部と、記憶部と、として機能させ、
前記記憶部は、文書に関する文書情報と、該文書に対応する音声に関する音声情報と、を格納し、
前記表示処理部は、前記文書を含んだ前記コンテンツを表示処理し、その表示処理結果をユーザが使用する端末へ送信し、
前記視線検出部は、前記ユーザから前記文書へ向けられる視線の位置、及びその位置における視線の滞在時間、を含んだ視線情報を検出し、
前記決定処理部は、経時的に変化する前記視線情報に基づいて、前記視線が検出されてから前記音声が再生されるまでの待機時間を含んだ優先条件を前記文書ごとに決定し、
前記音声提供部は、前記視線情報と、前記優先条件と、前記音声情報と、に基づいて前記文書に対応する音声を前記端末へ送信する。 The present invention also provides a content providing program for providing content,
causing the computer to function as a display processing unit, a line-of-sight detection unit, a determination processing unit, an audio providing unit, and a storage unit;
the storage unit stores document information about a document and voice information about voice corresponding to the document;
The display processing unit performs display processing on the content including the document, and transmits the display processing result to a terminal used by a user;
The line-of-sight detection unit detects line-of-sight information including a line-of-sight position directed toward the document from the user and a length of time the line of sight stays at that position,
The determination processing unit determines, for each document, a priority condition including a waiting time from when the line of sight is detected to when the sound is reproduced, based on the line of sight information that changes over time;
The audio providing unit transmits audio corresponding to the document to the terminal based on the line-of-sight information, the priority condition, and the audio information.

このような構成にすることで、視線情報による所定の優先条件を利用して、従来よりも適切且つ効果的に、音声が対応付けられたコンテンツを提供することができる。 By adopting such a configuration, it is possible to provide content associated with audio more appropriately and effectively than conventionally, by using a predetermined priority condition based on line-of-sight information.

本発明の好ましい形態では、前記文書情報は、前記ユーザが前記文書を読み進める順番を示す進行番号を含み、
前記決定処理部は、前記視線が位置する対象文書の進行番号に対して該進行番号が次の候補文書における優先条件を第１優先条件、前記対象文書の進行番号に対して該進行番号が前の既読文書における優先条件を第２優先条件、それ以外の進行番号が付される未読文書における優先条件を第３優先条件、として決定する。 In a preferred form of the present invention, the document information includes a progression number indicating the order in which the user reads the document,
The determination processing unit sets a priority condition for the candidate document whose progress number is next to the progress number of the target document on which the line of sight is positioned as a first priority condition, and sets the progress number to precedence for the progress number of the target document. is determined as the second priority condition, and the priority condition for the other unread documents to which progress numbers are assigned is determined as the third priority condition.

このような構成にすることで、ユーザから次に閲覧される文書等を予測しながらコンテンツを提供することができる。 With such a configuration, it is possible to provide content while predicting a document or the like that will be browsed next by the user.

本発明の好ましい形態では、前記候補文書における第１優先条件には、前記既読文書における第２優先条件よりも短い待機時間が設定され、
前記未読文書における第３優先条件には、前記既読文書における第２優先条件よりも長い待機時間が設定される。 In a preferred embodiment of the present invention, a waiting time shorter than the second priority condition for the read document is set for the first priority condition for the candidate document,
A waiting time longer than the second priority condition for the read document is set for the third priority condition for the unread document.

このような構成にすることで、より適切に音声が対応付けられたコンテンツを提供することができる。 With such a configuration, it is possible to provide content in which audio is associated more appropriately.

本発明の好ましい形態では、前記文書は、前記進行番号が対応付けられるメイン文書と、それ以外のサブ文書と、を有し、
前記記憶部は、前記進行番号に基づく前記サブ文書のサブ音声再生条件を格納し、
前記音声提供部は、前記視線情報と、前記サブ音声再生条件と、に基づいて、前記サブ文書の音声を前記端末へ送信する。 In a preferred embodiment of the present invention, the document has a main document associated with the progress number and other sub-documents,
the storage unit stores a sub-audio playback condition for the sub-document based on the progress number;
The speech providing unit transmits the speech of the sub-document to the terminal based on the line-of-sight information and the sub-speech reproduction condition.

このような構成にすることで、コンテンツの提供において、より効果的な演出を実現することができる。 By adopting such a configuration, it is possible to realize a more effective effect in providing content.

本発明の好ましい形態では、前記文書情報には、前記コンテンツの表示範囲における文書の位置を示す位置情報と、前記文書の表示範囲を示す領域情報と、が対応付けられ、
前記視線検出部は、前記位置情報及び／又は領域情報を利用して、前記視線情報を検出する。 In a preferred embodiment of the present invention, the document information is associated with position information indicating the position of the document in the display range of the content and area information indicating the display range of the document,
The line-of-sight detection unit detects the line-of-sight information using the position information and/or the area information.

このような構成にすることで、精度良くユーザの視線を検出することができる。 With such a configuration, it is possible to detect the line of sight of the user with high accuracy.

本発明の好ましい形態では、前記コンテンツは電子漫画コンテンツであり、
前記文書は、電子漫画の登場者における台詞を含み、
前記音声情報は、前記電子漫画の登場者における台詞音声を含む。 In a preferred form of the present invention, the content is electronic comic content,
The document includes dialogue in the characters of the electronic comic,
The audio information includes dialogue audio of characters appearing in the electronic comics.

このような構成にすることで、電子漫画コンテンツにおいて適切に台詞の音声を提供することができる。 By adopting such a configuration, it is possible to appropriately provide the voice of the lines in the electronic comic content.

本発明の好ましい形態では、前記コンテンツ提供システムは、更に音声選択部を備え、
前記記憶部は、複数の前記台詞音声を格納し、
前記音声選択部は、前記ユーザからの入力に基づいて、同一の台詞に対し、複数の台詞音声の中から１の台詞音声を選択可能である。 In a preferred embodiment of the present invention, the content providing system further comprises an audio selector,
the storage unit stores a plurality of the dialogue voices,
The voice selection unit can select one line voice from among a plurality of line voices for the same line based on the input from the user.

このような構成にすることで、ユーザの要望に応じた台詞の音声を提供することができる。 By adopting such a configuration, it is possible to provide the voice of the dialogue according to the user's request.

本発明によれば、視線情報による所定の優先条件を利用することで、コンテンツ提供システムに係る新規な技術を提供することができる。 According to the present invention, it is possible to provide a novel technology related to a content providing system by using a predetermined priority condition based on line-of-sight information.

本発明の一実施形態に係るシステム構成のブロック図を示す。1 shows a block diagram of a system configuration according to an embodiment of the present invention; FIG. 本発明の一実施形態に係る情報処理装置及び端末のハードウェア構成の一例の概略図を示す。1 shows a schematic diagram of an example of a hardware configuration of an information processing apparatus and a terminal according to an embodiment of the present invention; FIG. 本発明の一実施形態に係るコンテンツ提供システムの概略イメージ図を示す。1 shows a schematic image diagram of a content providing system according to an embodiment of the present invention; FIG. 本発明の一実施形態に係るコンテンツ提供システムの処理手順のフローチャートを示す。4 shows a flow chart of a processing procedure of the content providing system according to one embodiment of the present invention. 本発明の一実施形態に係るコンテンツ提供システムで利用される各情報の一例を示す。1 shows an example of each information used in the content providing system according to one embodiment of the present invention. 本発明の一実施形態に係る表示処理結果の概略イメージ図を示す。FIG. 4 shows a schematic image diagram of a display processing result according to an embodiment of the present invention; 本発明の一実施形態に係るコンテンツ提供システムで利用される各情報の一例を示す。1 shows an example of each information used in the content providing system according to one embodiment of the present invention. 本発明の一実施形態に係る優先条件の概略イメージ図を示す。FIG. 4 shows a schematic image diagram of a priority condition according to one embodiment of the present invention; 本発明の一実施形態に係る優先条件の概略イメージ図を示す。FIG. 4 shows a schematic image diagram of a priority condition according to one embodiment of the present invention; 本発明の一実施形態に係る優先条件の概略イメージ図を示す。FIG. 4 shows a schematic image diagram of a priority condition according to one embodiment of the present invention; 本実施形態に係るコンテンツ提供システムの詳細な処理フローを示す。4 shows a detailed processing flow of the content providing system according to the present embodiment; 本発明の一実施形態に係るサブ文書を説明するための概略イメージ図を示す。FIG. 4 shows a schematic image diagram for explaining a sub-document according to one embodiment of the present invention;

以下、添付図面を参照して、更に詳細に説明する。図面には好ましい実施形態が示されている。しかし、多くの異なる形態で実施されることが可能であり、本明細書に記載される実施形態に限定されない。 A more detailed description is given below with reference to the accompanying drawings. Preferred embodiments are shown in the drawings. It may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein.

例えば、本実施形態ではコンテンツ提供システムの構成、動作等について説明するが、実行される方法（ステップ）、装置、コンピュータプログラム等によっても、同様の作用効果を奏することができる。本実施形態におけるプログラムは、コンピュータが読み取り可能な非一過性の記録媒体として提供されてもよいし、外部のサーバからダウンロード可能に提供されてもよい。 For example, although the configuration, operation, etc. of the content providing system will be described in the present embodiment, similar effects can be achieved by methods (steps), devices, computer programs, etc. that are executed. The program in this embodiment may be provided as a computer-readable non-transitory recording medium, or may be provided downloadably from an external server.

また、本実施形態において「部」とは、例えば、広義の回路によって実施されるハードウェア資源と、これらハードウェア資源によって具体的に実現され得るソフトウェアの情報処理とを合わせたものも含み得る。 Further, in the present embodiment, the term "unit" may include, for example, a combination of hardware resources implemented by circuits in a broad sense and software information processing that can be specifically realized by these hardware resources.

本実施形態において「情報」とは、例えば電圧・電流を表す信号値の物理的な値、０又は１で構成される２進数のビット集合体としての信号値の高低、又は量子的な重ね合わせ（いわゆる量子ビット）によって表され、広義の回路上で通信・演算が実行され得る。 In this embodiment, "information" refers to, for example, the physical value of a signal value representing voltage or current, the height of a signal value as a binary bit aggregate composed of 0 or 1, or the quantum superposition. (so-called quantum bits), and communication and computation can be performed on a circuit in a broad sense.

広義の回路とは、回路（Circuit）、回路類（Circuitry）、プロセッサ（Processor）及びメモリ（Memory）等を適宜組み合わせることによって実現される回路である。即ち、ＣＰＵ（Central Processing Unit）、ＧＰＵ（Graphics Processing Unit）、ＬＳＩ（Large Scale Integration）、ＡＳＩＣ（Application Specific Integrated Circuit）、ＦＰＧＡ（Field-Programmable Gate Array）等を含むものである。 A circuit in a broad sense is a circuit realized by appropriately combining circuits, circuits, processors, memories, and the like. That is, it includes CPU (Central Processing Unit), GPU (Graphics Processing Unit), LSI (Large Scale Integration), ASIC (Application Specific Integrated Circuit), FPGA (Field-Programmable Gate Array), and the like.

＜システム構成＞
図１は、本発明の一実施形態に係るシステム構成を示すブロック図である。図１に示すようにコンテンツ提供システム１は、情報処理装置１０及びデータベースＤＢを備える。コンテンツ提供システム１は、ネットワークＮＷを介して複数の端末２（図１では符号２（ａ）～２（ｃ））、及び複数の音声提供サーバ３（図１では符号３（ａ）～３（ｃ））と通信可能に構成されている。 <System configuration>
FIG. 1 is a block diagram showing the system configuration according to one embodiment of the present invention. As shown in FIG. 1, the content providing system 1 includes an information processing device 10 and a database DB. The content providing system 1 includes a plurality of terminals 2 (reference numerals 2(a) to 2(c) in FIG. 1) and a plurality of audio providing servers 3 (reference numerals 3(a) to 3(c) in FIG. 1) via a network NW. c))).

情報処理装置１０は、サーバとして動作し、端末２はユーザが漫画などのコンテンツを閲覧等するための端末であり、音声提供サーバ３はコンテンツに含まれる文書に対応する音声を提供するためのサーバである。なお、音声提供サーバ３は本発明を実現するために必ずしも必要な構成ではなく、音声提供サーバ３の機能を情報処理装置１０に含める構成にすることもできる。 The information processing device 10 operates as a server, the terminal 2 is a terminal for a user to browse content such as comics, and the audio providing server 3 is a server for providing audio corresponding to a document included in the content. is. Note that the voice providing server 3 is not necessarily a configuration necessary for realizing the present invention, and the information processing apparatus 10 may include the functions of the voice providing server 3 .

ネットワークＮＷは、本実施形態では、ＩＰ（Internet Protocol）ネットワークであるが、通信プロトコルの種類に制限はなく、更に、ネットワークの種類、規模にも制限はない。 Although the network NW is an IP (Internet Protocol) network in this embodiment, there is no limit to the type of communication protocol, and there is no limit to the type and scale of the network.

なお、情報処理装置１０として、汎用のサーバ向けのコンピュータやパーソナルコンピュータ等を利用することが可能である。また、後述の機能構成要素を複数のコンピュータに実現させ、コンテンツ提供システム１を構成することも可能である。 As the information processing apparatus 10, it is possible to use a general-purpose server computer, a personal computer, or the like. Moreover, it is also possible to configure the content providing system 1 by causing a plurality of computers to implement the functional components described later.

端末２として、スマートフォンやタブレット端末、パーソナルコンピュータ、ウェアラブルデバイス等を利用することができる。端末２は、ユーザ用のコンテンツ提供アプリプログラムを記憶し、このアプリプログラムはユーザにおいて当該コンテンツ提供システム１から提供されるコンテンツを閲覧し、あるいはユーザがコンテンツを閲覧するために必要な情報を情報処理装置１０に送信するための機能を有して構成される。 A smartphone, a tablet terminal, a personal computer, a wearable device, or the like can be used as the terminal 2 . The terminal 2 stores a content providing application program for the user, and this application program allows the user to browse the content provided by the content providing system 1, or processes information necessary for the user to browse the content. It is configured with functionality for transmitting to device 10 .

なお、端末２はユーザ用のコンテンツ提供アプリプログラムを有していない構成にすることもできる。この場合、端末２はウェブブラウザ等を利用して各情報を閲覧等することができる。 It should be noted that the terminal 2 may be configured so as not to have a content providing application program for the user. In this case, the terminal 2 can browse each information using a web browser or the like.

音声提供サーバ３は、コンテンツに含まれる文書に対応する音声（音声データ）を格納する。図１に示すように、複数の音声提供サーバ３がネットワークＮＷを介して情報処理装置１０と通信接続されている。例えば漫画などのコンテンツにおいて登場人物ごとの声優が所属する会社が異なる場合、情報処理装置１０は複数の音声提供サーバ３からそれぞれの音声データの提供を受けることができる。 The audio providing server 3 stores audio (audio data) corresponding to the document included in the content. As shown in FIG. 1, a plurality of sound providing servers 3 are communicatively connected to an information processing device 10 via a network NW. For example, in a content such as comics, when voice actors of different characters belong to different companies, the information processing apparatus 10 can receive voice data from a plurality of voice providing servers 3 .

＜ハードウェア構成＞
図２（ａ）は、情報処理装置１０のハードウェア構成の一例を示す図である。情報処理装置１０は、ハードウェア構成として、制御部１１と、記憶部１２と、通信部１３と、を備える。 <Hardware configuration>
FIG. 2A is a diagram showing an example of the hardware configuration of the information processing device 10. As shown in FIG. The information processing apparatus 10 includes a control unit 11, a storage unit 12, and a communication unit 13 as a hardware configuration.

制御部１１は、ＣＰＵ等の１又は２以上のプロセッサを含み、本発明に係るコンテンツ提供プログラム、ＯＳ、その他のアプリケーションを実行することで、情報処理装置１０の動作処理全体を制御する。 The control unit 11 includes one or more processors such as a CPU, and controls the overall operation processing of the information processing apparatus 10 by executing the content providing program, OS, and other applications according to the present invention.

記憶部１２は、ＨＤＤ、ＳＳＤ、ＲＯＭ、ＲＡＭ等であって、本発明に係るコンテンツ提供プログラム及び、制御部１１がプログラムに基づき処理を実行する際に利用するデータ等を記憶する。制御部１１が、記憶部１２に記憶されているコンテンツ提供プログラムに基づき、処理を実行することによって、後述する機能構成が実現される。 The storage unit 12 is an HDD, SSD, ROM, RAM, or the like, and stores a content providing program according to the present invention, data used when the control unit 11 executes processing based on the program, and the like. The control unit 11 executes processing based on the content providing program stored in the storage unit 12, thereby realizing the functional configuration described later.

通信部１３は、ネットワークＮＷとの通信制御を実行して、情報処理装置１０を動作させるために必要な入力や、動作結果に係る出力を行う。 The communication unit 13 performs communication control with the network NW, inputs required for operating the information processing apparatus 10, and outputs related to operation results.

図２（ｂ）は端末９０（図１における端末２）のハードウェア構成の一例を示す図である。端末９０は、ハードウェア構成として、制御部９１と、記憶部９２と、通信部９３と、入力部９４と、出力部９５と、を備える。 FIG. 2B is a diagram showing an example of the hardware configuration of the terminal 90 (terminal 2 in FIG. 1). The terminal 90 includes a control unit 91, a storage unit 92, a communication unit 93, an input unit 94, and an output unit 95 as a hardware configuration.

端末９０の制御部９１は、ＣＰＵ等の１以上のプロセッサを含み、端末９０の動作処理全体を制御する。端末９０の記憶部９２は、ＨＤＤ、ＳＳＤ、ＲＯＭ、ＲＡＭ等であって、上述のコンテンツ提供アプリケーション並びに、制御部９１がプログラムに基づき処理を実行する際に利用するデータ等を記憶する。 A control unit 91 of the terminal 90 includes one or more processors such as a CPU, and controls the overall operation processing of the terminal 90 . The storage unit 92 of the terminal 90 is an HDD, SSD, ROM, RAM, or the like, and stores the above-described content providing application and data used when the control unit 91 executes processing based on a program.

端末９０の通信部９３は、ネットワークとの通信を制御する。端末９０の入力部９４は、タッチパネル、マウス及びキーボード等であって、ユーザによる操作要求を制御部９１に入力する。端末９０の出力部９５は、ディスプレイ等であって、制御部９１の処理の結果等を表示する。 A communication unit 93 of the terminal 90 controls communication with the network. The input unit 94 of the terminal 90 is a touch panel, a mouse, a keyboard, or the like, and inputs an operation request from the user to the control unit 91 . The output unit 95 of the terminal 90 is a display or the like, and displays the results of processing by the control unit 91 and the like.

＜機能構成＞
図２（ａ）に示すように、情報処理装置１０は、機能構成として、登録部１０１、表示処理部１０２、視線検出部１０３、決定処理部１０４、音声提供部１０５、音声選択部１０６、を備える。これらは、ソフトウェア（記憶部１２に記憶されている）による情報処理が、ハードウェア（制御部１１等）によって具体的に実現されたものである。 <Functional configuration>
As shown in FIG. 2A, the information processing apparatus 10 includes a registration unit 101, a display processing unit 102, a line-of-sight detection unit 103, a determination processing unit 104, a voice providing unit 105, and a voice selection unit 106 as functional configurations. Prepare. These are information processing by software (stored in the storage unit 12) that is specifically realized by hardware (control unit 11, etc.).

登録部１０１は、ユーザにコンテンツを提供するために必要な各種情報を登録する。本実施形態における登録部１０１は、文書情報やコンテンツ情報を登録する。例えば登録部１０１は、音声提供サーバ３から音声情報を取得し、記憶部１２へ登録（格納）することができる。 The registration unit 101 registers various types of information necessary to provide content to users. The registration unit 101 in this embodiment registers document information and content information. For example, the registration unit 101 can acquire voice information from the voice providing server 3 and register (store) it in the storage unit 12 .

表示処理部１０２は、ユーザからの要求に基づいてコンテンツを表示処理する。本実施形態では、文書を含んだコンテンツを表示処理し、その表示処理結果をユーザが使用する端末２へ送信する。 The display processing unit 102 displays content based on a request from the user. In this embodiment, the content including the document is displayed, and the display processing result is transmitted to the terminal 2 used by the user.

例えば表示処理部１０２は、後述する視線情報に基づいて（視線が対象領域に位置する場合に）対象領域のコンテンツ画像を動作させて表示処理することができる。具体的に表示処理部１０２は、ユーザの視線が対象領域に位置する場合、電子漫画などにおける登場人物の顔の表情、目や口などを動かすアニメーション表示をし、対象領域から視線が外れた後は静止画像として表示することができる。 For example, the display processing unit 102 can perform display processing by operating the content image in the target area (when the line of sight is located in the target area) based on line-of-sight information described later. Specifically, when the user's line of sight is positioned in the target area, the display processing unit 102 displays an animation that moves the facial expression, eyes, mouth, etc. of characters in electronic comics, etc. can be displayed as a still image.

同様に表示処理部１０２は、視線情報に基づいて対象領域のコンテンツ画像の色を変更することができる。例えば表示処理部１０２は、ユーザの視線が対象領域に位置する場合、電子漫画などにおける対象となるコマについてカラー表示を行い、対象領域から視線が外れた後は白黒表示を行うことができる。その他、ＶＲ機器を利用して仮想空間上にコンテンツを表示処理することもできる。これらの表示処理を行うことで本コンテンツ提供システムは、音声と共にさらなる演出効果を提供することができる。 Similarly, the display processing unit 102 can change the color of the content image of the target area based on the line-of-sight information. For example, when the user's line of sight is positioned in the target area, the display processing unit 102 can display a target frame in an electronic comic in color, and after the user's line of sight moves away from the target area, the display processing unit 102 can perform black-and-white display. In addition, content can be displayed in a virtual space using a VR device. By performing these display processes, the content providing system can provide further effects along with the sound.

視線検出部１０３は、ユーザから端末２へ向けられる視線を検出する。本実施形態における視線検出部１０３は、ユーザからコンテンツに含まれる文書へ向けられる視線の位置、及びその位置における視線の滞在時間、を含んだ視線情報を検出する。 The line-of-sight detection unit 103 detects the line of sight directed toward the terminal 2 from the user. The line-of-sight detection unit 103 in this embodiment detects the line-of-sight information including the position of the line of sight directed from the user to the document included in the content and the stay time of the line of sight at that position.

また、視線検出部１０３は、コンテンツの表示範囲における文書の位置を示す位置情報及び／又は文書の表示範囲を示す領域情報を利用して、視線情報を検出することができる。 Further, the line-of-sight detection unit 103 can detect line-of-sight information using position information indicating the position of the document in the display range of the content and/or area information indicating the display range of the document.

この視線の検出は、端末２に設けられるカメラ（撮像部）によってユーザの動作（顔の動きや目線の動き）を取得し、当該ユーザの動作から視線情報を検出することができる。また、例えば情報処理装置１０（表示処理部１０２）は、コンテンツに含まれる文書の閲覧に影響しない透明レイヤーを当該コンテンツの上に重ねて表示処理し、その透明レイヤーにユーザの視線が位置する場合に視線を検出することもできる。この透明レイヤーはユーザの視線に反応するボタンとして機能する。このように本発明を実現するための視線の検出には、一般的なアイトラッキング技術（視線計測の技術）を利用することができる。 This line-of-sight detection can be performed by acquiring the user's motion (face movement and line-of-sight movement) with a camera (imaging unit) provided in the terminal 2 and detecting line-of-sight information from the user's motion. Further, for example, the information processing apparatus 10 (display processing unit 102) performs display processing by superimposing a transparent layer that does not affect viewing of a document included in the content on the content, and when the user's line of sight is positioned on the transparent layer. It is also possible to detect the line of sight. This transparent layer acts as a button that responds to the user's gaze. A general eye tracking technology (a technology for measuring a line of sight) can be used to detect the line of sight for realizing the present invention in this way.

決定処理部１０４は、文書に対応する音声を再生するための優先条件を決定する。本実施形態における決定処理部１０４は、経時的に変化する視線情報に基づいて、視線が検出されてから音声が再生されるまでの待機時間を含んだ優先条件を文書ごとに決定する。決定処理部１０４は、視線情報に基づいて、文書ごとに優先条件を判定処理し、その判定処理結果により文書ごとの優先条件を決定することができる。 The determination processing unit 104 determines priority conditions for reproducing the audio corresponding to the document. The decision processing unit 104 in this embodiment decides, for each document, a priority condition including a waiting time from when the line of sight is detected to when the sound is reproduced, based on the line of sight information that changes over time. The determination processing unit 104 can determine the priority condition for each document based on the line-of-sight information, and determine the priority condition for each document based on the result of the determination processing.

また、決定処理部１０４は、コンテンツに含まれる文書において、現時点においてユーザの視線が位置する対象文書の進行番号（ユーザが文書を読み進める順番を示す進行番号）に対して該進行番号が次の（１つ大きい値の）候補文書（次に読まれることが想定される文書）における優先条件を第１優先条件、対象文書の進行番号に対して該進行番号が前の（１つ小さい値）の既読文書（１つ前に読まれた文書）における優先条件を第２優先条件、それ以外の進行番号が付される未読文書における優先条件を第３優先条件、として決定することもできる。 In addition, the determination processing unit 104 determines that the progress number (the progress number indicating the order in which the user proceeds to read the document) of the target document on which the user's line of sight is positioned at the current point in the document included in the content is the following number. The priority condition in the candidate document (the document expected to be read next) is the first priority condition, and the progress number is before the progress number of the target document (the value is one less) It is also possible to determine the priority condition for the read document (the document read one before) as the second priority condition, and the priority condition for the other unread documents to which progress numbers are assigned as the third priority condition.

音声提供部１０５は、文書に対応する音声を端末２などへ送信する。本実施形態における音声提供部１０５は、視線情報と、優先条件と、音声情報と、に基づいて文書に対応する音声を端末２へ送信する。 The voice providing unit 105 transmits voice corresponding to the document to the terminal 2 or the like. The voice providing unit 105 in this embodiment transmits voice corresponding to the document to the terminal 2 based on the line-of-sight information, the priority condition, and the voice information.

また、音声提供部１０５は、視線情報と、上記の進行番号が付されていないサブ文書におけるサブ音声再生条件と、に基づいて、当該サブ文書の音声を端末２へ送信することもできる。 Further, the speech providing unit 105 can also transmit the speech of the sub-document to the terminal 2 based on the line-of-sight information and the sub-speech reproduction condition of the sub-document to which the progress number is not assigned.

音声選択部１０６は、文書に対応する音声を選択する。本実施形態において音声選択部１０６は、ユーザからの入力に基づいて、同一の文書（例えば漫画における台詞）に対し、複数の台詞音声の中から１の台詞音声を選択可能に構成することができる。 A voice selection unit 106 selects a voice corresponding to the document. In this embodiment, the voice selection unit 106 can be configured to be able to select one line voice from among a plurality of line voices for the same document (for example, lines in a cartoon) based on input from the user. .

例えば記憶部１２に、同一の台詞について複数の声優による音声データが格納されている場合、音声選択部１０６はユーザの入力（要求）に基づいてこれら複数の声優による音声データの中から１の音声データを選択することができる。 For example, if the storage unit 12 stores voice data by a plurality of voice actors for the same line, the voice selection unit 106 selects one voice from the voice data by a plurality of voice actors based on the user's input (request). Data can be selected.

＜データベースＤＢ＞
図１のデータベースＤＢは、コンテンツに含まれる文書に関する文書情報、文書に対応する音声に関する音声情報、コンテンツ情報、優先条件情報、その他コンテンツの閲覧に必要な情報などを格納する。これらの一部又は全部は、記憶部１２等に格納されてもよいし、これらの一部が別のデータベース等に格納されてもよい。 <Database DB>
The database DB in FIG. 1 stores document information on documents included in content, audio information on audio corresponding to the document, content information, priority condition information, and other information necessary for browsing content. Some or all of these may be stored in the storage unit 12 or the like, or some of them may be stored in another database or the like.

以下、図３～１１を参照して、コンテンツ提供システム１の説明及び各機能構成要素による処理内容について説明する。 3 to 11, description of the content providing system 1 and details of processing by each functional component will be described below.

＜コンテンツ提供システムの概要＞
本実施形態に係るコンテンツ提供システム１は、ユーザからの要求に応じて音声付きのコンテンツを提供するためのシステムである。具体的には、コンテンツが表示された端末２へ向けられるユーザからの視線を検出し、その視線に基づいて漫画などのコンテンツと共に、台詞に対応する音声を提供することができる。 <Outline of content providing system>
A content providing system 1 according to this embodiment is a system for providing content with sound in response to a request from a user. Specifically, it is possible to detect the line of sight from the user directed to the terminal 2 on which the content is displayed, and to provide the content such as comics and the voice corresponding to the lines based on the line of sight.

図３には、本発明の一実施形態に係るコンテンツ提供システムの概略イメージ図を示す。同図に示すように、情報処理装置１０は、端末２に表示されたコンテンツ（例えばコンテンツが漫画の場合には台詞などの文書）へ向けられたユーザからの視線の位置等を視線情報として検出する（図３（ａ））。 FIG. 3 shows a schematic image diagram of a content providing system according to one embodiment of the present invention. As shown in the figure, the information processing apparatus 10 detects, as line-of-sight information, the position of the user's line of sight directed to the content displayed on the terminal 2 (for example, when the content is a comic book, a document such as dialogue). (Fig. 3(a)).

そして情報処理装置１０は、視線情報を利用して特徴的な所定条件に基づいた音声データを端末２へ送信する（図３（ｂ））。端末２は、ユーザから視線が向けられるコンテンツの文書に対応する音声を再生（出力）する（図３（ｃ））。 Then, the information processing device 10 uses the line-of-sight information to transmit audio data based on a characteristic predetermined condition to the terminal 2 (FIG. 3(b)). The terminal 2 reproduces (outputs) the sound corresponding to the content document to which the user's line of sight is directed (FIG. 3(c)).

この図３における（ａ）から（ｃ）の一連の処理が繰り返されることで、ユーザは通常の電子漫画コンテンツに演出効果が付加された、すなわち、音声付の電子漫画コンテンツを楽しむことができる。 By repeating the series of processes from (a) to (c) in FIG. 3, the user can enjoy the electronic comic content with sound effect added to the normal electronic comic content.

本実施形態におけるコンテンツ提供システム１は、概略以上のような流れで各処理を行う。以下、本実施形態に係るコンテンツ提供システム１の具体的な処理手順について詳しく説明する。 The content providing system 1 according to the present embodiment performs each process according to the outline of the flow described above. Specific processing procedures of the content providing system 1 according to this embodiment will be described in detail below.

＜各種情報の登録＞
図４は、一実施形態に係るコンテンツ提供システムの処理手順を示すフローチャートである。Ｓ２０１において、情報処理装置１０の登録部１０１は、コンテンツ情報を登録する。コンテンツ情報には、文書情報及び音声情報が対応付けられている。 <Registration of various information>
FIG. 4 is a flow chart showing the processing procedure of the content providing system according to one embodiment. In S201, the registration unit 101 of the information processing apparatus 10 registers content information. Document information and audio information are associated with the content information.

図５（ａ）に示すように、文書情報は、コンテンツに含まれる文書のテキスト、コンテンツの表示範囲における文書の位置を示す位置情報、文書の表示範囲を示す領域情報、メイン文書やサブ文書などの文書種別、文書に対応する音声の音声ＩＤ、文書を読み進める順番を示す進行番号、などの情報を含んでユーザＩＤで管理される。例えば図５（ａ）では、進行番号（１）、（２）、（３）、（４）の順番で文書が読み進められることになる。 As shown in FIG. 5A, the document information includes the text of the document included in the content, position information indicating the position of the document in the display range of the content, area information indicating the display range of the document, main document, sub-document, etc. document type, voice ID of voice corresponding to the document, progress number indicating the order in which the document is read, etc., and is managed by the user ID. For example, in FIG. 5A, the document is read in the order of progress numbers (1), (2), (3), and (4).

また、図５（ｂ）に示すように音声情報は、音声データの情報、各音声データを担当する声優名、などの情報を含んで音声ＩＤで管理される。そして図５（ｃ）に示すようにコンテンツ情報は、画像データ、文書ＩＤ、音声ＩＤ、などの情報を含んでコンテンツＩＤで管理される。 Also, as shown in FIG. 5(b), the audio information includes information on audio data, names of voice actors in charge of each audio data, etc., and is managed by audio IDs. As shown in FIG. 5(c), the content information includes information such as image data, document ID, voice ID, etc., and is managed by the content ID.

登録部１０１は、図１の音声提供サーバ３から音声データ等を取得し、この音声データを音声情報として記憶部１２に格納（登録）することもできる。また、登録部１０１は、ユーザの端末２から音声データを音声情報として取得し、記憶部１２に格納（登録）することもできる。 The registration unit 101 can also acquire voice data and the like from the voice providing server 3 in FIG. 1 and store (register) the voice data in the storage unit 12 as voice information. The registration unit 101 can also acquire voice data as voice information from the user's terminal 2 and store (register) it in the storage unit 12 .

例えば登録部１０１は、複数の声優による音声データを、それぞれの声優が属する音声提供サーバ３（ａ）～（ｃ）から取得し、同一の台詞等に対して異なる声優による音声データを登録することもできる。この場合、詳細は後述するが同一の台詞や登場人物に対して複数の声優による音声データを選択的に対応付けることができる。 For example, the registration unit 101 acquires voice data by a plurality of voice actors from the voice providing servers 3(a) to (c) to which each voice actor belongs, and registers voice data by different voice actors for the same lines. can also In this case, voice data by a plurality of voice actors can be selectively associated with the same dialogue or character, although the details will be described later.

加えて、図５（ｄ）に示すように登録部１０１は、音声を再生するための優先度や条件を示す優先条件情報を登録する。優先条件情報は、後述するメイン文書やサブ文書などの文書種別、第１優先条件や第２優先条件などの項目、ユーザからの視線が検出されてから音声が再生されるまでの待機時間に関する情報、他の音声再生条件、などの情報を含んで優先条件ＩＤで管理される。 In addition, as shown in FIG. 5D, the registration unit 101 registers priority condition information indicating the priority and conditions for reproducing audio. The priority condition information includes document types such as a main document and sub-documents described later, items such as the first priority condition and the second priority condition, and information about the waiting time from when the line of sight of the user is detected to when the voice is played. , other audio reproduction conditions, etc., and is managed by a priority condition ID.

なお、本実施形態では、画像データと文書とを別々に登録しているが、例えば画像データに漫画などの登場人物や台詞などを含めて登録することもできる。この場合、文書情報には、テキストの情報を含めずに、画像データ内における台詞の位置や範囲等の情報が含まれ、これに加えて当該文書に視線が向けられた際には視線情報を検知するためのボタン（上述した透明なレイヤー）に関する情報を含ませれば良い。 In the present embodiment, the image data and the document are separately registered, but for example, the image data can include the characters and lines of comics and the like and be registered. In this case, the document information does not include text information, but includes information such as the position and range of lines in the image data. Just include information about the button (the transparent layer mentioned above) to detect.

また、例えば本コンテンツ提供システムを利用するユーザに対し、あらかじめユーザ登録を行わせることもできる。この場合、ユーザ登録されたユーザのみが本コンテンツ提供システムを利用することができる。 Also, for example, users who use this content providing system can be made to perform user registration in advance. In this case, only registered users can use this content providing system.

＜表示処理及び視線検出＞
Ｓ２０２において、情報処理装置１０の表示処理部１０２は、ユーザからの要求に基づいて漫画などのコンテンツを表示処理し、その表示処理結果を端末２へ送信する。図６には、本発明の一実施形態に係る表示処理結果の一例を示す。同図に示すように本実施形態において端末２に表示されるコンテンツ画像Ｗ１０は電子漫画などである。 <Display processing and line-of-sight detection>
In S<b>202 , the display processing unit 102 of the information processing device 10 displays content such as comics based on a request from the user, and transmits the display processing result to the terminal 2 . FIG. 6 shows an example of display processing results according to an embodiment of the present invention. As shown in the figure, the content image W10 displayed on the terminal 2 in this embodiment is an electronic comic book or the like.

図６におけるコンテンツ画像Ｗ１０では、登場者（登場人物）として人物１及び人物２が表示されると共に、それぞれの登場者に対する台詞が表示されている。ここでは、台詞が図５（ａ）における文書（文書情報）であり、人物１や人物２（及び図示を省略している背景画像など）が図５（ｃ）における画像データである。なお、上述のとおり、文書と登場者等を合わせて画像データとして表示処理することもできる。なお、登場者は人物に限られるものではない。 In the content image W10 in FIG. 6, Person 1 and Person 2 are displayed as characters (appearing characters), and lines for each of the characters are displayed. Here, the dialogue is the document (document information) in FIG. 5(a), and the person 1 and person 2 (and the background image not shown) are the image data in FIG. 5(c). In addition, as described above, it is also possible to perform display processing as image data combining the document and the characters. Note that the characters are not limited to persons.

例えば図６では、台詞に（１）から（７）までの進行番号が付されている。進行番号は、ユーザが文書を読み進める順番を示す番号である。図６では進行番号を説明するために（１）から（７）をコンテンツ画像Ｗ１０に表示しているが、実際の表示処理においてはこの進行番号を表示しなくても良い。 For example, in FIG. 6, the lines are numbered from (1) to (7). The progression number is a number indicating the order in which the user advances through the document. In FIG. 6, (1) to (7) are displayed in the content image W10 to explain the progress numbers, but the progress numbers may not be displayed in actual display processing.

Ｓ２０３において、情報処理装置１０の視線検出部１０３は、ユーザからの視線を検出する。具体的に視線検出部１０３は、ユーザから図６における文書へ向けられる視線の位置、及びその位置における視線の滞在時間、を含んだ視線情報を検出する。 In S203, the line-of-sight detection unit 103 of the information processing apparatus 10 detects the line of sight of the user. Specifically, the line-of-sight detection unit 103 detects line-of-sight information including the position of the user's line of sight directed to the document in FIG. 6 and the stay time of the line of sight at that position.

上述のとおり、この視線の検出は、視線検出部１０３において端末２に設けられるカメラによってユーザの動作（顔の動きや目線の動き）を取得し、当該ユーザの動作から視線情報を検出することができる。あるいは、表示処理部１０２がコンテンツに含まれる文書の閲覧に影響しない透明レイヤーを当該コンテンツに重ねて表示処理し、その透明レイヤーにユーザの視線が位置する場合に視線検出部１０３によって視線を検出することもできる。 As described above, the line-of-sight detection unit 103 acquires the user's motion (face movement and line-of-sight movement) using a camera provided in the terminal 2, and detects line-of-sight information from the user's motion. can. Alternatively, the display processing unit 102 performs display processing by superimposing a transparent layer that does not affect viewing of the document included in the content on the content, and when the user's line of sight is positioned on the transparent layer, the line of sight detection unit 103 detects the line of sight. can also

図７（ａ）に示すように、視線情報は、ユーザから文書へ向けられる視線の位置に関する位置情報、ユーザから文書へ向けられる視線の位置における滞在時間、視線が検出された検出時刻、などの情報を含んで視線情報ＩＤで管理される。 As shown in FIG. 7A, the line-of-sight information includes position information about the position of the user's line of sight directed to the document, stay time at the position of the user's line of sight directed to the document, detection time when the line of sight was detected, and the like. Information is included and managed by a line-of-sight information ID.

視線情報における位置情報には、例えばコンテンツの全体表示に対する文書の位置を示した座標などを利用することができる。また、領域情報には、いわゆるバウンディングボックス（文字列などの周囲を囲む矩形の境界線）の技術を利用することができる。これら視線情報に基づいて後述する音声再生の優先条件が決定される。 For the position information in the line-of-sight information, for example, coordinates indicating the position of the document with respect to the entire display of the content can be used. For area information, a so-called bounding box (a rectangular boundary line surrounding a character string) can be used. A priority condition for audio reproduction, which will be described later, is determined based on the line-of-sight information.

＜優先条件＞
Ｓ２０４において、情報処理装置１０の決定処理部１０４は、文書ごとの優先条件を決定する。決定処理部１０４は、経時的に変化する視線情報に基づいて、視線が検出されてから音声が再生されるまでの待機時間を含んだ優先条件を文書ごとに決定する。本実施形態では、この優先条件に基づいて文書に対応する音声が再生される。 <Priority conditions>
In S204, the determination processing unit 104 of the information processing apparatus 10 determines priority conditions for each document. The determination processing unit 104 determines, for each document, a priority condition including a waiting time from when the line of sight is detected to when the sound is reproduced, based on the line of sight information that changes over time. In this embodiment, the audio corresponding to the document is reproduced based on this priority condition.

決定処理部１０４は、現時点においてユーザの視線が位置する対象文書の進行番号（ユーザが文書を読み進める順番を示す進行番号）に対して該進行番号が次の（１つ大きい値の）候補文書（次に読まれることが想定される文書）における優先条件を第１優先条件、対象文書の進行番号に対して該進行番号が前の（１つ小さい値の）既読文書（１つ前に読まれた文書）における優先条件を第２優先条件、それ以外の進行番号が付される未読文書における優先条件を第３優先条件、として決定する。 The determination processing unit 104 selects candidate documents whose progress number is next (one larger) than the progress number of the target document on which the user's line of sight is currently located (the progress number indicating the order in which the user reads the document). The priority condition in (the document that is assumed to be read next) is the first priority condition, and the read document (the one before The priority condition for the read document) is determined as the second priority condition, and the priority condition for the other unread documents to which progress numbers are assigned is determined as the third priority condition.

なお、進行番号の順番は特に限定されるものではなく、例えば進行番号を逆順、すなわち小さい番号が付された文書の次に大きい番号が付された文書が読まれる順番ではなく、大きい番号が付された文書の次に小さい番号が付された文章が読まれるように進行番号を設定することもできる。 The order of the progress numbers is not particularly limited. A progress number can also be set so that the next lower numbered sentence is read after the written document.

優先条件は、経時的に変化する視線情報に基づいて各文書に対して決定され、音声再生情報として管理される。例えば図７（ｂ）に示すように音声再生情報は、視線ＩＤ、音声ＩＤ、コンテンツＩＤ、進行番号、優先条件ＩＤ、第１優先などの項目、及び待機時間、他の音声再生条件、などの情報を含んで管理される。この図７（ｂ）の音声再生情報は、後述する図８のコンテンツ画像Ｗ１０に対応している（視線ＩＤ４０００２は、音声再生中）。 A priority condition is determined for each document based on line-of-sight information that changes over time, and is managed as audio reproduction information. For example, as shown in FIG. 7B, the audio reproduction information includes items such as line-of-sight ID, audio ID, content ID, progression number, priority condition ID, first priority, waiting time, and other audio reproduction conditions. Information is included and managed. The audio reproduction information in FIG. 7B corresponds to the content image W10 in FIG. 8 described later (line-of-sight ID 40002 indicates that audio is being reproduced).

なお、ユーザの視線は、経時的に変化するものであり、短時間で各文書に対する優先条件が変化するため、音声再生情報は必ずしも記憶部１２で管理される必要はなく、例えばユーザからの視線の変化に応じてその都度優先条件の決定ないし変更を繰り返し、データとしては管理されない構成にすることもできる。このような構成にすることで記憶部１２におけるデータ容量の削減が期待できる。 Note that the line of sight of the user changes over time, and the priority conditions for each document change in a short period of time. It is also possible to repeat determination or change of the priority condition each time according to the change of the priority condition and not to manage it as data. Such a configuration can be expected to reduce the data capacity of the storage unit 12 .

図８には、本発明の一実施形態に係る優先条件の概略イメージ図を示す。図８では説明を分かりやすくするために、吹き出しとして表現している文書ごとの進行番号を（１）から（７）として表示し、また優先条件に応じてそれぞれの文書の吹き出し部分に色や模様を付している。以下、図８において進行番号（１）が付された文書（台詞）のことを文書（１）、進行番号（２）が付された文書のことを文書（２）と表現する（（３）～（７）についても同様とする）。 FIG. 8 shows a schematic image diagram of priority conditions according to one embodiment of the present invention. In FIG. 8, in order to make the explanation easier to understand, the progress number of each document expressed as balloons is indicated as (1) to (7), and the color and pattern of the balloon part of each document are displayed according to the priority conditions. is attached. Hereinafter, the document (speech) to which the progress number (1) is attached in FIG. to (7)).

例えば図８に示すように、文書（２）の台詞にユーザからの視線が向けられると、すなわち、視線検出部１０３がユーザからの視線を検出すると、当該文書（２）の台詞に対して端末２から音声が再生される（詳細は後述）。 For example, as shown in FIG. 8, when the line of sight of the user is directed to the line of the document (2), that is, when the line of sight detection unit 103 detects the line of sight of the user, the terminal detects the line of sight of the document (2). 2 is played back (details will be described later).

この時、文書（２）の優先条件は、音声が再生される優先度が最も高い第１優先条件として決定処理部１０４によって決定される。なお、文書（２）のように音声再生中の文書に対しては優先条件を定めないようにすることもできる。例えば音声再生中の文書は、第１優先条件から第２優先条件へ徐々に変化していくように設定することもできる。 At this time, the priority condition of the document (2) is determined by the determination processing unit 104 as the first priority condition with the highest priority for audio reproduction. It is also possible not to set a priority condition for a document whose voice is being reproduced like the document (2). For example, it is possible to set a document whose voice is being reproduced to gradually change from the first priority condition to the second priority condition.

そして図８に示すように、視線情報に基づいて文書（２）が音声再生中である場合に、決定処理部１０４は、再生中の文書（２）に対して次に位置する（ストーリーの順番として次に読まれる）文書（３）の優先条件を、該優先条件が最も高い第１優先条件として決定する。ユーザは、通常であれば文書（２）の次に文書（３）を読む可能性が高いことから、本実施形態における決定処理部１０４は、文書（３）を第１優先条件として決定する。 Then, as shown in FIG. 8, when the document (2) is being played back based on the line-of-sight information, the determination processing unit 104 is positioned next to the document (2) being played back (in the order of the story). ) is determined as the first priority with the highest priority. Since there is a high possibility that the user will normally read document (3) after document (2), the determination processing unit 104 in this embodiment determines document (3) as the first priority condition.

さらに決定処理部１０４は、音声再生中の文書（２）に対して前に位置する（すでに読み終えた１つ前に位置する）文書（１）の優先条件を、優先度が２番目に高い第２優先条件として決定する。これはユーザが１つ前の文書（１）を読み直す可能性を考慮したものである。 Further, the determination processing unit 104 sets the priority condition of the document (1), which is positioned before the document (2) being played back (already read and is positioned one before), to the second highest priority. Determined as the second priority condition. This takes into consideration the possibility that the user will read the previous document (1) again.

そして決定処理部１０４は、文書（２）が音声再生中の状態において、文書（１）及び文書（３）以外の文書（４）、文書（５）、文書（６）、文書（７）に対する優先条件を第３優先条件として決定する。文書（５）などは文書（１）及び文書（３）に対してユーザから視線が向けられる可能性が低いことを考慮したものである。 Then, while the document (2) is being played back, the determination processing unit 104 performs A priority condition is determined as a third priority condition. Document (5) and the like are taken into consideration because it is unlikely that the user will look at document (1) and document (3).

このように決定処理部１０４は、視線情報に基づいて、それぞれの文書に対してユーザが視線を向ける可能性に応じた優先条件を決定する。本実施形態では、この優先条件によって、ユーザから視線を得てから音声を再生するまでの待機時間をそれぞれ設定している。例えば図７（ｂ）では、第１優先条件の待機時間を１．０秒、第２優先条件の待機時間を３．０秒、第３優先条件の待機時間を６．０秒と設定している。この待機時間は、ユーザから（端末２から）の設定変更入力に基づいて任意に変更することができる。 In this way, the decision processing unit 104 decides a priority condition according to the possibility that the user will turn his/her line of sight to each document based on the line-of-sight information. In this embodiment, the priority conditions are used to set the standby time from when the line of sight of the user is obtained to when the sound is reproduced. For example, in FIG. 7B, the waiting time of the first priority condition is set to 1.0 seconds, the waiting time of the second priority condition is set to 3.0 seconds, and the waiting time of the third priority condition is set to 6.0 seconds. there is This waiting time can be arbitrarily changed based on a setting change input from the user (from the terminal 2).

例えば第１優先条件における待機時間は、０．１秒から２．０秒程度であることが好ましいく、より好ましくは０．５秒から１．０秒程度であることが好適である。第１優先条件の文書はユーザが次に読む可能性が高く、瞬時に音声を再生させる必要があることからこのように設定している。 For example, the waiting time under the first priority condition is preferably about 0.1 seconds to 2.0 seconds, more preferably about 0.5 seconds to 1.0 seconds. The document of the first priority condition is highly likely to be read next by the user, and is set in this way because it is necessary to instantly reproduce the voice.

また、第２優先条件における待機時間は、２．０秒から５．０秒程度であることが好ましく、より好ましくは３．０～４．０秒程度であることが好適である。例えばユーザが台詞の読み直し等をする可能性がある一方、このような場合であっても音声再生まではユーザから望まれておらず、瞬発的に第２優先条件が付された文書に目線を移動させる可能性があり、適切に音声を再生できるようにするためこのように設定している。 Also, the waiting time under the second priority condition is preferably about 2.0 to 5.0 seconds, more preferably about 3.0 to 4.0 seconds. For example, while there is a possibility that the user will re-read the dialogue, even in such a case, the user does not want to reproduce the voice, and the user will instantly look at the document to which the second priority condition has been attached. It's set this way so that it can be moved and the sound can be played properly.

さらに第３優先条件における待機時間は、５．０秒から１０．０秒程度であることが好ましく、より好ましくは６．０から８．０秒程度であることが好適である。第３優先条件が付される文書は、基本的にユーザから次に読まれる可能性が低い一方、例えばユーザがコンテンツ全体を確認するために高速で視線を移動させる可能性があり、このような場合であっても適切に音声を再生できるようにするためである。言い換えれば、ユーザがコンテンツ全体を確認するために高速で視線を移動させた場合には、音声を再生させないようにするためである。 Furthermore, the waiting time under the third priority condition is preferably about 5.0 to 10.0 seconds, more preferably about 6.0 to 8.0 seconds. Documents to which the third priority condition is attached are basically less likely to be read next by the user, but there is a possibility that, for example, the user will move the line of sight at high speed to check the entire content. This is so that the audio can be reproduced appropriately even in the case of In other words, this is to prevent the sound from being reproduced when the user quickly moves the line of sight in order to check the entire content.

そして図９に示すように、ユーザからの視線によって文書（３）が音声再生中になると決定処理部１０４は、文書（３）に対して進行番号が１つ大きい値を示す文書（４）を第１優先条件として決定（または変更）し、文書（３）に対して進行番号が１つ小さい値を示す文書（２）を第２優先条件として決定し、それ以外の文書（１）、文書（５）、文書（６）、文書（７）は第３優先条件として決定する。 Then, as shown in FIG. 9, when document (3) is being played back by the user's line of sight, the determination processing unit 104 selects document (4) whose progress number is one higher than document (3). Determined (or changed) as the first priority condition, document (2) showing a progress number one less than document (3) is determined as the second priority condition, and other documents (1), documents (5), Document (6), and Document (7) are determined as the third priority condition.

また、図１０に示すように、未だコンテンツに含まれる文書に対してユーザからの視線が向けられていない場合、すなわち、当該コンテンツのページをはじめて表示した状態において決定処理部１０４は、文書（１）を第１優先条件として決定し、それ以外の文書（２）～文書（７）は第３優先条件として決定する。 Further, as shown in FIG. 10, when the line of sight of the user is not directed to the document included in the content, that is, when the page of the content is displayed for the first time, the determination processing unit 104 selects the document (1 ) is determined as the first priority condition, and the other documents (2) to (7) are determined as the third priority condition.

本実施形態では、ユーザが図１０の文書（１）に視線を向けて当該文書を読み進めることで、すなわち、決定処理部１０４は経時的に変化するユーザの視線情報に基づいて優先条件を上述した図８や図９の状態へと決定ないし変更していく。 In this embodiment, when the user directs his/her line of sight to the document (1) in FIG. It determines or changes to the state shown in FIG. 8 or FIG.

そしてＳ２０５において、視線情報と優先条件に基づいた音声を、ユーザが利用する端末２へ送信する。端末２は、所定の優先条件に基づいた音声（音声データ）を受信する。 Then, in S205, voice based on the line-of-sight information and the priority condition is transmitted to the terminal 2 used by the user. The terminal 2 receives audio (audio data) based on a predetermined priority condition.

Ｓ２０６において、音声データを受信した端末２において、文書に対応する音声が端末２から出力される。具体的には上述したとおり図８～１０に示したそれぞれの文書には優先条件が決定されているので、ユーザからの視線情報（位置情報、領域情報）に基づいてそれぞれの優先条件に応じた待機時間が経過した後に端末２から音声が再生される。 At S206, the terminal 2 that has received the voice data outputs voice corresponding to the document. Specifically, as described above, priority conditions are determined for each of the documents shown in FIGS. After the standby time has passed, the terminal 2 reproduces the sound.

このように優先条件に応じた待機時間を設定することで、例えばユーザが当該コンテンツの台詞全体を確認する場合（視線の位置を高速で切り替えて全体を見ている場合）などには音声は再生されず、ユーザが当該台詞を読みたい適切なタイミングでコンテンツに含まれる文書の音声を再生させることができる。 By setting the waiting time according to the priority condition in this way, for example, when the user confirms the entire dialogue of the content (when the user is looking at the entire content by switching the position of the line of sight at high speed), the audio is played back. Instead, the voice of the document included in the content can be reproduced at an appropriate timing when the user wants to read the dialogue.

図１１には本実施形態に係るコンテンツ提供システムの詳細な処理フローを示す。同図に示すように、情報処理装置１０は、ユーザからの視線が端末２の対象領域に入ると（Ｓ３０１）、進行度ログを参照し（Ｓ３０２）、優先度を決定（及び判定）する（Ｓ３０３）。そして現時点において前回実行音声を再生中か否かを判定し（Ｓ３０４）、再生中であれば再生中の音声トラックを停止（スキップ）する（Ｓ３０５）。再生中でなければ音声再生を優先条件の待機時間に基づいて待機する（Ｓ３０６）。 FIG. 11 shows a detailed processing flow of the content providing system according to this embodiment. As shown in the figure, when the line of sight from the user enters the target area of the terminal 2 (S301), the information processing apparatus 10 refers to the progress log (S302) and determines (and judges) the priority ( S303). Then, it is determined whether or not the previously executed sound is being reproduced at the present time (S304), and if it is being reproduced, the sound track being reproduced is stopped (skipped) (S305). If it is not during playback, the audio playback waits based on the waiting time of the priority condition (S306).

その後、当該対象領域から視線が外れたか否かを判定し（Ｓ３０７）、視線が外れていれば実行処理をキャンセルし（Ｓ３０８）、当該対象領域から視線が外れなければ文書に対応付けられた音声が端末２から再生され（Ｓ３０９）、進行ログが更新され（Ｓ３１０）、視線が外れたか否かを判定し（Ｓ３１１）、次の音声アルゴリズムに向けてこれら一連の処理が繰り返される。 Thereafter, it is determined whether or not the line of sight has left the target area (S307). If the line of sight has left the target area, the execution processing is canceled (S308). is played back from the terminal 2 (S309), the progress log is updated (S310), it is determined whether or not the line of sight is off (S311), and these series of processes are repeated for the next speech algorithm.

本実施形態においてこれらの処理を行う場合、視線検出部１０３は、Ｓ３０１、Ｓ３０２、Ｓ３０７、Ｓ３１０、Ｓ３１１に関連する処理を実行し、決定処理部１０４はＳ３０３、Ｓ３０４、Ｓ３０５に関連する処理を実行し、音声提供部１０５（または音声を受信した端末２）はＳ３０６、Ｓ３０８、Ｓ３０９に関連する処理を実行し、また記憶部１２は進行度ログ等を管理する。 When performing these processes in the present embodiment, the line-of-sight detection unit 103 executes processes related to S301, S302, S307, S310, and S311, and the determination processing unit 104 executes processes related to S303, S304, and S305. Then, the voice providing unit 105 (or the terminal 2 that has received the voice) executes processing related to S306, S308, and S309, and the storage unit 12 manages the progress log and the like.

例えば進行度ログと視線情報を利用して、ユーザが注目して複数回閲覧した文書等を数値として解析し、この解析結果を利用してコンテンツの内容や構成が変わる変則的なコンテンツを提供することもできる。 For example, by using the progress log and line-of-sight information, the documents, etc. that the user paid attention to and viewed multiple times are analyzed numerically, and the results of this analysis are used to provide irregular content that changes the content and structure. can also

＜サブ文書の音声再生＞
本実施形態における文書は、２種類の種別に分かれており、これら複数の文書種別に応じて音声再生を管理することができる。具体的に本実施形態における文書は、進行番号が対応付けられるメイン文書と、それ以外のサブ文書と、を有する（図５（ａ））。なお、文書の種別は２種類に限られず、３種類以上に分類して音声再生を管理することもできる。 <Audio playback of sub-document>
Documents in this embodiment are divided into two types, and voice reproduction can be managed according to these document types. Specifically, the document in this embodiment has a main document associated with a progress number and other sub-documents (FIG. 5(a)). Note that the types of documents are not limited to two types, and it is also possible to classify documents into three or more types and manage audio reproduction.

図１２には、本発明の一実施形態に係るサブ文書を説明するための概略イメージ図を示す。図１２は、図８に示したコンテンツ画像Ｗ１０にサブ文書等が付加されたものである。このサブ文書は、コンテンツにおいてメインとなるストーリーとは関連性の低い文書であり、図８では雨音を「ポツポツ」と表現した文書である。 FIG. 12 shows a schematic image diagram for explaining a sub-document according to one embodiment of the present invention. FIG. 12 shows the content image W10 shown in FIG. 8 with sub-documents and the like added. This sub-document is a document that has little relevance to the main story in the content, and is a document in which the sound of rain is expressed as "dropping" in FIG.

情報処理装置１０の登録部１０１は、あらかじめ進行番号に応じたサブ文書のサブ音声再生条件を登録することができる。例えば図５（ｄ）に示すように、サブ音声再生条件を優先条件情報に含めて登録することができる。 The registration unit 101 of the information processing apparatus 10 can register in advance the sub-audio reproduction condition of the sub-document corresponding to the progress number. For example, as shown in FIG. 5(d), the sub-audio reproduction condition can be included in the priority condition information and registered.

図１２（ｂ）には、サブ音声再生情報の設定イメージを示す。同図に示すように、音声提供部１０５は、ユーザからの視線が進行番号（２）～（４）の範囲に位置する場合、すなわち文書（２）～（４）がユーザに読まれている場合において、雨音を表現した「ポツポツ」の音声データを端末２へ送信することができる。 FIG. 12(b) shows a setting image of sub-audio reproduction information. As shown in the figure, the speech providing unit 105 detects that the line of sight of the user is located in the range of the progress numbers (2) to (4), that is, the documents (2) to (4) are read by the user. In this case, it is possible to transmit to the terminal 2 the voice data of "potsupotsu" representing the sound of rain.

このように本実施形態では、コンテンツのストーリーとは関連性が低いものの、演出効果を高めるためのサブ文書等についても所定の条件に基づいて音声を再生することができる。なお、このサブ音声再生条件は、ユーザからの入力に応じて適宜変更することができる。 As described above, in this embodiment, although the relevance to the story of the content is low, it is possible to reproduce the audio based on a predetermined condition even for a sub-document or the like for enhancing the presentation effect. It should be noted that this sub-audio reproduction condition can be appropriately changed according to the input from the user.

加えて、図１２（ｂ）に示すように、他の演出効果として「ＢＧＭ」を希望する場合、進行番号（１）～（７）に視線があることを条件にＢＧＭの音声が流れるように設定することもできる。この場合、ＢＧＭの音に重ねて文書（２）などの台詞の音声が再生される。 In addition, as shown in FIG. 12(b), if "BGM" is desired as another production effect, the sound of BGM will be played on the condition that the line of sight is on the progression numbers (1) to (7). Can also be set. In this case, the voice of the lines of the document (2) is reproduced superimposed on the sound of the BGM.

また、文書（５）の場面において効果音を出力（再生）することもできる。この場合、ＢＧＭの音に重ねて効果音や文書（５）の音声が再生される。例えば文書（５）の台詞が再生された後や台詞が再生される前に効果音（これに限らずサブ文書の音声）を再生させるように設定することもできる。 It is also possible to output (reproduce) sound effects in the scene of document (5). In this case, the effect sound and the voice of the document (5) are reproduced superimposed on the sound of the BGM. For example, after or before the speech of document (5) is played back, a sound effect (not limited to this, the voice of the sub-document) can be set to be played back.

加えて、この効果音を再生させる回数や再生させる時間なども設定することができる。例えば図１２において文書（５）の台詞音声が再生されている最中は繰り返し効果音を再生させるように設定することもできる。情報処理装置１０は、ユーザからの入力に基づいて端末２から設定変更信号を受け取り、設定変更信号に基づきこれらの設定を変更し、記憶部１２へ格納することができる。 In addition, it is also possible to set the number of times the sound effect is played and the time for which it is played. For example, in FIG. 12, it is possible to set so that the sound effect is repeatedly reproduced while the dialogue voice of the document (5) is being reproduced. The information processing apparatus 10 can receive a setting change signal from the terminal 2 based on an input from the user, change these settings based on the setting change signal, and store the settings in the storage unit 12 .

このように、本実施形態では、コンテンツにおいてメインとなるストーリーに関連するメイン文書についてはそれぞれ待機時間の異なる優先条件を設定し、さらにストーリーとの関連性の低いサブ文書についてはストーリーの音声を損なうことなく、進行番号に応じた柔軟な演出効果を実現することができる。 As described above, in this embodiment, priority conditions with different wait times are set for main documents related to the main story in the content, and the voice of the story is spoiled for sub-documents with low relevance to the story. It is possible to realize a flexible effect according to the progression number without any need.

以上のように本発明に係るコンテンツ提供システム１によれば、視線情報と、該視線情報による所定の優先条件を利用することで、従来よりも適切且つ効果的に音声が付されたコンテンツを提供することができる。 As described above, according to the content providing system 1 according to the present invention, by using line-of-sight information and a predetermined priority condition based on the line-of-sight information, content with sound added more appropriately and effectively than before is provided. can do.

また、本実施形態では電子漫画コンテンツについて説明したが、本コンテンツ提供システム１を小説、新聞、雑誌、学習教材などの電子書籍に加えて、テレビゲーム、ビデオゲーム、コンピュータゲームなどのゲームや、他のコンテンツに利用する場合にも本発明と同様の効果を得ることができる。 In the present embodiment, electronic comics content has been described, but in addition to electronic books such as novels, newspapers, magazines, and learning materials, the present content providing system 1 can also be used for games such as video games, video games, and computer games, as well as other types of content. The same effect as the present invention can be obtained even when the content is used for

１コンテンツ提供システム
２端末
３音声提供サーバ
１０情報処理装置
１１制御部
１２記憶部
１３通信部
９０端末（端末２）
９１制御部
９２記憶部
９３通信部
９４入力部
９５出力部
１０１登録部
１０２表示処理部
１０３検出処理部
１０４決定処理部
１０５音声再生部
１０６音声選択部
ＮＷネットワーク
Ｗ１０コンテンツ画像
1 content providing system 2 terminal 3 audio providing server 10 information processing device 11 control unit 12 storage unit 13 communication unit 90 terminal (terminal 2)
91 control unit 92 storage unit 93 communication unit 94 input unit 95 output unit 101 registration unit 102 display processing unit 103 detection processing unit 104 determination processing unit 105 audio reproduction unit 106 audio selection unit NW network W10 content image

Claims

A content providing system for providing content,
The content providing system includes a display processing unit, a line-of-sight detection unit, a determination processing unit, an audio providing unit, and a storage unit,
the storage unit stores document information about a document and voice information about voice corresponding to the document;
The display processing unit performs display processing on the content including the document, and transmits the display processing result to a terminal used by a user;
The line-of-sight detection unit detects line-of-sight information including a line-of-sight position directed toward the document from the user and a length of time the line of sight stays at that position,
The determination processing unit determines, for each document, a priority condition including a waiting time from when the line of sight is detected to when the sound is reproduced, based on the line of sight information that changes over time;
The audio providing unit transmits audio corresponding to the document to the terminal based on the line-of-sight information, the priority condition, and the audio information.
content delivery system.

the document information includes a progress number indicating the order in which the user progresses through the document;
The determination processing unit sets a priority condition for the candidate document whose progress number is next to the progress number of the target document on which the line of sight is positioned as a first priority condition, and sets the progress number to precedence for the progress number of the target document. determine the priority condition for the read document as the second priority condition, and the priority condition for the unread document to which the progress number is attached as the third priority condition;
The content providing system according to claim 1.

A waiting time shorter than the second priority condition for the read document is set for the first priority condition for the candidate document,
A waiting time longer than the second priority condition for the read document is set for the third priority condition for the unread document,
The content providing system according to claim 2.

The document has a main document with which the progress number is associated and other sub-documents,
the storage unit stores a sub-audio playback condition for the sub-document based on the progress number;
The audio providing unit transmits the audio of the sub-document to the terminal based on the line-of-sight information and the sub-audio reproduction condition.
The content providing system according to claim 2.

the document information is associated with position information indicating the position of the document in the display range of the content and area information indicating the display range of the document;
The line-of-sight detection unit detects the line-of-sight information using the position information and/or the area information.
The content providing system according to claim 1.

The content is electronic comic content,
The document includes dialogue in the characters of the electronic comic,
The voice information includes dialogue voices of characters in the electronic comic,
A content providing system according to any one of claims 1 to 5.

The content providing system further comprises an audio selector,
the storage unit stores a plurality of the dialogue voices,
The voice selection unit is capable of selecting one line voice from among a plurality of line voices for the same line based on an input from the user.
The content providing system according to claim 6.

A content providing method executed by a content providing system for providing content,
The content providing system includes a display processing unit, a line-of-sight detection unit, a determination processing unit, an audio providing unit, and a storage unit,
the storage unit stores document information about a document and voice information about voice corresponding to the document;
The display processing unit performs display processing on the content including the document, and transmits the display processing result to a terminal used by a user;
the line-of-sight detection unit detects line-of-sight information including a line-of-sight position directed to the document by the user and a length of stay of the line of sight at that position;
the decision processing unit, based on the line-of-sight information that changes over time, determines a priority condition for each of the documents, including a waiting time from detection of the line of sight to reproduction of the sound;
the audio providing unit transmitting audio corresponding to the document to the terminal based on the line-of-sight information, the priority condition, and the audio information;
Content Delivery Method.

A content providing program for providing content,
causing the computer to function as a display processing unit, a line-of-sight detection unit, a determination processing unit, an audio providing unit, and a storage unit;
the storage unit stores document information about a document and voice information about voice corresponding to the document;
The display processing unit performs display processing on the content including the document, and transmits the display processing result to a terminal used by a user;
The line-of-sight detection unit detects line-of-sight information including a line-of-sight position directed toward the document from the user and a length of time the line of sight stays at that position,
The determination processing unit determines, for each document, a priority condition including a waiting time from when the line of sight is detected to when the sound is reproduced, based on the line of sight information that changes over time;
The audio providing unit transmits audio corresponding to the document to the terminal based on the line-of-sight information, the priority condition, and the audio information.
content delivery program.