JP7353130B2

JP7353130B2 - Audio playback systems and programs

Info

Publication number: JP7353130B2
Application number: JP2019193760A
Authority: JP
Inventors: 奈津子榎本; 昌美長谷川; 彩乃田中
Original assignee: Tokyo Gas Co Ltd
Current assignee: Tokyo Gas Co Ltd
Priority date: 2019-10-24
Filing date: 2019-10-24
Publication date: 2023-09-29
Anticipated expiration: 2039-10-24
Also published as: JP2021067845A

Description

本発明は、音声再生システムおよびプログラムに関する。 The present invention relates to an audio reproduction system and program.

従来、映画などの物語の画面に合わせて台詞や音楽などを録音するアフターレコーディング（いわゆるアフレコ）技術が存在する。 Conventionally, there has been an after-recording (so-called dubbing) technology that records dialogue, music, etc. in conjunction with the screen of a story such as a movie.

特許文献１には、物語との一体感を楽しみながら、声優を純粋に体験するための装置の例が記載されている。この装置には、プレーヤが選択した配役以外の音声を再生する機能と、プレーヤが選択した配役が発声するタイミングで、プレーヤが選択した配役の台詞に対応するテロップだけを表示する機能とが設けられている。このため、プレーヤは、テロップに合わせて発声するだけで、アフレコを体験できる。なお、表示画面には、アニメ、ドラマ等の動画が表示されるので、プレーヤは、物語との一体感を楽しむことができる。 Patent Document 1 describes an example of a device for purely experiencing voice acting while enjoying a sense of unity with the story. This device is equipped with a function to play back the audio of a cast other than the one selected by the player, and a function to display only the subtitles corresponding to the lines of the cast selected by the player at the timing when the cast selected by the player speaks. ing. Therefore, the player can experience dubbing simply by speaking along with the subtitles. Note that since moving images such as anime and dramas are displayed on the display screen, the player can enjoy a sense of unity with the story.

特開２００６－３４６２８４号公報Japanese Patent Application Publication No. 2006-346284

アフレコの技術は、昔話などを録音したオーディオブックにも応用されている。オーディオブックは、子供の成長に伴い、年代に応じて、滑舌や抑揚のつけ方、台詞に対する心情の込め方等も変化すると考えられ、子供の発声の録音は、子供の成長の記録となる。しかし、録音された音声を分類して聴き比べるには、音声ファイルを主体別、年代別に特定して抽出する等の手間を要する。 Dubbing technology is also applied to audiobooks that include recordings of folk tales. When reading audiobooks, it is thought that as the child grows up, the way they use their tongue, intonation, and the way they put emotion into the lines will change depending on their age, so recordings of children's vocalizations serve as a record of their growth. . However, classifying and comparing recorded voices requires time and effort, such as identifying and extracting audio files by subject and age group.

本発明は、録音された音声ファイルを予め定められた条件により分類して提示し、過去の録音の聴き比べという、オーディオブックにおける新たな楽しみを提供することを目的とする。 An object of the present invention is to classify and present recorded audio files according to predetermined conditions, and to provide a new enjoyment of audiobooks, such as listening to and comparing past recordings.

請求項１に係る発明は、
ユーザが置き換えに参加して編集可能な音声情報ファイルを取得する取得手段と、
前記音声情報ファイルに含まれる音声を再生する再生手段と、
前記音声情報ファイルを編集する編集手段と、
音声の置き換えによる編集が行われた音声情報ファイルを、当該音声情報ファイルの編集情報と関連付けて、前記取得手段により取得された音声情報ファイルとは別に保存する保存手段と、
前記編集情報に基づいて前記編集が行われた音声情報ファイルの一覧を作成し提示する提示手段と、
提示された前記音声情報ファイルのうち、再生する音声情報ファイルの指定を受け付け、前記再生手段に再生させる再生制御手段と、を備え、
前記編集手段は、前記音声情報ファイルのコンテンツを複数の部分に分割し、当該コンテンツに含まれる音声に対して当該部分ごとに音声を他の音声と置き換える編集が可能であり、
前記再生制御手段は、
前記音声情報ファイルを再生する際に、前記部分ごとに、前記編集が行われた音声情報ファイルにおける置き換えた音声で再生可能であり、
一のコンテンツに係る音声情報ファイルに関して、当該コンテンツの前記部分に対して前記音声を置き換える編集が行われた音声情報ファイルが複数存在する場合に、当該コンテンツの当該部分ごとに、ファイルの生成順にしたがって、当該編集が行われた音声情報ファイルの音声を割り当てて再生することを特徴とする、音声再生システムである。
請求項２に係る発明は、
前記編集手段は、複数の前記編集が行われた音声情報ファイルの音声を組み合わせて、新たな音声情報ファイルを生成することを特徴とする、請求項１に記載の音声再生システムである。
請求項３に係る発明は、
前記編集情報は、前記編集において置き換えられた音声を発声したユーザの年齢の情報を含み、
前記再生制御手段は、一のコンテンツに係る音声情報ファイルに関して、当該コンテンツの前記部分に対して前記音声を置き換える編集が行われた音声情報ファイルが複数存在する場合に、当該コンテンツの当該部分ごとに、複数の当該音声ファイルに関連付けられた前記編集情報に示される前記ユーザの年齢順にしたがって、当該編集が行われた音声情報ファイルの音声を割り当てて再生することを特徴とする、請求項１に記載の音声再生システムである。
請求項４に係る発明は、
コンピュータを制御して、
ユーザが置き換えに参加して編集可能な音声情報ファイルを取得する取得手段と、
前記音声情報ファイルに含まれる音声を再生する再生手段と、
前記音声情報ファイルを編集する編集手段と、
音声の置き換えによる編集が行われた音声情報ファイルを、当該音声情報ファイルの編集情報と関連付けて、前記取得手段により取得された音声情報ファイルとは別に保存する保存手段と、
前記編集情報に基づいて前記編集が行われた音声情報ファイルの一覧を作成し提示する提示手段と、
提示された前記音声情報ファイルのうち、再生する音声情報ファイルの指定を受け付け、前記再生手段に再生させる再生制御手段として、機能させ、
前記編集手段の機能として、前記音声情報ファイルのコンテンツを複数の部分に分割し、当該コンテンツに含まれる音声に対して当該部分ごとに音声を他の音声と置き換える編集が可能であり、
前記再生制御手段の機能として、
前記音声情報ファイルを再生する際に、前記部分ごとに、前記編集が行われた音声情報ファイルにおける置き換えた音声で再生可能であり、
一のコンテンツに係る音声情報ファイルに関して、当該コンテンツの前記部分に対して前記音声を置き換える編集が行われた音声情報ファイルが複数存在する場合に、当該コンテンツの当該部分ごとに、ファイルの生成順にしたがって、当該編集が行われた音声情報ファイルの音声を割り当てて再生することを特徴とする、プログラムである。 The invention according to claim 1 is:
an acquisition means for acquiring an editable audio information file by participating in replacement by a user;
Reproducing means for reproducing the audio included in the audio information file;
editing means for editing the audio information file;
storage means for storing an audio information file edited by voice replacement in association with editing information of the audio information file separately from the audio information file acquired by the acquisition means;
Presentation means for creating and presenting a list of audio information files that have been edited based on the editing information;
Reproduction control means that accepts a designation of an audio information file to be reproduced from among the presented audio information files and causes the reproduction means to reproduce it ,
The editing means is capable of editing the audio included in the content by dividing the content of the audio information file into a plurality of parts and replacing the audio with other audio for each part,
The regeneration control means includes:
When playing the audio information file, each part can be played back with the replaced audio in the edited audio information file,
Regarding the audio information file related to one content, if there are multiple audio information files in which the part of the content has been edited to replace the audio, each part of the content is edited in the order in which the files were created. , is an audio reproduction system characterized in that the audio of the edited audio information file is assigned and reproduced .
The invention according to claim 2 is:
2. The audio playback system according to claim 1, wherein the editing means generates a new audio information file by combining sounds of a plurality of edited audio information files.
The invention according to claim 3 is:
The editing information includes information on the age of the user who uttered the voice replaced in the editing,
The playback control means is configured to control the playback control means for each part of the content when there are multiple audio information files in which the part of the content has been edited to replace the audio. , wherein the audio of the edited audio information file is assigned and played according to the age order of the user indicated in the editing information associated with a plurality of the audio files. This is an audio playback system.
The invention according to claim 4 is:
control the computer,
an acquisition means for acquiring an editable audio information file by participating in replacement by a user;
Reproducing means for reproducing the audio included in the audio information file;
editing means for editing the audio information file;
storage means for storing an audio information file edited by voice replacement in association with editing information of the audio information file separately from the audio information file acquired by the acquisition means;
Presentation means for creating and presenting a list of audio information files that have been edited based on the editing information;
Functioning as a reproduction control means that accepts a designation of an audio information file to be reproduced from among the presented audio information files and causes the reproduction means to reproduce it ;
As a function of the editing means, it is possible to edit the content of the audio information file into a plurality of parts and replace the audio included in the content with other audio for each part,
As a function of the reproduction control means,
When playing the audio information file, each part can be played back with the replaced audio in the edited audio information file,
Regarding the audio information file related to one content, if there are multiple audio information files in which the part of the content has been edited to replace the audio, each part of the content is edited in the order in which the files were created. , is a program characterized in that the audio of the edited audio information file is assigned and played back .

請求項１の発明によれば、音声情報を含む一つのコンテンツ中で、録音された音声情報ファイルをファイルの生成順に再生することにより、録音時期に従って次第に変化する音声を聞き比べる楽しみを提供することができる。
請求項２の発明によれば、一つの音声情報ファイル中で、異なる音声情報ファイルの音声を聴き比べることができる。
請求項３の発明によれば、ユーザの年齢順に従って次第に変化するユーザの音声を聞き比べることができる。
請求項４の発明によれば、本発明のプログラムを実行するコンピュータにおいて、録音された音声情報ファイルを予め定められた条件により分類して提示し、過去の録音の聴き比べという、オーディオブックにおける新たな楽しみを提供することができる。 According to the invention of claim 1, by playing back the recorded audio information files in the order in which the files were created in one content including audio information, it is possible to provide the enjoyment of listening and comparing the audio that gradually changes according to the recording time. Can be done.
According to the second aspect of the invention , it is possible to listen to and compare sounds of different audio information files in one audio information file.
According to the third aspect of the invention, it is possible to listen to and compare users' voices that gradually change according to the age of the users.
According to the invention of claim 4 , in a computer that executes the program of the invention, recorded audio information files are classified and presented according to predetermined conditions, and a new feature in audio books, such as listening and comparing past recordings, is created. can provide a lot of fun.

実施の形態で想定するネットワークシステムの概要を説明する図である。FIG. 1 is a diagram illustrating an overview of a network system assumed in an embodiment. 実施の形態で使用する端末の構成例を示す図である。FIG. 2 is a diagram showing an example of the configuration of a terminal used in the embodiment. 端末を構成する制御ユニットの機能構成を説明する図である。FIG. 3 is a diagram illustrating the functional configuration of a control unit that constitutes a terminal. オーディオファイル管理サーバの構成例である。This is an example of the configuration of an audio file management server. オーディオファイル管理サーバの機能構成を示す図である。It is a diagram showing the functional configuration of an audio file management server. 端末による編集済みオーディオブックの再生動作を示すフローチャートである。It is a flowchart which shows the reproduction|regeneration operation|movement of the edited audio book by a terminal. 提示画面の例を示す図である。It is a figure showing an example of a presentation screen. 選択画面の例を示す図である。It is a figure showing an example of a selection screen.

以下、添付図面を参照して、本発明の実施の形態について詳細に説明する。
＜システム構成＞
図１は、実施の形態で想定するネットワークシステム１の概要を説明する図である。図１に示すネットワークシステム１は、インターネット１０に接続されたオーディオファイル管理サーバ２０と、ユーザが操作する端末３０とで構成されている。 Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
<System configuration>
FIG. 1 is a diagram illustrating an overview of a network system 1 assumed in the embodiment. A network system 1 shown in FIG. 1 includes an audio file management server 20 connected to the Internet 10 and a terminal 30 operated by a user.

本実施の形態におけるオーディオファイル管理サーバ２０は、本等を朗読する音声や背景音等をデータとして記録したファイル（以下「オーディオファイル」という）を配信用に管理するサーバである。オーディオファイル管理サーバ２０は、コンピュータを基本構成とする。図１の場合、オーディオファイル管理サーバ２０は１台であるが、複数台の装置が協働してオーディオファイル管理サーバ２０として動作しても良い。 The audio file management server 20 in this embodiment is a server that manages for distribution a file (hereinafter referred to as an "audio file") in which audio of a book or the like is read aloud, background sounds, etc. are recorded as data. The audio file management server 20 has a basic configuration of a computer. In the case of FIG. 1, there is one audio file management server 20, but a plurality of devices may cooperate to operate as the audio file management server 20.

本実施の形態では、ユーザとの間で取引の単位となるオーディオファイルを総称してオーディオブックともいう。オーディオブックは、１つのオーディオファイルで構成される場合もあれば、複数のオーディオファイルで構成される場合もある。本実施の形態では、登場人物の台詞等で話が展開されるオーディオブックの類を音物語という。音物語には、例えば昔話や童話がある。 In this embodiment, audio files that are the unit of transaction with users are also collectively referred to as audiobooks. An audiobook may consist of one audio file or multiple audio files. In this embodiment, a type of audio book in which the story is developed using the lines of characters, etc. is referred to as a sound story. Examples of sound stories include folk tales and fairy tales.

また、オーディオファイル管理サーバ２０が配信するオーディオブックには、ユーザが自由に音を挿入することが可能な領域部分、または、元の音と置き換えが可能な領域部分を示す情報が付属されているものとする。ここでのオーディオブックは、音声情報ファイルの一例である。また、本実施の形態の場合、音を挿入することが可能な領域と音の置き換えが可能な領域とを区別して説明するが、音を挿入する処理は、無音の領域を有音の領域に置き換える処理である。このため、音を挿入することが可能な領域は、広義には、音の置き換えが可能な領域と言い得る。 Furthermore, the audio book distributed by the audio file management server 20 includes information indicating areas where the user can freely insert sounds or areas where the original sounds can be replaced. shall be taken as a thing. The audio book here is an example of an audio information file. In addition, in the case of this embodiment, an area where a sound can be inserted and an area where a sound can be replaced will be explained separately, but the process of inserting a sound changes a silent area to a sound area. This is a replacement process. Therefore, in a broad sense, an area where a sound can be inserted can be said to be an area where a sound can be replaced.

挿入が可能な領域部分または置き換えが可能な領域部分が定められているオーディオブックは、編集が可能な音物語のファイルの一例である。どの領域部分を挿入が可能な領域部分とするか、または、置き換えが可能な領域部分とするかは、オーディオブックを配信する側が事前に定めている。ここでの領域部分の多くは台詞である。もっとも、ユーザによる音の挿入や置換が可能な領域部分は、オーディオブックに現れる台詞の全てである必要はなく、特定の登場人物の台詞に限定される必要もない。また、ユーザによる音の挿入や置換が可能な領域部分は、ナレーションの一部分でも良い。 An audiobook in which insertable areas or replaceable areas are defined is an example of a sound story file that can be edited. The audio book distributor determines in advance which area can be inserted or replaced. Most of the area parts here are dialogue. However, the area in which the user can insert or replace sounds does not need to include all of the lines that appear in the audio book, nor does it need to be limited to the lines of a particular character. Furthermore, the area where the user can insert or replace sounds may be a part of the narration.

本実施の形態における端末３０は、オーディオブックをダウンロードするユーザが主に操作する情報機器である。端末３０としては、例えば、スマートフォン、タブレット端末、オーディオプレーヤ、ノートパソコン等が用いられる。端末３０には、インターネット１０を介してオーディオファイル管理サーバ２０にアクセスし、前述したオーディオブックをダウンロードすることが可能な機能が設けられている。もっとも、オーディオファイル管理サーバ２０との接続は、他の機器を介して実現されても良い。 The terminal 30 in this embodiment is an information device mainly operated by a user who downloads audiobooks. As the terminal 30, for example, a smartphone, a tablet terminal, an audio player, a notebook computer, etc. are used. The terminal 30 is provided with a function that allows the user to access the audio file management server 20 via the Internet 10 and download the audio book described above. However, the connection with the audio file management server 20 may be realized through another device.

＜端末３０のハードウェア構成＞
図２は、実施の形態で使用する端末３０の構成例を示す図である。本実施の形態における端末３０は、装置全体の動作を制御する制御ユニット３０１と、データを記録する不揮発性の記憶ユニット３０２と、ユーザインタフェース画面等の表示に用いられる表示ユニット３０３と、ユーザの操作を受け付ける操作受付ユニット３０４と、電気信号を音として再生するスピーカ３０５と、音を電気信号に変換するマイク３０６と、通信インタフェース（＝通信ＩＦ）３０７とを有している。 <Hardware configuration of terminal 30>
FIG. 2 is a diagram showing a configuration example of the terminal 30 used in the embodiment. The terminal 30 in this embodiment includes a control unit 301 that controls the operation of the entire device, a nonvolatile storage unit 302 that records data, a display unit 303 that is used to display a user interface screen, etc. It has an operation reception unit 304 that receives a message, a speaker 305 that reproduces an electrical signal as sound, a microphone 306 that converts the sound into an electrical signal, and a communication interface (=communication IF) 307.

本実施の形態における制御ユニット３０１は、ＣＰＵ（＝Central Processing Unit）３１１と、ファームウェア等が記録されたＲＯＭ（＝Read Only Memory）３１２と、ワークエリアとして用いられるＲＡＭ（＝Random Access Memory）３１３とを有している。制御ユニット３０１は、いわゆるコンピュータとして機能する。なお、ＲＯＭ３１２は、不揮発性の書き換え可能な半導体メモリである。 The control unit 301 in this embodiment includes a CPU (=Central Processing Unit) 311, a ROM (=Read Only Memory) 312 in which firmware etc. are recorded, and a RAM (=Random Access Memory) 313 used as a work area. have. Control unit 301 functions as a so-called computer. Note that the ROM 312 is a nonvolatile rewritable semiconductor memory.

記憶ユニット３０２は、磁気ディスク装置やＳＳＤ（＝Solid State Drive）等の不揮発性の書き換え可能な半導体メモリ等によって構成される。記憶ユニット３０２には、例えばオーディオブックのデータやマイク３０６で収録された音のデータ等が保存される。 The storage unit 302 is configured with a nonvolatile rewritable semiconductor memory such as a magnetic disk device or a solid state drive (SSD). The storage unit 302 stores, for example, audio book data, sound data recorded by the microphone 306, and the like.

表示ユニット３０３は、例えば液晶ディスプレイや有機ＥＬディスプレイで構成される。表示ユニット３０３には、ユーザによる操作を支援する操作画面やユーザに各種の情報を提示する情報画面などが表示される。 The display unit 303 is composed of, for example, a liquid crystal display or an organic EL display. The display unit 303 displays an operation screen that supports operations by the user, an information screen that presents various information to the user, and the like.

操作受付ユニット３０４は、ユーザによる操作を受け付ける操作デバイスにより構成される。操作受付ユニット３０４は、例えば表示ユニット３０３の表面に配置されるタッチセンサ、筐体に配置されるスイッチ、ボタンで構成される。 The operation reception unit 304 is configured by an operation device that accepts operations by a user. The operation reception unit 304 includes, for example, a touch sensor placed on the surface of the display unit 303, and switches and buttons placed on the casing.

通信インタフェース３０７は、インターネット１０を介してオーディオファイル管理サーバ２０と通信を行うためのインタフェースである。通信インタフェース３０７としては、例えば無線ＬＡＮ（＝Local Area Network）、ブルートゥース（登録商標）、移動通信規格に準拠した無線装置が用いられる。 The communication interface 307 is an interface for communicating with the audio file management server 20 via the Internet 10. As the communication interface 307, for example, a wireless LAN (Local Area Network), Bluetooth (registered trademark), or a wireless device compliant with mobile communication standards is used.

なお、制御ユニット３０１と各ユニット等とは、バス３０８や不図示の信号線を通じて接続されている。また、不図示であるが、端末３０には、位置情報を取得するＧＰＳ（＝Global Positioning System）センサ、地磁気センサ、加速度センサ、動画像や静止画像を撮像するカメラ等が実装されている。ここでの位置情報は、音が収録された場所の記録にも使用される。 Note that the control unit 301 and each unit are connected through a bus 308 and a signal line (not shown). Although not shown, the terminal 30 is equipped with a GPS (Global Positioning System) sensor that acquires position information, a geomagnetic sensor, an acceleration sensor, a camera that captures moving images and still images, and the like. This location information is also used to record the location where the sound was recorded.

＜端末３０の機能＞
図３は、端末３０を構成する制御ユニット３０１の機能構成を説明する図である。図３に示す機能モジュールは、ＣＰＵ３１１（図２参照）がプログラムを実行することにより実現される。なお、図３に示す機能モジュールは、制御ユニット３０１が実行するプログラムの一例である。図３に示すように、かかるプログラムにより実現される機能モジュールには、オーディオブック取得モジュール３２１、オーディオブック再生モジュール３２２、オーディオブック編集モジュール３２３、編集済みオーディオブック保存モジュール３２４、提示画面表示制御モジュール３２５、再生制御モジュール３２６が含まれる。 <Functions of terminal 30>
FIG. 3 is a diagram illustrating the functional configuration of the control unit 301 that constitutes the terminal 30. The functional modules shown in FIG. 3 are realized by the CPU 311 (see FIG. 2) executing programs. Note that the functional module shown in FIG. 3 is an example of a program executed by the control unit 301. As shown in FIG. 3, the functional modules realized by this program include an audiobook acquisition module 321, an audiobook playback module 322, an audiobook editing module 323, an edited audiobook storage module 324, and a presentation screen display control module 325. , a playback control module 326.

オーディオブック取得モジュール３２１は、オーディオファイル管理サーバ２０（図１参照）からオーディオブックを取得する機能モジュールである。取得の対象であるオーディオブックは、端末３０（図１参照）の操作画面を通じてユーザが指定する。なお、オーディオブックは、オーディオファイル管理サーバ２０から取得する以外に、記憶ユニット３０２（図２参照）から取得する場合もある。オーディオブック取得モジュール３２１は、取得手段の一例である。 The audiobook acquisition module 321 is a functional module that acquires audiobooks from the audio file management server 20 (see FIG. 1). The user specifies the audio book to be acquired through the operation screen of the terminal 30 (see FIG. 1). Note that the audio book may be acquired from the storage unit 302 (see FIG. 2) in addition to being acquired from the audio file management server 20. The audiobook acquisition module 321 is an example of an acquisition means.

オーディオブック再生モジュール３２２は、オーディオブックを再生する機能モジュールである。オーディオブック再生モジュール３２２によるオーディオブックの再生には、元の（編集前の）音声の再生と編集済みのオーディオブックの再生とがある。オーディオブック再生モジュール３２２は、再生手段の一例である。 The audiobook playback module 322 is a functional module that plays audiobooks. The reproduction of an audiobook by the audiobook reproduction module 322 includes reproduction of the original (unedited) audio and reproduction of the edited audiobook. The audiobook playback module 322 is an example of playback means.

オーディオブック編集モジュール３２３は、オーディオブックを編集するモジュールである。オーディオブックの編集は、登場人物の台詞やナレーションの音声を、ユーザが録音した音声に置き換えることにより行われる。ユーザの音声は、例えば、端末３０のマイク３０６（図２参照）を用いて収録される。なお、オーディオブックの編集対象には、台詞やナレーション以外の音声を含んでも良い。例えば、ＢＧＭ（background music）や効果音を他の音に置き換える編集を可能としても良い。また、オーディオブック編集モジュール３２３による音声の置き換えは、一つのオーディオブック全体に対して行うだけでなく、オーディオブックを分割した部分ごとに行うことができる。例えば、特定の登場人物の台詞のみを置き換えたり、特定の場面の音声のみを置き換えたりしても良い。オーディオブック編集モジュール３２３は、編集手段の一例である。 The audiobook editing module 323 is a module that edits audiobooks. Editing of an audiobook is performed by replacing the voices of characters and narrations with voices recorded by the user. The user's voice is recorded using, for example, the microphone 306 of the terminal 30 (see FIG. 2). Note that audio books other than dialogue and narration may be included in the audio book to be edited. For example, editing may be possible in which BGM (background music) or sound effects are replaced with other sounds. Furthermore, the audio book editing module 323 can replace the audio not only for an entire audiobook, but also for each divided portion of the audiobook. For example, only the lines of a specific character or the audio of a specific scene may be replaced. The audiobook editing module 323 is an example of editing means.

オーディオブックの編集は、編集対象のオーディオブックのファイルに対して行われるのではなく、かかるファイルの複製に対して行われる。したがって、オーディオブックの編集が行われると、元の（編集前の）オーディオブックのファイルとは別に、編集された（編集後の）オーディオブックのファイルができる。以下、編集されたオーディオブックのファイルを、編集済みオーディオブックと呼ぶ。編集済みオーディオブックは、編集が行われた音声情報ファイルの一例である。編集対象のオーディオブックは、初期的には、オーディオブック取得モジュール３２１により取得されたオーディオブックのファイルである。そして、オーディオブックの編集が行われた後は、編集済みオーディオブックも編集対象として、さらに編集を行うことができる。 Audiobook editing is not performed on the audiobook file being edited, but on a copy of such file. Therefore, when an audiobook is edited, an edited (post-edited) audiobook file is created separately from the original (before editing) audiobook file. Hereinafter, the edited audiobook file will be referred to as an edited audiobook. An edited audiobook is an example of an audio information file that has been edited. The audiobook to be edited is initially an audiobook file acquired by the audiobook acquisition module 321. After the audiobook has been edited, the edited audiobook can also be edited for further editing.

編集済みオーディオブック保存モジュール３２４は、端末３０の記憶ユニット３０２（図２参照）等の記憶手段に編集済みオーディオブックを保存するモジュールである。上述したように、編集済みオーディオブックは、元のオーディオブックとは別ファイルとして生成されるため、編集済みオーディオブック保存モジュール３２４は、元のオーディオブックとは別個に編集済みオーディオブックを保存する。保存される編集済みオーディオブックのファイルには、ファイル名またはヘッダ等に記録されるメタ情報として、編集日時（または保存日時）が記録される。編集済みオーディオブック保存モジュール３２４は、保存手段の一例である。 The edited audiobook storage module 324 is a module that stores the edited audiobook in a storage means such as the storage unit 302 (see FIG. 2) of the terminal 30. As mentioned above, the edited audiobook is generated as a separate file from the original audiobook, so the edited audiobook storage module 324 stores the edited audiobook separately from the original audiobook. In the saved edited audiobook file, the editing date and time (or the saving date and time) is recorded as meta information recorded in the file name, header, or the like. Edited audiobook storage module 324 is an example of storage means.

また、編集済みオーディオブック保存モジュール３２４は、保存した編集済みオーディオブックの編集情報を作成する。編集情報には、例えば、置き換えられた音声を発声したユーザ（以下、発声主体と呼ぶ）、発声主体であるユーザの年齢、編集日時（または保存日時）等の情報が含まれる。なお、編集情報は、編集済みオーディオブック保存モジュール３２４が編集済みオーディオブックを保存した際に自動的に生成されるが、事後的に編集可能としても良い。編集情報は、該当する編集済みオーディオブックに対応付けられ、例えば管理テーブルに登録して管理される。編集情報に対応付けられた編集済みオーディオブックは、例えば、子供の成長の履歴として残すことが可能になる。 The edited audiobook storage module 324 also creates editing information for the saved edited audiobook. The editing information includes, for example, information such as the user who uttered the replaced voice (hereinafter referred to as the utterer), the age of the user who is the utterer, and the date and time of editing (or date and time of storage). Note that the editing information is automatically generated when the edited audiobook storage module 324 stores the edited audiobook, but it may be possible to edit it after the fact. The editing information is associated with the corresponding edited audio book, and is managed by being registered in, for example, a management table. Edited audiobooks associated with editing information can be kept as a history of a child's growth, for example.

提示画面表示制御モジュール３２５は、編集済みオーディオブックの提示画面を生成し、端末３０の表示ユニット３０３（図２参照）等の表示手段に表示させるモジュールである。提示画面は、例えば、編集済みオーディオブックの編集情報を、各編集済みオーディオブックの生成時期順に並べた一覧である。また、提示画面は、ユーザの操作を受け付けるユーザインタフェース画面としての機能を有しても良い。例えば、ユーザが表示ユニット３０３に表示された提示画面上で所望の編集済みオーディオブックを指定する操作を行うと、指定された編集済みオーディオブックが再生されるようにしても良い。提示画面を用いたユーザインタフェースについては、具体例を後述する。提示画面表示制御モジュール３２５は、提示手段の一例である。 The presentation screen display control module 325 is a module that generates a presentation screen for an edited audiobook and causes it to be displayed on a display means such as the display unit 303 (see FIG. 2) of the terminal 30. The presentation screen is, for example, a list of editing information of edited audiobooks arranged in the order of generation time of each edited audiobook. Further, the presentation screen may have a function as a user interface screen that accepts user operations. For example, when the user performs an operation to specify a desired edited audiobook on the presentation screen displayed on the display unit 303, the specified edited audiobook may be played. A specific example of a user interface using a presentation screen will be described later. The presentation screen display control module 325 is an example of presentation means.

再生制御モジュール３２６は、ユーザによる再生指示を受け付けて、選択された編集済みオーディオブックの再生を制御するモジュールである。再生制御モジュール３２６は、再生指示に応じて、オーディオブックのコンテンツにおける場面ごとや台詞ごとに再生することができる。これにより、例えば、特定の場面における特定の登場人物の台詞のみを再生することもできる。再生制御モジュール３２６は、再生制御手段の一例である。 The playback control module 326 is a module that receives playback instructions from the user and controls playback of the selected edited audiobook. The playback control module 326 can play the audiobook content scene by scene or line by line in response to a playback instruction. This makes it possible, for example, to reproduce only the lines of a specific character in a specific scene. The playback control module 326 is an example of playback control means.

また、同じオーディオブックや同じ場面に対して音声の置き換えが行われた複数の編集済みオーディオブックが存在する場合、場面や台詞ごとに異なる編集済みオーディオブックの該当箇所の音声を組み合わせて再生しても良い。この場合、再生する編集済みオーディオブックは、場面や台詞等の部分ごとにユーザが設定しても良いし、適当な規則に基づいて自動的に設定しても良い。後者の場合、例えば、編集時期や保存時期の古いファイルから生成順に、部分ごとの音声を割り当てて再生しても良い。このようにすると、置き換えられた音声を発声したユーザが同一人物である場合、編集済みオーディオブックの部分ごとに、その人物の声が若いころから順に（すなわち、年齢順に）用いられて再生される。編集時期の異なる複数の編集済みオーディオブックから音声を抽出して再生を行った場合、編集済みオーディオブック保存モジュール３２４により、かかる複数の編集済みオーディオブックの音声が組み合わされた新たな編集済みオーディオブックのファイルとして保存しても良い。 In addition, if there are multiple edited audiobooks with audio replacements for the same audiobook or the same scene, you can combine and play the audio from the different edited audiobooks for each scene or dialogue. Also good. In this case, the edited audiobook to be played may be set by the user for each scene, dialogue, etc., or may be set automatically based on appropriate rules. In the latter case, for example, audio may be assigned to each part and played back in the order of generation, starting from the oldest edited or saved file. In this way, if the person who uttered the replaced voice is the same person, that person's voice will be used for each section of the edited audiobook in order of age (i.e., in order of age). . When audio is extracted and played from multiple edited audiobooks edited at different times, the edited audiobook storage module 324 creates a new edited audiobook in which the audio of the multiple edited audiobooks is combined. You can also save it as a file.

＜オーディオファイル管理サーバ２０のハードウェア構成＞
図４は、オーディオファイル管理サーバ２０の構成例を示す図である。オーディオファイル管理サーバ２０は、コンピュータにより実現され、演算手段であるＣＰＵ２０１と、記憶手段であるＲＯＭ２０３、ＲＡＭ２０２、記憶装置２０４とを備える。ＲＡＭ２０２は、主記憶装置（メイン・メモリ）であり、ＣＰＵ２０１が演算処理を行う際の作業用メモリとして用いられる。ＲＯＭ２０３にはプログラムや予め用意された設定値等のデータが保持されており、ＣＰＵ２０１はＲＯＭ２０３から直接プログラムやデータを読み込んで処理を実行することができる。記憶装置２０４は、プログラムやデータの保存手段である。記憶装置２０４にはプログラムが記憶されており、ＣＰＵ２０１は記憶装置２０４に格納されたプログラムを主記憶装置に読み込んで実行する。また、記憶装置２０４には、ＣＰＵ２０１による処理の結果が格納され、保存される。記憶装置２０４としては、例えば磁気ディスク装置やＳＳＤ等が用いられる。 <Hardware configuration of audio file management server 20>
FIG. 4 is a diagram showing a configuration example of the audio file management server 20. As shown in FIG. The audio file management server 20 is realized by a computer, and includes a CPU 201 as a calculation means, a ROM 203, a RAM 202, and a storage device 204 as storage means. The RAM 202 is a main memory and is used as a working memory when the CPU 201 performs arithmetic processing. The ROM 203 holds programs and data such as pre-prepared setting values, and the CPU 201 can directly read programs and data from the ROM 203 and execute processing. The storage device 204 is a storage means for programs and data. A program is stored in the storage device 204, and the CPU 201 reads the program stored in the storage device 204 into the main storage device and executes it. Furthermore, the results of processing by the CPU 201 are stored and saved in the storage device 204. As the storage device 204, for example, a magnetic disk device, an SSD, or the like is used.

オーディオファイル管理サーバ２０が上記のコンピュータにより構成される場合、例えば、ＣＰＵ２０１がプログラムを実行することにより、以下に説明するこれらのサーバの各機能が実現される。オーディオファイル管理サーバ２０は、例えば、ネットワーク上に構築されたサーバとして実現される。なお、これらのサーバは、単一のハードウェア（サーバマシン等）による構成に限定されず、複数のハードウェアや仮想マシンに分散して構成しても良い。 When the audio file management server 20 is configured by the above-mentioned computer, each function of these servers described below is realized by, for example, the CPU 201 executing a program. The audio file management server 20 is realized, for example, as a server built on a network. Note that these servers are not limited to the configuration of a single piece of hardware (such as a server machine), but may be configured by being distributed over multiple pieces of hardware or virtual machines.

＜オーディオファイル管理サーバ２０の機能＞
図５は、オーディオファイル管理サーバ２０の機能構成を示す図である。オーディオファイル管理サーバ２０は、オーディオファイル管理部２１０と、オーディオファイル格納部２２０と、提示画面生成部２３０とを備える。 <Function of audio file management server 20>
FIG. 5 is a diagram showing the functional configuration of the audio file management server 20. The audio file management server 20 includes an audio file management section 210, an audio file storage section 220, and a presentation screen generation section 230.

オーディオファイル管理部２１０は、元のオーディオブックと共に、端末３０で編集（台詞音声等の置き換え）された編集済みオーディオブックおよび編集情報を取得し、管理する。端末３０から取得した編集済みオーディオブックは、オーディオファイル格納部２２０に格納される。 The audio file management unit 210 acquires and manages the edited audiobook and editing information that have been edited (replaced dialogue audio, etc.) on the terminal 30, along with the original audiobook. The edited audiobook acquired from the terminal 30 is stored in the audio file storage section 220.

オーディオファイル格納部２２０は、オーディオブックを格納し保存する。オーディオファイル格納部２２０には、元のオーディオブックと、編集済みオーディオブックとが格納される。上述したように、端末３０には編集済みオーディオブック保存モジュール３２４が設けられており、生成した編集済みオーディオブックを保存可能であるが、記憶ユニット３０２の記憶容量が不足した場合には、外部記憶手段としてのオーディオファイル格納部２２０に保存することができる。また、端末３０で生成した編集済みオーディオブックをオーディオファイル管理サーバ２０のオーディオファイル格納部２２０にバックアップしておくことにより、端末３０の機器（ハードウェア）を変更した場合に、それまでに生成した編集済みオーディオブックを引き継ぐことができる。 The audio file storage unit 220 stores and saves audiobooks. The audio file storage unit 220 stores original audiobooks and edited audiobooks. As described above, the terminal 30 is provided with the edited audiobook storage module 324, which can store the generated edited audiobook, but if the storage capacity of the storage unit 302 is insufficient, the external storage It can be stored in the audio file storage unit 220 as a means. In addition, by backing up edited audiobooks generated on the terminal 30 to the audio file storage section 220 of the audio file management server 20, even if the device (hardware) of the terminal 30 is changed, the edited audiobooks generated on the terminal 30 can be backed up. Edited audiobooks can be carried over.

提示画面生成部２３０は、端末３０からの要求に応じて、編集済みオーディオブックの提示画面を生成し、返送する。端末３０において編集済みオーディオブックを再生する場合、提示画面表示制御モジュール３２５により、編集済みオーディオブックを提示して再生対象の選択を受け付けるための提示画面が生成される。ここで、編集済みオーディオブックおよび編集情報がオーディオファイル管理部２１０に管理されている場合、提示画面表示制御モジュール３２５に代わって、オーディオファイル管理サーバ２０の提示画面生成部２３０が提示画面を生成しても良い。このようにすれば、編集済みオーディオブックおよび編集情報をバックアップした後、これらのデータを端末３０において削除した場合であっても、端末３０において、オーディオファイル管理サーバ２０から提示画面を取得して表示することができる。 The presentation screen generation unit 230 generates a presentation screen for the edited audiobook in response to a request from the terminal 30 and sends it back. When an edited audiobook is played back on the terminal 30, the presentation screen display control module 325 generates a presentation screen for presenting the edited audiobook and receiving a selection of a playback target. Here, if the edited audiobook and editing information are managed by the audio file management unit 210, the presentation screen generation unit 230 of the audio file management server 20 generates the presentation screen instead of the presentation screen display control module 325. It's okay. In this way, even if the edited audiobook and editing information are backed up and then deleted on the terminal 30, the presentation screen can be acquired from the audio file management server 20 and displayed on the terminal 30. can do.

＜編集済みオーディオブックの再生動作＞
次に、編集済みオーディオブックの再生動作について説明する。なお、ここでは、端末３０が、編集済みオーディオブック保存モジュール３２４により自装置に編集済みオーディオブックを保持し、提示画面表示制御モジュール３２５により提示画面を生成するものとする。 <Playback operation of edited audiobooks>
Next, the reproduction operation of an edited audiobook will be explained. Here, it is assumed that the terminal 30 stores the edited audiobook in its own device using the edited audiobook storage module 324 and generates a presentation screen using the presentation screen display control module 325.

図６は、端末３０による編集済みオーディオブックの再生動作を示すフローチャートである。初期状態として、端末３０の表示ユニット３０３に機能選択のメニュー画面が表示されているものとする。特に図示しないが、メニュー画面は、オーディオブックのコンテンツの選択や再生、属性の設定等の機能がメニュー表示され、ユーザが画面上の操作により選択するように構成されている。ユーザは、このメニュー画面が表示された状態で操作受付ユニット３０４に対する操作を行い、所望の機能を選択する（Ｓ６０１）。 FIG. 6 is a flowchart showing the operation of playing an edited audiobook by the terminal 30. Assume that in the initial state, a function selection menu screen is displayed on the display unit 303 of the terminal 30. Although not particularly illustrated, the menu screen is configured to display functions such as selecting and playing audio book content, setting attributes, etc., and the user can select them by operating on the screen. With this menu screen displayed, the user operates the operation reception unit 304 to select a desired function (S601).

メニュー画面において、オーディオブックの再生が選択されると、提示画面表示制御モジュール３２５によりオーディオブックの提示画面が表示され、端末３０の表示ユニット３０３に表示される。提示画面３３０には、オーディオブックのコンテンツごとに、元のオーディオブックと編集済みオーディオブックとが表示される。 When playback of an audiobook is selected on the menu screen, the presentation screen display control module 325 displays an audiobook presentation screen, which is displayed on the display unit 303 of the terminal 30 . The presentation screen 330 displays the original audiobook and the edited audiobook for each audiobook content.

図７は、提示画面の例を示す図である。図７に示す例において、提示画面３３０には、ファイル情報表示領域３３１と、ボタンオブジェクト３３２とが設けられている。ファイル情報表示領域３３１には、元のオーディオブックおよび編集済みオーディオブック保存モジュール３２４により保存されている編集済みオーディオブックのファイルの一覧が表示される。図７に示す例では、オーディオブックのコンテンツ「昔話１」および「昔話２」に関して、それぞれ複数のオーディオブックのファイルが示されている。なお、図７では、元のオーディオブックと編集済みオーディオブックとを明示的に区別して表示していないが、各コンテンツの最上位には元のオーディオブックを表示したり、元のオーディオブックであることを示す表示を付したりすることにより区別しても良い。 FIG. 7 is a diagram showing an example of a presentation screen. In the example shown in FIG. 7, the presentation screen 330 is provided with a file information display area 331 and a button object 332. The file information display area 331 displays a list of original audiobooks and edited audiobook files saved by the edited audiobook storage module 324. In the example shown in FIG. 7, a plurality of audiobook files are shown for each of the audiobook contents "Folktales 1" and "Folktales 2." Note that in Figure 7, the original audiobook and the edited audiobook are not explicitly displayed separately, but the original audiobook is displayed at the top of each content, and the original audiobook is displayed at the top level of each content. They may be distinguished by attaching a mark indicating this.

各ファイルについては、「タイトル」、「名前」、「年齢」、「録音年月日」の各項目の情報が示されている。「タイトル」は、オーディオブックのコンテンツのタイトルを示す。「名前」は、編集済みオーディオブックにおいて、音声の置き換えに係る発声主体であるユーザの名前を示す。「年齢」は、発声主体であるユーザの年齢を示す。「録音年月日」は、編集済みオーディオブックの編集が行われた時期を示す。これらの情報は、各ファイルに対応付けられた編集情報から得られる。これらの情報により、ユーザは、編集済みオーディオブックに関して、誰が、いつ（何歳の時に）編集したファイルかを特定することができる。提示画面４４０はユーザインタフェース画面を兼ねており、ユーザは、ファイル情報表示領域３３１の表示上で所望のファイルが表示された欄を指定する操作（マウスクリック等）を行い、再生対象を特定する。「決定」と記されたボタンオブジェクト３３２は、再生対象のファイルを確定するためのオブジェクトである。「場面を指定」と記されたボタンオブジェクト３３３は、編集済みオーディオブックのコンテンツの部分である場面のうち、再生する場面を選択する選択画面に移行するためのオブジェクトである。 For each file, information on the following items is shown: "title", "name", "age", and "date of recording". “Title” indicates the title of the audiobook content. The "name" indicates the name of the user who is the main utterer for voice replacement in the edited audiobook. "Age" indicates the age of the user who is the utterer. The “recording date” indicates when the edited audiobook was edited. This information is obtained from editing information associated with each file. With this information, the user can identify who edited the edited audiobook and when (at what age) the file was edited. The presentation screen 440 also serves as a user interface screen, and the user performs an operation (such as a mouse click) to designate a field in which a desired file is displayed on the file information display area 331 to specify a reproduction target. The button object 332 labeled "Determine" is an object for determining the file to be played. The button object 333 labeled "Specify Scene" is an object for moving to a selection screen for selecting a scene to be played from among the scenes that are part of the content of the edited audiobook.

図８は、選択画面の例を示す図である。図８に示す例において、選択画面３４０には、場面特定表示３４１と、編集情報表示領域３４２と、ボタンオブジェクト３４３とが設けられている。選択画面３４０は、ユーザインタフェース画面を兼ねている。場面特定表示３４１は、物語を複数の場面に分け、各場面を時系列にしたがって並べた帯領域である。ユーザがこの場面特定表示３４１上の所望の場面を指定すると、その場面に対する編集が行われていれば、編集情報表示領域３４２が表示され、該当箇所の編集情報が表示される。図８に示す例では、場面特定表示３４１の左端（物語の最初の場面）から右方向へ向かって３番目の場面）が指定されており、指定された場面の位置からプルダウン表示により編集情報表示領域３４２が表示されている。 FIG. 8 is a diagram showing an example of a selection screen. In the example shown in FIG. 8, the selection screen 340 is provided with a scene specific display 341, an editing information display area 342, and a button object 343. The selection screen 340 also serves as a user interface screen. The scene specific display 341 is a band area in which the story is divided into a plurality of scenes and the scenes are arranged in chronological order. When the user specifies a desired scene on this scene specifying display 341, if editing has been performed on that scene, an editing information display area 342 is displayed, and editing information for the corresponding part is displayed. In the example shown in FIG. 8, the left end of the scene specific display 341 (the third scene toward the right from the first scene of the story) is specified, and editing information is displayed from the specified scene position by a pull-down display. Area 342 is displayed.

編集情報表示領域３４２には、指定された場面に関して、音声の置き換えに係る発声主体であるユーザの「名前」、発声主体であるユーザの「年齢」、編集済みオーディオブックの編集が行われた時期を示す「録音年月日」の各情報が示されている。ユーザは、編集情報表示領域３４２の表示上で所望の編集情報が表示された欄を指定する操作（マウスクリック等）を行い、指定された場面における再生対象の音声（データ）を特定する。「決定」と記されたボタンオブジェクト３４３は、場面特定表示３４１への操作で特定された場面に対する再生対象の音声を確定するためのオブジェクトである。図８に示す例では、一か所の場面のみが指定されているが、各場面に対して、再生対象の音声をそれぞれ指定することができる。 In the editing information display area 342, for the specified scene, the "name" of the user who is the main speaker for voice replacement, the "age" of the user who is the main speaker, and the time when the edited audiobook was edited. Information such as "date of recording" is shown. The user performs an operation (mouse click, etc.) to specify a field in which desired editing information is displayed on the display of the editing information display area 342, and specifies the audio (data) to be played back in the specified scene. The button object 343 labeled "Determine" is an object for determining the audio to be played for the scene specified by the operation on the scene specifying display 341. In the example shown in FIG. 8, only one scene is specified, but the audio to be played back can be specified for each scene.

図６に戻り、提示画面３３０（図７参照）において一の編集済みオーディオブック（図では「編集済みファイル」と記載）が指定され（Ｓ６０２でＹＥＳ）、選択画面３４０（図８参照）において再生箇所（場面）および音声が指定されて（Ｓ６０３でＹＥＳ）、「決定」のボタンオブジェクト３４３が操作されると、再生制御モジュール３２６が、編集済みオーディオブックにおける再生箇所での再生対象の音声の指定を受け付け（Ｓ６０４）、該当箇所の音声の再生指示をオーディオブック再生モジュール３２２に対して送信する。 Returning to FIG. 6, one edited audiobook (described as "edited file" in the diagram) is specified on the presentation screen 330 (see FIG. 7) (YES in S602), and played on the selection screen 340 (see FIG. 8). When a location (scene) and audio are specified (YES in S603) and the "Confirm" button object 343 is operated, the playback control module 326 specifies the audio to be played at the playback location in the edited audiobook. is received (S604), and an instruction to reproduce the audio of the corresponding part is transmitted to the audiobook reproduction module 322.

一方、提示画面３３０（図７参照）において一の編集済みオーディオブックが指定された後（Ｓ６０２でＹＥＳ）、選択画面３４０での再生箇所の指定が行われずに（Ｓ６０３でＮＯ）、「決定」のボタンオブジェクト３３２が操作されると、再生制御モジュール３２６が、編集済みオーディオブックのファイルの指定を受け付け（Ｓ６０５）、該当する編集済みオーディオブックの再生指示をオーディオブック再生モジュール３２２に対して送信する。 On the other hand, after one edited audiobook is specified on the presentation screen 330 (see FIG. 7) (YES at S602), the playback location is not specified on the selection screen 340 (NO at S603), and the "Decision" button is pressed. When the button object 332 is operated, the playback control module 326 accepts the specification of the edited audiobook file (S605), and sends an instruction to play the corresponding edited audiobook to the audiobook playback module 322. .

また、提示画面３３０（図７参照）において編集済みオーディオブックが指定されずに元のオーディオブックが指定されて（Ｓ６０２でＮＯ）、「決定」のボタンオブジェクト３３２が操作されると、再生制御モジュール３２６が、元のオーディオブックの再生指示をオーディオブック再生モジュール３２２に対して送信する。 Further, when the original audiobook is specified without specifying the edited audiobook on the presentation screen 330 (see FIG. 7) (NO in S602) and the "OK" button object 332 is operated, the playback control module 326 sends instructions to play the original audiobook to the audiobook playback module 322 .

オーディオブック再生モジュール３２２は、再生制御モジュール３２６から送信されたこれらの再生指示を受け付けると（Ｓ６０６）、受け付けた再生指示にしたがってオーディオブックの再生を開始する（Ｓ６０７）。 When the audiobook playback module 322 receives these playback instructions transmitted from the playback control module 326 (S606), it starts playing the audiobook according to the received playback instructions (S607).

なお、上記の動作例では、編集済みオーディオブックの部分として場面に着目し、場面ごとに再生対象の音声を選択し得るとしたが、さらに、登場人物やナレーション等のコンテンツ中の発話者別に再生音声を選択可能としても良い。上記の動作や画面の構成は例示に過ぎず、再生しようとするオーディオブックのコンテンツに編集済みオーディオブックが存在する場合に、特定のファイルやかかるファイルの特定の部分を再生対象として指定し、再生可能とするものであれば良い。 In addition, in the above operation example, we focused on the scenes as parts of the edited audiobook, and it was assumed that the audio to be played could be selected for each scene, but it is also possible to select the audio to be played for each scene. The audio may be selectable. The above operations and screen configurations are merely examples; if the content of the audiobook you are trying to play includes an edited audiobook, you can specify a specific file or a specific part of such a file as the playback target, and then play the audiobook. It is fine as long as it is possible.

以上、本発明の実施形態について説明したが、本発明の技術的範囲は上記実施形態には限定されない。例えば、上記の実施形態では、オーディオファイル管理サーバ２０が、オーディオファイル管理部２１０、オーディオファイル格納部２２０および提示画面生成部２３０とを備える構成としたが、元のオーディオブックを管理するオーディオファイル管理サーバ２０とは別の外部サーバとして、編集済みオーディオブックを管理するサーバを設けても良い。また、上記の実施形態では、編集済みオーディオブックおよび編集情報を共に、オーディオファイル管理サーバ２０に保存したが、データサイズの大きい編集済みオーディオブックのみをサーバに保存させ、編集情報は端末３０のみで保持しても良い。その他、本発明の技術思想の範囲から逸脱しない様々な変更や構成の代替は、本発明に含まれる。 Although the embodiments of the present invention have been described above, the technical scope of the present invention is not limited to the above embodiments. For example, in the above embodiment, the audio file management server 20 is configured to include the audio file management section 210, the audio file storage section 220, and the presentation screen generation section 230. A server that manages edited audiobooks may be provided as an external server separate from the server 20. Further, in the above embodiment, both the edited audio book and the editing information are stored in the audio file management server 20, but only the edited audio book with a large data size is stored in the server, and the editing information is stored only in the terminal 30. You can keep it. In addition, various changes and structural substitutions that do not depart from the scope of the technical idea of the present invention are included in the present invention.

１…ネットワークシステム、１０…インターネット、２０…オーディオファイル管理サーバ、３０…端末、３０１…制御ユニット、３２１…オーディオブック取得モジュール、３２２…オーディオブック再生モジュール、３２３…オーディオブック編集モジュール、３２４…編集済みオーディオブック保存モジュール、３２５…提示画面表示制御モジュール、３２６…再生制御モジュール DESCRIPTION OF SYMBOLS 1...Network system, 10...Internet, 20...Audio file management server, 30...Terminal, 301...Control unit, 321...Audiobook acquisition module, 322...Audiobook playback module, 323...Audiobook editing module, 324...Edited Audiobook storage module, 325...Presentation screen display control module, 326...Playback control module

Claims

an acquisition means for acquiring an editable audio information file by participating in replacement by a user;
Reproducing means for reproducing the audio included in the audio information file;
editing means for editing the audio information file;
storage means for storing an audio information file edited by voice replacement in association with editing information of the audio information file separately from the audio information file acquired by the acquisition means;
Presentation means for creating and presenting a list of audio information files that have been edited based on the editing information;
Reproduction control means that accepts a designation of an audio information file to be reproduced from among the presented audio information files and causes the reproduction means to reproduce it ,
The editing means is capable of editing the audio included in the content by dividing the content of the audio information file into a plurality of parts and replacing the audio with other audio for each part,
The regeneration control means includes:
When playing the audio information file, each part can be played back with the replaced audio in the edited audio information file,
Regarding the audio information file related to one content, if there are multiple audio information files in which the part of the content has been edited to replace the audio, each part of the content is edited in the order in which the files were created. , an audio playback system characterized in that the audio of the edited audio information file is assigned and played back .

2. The audio reproduction system according to claim 1, wherein the editing means generates a new audio information file by combining sounds of a plurality of edited audio information files.

The editing information includes information on the age of the user who uttered the voice replaced in the editing,
The playback control means is configured to control the playback control means for each part of the content when there are multiple audio information files in which the part of the content has been edited to replace the audio. , wherein the audio of the edited audio information file is assigned and played according to the age order of the user indicated in the editing information associated with a plurality of the audio files. audio playback system.

control the computer,
an acquisition means for acquiring an editable audio information file by participating in replacement by a user;
Reproducing means for reproducing the audio included in the audio information file;
editing means for editing the audio information file;
storage means for storing an audio information file edited by voice replacement in association with editing information of the audio information file separately from the audio information file acquired by the acquisition means;
Presentation means for creating and presenting a list of audio information files that have been edited based on the editing information;
Functioning as a reproduction control means that accepts a designation of an audio information file to be reproduced from among the presented audio information files and causes the reproduction means to reproduce it ;
As a function of the editing means, it is possible to edit the content of the audio information file into a plurality of parts and replace the audio included in the content with other audio for each part,
As a function of the reproduction control means,
When playing the audio information file, each part can be played back with the replaced audio in the edited audio information file,
Regarding the audio information file related to one content, if there are multiple audio information files in which the part of the content has been edited to replace the audio, each part of the content is edited in the order in which the files were created. , a program characterized in that the audio of the edited audio information file is assigned and played back .