JP5814879B2

JP5814879B2 - Posted audio playback control system, posted audio playback control method, posted audio playback control program

Info

Publication number: JP5814879B2
Application number: JP2012168740A
Authority: JP
Inventors: 佳織大畑; 孝司相澤; 功一郎成合; 伸一仲根
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2012-07-30
Filing date: 2012-07-30
Publication date: 2015-11-17
Anticipated expiration: 2032-07-30
Also published as: JP2014027615A

Description

本発明は、投稿音声再生制御システム、投稿音声再生制御方法、投稿音声再生制御プログラムに関するものであり、具体的には、音声ＳＮＳ等のメッセージサービスにおける類似した投稿音声を手間無く効率的に選択、再生し、投稿者意図の良好な伝達を可能とする技術に関する。 The present invention relates to a posted audio playback control system, a posted audio playback control method, and a posted audio playback control program. Specifically, it efficiently and efficiently selects similar posted audio in a message service such as an audio SNS, The present invention relates to a technology that enables reproduction and good transmission of poster intentions.

昨今、従来から存在するテキストベースのＳＮＳ（ＳｏｃｉａｌＮｅｔｗｏｒｋＳｅｒｖｉｃｅ）に加え、音声メッセージの投稿、公開がなされる音声ＳＮＳが登場し、広がりを見せている。音声ＳＮＳでは、ユーザの音声投稿すなわち肉声を、ＳＮＳ内で該当ユーザとつながりを持つ他ユーザが聞くことで、ユーザ間の感情豊かでリアリティあるコミュニケーションが図られることになる。こうした音声ＳＮＳにおいては、投稿が音声であるが故の様々な新サービスの展開が想定される。例えば、同種の音源を同時再生する既存技術として、アナウンス音声またはバックグラウンド・ミュージックなどの音源を個別に蓄積する複数の音源ボックスからなる音声蓄積部を有し、複数の音源からの音声内容を重畳して出力することを特徴とするボイスメール音声メッセージ重畳方式（特許文献１参照）などが提案されている。 In recent years, in addition to the existing text-based SNS (Social Network Service), voice SNSs for posting and publishing voice messages have appeared and are spreading. In the voice SNS, a user's voice posting, that is, a real voice is heard by another user having a connection with the corresponding user in the SNS, so that emotional and realistic communication between the users is achieved. In such a voice SNS, it is assumed that various new services will be developed because the posting is voice. For example, as an existing technology for simultaneously playing back the same type of sound source, it has a sound storage unit consisting of multiple sound source boxes that individually store sound sources such as announcement sound or background music, and superimposes the sound content from multiple sound sources For example, a voice mail voice message superimposing method (see Patent Document 1) is proposed.

特開平７−２１２４７５号公報Japanese Patent Laid-Open No. 7-212475

上述した音声ＳＮＳ等のメッセージサービスにおいて、例えば、ユーザの誕生日や冠婚葬祭などの各種イベントに合わせ、該当ユーザとつながりのある多数の他ユーザが、類似した内容のメッセージを投稿する場合がある。その場合、前記ユーザは、類似内容の投稿音声を長時間繰り返し聞くことになる。一方、各投稿音声において、発話ペースやメッセージ中での特定キーワードの出現タイミングは投稿者ごとに細かく異なっている。そのため、類似内容の投稿音声らをまとめて同時再生する場合、ユーザが投稿内容を聞き取れず、投稿者側の意図がうまく伝達されない結果に終わることも懸念される。 In the message service such as the above-described voice SNS, for example, in accordance with various events such as the user's birthday and ceremonial occasion, many other users connected to the user may post messages having similar contents. . In that case, the user repeatedly listens to the posted voice with similar contents for a long time. On the other hand, in each posted voice, the utterance pace and the appearance timing of a specific keyword in a message differ finely for each poster. For this reason, there is a concern that when the posted audios having similar contents are simultaneously reproduced, the user cannot hear the posted contents and the intention of the poster side is not transmitted well.

また、ユーザが再生する他ユーザからの投稿音声は、特定のイベントに合わせて投稿されるものだけでなく、イベントに無関係な日常的なものも多く含まれる。そのため、ユーザにとっては、膨大な数の投稿音声中より、特定のイベントに関連した投稿音声のみをピックアップする非常に煩雑な作業が必要となり、投稿音声再生時の心理的な負担にもなりかねない。特に、音声ＳＮＳ上において他ユーザとのつながりが多いユーザであるほど、そうした負担は大きくなり、ユーザビリティの低下が懸念される。 In addition, the posted audio from other users played by the user includes not only those posted in accordance with a specific event, but also many daily irrelevant events. Therefore, it is necessary for the user to pick up only the posted audio related to a specific event from a huge number of posted audio, which may be a psychological burden when reproducing the posted audio. . In particular, as the user has more connections with other users on the voice SNS, such a burden becomes greater, and there is a concern that usability is reduced.

そこで本発明の目的は、音声ＳＮＳ等のメッセージサービスにおける類似した投稿音声を手間無く効率的に選択、再生し、投稿者意図の良好な伝達を可能とする技術を提供することにある。 SUMMARY OF THE INVENTION An object of the present invention is to provide a technology that enables efficient selection and reproduction of similar posted voices in a message service such as a voice SNS without trouble and enabling good transmission of poster intentions.

上記課題を解決する本発明の投稿音声再生制御システムは、ソーシャルネットワークサービスのユーザが利用するユーザ端末とネットワークを介して通信する通信部と、イベントの識別情報と該当イベントに関して予め選定された選定キーワードとを対応付けた判定テーブルを格納する記憶部と、前記通信部を介しユーザ端末から受信した投稿音声データのうち、同じ投稿公開先の指定情報が付与されているものを特定し、当該特定した投稿音声データそれぞれに対し音声認識処理を実行してテキストデータを生成し、当該生成したテキストデータを投稿公開先毎に記憶部に格納する処理と、投稿公開先が共通な各テキストデータを前記判定テーブルの各選定キーワードに照合し、同一の選定キーワードを含んでいたテキストデータを、同じ投稿公開先および同じイベントに関するものとして特定し同時再生対象のグループとして記憶部に格納する処理と、前記同時再生対象のグループに含まれる投稿音声データにおける、前記同一の選定キーワードの音声信号の開始時点ないし終了時点を検索し、データ先頭から前記開始時点までの不要区間、ないし前記終了時点からデータ末尾までの不要区間の音声信号を削除し、当該削除実行後の各投稿音声データを同時に再生した際の、音声出力手段で出力可能なデータを、前記ユーザ端末に送信する処理とを実行する演算部とを備えることを特徴とする。 The posted audio reproduction control system of the present invention that solves the above problems includes a communication unit that communicates via a network with a user terminal used by a user of a social network service, event identification information, and a selection keyword that is selected in advance with respect to the event. And a storage unit that stores a determination table that associates with each other, and among the posted voice data received from the user terminal via the communication unit, the one to which the designation information of the same posting release destination is given is specified and specified Performs speech recognition processing on each posted voice data to generate text data, stores the generated text data in the storage unit for each posting publication destination, and determines each text data having a common posting publication destination against each selected keyword table, the text data contained identical selection keyword, the same Start of the process and, posts in the audio data, the same selection keyword of an audio signal included in the group of the simultaneous reproduction object to be stored in the storage unit as a specific group of co reproduced as relating draft published destination and the same event or searching the end, when the unwanted section from data head to the start point, or delete the audio signals of the unnecessary section up data end from the end, was regenerated each post audio data after the deletion execution time And a calculation unit that executes a process of transmitting data that can be output by the voice output unit to the user terminal.

また、本発明の投稿音声再生制御方法は、ソーシャルネットワークサービスのユーザが利用するユーザ端末とネットワークを介して通信する通信部と、イベントの識別情報と該当イベントに関して予め選定された選定キーワードとを対応付けた判定テーブルを格納する記憶部とを備えたコンピュータが、前記通信部を介しユーザ端末から受信した投稿音声データのうち、同じ投稿公開先の指定情報が付与されているものを特定し、当該特定した投稿音声データそれぞれに対し音声認識処理を実行してテキストデータを生成し、当該生成したテキストデータを投稿公開先毎に記憶部に格納する処理と、投稿公開先が共通な各テキストデータを前記判定テーブルの各選定キーワードに照合し、同一の選定キーワードを含んでいたテキストデータを、同じ投稿公開先および同じイベントに関するものとして特定し同時再生対象のグループとして記憶部に格納する処理と、前記同時再生対象のグループに含まれる、前記同一の選定キーワードの音声信号の開始時点ないし終了時点を検索し、データ先頭から前記開始時点までの不要区間、ないし前記終了時点からデータ末尾までの不要区間の音声信号を削除し、当該削除実行後の各投稿音声データを同時に再生した際の、音声出力手段で出力可能なデータを、前記ユーザ端末に送信する処理とを実行することを特徴とする。 In addition, the posted audio reproduction control method of the present invention corresponds to the communication unit that communicates via the network with the user terminal used by the user of the social network service, the identification information of the event, and the selection keyword selected in advance for the event. A computer having a storage unit for storing the attached determination table identifies the posted audio data received from the user terminal via the communication unit to which the same posting publication designation information is given, and Performs speech recognition processing on each identified posted voice data to generate text data, stores the generated text data in the storage unit for each posting publication destination, and each text data having a common posting publication destination. against each selected keyword of the determination table, the text data contained identical selection keywords A process of storing in the storage unit as a specific group of co reproduced as related to the same post disclosure destination and the same event, included in the group of the simultaneous reproduction object, beginning or end of the same selected keyword speech signal And delete the audio signal of the unnecessary section from the beginning of the data to the start time or the unnecessary section from the end time to the end of the data , A process of transmitting data that can be output by the output means to the user terminal is executed.

また、本発明の投稿音声再生制御プログラムは、ソーシャルネットワークサービスのユーザが利用するユーザ端末とネットワークを介して通信する通信部と、イベントの識別情報と該当イベントに関して予め選定された選定キーワードとを対応付けた判定テーブルを格納する記憶部とを備えたコンピュータに、前記通信部を介しユーザ端末から受信した投稿音声データのうち、同じ投稿公開先の指定情報が付与されているものを特定し、当該特定した投稿音声データそれぞれに対し音声認識処理を実行してテキストデータを生成し、当該生成したテキストデータを投稿公開先毎に記憶部に格納する処理と、投稿公開先が共通な各テキストデータを前記判定テーブルの各選定キーワードに照合し、同一の選定キーワードを含んでいたテキストデータを、同じ投稿公開先および同じイベントに関するものとして特定し同時再生対象のグループとして記憶部に格納する処理と、前記同時再生対象のグループに含まれる、前記同一の選定キーワードの音声信号の開始時点ないし終了時点を検索し、データ先頭から前記開始時点までの不要区間、ないし前記終了時点からデータ末尾までの不要区間の音声信号を削除し、当該削除実行後の各投稿音声データを同時に再生した際の、音声出力手段で出力可能なデータを、前記ユーザ端末に送信する処理とを実行させることを特徴とする。 In addition, the posted audio reproduction control program of the present invention corresponds to a communication unit that communicates with a user terminal used by a user of a social network service via a network, event identification information, and a selection keyword selected in advance for the event. A computer having a storage unit for storing the attached determination table, and identifying the post audio data received from the user terminal via the communication unit to which the same post publishing destination designation information is given, Performs speech recognition processing on each identified posted voice data to generate text data, stores the generated text data in the storage unit for each posting publication destination, and each text data having a common posting publication destination. against each selected keyword of the decision table, it contained the same selection keyword text de The data, a process of storing in the storage unit as a specific group of co reproduced as related to the same post disclosure destination and the same event, included in the group of the simultaneous playback target, the start point of the same selected keyword speech signal or searching the end, when the unwanted section from data head to the start point, or delete the audio signals of the unnecessary section up data end from the end, was regenerated each post audio data after the deletion execution time And a process of transmitting data that can be output by the voice output means to the user terminal.

本発明によれば、音声ＳＮＳ等のメッセージサービスにおける類似した投稿音声を手間無く効率的に選択、再生し、投稿者意図の良好な伝達が可能となる。 According to the present invention, similar posted voices in a message service such as voice SNS can be efficiently selected and played back without any trouble, and good transmission of the poster intention can be achieved.

第１の実施形態の投稿音声再生制御システムを含むネットワーク構成図である。It is a network block diagram containing the contribution audio | voice reproduction | regeneration control system of 1st Embodiment. 第１の実施形態のＳＮＳサーバのハードウェア構成例を示す図である。It is a figure which shows the hardware structural example of the SNS server of 1st Embodiment. 第１の実施形態の公開Ｗｅｂサーバのハードウェア構成例を示す図である。It is a figure which shows the hardware structural example of the public Web server of 1st Embodiment. 第１の実施形態の投稿記録装置のハードウェア構成例を示す図である。It is a figure which shows the hardware structural example of the posting recording apparatus of 1st Embodiment. 第１の実施形態の判定テーブルの一例を示す図である。It is a figure which shows an example of the determination table of 1st Embodiment. 第１の実施形態における投稿音声再生制御方法の処理手順例を示すフロー図である。It is a flowchart which shows the process sequence example of the contribution audio | voice reproduction | regeneration control method in 1st Embodiment. 第１の実施形態の投稿音声データの例１を示す図である。It is a figure which shows Example 1 of the contribution audio | voice data of 1st Embodiment. 第１の実施形態の合成再生用ファイルの例１を示す図である。It is a figure which shows Example 1 of the file for synthetic | combination reproduction | regeneration of 1st Embodiment. 第１の実施形態の合成再生用ファイルの例２を示す図である。It is a figure which shows Example 2 of the file for synthetic | combination reproduction | regeneration of 1st Embodiment. 第１の実施形態の投稿音声データの例２を示す図である。It is a figure which shows Example 2 of the contribution audio | voice data of 1st Embodiment. 第１の実施形態の合成再生用ファイルの例３を示す図である。It is a figure which shows Example 3 of the file for synthetic | combination reproduction | regeneration of 1st Embodiment. 第１の実施形態における処理結果例を示す図である。It is a figure which shows the example of a process result in 1st Embodiment. 第２の実施形態のＳＮＳサーバのハードウェア構成例を示す図である。It is a figure which shows the hardware structural example of the SNS server of 2nd Embodiment. 第２の実施形態の評価テーブルの一例を示す図である。It is a figure which shows an example of the evaluation table of 2nd Embodiment. 第２の実施形態のユーザテーブルの一例を示す図である。It is a figure which shows an example of the user table of 2nd Embodiment. 第２の実施形態の投稿音声データ情報テーブルの一例を示す図である。It is a figure which shows an example of the contribution audio | voice data information table of 2nd Embodiment. 第２の実施形態の音楽選択支援方法の処理手順例を示すフロー図である。It is a flowchart which shows the process sequence example of the music selection assistance method of 2nd Embodiment. 第２の実施形態における評価結果例１を示す図である。It is a figure which shows the evaluation result example 1 in 2nd Embodiment. 第２の実施形態における評価結果例２を示す図である。It is a figure which shows the evaluation result example 2 in 2nd Embodiment. 第３の実施形態のＳＮＳサーバのハードウェア構成例を示す図である。It is a figure which shows the hardware structural example of the SNS server of 3rd Embodiment. 第３の実施形態のグルーピングテーブルの一例を示す図である。It is a figure which shows an example of the grouping table of 3rd Embodiment. 第３の実施形態の優先度評価テーブルの一例を示す図である。It is a figure which shows an example of the priority evaluation table of 3rd Embodiment. 第３の実施形態のユーザテーブルの一例を示す図である。It is a figure which shows an example of the user table of 3rd Embodiment. 第３の実施形態における投稿音声再生制御方法の処理手順例を示すフロー図である。It is a flowchart which shows the process sequence example of the contribution audio | voice reproduction | regeneration control method in 3rd Embodiment. 第３の実施形態における評価結果例を示す図である。It is a figure which shows the example of an evaluation result in 3rd Embodiment.

−−−第１の実施形態におけるシステム構成−−−
以下に本発明の実施形態について図面を用いて詳細に説明する。図１は、第１の実施形態の投稿音声再生制御システム１００２を含むネットワーク構成図である。図１に示す投稿音声再生制御システム１００２（以下、システム１００２）は、音声ＳＮＳ等のメッセージサービスにおける類似した投稿音声を手間無く効率的に選択、再生し、投稿者意図の良好な伝達を可能とするコンピュータシステムである。 --- System configuration in the first embodiment ---
Embodiments of the present invention will be described below in detail with reference to the drawings. FIG. 1 is a network configuration diagram including a posted audio reproduction control system 1002 according to the first embodiment. A posted voice playback control system 1002 (hereinafter, system 1002) shown in FIG. 1 can efficiently select and play back similar posted voices in a message service such as a voice SNS without trouble, thereby enabling good transmission of poster intentions. Computer system.

なお、音声ＳＮＳとは、従来から存在するテキストベースのＳＮＳとは異なり、音声メッセージの投稿、公開がなされるＳＮＳである。音声ＳＮＳでは、ユーザの音声投稿すなわち肉声を、ＳＮＳ内で該当ユーザとつながりを持つ他ユーザが聞くことで、ユーザ間の感情豊かでリアリティあるコミュニケーションが図られることになる。 Note that the voice SNS is an SNS in which a voice message is posted and released, unlike a conventional text-based SNS. In the voice SNS, a user's voice posting, that is, a real voice is heard by another user having a connection with the corresponding user in the SNS, so that emotional and realistic communication between the users is achieved.

図１に例示するシステム１００２は、ＳＮＳサーバ１５０、公開Ｗｅｂサーバ１６０、および投稿記録装置１７０で構成されている。ＳＮＳサーバ１５０は、音声ＳＮＳに関する各種処理の主たる実行主体であり、音声ＳＮＳのユーザに関する認証情報や、各ユーザの投稿音声データの情報などを管理し、ユーザ認証や投稿音声の公開、再生といった処理を実行するサーバ装置となる。また、公開Ｗｅｂサーバ１６０は、インターネット網１２０を介してアクセスしてくるユーザ端末２００と、上述のＳＮＳサーバ１５０との間にあって、ＳＮＳサーバ１５０でのユーザ認証処理や、上述のＳＮＳサーバ１５０が出力した、投稿音声データの再生データなど各種データの授受を仲介するサーバ装置である。また、投稿記録装置１７０は、ユーザ端末２００から送られてくる投稿音声データを格納する情報処理装置であり、ＳＮＳサーバ１５０からの要求に応じて、当該要求が指定する投稿音声データを読み出してＳＮＳサーバ１５０に送るものとなる。 A system 1002 illustrated in FIG. 1 includes an SNS server 150, a public Web server 160, and a posting recording device 170. The SNS server 150 is a main execution subject of various processes related to the voice SNS, manages authentication information related to users of the voice SNS, information of posted voice data of each user, and processes such as user authentication and published voice playback and playback. It becomes the server device which executes. The public Web server 160 is located between the user terminal 200 accessed via the Internet network 120 and the above-described SNS server 150. The user authentication process in the SNS server 150 and the output from the above-described SNS server 150 are output. The server device mediates exchange of various data such as reproduction data of posted audio data. The posting recording device 170 is an information processing device that stores posted voice data sent from the user terminal 200. In response to a request from the SNS server 150, the posted recording data 170 reads out the posted voice data specified by the request. This is sent to the server 150.

なお、ＳＮＳサーバ１５０では、ソーシャルネットワークサービスの形態として当然ながら、ユーザ毎に、当該音声ＳＮＳ上でのつながり（例：友人、知人、同僚、家族、一方的なフォロワー、趣味・嗜好等のグループなど）が規定された他ユーザの情報が管理されている。従って、ユーザが再生して聞くことが出来る投稿音声は、こうしたつながりを持った他ユーザのものとなる。そのため、投稿記録装置１７０では、ユーザ毎に、該当ユーザ宛て乃至該当グループ宛等に投稿された他ユーザの投稿音声データを紐付けて管理している。 Of course, in the SNS server 150, as a form of the social network service, for each user, a connection on the voice SNS (eg, friend, acquaintance, colleague, family, unilateral follower, hobby / preference group, etc.) ) Is managed for other users. Therefore, the posted voice that can be reproduced and heard by the user is that of another user having such a connection. Therefore, in the posting recording device 170, for each user, the posted audio data of other users posted to the corresponding user or the corresponding group is linked and managed.

公開Ｗｅｂサーバ１６０およびＳＮＳサーバ１５０は、ユーザ端末２００（投稿者端末２２０、閲覧再生者端末２４０）からのアクセスに際し、所定のユーザ認証処理を経てアクセスユーザを特定し、該当ユーザに紐付けて投稿記録装置１７０にて管理している他ユーザからの投稿音声データの情報（例：投稿ユーザ名、投稿日時、タイトル、録音長等）を、該当ユーザ用のページ（ＳＮＳサイトにおける、いわゆるマイページ）にて、例えば投稿日時順に列挙するように設定し、このページデータを該当ユーザ端末２００に送信することとなる。 The public Web server 160 and the SNS server 150 specify an access user through a predetermined user authentication process when accessing from the user terminal 200 (the contributor terminal 220 and the browsing player terminal 240), and post it in association with the corresponding user. Information on posted audio data from other users managed by the recording device 170 (e.g., posting user name, posting date / time, title, recording length, etc.), a page for the corresponding user (so-called my page on the SNS site) Then, for example, the page data is set to be listed in order of posting date and time, and the page data is transmitted to the corresponding user terminal 200.

公開Ｗｅｂサーバ１６０が、前記のユーザ用ページにて、所定投稿についての再生指示をユーザ端末２００から受けた場合、公開Ｗｅｂサーバ１６０は該当投稿に関する再生指示の情報をＳＮＳサーバ１５０に送る。ＳＮＳサーバ１５０ではこの再生指示の情報を受けて、該当投稿に関する投稿音声データを投稿記録装置１７０から読み出して再生し、その再生データを公開Ｗｅｂサーバ１６０を介してユーザ端末２００に送信する。ユーザ端末２００では、この再生データをインターネット網１２０を介して受信し、スピーカーで出力する。 When the public Web server 160 receives a playback instruction for a predetermined post from the user terminal 200 on the user page, the public Web server 160 sends information on the playback instruction regarding the post to the SNS server 150. The SNS server 150 receives this reproduction instruction information, reads the posted audio data related to the corresponding posting from the posting recording device 170 and reproduces it, and transmits the reproduced data to the user terminal 200 via the public Web server 160. The user terminal 200 receives this reproduction data via the Internet network 120 and outputs it through a speaker.

一方、音声ＳＮＳのユーザが利用する端末がユーザ端末２００である。このユーザ端末２００は、例えば図１にて示すように、投稿者端末２２０、２３０、閲覧再生者端末２４０に分類できる。投稿者端末２２０は、インターネット網１２０を介して公開Ｗｅｂサーバ１６０にアクセスし、当該公開Ｗｅｂサーバ１６０にて公開されている音声ＳＮＳのＷｅｂサイトのデータを取得、表示すると共に、自ユーザによる投稿音声の入力をマイクで受け付けて、対応する投稿音声データを公開Ｗｅｂサーバ１６０にアップロードする端末となる。また、閲覧再生者端末２４０は、インターネット網１２０を介して公開Ｗｅｂサーバ１６０にアクセスし、当該公開Ｗｅｂサーバ１６０にて公開されている音声ＳＮＳのＷｅｂサイトのデータを取得、表示する端末となる。 On the other hand, the terminal used by the user of the voice SNS is the user terminal 200. For example, as shown in FIG. 1, the user terminal 200 can be classified into contributor terminals 220 and 230 and a browsing player terminal 240. The poster terminal 220 accesses the public Web server 160 via the Internet network 120, acquires and displays the data of the voice SNS website published on the public Web server 160, and the posted voice by the own user. Is input to the public Web server 160 by accepting the input by the microphone. Further, the browsing player terminal 240 is a terminal that accesses the public Web server 160 via the Internet network 120 and acquires and displays data of the website of the voice SNS published on the public Web server 160.

なお、投稿者端末２２０、閲覧再生者端末２４０が公開Ｗｅｂサーバ１６０にアクセスする場合、当然ながら、音声ＳＮＳにおける自分用のページにログインするためのユーザ認証処理が必要となる。従って、投稿者端末２２０、閲覧再生者端末２４０では、利用中のユーザによる認証情報の入力を受け付けて、これを認証依頼と共に、公開Ｗｅｂサーバ１６０を介してＳＮＳサーバ１５０に送信する。この場合、ＳＮＳサーバ１５０は、前記の認証情報を、認証用の情報を格納したデータベース等に照合してユーザ認証を実行し、その認証結果に応じて、該当投稿者端末２２０や閲覧再生者端末２４０からの公開Ｗｅｂサーバ１６０における該当ユーザ用ページへのアクセス可否を制御する。また、ＳＮＳサーバ１５０は、前記の認証結果を公開Ｗｅｂサーバ１６０を介して投稿者端末２２０や閲覧再生者端末２４０に送信する。 In addition, when the contributor terminal 220 and the browsing player terminal 240 access the public Web server 160, of course, a user authentication process for logging in to a page for oneself in the voice SNS is necessary. Therefore, the contributor terminal 220 and the browsing / playback terminal 240 accept input of authentication information by a user in use, and transmit this to the SNS server 150 via the public Web server 160 together with an authentication request. In this case, the SNS server 150 performs user authentication by collating the authentication information with a database or the like that stores authentication information, and according to the authentication result, the corresponding contributor terminal 220 or the browsing player terminal. The access to the corresponding user page in the public Web server 160 from 240 is controlled. Further, the SNS server 150 transmits the authentication result to the poster terminal 220 and the browsing player terminal 240 via the public Web server 160.

また、上述したように、投稿音声データやその再生データの授受等をインターネット網１２０を介して行う場合の他に、ユーザ端末２００の電話機能と、公衆回線網１２２らを用いて投稿音声データやその再生データの授受等を行う形態も想定できる。この場合、図１にて示すように、上述のＳＮＳサーバ１５０や投稿記録装置１７０にＬＡＮ回線１２１を介して接続する電話応答システム３００がネットワーク構成に含まれる。 Further, as described above, in addition to the case where the posting voice data and its reproduction data are exchanged via the Internet network 120, the posting voice data and the like using the telephone function of the user terminal 200 and the public line network 122, etc. It is also possible to assume a form in which the reproduction data is exchanged. In this case, as shown in FIG. 1, the network configuration includes a telephone answering system 300 connected to the SNS server 150 and the post recording device 170 via the LAN line 121.

この電話応答システム３００は、交換機３１０、自動音声応答装置３２０、ＣＴＩ（Computer Telephony Integration）装置３３０から構成されている。そのうち交換機３１０は、電話回線を相互接続し電話網を構成するための交換機であり、自動音声応答装置３２０は、電話の応答と音声による情報の入出力や対話をコンピュータにて行う装置であり、ＣＴＩ装置３３０は、電話やＦＡＸをコンピュータシステムに統合する装置である。 The telephone response system 300 includes an exchange 310, an automatic voice response device 320, and a CTI (Computer Telephony Integration) device 330. Among them, the exchange 310 is an exchange for interconnecting telephone lines to form a telephone network, and the automatic voice response device 320 is a device that performs a telephone response and voice information input / output and dialogue with a computer, The CTI device 330 is a device that integrates a telephone or a fax into a computer system.

こうした構成において、ユーザ端末たる投稿者端末２３０より、予め定められた投稿受付電話番号へ発話がなされた場合、上述の投稿者端末２３０は、公衆回線網１２２を経て交換機３１０へ接続され、この接続に応じて、ＣＴＩ装置３３０での発話番号取得や自動応答装置３２０での自動音声ガイダンス再生がなされる。その後、上述の投稿者端末２３０において、投稿者がメッセージを発話した場合、その発話メッセージは自動音声応答装置３２０を経て、投稿音声データとして投稿記録装置１７０に録音、すなわち登録される。また、この投稿音声データの登録処理にあわせ、自動音声応答装置３２０が、投稿記録装置１７０に登録された投稿音声データの情報（投稿者、投稿日時、投稿音声データの識別情報等）を、ＳＮＳサーバ１５０に通知する。ＳＮＳサーバ１５０では、この通知を受けて、投稿記録装置１７０に登録された投稿音声データの情報（投稿者、投稿日時、投稿音声データの識別情報等）を記憶部に格納することとなる。 In such a configuration, when an utterance is made from a contributor terminal 230, which is a user terminal, to a predetermined post acceptance telephone number, the contributor terminal 230 is connected to the exchange 310 via the public network 122, and this connection In response to this, an utterance number is acquired by the CTI device 330 and automatic voice guidance reproduction is performed by the automatic response device 320. Thereafter, in the above-described poster terminal 230, when the poster utters a message, the uttered message is recorded, that is, registered in the post recording device 170 as post voice data through the automatic voice response device 320. Also, in accordance with the posted voice data registration process, the automatic voice response device 320 uses the SNS to record the posted voice data information (poster, posted date, posted voice data identification information, etc.) registered in the posted recording device 170. The server 150 is notified. In response to this notification, the SNS server 150 stores the posted voice data information registered in the posting recording device 170 (poster, posted date and time, identification information of the posted voice data, etc.) in the storage unit.

続いて、システム１００２のハードウェア構成について説明する。第１の実施形態におけるシステム１００２は、上述のように、ＳＮＳサーバ１５０、公開Ｗｅｂサーバ１６０、および投稿記録装置１７０にて構成されている。まずは、システム１００２における処理の実行主体たるＳＮＳサーバ１５０について説明を行うこととする。 Next, the hardware configuration of the system 1002 will be described. As described above, the system 1002 according to the first embodiment includes the SNS server 150, the public Web server 160, and the posting recording device 170. First, the SNS server 150 that is the execution subject of processing in the system 1002 will be described.

この場合、システム１００２を構成するＳＮＳサーバ１５０は、図２に例示するように、ハードディスクドライブなど適宜な不揮発性記憶装置で構成される記憶部１０１、ＲＡＭなど揮発性記憶装置で構成されるメモリ１０３、前記記憶部１０１に保持されるプログラム１０２をメモリ１０３に読み出すなどして実行し装置自体の統括制御を行なうとともに各種判定、演算及び制御処理を行なうＣＰＵなどの演算部１０４、ＬＡＮ回線１２１等と接続し他装置との通信処理を担う通信部１０５、を備える。なお、記憶部１０１内には、第１の実施形態の投稿音声再生制御システムとして必要な機能を実装する為のプログラム１０２、イベントの識別情報と該当イベントに関して予め選定された選定キーワードとを対応付けた判定テーブル１３０が少なくとも記憶されている。 In this case, as illustrated in FIG. 2, the SNS server 150 configuring the system 1002 includes a storage unit 101 configured with an appropriate non-volatile storage device such as a hard disk drive, and a memory 103 configured with a volatile storage device such as RAM. The computer 102, which reads and executes the program 102 held in the storage unit 101, executes the overall control of the apparatus itself and performs various determinations, computations, and control processes, and the LAN unit 121, etc. A communication unit 105 connected and responsible for communication processing with other devices. In the storage unit 101, the program 102 for implementing the functions necessary as the posted audio reproduction control system of the first embodiment, the event identification information, and the selection keyword selected in advance for the corresponding event are associated with each other. The determination table 130 is stored at least.

また、公開Ｗｅｂサーバ１６０も同様に、図３に例示するように、ハードディスクドライブなど適宜な不揮発性記憶装置で構成される記憶部１１１、ＲＡＭなど揮発性記憶装置で構成されるメモリ１１３、前記記憶部１１１に保持されるプログラム１１２をメモリ１１３に読み出すなどして実行し装置自体の統括制御を行なうとともに各種判定、演算及び制御処理を行なうＣＰＵなどの演算部１１４、インターネット網１２０、ＬＡＮ回線１２１と接続し他装置との通信処理を担う通信部１１５、を備える。なお、記憶部１１１内には、第１の実施形態の投稿音声再生制御システムとしてＳＮＳサーバ１５０と協働し必要な機能を実装する為のプログラム１１２、および、音声ＳＮＳサイトの各種Ｗｅｂページデータ１１６が少なくとも記憶されている。 Similarly, as illustrated in FIG. 3, the public Web server 160 also includes a storage unit 111 configured with an appropriate non-volatile storage device such as a hard disk drive, a memory 113 configured with a volatile storage device such as RAM, and the storage. An arithmetic unit 114 such as a CPU, an Internet network 120, a LAN line 121, and the like that read and execute a program 112 held in the unit 111 to the memory 113 and execute overall control of the apparatus itself and perform various determinations, calculations, and control processes A communication unit 115 connected and responsible for communication processing with other devices. In the storage unit 111, a program 112 for implementing necessary functions in cooperation with the SNS server 150 as the posted audio reproduction control system of the first embodiment, and various Web page data 116 of the audio SNS site. Is at least remembered.

また、投稿記録装置１７０も同様に、図４に例示するように、ハードディスクドライブなど適宜な不揮発性記憶装置で構成される記憶部１１、ＲＡＭなど揮発性記憶装置で構成されるメモリ１３、前記記憶部１１に保持されるプログラム１２をメモリ１３に読み出すなどして実行し装置自体の統括制御を行なうとともに各種判定、演算及び制御処理を行なうＣＰＵなどの演算部１４、ＬＡＮ１２１と接続し他装置との通信処理を担う通信部１５、を備える。なお、記憶部１１内には、第１の実施形態の投稿音声再生制御システムとしてＳＮＳサーバ１５０と協働し必要な機能を実装する為のプログラム１２、および、各ユーザ端末２００（投稿者端末２２０、２３０）から受信した投稿音声データ１６が少なくとも記憶されている。投稿音声データ１６には、各投稿音声データのファイル（ファイル名が識別情報ともなる）と、該当投稿音声の投稿者、投稿日時、および公開先といったデータが対応付けて格納されている（図４）。 Similarly, as illustrated in FIG. 4, the posting recording device 170 also includes a storage unit 11 configured with an appropriate non-volatile storage device such as a hard disk drive, a memory 13 configured with a volatile storage device such as RAM, and the storage. The program 12 held in the unit 11 is executed by reading the program 12 into the memory 13 to perform overall control of the device itself and perform various determinations, computations, and control processes, and the arithmetic unit 14 such as a CPU and the LAN 121 to connect with other devices. The communication part 15 which bears a communication process is provided. In the storage unit 11, the program 12 for implementing necessary functions in cooperation with the SNS server 150 as the posted audio reproduction control system of the first embodiment, and each user terminal 200 (the poster terminal 220). , 230) is at least stored. In the posted audio data 16, a file of each posted audio data (file name also serves as identification information) and data such as a contributor of the corresponding posted audio, a posting date and a publication destination, and the like are stored in association with each other (FIG. 4). ).

続いて、第１の実施形態のシステム１００２が備える機能について説明する。第１の実施形態におけるシステム１００２は、上述のように、ＳＮＳサーバ１５０、公開Ｗｅｂサーバ１６０、および投稿記録装置１７０にて構成されているが、以下では説明の簡明化の為、ＳＮＳサーバ１５０が公開Ｗｅｂサーバ１６０および投稿記録装置１７０の機能を備え、一体のシステム１００２として機能を果たすものとして説明を行うこととする。なお、こうしたシステム１００２において、ユーザ端末２００とのデータ授受は公開Ｗｅｂサーバ１６０を介して実行され、投稿音声データの管理については投稿記録装置１７０を介して実行される。 Next, functions provided in the system 1002 of the first embodiment will be described. As described above, the system 1002 according to the first embodiment includes the SNS server 150, the public Web server 160, and the posting recording device 170. However, for the sake of simplification of explanation, the SNS server 150 is described below. The description will be made on the assumption that the functions of the public Web server 160 and the post recording device 170 are provided and the functions as the integrated system 1002 are achieved. In such a system 1002, data exchange with the user terminal 200 is executed via the public Web server 160, and post audio data management is executed via the post recording device 170.

この場合、システム１００２は、ユーザ端末２００から受信し投稿記録装置１７０にて格納されている投稿音声データのうち、同じ投稿公開先の指定情報が付与されているものを、通信部１０５を介して投稿記録装置１７０にアクセスして特定し、当該特定した投稿音声データそれぞれに対し音声認識処理を実行してテキストデータを生成し、当該生成したテキストデータを投稿公開先毎に記憶部１０１に格納する機能を有している。 In this case, the system 1002 receives, via the communication unit 105, the post audio data received from the user terminal 200 and stored in the post recording device 170, to which the same post release destination designation information is assigned. The posting recording device 170 is accessed and specified, voice recognition processing is performed on each of the specified posted voice data to generate text data, and the generated text data is stored in the storage unit 101 for each posting publication destination. It has a function.

また、システム１００２は、上述で得た投稿公開先が共通な各テキストデータを判定テーブル１３０の各選定キーワードに照合し、共通する選定キーワードを含んでいたテキストデータを、同じ投稿公開先および同じイベントに関するものとして特定し、同時再生対象のグループとして記憶部に格納する機能を有している。 Further, the system 1002 collates each text data having the same posting publication destination obtained above with each selection keyword of the determination table 130, and the text data including the common selection keyword is converted into the same posting publication destination and the same event. As a group to be simultaneously reproduced, and stored in the storage unit.

また、システム１００２は、上述の同期再生対象のグループに含まれる投稿音声データにおける、所定属性（選定キーワード）の音声信号の開始時点ないし終了時点を検索し、データ先頭から開始時点までの不要区間、ないし終了時点からデータ末尾までの不要区間の音声信号を削除し、当該削除実行後の各投稿音声データを同時に再生したデータを、ユーザ端末２００に送信する機能を有している。 Further, the system 1002 searches for the start time or the end time of the audio signal having the predetermined attribute (selected keyword) in the posted audio data included in the group to be synchronously reproduced, and an unnecessary section from the beginning of the data to the start time, In addition, it has a function of deleting an audio signal in an unnecessary section from the end point to the end of the data, and transmitting data obtained by simultaneously reproducing each posted audio data after execution of the deletion to the user terminal 200.

また、システム１００２は、上述の削除実行後の各投稿音声データを、データ先頭から同時に再生したデータをユーザ端末２００に送信するとしてもよい。或いは、システム１００２は、削除実行後の各投稿音声データを、同時に終了するようデータ末尾を揃えて再生したデータをユーザ端末２００に送信するとしてもよい。 In addition, the system 1002 may transmit, to the user terminal 200, data obtained by simultaneously reproducing each posted audio data after the above-described deletion execution from the top of the data. Alternatively, the system 1002 may transmit, to the user terminal 200, data that has been reproduced with the end of the data aligned so that the post-deletion post-deletion data ends at the same time.

また、システム１００２は、上述の削除実行後の各投稿音声データ間での再生時間長の平均値を算定し、各投稿音声データのうち再生時間長が平均値に満たないものは基準速度より低速で再生し、各投稿音声データのうち再生時間長が平均値を越えるものは基準速度より高速で再生して、各投稿音声データの再生時間長を統一する処理を実行し、当該処理後の各投稿音声データを同時に再生したデータを、ユーザ端末２００に送信するとしてもよい。 Further, the system 1002 calculates the average value of the reproduction time lengths between the respective posted audio data after execution of the deletion described above, and among the posted audio data, those whose reproduction time length is less than the average value are slower than the reference speed. If the playback time length exceeds the average value of each posted audio data, playback is performed at a speed higher than the reference speed, and a process for unifying the playback time length of each posted audio data is executed. Data obtained by simultaneously reproducing the posted audio data may be transmitted to the user terminal 200.

−−−第１の実施形態におけるデータ構造例−−−
次に、第１の実施形態のシステム１００２が用いるテーブルにおけるデータ構造例について説明する。図５は、第１の実施形態における判定テーブル１３０の一例を示す図である。判定テーブル１３０は、「結婚」、「誕生日」、「合格」といった各種イベントの識別情報をキーに、該当イベントに関して予め選定された、「けっこんおめでとう」、「たんじょうびおめでとう」、「ハッピーウェデング」といった選定キーワードを対応付けたレコードの集合体となっている。 --- Example of data structure in the first embodiment ---
Next, an example of a data structure in a table used by the system 1002 of the first embodiment will be described. FIG. 5 is a diagram illustrating an example of the determination table 130 according to the first embodiment. The determination table 130 includes “congratulations congratulations”, “congratulations congratulations”, and “happy weddings” selected in advance with respect to the corresponding events using identification information of various events such as “marriage”, “birthday”, and “pass” as keys. It is an aggregate of records in which the selected keywords are associated with each other.

−−−第１の実施形態における処理手順例−−−
以下、第１の実施形態における投稿音声再生制御方法の実際手順について図に基づき説明する。以下で説明する投稿音声再生制御方法に対応する各種動作は、システム１００２を構成する各装置らがメモリに読み出してそれぞれ実行するプログラムによって実現される。そして、このプログラムは、以下に説明される各種の動作を行うためのコードから構成されている。 --- Example of processing procedure in the first embodiment ---
Hereinafter, the actual procedure of the posted audio reproduction control method in the first embodiment will be described with reference to the drawings. Various operations corresponding to the posted audio reproduction control method described below are realized by programs that are read into the memory and executed by each device constituting the system 1002. And this program is comprised from the code | cord | chord for performing the various operation | movement demonstrated below.

図６は、第１の実施形態における投稿音声再生制御方法の処理手順例を示すフロー図である。ここで、ＳＮＳサーバ１５０は、ユーザ端末２００から受信し投稿記録装置１７０にて格納されている投稿音声データのうち、同じ投稿公開先の指定情報が付与されているものを、通信部１０５を介して投稿記録装置１７０にアクセスして特定する（ｓ１００）。 FIG. 6 is a flowchart showing a processing procedure example of the posted audio reproduction control method according to the first embodiment. Here, the SNS server 150 receives, through the communication unit 105, the post audio data received from the user terminal 200 and stored in the post recording device 170, to which the same post release destination designation information is assigned. The post recording device 170 is accessed and specified (s100).

次に、ＳＮＳサーバ１５０は、上述で特定した投稿音声データそれぞれに対し、プログラム１０２が含む音声認識プログラムを呼び出して実行して音声認識処理を実行し（ｓ１０１）、テキストデータを生成し、当該生成したテキストデータを投稿公開先毎に記憶部１０１に格納する（ｓ１０２）。 Next, the SNS server 150 calls and executes the speech recognition program included in the program 102 for each of the posted speech data specified above, executes speech recognition processing (s101), generates text data, and generates The text data is stored in the storage unit 101 for each posting publication destination (s102).

ここでＳＮＳサーバ１５０は、前記の音声認識処理で生成したテキストデータを、判定テーブル１３０における各選定キーワードに照合し、共通する選定キーワードを含んでいたテキストデータを、同じ投稿公開先および同じイベントに関するものとして特定し、合成再生対象グループとして記憶部１０１に記憶する（ｓ１０３）。ＳＮＳサーバ１５０は、このステップｓ１０３の処理を、上述のステップｓ１０２で得ている各投稿音声データについて全て実行する（ｓ１０４）。こうした処理により、例えば、あるユーザ「U00001」を公開先とし、「けっこんおめでとう」という選定キーワードを共通に含んでいた投稿音声データが特定されたとする。図５に示す判定テーブル１３０の例であれば、この場合の投稿音声データのテーマは、イベント「結婚祝い」となる。また、図１２に第１の実施形態における処理結果例を示す。この例では、「投稿１」、「投稿４」、「投稿７」、が、投稿公開先「太郎」、およびイベント「誕生祝い」について共通する投稿音声データとして特定されている。 Here, the SNS server 150 collates the text data generated by the voice recognition processing with each selection keyword in the determination table 130, and the text data including the common selection keyword is related to the same posting publication destination and the same event. It identifies as a thing and memorize | stores in the memory | storage part 101 as a synthetic | combination reproduction | regeneration object group (s103). The SNS server 150 executes the process in step s103 for all the posted audio data obtained in step s102 (s104). By such processing, for example, it is assumed that posted audio data that has a certain user “U00001” as a disclosure destination and includes a selection keyword “congratulations” in common is specified. In the example of the determination table 130 shown in FIG. 5, the theme of the posted audio data in this case is the event “Marriage Celebration”. FIG. 12 shows an example of processing results in the first embodiment. In this example, “Post 1”, “Post 4”, and “Post 7” are specified as post voice data common to the post release destination “Taro” and the event “Birthday celebration”.

続いてＳＮＳサーバ１５０は、上述のステップｓ１０３にて特定した、投稿公開先およびイベントが共通する各投稿音声データにおいて、所定属性（選定キーワード）の音声信号の開始時点ないし終了時点を検索する（ｓ１０５）。上述の例であれば、ＳＮＳサーバ１５０は、前記各投稿音声データについて、「けっこんおめでとう」の選定キーワードの発話開始時点、或いは発話終了時点を、音声解析処理により特定する。この音声解析処理は、ＳＮＳサーバ１５０が備えるプログラム１０２が備える音声解析プログラムを実行することで実行される。なお、音声解析プログラムは既存のものを利用すればよい。 Subsequently, the SNS server 150 searches for the starting time point or the ending time point of the audio signal having the predetermined attribute (selected keyword) in each posting audio data specified in the above-described step s103 and sharing the posting destination and the event (s105). ). In the above example, the SNS server 150 specifies the utterance start time or the utterance end time of the selected keyword “Congratulations” for each post voice data by the voice analysis processing. This voice analysis process is executed by executing a voice analysis program provided in the program 102 provided in the SNS server 150. An existing voice analysis program may be used.

「けっこんおめでとう」の選定キーワードの発話開始時点、或いは発話終了時点を、音声解析処理により特定したＳＮＳサーバ１５０は、図７に例示する如く、該当投稿音声データにおけるデータ先頭から上述の発話開始時点までの不要区間Δｔ１、ないし上述の発話終了時点から投稿音声データにおけるデータ末尾までの不要区間Δｔ２、のいずれかの音声信号を削除する（ｓ１０６）。ＳＮＳサーバ１５０は、このステップｓ１０６の処理を、上述のステップｓ１０５で開始時点ないし終了時点を検索した各投稿音声データについて全て実行する（ｓ１０７）。 The SNS server 150 that has identified the utterance start time or utterance end time of the selected keyword of “Kekkon Congratulations” by voice analysis processing, from the beginning of the data in the corresponding posted voice data to the above-mentioned utterance start time, as illustrated in FIG. Of the unnecessary section Δt1 or the unnecessary section Δt2 from the end time of the utterance to the end of the data in the posted voice data is deleted (s106). The SNS server 150 executes the process of step s106 for all the posted audio data searched for the start time or end time in step s105 (s107).

次に、ＳＮＳサーバ１５０は、上述のステップｓ１０６での不要区間削除の実行後、各投稿音声データを、データ先頭から重畳させることで合成して１ファイルとし（ｓ１０８）、当該ファイルを再生した再生データをユーザ端末２００に送信する（ｓ１０９）。このように、各投稿音声データをデータ先頭から重畳させることで合成して１ファイルとし、当該ファイルを再生することで、各投稿音声データの同時再生がなされることになる。図８に第１の実施形態の合成再生用ファイルの例１を示す。この図８にて示すように、前記のステップｓ１０８で合成して得たファイル、すなわち合成再生用ファイルは、各投稿者の投稿音声データが「けっこんおめでとう」の部分を先頭に多重化されたものとなっている。 Next, after executing the unnecessary section deletion in step s106 described above, the SNS server 150 synthesizes each posted audio data by superimposing them from the top of the data to form one file (s108), and reproducing the file. Data is transmitted to the user terminal 200 (s109). As described above, each posted audio data is superimposed from the top of the data to be combined into one file, and by playing the file, each posted audio data is reproduced simultaneously. FIG. 8 shows an example 1 of a composite reproduction file according to the first embodiment. As shown in FIG. 8, the file obtained by synthesizing in the above step s108, that is, the synthetic reproduction file, is a file in which the contribution audio data of each contributor is multiplexed with the "Congratulations" section at the beginning. It has become.

なお、ＳＮＳサーバ１５０は、不要区間削除の実行後、上述のステップｓ１０８において、各投稿音声データを、同時に再生終了となるようデータ末尾を揃えて重畳させることで合成して１ファイルとし、前記ステップｓ１０９において、当該ファイルを再生した再生データをユーザ端末２００に送信するとしてもよい。図９に第１の実施形態の合成再生用ファイルの例２を示す。この場合の合成再生用ファイルは、図９にて示すように、各投稿者の投稿音声データが、「けっこんおめでとう」の部分を後端に揃えて多重化されたものとなっている。 In addition, after execution of unnecessary section deletion, the SNS server 150 synthesizes each posted audio data by superimposing the data at the end so that the reproduction ends at the same time in the above-described step s108 to form one file. In s109, reproduction data obtained by reproducing the file may be transmitted to the user terminal 200. FIG. 9 shows a second example of the composite reproduction file of the first embodiment. As shown in FIG. 9, the synthetic reproduction file in this case is a file in which the contribution audio data of each contributor is multiplexed with the “Congratulations” part aligned at the rear end.

上述した例では、投稿音声データのうち、不要区間として削除するのは、データ先頭から選定キーワードの発話開始時点まで、或いは選定キーワードの発話終了時点からデータ末尾まで、としたが、その他にも、図１０に示すように、第１の選定キーワードの発話終了時点から、第２の選定キーワードの発話開始時点までの不要区間Δｔ３を想定するとしてもよい。この場合、ＳＮＳサーバ１５０は、例えば、処理対象の各投稿音声データにおけるデータ先頭から上述の発話開始時点までの不要区間Δｔ１、および、上述の不要区間Δｔ３、の音声信号を削除して、不要区間Δｔ１，Δｔ３の削除実行後の各投稿音声データを、データ先頭から重畳させることで合成して１ファイルとする（図１１）。或いは、ＳＮＳサーバ１５０は、例えば、処理対象の各投稿音声データにおける上述の発話終了時点からデータ末尾までの不要区間Δｔ２、および、上述の不要区間Δｔ３、の音声信号を削除して、不要区間Δｔ２，Δｔ３の削除実行後の各投稿音声データを、データ末尾が揃うよう重畳させることで合成して１ファイルとするとしてもよい。 In the above-described example, the post audio data is deleted as an unnecessary section from the beginning of the data to the start time of the utterance of the selected keyword or from the end time of the utterance of the selected keyword to the end of the data. As shown in FIG. 10, an unnecessary section Δt3 from the time when the utterance of the first selected keyword is ended to the time when the utterance of the second selected keyword is started may be assumed. In this case, for example, the SNS server 150 deletes the audio signals of the unnecessary section Δt1 from the data head to the above-described utterance start time and the unnecessary section Δt3 in each posted audio data to be processed, Each posted audio data after the deletion of Δt1 and Δt3 is combined from the top of the data to be combined into one file (FIG. 11). Alternatively, for example, the SNS server 150 deletes the unnecessary section Δt2 from the end time of the utterance to the end of the data and the unnecessary section Δt3 in the post data to be processed, and deletes the unnecessary section Δt2. , Δt3 may be combined into a single file by superimposing the post audio data after execution of deletion of Δt3 so that the end of the data is aligned.

なお、上述のステップｓ１０６において、不要区間の削除を行って得られた投稿音声データの再生時間長が、各投稿音声データ間で大きく異なっているとすれば、上述の合成再生用ファイルを生成して再生したとしても、それを聞いているユーザは発話タイミングがばらばらにずれた内容のメッセージを聞くことになりかねない。 If the playback time length of the posted audio data obtained by deleting the unnecessary section in step s106 is greatly different between the posted audio data, the above-described synthetic playback file is generated. Even if it is played back, the user who listens to it may hear a message whose content is different from the timing of the utterance.

そこで、ＳＮＳサーバ１５０は、上述のステップｓ１０８において、不要区間削除実行後の各投稿音声データを重畳させる際、不要区間削除実行後の各投稿音声データ間での再生時間長の平均値を算定し（ｓ１０８Ａ）、各投稿音声データのうち再生時間長が平均値に満たないものは基準速度より低速の再生速度、各投稿音声データのうち再生時間長が平均値を越えるものは基準速度より高速の再生速度を設定し、各投稿音声データの再生時間長を統一する処理を実行する（ｓ１０８Ｂ）。ＳＮＳサーバ１５０は、当該処理後の各投稿音声データを合成して１ファイルとする。 Therefore, the SNS server 150 calculates the average value of the reproduction time lengths between the respective posted audio data after executing the unnecessary section deletion when superimposing the posted voice data after executing the unnecessary section deletion in the above-described step s108. (S108A) Among the posted audio data, those whose playback time length is less than the average value are playback speeds lower than the reference speed, and among each posted audio data, those whose playback time length exceeds the average value are faster than the reference speed. A process for setting the playback speed and unifying the playback time length of each posted audio data is executed (s108B). The SNS server 150 synthesizes each post audio data after the processing into one file.

以上、本発明を実施するための最良の形態などについて具体的に説明したが、本発明はこれに限定されるものではなく、その要旨を逸脱しない範囲で種々変更可能である。 Although the best mode for carrying out the present invention has been specifically described above, the present invention is not limited to this, and various modifications can be made without departing from the scope of the invention.

こうした本実施形態によれば、音声ＳＮＳにおける類似した投稿音声を手間無く効率的に選択、再生し、投稿者意図の良好な伝達が可能となる。 According to the present embodiment, it is possible to efficiently select and reproduce similar posted voices in the voice SNS without trouble and to transmit the poster intentions well.

本明細書の記載により、少なくとも次のことが明らかにされる。すなわち、第１の実施形態の投稿音声再生制御システムにおいて、前記演算部は、前記削除実行後の各投稿音声データを、データ先頭から同時に再生したデータを前記ユーザ端末に送信するものであるとしてもよい。 At least the following will be clarified by the description of the present specification. That is, in the posted sound reproduction control system according to the first embodiment, the calculation unit may transmit data obtained by simultaneously reproducing each posted sound data after the deletion execution from the top of the data to the user terminal. Good.

また、第１の実施形態の投稿音声再生制御システムにおいて、前記演算部は、前記削除実行後の各投稿音声データを、同時に終了するようデータ末尾を揃えて再生したデータを前記ユーザ端末に送信するものであるとしてもよい。 Further, in the posted sound reproduction control system according to the first embodiment, the calculation unit transmits data reproduced by aligning the end of the data so that the posted sound data after the deletion is finished at the same time. It may be a thing.

また、第１の実施形態の投稿音声再生制御システムにおいて、前記演算部は、前記削除実行後の各投稿音声データ間での再生時間長の平均値を算定し、前記各投稿音声データのうち再生時間長が前記平均値に満たないものは基準速度より低速の再生速度を設定し、前記各投稿音声データのうち再生時間長が前記平均値を越えるものは基準速度より高速の再生速度を設定して、前記各投稿音声データの再生時間長を統一する処理を実行し、当該処理後の各投稿音声データを同時に再生したデータを、前記ユーザ端末に送信するものであるとしてもよい。 Further, in the posted audio reproduction control system according to the first embodiment, the calculation unit calculates an average value of reproduction time lengths between the respective post audio data after the execution of the deletion, and reproduces among the posted audio data. If the time length is less than the average value, a playback speed lower than the reference speed is set, and among the posted audio data, if the playback time length exceeds the average value, a playback speed higher than the reference speed is set. Then, a process for unifying the playback time length of each posted audio data may be executed, and data obtained by simultaneously reproducing each posted audio data after the processing may be transmitted to the user terminal.

−−−第２の実施形態におけるシステム構成−−−
以下に本発明の第２の実施形態について図面を用いて詳細に説明する。第２の実施形態の音楽選択支援システム１００１を含むネットワーク構成は、第１の実施形態のネットワーク構成（図１）と同じである。そのため、以降は第１の実施形態と異なる構成についてのみ説明を行うこととする。第２の実施形態における音楽選択支援システム１００１（以下、システム１００１）は、音声ＳＮＳ（ＳｏｃｉａｌＮｅｔｗｏｒｋＳｅｒｖｉｃｅ）での投稿音声に対して手間無く効率的に音楽を選択し、ひいては音声ＳＮＳでのユーザビリティ向上を図るコンピュータシステムである。 --- System configuration in the second embodiment ---
Hereinafter, a second embodiment of the present invention will be described in detail with reference to the drawings. The network configuration including the music selection support system 1001 of the second embodiment is the same as the network configuration (FIG. 1) of the first embodiment. Therefore, only the configuration different from that of the first embodiment will be described below. A music selection support system 1001 (hereinafter, system 1001) according to the second embodiment efficiently selects music with respect to a posted voice in a voice SNS (Social Network Service), and thus improves usability in the voice SNS. It is a computer system that aims to.

続いてシステム１００１のハードウェア構成について説明する。第２の実施形態におけるシステム１００１は、第１の実施形態と同様、ＳＮＳサーバ１５０、公開Ｗｅｂサーバ１６０、および投稿記録装置１７０にて構成されている。ここでは、システム１００１における処理の実行主体たるＳＮＳサーバ１５０について説明を行うこととする。 Next, the hardware configuration of the system 1001 will be described. A system 1001 according to the second embodiment includes an SNS server 150, a public Web server 160, and a post recording device 170, as in the first embodiment. Here, the SNS server 150 that is the execution subject of processing in the system 1001 will be described.

この場合、システム１００１を構成するＳＮＳサーバ１５０は、図１３に例示するように、ハードディスクドライブなど適宜な不揮発性記憶装置で構成される記憶部１０１、ＲＡＭなど揮発性記憶装置で構成されるメモリ１０３、前記記憶部１０１に保持されるプログラム１０２をメモリ１０３に読み出すなどして実行し装置自体の統括制御を行なうとともに各種判定、演算及び制御処理を行なうＣＰＵなどの演算部１０４、ＬＡＮ回線１２１等と接続し他装置との通信処理を担う通信部１０５、を備える。なお、記憶部１０１内には、第２の実施形態の音楽選択支援システムとして必要な機能を実装する為のプログラム１０２、評価テーブル１２５、ユーザテーブル１２８、および、投稿音声データ情報テーブル１２９が少なくとも記憶されている。なお、評価テーブル１２５は、一次評価テーブル１２６と二次評価テーブル１２７から構成されている。 In this case, as illustrated in FIG. 13, the SNS server 150 configuring the system 1001 includes a storage unit 101 including an appropriate non-volatile storage device such as a hard disk drive, and a memory 103 including a volatile storage device such as a RAM. The computer 102, which reads and executes the program 102 held in the storage unit 101, executes the overall control of the apparatus itself and performs various determinations, computations, and control processes, and the LAN unit 121, etc. A communication unit 105 connected and responsible for communication processing with other devices. The storage unit 101 stores at least a program 102 for implementing functions necessary for the music selection support system of the second embodiment, an evaluation table 125, a user table 128, and a posted audio data information table 129. Has been. The evaluation table 125 includes a primary evaluation table 126 and a secondary evaluation table 127.

続いて、第２の実施形態のシステム１００１が備える機能について説明する。第２の実施形態におけるシステム１００１は、上述のように、ＳＮＳサーバ１５０、公開Ｗｅｂサーバ１６０、および投稿記録装置１７０にて構成されているが、以下では説明の簡明化の為、ＳＮＳサーバ１５０が公開Ｗｅｂサーバ１６０および投稿記録装置１７０の機能を備え、一体のシステム１００１として機能を果たすものとして説明を行うこととする。なお、こうしたシステム１００１において、ユーザ端末２００とのデータ授受は公開Ｗｅｂサーバ１６０を介して実行され、投稿音声データの管理については投稿記録装置１７０を介して実行される。 Next, functions provided in the system 1001 of the second embodiment will be described. As described above, the system 1001 according to the second embodiment includes the SNS server 150, the public Web server 160, and the posting recording device 170. However, for the sake of simplification of description, the SNS server 150 is described below. The description will be made on the assumption that the functions of the public Web server 160 and the post recording device 170 are provided and the functions of the integrated system 1001 are achieved. In such a system 1001, data exchange with the user terminal 200 is executed via the public Web server 160, and post audio data management is executed via the post recording device 170.

この場合、システム１００は、ユーザ端末２００から受信した投稿音声データ（投稿記録装置１７０で格納されているもの）に対し、プログラム１０２が含む音声認識プログラムを起動して音声認識処理を実行し、テキストデータを生成する機能を有している。ここで、ＳＮＳサーバ１５０は、この音声認識処理のため、プログラム１０２の一部として、音声認識プログラムを備えているものとする。 In this case, the system 100 activates a speech recognition program included in the program 102 and executes speech recognition processing on the posted speech data (stored in the posted recording device 170) received from the user terminal 200, and the text It has a function to generate data. Here, it is assumed that the SNS server 150 includes a voice recognition program as part of the program 102 for the voice recognition processing.

また、システム１００１は、上述の音声認識処理で生成したテキストデータを評価テーブル１２５の各選定キーワード群に照合して、テキストデータが含むキーワードとのマッチ度が所定値以上である選定キーワード群を特定し、特定した選定キーワード群に対応する音楽データの識別情報を、投稿音声データと合わせて再生する音楽の推奨情報としてユーザ端末２００に送信する機能を有している。他方、ユーザ端末２００では、この推奨情報を受信してディスプレイにて表示し、ユーザに閲覧させる。ここでシステム１００１は、前記選定キーワード群を特定するに際し、テキストデータが含むキーワードとのマッチ度が最も高い選定キーワード群を特定するとしてもよい。 Further, the system 1001 collates the text data generated by the above-described voice recognition processing with each selected keyword group in the evaluation table 125, and specifies a selected keyword group whose degree of matching with the keyword included in the text data is a predetermined value or more. The music data identification information corresponding to the specified selected keyword group is transmitted to the user terminal 200 as recommended music information to be reproduced together with the posted voice data. On the other hand, the user terminal 200 receives the recommended information, displays it on the display, and allows the user to browse. Here, when specifying the selected keyword group, the system 1001 may specify the selected keyword group having the highest degree of matching with the keyword included in the text data.

ユーザが上述の推奨情報の示す音楽を容認した場合、その旨がユーザ端末２００から公開Ｗｅｂサーバ１６０を介してＳＮＳサーバ１５０に通知される。ＳＮＳサーバ１５０は、容認通知を受信した音楽のデータ（投稿記録装置１７０ないし自身の記憶部１０１にて保持）を、該当投稿音声データと合わせて再生し、当該再生したデータを公開Ｗｅｂサーバ１６０を介してユーザ端末２００に送信することとなる。或いは、上述の如き、システム１００１は、ユーザ端末２００への推奨情報の送信を行わず、前記特定した選定キーワード群に対応する音楽データを、投稿音声データと合わせて再生し、当該再生したデータをユーザ端末２００に送信する機能を有しているとしてもよい。 When the user accepts the music indicated by the recommended information, the user terminal 200 notifies the SNS server 150 via the public Web server 160. The SNS server 150 reproduces the music data (retained in the posting recording device 170 or its own storage unit 101) that has received the acceptance notification together with the corresponding posted voice data, and the reproduced data is displayed on the public Web server 160. To the user terminal 200. Alternatively, as described above, the system 1001 does not transmit the recommended information to the user terminal 200, but reproduces the music data corresponding to the specified selected keyword group together with the posted audio data, and the reproduced data is reproduced. It may have a function of transmitting to the user terminal 200.

また、システム１００１は、上述のテキストデータを評価テーブル１２５の各選定キーワード群に照合して、テキストデータが含むキーワードとのマッチ度が所定値以上である選定キーワード群を複数特定した場合、例えば、マッチ度最高のものが並存していた場合、ユーザ端末２００のユーザに関する属性情報をユーザテーブル１２８より読み出し、ユーザの属性情報を評価テーブル１２５の各ユーザ属性に照合して、ユーザの属性情報とのマッチ度が最も高いユーザ属性を特定し、特定したユーザ属性に対応する音楽データの識別情報を、投稿音声データと合わせて再生する音楽の推奨情報としてユーザ端末２００に送信する機能を有しているとしてもよい。当然この場合も、システム１００１は、上述の如きユーザ端末２００への推奨情報の送信を行わず、前記特定したユーザ属性に対応する音楽データを、投稿音声データと合わせて再生し、当該再生したデータをユーザ端末２００に送信する機能を有しているとしてもよい。 Further, when the system 1001 collates the above text data with each selected keyword group of the evaluation table 125 and specifies a plurality of selected keyword groups whose degree of matching with the keyword included in the text data is a predetermined value or more, for example, When the one with the highest matching degree coexists, the attribute information about the user of the user terminal 200 is read from the user table 128, the user attribute information is collated with each user attribute of the evaluation table 125, and the user attribute information It has a function of identifying the user attribute having the highest degree of match and transmitting the music data identification information corresponding to the identified user attribute to the user terminal 200 as recommended music information to be reproduced together with the posted audio data. It is good. Of course, also in this case, the system 1001 does not transmit the recommended information to the user terminal 200 as described above, but reproduces the music data corresponding to the specified user attribute together with the posted audio data, and the reproduced data. May be transmitted to the user terminal 200.

また、システム１００１は、マッチ度が最も高い選定キーワード群を複数特定した場合に、該当投稿音声データに、ソーシャルネットワークサービスにおける公開先（ユーザやグループ等）の指定情報が付与されているか判定する機能を有しているとしてもよい。 Further, the system 1001 has a function of determining whether or not designation information of a public destination (such as a user or a group) in the social network service is given to the corresponding posted voice data when a plurality of selected keyword groups having the highest matching degree are specified. It is good also as having.

この場合、システム１００１は、前記の判定の結果、該当投稿音声データに投稿公開先の指定情報が付与されていると判定した場合、該当指定情報が示す投稿公開先のユーザに関して、ユーザテーブル１２８から属性情報を読み出し、当該属性情報を評価テーブル１２５の各ユーザ属性に照合して、投稿公開先のユーザの属性情報とのマッチ度が最も高いユーザ属性を特定し、特定したユーザ属性に対応する音楽データの識別情報を、投稿音声データと合わせて再生する音楽の推奨情報としてユーザ端末２００に送信する機能を有しているとしてもよい。この場合も、システム１００１は、上述の如きユーザ端末２００への推奨情報の送信を行わず、前記特定したユーザ属性に対応する音楽データを、投稿音声データと合わせて再生し、当該再生したデータをユーザ端末２００に送信する機能を有しているとしてもよい。 In this case, when the system 1001 determines that the post publishing destination designation information is attached to the corresponding posted audio data as a result of the determination, the system 1001 determines from the user table 128 regarding the post publishing destination user indicated by the corresponding designation information. The attribute information is read out, the attribute information is compared with each user attribute in the evaluation table 125, the user attribute having the highest degree of matching with the attribute information of the posting destination user is identified, and the music corresponding to the identified user attribute The data identification information may be transmitted to the user terminal 200 as music recommendation information to be reproduced together with the posted audio data. Also in this case, the system 1001 does not transmit the recommended information to the user terminal 200 as described above, but reproduces the music data corresponding to the specified user attribute together with the posted audio data, and the reproduced data is reproduced. It may have a function of transmitting to the user terminal 200.

一方、前記の判定の結果、該当投稿音声データに投稿公開先の指定情報が付与されていないと判定した場合、システム１００１は、該当投稿音声データの投稿ユーザに関して、ユーザテーブル１２８から属性情報を読み出し、当該属性情報を評価テーブル１２５の各ユーザ属性に照合して、投稿ユーザの属性情報とのマッチ度が最も高いユーザ属性を特定し、特定したユーザ属性に対応する音楽データの識別情報を、投稿音声データと合わせて再生する音楽の推奨情報としてユーザ端末２００に送信する処理を実行するものとしてもよい。この場合も、システム１００１は、上述の如きユーザ端末２００への推奨情報の送信を行わず、前記特定したユーザ属性に対応する音楽データを、投稿音声データと合わせて再生し、当該再生したデータをユーザ端末２００に送信する機能を有しているとしてもよい。 On the other hand, as a result of the determination, when it is determined that the post publishing destination designation information is not given to the corresponding posted audio data, the system 1001 reads the attribute information from the user table 128 regarding the posting user of the corresponding posted audio data. The attribute information is collated with each user attribute of the evaluation table 125, the user attribute having the highest degree of matching with the attribute information of the posting user is identified, and the music data identification information corresponding to the identified user attribute is posted. A process of transmitting to the user terminal 200 as recommended information of music to be reproduced together with the audio data may be executed. Also in this case, the system 1001 does not transmit the recommended information to the user terminal 200 as described above, but reproduces the music data corresponding to the specified user attribute together with the posted audio data, and the reproduced data is reproduced. It may have a function of transmitting to the user terminal 200.

また、システム１００１は、上述のテキストデータを評価テーブル１２５の各選定キーワード群に照合して、テキストデータが含むキーワードとのマッチ度が最も高い選定キーワード群を複数特定した場合、ユーザ端末２００のユーザに関する直近の投稿音声データを、投稿記録装置１７０ないし記憶部１０１より読み出し、該当投稿音声データに対し、上述同様の音声認識処理を実行してテキストデータを生成し、当該テキストデータを評価テーブル１２５の各選定キーワード群に照合して、テキストデータが含むキーワードとのマッチ度が最も高い選定キーワード群を特定し、特定した選定キーワード群に対応する音楽データの識別情報を、投稿音声データと合わせて再生する音楽の推奨情報としてユーザ端末２００に送信する機能を有しているとしてもよい。この場合も、システム１００１は、上述の如きユーザ端末２００への推奨情報の送信を行わず、前記特定した選定キーワード群に対応する音楽データを、投稿音声データと合わせて再生し、当該再生したデータをユーザ端末２００に送信する機能を有しているとしてもよい。 In addition, when the system 1001 collates the above-described text data with each selected keyword group in the evaluation table 125 and specifies a plurality of selected keyword groups that have the highest degree of matching with the keyword included in the text data, the user of the user terminal 200 The latest posted voice data relating to the post-recording device 170 or the storage unit 101 is read out, the corresponding posted voice data is subjected to voice recognition processing similar to the above to generate text data, and the text data is stored in the evaluation table 125. Match each selected keyword group, identify the selected keyword group that has the highest degree of matching with the keyword included in the text data, and play the music data identification information corresponding to the identified selected keyword group together with the posted audio data A function to transmit to the user terminal 200 as recommended music information It may be used as is. Also in this case, the system 1001 does not transmit the recommended information to the user terminal 200 as described above, but reproduces the music data corresponding to the specified selected keyword group together with the posted audio data, and the reproduced data May be transmitted to the user terminal 200.

また、システム１００１は、上述のテキストデータが含むキーワードとのマッチ度が最も高い選定キーワード群を複数特定した場合、ユーザ端末２００のユーザに関する直近から所定範囲の期間の各投稿音声データを、投稿記録装置１７０ないし記憶部１０１より読み出し、各投稿音声データに対し上述と同様の音声認識処理を実行してテキストデータを生成し、各テキストデータを跨って出現頻度が一定以上のキーワードを抽出し、当該キーワードを評価テーブル１２５の各選定キーワード群に照合して、出現頻度一定以上のキーワードとのマッチ度が最も高い選定キーワード群を特定し、特定した選定キーワード群に対応する音楽データの識別情報を、投稿音声データと合わせて再生する音楽の推奨情報としてユーザ端末２００に送信する機能を有しているとしてもよい。この場合も、システム１００１は、上述の如きユーザ端末２００への推奨情報の送信を行わず、前記特定した選定キーワード群に対応する音楽データを、投稿音声データと合わせて再生し、当該再生したデータをユーザ端末２００に送信する機能を有しているとしてもよい。 In addition, when the system 1001 specifies a plurality of selected keyword groups having the highest degree of matching with the keyword included in the text data, each posted audio data for a period within a predetermined range from the most recent time related to the user of the user terminal 200 is recorded. Read from the device 170 or the storage unit 101, execute the speech recognition process similar to the above on each posted voice data to generate text data, extract keywords with a frequency of appearance above a certain level across each text data, The keyword is collated with each selected keyword group of the evaluation table 125, the selected keyword group having the highest degree of matching with the keyword having a certain appearance frequency or more is specified, and the identification information of the music data corresponding to the specified selected keyword group, It is transmitted to the user terminal 200 as recommended information of music to be reproduced together with the posted audio data. Functions may have. Also in this case, the system 1001 does not transmit the recommended information to the user terminal 200 as described above, but reproduces the music data corresponding to the specified selected keyword group together with the posted audio data, and the reproduced data May be transmitted to the user terminal 200.

−−−第２の実施形態におけるデータ構造例−−−
次に、第２の実施形態のシステム１００１が用いるテーブルにおけるデータ構造例について説明する。図１４は第２の実施形態の評価テーブル１２５の一例を示す図である。評価テーブル１２５は、音楽データの識別情報と該当音楽データに関して予め選定された選定キーワード群とを対応付けたテーブルであり、第２の実施形態では、一次評価テーブル１２６と二次評価テーブル１２７とで構成している。図に示す例では、一次評価テーブル１２６は、音楽データの識別情報たるＢＧＭ名をキーとして、該当音楽データすなわちＢＧＭに関して予め選定された、「たんじょうび」、「ばーすでい」、「はっぴー」、などといった選定キーワード群とを対応付けたレコードの集合体となっている。また、二次評価テーブル１２７は、音楽データの識別情報たるＢＧＭ名をキーとして、該当音楽データすなわちＢＧＭに関して予め選定されたユーザ属性（例：誕生日当日、ロック音楽好き、４０代）、および直近書込みが含むキーワード（例：結婚、クリスマス等）を対応付けたレコードの集合体となっている。 --- Example of data structure in the second embodiment ---
Next, an example of a data structure in a table used by the system 1001 according to the second embodiment will be described. FIG. 14 is a diagram illustrating an example of the evaluation table 125 according to the second embodiment. The evaluation table 125 is a table in which identification information of music data is associated with a selection keyword group selected in advance for the corresponding music data. In the second embodiment, the evaluation table 125 includes a primary evaluation table 126 and a secondary evaluation table 127. It is composed. In the example shown in the figure, the primary evaluation table 126 uses “BGM name” as music data identification information as a key, and “musical information”, “basile”, “ha” selected in advance for the corresponding music data, that is, BGM. It is a collection of records in which selected keyword groups such as “ppy” are associated. Further, the secondary evaluation table 127 uses the BGM name as identification information of music data as a key, the user attributes (for example, birthday date, rock music enthusiast, 40s) selected in advance for the corresponding music data, that is, the BGM, and the latest It is a collection of records that associate keywords (eg, marriage, Christmas, etc.) included in writing.

図１５は第２の実施形態のユーザテーブル１２８の一例を示す図である。ユーザテーブル１２８は、音声ＳＮＳの各ユーザの属性情報を記述したテーブルであり、図の例では、ユーザＩＤをキーとして、該当ユーザの誕生日、音楽志向、趣味、年齢といったユーザ属性の値を対応付けたレコードの集合体となっている。 FIG. 15 is a diagram illustrating an example of the user table 128 according to the second embodiment. The user table 128 is a table in which attribute information of each user of the voice SNS is described. In the example of the figure, the user attribute values such as the birthday, music orientation, hobbies, and age of the corresponding user are associated with the user ID as a key. It is a collection of attached records.

図１６は第２の実施形態の投稿音声データ情報テーブル１２９の一例を示す図である。投稿音声データ情報テーブル１２９は、投稿記録装置１７０に格納されている投稿音声データに関する情報を、該当投稿音声データの公開先毎に格納したテーブルであり、図の例では、「U00001」といったユーザＩＤ毎に、投稿音声データＩＤをキーとして、該当投稿音声データの投稿者、投稿日時、投稿音声データのタイトル（識別情報）といった値を対応付けたレコードの集合体となっている。 FIG. 16 is a diagram illustrating an example of the posted audio data information table 129 according to the second embodiment. The posted audio data information table 129 is a table in which information related to posted audio data stored in the posting recording device 170 is stored for each disclosure destination of the corresponding posted audio data. In the example of the figure, a user ID such as “U00001” is stored. Each is a set of records in which values such as a contributor of the corresponding posted audio data, a posting date and time, and a title (identification information) of the posted audio data are associated with the posted audio data ID as a key.

−−−第２の実施形態における処理手順例−−−
以下、第２の実施形態における音楽選択支援方法の実際手順について図に基づき説明する。以下で説明する音楽選択支援方法に対応する各種動作は、システム１００１を構成する各装置らがメモリに読み出してそれぞれ実行するプログラムによって実現される。そして、このプログラムは、以下に説明される各種の動作を行うためのコードから構成されている。 --- Example of processing procedure in the second embodiment ---
Hereinafter, the actual procedure of the music selection support method in the second embodiment will be described with reference to the drawings. Various operations corresponding to the music selection support method described below are realized by programs that are read into the memory and executed by the devices constituting the system 1001. And this program is comprised from the code | cord | chord for performing the various operation | movement demonstrated below.

図１７は、第２の実施形態における音楽選択支援方法の処理手順例を示すフロー図である。ここで、ＳＮＳサーバ１５０は、ユーザ端末２００から受信して投稿記録装置１７０に格納された投稿音声データを、投稿記録装置１７０より取得する（ｓ２００）。 FIG. 17 is a flowchart illustrating a processing procedure example of the music selection support method according to the second embodiment. Here, the SNS server 150 acquires the posted voice data received from the user terminal 200 and stored in the posting recording device 170 from the posting recording device 170 (s200).

次に、ＳＮＳサーバ１５０は、上述で取得した投稿音声データに対し、プログラム１０２が含む音声認識プログラムを起動して音声認識処理を実行し、テキストデータを生成する（ｓ２０１）。ここでＳＮＳサーバ１５０は、前記の音声認識処理で生成したテキストデータを、一次評価テーブル１２６における、各ＢＧＭの選定キーワード群に照合して、テキストデータが含むキーワードと各ＢＧＭとのマッチ度を算定する（ｓ２０２）。 Next, the SNS server 150 activates a speech recognition program included in the program 102 for the posted speech data acquired above, executes speech recognition processing, and generates text data (s201). Here, the SNS server 150 collates the text data generated by the speech recognition process with the selected keyword group of each BGM in the primary evaluation table 126, and calculates the degree of match between the keyword included in the text data and each BGM. (S202).

このように、テキストデータが含むキーワードと各ＢＧＭとのマッチ度を算定する処理は、例えば次のような処理手順となる。テキストデータが含むキーワードが、「けっこん」、「おめでとう」、「しあわせ」であった場合、ＳＮＳサーバ１５０は、これらキーワードを一次評価テーブル１２６に照合し、「けっこん」のキーワードについては、適合時得点の「１０点」を特定し、「おめでとう」のキーワードについては、適合時得点の「６点」を特定し、「しあわせ」のキーワードについては、適合時得点の「６点」を特定する。こうした、「けっこん」、「おめでとう」、「しあわせ」の各キーワードに関する適合時得点の特定処理を、一次評価テーブル１２６における各ＢＧＭのレコードについて実行し（ｓ２０３）、図１８に示す評価結果例１のように、各ＢＧＭ毎の得点計を算定する。図１８の例では、"ＢＧＭ２：ウェディングソング"について最高得点「１９点」が算定された。 As described above, the process for calculating the degree of matching between the keyword included in the text data and each BGM is, for example, the following processing procedure. When the keywords included in the text data are “Kekkon”, “Congratulations”, and “Happiness”, the SNS server 150 matches these keywords with the primary evaluation table 126, and for the keyword “Kekkon”, the score at the time of adaptation “10 points” is specified, “6 points” of the score for adaptation is specified for the keyword “congratulations”, and “6 points” of the score for adaptation is specified for the keyword “happy”. The process of specifying the score at the time of matching for each of the keywords “Kekkon”, “Congratulations”, and “happiness” is executed for each BGM record in the primary evaluation table 126 (s203), and the evaluation result example 1 shown in FIG. Thus, the score meter for each BGM is calculated. In the example of FIG. 18, the highest score “19 points” was calculated for “BGM2: wedding song”.

このように、一次評価テーブル１２６へのテキストデータのキーワードの照合と得点算定の処理により、テキストデータが含むキーワードとのマッチ度最高のもの、つまり最高得点のキーワード群＝ＢＧＭが１つのみ特定された場合（ｓ２０４：Ｎｏ）、ＳＮＳサーバ１５０は、特定したＢＧＭつまり音楽データの識別情報を、投稿音声データと合わせて再生する音楽の推奨情報として、該当投稿音声データの投稿者或いは公開先として指定されている他ユーザのユーザ端末２００に送信する（ｓ２０５）。該当投稿音声データの投稿者情報は勿論のこと、公開先は、投稿音声データの投稿時に投稿者が指定しており、公開先となったユーザないしグループの情報が該当投稿音声データに付帯して投稿記録装置１７０や投稿音声データ情報テーブル１２９に格納されているものとする。 As described above, only the keyword with the highest degree of matching with the keyword included in the text data, that is, the keyword group with the highest score = BGM is specified by the matching of the keyword of the text data to the primary evaluation table 126 and the score calculation process. If it is found (s204: No), the SNS server 150 designates the identified BGM, that is, the identification information of the music data, as the recommended information of the music to be reproduced together with the posted audio data, as the poster or publication destination of the corresponding posted audio data. It transmits to the user terminal 200 of the other user who is done (s205). The posting destination is specified by the contributor at the time of posting the posted audio data, as well as the poster information of the corresponding posted audio data, and the information of the user or group that became the publishing destination is attached to the corresponding posted audio data. It is assumed that they are stored in the posting recording device 170 and the posted audio data information table 129.

なお、上述の推奨情報を受信したユーザ端末２００は、推奨情報をディスプレイに表示して、ユーザに閲覧させることとなる。ユーザが上述の推奨情報の示すＢＧＭを容認した場合、その旨がユーザ端末２００から公開Ｗｅｂサーバ１６０を介してＳＮＳサーバ１５０に通知される。 Note that the user terminal 200 that has received the recommended information described above displays the recommended information on a display and allows the user to browse. When the user accepts the BGM indicated by the recommended information described above, the fact is notified from the user terminal 200 to the SNS server 150 via the public Web server 160.

ＳＮＳサーバ１５０は、ユーザ端末２００から容認通知を受信したＢＧＭのデータ（投稿記録装置１７０ないし自身の記憶部１０１にて保持）を、該当投稿音声データと合わせて再生し、当該再生したデータを公開Ｗｅｂサーバ１６０を介してユーザ端末２００に送信する（ｓ２１２）。なお、ＳＮＳサーバ１５０は、上述の如き、ユーザ端末２００への推奨情報の送信を行わず、前記特定した選定キーワード群に対応するＢＧＭのデータを、投稿音声データと合わせて再生し、当該再生したデータをユーザ端末２００に送信するとしてもよい。 The SNS server 150 reproduces the BGM data (held in the posting recording device 170 or its own storage unit 101) that has received the acceptance notification from the user terminal 200 together with the corresponding posted audio data, and discloses the reproduced data. It transmits to the user terminal 200 via the Web server 160 (s212). Note that the SNS server 150 does not transmit the recommended information to the user terminal 200 as described above, but reproduces the BGM data corresponding to the specified selected keyword group together with the posted voice data, and reproduces the data. Data may be transmitted to the user terminal 200.

一方、一次評価テーブル１２６へのテキストデータのキーワードの照合と得点算定の処理により、テキストデータが含むキーワードとのマッチ度最高のもの、つまり最高得点のキーワード群＝ＢＧＭが複数特定された場合（ｓ２０４：Ｙｅｓ）、ＳＮＳサーバ１５０は、該当投稿音声データに、音声ＳＮＳにおける公開先（ユーザやグループ等）の指定情報が付与されているか判定する（ｓ２０６）。 On the other hand, when the matching of the keywords of the text data to the primary evaluation table 126 and the score calculation process specify a plurality of keywords having the highest degree of matching with the keywords included in the text data, that is, the highest score keyword group = BGM (s204). : Yes), the SNS server 150 determines whether or not the designated information of the disclosure destination (user, group, etc.) in the voice SNS is given to the corresponding posted voice data (s206).

この場合、ＳＮＳサーバ１５０は、上述のステップｓ２０６での判定の結果、該当投稿音声データに投稿公開先の指定情報が付与されていると判定した場合（ｓ２０６：Ｙｅｓ）、該当指定情報が示す「投稿公開先」のユーザに関して、ユーザテーブル１２８から属性情報を読み出す（ｓ２０７）。このステップｓ２０７において、ＳＮＳサーバ１５０は、更に、「投稿公開先」のユーザに関する直近の投稿音声データを、投稿記録装置１７０より読み出し、該当投稿音声データに対し、音声認識処理を実行してテキストデータを生成する。ＳＮＳサーバ１５０は、直近の投稿音声データを投稿記録装置１７０から読み出す際、直近から所定範囲の期間の投稿音声データを読み出すものとしてもよい。その場合、ＳＮＳサーバ１５０は、所定範囲の期間の各投稿音声データに対し上述と同様の音声認識処理を実行してテキストデータを生成し、各テキストデータを跨って出現頻度が一定以上のキーワードを抽出する。 In this case, if the SNS server 150 determines that the post publishing destination designation information is added to the corresponding posted audio data as a result of the determination in step s206 described above (s206: Yes), the corresponding designation information indicates “ The attribute information is read from the user table 128 for the user of “post publication destination” (s207). In step s207, the SNS server 150 further reads the latest posted voice data related to the user of “post publication destination” from the posting recording device 170, executes voice recognition processing on the corresponding posted voice data, and performs text data processing. Is generated. When the SNS server 150 reads the latest posted voice data from the posting recording device 170, the SNS server 150 may read the posted voice data in a predetermined range from the latest. In that case, the SNS server 150 generates the text data by executing the same voice recognition processing as described above for each posted voice data in the period of the predetermined range, and selects a keyword having a certain appearance frequency across each text data. Extract.

他方、上述のステップｓ２０６での判定の結果、該当投稿音声データに投稿公開先の指定情報が付与されていないと判定した場合（ｓ２０６：Ｎｏ）、ＳＮＳサーバ１５０は、該当投稿音声データの「投稿」ユーザに関して、ユーザテーブル１２８から属性情報を読み出す（ｓ２０８）。このステップｓ２０８において、ＳＮＳサーバ１５０は、更に、「投稿」ユーザに関する直近の投稿音声データを、投稿記録装置１７０より読み出し、該当投稿音声データに対し、音声認識処理を実行してテキストデータを生成する。ＳＮＳサーバ１５０は、直近の投稿音声データを投稿記録装置１７０から読み出す際、直近から所定範囲の期間の投稿音声データを読み出すものとしてもよい。その場合、ＳＮＳサーバ１５０は、所定範囲の期間の各投稿音声データに対し上述と同様の音声認識処理を実行してテキストデータを生成し、各テキストデータを跨って出現頻度が一定以上のキーワードを抽出する。 On the other hand, as a result of the determination in the above-described step s206, when it is determined that the post publishing destination designation information is not given to the corresponding posted audio data (s206: No), the SNS server 150 displays “post” of the corresponding posted audio data. ”Regarding the user, the attribute information is read from the user table 128 (s208). In step s208, the SNS server 150 further reads the latest posted voice data related to the “post” user from the posting recording device 170, executes voice recognition processing on the corresponding posted voice data, and generates text data. . When the SNS server 150 reads the latest posted voice data from the posting recording device 170, the SNS server 150 may read the posted voice data in a predetermined range from the latest. In that case, the SNS server 150 generates the text data by executing the same voice recognition processing as described above for each posted voice data in the period of the predetermined range, and selects a keyword having a certain appearance frequency across each text data. Extract.

続いてＳＮＳサーバ１５０は、上述のステップｓ２０７、ｓ２０８のいずれかで得た、ユーザの属性情報および直近書込みから得たキーワードを、二次評価テーブル１２７に照合し、各ＢＧＭに関して規定されているユーザの属性情報およびキーワードとのマッチ度を算定する（ｓ２０９）。 Subsequently, the SNS server 150 collates the user attribute information obtained in any of the above-described steps s207 and s208 and the keyword obtained from the latest writing with the secondary evaluation table 127, and the user defined for each BGM. The degree of matching with the attribute information and the keyword is calculated (s209).

このように、ユーザの属性情報およびキーワードと、各ＢＧＭとのマッチを算定する処理は、例えば次のような処理手順となる。ユーザの属性情報が「年齢：４０」、「音楽志向：ロック」であり、キーワードが、「フットサル」、「がんばろう」であった場合、ＳＮＳサーバ１５０は、これらの値を二次評価テーブル１２７に照合し、「年齢：４０」、「音楽志向：ロック」の各値については適合時得点の「１０点」をそれぞれ特定し、「フットサル」、「がんばろう」の各キーワードについては、適合時得点無しを特定する。こうした、ユーザの属性情報およびキーワードに関する適合時得点の特定処理を、二次評価テーブル１２７における各ＢＧＭのレコードについて実行し（ｓ２１０）、図１９に示す評価結果例２のように、各ＢＧＭ毎の得点計を算定する。 As described above, the process for calculating the match between the user attribute information and the keyword and each BGM is, for example, the following processing procedure. When the attribute information of the user is “age: 40”, “music-oriented: rock”, and the keywords are “futsal” and “good luck”, the SNS server 150 uses these values as the secondary evaluation table 127. For each value of “Age: 40” and “Music Orientation: Rock”, specify “10 points” at the time of adaptation, and for each keyword of “Futsal” and “Good luck” at the time of adaptation Specify no score. The process of specifying the matching score regarding the user attribute information and the keyword is executed for each BGM record in the secondary evaluation table 127 (s210), and the evaluation result example 2 shown in FIG. Calculate the score meter.

図１９の例では、一次評価テーブル１２６を用いた評価で最高得点となった"ＢＧＭ４"、"ＢＧＭ５"、"ＢＧＭ６"、のうち、二次評価テーブル１２７を用いた評価で、"ＢＧＭ６"に計２０点が算定された。このように、二次評価テーブル１２７へのユーザの属性情報およびキーワードの照合と得点算定の処理により、ユーザの属性情報およびキーワードとのマッチ度最高のもの、つまり最高得点のＢＧＭを特定する。 In the example of FIG. 19, the evaluation using the secondary evaluation table 127 among “BGM4”, “BGM5”, and “BGM6” that has the highest score in the evaluation using the primary evaluation table 126 is changed to “BGM6”. A total of 20 points were calculated. In this way, by matching the user attribute information and keywords to the secondary evaluation table 127 and processing for score calculation, the BGM having the highest degree of match with the user attribute information and keywords, that is, the highest score BGM is specified.

続いてＳＮＳサーバ１５０は、上述のステップｓ２０５と同様に、上述のステップｓ２０９で特定したＢＧＭつまり音楽データの識別情報を、投稿音声データと合わせて再生する音楽の推奨情報として、該当投稿音声データの投稿者のユーザ端末２００に送信する（ｓ２１１）。なお、上述の推奨情報を受信したユーザ端末２００は、推奨情報をディスプレイに表示して、ユーザに閲覧させることとなる。ユーザが上述の推奨情報の示すＢＧＭを容認した場合、その旨がユーザ端末２００から公開Ｗｅｂサーバ１６０を介してＳＮＳサーバ１５０に通知される。 Subsequently, as in step s205 described above, the SNS server 150 uses the BGM identified in step s209, that is, music data identification information, as the recommended music information to be reproduced together with the posted audio data, It transmits to the user terminal 200 of a contributor (s211). Note that the user terminal 200 that has received the recommended information described above displays the recommended information on a display and allows the user to browse. When the user accepts the BGM indicated by the recommended information described above, the fact is notified from the user terminal 200 to the SNS server 150 via the public Web server 160.

ＳＮＳサーバ１５０は、ユーザ端末２００から容認通知を受信したＢＧＭのデータ（投稿記録装置１７０ないし自身の記憶部１０１にて保持）を、該当投稿音声データと合わせて再生し、当該再生したデータを公開Ｗｅｂサーバ１６０を介してユーザ端末２００に送信する（ｓ２１２）。 The SNS server 150 reproduces the BGM data (held in the posting recording device 170 or its own storage unit 101) that has received the acceptance notification from the user terminal 200 together with the corresponding posted audio data, and discloses the reproduced data. It transmits to the user terminal 200 via the Web server 160 (s212).

こうした第２の実施形態によれば、音声ＳＮＳでの投稿音声に対して手間無く効率的に音楽を選択し、ひいては音声ＳＮＳでのユーザビリティ向上を図ることが可能となる。 According to such 2nd Embodiment, it becomes possible to select music efficiently with respect to the contribution sound | voice by audio | voice SNS, and, by extension, the usability improvement by audio | voice SNS can be aimed at.

−−−第３の実施形態におけるシステム構成−−−
以下に本発明の第３の実施形態について図面を用いて詳細に説明する。第３の実施形態の投稿音声再生制御システム１００３を含むネットワーク構成は、第１、第２の実施形態のネットワーク構成（図１）と同じである。そのため、以降は第１の実施形態と異なる構成についてのみ説明を行うこととする。第３の実施形態における投稿音声再生制御システム１００３（以下、システム１００３）は、音声ＳＮＳにおける複数の投稿音声を違和感無く連続再生し、投稿者意図の良好な伝達を可能とするコンピュータシステムである。 --- System configuration in the third embodiment ---
Hereinafter, a third embodiment of the present invention will be described in detail with reference to the drawings. The network configuration including the posted audio reproduction control system 1003 of the third embodiment is the same as the network configuration (FIG. 1) of the first and second embodiments. Therefore, only the configuration different from that of the first embodiment will be described below. A posted voice reproduction control system 1003 (hereinafter, system 1003) in the third embodiment is a computer system that continuously reproduces a plurality of posted voices in the voice SNS without a sense of incongruity and enables good transmission of a poster's intention.

続いてシステム１００３のハードウェア構成について説明する。第３の実施形態におけるシステム１００３は、第１の実施形態と同様、ＳＮＳサーバ１５０、公開Ｗｅｂサーバ１６０、および投稿記録装置１７０にて構成されている。ここでは、システム１００３における処理の実行主体たるＳＮＳサーバ１５０について説明を行うこととする。 Next, the hardware configuration of the system 1003 will be described. A system 1003 according to the third embodiment includes an SNS server 150, a public Web server 160, and a posting recording device 170, as in the first embodiment. Here, the SNS server 150 that is the execution subject of processing in the system 1003 will be described.

この場合、システム１００３を構成するＳＮＳサーバ１５０は、図２０に例示するように、ハードディスクドライブなど適宜な不揮発性記憶装置で構成される記憶部１０１、ＲＡＭなど揮発性記憶装置で構成されるメモリ１０３、前記記憶部１０１に保持されるプログラム１０２をメモリ１０３に読み出すなどして実行し装置自体の統括制御を行なうとともに各種判定、演算及び制御処理を行なうＣＰＵなどの演算部１０４、ＬＡＮ回線１２１等と接続し他装置との通信処理を担う通信部１０５、を備える。なお、記憶部１０１内には、第３の実施形態の投稿音声再生制御システムとして必要な機能を実装する為のプログラム１０２、グルーピングテーブル１３１、優先度評価テーブル１３２、およびユーザテーブル１３３が少なくとも記憶されている。これらテーブルの詳細については後述する。 In this case, as illustrated in FIG. 20, the SNS server 150 configuring the system 1003 includes a storage unit 101 including an appropriate non-volatile storage device such as a hard disk drive, and a memory 103 including a volatile storage device such as a RAM. The computer 102, which reads and executes the program 102 held in the storage unit 101, executes the overall control of the apparatus itself and performs various determinations, computations, and control processes, and the LAN unit 121, etc. A communication unit 105 connected and responsible for communication processing with other devices. The storage unit 101 stores at least a program 102, a grouping table 131, a priority evaluation table 132, and a user table 133 for implementing functions necessary for the posted audio reproduction control system of the third embodiment. ing. Details of these tables will be described later.

なお特に図示しないが、投稿記録装置１７０が、記憶部１１において格納している投稿音声データ１６には、該当投稿音声データと共に再生するＢＧＭの識別情報のデータが対応付けられている。 Although not particularly illustrated, the posted audio data 16 stored in the storage unit 11 by the posting recording device 170 is associated with the identification information data of BGM to be reproduced together with the corresponding posted audio data.

続いて、第３の実施形態のシステム１００３が備える機能について説明する。第３の実施形態におけるシステム１００３は、上述のように、ＳＮＳサーバ１５０、公開Ｗｅｂサーバ１６０、および投稿記録装置１７０にて構成されているが、以下では説明の簡明化の為、ＳＮＳサーバ１５０が公開Ｗｅｂサーバ１６０および投稿記録装置１７０の機能を備え、一体のシステム１００３として機能を果たすものとして説明を行うこととする。なお、こうしたシステム１００３において、ユーザ端末２００とのデータ授受は公開Ｗｅｂサーバ１６０を介して実行され、投稿音声データの管理については投稿記録装置１７０を介して実行される。 Next, functions provided in the system 1003 of the third embodiment will be described. As described above, the system 1003 according to the third embodiment is configured by the SNS server 150, the public Web server 160, and the posting recording device 170. However, for the sake of simplification of description, the SNS server 150 is described below. The description will be made on the assumption that the functions of the public Web server 160 and the post recording device 170 are provided and the functions of the integrated system 1003 are achieved. In such a system 1003, data exchange with the user terminal 200 is executed via the public Web server 160, and post audio data management is executed via the post recording device 170.

この場合、システム１００３は、投稿記録装置１７０がユーザ端末２００から得て格納している投稿音声データのうち、同じ投稿公開先の指定情報（例：公開先となるユーザやグループの識別情報）が付与されているものを、投稿記録装置１７０より通信部１０５を介しアクセスして特定し、該当投稿音声データに同時再生すべき音楽が付与されている場合は、当該特定した各投稿音声データに付与されている、該当投稿音声データと同時再生すべき音楽の識別情報を、一方、該当投稿音声データに同時再生すべき音楽が付与されていない場合は、当該特定した各投稿音声データに対し音声認識処理を実行してテキストデータを生成し、当該生成したテキストデータを、グルーピングテーブル１３１に照合して、各投稿音声データのテーマ（例：誕生日祝い、結婚祝い等）を特定し、当該特定したテーマが互いに共通する投稿音声データらを連続再生対象のグループとして記憶部１０１に格納する機能を有している。 In this case, the system 1003 has the same post release destination designation information (for example, identification information of the user or group as the release destination) in the post voice data obtained and stored by the post recording device 170 from the user terminal 200. What is given is accessed and specified via the communication unit 105 from the posting recording device 170, and if the music to be played back simultaneously is given to the corresponding posted voice data, the given posted voice data is given Identification information of music that should be played back simultaneously with the corresponding posted voice data, and if the music to be played back simultaneously is not given to the posted voice data, voice recognition is performed for each of the specified posted voice data The process is executed to generate text data, and the generated text data is collated with the grouping table 131 to obtain the theme ( : Birthday, identifies the wedding, etc.), and has a function of storing in the storage unit 101 posts the audio data et al themes the identified common with each other as a group to be continuously played back.

また、システム１００３は、連続再生対象のグループに含まれる各投稿音声データを順次再生し、当該再生したデータをユーザ端末２００に送信する機能を有している。 In addition, the system 1003 has a function of sequentially reproducing each posted audio data included in the group to be continuously reproduced and transmitting the reproduced data to the user terminal 200.

なお、システム１００３は、連続再生対象のグループにおける投稿公開先のユーザに関する属性情報（例：誕生日、音楽志向、趣味、年齢等）をユーザテーブル１３３より読み出し、このユーザの属性情報を優先度評価テーブル１３２に照合して、ユーザの属性情報に応じたグループの再生優先レベルを特定し、当該再生優先レベルの高低に応じてグループ間の再生順序を決定し、当該再生順序で連続再生対象のグループを記憶部１０１より読み出し、当該グループに含まれる各投稿音声データを順次再生し、当該再生したデータをユーザ端末２００に送信する機能を備えるとしてもよい。 Note that the system 1003 reads attribute information (for example, birthday, music orientation, hobbies, age, etc.) related to the posting disclosure destination user in the group to be continuously reproduced from the user table 133, and evaluates the attribute information of this user on the priority evaluation. The reproduction priority level of the group corresponding to the attribute information of the user is identified with reference to the table 132, the reproduction order between the groups is determined according to the level of the reproduction priority level, and the group subject to continuous reproduction in the reproduction order May be provided from the storage unit 101, the posted audio data included in the group may be sequentially reproduced, and the reproduced data may be transmitted to the user terminal 200.

また、システム１００３は、連続再生対象のグループにおける投稿公開先のユーザに関する直近の投稿音声データ１６を、投稿記録装置１７０（ないし投稿記録装置１７０から予めの投稿音声データ１６をコピーした記憶部１０１）より読み出し、該当投稿音声データに対し、音声認識処理を実行してテキストデータを生成し、当該テキストデータを優先度評価テーブル１３２に照合して、ユーザの投稿内容に応じたグループの再生優先レベルを特定し、当該再生優先レベルの高低に応じてグループ間の再生順序を決定し、当該再生順序で連続再生対象のグループを記憶部１０１より読み出し、当該グループに含まれる各投稿音声データを順次再生し、当該再生したデータをユーザ端末２００に送信する機能を備えるとしてもよい。 In addition, the system 1003 displays the latest posted audio data 16 related to the user of the posting disclosure destination in the group to be continuously played back to the posting recording device 170 (or the storage unit 101 that has copied the posted audio data 16 from the posting recording device 170). Read out, execute speech recognition processing on the corresponding posted voice data to generate text data, collate the text data with the priority evaluation table 132, and set the playback priority level of the group according to the user's posted content. The playback order between the groups is determined in accordance with the level of the playback priority level, the group to be continuously played back is read from the storage unit 101 in the playback order, and each posted audio data included in the group is played back sequentially. A function of transmitting the reproduced data to the user terminal 200 may be provided.

なお、システム１００３は、上述の投稿音声データの再生時に、該当再生データを受信しているユーザ端末２００より、再生停止指示を通信部１０５を介して受信した場合、該当連続再生対象のグループのテーマに関して、一定期間の再生対象排除を指定するフラグを記憶部１０１にて設定する機能を備えるとしてもよい。この場合、システム１００３は、連続再生対象のグループに含まれる各投稿音声データを順次再生する際、上述のフラグが設定されているテーマに対応した連続再生対象のグループについては再生をせず、他の連続再生対象のグループの再生を優先する。 In addition, when the system 1003 receives a reproduction stop instruction from the user terminal 200 that has received the corresponding reproduction data via the communication unit 105 during the reproduction of the posted audio data, the system 1003 receives the group theme of the corresponding continuous reproduction target. With regard to the above, a function may be provided in which the storage unit 101 sets a flag for designating the exclusion of a reproduction target for a certain period. In this case, when the system 1003 sequentially plays back each posted audio data included in the group to be continuously played back, the system 1003 does not play back the group to be played back continuously corresponding to the theme for which the above flag is set. Priority is given to the playback of groups that are subject to continuous playback.

−−−第３の実施形態におけるデータ構造例−−−
次に、第３の実施形態のシステム１００３が用いるテーブルにおけるデータ構造例について説明する。図２１は、第３の実施形態のグルーピングテーブル１３１の一例を示す図である。このグルーピングテーブル１３１は、「バースデイソング」、「ウェディングソング」といったＢＧＭ名、すなわち音楽データの識別情報をキーとして、投稿音声を音声認識により生成したテキストキーワード、「お祝い系」、「元気系」といった該当音楽または投稿音声のテーマ、および「楽しい」、「明るい」といったトーン（曲調）とを対応付けたレコードの集合体となっている。 --- Example of data structure in the third embodiment ---
Next, an example of a data structure in a table used by the system 1003 of the third embodiment will be described. FIG. 21 is a diagram illustrating an example of the grouping table 131 according to the third embodiment. The grouping table 131 includes BGM names such as “birthday song” and “wedding song”, that is, text keywords generated by voice recognition using the identification information of music data as keys, “celebration”, “genki”, and the like. It is a collection of records in which the theme of the corresponding music or posted audio and the tone (musical tone) such as “fun” and “bright” are associated with each other.

図２２は第３の実施形態の優先度評価テーブル１３２の一例を示す図である。また、優先度評価テーブル１３２は、ユーザの属性情報（例：誕生日等）ないし投稿内容（例：結婚、試合等）と、連続再生対象の再生優先レベルとを対応付けたレコードの集合体となっている。図２２における優先度評価テーブル１３２の例では、連続再生対象のグループの公開先であるユーザの属性情報（例：誕生日等）ないし直近の投稿内容（例：結婚、試合等）が、当該優先度評価テーブル１３２におけるユーザ属性ないし投稿内容の項目値にマッチした場合、該当連続再生対象のグループに付与される所定の評価得点の値が規定されたテーブルとなっている。 FIG. 22 is a diagram illustrating an example of the priority evaluation table 132 according to the third embodiment. In addition, the priority evaluation table 132 includes a collection of records in which user attribute information (eg, birthday, etc.) or post content (eg, marriage, match, etc.) is associated with a playback priority level to be continuously played back. It has become. In the example of the priority evaluation table 132 in FIG. 22, the attribute information (eg, birthday) of the user who is the disclosure destination of the group to be continuously played or the latest posted content (eg, marriage, match, etc.) This table defines a predetermined evaluation score value to be given to the group to be subjected to continuous reproduction when the user attribute or the post content item value in the degree evaluation table 132 is matched.

図２３は第３の実施形態のユーザテーブル１３３の一例を示す図である。また、ユーザテーブル１３３は、音声ＳＮＳの各ユーザの属性情報を記述したテーブルであり、図の例では、ユーザＩＤをキーとして、該当ユーザの誕生日、音楽志向、趣味、年齢といったユーザ属性の値を対応付けたレコードの集合体となっている。 FIG. 23 is a diagram illustrating an example of the user table 133 according to the third embodiment. Further, the user table 133 is a table describing attribute information of each user of the voice SNS. In the example of the figure, user attribute values such as the birthday, music orientation, hobbies, and age of the corresponding user using the user ID as a key. It is a collection of records that correspond to each other.

−−−第３の実施形態における処理手順例−−−
以下、第３の実施形態における投稿音声再生制御方法の実際手順について図に基づき説明する。以下で説明する投稿音声再生制御方法に対応する各種動作は、システム１００３を構成する各装置らがメモリ等に読み出して実行するプログラムによって実現される。そして、このプログラムは、以下に説明される各種の動作を行うためのコードから構成されている。 --- Example of processing procedure in the third embodiment ---
The actual procedure of the posted audio reproduction control method in the third embodiment will be described below with reference to the drawings. Various operations corresponding to the posted audio reproduction control method described below are realized by programs that are read out from a memory or the like and executed by each device constituting the system 1003. And this program is comprised from the code | cord | chord for performing the various operation | movement demonstrated below.

図２４は、第３の実施形態における投稿音声再生制御方法の処理手順例を示すフロー図である。ここで、ＳＮＳサーバ１５０は、投稿記録装置１７０がユーザ端末２００から得て格納している投稿音声データ１６のうち、同じ投稿公開先の指定情報（例：公開先となるユーザやグループの識別情報）が付与されているものを、投稿記録装置１７０より通信部１０５を介しアクセスして特定する（ｓ３００）。 FIG. 24 is a flowchart showing a processing procedure example of the posted audio reproduction control method according to the third embodiment. Here, the SNS server 150 specifies the same posting publication destination designation information (for example, identification information of a user or a group serving as a publication destination) in the posted voice data 16 obtained and stored by the posting recording device 170 from the user terminal 200. ) Is specified from the posting recording device 170 via the communication unit 105 (s300).

また、ＳＮＳサーバ１５０は、上述のステップｓ３００で特定された当該投稿音声データに同時再生すべき音楽が付与されているか否かを判定し（ｓ３０１）、付与されている場合、特定した各投稿音声データに付与されている、該当投稿音声データと同時再生すべき音楽の識別情報をグルーピングテーブル１３１に照合して、各投稿音声データと同時再生すべき音楽のテーマ（例：誕生日祝い、結婚祝い等）やトーンを特定する（ｓ３０２）。この場合、ＳＮＳサーバ１５０は、投稿記録装置１７０にて該当投稿音声データ１６に関して付与されている、「バースデイソング」といったＢＧＭの識別情報を読み取り、当該ＢＧＭの識別情報をグルーピングテーブル１３１に照合して、該当投稿音声データと同時再生すべき音楽のテーマを「お祝い系」、トーンを「楽しい・明るい」などと特定することになる。なお、各投稿音声データに付与されている、該当投稿音声データと同時再生すべき音楽の識別情報は、後述する第２の実施形態における音楽選択支援システム１００１により特定され、投稿記録装置１７０にて投稿音声データに付与されたものであると想定する。 In addition, the SNS server 150 determines whether or not music to be simultaneously reproduced is added to the posted audio data specified in step s300 described above (s301). The identification information of the music to be reproduced simultaneously with the corresponding posted audio data, which is given to the data, is collated with the grouping table 131, and the music theme to be reproduced simultaneously with each posted audio data (eg birthday celebration, wedding celebration) Etc.) and a tone are specified (s302). In this case, the SNS server 150 reads the BGM identification information such as “birthday song” given to the corresponding posted audio data 16 in the posting recording device 170 and collates the BGM identification information with the grouping table 131. The music theme to be played simultaneously with the corresponding posted audio data is specified as “celebration”, the tone as “fun / bright”, and the like. Note that the identification information of the music to be played back simultaneously with the corresponding posted voice data, which is given to each posted voice data, is specified by the music selection support system 1001 in the second embodiment to be described later, and is posted by the posting recording device 170. Assume that it is given to the posted audio data.

また、上述のステップｓ３００で特定された当該投稿音声データに同時再生すべき音楽が付与されていない場合、当該特定した各投稿音声データに対し音声認識処理を実行してテキストデータを生成し、当該生成したテキストデータを、グルーピングテーブル１３１に照合して、各投稿音声データのテーマ（例：誕生日祝い、結婚祝い等）を特定する（ｓ３０３）。 If the music to be played back simultaneously is not given to the posted audio data specified in step s300 described above, a speech recognition process is performed on each of the specified posted audio data to generate text data, The generated text data is collated with the grouping table 131, and the theme (for example, birthday celebration, wedding celebration, etc.) of each posted audio data is specified (s303).

続いてＳＮＳサーバ１５０は、上述のステップｓ３０２またはｓ３０３で特定したテーマ、好ましくは更にトーンも互いに共通する投稿音声データらを、連続再生対象のグループとして記憶部１０１に格納する（ｓ３０４）。ＳＮＳサーバ１５０における、この投稿音声データのグルーピングの処理は、上述のステップｓ３００で特定した全ての投稿音声データに関して処理完了するまで繰り返し実行することとなる（ｓ３０５）。図２５に示す評価結果例では、同一公開先の投稿音声データとして、「投稿１」〜「投稿７」までの７つの投稿が特定され、それら各投稿に紐付けされていたＢＧＭの識別情報が「ＢＧＭ１：バースデイソング」、「ＢＧＭ２：応援歌」、「ＢＧＭ３：卒業ソング」、「ＢＧＭ４：バースデイソング」、「ＢＧＭ５：卒業ソング」、「ＢＧＭ６：バレンタインソング」、「ＢＧＭ７：ＢＧＭなし」であり、それらのテーマに基づくグループは、「投稿１：お祝い系」、「投稿２：元気系」、「投稿３：お別れ系」、「投稿４：お祝い系」、「投稿５：お別れ系」、「投稿６：お祝い系」、「投稿７：お祝い系」と分類された。 Subsequently, the SNS server 150 stores, in the storage unit 101, the theme specified in the above-described step s302 or s303, preferably the posted audio data having the same tone, as a group to be continuously reproduced (s304). The grouping process of the posted voice data in the SNS server 150 is repeatedly executed until the process is completed for all the posted voice data specified in step s300 described above (s305). In the example of the evaluation result shown in FIG. 25, seven posts from “Post 1” to “Post 7” are specified as post voice data of the same publication destination, and the identification information of the BGM linked to each post is shown. “BGM1: Birthday Song”, “BGM2: Support Song”, “BGM3: Graduation Song”, “BGM4: Birthday Song”, “BGM5: Graduation Song”, “BGM6: Valentine Song”, “BGM7: No BGM” The groups based on these themes are “Post 1: Celebration”, “Post 2: Energetic”, “Post 3: Farewell”, “Post 4: Celebration”, “Post 5: Farewell” , “Post 6: Congratulatory”, “Post 7: Congratulatory”.

次に、ＳＮＳサーバ１５０は、上述の連続再生対象のグループにおける投稿公開先のユーザに関する属性情報（例：誕生日、音楽志向、趣味、年齢等）をユーザテーブル１３３より読み出す（ｓ３０６）。また、ＳＮＳサーバ１５０は、前記グループにおける投稿公開先のユーザに関する直近の投稿音声データ１６（直近から一定期間遡った複数件であってもよい）を、投稿記録装置１７０（ないし投稿記録装置１７０から予めの投稿音声データ１６をコピーした記憶部１０１）より読み出し、該当投稿音声データに対し、プログラム１０２が含む音声認識プログラムを起動して音声認識処理を実行し、テキストデータを生成する（ｓ３０７）。ここで、ＳＮＳサーバ１５０は、この音声認識処理のため、プログラム１０２の一部として、音声認識プログラムを備えているものとする。 Next, the SNS server 150 reads attribute information (eg, birthday, music orientation, hobbies, age, etc.) regarding the posting disclosure destination user in the group to be continuously reproduced from the user table 133 (s306). In addition, the SNS server 150 obtains the latest posted audio data 16 (may be a plurality of items retroactive for a certain period from the latest) from the posting recording device 170 (or the posting recording device 170). The storage unit 101) that has copied the posted voice data 16 in advance is read out, the voice recognition program included in the program 102 is activated for the corresponding posted voice data, voice recognition processing is executed, and text data is generated (s307). Here, it is assumed that the SNS server 150 includes a voice recognition program as part of the program 102 for the voice recognition processing.

ＳＮＳサーバ１５０は、上述のステップｓ３０６，ｓ３０７で得た、前記ユーザの属性情報およびテキストデータ（すなわち直近の投稿内容）を、優先度評価テーブル１３２に照合し、これらユーザの属性情報および直近の投稿内容に応じて、該当グループの再生優先レベルを特定する（ｓ３０８）。 The SNS server 150 collates the user attribute information and text data (that is, the latest post content) obtained in the above steps s306 and s307 with the priority evaluation table 132, and the user attribute information and the latest post According to the content, the playback priority level of the corresponding group is specified (s308).

図２５に示す評価結果例の場合、投稿公開先のユーザの属性情報が、「誕生日：２月」であり、また、当該ユーザによる直近の投稿内容が「明日はサッカーの試合のあと、誕生日パーティだ！」、「週末はバレンタイン」、「来月で卒業。さみしいなあ」であった。そのため、「投稿１」〜「投稿７」の各投稿のうち、「投稿１」については、その投稿内容が「７歳の誕生日おめでとう」であるから、"誕生日"なるキーワードについて「１０点」獲得し、該当グループの「お祝い系」にこの１０点を加算する。また、「投稿２」については、その投稿内容が「明日の試合がんばろう」であるから、"試合"なるキーワードについて「１０点」獲得し、該当グループの「元気系」にこの１０点を加算する。また、「投稿３」については、その投稿内容が「もうすぐ卒業式、高校は別々で寂しいね」であるから、"卒業"なるキーワードについて「１０点」獲得し、該当グループの「お別れ系」にこの１０点を加算する。また、「投稿４」については、その投稿内容が「ハッピーバースディ。もう７才。大きくなったね」であるから、"バースデイ"なるキーワードについて「１０点」獲得し、該当グループの「お祝い系」にこの１０点を加算する。また、「投稿５」については、その投稿内容が「卒業してもまたみんなで遊ぼう」であるから、"卒業"なるキーワードについて「１０点」獲得し、該当グループの「お別れ系」にこの１０点を加算する。また、「投稿６」については、その投稿内容が「バレンタインチョコ作りで忙しいよう」であるから、"バレンタイン"なるキーワードについて「１０点」獲得し、該当グループの「お祝い系」にこの１０点を加算する。また、「投稿７」については、その投稿内容が「今日の誕生日会はみんなにお祝いしてもらって楽しかった」であるから、"誕生日"なるキーワードについて「１０点」獲得し、該当グループの「お祝い系」にこの１０点を加算する。 In the case of the evaluation result example shown in FIG. 25, the attribute information of the posting release destination user is “birthday: February”, and the latest posted content by the user is “birth tomorrow after a soccer game, "It was a party!", "Valentine on the weekend", "Graduated next month. Therefore, among the posts from “Post 1” to “Post 7”, “Post 1” has a post content of “Happy Birthday for 7 years old”. ”And add these 10 points to the“ festive ”group. For “Post 2”, the content of the post is “Let's do our best tomorrow's game”, so “10 points” are acquired for the keyword “Game”, and these 10 points are added to the “Genki” of the corresponding group. To do. In addition, as for “Post 3”, the content of the post is “Soon graduation ceremony, high school is separate and lonely”, so we get “10” for the keyword “Graduation” and “Farewell” of the corresponding group Add these 10 points to. Also, for “Post 4”, the content of the post is “Happy Birthday. Already 7 years old. You ’ve grown up”, so earned “10 points” for the keyword “Birthday” and made it a “Celebration” for that group. Add these 10 points. Also, for “Post 5”, the content of the post is “Let's play together even after graduation”, so “10 graduation” is obtained for the keyword “Graduation”, and it becomes “Farewell” of the corresponding group. Add these 10 points. Also, for “Post 6”, the content of the post is “It seems to be busy with Valentine's day chocolate making”, so “10 points” are obtained for the keyword “Valentine”, and this 10 points are given to the “festive group” of the corresponding group. to add. Also, for “Post 7”, the content of the post is “Today ’s birthday party was fun to be celebrated by everyone”, so “10 points” were obtained for the keyword “Birthday” and These 10 points are added to the “celebration”.

各グループの得点は、「投稿１」、「投稿４」、「投稿６」、「投稿７」から１０点ずつ得たお祝い系が合計４０点となり、「投稿２」からのみ１０点得た元気計が合計１０点となり、「投稿３」、「投稿５」から１０点づつ得たお別れ系が合計２０点となった。従って、ＳＮＳサーバ１５０は、こうして得た各グループの総得点が高いものほど、再生優先レベルが高いと判定し、これに応じてグループ間の再生順序を、再生順序１位：お祝い系、再生順序２位：お別れ系、再生順序３位：元気系、と決定する（ｓ３０９）。 The score of each group is 40 points in total for 10 points from “Post 1”, “Post 4”, “Post 6”, “Post 7”, and 10 points from “Post 2”. The total was 10 points, and the farewell system obtained 10 points from “Post 3” and “Post 5” was 20 points in total. Accordingly, the SNS server 150 determines that the higher the total score of each group obtained in this way is, the higher the playback priority level is. Accordingly, the playback order between the groups is set as the playback order first: festive system, playback order. 2nd place: Farewell system, reproduction order 3rd place: Energetic system is determined (s309).

続いてＳＮＳサーバ１５０は、上述のステップｓ３０９で決定した再生順序で、該当連続再生対象のグループを、投稿記録装置１７０（ないし投稿記録装置１７０から投稿音声データをコピーした記憶部１０１）より読み出し、当該グループに含まれる各投稿音声データを順次再生し、当該再生したデータをユーザ端末２００に送信する（ｓ３１０）。 Subsequently, the SNS server 150 reads the corresponding continuous playback target group from the posting recording device 170 (or the storage unit 101 that has copied the posted audio data from the posting recording device 170) in the playback order determined in step s309 described above, Each posted audio data included in the group is sequentially reproduced, and the reproduced data is transmitted to the user terminal 200 (s310).

なお、ＳＮＳサーバ１５０は、上述のステップｓ３１０における投稿音声データの再生時に、該当再生データを受信しているユーザ端末２００より、再生停止指示を通信部１０５を介して受信した場合（ｓ３１１：Ｙｅｓ）、該当連続再生対象のグループのテーマに関して、一定期間の再生対象排除を指定するフラグを、記憶部１０１にて設定するとしてもよい（ｓ３１２）。このステップｓ３１２の処理後のＳＮＳサーバ１５０は、後に、ステップｓ３１０を再度実行するに際し、上述のフラグが設定されているテーマに対応した連続再生対象のグループについて、該当フラグの示す有効期間内の間は、再生をせず、他の連続再生対象のグループのうち再生順序が早いものから再生を優先する。このような処理を行うこととすれば、システム１００３側で決定した再生順序を、ユーザが好ましく思わなかった事実を確実に踏まえて、よりユーザの意向、気分に沿った投稿音声データの連続再生が可能となる。 Note that the SNS server 150 receives a playback stop instruction from the user terminal 200 that has received the corresponding playback data via the communication unit 105 during playback of the posted audio data in step s310 (s311: Yes). For the theme of the group to be continuously reproduced, the storage unit 101 may set a flag for designating the exclusion of the reproduction object for a certain period (s312). When the SNS server 150 after the processing of step s312 later executes step s310 again, the SNS server 150 is within the effective period indicated by the corresponding flag for the group to be continuously played back corresponding to the theme for which the flag is set. Does not reproduce, but prioritizes the reproduction from the group with the highest reproduction order among the other groups subject to continuous reproduction. If such processing is performed, the playback order determined on the system 1003 side is based on the fact that the user did not like it, and it is possible to continuously play back the posted audio data more in line with the user's intention and mood. It becomes possible.

また、ＳＮＳサーバ１５０は、上述のフラグの起源である再生停止指示を行ったユーザ毎に、前記フラグの設定履歴を該当グループと対応付けて記憶部１０１にて保持しておくとしてもよい。この場合、ＳＮＳサーバ１５０は、該当ユーザに関して、各グループでのフラグ設定頻度を一定期間毎に算定し、グループ間でフラグ設定頻度が最高となったものについて、以後再びフラグが設定される際には、フラグの有効期間すなわち再生対象排除の期間を通常より一定期間延長するといった処理を実行する。このような処理を行うこととすれば、ユーザが好ましく思わなかった事実を更に確実に踏まえて、よりユーザの意向、気分に沿った投稿音声データの連続再生が可能となる。 The SNS server 150 may store the flag setting history in the storage unit 101 in association with the corresponding group for each user who has given a playback stop instruction that is the origin of the flag. In this case, the SNS server 150 calculates the flag setting frequency in each group for each user for a certain period, and when the flag setting frequency is highest among the groups, the flag is set again thereafter. Performs a process of extending the effective period of the flag, that is, the period of exclusion of the reproduction target by a certain period from the normal period. By performing such processing, it is possible to continuously reproduce the posted audio data more in line with the user's intention and mood, more reliably based on the fact that the user did not like.

こうした第３の実施形態によれば、音声ＳＮＳにおける複数の投稿音声を違和感無く連続再生し、投稿者意図の良好な伝達が可能となる。 According to the third embodiment, a plurality of posted voices in the voice SNS can be continuously reproduced without a sense of incongruity, and a good transmission of the poster intention can be achieved.

本明細書の記載により、少なくとも次のことが明らかにされる。すなわち、第３の実施形態の投稿音声再生制御システムにおいて、前記記憶部は、ソーシャルネットワークサービスの各ユーザの属性情報を記述したユーザテーブルと、ユーザの属性情報と連続再生対象の再生優先レベルとを対応付けた優先度評価テーブルとを更に備え、前記演算部は、前記連続再生対象のグループにおける投稿公開先のユーザに関する属性情報を前記ユーザテーブルより読み出し、前記ユーザの属性情報を前記優先度評価テーブルに照合して、前記ユーザの属性情報に応じた前記グループの再生優先レベルを特定し、当該再生優先レベルの高低に応じてグループ間の再生順序を決定し、当該再生順序で前記連続再生対象のグループを記憶部より読み出し、当該グループに含まれる各投稿音声データを順次再生し、当該再生したデータを前記ユーザ端末に送信するものである、としてもよい。 At least the following will be clarified by the description of the present specification. That is, in the posted audio reproduction control system according to the third embodiment, the storage unit includes a user table describing attribute information of each user of the social network service, user attribute information, and a reproduction priority level to be continuously reproduced. An associated priority evaluation table, wherein the calculation unit reads attribute information related to a posting release destination user in the group to be continuously reproduced from the user table, and the attribute information of the user is read from the priority evaluation table. The playback priority level of the group according to the attribute information of the user is determined, the playback order between the groups is determined according to the level of the playback priority level, and the continuous playback target in the playback order is determined. Read the group from the storage unit, play each post audio data included in the group sequentially, Data is the one that transmits to the user terminal may be.

また、第３の実施形態の投稿音声再生制御システムにおいて、前記記憶部は、各ユーザ端末から受信した投稿音声データを蓄積しているものであり、ソーシャルネットワークサービスの各ユーザの属性情報を記述したユーザテーブルと、ユーザの投稿内容と連続再生対象の再生優先レベルとを対応付けた優先度評価テーブルとを更に備え、前記演算部は、前記連続再生対象のグループにおける投稿公開先のユーザに関する直近の投稿音声データを記憶部より読み出し、該当投稿音声データに対し、音声認識処理を実行してテキストデータを生成し、当該テキストデータを前記優先度評価テーブルに照合して、前記ユーザの投稿内容に応じた前記グループの再生優先レベルを特定し、当該再生優先レベルの高低に応じてグループ間の再生順序を決定し、当該再生順序で前記連続再生対象のグループを記憶部より読み出し、当該グループに含まれる各投稿音声データを順次再生し、当該再生したデータを前記ユーザ端末に送信するものである、としてもよい。 In the posted audio playback control system according to the third embodiment, the storage unit stores posted audio data received from each user terminal, and describes attribute information of each user of the social network service. A user table, and a priority evaluation table that associates the user's posted content with the playback priority level of the continuous playback target, and the calculation unit includes the latest user related post posting destination in the group of the continuous playback target Read the posted voice data from the storage unit, execute voice recognition processing on the corresponding posted voice data to generate text data, check the text data against the priority evaluation table, and according to the user's posted content The playback priority level of the group is specified, and the playback order between groups is determined according to the level of the playback priority level. The continuous playback target group is read from the storage unit in the playback order, each posted audio data included in the group is sequentially played back, and the played back data is transmitted to the user terminal. Good.

１００１音楽選択支援システム（第２の実施形態）
１００２投稿音声再生制御システム（第１の実施形態）
１００３投稿音声再生制御システム（第３の実施形態）
１１、１０１、１１１記憶部
１２、１０２、１１２プログラム
１３、１０３、１１３メモリ
１４、１０４、１１４演算部
１５、１０５、１１５通信部
１６投稿音声データ
１２０インターネット網（ネットワーク）
１２１ＬＡＮ回線（ネットワーク）
１２２公衆回線網（ネットワーク）
１２５評価テーブル
１２６一次評価テーブル
１２７二次評価テーブル
１２８ユーザテーブル
１２９投稿音声データ情報テーブル
１３０判定テーブル
１３１グルーピングテーブル
１３２優先度評価テーブル
１３３ユーザテーブル
１５０ＳＮＳサーバ
１６０公開Ｗｅｂサーバ
１７０投稿記録装置
２００ユーザ端末
２２０、２３０投稿者端末（ユーザ端末）
２４０閲覧再生者端末（ユーザ端末）
３００電話応答システム
３１０交換機
３２０自動音声応答装置
３３０ＣＴＩ装置 1001 Music selection support system (second embodiment)
1002 Posted audio playback control system (first embodiment)
1003 Posted audio playback control system (third embodiment)
11, 101, 111 Storage unit 12, 102, 112 Program 13, 103, 113 Memory 14, 104, 114 Calculation unit 15, 105, 115 Communication unit 16 Posted audio data 120 Internet network (network)
121 LAN line (network)
122 Public network (network)
125 Evaluation table 126 Primary evaluation table 127 Secondary evaluation table 128 User table 129 Posted audio data information table 130 Determination table 131 Grouping table 132 Priority evaluation table 133 User table 150 SNS server 160 Public Web server 170 Post recording device 200 User terminal 220 , 230 Contributor terminal (user terminal)
240 Browsing player terminal (user terminal)
300 Telephone response system 310 Exchange 320 Automatic voice response device 330 CTI device

Claims

A communication unit that communicates with a user terminal used by a user of a social network service via a network;
A storage unit for storing a determination table in which identification information of an event and a selection keyword selected in advance for the event are associated;
Among the posted voice data received from the user terminal via the communication unit, the one to which the specified posting publication destination designation information is assigned is specified, and the voice recognition processing is performed on each of the specified posted voice data to obtain the text Processing to generate data and store the generated text data in the storage unit for each posting publication destination;
Collate each text data with a common posting publication destination against each selection keyword in the judgment table, specify the text data containing the same selection keyword as related to the same posting publication destination and the same event, Processing to store in the storage unit as
Search the start time or end time of the audio signal of the same selected keyword in the posted audio data included in the group to be simultaneously played, and search the unnecessary section from the start of the data to the start time, or the end of the data from the end The processing unit that deletes the audio signal of the unnecessary section up to and transmits the data that can be output by the audio output means to the user terminal when the posted audio data after the deletion is simultaneously reproduced When,
A posted voice reproduction control system comprising:

The said calculating part transmits the data which can be output by an audio | voice output means to the said user terminal at the time of reproducing simultaneously each contribution audio | voice data after the said deletion execution from the data head. 2. The posted audio reproduction control system according to 1.

The calculation unit is configured to transmit, to the user terminal, data that can be output by the audio output unit when the post-deletion-executed post-voice data is reproduced with the end of the data aligned so as to end simultaneously. The posted voice reproduction control system according to claim 1, wherein

The calculation unit calculates an average value of playback time lengths between the respective posted audio data after execution of the deletion, and those of the posted audio data whose playback time length is less than the average value are slower than a reference speed If the playback time length of each posted audio data exceeds the average value, the playback speed higher than the reference speed is set, and the process of unifying the playback time length of each posted audio data is executed. 2. The posted voice reproduction control according to claim 1, wherein data that can be output by the voice output means when the post-voice data after the processing is simultaneously reproduced is transmitted to the user terminal. system.

A communication unit that communicates via a network with a user terminal used by a user of a social network service, and a storage unit that stores a determination table in which event identification information and a selection keyword selected in advance for the event are associated with each other Computer
Among the posted voice data received from the user terminal via the communication unit, the one to which the specified posting publication destination designation information is assigned is specified, and the voice recognition processing is performed on each of the specified posted voice data to obtain the text Processing to generate data and store the generated text data in the storage unit for each posting publication destination;
Collate each text data with a common posting publication destination against each selection keyword in the judgment table, specify the text data containing the same selection keyword as related to the same posting publication destination and the same event, Processing to store in the storage unit as
The start time or end time of the audio signal of the same selected keyword included in the group to be simultaneously played is searched, and an unnecessary section from the beginning of the data to the start time, or an unnecessary section from the end time to the end of the data. Deleting the audio signal, and transmitting to the user terminal data that can be output by the audio output means when simultaneously reproducing each posted audio data after execution of the deletion,
A method for controlling the reproduction of posted audio, comprising:

A communication unit that communicates via a network with a user terminal used by a user of a social network service, and a storage unit that stores a determination table in which event identification information and a selection keyword selected in advance for the event are associated with each other Computer
Among the posted voice data received from the user terminal via the communication unit, the one to which the specified posting publication destination designation information is assigned is specified, and the voice recognition processing is performed on each of the specified posted voice data to obtain the text Processing to generate data and store the generated text data in the storage unit for each posting publication destination;
Collate each text data with a common posting publication destination against each selection keyword in the judgment table, specify the text data containing the same selection keyword as related to the same posting publication destination and the same event, Processing to store in the storage unit as
The start time or end time of the audio signal of the same selected keyword included in the group to be simultaneously played is searched, and an unnecessary section from the beginning of the data to the start time, or an unnecessary section from the end time to the end of the data. Deleting the audio signal, and transmitting to the user terminal data that can be output by the audio output means when simultaneously reproducing each posted audio data after execution of the deletion,
A post voice reproduction control program characterized by causing