JP2022185174A

JP2022185174A - Message service providing method, message service providing program and message service system

Info

Publication number: JP2022185174A
Application number: JP2021092665A
Authority: JP
Inventors: 心根岸; Kokoro Negishi
Original assignee: 6assets; 6assets Co Ltd
Current assignee: 6assets; 6assets Co Ltd
Priority date: 2021-06-02
Filing date: 2021-06-02
Publication date: 2022-12-14

Abstract

To solve such a problem that an attraction and a value as sound contents are not provided well to a user since conventional techniques about sound content contribution is just a technique configured for convenience of contributing contents and is not text-based or image-based, and the sound contents have poor visibility.MEANS FOR SOLVING THE PROBLEM: A method of providing a message service that can contribute a plurality of sound messages constituted within a prescribed time via the network comprises the steps of: acquiring user information being information for identifying a user; acquiring sound by linking it with the user information of the user being an acquisition source; displaying a sound reproduction UI being an interface for enabling reproduction of the acquired sound; and outputting, upon reception of selection of the sound reproduction UI, the sound according to the received selection.SELECTED DRAWING: Figure 2

Description

本発明は、ネットワークを介して所定時間内にて構成される複数の音声メッセージを投稿可能なメッセージサービスの提供方法や提供プログラム及びメッセージサービスシステムなどに関する。 The present invention relates to a message service providing method, providing program, message service system, etc., capable of posting a plurality of voice messages composed within a predetermined time via a network.

従来から、ソーシャルネットワークサービス（ＳＮＳ）などを通じて、様々なコミュニケーションを促進するための技術が知られている。 2. Description of the Related Art Conventionally, techniques for promoting various communications through social network services (SNS) and the like have been known.

そしてこのようなサービスにおいては、従来からテキストベースや画像（動画）ベースの情報の送受信が行われてきたところ、近時、音声ベースの投稿を通じたサービスが展開されたり、関連する技術が開示されたりするに至っている。例えば特許文献１には、音声通話を録音し、通話相手の承諾を条件に、当該録音データを音声コンテンツとしてソーシャルメディアに公開投稿する技術が開示されている。 In such services, text-based and image (moving image)-based information has been transmitted and received in the past, but recently, voice-based posting services have been developed and related technologies have been disclosed. It has reached the point where For example, Patent Literature 1 discloses a technique of recording a voice call and publicly posting the recorded data as voice content on social media with the consent of the other party.

特開２０２０―０５２７９４号Japanese Patent Application Laid-Open No. 2020-052794

しかしながら、特許文献１に記載されている先行技術は、単にコンテンツを投稿する便宜を図る技術に過ぎず、投稿されたコンテンツに接するユーザに対して、好適な提示手段を提供しているとはいいがたかった。すなわち、テキストベースや画像ベースではなく、音声コンテンツそのものには視覚性が乏しいため、ユーザに対し、音声コンテンツとしての魅力乃至価値をうまく提供しきれているとは言い難かった。 However, the prior art described in Patent Literature 1 is merely a technique for facilitating the posting of content, and it cannot be said that it provides a suitable presentation means for users who come into contact with posted content. I wanted to That is, since the audio content itself is not text-based or image-based and lacks visuality, it is difficult to say that the attractiveness and value of the audio content have been successfully provided to the user.

以上のような課題を解決すべく、本発明は、ネットワークを介して所定時間内にて構成される複数の音声メッセージを投稿可能なメッセージサービスシステムを提供する方法であって、ユーザを識別する情報であるユーザ情報を取得するユーザ情報取得ステップと、取得元となるユーザのユーザ情報と紐づけて音声を取得する音声取得ステップと、取得した音声を再生可能にするためのインタフェースである音声再生ＵＩを表示させるＵＩ表示ステップと、音声再生ＵＩの選択を受け付ける選択受付ステップと、受け付けた選択に応じて音声を出力する音声出力ステップと、をコンピュータに実行させる方法などを提案する。 In order to solve the above problems, the present invention provides a method for providing a message service system capable of posting a plurality of voice messages configured within a predetermined time via a network, wherein information identifying a user A user information acquisition step of acquiring user information, a voice acquisition step of acquiring voice in association with the user information of the user who is the acquisition source, and a voice playback UI that is an interface for enabling playback of the acquired voice , a selection receiving step of receiving a selection of an audio playback UI, and a voice outputting step of outputting voice according to the received selection.

また、上記方法に関連して、音声取得ステップが、取得する音声の再生時間長に応じて当該音声を編集する編集取得サブステップをさらに有する方法なども提案する。 Also, in relation to the above method, a method is proposed in which the voice acquisition step further includes an edit acquisition sub-step of editing the acquired voice according to the playback time length of the acquired voice.

また、上記方法に関連して、音声出力ステップが、投稿された複数の音声メッセージを順番に連続して出力する時系列出力サブステップをさらに有する方法なども提案する。 Also, in relation to the above method, a method is proposed in which the voice output step further includes a time-series output substep of sequentially outputting the plurality of posted voice messages.

また、上記方法に関連して、選択受付ステップにて選択を受け付けると、当該選択をしたユーザのユーザ情報に応じて選択された音声の音響処理を行う事前処理ステップをさらに有し、音声出力ステップが、事前処理ステップにて処理された音声を出力する処理済音声出力サブステップを有する方法なども提案する。 Further, in relation to the above method, when receiving a selection in the selection receiving step, the method further includes a pre-processing step of performing acoustic processing of the selected voice according to the user information of the user who made the selection, and a voice outputting step However, it also proposes a method having a processed audio output sub-step that outputs the audio processed in the pre-processing step.

また、上記方法に関連して、取得した音声を音声認識のうえ一部又は全部を文字化する文字化処理ステップをさらに有し、ＵＩ表示ステップが、文字化処理ステップにて処理された文字データとともに音声再生ＵＩを表示させる文字表示サブステップをさらに有する方法なども提案する。 In addition, in relation to the above method, the method further includes a characterization processing step of recognizing the acquired voice and converting a part or all of it into characters, and the UI display step includes character data processed in the characterization processing step. Also proposed is a method that further includes a character display substep for displaying a voice playback UI.

また、上記方法を実現するためのプログラムやシステムなども提案する。 We also propose programs and systems for realizing the above method.

主に以上のような構成をとる本発明によって、多くのユーザに対し、投稿された音声コンテンツの視認性を向上させることができるようになる。 With the present invention, which mainly has the configuration described above, it is possible to improve the visibility of posted audio content for many users.

本発明のシステムの概略図Schematic diagram of the system of the invention 実施形態１のシステムの機能ブロックの一例を示す図FIG. 2 is a diagram showing an example of functional blocks of the system of Embodiment 1; 実施形態１のシステムの機能的な各構成をまとめて一のハードウェアとして実現した際の構成の一例を示す概略図Schematic diagram showing an example of a configuration when each functional configuration of the system of Embodiment 1 is collectively realized as one piece of hardware. 実施形態１のシステムにおける処理の流れの一例を示す図FIG. 4 is a diagram showing an example of the flow of processing in the system of the first embodiment; 実施形態２のシステムの機能ブロックの一例を示す図FIG. 11 is a diagram showing an example of functional blocks of the system of the second embodiment; 実施形態２のシステムにおける処理の流れの一例を示す図A diagram showing an example of the flow of processing in the system of the second embodiment. 実施形態３のシステムの機能ブロックの一例を示す図FIG. 11 is a diagram showing an example of functional blocks of the system of Embodiment 3; 実施形態３のシステムにおける処理の流れの一例を示す図A diagram showing an example of the flow of processing in the system of the third embodiment. 実施形態４のシステムの機能ブロックの一例を示す図FIG. 11 is a diagram showing an example of functional blocks of the system of Embodiment 4; 実施形態４のシステムにおける処理の流れの一例を示す図A diagram showing an example of the flow of processing in the system of the fourth embodiment. 実施形態５のシステムの機能ブロックの一例を示す図FIG. 11 is a diagram showing an example of functional blocks of the system of Embodiment 5; 実施形態５のシステムにおける処理の流れの一例を示す図A diagram showing an example of the flow of processing in the system of the fifth embodiment.

まず図１を示す。図１は本発明の概要を示す図である。本図に示されているように、本発明は、複数のスマートフォン０１０１、０１０２、０１１１やタブレット０１１２その他の端末とネットワークを介して接続されたコンピュータ０１２１やサーバ０１３１によって実現可能である。本発明においては音声再生に技術的特徴があることから、前記スマートフォンやタブレットはいずれも、何らかの音声出力インタフェースを備えている必要がある。その意味においては、かかるインタフェースを備えていさえすれば、端末の種別は特に限定されず、他にも例えば、スマートグラス、スマートウォッチ、ＡＲ／ＶＲゴーグルなども含まれうる。また、コンピュータ０１２１とサーバ０１３１は、図１に記載されているようにネットワークを介して接続され協働するいわゆるクラウドコンピューティングの形式にて全体として一のシステムを構成してもよいし、オンプレミス型にて一体的に構成されてももちろんよい。 First, FIG. 1 is shown. FIG. 1 is a diagram showing an outline of the present invention. As shown in the figure, the present invention can be realized by a computer 0121 and a server 0131 connected to a plurality of smartphones 0101, 0102, 0111, tablets 0112, and other terminals via a network. Since the present invention has a technical feature in audio reproduction, each of the smart phones and tablets must be equipped with some kind of audio output interface. In that sense, the type of terminal is not particularly limited as long as it has such an interface, and may also include, for example, smart glasses, smart watches, AR/VR goggles, and the like. In addition, the computer 0121 and the server 0131 may constitute one system as a whole in the form of so-called cloud computing in which they are connected and cooperated via a network as shown in FIG. Of course, it may be constructed integrally with.

以下、本発明の各実施形態について図面とともに説明する。まず実施形態と請求項の相互の関係は、以下のとおりである。まず、実施形態１は主に請求項１、６、７などに対応する。実施形態２は主に請求項２などに対応する。実施形態３は主に請求項３などに対応する。実施形態４は主に請求項４などに対応する。実施形態５は主に請求項５などに対応する。 Each embodiment of the present invention will be described below with reference to the drawings. First, mutual relationships between the embodiments and the claims are as follows. First, Embodiment 1 mainly corresponds to claims 1, 6, 7, and the like. Embodiment 2 mainly corresponds to claim 2 and the like. Embodiment 3 mainly corresponds to claim 3 and the like. Embodiment 4 mainly corresponds to claim 4 and the like. Embodiment 5 mainly corresponds to claim 5 and the like.

なお、本発明はこれらの実施形態に何ら限定されるものではなく、技術常識に従って特許請求の範囲の各請求項に記載の技術的思想を有し、その要旨を逸脱しない範囲内において、様々な態様で実施し得る。 In addition, the present invention is not limited to these embodiments at all, and has the technical idea described in each claim of the claims according to common general technical knowledge, and various can be implemented in any manner.

＜＜実施形態１＞＞
＜概要＞
図２は、本実施形態のメッセージサービス提供システムの機能ブロックの一例を示す図である。同図において示されているように、本実施形態の「メッセージサービス提供システム」０２００は、「ユーザ情報取得部」０２０１と、「音声取得部」０２０２と、「ＵＩ表示部」０２０３と、「選択受付部」０２０４と、「音声出力部」０２０５と、を有する。 <<Embodiment 1>>
<Overview>
FIG. 2 is a diagram showing an example of functional blocks of the message service providing system of this embodiment. As shown in the figure, the ``message service providing system'' 0200 of this embodiment includes a ``user information acquisition unit'' 0201, a ``voice acquisition unit'' 0202, a ``UI display unit'' 0203, and a ``selection unit'' 0203. It has a reception unit 0204 and an audio output unit 0205 .

なお、以下で詳しく説明するメッセージサービス提供システムは、その機能の一又は複数の機能を複数の装置にて実現するようにも構成され得るものであって、その機能ブロックは、いずれもハードウェア又はソフトウェアとして実現され得る。コンピュータを用いるものを例にすれば、ＣＰＵやメインメモリ、ＧＰＵ、ＴＰＵ、画像メモリ、バス、二次記憶装置（ハードディスクや不揮発性メモリ）、キーボードやマイク、タッチパネル、タッチパネルをタッチするための電子ペンなどの各種入力デバイス、スピーカ、ディスプレイその他各種出力デバイス、その他の外部周辺装置などのハードウェア構成部、またその外部周辺装置用のインタフェース、通信用インタフェース、それらのハードウェアを制御するためのドライバプログラムやその他のアプリケーションプログラムなどが挙げられる。 In addition, the message service providing system described in detail below can also be configured so that one or more of its functions are realized by a plurality of devices, and the functional blocks are either hardware or It can be implemented as software. Taking a computer as an example, CPU, main memory, GPU, TPU, image memory, bus, secondary storage device (hard disk or non-volatile memory), keyboard, microphone, touch panel, electronic pen for touching the touch panel Various input devices such as, speakers, displays and other various output devices, hardware components such as other external peripheral devices, interfaces for external peripheral devices, communication interfaces, and driver programs for controlling those hardware and other application programs.

そしてメインメモリ上に展開したプログラムに従った演算処理によって、入力デバイスやその他インタフェースなどから入力されメモリやハードウェア上に保持されているデータなどが加工、蓄積されたり、前記各ハードウェアやソフトウェアを制御するための命令が作成されたりする。ここで、上記プログラムは、モジュール化された複数のプログラムとして実現されてもよいし、２以上のプログラムをクラウドコンピューティングその他の方法により組み合わせて一のプログラムとして実現されても良い。 Then, by arithmetic processing according to the program developed on the main memory, data input from input devices and other interfaces and held in memory and hardware are processed and accumulated, and each hardware and software described above is processed and accumulated. Create commands to control. Here, the above program may be realized as a plurality of modularized programs, or may be realized as one program by combining two or more programs by cloud computing or other methods.

＜機能的構成＞
本実施形態のメッセージサービス提供システムを構成する各機能ブロックはいずれも、ネットワークを介して所定時間内にて構成される複数の音声メッセージを投稿可能なメッセージサービスを提供するように構成されている。具体的には、複数のユーザによって利用可能なサービスであり、各ユーザが音声メッセージを投稿可能とすることで、当該複数のユーザが、サービス上にて対話形式でコミュニケーションをとることを可能とするサービスとして構成することが考えられる。当該対話形式のコミュニケーションは、当該サービスを利用する全ての又は一部のユーザが閲覧可能であって、閲覧可能なユーザの範囲については、個々のユーザによって適宜設定可能である。 <Functional configuration>
Each functional block constituting the message service providing system of this embodiment is configured to provide a message service capable of posting a plurality of voice messages constructed within a predetermined time via a network. Specifically, it is a service that can be used by multiple users, and by enabling each user to post voice messages, the multiple users can communicate interactively on the service. It is conceivable to configure it as a service. The interactive communication can be viewed by all or some of the users who use the service, and the range of users who can view it can be appropriately set by each user.

なお本実施形態のメッセージサービス提供システムは、音声メッセージを投稿可能とすることを技術的特徴として備えているが、投稿可能な情報は音声メッセージに限定されない。音声メッセージに加え、文字や画像、動画を音声メッセージとともに、又は、選択的に投稿することも可能であり、このような構成を採用することにより、ユーザ間で、豊かな表現手段を用いたコミュニケーションを図るための環境を提供することが可能になる。 Although the message service providing system of the present embodiment is technically characterized in that voice messages can be posted, information that can be posted is not limited to voice messages. In addition to voice messages, it is also possible to post text, images, and videos together with voice messages or selectively. It becomes possible to provide an environment for

「ユーザ情報取得部」０２０１は、ユーザを識別する情報であるユーザ情報を取得するように構成されている。ユーザ識別情報は、ユーザが自身の管理する端末を介して、本実施形態のメッセージサービス提供システムを利用するための申込に応じて生成され、又は取得される。具体的には、ユーザ又はシステムによって任意に設定されたＩＤをユーザ情報とする場合のほか、当該ＩＤと紐づけて取得される、ユーザの氏名やメールアドレス、ＳＮＳアカウント、顔写真、プロフィールその他ユーザを識別するために用いることが可能な情報もまた、ここでいうユーザ情報として取り扱うことが可能である。 A 'user information acquisition unit' 0201 is configured to acquire user information, which is information for identifying a user. The user identification information is generated or acquired in response to an application for using the message service providing system of this embodiment via a terminal managed by the user. Specifically, in addition to the case where an ID arbitrarily set by the user or the system is used as user information, the user's name, email address, SNS account, face photo, profile and other user information acquired in association with the ID Information that can be used to identify can also be treated as user information here.

ユーザ情報取得部では、例えば、ユーザがサービスの利用を求める場合に、あらかじめ端末を操作してログイン処理を行う際にユーザ情報を取得する。当該利用を求める都度ユーザ情報を取得する構成を採用してもよいし、いちどログイン処理が行われユーザ情報を取得したのちは、当該取得したユーザ情報を用いて、所定期間サービスの利用を可能としてもよい。 For example, when a user requests to use a service, the user information acquisition unit acquires user information when performing login processing by operating the terminal in advance. A configuration may be adopted in which user information is acquired each time a request for use is made, or once a login process is performed and user information is acquired, the acquired user information may be used to make it possible to use the service for a predetermined period of time. good too.

「音声取得部」０２０２は、取得元となるユーザのユーザ情報と紐づけて音声を取得するように構成されている。音声取得の具体的な態様としては、ユーザの管理する端末に備えられているマイクを介して、ユーザが発した音声を取得することが考えられる。また、あらかじめユーザ端末に記録されている音声データの送信を受け付けたり、あらかじめユーザ端末などから送信を受け付け本システムの又は外部のコンテンツサーバにて取得しておいた音声データを読み出したりして取得する構成も考えられる。 The 'speech acquisition unit' 0202 is configured to acquire the speech in association with the user information of the user who is the acquisition source. As a specific aspect of voice acquisition, it is conceivable to acquire the voice uttered by the user via a microphone provided in a terminal managed by the user. In addition, it accepts transmission of audio data recorded in advance on the user terminal, or reads and acquires audio data that has been received in advance from the user terminal, etc., and has been acquired by this system or an external content server. Configurations are also conceivable.

音声取得部においては、音声とともに、当該音声を取得した日時や取得した音声の時間長、周波数、波長、特定のユーザとの関連付けその他音声に付随関連する情報である音声関連情報をも取得することが可能である。音声関連情報をも用いて後記ＵＩ表示部にて音声ＵＩを表示する構成を採用することにより、ユーザにとって、より音声の特徴を視認しやすくすることが可能となる。 In addition to the voice, the voice acquisition unit also acquires the date and time when the voice was acquired, the time length of the acquired voice, the frequency, the wavelength, the association with a specific user, and other voice-related information that is associated with the voice. is possible. By adopting a configuration in which the UI display unit, which will be described later, displays an audio UI using also the audio-related information, it becomes possible for the user to more easily recognize the characteristics of the audio.

なお、ユーザから、音声を取得する際に、当該音声を再生可能とする時間又は時刻を、あらかじめユーザにて設定可能とし、当該設定に関する情報を音声関連情報として取得してもよい。当該構成を採用すれば、例えば、あるユーザに向けて、誕生日メッセージを送ったり、特定のスケジュール直前に所定のメッセージを送ったりしたいような場合、あらかじめ音声取得部にて音声を取得しておき、それらの音声を再生可能とする時間又は時刻を音声関連情報として取得し、当該情報に基づいて後記ＵＩ表示部にて音声再生ＵＩを表示可能とすることも可能となるため、時宜にかなった音声再生の機会を担保することが可能となる。 It should be noted that, when acquiring audio from the user, the user may be allowed to set in advance the time or time during which the audio can be reproduced, and information related to the setting may be acquired as the audio-related information. If this configuration is adopted, for example, when it is desired to send a birthday message to a certain user, or to send a predetermined message just before a specific schedule, voice is acquired in advance by the voice acquisition unit. , it is also possible to acquire the time or time when those sounds can be played as sound related information, and based on this information, it is possible to display the sound playback UI in the UI display unit described later, so it is timely It becomes possible to guarantee the opportunity of voice reproduction.

さらに、取得した音声については、取得後一定時間（例えば２４時間）経過すると、データとして消去されるような構成を採用してもよい。音声データはその内容や声色その他の要素で入力元の個人の特定が容易であるなど、その取扱いに慎重さが求められる場合も少なくなく、当該構成を採用することにより、ユーザに対し、セキュリティの観点から安全安心して音声入力を促すことができるようになる。 Furthermore, a configuration may be adopted in which the acquired voice is erased as data after a certain period of time (for example, 24 hours) has elapsed after acquisition. In many cases, careful handling of voice data is required, such as the fact that it is easy to identify the individual who entered the voice data based on its content, tone of voice, and other factors. From the point of view, it becomes possible to prompt voice input safely and securely.

また、音声取得部においては、一のユーザからあらかじめ複数の音声を取得することも可能である。当該複数の音声をどのように処理するかについては適宜設定可能であり、実施形態３にてその一例について具体的に説明する。 Also, in the voice acquisition unit, it is possible to acquire a plurality of voices in advance from one user. How to process the plurality of sounds can be appropriately set, and a specific example thereof will be described in the third embodiment.

「ＵＩ表示部」０２０３は、取得した音声を再生可能にするためのインタフェースである音声再生ＵＩを表示させるように構成されている。ここでいう音声再生ＵＩ表示の一例としては、端末の画面上に取得した音声を再生可能とするためのボタンその他のアイコンを表示させることが考えられる。 The 'UI display unit' 0203 is configured to display an audio playback UI, which is an interface for enabling playback of acquired audio. As an example of the voice playback UI display here, it is conceivable to display a button or other icon on the screen of the terminal for enabling playback of the acquired voice.

なお、ＵＩ表示部においては、音声とともに取得している音声関連情報として特定のユーザとの関連付けがなされている場合に、当該関連付けられている特定のユーザに対してのみ音声再生ＵＩを表示させるような制御を行うことも可能である。当該構成を採用することにより、例えば、プライベートな内容の音声については、特定のユーザに対してのみ音声再生ＵＩを表示させるような処理を行うことができ、いわゆる「誤爆」のように、無用な範囲にまで音声データが流出してしまうような危険を回避することが可能になる。 In addition, in the UI display unit, when the audio-related information acquired together with the audio is associated with a specific user, the audio playback UI is displayed only for the associated specific user. It is also possible to control By adopting this configuration, for example, for audio with private content, it is possible to perform processing such that the audio playback UI is displayed only for a specific user. It is possible to avoid the risk of voice data leaking out of range.

ここで、後記選択受付部において選択を受け付けた音声再生ＵＩ、後記音声出力部にて出力された音声と紐づけられた音声再生ＵＩについては、以後も同様にユーザ端末にて表示可能としてもよいし、一回又は所定回数各処理が行われたあとあるいは一定時間（たとえば２４時間）経過したあとは、ユーザ端末にて非表示とするようにＵＩ表示部を制御する構成もあってよい。非表示とするための各処理の回数については、音声取得先であるユーザによって選択可能とすることが可能である。当該構成を採用することにより、音声に希少性を持たせ、ひいては当該音声の提供に課金その他のインセンティブを設定することが可能になる。 Here, the audio playback UI whose selection is accepted by the selection acceptance unit described later and the audio playback UI linked to the audio output by the audio output unit described later may be similarly displayed on the user terminal. However, after each process is performed once or a predetermined number of times, or after a certain period of time (for example, 24 hours) has passed, the UI display unit may be controlled so as not to be displayed on the user terminal. The number of times each process is performed for hiding can be made selectable by the user who is the audio acquisition destination. By adopting this configuration, it becomes possible to give scarcity to voices and to set charging or other incentives for the provision of such voices.

さらに、もっとも直近に取得された音声を優先的に再生可能にするために、ポップアップ表示処理等を行うなどの態様で当該音声再生ＵＩを表示する構成も考えられる。当該構成を採用すれば、最も「旬」な音声情報を、視覚的にも優先的に「聞くべき」というメッセージ性をもって提供することができ、より出力先ユーザに対する視聴動機を向上させることができる。 Furthermore, in order to preferentially reproduce the most recently acquired audio, a configuration may be considered in which the audio reproduction UI is displayed in a manner such as performing pop-up display processing or the like. By adopting this configuration, it is possible to provide the most "seasonal" audio information with a message that "should be listened to" visually as well as preferentially, and it is possible to further improve the viewing motivation of the output destination user. .

「選択受付部」０２０４は、音声再生ＵＩの選択を受け付けるように構成されている。具体的には、上述したボタンその他のアイコンをタップしたり、クリックしたりすることで音声再生ＵＩの選択を受け付けることが考えられる。ここで選択受付部では、音声再生ＵＩの選択とともに、ユーザから、音声出力の態様についての選択を受け付けることも可能である。具体的には、音声出力の大小や音声出力の時間長又は出力範囲（例えば、冒頭所定時間に限り出力、など）、出力先端末の指定などが音声出力の態様として選択可能に設定することが考えられる。当該構成を採用することにより、音声を発信するユーザだけでなく、音声を受信するユーザの便宜にもかなうことが可能になる。 The 'selection reception unit' 0204 is configured to receive selection of the audio reproduction UI. Specifically, it is conceivable to accept the selection of the audio reproduction UI by tapping or clicking the button or other icon described above. Here, the selection reception unit can also receive a selection of an audio output mode from the user together with the selection of the audio reproduction UI. Specifically, the size of voice output, the time length or output range of voice output (for example, output only for a predetermined time at the beginning, etc.), designation of the output destination terminal, etc. can be set to be selectable as a mode of voice output. Conceivable. By adopting this configuration, it becomes possible to meet the convenience of not only the user who transmits the voice but also the user who receives the voice.

「音声出力部」０２０５は、受け付けた選択に応じて音声を出力するように構成されている。ここで受け付けた選択に応じて、とは、単に選択された音声を出力する、ということにとどまらず、どのようにして音声を出力するかという出力態様をも受け付けた選択に対応して制御されうることを意味する。上述した例に即して説明すると、選択を受け付けた音声を、選択された音量で出力したり、選択された範囲のみ出力したり、指定された端末に向けて送信出力したりすることが考えられる。当該構成を採用することで、音声出力先として表示機能を有さないスマートスピーカなどを選択し、当該装置にて音声出力するような処理も可能になることから、ユーザの環境やタイミングごとに異なる様々なシチュエーションにて、音声出力を楽しむことが可能となる。 The 'audio output unit' 0205 is configured to output audio according to the received selection. Here, according to the received selection, it is not limited to simply outputting the selected sound, but the output mode of how to output the sound is also controlled according to the received selection. means to get According to the above example, it is conceivable to output the selected voice at the selected volume, to output only the selected range, or to transmit and output to the specified terminal. be done. By adopting this configuration, it is possible to select a smart speaker that does not have a display function as an audio output destination and output audio from that device, so it will differ depending on the user's environment and timing. It is possible to enjoy audio output in various situations.

なおここで、音声出力のあとで、ユーザ端末に対し、当該音声の出力元であるユーザとの通話を可能にするための処理を行うことも考えられる。例えば、通話を開始するためのボタンを表示したり、通話を求めるかどうかの音声指示を受け付ける音声入力プログラムを起動したりすることが考えられる。どのようなタイミングで通話を可能にするための処理を行うかどうかは適宜設定可能であり、例えば、音声入力と出力とが交互に複数回繰り返されたタイミングなど、ユーザが「これなら電話で直接話した方が早いかも」と感じるであろう機会に当該処理が行われることが好ましい。当該構成を採用することにより、様々なコミュニケーション手段のうち、状況に応じて好適なものを提供可能にすることが可能となる。 Here, it is conceivable that after the voice is output, processing is performed on the user terminal to enable communication with the user who is the output source of the voice. For example, it is conceivable to display a button for starting a call, or to start a voice input program that accepts a voice instruction as to whether or not to request a call. It is possible to appropriately set at what timing the processing for enabling a call is performed. It is preferable that the process is performed at an opportunity when the user feels that it might be quicker to talk. By adopting this configuration, it is possible to provide a suitable one among various communication means according to the situation.

＜具体的な構成＞
ここで図３を示す。同図は本実施形態のメッセージサービス提供システムの機能的な各構成をまとめて一のハードウェアとして実現した際の構成の一例を示す概略図である。各装置はいずれも、それぞれ各種演算処理を実行するための「ＣＰＵ」０３０１と、「記憶装置（記憶媒体）」０３０２と、「メインメモリ」０３０３と、「入力インタフェース」０３０４、「出力インタフェース」０３０５、「ネットワークインタフェース」０３０６と、を備え、入出力インタフェースを介して、例えば「タッチパネル」０３０７、「ディスプ
レイ」０３０８などの外部周辺装置と情報の送受信を行う。また、ネットワークインタフェースを介して複数の「ユーザ端末」０３０９や「コンテンツサーバ」０３１０などの外部装置と情報の送受信を行う場合があってもよい。このネットワークインタフェースの具体的な態様は有線、無線を問わず、また、通信方法も直接、間接を問わない。よって特定の外部装置ないし同装置の利用者と紐づけられた第三者の管理するサーバとの間で情報の送受信を行ういわゆるクラウドコンピューティングの形式を採用することも可能である。 <Specific configuration>
FIG. 3 is shown here. This figure is a schematic diagram showing an example of a configuration when each functional configuration of the message service providing system of the present embodiment is collectively realized as one piece of hardware. Each device has a ``CPU'' 0301, a ``storage device (storage medium)'' 0302, a ``main memory'' 0303, an ``input interface'' 0304, and an ``output interface'' 0305 for executing various kinds of arithmetic processing. , and a “network interface” 0306, and transmits and receives information to and from external peripheral devices such as a “touch panel” 0307 and a “display” 0308 via input/output interfaces. Information may also be transmitted and received to and from external devices such as a plurality of "user terminals" 0309 and "content servers" 0310 via a network interface. The specific aspect of this network interface may be wired or wireless, and the communication method may be direct or indirect. Therefore, it is possible to adopt a form of so-called cloud computing in which information is transmitted and received between a specific external device or a user of the same device and a server managed by a third party.

記憶装置には以下で説明するような各種プログラムが格納されており、ＣＰＵはこれら各種プログラムをメインメモリのワーク領域内に読み出して展開、実行する。なお、これらの構成は、「システムバス」０３９９などのデータ通信経路によって相互に接続され、情報の送受信や処理を行う（以上の構成の基本的な構成は、以下で説明する他の装置のいずれについても同様である。 Various programs as described below are stored in the storage device, and the CPU reads these various programs into the work area of the main memory, expands them, and executes them. These components are interconnected by a data communication path such as a "system bus" 0399, and perform information transmission/reception and processing (the basic configuration of the above configuration can be used by any of the other devices described below). The same is true for

（ユーザ情報取得部の具体的な構成）
ユーザ情報取得部は、コンピュータプログラムとコンピュータハードウェアにより構成され、具体的には、ＣＰＵが記憶装置から「ユーザ情報取得プログラム」０３２０をメインメモリに読み出して実行し、ユーザを識別する情報であるユーザ情報を取得してメインメモリの所定のアドレスに格納する。 (Specific Configuration of User Information Acquisition Unit)
The user information acquisition unit is composed of a computer program and computer hardware. Acquire information and store it at a predetermined address in the main memory.

（音声取得部の具体的な構成）
音声取得部は、コンピュータプログラムとコンピュータハードウェアにより構成され、具体的には、ＣＰＵが記憶装置から「音声取得プログラム」０３３０をメインメモリに読み出して実行し、取得元となるユーザのユーザ情報と紐づけて音声を取得してメインメモリの所定のアドレスに格納する。 (Specific Configuration of Voice Acquisition Unit)
The speech acquisition unit is composed of a computer program and computer hardware. Acquire the voice by attaching it and store it at a predetermined address in the main memory.

（ＵＩ表示部の具体的な構成）
ＵＩ表示部は、コンピュータプログラムとコンピュータハードウェアにより構成され、具体的には、ＣＰＵが記憶装置から「ＵＩ表示プログラム」０３４０をメインメモリに読み出して実行し、音声取得プログラムの実行により取得した音声をユーザ端末にて再生可能にするためのインタフェースである音声再生ＵＩを当該ユーザ端末に表示させる処理を行う。 (Specific configuration of UI display unit)
The UI display unit is composed of a computer program and computer hardware. Specifically, the CPU reads a "UI display program" 0340 from the storage device into the main memory and executes it, and outputs the sound acquired by executing the sound acquisition program. A process for displaying an audio playback UI, which is an interface for enabling playback on the user terminal, is performed on the user terminal.

（選択受付部の具体的な構成）
選択受付部は、コンピュータプログラムとコンピュータハードウェアにより構成され、具体的には、ＣＰＵが記憶装置から「選択受付プログラム」０３５０をメインメモリに読み出して実行し、前記ＵＩ表示プログラムの実行により音声再生ＵＩを表示させたユーザ端末より、当該音声再生ＵＩの選択を受け付ける。 (Specific configuration of the selection reception unit)
The selection reception unit is composed of a computer program and computer hardware. Specifically, the CPU reads out a "selection reception program" 0350 from the storage device into the main memory and executes it, and executes the UI display program to display the audio reproduction UI. is displayed, the selection of the audio reproduction UI is accepted.

（音声出力部の具体的な構成）
音声出力部は、コンピュータプログラムとコンピュータハードウェアにより構成され、具体的には、ＣＰＵが記憶装置から「音声出力プログラム」０３６０をメインメモリに読み出して実行し、選択受付ステップの実行により受け付けた選択に応じて、音声取得プログラムの実行により取得していた音声データを読み出し、当該音声を出力する。 (Specific configuration of audio output unit)
The audio output unit is composed of a computer program and computer hardware. Specifically, the CPU reads out the "audio output program" 0360 from the storage device to the main memory and executes it. In response, the voice data acquired by executing the voice acquisition program is read, and the voice is output.

＜処理の流れ＞
図４は、本実施形態のメッセージサービス提供システムにおける処理の流れの一例を示す図である。同図の処理の流れは以下のステップからなる。最初にステップＳ０４０１では、ユーザを識別する情報であるユーザ情報を取得する（ユーザ情報取得ステップ）。そしてステップＳ０４０２では、取得元となるユーザのユーザ情報と紐づけて音声を取得（音声取得ステップ）し、ステップＳ０４０３では、取得した音声を再生可能にするためのインタフェースである音声再生ＵＩを表示させる（ＵＩ表示ステップ）。その後ステップＳ０４０４で音声再生ＵＩの選択を受け付ける（選択受付ステップ）と、ステップＳ０４０５で、受け付けた選択に応じて音声を出力する（音声出力ステップ）。 <Process flow>
FIG. 4 is a diagram showing an example of the flow of processing in the message service providing system of this embodiment. The flow of processing in the figure consists of the following steps. First, in step S0401, user information that identifies a user is acquired (user information acquisition step). Then, in step S0402, the voice is acquired in association with the user information of the user who is the acquisition source (sound acquisition step), and in step S0403, the voice playback UI, which is an interface for enabling playback of the acquired voice, is displayed. (UI display step). After that, in step S0404, selection of the audio reproduction UI is accepted (selection acceptance step), and in step S0405, audio is output according to the accepted selection (audio output step).

＜効果＞
以上の構成を採用するメッセージサービス提供システムを利用することにより、多くのユーザに対し、投稿された音声コンテンツの視認性を向上させることができるようになる。 <effect>
By using the message service providing system that employs the above configuration, it is possible to improve the visibility of posted audio content for many users.

＜＜実施形態２＞＞
＜概要＞
本実施形態のメッセージサービス提供システムは、基本的には実施形態１に記載のメッセージサービス提供システムの技術的特徴と同様であるが、音声取得に際し、取得する音声の再生時間長に応じて当該音声を編集する点において更なる特徴を有している。 <<Embodiment 2>>
<Overview>
The message service providing system of this embodiment basically has the same technical features as the message service providing system described in Embodiment 1, but when acquiring voice, It has a further feature in that it edits the .

＜機能的構成＞
図５は、本実施形態のメッセージサービス提供システムの機能ブロックの一例を示す図である。同図において示されているように、本実施形態の「メッセージサービス提供システム」０５００は、「ユーザ情報取得部」０５０１と、「音声取得部」０５０２と、「ＵＩ表示部」０５０３と、「選択受付部」０５０４と、「音声出力部」０５０５と、を有し、音声取得部はさらに「編集取得手段」０５１２を有する。基本的な構成は、実施形態１の図２を用いて説明したメッセージサービス提供システムと共通するため、以下では相違点である「編集取得手段」０５１２の機能について説明する。 <Functional configuration>
FIG. 5 is a diagram showing an example of functional blocks of the message service providing system of this embodiment. As shown in the figure, the 'message service providing system' 0500 of this embodiment includes a 'user information acquisition unit' 0501, a 'speech acquisition unit' 0502, a 'UI display unit' 0503, and a 'selection unit' 0503. It has a reception section 0504 and a 'speech output section' 0505 , and the speech acquisition section further has an 'edit acquisition means' 0512 . Since the basic configuration is the same as that of the message service providing system described with reference to FIG. 2 of Embodiment 1, the function of the 'edit acquisition means' 0512, which is the difference, will be described below.

「編集取得手段」０５１２は、音声取得部において、取得する音声の再生時間長に応じて当該音声を編集するように構成される。具体的には、取得する音声の再生時間長が所定時間よりも長ければ、当該音声の一部を切り出す編集処理を行うことが考えられる。より具体的には、冒頭の一部（例えば１０秒以内）のみを切り出す編集処理を行ったうえで、当該編集済みの音声を取得するような処理が考えられる。 The 'edit acquisition means' 0512 is configured to edit the acquired audio according to the reproduction time length of the acquired audio in the audio acquisition unit. Specifically, if the reproduction time length of the sound to be acquired is longer than a predetermined time, it is conceivable to perform editing processing for cutting out a part of the sound. More specifically, it is conceivable to perform an editing process of cutting out only a part of the beginning (for example, within 10 seconds), and then acquire the edited audio.

他にも例えば、取得する音声の再生時間長が所定時間よりも長い場合に、当該音声を前記所定時間長となるよう、ピッチを速める編集処理を行うことも考えられる。当該構成を採用すれば、所定時間内にメッセージを聞き取ることができ、多忙な利用者にとっても便宜である。 In addition, for example, when the reproduction time length of the voice to be acquired is longer than a predetermined length of time, it is conceivable to perform editing processing to speed up the pitch of the voice so that the voice has the predetermined length of time. By adopting this configuration, the message can be heard within a predetermined time, which is convenient for busy users.

編集後には、編集前の音声と、編集後の音声とをともに関連付けて保持しておくことが考えられる。このような処理を施した音声を取得し、出力可能とすることにより、音声取得部にて比較的長時間の音声を取得した場合であっても、より直感的に認識可能な時間長で当該音声を出力し、把握可能とすることができる。 After editing, it is conceivable to store both the pre-edited audio and the edited audio in association with each other. Acquiring and outputting the voice processed in this way enables the acquisition of a relatively long voice by the voice acquisition unit, with a time length that is more intuitively recognizable. Audio can be output to make it comprehensible.

なお、編集取得手段における音声編集を行うタイミングは、音声取得部において音声を取得したタイミングであってもよいし、選択受付部にて選択を受け付けた後、音声出力部にて音声出力する前のタイミングであってもよい。 The timing at which the audio is edited by the edit acquisition means may be the timing at which the audio is acquired by the audio acquisition section, or the timing after the selection reception section accepts the selection and before the audio output by the audio output section. It can be timing.

また、音声の編集処理をした場合、音声出力部では編集後の音声を出力することが想定されるが、編集前の音声を出力可能とする構成が別途用意されてもよい。この場合には例えば、一度編集後の音声を出力し、さらに当該出力済みの音声の編集前の音声の出力を希望する旨の情報を受信した場合、改めて編集前の音声を出力する。当該構成を採用することにより、当初は編集後の音声を聞いて概要を把握しておき、あとで時間があるタイミングを見計らって、改めて編集前の音声を聞きたい、といったユーザの需要を満たすことが可能になる。 Further, when the audio is edited, it is assumed that the audio output unit outputs the edited audio, but a separate configuration that enables the output of the pre-edited audio may be provided. In this case, for example, once the edited sound is output, and furthermore, when information is received indicating that the output sound before editing is desired to be output, the pre-edited sound is output again. By adopting this configuration, it is possible to satisfy the user's demand, such as listening to the edited voice at the beginning to understand the outline, and later listening to the unedited voice again when there is time. becomes possible.

なおここまでは、編集取得手段においては、特定の機能に基づいて当該音声を編集する処理について説明をしてきたが、ユーザからの指示選択に応じて取得する音声の再生時間長に応じて当該音声を編集するように構成されてももちろんよい。当該構成を採用することにより、ユーザは入力した音声情報のうち、自身にとり最適な部分を抽出して出力可能とすることができるようになる。 Up to this point, in the edit acquisition means, the process of editing the sound based on a specific function has been explained. Of course, it may be configured to edit the . By adopting this configuration, the user can extract and output the most suitable part for him/herself from the input voice information.

＜具体的な構成＞
本実施形態のメッセージサービス提供システムを構成する各装置のハードウェア構成は、基本的には、図３を用いて説明した実施形態１のメッセージサービス提供システムにおけるハードウェア構成と同様である。そこで以下については、これまで説明していない「編集取得手段」の具体的な処理について説明する。 <Specific configuration>
The hardware configuration of each device constituting the message service providing system of this embodiment is basically the same as the hardware configuration of the message service providing system of the first embodiment described with reference to FIG. Therefore, in the following, specific processing of the "edit acquisition means", which has not been explained so far, will be explained.

（編集取得手段の具体的な構成）
編集取得手段は、具体的にはコンピュータプログラムとコンピュータハードウェアにより構成され、音声取得プログラムの実行に際し、ＣＰＵが記憶装置から「編集取得サブプログラム」をメインメモリに読み出して実行し、取得する音声の再生時間長に応じて当該音声を編集処理しメインメモリの所定のアドレスに格納する。 (Specific Configuration of Editing Acquisition Means)
The edit acquisition means is specifically composed of a computer program and computer hardware, and when executing the voice acquisition program, the CPU reads out an "edit acquisition subprogram" from the storage device to the main memory and executes it to obtain the voice to be acquired. The audio is edited according to the playback time length and stored at a predetermined address in the main memory.

＜処理の流れ＞
図６は、本実施形態のメッセージサービス提供システムにおける処理の流れの一例を示す図である。同図の処理の流れは以下のステップからなる。最初にステップＳ０６０１では、ユーザを識別する情報であるユーザ情報を取得する（ユーザ情報取得ステップ）。そしてステップＳ０６０２では、取得元となるユーザのユーザ情報と紐づけて音声をその再生時間長に応じて当該音声を編集し（音声取得ステップ）、ステップＳ０６０３では、取得した音声を再生可能にするためのインタフェースである音声再生ＵＩを表示させる（ＵＩ表示ステップ）。その後ステップＳ０６０４で音声再生ＵＩの選択を受け付ける（選択受付ステップ）と、ステップＳ０６０５で、受け付けた選択に応じて音声を出力する（音声出力ステップ）。 <Process flow>
FIG. 6 is a diagram showing an example of the flow of processing in the message service providing system of this embodiment. The flow of processing in the figure consists of the following steps. First, in step S0601, user information that identifies the user is acquired (user information acquisition step). Then, in step S0602, the audio is edited according to the playback time length of the audio in association with the user information of the user who is the acquisition source (audio acquisition step). to display an audio reproduction UI, which is an interface of (UI display step). After that, in step S0604, selection of the audio reproduction UI is accepted (selection acceptance step), and in step S0605, audio is output according to the accepted selection (audio output step).

＜効果＞
本実施形態のメッセージサービス提供システムを用いることにより、実施形態１のメッセージサービス提供システムを用いる場合よりも、直感的に音声内容を把握することができる。 <effect>
By using the message service providing system of the present embodiment, it is possible to grasp the contents of voice more intuitively than in the case of using the message service providing system of the first embodiment.

＜＜実施形態３＞＞
＜概要＞
本実施形態のメッセージサービス提供システムは、基本的には実施形態１や２に記載のメッセージサービス提供システムの技術的特徴と同様であるが、音声出力に際し、投稿された複数の音声メッセージを順番に連続して出力する点を特徴としている。 <<Embodiment 3>>
<Overview>
The message service providing system of this embodiment basically has the same technical characteristics as those of the message service providing system described in the first and second embodiments. It is characterized by continuous output.

＜機能的構成＞
図７は、本実施形態のメッセージサービス提供システムを一のコンピュータ（装置）で実現した場合の機能ブロックの一例を示す図である。同図において示されているように、本実施形態の「メッセージサービス提供システム」０７００は、「ユーザ情報取得部」０７０１と、「音声取得部」０７０２と、「ＵＩ表示部」０７０３と、「選択受付部」０７０４と、「音声出力部」０７０５と、を有し、音声出力部は、「時系列取得手段」０７１５を皿に有する。基本的な構成は、実施形態１の図２を用いて説明したメッセージサービス提供システムと共通するため、以下では相違点である「時系列出力手段」０７１５の機能について説明する。 <Functional configuration>
FIG. 7 is a diagram showing an example of functional blocks when the message service providing system of this embodiment is realized by one computer (device). As shown in the figure, the 'message service providing system' 0700 of this embodiment includes a 'user information acquisition unit' 0701, a 'speech acquisition unit' 0702, a 'UI display unit' 0703, and a 'selection unit' 0703. It has a "reception unit" 0704 and a "voice output unit" 0705, and the voice output unit has a "time-series acquisition means" 0715 on its plate. Since the basic configuration is the same as that of the message service providing system described with reference to FIG. 2 of Embodiment 1, the function of the "time-series output means" 0715, which is the difference, will be described below.

「時系列出力手段」０７１５は、音声出力部において、投稿された複数の音声メッセージを順番に連続して出力するように構成される。ここでいう「順番に」とは、特定の投稿者から投稿された複数の音声メッセージを基準にしてもよいし、複数の投稿者によってそれぞれ投稿された複数の音声メッセージを基準にしてもよい。そして「順番に連続して」についても、投稿者に関係なく、投稿された順番、すなわち投稿日時の古い順から連続して出力するような構成であってもよいし、一の投稿者において複数回投稿された音声メッセージのみを、投稿日時の古い順から連続して出力するような構成であってもよい。当該構成を採用することにより、利用者がいちいち音声出力のための操作をすることなく、連続して音声出力がなされ、利用者があたかも投稿者とコミュニケーションを図っているかのような環境を提供することが可能となる。 The 'time-series output means' 0715 is configured to sequentially output a plurality of posted voice messages in the voice output unit. Here, "in order" may be based on a plurality of voice messages posted by a specific poster, or may be based on a plurality of voice messages posted by a plurality of posters. Regarding "continuously in order", regardless of the poster, it may be configured to output continuously in the order of posting, that is, in the order of the oldest posting date and time, or multiple posts for one poster. It may be configured such that only the voice messages that have been posted multiple times are sequentially output in chronological order of posting date and time. By adopting this configuration, the user does not have to operate for voice output one by one, and voice output is performed continuously, providing an environment as if the user were trying to communicate with the contributor. becomes possible.

またここでは、いちど音声出力された音声メッセージか否かで、「順番に」における出力の順番に含めるか否かを制御するような構成を採用してもよい。すなわち、複数の音声メッセージのうち、既にいちど出力済みとなっている音声メッセージがある場合には、当該音声メッセージを除いた他の音声メッセージのみを対象として順番に連続して音声を出力するような構成が考えられる。当該構成を採用することにより、何度も重複して同じ音声を聞く煩わしさを回避することができる。 Further, here, a configuration may be adopted in which whether or not to include the voice message in the order of output in "in order" is controlled depending on whether or not the voice message has been voice-output once. In other words, if there is a voice message that has already been output once among a plurality of voice messages, only the other voice messages other than the voice message are targeted and the voice is continuously output in order. configuration is conceivable. By adopting this configuration, it is possible to avoid the annoyance of repeatedly listening to the same sound.

なお逆に、複数の音声メッセージのうち、既にいちど出力済みとなっている音声メッセージがある場合に、当該音声メッセージのみを対象として順番に連続して音声を出力するような構成があってもよい。当該構成を採用することにより、重要な音声メッセージを繰り返し聞きなおし、その意図を確認することができるようにもなる。 Conversely, if there is a voice message that has already been output once among a plurality of voice messages, there may be a configuration in which the voice is continuously output in order only for that voice message. . By adopting this configuration, it becomes possible to repeatedly listen to an important voice message and confirm its intention.

＜具体的な構成＞
本実施形態のメッセージサービス提供システムを構成する各装置のハードウェア構成は、基本的には、図３を用いて説明した実施形態１のメッセージサービス提供システムにおけるハードウェア構成と同様である。そこで以下については、これまで説明していない「時系列出力手段」の具体的な処理について説明する。 <Specific configuration>
The hardware configuration of each device constituting the message service providing system of this embodiment is basically the same as the hardware configuration of the message service providing system of the first embodiment described with reference to FIG. Therefore, the specific processing of the "time-series output means", which has not been described so far, will be described below.

（時系列出力手段の具体的な構成）
時系列出力手段は、具体的にはコンピュータプログラムとコンピュータハードウェアにより構成され、ＣＰＵが記憶装置から音声出力プログラムを読み出す際に、「時系列出力サブプログラム」をメインメモリに読み出して実行し、投稿された複数の音声メッセージを順番に連続して出力する。 (Specific configuration of time-series output means)
The time-series output means is specifically composed of a computer program and computer hardware, and when the CPU reads the sound output program from the storage device, it reads the "time-series output subprogram" into the main memory, executes it, and posts it. output multiple voice messages in sequence.

＜処理の流れ＞
図８は、本実施形態のメッセージサービス提供システムにおける処理の流れの一例を示す図である。同図の処理の流れは以下のステップからなる。最初にステップＳ０８０１では、ユーザを識別する情報であるユーザ情報を取得する（ユーザ情報取得ステップ）。そしてステップＳ０８０２では、取得元となるユーザのユーザ情報と紐づけて音声を取得（音声取得ステップ）し、ステップＳ０８０３では、取得した音声を再生可能にするためのインタフェースである音声再生ＵＩを表示させる（ＵＩ表示ステップ）。その後ステップＳ０８０４で音声再生ＵＩの選択を受け付ける（選択受付ステップ）と、ステップＳ０８０５で、受け付けた選択に応じて音声を出力する（音声出力ステップ）。 <Process flow>
FIG. 8 is a diagram showing an example of the flow of processing in the message service providing system of this embodiment. The flow of processing in the figure consists of the following steps. First, in step S0801, user information that identifies a user is acquired (user information acquisition step). Then, in step S0802, audio is acquired in association with the user information of the user who is the acquisition source (audio acquisition step), and in step S0803, an audio playback UI, which is an interface for enabling playback of the acquired audio, is displayed. (UI display step). After that, in step S0804, selection of the audio reproduction UI is accepted (selection acceptance step), and in step S0805, audio is output according to the accepted selection (audio output step).

＜効果＞
本実施形態のメッセージサービス提供システムを用いることにより、実施形態１や２のメッセージサービス提供システムを用いる場合に比べて、複数の音声メッセージを細切れに聞かざるを得ない状況を回避したり、必要な音声メッセージのみをいちいち操作することなくストレスレスに聞くことが可能となる。によって音声出力を受けることが可能となる。音声を発した環境に影響を受けず好適なコミュニケーションのための音声出力をすることができる。 <effect>
By using the message service providing system of the present embodiment, compared to the case of using the message service providing system of the first or second embodiment, it is possible to avoid a situation in which a plurality of voice messages must be listened to in pieces, or It is possible to listen to voice messages stresslessly without having to operate them one by one. It is possible to receive audio output by It is possible to output voice for suitable communication without being affected by the environment in which the voice is emitted.

＜＜実施形態４＞＞
＜概要＞
本実施形態のメッセージサービス提供システムは、基本的には実施形態１に記載のメッセージサービス提供システムの技術的特徴と同様であるが、音声再生ＵＩの選択を受け付けると、当該選択をしたユーザのユーザ情報に応じて選択された音声の音響処理を行い、音声出力の際に、当該音響処理された音声を出力する点において更なる特徴を有している。 <<Embodiment 4>>
<Overview>
The message service providing system of this embodiment is basically the same as the technical features of the message service providing system described in the first embodiment. It is further characterized in that audio processing is performed on audio selected according to information, and the audio that has undergone the audio processing is output at the time of audio output.

＜機能的構成＞
図９は、本実施形態のメッセージサービス提供システムの機能ブロックの一例を示す図である。同図において示されているように、本実施形態の「メッセージサービス提供システム」０９００は、「ユーザ情報取得部」０９０１と、「音声取得部」０９０２と、「ＵＩ表示部」０９０３と、「選択受付部」０９０４と、「音声出力部」０９０５と、「事前処理部」０９０６を融資、音声出力部は「処理済音声出力手段」０９１５をさらに有する。基本的な構成は、実施形態１の図２を用いて説明したメッセージサービス提供システムと共通するため、以下では相違点である「事前処理部」０９０６及び「処理済音声出力手段」０９１５の機能について説明する。 <Functional configuration>
FIG. 9 is a diagram showing an example of functional blocks of the message service providing system of this embodiment. As shown in the figure, the 'message service providing system' 0900 of this embodiment includes a 'user information acquisition unit' 0901, a 'speech acquisition unit' 0902, a 'UI display unit' 0903, and a 'selection unit' 0903. It includes a reception unit 0904 , a voice output unit 0905 and a preprocessing unit 0906 , and the voice output unit further has a “processed voice output unit” 0915 . Since the basic configuration is the same as that of the message service providing system described with reference to FIG. explain.

「事前処理部」０９０６は、選択受付部にて音声再生ＵＩの選択を受け付けると、当該選択をしたユーザのユーザ情報に応じて選択された音声の音響処理を行うように構成される。ここでいう「ユーザ情報に応じて」とは、例えばユーザ情報を用いて音響処理を行うことを意味しており、この場合ユーザ情報として、当該ユーザ、音声メッセージを投稿したユーザ又は他の特定の者の声色に関する音声辞書が含まれているような場合が考えられる。「他の特定の者」とは例えば有名人やキャラクターの声優、知人などが考えられ、ユーザにより適宜選択することが可能である。そしてこれらの音声辞書を使った音響処理が行われることにより、選択をしたユーザの求める声色によって音声出力を受けることが可能となる。 The 'preprocessing unit' 0906 is configured to, when the selection receiving unit receives the selection of the audio reproduction UI, perform acoustic processing of the selected audio according to the user information of the user who has made the selection. Here, "according to user information" means, for example, performing sound processing using user information. In this case, as user information, the user, the user who posted the voice message, or another There may be a case in which a phonetic dictionary of voice tones is included. The "other specific person" can be, for example, a celebrity, a voice actor of a character, an acquaintance, etc., and can be appropriately selected by the user. By performing acoustic processing using these speech dictionaries, it is possible to receive speech output in accordance with the tone desired by the selected user.

音声辞書は、適宜編集・改変されてもよいし、追加、削除されてもよく、また、どのような技術思想に基づいて構成されるかについても設計事項である。外部サーバから音声辞書のデータを取得してもよいし、自ら音声辞書を保持してももちろんよい。 The speech dictionary may be edited/modified, added, or deleted as appropriate, and it is also a design matter based on what kind of technical concept it is constructed. The data of the speech dictionary may be obtained from an external server, or the speech dictionary may be held by itself.

また、ユーザ情報としてあるユーザが発する音声の周波数や音量などを保持しておき、当該情報を用いた音響処理を行う場合もある。具体的には、普段から小声、すなわち音量が小さいとの内容の情報をユーザ情報として有している場合には、当該ユーザから投稿された音声を出力する場合には、出力すべき音声の音量を一定の程度大きくする音響処理をあらかじめ行うようなことが考えられる。当該構成を採用することにより、音声を聞くユーザにとって、自身で調整をすることなく、快適に音声を聞くことが可能となる。 In some cases, the frequency and volume of a voice uttered by a certain user are stored as user information, and sound processing is performed using the information. Specifically, when the user information contains information indicating that the voice is usually low, that is, the volume is low, when outputting the voice posted by the user, the volume of the voice to be output is set. It is conceivable to preliminarily perform acoustic processing to increase the to a certain degree. By adopting this configuration, it becomes possible for the user who listens to the voice to comfortably listen to the voice without having to make adjustments by himself/herself.

「処理済音声出力手段」０９１５は、音声出力部において、事前処理部にて音響処理された音声を出力するように構成される。「事前処理部にて音響処理された音声」の具体的な性質については既に述べたとおりであるが、処理済音声出力手段においては、音響処理済みであることを示す情報とともに当該音声を出力するように構成されていることが望ましい。具体的には音声出力とともに、ディスプレイ等画面上に「音響処理済み」や「ＸＸさんの声で再生中」などの表示をする、といった具合である。また、音声の冒頭又は最後に「音響処理済みです」などといった説明音声を付加出力してもよい。これらの構成を採用することにより、本機能がいわゆる音声出力者の「なりすまし」のような不正な手段に用いられることを回避することが可能となる。 The 'processed audio output means' 0915 is configured to output the audio processed by the pre-processing unit in the audio output unit. The specific nature of "speech that has been acoustically processed by the pre-processing unit" has already been described, but in the processed speech output means, the speech is output along with information indicating that acoustic processing has been completed. It is desirable to be configured as follows. Specifically, along with voice output, a message such as "sound processing completed" or "playing back in Mr. XX's voice" is displayed on a screen such as a display. Also, an explanation voice such as "Sound processing has been completed" may be additionally output at the beginning or end of the voice. By adopting these configurations, it is possible to prevent this function from being used for fraudulent means such as so-called "spoofing" of a person who outputs audio.

＜具体的な構成＞
本実施形態のメッセージサービス提供システムを構成する各装置のハードウェア構成は、基本的には、図３を用いて説明した実施形態１のメッセージサービス提供システムにおけるハードウェア構成と同様である。そこで以下については、これまで説明していない「事前処理部」及び「処理済音声出力手段」の具体的な処理について説明する。 <Specific configuration>
The hardware configuration of each device constituting the message service providing system of this embodiment is basically the same as the hardware configuration of the message service providing system of the first embodiment described with reference to FIG. Therefore, in the following, specific processing of the "preprocessing section" and "processed audio output means", which have not been described so far, will be described.

（事前処理部の具体的な構成）
事前処理部は、具体的にはコンピュータプログラムとコンピュータハードウェアにより構成され、音声出力プログラムの実行に際し、ＣＰＵが記憶装置から「事前処理プログラム」をメインメモリに読み出して実行する。 (Specific configuration of preprocessing unit)
The pre-processing unit is specifically composed of a computer program and computer hardware. When executing the audio output program, the CPU reads out the "pre-processing program" from the storage device to the main memory and executes it.

（処理済音声出力手段の具体的な構成）
処理済音声出力手段は、具体的にはコンピュータプログラムとコンピュータハードウェアにより構成され、音声出力プログラムの実行に際し、ＣＰＵが記憶装置から「処理済音声出力手段サブプログラム」をメインメモリに読み出して実行する。 (Specific configuration of processed voice output means)
The processed voice output means is specifically composed of a computer program and computer hardware. When executing the voice output program, the CPU reads out the "processed voice output means subprogram" from the storage device to the main memory and executes it. .

＜処理の流れ＞
図１０は、本実施形態のメッセージサービス提供システムにおける処理の流れの一例を示す図である。同図の処理の流れは以下のステップからなる。最初にステップＳ１００１では、ユーザを識別する情報であるユーザ情報を取得する（ユーザ情報取得ステップ）。そしてステップＳ１００２では、取得元となるユーザのユーザ情報と紐づけて音声を取得（音声取得ステップ）し、ステップＳ１００３では、取得した音声を再生可能にするためのインタフェースである音声再生ＵＩを表示させる（ＵＩ表示ステップ）。その後ステップＳ１００４で音声再生ＵＩの選択を受け付ける（選択受付ステップ）と、ステップＳ１００５で、受け付けた選択に応じて音声を出力する（音声出力ステップ）。 <Process flow>
FIG. 10 is a diagram showing an example of the flow of processing in the message service providing system of this embodiment. The flow of processing in the figure consists of the following steps. First, in step S1001, user information that identifies a user is acquired (user information acquisition step). Then, in step S1002, the audio is acquired in association with the user information of the user who is the acquisition source (audio acquisition step), and in step S1003, the audio playback UI, which is an interface for enabling playback of the acquired audio, is displayed. (UI display step). After that, in step S1004, selection of the audio reproduction UI is accepted (selection acceptance step), and in step S1005, audio is output according to the accepted selection (audio output step).

＜効果＞
本実施形態のメッセージサービス提供システムを用いることにより、実施形態１のメッセージサービス提供システムを用いる場合よりも、ユーザの求める声色を楽しめるようになる。 <effect>
By using the message service providing system of the present embodiment, it becomes possible to enjoy the tone desired by the user more than when using the message service providing system of the first embodiment.

＜＜実施形態５＞＞
＜概要＞
本実施形態のメッセージサービス提供システムは、基本的には実施形態１に記載のメッセージサービス提供システムの技術的特徴と同様であるが、取得した音声を音声認識のうえ一部又は全部を文字化し、音声再生ＵＩを表示する際に、前記文字化処理された文字データとともに音声再生ＵＩを表示させる点において更なる特徴を有している。 <<Embodiment 5>>
<Overview>
The message service providing system of this embodiment is basically the same as the technical features of the message service providing system described in the first embodiment. It is further characterized in that the voice reproduction UI is displayed together with the character data that has been converted into characters when the voice reproduction UI is displayed.

＜機能的構成＞
図１１は、本実施形態のメッセージサービス提供システムの機能ブロックの一例を示す図である。同図において示されているように、本実施形態の「メッセージサービス提供システム」１１００は、「ユーザ情報取得部」０２０１と、「音声取得部」１１０２と、「ＵＩ表示部」１１０３と、「選択受付部」１１０４と、「音声出力部」１１０５と、「文字化処理部」１１０６と、を有し、ＵＩ表示部は「文字表示手段」１１１３をさらに有する。基本的な構成は、実施形態１の図２を用いて説明したメッセージサービス提供システムと共通するため、以下では相違点である「文字化処理部」１１０６と、「文字表示手段」１１１３の機能について説明する。 <Functional configuration>
FIG. 11 is a diagram showing an example of functional blocks of the message service providing system of this embodiment. As shown in the figure, the ``message service providing system'' 1100 of this embodiment includes a ``user information acquisition unit'' 0201, a ``speech acquisition part'' 1102, a ``UI display part'' 1103, and a ``selection It has a "reception unit" 1104, a "voice output unit" 1105, and a "character processing unit" 1106, and the UI display unit further has a "character display unit" 1113. Since the basic configuration is the same as that of the message service providing system described with reference to FIG. explain.

「文字化処理部」１１０６は、取得した音声を音声認識のうえ一部又は全部を文字化するように構成される。本処理を行うためには、音声認識のための各種エンジンを備えることが必要であり、当該エンジンを用いて、取得した音声の認識処理を行う。ここでは不要な音声のノイズキャンセリング処理などが行われることになるが、それらの処理をどのように行うかについては、適宜設定されてよい。 The 'text processing unit' 1106 is configured to perform voice recognition on the acquired voice and to transcribe part or all of the voice. In order to carry out this process, it is necessary to have various engines for speech recognition, and the engine is used to perform the recognition process of the acquired speech. Here, noise canceling processing for unnecessary sounds and the like are performed, but how to perform those processing may be set as appropriate.

「文字表示手段」１１１３は、ＵＩ表示部において、文字化処理部にて処理された文字データとともに音声再生ＵＩを表示させるように構成される。文字データとともに音声再生ＵＩを表示する構成をとれば、メッセージを取得したユーザが、音声再生が難しい環境下にいるような場合であっても、当該投稿された音声メッセージの内容をあらかじめ把握することが可能になる。 The 'character display unit' 1113 is configured to display the voice reproduction UI together with the character data processed by the character processing unit in the UI display unit. By adopting a configuration in which a voice reproduction UI is displayed together with character data, even when a user who has acquired a message is in an environment where voice reproduction is difficult, the content of the posted voice message can be grasped in advance. becomes possible.

なおここで、文字データは、当該文字内容に関連したデコレーション処理を施したうえで表示出力されてもよい。具体的には、「誕生日おめでとう」や「目標達成！」などのポジティブなイメージの文字である場合には、明るい色文字としたり、ポップなフォントとしたり、背景画像を明るいイメージのものとしたりして出力することが考えられる。デコレーション処理をするにあたり文字内容を解析する処理としては、当該文字と所定の意味内容とを紐づける辞書データベースを用いた処理が考えられる。当該データベースを用いて文字内容を解析し、好適なデコレーション処理を施したうえで文字を表示出力する構成を採用することにより、音声出力先のユーザに対し、聴覚だけでなく視覚的にも豊富な情報量を提供することが可能になる。 Here, the character data may be displayed and output after being subjected to decoration processing related to the character content. Specifically, if the text has a positive image, such as "Happy Birthday" or "Goal Accomplished!" It is conceivable to output as Processing using a dictionary database that associates the character with a predetermined semantic content is conceivable as the processing for analyzing the character content in performing the decoration processing. By adopting a configuration that analyzes the character content using the database and displays and outputs the characters after applying suitable decoration processing, the user of the voice output destination can receive rich visual as well as auditory information. It becomes possible to provide the amount of information.

＜具体的な構成＞
本実施形態のメッセージサービス提供システムを構成する各装置のハードウェア構成は、基本的には、図３を用いて説明した実施形態１のメッセージサービス提供システムにおけるハードウェア構成と同様である。そこで以下については、これまで説明していない「文字化処理部」と、「文字表示手段」の具体的な処理について説明する。 <Specific configuration>
The hardware configuration of each device constituting the message service providing system of this embodiment is basically the same as the hardware configuration of the message service providing system of the first embodiment described with reference to FIG. Therefore, in the following, specific processing of the "character conversion processing unit" and "character display means", which have not been described so far, will be described.

（文字化処理部の具体的な構成）
文字化処理部は、具体的にはコンピュータプログラムとコンピュータハードウェアにより構成され、ＣＰＵが記憶装置から「文字化処理プログラム」をメインメモリに読み出して実行する。 (Specific configuration of characterization processing unit)
The characterization processing unit is specifically composed of a computer program and computer hardware, and the CPU reads a "characterization processing program" from the storage device into the main memory and executes it.

（文字表示手段の具体的な構成）
文字表示手段は、具体的にはコンピュータプログラムとコンピュータハードウェアにより構成され、ＵＩ表示プログラムの実行に際し、ＣＰＵが記憶装置から「文字表示サブプログラム」をメインメモリに読み出して実行する。 (Specific configuration of character display means)
The character display means is specifically composed of a computer program and computer hardware. When executing the UI display program, the CPU reads a "character display subprogram" from the storage device into the main memory and executes it.

＜処理の流れ＞
図１２は、本実施形態のメッセージサービス提供システムにおける処理の流れの一例を示す図である。同図の処理の流れは以下のステップからなる。最初にステップＳ１２０１では、ユーザを識別する情報であるユーザ情報を取得する（ユーザ情報取得ステップ）。そしてステップＳ１２０２では、取得元となるユーザのユーザ情報と紐づけて音声を取得（音声取得ステップ）し、ステップＳ１２０３では、取得した音声を再生可能にするためのインタフェースである音声再生ＵＩを表示させる（ＵＩ表示ステップ）。その後ステップＳ１２０４で音声再生ＵＩの選択を受け付ける（選択受付ステップ）と、ステップＳ１２０５で、受け付けた選択に応じて音声を出力する（音声出力ステップ）。 <Process flow>
FIG. 12 is a diagram showing an example of the flow of processing in the message service providing system of this embodiment. The flow of processing in the figure consists of the following steps. First, in step S1201, user information that identifies a user is acquired (user information acquisition step). Then, in step S1202, audio is acquired in association with the user information of the user who is the acquisition source (audio acquisition step), and in step S1203, an audio playback UI, which is an interface for enabling playback of the acquired audio, is displayed. (UI display step). After that, in step S1204, selection of the audio reproduction UI is accepted (selection acceptance step), and in step S1205, audio is output according to the accepted selection (audio output step).

＜効果＞
本実施形態のメッセージサービス提供システムを用いることにより、実施形態１のメッセージサービス提供システムを用いる場合よりも、音声出力が困難な環境でもコミュニケーションを図ることが可能となる。 <effect>
By using the message service providing system of this embodiment, communication can be achieved even in an environment in which voice output is difficult compared to when using the message service providing system of the first embodiment.

０２００・・・メッセージサービス提供システム、０２０１・・・ユーザ情報取得部、０２０２・・・音声取得部、０２０３・・・ＵＩ表示部、０２０４・・・選択受付部、０２０５・・・音声出力部

0200... Message service providing system 0201... User information acquisition unit 0202... Voice acquisition unit 0203... UI display unit 0204... Selection reception unit 0205... Voice output unit

Claims

A method for providing a message service capable of posting a plurality of voice messages composed within a predetermined time via a network, comprising:
a user information acquisition step of acquiring user information that is information that identifies a user;
a voice acquisition step of acquiring voice in association with user information of a user who is an acquisition source;
a UI display step of displaying an audio playback UI, which is an interface for enabling playback of the acquired audio;
a selection acceptance step for accepting selection of an audio playback UI;
an audio output step of outputting audio in response to the received selection;
A method of providing a message service that causes a computer to execute

The audio acquisition step is
2. The message service providing method according to claim 1, further comprising an edit obtaining substep of editing the obtained audio according to the playback time length of the obtained audio.

The audio output step is
3. The message service providing method according to claim 1 or 2, further comprising a time series output substep of sequentially outputting the plurality of posted voice messages.

a pre-processing step of performing acoustic processing of the selected voice according to the user information of the user who made the selection when the selection is received in the selection receiving step;
The audio output step is
4. A method of providing a message service according to any one of claims 1 to 3, comprising a processed voice output substep of outputting the voice processed in the preprocessing step.

further comprising a characterization processing step of characterizing a part or all of the obtained voice after speech recognition,
The UI display step is
5. The message service providing method according to any one of claims 1 to 4, further comprising a character display substep of displaying a voice reproduction UI together with the character data processed in the characterization processing step.

A message service providing program capable of posting a plurality of voice messages configured within a predetermined time via a network,
a user information acquisition step of acquiring user information that is information that identifies a user;
a voice acquisition step of acquiring voice in association with user information of a user who is an acquisition source;
a UI display step of displaying an audio playback UI, which is an interface for enabling playback of the acquired audio;
a selection acceptance step for accepting selection of an audio playback UI;
an audio output step of outputting audio in response to the received selection;
A message service provider program that causes a computer to run

A message service system capable of posting and using a plurality of voice messages configured within a predetermined time via a network,
a user information acquisition unit that acquires user information that is information that identifies a user;
a voice acquisition unit that acquires voice in association with user information of a user that is an acquisition source;
a UI display unit for displaying an audio playback UI, which is an interface for enabling playback of acquired audio;
a selection reception unit that receives selection of the audio playback UI;
an audio output unit that outputs audio according to the accepted selection;
A message service system comprising: