JP2021107873A5

JP2021107873A5 -

Info

Publication number: JP2021107873A5
Application number: JP2019239264A
Authority: JP
Filing date: 2019-12-27
Publication date: 2022-12-23

Description

また、本開示は、映像およびオペレータの発話音声をオペレータ端末から受信して出力する受信機と、サーバとにより構成される音声特性変更システムにより実行される音声特性変更方法であって、前記受信機により、前記映像および前記発話音声を視聴する顧客を撮像するカメラを有し、前記カメラにより撮像された前記顧客の撮像画像を取得するステップと、前記サーバにより、前記受信機から送られた前記顧客の撮像画像に基づいて、前記顧客の前記映像および前記発話音声に対する感情を示す感情データを導出するステップと、前記サーバにより、前記顧客の前記感情データの導出結果に基づいて、前記オペレータの発話音声の特性の変更に関する処理指示を生成して前記受信機に送るステップと、前記受信機により、前記サーバから送られた前記処理指示に基づいて、前記オペレータの発話音声の特性を変更して出力するステップと、を有する、音声特性変更方法を提供する。また、本開示は、映像およびオペレータの発話音声をオペレータ端末から受信して出力する受信機と通信可能に接続される音声特性変更装置であって、前記映像および前記発話音声を視聴する顧客を撮像するカメラと接続された前記受信機から、前記カメラにより撮像された前記顧客の撮像画像を取得し、前記受信機から送られた前記顧客の撮像画像に基づいて、前記顧客の前記映像および前記発話音声に対する感情を示す感情データを導出し、前記顧客の前記感情データの導出結果に基づいて、前記オペレータの発話音声の特性の変更に関する処理指示を生成して前記受信機に送る、音声特性変更装置を提供する。 Further, the present disclosure is an audio characteristic changing method executed by an audio characteristic changing system configured by a receiver that receives and outputs video and an operator's uttered voice from an operator terminal, and a server, wherein the receiver a step of obtaining a captured image of the customer captured by the camera, and the customer sent from the receiver by the server; a step of deriving emotion data indicating the emotion of the customer with respect to the video and the uttered voice based on the captured image of the operator; a step of generating a processing instruction for changing the characteristics of the operator and sending it to the receiver, and changing and outputting the characteristics of the operator's uttered voice by the receiver based on the processing instruction sent from the server and a method for modifying audio characteristics. Further, the present disclosure is an audio characteristic changing device communicably connected to a receiver that receives and outputs video and an operator's uttered voice from an operator terminal, and captures a customer viewing the video and the uttered voice. a captured image of the customer captured by the camera is acquired from the receiver connected to the camera connected to the receiver, and based on the captured image of the customer sent from the receiver, the image and the utterance of the customer are obtained. A voice characteristic changing device for deriving emotion data indicating an emotion toward voice, and based on the derivation result of the emotion data of the customer, generating a processing instruction for changing the characteristic of the voice uttered by the operator and transmitting the processing instruction to the receiver. I will provide a.

Claims

An audio characteristic changing system in which a receiver that receives and outputs video and an operator's uttered voice from an operator terminal and a server are communicably connected,
The receiver is
connected to a camera that captures an image of the customer viewing the video and the uttered voice, acquires an image of the customer captured by the camera, and sends the captured image to the server;
The server is
deriving emotion data indicating the customer's emotion toward the video and the uttered voice based on the captured image of the customer sent from the receiver;
Based on the derivation result of the emotion data of the customer, generating a processing instruction for changing characteristics of the operator's uttered voice and sending it to the receiver;
The receiver is
Based on the processing instruction sent from the server, the characteristics of the operator's uttered voice are changed and output.
Voice characteristic change system.

The receiver is
connected to a microphone that picks up the customer's uttered voice, acquires the customer's uttered voice picked up by the microphone, and sends it to the server;
The server is
Deriving the emotional data of the customer based on the customer's captured image or the customer's uttered voice sent from the receiver;
2. A system for modifying audio characteristics according to claim 1.

The server is
when it is determined that the emotional data of the customer indicates anger, generating the processing instruction to lower the pitch of the ending part of the operator's uttered voice;
2. A system for modifying audio characteristics according to claim 1.

The server is
when determining that the emotion data of the customer indicates anger, generating advice information prompting the operator to stop continuation of speech and transmitting the advice information to the operator terminal;
The operator terminal is
receiving and displaying the advice information sent from the server;
2. A system for modifying audio characteristics according to claim 1.

The server is
generating the processing instruction to increase the volume of the operator's uttered voice when it is determined that the emotion data of the customer indicates distress;
2. A system for modifying audio characteristics according to claim 1.

The server is
Deriving the emotional data of the customer based on both the customer's captured image and the customer's spoken voice sent from the receiver;
3. A system for modifying audio characteristics according to claim 2.

The receiver is a face-to-face information providing device that supports dialogue with the operator,
A system for modifying audio characteristics according to any one of claims 1-6.

The receiver is a television receiver placed in the home,
A system for modifying audio characteristics according to any one of claims 1-5.

At least one receiver is arranged in each of the plurality of homes,
wherein the server generates different processing instructions for changing characteristics of the operator's spoken voice for each receiver in the home and sends them to the corresponding receiver;
9. A system for modifying audio characteristics according to claim 8.

The receiver is
When there are a plurality of customers viewing the video and the uttered voice output from the receiver, a processing instruction for changing the characteristics of the operator's uttered voice is generated based on the derivation result of the predetermined emotion data. do,
9. A system for modifying audio characteristics according to claim 8.

An audio characteristic changing method executed by an audio characteristic changing system composed of a receiver that receives and outputs video and an operator's uttered voice from an operator terminal and a server,
a step of obtaining a captured image of the customer captured by the camera, wherein the receiver has a camera that captures an image of the customer viewing the video and the uttered voice;
a step of deriving, by the server, emotion data indicating the customer's emotion toward the video and the uttered voice based on the captured image of the customer sent from the receiver;
a step of generating, by the server, a processing instruction for changing characteristics of the operator's uttered voice based on the derivation result of the emotion data of the customer, and transmitting the processing instruction to the receiver;
a step of changing and outputting characteristics of the operator's uttered voice by the receiver based on the processing instruction sent from the server;
How to change voice characteristics.

An audio characteristic changing device communicatively connected to a receiver that receives and outputs video and an operator's uttered voice from an operator terminal,
obtaining an image of the customer captured by the camera from the receiver connected to the camera that captures the customer viewing the video and the uttered voice;
deriving emotion data indicating the customer's emotion toward the video and the uttered voice based on the captured image of the customer sent from the receiver;
Based on the derivation result of the emotion data of the customer, a processing instruction for changing characteristics of the operator's uttered voice is generated and sent to the receiver.
Voice characteristic modifier.