WO2018179227A1 - Système de fourniture de texte pour répondeur téléphonique, procédé de fourniture de texte pour répondeur téléphonique, et programme - Google Patents

Système de fourniture de texte pour répondeur téléphonique, procédé de fourniture de texte pour répondeur téléphonique, et programme Download PDF

Info

Publication number
WO2018179227A1
WO2018179227A1 PCT/JP2017/013260 JP2017013260W WO2018179227A1 WO 2018179227 A1 WO2018179227 A1 WO 2018179227A1 JP 2017013260 W JP2017013260 W JP 2017013260W WO 2018179227 A1 WO2018179227 A1 WO 2018179227A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
answering machine
speaker
voice
providing system
Prior art date
Application number
PCT/JP2017/013260
Other languages
English (en)
Japanese (ja)
Inventor
俊二 菅谷
Original Assignee
株式会社オプティム
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社オプティム filed Critical 株式会社オプティム
Priority to PCT/JP2017/013260 priority Critical patent/WO2018179227A1/fr
Publication of WO2018179227A1 publication Critical patent/WO2018179227A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/64Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations
    • H04M1/65Recording arrangements for recording a message from the calling party
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/53Centralised arrangements for recording incoming messages, i.e. mailbox systems
    • H04M3/537Arrangements for indicating the presence of a recorded message, whereby the presence information might include a preview or summary of the message

Definitions

  • the present invention relates to an answering machine text providing system, an answering machine text providing method, and a program for converting and providing the contents of an answering machine into text.
  • Patent Document 1 a voice message blown into an answering machine is voice-recognized, converted into text, and sent to the other party by e-mail.
  • Patent Document 1 has a problem that it is impossible to specify from the voice who the person who sent the voice message is.
  • an object of the present invention is to perform text conversion and speaker identification from the result of analyzing voice recorded on an answering machine, and provide notification to a callee by associating the text with the speaker. .
  • the present invention provides the following solutions.
  • an answering machine text providing system comprising: specifying means for specifying the voice speaker; providing means for associating and providing the converted text and the specified speaker.
  • the invention according to the first feature is based on the analysis step of analyzing the voice recorded on the answering machine, the conversion step of converting the voice into text based on the analyzed result, and the analyzed result
  • An answering machine text providing method comprising: a specifying step of specifying a voice speaker, a providing step of associating and providing the converted text and the specified speaker.
  • the invention according to the first aspect includes a step of analyzing a voice recorded in an answering machine by a computer, a step of converting the voice into text based on the analyzed result, and a result of the analysis. Based on this, there is provided a program for identifying a speaker of the voice, and providing the converted text and the identified speaker in association with each other.
  • FIG. 1 is a schematic diagram of an answering machine text providing system.
  • the answering machine text providing system of the present invention is a system that converts the contents of the answering machine into text and provides it.
  • FIG. 1 is a diagram for explaining the outline of an answering machine text providing system which is a preferred embodiment of the present invention.
  • the answering machine text providing system includes an analysis unit, a conversion unit, a specifying unit, and a providing unit realized by the control unit reading a predetermined program. Furthermore, you may provide a change means. These may be application types installed on smart devices, cloud types, or others. Each means described above may be realized by a single computer or may be realized by two or more computers (for example, a server and a terminal).
  • the analysis means analyzes the voice recorded on the answering machine.
  • An audio waveform may be analyzed. Further, the voice may be recognized. Further, machine learning may be performed using past speech as teacher data using AI (artificial intelligence). By performing machine learning, the accuracy of analysis can be improved.
  • AI artificial intelligence
  • the conversion means converts speech into text based on the analyzed result.
  • Machine learning may be performed using past speech as teacher data using AI (artificial intelligence). By performing machine learning, the accuracy of conversion can be improved.
  • the specifying means specifies a voice speaker based on the analyzed result.
  • the voice database may be collated to identify the voice speaker.
  • the voice database in which the voice is registered may be inside or outside.
  • speaker recognition may be performed to identify a voice speaker.
  • machine learning may be performed using past speech as teacher data using AI (artificial intelligence). Specific accuracy can be improved by performing machine learning.
  • the providing means provides the converted text and the specified speaker in association with each other.
  • the converted text may be provided in association with the identified speaker's image. Providing an image of the speaker encourages understanding of who the speaker is. Further, the converted text may be provided in association with an image different from the speaker who is associated with the specified speaker in advance. Even if the user does not want to know who the speaker is around, the user can understand who the speaker is. For example, by attaching a bear stamp in the case of Mr. A and a lover mark in the case of Mr. B in advance, the specified speaker is converted to a bear stamp in the case of Mr. A. Provide associated text. Similarly, Mr. B provides the lover mark in association with the converted text. Further, the converted text and the registered name of the specified speaker registered in the telephone directory may be provided in association with each other. Synchronize the name you provide with the phone book to help understand who the speaker is.
  • the changing means changes the attention level of the converted text based on the analyzed result.
  • the attention level of the converted text may be changed based on the strength of the voice, the intonation of the voice, the content of the voice call, and the like. Encourage understanding of the text by changing the degree of attention.
  • the changing means changes the attention level of the converted text according to the specified speaker. For example, in the case of Mr. C, the text color is red, and in the case of Mr. D, the text size is increased. Change the color of to red. D increases the text size as well. Changing the level of attention encourages understanding of the text and who is speaking. [Description of operation]
  • the answering machine text providing method includes an analysis step, a conversion step, a specifying step, and a providing step. Furthermore, a change step may be provided.
  • the analysis means described above analyzes the voice recorded on the answering machine.
  • An audio waveform may be analyzed. Further, the voice may be recognized. Further, machine learning may be performed using past speech as teacher data using AI (artificial intelligence). By performing machine learning, the accuracy of analysis can be improved.
  • AI artificial intelligence
  • the above-mentioned changing means converts speech into text based on the analyzed result.
  • Machine learning may be performed using past speech as teacher data using AI (artificial intelligence). By performing machine learning, the accuracy of conversion can be improved.
  • the above-described specifying means specifies the voice speaker based on the analyzed result.
  • the voice database may be collated to identify the voice speaker.
  • the voice database in which the voice is registered may be inside or outside.
  • speaker recognition may be performed to identify a voice speaker.
  • machine learning may be performed using past speech as teacher data using AI (artificial intelligence). Specific accuracy can be improved by performing machine learning.
  • the providing means described above provides the converted text in association with the specified speaker.
  • the converted text may be provided in association with the identified speaker's image. Providing an image of the speaker encourages understanding of who the speaker is. Further, the converted text may be provided in association with an image different from the speaker who is associated with the specified speaker in advance. Even if the user does not want to know who the speaker is around, the user can understand who the speaker is. For example, by attaching a bear stamp in the case of Mr. A and a lover mark in the case of Mr. B in advance, the specified speaker is converted to a bear stamp in the case of Mr. A. Provide associated text. Similarly, Mr. B provides the lover mark in association with the converted text. Further, the converted text and the registered name of the specified speaker registered in the telephone directory may be provided in association with each other. Synchronize the name you provide with the phone book to help understand who the speaker is.
  • the above-described changing means changes the attention level of the converted text based on the analyzed result.
  • the attention level of the converted text may be changed based on the strength of the voice, the intonation of the voice, the content of the voice call, and the like. Encourage understanding of the text by changing the degree of attention.
  • the attention level of the converted text is changed according to the specified speaker. For example, in the case of Mr. C, the text color is red, and in the case of Mr. D, the text size is increased. Change the color of to red. D increases the text size as well. Changing the level of attention encourages understanding of the text and who is speaking.
  • the means and functions described above are realized by a computer (including a CPU, an information processing apparatus, and various terminals) reading and executing a predetermined program.
  • the program may be, for example, an application installed on a computer, or may be in the form of SaaS (software as a service) provided from a computer via a network, for example, a flexible disk, a CD It may be provided in a form recorded on a computer-readable recording medium such as a CD-ROM (DVD-ROM, DVD-RAM, etc.).
  • the computer reads the program from the recording medium, transfers it to the internal storage device or the external storage device, stores it, and executes it.
  • the program may be recorded in advance in a storage device (recording medium) such as a magnetic disk, an optical disk, or a magneto-optical disk, and provided from the storage device to a computer via a communication line.
  • a nearest neighbor method a naive Bayes method, a decision tree, a support vector machine, or the like may be used.
  • deep learning may be used in which a characteristic amount for learning is generated by using a neural network.

Abstract

L'invention traite le problème de la réalisation de conversion en texte et de l'identification d'un locuteur sur la base du résultat de l'analyse d'une parole enregistrée sur un répondeur téléphonique, et de la présentation du texte et du locuteur en association l'un avec l'autre. La solution selon l'invention fait intervenir le présent système de fourniture de texte pour répondeur téléphonique, qui comporte: un moyen d'analyse servant à analyser une parole enregistrée sur un répondeur téléphonique; un moyen de conversion servant à convertir la parole en texte sur la base du résultat d'analyse; un moyen d'identification servant à identifier le locuteur de la parole sur la base du résultat d'analyse; et un moyen de fourniture servant à fournir le texte issu de la conversion et le locuteur identifié en association l'un avec l'autre.
PCT/JP2017/013260 2017-03-30 2017-03-30 Système de fourniture de texte pour répondeur téléphonique, procédé de fourniture de texte pour répondeur téléphonique, et programme WO2018179227A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2017/013260 WO2018179227A1 (fr) 2017-03-30 2017-03-30 Système de fourniture de texte pour répondeur téléphonique, procédé de fourniture de texte pour répondeur téléphonique, et programme

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2017/013260 WO2018179227A1 (fr) 2017-03-30 2017-03-30 Système de fourniture de texte pour répondeur téléphonique, procédé de fourniture de texte pour répondeur téléphonique, et programme

Publications (1)

Publication Number Publication Date
WO2018179227A1 true WO2018179227A1 (fr) 2018-10-04

Family

ID=63677839

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2017/013260 WO2018179227A1 (fr) 2017-03-30 2017-03-30 Système de fourniture de texte pour répondeur téléphonique, procédé de fourniture de texte pour répondeur téléphonique, et programme

Country Status (1)

Country Link
WO (1) WO2018179227A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109842712A (zh) * 2019-03-12 2019-06-04 贵州财富之舟科技有限公司 通话记录生成的方法、装置、计算机设备和存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08242280A (ja) * 1995-03-02 1996-09-17 Canon Inc 音声メール装置
JPH10276263A (ja) * 1997-03-28 1998-10-13 Matsushita Electric Ind Co Ltd 電話装置
JP2002218066A (ja) * 2001-01-23 2002-08-02 Fujitsu Ltd 録音情報転送システム、録音情報送信装置、記録媒体及びプログラム
JP2002271529A (ja) * 2001-03-07 2002-09-20 Sharp Corp 通信装置
JP2007206501A (ja) * 2006-02-03 2007-08-16 Advanced Telecommunication Research Institute International 最適音声認識方式判定装置、音声認識装置、パラメータ算出装置、情報端末装置、及びコンピュータプログラム
JP2009171336A (ja) * 2008-01-17 2009-07-30 Nec Corp 携帯通信端末
US20160260435A1 (en) * 2014-04-01 2016-09-08 Sony Corporation Assigning voice characteristics to a contact information record of a person

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08242280A (ja) * 1995-03-02 1996-09-17 Canon Inc 音声メール装置
JPH10276263A (ja) * 1997-03-28 1998-10-13 Matsushita Electric Ind Co Ltd 電話装置
JP2002218066A (ja) * 2001-01-23 2002-08-02 Fujitsu Ltd 録音情報転送システム、録音情報送信装置、記録媒体及びプログラム
JP2002271529A (ja) * 2001-03-07 2002-09-20 Sharp Corp 通信装置
JP2007206501A (ja) * 2006-02-03 2007-08-16 Advanced Telecommunication Research Institute International 最適音声認識方式判定装置、音声認識装置、パラメータ算出装置、情報端末装置、及びコンピュータプログラム
JP2009171336A (ja) * 2008-01-17 2009-07-30 Nec Corp 携帯通信端末
US20160260435A1 (en) * 2014-04-01 2016-09-08 Sony Corporation Assigning voice characteristics to a contact information record of a person

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109842712A (zh) * 2019-03-12 2019-06-04 贵州财富之舟科技有限公司 通话记录生成的方法、装置、计算机设备和存储介质

Similar Documents

Publication Publication Date Title
JP6394709B2 (ja) 話者識別装置および話者識別用の登録音声の特徴量登録方法
CN110503961B (zh) 音频识别方法、装置、存储介质及电子设备
JP6341092B2 (ja) 表現分類装置、表現分類方法、不満検出装置及び不満検出方法
US10068588B2 (en) Real-time emotion recognition from audio signals
WO2020238209A1 (fr) Procédé de traitement de contenus audio, système et dispositif associé
JP5731998B2 (ja) 対話支援装置、対話支援方法および対話支援プログラム
US11062708B2 (en) Method and apparatus for dialoguing based on a mood of a user
KR20190122457A (ko) 음성 인식을 수행하는 전자 장치 및 전자 장치의 동작 방법
Abdullah et al. Paralinguistic speech processing: An overview
KR102536944B1 (ko) 음성 신호 처리 방법 및 장치
CN110931016A (zh) 一种离线质检用语音识别方法及系统
WO2018179227A1 (fr) Système de fourniture de texte pour répondeur téléphonique, procédé de fourniture de texte pour répondeur téléphonique, et programme
WO2021169825A1 (fr) Procédé et appareil de synthèse de la parole, dispositif, et support de stockage
WO2023090380A1 (fr) Programme, système de traitement d'informations et procédé de traitement d'informations
WO2019003395A1 (fr) Système, procédé et programme d'affichage de contenu conversationnel de centre d'appel
CN110047473A (zh) 一种人机协作交互方法及系统
JP2017211586A (ja) 心理分析装置、心理分析方法、およびプログラム
KR20210117827A (ko) 인공지능을 활용한 음성 서비스 제공 시스템 및 제공 방법
WO2023090379A1 (fr) Programme, système de traitement d'informations et procédé de traitement d'informations
CN113946673B (zh) 一种基于语义的客服智能路由处理方法和装置
US11632346B1 (en) System for selective presentation of notifications
KR102221236B1 (ko) 음성을 제공하는 방법 및 장치
Pandharipande et al. A language independent approach to identify problematic conversations in call centers
JP2021064876A (ja) 判定装置、判定方法及びプログラム
CN115331653A (zh) 一种语音合成方法、电子设备及计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17903692

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17903692

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP