WO2018179227A1 - Système de fourniture de texte pour répondeur téléphonique, procédé de fourniture de texte pour répondeur téléphonique, et programme - Google Patents
Système de fourniture de texte pour répondeur téléphonique, procédé de fourniture de texte pour répondeur téléphonique, et programme Download PDFInfo
- Publication number
- WO2018179227A1 WO2018179227A1 PCT/JP2017/013260 JP2017013260W WO2018179227A1 WO 2018179227 A1 WO2018179227 A1 WO 2018179227A1 JP 2017013260 W JP2017013260 W JP 2017013260W WO 2018179227 A1 WO2018179227 A1 WO 2018179227A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- text
- answering machine
- speaker
- voice
- providing system
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/64—Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations
- H04M1/65—Recording arrangements for recording a message from the calling party
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/50—Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
- H04M3/53—Centralised arrangements for recording incoming messages, i.e. mailbox systems
- H04M3/537—Arrangements for indicating the presence of a recorded message, whereby the presence information might include a preview or summary of the message
Definitions
- the present invention relates to an answering machine text providing system, an answering machine text providing method, and a program for converting and providing the contents of an answering machine into text.
- Patent Document 1 a voice message blown into an answering machine is voice-recognized, converted into text, and sent to the other party by e-mail.
- Patent Document 1 has a problem that it is impossible to specify from the voice who the person who sent the voice message is.
- an object of the present invention is to perform text conversion and speaker identification from the result of analyzing voice recorded on an answering machine, and provide notification to a callee by associating the text with the speaker. .
- the present invention provides the following solutions.
- an answering machine text providing system comprising: specifying means for specifying the voice speaker; providing means for associating and providing the converted text and the specified speaker.
- the invention according to the first feature is based on the analysis step of analyzing the voice recorded on the answering machine, the conversion step of converting the voice into text based on the analyzed result, and the analyzed result
- An answering machine text providing method comprising: a specifying step of specifying a voice speaker, a providing step of associating and providing the converted text and the specified speaker.
- the invention according to the first aspect includes a step of analyzing a voice recorded in an answering machine by a computer, a step of converting the voice into text based on the analyzed result, and a result of the analysis. Based on this, there is provided a program for identifying a speaker of the voice, and providing the converted text and the identified speaker in association with each other.
- FIG. 1 is a schematic diagram of an answering machine text providing system.
- the answering machine text providing system of the present invention is a system that converts the contents of the answering machine into text and provides it.
- FIG. 1 is a diagram for explaining the outline of an answering machine text providing system which is a preferred embodiment of the present invention.
- the answering machine text providing system includes an analysis unit, a conversion unit, a specifying unit, and a providing unit realized by the control unit reading a predetermined program. Furthermore, you may provide a change means. These may be application types installed on smart devices, cloud types, or others. Each means described above may be realized by a single computer or may be realized by two or more computers (for example, a server and a terminal).
- the analysis means analyzes the voice recorded on the answering machine.
- An audio waveform may be analyzed. Further, the voice may be recognized. Further, machine learning may be performed using past speech as teacher data using AI (artificial intelligence). By performing machine learning, the accuracy of analysis can be improved.
- AI artificial intelligence
- the conversion means converts speech into text based on the analyzed result.
- Machine learning may be performed using past speech as teacher data using AI (artificial intelligence). By performing machine learning, the accuracy of conversion can be improved.
- the specifying means specifies a voice speaker based on the analyzed result.
- the voice database may be collated to identify the voice speaker.
- the voice database in which the voice is registered may be inside or outside.
- speaker recognition may be performed to identify a voice speaker.
- machine learning may be performed using past speech as teacher data using AI (artificial intelligence). Specific accuracy can be improved by performing machine learning.
- the providing means provides the converted text and the specified speaker in association with each other.
- the converted text may be provided in association with the identified speaker's image. Providing an image of the speaker encourages understanding of who the speaker is. Further, the converted text may be provided in association with an image different from the speaker who is associated with the specified speaker in advance. Even if the user does not want to know who the speaker is around, the user can understand who the speaker is. For example, by attaching a bear stamp in the case of Mr. A and a lover mark in the case of Mr. B in advance, the specified speaker is converted to a bear stamp in the case of Mr. A. Provide associated text. Similarly, Mr. B provides the lover mark in association with the converted text. Further, the converted text and the registered name of the specified speaker registered in the telephone directory may be provided in association with each other. Synchronize the name you provide with the phone book to help understand who the speaker is.
- the changing means changes the attention level of the converted text based on the analyzed result.
- the attention level of the converted text may be changed based on the strength of the voice, the intonation of the voice, the content of the voice call, and the like. Encourage understanding of the text by changing the degree of attention.
- the changing means changes the attention level of the converted text according to the specified speaker. For example, in the case of Mr. C, the text color is red, and in the case of Mr. D, the text size is increased. Change the color of to red. D increases the text size as well. Changing the level of attention encourages understanding of the text and who is speaking. [Description of operation]
- the answering machine text providing method includes an analysis step, a conversion step, a specifying step, and a providing step. Furthermore, a change step may be provided.
- the analysis means described above analyzes the voice recorded on the answering machine.
- An audio waveform may be analyzed. Further, the voice may be recognized. Further, machine learning may be performed using past speech as teacher data using AI (artificial intelligence). By performing machine learning, the accuracy of analysis can be improved.
- AI artificial intelligence
- the above-mentioned changing means converts speech into text based on the analyzed result.
- Machine learning may be performed using past speech as teacher data using AI (artificial intelligence). By performing machine learning, the accuracy of conversion can be improved.
- the above-described specifying means specifies the voice speaker based on the analyzed result.
- the voice database may be collated to identify the voice speaker.
- the voice database in which the voice is registered may be inside or outside.
- speaker recognition may be performed to identify a voice speaker.
- machine learning may be performed using past speech as teacher data using AI (artificial intelligence). Specific accuracy can be improved by performing machine learning.
- the providing means described above provides the converted text in association with the specified speaker.
- the converted text may be provided in association with the identified speaker's image. Providing an image of the speaker encourages understanding of who the speaker is. Further, the converted text may be provided in association with an image different from the speaker who is associated with the specified speaker in advance. Even if the user does not want to know who the speaker is around, the user can understand who the speaker is. For example, by attaching a bear stamp in the case of Mr. A and a lover mark in the case of Mr. B in advance, the specified speaker is converted to a bear stamp in the case of Mr. A. Provide associated text. Similarly, Mr. B provides the lover mark in association with the converted text. Further, the converted text and the registered name of the specified speaker registered in the telephone directory may be provided in association with each other. Synchronize the name you provide with the phone book to help understand who the speaker is.
- the above-described changing means changes the attention level of the converted text based on the analyzed result.
- the attention level of the converted text may be changed based on the strength of the voice, the intonation of the voice, the content of the voice call, and the like. Encourage understanding of the text by changing the degree of attention.
- the attention level of the converted text is changed according to the specified speaker. For example, in the case of Mr. C, the text color is red, and in the case of Mr. D, the text size is increased. Change the color of to red. D increases the text size as well. Changing the level of attention encourages understanding of the text and who is speaking.
- the means and functions described above are realized by a computer (including a CPU, an information processing apparatus, and various terminals) reading and executing a predetermined program.
- the program may be, for example, an application installed on a computer, or may be in the form of SaaS (software as a service) provided from a computer via a network, for example, a flexible disk, a CD It may be provided in a form recorded on a computer-readable recording medium such as a CD-ROM (DVD-ROM, DVD-RAM, etc.).
- the computer reads the program from the recording medium, transfers it to the internal storage device or the external storage device, stores it, and executes it.
- the program may be recorded in advance in a storage device (recording medium) such as a magnetic disk, an optical disk, or a magneto-optical disk, and provided from the storage device to a computer via a communication line.
- a nearest neighbor method a naive Bayes method, a decision tree, a support vector machine, or the like may be used.
- deep learning may be used in which a characteristic amount for learning is generated by using a neural network.
Abstract
L'invention traite le problème de la réalisation de conversion en texte et de l'identification d'un locuteur sur la base du résultat de l'analyse d'une parole enregistrée sur un répondeur téléphonique, et de la présentation du texte et du locuteur en association l'un avec l'autre. La solution selon l'invention fait intervenir le présent système de fourniture de texte pour répondeur téléphonique, qui comporte: un moyen d'analyse servant à analyser une parole enregistrée sur un répondeur téléphonique; un moyen de conversion servant à convertir la parole en texte sur la base du résultat d'analyse; un moyen d'identification servant à identifier le locuteur de la parole sur la base du résultat d'analyse; et un moyen de fourniture servant à fournir le texte issu de la conversion et le locuteur identifié en association l'un avec l'autre.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2017/013260 WO2018179227A1 (fr) | 2017-03-30 | 2017-03-30 | Système de fourniture de texte pour répondeur téléphonique, procédé de fourniture de texte pour répondeur téléphonique, et programme |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2017/013260 WO2018179227A1 (fr) | 2017-03-30 | 2017-03-30 | Système de fourniture de texte pour répondeur téléphonique, procédé de fourniture de texte pour répondeur téléphonique, et programme |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018179227A1 true WO2018179227A1 (fr) | 2018-10-04 |
Family
ID=63677839
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2017/013260 WO2018179227A1 (fr) | 2017-03-30 | 2017-03-30 | Système de fourniture de texte pour répondeur téléphonique, procédé de fourniture de texte pour répondeur téléphonique, et programme |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2018179227A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109842712A (zh) * | 2019-03-12 | 2019-06-04 | 贵州财富之舟科技有限公司 | 通话记录生成的方法、装置、计算机设备和存储介质 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08242280A (ja) * | 1995-03-02 | 1996-09-17 | Canon Inc | 音声メール装置 |
JPH10276263A (ja) * | 1997-03-28 | 1998-10-13 | Matsushita Electric Ind Co Ltd | 電話装置 |
JP2002218066A (ja) * | 2001-01-23 | 2002-08-02 | Fujitsu Ltd | 録音情報転送システム、録音情報送信装置、記録媒体及びプログラム |
JP2002271529A (ja) * | 2001-03-07 | 2002-09-20 | Sharp Corp | 通信装置 |
JP2007206501A (ja) * | 2006-02-03 | 2007-08-16 | Advanced Telecommunication Research Institute International | 最適音声認識方式判定装置、音声認識装置、パラメータ算出装置、情報端末装置、及びコンピュータプログラム |
JP2009171336A (ja) * | 2008-01-17 | 2009-07-30 | Nec Corp | 携帯通信端末 |
US20160260435A1 (en) * | 2014-04-01 | 2016-09-08 | Sony Corporation | Assigning voice characteristics to a contact information record of a person |
-
2017
- 2017-03-30 WO PCT/JP2017/013260 patent/WO2018179227A1/fr active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08242280A (ja) * | 1995-03-02 | 1996-09-17 | Canon Inc | 音声メール装置 |
JPH10276263A (ja) * | 1997-03-28 | 1998-10-13 | Matsushita Electric Ind Co Ltd | 電話装置 |
JP2002218066A (ja) * | 2001-01-23 | 2002-08-02 | Fujitsu Ltd | 録音情報転送システム、録音情報送信装置、記録媒体及びプログラム |
JP2002271529A (ja) * | 2001-03-07 | 2002-09-20 | Sharp Corp | 通信装置 |
JP2007206501A (ja) * | 2006-02-03 | 2007-08-16 | Advanced Telecommunication Research Institute International | 最適音声認識方式判定装置、音声認識装置、パラメータ算出装置、情報端末装置、及びコンピュータプログラム |
JP2009171336A (ja) * | 2008-01-17 | 2009-07-30 | Nec Corp | 携帯通信端末 |
US20160260435A1 (en) * | 2014-04-01 | 2016-09-08 | Sony Corporation | Assigning voice characteristics to a contact information record of a person |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109842712A (zh) * | 2019-03-12 | 2019-06-04 | 贵州财富之舟科技有限公司 | 通话记录生成的方法、装置、计算机设备和存储介质 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6394709B2 (ja) | 話者識別装置および話者識別用の登録音声の特徴量登録方法 | |
CN110503961B (zh) | 音频识别方法、装置、存储介质及电子设备 | |
JP6341092B2 (ja) | 表現分類装置、表現分類方法、不満検出装置及び不満検出方法 | |
US10068588B2 (en) | Real-time emotion recognition from audio signals | |
WO2020238209A1 (fr) | Procédé de traitement de contenus audio, système et dispositif associé | |
JP5731998B2 (ja) | 対話支援装置、対話支援方法および対話支援プログラム | |
US11062708B2 (en) | Method and apparatus for dialoguing based on a mood of a user | |
KR20190122457A (ko) | 음성 인식을 수행하는 전자 장치 및 전자 장치의 동작 방법 | |
Abdullah et al. | Paralinguistic speech processing: An overview | |
KR102536944B1 (ko) | 음성 신호 처리 방법 및 장치 | |
CN110931016A (zh) | 一种离线质检用语音识别方法及系统 | |
WO2018179227A1 (fr) | Système de fourniture de texte pour répondeur téléphonique, procédé de fourniture de texte pour répondeur téléphonique, et programme | |
WO2021169825A1 (fr) | Procédé et appareil de synthèse de la parole, dispositif, et support de stockage | |
WO2023090380A1 (fr) | Programme, système de traitement d'informations et procédé de traitement d'informations | |
WO2019003395A1 (fr) | Système, procédé et programme d'affichage de contenu conversationnel de centre d'appel | |
CN110047473A (zh) | 一种人机协作交互方法及系统 | |
JP2017211586A (ja) | 心理分析装置、心理分析方法、およびプログラム | |
KR20210117827A (ko) | 인공지능을 활용한 음성 서비스 제공 시스템 및 제공 방법 | |
WO2023090379A1 (fr) | Programme, système de traitement d'informations et procédé de traitement d'informations | |
CN113946673B (zh) | 一种基于语义的客服智能路由处理方法和装置 | |
US11632346B1 (en) | System for selective presentation of notifications | |
KR102221236B1 (ko) | 음성을 제공하는 방법 및 장치 | |
Pandharipande et al. | A language independent approach to identify problematic conversations in call centers | |
JP2021064876A (ja) | 判定装置、判定方法及びプログラム | |
CN115331653A (zh) | 一种语音合成方法、电子设备及计算机可读存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17903692 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17903692 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: JP |