WO2008077336A1 - Procédé de réponse vocale et serveur vocal - Google Patents

Procédé de réponse vocale et serveur vocal Download PDF

Info

Publication number
WO2008077336A1
WO2008077336A1 PCT/CN2007/071104 CN2007071104W WO2008077336A1 WO 2008077336 A1 WO2008077336 A1 WO 2008077336A1 CN 2007071104 W CN2007071104 W CN 2007071104W WO 2008077336 A1 WO2008077336 A1 WO 2008077336A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
response data
text
service request
processing module
Prior art date
Application number
PCT/CN2007/071104
Other languages
English (en)
French (fr)
Inventor
Yuetao Meng
Zhou Yu
Keping Chen
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to EP07817294A priority Critical patent/EP1968293A1/en
Priority to US12/132,185 priority patent/US20080232559A1/en
Publication of WO2008077336A1 publication Critical patent/WO2008077336A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/38Displays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/40Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M7/00Arrangements for interconnection between switching centres
    • H04M7/006Networks other than PSTN/ISDN providing telephone service, e.g. Voice over Internet Protocol (VoIP), including next generation networks with a packet-switched transport layer

Definitions

  • the present invention relates to the field of communications, and in particular, to a voice response method and a voice server.
  • VoIP Voice Over IP
  • IVR Interactive Voice Response
  • voice response voice response
  • customers can conduct end-to-end business with businesses or carriers.
  • IVR Interactive Voice Response
  • a customer calls a service hotline of a consumer appliance manufacturer, as long as the customer chooses to say "water tank”, it will be referred to the relevant department, thereby greatly shortening the talk time.
  • the telecom value-added service area such as the number of the Know-How business, the operator also provides a user experience for voice recognition.
  • voice technology is far superior to push-button IVR.
  • American Airlines recently introduced a fully automated system for people to book tickets over the phone. It is impossible to deploy such an application using only push-to-talk.
  • VUI Voice User Interface
  • the following example is an interaction between a user and a flight information system:
  • the automated IVR system includes a telephone, a switch, and a voice server that are sequentially connected.
  • the voice server includes a service processing module, a service control module, and a voice processing module that are sequentially connected, and the service control module is connected to the switch, and the main work of the IVR system.
  • the process is as follows: A. The user dials the telephone number of the voice server by using the telephone, and the switch connects the transmission channel of the telephone and the voice server;
  • the voice server plays a welcome word or an operation prompt, specifically: the service control module obtains a text response from the service processing module, and the service control module invokes a TTS (Text to Speech) technology of the voice processing module to display the text. The response is converted into voice, and the service control module returns the voice to the telephone through the switch;
  • TTS Text to Speech
  • the user interacts with the voice and voice server, and the service control module hands the voice signal sent by the telephone to the voice processing module; the voice processing module performs ASR (Automatic Speech Recognition) and returns the text to the service control module, and the service control module Submit text to the business processing module;
  • ASR Automatic Speech Recognition
  • the service processing module executes the service and prompts the user to execute the result; if the voice is not recognized or has ambiguity, the service processing module prompts the user to confirm the result or the error;
  • the voice server prompts the result or asks the user to confirm the operation.
  • the voice interactive prompt tone is often used to require the user to confirm the ambiguity or re-initiate the voice operation.
  • the speed control of the prompt sound playback is not easy to understand, and is easy to forget. 'The dike may cause the user to lose patience.
  • the noise will also affect the user's hearing.
  • the repeating sound can be used, it often causes the user's dislike.
  • the IVR system has the following disadvantages:
  • a bad voice interaction interface may reduce the speed of the voice interaction system, because the user must listen to and understand the prompts to continue using the system;
  • the technical problem to be solved by the embodiments of the present invention is to provide a voice response method and a voice server that provide a visual interface while the voice recognition interaction interface is provided.
  • a voice response method including the following steps: Acquiring a voice service request, converting the voice service request into a text service request; obtaining corresponding voice response data and visual response data according to the text service request; and transmitting the voice response data and the visual response data.
  • the technical solution adopted by the embodiment of the present invention is to provide a voice server, which includes a service processing module, a service control module, and a voice processing module.
  • the voice processing module is configured to convert the received voice service request into a text service request, where the service processing module is configured to obtain corresponding voice response data and visual response data according to the text service request;
  • the service control module is configured to send the voice response data and the visual response data.
  • the beneficial effects of the embodiments of the present invention are: since the embodiment of the present invention uses a combination of sound and visual response data, the human-computer interaction interface is more friendly and harmonious; when the prompt sound is indistinguishable, the visual interface can still achieve interaction.
  • the effect can be achieved by interrupting the user's voice, answering the result without listening to the prompt tone, thereby speeding up the speed and efficiency of the voice interaction; and avoiding the need to repeatedly play the prompt tone if the user does not understand or hear the prompt tone.
  • FIG. 1 is a schematic structural view of an automated IVR system in the background art.
  • FIG. 2 is a schematic structural diagram of an IVR system according to an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart diagram of a voice response method according to an embodiment of the present invention.
  • FIG. 4 is a schematic flow chart of a SIP-based voice response method according to an embodiment of the present invention. detailed description
  • the IVR system of the embodiment of the present invention includes a telephone, a switch, and a voice server that are sequentially connected.
  • the voice server includes a service processing module, a service control module, and a voice processing module that are sequentially connected, and the service control module is connected to the switch.
  • the voice processing module is configured to convert the received voice service request into a text service request, where the voice service request is obtained from the service control module, or directly obtained through the interface; the service processing module stores the voice response associated with the text service request.
  • Data and visual response data (such as text, image, streaming media), the service processing module obtains corresponding voice response data and visual response data according to the text service request; the service control module is connected to the service processing module, and is used to control the service processing module.
  • Voice response data and visual response obtained by the business processing module The data is returned to the telephone through the switch for delivery to the user.
  • the telephone is a telephone with a display module.
  • the voice server transmits text, images or streaming media to the telephone while transmitting the voice, and the telephone displays the text, the image or the image through the display module. Streaming content.
  • the embodiment of the invention enables a person to see a synthesized face (virtual host) while listening to the computer, so that the human-computer interaction interface is more friendly and harmonious.
  • the voice server of the embodiment of the present invention should further include a conversion unit and a second voice processing module, and the conversion unit It may be a separate module, or may be disposed in a service control module for converting text response data into an image and/or a media stream; the second voice processing module is configured to convert the text response data into voice response data,
  • the second voice processing module may be a separate module or may be disposed in the voice processing module.
  • the service control module is configured to control the service processing module, obtain text response data from the service processing module, and call the TTS technology of the second voice processing module to convert the text response data into voice response data, and control the conversion unit to call the text. Conversion to visual information
  • a Text-to-Visual Speech (TTVS) technique converts the text response data into an image or streaming media.
  • TTVS Text-to-Visual Speech
  • the telephone voice system of the present invention can also provide an auxiliary text, a graphical visual interface or a video interface, thereby speeding up the speed and efficiency of the voice interaction through the combination of sound and visual information, and at the same time, human-computer interaction.
  • the interface is more friendly and harmonious.
  • the embodiments of the present invention do not limit the transmission network and protocols, so the text, image, and streaming media can be transmitted using a PSTN network (Public Switched Telephone Network), an IP-based switching network, and an IP-based protocol (such as the SIP protocol).
  • PSTN network Public Switched Telephone Network
  • IP-based switching network such as the SIP protocol
  • IP-based protocol such as the SIP protocol.
  • the telephone of the embodiment of the present invention may be a VOIP telephone, a plain old telephone POTS, a smart terminal, a mobile phone, or the like.
  • the voice response method of the embodiment of the present invention includes the following steps:
  • A. Acquire a voice service request of the user, and convert the voice service request into a text service request.
  • the visual response data includes at least one of: text, image, and streaming media; if the visual response data is text or image, Transmitting the text or image to the user by signaling; if the visual response data is streaming media, establishing a streaming media communication channel, and sending the streaming media to the user.
  • the voice response method based on the SIP includes the following steps:
  • the phone After the user dials, the phone sends an INVITE message to the voice server, and the voice server returns a 200 OK message. Both the INVITE and the 200 OK message carry the text message, the image, the identifier of the streaming media, and the SDP for describing the media stream. Session Description Protocol (Session Description Protocol);
  • an audio communication channel is established between the telephone and the voice server. If it is determined that the telephone supports the text message, the phone and the voice server exchange text through signaling. Audio communication channel interactive voice; if it is determined that the phone supports streaming media, a video communication channel is established between the phone and the voice server, and the phone and the voice server exchange streaming media through the video communication channel; if it is determined that the phone supports image message , the image is exchanged between the telephone and the voice server by signaling.
  • INVITE SIP 911 SIP/2.0 ⁇ means to initiate a call to 911
  • the phone sends an INVITE message to the voice server, indicating that it wants to establish a video and audio channel, and tells the voice server that the phone supports text messages (MESSAGE), supports images (INFO), and the phone returns 200 OK messages as follows:
  • the H.320 protocol can be used to implement the above functions, and will not be described.
  • the switch in the IVR system of the embodiment of the present invention may also be replaced by a softswitch device, a router, or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)

Description

语音应答方法及语音服务器
本申请要求于 2006 年 12 月 26 日提交中国专利局、 申请号为 200610157787.8, 发明名称为"语音服务器及语音应答方法 "的中国专利申请的 优先权, 其全部内容通过引用结合在本申请中。
技术领域
本发明涉及通信领域, 尤其涉及一种语音应答方法及语音服务器。
背景技术
语音识别技术的进步以及 VoIP ( Voice Over IP )的问世, 加上新出现的先 进的"语音服务器"(与之相对的是按键式菜单选择), 这些共同促成了完全自 动化的 IVR ( Interactive Voice Response, 语音应答)应用, 客户可通过这些应 用与企业或运营商进行端到端业务。譬如说,顾客拨打消费类家电设备生产商 的服务热线, 只要让顾客选择说 "水箱", 就会被引到相关部门, 从而大大缩短 了通话时间。 在电信增值业务领域, 如号码百事通业务, 运营商也提供了语音 识别的用户体验。在另一个应用领域:数据录入,语音技术远胜于按键式 IVR。 譬如说,美国航空公司最近推出了完全自动化的系统,以便人们通过电话订票。 仅采用按键式拨号是不可能部署这种应用的。
在语音技术方面, 用户使用听觉器官和声音与系统交互,这样的界面被称 作语音用户界面 (Voice User Interface, VUI), VUI必须尽量能在第一次交互时 就得到正确的结果, 减少用户确认次数及从错误返回的次数。
下面的例子是一个用户和一个航班信息系统的一次交互过程:
系统: 你好, 多谢致电"蓝天"航空。 我们最新的自动系统可以帮你查询 你需要的航班信息。 请问你知道航班号吗?
用户: 对不起, 我不知道。
系统: 没关系, 请告诉我航班的出发城市。
用户: 北京。 请参阅图 1, 自动化 IVR系统包括依次连接的电话机、 交换机和语音服务 器,语音服务器包括依次连接的业务处理模块、业务控制模块和语音处理模块, 业务控制模块与交换机连接, IVR系统的主要工作流程如下: A、 用户使用电话机拨打语音服务器的电话号码, 交换机接通电话机和语 音服务器的传输通道;
B、 语音服务器播放欢迎词或操作提示语, 具体为: 业务控制模块从业务 处理模块获得文本应答, 业务控制模块调用语音处理模块的 TTS ( Text to Speech, 文字 /语音转换)技术将所述文本应答转换成语音, 业务控制模块通 过交换机将该语音返回到电话机;
C、 用户通过语音和语音服务器交互, 业务控制模块把电话机发出的声音 信号交语音处理模块; 语音处理模块执行 ASR ( Automatic Speech Recognition, 自动语音识别)并返回文本给业务控制模块, 业务控制模块提交文本到业务处 理模块;
D、 如果语音被正确识别文本, 业务处理模块执行业务, 并提示用户执行 结果; 如果语音没有识别或者带有歧义, 业务处理模块提示用户确认结果或错 误;
E、 用户继续使用语音和语音服务器交互, 或者挂机。
可以看出,整个流程是用户输入语音应答,语音服务器提示结果或要求用 户确认操作。 然而, 当语音服务器无法识别用户语音操作或有歧义时, 往往使 用语音交互提示音要求用户确认歧义或者重新发起语音操作 ,这时提示音播放 的速度控制较快不宜听懂, 且易忘记, 较'隄可能导致用户失去耐心, 同时如果 在一个嘈杂的环境 ,噪声也会影响用户的听觉,虽然可以采用重复播放提示音 , 但这也常常会引起用户的反感。
因此 , IVR系统具有如下缺点:
1、 不好的语音交互界面可能会降低语音交互系统的速度, 因为用户必须 听完并理解提示才能继续使用系统;
2、 重复播放提示音常常会引起用户的反感。
发明内容
本发明实施例所要解决的技术问题在于提供一种在语音识别交互界面的 同时提供可视界面的语音应答方法及语音服务器。
为解决上述技术问题,本发明实施例所采用的技术方案是: 一种语音应答 方法, 包括以下步骤: 获取语音业务请求, 将所述语音业务请求转换成文本业务请求; 根据所述文本业务请求获得相应的语音应答数据和可视应答数据; 发送所述语音应答数据和可视应答数据。
为解决上述技术问题,本发明实施例所采用的技术方案是:提供一种语音 服务器, 其包括业务处理模块、 业务控制模块和语音处理模块,
所述语音处理模块, 用于将接收到的语音业务请求转换成文本业务请求; 所述业务处理模块,用于根据所述文本业务请求获得相应的语音应答数据 和可视应答数据;
所述业务控制模块, 用于发送所述语音应答数据和可视应答数据。
本发明实施例的有益效果是:由于本发明实施例采用声音和可视应答数据 的结合, 因此人机交互界面更为友好、 和谐; 当提示音不可辨别时, 利用可视 界面依然能够达到交互效果; 可以实现用户语音打断,在没有听完提示音就答 复结果,从而加快语音交互的速度和效率; 另外可避免用户没有理解或听清提 示音的情况下需要重复播放提示音。
附图说明
图 1是背景技术中自动化 IVR系统的结构示意图。
图 2是本发明实施例的 IVR系统的结构示意图。
图 3是本发明实施例的语音应答方法的流程示意图。
图 4是本发明实施例的基于 SIP协议的语音应答方法的流程示意图。 具体实施方式
下面结合附图举例说明本发明的具体实施方式。
请参阅图 2, 本发明实施例的 IVR系统包括依次连接的电话机、 交换机和 语音服务器,语音服务器包括依次连接的业务处理模块、业务控制模块和语音 处理模块, 业务控制模块与交换机连接,其中语音处理模块用于将接收到的语 音业务请求转换成文本业务请求,该语音业务请求可从业务控制模块获得,也 可以直接通过接口获得;业务处理模块中存储有与文本业务请求关联的语音应 答数据和可视应答数据(如文本、 图像、 流媒体), 业务处理模块根据文本业 务请求获得相应语音应答数据和可视应答数据;业务控制模块与业务处理模块 相连, 用于控制业务处理模块,将业务处理模块获得的语音应答数据和可视应 答数据通过交换机返回给电话机, 以提供给用户。 电话机是具有显示模块的电 话机, 通过视频通信通道、 音频通信通道和信令, 语音服务器在传递语音的同 时传递文本、 图像或流媒体到电话机, 电话机通过显示模块显示文本、 图像或 流媒体内容。本发明实施例可使人们在听计算机说话的同时能看到一个合成的 人脸 (虚拟的主持人), 使人机交互界面更为友好、 和谐。 另外, 在本发明实施 例的语音服务器中, 若业务处理模块具有与文本业务请求关联的文本应答数 据, 那么本发明实施例的语音服务器还应当包括转换单元和第二语音处理模 块, 该转换单元可以是独立的模块, 也可以设置在业务控制模块中, 其用于将 文本应答数据转换成图像和 /或媒体流; 该第二语音处理模块用于将文本应答 数据转换成语音应答数据,该第二语音处理模块可以是独立的模块,也可以设 置在语音处理模块中。 此时, 业务控制模块用于控制业务处理模块, 从业务处 理模块获得文本应答数据, 并调用第二语音处理模块的 TTS技术将所述文本 应答数据转换成语音应答数据, 及控制转换单元调用文本到可视信息的转换
(Text-to- Visual Speech, TTVS)技术将所述文本应答数据转换成图像或流媒体。
本发明实施例电话语音系统除了声音交互界面外, 还能提供辅助的文本、 图形可视界面或视频界面, 因此通过声音和可视信息的结合,加快语音交互的 速度和效率, 同时人机交互界面更为友好、 和谐。
本发明实施例不限制传输网络和协议, 所以文本、 图像和流媒体的传递可 以使用 PSTN网络 ( Public Switched Telephone Network, 公用交换电话网) 、 基于 IP的交换网络和基于 IP协议(如 SIP协议) ; 本发明实施例电话机可以是 VOIP电话、 普通老式电话机 POTS、 智能终端、 手机等。
请参阅图 3, 本发明实施例的语音应答方法包括以下步骤:
A、获取用户的语音业务请求,将所述语音业务请求转换成文本业务请求;
B、 根据所述文本业务请求获得相应的语音应答数据和可视应答数据; 若 有与所述文本业务请求关联的语音应答数据和 /或可视应答数据, 则根据所述 文本业务请求直接获得相应的语音应答数据和 /或可视应答数据; 若具有与所 述文本业务请求关联的文本应答数据,则根据所述文本业务请求获得相应的文 本应答数据, 将所述文本应答数据转换成语音应答数据、 图像和 /或流媒体; C、 将所述语音应答数据和可视应答数据发送给所述用户, 所述可视应答 数据包括如下至少一种: 文本、 图像和流媒体; 若所述可视应答数据为文本或 图像, 则通过信令将所述文本或图像发送给所述用户; 若所述可视应答数据为 流媒体, 则建立流媒体通信通道,通过所述流媒体通信通道将所述流媒体发送 给所述用户。 接收用户上报的终端支持的业务能力信息,根据所述业务能力信息确定相应的 可视应答数据。
请参阅图 4, 本发明实施例基于 SIP协议 (Session Initiation Protocol, 会话初 始化协议)的语音应答方法包括以下步骤:
Al、 用户拨号后, 电话机向语音服务器发 INVITE消息,语音服务器返回 200OK消息, INVITE和 200OK消息中均带有电话机是否支持文本消息、 图 像、 流媒体的标识和用于描述媒体流的 SDP协议 (Session Description Protocol, 会话描述协议);
Bl、 承载 SDP协议的 INVITE、 200OK 消息进行 SDP协商后, 电话机 和语音服务器之间建立音频通信通道; 若确定电话机支持文本消息, 则电话机 和语音服务器之间通过信令交互文本,通过音频通信通道交互语音; 若确定电 话机支持流媒体, 则在电话机和语音服务器之间建立视频通信通道, 电话机和 语音服务器之间通过视频通信通道交互流媒体; 若确定电话机支持图像消息, 则在电话机和语音服务器之间通过信令交互图像。
举例如下: 用户拨打 911电话, 电话机发送 INVITE消息如下:
INVITE SIP:911 SIP/2.0 〃表示向 911发起呼叫
Allow: MESS AGE, INFO,.... //表示话机支持 MESSAGE 消息, INFO消息 Content-Type: application/SDP〃表示下面是消息内容, 遵循 SDP协议 c=IN IP4 191.169.1.112 //话机希望使用 IP地址 191.169.1.112 收发 媒体数据
m=audio 14380 RTP/AVP 0 96 97 98 //话机音频收发的端口为 14380 a=rtpmap:0 PCMU //音频编码方式 m=video 3400 RTP/AVP 98 99 〃话机视频收发端口为 3400 a= //视频编码方式 (略)
电话机发送 INVITE 消息给语音服务器, 表明希望建立视频和音频通道, 同时告诉语音服务器电话机支持文本消息 (MESSAGE), 支持图像(INFO ), 电话机返回 200OK消息如下:
SIP/2.0 200OK Content-Type: application/SDP
m=audio 14380 RTP/AVP 0 96 97 98 //语音服务器音频收发的端口为 14380
a=rtpmap:0 PCMU //音频编码方式 m=video 3400 RTP/AVP 98 99 //语音服务器视频收发端口为 3400 语音服务器返回 200OK消息之后视频和音频媒体流建立。
Allow: MESSAGE, INFO,.... //表示支持 MESSAGE 消息, INFO消息 通过 INVITE 消息 Allow字段的 MESSAGE和 INFO , 语音服务器知 道电话机能接受文字消息和图像, 发送文字使用 MESSAGE 消息, 发送图像 使用 INFO 消息。
具体的标准如下:
RFC3261 详细描述 SIP协议
RFC3364 详细描述 SDP的话协商
RFC3428 详细描述 MESSAGE 消息收发文本
RFC2976 伴细描述 INFO消息
在 PSTN 网络上, 要实现上述功能可以使用 H.320协议, 具体不再描述。 另外, 本发明实施例的 IVR系统中的交换机也可以由软交换设备、 路由器 等来替代。
通过以上的实施方式的描述, 本领域的技术人员可以清楚地了解到本发 明可借助软件加必需的硬件平台的方式来实现,当然也可以全部通过硬件来实 施, 但很多情况下前者是更佳的实施方式。 基于这样的理解, 本发明的技术方 案对背景技术做出贡献的全部或者部分可以以软件产品的形式体现出来,该计 算机软件产品可以存储在存储介质中, 如 ROM/RAM、磁碟、 光盘等, 包括若 干指令用以使得一台计算机设备(可以是个人计算机, 服务器, 或者网络设备 以上所述,仅为本发明的较佳实施例而已, 并非用于限定本发明的保护范 围, 凡在本发明的精神和原则之内所做的任何修改、 等同替换、 改进等, 均应 包含在本发明的保护范围之内。

Claims

权 利 要 求
1、 一种语音应答方法, 其特征在于, 包括以下步骤:
获取语音业务请求, 将所述语音业务请求转换成文本业务请求; 根据所述文本业务请求获得相应的语音应答数据和可视应答数据; 发送所述语音应答数据和可视应答数据。
2、 如权利要求 1所述的语音应答方法, 其特征在于, 该方法还包括: 接收业务能力信息;
根据所述业务能力信息确定所述可视应答数据。
3、 如权利要求 1或 2所述的语音应答方法, 其特征在于, 所述可视应答 数据包括: 文本和 /或图像和 /或流媒体。
4、 如权利要求 1所述的语音应答方法, 其特征在于: 所述根据所述文本 业务请求获得相应的语音应答数据和可视应答数据包括: :
根据所述文本业务请求获得相应的文本应答数据;
将所述文本应答数据转换成所述语音应答数据。
5、 如权利要求 1所述的语音应答方法, 其特征在于, 所述根据所述文本 业务请求获得相应的语音应答数据和可视应答数据包括:
才艮据所述文本业务请求获得相应的文本应答数据;
将所述文本应答数据转换成所述可视应答数据。
6、 如权利要求 3所述的语音应答方法, 其特征在于, 该方法还包括: 当所述可视应答数据为文本或图像时, 通过信令发送所述文本或图像; 当所述可视应答数据为流媒体时, 建立流媒体通信通道,通过所述流媒体 通信通道发送所述流媒体。
7、 一种语音服务器, 其包括业务处理模块、 业务控制模块和语音处理模 块, 其特征在于,
所述语音处理模块, 用于将接收到的语音业务请求转换成文本业务请求; 所述业务处理模块,用于根据所述文本业务请求获得相应的语音应答数据 和可视应答数据;
所述业务控制模块, 用于发送所述语音应答数据和可视应答数据。
8、 如权利要求 7所述的语音服务器, 其特征在于, 所述业务处理模块中 存储有与所述文本业务请求关联的语音应答数据和可视应答数据。
9、 如权利要求 7所述的语音服务器, 其特征在于, 所述业务处理模块中 存储与所述文本业务请求关联的文本应答数据,所述语音服务器还包括第二语 音处理模块, 用于将所述文本应答数据转换成所述语音应答数据。
10、 如权利要求 7所述的语音服务器, 其特征在于, 所述业务处理模块中 存储与所述文本业务请求关联的文本应答数据,所述语音服务器还包括转换单 元, 用于将所述文本应答数据转换成所述可视应答数据。
PCT/CN2007/071104 2006-12-26 2007-11-21 Procédé de réponse vocale et serveur vocal WO2008077336A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP07817294A EP1968293A1 (en) 2006-12-26 2007-11-21 Speech response method and speech server
US12/132,185 US20080232559A1 (en) 2006-12-26 2008-06-03 Method for voice response and voice server

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200610157787.8 2006-12-26
CNA2006101577878A CN101001287A (zh) 2006-12-26 2006-12-26 语音服务器及语音应答方法

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/132,185 Continuation US20080232559A1 (en) 2006-12-26 2008-06-03 Method for voice response and voice server

Publications (1)

Publication Number Publication Date
WO2008077336A1 true WO2008077336A1 (fr) 2008-07-03

Family

ID=38693089

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2007/071104 WO2008077336A1 (fr) 2006-12-26 2007-11-21 Procédé de réponse vocale et serveur vocal

Country Status (4)

Country Link
US (1) US20080232559A1 (zh)
EP (1) EP1968293A1 (zh)
CN (1) CN101001287A (zh)
WO (1) WO2008077336A1 (zh)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9705653B2 (en) * 2009-05-04 2017-07-11 Qualcomm Inc. Downlink control transmission in multicarrier operation
CN102185982A (zh) * 2011-05-13 2011-09-14 廖公仆 一种缩短电话语音提示系统等待时间的方法
CN103020047A (zh) * 2012-12-31 2013-04-03 威盛电子股份有限公司 修正语音应答的方法及自然语言对话系统
CN104079729A (zh) * 2013-03-29 2014-10-01 上海城际互通通信有限公司 一种ivr的信息查询方法
CN103533186B (zh) * 2013-09-23 2016-03-02 安徽科大讯飞信息科技股份有限公司 一种基于语音呼叫的业务流程实现方法及系统
CN103916548A (zh) * 2014-04-17 2014-07-09 上海斐讯数据通信技术有限公司 一种嵌入式voip语音通信系统及其语音放音方法
CN104010097A (zh) * 2014-06-17 2014-08-27 携程计算机技术(上海)有限公司 基于传统pstn电话的多媒体通讯系统及方法
US9560200B2 (en) 2014-06-24 2017-01-31 Xiaomi Inc. Method and device for obtaining voice service
CN104331148A (zh) * 2014-09-23 2015-02-04 普强信息技术(北京)有限公司 一种语音用户界面信息交互方法
WO2016054977A1 (zh) * 2014-10-09 2016-04-14 腾讯科技(深圳)有限公司 一种互动应答的方法及装置
CN105120373B (zh) * 2015-09-06 2018-07-13 上海智臻智能网络科技股份有限公司 语音传输控制方法及系统
CN106559588B (zh) * 2015-09-30 2021-01-26 中兴通讯股份有限公司 一种文本信息上传的方法及装置
CN105516520B (zh) * 2016-02-04 2018-09-18 平安科技(深圳)有限公司 一种互动式语音应答装置
US10186269B2 (en) 2016-04-18 2019-01-22 Honda Motor Co., Ltd. Hybrid speech data processing in a vehicle
CN109145853A (zh) * 2018-08-31 2019-01-04 百度在线网络技术(北京)有限公司 用于识别噪音的方法和装置
CN109600525B (zh) * 2018-11-15 2021-01-05 中国联合网络通信集团有限公司 基于虚拟现实的呼叫中心的控制方法及装置
CN110379429B (zh) * 2019-07-16 2022-02-11 招联消费金融有限公司 语音处理方法、装置、计算机设备和存储介质
US11288459B2 (en) * 2019-08-01 2022-03-29 International Business Machines Corporation Adapting conversation flow based on cognitive interaction
CN113674748A (zh) * 2021-08-30 2021-11-19 疯壳(深圳)科技有限公司 一种可触发虚拟成像系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6804330B1 (en) * 2002-01-04 2004-10-12 Siebel Systems, Inc. Method and system for accessing CRM data via voice
JP2005292476A (ja) * 2004-03-31 2005-10-20 Jfe Systems Inc 顧客応対方法及び装置
CN1791157A (zh) * 2004-12-13 2006-06-21 西安大唐电信有限公司 智能输入实现电话呼叫及信息查询的系统及方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6804330B1 (en) * 2002-01-04 2004-10-12 Siebel Systems, Inc. Method and system for accessing CRM data via voice
JP2005292476A (ja) * 2004-03-31 2005-10-20 Jfe Systems Inc 顧客応対方法及び装置
CN1791157A (zh) * 2004-12-13 2006-06-21 西安大唐电信有限公司 智能输入实现电话呼叫及信息查询的系统及方法

Also Published As

Publication number Publication date
CN101001287A (zh) 2007-07-18
EP1968293A1 (en) 2008-09-10
US20080232559A1 (en) 2008-09-25

Similar Documents

Publication Publication Date Title
WO2008077336A1 (fr) Procédé de réponse vocale et serveur vocal
US8861510B1 (en) Dynamic assignment of media proxy
EP2012516B1 (en) Customised playback telephony services
US9219820B2 (en) Method and apparatus for providing voice control for accessing teleconference services
EP1829306B1 (en) Method and apparatus for providing emergency calls to a disabled endpoint device
US7688954B2 (en) System and method for identifying caller
US8442197B1 (en) Telephone-based user interface for participating simultaneously in more than one teleconference
US8223948B2 (en) Multi-tiered media services for globally interconnecting businesses and customers
US20050232169A1 (en) System and method for providing telecommunication relay services
US20080137643A1 (en) Accessing call control functions from an associated device
CA2551568A1 (en) Method and system for managing communication sessions between a text-based and a voice-based client
US8908845B2 (en) Method, device and system for implementing customized ring back tone service and customized ring tone service
US11032420B2 (en) Telephone call management system
US9215253B1 (en) Method, device, and system for real-time call annoucement
US7664237B1 (en) Method and apparatus for providing emergency ring tones for urgent calls
CN114285945B (zh) 一种视频交互方法、装置和存储介质
WO2013040832A1 (zh) 在总机业务中实现话务员插入通话的方法、装置和系统
EP2736212B1 (en) Method and system for implementing broadcast group call in click to dial service
CN114979386A (zh) 小程序语音通信方法、装置、电子设备、存储介质
US8625577B1 (en) Method and apparatus for providing audio recording
CN102664863A (zh) 终端实现呼叫等待的方法、装置和系统
US8837459B2 (en) Method and apparatus for providing asynchronous audio messaging
US8130934B1 (en) Method and apparatus for providing network based muting of call legs
US20230247136A1 (en) Automated attendant that specifies audio transmission characteristics for calls
US7817782B1 (en) System and method to support a telecommunication device for the deaf (TDD) in a voice over internet protocol (VoIP) network

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 2007817294

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07817294

Country of ref document: EP

Kind code of ref document: A1

WWP Wipo information: published in national office

Ref document number: 2007817294

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE