WO2007109950A1 - A method and system for realizing speech interaction - Google Patents

A method and system for realizing speech interaction Download PDF

Info

Publication number
WO2007109950A1
WO2007109950A1 PCT/CN2007/000188 CN2007000188W WO2007109950A1 WO 2007109950 A1 WO2007109950 A1 WO 2007109950A1 CN 2007000188 W CN2007000188 W CN 2007000188W WO 2007109950 A1 WO2007109950 A1 WO 2007109950A1
Authority
WO
WIPO (PCT)
Prior art keywords
mrs
user terminal
message
information
media channel
Prior art date
Application number
PCT/CN2007/000188
Other languages
French (fr)
Chinese (zh)
Inventor
Santosh Nath
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Publication of WO2007109950A1 publication Critical patent/WO2007109950A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1069Session establishment or de-establishment

Definitions

  • the present invention relates to Next Generation Network (NGN) technology, and in particular, to a method and system for implementing voice interaction in an NGN.
  • NGN Next Generation Network
  • SIP Session Initial Protocol
  • IETF 7 Internet Engineering Task Force
  • SIP is mainly used to initiate a session. . Specifically, SIP can be used to create, modify, and terminate multimedia session sessions attended by multiple participants. It is one of the core protocols of NGN.
  • the session refers to an application level association relationship between two SIP nodes.
  • the SIP node can be a User Agent (UA), a SIP Agent, a registration server, a back-to-back agent, a location server, and the like.
  • the end-to-end user agent can be generally referred to as an application server (AS, Application Server).
  • the application server belongs to a service provider, and is mainly used to provide various value-added services for end users.
  • MRS Media Resource Server
  • PA send Notifications
  • PAC Play Announcement
  • the MRS and the user terminal perform voice interaction by establishing a real-time transport protocol (RTP) media channel.
  • RTP real-time transport protocol
  • the user terminal acting as the calling party needs to set itself up.
  • Device capability information such as voice capabilities, image capabilities, etc.
  • IP Internet Protocol
  • SDP Session Description Protocol
  • the MRS as the called party also needs to load device capability information such as voice capability and video capability, and the IP address and port address information of the RTP media channel into the SDP packet, and send it as the content of the SIP message to the master.
  • the user terminal and the MRS establish the RTP media channel according to the negotiation result.
  • the calling party may carry its own SDP packet in an Invite message or an ACK message sent to the MRS, and the called party may carry its own SDP packet in the ringback returned to the user terminal. 180 Ringing or 200 OK message.
  • the MRS provides a voice interaction function to the user terminal through the established RTP media channel, for example, starting to play a recording notification or collecting the number dialed by the user while playing the recording notification, that is, performing a PA or PAC operation;
  • the MRS After completing the voice interaction function, the MRS submits the collected user data to the AS using a Session Initiation Protocol Information Message (SIP INPO) or a Hypertext Transfer Protocol Acquisition Message (HTTP GET).
  • SIP INPO Session Initiation Protocol Information Message
  • HTTP GET Hypertext Transfer Protocol Acquisition Message
  • the AS may also send the next voice interaction request to the MRS.
  • the MRS completes the requested voice interaction with the user terminal through the established RTP media channel.
  • the AS After the AS obtains sufficient user data, it instructs the MRS to tear down the RTP media channel established with the user terminal.
  • the AS can obtain sufficient user data, and will provide the requested intelligent service to the user according to the obtained user data.
  • the AS obtains the card number, password, and called user number of the prepaid user through the voice interaction process between the MRS and the user terminal
  • the AS obtains the The obtained called user number routes the smart call to the called party, and at the same time, after the call ends, the call is charged according to the card number and password of the prepaid user.
  • the network operator when the RTP media channel is established between the user terminal and the MRS, and the voice interaction process is performed by using the established RTP media channel, the network operator is occupied by the RTP media channel.
  • a fee will usually be required for a service provider that provides intelligent services.
  • the service provider should only charge the user when the user enjoys the smart service, and the user should not be charged during the establishment of the smart call.
  • the service provider should only charge the prepaid user for the fee during the period between the user and the called party, and should not charge the prepaid fee.
  • the card number, password, and the fee during the collection period of the called number are performed during the voice interaction function between the prepaid user and the MRS during the call. Therefore, in this case, the service provider itself needs to pay the network operator for the network resources occupied by the voice interaction between the user terminal and the MRS, which will greatly increase the service cost of the service provider.
  • the embodiment of the invention provides a method for implementing a voice interaction function, so that the network operator does not charge the service provider for the network resources occupied by the user terminal and the MRS during the voice call process, thereby saving The operating cost of the service provider.
  • the media resource server MRS receives the invitation message sent by the user terminal, where the invitation message carries the session description protocol SDP information of the user terminal;
  • a ringing message to the user terminal, where the ringing message is carried SDP information of MRS;
  • the user terminal and the MRS establish a real-time transport protocol RTP media channel according to the SDP information exchanged by the two parties, and perform voice interaction through the RTP media channel.
  • Another embodiment of the present invention provides a system for implementing voice interaction, including: a user terminal and a media resource server MBS; wherein:
  • the user terminal sends an invitation message to the MRS, and carries the session description protocol SDP information of the user terminal in the invitation message, and receives the ringing message returned by the MRS according to the invitation message, and obtains the ringing message to be carried in the ringing message.
  • the SDP information of the MRS establishes a real-time transmission protocol RTP media channel according to the SDP information of the two parties and performs voice interaction through the established RTP media channel.
  • the device capability negotiation is performed between the user terminal and the MRS through the Invite message and the ringing message, so that the RTP media channel can be established in advance and realized.
  • the stream of information interaction processes are performed between the user terminal and the MRS through the Invite message and the ringing message, so that the RTP media channel can be established in advance and realized.
  • the MRS since the MRS does not return a 200 OK message before the RTP media channel is established between the user terminal and the MRS, the network operator will not perform the subsequent voice interaction process. Charges are made to reduce the operating costs of the service provider.
  • the network operator can decide to adopt different charging policies according to the actual situation of the intelligent service:
  • the charging can be started immediately after the calling user dials the access code of the intelligent service.
  • the smart call is actually established (for example, the AS returns a 200 OK message to the calling user)
  • the charging is started, so that the calling user can dial some intelligent service access codes without charging, and dialing Some smart service codes are charged.
  • Service 200 is a free intelligent access code
  • 17930 is a charging intelligent access code.
  • FIG. 1 is a message flow diagram of implementing the voice interaction function according to a preferred embodiment of the present invention. Mode for carrying out the invention
  • the network operator when the called party sends a 200 OK message response, the network operator starts charging for the current call. Therefore, in order to prevent the network operator from performing voice interaction with the user terminal and the MRS. The network resources occupied by the time are charged, and the message flow for implementing the voice interaction process in the existing intelligent call can be improved.
  • FIG. 1 The specific message flow for implementing voice interaction between the MRS and the user terminal according to the preferred embodiment of the present invention is as shown in FIG. 1 , which mainly includes:
  • the user terminal A After the user dials the smart call number through the user terminal A, the user terminal A sends an Invite message requesting the network to provide the smart service corresponding to the smart call number, and the Invite message is then routed to the AS responsible for processing the smart call.
  • the user terminal A as the calling party needs to load its own device capability information, such as voice capability, video capability, etc., and the IP address and port address information of the RTP media channel into the SDP package, and the The SDP packet is carried in the Invite message and sent to the MRS as the called party.
  • device capability information such as voice capability, video capability, etc.
  • IP address and port address information of the RTP media channel into the SDP package
  • the AS analyzes that the requested intelligent service needs to use a voice interaction function such as PA/PAC, forward the Invite message to an MRS that provides a voice interaction function.
  • a voice interaction function such as PA/PAC
  • the MRS sends a 180 Ringing message to the AS, indicating that the MRS is ready to provide the required voice interaction function.
  • the MRS as the called party needs to have its own device capability information, for example.
  • the voice capability, the video capability, and the like, and the IP address and port address information of the RTP media channel are loaded into the SDP packet, and the SDP packet is carried in the 180 Ringing message and sent to the user terminal A as the calling party. .
  • the AS forwards the received 180 Ringing message to the user terminal A.
  • the user equipment A and the MRS complete the negotiation of the device capabilities between the two.
  • the user terminal A and the MRS establish an RTP media channel according to the negotiation result of the device capability.
  • the MRS provides a voice interaction function to the user terminal A through the established RTP media channel.
  • the voice interaction function includes: PA or PAC operation, for example, starting to play a recording notification or collecting the number dialed by the user while playing the recording notification.
  • the MRS submits the collected user data to the AS by using a SIP INFO or an HTTP GET message.
  • the AS can use the SIP INFO or HTTP GET message to send the next voice interaction request to the MRS through Piggybacking. In this way, after receiving the new voice interaction request, the MRS returns to the above step 206 to perform the next voice interaction process with the user terminal A through the established RTP connection.
  • the requested smart service will be provided to the user terminal A.
  • the AS will The obtained called user number routes the current smart call to the called party, and after the call ends, the card will be deducted according to the card number and password of the prepaid user, and the main account is deducted during the call. The cost of calling the user.
  • the user terminal A sends the device capability information to the MRS through the Invite message, and the MRS needs to The device capability information is sent to the user terminal A through the 180 Ringing message, and the MRS and the user terminal do not need to exchange the 200 OK message and the ACK message, so that the device capability information interaction between the user terminal A and the MRS can be implemented earlier, thereby implementing the RTP media channel.
  • the MRS does not send the 200 OK message after sending the 180 Ringing message
  • the user terminal A completes the RTP media channel directly with the MRS after receiving the 180 Ringing message from the MRS.
  • the 200 OK message will not appear, so that the network operator will not be able to charge the network resources occupied by the voice interaction process, thereby reducing the service. Carrier's overhead.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)

Abstract

A method for realizing speech interaction, comprise: the media resource server MRS receives the Invite message transmitted from the user terminal, the said Invite message carries the user terminal SDP information; the said MRS feeds back a ringing message to the user terminal, the said ringing message carries MRS SDP information; the said user terminal and MRS establish real-time transmission protocol RTP media channel according to the interactive SDP information of both sides, and make speech interaction through the said established RTP media channel. A system for realizing speech interaction is also provided. Above-mentioned method and system can establish RTP media channel in advance, avoid that the network service provider collects fee to speech interactive process between the user terminal and MRS, thereby reduces the operating cost of the service provider.

Description

一种实现语音交互的方法及系统  Method and system for realizing voice interaction
技术领域 Technical field
本发明涉及到下一代网络(NGN, Next Generation Network )技术, 特别涉及到一种在 NGN中实现语音交互的方法及系统。 发明背景  The present invention relates to Next Generation Network (NGN) technology, and in particular, to a method and system for implementing voice interaction in an NGN. Background of the invention
会话初始协议(SIP, Session Initial Protocol )是由因特网工程任务 组(IETF7 Internet Engineering Task Force )提出来的一个应用控制 (信 令)协议, 正如其名字所代表的含义, SIP主要用于发起会话。 具体来 讲, SIP 可以用来创建、 修改以及终结多个参与者参加的多媒体会话进 程, 它是 NGN的核心协议之一。 The Session Initial Protocol (SIP) is an application control (signaling) protocol proposed by the Internet Engineering Task Force (IETF 7 ). As its name implies, SIP is mainly used to initiate a session. . Specifically, SIP can be used to create, modify, and terminate multimedia session sessions attended by multiple participants. It is one of the core protocols of NGN.
所述会话是指两个 SIP节点之间的应用级关联关系。 而所迷的 SIP 节点可以是用户代理(UA, User Agent )、 SIP代理、 注册服务器、 端到 端的用户代理(back to back agent )、 位置服务器等等。 其中, 端到端的 用户代理一般可被称作应用服务器(AS, Application Server )„ 所述应用 服务器隶属于服务提供商, 它主要用于为终端用户提供各种增值业务。  The session refers to an application level association relationship between two SIP nodes. The SIP node can be a User Agent (UA), a SIP Agent, a registration server, a back-to-back agent, a location server, and the like. The end-to-end user agent can be generally referred to as an application server (AS, Application Server). The application server belongs to a service provider, and is mainly used to provide various value-added services for end users.
在 NGN 中, 为了实现在普通呼叫及智能呼叫过程中对集中放音、 收号、 会议等媒体资源的支持, 引入了媒体资源服务器(MRS, Media Resource Server ) 0 通过 MRS 可以实现诸如插、送通知 (PA , Play Announcement )以及播送通 口并 >j欠集号码 ( PAC, Play Announcement and Collect Digits )等语音交互功能„ In the NGN in order to achieve support for centralized playback, digit collection, conferences and other media resources in a normal call and intelligent call, the introduction of the Media Resource Server (MRS, Media Resource Server) 0 can be achieved, such as inserted by MRS, send Notifications (PA, Play Announcement), as well as voice communication functions such as PAC, Play Announcement and Collect Digits
在智能呼叫中, MRS与用户终端之间通过建立实时传输协议(RTP, Real-Time Transport Protocol )媒体通道来进行语音交互。 为了在用户终 端和 MHS之间建立 RTP媒体通道, 作为主叫方的用户终端需要将自身 的设备能力信息, 例如语音能力、 影像能力等, 以及 RTP媒体通道的网 际协议(IP, Internet Protocol )地址及端口地址信息等加载到会话描述 协议 ( SDP, Session Description Protocol ) 包中, 并将 SDP包作为 SIP 消息的内容, 发送给作为被叫方的 MRS。 同样, 作为被叫方的 MRS也 需要将诸如语音能力、影像能力等设备能力信息, 以及 RTP媒体通道的 IP地址及端口地址信息加载到 SDP包中,并作为 SIP消息的内容,发送 给作为主叫方的用户终端, 以进行协商。 所述用户终端和 MRS根据协 商结果建立所述 RTP媒体通道。 根据 SIP协议的规定, 主叫方可以将自 身的 SDP包承载在发送至 MRS的邀请( Invite ) 消息或 ACK消息中, 而被叫方可以将自身的 SDP 包承载在返回给用户终端的振铃 180 Ringing或 200 OK消息中。 In the smart call, the MRS and the user terminal perform voice interaction by establishing a real-time transport protocol (RTP) media channel. In order to establish an RTP media channel between the user terminal and the MHS, the user terminal acting as the calling party needs to set itself up. Device capability information, such as voice capabilities, image capabilities, etc., and RTP media channel Internet Protocol (IP) address and port address information are loaded into the Session Description Protocol (SDP) package, and SDP The packet is sent to the MRS as the called party as the content of the SIP message. Similarly, the MRS as the called party also needs to load device capability information such as voice capability and video capability, and the IP address and port address information of the RTP media channel into the SDP packet, and send it as the content of the SIP message to the master. Call the user terminal of the party to negotiate. The user terminal and the MRS establish the RTP media channel according to the negotiation result. According to the provisions of the SIP protocol, the calling party may carry its own SDP packet in an Invite message or an ACK message sent to the MRS, and the called party may carry its own SDP packet in the ringback returned to the user terminal. 180 Ringing or 200 OK message.
然后, MRS通过建立的 RTP媒体通道为所述用户终端提供语音交 互功能, 例如开始播放录音通知或者在播放录音通知的同时进行对用户 所拨打号码的收集, 即进行 PA或 PAC操作;  Then, the MRS provides a voice interaction function to the user terminal through the established RTP media channel, for example, starting to play a recording notification or collecting the number dialed by the user while playing the recording notification, that is, performing a PA or PAC operation;
在完成所述语音交互功能后, 所述 MRS使用会话初始协议信息消 息 (SIP INPO )或超文本传输协议获取消息 (HTTP GET )将收集到的 用户数据提交给所述 AS。  After completing the voice interaction function, the MRS submits the collected user data to the AS using a Session Initiation Protocol Information Message (SIP INPO) or a Hypertext Transfer Protocol Acquisition Message (HTTP GET).
之后, AS还可以发送下一次语音交互请求到所述 MRS,此时, MRS 通过已建立的 RTP媒体通道与用户终端完成所请求的语音交互;  Afterwards, the AS may also send the next voice interaction request to the MRS. At this time, the MRS completes the requested voice interaction with the user terminal through the established RTP media channel.
在 AS获得充足的用户数据后,指示 MRS可以拆除与用户终端之间 建立的 RTP媒体通道。  After the AS obtains sufficient user data, it instructs the MRS to tear down the RTP media channel established with the user terminal.
这样, 通过上述语音交互功能, 所述 AS可以获得充足的用户数据, 并将才 据所获得的用户数据为用户提供所请求的智能业务。 例如, 对预 付费智能呼叫而言,在所述 AS通过 MRS与用户终端之间的语音交互过 程获得预付费用户的卡号、 密码以及被叫用户号码后, AS 会根据所获 得的被叫用户号码将本次智能呼叫路由到被叫方, 同时在通话结束后, 根据预付费用户的卡号及密码对本次呼叫收取一定的费用。 In this way, through the voice interaction function described above, the AS can obtain sufficient user data, and will provide the requested intelligent service to the user according to the obtained user data. For example, for a prepaid smart call, after the AS obtains the card number, password, and called user number of the prepaid user through the voice interaction process between the MRS and the user terminal, the AS obtains the The obtained called user number routes the smart call to the called party, and at the same time, after the call ends, the call is charged according to the card number and password of the prepaid user.
在上述过程中,在所述用户终端和 MRS之间建立起 RTP媒体通道, 并利用所建立的 RTP媒体通道进行语音交互过程时, 由于所述 RTP媒 体通道将占有实际的网络资源, 网络运营商通常将会对提供智能业务的 服务提供商索取一定的费用。 通常, 在现有的智能呼叫流程中, 当被叫 方向主叫方发送 200 OK消息表明自身应答之后, 网络运营商就将开始 对当前的呼叫进行计费了。 然而, 对于前面所描述智能呼叫而言, 服务 提供商应当只有在用户享受智能业务时, 才向用户收费, 而在智能呼叫 的建立期间则不应当向用户收取费用。 例如, 对于预付费智能业务, 在 预付费用户作为主叫方的智能呼叫中, 业务提供商仅应向预付费用户收 取该用户与被叫通话期间内的费用, 而不应该收取本次预付费呼叫中预 付费用户与 MRS之间进行语音交互功能时进行卡号、 密码以及被叫号 码采集期间内的费用。 因而, 在这种情况下, 就造成了服务提供商自身 需要为在用户终端和 MRS之间进行语音交互时所占用的网络资源向网 络运营商支付费用, 这将大大增加服务提供商的运营成本。 发明内容  In the above process, when the RTP media channel is established between the user terminal and the MRS, and the voice interaction process is performed by using the established RTP media channel, the network operator is occupied by the RTP media channel. A fee will usually be required for a service provider that provides intelligent services. Generally, in the existing intelligent call flow, after the called party sends a 200 OK message to the calling party to indicate its own response, the network operator will start charging the current call. However, for the smart call described above, the service provider should only charge the user when the user enjoys the smart service, and the user should not be charged during the establishment of the smart call. For example, for a prepaid smart service, in a smart call where the prepaid user acts as the calling party, the service provider should only charge the prepaid user for the fee during the period between the user and the called party, and should not charge the prepaid fee. The card number, password, and the fee during the collection period of the called number are performed during the voice interaction function between the prepaid user and the MRS during the call. Therefore, in this case, the service provider itself needs to pay the network operator for the network resources occupied by the voice interaction between the user terminal and the MRS, which will greatly increase the service cost of the service provider. . Summary of the invention
本发明实施例提供了一种实现语音交互功能的方法, 使网络运营商 不会对智能呼叫过程中用户终端和 MRS进行语音支互过程时所占用的 网络资源向服务提供商收取费用, 从而节约了服务提供商的运营成本。  The embodiment of the invention provides a method for implementing a voice interaction function, so that the network operator does not charge the service provider for the network resources occupied by the user terminal and the MRS during the voice call process, thereby saving The operating cost of the service provider.
本发明实施例所述实现语音交互的方法包括:  The method for implementing voice interaction in the embodiment of the present invention includes:
媒体资源服务器 MRS接收用户终端发送的邀请消息, 所述邀请消 息中携带用户终端的会话描述协议 SDP信息;  The media resource server MRS receives the invitation message sent by the user terminal, where the invitation message carries the session description protocol SDP information of the user terminal;
所述 MRS 返回振铃消息到所述用户终端, 所述振铃消息中携带 MRS的 SDP信息; Returning, by the MRS, a ringing message to the user terminal, where the ringing message is carried SDP information of MRS;
所述用户终端和 MRS根据双方交互的 SDP信息建立实时传输协议 RTP媒体通道, 并通过所述 RTP媒体通道进行语音交互。  The user terminal and the MRS establish a real-time transport protocol RTP media channel according to the SDP information exchanged by the two parties, and perform voice interaction through the RTP media channel.
本发明另一实施例提供了一种实现语音交互的系统, 包括: 用户终 端和媒体资源服务器 MBS; 其中:  Another embodiment of the present invention provides a system for implementing voice interaction, including: a user terminal and a media resource server MBS; wherein:
所述用户终端发送邀请消息到所述 MRS,并在所述邀请消息中携带 用户终端的会话描述协议 SDP信息, 接收 MRS根据所述邀请消息返回 的振铃消息, 获得所述振铃消息中携带的 MRS的 SDP信息, 根据双方 的 SDP信息与 MRS建立实时传输协议 RTP媒体通道,并通过所建立的 RTP媒体通道进行语音交互。  The user terminal sends an invitation message to the MRS, and carries the session description protocol SDP information of the user terminal in the invitation message, and receives the ringing message returned by the MRS according to the invitation message, and obtains the ringing message to be carried in the ringing message. The SDP information of the MRS establishes a real-time transmission protocol RTP media channel according to the SDP information of the two parties and performs voice interaction through the established RTP media channel.
由此可以看出,在本发明实施例实现语音交互功能的方法及系统中, 用户终端和 MRS之间通过 Invite消息和振铃消息进行设备能力协商,因 而可以实现 RTP媒体通道的提前建立, 实现信息交互流程的筒化。  It can be seen that in the method and system for implementing the voice interaction function in the embodiment of the present invention, the device capability negotiation is performed between the user terminal and the MRS through the Invite message and the ringing message, so that the RTP media channel can be established in advance and realized. The stream of information interaction processes.
另外, 在本发明实施例所述的方法及系统中, 由于在用户终端与 MRS之间建立 RTP媒体通道之前, MRS并未返回 200 OK消息, 因而 网络运营商将不会对随后进行语音交互过程进行收费, 从而降低服务提 供商的运营成本。  In addition, in the method and system of the embodiment of the present invention, since the MRS does not return a 200 OK message before the RTP media channel is established between the user terminal and the MRS, the network operator will not perform the subsequent voice interaction process. Charges are made to reduce the operating costs of the service provider.
更进一步, 通过采用本发明所提供的方法及系统, 网络运营商可以 根据智能业务的实际情况决定采用不同的计费策略: 可以在主叫用户拨 打了智能业务的接入码后即刻开始计费, 也可以在智能呼叫真正建立后 (例如在 AS返回 200 OK消息到主叫用户)再开始计费, 从而实现对 主叫用户来讲拨打某些智能业务接入码是不收费的, 而拨打某些智能业 务码是收费的。 例如: 200号业务是一种免费智能接入码, 而 17930就 是一种收费智能接入码。 附图简要说明 Further, by adopting the method and system provided by the present invention, the network operator can decide to adopt different charging policies according to the actual situation of the intelligent service: The charging can be started immediately after the calling user dials the access code of the intelligent service. After the smart call is actually established (for example, the AS returns a 200 OK message to the calling user), the charging is started, so that the calling user can dial some intelligent service access codes without charging, and dialing Some smart service codes are charged. For example: Service 200 is a free intelligent access code, and 17930 is a charging intelligent access code. BRIEF DESCRIPTION OF THE DRAWINGS
图 1 为本发明优选实施例所述实现所述语音交互功能的消息流程 图。 实施本发明的方式  FIG. 1 is a message flow diagram of implementing the voice interaction function according to a preferred embodiment of the present invention. Mode for carrying out the invention
为使发明的技术方案及优点更加清楚明白, 以下参照附图并举实施 例, 对本发明作进一步详细说明。  In order to make the technical solutions and advantages of the invention more apparent, the invention will be further described in detail below with reference to the accompanying drawings.
考虑到在现有的智能呼叫过程中, 当被叫方发送 200 OK消息应答 后, 网络运营商就会对当前的呼叫开始计费, 因而, 为了避免网络运营 商对用户终端和 MRS进行语音交互时所占用的网络资源收取费用, 可 以对现有智能呼叫中实现语音交互过程的消息流程进行改进。  Considering that in the existing intelligent call process, when the called party sends a 200 OK message response, the network operator starts charging for the current call. Therefore, in order to prevent the network operator from performing voice interaction with the user terminal and the MRS. The network resources occupied by the time are charged, and the message flow for implementing the voice interaction process in the existing intelligent call can be improved.
本发明优选实施例所述的实现 MRS和用户终端之间语音交互的具 体消息流程如图 1所示, 主要包括:  The specific message flow for implementing voice interaction between the MRS and the user terminal according to the preferred embodiment of the present invention is as shown in FIG. 1 , which mainly includes:
101、 在用户通过用户终端 A拨打智能呼叫号码后, 用户终端 A发 出 Invite消息请求网络提供该智能呼叫号码所对应的智能业务,该 Invite 消息随后被路由到负责处理本次智能呼叫的 AS。  101. After the user dials the smart call number through the user terminal A, the user terminal A sends an Invite message requesting the network to provide the smart service corresponding to the smart call number, and the Invite message is then routed to the AS responsible for processing the smart call.
在本步骤中,作为主叫方的用户终端 A需要将自身的设备能力信息, 例如语音能力、 影像能力等, 以及 RTP媒体通道的 IP地址及端口地址 信息等加载到 SDP包中, 并将该 SDP包承载在所述 Invite消息中, 发 送给作为被叫方的 MRS。  In this step, the user terminal A as the calling party needs to load its own device capability information, such as voice capability, video capability, etc., and the IP address and port address information of the RTP media channel into the SDP package, and the The SDP packet is carried in the Invite message and sent to the MRS as the called party.
102、 在所述 AS分析出所请求的智能业务需要使用 PA/PAC等语音 交互功能后, 将所述 Invite消息转发到提供语音交互功能的 MRS。  102. After the AS analyzes that the requested intelligent service needs to use a voice interaction function such as PA/PAC, forward the Invite message to an MRS that provides a voice interaction function.
103、 所述 MRS发送 180 Ringing消息到所述 AS, 表明所述 MRS 已作好提供所需的语音交互功能的准备。  103. The MRS sends a 180 Ringing message to the AS, indicating that the MRS is ready to provide the required voice interaction function.
在本步骤中, 作为被叫方的 MRS 需要将自身的设备能力信息, 例 如语音能力、 影像能力等, 以及 RTP媒体通道的 IP地址及端口地址信 息等加载到 SDP包中, 并将该 SDP包承载在所述 180 Ringing消息中, 发送给作为主叫方的用户终端 A。 In this step, the MRS as the called party needs to have its own device capability information, for example. For example, the voice capability, the video capability, and the like, and the IP address and port address information of the RTP media channel are loaded into the SDP packet, and the SDP packet is carried in the 180 Ringing message and sent to the user terminal A as the calling party. .
104、所述 AS将所接收的 180 Ringing消息转发到所述用户终端 A。 通过上述步骤 201 204, 所述用户终端 A和 MRS之间完成二者之 间设备能力的协商。  104. The AS forwards the received 180 Ringing message to the user terminal A. Through the above step 201 204, the user equipment A and the MRS complete the negotiation of the device capabilities between the two.
105、所述用户终端 A以及 MRS根据设备能力的协商结果建立 RTP 媒体通道。  105. The user terminal A and the MRS establish an RTP media channel according to the negotiation result of the device capability.
106、 所述 MRS通过建立的 RTP媒体通道为所述用户终端 A提供 语音交互功能。  106. The MRS provides a voice interaction function to the user terminal A through the established RTP media channel.
所述语音交互功能包括: PA或 PAC操作, 例如开始播放录音通知 或者在播放录音通知的同时进行用户所拨打号码的收集等等。  The voice interaction function includes: PA or PAC operation, for example, starting to play a recording notification or collecting the number dialed by the user while playing the recording notification.
107、在完成一次语音交互功能后,所述 MRS使用 SIP INFO或 HTTP GET消息将收集到的用户数据提交给所述 AS。  107. After completing a voice interaction function, the MRS submits the collected user data to the AS by using a SIP INFO or an HTTP GET message.
如果在本次呼叫过程中 AS请求下一次语音交互过程, 则 AS可以 通过借道法( Piggybacking )利用 SIP INFO或 HTTP GET消息将下一次 的语音交互请求发送到 MRS。 这样, 在 MRS在接收到新的语音交互请 求后将返回上述步骤 206,通过已建立的 RTP连接与用户终端 A进行下 一次语音交互过程。  If the AS requests the next voice interaction process during the call, the AS can use the SIP INFO or HTTP GET message to send the next voice interaction request to the MRS through Piggybacking. In this way, after receiving the new voice interaction request, the MRS returns to the above step 206 to perform the next voice interaction process with the user terminal A through the established RTP connection.
108、 在所述 AS获得充足的用户数据后, 指示 MRS可以拆除与用 户终端 A之间建立的 RTP媒体通道。  108. After the AS obtains sufficient user data, instruct the MRS to tear down the RTP media channel established with the user terminal A.
这样, 通过上述一次或者多次的语音交互过程, 在所述 AS获得充 足的用户数据后, 将为所述用户终端 A提供所请求的智能业务。 例如, 对预付费智能呼叫而言,在所述 AS通过 MRS与用户终端之间的语音交 互过程获得预付费用户的卡号、 密码以及被叫用户号码后, AS 会根据 所获得的被叫用户号码将本次智能呼叫路由到被叫方, 同时在通话结束 后, 将根据所述预付费用户的卡号及密码, 在对应的账户内扣除本次呼 叫过程中主、 被叫用户通话的费用。 In this way, after the AS obtains sufficient user data through the one or more voice interaction processes described above, the requested smart service will be provided to the user terminal A. For example, for a prepaid smart call, after the AS obtains the card number, password, and called user number of the prepaid user through the voice interaction process between the MRS and the user terminal, the AS will The obtained called user number routes the current smart call to the called party, and after the call ends, the card will be deducted according to the card number and password of the prepaid user, and the main account is deducted during the call. The cost of calling the user.
由上述建立语音交互功能的消息流程可以看出, 在本实施例所述实 现语音交互功能的方法中, 用户终端 A要将自身的设备能力信息通过 Invite 消息发送到 MRS, 而 MRS 要将自身的设备能力信息通过 180 Ringing消息发送到用户终端 A, 无需 MRS和用户终端交互 200 OK消 息和 ACK消息,因而可以较早地实现用户终端 A与 MRS之间设备能力 信息的交互, 从而实现 RTP媒体通道的提前建立, 实现信息交互流程的 筒化。  As shown in the foregoing message flow function for establishing the voice interaction function, in the method for implementing the voice interaction function in the embodiment, the user terminal A sends the device capability information to the MRS through the Invite message, and the MRS needs to The device capability information is sent to the user terminal A through the 180 Ringing message, and the MRS and the user terminal do not need to exchange the 200 OK message and the ACK message, so that the device capability information interaction between the user terminal A and the MRS can be implemented earlier, thereby implementing the RTP media channel. The establishment of the advancement, the realization of the information exchange process.
另外, 在本实施例所述的方法中, MRS在发送 180 Ringing消息后 并不再发送 200 OK消息, 并且用户终端 A在接收到来自 MRS的 180 Ringing消息后将直接与 MRS完成 RTP媒体通道的建立,而不再等待来 自 MRS的 200 OK消息。 因而,在用户终端 A与 MRS建立 RTP媒体通 道的过程中, 将不会出现 200 OK消息, 这样, 网络运营商将无法对此 次语音交互过程所占用的网络资源进行计费, 从而降低了服务运营商的 开销。  In addition, in the method in this embodiment, the MRS does not send the 200 OK message after sending the 180 Ringing message, and the user terminal A completes the RTP media channel directly with the MRS after receiving the 180 Ringing message from the MRS. Established without waiting for a 200 OK message from the MRS. Therefore, in the process of establishing the RTP media channel between the user terminal A and the MRS, the 200 OK message will not appear, so that the network operator will not be able to charge the network resources occupied by the voice interaction process, thereby reducing the service. Carrier's overhead.
以上所述仅为本发明的较佳实施例而已, 并不用以限制本发明, 凡 在本发明的精神和原则之内, 所作的任何修改、 等同替换、 改进等, 均 应包含在本发明的保护范围之内。  The above is only the preferred embodiment of the present invention, and is not intended to limit the present invention. Any modifications, equivalents, improvements, etc., which are included in the spirit and scope of the present invention, should be included in the present invention. Within the scope of protection.

Claims

权利要求书 Claim
1、 一种实现语音交互的方法, 其特征在于, 所述方法包括: 媒体资源服务器 v,RS接收用户终端发送的邀请消息, 所述邀请消 息中携带用户终端的会话描述协议 SDP信息; A method for implementing voice interaction, the method includes: a media resource server v, the RS receiving an invitation message sent by the user terminal, where the invitation message carries a session description protocol SDP information of the user terminal;
所述 MRS 返回振铃消息到所述用户终端, 所述振铃消息中携带 MRS的 SDP信息:  The MRS returns a ringing message to the user terminal, where the ringing message carries the SDP information of the MRS:
所述用户终端和 MRS根据默方交互的 SDP信息建立实时传输协议 RTP媒体通道, 并通过所述 RTP媒体通道进行语音交互。  The user terminal and the MRS establish a real-time transport protocol RTP media channel according to the SDP information of the silent party interaction, and perform voice interaction through the RTP media channel.
2、 根据权利- 求 1所述的方法, 其特征在于, 该方法进一步包括: 所述 MRS 收集用户数据, 并将收集到的用户数据提交给应用服务 器 AS。  2. The method of claim 1, wherein the method further comprises: the MRS collecting user data, and submitting the collected user data to the application server AS.
3、 根据权利要求 2所述的方法, 其特征在于, 该方法进一步包括: 所述 AS才艮据所述收集到的用户数据为所述用户终端提供所请求的 智能业务。 :'  The method according to claim 2, wherein the method further comprises: the AS providing the requested intelligent service to the user terminal according to the collected user data. :'
4、 根据权利要求 1所述的方法, 其特征在于, 所述 SDP信息包括: 设备能力信息及对应的端口信息, 以及 RTP媒体通道的网际协议地址。  The method according to claim 1, wherein the SDP information comprises: device capability information and corresponding port information, and an internet protocol address of the RTP media channel.
5、根据权利要求 4所述的方法, 其特征在于, 所述设备能力信息包 括: 设备的语音 力信息及影像能力信息。  The method according to claim 4, wherein the device capability information comprises: voice signal information and image capability information of the device.
6、根据权利要求 1或 2所述的方法, 其特征在于, 所述语音交互为 播送通知或播送通知并收集号码。  The method according to claim 1 or 2, wherein the voice interaction is a broadcast notification or a broadcast notification and collects a number.
7、 一种实现语音交互的系统, 其特征在于, 包括: 用户终端和媒体 资源服务器 MBS; 其中:  7. A system for implementing voice interaction, comprising: a user terminal and a media resource server MBS; wherein:
所述用户终端发送邀请消息到所述 MRS ,并在所述邀请消息中携带 用户终端的会话描述协议 SDP信息, 接收 MRS根椐所述邀请消息返回 的振铃消息, 获得所述振铃消息中携带的 MRS的 SDP信息, 根据双方 的 SDP信息与 MRS建立实时传输协议 RTP媒体通道,并通过所建立的 RTP媒体通道进行语音交互 The user terminal sends an invite message to the MRS, and carries the session description protocol SDP information of the user terminal in the invite message, and receives the MRS and returns the invitation message. The ringing message obtains the SDP information of the MRS carried in the ringing message, establishes a real-time transmission protocol RTP media channel according to the SDP information of the two parties, and performs voice interaction through the established RTP media channel.
8、 根据权利要求 7 所述的系统, 其特征在于, 还包括应用服务器 AS, 所述 AS用于接收所述 MRS在完成语音交互后, 使用会话初始协 议信息消息 SIP INK)或超文本传输协议获取消息 HTTP GET将收集到 的用户数据; The system according to claim 7, further comprising an application server AS, wherein the AS is configured to receive the session initiation protocol information message SIP INK or the hypertext transfer protocol after the MRS completes the voice interaction. Get the user data that the message HTTP GET will collect;
并根据来自 MRS 的用户数据为所述用户终端提供所请求的智能业 务。  And providing the requested intelligent service to the user terminal based on user data from the MRS.
9、 根据权利要求 7所述的系统, 其特征在于, 所述 SDP信息包括: 设备能力信息及对应的端口信息, 以及 RTP媒体通道的网际协议地址。  The system according to claim 7, wherein the SDP information comprises: device capability information and corresponding port information, and an internet protocol address of the RTP media channel.
10、 根据权利要求 9所述的系统, 其特征在于, 所述设备能力信息 包括: 设备的语音能力信息及影像能力信息。  10. The system according to claim 9, wherein the device capability information comprises: voice capability information and image capability information of the device.
PCT/CN2007/000188 2006-03-27 2007-01-18 A method and system for realizing speech interaction WO2007109950A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN 200610065193 CN100486282C (en) 2006-03-27 2006-03-27 Method for realizing interactive voice
CN200610065193.4 2006-03-27

Publications (1)

Publication Number Publication Date
WO2007109950A1 true WO2007109950A1 (en) 2007-10-04

Family

ID=37298367

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2007/000188 WO2007109950A1 (en) 2006-03-27 2007-01-18 A method and system for realizing speech interaction

Country Status (2)

Country Link
CN (1) CN100486282C (en)
WO (1) WO2007109950A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114978485A (en) * 2022-04-21 2022-08-30 中国电信股份有限公司 Voice data transmission method, system, electronic device and storage medium

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009135375A1 (en) * 2008-05-07 2009-11-12 中兴通讯股份有限公司 Call establishing method for realizing the single dialog color ring service
CN105306420B (en) * 2014-06-27 2019-08-30 中兴通讯股份有限公司 Realize the method, apparatus played from Text To Speech cycle of business operations and server
CN115842808A (en) * 2021-08-04 2023-03-24 中国移动通信有限公司研究院 Call interaction method, device, network node and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1567950A (en) * 2003-06-25 2005-01-19 中兴通讯股份有限公司 Method for implementing telephone conference service by using media resource server
JP2005252939A (en) * 2004-03-08 2005-09-15 Oki Electric Ind Co Ltd Interworking apparatus
KR20060017019A (en) * 2004-08-19 2006-02-23 주식회사 케이티 A session control method and apparatus in ip based voice/videophone network
KR20060025458A (en) * 2004-09-16 2006-03-21 주식회사 케이티 System for monitoring the voice call in next generation network and method thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1567950A (en) * 2003-06-25 2005-01-19 中兴通讯股份有限公司 Method for implementing telephone conference service by using media resource server
JP2005252939A (en) * 2004-03-08 2005-09-15 Oki Electric Ind Co Ltd Interworking apparatus
KR20060017019A (en) * 2004-08-19 2006-02-23 주식회사 케이티 A session control method and apparatus in ip based voice/videophone network
KR20060025458A (en) * 2004-09-16 2006-03-21 주식회사 케이티 System for monitoring the voice call in next generation network and method thereof

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114978485A (en) * 2022-04-21 2022-08-30 中国电信股份有限公司 Voice data transmission method, system, electronic device and storage medium
CN114978485B (en) * 2022-04-21 2023-09-08 中国电信股份有限公司 Voice data transmission method, system, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN100486282C (en) 2009-05-06
CN1859506A (en) 2006-11-08

Similar Documents

Publication Publication Date Title
US8488591B2 (en) Method and system for video telephone communications set up, related equipment and computer program product
EP1665722B1 (en) Exchange protocol for combinational multimedia services
US8634412B2 (en) Session initiation protocol (SIP) message incorporating a multi-purpose internet mail extension (MIME) media type for describing the content and format of information included in the SIP message
US7664102B1 (en) System and method for providing a plurality of multi-media services using a number of media servers to form a preliminary interactive communication relationship with a calling communication device
US9001818B2 (en) Method to process a call request
US8160214B1 (en) Mixed protocol multi-media provider system incorporating a session initiation protocol (SIP) based media server adapted to operate using SIP messages which encapsulate GR-1129 advanced intelligence network based information
US6928150B2 (en) Call charging notification
WO2001084798A2 (en) Configuring user interfaces of call devices
US20120250585A1 (en) Interworking between ims/sip and pstn/plmn to exchange dynamic charging information
WO2009033401A1 (en) A communication method, system and service controlling function entity
CN101313551A (en) Method and apparatus for utilizing network services in a manner substantially transparent to service endpoints
US20060268754A1 (en) Combined H.450.2 and SIP call transfer
JP5551786B2 (en) Method, server and terminal device for playing multimedia ringer during conversation
KR100693038B1 (en) apparatus and method of providing Caller Identification in VoIP service system
WO2007109950A1 (en) A method and system for realizing speech interaction
WO2007019777A1 (en) A session establish method and a session control node
KR101069530B1 (en) Apparatus and method for terminating call's bearer control, and multimedia information providing service system and method in NGN
KR100770946B1 (en) Apparatus and method for providing video streaming service to sip user agent using session initiation protocol
KR100959019B1 (en) Method for collecting billing information in real time by forking SIP messages in a SIP-based network
WO2008049371A1 (en) A method and system for transferring service event
WO2009135375A1 (en) Call establishing method for realizing the single dialog color ring service
CN100486254C (en) Method and system for control conversation timer in conversation iniatial protocol network
WO2007062566A1 (en) A method and system for implementing the service subscription
WO2008131621A1 (en) Conference service notification method
WO2007107074A1 (en) A method, apparatus and system for communication service processing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07702120

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07702120

Country of ref document: EP

Kind code of ref document: A1