WO2016169319A1 - 业务触发方法、装置、系统及媒体服务器 - Google Patents

业务触发方法、装置、系统及媒体服务器 Download PDF

Info

Publication number
WO2016169319A1
WO2016169319A1 PCT/CN2016/073370 CN2016073370W WO2016169319A1 WO 2016169319 A1 WO2016169319 A1 WO 2016169319A1 CN 2016073370 W CN2016073370 W CN 2016073370W WO 2016169319 A1 WO2016169319 A1 WO 2016169319A1
Authority
WO
WIPO (PCT)
Prior art keywords
service
media channel
voice recognition
media
resource
Prior art date
Application number
PCT/CN2016/073370
Other languages
English (en)
French (fr)
Inventor
张伟
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2016169319A1 publication Critical patent/WO2016169319A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications

Definitions

  • the present invention relates to the field of communications, and in particular to a service triggering method, apparatus, system, and media server.
  • the Media Server is a stand-alone device that provides dedicated media resource functions in the softswitch system. It is also an important device in the packet network, providing media processing functions in basic and enhanced services, including DTMF signal acquisition and decoding, and signal tone generation.
  • the media server performs a playback service, starts playing a prompt tone, and waits for receiving data transmitted by the terminal during the playback of the prompt tone.
  • the DTMF number packet sent by the terminal is received through the media channel corresponding to the number receiving service, the number information is obtained by parsing the number packet, and the number information is reported to the application server.
  • speech recognition technology it is also possible to trigger a service by means of speech recognition.
  • the intrinsic service triggering system can only support one of the numbering mode and the voice recognition mode to trigger the service.
  • the user can only passively select the service triggering supported by the service triggering system. In this way, the user selection is relatively simple and the limitations are obvious.
  • the invention provides a service triggering method, device, system and media server, so as to at least solve the problem that only one of the related technologies can passively select the receiving mode and the voice recognition mode.
  • a service triggering method including: creating a media channel of a voice playback service, a media channel of a voice recognition service, and a media channel of a number receiving service; when playing through a media channel of a voice playback service
  • the trigger signal for triggering the service is received; the information contained in the trigger signal is obtained by the media channel of the received service or the media channel of the voice recognition service; and the information contained in the obtained trigger signal is reported to the application server.
  • the information included in the trigger signal is obtained by the media channel of the number-receiving service or the media channel of the voice-recognition service, including: when the category of the trigger signal is a number data packet, obtaining the number through the media channel of the number-collecting service The number information included in the data packet; when the category of the trigger signal is a voice data packet, the text information corresponding to the voice data packet is obtained through the media channel of the voice recognition service.
  • the media channel of the voice-playing service, the media channel of the voice-recognition service, and the media channel of the number-receiving service include: receiving a service message sent by the application server, where the service message includes a voice playback service and voice recognition. Service and number-receiving service; setting the media channel creation sequence of the voice-playing service, the voice recognition service, and the number-receiving service; according to the set creation order, sequentially creating the media channel of the sound-playing service, the media channel of the voice recognition service, and receiving Media channel for the business.
  • the setting order of the setting is: a media channel of the sounding service, a media channel of the voice recognition service, and a media channel of the number receiving service.
  • obtaining the number information included in the number data packet by using the media channel of the number receiving service includes: receiving the number data packet through the media channel of the number receiving service; parsing the number data packet, and acquiring the number included in the number data packet information.
  • acquiring the text information corresponding to the voice data packet by using the media channel of the voice recognition service includes: transmitting the voice data packet to the voice recognition server through the media channel of the voice recognition service; and receiving the notification sent by the voice recognition server The notification is used to instruct the voice recognition server to listen to the voice input; to break the media channel of the open voice service and the number receiving service; and receive the text information corresponding to the voice data packet returned by the voice recognition server.
  • the media channel for creating the playback service includes: receiving resource application information, where the resource application information is used for a predetermined playback service resource; and the media channel of the predetermined playback service resource is opened.
  • the media channel for creating the voice recognition service includes: receiving resource application information, where the resource application information is used to reserve a voice recognition service resource; and the media channel of the predetermined voice recognition service resource is opened.
  • the media channel for creating the number-collecting service includes: receiving resource application information, where the resource application information is used for the predetermined number-collecting service resource; and the media channel of the predetermined number-collecting service resource is opened.
  • the method further includes: releasing the resources reserved by the sounding service, the number collecting service, and the voice recognition service; The media channel of the voice service resource, the received service resource, and the voice recognition service resource; the voice recognition channel between the media server and the voice recognition server is closed.
  • the method further includes: releasing the resource reserved by the voice recognition service; closing the media channel of the voice recognition service resource; A voice recognition channel between the media server and the voice recognition server.
  • a service triggering apparatus including: a creating module, a media channel configured to create a sounding service, a media channel of a voice recognition service, and a media channel of a number receiving service; and a receiving module,
  • the acquiring module is configured to acquire the information included in the trigger signal by using the media channel of the number-receiving service or the media channel of the voice recognition service.
  • the sending module is configured to report the information contained in the obtained trigger signal to the application server.
  • the acquiring module includes: a first acquiring sub-module, configured to: when the category of the trigger signal is a number data packet, obtain the number information included in the number data packet by using the media channel of the receiving service;
  • the submodule is configured to acquire the text information corresponding to the voice data packet through the media channel of the voice recognition service when the category of the trigger signal is a voice data packet.
  • the creating module includes: a first receiving unit, configured to receive a service message sent by the application server, where the service message includes a voice play service, a voice recognition service, and a number receiving service; and the setting unit is set to Setting The media channel of the audio service, the media channel of the voice recognition service, and the media channel of the number-of-address service; the creation unit is set to sequentially create the media channel of the sound-playing service and the media channel of the voice recognition service according to the set creation order. And the media channel for the collection business.
  • the first obtaining submodule includes: a second receiving unit, configured to receive a number data packet by using a media channel of the number receiving service; and the first acquiring unit is configured to parse the number data packet, and obtain the number data packet. The number information contained.
  • the second obtaining submodule includes: a sending unit, configured to send the voice data packet to the voice recognition server through the media channel of the voice recognition service; and the processing unit is configured to break the open voice service and the number receiving service a media channel; the third receiving unit is configured to receive text information corresponding to the voice data packet returned by the voice recognition server.
  • the creating unit includes: a receiving subunit, configured to receive resource application information, where the resource application information is used for a predetermined playback service resource, or a voice recognition service resource, or a revenue collection service resource; , set to open a predetermined playback service resource, or a voice recognition service resource, or a media channel that receives the service resource.
  • a media server is provided, and the media server includes any one of the service triggering devices in the embodiment of the present invention.
  • a service triggering system including: any one of the media servers and the application server in the embodiment of the present invention, where the application server is configured to receive number information or text information sent by the media server.
  • the media channel of the voice-playing service, the media channel of the voice-recognition service, and the media channel of the number-receiving service are used in the embodiment of the present invention; when the prompt tone is played through the media channel of the voice-playing service, the trigger signal for triggering the service is received.
  • the category of the trigger signal is a number data packet
  • the number information included in the number data packet is obtained through the media channel of the number receiving service
  • the category of the trigger signal is a voice data packet
  • the voice data is obtained through the media channel of the voice recognition service.
  • the text information corresponding to the packet; and the number information contained in the number data packet or the text information corresponding to the voice data packet is reported to the application server, and the related technology can only passively select one of the receiving mode and the voice recognition mode.
  • the problem of triggering the service further achieves the effect of actively selecting the receiving mode or the voice recognition mode to trigger the service according to the input trigger signal.
  • FIG. 1 is a flowchart of a service triggering method according to an embodiment of the present invention.
  • FIG. 2 is a sequence diagram of a service triggering method when a category of a trigger signal is a number data packet according to an embodiment of the present invention
  • FIG. 3 is a sequence diagram of a service triggering method when a category of a trigger signal is a voice data packet according to an embodiment of the present invention
  • FIG. 4 is a code flow diagram of a service triggering method according to an embodiment of the present invention.
  • FIG. 5 is a structural block diagram of a service triggering apparatus according to an embodiment of the present invention.
  • FIG. 6 is a structural block diagram of a media server according to an embodiment of the present invention.
  • FIG. 7 is a structural block diagram of a service triggering system according to an embodiment of the present invention.
  • FIG. 1 is a flowchart of a service triggering method according to an embodiment of the present invention. As shown in FIG. 1, the process includes the following steps:
  • Step S102 creating a media channel of the sound reproduction service, a media channel of the voice recognition service, and a media channel of the number collection service;
  • the media channel of the voice-playing service may be a channel for the resource for the voice-playing service in the media server
  • the media channel of the voice-recognition service may be a channel for the resource for the voice-recognition service within the media server, and the number is received.
  • the media channel of the service may be a channel for the resources of the media server to receive the number service.
  • the playback service or the play service can realize that the media server plays the prompt tone, and waits for the user to input the number information or the voice information while playing the prompt tone; the voice recognition service or the speech The service can receive the voice information input by the user and identify the voice information; the number receiving service or the dtmf service can obtain the user dialing information through the service.
  • Step S104 Receive a trigger signal for triggering a service when playing a prompt tone through a media channel of the playback service.
  • the trigger signal may include a number data packet or a voice data packet, the number data packet includes user dialing information, and the voice data packet includes user voice input information.
  • the trigger signal can also be used to interrupt the prompt sound being played.
  • Step S106 Acquire, by the media channel of the number-collecting service or the media channel of the voice recognition service, information included in the trigger signal;
  • the trigger signal that can be identified at this time is only the number packet, that is, the user can only pass
  • the dialing method interrupts the playback and triggers the service. The user does not generate a correct response after inputting the voice information.
  • the media server only creates the media channel of the playback service and the media channel of the voice recognition service.
  • the trigger signal that can be recognized is only the voice data packet, that is, the user can only interrupt the playback by voice input and trigger the service, and the user does not generate a correct response after dialing.
  • the media channel of the playback service and the media channel of the collection service are all opened.
  • the user can actively select to adopt the dialing mode to generate the input.
  • Trigger signal or a voice signal to generate an input trigger signal.
  • the information contained in the trigger signal can be obtained through the media channel of the number receiving service, and the user generates a voice input manner.
  • the trigger signal is received, the information contained in the trigger signal can be obtained through the media channel of the voice recognition service.
  • step S108 the information included in the obtained trigger signal is reported to the application server.
  • the media server reports the obtained information to the application server to trigger the application server to start the next service.
  • the service when the prompt tone is played, the service is triggered according to the input number data packet or the voice data packet, and the related technology can only passively select the receiving mode and the voice recognition mode to trigger the upper layer service. Improve the flexibility of the business triggering method.
  • a media server is an independent device that provides a dedicated media resource function in a softswitch system, and is also an important device in a packet network, and provides media processing functions in basic and enhanced services, including DTMF signals.
  • Various resource functions such as acquisition and decoding, generation and transmission of signal tones, transmission of recording notifications, conferences, conversion between different codec algorithms, and communication functions and management and maintenance functions.
  • the service control unit is further included in the embodiment of the present invention, or simply referred to as a Call; the logical resource management module, in the embodiment of the present invention or the ResManage; and the resource board processing module, in the embodiment of the present invention or Referred to as ResBoard.
  • the resource board processing module further includes a media storage transmission audio unit, which is referred to as an MSTU in the embodiment of the present invention, and a media processing unit, which is referred to as an MRU in the embodiment of the present invention.
  • the service control unit is an important unit in the media server, and mainly performs capability negotiation with other entities, provides management, maintenance, and control functions of the resource itself to complete complex services; the media storage and transmission audio unit is the media.
  • the service resource unit in the server completes massive audio data storage and realizes audio file playback and recording functions.
  • the media storage transmission audio unit has an external network port, which can be directly sent and received through the external network port on the unit; the media processing unit undertakes audio editing. Decoding conversion, DTMF, and conference mixing.
  • the application server or APP for short, is responsible for the logic generation and management of various value-added services and intelligent services, and also provides various open APIs to provide a creation platform for the development of third-party services.
  • the application server is a separate component, which has nothing to do with the softswitch of the control layer, thus separating the service from the call control and facilitating the introduction of new services.
  • the step S106 is: acquiring information included in the trigger signal by using the media channel of the number-receiving service or the media channel of the voice recognition service, including:
  • Step S106a When the category of the trigger signal is a number data packet, obtain the number information included in the number data packet by using the media channel of the number receiving service;
  • Step S106b When the category of the trigger signal is a voice data packet, the text information corresponding to the voice data packet is obtained through the media channel of the voice recognition service.
  • the media channel of the playback service and The media channel of the receiving service is already open.
  • the user can actively choose to use the dialing mode to generate the input trigger signal, or use the voice mode to generate the input trigger signal.
  • the information contained in the trigger signal can be obtained through the media channel of the number receiving service.
  • the information contained in the trigger signal can be obtained through the media channel of the voice recognition service.
  • the user freely selects one of a dialing mode and a voice mode to generate a trigger signal.
  • the dialing behavior of the user and the voice input behavior of the user are differentiated in time. If the user is detected to dial first, the number information is obtained through the media channel of the receiving service. If it is detected that the user inputs the voice first, the text information corresponding to the voice data packet is obtained through the media channel of the voice recognition service.
  • the number information or the text information is reported to the application server, and the application server is triggered to deliver a new service. Specifically, after the media server reports the number information or the text information to the application server, the media server receives the number information or the text information report completion message returned by the application server, and the application server delivers the number information or the text information reported by the media server. New business to media server.
  • step S102 is: creating a media channel of the voice playback service, a media channel of the voice recognition service, and a media channel of the number collection service, including:
  • Step S1022 Receive a service message sent by the application server, where the service message includes a voice playback service, a voice recognition service, and a number collection service;
  • Step S1024 setting a sequence of creating media channels of the voice playback service, the voice recognition service, and the number collection service;
  • Step S1026 sequentially create a media channel of the sound reproduction service, a media channel of the voice recognition service, and a media channel of the number collection service according to the set creation order.
  • the application server sends an info message to the media server, where the message is in the form of a group, and the info message includes a play service, a dtmf service, and a speech service, where the play service is a playback service, and the dtmf service is used.
  • the speech service is a speech recognition service.
  • the media server parses the info message according to the sip and xml protocols, and sets the execution order of the service to play, speech, and dtmf in order to facilitate the creation of the media channels of the three services in the subsequent process.
  • the method further includes: Step S1021: The media server performs media negotiation with the application server, including: the application server sends the invite to the media server, and the request and the media server Performing media negotiation; the service control unit in the media server applies for the first foreign port resource to the logical resource management module.
  • the logical resource management module replies to the service control unit resource application successfully; the application server and the media server complete media negotiation, and the media
  • the server sends an acknowledgement message, such as a 200 OK message, to the application server; the application server sends an acknowledgement message, such as an ack message, to the media server to confirm that the media negotiation was successful.
  • the setting order of the setting is: a media channel of the voice playback service, a media channel of the voice recognition service, and a media channel of the number collection service.
  • the use of such media channel creation order is more conducive to the execution of the program and can save hardware resources.
  • obtaining the number information included in the number data packet by using the media channel of the number receiving service includes:
  • Step S106a1 receiving a number data packet by using a media channel of the number receiving service; specifically, receiving a number data packet by using the second import resource;
  • Step S106a3 parsing the number data packet, and acquiring the number information included in the number data packet; specifically, the DTMF decoding operation is completed by the media processing unit, and the number information included in the number data packet is obtained.
  • step S106b acquiring text information corresponding to the voice data packet by using the media channel of the voice recognition service includes:
  • Step S106b1 Send the voice data packet to the voice recognition server through the media channel of the voice recognition service; specifically, receive the voice data packet through the second import resource, and after the necessary processing by the media processing unit, pass the second exit resource and the first
  • the external interface unit sends the voice data packet to the voice recognition server.
  • Step S106b3 Disconnecting the media channel of the open voice service and the number receiving service; specifically, after receiving the notification sent by the voice recognition server for instructing the voice recognition server to listen to the voice input, the service control unit notifies the logical resource management module to release the playback The service and the number of the media resources applied for by the service, and the service control unit notifies the resource board processing module of the media channel of the media resource applied for by the open voice service and the number receiving service.
  • the media resources applied for the voice-playing service include the first import resource, the first export resource, and the first pair of internal interface units
  • the media resources applied for the number-collecting service include the third import unit.
  • the media resource applied for by the receiving service includes the second import unit
  • the second import unit since the second import unit also belongs to the media resource applied by the voice recognition service, the second import unit is not released at this time.
  • Step S106b5 Receive text information corresponding to the voice data packet returned by the voice recognition server. Specifically, after the voice recognition server identifies the completion, the result is sent to the service control unit of the media server.
  • a voice recognition server or simply ASR (Automated Speech Recognition) is used to identify the input audio, convert it into text, and send the text information to the user through message reporting, and use the speech keyword in the info message.
  • ASR Automatic Speech Recognition
  • the media channel for creating the playback service includes:
  • Step S10261 Receive resource application information, where the resource application information is used for a predetermined playback service resource, and the playback service resource is a resource for the playback service; specifically, the logical resource management module receives the predetermined playback service resource.
  • the application, the playback service resource includes a first pair of internal interface units of the media storage transmission audio unit, a first import resource and a first export resource of the media processing unit;
  • Step S10262 The media channel of the predetermined playback service resource is opened. Specifically, the logic resource management module sends an open channel message to the resource board processing module, where the open channel message is used to notify the resource board processing module to open the predetermined playback.
  • the media channel of the source is opened. Specifically, the logic resource management module sends an open channel message to the resource board processing module, where the open channel message is used to notify the resource board processing module to open the predetermined playback.
  • the media channel of the service resource the service logic processing module in the media server applies the foregoing three resources for the voice service to the logic resource management module, and notifies each resource board processing module to open the above three resources for the voice service.
  • the media channel for creating the voice recognition service includes:
  • Step S10264 Receive resource application information, where the resource application information is used for a predetermined voice recognition service resource, and the voice recognition service resource is a resource used for the voice recognition service.
  • the logic resource management module receives the service for the predetermined voice recognition service resource.
  • the voice recognition service resource includes a second external interface unit of the media storage transmission audio unit, a second import resource of the media processing unit, and a second export resource, where the second external interface unit is configured to receive the received voice recognition data packet.
  • Step S10265 The media channel of the predetermined voice recognition service resource is opened. Specifically, the logic resource management module sends an open channel message to the resource board processing module, where the open channel message is used to notify the resource board processing module to open the predetermined voice recognition. a media channel of the service resource; the service logic processing module in the media server applies the foregoing three resources for the voice recognition service to the logical resource management module, and notifies each resource board processing module to open the foregoing three resources for the voice recognition service. Media channel.
  • the media server further performs media negotiation with the voice recognition server to open a voice recognition channel between the media server and the voice recognition server. Specifically, the media server sends an invite message to The voice recognition server requests the media negotiation with the voice recognition server; the media server sends an MRCP request to notify the voice recognition server to prepare to receive the audio input, where the MRCP refers to the Media Resource Control Protocol, and the voice recognition server provides various voice services to the client. .
  • the media channel for creating the number-collecting service includes:
  • Step S10267 Receive resource application information, where the resource application information is used for the predetermined numbered service resource, and the received service resource is the resource used for the number collection service.
  • the logical resource management module receives the predetermined numbered service resource.
  • the application, the collection service resource includes a third import unit of the media processing unit, or the collection service resource includes a second import unit of the media processing unit. In the latter case, the second import unit can serve as a voice recognition service.
  • the voice data packet import unit can also be used as an import unit of the number data packet of the receiving number service;
  • Step S10268 The media channel of the predetermined numbered service resource is opened. Specifically, the logic resource management module sends an open channel message to the resource board processing module, where the open channel message is used to notify the resource board processing module to open the predetermined number. Media channel for business resources.
  • the service logic processing module in the media server applies the foregoing resource for the number-of-address service to the logical resource management module, and notifies each resource board processing module to open the media channel of the foregoing resource for the number-of-receipt service; specifically, when used
  • the resource of the receiving service includes the second import unit of the media processing unit
  • the order of creating the media channel according to the playback service, the voice recognition service, and the collection service set by the media server may be regarded as being in the above steps S10264 and S10275.
  • the resource application for the numbering service and the opening of the media channel have been completed.
  • the method further includes: the media server sending a message to the application server, instructing the application server to process the media channel of the voice playback service, the voice recognition service, and the number collection service, and starting to play the prompt tone, waiting for the interruption Send a number packet or voice packet.
  • the service triggering method according to the embodiment of the present invention further includes:
  • Step S1071 The resources reserved by the voice-playing service, the number-of-sales service, and the voice-recognition service are released. Specifically, the service control unit notifies the logical resource management module to release the resources applied for the voice-receiving service, the number-collecting service, and the voice recognition service, to be released.
  • the resource includes a second external interface unit, a first import resource, a first exit resource, a second import unit, a second exit unit, and a first pair of internal interface units.
  • Step S1073 Turn off the media channel of the voice service resource, the number of the service resource, and the voice recognition service resource. Specifically, the service control unit notifies the resource board processing module to close the media channel of the voice service, the number collection service, and the voice recognition service.
  • Step S1075 The voice recognition channel between the media server and the voice recognition server is closed. Specifically, the service control unit closes the voice recognition channel between the media server and the voice recognition server; the service control unit sends a bye message to the voice recognition server to release Speech recognition server related session resources.
  • the service triggering method according to the embodiment of the present invention further includes:
  • Step S1072 Release the resource reserved by the voice recognition service. Specifically, the service control unit notifies the logical resource management module to release the media resource applied by the voice recognition service.
  • Step S1074 The media channel of the voice recognition service resource is closed. Specifically, the service control unit notifies the resource board processing module to close the media channel of the voice recognition service; specifically, the voice data packet corresponding to the voice channel is obtained through the media channel of the voice recognition service.
  • the media resources applied for the collection service and the playback service have been released, and the media channel of the collection service and the playback service is closed, and after the text information corresponding to the voice data packet is obtained, only the release is required.
  • the media resource applied for the voice recognition service is used to close the media channel of the voice recognition service.
  • the media resources applied by the voice recognition service include: a second external interface unit, a second import resource, and a second export resource.
  • Step S1076 The voice recognition channel between the media server and the voice recognition server is closed. Specifically, the service control unit closes the voice recognition channel between the media server and the voice recognition server; the service control unit of the media server sends a bye message to the voice recognition. The server releases the voice recognition server related session resources.
  • FIG. 2 is a timing diagram of a service triggering method when a category of a trigger signal is a number data packet according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a service triggering method according to the present invention.
  • the category of steps 31 to 38 of FIG. 2 or the steps of step 41 to step 52 of FIG. 3 are performed.
  • the service triggering method includes:
  • Step 1 The APP sends an invite message to the MS, and the invite message is used for media negotiation between the APP and the MS.
  • Step 2 The Call sends a GetResReq (mstuout1) request to the ResManage, and the GetResReq (mstuout1) requests the first external interface unit for requesting the MSTU, that is, the Mstuout1 external port resource;
  • Step 3 ResManage sends GetResRsp information to the Call, and the GetResRsp information is used to indicate that the first external interface unit is successfully applied.
  • Step 4 The MS and the APP complete the media negotiation, and the MS sends a 200 OK message to the APP.
  • Step 5 The APP sends an ACK message to the MS, where the ACK message is used to confirm that the media negotiation is successful.
  • Step 6 The APP sends an info ( ⁇ play> ⁇ speech> ⁇ dtmf>) message, which includes three services: play play service, dtmf receive number service, and speech voice recognition service. Ms parses the info message and specifies the service. The execution order is play, speech, and dtmf, which facilitates subsequent resource application and opening;
  • Step 7 The Call sends a GetResReq (mstuin, mru1, mru2) request to the ResManage, and the GetResReq (mstuin, mru1, mru2) requests the first pair of internal interface units, the first import resource, and the first exit for applying for the playback service.
  • Resources ie mstuin, mru1, mru2 resources;
  • Step 8 ResManage sends GetResRsp information to the Call, and the GetResRsp information is used to indicate that the resource application is successful.
  • Step 9 Call sends an OpenResReq (mstuout1-nat, mru2, mru1, mstuin) request to the ResBoard, and the OpenResReq (mstuout1-nat, mru2, mru1, mstuin) request is used to notify the ResBoard to open the first pair of internal interface units, the first import Resources, the first export resource, that is, the media channel of the Mstuin, Mru1, and Mru2 resources;
  • Step 10 The ResBoard sends an OpenResRsp message to the Call.
  • the OpenResRsp message is used to indicate that the media channel of the Mstuin, Mru1, and Mru2 resources has been successfully opened.
  • Step 11 The MS sends an invite message to the ASR server, and the invite message is used for media negotiation between the MS and the ASR.
  • Step 12 The MS and the ASR complete the media negotiation, and the ASR sends a 200 OK message to the MS.
  • Step 13 The MS sends a MrcpReq request to the ASR, and the MrcpReq request is used to notify the ASR server to prepare to receive the audio input.
  • Step 14 The ASR replies to the MrcpRsp request to the MS, and the MrcpRsp request is used to notify the MS that the ASR server is ready;
  • Step 15 The Call sends a GetResReq (mstuout2-nat, playmru2, playmru1) request to the ResManage, and the GetResReq (mstuout2-nat, playmru2, playmru1) requests the second external interface unit for the voice recognition service, the second import resource, And the second export resource, namely mstuout2, playmru1, playmru2 resources;
  • Step 16 ResManage sends GetResRsp information to the Call, and the GetResRsp information is used to indicate that the resource application is successful.
  • Step 17 Call sends an OpenResReq (mstuout2-nat, playmru2, playmru1) request to the ResBoard, and the OpenResReq (mstuout2-nat, playmru2, playmru1) request is used to notify the ResBoard to open the second external interface unit, the second imported resource, and the second Export resources, ie media channels of mstuout2, playmru1, playmru2 resources;
  • Step 18 The ResBoard sends an OpenResRsp message to the Call, and the OpenResRsp message is used to indicate mstuout2.
  • the media channel of the playmru1 and playmru2 resources has been successfully opened, waiting for the audio stream sent by the receiving terminal;
  • Step 19 The Call sends an OpenDtmfChannel (playmru2) message to the ResBoard, and the OpenDtmfChannel (playmru2) message is used to notify the ResBoard to open the message of the media channel of the second imported resource for the receiving service, that is, the Call module notifies the PlayMru2 resource in the ResBoard to receive the number. ;
  • Step 20 The ResBoard sends a GetResRsp message to the Call, where the GetResRsp information is used to indicate that the media channel of the PlayMru2 resource has been successfully opened.
  • Step 21 The MS replies to the 200Ok message of the info, and informs the APP media server that the media channel for processing the three services of play, speech, and dtmf has been opened, and the prompt tone is started, and Wait for the terminal to send a received number or voice stream.
  • step 31 to step 38 shown in FIG. 2 the steps from step 31 to step 38 shown in FIG. 2 are performed, and if the voice data packet is received first during the play prompting process, the execution map is executed. Steps from step 41 to step 52 shown in 3.
  • the method when a number data packet is received, the method includes:
  • Step 31 The Call receives the RecDtmf information sent by the ResBoard, and instructs the MS module to receive the number data packet and parse the number data packet, and obtain the number information in the number data packet.
  • Step 32 The MS sends an info (SendDtmf) message to the APP, and the info (SendDtmf) message is used to report the obtained number information to the APP.
  • Step 33 The Call sends ClossAllRes (mstu, mru) information to the ResBoard module, and the ClossAllRes (mstu, mru) information is used to notify the ResBoard module to close the media channel opened by the play, dtmf, and speech services.
  • Step 34 Call sends ReleaseAllRes (mstu, mru) information to ResManage, and ReleaseAllRes (mstu, mru) information is used to notify the ResManage module to release the media resources requested by play, dtmf, and speech, namely, mstuout2, mru1, mru2, playmru1, playmru2. Mstuin;
  • Step 35 The Call module sends a Bye message to the ASR server to release the ASR server related session resources.
  • Step 36 The ASR sends a ByeRsp message to the Call to confirm the release of the resource.
  • Step 37 The MS receives the 200 OK message sent by the APP, and is used to indicate that the MS reports the number information to be completed.
  • Step 38 The APP delivers a new service message info to the MS according to the number message reported by the MS.
  • the method when a voice data packet is received, the method includes:
  • Step 41 The ASR server sends a RecVoiceReport message to the MS to notify the media server that the ASR has listened to the voice input.
  • Step 42 Call sends ClossAllRes (mstuin, mru1, mru2) information to the ResBoard module, and ClossAllRes (mstuin, mru1, mru2) information is used to notify the ResBoard module to close the mstuin applied for the play service. Mru1, mru2 resources;
  • Step 43 The Call sends the ReleaseAllRes (mstuin, mru1, mru2) information to the ResManage, and the ReleaseAllRes (mstuin, mru1, mru2) information is used to notify the ResManage module to release the media resources requested by the play, that is, the mstuin, mru1, and mru2 resources;
  • Step 44 Call sends a ClossDtmfChannel (playmru2) message to the ResBoard module, and the ClossDtmfChannel (playmru2) information is used to notify the ResBoard module that the playmru2 channel stops receiving the number, which is equivalent to shifting the channel attribute of the playmru2 from the DTMF service to the receiving voice packet service;
  • Step 45 The ASR sends the SendAsrResult information to the MS, and the SendAsrResult information is used to send the text information corresponding to the voice data packet to the MS after the audio identification of the ASR server is completed;
  • Step 46 The MS sends an info (SendSpeechResult) message to the APP, and the info (SendSpeechResult) message is used by the MS to report the text information recognized by the ASR server to the APP.
  • info SendSpeechResult
  • Step 47 Call sends ClossAllRes (mstu, mru) information to the ResBoard module, and ClossAllRes (mstu, mru) information is used to notify the ResBoard module to close the media channels of mstuout2, playmru1, playmru2 opened by the speech service;
  • Step 48 The Call sends the ReleaseAllRes (mstu, mru) information to the ResManage, and the ReleaseAllRes (mstu, mru) information is used to notify the ResManage module to release the resources requested by the speech service, that is, the resources of mstuout2, playmru1, and playmru2;
  • Step 49 The Call module sends a bye message to the ASR server, and the bye message is used to notify the ASR server to release the ASR server related session resources.
  • Step 50 The ASR sends a ByeRsp message to the Call, and the ByeRsp message is used to confirm the release of the resource.
  • Step 51 The MS receives the 200 OK message sent by the APP, and is used to indicate that the text information corresponding to the reported voice data packet returned by the MS is completed.
  • Step 52 The APP sends a new service message info to the MS according to the text message of the voice recognition result reported by the MS.
  • the MSML/MOML protocol is an open protocol, conforms to the extended principle of the SIP protocol, and provides a good extension framework for expansion without changing the SIP protocol.
  • MSML/MOML works through the INFO and INVITE message bodies of the SIP protocol.
  • the combination of SIP/MSML/MOML uses the SIP protocol to establish a session, modify a session, and delete a session.
  • the XML-based MSML/MOML is used to provide a control interface for media processing.
  • MSML is an interface for controlling media streams and internal conference resources of the media server; MOML is used to control media streams and complex media processing objects involved in the conference.
  • the combination of SIP and MSML/MOML constitutes a powerful interface framework for interaction between application servers and media servers.
  • FIG. 4 is a code flow diagram of a service triggering method according to an embodiment of the present invention.
  • audio data is stored in an MSTU, and audio file playback can be implemented;
  • the inner interface unit mstuin sends the audio data packet to be played to the MRU unit through the first import resource mru1 to implement the audio file.
  • the PlayMru2 resource receives the number; if the trigger signal type is a voice data packet, passes the second exit resource PlayMru2 and the MSTU.
  • the second external interface unit sends the voice data packet to the ASR server, and the ASR server completes the voice to text conversion.
  • the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation.
  • the technical solution of the present invention in essence or the contribution to the related art can be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, CD-ROM).
  • the instructions include a number of instructions for causing a terminal device (which may be a cell phone, computer, server, or network device, etc.) to perform the methods described in various embodiments of the present invention.
  • a service triggering device is also provided, which is used to implement the foregoing embodiments and preferred embodiments, and is not described again.
  • the term “module” may implement a combination of software and/or hardware of a predetermined function.
  • the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
  • FIG. 5 is a structural block diagram of a service triggering apparatus according to an embodiment of the present invention.
  • the apparatus includes: a creating module 10, configured to create a media channel of a voice playing service, a media channel of a voice recognition service, and a number collecting service
  • the receiving module 20 is configured to receive a trigger signal for triggering a service when the prompting tone is played through the media channel of the sound emitting service; and the acquiring module 30 is configured to pass the media channel of the number collecting service or the voice recognition service.
  • the media channel acquires the information included in the trigger signal, and the sending module 40 is configured to report the information included in the acquired trigger signal to the application server.
  • the obtaining module 30 further includes: a first obtaining submodule 301, configured to: when the category of the trigger signal is a number data packet, obtain the number information included in the number data packet by using the media channel of the number receiving service;
  • the sub-module 302 is configured to acquire the text information corresponding to the voice data packet through the media channel of the voice recognition service when the category of the trigger signal is a voice data packet.
  • the creating module 10 includes: a first receiving unit, configured to receive a service message sent by the application server, where the service message includes a voice play service, a voice recognition service, and a number receiving service; and the setting unit is set to set The media channel of the playback service, the media channel of the voice recognition service, and the media channel of the collection service; the creation unit is set to sequentially create the media channel of the playback service and the media of the voice recognition service according to the set creation sequence.
  • the first obtaining submodule 301 includes: a second receiving unit, configured to receive a number data packet by using a media channel of the number receiving service; and the first acquiring unit is configured to parse the number data packet, and obtain the number data packet Number information.
  • the second obtaining submodule 302 includes: a sending unit, configured to send the voice data packet to the voice recognition server through the media channel of the voice recognition service; and the processing unit is configured to disconnect the open voice service and the media channel of the number receiving service And a third receiving unit configured to receive the text information corresponding to the voice data packet returned by the voice recognition server.
  • the creating unit includes: a receiving subunit, configured to receive resource application information, where the resource application information is used for a predetermined playback service resource, or a voice recognition service resource, or a collection service resource; and the processing subunit is set to Open a predetermined playback service resource, or a voice recognition service resource, or a media channel that receives a service resource.
  • each of the above modules may be implemented by software or hardware.
  • the foregoing may be implemented by, but not limited to, the foregoing modules are all located in the same processor; or, the modules are located in multiple In the processor.
  • FIG. 6 is a structural block diagram of a media server according to an embodiment of the present invention. As shown in FIG. 6, the media server includes any one of the service triggering devices in the embodiment of the present invention.
  • a service triggering system is also provided, which is used to implement the foregoing embodiments and preferred embodiments, and has not been described again.
  • FIG. 7 is a structural block diagram of a service triggering system according to an embodiment of the present invention. As shown in FIG. 7, the system includes:
  • the application server receives the number information or the text information sent by the media server.
  • the media server 1 creates a media channel with a voice playback service, a voice recognition service, and a number collection service, where the media server is configured to trigger when the prompt tone is played by the media channel of the voice playback service.
  • the trigger signal of the service when the category of the trigger signal is a number data packet, obtain the number information contained in the number data packet through the media channel of the number receiving service, and when the category of the trigger signal is a voice data packet, the media through the voice recognition service
  • the channel obtains the text information corresponding to the voice data packet, and sends the number information or the text information to the application server;
  • the service triggering system further includes: a voice recognition server, the voice recognition server receives the voice data packet sent by the media server, converts the voice data packet into corresponding text information, and sends the text information to the media server.
  • a voice recognition server receives the voice data packet sent by the media server, converts the voice data packet into corresponding text information, and sends the text information to the media server.
  • the service triggering system further includes: the media server 1 is further configured to receive the service message sent by the application server 2, where the service message includes a voice playback service, a voice recognition service, and a number collection service, and the media server 1 sets the playback voice.
  • the setting order of the setting is: a media channel of the voice playback service, a media channel of the voice recognition service, and a media channel of the number collection service.
  • the service triggering system further includes: the media server 1 is further configured to perform media negotiation with the application server 2, including: the application server 2 sends an invite to the media server 1, requesting media negotiation with the media server 1; and the media server 1
  • the service control unit requests the first foreign port resource from the logical resource management module.
  • the logical resource management module replies to the service control unit resource application successfully; the application server 2 and the media server 1 complete the media negotiation, and the media server 1 applies the application.
  • the server 2 sends an acknowledgment message, such as a 200 OK message; the application server 2 sends an acknowledgment message, such as an ack message, to the media server 1 to confirm that the media negotiation was successful.
  • the media server 1 is further configured to: The media channel receives the number data packet, specifically, receives the number data packet through the second import resource; parses the number data packet, and obtains the number information included in the number data packet. Specifically, the DTMF decoding work is completed by the media processing unit to obtain the number data. The number information contained in the package.
  • the media server 1 is further configured to: the service control unit notifies the logical resource management module to release the resources requested by the voice playback service, the revenue collection service, and the voice recognition service; the service control unit Notifying the resource board processing module to close the media channel of the sounding service, the number receiving service, and the voice recognition service; specifically, the service control unit notifies the logical resource management module to release the second external interface unit, the first import resource, the first export resource, a second import unit, a second egress unit, and a first pair of internal interface units; and a service control unit that closes a voice recognition channel between the media server and the voice recognition server; specifically, the service control unit sends a bye message to the voice recognition The server releases the voice recognition server related session resources.
  • the media server 1 is further configured to: send the voice data packet to the voice recognition server through the media channel of the voice recognition service, and disconnect the media channel of the open voice service and the number collection service. Receiving text information corresponding to the voice data packet returned by the voice recognition server. Specifically, the voice data packet is received by the second import resource, and after the media processing unit performs necessary processing, the voice data packet is sent to the voice recognition server by using the second egress resource and the second external interface unit; and the voice recognition server is sent by the receiving voice recognition server.
  • the service control unit After the notification is used to instruct the voice recognition server to listen to the voice input, the service control unit notifies the logical resource management module to release the media resource requested by the voice service and the number collection service, and the service control unit notifies the resource board processing module to open.
  • the media resources applied for by the sound-receiving service include the first import resource, the first export resource, and the first pair of internal interface units.
  • the media resources applied for by the number-collecting service include the third import unit, and optional, when the number-collecting service applies.
  • the media resource includes the second import unit, since the second import unit is also a media resource applied for by the voice recognition service, then although the second import unit and the second import unit are not released at this time, it is regarded as The media resource applied for by the service has been released and the media channel has been disconnected; after the voice recognition server is identified, the result is sent to the service control unit of the media server.
  • the media server 1 is further configured to: the service control unit notifies the logical resource management module to release the media resource applied by the voice recognition service; and the service control unit notifies the resource board processing module to close The media channel of the voice recognition service; specifically, in the process of acquiring the text information corresponding to the voice data packet through the media channel of the voice recognition service, the media resource applied for the number receiving service and the sounding service has been released, and the number is closed.
  • the media channel of the service and the voice service only needs to release the media resource applied by the voice recognition service, close the media channel of the voice recognition service, and the media resource applied by the voice recognition service.
  • the second external interface unit, the second import resource, and the second export resource; and the service control unit is configured to close the voice recognition channel between the media server and the voice recognition server; specifically, the service control unit of the media server sends the bye Message to speech recognition server, release speech recognition Do not server related session resources.
  • the media server 1 is further configured to send a message to the application server 2, instructing the notification application server 20 to process the media channel of the voice playback service, the voice recognition service, and the number collection service, and start playing the prompt tone, waiting for the interrupt to send the number. Packet or voice packet.
  • Embodiments of the present invention also provide a storage medium.
  • the foregoing storage medium may be configured to store program code for performing the following steps:
  • the storage medium is further arranged to store program code for performing the following steps:
  • the storage medium is further arranged to store program code for performing the following steps:
  • S1 Receive a service message sent by an application server, where the service message includes a voice play service, a voice recognition service, and a number receiving service;
  • S3 sequentially create a media channel of the sound reproduction service, a media channel of the voice recognition service, and a media channel of the number collection service.
  • the storage medium is further arranged to store program code for performing the following steps:
  • S1 receiving a number data packet by using a media channel of the number receiving service
  • the storage medium is further arranged to store program code for performing the following steps:
  • S2 media channel for breaking the open voice service and the number receiving service
  • S3 Receive text information corresponding to the voice data packet returned by the voice recognition server.
  • the storage medium is further arranged to store program code for performing the following steps:
  • the storage medium is further arranged to store program code for performing the following steps:
  • S1 Receive resource application information, where the resource application information is used to reserve a voice recognition service resource;
  • the storage medium is further arranged to store program code for performing the following steps:
  • the storage medium is further arranged to store program code for performing the following steps:
  • the storage medium is further arranged to store program code for performing the following steps:
  • the foregoing storage medium may include, but not limited to, a USB flash drive, a Read-Only Memory (ROM), a Random Access Memory (RAM), a mobile hard disk, and a magnetic memory.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • a mobile hard disk e.g., a hard disk
  • magnetic memory e.g., a hard disk
  • the processor executes according to the stored program code in the storage medium:
  • the processor executes according to the stored program code in the storage medium:
  • the processor executes according to the stored program code in the storage medium:
  • S1 Receive a service message sent by an application server, where the service message includes a voice play service, a voice recognition service, and a number receiving service;
  • S3 sequentially create a media channel of the sound reproduction service, a media channel of the voice recognition service, and a media channel of the number collection service.
  • the processor executes according to the stored program code in the storage medium:
  • S1 receiving a number data packet by using a media channel of the number receiving service
  • the processor executes according to the stored program code in the storage medium:
  • S2 media channel for breaking the open voice service and the number receiving service
  • S3 Receive text information corresponding to the voice data packet returned by the voice recognition server.
  • the processor executes according to the stored program code in the storage medium:
  • the processor executes according to the stored program code in the storage medium:
  • S1 Receive resource application information, where the resource application information is used to reserve a voice recognition service resource;
  • the processor executes according to the stored program code in the storage medium:
  • the processor executes according to the stored program code in the storage medium:
  • the processor executes according to the stored program code in the storage medium:
  • modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein.
  • the steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module.
  • the invention is not limited to any specific combination of hardware and software.
  • the media channel of the voice-playing service, the media channel of the voice-recognition service, and the media channel of the number-receiving service are used in the embodiment of the present invention; when the prompt tone is played through the media channel of the voice-playing service, the trigger signal for triggering the service is received.
  • the category of the trigger signal is a number data packet
  • the number information included in the number data packet is obtained through the media channel of the number receiving service
  • the category of the trigger signal is a voice data packet
  • the voice data is obtained through the media channel of the voice recognition service.
  • the text information corresponding to the packet; and the number information contained in the number data packet or the text information corresponding to the voice data packet is reported to the application server, and the related technology can only passively select one of the receiving mode and the voice recognition mode.
  • the problem of triggering the service further achieves the effect of actively selecting the receiving mode or the voice recognition mode to trigger the service according to the input trigger signal.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Telephonic Communication Services (AREA)

Abstract

本发明提供了一种业务触发方法、装置、系统及媒体服务器。该方法包括:创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道;当通过放音业务的媒体通道播放提示音时,接收用于触发业务的触发信号;通过收号业务的媒体通道或语音识别业务的媒体通道获取触发信号中包含的信息;将获取的触发信号中包含的信息上报给应用服务器。通过本发明解决了相关技术只能被动选择收号方式和语音识别方式其中一种触发业务的问题,达到了根据输入的触发信号主动选择收号方式或者语音识别方式触发业务的效果。

Description

业务触发方法、装置、系统及媒体服务器 技术领域
本发明涉及通信领域,具体而言,涉及一种业务触发方法、装置、系统及媒体服务器。
背景技术
媒体服务器(MS)是软交换体系中提供专用媒体资源功能的独立设备,也是分组网络中的重要设备,提供基本和增强业务中的媒体处理功能,包括DTMF信号的采集与解码、信号音的产生与发送、录音通知的发送、会议、不同编解码算法间的转换等各种资源功能以及通信功能和管理维护功能。
在相关技术中,媒体服务器执行放音业务,开始播放提示音,并且在提示音的播放过程中,等待接收终端发送的数据。通常的,在提示音的播放过程中,通过收号业务所对应的媒体通道,接收终端发送的DTMF号码包,通过解析该号码包获得号码信息,并将号码信息上报给应用服务器,以此来触发应用服务器开展下一业务。随着语音识别技术的发展,也可以选择通过语音识别的方式来触发业务。然而,固有的业务触发系统,都能且仅能支持收号方式和语音识别方式中的其中一种来触发业务,在实际应用场景中,用户只能被动的选择该业务触发系统支持的业务触发方式,用户选择比较单一,且局限性明显。
发明内容
本发明提供了一种业务触发方法、装置、系统及媒体服务器,以至少解决相关技术中只能被动选择收号方式和语音识别方式其中一种触发业务的问题。
根据本发明实施例的一个方面,提供了一种业务触发方法,包括:创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道;当通过放音业务的媒体通道播放提示音时,接收用于触发业务的触发信号;通过收号业务的媒体通道或语音识别业务的媒体通道获取触发信号中包含的信息;将获取的触发信号中包含的信息上报给应用服务器。
在本发明实施例中,通过收号业务的媒体通道或语音识别业务的媒体通道获取触发信号中包含的信息包括:当触发信号的类别为号码数据包时,通过收号业务的媒体通道获取号码数据包中包含的号码信息;当触发信号的类别为语音数据包时,通过语音识别业务的媒体通道获取语音数据包所对应的文本信息。
在本发明实施例中,创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道包括:接收应用服务器发来的业务消息,其中,业务消息包括放音业务、语音识别业务和收号业务;设定放音业务、语音识别业务和收号业务的媒体通道的创建顺序;根据设定的创建顺序,依次创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道。
在本发明实施例中,设定的创建顺序依次为:放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道。
在本发明实施例中,通过收号业务的媒体通道获取号码数据包中包含的号码信息包括:通过收号业务的媒体通道接收号码数据包;解析号码数据包,获取号码数据包中包含的号码信息。
在本发明实施例中,通过语音识别业务的媒体通道获取语音数据包所对应的文本信息包括:将语音数据包通过语音识别业务的媒体通道发送至语音识别服务器;接收语音识别服务器发来的通知,其中,通知用于指示语音识别服务器监听到语音输入;断开放音业务和收号业务的媒体通道;接收语音识别服务器返回的语音数据包所对应的文本信息。
在本发明实施例中,创建放音业务的媒体通道包括:接收资源申请信息,其中,资源申请信息用于预定放音业务资源;打开预定的放音业务资源的媒体通道。
在本发明实施例中,创建语音识别业务的媒体通道包括:接收资源申请信息,其中,资源申请信息用于预定语音识别业务资源;打开预定的语音识别业务资源的媒体通道。
在本发明实施例中,创建收号业务的媒体通道包括:接收资源申请信息,其中,资源申请信息用于预定收号业务资源;打开预定的收号业务资源的媒体通道。
在本发明实施例中,在通过收号业务的媒体通道获取号码数据包中包含的号码信息之后,该方法还包括:释放放音业务、收号业务和语音识别业务所预定的资源;关闭放音业务资源、收号业务资源和语音识别业务资源的媒体通道;关闭媒体服务器和语音识别服务器之间的语音识别通道。
在本发明实施例中,在通过语音识别业务的媒体通道获取语音数据包所对应的文本信息之后,该方法还包括:释放语音识别业务所预定的资源;关闭语音识别业务资源的媒体通道;关闭媒体服务器和语音识别服务器之间的语音识别通道。
根据本发明实施例的另一方面,提供了一种业务触发装置,包括,创建模块,设置为创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道;接收模块,设置为当通过放音业务的媒体通道播放提示音时,接收用于触发业务的触发信号;获取模块,设置为通过收号业务的媒体通道或语音识别业务的媒体通道获取触发信号中包含的信息;发送模块,设置为将获取的触发信号中包含的信息上报给应用服务器。
在本发明实施例中,获取模块包括:第一获取子模块,设置为当触发信号的类别为号码数据包时,通过收号业务的媒体通道获取号码数据包中包含的号码信息;第二获取子模块,设置为当触发信号的类别为语音数据包时,通过语音识别业务的媒体通道获取语音数据包所对应的文本信息。
在本发明实施例中,创建模块包括:第一接收单元,设置为接收应用服务器发来的业务消息,其中,业务消息包括放音业务、语音识别业务和收号业务;设定单元,设置为设定放 音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道的创建顺序;创建单元,设置为根据设定的创建顺序,依次创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道。
在本发明实施例中,第一获取子模块包括:第二接收单元,设置为通过收号业务的媒体通道接收号码数据包;第一获取单元,设置为解析号码数据包,获取号码数据包中包含的号码信息。
在本发明实施例中,第二获取子模块包括:发送单元,设置为将语音数据包通过语音识别业务的媒体通道发送至语音识别服务器;处理单元,设置为断开放音业务和收号业务的媒体通道;第三接收单元,设置为接收语音识别服务器返回的语音数据包所对应的文本信息。
在本发明实施例中,创建单元包括:接收子单元,设置为接收资源申请信息,其中,资源申请信息用于预定放音业务资源、或者语音识别业务资源、或者收号业务资源;处理子单元,设置为打开预定的放音业务资源、或者语音识别业务资源、或者收号业务资源的媒体通道。
根据本发明实施例的另一方面,提供了一种媒体服务器,该媒体服务器包括本发明实施例中任意一种业务触发装置。
根据本发明实施例的另一方面,提供了一种业务触发系统,包括:本发明实施例中任意一种媒体服务器,以及应用服务器,应用服务器设置为接收媒体服务器发送的号码信息或者文本信息。
通过本发明实施例,采用创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道;当通过放音业务的媒体通道播放提示音时,接收用于触发业务的触发信号;当触发信号的类别为号码数据包时,通过收号业务的媒体通道获取号码数据包中包含的号码信息;当触发信号的类别为语音数据包时,通过语音识别业务的媒体通道获取语音数据包所对应的文本信息;以及将号码数据包中包含的号码信息或者语音数据包所对应的文本信息上报给应用服务器,解决了相关技术中只能被动选择收号方式和语音识别方式其中一种触发业务的问题,进而达到了根据输入的触发信号主动选择收号方式或者语音识别方式触发业务的效果。
附图说明
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1是根据本发明实施例的业务触发方法的流程图;
图2是根据本发明实施例的当触发信号的类别为号码数据包时的业务触发方法的时序图;
图3是根据本发明实施例的当触发信号的类别为语音数据包时的业务触发方法的时序图;
图4是根据本发明实施例的业务触发方法的码流流向图;
图5是根据本发明实施例的业务触发装置的结构框图;
图6是根据本发明实施例的媒体服务器的结构框图;
图7是根据本发明实施例的业务触发系统的结构框图。
具体实施方式
下文中将参考附图并结合实施例来详细说明本发明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
在本实施例中提供了一种业务触发方法,图1是根据本发明实施例的业务触发方法的流程图,如图1所示,该流程包括如下步骤:
步骤S102,创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道;
在本发明实施中,放音业务的媒体通道可以是媒体服务器内部用于放音业务的资源的通道,语音识别业务的媒体通道可以是媒体服务器内部用于语音识别业务的资源的通道,收号业务的媒体通道可以是媒体服务器内部用于收号业务的资源的通道。在本发明实施例中,放音业务或称play业务,通过该放音业务可实现媒体服务器播放提示音,并在播放提示音同时等待用户输入的号码信息或语音信息;语音识别业务或称speech业务,通过该业务可接收用户输入的语音信息,并识别该语音信息;收号业务或称dtmf业务,通过该业务可获取用户拨号信息。
步骤S104,当通过放音业务的媒体通道播放提示音时,接收用于触发业务的触发信号;
在本发明实施例中,触发信号可包括号码数据包或者语音数据包,号码数据包中包含用户拨号信息,语音数据包中包含用户语音输入信息。在本发明实施例中,该触发信号还可用于打断正在播放的提示音。
步骤S106,通过收号业务的媒体通道或语音识别业务的媒体通道获取触发信号中包含的信息;
在相关技术中,在一种应用场景中,媒体服务器中只创建了放音业务的媒体通道和收号业务的媒体通道,此时能被识别的触发信号只有号码数据包,即用户只能通过拨号的方式打断放音并触发业务,用户输入语音信息后不会产生正确的响应;在另一种应用场景中,媒体服务器只创建了放音业务的媒体通道和语音识别业务的媒体通道,此时能被识别的触发信号只有语音数据包,即用户只能通过语音输入的方式打断放音并触发业务,用户拨号后不会产生正确的响应。
在本发明实施例中,当通过放音业务的媒体通道播放提示音时,放音业务的媒体通道和收号业务的媒体通道都已打开,此时,用户可以主动选择采取拨号方式产生输入的触发信号,或是采取语音方式产生输入的触发信号,当用户选择采取拨号方式产生输入的触发信号时,可通过收号业务的媒体通道获取触发信号中包含的信息,当用户采取语音方式产生输入的触发信号时,可通过语音识别业务的媒体通道获取触发信号中包含的信息。
步骤S108,将获取的触发信号中包含的信息上报给应用服务器。
在本发明实施例中,媒体服务器将获取到的信息上报给应用服务器,以触发应用服务器开始下一业务。
通过上述步骤,通过在播放提示音的同时,支持根据输入的号码数据包或者语音数据包来触发业务,解决了相关技术中只能被动选择收号方式和语音识别方式其中一种触发上层业务,提高了业务触发方式的灵活度。
在本发明实施例中,媒体服务器,或简称MS,是软交换体系中提供专用媒体资源功能的独立设备,也是分组网络中的重要设备,提供基本和增强业务中的媒体处理功能,包括DTMF信号的采集与解码、信号音的产生与发送、录音通知的发送、会议、不同编解码算法间的转换等各种资源功能以及通信功能和管理维护功能。在媒体服务器中,还包括业务控制单元,在本发明实施例中或简称Call;逻辑资源管理模块,在本发明实施例中或简称ResManage;以及资源单板处理模块,在本发明实施例中或简称ResBoard。资源单板处理模块还包括媒体存储传输音频单元,在本发明实施例中或简称MSTU;以及媒体处理单元,在本发明实施例中或简称MRU。其中,业务控制单元是媒体服务器中的一个重要单元,主要完成与其他实体进行能力协商,提供资源本身的管理、维护以及控制其它业务资源单元完成复杂业务的功能;媒体存储传输音频单元,是媒体服务器中的业务资源单元,完成海量的音频数据存储,实现音频文件播放、录制功能,媒体存储传输音频单元上有对外网口,可以直接通过单元上的对外网口收发;媒体处理单元承担音频编解码转换、DTMF以及会议混音功能。
在本发明实施例中,应用服务器,或简称APP,负责各种增值业务和智能业务的逻辑产生和管理,并且还提供各种开放的API,为第三方业务的开发提供创作平台。应用服务器是一个独立的组件,与控制层的软交换无关,从而实现了业务与呼叫控制的分离,有利于新业务的引入。
可选地,步骤S106:通过收号业务的媒体通道或语音识别业务的媒体通道获取触发信号中包含的信息包括:
步骤S106a:当触发信号的类别为号码数据包时,通过收号业务的媒体通道获取号码数据包中包含的号码信息;
步骤S106b:当触发信号的类别为语音数据包时,通过语音识别业务的媒体通道获取语音数据包所对应的文本信息。
在本发明实施例中,当通过放音业务的媒体通道播放提示音时,放音业务的媒体通道和 收号业务的媒体通道都已打开,此时,用户可以主动选择采取拨号方式产生输入的触发信号,或是采取语音方式产生输入的触发信号,当用户选择采取拨号方式产生输入的触发信号时,可通过收号业务的媒体通道获取触发信号中包含的信息,当用户采取语音方式产生输入的触发信号时,可通过语音识别业务的媒体通道获取触发信号中包含的信息。
在本发明实施例中,用户自由选取拨号方式和语音方式的其中一种来产生触发信号。在可能存在的用户既拨号又语音输入的情况下,用户的拨号行为和用户的语音输入行为有时间上的先后区分,如果检测到用户先拨号,则通过收号业务的媒体通道获取号码信息,如果检测到用户先输入语音,则通过语音识别业务的媒体通道获取语音数据包所对应的文本信息。
可选地,将号码信息或文本信息上报至应用服务器,触发应用服务器下发新业务。具体的,在媒体服务器将号码信息或文本信息上报至应用服务器之后,媒体服务器接收到应用服务器返回的号码信息或文本信息上报完成消息,应用服务器依据媒体服务器上报的号码信息或文本信息,下发新的业务至媒体服务器。
可选地,上述步骤S102:创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道包括:
步骤S1022:接收应用服务器发来的业务消息,其中,业务消息包括放音业务、语音识别业务和收号业务;
步骤S1024:设定放音业务、语音识别业务和收号业务的媒体通道的创建顺序;
步骤S1026:根据设定的创建顺序,依次创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道。
具体的,在本发明实施例中,应用服务器下发info消息给媒体服务器,该消息采用group形式,info消息中包含play业务、dtmf业务以及speech业务,其中,play业务为放音业务,dtmf业务为收号业务,speech业务为语音识别业务。媒体服务器根据sip、xml协议解析该info消息,设定业务的执行顺序依次为play、speech、dtmf,用于方便后续过程中这三个业务的媒体通道的创建。
可选地,在步骤S1022:媒体服务器接收应用服务器下发的info消息之前,还包括:步骤S1021:媒体服务器与应用服务器进行媒体协商,包括:应用服务器下发invite至媒体服务器,请求与媒体服务器进行媒体协商;媒体服务器中的业务控制单元向逻辑资源管理模块申请第一外口资源,在申请成功后,逻辑资源管理模块回复业务控制单元资源申请成功;应用服务器和媒体服务器完成媒体协商,媒体服务器向应用服务器发送确认消息,例如200OK消息;应用服务器向媒体服务器发送确认消息,例如ack消息,确认媒体协商成功。
可选地,设定的创建顺序依次为:放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道。采用这样的媒体通道创建顺序,更有利于程序的执行,且能节省硬件资源。
可选地,上述步骤S106a中:通过收号业务的媒体通道获取号码数据包中包含的号码信息包括:
步骤S106a1:通过收号业务的媒体通道接收号码数据包;具体的,通过第二进口资源接收号码数据包;
步骤S106a3:解析号码数据包,获取号码数据包中包含的号码信息;具体的,由媒体处理单元完成DTMF解码工作,获取号码数据包中包含的号码信息。
可选地,上述步骤S106b中:通过语音识别业务的媒体通道获取语音数据包所对应的文本信息包括:
步骤S106b1:将语音数据包通过语音识别业务的媒体通道发送至语音识别服务器;具体的,通过第二进口资源接收语音数据包,在媒体处理单元进行必要的处理后,通过第二出口资源和第二对外接口单元将语音数据包发送给语音识别服务器。
步骤S106b3:断开放音业务和收号业务的媒体通道;具体的,在接收语音识别服务器发来用于指示语音识别服务器监听到语音输入的通知后,业务控制单元通知逻辑资源管理模块释放放音业务和收号业务所申请的媒体资源,并且,业务控制单元通知资源单板处理模块断开放音业务和收号业务所申请的媒体资源的媒体通道。具体的,放音业务所申请的媒体资源包括第一进口资源、第一出口资源、第一对内接口单元,收号业务所申请的媒体资源包括第三进口单元。可选的,当收号业务所申请的媒体资源包括第二进口单元时,由于第二进口单元亦属于语音识别业务所申请的媒体资源,那么此时虽没有释放第二进口单元和断开第二进口单元的通道,但视为收号业务所申请的媒体资源已经释放并且媒体通道已经断开;
步骤S106b5:接收语音识别服务器返回的语音数据包所对应的文本信息,具体的,语音识别服务器识别完成后,将结果发送到媒体服务器的业务控制单元。
在本发明实施例中,语音识别服务器,或简称ASR(Automated Speech Recognition),用于将输入的音频识别出来,转化为文字,将文字信息通过消息上报发送给用户,info消息中用speech关键字标识。
可选地,上述步骤S102中:创建放音业务的媒体通道包括:
步骤S10261:接收资源申请信息,其中,资源申请信息用于预定放音业务资源,放音业务资源即用于放音业务的资源;具体的,逻辑资源管理模块接收用于预定放音业务资源的申请,放音业务资源包括媒体存储传输音频单元的第一对内接口单元,媒体处理单元的第一进口资源和第一出口资源;
步骤S10262:打开预定的放音业务资源的媒体通道,具体的,逻辑资源管理模块向资源单板处理模块发送打开通道消息,其中,打开通道消息用于通知资源单板处理模块打开预定的放音业务资源的媒体通道;媒体服务器中的业务逻辑处理模块向逻辑资源管理模块申请用于放音业务的上述三资源,并且通知各个资源单板处理模块打开用于放音业务的上述三个资 源的媒体通道。
可选地,上述步骤S102中:创建语音识别业务的媒体通道包括:
步骤S10264:接收资源申请信息,其中,资源申请信息用于预定语音识别业务资源,语音识别业务资源即用于语音识别业务的资源;具体的,逻辑资源管理模块接收用于预定语音识别业务资源的申请,语音识别业务资源包括媒体存储传输音频单元的第二对外接口单元,媒体处理单元的第二进口资源和第二出口资源,其中,第二对外接口单元用于将接收到的语音识别数据包发送至语音识别服务器;
步骤S10265:打开预定的语音识别业务资源的媒体通道,具体的,逻辑资源管理模块向资源单板处理模块发送打开通道消息,其中,打开通道消息用于通知资源单板处理模块打开预定的语音识别业务资源的媒体通道;媒体服务器中的业务逻辑处理模块向逻辑资源管理模块申请用于语音识别业务的上述三资源,并且通知各个资源单板处理模块打开用于语音识别业务的上述三个资源的媒体通道。
可选地,在创建语音识别业务的媒体通道之前,还包括:媒体服务器与语音识别服务器进行媒体协商,打开媒体服务器与语音识别服务器之间的语音识别通道,具体的,媒体服务器发送invite消息至语音识别服务器,请求与语音识别服务器进行媒体协商;媒体服务器发送MRCP请求,通知语音识别服务器准备接收音频输入,其中,MRCP指Media Resource Control Protocol,用于语音识别服务器向客户端提供各种语音服务。
可选地,上述步骤S102中:创建收号业务的媒体通道包括:
步骤S10267:接收资源申请信息,其中,资源申请信息用于预定收号业务资源,收号业务资源即用于收号业务的资源;具体的,逻辑资源管理模块接收用于预定收号业务资源的申请,收号业务资源包括媒体处理单元的第三进口单元,或者,收号业务资源包括媒体处理单元的第二进口单元,在后一种情况下,第二进口单元既可作为语音识别业务的语音数据包进口单元,也可作为收号业务的号码数据包的进口单元;
步骤S10268:打开预定的收号业务资源的媒体通道,具体的,逻辑资源管理模块向资源单板处理模块发送打开通道消息,其中,打开通道消息用于通知资源单板处理模块打开预定的收号业务资源的媒体通道。媒体服务器中的业务逻辑处理模块向逻辑资源管理模块申请用于收号业务的上述资源,并且通知各资源单板处理模块打开用于收号业务的上述资源的媒体通道;具体的,当用于收号业务的资源包括媒体处理单元第二进口单元时,根据媒体服务器设定的放音业务、语音识别业务和收号业务的媒体通道的创建顺序,可以视为在上述步骤S10264和步骤S10275中已完成对用于收号业务的资源申请和媒体通道的打开。
可选地,在步骤S104之前,还包括:媒体服务器向应用服务器发送消息,指示通知应用服务器处理放音业务、语音识别业务、收号业务的媒体通道已经打开,并开始播放提示音,等待中断发送号码数据包或者语音数据包。
可选地,在上述步骤S106a:在通过收号业务的媒体通道获取号码数据包中包含的号码信 息之后,根据本发明实施例的业务触发方法还包括:
步骤S1071:释放放音业务、收号业务和语音识别业务所预定的资源;具体的,业务控制单元通知逻辑资源管理模块释放放音业务、收号业务和语音识别业务所申请的资源,待释放的资源包括第二对外接口单元、第一进口资源、第一出口资源、第二进口单元、第二出口单元和第一对内接口单元。
步骤S1073:关闭放音业务资源、收号业务资源和语音识别业务资源的媒体通道;具体的,业务控制单元通知资源单板处理模块关闭放音业务、收号业务和语音识别业务的媒体通道。
步骤S1075:关闭媒体服务器和语音识别服务器之间的语音识别通道,具体的,业务控制单元关闭媒体服务器和语音识别服务器之间的语音识别通道;业务控制单元下发bye消息至语音识别服务器,释放语音识别服务器相关会话资源。
可选地,在上述步骤S106b:在通过语音识别业务的媒体通道获取语音数据包所对应的文本信息之后,根据本发明实施例的业务触发方法还包括:
步骤S1072:释放语音识别业务所预定的资源,具体的,业务控制单元通知逻辑资源管理模块释放语音识别业务所申请的媒体资源;
步骤S1074:关闭语音识别业务资源的媒体通道,具体的,业务控制单元通知资源单板处理模块关闭语音识别业务的媒体通道;具体的,在通过语音识别业务的媒体通道获取语音数据包所对应的文本信息过程中,已释放收号业务和放音业务所申请的媒体资源,并关闭了收号业务和放音业务的媒体通道,在获取到语音数据包所对应的文本信息之后,只需释放语音识别业务所申请的媒体资源,关闭语音识别业务的媒体通道,语音识别业务所申请的媒体资源包括:第二对外接口单元、第二进口资源和第二出口资源。
步骤S1076:关闭媒体服务器和语音识别服务器之间的语音识别通道,具体的,业务控制单元关闭媒体服务器和语音识别服务器之间的语音识别通道;媒体服务器的业务控制单元下发bye消息至语音识别服务器,释放语音识别服务器相关会话资源。
下面通过具体实例本发明实施例的业务触发方法进行进一步的说明,图2是根据本发明实施例的当触发信号的类别为号码数据包时的业务触发方法的时序图,图3是根据本发明实施例的当触发信号的类别为语音数据包时的业务触发方法的时序图;图2和图3所示方法的步骤1至步骤21为相同步骤,在步骤21之后,根据接收到的触发信号的类别,执行图2步骤31至步骤38的步骤,或者执行图3步骤41至步骤52的步骤。
下面首先介绍步骤1至步骤21,如图2或者图3所示,该业务触发方法包括:
步骤1:APP下发invite消息至MS,invite消息用于APP与MS进行媒体协商;
步骤2:Call向ResManage发送GetResReq(mstuout1)请求,GetResReq(mstuout1)请求用于申请MSTU的第一对外接口单元,即Mstuout1外口资源;
步骤3:ResManage向Call发送GetResRsp信息,GetResRsp信息用于指示第一对外接口单元申请成功;
步骤4:MS和APP完成媒体协商,MS向APP发送200OK消息;
步骤5:APP向MS下发ACK消息,ACK消息用于确认媒体协商成功;
步骤6:APP下发info(<play><speech><dtmf>)消息,该消息包括play放音业务、dtmf收号业务、speech语音识别业务这三个业务,Ms解析info消息,并规定业务执行顺序为play、speech、dtmf,方便后续资源申请和打开;
步骤7:Call向ResManage发送GetResReq(mstuin、mru1、mru2)请求,GetResReq(mstuin、mru1、mru2)请求用以申请用于放音业务的第一对内接口单元、第一进口资源、第一出口资源,即mstuin、mru1、mru2资源;
步骤8:ResManage向Call发送GetResRsp信息,GetResRsp信息用于指示资源申请成功;
步骤9:Call向ResBoard发送OpenResReq(mstuout1-nat、mru2、mru1、mstuin)请求,OpenResReq(mstuout1-nat、mru2、mru1、mstuin)请求用于通知ResBoard打开该第一对内接口单元、第一进口资源、第一出口资源,即Mstuin、Mru1、Mru2资源的媒体通道;
步骤10:ResBoard向Call发送OpenResRsp消息,OpenResRsp消息用于指示Mstuin、Mru1、Mru2资源的媒体通道已成功打开;
步骤11:MS发送invite消息到ASR服务器,invite消息用于MS与ASR进行媒体协商;
步骤12:MS和ASR完成媒体协商,ASR向MS发送200OK消息;
步骤13:MS向ASR发送MrcpReq请求,MrcpReq请求用于通知ASR服务器准备接收音频输入;
步骤14:ASR向MS回复MrcpRsp请求,MrcpRsp请求用于通知MS,ASR服务器已准备好;
步骤15:Call向ResManage发送GetResReq(mstuout2-nat、playmru2、playmru1)请求,GetResReq(mstuout2-nat、playmru2、playmru1)请求用以申请用于语音识别业务的第二对外接口单元、第二进口资源、和第二出口资源,即mstuout2、playmru1、playmru2资源;
步骤16:ResManage向Call发送GetResRsp信息,GetResRsp信息用于指示资源申请成功;
步骤17:Call向ResBoard发送OpenResReq(mstuout2-nat、playmru2、playmru1)请求,OpenResReq(mstuout2-nat、playmru2、playmru1)请求用于通知ResBoard打开该第二对外接口单元、第二进口资源、和第二出口资源,即mstuout2、playmru1、playmru2资源的媒体通道;
步骤18:ResBoard向Call发送OpenResRsp消息,OpenResRsp消息用于指示mstuout2、 playmru1、playmru2资源的媒体通道已成功打开的消息,等待接收终端发送过来的音频码流;
步骤19:Call向ResBoard发送OpenDtmfChannel(playmru2)消息,OpenDtmfChannel(playmru2)消息用于通知ResBoard打开用于收号业务的第二进口资源的媒体通道的消息,即Call模块通知ResBoard中PlayMru2资源进行收号;
步骤20:ResBoard向Call发送GetResRsp信息,GetResRsp信息用于指示PlayMru2资源的媒体通道已成功打开;
步骤21:MS回复info的200Ok消息,通知APP媒体服务器内部用于处理放音(play)、语音识别(speech)、收号(dtmf)三个业务的媒体通道已经打开,开始播放提示音,并且等待终端发送收号或者语音码流。
此时,如果在播放提示音的过程中MS先收到号码数据包,则执行图2所示的步骤31至步骤38的步骤,如果播放提示过程中,先收到语音数据包,则执行图3所示的步骤41至步骤52的步骤。
如图2所示,当接收到号码数据包,该方法包括:
步骤31:Call接收ResBoard发来的RecDtmf信息,指示MS模块收到号码数据包并解析完该号码数据包,获取了号码数据包中的号码信息;
步骤32:MS向APP发送info(SendDtmf)消息,info(SendDtmf)消息用于向APP上报获取到的号码信息;
步骤33:Call向ResBoard模块发送ClossAllRes(mstu、mru)信息,ClossAllRes(mstu、mru)信息用于通知ResBoard模块关闭play、dtmf、speech业务所打开的媒体通道;
步骤34:Call向ResManage发送ReleaseAllRes(mstu、mru)信息,ReleaseAllRes(mstu、mru)信息用于通知ResManage模块释放play、dtmf、speech所申请的媒体资源,即mstuout2、mru1、mru2、playmru1、playmru2、mstuin;
步骤35:Call模块下发Bye消息到ASR服务器,释放ASR服务器相关会话资源;
步骤36:ASR向Call发送ByeRsp消息,用于确认资源释放;
步骤37:MS接收APP发送的200OK消息,用于指示MS上报号码信息完成;
步骤38:依据MS上报的号码消息,APP下发新业务消息info到MS。
如图3所示,当接收到语音数据包,该方法包括:
步骤41:ASR服务器向MS发送RecVoiceReport消息,用于通知媒体服务器ASR已经监听到语音输入;
步骤42:Call向ResBoard模块发送ClossAllRes(mstuin、mru1、mru2)信息,ClossAllRes(mstuin、mru1、mru2)信息用于通知ResBoard模块关闭play业务所申请的mstuin、 mru1、mru2资源;
步骤43:Call向ResManage发送ReleaseAllRes(mstuin、mru1、mru2)信息,ReleaseAllRes(mstuin、mru1、mru2)信息用于通知ResManage模块释放play所申请的媒体资源,即mstuin、mru1、mru2资源;
步骤44:Call向ResBoard模块发送ClossDtmfChannel(playmru2)信息,ClossDtmfChannel(playmru2)信息用于通知ResBoard模块playmru2通道停止收号,相当于将playmru2的通道属性从DTMF业务转向接收语音数据包业务;
步骤45:ASR向MS发送SendAsrResult信息,SendAsrResult信息用于ASR服务器音频识别完成后,将语音数据包对应的文本信息发送给MS;
步骤46:MS向APP发送info(SendSpeechResult)消息,info(SendSpeechResult)消息用于MS将ASR服务器识别出的文本信息上报给APP;
步骤47:Call向ResBoard模块发送ClossAllRes(mstu、mru)信息,ClossAllRes(mstu、mru)信息用于通知ResBoard模块关闭进行speech业务所打开的mstuout2、playmru1、playmru2的媒体通道;
步骤48:Call向ResManage发送ReleaseAllRes(mstu、mru)信息,ReleaseAllRes(mstu、mru)信息用于通知ResManage模块释放进行speech业务所申请的资源,即mstuout2、playmru1、playmru2资源;
步骤49:Call模块下发bye消息到ASR服务器,bye消息用于通知ASR服务器释放ASR服务器相关会话资源;
步骤50:ASR向Call发送ByeRsp消息,ByeRsp消息用于确认资源释放;
步骤51:MS接收APP发送的200OK消息,用于指示MS回复的上报语音数据包对应的文本信息完成;
步骤52:依据MS上报的语音识别结果的文本消息,APP下发新业务消息info到MS。
在本发明实施例中,MSML/MOML协议,是开放式的协议,遵循SIP协议的扩展原则,在没有改变SIP协议的基础上,提供了一个很好的扩展框架用于扩展的需要。MSML/MOML是通过SIP协议的INFO和INVITE消息体起作用的。SIP/MSML/MOML的组合,是利用SIP协议来建立会话、修改会话、删除会话。而基于XML的MSML/MOML则用来提供媒体处理的控制接口。其中MSML是控制媒体流和媒体服务器的内部会议资源的接口;MOML是用来控制媒体流和会议中涉及到的复杂的媒体处理对象的。SIP和MSML/MOML组合在一起就构成了一个应用服务器和媒体服务器交互的一个功能强大的接口框架。
图4是根据本发明实施例的业务触发方法的码流流向图;如图4所示,对于放音业务,MSTU中存储有音频数据,可以实现音频文件的播放;通过MSTU中的第一对内接口单元mstuin,将待播放的音频数据包通过第一进口资源mru1发送至MRU单元,实现音频文件的 转码,并通过第一出口资源mru2、以及第一对外接口单元mstuout1,向终端播放提示音;当终端产生触发信号后,媒体服务器通过第一对外接口单元mstuout1接收终端发来的触发信号,MRU通过第二进口资源PlayMru1,接收该触发信号,如果该触发信号的类型为号码数据包时,PlayMru2资源进行收号;如果该触发信号的类型为语音数据包时,通过第二出口资源PlayMru2和MSTU的第二对外接口单元,将该语音数据包发送给ASR服务器,由ASR服务器完成语音到文本的转换。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。
在本实施例中还提供了一种业务触发装置,该装置用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。
图5是根据本发明实施例的业务触发装置的结构框图;如图5所示,该装置包括:创建模块10,设置为创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道;接收模块20,设置为当通过放音业务的媒体通道播放提示音时,接收用于触发业务的触发信号;获取模块30,设置为通过收号业务的媒体通道或语音识别业务的媒体通道获取触发信号中包含的信息;发送模块40,设置为将获取的触发信号中包含的信息上报给应用服务器。
可选地,获取模块30还包括:第一获取子模块301,设置为当触发信号的类别为号码数据包时,通过收号业务的媒体通道获取号码数据包中包含的号码信息;第二获取子模块302,设置为当触发信号的类别为语音数据包时,通过语音识别业务的媒体通道获取语音数据包所对应的文本信息。
可选地,创建模块10包括:第一接收单元,设置为接收应用服务器发来的业务消息,其中,业务消息包括放音业务、语音识别业务和收号业务;设定单元,设置为设定放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道的创建顺序;创建单元,设置为根据设定的创建顺序,依次创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道。
可选地,第一获取子模块301包括:第二接收单元,设置为通过收号业务的媒体通道接收号码数据包;第一获取单元,设置为解析号码数据包,获取号码数据包中包含的号码信息。
可选地,第二获取子模块302包括:发送单元,设置为将语音数据包通过语音识别业务的媒体通道发送至语音识别服务器;处理单元,设置为断开放音业务和收号业务的媒体通道;第三接收单元,设置为接收语音识别服务器返回的语音数据包所对应的文本信息。
可选地,创建单元包括:接收子单元,设置为接收资源申请信息,其中,资源申请信息用于预定放音业务资源、或者语音识别业务资源、或者收号业务资源;处理子单元,设置为打开预定的放音业务资源、或者语音识别业务资源、或者收号业务资源的媒体通道。
需要说明的是,上述各个模块是可以通过软件或硬件来实现的,对于后者,可以通过以下方式实现,但不限于此:上述模块均位于同一处理器中;或者,上述模块分别位于多个处理器中。
在本实施例中还提供了一种媒体服务器,该媒体服务器设置为实现上述实施例及优选实施方式,已经进行过说明的不再赘述。图6是根据本发明实施例的媒体服务器的结构框图,如图6所示,该媒体服务器包括本发明实施例中的任意一种业务触发装置。
在本实施例中还提供了一种业务触发系统,该系统用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述。
图7是根据本发明实施例的业务触发系统的结构框图,如图7所示,该系统包括:
上述任意一种实施方式的媒体服务器1,以及应用服务器2,应用服务器接收媒体服务器发送的号码信息或者文本信息。
可选地,媒体服务器1中创建有放音业务、语音识别业务和收号业务的媒体通道,媒体服务器设置为在通过放音业务的媒体通道播放提示音时,接收终端发来的用于触发业务的触发信号,当触发信号的类别为号码数据包时,通过收号业务的媒体通道获取号码数据包中包含的号码信息,当触发信号的类别为语音数据包时,通过语音识别业务的媒体通道获取语音数据包所对应的文本信息,将号码信息或者文本信息发送至应用服务器;
可选地,该业务触发系统还包括,语音识别服务器,语音识别服务器接收媒体服务器发送的语音数据包,将语音数据包转换为相对应的文本信息,并将文本信息发送给媒体服务器。
可选地,该业务触发系统还包括:媒体服务器1还设置为接收应用服务器2发送的业务消息,其中,业务消息包括放音业务、语音识别业务和收号业务,媒体服务器1设定放音业务、语音识别业务和收号业务的媒体通道的创建顺序,并依次创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道。可选地,设定的创建顺序依次为:放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道。
可选地,该业务触发系统还包括:媒体服务器1还设置为与应用服务器2进行媒体协商,包括:应用服务器2下发invite至媒体服务器1,请求与媒体服务器1进行媒体协商;媒体服务器1中的业务控制单元向逻辑资源管理模块申请第一外口资源,在申请成功后,逻辑资源管理模块回复业务控制单元资源申请成功;应用服务器2和媒体服务器1完成媒体协商,媒体服务器1向应用服务器2发送确认消息,例如200OK消息;应用服务器2向媒体服务器1发送确认消息,例如ack消息,确认媒体协商成功。
可选地,当触发信号的类别为号码数据包时,媒体服务器1还设置为:通过收号业务的 媒体通道接收号码数据包,具体的,通过第二进口资源接收号码数据包;解析号码数据包,获取号码数据包中包含的号码信息,具体的,由媒体处理单元完成DTMF解码工作,获取号码数据包中包含的号码信息。
可选地,当触发信号的类别为号码数据包时,媒体服务器1还设置为:业务控制单元通知逻辑资源管理模块释放放音业务、收号业务和语音识别业务所申请的资源;业务控制单元通知资源单板处理模块关闭放音业务、收号业务和语音识别业务的媒体通道;具体的,业务控制单元通知逻辑资源管理模块释放第二对外接口单元、第一进口资源、第一出口资源、第二进口单元、第二出口单元、第一对内接口单元;以及设置为业务控制单元关闭媒体服务器和语音识别服务器之间的语音识别通道;具体的,业务控制单元下发bye消息至语音识别服务器,释放语音识别服务器相关会话资源。
可选地,当触发信号的类别为语音数据包时,媒体服务器1还设置为:将语音数据包通过语音识别业务的媒体通道发送至语音识别服务器,断开放音业务和收号业务的媒体通道,接收语音识别服务器返回的语音数据包所对应的文本信息。具体的,通过第二进口资源接收语音数据包,在媒体处理单元进行必要的处理后,通过第二出口资源和第二对外接口单元将语音数据包发送给语音识别服务器;在接收语音识别服务器发来用于指示语音识别服务器监听到语音输入的通知后,业务控制单元通知逻辑资源管理模块释放放音业务和收号业务所申请的媒体资源,并且,业务控制单元通知资源单板处理模块断开放音业务和收号业务所申请的媒体资源的媒体通道。放音业务所申请的媒体资源包括第一进口资源、第一出口资源、第一对内接口单元,收号业务所申请的媒体资源包括第三进口单元,可选的,当收号业务所申请的媒体资源包括第二进口单元时,由于第二进口单元亦属于语音识别业务所申请的媒体资源,那么此时虽没有释放第二进口单元和断开第二进口单元的通道,但视为收号业务所申请的媒体资源已经释放并且媒体通道已经断开;语音识别服务器识别完成后,将结果发送到媒体服务器的业务控制单元。
可选地,当触发信号的类别为语音数据包时,媒体服务器1还设置为:业务控制单元通知逻辑资源管理模块释放语音识别业务所申请的媒体资源;业务控制单元通知资源单板处理模块关闭语音识别业务的媒体通道;具体的,在通过语音识别业务的媒体通道获取语音数据包所对应的文本信息过程中,已释放收号业务和放音业务所申请的媒体资源,并关闭了收号业务和放音业务的媒体通道,在获取到语音数据包所对应的文本信息之后,只需释放语音识别业务所申请的媒体资源,关闭语音识别业务的媒体通道,语音识别业务所申请的媒体资源包括:第二对外接口单元、第二进口资源和第二出口资源;以及用于业务控制单元关闭媒体服务器和语音识别服务器之间的语音识别通道;具体的,媒体服务器的业务控制单元下发bye消息至语音识别服务器,释放语音识别服务器相关会话资源。
可选地,媒体服务器1还设置为向应用服务器2发送消息,指示通知应用服务器20处理放音业务、语音识别业务、收号业务的媒体通道已经打开,并开始播放提示音,等待中断发送号码数据包或者语音数据包。
本发明的实施例还提供了一种存储介质。可选地,在本实施例中,上述存储介质可以被设置为存储用于执行以下步骤的程序代码:
S1,创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道;
S2,当通过放音业务的媒体通道播放提示音时,接收用于触发业务的触发信号;
S3,通过收号业务的媒体通道或语音识别业务的媒体通道获取触发信号中包含的信息;
S4,将获取的触发信号中包含的信息上报给应用服务器。
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:
S1,当触发信号的类别为号码数据包时,通过收号业务的媒体通道获取号码数据包中包含的号码信息,
S2,当触发信号的类别为语音数据包时,通过语音识别业务的媒体通道获取语音数据包所对应的文本信息。
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:
S1,接收应用服务器发来的业务消息,其中,业务消息包括放音业务、语音识别业务和收号业务;
S2,设定放音业务、语音识别业务和收号业务的媒体通道的创建顺序;
S3,根据设定的创建顺序,依次创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道。
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:
S1:通过收号业务的媒体通道接收号码数据包;
S2:解析号码数据包,获取号码数据包中包含的号码信息;
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:
S1:将语音数据包通过语音识别业务的媒体通道发送至语音识别服务器;
S2:断开放音业务和收号业务的媒体通道;
S3:接收语音识别服务器返回的语音数据包所对应的文本信息。
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:
S1:接收资源申请信息,其中,资源申请信息用于预定放音业务资源;
S2:打开预定的放音业务资源的媒体通道。
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:
S1:接收资源申请信息,其中,资源申请信息用于预定语音识别业务资源;
S2:打开预定的语音识别业务资源的媒体通道。
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:
S1:接收资源申请信息,其中,资源申请信息用于预定收号业务资源;
S2:打开预定的收号业务资源的媒体通道。
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:
S1:释放放音业务、收号业务和语音识别业务所预定的资源;
S2:关闭放音业务资源、收号业务资源和语音识别业务资源的媒体通道;
S3:关闭媒体服务器和语音识别服务器之间的语音识别通道。
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:
S1:释放语音识别业务所预定的资源;
S2:关闭语音识别业务资源的媒体通道;
S3:关闭媒体服务器和语音识别服务器之间的语音识别通道。
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
可选地,在本实施例中,处理器根据存储介质中已存储的程序代码执行:
S1,创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道;
S2,当通过放音业务的媒体通道播放提示音时,接收用于触发业务的触发信号;
S3,通过收号业务的媒体通道或语音识别业务的媒体通道获取触发信号中包含的信息;
S4,将获取的触发信号中包含的信息上报给应用服务器。
可选地,在本实施例中,处理器根据存储介质中已存储的程序代码执行:
S1,当触发信号的类别为号码数据包时,通过收号业务的媒体通道获取号码数据包中包含的号码信息,
S2,当触发信号的类别为语音数据包时,通过语音识别业务的媒体通道获取语音数据包所对应的文本信息。
可选地,在本实施例中,处理器根据存储介质中已存储的程序代码执行:
S1,接收应用服务器发来的业务消息,其中,业务消息包括放音业务、语音识别业务和收号业务;
S2,设定放音业务、语音识别业务和收号业务的媒体通道的创建顺序;
S3,根据设定的创建顺序,依次创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道。
可选地,在本实施例中,处理器根据存储介质中已存储的程序代码执行:
S1:通过收号业务的媒体通道接收号码数据包;
S2:解析号码数据包,获取号码数据包中包含的号码信息;
可选地,在本实施例中,处理器根据存储介质中已存储的程序代码执行:
S1:将语音数据包通过语音识别业务的媒体通道发送至语音识别服务器;
S2:断开放音业务和收号业务的媒体通道;
S3:接收语音识别服务器返回的语音数据包所对应的文本信息。
可选地,在本实施例中,处理器根据存储介质中已存储的程序代码执行:
S1:接收资源申请信息,其中,资源申请信息用于预定放音业务资源;
S2:打开预定的放音业务资源的媒体通道。
可选地,在本实施例中,处理器根据存储介质中已存储的程序代码执行:
S1:接收资源申请信息,其中,资源申请信息用于预定语音识别业务资源;
S2:打开预定的语音识别业务资源的媒体通道。
可选地,在本实施例中,处理器根据存储介质中已存储的程序代码执行:
S1:接收资源申请信息,其中,资源申请信息用于预定收号业务资源;
S2:打开预定的收号业务资源的媒体通道。
可选地,在本实施例中,处理器根据存储介质中已存储的程序代码执行:
S1:释放放音业务、收号业务和语音识别业务所预定的资源;
S2:关闭放音业务资源、收号业务资源和语音识别业务资源的媒体通道;
S3:关闭媒体服务器和语音识别服务器之间的语音识别通道。
可选地,在本实施例中,处理器根据存储介质中已存储的程序代码执行:
S1:释放语音识别业务所预定的资源;
S2:关闭语音识别业务资源的媒体通道;
S3:关闭媒体服务器和语音识别服务器之间的语音识别通道。
可选地,本实施例中的具体示例可以参考上述实施例及可选实施方式中所描述的示例,本实施例在此不再赘述。
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。
工业实用性
通过本发明实施例,采用创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道;当通过放音业务的媒体通道播放提示音时,接收用于触发业务的触发信号;当触发信号的类别为号码数据包时,通过收号业务的媒体通道获取号码数据包中包含的号码信息;当触发信号的类别为语音数据包时,通过语音识别业务的媒体通道获取语音数据包所对应的文本信息;以及将号码数据包中包含的号码信息或者语音数据包所对应的文本信息上报给应用服务器,解决了相关技术中只能被动选择收号方式和语音识别方式其中一种触发业务的问题,进而达到了根据输入的触发信号主动选择收号方式或者语音识别方式触发业务的效果。

Claims (18)

  1. 一种业务触发方法,包括:
    创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道;
    当通过所述放音业务的媒体通道播放提示音时,接收用于触发业务的触发信号;
    通过所述收号业务的媒体通道或所述语音识别业务的媒体通道获取所述触发信号中包含的信息;
    将获取的所述触发信号中包含的信息上报给应用服务器。
  2. 根据权利要求1所述的方法,其中,通过所述收号业务的媒体通道或所述语音识别业务的媒体通道获取所述触发信号中包含的信息包括:
    当所述触发信号的类别为号码数据包时,通过所述收号业务的媒体通道获取所述号码数据包中包含的号码信息;
    当所述触发信号的类别为语音数据包时,通过所述语音识别业务的媒体通道获取所述语音数据包所对应的文本信息。
  3. 根据权利要求1或2所述的方法,其中,创建所述放音业务的媒体通道、所述语音识别业务的媒体通道和所述收号业务的媒体通道包括:
    接收应用服务器发来的业务消息,其中,所述业务消息包括所述放音业务、所述语音识别业务和所述收号业务;
    设定所述放音业务的媒体通道、所述语音识别业务的媒体通道和所述收号业务的媒体通道的创建顺序;
    根据设定的所述创建顺序,依次创建所述放音业务的媒体通道、所述语音识别业务的媒体通道和所述收号业务的媒体通道。
  4. 根据权利要求3所述的方法,其中,设定的所述创建顺序依次为:所述放音业务的媒体通道、所述语音识别业务的媒体通道和所述收号业务的媒体通道。
  5. 根据权利要求2所述的方法,其中,通过所述收号业务的媒体通道获取所述号码数据包中包含的号码信息包括:
    通过所述收号业务的媒体通道接收所述号码数据包;
    解析所述号码数据包,获取所述号码数据包中包含的所述号码信息。
  6. 根据权利要求2所述的方法,其中,通过所述语音识别业务的媒体通道获取所述语音数据包所对应的文本信息包括:
    将所述语音数据包通过所述语音识别业务的媒体通道发送至语音识别服务器;
    断开所述放音业务和所述收号业务的媒体通道;
    接收所述语音识别服务器返回的所述语音数据包所对应的文本信息。
  7. 根据权利要求2所述的方法,其中,创建所述放音业务的媒体通道、或者所述语音识别业务的媒体通道、或者所述收号业务的媒体通道包括:
    接收资源申请信息,其中,所述资源申请信息用于预定放音业务资源、或者语音识别业务的资源、或者收号业务资源;
    打开预定的所述放音业务资源、或者所述语音识别业务资源、或者所述收号业务资源的媒体通道。
  8. 根据权利要求7所述的方法,其中,在通过所述收号业务的媒体通道获取所述号码数据包中包含的号码信息之后,还包括:
    释放所述放音业务、所述收号业务和所述语音识别业务所预定的资源;
    关闭所述放音业务资源、所述收号业务资源和所述语音识别业务资源的媒体通道;
    关闭媒体服务器和语音识别服务器之间的语音识别通道。
  9. 根据权利要求7所述的方法,其中,在通过所述语音识别业务的媒体通道获取所述语音数据包所对应的文本信息之后,还包括:
    释放所述语音识别业务所预定的资源;
    关闭所述语音识别业务资源的媒体通道;
    关闭媒体服务器和语音识别服务器之间的语音识别通道。
  10. 一种业务触发装置,包括:
    创建模块,设置为创建放音业务的媒体通道、语音识别业务的媒体通道和收号业务的媒体通道;
    接收模块,设置为当通过所述放音业务的媒体通道播放提示音时,接收用于触发业务的触发信号;
    获取模块,设置为通过所述收号业务的媒体通道或所述语音识别业务的媒体通道获取所述触发信号中包含的信息;
    发送模块,设置为将获取的所述触发信号中包含的信息上报给应用服务器。
  11. 根据权利要求10所述的装置,其中,所述获取模块包括:
    第一获取子模块,设置为当所述触发信号的类别为号码数据包时,通过所述收号业务的媒体通道获取所述号码数据包中包含的号码信息;
    第二获取子模块,设置为当所述触发信号的类别为语音数据包时,通过所述语音识别业务的媒体通道获取所述语音数据包所对应的文本信息。
  12. 根据权利要求10所述的装置,其中,所述创建模块包括:
    第一接收单元,设置为接收应用服务器发来的业务消息,其中,所述业务消息包括所述放音业务、所述语音识别业务和所述收号业务;
    设定单元,设置为设定所述放音业务的媒体通道、所述语音识别业务的媒体通道和所述收号业务的媒体通道的创建顺序;
    创建单元,设置为根据设定的所述创建顺序,依次创建所述放音业务的媒体通道、所述语音识别业务的媒体通道和所述收号业务的媒体通道。
  13. 根据权利要求11所述的装置,其中,所述第一获取子模块包括:
    第二接收单元,设置为通过所述收号业务的媒体通道接收所述号码数据包;
    第一获取单元,设置为解析所述号码数据包,获取所述号码数据包中包含的所述号码信息。
  14. 根据权利要求11所述的装置,其中,所述第二获取子模块包括:
    发送单元,设置为将所述语音数据包通过所述语音识别业务的媒体通道发送至语音识别服务器;
    处理单元,设置为断开所述放音业务和所述收号业务的媒体通道;
    第三接收单元,设置为接收所述语音识别服务器返回的所述语音数据包所对应的文本信息。
  15. 根据权利要求12所述的装置,其中,所述创建单元包括:
    接收子单元,设置为接收资源申请信息,其中,所述资源申请信息用于预定放音业务资源、或者语音识别业务资源、或者收号业务资源;
    处理子单元,设置为打开预定的所述放音业务资源、或者所述语音识别业务资源、或者所述收号业务资源的媒体通道。
  16. 一种媒体服务器,包括权利要求10至15中任意一项所述的装置。
  17. 一种业务触发系统,包括:权利要求16所述的媒体服务器;
    应用服务器,其中,所述应用服务器设置为接收所述媒体服务器发送的触发信号中包含的信息。
  18. 根据权利要求17所述的系统,其中,还包括,语音识别服务器,所述语音识别服务器设置为接收所述媒体服务器发送的所述语音数据包,将所述语音数据包转换为相对应的文 本信息,并将所述文本信息发送给所述媒体服务器。
PCT/CN2016/073370 2015-04-23 2016-02-03 业务触发方法、装置、系统及媒体服务器 WO2016169319A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510196967.6 2015-04-23
CN201510196967.6A CN106161407A (zh) 2015-04-23 2015-04-23 业务触发方法、装置、系统及媒体服务器

Publications (1)

Publication Number Publication Date
WO2016169319A1 true WO2016169319A1 (zh) 2016-10-27

Family

ID=57142856

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/073370 WO2016169319A1 (zh) 2015-04-23 2016-02-03 业务触发方法、装置、系统及媒体服务器

Country Status (2)

Country Link
CN (1) CN106161407A (zh)
WO (1) WO2016169319A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110875058A (zh) * 2018-08-31 2020-03-10 中国移动通信有限公司研究院 一种语音通信处理方法、终端设备及服务器

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6999564B1 (en) * 2002-03-29 2006-02-14 Nortel Networks Limited System and method for telephonic switching and signaling based on voice recognition
CN101163119A (zh) * 2006-10-10 2008-04-16 中兴通讯股份有限公司 接入网关中用户语音拨号的处理方法
CN101621712A (zh) * 2009-07-22 2010-01-06 中兴通讯股份有限公司 一种在彩铃系统中实现语音识别的系统及方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1234229C (zh) * 2003-01-15 2005-12-28 唐海县信通物资经销公司 一种电话控制方法及系统
CN101222541A (zh) * 2005-10-21 2008-07-16 华为技术有限公司 一种实现语音识别功能的方法
CN101437047B (zh) * 2008-12-09 2012-09-05 中兴通讯股份有限公司 对用户终端进行放音/录音的方法、系统及媒体服务器
CN102148911A (zh) * 2011-01-25 2011-08-10 中兴通讯股份有限公司 语音管理业务的实现方法及装置
CN104092829A (zh) * 2014-07-21 2014-10-08 苏州工业园区服务外包职业学院 基于语音识别的语音呼叫方法和接入网关

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6999564B1 (en) * 2002-03-29 2006-02-14 Nortel Networks Limited System and method for telephonic switching and signaling based on voice recognition
CN101163119A (zh) * 2006-10-10 2008-04-16 中兴通讯股份有限公司 接入网关中用户语音拨号的处理方法
CN101621712A (zh) * 2009-07-22 2010-01-06 中兴通讯股份有限公司 一种在彩铃系统中实现语音识别的系统及方法

Also Published As

Publication number Publication date
CN106161407A (zh) 2016-11-23

Similar Documents

Publication Publication Date Title
US8170194B2 (en) Method and system for replicating ring back tones
KR101868533B1 (ko) 비디오 미디어 플레이 방법, 장치 및 시스템, 컴퓨터 저장 매체
US9894128B2 (en) Selective transcoding
US10187432B2 (en) Replaying content of a virtual meeting
US20230353603A1 (en) Call processing system and call processing method
CN107911361A (zh) 支持多会话的语音管理方法、装置、终端设备及存储介质
US20120201361A1 (en) Recording Identity Data to Enable on Demand Services in a Communications System
CN101056189A (zh) 一种电话会议控制方法和系统
WO2008095385A1 (fr) Procédé, système et dispositif d&#39;appel personnalisé
US9531883B2 (en) Providing an announcement for a multiparty communication session
US20090299735A1 (en) Method for Transferring an Audio Stream Between a Plurality of Terminals
CA2839374C (en) Call processing method, device and system
WO2016082489A1 (zh) 一种彩铃的实现方法、装置、服务器及系统
WO2016058389A1 (zh) 会议录音方法、装置及系统
WO2016169319A1 (zh) 业务触发方法、装置、系统及媒体服务器
WO2010130193A1 (zh) 音频媒体发包控制装置、方法及音频媒体服务器
WO2019186443A1 (en) Audio streaming from host bluetooth device to multiple receiving bluetooth devices
WO2015196823A1 (zh) 实现从文本到语音业务循环播放的方法、装置及服务器
CN104301551B (zh) 一种音乐播放的方法和设备
TW201412083A (zh) 高保真音頻分配系統之通訊設備
US9584560B2 (en) Providing external application services with an existing private branch exchange media server
WO2016045383A1 (zh) 一种业务处理模块负载均衡的方法及媒体服务器
WO2017113071A1 (zh) 一种补充业务实现方法、终端设备和ims服务器
JP6367592B2 (ja) 通話録音システムおよび通話音声移行方法
WO2023197593A1 (zh) 多媒体会议的控制方法及装置、通信系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16782467

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16782467

Country of ref document: EP

Kind code of ref document: A1