CN105206273B - Voice transfer control method and system - Google Patents

Voice transfer control method and system Download PDF

Info

Publication number
CN105206273B
CN105206273B CN201510560931.1A CN201510560931A CN105206273B CN 105206273 B CN105206273 B CN 105206273B CN 201510560931 A CN201510560931 A CN 201510560931A CN 105206273 B CN105206273 B CN 105206273B
Authority
CN
China
Prior art keywords
voice
mrcp
phonetic order
control
answer device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510560931.1A
Other languages
Chinese (zh)
Other versions
CN105206273A (en
Inventor
李波
陈迪
朱频频
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhizhen Intelligent Network Technology Co Ltd
Original Assignee
Shanghai Zhizhen Intelligent Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Zhizhen Intelligent Network Technology Co Ltd filed Critical Shanghai Zhizhen Intelligent Network Technology Co Ltd
Priority to CN201510560931.1A priority Critical patent/CN105206273B/en
Publication of CN105206273A publication Critical patent/CN105206273A/en
Application granted granted Critical
Publication of CN105206273B publication Critical patent/CN105206273B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

A kind of voice transfer control method and system.The described method includes: voice answer device receives phonetic order;The phonetic order is sent to by MRCP bearer path by Voice-operated services device using MRCP agreement;The Voice-operated services device carries out semantics recognition to the phonetic order;Corresponding MRCP, which is generated, based on recognition result controls information;MRCP control information is sent to the voice answer device through the MRCP bearer path;The voice answer device executes corresponding operation according to the MRCP control information received.It can simplify the system architecture and workflow of communication system using the method, and reduce its design difficulty.

Description

Voice transfer control method and system
Technical field
The present invention relates to voice transmission technology field, in particular to a kind of voice transfer control method and system.
Background technique
Currently, enterprise application system may include enterprise communication platform and UC system etc..Pass through enterprise communication platform and UC The systems such as system can provide convenience for employee.For example, can be dialed by enterprise communication platform to other side between enterprise staff Phone convenes multimedia conferencing etc. by UC system.
When using enterprise communication platform or UC system, employee needs first to search address list to know destination phone numbers, The destination phone numbers known are input to enterprise communication platform or UC system by way of manual key to carry out phase again The communication answered.
In order to enable enterprise application system easily facilitates use, enterprise application system is improved at present, so that institute Corresponding operation can be executed based on the phonetic order of employee by stating enterprise application system.But above-mentioned improvement is so that application system, enterprise The framework and workflow of system become extremely complex, and design difficulty is larger.
Summary of the invention
The problem to be solved in the present invention be how the system architecture and workflow of simplified communication system, and reduce its design Difficulty.
The embodiment of the invention provides a kind of voice transfer control methods, which comprises
Voice answer device receives phonetic order;
The phonetic order is sent to by MRCP bearer path by Voice-operated services device using MRCP agreement;
The Voice-operated services device carries out semantics recognition to the phonetic order;
Corresponding MRCP, which is generated, based on recognition result controls information;
MRCP control information is sent to the voice answer device through the MRCP bearer path;
The voice answer device executes corresponding operation according to the MRCP control information received.
Optionally, the MRCP control information is that the first MRCP controls information, and the first MRCP control information includes: control The first MRCP control instruction of the voice answer device casting voice is made, and the voice corresponding data storage ground broadcasted Location;
The voice answer device executes corresponding operation according to the MRCP control information received, comprising: the voice Answering device obtains the broadcasted corresponding data of voice and broadcasts voice according to the first MRCP control instruction, prompts to use Family carries out input operation, and the phonetic order that user inputs is sent to the Voice-operated services device through the MRCP bearer path.
Optionally, after the voice answer device receives the first MRCP control information, further includes:
Speech synthesis request is sent to speech synthetic device;
The speech synthetic device requests MRCP control information synthesizing corresponding language according to the speech synthesis Sound is simultaneously sent to the Voice-operated services device.
Optionally, when the voice answer device receives the phonetic order, further includes:
The phonetic order is sent to voice conversion device by the voice answer device;
The phonetic order is converted to corresponding text data by the voice conversion device;
The corresponding text data of the phonetic order is sent to by the Voice-operated services device using MRCP agreement.
Optionally, MRCP control information is that the 2nd MRCP controls information, the 2nd MRCP control information include: with The corresponding response target information of the phonetic order, and the control voice answer device execute the second of the phonetic order MRCP control instruction;
The voice answer device executes corresponding operation according to the MRCP control information received, comprising:
The voice answer device executes the voice according to the 2nd MRCP control instruction and response target information and refers to It enables.
Optionally, the response target information is destination number.
Optionally, the MRCP bearer path is using Session initiation Protocol SIP as Deta bearer agreement.
Optionally, the voice answer device is received the voice by way of human-computer interaction interface or remote control and referred to It enables.
Optionally, before the voice answer device receives phonetic order, the method also includes:
The voice answer device receives trigger action from the user;
Operation requests are sent to the Voice-operated services device according to the trigger action;
The Voice-operated services device sends corresponding MRCP control letter to the voice answer device according to the operation requests Breath;
The voice answer device receives phonetic order, comprising: the voice answer device is controlled according to the MRCP to be believed Breath receives the phonetic order.
The embodiment of the invention also provides a kind of voice transfer control system, the system comprises:
Voice answer device is suitable for receiving phonetic order;The phonetic order is carried by MRCP using MRCP agreement Channel is sent to Voice-operated services device;And corresponding operation is executed according to the MRCP control information received;
The Voice-operated services device is suitable for carrying out semantics recognition to the phonetic order;It is generated based on recognition result corresponding MRCP controls information;And MRCP control information is sent to the voice answer device through the MRCP bearer path.
Optionally, the voice answer device includes:
First receiving unit is suitable for receiving the phonetic order;
The phonetic order is sent to Voice-operated services by MRCP bearer path using MRCP agreement by the first transmission unit The MRCP service unit of device;
Operating unit, suitable for executing corresponding operation according to the MRCP control information received;
The Voice-operated services device includes: MRCP service unit and control device, wherein
The control device includes:
Recognition unit is suitable for carrying out semantics recognition to the phonetic order;
Generation unit is suitable for generating corresponding MRCP control information based on recognition result;
The MRCP service unit includes:
Second transmission unit is answered suitable for MRCP control information is sent to the voice through the MRCP bearer path Answer device.
Optionally, the MRCP control information that the generation unit generates is that the first MRCP controls information, the first MRCP Control information includes: the first MRCP control instruction of the control voice answer device casting voice, and the voice broadcasted Corresponding address data memory;
The operating unit is suitable for obtaining the broadcasted corresponding data of voice simultaneously according to the first MRCP control instruction Voice is broadcasted, user is prompted to carry out input operation;
First transmission unit is further adapted for the phonetic order that user inputs being sent to institute through the MRCP bearer path State Voice-operated services device.
Optionally, the voice answer device is further adapted for after receiving the first MRCP control information, closes to voice Speech synthesis request is sent at device;
The system also includes: speech synthetic device is suitable for being controlled the first MRCP according to speech synthesis request Information processed synthesizes corresponding speech concurrent and send to the voice answer device.
Optionally, the voice answer device is further adapted for when receiving the phonetic order, and the voice answer-back is filled It sets and the phonetic order is sent to voice conversion device;
The voice answer device is further adapted for when receiving the phonetic order, will be described by the voice answer device Phonetic order is sent to voice conversion device;
Optionally, the MRCP control information that the generation unit generates is that the 2nd MRCP controls information, the 2nd MRCP Controlling information includes: response target information corresponding with the phonetic order, and described in the control voice answer device execution 2nd MRCP control instruction of phonetic order;
The operating unit, which is suitable for executing the voice according to the 2nd MRCP control instruction and response target information, to be referred to It enables.
Optionally, the system also includes voice conversion devices, suitable for the voice for receiving the voice answer device Instruction is converted to corresponding text data;
First transmission unit is suitable for the corresponding text data of the phonetic order passing through network data transmission channel It is sent to the MRCP service unit.
Optionally, the MRCP control information that the generation unit generates is that the 2nd MRCP controls information, the 2nd MRCP Controlling information includes: response target information corresponding with the phonetic order, and described in the control voice answer device execution 2nd MRCP control instruction of phonetic order;
The operating unit executes the phonetic order according to the 2nd MRCP control instruction and response target information.
Optionally, the response target information is destination number.
Optionally, the voice answer device further include:
Second receiving unit is suitable for before the voice answer device receives phonetic order, receives touching from the user Hair operation;
Third transmission unit is suitable for sending operation requests to the Voice-operated services device according to the trigger action;
The Voice-operated services device further include:
4th transmission unit is suitable for sending corresponding MRCP control to the voice answer device according to the operation requests Information;
First receiving unit of the voice answer device is suitable for controlling the information reception voice according to the MRCP and refers to It enables.
Optionally, the voice answer device is located at Third-Party Service, and the Voice-operated services device is integrated in artificial intelligence It can robot.
Compared with prior art, technical solution of the present invention has at least the following advantages:
Phonetic order is received by using voice answer device, and the phonetic order is identified using Voice-operated services device, Corresponding control information is generated further according to recognition result, is finally executed by control information control voice answer device corresponding Operation can simplify the system architecture and workflow that transmission control is carried out to phonetic order.Also, use the conduct of MRCP agreement Control protocol between voice answer device and Voice-operated services device, can be effectively reduced to phonetic order carry out transmission control set Count difficulty.
Detailed description of the invention
Fig. 1 is a kind of structural schematic diagram of voice transfer control system in the embodiment of the present invention;
Fig. 2 is a kind of flow chart of voice transfer control method in the embodiment of the present invention;
Fig. 3 is a kind of work flow diagram of voice transfer control system in the embodiment of the present invention;
Fig. 4 is a kind of structural schematic diagram of voice answer device in the embodiment of the present invention;
Fig. 5 is a kind of structural schematic diagram of Voice-operated services device in the embodiment of the present invention.
Specific embodiment
Although current enterprise application system can realize multi-party call and call forwarding etc. based on the phonetic order of employee Function, but the framework of the enterprise application system and workflow are usually more complex, and design difficulty is larger.
In view of the above-mentioned problems, the method is by adopting the embodiment provides a kind of voice transfer control method Phonetic order is received with voice answer device, and semantics recognition is carried out to the phonetic order using Voice-operated services device, in turn Corresponding control information is generated, the voice answer device execution corresponding operation is controlled by the control information, therefore can be with Simplify the system architecture and workflow that transmission control is carried out to phonetic order.Also, use Media Resource Control Protocol (Media Resource Control Protocol, MRCP) is as the control between voice answer device and Voice-operated services device The design difficulty that transmission control is carried out to phonetic order can be effectively reduced in agreement.
To make the above purposes, features and advantages of the invention more obvious and understandable, with reference to the accompanying drawing to the present invention Specific embodiment be described in detail.
As shown in Figure 1, the embodiment of the invention provides a kind of voice transfer control system 10, voice transfer control system System 10 may include: voice answer device 11 and Voice-operated services device 12.The voice answer device 11 and Voice-operated services device 12 are carried out data transmission using MRCP agreement and by MRCP bearer path 13.
Wherein, the voice answer device 11 is suitable for receiving phonetic order, and uses MRCP agreement by the phonetic order Voice-operated services device 12 is sent to by MRCP bearer path 13.The Voice-operated services device 12 is suitable for carrying out language to the phonetic order Justice identification, and corresponding MRCP is generated based on recognition result and controls information, and will be described through the MRCP bearer path 13 MRCP control information is sent to the voice answer device 11.The voice answer device 11 is further adapted for according to the MRCP received It controls information and executes corresponding operation.
In specific implementation, the phonetic order can be the natural language of user's oral expression.User can pass through institute The natural language of expression issues corresponding request or order to voice answer device 11.Can only include in the phonetic order Target object information, or only include object run information, it can also simultaneously include target object information and object run information. For example, the phonetic order can be " guest of sales department please be looked for refined ", wherein " guest of sales department is refined " is target object letter Breath.The phonetic order can also be " technology department's Yangze river and Huai river is looked for have a meeting at once ", wherein " technology department's Yangze river and Huai river " is target object letter Breath, " meeting " is object run information.
In specific implementation, the voice answer device 11 can be individual terminal device, such as the voice answer-back Device 11 can be hand-held terminal device, laptop device, network PC, minicomputer, mainframe computer etc.;It can also be located at Third-Party Service, for example, the voice answer device 11 can be interactive voice answering system (Interactive Voice Response, IVR).The specific form regardless of the voice answer device 11, as long as the voice answer device 11 can receive phonetic order, and the phonetic order is transmitted to Voice-operated services device 12 by MRCP bearer path 13, and Corresponding response operation is executed according to the control information that the Voice-operated services device 12 generates.
In specific implementation, the Voice-operated services device 12 may include MRCP service unit 121 and control device 122.Its In, the MRCP service unit 121 can be obtained from the data packet of the phonetic order after receiving the phonetic order MRCP message, and accessed MRCP message is converted into the form that the control device 122 can identify, then by the control Device 122 processed carries out semantics recognition to the MRCP message, and generates corresponding control information.Later, the control device 122 The MRCP control information of generation can be sent through MRCP service unit 121.
It should be noted that the MRCP service unit 121 and control device 122 can be respectively one independent dedicated Server, for example, the MRCP service unit 121 can be a MRCP proxy server, control device 122 is artificial intelligence Robot.The MRCP service unit 121 and control device 122 can also provide other services simultaneously, such as can be at other One piece of dedicated memory block and memory field are opened up on server, to provide voice control service.The MRCP service unit 121 and Control device 122 can also be integrated in an artificial intelligence robot simultaneously.Certainly, the Voice-operated services of which kind of mode are whether used Device 12 can be connected with voice answer device 11 by MRCP bearer path 13.
In specific implementation, the MRCP bearer path 13 can be cable network data transmission channel, can also use Wireless network data transmission channel.Wherein, the wireless network may include a variety of wireless connection sides such as WiFi, bluetooth, infrared Formula.Specifically regardless of the connection type between the voice answer device 11 and the Voice-operated services device 12, do not constitute pair Limitation of the invention, and it is within the scope of the present invention.
Since MRCP agreement is not an independent agreement, the message based on MRCP agreement is needed by other numbers Other side can be just sent to according to the carrying of transport protocol.In one embodiment of this invention, Session initiation Protocol can be used (Session Initiation Protocol, SIP) is as basic Data Transport Protocol.By in voice answer device and Dohandshake action between Voice-operated services device establishes Session Initiation Protocol, then carries the message based on MRCP agreement, example by Session Initiation Protocol Such as, MRCP message is encapsulated in the message body of sip message, by sip message interaction between the two, so that Voice-operated services device The voice answer device can be controlled based on MRCP agreement.
It should be noted that the MRCP bearer path 13 can also be disappeared using other Data Transport Protocols to carry MRCP Breath, however it is not limited to Session Initiation Protocol.But no matter which kind of Data Transport Protocol to carry MRCP message using, do not constitute to the present invention Limitation, and it is within the scope of the present invention.
In specific implementation, the voice transfer control system 10 can also include voice conversion device 14.The voice The phonetic order can be first sent to voice conversion device 14 by answering device 11, will be described by the voice conversion device 14 After phonetic order is converted to corresponding text data, then use MRCP agreement that the text data is sent to the Voice-operated services Device 12 carries out respective handling to the corresponding text data of the phonetic order by the Voice-operated services device 12.
Wherein, real-time transport protocol can be used between the voice conversion device 14 and the voice answer device 11 (Real Time Transport Protocol, RTP) carries out data transmission.The voice conversion device 14 takes with the acoustic control The message based on MRCP agreement can be transmitted as basic bearing protocol using SIP between business device 12.
In specific implementation, the voice transfer control system 10 can also include speech synthetic device 15.The voice Answering device 11 can send speech synthesis request after receiving the MRCP control information to speech synthetic device 15, by Speech synthetic device 15 is requested after MRCP control information is synthesized corresponding voice according to the speech synthesis, is retransmited To the voice answer device 11.
It should be noted that the voice conversion device 14 and speech synthetic device 15 can set for individual terminal It is standby, for example the voice answer device 11 or speech synthetic device 15 can be hand-held terminal device, laptop device, network PC, minicomputer, mainframe computer etc.;Third-Party Service can also be located at, for example, the voice answer device 11 can Think interactive voice answering system (Automatic Speech Recognition, ASR), the speech synthetic device 15 can Think text-to-speech system (Text To Speech, TTS).
It is corresponding to voice transfer control system below in order to more fully understand those skilled in the art and realize the present invention Method be described in detail.
As described in Figure 2, the embodiment of the invention provides a kind of voice transfer control methods.Below with reference to Fig. 1 to the side Method carries out connecing introduction in detail.
Specifically, the method may include following steps:
Step 21, voice answer device 11 receives phonetic order.
In specific implementation, the voice answer device 11 can receive the phonetic order in several ways.For example, The phonetic order can be received by human-computer interaction interface, and the voice can also be received by way of remote control and is referred to It enables.
In specific implementation, the voice answer device 11 can be received first and be come from before receiving the phonetic order The trigger action of user.For example, hot key " 11 " are pressed, to activate the voice transfer control system 10.The voice answer device After 11 receive trigger action from the user, operation can be sent to the Voice-operated services device 12 according to the trigger action and asked It asks.The Voice-operated services device 12 sends corresponding MRCP control letter to the voice answer device 11 according to the operation requests Breath, the voice answer device 11 can according to the MRCP control information execute corresponding operation, for example, play welcome words and Wait the phonetic order etc. of user.That is, what the voice answer device 11 can be sent in the Voice-operated services device 12 MRCP is controlled under the control of information, receives the phonetic order.
Step 22, the phonetic order is sent to by MRCP bearer path by Voice-operated services device 12 using MRCP agreement.
In specific implementation, the phonetic order received can be sent directly to sound by the voice answer device 11 Server 12 is controlled, Voice-operated services device 12 can also be sent to indirectly.
When the phonetic order is directly sent to Voice-operated services device 12 by the voice answer device 11, the acoustic control clothes The phonetic order first can be converted to corresponding text data when identifying the phonetic order by business device 12, then be known Not.
When the phonetic order is sent to Voice-operated services device 12 indirectly by the voice answer device 11, the voice is answered Answering device 11 first can be sent to voice conversion device 14 for the phonetic order according to MRCP agreement, then be converted by the voice The phonetic order is converted to corresponding text data by device 14.Later, the voice conversion device 14 can use MRCP The corresponding text data of the phonetic order is sent to the Voice-operated services device 12 by agreement.
Step 23, the Voice-operated services device 12 carries out semantics recognition to the phonetic order.
In specific implementation, the Voice-operated services device 12, can after receiving the corresponding text data of the phonetic order In a manner of by carrying out fuzzy matching to the corresponding text data of the phonetic order, the phonetic order is identified.Wherein, institute The rule for stating fuzzy matching can be set by those skilled in the art according to actual use situation, for example, can be first by institute It states text data and is converted to corresponding pinyin string, then word segmentation processing is carried out to the pinyin string, finally searched from corpus data library Rope text corresponding with the pinyin string after participle.
Step 24, corresponding MRCP is generated based on recognition result and controls information.
In specific implementation, the MRCP control information can control information for the first MRCP.The first MRCP control Information may include: the first MRCP control instruction that the control voice answer device 11 broadcasts voice, and the language broadcasted The corresponding address data memory of sound.
Wherein, the voice broadcasted can store in the voice answer device 11, also can store in Voice-operated services It in device 12, can also be stored in other equipment, the storage medium for the voice specifically broadcasted is not construed as limiting, as long as the voice Answering device 11 is available to the voice broadcasted.
In specific implementation, the first MRCP control information that the Voice-operated services device 12 generates can be the shape of voice data Formula is also possible to the form of text data.When the first MRCP control information is form of textual data, the voice answer-back Device 11 can send speech synthesis request to speech synthetic device 15, by speech synthetic device 15 first according to the speech synthesis First MRCP control information is synthesized corresponding voice by request, is retransmited to the Voice-operated services device 12.
In specific implementation, the MRCP control information may be the 2nd MRCP control information.The 2nd MRCP control Information processed may include: response target information corresponding with the phonetic order, and the control voice answer device executes institute State the 2nd MRCP control instruction of phonetic order.For example, the response target information can be destination number.Wherein, with it is described The corresponding response target information of phonetic order can be and believe with response target corresponding to the user for the first time phonetic order of input Breath, be also possible to and any phonetic order that user inputs in subsequent use process corresponding to response target information.
It should be noted that the response target information can store in the voice answer device 11, can also deposit Storage can also be stored in other equipment in Voice-operated services device 12, and the storage medium of the specific response target information is not made It limits, as long as the Voice-operated services device 12 is available to arrive the response target information.
Step 25, MRCP control information is sent to the voice answer device through the MRCP bearer path 13 11。
Step 26, the voice answer device 11 executes corresponding operation according to the MRCP control information received.
Specifically, when MRCP control information is that the first MRCP controls information, the voice answer device 11 can be with It according to the first MRCP control instruction, obtains the broadcasted corresponding data of voice and broadcasts voice, prompt user to carry out defeated Enter operation, and the phonetic order that user inputs is sent to the Voice-operated services device 12 through the MRCP bearer path 13.
When MRCP control information is that the 2nd MRCP controls information, the voice answer device 11 can be according to described 2nd MRCP control instruction and response target information execute the phonetic order.
For example, the voice answer device 11 can be according to described second when the response target information is destination number MRCP control instruction dials destination number, realizes the functions such as multi-party call.
It is below IVR with the voice answer device 11, voice conversion device 14 is ASR, and speech synthetic device 15 is TTS, for Voice-operated services device 12 is artificial intelligence robot, to using the voice transfer control system to realize real-time voice meeting The workflow of view is described in detail, wherein the Voice-operated services device 12 includes MRCP service unit 121 and control device 122。
Specifically, as shown in figure 3, the workflow of the voice transfer control system may include steps of:
Step s1, user call access IVR 11;
The phonetic order of the reception real-time voice meeting of step s2, IVR 11;
The phonetic order is sent to ASR 14 by step s3, IVR 11;
Step s4, ASR 14 is converted to corresponding text data to the phonetic order;
The text data is sent to MRCP service unit 121 by step s5, ASR 14;
The text data is sent to control device 122 by step s6, MRCP service unit 121;
Step s7, control device 122 carry out semantics recognition to the text data, and generate the first MRCP and control information, The first MRCP control information is for controlling the casting voice of IVR 11 and user speech being prompted to instruct;
First MRCP control information is sent to MRCP service unit 121 by step s8, control device 122;
First MRCP control information is sent to IVR 11 by step s9, MRCP service unit 121;
Step s10, IVR 11 sends synthesis request to TTS 15;
Step s11, TTS 15 requests MRCP control information synthesizing corresponding voice according to the synthesis;
The corresponding voice of MRCP control information is sent to IVR 11 by step s12, TTS 15;
Step s13, IVR 11 broadcasts voice and user is prompted to input phonetic order;
Step s14, user input phonetic order to IVR 11;
The phonetic order that user inputs is sent to ASR 14 by step s15, IVR 11;
The phonetic order that step s16, ASR 14 inputs the user is converted to corresponding text data;
The text data is sent to MRCP service unit 121 by step s17, ASR 14;
The text data is sent to control device 122 by step s18, MRCP service unit 121;
Step s19, control device 122 carry out semantics recognition to the text data, and generate the 2nd MRCP and control information, The 2nd MRCP control information executes the phonetic order for controlling IVR 11;
2nd MRCP control information is sent to MRCP service unit 121 by step s20, control device 122;
2nd MRCP control information is sent to IVR 11 by step s21, MRCP service unit 121;
Step s22, IVR 11 executes the phonetic order.
In order to enable those skilled in the art clearly implement the voice transfer control system 10, below to described Voice-transmission system 10 describes in detail:
In one embodiment of this invention, as shown in figure 4, the voice answer device 11 may include: the first reception list Member 41, the first transmission unit 42 and operating unit 43.Wherein, first receiving unit 41 is suitable for receiving the voice and refers to It enables.First transmission unit 42 is suitable for that the phonetic order is sent to sound by MRCP bearer path 13 using MRCP agreement Control the MRCP service unit 121 of server 12.The operating unit 43 is suitable for executing phase according to the MRCP control information received The operation answered.
In one embodiment of this invention, as shown in figure 5, the Voice-operated services device 12 may include: MRCP service unit 121 and control device 122.Wherein, the control device 122 may include: recognition unit 51 and generation unit 52.The identification Unit 51 is suitable for carrying out semantics recognition to the phonetic order.The generation unit 52 is suitable for generating based on recognition result corresponding Control information.The MRCP service unit 121 includes: the second transmission unit 53.Second transmission unit 53 is suitable for described in warp MRCP control information is sent to the voice answer device 11 by MRCP bearer path 13.
Below with reference to Fig. 4 and Fig. 5, describe in detail to the voice transfer control system:
In specific implementation, first receiving unit 41 can receive the phonetic order in several ways.For example, First receiving unit 51 can receive the phonetic order by human-computer interaction interface, can also pass through the side of remote control Formula receives the phonetic order.
The phonetic order can turn after voice conversion device 14 is converted to corresponding text data, then by the voice The corresponding text data of the phonetic order is sent to Voice-operated services device 12 by MRCP bearer path 13 by changing device 14.
After the MRCP service unit 121 receives the corresponding text data of the phonetic order, by the text data It is sent to control device 122.By the control device 122 recognition unit 51 to the corresponding text data of the phonetic order into Row fuzzy matching, to identify the phonetic order.MRCP is generated according to the phonetic order by generation unit 52 again and controls information.
In specific implementation, the MRCP control information that the generation unit 52 generates can control information for the first MRCP, Or the 2nd MRCP controls information.Wherein, the first MRCP control information may include: the control voice answer-back dress Set the first MRCP control instruction of casting voice, and the corresponding address data memory of voice broadcasted.2nd MRCP Controlling information may include: response target information corresponding with the phonetic order, and the control voice answer device executes 2nd MRCP control instruction of the phonetic order.Wherein, the response target information can be destination number.
When the MRCP control information that the generation unit 52 generates is that the first MRCP controls information, the voice answer-back dress It sets 11 to be suitable for after receiving the first MRCP control information, speech synthesis request is sent to speech synthetic device 15, by institute It states speech synthetic device 15 and the oneth MRCP control information is synthesized simultaneously by corresponding voice according to speech synthesis request It is sent to the voice answer device 11.The operating unit 43 is suitable for according to the first MRCP control instruction, and acquisition is broadcast The corresponding data of the voice of report simultaneously broadcast voice, and user is prompted to carry out input operation, first transmission unit 42 be further adapted for by The phonetic order of user's input is sent to the Voice-operated services device 12 through the MRCP bearer path.
When the MRCP control information that the generation unit 52 generates is that the 2nd MRCP controls information, the operating unit 43 Suitable for being obtained according to the 2nd MRCP control instruction and executing the phonetic order.
In specific implementation, the voice answer device 11 can also include: the second receiving unit (not shown) and Three transmission unit (not shown).Wherein, second receiving unit be suitable for the voice answer device receive phonetic order it Before, receive trigger action from the user.The third transmission unit is suitable for according to the trigger action to the Voice-operated services Device sends operation requests.
Correspondingly, the Voice-operated services device 12 may include: 54 (not shown) of the 4th transmission unit.Described 4th sends list Member 54, which is suitable for sending corresponding MRCP to the voice answer device according to the operation requests, controls information.At this point, institute's predicate First receiving unit 41 of sound answering device 11, which is suitable for controlling information according to the MRCP, receives the phonetic order.
Voice transfer control system system architecture and workflow in the embodiment of the present invention it can be seen from above content Journey is simpler, and is set as MRCP control information by that will control information, can greatly reduce setting for voice transfer control system Count difficulty.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can It is completed with instructing relevant hardware by program, which can be stored in a computer readable storage medium, storage Medium may include: ROM, RAM, disk or CD etc..
Although present disclosure is as above, present invention is not limited to this.Anyone skilled in the art are not departing from this It in the spirit and scope of invention, can make various changes or modifications, therefore protection scope of the present invention should be with claim institute Subject to the range of restriction.

Claims (15)

1. a kind of voice transfer control method characterized by comprising
Voice answer device receives phonetic order, and the voice answer device is individual terminal device or takes positioned at third party Business system;
The phonetic order is sent to by MRCP bearer path by Voice-operated services device, the Voice-operated services device using MRCP agreement And the voice answer device is carried out data transmission using MRCP agreement and by MRCP bearer path;
The Voice-operated services device carries out semantics recognition to the phonetic order;
Corresponding MRCP, which is generated, based on recognition result controls information;
MRCP control information is sent to the voice answer device through the MRCP bearer path;
The voice answer device executes corresponding operation according to the MRCP control information received;
The MRCP control information is that the first MRCP controls information or the 2nd MRCP controls information, and the first MRCP controls information It include: the first MRCP control instruction of the control voice answer device casting voice, and the corresponding number of voice broadcasted According to storage address;
The 2nd MRCP control information includes: response target information corresponding with the phonetic order, and the control voice Answering device executes the 2nd MRCP control instruction of the phonetic order;
The voice answer device executes corresponding operation according to the MRCP control information received, comprising: the voice answer-back Device obtains the broadcasted corresponding data of voice and simultaneously broadcasts voice according to the first MRCP control instruction, prompt user into Row input operation, and the phonetic order that user inputs is sent to the Voice-operated services device through the MRCP bearer path;Or
The voice answer device executes the phonetic order according to the 2nd MRCP control instruction and response target information.
2. voice transfer control method as described in claim 1, which is characterized in that the voice answer device receives described After first MRCP controls information, further includes:
Speech synthesis request is sent to speech synthetic device;
The speech synthetic device requests the first MRCP control information synthesizing corresponding language according to the speech synthesis Sound is simultaneously sent to the Voice-operated services device.
3. voice transfer control method as described in claim 1, which is characterized in that when the voice answer device receives institute When stating phonetic order, further includes:
The phonetic order is sent to voice conversion device by the voice answer device;
The phonetic order is converted to corresponding text data by the voice conversion device;
The corresponding text data of the phonetic order is sent to by the Voice-operated services device using MRCP agreement.
4. voice transfer control method as described in claim 1, which is characterized in that the response target information is target number Code.
5. voice transfer control method as described in claim 1, which is characterized in that the MRCP bearer path is sent out using session Agreement SIP is played as Deta bearer agreement.
6. voice transfer control method as described in claim 1, which is characterized in that the voice answer device passes through man-machine friendship The mode of mutual interface or remote control receives the phonetic order.
7. voice transfer control method as described in claim 1, which is characterized in that
Before the voice answer device receives phonetic order, the method also includes:
The voice answer device receives trigger action from the user;
Operation requests are sent to the Voice-operated services device according to the trigger action;
The Voice-operated services device sends corresponding MRCP to the voice answer device according to the operation requests and controls information;
The voice answer device receives phonetic order, comprising: the voice answer device controls information according to the MRCP and connects Receive the phonetic order.
8. a kind of voice transfer control system characterized by comprising
Voice answer device is suitable for receiving phonetic order;The phonetic order is passed through by MRCP bearer path using MRCP agreement It is sent to Voice-operated services device;And corresponding operation, the voice answer device are executed according to the MRCP control information received For individual terminal device or it is located at Third-Party Service;
The Voice-operated services device is suitable for identifying the phonetic order;Corresponding MRCP, which is generated, based on recognition result controls information;With And MRCP control information is sent to the voice answer device through the MRCP bearer path;
The voice answer device includes:
First receiving unit is suitable for receiving the phonetic order;
The phonetic order is sent to Voice-operated services device by MRCP bearer path using MRCP agreement by the first transmission unit MRCP service unit;
Operating unit, suitable for executing corresponding operation according to the MRCP control information received;
The Voice-operated services device includes: MRCP service unit and control device, wherein
The control device includes:
Recognition unit is suitable for carrying out semantics recognition to the phonetic order;
Generation unit is suitable for generating corresponding MRCP control information based on recognition result;
The MRCP service unit includes:
Second transmission unit is filled suitable for MRCP control information is sent to the voice answer-back through the MRCP bearer path It sets;
The MRCP control information that the generation unit generates is that the first MRCP controls information or the 2nd MRCP controls information, described the One MRCP control information includes: the first MRCP control instruction of the control voice answer device casting voice, and is broadcasted The corresponding address data memory of voice;
The 2nd MRCP control information includes: response target information corresponding with the phonetic order, and the control voice Answering device executes the 2nd MRCP control instruction of the phonetic order;
The operating unit is suitable for obtaining the broadcasted corresponding data of voice according to the first MRCP control instruction and broadcasting Voice prompts user to carry out input operation;
First transmission unit is further adapted for the phonetic order that user inputs being sent to the sound through the MRCP bearer path Control server;Or
The operating unit is suitable for executing the phonetic order according to the 2nd MRCP control instruction and response target information.
9. voice transfer control system as claimed in claim 8, which is characterized in that the voice answer device is further adapted for connecing After receiving the first MRCP control information, speech synthesis request is sent to speech synthetic device;
The system also includes: speech synthetic device is suitable for being controlled the first MRCP according to speech synthesis request and believe Breath synthesizes corresponding speech concurrent and send to the voice answer device.
10. voice transfer control system as claimed in claim 8, which is characterized in that the voice answer device is further adapted for When receiving the phonetic order, the phonetic order is sent to voice conversion device by the voice answer device;
The system also includes: voice conversion device, suitable for the phonetic order is converted to corresponding text data;And it adopts The phonetic order is converted into corresponding text data with MRCP agreement and is sent to the MRCP service unit.
11. voice transfer control system as claimed in claim 8, which is characterized in that further include: voice conversion device is suitable for The phonetic order that the voice answer device receives is converted into corresponding text data;
First transmission unit is suitable for sending the corresponding text data of the phonetic order by network data transmission channel To the MRCP service unit.
12. voice transfer control system as claimed in claim 8, which is characterized in that the MRCP control that the generation unit generates Information processed is that the 2nd MRCP controls information, and the 2nd MRCP control information includes: response mesh corresponding with the phonetic order Information is marked, and the control voice answer device executes the 2nd MRCP control instruction of the phonetic order;
The operating unit executes the phonetic order according to the 2nd MRCP control instruction and response target information.
13. voice transfer control system as claimed in claim 8, which is characterized in that the response target information is target number Code.
14. voice transfer control system as claimed in claim 8, which is characterized in that
The voice answer device further include:
Second receiving unit is suitable for before the voice answer device receives phonetic order, receives triggering behaviour from the user Make;
Third transmission unit is suitable for sending operation requests to the Voice-operated services device according to the trigger action;
The Voice-operated services device further include:
4th transmission unit is suitable for sending corresponding MRCP control letter to the voice answer device according to the operation requests Breath;
First receiving unit of the voice answer device, which is suitable for controlling information according to the MRCP, receives the phonetic order.
15. voice transfer control system as claimed in claim 8, which is characterized in that the voice answer device is located at third Square service system, the Voice-operated services device are integrated in artificial intelligence robot.
CN201510560931.1A 2015-09-06 2015-09-06 Voice transfer control method and system Active CN105206273B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510560931.1A CN105206273B (en) 2015-09-06 2015-09-06 Voice transfer control method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510560931.1A CN105206273B (en) 2015-09-06 2015-09-06 Voice transfer control method and system

Publications (2)

Publication Number Publication Date
CN105206273A CN105206273A (en) 2015-12-30
CN105206273B true CN105206273B (en) 2019-05-10

Family

ID=54953902

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510560931.1A Active CN105206273B (en) 2015-09-06 2015-09-06 Voice transfer control method and system

Country Status (1)

Country Link
CN (1) CN105206273B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105808501A (en) * 2016-03-09 2016-07-27 北京众星智联科技有限责任公司 Implementation of artificial intelligence learning
CN106531163A (en) * 2016-11-14 2017-03-22 北京小米移动软件有限公司 Method and device for controlling terminal
CN110769124B (en) * 2019-10-30 2020-11-06 国网江苏省电力有限公司镇江供电分公司 Electric power marketing customer communication system
CN111128198B (en) * 2019-12-25 2022-10-28 厦门快商通科技股份有限公司 Voiceprint recognition method, voiceprint recognition device, storage medium, server and voiceprint recognition system
CN111785293B (en) * 2020-06-04 2023-04-25 杭州海康威视系统技术有限公司 Voice transmission method, device and equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1984201A (en) * 2005-12-13 2007-06-20 国际商业机器公司 Voice services system and method
CN101030994A (en) * 2007-04-11 2007-09-05 华为技术有限公司 Speech discriminating method system and server
CN101453446A (en) * 2007-11-30 2009-06-10 华为技术有限公司 Method, apparatus and system for establishing MRCP control and bearing channel
US8296148B1 (en) * 2008-06-13 2012-10-23 West Corporation Mobile voice self service device and method thereof
CN103119644A (en) * 2010-07-23 2013-05-22 奥尔德巴伦机器人公司 Humanoid robot equipped with a natural dialogue interface, method for controlling the robot and corresponding program
CN103151041A (en) * 2013-01-28 2013-06-12 中兴通讯股份有限公司 Method and system for achieving automatic speech recognition business and media server
CN104732982A (en) * 2013-12-18 2015-06-24 中兴通讯股份有限公司 Method and device for recognizing voice in interactive voice response (IVR) service

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1984201A (en) * 2005-12-13 2007-06-20 国际商业机器公司 Voice services system and method
CN101030994A (en) * 2007-04-11 2007-09-05 华为技术有限公司 Speech discriminating method system and server
CN101453446A (en) * 2007-11-30 2009-06-10 华为技术有限公司 Method, apparatus and system for establishing MRCP control and bearing channel
US8296148B1 (en) * 2008-06-13 2012-10-23 West Corporation Mobile voice self service device and method thereof
CN103119644A (en) * 2010-07-23 2013-05-22 奥尔德巴伦机器人公司 Humanoid robot equipped with a natural dialogue interface, method for controlling the robot and corresponding program
CN103151041A (en) * 2013-01-28 2013-06-12 中兴通讯股份有限公司 Method and system for achieving automatic speech recognition business and media server
CN104732982A (en) * 2013-12-18 2015-06-24 中兴通讯股份有限公司 Method and device for recognizing voice in interactive voice response (IVR) service

Also Published As

Publication number Publication date
CN105206273A (en) 2015-12-30

Similar Documents

Publication Publication Date Title
CN105206273B (en) Voice transfer control method and system
CN105120373B (en) Voice transfer control method and system
EP3050051B1 (en) In-call virtual assistants
CN101207656B (en) Method and system for switching between modalities in speech application environment
CN110347863B (en) Speaking recommendation method and device and storage medium
US20190392395A1 (en) Worry-free meeting conferencing
CN107112016A (en) Multi-modal cycle of states
US9538348B2 (en) Method and message server for routing a speech message
US20140269678A1 (en) Method for providing an application service, including a managed translation service
WO2016101571A1 (en) Voice translation method, communication method and related device
CN109005190B (en) Method for realizing full duplex voice conversation and page control on webpage
US11967309B2 (en) Methods and apparatus for leveraging machine learning for generating responses in an interactive response system
CN110992955A (en) Voice operation method, device, equipment and storage medium of intelligent equipment
US11895165B2 (en) In-line, in-call AI virtual assistant for teleconferencing
WO2024160041A1 (en) Multi-modal conversation method and apparatus, and device and storage medium
TW202022560A (en) Programmable intelligent agent for human-chatbot communication
CN104122979A (en) Method and device for control over large screen through voice
US11706340B2 (en) Caller deflection and response system and method
CN109545203A (en) Audio recognition method, device, equipment and storage medium
US20210312143A1 (en) Real-time call translation system and method
CN105118507B (en) Voice activated control and its control method
US12088543B2 (en) Voice user interface sharing of content
CN107783650A (en) A kind of man-machine interaction method and device based on virtual robot
CN116016779A (en) Voice call translation assisting method, system, computer equipment and storage medium
US20190057691A1 (en) Unified n-best asr results

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Voice transmission control method and system

Effective date of registration: 20221124

Granted publication date: 20190510

Pledgee: Shanghai Lingang Financial Leasing Co.,Ltd.

Pledgor: SHANGHAI XIAOI ROBOT TECHNOLOGY Co.,Ltd.

Registration number: Y2022980023447

EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20151230

Assignee: Shanghai Lingang Financial Leasing Co.,Ltd.

Assignor: SHANGHAI XIAOI ROBOT TECHNOLOGY Co.,Ltd.

Contract record no.: X2022980023270

Denomination of invention: Voice transmission control method and system

Granted publication date: 20190510

License type: Exclusive License

Record date: 20221128