CN1427394A - Speech sound browsing network - Google Patents

Speech sound browsing network Download PDF

Info

Publication number
CN1427394A
CN1427394A CN 02106040 CN02106040A CN1427394A CN 1427394 A CN1427394 A CN 1427394A CN 02106040 CN02106040 CN 02106040 CN 02106040 A CN02106040 A CN 02106040A CN 1427394 A CN1427394 A CN 1427394A
Authority
CN
China
Prior art keywords
voice
voicexml
speech
command
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 02106040
Other languages
Chinese (zh)
Inventor
廖杰远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WUXIANSHANGJI COMMUNICATION TECHNOLOGY Co Ltd BEIJING
Original Assignee
WUXIANSHANGJI COMMUNICATION TECHNOLOGY Co Ltd BEIJING
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WUXIANSHANGJI COMMUNICATION TECHNOLOGY Co Ltd BEIJING filed Critical WUXIANSHANGJI COMMUNICATION TECHNOLOGY Co Ltd BEIJING
Priority to CN 02106040 priority Critical patent/CN1427394A/en
Publication of CN1427394A publication Critical patent/CN1427394A/en
Pending legal-status Critical Current

Links

Images

Abstract

A speech browsing gateway is composed of speech input channel, speech recognizer, and the VoiceXML analyzer and controlling said speech input channel and speech recognizer. Said analyzer consists of a device for detecting the state of speech channel and sending the detected speech command to speech recognizer, a device for analyzing the text command from speech recognizer, converting it to XML command, and creating communication link to target host, and a device for downloading page script, analyzing page, and starting speech input device.

Description

Speech sound browsing network
Technical field
The gateway that the present invention relates to use in the communications field, especially voice-based gateway, and use this gateway to realize the internet of voice browse.
Background technology
The develop rapidly of Internet and widespread use are to build on the success of WEB navigation mechanism to a great extent.Exactly because the structure of Client/Server and html script SGML and http etc. be the combination of host-host protocol effectively, makes Internet have the most powerful distribution/central access structure, and simple application and development mechanism.We can say that browsing is server.
For voice application, on the interaction mechanism that voice application in the past builds on simply, seals.Its Data Source relies on prior prefabricated recording basically fully, and operating process is that simple menu formula button is selected.
Along with the maturation of new man-machine interaction mode such as speech recognition, phonetic synthesis is used, traditional CTI system has possessed new interaction capabilities.Combination that this new interactive mode and Internet use is born and voice browse is just for adapting to.Voice browse makes traditional simple telephone set become a kind of data access terminal powerful and simple to operate, with data with build on Internet alternately and browse on the structure, thereby make the so simple equipment of telephone set enjoy a trip to Internet more easily and easily than other network terminal.Voice browse, it is analogous to Internet that we are familiar with and the navigation mechanism between the client computer.Thereby the huge speech communication network of popularizing has the most organically been incorporated among the abundant Internet, made the vast application that builds on the data network obtain extending the most widely.
Submit at 1999.11.30, the application people is in " information service system of interactive telephone phonetic and method " by name application of 99125249.7 for the application number of Fuzhou Shutong Information Technology Co., Ltd, a kind of information service system of interactive telephone phonetic is disclosed, a kind of system and method based on speech query information is disclosed in this system, wherein the user dials shortcode and inserts certain urban node then by telephone network and in inserting by terminal device, the user gives an oral account the voice signal of Business Name through inserting and Switching Module subsequently, and, determine Classification of Businesses classification and location thereof through speech-recognition services module identification services type; If then be not forwarded to the destination node through remote communication module and wide area network in this locality.Behind sound identification module identification industry Business Name, enter this professional operation flow through message control module, and then the user can be mutual further with system.
System described in this application has realized substantially by speech query information, but it exists following some deficiency simultaneously:
1) this system can only inquire about determine in advance good, by the urban node that the supplier of system is managed, promptly the supplier of system also must provide content simultaneously, so just can make voice communication valuable;
2) urban node isolates, and the user dials in a shortcode, and the content that can only inquire about a node if will wish to obtain the content of other node, then can only be dialled in other shortcode, freely browsing internet.
Internet is the basis that Internet extensively popularizes based on the application of the web browsing mechanism of html script descriptive language.And the Internet extensively to popularize the requirement that also makes to the mode of obtaining internet information also more and more.Thereby a kind of new script describing Language XML has occurred, and it has brought a kind of completely new concept for the browsing data technology.The XML tag language is paid close attention to the concern that the notion of data representation form transfers to data implication and content with the SGML of in the past HTML and so on and is come up.In the HTML SGML, computer program can know which type of mode is these data should show on screen, but we are difficult to allow computer program know, what implication is these data are.And in XML, its mark be the implication and the content of data, we can allow program discern and handle these data easily, and show with various suitable form.
The XML language is a kind of " more honest netspeak ", and it makes data obtain and exchange more convenient flexibly on network, and can obtain reflection by comprising more terminal device such as computer, TV and mobile phone.
Acoustic control information obtain manner has and uses simple, characteristics of high efficiency, it make man-machine more natural alternately, get close to, therefore just be subjected to increasing people's favor.Arise at the historic moment at the VoiceXML of extendability identifiable language aspect the voice application based on XML.It is by IBM, Lucent, Motorola, AT﹠amp; A kind of SGML that is applied to voice browse that T four tame international corporations proposed in 2000, the VoiceXML resolver is the core of voice browse technology.And the VoiceXML language is because be a kind of XML descriptive language equally, almost do not have obstacle with the data exchange of database, HTML, WML and other document process and delivery system.
Summary of the invention
For this reason, the present invention proposes a kind of voice Interworking GateWay that is applied to the internet, can use as the Web that sets up HTML and set up voice application system easily, and such voice application system can extensively be supported for the speech sound browsing network institute based on VoiceXML.The voice Interworking GateWay carries out man-machine interaction by resolving the VoiceXML combination with modes such as speech recognition and phonetic syntheses, thereby realizes the sociable dream that just can surf the Net.By this gateway, the mode browsing internet that the user can listen to voice, and be not single node, this speech sound browsing network for this reason comprises:
The phonetic entry channel is used to receive the voice messaging or the order of user's input;
Speech recognition equipment, the voice command that is used for receiving from voice channel converts Text Command to;
The VoiceXML resolver, be used for the script of VoiceXML standard is analyzed, explained, and then control speech recognition equipment, phonetic entry channel, specifically comprise: be used for the detecting voice channel status and the voice command that detects sent to the device of speech recognition equipment; The device that is used for to become to meet the XML order of VoiceXML agreement from the text command analyzing that speech recognition equipment receives and then orders the communication linkage of foundation and the corresponding destination host of described voice command according to XML; And be used for the downloading page script, analyze the page, the described information content is become content of text and starts the device of speech input device with the typing voice command when needing the user speech input according to the VoiceXML protocol analysis.
Preferably, this speech sound browsing network further comprises: the voice output channel is used to export voice messaging; Send the Text To Speech synthesizer, receive the voice content that returns, and send to the voice output passage and carry out playback; Speech synthetic device, be used for receiving content of text from described VoiceXML resolver, text content is converted to voice messaging and returns resolver, and wherein, described resolver receives the voice messaging after changing and sends it to the output voice channel from speech synthetic device.
According to another aspect of the present invention, a kind of voice-based internet information server is provided, comprising:
Telephone terminal;
Server, store on it that ICP provides according to VoiceXML agreement canned data;
Speech sound browsing network is used for according to the voice command that sends from telephone terminal according to certain procotol it is characterized in that from described server lookup and reception information this gateway comprises:
The phonetic entry channel is used to receive the voice messaging or the order of user's input;
Speech recognition equipment, the voice command that is used for receiving from voice channel converts Text Command to;
The VoiceXML resolver, be used for the script of VoiceXML standard is analyzed, explained, and then control speech recognition equipment, phonetic entry channel, specifically comprise: be used for the detecting voice channel status and the voice command that detects sent to the device of speech recognition equipment; The device that is used for to become to meet the XML order of VoiceXML agreement from the text command analyzing that speech recognition equipment receives and then orders the communication linkage of the specific service in foundation and the corresponding server of described voice command according to XML; And be used for the downloading page script, analyze the page, the described information content is become content of text and starts the device of speech input device with the typing voice command when needing the user speech input according to the VoiceXML protocol analysis.
According to another aspect of the present invention, a kind of voice internet information server is provided, comprising:
A kind of voice internet information server comprises:
Telephone terminal;
The WEB server, the application scenarios that stores the address identification message relevant on it or store according to the VoiceXML agreement with ICP;
Database server, the storage data message relevant on it with application;
Speech sound browsing network is used for searching and reception information from described WEB server according to the interactive voice order of sending from telephone terminal, it is characterized in that this gateway comprises:
Phonetic entry channel (2) is used to receive the voice messaging or the order of user's input;
Speech recognition equipment (3), the voice command that is used for receiving from voice channel converts Text Command to;
VoiceXML resolver (4) is used for the script of VoiceXML standard is analyzed, explained, and then controls speech recognition equipment (2), phonetic entry channel (3), specifically comprises:
Be used for the detecting voice channel status and the voice command that detects sent to the device (41) of speech recognition equipment;
The text command analyzing that is used for receiving from speech recognition equipment becomes meet the XML order of VoiceXML agreement and then from the retrieval of WEB server and the network address of the corresponding destination host of described voice command and in view of the above sets up device (42) with the communication linkage of this data server according to certain agreement according to the XML order; And
Be used for the downloading page script, analyze page script, described script information content is become content of text and starts the device (43) of speech input device with the typing voice command when needing the user speech input according to the VoiceXML protocol analysis.
Preferably, described speech sound browsing network further comprises the output voice channel, is used to export voice messaging; Speech synthetic device is used for receiving content of text from described VoiceXML resolver, and converts text content to voice messaging under the control of resolver, wherein,
Described resolver receives the voice messaging after changing and sends it to the output voice channel from speech synthetic device.
According to the present invention on the other hand, provide a kind of and obtain the method for internet information, comprising: utilize a telephone terminal to receive the voice command that the user sends by voice; The VoiceXML resolver consigns to a speech recognition device with it after detecting this voice command; The voice command that speech recognition device sends the user converts text message to and submits to the VoiceXML resolver; The described text message of VoiceXML resolver resolves also is established to the link of this network host in view of the above with the network address of the destination host of definite user expectation; The VoiceXML resolver receives the user needed information content according to certain procotol from network host according to user and the mutual voice command of network host; The VoiceXML resolver is judged the information format that is received, when this information format is to become text message according to this content of VoiceXML protocol analysis when meeting the VoiceXML standard; The text message that voice operation demonstrator receive to be resolved from the VoiceXML resolver also converts thereof into voice messaging and converts thereof into voice messaging and offer resolver; Resolver is sent to the voice messaging of receiving on the voice output passage.
According to the present invention on the other hand, provide a kind of speech sound browsing network, comprising: input channel is used to receive the information or the order of user's input; The button recognition device, the command conversion that is used for receiving from input channel becomes Text Command; The VoiceXML resolver is used for the script of VoiceXML standard is analyzed, explained, and then controls speech recognition equipment, phonetic entry channel, specifically comprises:
The order that is used to detect described input channel state and will detects sends to the device of button recognition device; The device (42) that is used for to become meet the XML order of VoiceXML agreement from the text command analyzing of button recognition device reception and then orders the communication linkage of foundation and the corresponding destination host of described user command according to XML; And be used for the downloading page script, analyze the page, the described information content is become content of text and starts the device of press key input device with the typing user command when needing the user command input according to the VoiceXML protocol analysis.
Description of drawings
Fig. 1 is the voice-based network system synoptic diagram of prior art;
Fig. 2 is according to the synoptic diagram of voice-based internet of the present invention system;
Fig. 3 is the structural representation according to voice Interworking GateWay of the present invention;
Fig. 4 is according to voice-based network browsing process flow diagram of the present invention;
Fig. 5 is the structural drawing according to resolver in the voice Interworking GateWay of the present invention;
Fig. 6 is the further structure configuration map of resolver shown in Figure 5.
Specific embodiment
Embodiment 1
Shown in Fig. 3 a, for realizing the structural representation of speech sound browsing network 1 of the present invention, this gateway 1 comprises: input voice channel 21, speech recognition equipment 3 and VoiceXML resolver 4.
Input voice channel 21 is used to receive the voice command of user by the phone input, and voice channel is the speech data signal treatment channel of transmission user in the VoiceXML speech sound browsing network, and it connects physically voice collecting and playback equipment.Speech recognition equipment.In voice application system, voice channel mainly is sound card, voice channel or with the existing tunnel of numerical coding form, as vocoded data bag of IP etc.To the support of the voice channel of different platform, but determined the platform of VoiceXML speech sound browsing network practical application.
Speech recognition equipment 3 is connected with described voice channel 2, and the voice command that is used for receiving from voice channel 3 converts Text Command to, sends the VoiceXML resolver back to and deals with.In speech sound browsing network, speech recognition equipment is authoritative recognition engine, and it discerns the user's voice signal according to limited grammer, produces the recognition result of corresponding syntactic definition.So, grammer just becomes the key concept in the VoiceXML speech sound browsing network 1, grammer determined the user talkative what, how to say, good grammer can bring the user good interactive feel, also can be from improving the discrimination of speech recognition equipment in logic, make the browsing smoothness of whole voice application and easily.In the VoiceXML speech sound browsing network, speech recognition equipment not only needs to handle the identification to user voice signal, also needs to handle simultaneously the identification to user key-press, and button and voice are with same machine-processed processed and transmission.
VoiceXML resolver 4 is as the core content of this speech sound browsing network, be used for the script of VoiceXML standard is analyzed, explained, and then the relevant resource (as speech recognition equipment, speech synthetic device, phonetic entry output unit etc.) of control is carried out work.Specifically comprise: the detecting voice channel status; The downloading page script when analysis page is imported for the needs user speech, starts speech input device, and the typing voice command sends voice command to speech recognition equipment, and receives the Text Command that returns; Send the Text To Speech synthesizer, receive the voice content that returns, and send to the voice output passage and carry out playback; To become to meet the XML order of VoiceXML agreement from the text command analyzing that speech recognition equipment receives, and then set up with the communication linkage of the corresponding destination host of described voice command and from destination host according to XML order and to receive and voice command information corresponding content, and the described information content is become content of text according to the VoiceXML protocol analysis.
VoiceXML resolver 4 constantly detecting voice channel, is then controlled speech recognition equipment 3 and is received, changes this voice command if having to have judged whether the voice command input.VoiceXML resolver 4 will receive the Text Command after changing and it will be resolved to the XML that meets the VoiceXML agreement and order from speech recognition equipment 3, and then according to the communication linkage of XML order foundation with the corresponding destination host of described voice command, and receive and voice command information corresponding content, and the described information content is become content of text according to the VoiceXML protocol analysis from destination host.Specifically, resolver 4 is used and session by setting up, obtain the document that comprises control command, set up dialogue according to the sign in the document, thereby explain each dialogue, the triggering of control speech recognition, speech synthesis engine and voice channel, opening and closing, hang-up etc., realization is conversational mutual with the user's, and according to the judgement of leading of the recognition result to customer responsiveness, carry out the transfer between the document and use between transfer.
As shown in Figure 5, this resolver 4 comprises: be used for the detecting voice channel status and the voice command that detects sent to the device (41) of speech recognition equipment; The device (42) that is used for to become to meet the XML order of VoiceXML agreement from the text command analyzing that speech recognition equipment receives and then orders the communication linkage of foundation and the corresponding destination host of described voice command according to XML; And be used for the downloading page script, analyze the page, the described information content is become content of text and starts the device (43) of speech input device with the typing voice command when needing the user speech input according to the VoiceXML protocol analysis.
Preferably, shown in Fig. 3 b, speech sound browsing network of the present invention also comprises a voice operation demonstrator (TTS) 5, this compositor is connected with VoiceXML resolver 4, be used to receive the content of text after the parsing and under the control of resolver 4, convert the content of text that receives to voice messaging, and output to a voice output channel 22 by resolver and play to the user.Phonetic synthesis can become voice document with text conversion, also can convert audio data stream to, and the quality of phonetic synthesis has determined the direct sensation of user to system, and the synthetic video of smooth nature will make the user feel good.How to improve the quality of phonetic synthesis, become the key factor that influences the voice application effect.
Inputting/outputting voice channel of the present invention (21,22) can be realized by present known any speech processes integrated circuit board, and this integrated circuit board is connected by suitable interface conversion and a telephone network such as PSTN or ISDN, directly receives user's voice.
Preferably, as shown in Figure 6, the described device (42) in the resolver of gateway of the present invention further comprises: first translating equipment (421) is used to receive the Text Command from speech recognition equipment, and according to the VoiceXML agreement it is translated into the XML verbal order; Link establishment device (422) is set up linking of the destination host corresponding with described verbal order according to the XML order of resolving; Device wherein (43) further comprises: be used for downloading the device (431) of ordering the information corresponding content with the XML that resolves from this destination host; Information detector (432) is used to judge the information format that receives from described link establishment device; Second translating equipment (433) is connected with described information detector, is used for when described information format is non-phonetic matrix the described information content being become text message according to the VoiceXML protocol translation.
Embodiment 2
As shown in Figure 2, for speech sound browsing network of the present invention is applied to the synoptic diagram of internet, comprise telephone terminal, telephone network, speech sound browsing network 1 and WEB server and database server.
Speech sound browsing network 1 receives the voice command that the user sends by telephone network such as PSTN or ISDN etc. from telephone terminal, after this voice command being carried out dissection process, therefrom obtain the identification information that user expectation is obtained the destination host of information according to embodiment 1 described principle of work, and according to the address of this information from the WEB server retrieves destination host corresponding with this identification information, it also is the network address of database server, thereby between speech sound browsing network and this database server, set up a virtual passage that links, on this tunnel, communicate according at present general procotol such as Http protocol voice browsing network and database server.Speech sound browsing network obtains according to a certain specific grammer agreement canned data from database server according to the user voice command after resolving, in this embodiment of the invention, this grammer agreement that is used for canned data is based on the VoiceXML agreement of voice, and this information also can be the message file of the audio format that provides of ICP certainly.The VoiceXML resolver passes through procotol, as agreements such as Http, obtain the described profile of VoiceXML script (document), resolve this profile, explain wherein each sign (Tag), produce control commands corresponding, control other parts and carry out corresponding action, and obtain the result, determine execution direction and the sequential flow used according to the result.
When the resolver of speech sound browsing network inside is judged the information that receives from database server and is the file of audio format, this audio files directly can be sent to the voice output passage, and arrive user's phone by telephone network; When the information that retrieves from database server is during according to VoiceXML agreement canned data, text voice compositor by intra-gateway will convert voice messaging to through the text message after the resolver processing, be sent to the user with speech form, also the text message after handling can be sent to by telephone network on user's the display terminal and show.Can certainly adopt the mode of the two combination, obtain needed information.
In another embodiment of the present invention, the network information also can store on the server of a special use, for example a server can managing by the ICP trustship, by operator of this server.Like this, after speech sound browsing network receives voice command and the content that is provided by certain ICP of user expectation is provided, just can directly to this trusteeship service device, search the information relevant with this ICP.This trusteeship service device can exist together with speech sound browsing network physically and local be connected by LAN (Local Area Network), also can be long-range, as linking together by present internet.
Embodiment 3
A kind of speech sound browsing network is provided, comprises: input channel is used to receive the information or the order of user's input; The button recognition device, the command conversion that is used for receiving from input channel becomes Text Command; The VoiceXML resolver is used for the script of VoiceXML standard is analyzed, explained, and then controls speech recognition equipment, phonetic entry channel, specifically comprises:
The order that is used to detect described input channel state and will detects sends to the device of button recognition device; The device (42) that is used for to become meet the XML order of VoiceXML agreement from the text command analyzing of button recognition device reception and then orders the communication linkage of foundation and the corresponding destination host of described user command according to XML; And be used for the downloading page script, analyze the page, the described information content is become content of text and starts the device of press key input device with the typing user command when needing the user command input according to the VoiceXML protocol analysis.
By voice Interworking GateWay of the present invention, can set up new voice application and service at an easy rate, as voice portal, voice Call Center, voice information services, voice-enabled e-commerce or the like.And these application or service can combine with original data system at an easy rate, even can extend out from original types of applications easily.And the voice application of VoiceXML, can be with the data representation form of XML, application system, the data system with other carries out alternately easily.Voice application based on VoiceXML mechanism has following characteristics: with application, session, document is that unit sets up application structure; With dialogue is mutual unit, finishes dialogue and determines flow guiding, and the grammer relevant with scope activates/forbid mechanism; With the voice webpage is unit, makes up complicated application level.
Utilize speech sound browsing network of the present invention, can make up business system based on voice browse, organically combine with traditional ecommerce, can also can on powerful middleware platform basis, make up new voice-enabled e-commerce system easily in conjunction with original e-commerce system.
Utilize speech sound browsing network of the present invention, can make up voice portal, realized that phone also can surfing on Internet, this speech sound browsing network can be selected and in conjunction with voice application web construction management tool, delivery system or the like by portal website, and their user is extended among the huge telephone subscriber group.This voice portal is with to set up the WEB website the same simple, even can support original WEB website forcefully, shows to the user with abundanter form.
Utilize speech sound browsing network of the present invention, can make up UMS (unified infosystem) platform.As personal communication service, the performance of UMS is more and more active.The user can inquire about, obtain information and obtain feedback by various instruments.The communication form of E-Mail, phone, fax, short message, PDA and BP or the like, and processing to the phone information mode not only can be described with VoiceXML, and can describe and the information communication of alternate manner and mutual process, make whole UMS platform become an organic integral body.
Utilize speech sound browsing network of the present invention, can make up the call center of crossing over internet and telephone network.The call center will more and more receive the concern of businessman not only for the telephone subscriber provides service based on the call center of WEB.Become easily simple by XML data markers technology alternately between two networks.
Use example:
Utilize speech sound browsing network of the present invention, can realize following various application:
1, VoiceXML voice mail
The VoiceXML voice mail application can be sent and received e-mail the user by sound devices such as phones.In the voice mail application based on VoiceXML, the user is free to selectivity and listens to mail, only listens title or content, order to browse, delete at any time.Utilize the Communication book function, the user can send the mail of speech form by saying name, allows the other side hear the acoustic information of oneself.
2, VoiceXML stock inquiry
Based on the stock inquiry application system of VoiceXML, the user need not remember stock code, only need say stock name and get final product.The user can selectivity customize own several the stocks of being concerned about, the information of only inquiring about these several stocks.By more complicated model customization, the user can also customize detail content such as stock price that they are concerned about, trading volume, listens to the style of hobby, and functions such as the user also can making prompting, warning are in time handled.
3, VoiceXML weather inquiry
Based on the weather inquiry system of VoiceXML, select several cities that the user was concerned about, inquire about weather condition at any time, so that arrange trip, tourism.
4, VoiceXML voice game
To have a try and play finger-guessing game recreation of computer, does the complaint during proud and defeated when hearing computer and winning look at that you can a few words say to such an extent that computer is bowed and admitted defeat?
Although below described the present invention in conjunction with specific embodiments, should understand, these specific embodiments and unrestriction, present technique field personnel can therefrom make variations and modifications, and can not break away from the spirit and scope of the present invention.

Claims (18)

1, a kind of speech sound browsing network (1) comprising:
Phonetic entry channel (21) is used to receive the voice messaging or the order of user's input;
Speech recognition equipment (3), the voice command that is used for receiving from voice channel converts Text Command to;
VoiceXML resolver (4) is used for the script of VoiceXML standard is analyzed, explained, and then controls speech recognition equipment (2), phonetic entry channel (3), specifically comprises:
Be used for the detecting voice channel status and the voice command that detects sent to the device (41) of speech recognition equipment;
The device (42) that is used for to become to meet the XML order of VoiceXML agreement from the text command analyzing that speech recognition equipment receives and then orders the communication linkage of foundation and the corresponding destination host of described voice command according to XML; And
Be used for the downloading page script, analyze the page, the described information content is become content of text and starts the device (43) of speech input device with the typing voice command when needing the user speech input according to the VoiceXML protocol analysis.
2, speech sound browsing network as claimed in claim 1, wherein said speech recognition equipment further comprise a button recognition device, are used to discern the key information that the user sends.
3, speech sound browsing network as claimed in claim 1, wherein said device (42) further comprises:
First translating equipment (421) is used to receive the Text Command from speech recognition equipment, and according to the VoiceXML agreement it is translated into the XML verbal order;
Link establishment device (422) is set up linking of the destination host corresponding with described verbal order according to the XML order of resolving;
Device wherein (43) further comprises:
Be used for downloading the device (431) of ordering the information corresponding content with the XML that resolves from this destination host;
Information detector (432) is used to judge the information format that receives from described link establishment device;
Second translating equipment (433) is connected with described information detector, is used for when described information format is non-phonetic matrix the described information content being become text message according to the VoiceXML protocol translation.
4, as any described speech sound browsing network among the claim 1-3, further comprise:
Voice output channel (22) is used to export voice messaging; Send the Text To Speech synthesizer, receive the voice content that returns, and send to voice output passage (22) and carry out playback;
Speech synthetic device (5) is used for receiving content of text from described VoiceXML resolver, and convert text content to voice messaging and return resolver (4), wherein,
Described resolver receives the voice messaging after changing and sends it to the output voice channel from speech synthetic device.
5, speech sound browsing network as claimed in claim 4 further comprises: audio recording apparatus is used to write down voice messaging or order on the described voice channel, and submits to the VoiceXML resolver and carry out dissection process.
6, speech sound browsing network as claimed in claim 5 further comprises: sound play device is used to play voice suggestion order or the voice content that returns from resolver.
7, a kind of internet information server based on voice browse comprises:
Telephone terminal;
Content server, store on it that ICP provides according to VoiceXML agreement canned data;
Speech sound browsing network is connected with described telephone terminal by network, is used for according to the voice command that sends from telephone terminal according to certain procotol it is characterized in that from described server lookup and reception information this gateway comprises:
Phonetic entry channel (2) is used to receive the voice messaging or the order of user's input;
Speech recognition equipment (3), the voice command that is used for receiving from voice channel converts Text Command to;
VoiceXML resolver (4) is used for the script of VoiceXML standard is analyzed, explained, and then controls speech recognition equipment (2), phonetic entry channel (3), specifically comprises:
Be used for the detecting voice channel status and the voice command that detects sent to the device (41) of speech recognition equipment;
The device (42) that is used for to become to meet the XML order of VoiceXML agreement from the text command analyzing that speech recognition equipment receives and then orders the communication linkage of the specific service in foundation and the corresponding server of described voice command according to XML; And
Be used for the downloading page script, analyze the page, the described information content is become content of text and starts the device (43) of speech input device with the typing voice command when needing the user speech input according to the VoiceXML protocol analysis.
8, a kind of based on the voice browse internet information server, comprising:
Telephone terminal;
The WEB server, the application scenarios that stores the address identification message relevant on it or store according to the VoiceXML agreement with ICP;
Database server, the storage data message relevant on it with application;
Speech sound browsing network is used for searching and reception information from described WEB server according to the interactive voice order of sending from telephone terminal, it is characterized in that this gateway comprises:
Phonetic entry channel (2) is used to receive the voice messaging or the order of user's input;
Speech recognition equipment (3), the voice command that is used for receiving from voice channel converts Text Command to;
VoiceXML resolver (4) is used for the script of VoiceXML standard is analyzed, explained, and then controls speech recognition equipment (2), phonetic entry channel (3), specifically comprises:
Be used for the detecting voice channel status and the voice command that detects sent to the device (41) of speech recognition equipment;
The text command analyzing that is used for receiving from speech recognition equipment becomes meet the XML order of VoiceXML agreement and then from the retrieval of WEB server and the network address of the corresponding destination host of described voice command and in view of the above sets up device (42) with the communication linkage of this data server according to certain agreement according to the XML order; And
Be used for the downloading page script, analyze page script, described script information content is become content of text and starts the device (43) of speech input device with the typing voice command when needing the user speech input according to the VoiceXML protocol analysis.
9, as claim 7 or 8 described systems, wherein said speech recognition equipment further comprises a button recognition device, is used to discern the key information that the user sends.
10, system as claimed in claim 9, wherein institute's device (42) further comprises:
First translating equipment (421) is used to receive the Text Command from speech recognition equipment, and according to the VoiceXML agreement it is translated into the XML verbal order;
Link establishment device (422) is set up linking of described specific service corresponding with described verbal order or database server according to the XML order of resolving;
Device wherein (43) further comprises:
Be used for downloading the device (431) of ordering the information corresponding content with the XML that resolves from this specific service or database server;
Information detector (432) is used to judge the information format that receives from described link establishment device;
Second translating equipment (433) is connected with described information detector, is used for when described information format is non-phonetic matrix the described information content being become text message according to the VoiceXML protocol translation.
11, as claim 7 or 8 described systems, further comprise:
The output voice channel is used to export voice messaging;
Speech synthetic device is used for receiving content of text from described VoiceXML resolver, and converts text content to voice messaging under the control of resolver, wherein,
Described resolver receives the voice messaging after changing and sends it to the output voice channel from speech synthetic device.
12, system as claimed in claim 11 further comprises: audio recording apparatus is used to write down the voice command on the described voice channel, and the voice command that is write down is submitted to the VoiceXML resolver.
13, system as claimed in claim 12 further comprises: sound play device is used to play the voice suggestion order or the voice content that will receive from described output voice channel.
14, as claim 7 or 8 described systems, further comprise a display device, be used to show the content of text that receives from described VoiceXML resolver.
15, a kind ofly obtain the method for internet information, comprising by voice:
1) utilize a telephone terminal to receive the voice command that the user sends;
2) the VoiceXML resolver consigns to a speech recognition device with it after detecting this voice command;
3) speech recognition device voice command that the user is sent converts text message to and submits to the VoiceXML resolver;
4) the described text message of VoiceXML resolver resolves also is established to the link of this network host in view of the above with the network address of the destination host of definite user expectation;
5) the VoiceXML resolver receives the user needed information content according to certain procotol from network host according to user and the mutual voice command of network host;
6) the VoiceXML resolver is judged the information format that is received, when this information format is to become text message according to this content of VoiceXML protocol analysis when meeting the VoiceXML standard.
7) the voice operation demonstrator text message of receive resolving from the VoiceXML resolver and convert thereof into voice messaging and offer resolver;
8) resolver is sent to the voice messaging of receiving on the voice output passage.
16, method as claimed in claim 15, wherein said procotol are TCP/IP, http protocol.
17, as claim 15 or 16 described methods, wherein also comprise optional step: the text message that will obtain in step 6 shows on display.
18 1 kinds of speech sound browsing networks comprise:
Input channel is used to receive the information or the order of user's input;
The button recognition device, the command conversion that is used for receiving from input channel becomes Text Command;
The VoiceXML resolver is used for the script of VoiceXML standard is analyzed, explained, and then controls speech recognition equipment, phonetic entry channel, specifically comprises:
The order that is used to detect described input channel state and will detects sends to the device of button recognition device;
The device (42) that is used for to become meet the XML order of VoiceXML agreement from the text command analyzing of button recognition device reception and then orders the communication linkage of foundation and the corresponding destination host of described user command according to XML; And
Be used for the downloading page script, analyze the page, the described information content is become content of text and starts the device of press key input device with the typing user command when needing the user command input according to the VoiceXML protocol analysis.
CN 02106040 2002-04-09 2002-04-09 Speech sound browsing network Pending CN1427394A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 02106040 CN1427394A (en) 2002-04-09 2002-04-09 Speech sound browsing network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 02106040 CN1427394A (en) 2002-04-09 2002-04-09 Speech sound browsing network

Publications (1)

Publication Number Publication Date
CN1427394A true CN1427394A (en) 2003-07-02

Family

ID=4740191

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 02106040 Pending CN1427394A (en) 2002-04-09 2002-04-09 Speech sound browsing network

Country Status (1)

Country Link
CN (1) CN1427394A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008009158A1 (en) * 2006-06-20 2008-01-24 Han Yi Chen A system and method for a multi-languages speech domain name and a voice search based on internet
CN100414546C (en) * 2005-11-30 2008-08-27 鼎华电脑股份有限公司 Method for downloading script language association file group and computer recording medium
CN100456234C (en) * 2005-06-16 2009-01-28 国际商业机器公司 Method and system for synchronizing visual and speech events in a multimodal application
WO2010111861A1 (en) * 2009-03-30 2010-10-07 中兴通讯股份有限公司 Voice interactive method for mobile terminal based on vocie xml and apparatus thereof
CN102546542A (en) * 2010-12-20 2012-07-04 福建星网视易信息系统有限公司 Electronic system and embedded device and transit device of electronic system
CN102752019A (en) * 2011-04-20 2012-10-24 深圳盒子支付信息技术有限公司 Data sending, receiving and transmitting method and system based on headset jack
CN103366729A (en) * 2012-03-26 2013-10-23 富士通株式会社 Speech dialogue system, terminal apparatus, and data center apparatus
CN101207656B (en) * 2006-12-19 2014-04-09 纽奥斯通讯有限公司 Method and system for switching between modalities in speech application environment
CN105551488A (en) * 2015-12-15 2016-05-04 深圳Tcl数字技术有限公司 Voice control method and system
WO2017045399A1 (en) * 2015-09-16 2017-03-23 广州市动景计算机科技有限公司 Method for reading webpage information by voice, browser client and server
CN107211020A (en) * 2015-01-26 2017-09-26 Lg电子株式会社 Host device and its control method

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100456234C (en) * 2005-06-16 2009-01-28 国际商业机器公司 Method and system for synchronizing visual and speech events in a multimodal application
CN100414546C (en) * 2005-11-30 2008-08-27 鼎华电脑股份有限公司 Method for downloading script language association file group and computer recording medium
WO2008009158A1 (en) * 2006-06-20 2008-01-24 Han Yi Chen A system and method for a multi-languages speech domain name and a voice search based on internet
CN101207656B (en) * 2006-12-19 2014-04-09 纽奥斯通讯有限公司 Method and system for switching between modalities in speech application environment
CN101527755B (en) * 2009-03-30 2011-07-13 中兴通讯股份有限公司 Voice interactive method based on VoiceXML movable termination and movable termination
WO2010111861A1 (en) * 2009-03-30 2010-10-07 中兴通讯股份有限公司 Voice interactive method for mobile terminal based on vocie xml and apparatus thereof
US8724780B2 (en) 2009-03-30 2014-05-13 Zte Corporation Voice interaction method of mobile terminal based on voiceXML and mobile terminal
CN102546542B (en) * 2010-12-20 2015-04-29 福建星网视易信息系统有限公司 Electronic system and embedded device and transit device of electronic system
CN102546542A (en) * 2010-12-20 2012-07-04 福建星网视易信息系统有限公司 Electronic system and embedded device and transit device of electronic system
CN102752019A (en) * 2011-04-20 2012-10-24 深圳盒子支付信息技术有限公司 Data sending, receiving and transmitting method and system based on headset jack
CN102752019B (en) * 2011-04-20 2015-01-28 深圳盒子支付信息技术有限公司 Data sending, receiving and transmitting method and system based on headset jack
CN103366729A (en) * 2012-03-26 2013-10-23 富士通株式会社 Speech dialogue system, terminal apparatus, and data center apparatus
CN103366729B (en) * 2012-03-26 2016-05-04 富士通株式会社 Speech dialogue system, terminal installation and data center's device
CN107211020A (en) * 2015-01-26 2017-09-26 Lg电子株式会社 Host device and its control method
CN107211020B (en) * 2015-01-26 2020-06-16 Lg电子株式会社 Sink device and control method thereof
WO2017045399A1 (en) * 2015-09-16 2017-03-23 广州市动景计算机科技有限公司 Method for reading webpage information by voice, browser client and server
US10714074B2 (en) 2015-09-16 2020-07-14 Guangzhou Ucweb Computer Technology Co., Ltd. Method for reading webpage information by speech, browser client, and server
US11308935B2 (en) 2015-09-16 2022-04-19 Guangzhou Ucweb Computer Technology Co., Ltd. Method for reading webpage information by speech, browser client, and server
CN105551488A (en) * 2015-12-15 2016-05-04 深圳Tcl数字技术有限公司 Voice control method and system

Similar Documents

Publication Publication Date Title
CN1160700C (en) System and method for providing network coordinated conversational services
US7415537B1 (en) Conversational portal for providing conversational browsing and multimedia broadcast on demand
CN101341532B (en) Sharing voice application processing via markup
CN102591856B (en) A kind of translation system and interpretation method
US20070043868A1 (en) System and method for searching for network-based content in a multi-modal system using spoken keywords
CN1666199A (en) An arrangement and a method relating to access to internet content
US20090030687A1 (en) Adapting an unstructured language model speech recognition system based on usage
US20030161298A1 (en) Multi-modal content and automatic speech recognition in wireless telecommunication systems
US20080221901A1 (en) Mobile general search environment speech processing facility
CN1329739A (en) Voice control of a user interface to service applications
US20080221898A1 (en) Mobile navigation environment speech processing facility
US20090030696A1 (en) Using results of unstructured language model based speech recognition to control a system-level function of a mobile communications facility
CN1689015A (en) System and method for accessing services and/or applications and/or content on a communication network
WO2001069422A2 (en) Multimodal information services
CN1752975A (en) Method and system for voice-enabled autofill
JP2003295890A (en) Apparatus, system, and method for speech recognition interactive selection, and program
CN1427394A (en) Speech sound browsing network
US7054421B2 (en) Enabling legacy interactive voice response units to accept multiple forms of input
JPH10177469A (en) Mobile terminal voice recognition, database retrieval and resource access communication system
CN1235387C (en) Distributed speech recognition for internet access
CN1489861A (en) Radio mobile terminal communication system
Lazzari Spoken translation: challenges and opportunities
CN1750499A (en) Voice browse system
CN1805403A (en) Method of using communication services with packet user terminal and its system
CN1168264C (en) Interactive telephone phonetic information service system and method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication