CN1427394A

CN1427394A - Speech sound browsing network

Info

Publication number: CN1427394A
Application number: CN 02106040
Authority: CN
Inventors: 廖杰远
Original assignee: WUXIANSHANGJI COMMUNICATION TECHNOLOGY Co Ltd BEIJING
Current assignee: WUXIANSHANGJI COMMUNICATION TECHNOLOGY Co Ltd BEIJING
Priority date: 2002-04-09
Filing date: 2002-04-09
Publication date: 2003-07-02

Abstract

A speech browsing gateway is composed of speech input channel, speech recognizer, and the VoiceXML analyzer and controlling said speech input channel and speech recognizer. Said analyzer consists of a device for detecting the state of speech channel and sending the detected speech command to speech recognizer, a device for analyzing the text command from speech recognizer, converting it to XML command, and creating communication link to target host, and a device for downloading page script, analyzing page, and starting speech input device.

Description

Speech sound browsing network

Technical field

The gateway that the present invention relates to use in the communications field, especially voice-based gateway, and use this gateway to realize the internet of voice browse.

Background technology

The develop rapidly of Internet and widespread use are to build on the success of WEB navigation mechanism to a great extent.Exactly because the structure of Client/Server and html script SGML and http etc. be the combination of host-host protocol effectively, makes Internet have the most powerful distribution/central access structure, and simple application and development mechanism.We can say that browsing is server.

For voice application, on the interaction mechanism that voice application in the past builds on simply, seals.Its Data Source relies on prior prefabricated recording basically fully, and operating process is that simple menu formula button is selected.

Along with the maturation of new man-machine interaction mode such as speech recognition, phonetic synthesis is used, traditional CTI system has possessed new interaction capabilities.Combination that this new interactive mode and Internet use is born and voice browse is just for adapting to.Voice browse makes traditional simple telephone set become a kind of data access terminal powerful and simple to operate, with data with build on Internet alternately and browse on the structure, thereby make the so simple equipment of telephone set enjoy a trip to Internet more easily and easily than other network terminal.Voice browse, it is analogous to Internet that we are familiar with and the navigation mechanism between the client computer.Thereby the huge speech communication network of popularizing has the most organically been incorporated among the abundant Internet, made the vast application that builds on the data network obtain extending the most widely.

Submit at 1999.11.30, the application people is in " information service system of interactive telephone phonetic and method " by name application of 99125249.7 for the application number of Fuzhou Shutong Information Technology Co., Ltd, a kind of information service system of interactive telephone phonetic is disclosed, a kind of system and method based on speech query information is disclosed in this system, wherein the user dials shortcode and inserts certain urban node then by telephone network and in inserting by terminal device, the user gives an oral account the voice signal of Business Name through inserting and Switching Module subsequently, and, determine Classification of Businesses classification and location thereof through speech-recognition services module identification services type; If then be not forwarded to the destination node through remote communication module and wide area network in this locality.Behind sound identification module identification industry Business Name, enter this professional operation flow through message control module, and then the user can be mutual further with system.

System described in this application has realized substantially by speech query information, but it exists following some deficiency simultaneously:

1) this system can only inquire about determine in advance good, by the urban node that the supplier of system is managed, promptly the supplier of system also must provide content simultaneously, so just can make voice communication valuable;

2) urban node isolates, and the user dials in a shortcode, and the content that can only inquire about a node if will wish to obtain the content of other node, then can only be dialled in other shortcode, freely browsing internet.

Internet is the basis that Internet extensively popularizes based on the application of the web browsing mechanism of html script descriptive language.And the Internet extensively to popularize the requirement that also makes to the mode of obtaining internet information also more and more.Thereby a kind of new script describing Language XML has occurred, and it has brought a kind of completely new concept for the browsing data technology.The XML tag language is paid close attention to the concern that the notion of data representation form transfers to data implication and content with the SGML of in the past HTML and so on and is come up.In the HTML SGML, computer program can know which type of mode is these data should show on screen, but we are difficult to allow computer program know, what implication is these data are.And in XML, its mark be the implication and the content of data, we can allow program discern and handle these data easily, and show with various suitable form.

The XML language is a kind of " more honest netspeak ", and it makes data obtain and exchange more convenient flexibly on network, and can obtain reflection by comprising more terminal device such as computer, TV and mobile phone.

Acoustic control information obtain manner has and uses simple, characteristics of high efficiency, it make man-machine more natural alternately, get close to, therefore just be subjected to increasing people's favor.Arise at the historic moment at the VoiceXML of extendability identifiable language aspect the voice application based on XML.It is by IBM, Lucent, Motorola, AT﹠amp; A kind of SGML that is applied to voice browse that T four tame international corporations proposed in 2000, the VoiceXML resolver is the core of voice browse technology.And the VoiceXML language is because be a kind of XML descriptive language equally, almost do not have obstacle with the data exchange of database, HTML, WML and other document process and delivery system.

Summary of the invention

For this reason, the present invention proposes a kind of voice Interworking GateWay that is applied to the internet, can use as the Web that sets up HTML and set up voice application system easily, and such voice application system can extensively be supported for the speech sound browsing network institute based on VoiceXML.The voice Interworking GateWay carries out man-machine interaction by resolving the VoiceXML combination with modes such as speech recognition and phonetic syntheses, thereby realizes the sociable dream that just can surf the Net.By this gateway, the mode browsing internet that the user can listen to voice, and be not single node, this speech sound browsing network for this reason comprises:

The phonetic entry channel is used to receive the voice messaging or the order of user's input;

Speech recognition equipment, the voice command that is used for receiving from voice channel converts Text Command to;

The VoiceXML resolver, be used for the script of VoiceXML standard is analyzed, explained, and then control speech recognition equipment, phonetic entry channel, specifically comprise: be used for the detecting voice channel status and the voice command that detects sent to the device of speech recognition equipment; The device that is used for to become to meet the XML order of VoiceXML agreement from the text command analyzing that speech recognition equipment receives and then orders the communication linkage of foundation and the corresponding destination host of described voice command according to XML; And be used for the downloading page script, analyze the page, the described information content is become content of text and starts the device of speech input device with the typing voice command when needing the user speech input according to the VoiceXML protocol analysis.

Preferably, this speech sound browsing network further comprises: the voice output channel is used to export voice messaging; Send the Text To Speech synthesizer, receive the voice content that returns, and send to the voice output passage and carry out playback; Speech synthetic device, be used for receiving content of text from described VoiceXML resolver, text content is converted to voice messaging and returns resolver, and wherein, described resolver receives the voice messaging after changing and sends it to the output voice channel from speech synthetic device.

According to another aspect of the present invention, a kind of voice-based internet information server is provided, comprising:

Telephone terminal;

Server, store on it that ICP provides according to VoiceXML agreement canned data;

Speech sound browsing network is used for according to the voice command that sends from telephone terminal according to certain procotol it is characterized in that from described server lookup and reception information this gateway comprises:

The VoiceXML resolver, be used for the script of VoiceXML standard is analyzed, explained, and then control speech recognition equipment, phonetic entry channel, specifically comprise: be used for the detecting voice channel status and the voice command that detects sent to the device of speech recognition equipment; The device that is used for to become to meet the XML order of VoiceXML agreement from the text command analyzing that speech recognition equipment receives and then orders the communication linkage of the specific service in foundation and the corresponding server of described voice command according to XML; And be used for the downloading page script, analyze the page, the described information content is become content of text and starts the device of speech input device with the typing voice command when needing the user speech input according to the VoiceXML protocol analysis.

According to another aspect of the present invention, a kind of voice internet information server is provided, comprising:

A kind of voice internet information server comprises:

Telephone terminal;

The WEB server, the application scenarios that stores the address identification message relevant on it or store according to the VoiceXML agreement with ICP;

Database server, the storage data message relevant on it with application;

Speech sound browsing network is used for searching and reception information from described WEB server according to the interactive voice order of sending from telephone terminal, it is characterized in that this gateway comprises:

Phonetic entry channel (2) is used to receive the voice messaging or the order of user's input;

Speech recognition equipment (3), the voice command that is used for receiving from voice channel converts Text Command to;

VoiceXML resolver (4) is used for the script of VoiceXML standard is analyzed, explained, and then controls speech recognition equipment (2), phonetic entry channel (3), specifically comprises:

Be used for the detecting voice channel status and the voice command that detects sent to the device (41) of speech recognition equipment;

The text command analyzing that is used for receiving from speech recognition equipment becomes meet the XML order of VoiceXML agreement and then from the retrieval of WEB server and the network address of the corresponding destination host of described voice command and in view of the above sets up device (42) with the communication linkage of this data server according to certain agreement according to the XML order; And

Be used for the downloading page script, analyze page script, described script information content is become content of text and starts the device (43) of speech input device with the typing voice command when needing the user speech input according to the VoiceXML protocol analysis.

Preferably, described speech sound browsing network further comprises the output voice channel, is used to export voice messaging; Speech synthetic device is used for receiving content of text from described VoiceXML resolver, and converts text content to voice messaging under the control of resolver, wherein,

Described resolver receives the voice messaging after changing and sends it to the output voice channel from speech synthetic device.

According to the present invention on the other hand, provide a kind of and obtain the method for internet information, comprising: utilize a telephone terminal to receive the voice command that the user sends by voice; The VoiceXML resolver consigns to a speech recognition device with it after detecting this voice command; The voice command that speech recognition device sends the user converts text message to and submits to the VoiceXML resolver; The described text message of VoiceXML resolver resolves also is established to the link of this network host in view of the above with the network address of the destination host of definite user expectation; The VoiceXML resolver receives the user needed information content according to certain procotol from network host according to user and the mutual voice command of network host; The VoiceXML resolver is judged the information format that is received, when this information format is to become text message according to this content of VoiceXML protocol analysis when meeting the VoiceXML standard; The text message that voice operation demonstrator receive to be resolved from the VoiceXML resolver also converts thereof into voice messaging and converts thereof into voice messaging and offer resolver; Resolver is sent to the voice messaging of receiving on the voice output passage.

According to the present invention on the other hand, provide a kind of speech sound browsing network, comprising: input channel is used to receive the information or the order of user's input; The button recognition device, the command conversion that is used for receiving from input channel becomes Text Command; The VoiceXML resolver is used for the script of VoiceXML standard is analyzed, explained, and then controls speech recognition equipment, phonetic entry channel, specifically comprises:

The order that is used to detect described input channel state and will detects sends to the device of button recognition device; The device (42) that is used for to become meet the XML order of VoiceXML agreement from the text command analyzing of button recognition device reception and then orders the communication linkage of foundation and the corresponding destination host of described user command according to XML; And be used for the downloading page script, analyze the page, the described information content is become content of text and starts the device of press key input device with the typing user command when needing the user command input according to the VoiceXML protocol analysis.

Description of drawings

Fig. 1 is the voice-based network system synoptic diagram of prior art;

Fig. 2 is according to the synoptic diagram of voice-based internet of the present invention system;

Fig. 3 is the structural representation according to voice Interworking GateWay of the present invention;

Fig. 4 is according to voice-based network browsing process flow diagram of the present invention;

Fig. 5 is the structural drawing according to resolver in the voice Interworking GateWay of the present invention;

Fig. 6 is the further structure configuration map of resolver shown in Figure 5.

Specific embodiment

Embodiment 1

Shown in Fig. 3 a, for realizing the structural representation of speech sound browsing network 1 of the present invention, this gateway 1 comprises: input voice channel 21, speech recognition equipment 3 and VoiceXML resolver 4.

Input voice channel 21 is used to receive the voice command of user by the phone input, and voice channel is the speech data signal treatment channel of transmission user in the VoiceXML speech sound browsing network, and it connects physically voice collecting and playback equipment.Speech recognition equipment.In voice application system, voice channel mainly is sound card, voice channel or with the existing tunnel of numerical coding form, as vocoded data bag of IP etc.To the support of the voice channel of different platform, but determined the platform of VoiceXML speech sound browsing network practical application.

Speech recognition equipment 3 is connected with described voice channel 2, and the voice command that is used for receiving from voice channel 3 converts Text Command to, sends the VoiceXML resolver back to and deals with.In speech sound browsing network, speech recognition equipment is authoritative recognition engine, and it discerns the user's voice signal according to limited grammer, produces the recognition result of corresponding syntactic definition.So, grammer just becomes the key concept in the VoiceXML speech sound browsing network 1, grammer determined the user talkative what, how to say, good grammer can bring the user good interactive feel, also can be from improving the discrimination of speech recognition equipment in logic, make the browsing smoothness of whole voice application and easily.In the VoiceXML speech sound browsing network, speech recognition equipment not only needs to handle the identification to user voice signal, also needs to handle simultaneously the identification to user key-press, and button and voice are with same machine-processed processed and transmission.

VoiceXML resolver 4 is as the core content of this speech sound browsing network, be used for the script of VoiceXML standard is analyzed, explained, and then the relevant resource (as speech recognition equipment, speech synthetic device, phonetic entry output unit etc.) of control is carried out work.Specifically comprise: the detecting voice channel status; The downloading page script when analysis page is imported for the needs user speech, starts speech input device, and the typing voice command sends voice command to speech recognition equipment, and receives the Text Command that returns; Send the Text To Speech synthesizer, receive the voice content that returns, and send to the voice output passage and carry out playback; To become to meet the XML order of VoiceXML agreement from the text command analyzing that speech recognition equipment receives, and then set up with the communication linkage of the corresponding destination host of described voice command and from destination host according to XML order and to receive and voice command information corresponding content, and the described information content is become content of text according to the VoiceXML protocol analysis.

VoiceXML resolver 4 constantly detecting voice channel, is then controlled speech recognition equipment 3 and is received, changes this voice command if having to have judged whether the voice command input.VoiceXML resolver 4 will receive the Text Command after changing and it will be resolved to the XML that meets the VoiceXML agreement and order from speech recognition equipment 3, and then according to the communication linkage of XML order foundation with the corresponding destination host of described voice command, and receive and voice command information corresponding content, and the described information content is become content of text according to the VoiceXML protocol analysis from destination host.Specifically, resolver 4 is used and session by setting up, obtain the document that comprises control command, set up dialogue according to the sign in the document, thereby explain each dialogue, the triggering of control speech recognition, speech synthesis engine and voice channel, opening and closing, hang-up etc., realization is conversational mutual with the user's, and according to the judgement of leading of the recognition result to customer responsiveness, carry out the transfer between the document and use between transfer.

As shown in Figure 5, this resolver 4 comprises: be used for the detecting voice channel status and the voice command that detects sent to the device (41) of speech recognition equipment; The device (42) that is used for to become to meet the XML order of VoiceXML agreement from the text command analyzing that speech recognition equipment receives and then orders the communication linkage of foundation and the corresponding destination host of described voice command according to XML; And be used for the downloading page script, analyze the page, the described information content is become content of text and starts the device (43) of speech input device with the typing voice command when needing the user speech input according to the VoiceXML protocol analysis.

Preferably, shown in Fig. 3 b, speech sound browsing network of the present invention also comprises a voice operation demonstrator (TTS) 5, this compositor is connected with VoiceXML resolver 4, be used to receive the content of text after the parsing and under the control of resolver 4, convert the content of text that receives to voice messaging, and output to a voice output channel 22 by resolver and play to the user.Phonetic synthesis can become voice document with text conversion, also can convert audio data stream to, and the quality of phonetic synthesis has determined the direct sensation of user to system, and the synthetic video of smooth nature will make the user feel good.How to improve the quality of phonetic synthesis, become the key factor that influences the voice application effect.

Inputting/outputting voice channel of the present invention (21,22) can be realized by present known any speech processes integrated circuit board, and this integrated circuit board is connected by suitable interface conversion and a telephone network such as PSTN or ISDN, directly receives user's voice.

Preferably, as shown in Figure 6, the described device (42) in the resolver of gateway of the present invention further comprises: first translating equipment (421) is used to receive the Text Command from speech recognition equipment, and according to the VoiceXML agreement it is translated into the XML verbal order; Link establishment device (422) is set up linking of the destination host corresponding with described verbal order according to the XML order of resolving; Device wherein (43) further comprises: be used for downloading the device (431) of ordering the information corresponding content with the XML that resolves from this destination host; Information detector (432) is used to judge the information format that receives from described link establishment device; Second translating equipment (433) is connected with described information detector, is used for when described information format is non-phonetic matrix the described information content being become text message according to the VoiceXML protocol translation.

Embodiment 2

As shown in Figure 2, for speech sound browsing network of the present invention is applied to the synoptic diagram of internet, comprise telephone terminal, telephone network, speech sound browsing network 1 and WEB server and database server.

Speech sound browsing network 1 receives the voice command that the user sends by telephone network such as PSTN or ISDN etc. from telephone terminal, after this voice command being carried out dissection process, therefrom obtain the identification information that user expectation is obtained the destination host of information according to embodiment 1 described principle of work, and according to the address of this information from the WEB server retrieves destination host corresponding with this identification information, it also is the network address of database server, thereby between speech sound browsing network and this database server, set up a virtual passage that links, on this tunnel, communicate according at present general procotol such as Http protocol voice browsing network and database server.Speech sound browsing network obtains according to a certain specific grammer agreement canned data from database server according to the user voice command after resolving, in this embodiment of the invention, this grammer agreement that is used for canned data is based on the VoiceXML agreement of voice, and this information also can be the message file of the audio format that provides of ICP certainly.The VoiceXML resolver passes through procotol, as agreements such as Http, obtain the described profile of VoiceXML script (document), resolve this profile, explain wherein each sign (Tag), produce control commands corresponding, control other parts and carry out corresponding action, and obtain the result, determine execution direction and the sequential flow used according to the result.

When the resolver of speech sound browsing network inside is judged the information that receives from database server and is the file of audio format, this audio files directly can be sent to the voice output passage, and arrive user's phone by telephone network; When the information that retrieves from database server is during according to VoiceXML agreement canned data, text voice compositor by intra-gateway will convert voice messaging to through the text message after the resolver processing, be sent to the user with speech form, also the text message after handling can be sent to by telephone network on user's the display terminal and show.Can certainly adopt the mode of the two combination, obtain needed information.

In another embodiment of the present invention, the network information also can store on the server of a special use, for example a server can managing by the ICP trustship, by operator of this server.Like this, after speech sound browsing network receives voice command and the content that is provided by certain ICP of user expectation is provided, just can directly to this trusteeship service device, search the information relevant with this ICP.This trusteeship service device can exist together with speech sound browsing network physically and local be connected by LAN (Local Area Network), also can be long-range, as linking together by present internet.

Embodiment 3

A kind of speech sound browsing network is provided, comprises: input channel is used to receive the information or the order of user's input; The button recognition device, the command conversion that is used for receiving from input channel becomes Text Command; The VoiceXML resolver is used for the script of VoiceXML standard is analyzed, explained, and then controls speech recognition equipment, phonetic entry channel, specifically comprises:

By voice Interworking GateWay of the present invention, can set up new voice application and service at an easy rate, as voice portal, voice Call Center, voice information services, voice-enabled e-commerce or the like.And these application or service can combine with original data system at an easy rate, even can extend out from original types of applications easily.And the voice application of VoiceXML, can be with the data representation form of XML, application system, the data system with other carries out alternately easily.Voice application based on VoiceXML mechanism has following characteristics: with application, session, document is that unit sets up application structure; With dialogue is mutual unit, finishes dialogue and determines flow guiding, and the grammer relevant with scope activates/forbid mechanism; With the voice webpage is unit, makes up complicated application level.

Utilize speech sound browsing network of the present invention, can make up business system based on voice browse, organically combine with traditional ecommerce, can also can on powerful middleware platform basis, make up new voice-enabled e-commerce system easily in conjunction with original e-commerce system.

Utilize speech sound browsing network of the present invention, can make up voice portal, realized that phone also can surfing on Internet, this speech sound browsing network can be selected and in conjunction with voice application web construction management tool, delivery system or the like by portal website, and their user is extended among the huge telephone subscriber group.This voice portal is with to set up the WEB website the same simple, even can support original WEB website forcefully, shows to the user with abundanter form.

Utilize speech sound browsing network of the present invention, can make up UMS (unified infosystem) platform.As personal communication service, the performance of UMS is more and more active.The user can inquire about, obtain information and obtain feedback by various instruments.The communication form of E-Mail, phone, fax, short message, PDA and BP or the like, and processing to the phone information mode not only can be described with VoiceXML, and can describe and the information communication of alternate manner and mutual process, make whole UMS platform become an organic integral body.

Utilize speech sound browsing network of the present invention, can make up the call center of crossing over internet and telephone network.The call center will more and more receive the concern of businessman not only for the telephone subscriber provides service based on the call center of WEB.Become easily simple by XML data markers technology alternately between two networks.

Use example:

Utilize speech sound browsing network of the present invention, can realize following various application:

1, VoiceXML voice mail

The VoiceXML voice mail application can be sent and received e-mail the user by sound devices such as phones.In the voice mail application based on VoiceXML, the user is free to selectivity and listens to mail, only listens title or content, order to browse, delete at any time.Utilize the Communication book function, the user can send the mail of speech form by saying name, allows the other side hear the acoustic information of oneself.

2, VoiceXML stock inquiry

Based on the stock inquiry application system of VoiceXML, the user need not remember stock code, only need say stock name and get final product.The user can selectivity customize own several the stocks of being concerned about, the information of only inquiring about these several stocks.By more complicated model customization, the user can also customize detail content such as stock price that they are concerned about, trading volume, listens to the style of hobby, and functions such as the user also can making prompting, warning are in time handled.

3, VoiceXML weather inquiry

Based on the weather inquiry system of VoiceXML, select several cities that the user was concerned about, inquire about weather condition at any time, so that arrange trip, tourism.

4, VoiceXML voice game

To have a try and play finger-guessing game recreation of computer, does the complaint during proud and defeated when hearing computer and winning look at that you can a few words say to such an extent that computer is bowed and admitted defeat?

Although below described the present invention in conjunction with specific embodiments, should understand, these specific embodiments and unrestriction, present technique field personnel can therefrom make variations and modifications, and can not break away from the spirit and scope of the present invention.

Claims

1, a kind of speech sound browsing network (1) comprising:

Phonetic entry channel (21) is used to receive the voice messaging or the order of user's input;

The device (42) that is used for to become to meet the XML order of VoiceXML agreement from the text command analyzing that speech recognition equipment receives and then orders the communication linkage of foundation and the corresponding destination host of described voice command according to XML; And

Be used for the downloading page script, analyze the page, the described information content is become content of text and starts the device (43) of speech input device with the typing voice command when needing the user speech input according to the VoiceXML protocol analysis.

2, speech sound browsing network as claimed in claim 1, wherein said speech recognition equipment further comprise a button recognition device, are used to discern the key information that the user sends.

3, speech sound browsing network as claimed in claim 1, wherein said device (42) further comprises:

First translating equipment (421) is used to receive the Text Command from speech recognition equipment, and according to the VoiceXML agreement it is translated into the XML verbal order;

Link establishment device (422) is set up linking of the destination host corresponding with described verbal order according to the XML order of resolving;

Device wherein (43) further comprises:

Be used for downloading the device (431) of ordering the information corresponding content with the XML that resolves from this destination host;

Information detector (432) is used to judge the information format that receives from described link establishment device;

Second translating equipment (433) is connected with described information detector, is used for when described information format is non-phonetic matrix the described information content being become text message according to the VoiceXML protocol translation.

4, as any described speech sound browsing network among the claim 1-3, further comprise:

Voice output channel (22) is used to export voice messaging; Send the Text To Speech synthesizer, receive the voice content that returns, and send to voice output passage (22) and carry out playback;

Speech synthetic device (5) is used for receiving content of text from described VoiceXML resolver, and convert text content to voice messaging and return resolver (4), wherein,

5, speech sound browsing network as claimed in claim 4 further comprises: audio recording apparatus is used to write down voice messaging or order on the described voice channel, and submits to the VoiceXML resolver and carry out dissection process.

6, speech sound browsing network as claimed in claim 5 further comprises: sound play device is used to play voice suggestion order or the voice content that returns from resolver.

7, a kind of internet information server based on voice browse comprises:

Telephone terminal;

Content server, store on it that ICP provides according to VoiceXML agreement canned data;

Speech sound browsing network is connected with described telephone terminal by network, is used for according to the voice command that sends from telephone terminal according to certain procotol it is characterized in that from described server lookup and reception information this gateway comprises:

The device (42) that is used for to become to meet the XML order of VoiceXML agreement from the text command analyzing that speech recognition equipment receives and then orders the communication linkage of the specific service in foundation and the corresponding server of described voice command according to XML; And

8, a kind of based on the voice browse internet information server, comprising:

Telephone terminal;

Database server, the storage data message relevant on it with application;

9, as claim 7 or 8 described systems, wherein said speech recognition equipment further comprises a button recognition device, is used to discern the key information that the user sends.

10, system as claimed in claim 9, wherein institute's device (42) further comprises:

Link establishment device (422) is set up linking of described specific service corresponding with described verbal order or database server according to the XML order of resolving;

Device wherein (43) further comprises:

Be used for downloading the device (431) of ordering the information corresponding content with the XML that resolves from this specific service or database server;

11, as claim 7 or 8 described systems, further comprise:

The output voice channel is used to export voice messaging;

Speech synthetic device is used for receiving content of text from described VoiceXML resolver, and converts text content to voice messaging under the control of resolver, wherein,

12, system as claimed in claim 11 further comprises: audio recording apparatus is used to write down the voice command on the described voice channel, and the voice command that is write down is submitted to the VoiceXML resolver.

13, system as claimed in claim 12 further comprises: sound play device is used to play the voice suggestion order or the voice content that will receive from described output voice channel.

14, as claim 7 or 8 described systems, further comprise a display device, be used to show the content of text that receives from described VoiceXML resolver.

15, a kind ofly obtain the method for internet information, comprising by voice:

1) utilize a telephone terminal to receive the voice command that the user sends;

2) the VoiceXML resolver consigns to a speech recognition device with it after detecting this voice command;

3) speech recognition device voice command that the user is sent converts text message to and submits to the VoiceXML resolver;

4) the described text message of VoiceXML resolver resolves also is established to the link of this network host in view of the above with the network address of the destination host of definite user expectation;

5) the VoiceXML resolver receives the user needed information content according to certain procotol from network host according to user and the mutual voice command of network host;

6) the VoiceXML resolver is judged the information format that is received, when this information format is to become text message according to this content of VoiceXML protocol analysis when meeting the VoiceXML standard.

7) the voice operation demonstrator text message of receive resolving from the VoiceXML resolver and convert thereof into voice messaging and offer resolver;

8) resolver is sent to the voice messaging of receiving on the voice output passage.

16, method as claimed in claim 15, wherein said procotol are TCP/IP, http protocol.

17, as claim 15 or 16 described methods, wherein also comprise optional step: the text message that will obtain in step 6 shows on display.

18 1 kinds of speech sound browsing networks comprise:

Input channel is used to receive the information or the order of user's input;

The button recognition device, the command conversion that is used for receiving from input channel becomes Text Command;

The VoiceXML resolver is used for the script of VoiceXML standard is analyzed, explained, and then controls speech recognition equipment, phonetic entry channel, specifically comprises:

The order that is used to detect described input channel state and will detects sends to the device of button recognition device;

The device (42) that is used for to become meet the XML order of VoiceXML agreement from the text command analyzing of button recognition device reception and then orders the communication linkage of foundation and the corresponding destination host of described user command according to XML; And

Be used for the downloading page script, analyze the page, the described information content is become content of text and starts the device of press key input device with the typing user command when needing the user command input according to the VoiceXML protocol analysis.