CN102520792A

CN102520792A - Voice-type interaction method for network browser

Info

Publication number: CN102520792A
Application number: CN2011103887723A
Authority: CN
Inventors: 林云
Original assignee: JIANGSU QIYIDIAN NETWORKS CO Ltd
Current assignee: JIANGSU QIYIDIAN NETWORKS CO Ltd
Priority date: 2011-11-30
Filing date: 2011-11-30
Publication date: 2012-06-27

Abstract

The invention discloses a voice-type interaction method for a network browser, which comprises the following steps that: 1) a voice identification engine is established on a server; 2) after a client opens a network browser, voice of a user is collected through a microphone, voice characteristic information in the user voice is extracted and collected, and the voice characteristic information is transmitted to the server; 3) the server receives the voice characteristic information transmitted by the client, calls the voice identification engine to convert the voice characteristic information to a browser control order and transmits the browser control order to the client; and 4) the client receives the browser control order transmitted by the server and executes the browser control order to realize the interaction with the network browser. The voice-type interaction method has advantages that: the network function of the browser can be adequately used for realizing the calling of the voice identification engine on the server and for realizing the voice-type interaction with the network browser, the user experience is good, and the simplicity and convenience in use can be realized.

Description

The speech type exchange method that is used for web browser

Technical field

The present invention relates to field of human-computer interaction, be specifically related to a kind of speech type exchange method that is used for web browser.

Background technology

The The Research of Speech Recognition of China originates in 1958, by 10 vowels of Chinese Academy of Sciences's vacuum tube circuit that acoustics utilizes identification.Just discerned until 1973 by Chinese Academy of Sciences's computer speech that acoustics begins.Because the restriction of prevailing condition, the The Research of Speech Recognition work of China is in slow development stages always.Get into after the eighties, along with Computer Applied Technology is popularized and the further developing of application and digital signal technique in China gradually, domestic many units have possessed the pacing items of research voice technology.Meanwhile, speech recognition technology becomes the focus of research heavily again after having passed through silence for many years in the world, and development rapidly.Just under this form, domestic many units put into one after another in this research work and go.In March, 1986 China's development in Hi-Tech plan (863 Program) starts, and speech recognition is classified as research topic specially as an important component part of intelligent computer systems research.Under the support of 863 Program, China has begun the research of organized speech recognition technology, and has determined every special meeting of holding a speech recognition at a distance from 2 years.From then on the speech recognition technology of China has got into a unprecedented developing stage.Especially along with the most in the last few years, national and various commercial undertakings are to the attention of speech recognition, and speech recognition technology is mature on the whole at present, and in commercial application, obtained using widely.

Web browser has become the main entrance of operating system and types of applications platform at present; Become one of application software main in the operating system gradually, the user experience that therefore how to improve web browser has become web browser and has attracted one of main means of user.And web browser is particularly useful for speech recognition technology comparatively speaking because content identified is single relatively.

Summary of the invention

The technical matters that the present invention will solve provides that the speech type that calls, realizes web browser that a kind of network function that can make full use of browser itself realizes the service end speech recognition engine is mutual, user experience good, the speech type exchange method that is used for web browser easy to use.

In order to solve the problems of the technologies described above, the technical scheme that the present invention adopts is:

A kind of speech type exchange method that is used for web browser, implementation step is following:

1) service end is set up speech recognition engine;

2) client is gathered user speech through microphone after opening web browser, extracts the phonetic feature information in the user speech that collects, and said phonetic feature information is sent to service end;

3) said service end receives the phonetic feature information that client is sent, and calling speech recognition engine is the browser control command with the phonetic feature information translation, and said browser control command is sent to client;

4) client receives the browser control command that said service end is sent, and carries out the mutual of said browser control command realization and web browser.

Further improvement as technique scheme:

The server calls speech recognition engine is that the concrete steps of browser control command comprise with the phonetic feature information translation in the said step 3): calling speech recognition engine is Word message with the phonetic feature information translation; Said Word message is divided into control model information and control command information; Three kinds of the location input of said control model packets of information purse rope, current page and label control, browser program controls, said control command information comprises the shortcut that is used for correspondence under said control model information.

The concrete steps of the said browser control command of client executing comprise in the said step 4): client reads the control model information of browser control command; If control model information is the network address input; Then, comprise the key-press event of shortcut then to operating system transmitting control commands information with the address input field of the current focus fixer network browser of operating system; If control model information is the control of current page and label,, comprise the key-press event of shortcut then to operating system transmitting control commands information then with the page or the label of the current focus fixer network browser of operating system; If control model information is browser program control,, comprise the key-press event of shortcut then to operating system transmitting control commands information then with the window of the current focus fixer network browser of operating system.

If said client reads the failure of control model information when reading the control model information of browser control command, then current Shipping Options Page or the current page with web browser navigates to preset network address.

The present invention has following advantage:

The present invention sets up speech recognition engine, client after opening web browser through service end; Gather user speech through microphone; Phonetic feature information in the user speech that extraction collects; And phonetic feature information is sent to service end, service end receive the phonetic feature information that client is sent; Calling speech recognition engine is the browser control command with the phonetic feature information translation; And the browser control command is sent to client, client receive the browser control command that service end is sent, and carry out the browser control command and realize mutual with web browser, can make full use of the calling of network function realization service end speech recognition engine of browser itself; And speech recognition engine is arranged on service end and can makes things convenient at any time and upgrade speech recognition engine and client need not any change and can improve speech recognition performance, has good, the easy to use advantage of user experience.

Description of drawings

In order to be illustrated more clearly in the embodiment of the invention or technical scheme of the prior art; To do to introduce simply to the accompanying drawing of required use in embodiment or the description of the Prior Art below; Obviously, the accompanying drawing in describing below only is some embodiments of the present invention, for those of ordinary skills; Under the prerequisite of not paying creative work, can also obtain other accompanying drawing according to these accompanying drawings.

Fig. 1 is the main schematic flow sheet of the embodiment of the invention.

Embodiment

Below in conjunction with accompanying drawing the preferred embodiments of the present invention are set forth in detail, thereby protection scope of the present invention is made more explicit defining so that advantage of the present invention and characteristic can be easier to it will be appreciated by those skilled in the art that.

As shown in Figure 1, the implementation step of speech type exchange method that present embodiment is used for web browser is following:

1) service end is set up speech recognition engine;

2) client is gathered user speech through microphone after opening web browser, extracts the phonetic feature information in the user speech that collects, and phonetic feature information is sent to service end;

3) service end receives the phonetic feature information that client is sent, and calling speech recognition engine is the browser control command with the phonetic feature information translation, and the browser control command is sent to client;

4) client receives the browser control command that service end is sent, and carries out the mutual of realization of browser control command and web browser.

The server calls speech recognition engine is that the concrete steps of browser control command comprise with the phonetic feature information translation in the present embodiment step 3): calling speech recognition engine is Word message with the phonetic feature information translation; Word message is divided into control model information and control command information; Three kinds of the location input of control model packets of information purse rope, current page and label control, browser program controls, control command information comprises the shortcut that is used for correspondence under control model information.

The concrete steps of client executing browser control command comprise in the present embodiment step 4): client reads the control model information of browser control command; If control model information is the network address input; Then, comprise the key-press event of shortcut then to operating system transmitting control commands information with the address input field of the current focus fixer network browser of operating system; If control model information is the control of current page and label,, comprise the key-press event of shortcut then to operating system transmitting control commands information then with the page or the label of the current focus fixer network browser of operating system; If control model information is browser program control,, comprise the key-press event of shortcut then to operating system transmitting control commands information then with the window of the current focus fixer network browser of operating system.

If the present embodiment client reads the failure of control model information when reading the control model information of browser control command, then current Shipping Options Page or the current page with web browser navigates to preset network address.

The above only is a preferred implementation of the present invention, and protection scope of the present invention also not only is confined to the foregoing description, and all technical schemes that belongs under the thinking of the present invention all belong to protection scope of the present invention.Should be pointed out that for those skilled in the art in the some improvement and the retouching that do not break away under the principle of the invention prerequisite, these improvement and retouching also should be regarded as protection scope of the present invention.

Claims

1. speech type exchange method that is used for web browser is characterized in that implementation step is following:

1) service end is set up speech recognition engine;

2. the speech type exchange method that is used for web browser according to claim 1; It is characterized in that: the server calls speech recognition engine is that the concrete steps of browser control command comprise with the phonetic feature information translation in the said step 3): calling speech recognition engine is Word message with the phonetic feature information translation; Said Word message is divided into control model information and control command information; Three kinds of the location input of said control model packets of information purse rope, current page and label control, browser program controls, said control command information comprises the shortcut that is used for correspondence under said control model information.

3. the speech type exchange method that is used for web browser according to claim 2; It is characterized in that: the concrete steps of the said browser control command of client executing comprise in the said step 4): client reads the control model information of browser control command; If control model information is the network address input; Then, comprise the key-press event of shortcut then to operating system transmitting control commands information with the address input field of the current focus fixer network browser of operating system; If control model information is the control of current page and label,, comprise the key-press event of shortcut then to operating system transmitting control commands information then with the page or the label of the current focus fixer network browser of operating system; If control model information is browser program control,, comprise the key-press event of shortcut then to operating system transmitting control commands information then with the window of the current focus fixer network browser of operating system.

4. the speech type exchange method that is used for web browser according to claim 3; It is characterized in that: if said client reads the failure of control model information when reading the control model information of browser control command, then current Shipping Options Page or the current page with web browser navigates to preset network address.