CN103376990A

CN103376990A - Speech control method and system for web page operations

Info

Publication number: CN103376990A
Application number: CN2012101202020A
Authority: CN
Inventors: 周晓波; 刘玉国; 司天歌
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd; Tencent Cloud Computing Beijing Co Ltd
Priority date: 2012-04-23
Filing date: 2012-04-23
Publication date: 2013-10-30
Anticipated expiration: 2032-04-23
Also published as: CN103376990B

Abstract

An embodiment of the invention provides a speech control method and a system for web page operations. The method comprises the steps of setting speech text domains and control command domains corresponding to the speech text domains in hypertext markup language (HTML) tags of web pages, wherein the control command domains contain web page control commands; identifying key words of speech commands, retrieving the speech text domains matched with the key words from the HTML tags of the web pages, and implementing the web page control commands contained in the control command domains corresponding to the speech text domains. According to the speech control method and the system, by the aid of extension of the HTML tags and identification of the speech key words, speech control of the web page operations of web page content elements is achieved. The control method aims at the specific web pages instead of common commands, so that the operation universality is improved, and various choices are provided for developers.

Description

A kind of sound control method of web page operation and system

Technical field

Embodiment of the present invention relates to voice control technology field, more specifically, relates to a kind of sound control method and system of web page operation.

Background technology

Along with the develop rapidly of computer technology and network technology, the effect that internet (Internet) brings into play in daily life, study and work is also increasing.Hypermedia document on the internet is referred to as webpage (web page).Generally include the pointer (hyperlink) that points to other related pages or other nodes in the webpage.The organic set that logically will look as a whole a series of webpages is called website (Website or Site).

HTML (Hypertext Markup Language) (HTML, Hypertext Markup Language) is for a kind of markup language of describing web document.HTML is a kind of codes and standards, and it comes various piece in the webpage that mark will show by label.Web page files itself is a kind of text, by add label in text, can tell how browser shows content (as: how literal is processed, and how picture arranges, and how picture shows etc.) wherein.Browser is read web page files in order, then explains and show the content of its mark according to label, will not point out its mistake to writing the label of makeing mistakes, and does not stop its explanation implementation.

Current, voice technology begins to popularize in navigator product.Voice technology about browser mainly contains two kinds of patterns: phonitic entry method pattern and voice command mode.Under the phonitic entry method pattern, by the phonetic entry text; Under voice command mode, by the elemental motions such as forward-reverse of voice control browser.

Yet under present voice command mode, control command operation must be general, i.e. control command operation must be the operation that each webpage can both carry out, such as advancing, retreat etc.That is to say that control command operation is in fact the function of browser itself, and it doesn't matter with the particular content of webpage, can't realize customized command according to web page contents, it is also bad therefore to operate versatility.

Summary of the invention

Embodiment of the present invention proposes a kind of sound control method of web page operation, to improve the operation versatility.

Embodiment of the present invention proposes a kind of speech control system of web page operation, to improve the operation versatility.

The technical scheme of embodiment of the present invention is as follows:

A kind of sound control method of web page operation comprises:

The speech text territory is set in HTML (Hypertext Markup Language) (HTML) label of webpage and corresponding to the control command territory in this speech text territory, in described control command territory, comprises the webpage control command;

From voice command, identify key word, in the html tag of described webpage, retrieve the speech text territory that is complementary with this key word, and the webpage control command that comprises in the control command territory of execution corresponding to described speech text territory.

A kind of speech control system of web page operation, this system comprises webpage setting unit and browser, wherein:

The webpage setting unit is used for html tag at webpage and the speech text territory is set and corresponding to the control command territory in this speech text territory, comprises the webpage control command in described control command territory;

Browser is used for identifying key word from voice command, retrieves the speech text territory that is complementary with this key word in the html tag of described webpage, and the webpage control command that comprises in the control command territory of execution corresponding to described speech text territory.

Can find out from technique scheme, in embodiment of the present invention, the speech text territory at first is set and corresponding to the control command territory in this speech text territory in the html tag of webpage, and in the control command territory, include the webpage control command; Then from voice command, identify key word, in the html tag of webpage, retrieve the speech text territory that is complementary with this key word, and the webpage control command that comprises in the control command territory of execution corresponding to the speech text territory.This shows, use after the embodiment of the present invention, by expansion html tag and voiced keyword identification, realized the web page operation voice control for the web page contents element.The control mode of embodiment of the present invention is for specific webpage, rather than general order, has therefore significantly improved the operation versatility.

Description of drawings

Fig. 1 is the sound control method schematic flow sheet according to the web page operation of embodiment of the present invention;

Fig. 2 inputs synoptic diagram according to the microblogging of sending out of embodiment of the present invention;

Fig. 3 is the relay input synoptic diagram according to embodiment of the present invention;

Fig. 4 is the comment input synoptic diagram according to embodiment of the present invention;

Fig. 5 is the speech control system structural representation according to the web page operation of embodiment of the present invention.

Embodiment

For making the purpose, technical solutions and advantages of the present invention clearer, the present invention is described in further detail below in conjunction with accompanying drawing.

In embodiment of the present invention, more single for the structure of webpage, the user submits to content to increase, but the characteristics that entrance is single, some voice control command are provided, when the element-specific in the webpage and this voice control command coupling, just trigger operation corresponding on this element.

Fig. 1 is the sound control method schematic flow sheet according to the web page operation of embodiment of the present invention.

As shown in Figure 1, the method comprises:

Step 101: the speech text territory is set and corresponding to the control command territory in this speech text territory in HTML (Hypertext Markup Language) (HTML) label (tag) of webpage, and in the control command territory, comprises the webpage control command.

, can expand the HTML standard here, for some labels increase speech text territory and control command territory.Label is the fundamental element among the HTML, and web page element is corresponding label in the HTML standard.Web page element is the base unit of webpage, and for example the button in the webpage is exactly a kind of web page element.

In embodiment of the present invention, the speech text territory keeps corresponding with control command, and comprises the webpage control command in the control command territory.

Such as, embodiment of the present invention can arrange speech text territory and corresponding control command territory in the html tag that Input label, Div label, Table label, Tbody label, Tfoot label or Caption label etc. are commonly used.

Such as: can be in the html tag of webpage, send out microblogging speech text territory for the webpage control command setting of sending out microblogging and corresponding to the control command territory in this microblogging speech text territory for particular type; , the webpage control command of relaying microblogging relays microblogging speech text territory for arranging and corresponding to the control command territory in this relay microblogging speech text territory for particular type; For the webpage control command of comment microblogging comment microblogging speech text territory is set and corresponding to the control command territory in this comment microblogging speech text territory for particular type; For comment and the webpage control command of relaying microblogging comment is set and relays microblogging speech text territory and corresponding to this comment and relay the control command territory in microblogging speech text territory for particular type.

Although more than specifically enumerated the more extendible concrete html tags of embodiment of the present invention, it will be appreciated by those of skill in the art that this enumerating only is exemplary, and be not limited to the protection domain of embodiment of the present invention.

And, in embodiment of the present invention, can set in advance by the mode of self-defining function the particular content of the webpage control command that in the control command territory, comprises.

Exemplarily, can be with speech text territory called after voicetext; Control command territory called after voicecmd; And the mode by function definition arranges forwardweibo for relaying the concrete function name of microblogging operational order.

Take the Input label as example, the embodiment of the present invention specific implementation can be as follows:

Onclick=' forwardweibo ' voicecmd=" forwardweibo " voicetext=" please relay "＜input type=" button " class=" inputBtn sendBtn " value=" relay " title=" relay " 〉

Wherein, voicecmd and voicetext are the territory that embodiment of the present invention increased newly.Specifically describing in voicetext has text " please relay ", and specifically describe in voicecmd the concrete function name forwardweibo that relays the microblogging operational order is arranged.

Step 102: from voice command, identify key word, in the html tag of described webpage, retrieve the speech text territory that is complementary with this key word, and the webpage control command that comprises in the control command territory of execution corresponding to described speech text territory.

Browser need to be applied to speech recognition technology herein.

Speech recognition is also referred to as automatic speech recognition (ASR, Automatic Speech Recognition), and its target is that the vocabulary content in the human speech is converted to computer-readable input, for example button, binary coding or character string.

Based on concrete applied environment, the webpage control command of browser support can comprise following at least one: send out microblogging; Relay microblogging; The comment microblogging; Comment and relay microblogging; Send mail; Send personal letter; Or upload annex, etc.

When embodiment of the present invention being applied to when utilizing voice to send out microblogging in browser, the method specifically comprises:

At first from voice command, identify " sending out microblogging " key word, then browser retrieves the speech text territory (namely sending out microblogging speech text territory) that is complementary with " send out microblogging " key word in the html tag of webpage, and from corresponding to parsing microblogging function command the control command territory in this speech text territory; Then move this microblogging function command, in webpage, to send microblogging.

When embodiment of the present invention being applied to when utilizing voice to relay microblogging in browser, the method specifically comprises:

At first from voice command, identify " relay microblogging " key word, in the html tag of webpage, retrieve the speech text territory (namely relaying microblogging speech text territory) that is complementary with " relay microblogging " key word, and from the control command territory corresponding to this speech text territory, parse relay microblogging function command; Then move this relay microblogging function command, in webpage, to relay microblogging.

When embodiment of the present invention being applied to when utilizing voice to comment on microblogging in browser, the method specifically comprises:

At first from voice command, identify " comment microblogging " key word, in the html tag of webpage, retrieve the speech text territory (namely commenting on microblogging microblogging speech text territory) that is complementary with " comment microblogging " key word, and from the control command territory corresponding to the speech text territory, parse comment microblogging function command; Then move this comment microblogging function command, in webpage, to comment on microblogging.

Utilize voice in browser, to comment on and when relaying microblogging, the method specifically comprises when embodiment of the present invention is applied to:

At first from voice command, identify " comment and relay microblogging " key word, in the html tag of webpage, retrieve the speech text territory (i.e. comment and relay microblogging speech text territory) that is complementary with " comment and relay microblogging " key word, and from the control command territory corresponding to this speech text territory, parse comment and relay the microblogging order; Then move this comment and relay the microblogging function command, with comment in webpage and relay microblogging.

Although more than specifically enumerated some embodiments of webpage control command, it will be appreciated by those of skill in the art that this enumerating only is exemplary, and be not limited to the protection domain of embodiment of the present invention.

In one embodiment, browser identifies the concrete sound identification of key word from the voice command that the user sends method can have three kinds: based on the method for channel model and voice knowledge, the method for template matches and the method for utilizing artificial neural network, embodiment of the present invention preferably adopts the method for template matches.Template matches development comparative maturity has reached the practical stage at present.In template matching method, be through four steps: feature extraction, template training, template classification, judgement.Technology commonly used has three kinds: dynamic time warping (DTW), theoretical, vector quantization (VQ) technology of hidden Markov (HMM).

Exemplarily: when the user browses a page, and when having inputted some literal (perhaps not input characters), send voice command and " please relay " (namely saying " please relay " these 3 words), browser begins to search in webpage so, find with key word and " please relay " voicetext territory in the input element that is complementary, and definite voicecmd territory corresponding with the voicetext territory, then can carry out ' forwardweibo ' operation according to the value of voicecmd, namely carry out concrete relay microblogging operational order.

Preferably, input-output apparatus control command territory can be set in html tag further, and in input-output apparatus control command territory, comprise the webpage control command.Like this, when receiving the operation of input-output apparatus, can need not to carry out speech recognition, but directly carry out the webpage control command that comprises in this input-output apparatus control command territory.

Such as, take the Input label as example, can increase input-output apparatus control command territory (such as being onclick) newly, and onclick=' forwardweibo ', like this when mouse is clicked button corresponding to label, can directly carry out ' forwardweibo ' operation, namely directly carry out and relay the microblogging operation.

In webpage, can provide a plurality of several operating interactive points for the user, thereby be convenient to user's control.Such as, be applied as example with microblogging, the microblogging of sending out can be arranged, relay microblogging, comment on microblogging or a plurality of operating interactive points such as comment and relay microblogging.

Use after the embodiment of the present invention, just can be by the control of voice realization to these operations.

Exemplarily, Fig. 2 inputs synoptic diagram according to the microblogging of sending out of embodiment of the present invention; Fig. 3 is the relay input synoptic diagram according to embodiment of the present invention; Fig. 4 is the comment input synoptic diagram according to embodiment of the present invention.

Based on above-mentioned analysis, embodiment of the present invention has also proposed a kind of speech control system of web page operation.

As shown in Figure 5, this system comprises webpage setting unit 501 and browser 502.Wherein:

Webpage setting unit 501 is used for html tag at webpage and the speech text territory is set and corresponding to the control command territory in this speech text territory, comprises the webpage control command in the control command territory.

Such as, can be in the html tag of webpage, send out microblogging speech text territory for the webpage control command setting of sending out microblogging and corresponding to the control command territory in this microblogging speech text territory for particular type; , the webpage control command of relaying microblogging relays microblogging speech text territory for arranging and corresponding to the control command territory in this relay microblogging speech text territory for particular type; For the webpage control command of comment microblogging comment microblogging speech text territory is set and corresponding to the control command territory in this comment microblogging speech text territory for particular type; For comment and the webpage control command of relaying microblogging comment is set and relays microblogging speech text territory and corresponding to this comment and relay the control command territory in microblogging speech text territory for particular type.

Browser 502 is used for identifying key word from voice command, retrieves the speech text territory that is complementary with this key word in the html tag of webpage, and the webpage control command that comprises in the control command territory of execution corresponding to the speech text territory.

In one embodiment, input-output apparatus control command territory can be set in html tag further, and in input-output apparatus control command territory, comprise the webpage control command.Like this, when receiving the operation of input-output apparatus, can need not to carry out speech recognition, but directly carry out the webpage control command that comprises in this input-output apparatus control command territory.

Such as, take the Input label as example, can increase input-output apparatus control command territory (such as being onclick) newly, and onclick=' forwardweibo ', like this when mouse is clicked button corresponding to label, can directly carry out ' forwardweibo ' operation, namely carry out and relay the microblogging operation.

Particularly:

Webpage setting unit 501 is further used for arranging input-output apparatus control command territory in this html tag, comprise the webpage control command in described input-output apparatus control command territory;

Browser 502 is further used for when receiving the operation of input-output apparatus, carries out the webpage control command that comprises in this input-output apparatus control command territory.

Preferably, webpage setting unit 501 can also be further used for by the mode of self-defining function the webpage control command being set.

And embodiment of the present invention can arrange speech text territory and corresponding control command territory in the html tag that Input label, Div label, Table label, Tbody label, Tfoot label or Caption label etc. are commonly used.

In one embodiment, the webpage control command is for sending out microblogging.At this moment, browser 502, be used for from voice command identification microblogging key word, in the html tag of webpage, retrieve and the speech text territory (namely sending out microblogging speech text territory) of sending out the microblogging key word and being complementary, and from corresponding to parsing microblogging function command the control command territory in described speech text territory; And move this microblogging function command, in webpage, to send microblogging.

In one embodiment, the webpage control command is for relaying microblogging.At this moment, browser 502, be used for identifying relay microblogging key word from voice command, in the html tag of webpage, retrieve and the speech text territory (namely relaying microblogging speech text territory) of relaying the microblogging key word and being complementary, and from the control command territory corresponding to the speech text territory, parse and relay the microblogging function command; And move this relay microblogging function command, in webpage, to relay microblogging.

In one embodiment, the webpage control command is the comment microblogging.At this moment, browser 502, be used for identifying comment microblogging key word from voice command, in the html tag of webpage, retrieve and the speech text territory (namely commenting on microblogging speech text territory) of commenting on the microblogging key word and being complementary, and from the control command territory corresponding to described speech text territory, parse comment microblogging function command; And move this comment microblogging function command, in webpage, to comment on microblogging.

In one embodiment, the webpage control command is comment and relay microblogging; At this moment, browser 502, be used for identifying comment and relaying microblogging key word (i.e. comment and relay microblogging speech text territory) from voice command, in the html tag of webpage, retrieve and the speech text territory of commenting on and relaying the microblogging key word and being complementary, and from the control command territory corresponding to the speech text territory, parse comment and relay the microblogging order; And move this comment and relay the microblogging function command, with comment in webpage and relay microblogging.

But although more than specifically enumerated some embodiments of webpage control command and extension tag, it will be appreciated by those of skill in the art that this enumerating only is exemplary, and be not limited to the protection domain of embodiment of the present invention.

Can find out from technique scheme, in embodiment of the present invention, the speech text territory at first is set and corresponding to the control command territory in this speech text territory in the html tag of webpage, and in the control command territory, include the webpage control command; Then from voice command, identify key word, in the html tag of webpage, retrieve the speech text territory that is complementary with this key word, and the webpage control command that comprises in the control command territory of execution corresponding to the speech text territory.This shows, use after the embodiment of the present invention, by expansion html tag and voiced keyword identification, realized the web page operation voice control for the web page contents element.And the control mode of embodiment of the present invention is for specific webpage, rather than general order, so embodiment of the present invention has significantly improved the operation versatility.

In addition, the present invention can select label to expand arbitrarily in numerous labels of HTML, and therefore concrete application form of the present invention is very various, also helps various selection of developer.

The above is preferred embodiment of the present invention only, is not for limiting protection scope of the present invention.Within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims

1. the sound control method of a web page operation is characterized in that, comprising:

2. method according to claim 1 is characterized in that, the method further comprises: input-output apparatus control command territory is set in this html tag, comprises the webpage control command in described input-output apparatus control command territory;

When receiving the operation of input-output apparatus, carry out the webpage control command that comprises in this input-output apparatus control command territory.

3. method according to claim 1 is characterized in that, the method further comprises: the mode by self-defining function arranges described webpage control command.

4. each described method is characterized in that according to claim 1-3, described webpage control command comprise following at least one:

Send out microblogging;

Relay microblogging;

The comment microblogging;

Comment and relay microblogging;

Send mail;

Send personal letter; Or

Upload annex.

5. each described method is characterized in that according to claim 1-3, described html tag comprise following at least one:

The Input label;

The Div label;

The Table label;

The Tbody label;

The Tfoot label; Or

The Caption label.

6. method according to claim 1 is characterized in that, described webpage control command is for sending out microblogging; The method comprises:

Identification microblogging key word from voice command retrieves in the html tag of described webpage and the speech text territory of sending out the microblogging key word and being complementary, and from corresponding to parsing microblogging function command the control command territory in described speech text territory;

Move this microblogging function command, in webpage, to send microblogging.

7. method according to claim 1 is characterized in that, described webpage control command is for relaying microblogging; The method comprises:

From voice command, identify and relay the microblogging key word, in the html tag of described webpage, retrieve and the speech text territory of relaying the microblogging key word and being complementary, and from the control command territory corresponding to described speech text territory, parse and relay the microblogging function command;

Move this relay microblogging function command, in webpage, to relay microblogging.

8. method according to claim 1 is characterized in that, described webpage control command is the comment microblogging; The method comprises:

From voice command, identify comment microblogging key word, in the html tag of described webpage, retrieve and the speech text territory of commenting on the microblogging key word and being complementary, and from the control command territory corresponding to described speech text territory, parse comment microblogging function command;

Move this comment microblogging function command, in webpage, to comment on microblogging.

9. method according to claim 1 is characterized in that, described webpage control command is for comment and relay microblogging; The method comprises:

From voice command, identify comment and relay the microblogging key word, in the html tag of described webpage, retrieve and the speech text territory of commenting on and relaying the microblogging key word and being complementary, and from the control command territory corresponding to described speech text territory, parse comment and relay the microblogging order;

Move this comment and relay the microblogging function command, with comment in webpage and relay microblogging.

10. the speech control system of a web page operation is characterized in that, this system comprises webpage setting unit and browser, wherein:

11. system according to claim 10 is characterized in that,

The webpage setting unit is further used for arranging input-output apparatus control command territory in this html tag, comprise the webpage control command in described input-output apparatus control command territory;

Browser is further used for when receiving the operation of input-output apparatus, carries out the webpage control command that comprises in this input-output apparatus control command territory.

12. system according to claim 10 is characterized in that,

Described webpage setting unit is further used for by the mode of self-defining function described webpage control command being set.

13. each described system is characterized in that according to claim 10-12, described webpage control command comprise following at least one:

Send out microblogging;

Relay microblogging;

The comment microblogging;

Comment and relay microblogging;

Send mail;

Send personal letter; Or

Upload annex.

14. each described system is characterized in that according to claim 10-12, described html tag comprise following at least one:

The Input label;

The Div label;

The Table label;

The Tbody label;

The Tfoot label; Or

The Caption label.

15. system according to claim 10 is characterized in that, described webpage control command is for sending out microblogging;

Browser, be used for from voice command identification microblogging key word, in the html tag of described webpage, retrieve and the speech text territory of sending out the microblogging key word and being complementary, and from corresponding to parsing microblogging function command the control command territory in described speech text territory; And move this microblogging function command, in webpage, to send microblogging.

16. system according to claim 10 is characterized in that, described webpage control command is for relaying microblogging;

Browser, be used for identifying relay microblogging key word from voice command, in the html tag of described webpage, retrieve and the speech text territory of relaying the microblogging key word and being complementary, and from the control command territory corresponding to described speech text territory, parse and relay the microblogging function command; And move this relay microblogging function command, in webpage, to relay microblogging.

17. system according to claim 10 is characterized in that, described webpage control command is the comment microblogging;

Browser, be used for identifying comment microblogging key word from voice command, in the html tag of described webpage, retrieve and the speech text territory of commenting on the microblogging key word and being complementary, and from the control command territory corresponding to described speech text territory, parse comment microblogging function command; And move this comment microblogging function command, in webpage, to comment on microblogging.

18. system according to claim 10 is characterized in that, described webpage control command is for comment and relay microblogging;

Browser, be used for identifying comment and relaying the microblogging key word from voice command, in the html tag of described webpage, retrieve and the speech text territory of commenting on and relaying the microblogging key word and being complementary, and from the control command territory corresponding to described speech text territory, parse comment and relay the microblogging order; And move this comment and relay the microblogging function command, with comment in webpage and relay microblogging.