WO2017028601A1 - 智能终端的语音控制方法、装置及电视机系统 - Google Patents

智能终端的语音控制方法、装置及电视机系统 Download PDF

Info

Publication number
WO2017028601A1
WO2017028601A1 PCT/CN2016/084476 CN2016084476W WO2017028601A1 WO 2017028601 A1 WO2017028601 A1 WO 2017028601A1 CN 2016084476 W CN2016084476 W CN 2016084476W WO 2017028601 A1 WO2017028601 A1 WO 2017028601A1
Authority
WO
WIPO (PCT)
Prior art keywords
control
voice
text information
controllable
information
Prior art date
Application number
PCT/CN2016/084476
Other languages
English (en)
French (fr)
Inventor
韩菁
Original Assignee
深圳Tcl数字技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳Tcl数字技术有限公司 filed Critical 深圳Tcl数字技术有限公司
Publication of WO2017028601A1 publication Critical patent/WO2017028601A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Definitions

  • the present invention relates to the field of voice control technologies, and in particular, to a voice control method, apparatus, and television system for an intelligent terminal.
  • smart terminals Under the trend of triple play, smart terminals have gradually become the center of home entertainment. Faced with the complex new functions of smart terminals and the ever-increasing variety of application software, the use of manual operations has been unable to meet consumers' needs for simple and convenient control of intelligent terminals.
  • smart phones have mainstream voice assistant tools such as Siri and small i robots, and smart TVs are still in a blank stage in terms of intelligent voice assistants.
  • voice assistant tools currently used by smart TVs are only templates for preset user talks and their corresponding data, or just voice interactions with the theme of chat and entertainment, and there is no real sense. The full voice control of the TV set can not enable the user to lose the remote control and realize the global function control of the TV through voice.
  • the main object of the present invention is to provide a voice control method, device and television system for an intelligent terminal, which aims to solve the problem that the existing smart terminal cannot implement full-range voice control.
  • the present invention provides a voice control method for an intelligent terminal, where the voice control method of the smart terminal includes:
  • the voice command sent by the voice input device is an audio stream.
  • the present invention further provides a voice control device for an intelligent terminal, where the voice control device of the smart terminal includes:
  • the collecting module is configured to collect, according to the view attribute of the currently displayed page of the smart terminal, parameter information of all controllable control objects on the current display page when receiving the voice instruction sent by the voice input device;
  • a matching module configured to match the semantic text information obtained by the voice instruction through the speech semantic recognition with the text information in the collected parameter information of all controllable control objects, to obtain a semantic text information matching the voice instruction Controllable control object;
  • a triggering module configured to trigger a corresponding control operation of the controllable control object.
  • the present invention further provides a television system, characterized in that the television system comprises a television set, a voice input device, a voice semantic recognition server, and the television set is provided with a voice receiving device and a voice control device.
  • the voice receiving device receives the voice command input by the voice input device, and sends the voice command to the voice semantic recognition server for voice semantic recognition to obtain semantic text information;
  • the voice control device is voice control of the smart terminal
  • a device configured to obtain a controllable control object matching the semantic text information on a current display page of the television, and trigger a corresponding control operation of the controllable control object.
  • the invention provides a voice control method and device for a smart terminal and a television system.
  • the system can collect all controllable controls on the current display page based on the system layer of the background system of the smart terminal.
  • the parameter information of the object thereby implementing the voice control of any controllable control object on the currently displayed page; since all controllable control objects on each display page can be collected, the whole voice control of the intelligent terminal is realized.
  • FIG. 1 is a schematic flowchart of a voice control method of a smart terminal according to a first embodiment of the present invention
  • FIG. 2 is a schematic diagram of a first refinement process of collecting parameter information of all controllable control objects currently displayed on an intelligent terminal in a second embodiment of the voice control method of the smart terminal according to the present invention
  • FIG. 3 is a schematic diagram of a first refinement process of acquiring a controllable control object corresponding to a voice instruction in a third embodiment of a voice control method of a smart terminal according to the present invention
  • FIG. 4 is a second refinement flow diagram of collecting parameter information of all controllable control objects currently displayed on the smart terminal in the fourth embodiment of the voice control method of the smart terminal according to the fourth embodiment of the present invention
  • FIG. 5 is a schematic diagram of a second refinement process of acquiring a controllable control object corresponding to a voice instruction in a fifth embodiment of a voice control method of a smart terminal according to the present invention
  • FIG. 6 is a schematic diagram of functional modules of a first embodiment of a voice control apparatus of a smart terminal according to the present invention.
  • FIG. 7 is a schematic diagram of a first refinement function module of an acquisition module in a second embodiment of a voice control device of an intelligent terminal according to the present invention.
  • FIG. 8 is a schematic diagram of a first refinement function module of a matching module in a third embodiment of a voice control device of an intelligent terminal according to the present invention.
  • FIG. 9 is a schematic diagram of a second refinement function module of the acquisition module in the fourth embodiment of the voice control device of the intelligent terminal of the present invention.
  • FIG. 10 is a schematic diagram of a second refinement function module of a matching module in a fifth embodiment of a voice control device of a smart terminal according to the present invention.
  • FIG. 11 is a schematic structural view of a television system of the present invention.
  • the present invention provides a voice control method for an intelligent terminal. As shown in FIG. 1 , a flow chart of a voice control method for a smart terminal according to a first embodiment of the present invention is shown.
  • the voice input device can be a mobile terminal or a remote controller.
  • the mobile terminal may be a mobile phone, a tablet computer, or the like, which can input voice through an instant messaging voice module or a multi-screen interactive voice module.
  • the mobile phone can implement a voice control television by installing a WeChat TV application software. Operation.
  • the remote controller may be all remote controllers that support voice input functions.
  • the controls on the display page are divided into controllable controls and non-controllable controls.
  • the controllable controls can perform further operations according to the instructions, and the control properties are controllable; the non-controllable controls can be used to display content on the page, and cannot perform further Operation, its control properties are uncontrollable.
  • the parameter information of the collected controllable control object includes text information of the controllable control object, a control identifier, a control type to which the control object belongs (for example, a button class, a radio box class, a list class, etc.), and a URL address of the control object. Wait.
  • the user inputs a voice command using a voice input device such as a mobile terminal or a remote controller. While the user inputs the voice command, the voice input device converts the voice command input by the user into an audio stream in real time and sends the voice command to the smart terminal.
  • the smart terminal may be a smart TV, or may be a smart mobile terminal such as a mobile phone or a tablet computer.
  • the smart terminal starts collecting control information of all control objects on the current display page of the smart terminal at the system layer of the background system of the intelligent terminal, and extracts the control according to the attribute of the control. The object is filtered out as a controllable control object, and then the parameter information of the controllable control object is obtained.
  • the control base class on which each control is based is defined in the system layer of the background system of the smart terminal, and the control types to which all the control objects belong are derived based on the base class of the control.
  • the smart terminal After receiving the audio stream of the voice command, the smart terminal sends the audio stream to a voice recognizer, where the voice recognizer may be a module or unit in the smart terminal, or may be A third-party voice recognition server.
  • the speech recognizer recognizes the audio stream, recognizes it, and outputs a final recognition result, that is, voice text information, and then returns the voice text information to the smart terminal.
  • the smart terminal sends the voice text information to the semantic identifier after receiving the voice text information, wherein the semantic identifier may be a module or unit in the smart terminal, or may be a semantic recognition of a third party. server.
  • the semantic recognizer receives the speech text information, analyzes the word segmentation, understands the key verb and the key search object, and outputs the final recognition result, that is, the semantic text information, and then returns the semantic text information to the Intelligent Terminal.
  • the smart terminal after receiving the audio stream of the voice instruction, the smart terminal sends the audio stream to a voice semantic recognition server, and the voice semantic recognition server performs voice input on the input audio stream. After the recognition, the semantic recognition is performed; after the final semantic text information is obtained through the speech semantic recognition, the semantic text information is returned to the intelligent terminal.
  • the smart terminal After receiving the semantic text information, the smart terminal matches the semantic text information with the text information of all controllable control objects of the collected current display page, thereby obtaining the matched text information;
  • the matched text information and the parameter information of all the controllable control objects collected may obtain a controllable control object that matches the semantic text information of the voice instruction.
  • a fuzzy matching algorithm may be used in performing matching, and the fuzzy matching algorithm may be any one of the existing fuzzy matching algorithms, for example, a fast Chinese string fuzzy matching algorithm.
  • the obtained controllable control object matching the semantic text information of the voice instruction is a play program button, triggering a corresponding operation of playing the program; matching the obtained semantic text information with the voice instruction
  • the controllable control object is a drop-down list, trigger a corresponding operation of expanding the drop-down list and displaying the content of the drop-down list;
  • the obtained controllable control object matching the semantic text information of the voice instruction is a dialog box In the "OK" button, triggering the corresponding operation of the "OK” button; triggering the jump when the obtained controllable control object matching the semantic text information of the voice instruction is a link on the webpage Go to the action on the corresponding page of the link.
  • the voice control method of the intelligent terminal proposed by the invention is to collect the control at the system layer of the background system of the intelligent terminal, and can collect parameter information of all the controls based on the control layer base class of the system layer, and is applicable to any third-party application.
  • the effect of unified adaptation is achieved, so that the coverage and controllable range of voice control in the intelligent terminal are greatly improved, and the full-range voice control in the true sense is realized.
  • step S10 includes:
  • the view attributes are classified into three categories: a dialog view class, a web view class, and an image display view class.
  • the parameter information includes the view attribute of the current display page and all the control objects on the page.
  • the parameter information and the defined control base class when the view attribute of the currently displayed page is a dialog box or an image display page, collecting parameter information of all control objects currently displayed on the smart terminal, the parameter information includes a control The text information of the object, the controllable property, the control identifier, the type of the control, and so on. According to whether the controllable property of the control object is controllable, all controllable control objects are filtered out from all the control objects collected.
  • a classification control object list in which the parameter information of all controllable control objects belonging to the same control type is stored in the same list, the parameters
  • the information includes the text information of the control object, the control identifier, and the type of the control to which it belongs.
  • the text information and the control identifier of all controllable control objects are encapsulated in a JSON data format (the text information and the control identifier are correspondingly), thereby obtaining all controllable control objects. Parameter information.
  • the method for collecting parameter information of all controllable control objects on the page currently displayed by the intelligent terminal is based on the view attribute of the current display page, and the system layer of the background system of the intelligent terminal is based on the control based on each control object.
  • the base class performs parameter collection, and can collect parameter information of all controllable control objects inherited from the control base class in the current display page of the smart terminal, and is applicable to collection of various controls on any third-party application.
  • step S20 includes:
  • controllable control object corresponding to the control identifier after obtaining the control identifier of the controllable control object corresponding to the matched text information, obtaining the controllable control object corresponding to the control identifier according to the constructed classification control object list, according to The obtained controllable control object and the control type to which it belongs implement a corresponding control operation of the controllable control object.
  • a method for obtaining a controllable control object matching the semantic text information of the voice instruction according to the third embodiment of the present invention obtains a controllable control object corresponding to the matched text information by using a control identifier, so that the search process is
  • the steps are simpler and easier to implement.
  • step S10 includes:
  • the webpage parsing information includes label information, text information, and a URL address;
  • the view attributes are classified into three categories: a dialog view class, a web view class, and an image display view class.
  • the HTML source code of the webpage is obtained by performing webpage parsing on the current display page, and the label information, the text information, the URL address, and the like of all the control objects can be obtained in the HTML source code.
  • the parameter information of all controllable control objects that can be linked is filtered out from the webpage parsing information according to the label information of all the control objects, and the parameter information includes text information and a URL address of the control object.
  • the parameter information is then encapsulated in a JSON data format (the text information and the URL address are in a corresponding relationship), thereby obtaining parameter information of all controllable control objects.
  • the method for collecting the parameter information of all controllable control objects on the page currently displayed by the intelligent terminal acquires the parameter information of all controllable control objects on the current webpage according to the label information of the controllable control object after the webpage parsing
  • the acquisition procedure is simple and easy to implement.
  • step S20 includes:
  • the method for searching for the corresponding controllable control object according to the matched text information proposed by the fifth embodiment of the present invention is applicable to the searching of all web-based controllable control objects, and the searching step is simple and easy to implement.
  • the present invention also provides a voice control device for a smart terminal.
  • a schematic diagram of a functional module of a voice control device of a smart terminal of the present invention is shown, including:
  • the collecting module 100 is configured to collect, according to the view attribute of the currently displayed page of the smart terminal, parameter information of all controllable control objects on the current display page when receiving the voice instruction sent by the voice input device;
  • the voice input device may be a mobile terminal or a remote controller.
  • the mobile terminal may be a mobile phone, a tablet computer, or the like, which can input voice through an instant messaging voice module or a multi-screen interactive voice module.
  • the mobile phone can implement a voice control television by installing a WeChat TV application software.
  • the remote controller may be all remote controllers that support voice input functions.
  • the control on the display page is divided into a controllable control and a non-controllable control, wherein the controllable control can perform further operations according to the instruction, and the control attribute is controllable; and the non-controllable control can be used to display the content on the page. No further operations can be performed and its control properties are uncontrollable.
  • the parameter information of the collected controllable control object includes text information of the controllable control object, a control identifier, a control type to which the control object belongs (for example, a button class, a radio box class, a list class, etc.), and a URL address of the control object. Wait.
  • the user inputs a voice command using a voice input device such as a mobile terminal or a remote controller. While the user inputs the voice command, the voice input device converts the voice command input by the user into an audio stream in real time and sends the voice command to the smart terminal.
  • the smart terminal may be a smart TV, or may be a smart mobile terminal such as a mobile phone or a tablet computer.
  • the collecting module 100 starts collecting the control information of all the control objects on the current display page of the smart terminal in the system layer of the background system of the smart terminal, and according to the control attribute
  • the collected control object filters out the controllable property as a controllable control object, and then obtains the parameter information of the controllable control object.
  • the control base class on which each control is based is defined in the system layer of the background system of the smart terminal, and the control types to which all the control objects belong are derived based on the base class of the control.
  • the matching module 200 is configured to match the semantic text information obtained by the voice instruction through the speech semantic recognition with the text information in the collected parameter information of all controllable control objects, to obtain the semantic text information of the voice instruction. Matching controllable control object;
  • the smart terminal After receiving the audio stream of the voice command, the smart terminal sends the audio stream to a voice recognizer, where the voice recognizer may be a module or unit in the smart terminal, or may be A third-party voice recognition server.
  • the speech recognizer recognizes the audio stream, recognizes it, and outputs a final recognition result, that is, voice text information, and then returns the voice text information to the smart terminal.
  • the smart terminal sends the voice text information to the semantic identifier after receiving the voice text information, wherein the semantic identifier may be a module or unit in the smart terminal, or may be a semantic recognition of a third party. server.
  • the semantic recognizer receives the speech text information, analyzes the word segmentation, understands the key verb and the key search object, and outputs the final recognition result, that is, the semantic text information, and then returns the semantic text information to the Intelligent Terminal.
  • the smart terminal after receiving the audio stream of the voice instruction, the smart terminal sends the audio stream to a voice semantic recognition server, and the voice semantic recognition server inputs the audio stream. After the speech recognition is performed, the semantic recognition is performed; after the final semantic text information is obtained through the speech semantic recognition, the semantic text information is returned to the intelligent terminal.
  • the matching module 200 matches the semantic text information with the text information of all controllable control objects of the collected current display page, thereby obtaining the matched text information. And obtaining, according to the matched text information and the parameter information of all the controllable control objects collected, a controllable control object matching the semantic text information of the voice instruction is obtained.
  • a fuzzy matching algorithm may be used for matching, and the fuzzy matching algorithm may be any of the existing fuzzy matching algorithms, for example, a fast Chinese string fuzzy matching algorithm.
  • the triggering module 300 is configured to trigger a corresponding control operation of the controllable control object.
  • the triggering module 300 triggers a corresponding operation of playing the program; the semantics of the obtained voice instruction
  • the controllable control object of the text information matching is a drop-down list
  • the triggering module 300 triggers a corresponding operation of expanding the drop-down list and displaying the content of the drop-down list; the obtained matching with the semantic text information of the voice instruction is obtained
  • the control object is an "OK" button in the dialog box
  • the trigger module 300 triggers a corresponding operation of the "OK” button
  • the controllable control object obtained in the obtained semantic text information matching the voice instruction is
  • a link is on the web page
  • the trigger module 300 triggers an action to jump to the corresponding web page of the link.
  • the voice control device of the intelligent terminal proposed by the present invention collects the control at the system layer of the background system of the intelligent terminal, and can collect parameter information of all the controls based on the control layer base class of the system layer, and is applicable to any third-party application.
  • the effect of unified adaptation is achieved, so that the coverage and controllable range of voice control in the intelligent terminal are greatly improved, and the full-range voice control in the true sense is realized.
  • the collection module 100 includes:
  • the first collecting unit 101 is configured to collect parameter information of all control objects currently displayed on the smart terminal when the view attribute of the current display page is a dialog box or an image display page, and filter out all controllable control objects from the control object. ;
  • the first collecting unit 101 acquires parameter information of the currently displayed page, where the parameter information includes a view attribute of the currently displayed page and all control objects on the page.
  • the first collection unit 101 collects parameter information of all control objects currently displayed on the smart terminal.
  • the parameter information of all the control objects includes the text information of the control object, the controllable attribute, the control identifier, the type of the control, and the like.
  • the first collection unit 101 filters out all controllable control objects from all the collected control objects according to whether the controllable property of the control object is controllable.
  • the first obtaining unit 102 is configured to extract parameter information of all controllable control objects according to the type of the control to which all the controllable control objects belong.
  • the first obtaining unit 102 constructs a classification control object list according to the control type to which all the controllable control objects belong, in which the parameter information of all controllable control objects belonging to the same control type is stored in the classification control object list.
  • the parameter information includes the text information of the control object, the control identifier, and the type of the control to which it belongs.
  • the first obtaining unit 102 encapsulates the text information and the control identifier of all controllable control objects in a JSON data format (the text information and the control identifier are in a corresponding relationship), thereby Get parameter information for all controllable control objects.
  • the voice control device of the smart terminal collects parameters based on the control base class of each control object according to the view attribute of the current display page, and can collect the current display. All the parameter information of all controllable control objects in the page that inherit from the base class of the control is applicable to the collection of various controls on any third-party application.
  • the matching module 200 includes:
  • the first matching unit 201 is configured to match the semantic text information obtained by the voice instruction through the voice semantic recognition with the text information in the collected parameter information of all controllable control objects, to obtain the matched text information;
  • the second obtaining unit 202 is configured to obtain, according to the correspondence between the text information and the control identifier, a control identifier of the controllable control object corresponding to the matched text information, to trigger the controllable control according to the control identifier The corresponding control operation of the object.
  • the second obtaining unit 202 obtains the corresponding control identifier according to the constructed classification control object list.
  • the control object can be controlled to implement a corresponding control operation of the controllable control object according to the obtained controllable control object and the type of the control to which it belongs.
  • the voice control device of the smart terminal according to the third embodiment of the present invention realizes that the controllable control object corresponding to the matched text information is obtained by searching the control identifier, and the operation is simple and easy to implement.
  • the collection module 100 includes:
  • the second collecting unit 103 is configured to: when the view attribute of the current display page is a webpage, parse the webpage to obtain webpage parsing information; the webpage parsing information includes label information, text information, and a URL address;
  • the view attributes are classified into three categories: a dialog view class, a web view class, and an image display view class.
  • the second collecting unit 103 obtains the HTML source code of the webpage by performing webpage parsing on the current display page, and obtains label information and text information of all the control objects in the HTML source code. URL address, etc.
  • the third obtaining unit 104 is configured to extract parameter information of all controllable control objects from the webpage parsing information according to the label information.
  • the third obtaining unit 104 filters parameter information of all controllable control objects that can be linked from the webpage parsing information according to the label information of all the control objects, where the parameter information includes the text of the control object. Information and URL address.
  • the parameter information is then encapsulated in a JSON data format (the text information and the URL address are in a corresponding relationship), thereby obtaining parameter information of all controllable control objects.
  • the voice control device of the smart terminal according to the fourth embodiment of the present invention realizes obtaining parameter information of all controllable control objects on the current webpage according to the tag information of the controllable control object after the webpage is parsed, and the operation is simple and easy to implement.
  • the matching module 200 includes:
  • the second matching unit 203 is configured to match the semantic text information obtained by the voice instruction through the voice semantic recognition with the text information in the collected parameter information of all controllable control objects, to obtain the matched text information;
  • the fourth obtaining unit 204 is configured to obtain, according to the correspondence between the text information and the URL, a URL address of the controllable control object corresponding to the matched text information, to trigger the controllable control object according to the URL address The corresponding control operation.
  • the voice control device of the smart terminal according to the fifth embodiment of the present invention is applicable to the searching of all web-based controllable control objects, and the operation is simple and easy to implement.
  • the present invention also provides a television system, as shown in Fig. 11, which shows a schematic structural view of a television system of the present invention.
  • the television system includes a television set 500, a voice input device 400, and a voice semantic recognition server 600.
  • the television set is provided with a voice receiving device and a voice control device.
  • the voice receiving device receives the voice input by the voice input device 400.
  • the instruction is sent to the speech semantic recognition server 600 for speech semantic recognition to obtain semantic text information.
  • the voice control device is a voice control device of any of the smart terminals, and is used to obtain the current display page of the television. Controllable control object matching the semantic text information, and triggering a corresponding control operation of the controllable control object.
  • the speech semantic recognition server 600 can be a server that can perform both speech recognition and semantic recognition. It can also be two separate servers, namely a speech recognition server and a semantic recognition server.
  • the speech semantic recognition server 600 in the television system can be replaced by a speech semantic recognition module in the voice control device of the television set 500, and the speech semantic recognition module has the semantic recognition with the speech.
  • the server 600 has the same voice semantic recognition function.
  • the television system provided by the invention supports voice full-range control of television operation
  • the television set in the television system is provided with a voice receiving device and a voice control device, which can support voice control of the television
  • the voice control device is Based on the system layer of the TV background system, the parameters of all controllable control objects of the current display page are collected, and the parameter information of all the control base classes based on the system layer can be collected, which is applicable to any third-party application.
  • the effect of unified adaptation makes the coverage and controllable range of voice control in the TV set greatly improved, realizing full-range voice control in the true sense.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种智能终端的语音控制方法,包括:在接收到语音输入设备发送的语音指令时,采集智能终端当前展示页面上所有可控控件对象的参数信息(S10);将所述语音指令经过语音语义识别后获得的语义文本信息与所采集的所有可控控件对象的参数信息中的文本信息进行匹配,获得与所述语音指令的语义文本信息匹配的可控控件对象(S20);触发所述可控控件对象相应的控制操作(S30)。一种智能终端的语音控制装置及电视机系统。

Description

智能终端的语音控制方法、装置及电视机系统
技术领域
本发明涉及语音控制技术领域,尤其涉及一种智能终端的语音控制方法、装置及电视机系统。
背景技术
在三网融合的大潮流下,智能终端逐渐成为家庭娱乐的中心。面对智能终端复杂的新功能和种类日益繁多的应用软件,通过使用手工操作已经无法满足消费者对于简单、便捷控制智能终端的需求。目前,智能手机有Siri、小i机器人等主流的语音助手工具,而智能电视在智能语音助手方面还处于空白阶段。以智能电视为例,很多智能电视目前使用的语音助手工具都只是预置用户说话的模板及其对应的数据,或者仅仅是以聊天娱乐为主题的语音交互,并没有真正意义上的做到对电视机的全程语音控制,也就无法使用户丢掉遥控器而通过语音实现对电视机的全局功能控制。
发明内容
本发明的主要目的在于提供一种智能终端的语音控制方法、装置及电视机系统,旨在解决现有智能终端不能实现全程语音控制的问题。
为实现上述目的,本发明提供一种智能终端的语音控制方法,所述智能终端的语音控制方法包括:
S10、在接收到语音输入设备发送的语音指令时,采集智能终端当前展示页面上所有可控控件对象的参数信息;
S20、将所述语音指令经过语音语义识别后获得的语义文本信息与所采集的所有可控控件对象的参数信息中的文本信息进行匹配,获得与所述语音指令的语义文本信息匹配的可控控件对象;
S30、触发所述可控控件对象相应的控制操作;
其中,所述语音输入设备发送的语音指令为音频流。
为实现上述目的,本发明还提供一种智能终端的语音控制装置,所述智能终端的语音控制装置包括:
采集模块,用于在接收到语音输入设备发送的语音指令时,根据智能终端当前展示页面的视图属性采集当前展示页面上所有可控控件对象的参数信息;
匹配模块,用于将所述语音指令经过语音语义识别后获得的语义文本信息与所采集的所有可控控件对象的参数信息中的文本信息进行匹配,获得与所述语音指令的语义文本信息匹配的可控控件对象;
触发模块,用于触发所述可控控件对象相应的控制操作。
为实现上述目的,本发明还提供一种电视机系统,其特征在于,所述电视机系统包括电视机、语音输入设备、语音语义识别服务器,所述电视机上设有语音接收装置、语音控制装置;所述语音接收装置接收所述语音输入设备输入的语音指令,并将其发送至所述语音语义识别服务器进行语音语义识别,获得语义文本信息;所述语音控制装置为上述智能终端的语音控制装置,用于获得所述电视机当前展示页面上与所述语义文本信息匹配的可控控件对象,并触发所述可控控件对象相应的控制操作。
本发明提出一种智能终端的语音控制方法、装置及电视机系统,在接收到语音输入设备发送的语音指令时,能够基于智能终端后台系统的系统层采集到当前展示页面上所有的可控控件对象的参数信息,进而实现对当前展示页面上任意可控控件对象的语音控制;由于能够采集到各个展示页面上所有的可控控件对象,从而实现了对智能终端的全程语音控制。
附图说明
图1为本发明智能终端的语音控制方法第一实施例的流程示意图;
图2为本发明智能终端的语音控制方法第二实施例中采集智能终端当前展示页面上所有可控控件对象的参数信息的第一细化流程示意图;
图3为本发明智能终端的语音控制方法第三实施例中获取语音指令对应的可控控件对象的第一细化流程示意图;
图4为本发明智能终端的语音控制方法第四实施例中采集智能终端当前展示页面上所有可控控件对象的参数信息的第二细化流程示意图;
图5为本发明智能终端的语音控制方法第五实施例中获取语音指令对应的可控控件对象的第二细化流程示意图;
图6为本发明智能终端的语音控制装置第一实施例的功能模块示意图;
图7为本发明智能终端的语音控制装置第二实施例中采集模块的第一细化功能模块示意图;
图8为本发明智能终端的语音控制装置第三实施例中匹配模块的第一细化功能模块示意图;
图9为本发明智能终端的语音控制装置第四实施例中采集模块的第二细化功能模块示意图;
图10为本发明智能终端的语音控制装置第五实施例中匹配模块的第二细化功能模块示意图;
图11为本发明电视机系统的结构示意图。
本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
本发明提供一种智能终端的语音控制方法,如图1所示,示出了本发明智能终端的语音控制方法第一实施例的流程示意图,包括:
S10、在接收到语音输入设备发送的语音指令时,采集智能终端当前展示页面上所有可控控件对象的参数信息;
语音输入设备可以是移动终端,也可以是遥控器。所述移动终端可以是手机、平板电脑等可借助即时通讯语音模块或多屏互动语音模块进行语音输入的终端,例如,在电视机中,手机可以通过安装微信电视应用软件来实现语音控制电视机的操作。所述遥控器可以是所有支持语音输入功能的遥控器。
展示页面上的控件分为可控控件和非可控控件,可控控件能够根据指令执行进一步的操作,其控件属性为可控;非可控控件可用于在页面上展示内容,不能执行进一步的操作,其控件属性为不可控。所述采集的可控控件对象的参数信息包括可控控件对象的文本信息、控件标识、控件对象所属的控件类型(例如,按钮类、单选框类、列表类等)、控件对象的URL地址等。
用户使用移动终端或遥控器等语音输入设备输入语音指令,在用户输入语音指令的同时,语音输入设备将用户正在输入的语音指令实时转化为音频流并发送给智能终端。所述智能终端可以是智能电视,也可以是手机、平板电脑等智能移动终端。所述智能终端在接收到所述语音指令的音频流时,即开始在智能终端后台系统的系统层采集智能终端当前展示页面上所有控件对象的控件信息,并根据控件的属性从所采集的控件对象中筛选出可控属性为可控的控件对象,进而得到可控控件对象的参数信息。其中,在所述智能终端后台系统的系统层中定义有各个控件所基于的控件基类,所有的控件对象所属的控件类型均基于所述控件基类派生而成。
S20、将所述语音指令经过语音语义识别后获得的语义文本信息与所采集的所有可控控件对象的参数信息中的文本信息进行匹配,获得与所述语音指令的语义文本信息匹配的可控控件对象;
所述智能终端在接收完所述语音指令的音频流后,将所述音频流发送到语音识别器,其中,所述语音识别器可以是所述智能终端中的一个模块或单元,也可以是第三方的一个语音识别服务器。所述语音识别器接收所述音频流后对其进行识别并输出最终的识别结果,即语音文本信息,然后将所述语音文本信息返回给所述智能终端。
所述智能终端在接收到所述语音文本信息后将其发送到语义识别器,其中,所述语义识别器可以是所述智能终端中的一个模块或单元,也可以是第三方的一个语义识别服务器。所述语义识别器接收所述语音文本信息后对其进行分词分析,理解出关键动词和关键搜索对象,并输出最终的识别结果,即语义文本信息,然后将所述语义文本信息返回给所述智能终端。
此外,还存在这样的识别场景:所述智能终端在接收完所述语音指令的音频流后,将所述音频流发送到语音语义识别服务器,所述语音语义识别服务器将输入的音频流进行语音识别后再进行语义识别;在经过语音语义识别得到最终的语义文本信息后,将所述语义文本信息返回给所述智能终端。
所述智能终端在接收到所述语义文本信息后,将所述语义文本信息与所述采集的当前展示页面的所有可控控件对象的文本信息进行匹配,从而获得匹配后的文本信息;根据所述匹配后的文本信息和所述采集的所有可控控件对象的参数信息,即可获得与所述语音指令的语义文本信息匹配的可控控件对象。在进行匹配时可采用模糊匹配算法,所述模糊匹配算法可以是现有模糊匹配算法中的任一算法,例如,快速中文字符串模糊匹配算法。
S30、触发所述可控控件对象相应的控制操作。
例如,在获得的与所述语音指令的语义文本信息匹配的所述可控控件对象为播放节目按钮时,触发播放所述节目的相应操作;在获得的与所述语音指令的语义文本信息匹配的所述可控控件对象为下拉列表时,触发展开所述下拉列表并显示下拉列表内容的相应操作;在获得的与所述语音指令的语义文本信息匹配的所述可控控件对象为对话框中的“确定”按钮时,触发执行所述“确定”按钮相应的操作;在获得的与所述语音指令的语义文本信息匹配的所述可控控件对象为网页上的一个链接时,触发跳转到所述链接相应的网页上的操作。
本发明提出的智能终端的语音控制方法是在智能终端后台系统的系统层进行控件的采集,可以采集到所有基于系统层的控件基类而实现的控件的参数信息,适用于任意第三方应用,达到了统一适配的效果,使得语音控制在智能终端中的覆盖面和可控范围得到大幅提升,实现了真正意义上的全程语音控制。
进一步地,基于第一实施例提出本发明智能终端的语音控制方法第二实施例,在本实施例中,如图2所示,上述步骤S10包括:
S11、在所述当前展示页面的视图属性为对话框或图像展示页面时,采集智能终端当前展示页面上所有控件对象的参数信息,并从中筛选出所有的可控控件对象;
所有展示页面都有一个视图属性,本实施例中,所述视图属性被分为3类,分别为对话框视图类、网页视图类、图像展示视图类。
获取当前展示页面的参数信息,参数信息中包括当前展示页面的视图属性和页面上所有的控件对象。根据所述参数信息和所述定义的控件基类,在当前展示页面的视图属性为对话框或图像展示页面时,采集智能终端当前展示页面上所有控件对象的参数信息,所述参数信息包括控件对象的文本信息、可控属性、控件标识、所属控件类型等。根据控件对象的可控属性是否为可控,从所述采集的所有控件对象中筛选出所有的可控控件对象。
S12、根据所述所有的可控控件对象所属的控件类型,提取出所有可控控件对象的参数信息。
根据所述所有的可控控件对象所属的控件类型构建分类控件对象列表,在所述分类控件对象列表中,属于同一控件类型的所有可控控件对象的参数信息存储在同一列表中,所述参数信息包括控件对象的文本信息、控件标识、所属控件类型。在构建所述分类控件对象列表完成后,将所有可控控件对象的文本信息和控件标识以JSON数据格式进行封装(所述文本信息和控件标识呈对应关系),从而获得所有可控控件对象的参数信息。
本发明第二实施例提出的采集智能终端当前展示页面上所有可控控件对象的参数信息的方法,根据当前展示页面的视图属性,在智能终端后台系统的系统层基于各个控件对象所基于的控件基类进行参数采集,能够采集到智能终端当前展示页面中所有继承自所述控件基类的所有可控控件对象的参数信息,适用于任意第三方应用上各类控件的采集。
进一步地,基于第二实施例提出本发明智能终端的语音控制方法第三实施例,在本实施例中,如图3所示,上述步骤S20包括:
S21、将所述语音指令经过语音语义识别后获得的语义文本信息与所采集的所有可控控件对象的参数信息中的文本信息进行匹配,获得匹配后的文本信息;
S22、根据所述文本信息和控件标识的对应关系,获得与所述匹配后的文本信息对应的可控控件对象的控件标识,以根据该控件标识触发所述可控控件对象相应的控制操作。
本实施例中,在获得所述匹配后的文本信息对应的可控控件对象的控件标识后,再根据所述构建的分类控件对象列表,获得所述控件标识对应的可控控件对象,以根据所述获得的可控控件对象和其所属的控件类型实现所述可控控件对象相应的控制操作。
本发明第三实施例提出的获得与所述语音指令的语义文本信息匹配的可控控件对象的方法,通过控件标识查找得到所述匹配后的文本信息对应的可控控件对象,使查找过程的步骤更简单,易于实现。
进一步地,基于第一实施例提出本发明智能终端的语音控制方法第四实施例,在本实施例中,如图4所示,上述步骤S10包括:
S13、在所述当前展示页面的视图属性为网页时,对网页进行解析,获得网页解析信息;所述网页解析信息包括标签信息、文本信息、URL地址;
所有展示页面都有一个视图属性,本实施例中,所述视图属性被分为3类,分别为对话框视图类、网页视图类、图像展示视图类。
在视图属性为网页时,通过对当前展示页面进行网页解析,获得所述网页的HTML源码,在所述HTML源码中能够获取到所有控件对象的标签信息、文本信息、URL地址等。
S14、根据所述标签信息,从所述网页解析信息中提取出所有可控控件对象的参数信息。
根据所有控件对象的标签信息,从所述网页解析信息中筛选出标签属性为可链接的所有可控控件对象的参数信息,所述参数信息包括控件对象的文本信息和URL地址。然后将所述参数信息以JSON数据格式进行封装(所述文本信息和URL地址呈对应关系),从而获得所有可控控件对象的参数信息。
本发明第四实施例提出的采集智能终端当前展示页面上所有可控控件对象的参数信息的方法,根据网页解析后的可控控件对象的标签信息获取当前网页上所有可控控件对象的参数信息,采集步骤简单易于实现。
进一步地,基于第四实施例提出本发明智能终端的语音控制方法第五实施例,在本实施例中,如图5所示,上述步骤S20包括:
S23、将所述语音指令经过语音语义识别后获得的语义文本信息与所采集的所有可控控件对象的参数信息中的文本信息进行匹配,获得匹配后的文本信息;
S24、根据所述文本信息和URL的对应关系,获得与所述匹配后的文本信息对应的可控控件对象的URL地址,以根据该URL地址触发所述可控控件对象相应的控制操作。
本发明第五实施例提出的根据所述匹配后的文本信息查找对应的可控控件对象的方法,适用于所有基于网页的可控控件对象的查找,查找步骤简单易于实现。
本发明还提供一种智能终端的语音控制装置,如图6所示,示出了本发明智能终端的语音控制装置第一实施例的功能模块示意图,包括:
采集模块100,用于在接收到语音输入设备发送的语音指令时,根据智能终端当前展示页面的视图属性采集当前展示页面上所有可控控件对象的参数信息;
所述语音输入设备可以是移动终端,也可以是遥控器。所述移动终端可以是手机、平板电脑等可借助即时通讯语音模块或多屏互动语音模块进行语音输入的终端,例如,在电视机中,手机可以通过安装微信电视应用软件来实现语音控制电视机的操作。所述遥控器可以是所有支持语音输入功能的遥控器。展示页面上的控件分为可控控件和非可控控件,所述可控控件能够根据指令执行进一步的操作,其控件属性为可控;所述非可控控件可用于在页面上展示内容,不能执行进一步的操作,其控件属性为不可控。
所述采集的可控控件对象的参数信息包括可控控件对象的文本信息、控件标识、控件对象所属的控件类型(例如,按钮类、单选框类、列表类等)、控件对象的URL地址等。
用户使用移动终端或遥控器等语音输入设备输入语音指令,在用户输入语音指令的同时,语音输入设备将用户正在输入的语音指令实时转化为音频流并发送给智能终端。所述智能终端可以是智能电视,也可以是手机、平板电脑等智能移动终端。所述智能终端在接收到所述语音指令的音频流时,所述采集模块100即开始在智能终端后台系统的系统层采集智能终端当前展示页面上所有控件对象的控件信息,并根据控件属性从所采集的控件对象中筛选出可控属性为可控的控件对象,进而得到可控控件对象的参数信息。其中,在所述智能终端后台系统的系统层中定义有各个控件所基于的控件基类,所有的控件对象所属的控件类型均基于所述控件基类派生而成。
匹配模块200,用于将所述语音指令经过语音语义识别后获得的语义文本信息与所采集的所有可控控件对象的参数信息中的文本信息进行匹配,获得与所述语音指令的语义文本信息匹配的可控控件对象;
所述智能终端在接收完所述语音指令的音频流后,将所述音频流发送到语音识别器,其中,所述语音识别器可以是所述智能终端中的一个模块或单元,也可以是第三方的一个语音识别服务器。所述语音识别器接收所述音频流后对其进行识别并输出最终的识别结果,即语音文本信息,然后将所述语音文本信息返回给所述智能终端。
所述智能终端在接收到所述语音文本信息后将其发送到语义识别器,其中,所述语义识别器可以是所述智能终端中的一个模块或单元,也可以是第三方的一个语义识别服务器。所述语义识别器接收所述语音文本信息后对其进行分词分析,理解出关键动词和关键搜索对象,并输出最终的识别结果,即语义文本信息,然后将所述语义文本信息返回给所述智能终端。
此外,还存在这样的识别场景:所述智能终端在接收完所述语音指令的音频流后,将所述音频流发送到语音语义识别服务器,所述语音语义识别服务器将输入的所述音频流进行语音识别后再进行语义识别;在经过语音语义识别得到最终的语义文本信息后,将所述语义文本信息返回给所述智能终端。
所述智能终端在接收到所述语义文本信息后,匹配模块200将所述语义文本信息与所述采集的当前展示页面的所有可控控件对象的文本信息进行匹配,从而获得匹配后的文本信息;根据所述匹配后的文本信息和所述采集的所有可控控件对象的参数信息,即可获得与所述语音指令的语义文本信息匹配的可控控件对象。在进行匹配时可采用模糊匹配算法,模糊匹配算法可以是现有模糊匹配算法中的任一算法,例如,快速中文字符串模糊匹配算法。
触发模块300,用于触发所述可控控件对象相应的控制操作。
例如,在获得的与所述语音指令的语义文本信息匹配的所述可控控件对象为播放节目按钮时,触发模块300触发播放所述节目的相应操作;在获得的与所述语音指令的语义文本信息匹配的所述可控控件对象为下拉列表时,触发模块300触发展开所述下拉列表并显示下拉列表内容的相应操作;在获得的与所述语音指令的语义文本信息匹配的所述可控控件对象为对话框中的“确定”按钮时,触发模块300触发执行所述“确定”按钮相应的操作;在获得的与所述语音指令的语义文本信息匹配的所述可控控件对象为网页上的一个链接时,触发模块300触发跳转到所述链接相应的网页上的操作。
本发明提出的智能终端的语音控制装置是在智能终端后台系统的系统层进行控件的采集,可以采集到所有基于系统层的控件基类而实现的控件的参数信息,适用于任意第三方应用,达到了统一适配的效果,使得语音控制在智能终端中的覆盖面和可控范围得到大幅提升,实现了真正意义上的全程语音控制。
进一步地,参照图7,基于上述图6所示的第一实施例提出本发明智能终端的语音控制装置第二实施例,基于上述图6所示的实施例,所述采集模块100包括:
第一采集单元101,用于在所述当前展示页面的视图属性为对话框或图像展示页面时,采集智能终端当前展示页面上所有控件对象的参数信息,并从中筛选出所有的可控控件对象;
所有展示页面都有一个视图属性,本实施例中,所述视图属性被分为3类,分别为对话框视图类、网页视图类、图像展示视图类。所述第一采集单元101获取当前展示页面的参数信息,所述参数信息中包括当前展示页面的视图属性和页面上所有的控件对象。根据所述参数信息和所述定义的控件基类,在当前展示页面的视图属性为对话框或图像展示页面时,所述第一采集单元101采集智能终端当前展示页面上所有控件对象的参数信息,所述所有控件对象的参数信息包括控件对象的文本信息、可控属性、控件标识、所属控件类型等。根据控件对象的可控属性是否为可控,所述第一采集单元101从所述采集的所有控件对象中筛选出所有的可控控件对象。
第一获取单元102,用于根据所述所有的可控控件对象所属的控件类型,提取出所有可控控件对象的参数信息。
所述第一获取单元102根据所述所有的可控控件对象所属的控件类型构建分类控件对象列表,在所述分类控件对象列表中,属于同一控件类型的所有可控控件对象的参数信息存储在同一列表中,所述参数信息包括控件对象的文本信息、控件标识、所属控件类型。在构建所述分类控件对象列表完成后,所述第一获取单元102将所有可控控件对象的文本信息和控件标识以JSON数据格式进行封装(所述文本信息和控件标识呈对应关系),从而获得所有可控控件对象的参数信息。
本发明第二实施例提出的智能终端的语音控制装置,根据当前展示页面的视图属性,在智能终端后台系统的系统层基于各个控件对象所基于的控件基类进行参数采集,能够采集到当前展示页面中所有继承自所述控件基类的所有可控控件对象的参数信息,适用于任意第三方应用上各类控件的采集。
进一步地,参照图8,基于上述图7所示的第二实施例提出本发明智能终端的语音控制装置第三实施例,基于上述图6所示的实施例,所述匹配模块200包括:
第一匹配单元201,用于将所述语音指令经过语音语义识别后获得的语义文本信息与所采集的所有可控控件对象的参数信息中的文本信息进行匹配,获得匹配后的文本信息;
第二获取单元202,用于根据所述文本信息和控件标识的对应关系,获得与所述匹配后的文本信息对应的可控控件对象的控件标识,以根据该控件标识触发所述可控控件对象相应的控制操作。
本实施例中,在获得所述匹配后的文本信息对应的可控控件对象的控件标识后,所述第二获取单元202再根据所述构建的分类控件对象列表,获得所述控件标识对应的可控控件对象,以根据所述获得的可控控件对象和其所属的控件类型实现所述可控控件对象相应的控制操作。
本发明第三实施例提出的智能终端的语音控制装置,实现了通过控件标识查找得到所述匹配后的文本信息对应的可控控件对象,操作简单且易于实现。
进一步地,参照图9,基于上述图6所示的第一实施例提出本发明智能终端的语音控制装置第四实施例,基于上述图6所示的实施例,所述采集模块100包括:
第二采集单元103,用于在所述当前展示页面的视图属性为网页时,对网页进行解析,获得网页解析信息;所述网页解析信息包括标签信息、文本信息、URL地址;
所有展示页面都有一个视图属性,本实施例中,所述视图属性被分为3类,分别为对话框视图类、网页视图类、图像展示视图类。
在视图属性为网页时,所述第二采集单元103通过对当前展示页面进行网页解析,获得所述网页的HTML源码,在所述HTML源码中能够获取到所有控件对象的标签信息、文本信息、URL地址等。
第三获取单元104,用于根据所述标签信息,从所述网页解析信息中提取出所有可控控件对象的参数信息。
所述第三获取单元104根据所述所有控件对象的标签信息,从所述网页解析信息中筛选出标签属性为可链接的所有可控控件对象的参数信息,所述参数信息包括控件对象的文本信息和URL地址。然后将所述参数信息以JSON数据格式进行封装(所述文本信息和URL地址呈对应关系),从而获得所有可控控件对象的参数信息。
本发明第四实施例提出的智能终端的语音控制装置,实现了根据网页解析后的可控控件对象的标签信息获取当前网页上所有可控控件对象的参数信息,操作简单且易于实现。
进一步地,参照图10,基于上述图9所示的第四实施例提出本发明智能终端的语音控制装置第五实施例,基于上述图6所示的实施例,所述匹配模块200包括:
第二匹配单元203,用于将所述语音指令经过语音语义识别后获得的语义文本信息与所采集的所有可控控件对象的参数信息中的文本信息进行匹配,获得匹配后的文本信息;
第四获取单元204,用于根据所述文本信息和URL的对应关系,获得与所述匹配后的文本信息对应的可控控件对象的URL地址,以根据该URL地址触发所述可控控件对象相应的控制操作。
本发明第五实施例提出的智能终端的语音控制装置,适用于所有基于网页的可控控件对象的查找,操作简单且易于实现。
本发明还提供一种电视机系统,如图11所示,示出了本发明电视机系统的结构示意图。所述电视机系统包括电视机500、语音输入设备400、语音语义识别服务器600,所述电视机上设有语音接收装置、语音控制装置;所述语音接收装置接收所述语音输入设备400输入的语音指令,并将其发送至所述语音语义识别服务器600进行语音语义识别,获得语义文本信息;所述语音控制装置为上述任一智能终端的语音控制装置,用于获得所述电视机当前展示页面上与所述语义文本信息匹配的可控控件对象,并触发所述可控控件对象相应的控制操作。
所述语音语义识别服务器600可以是一个服务器,所述服务器既能够进行语音识别又能够进行语义识别;也可以是两个单独的服务器,即一个语音识别服务器,一个语义识别服务器。
可以理解,所述电视机系统中的语音语义识别服务器600可以由所述电视机500的所述语音控制装置中的语音语义识别模块来替代,所述语音语义识别模块具有与所述语音语义识别服务器600同样的语音语义识别功能。
本发明提出的电视机系统支持语音全程控制电视机操作,所述电视机系统中的电视机上设有语音接收装置、语音控制装置,能够支持语音对电视机的控制,且所述语音控制装置是基于电视机后台系统的系统层对当前展示页面的所有可控控件对象进行参数采集,可以采集到所有基于系统层的控件基类而实现的控件的参数信息,适用于任意第三方应用,达到了统一适配的效果,使得语音控制在电视机中的覆盖面和可控范围得到大幅提升,实现了真正意义上的全程语音控制。
以上仅为本发明的优选实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围内。

Claims (15)

  1. 一种智能终端的语音控制方法,其特征在于,所述智能终端的语音控制方法包括:
    S10、在接收到语音输入设备发送的语音指令时,采集智能终端当前展示页面上所有可控控件对象的参数信息;
    S20、将所述语音指令经过语音语义识别后获得的语义文本信息与所采集的所有可控控件对象的参数信息中的文本信息进行匹配,获得与所述语音指令的语义文本信息匹配的可控控件对象;
    S30、触发所述可控控件对象相应的控制操作;
    其中,所述语音输入设备发送的语音指令为音频流。
  2. 如权利要求1所述的智能终端的语音控制方法,其特征在于,步骤S10包括:
    在所述当前展示页面的视图属性为对话框或图像展示页面时,采集智能终端当前展示页面上所有控件对象的参数信息,并从中筛选出所有的可控控件对象;
    根据所述所有的可控控件对象所属的控件类型,提取出所有可控控件对象的参数信息。
  3. 如权利要求2所述的智能终端的语音控制方法,其特征在于,所述参数信息包括文本信息和控件标识,且所述文本信息和控件标识呈对应关系;步骤S20包括:
    将所述语音指令经过语音语义识别后获得的语义文本信息与所采集的所有可控控件对象的参数信息中的文本信息进行匹配,获得匹配后的文本信息;
    根据所述文本信息和控件标识的对应关系,获得与所述匹配后的文本信息对应的可控控件对象的控件标识,以根据该控件标识触发所述可控控件对象相应的控制操作。
  4. 如权利要求1所述的智能终端的语音控制方法,其特征在于,步骤S10包括:
    在所述当前展示页面的视图属性为网页时,对网页进行解析,获得网页解析信息;所述网页解析信息包括标签信息、文本信息、URL地址;
    根据所述标签信息,从所述网页解析信息中提取出所有可控控件对象的参数信息。
  5. 如权利要求4所述的智能终端的语音控制方法,其特征在于,所述参数信息包括文本信息和URL地址,且文本信息和URL地址呈对应关系;步骤S20包括:
    将所述语音指令经过语音语义识别后获得的语义文本信息与所采集的所有可控控件对象的参数信息中的文本信息进行匹配,获得匹配后的文本信息;
    根据所述文本信息和URL地址的对应关系,获得与所述匹配后的文本信息对应的可控控件对象的URL地址,以根据该URL地址触发所述可控控件对象相应的控制操作。
  6. 如权利要求2所述的智能终端的语音控制方法,其特征在于,控件对象的参数信息至少包括以下之一:控件对象的文本信息、可控属性、控件标识、所属控件类型。
  7. 如权利要求6所述的智能终端的语音控制方法,其特征在于,所述从中筛选出所有的可控控件对象包括:
    根据控件对象的可控属性是否为可控,从采集的所有控件对象中筛选出所有的可控控件对象。
  8. 如权利要求4所述的智能终端的语音控制方法,其特征在于,所述在所述当前展示页面的视图属性为网页时,对网页进行解析,获得网页解析信息包括:
    对当前展示页面进行网页解析,以获得所述网页的HTML源码;
    将所述HTML源码中所有控件对象的标签信息、文本信息、URL地址,作为所述网页解析信息。
  9. 如权利要求8所述的智能终端的语音控制方法,其特征在于,所述根据所述标签信息,从所述网页解析信息中提取出所有可控控件对象的参数信息包括:
    根据所述标签信息,从所述网页解析信息中筛选出标签属性为可链接的所有可控控件对象的参数信息,其中,所述参数信息包括控件对象的文本信息和URL地址;将所述参数信息以JSON数据格式进行封装。
  10. 一种智能终端的语音控制装置,其特征在于,所述智能终端的语音控制装置包括:
    采集模块,用于在接收到语音输入设备发送的语音指令时,根据智能终端当前展示页面的视图属性采集当前展示页面上所有可控控件对象的参数信息;
    匹配模块,用于将所述语音指令经过语音语义识别后获得的语义文本信息与所采集的所有可控控件对象的参数信息中的文本信息进行匹配,获得与所述语音指令的语义文本信息匹配的可控控件对象;
    触发模块,用于触发所述可控控件对象相应的控制操作。
  11. 如权利要求10所述的智能终端的语音控制装置,其特征在于,所述采集模块包括:
    第一采集单元,用于在所述当前展示页面的视图属性为对话框或图像展示页面时,采集智能终端当前展示页面上所有控件对象的参数信息,并从中筛选出所有的可控控件对象;
    第一获取单元,用于根据所述所有的可控控件对象所属的控件类型,提取出所有可控控件对象的参数信息。
  12. 如权利要求11所述的智能终端的语音控制装置,其特征在于,所述参数信息包括文本信息和控件标识,且所述文本信息和控件标识呈对应关系;所述匹配模块包括:
    第一匹配单元,用于将所述语音指令经过语音语义识别后获得的语义文本信息与所采集的所有可控控件对象的参数信息中的文本信息进行匹配,获得匹配后的文本信息;
    第二获取单元,用于根据所述文本信息和控件标识的对应关系,获得与所述匹配后的文本信息对应的可控控件对象的控件标识,以根据该控件标识触发所述可控控件对象相应的控制操作。
  13. 如权利要求10所述的智能终端的语音控制装置,其特征在于,所述采集模块包括:
    第二采集单元,用于在所述当前展示页面的视图属性为网页时,对网页进行解析,获得网页解析信息;所述网页解析信息包括标签信息、文本信息、URL地址;
    第三获取单元,用于根据所述标签信息,从所述网页解析信息中提取出所有可控控件对象的参数信息。
  14. 如权利要求13所述的智能终端的语音控制装置,其特征在于,所述参数信息包括文本信息和URL地址,且文本信息和URL地址呈对应关系;所述匹配模块包括:
    第二匹配单元,用于将所述语音指令经过语音语义识别后获得的语义文本信息与所采集的所有可控控件对象的参数信息中的文本信息进行匹配,获得匹配后的文本信息;
    第四获取单元,用于根据所述文本信息和URL地址的对应关系,获得与所述匹配后的文本信息对应的可控控件对象的URL地址,以根据该URL地址触发所述可控控件对象相应的控制操作。
  15. 一种电视机系统,其特征在于,所述电视机系统包括电视机、语音输入设备、语音语义识别服务器,所述电视机上设有语音接收装置、语音控制装置;所述语音接收装置接收所述语音输入设备输入的语音指令,并将其发送至所述语音语义识别服务器进行语音语义识别,获得语义文本信息;所述语音控制装置为权利要求10所述的智能终端的语音控制装置,用于获得所述电视机当前展示页面上与所述语义文本信息匹配的可控控件对象,并触发所述可控控件对象相应的控制操作。
PCT/CN2016/084476 2015-08-20 2016-06-02 智能终端的语音控制方法、装置及电视机系统 WO2017028601A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510514619.9A CN105161106A (zh) 2015-08-20 2015-08-20 智能终端的语音控制方法、装置及电视机系统
CN201510514619.9 2015-08-20

Publications (1)

Publication Number Publication Date
WO2017028601A1 true WO2017028601A1 (zh) 2017-02-23

Family

ID=54801939

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/084476 WO2017028601A1 (zh) 2015-08-20 2016-06-02 智能终端的语音控制方法、装置及电视机系统

Country Status (2)

Country Link
CN (1) CN105161106A (zh)
WO (1) WO2017028601A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110503975A (zh) * 2019-08-02 2019-11-26 广州长嘉电子有限公司 基于多麦克风降噪的智能电视语音增强控制方法及系统
CN112732379A (zh) * 2020-12-30 2021-04-30 智道网联科技(北京)有限公司 智能终端上应用程序的运行方法、终端和存储介质
US11221822B2 (en) * 2017-11-15 2022-01-11 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for controlling page

Families Citing this family (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105161106A (zh) * 2015-08-20 2015-12-16 深圳Tcl数字技术有限公司 智能终端的语音控制方法、装置及电视机系统
CN107093424A (zh) * 2016-02-17 2017-08-25 中兴通讯股份有限公司 语音控制方法及装置
CN105764185B (zh) * 2016-03-18 2017-12-12 深圳Tcl数字技术有限公司 交流驱动混合调光电路和电视机
CN106101789B (zh) * 2016-07-06 2020-04-24 深圳Tcl数字技术有限公司 终端的语音交互方法及装置
CN107659839A (zh) * 2016-07-26 2018-02-02 北京优朋普乐科技有限公司 智能终端的控制方法、视频搜索方法、设备及系统
CN106023993A (zh) * 2016-07-29 2016-10-12 西安旭天电子科技有限公司 基于自然语言的机器人控制系统及控制方法
CN106297791B (zh) * 2016-08-25 2020-08-18 Tcl科技集团股份有限公司 一种全程语音实现方法及系统
CN106484270A (zh) * 2016-09-12 2017-03-08 深圳市金立通信设备有限公司 一种语音操作事件添加方法及终端
CN106845766A (zh) * 2016-12-14 2017-06-13 国网北京市电力公司 信息采集方法
CN106710598A (zh) * 2017-03-24 2017-05-24 上海与德科技有限公司 语音识别方法及装置
CN107204185B (zh) * 2017-05-03 2021-05-25 深圳车盒子科技有限公司 车载语音交互方法、系统及计算机可读存储介质
CN107342083B (zh) * 2017-07-05 2021-07-20 百度在线网络技术(北京)有限公司 用于提供语音服务的方法和装置
CN107608652B (zh) * 2017-08-28 2020-05-22 三星电子(中国)研发中心 一种语音控制图形界面的方法和装置
CN109474843B (zh) * 2017-09-08 2021-09-03 腾讯科技(深圳)有限公司 语音操控终端的方法、客户端、服务器
CN109545223B (zh) * 2017-09-22 2022-03-01 Tcl科技集团股份有限公司 应用于用户终端的语音识别方法及终端设备
CN107909997A (zh) * 2017-09-29 2018-04-13 威创集团股份有限公司 一种拼接墙控制方法及系统
CN107749297B (zh) * 2017-10-25 2021-09-07 深圳市愚公科技有限公司 一种语音控制智能硬件的方法
CN109862170B (zh) * 2017-11-30 2021-02-12 Tcl科技集团股份有限公司 一种通信控制的方法、装置和穿戴设备
CN107948698A (zh) * 2017-12-14 2018-04-20 深圳市雷鸟信息科技有限公司 智能电视的语音控制方法、系统及智能电视
CN108538300B (zh) * 2018-02-27 2021-01-29 科大讯飞股份有限公司 语音控制方法及装置、存储介质、电子设备
CN108597499B (zh) * 2018-04-02 2020-09-25 联想(北京)有限公司 语音处理方法以及语音处理装置
CN108877791B (zh) * 2018-05-23 2021-10-08 百度在线网络技术(北京)有限公司 基于视图的语音交互方法、装置、服务器、终端和介质
EP3805914A4 (en) * 2018-05-25 2021-06-30 Sony Corporation INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD AND INFORMATION PROCESSING SYSTEM
CN110570846B (zh) * 2018-06-05 2022-04-22 青岛海信移动通信技术股份有限公司 一种语音控制方法、装置及手机
CN108877796A (zh) * 2018-06-14 2018-11-23 合肥品冠慧享家智能家居科技有限责任公司 语音控制智能设备终端操作的方法和装置
CN110691160A (zh) * 2018-07-04 2020-01-14 青岛海信移动通信技术股份有限公司 一种语音控制方法、装置及手机
CN110795175A (zh) * 2018-08-02 2020-02-14 Tcl集团股份有限公司 模拟控制智能终端的方法、装置及智能终端
CN109448727A (zh) * 2018-09-20 2019-03-08 李庆湧 语音交互方法以及装置
CN109166584A (zh) * 2018-10-30 2019-01-08 深圳融昕医疗科技有限公司 语音控制方法、装置、呼吸机和存储介质
CN111200744B (zh) * 2018-11-19 2021-05-25 Tcl科技集团股份有限公司 一种多媒体播放控制方法、装置及智能设备
CN109597996B (zh) * 2018-12-07 2023-09-05 深圳创维数字技术有限公司 一种语义解析方法、装置、设备和介质
CN109600646B (zh) * 2018-12-11 2021-03-23 未来电视有限公司 语音定位的方法及装置、智能电视、存储介质
CN111383631B (zh) * 2018-12-11 2024-01-23 阿里巴巴集团控股有限公司 一种语音交互方法、装置及系统
CN110085224B (zh) * 2019-04-10 2021-06-01 深圳康佳电子科技有限公司 智能终端全程语音操控处理方法、智能终端及存储介质
CN110765375B (zh) * 2019-10-22 2022-08-02 普信恒业科技发展(北京)有限公司 一种页面跳转方法、装置及系统
CN113127609A (zh) * 2019-12-31 2021-07-16 华为技术有限公司 语音控制方法、装置、服务器、终端设备及存储介质
CN111263236B (zh) * 2020-02-21 2022-04-12 广州欢网科技有限责任公司 电视机应用的语音适配方法和装置及语音控制方法
CN111768777A (zh) * 2020-06-28 2020-10-13 广州小鹏车联网科技有限公司 语音控制方法、信息处理方法、车辆和服务器
CN111768780B (zh) * 2020-06-28 2021-12-07 广州小鹏汽车科技有限公司 语音控制方法、信息处理方法、车辆和服务器
CN112333501A (zh) * 2020-07-29 2021-02-05 深圳Tcl新技术有限公司 智能电视语音控制方法、装置、智能电视及存储介质
CN114255745A (zh) * 2020-09-10 2022-03-29 华为技术有限公司 一种人机交互的方法、电子设备及系统
CN112102832B (zh) * 2020-09-18 2021-12-28 广州小鹏汽车科技有限公司 语音识别方法、装置、服务器和计算机可读存储介质
CN112770157B (zh) * 2020-12-17 2023-03-28 深圳创维-Rgb电子有限公司 电视web前端界面的语音控制方法、装置、设备及介质
CN114996622B (zh) * 2022-08-02 2022-11-11 北京弘玑信息技术有限公司 信息获取方法、值网络模型的训练方法及电子设备
CN115396709A (zh) * 2022-08-22 2022-11-25 海信视像科技股份有限公司 显示设备、服务器及免唤醒语音控制方法

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7194412B2 (en) * 2001-07-19 2007-03-20 Overhead Door Corporation Speech activated door operator system
US20120005701A1 (en) * 2010-06-30 2012-01-05 Rovi Technologies Corporation Method and Apparatus for Identifying Video Program Material or Content via Frequency Translation or Modulation Schemes
CN102520789A (zh) * 2011-11-18 2012-06-27 上海聚力传媒技术有限公司 一种用于实现语音控制受控设备的方法与设备
CN103246648A (zh) * 2012-02-01 2013-08-14 腾讯科技(深圳)有限公司 语音输入控制方法及装置
CN103970839A (zh) * 2014-04-24 2014-08-06 四川长虹电器股份有限公司 语音控制网页浏览的方法
CN104599669A (zh) * 2014-12-31 2015-05-06 乐视致新电子科技(天津)有限公司 一种语音控制方法和装置
CN105161106A (zh) * 2015-08-20 2015-12-16 深圳Tcl数字技术有限公司 智能终端的语音控制方法、装置及电视机系统

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007171809A (ja) * 2005-12-26 2007-07-05 Canon Inc 情報処理装置及び情報処理方法
DE102007019796A1 (de) * 2007-04-26 2008-10-30 CCS Technology, Inc., Wilmington Spleißgerät für Lichtleitfasern und Verfahren zum Betreiben eines Spleißgerätes für Lichtleitfasern
CN102137309A (zh) * 2010-11-30 2011-07-27 广东星海数字家庭产业技术研究院有限公司 应用面向数字电视终端的内容描述语言的处理方法
CN102207872B (zh) * 2011-06-04 2014-08-06 中国移动通信集团内蒙古有限公司 按照用户需求定制Web UI控件的方法和系统
CN102681841A (zh) * 2012-02-01 2012-09-19 中兴通讯(香港)有限公司 一种手机应用开发方法和系统
CN102629246B (zh) * 2012-02-10 2017-06-27 百纳(武汉)信息技术有限公司 识别浏览器语音命令的服务器及浏览器语音命令识别方法
CN103442138A (zh) * 2013-08-26 2013-12-11 华为终端有限公司 语音控制方法、装置及终端
CN104462186A (zh) * 2014-10-17 2015-03-25 百度在线网络技术(北京)有限公司 一种语音搜索方法及装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7194412B2 (en) * 2001-07-19 2007-03-20 Overhead Door Corporation Speech activated door operator system
US20120005701A1 (en) * 2010-06-30 2012-01-05 Rovi Technologies Corporation Method and Apparatus for Identifying Video Program Material or Content via Frequency Translation or Modulation Schemes
CN102520789A (zh) * 2011-11-18 2012-06-27 上海聚力传媒技术有限公司 一种用于实现语音控制受控设备的方法与设备
CN103246648A (zh) * 2012-02-01 2013-08-14 腾讯科技(深圳)有限公司 语音输入控制方法及装置
CN103970839A (zh) * 2014-04-24 2014-08-06 四川长虹电器股份有限公司 语音控制网页浏览的方法
CN104599669A (zh) * 2014-12-31 2015-05-06 乐视致新电子科技(天津)有限公司 一种语音控制方法和装置
CN105161106A (zh) * 2015-08-20 2015-12-16 深圳Tcl数字技术有限公司 智能终端的语音控制方法、装置及电视机系统

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11221822B2 (en) * 2017-11-15 2022-01-11 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for controlling page
CN110503975A (zh) * 2019-08-02 2019-11-26 广州长嘉电子有限公司 基于多麦克风降噪的智能电视语音增强控制方法及系统
CN112732379A (zh) * 2020-12-30 2021-04-30 智道网联科技(北京)有限公司 智能终端上应用程序的运行方法、终端和存储介质
CN112732379B (zh) * 2020-12-30 2023-12-15 智道网联科技(北京)有限公司 智能终端上应用程序的运行方法、终端和存储介质

Also Published As

Publication number Publication date
CN105161106A (zh) 2015-12-16

Similar Documents

Publication Publication Date Title
WO2017028601A1 (zh) 智能终端的语音控制方法、装置及电视机系统
WO2017201913A1 (zh) 一种精准语音控制方法及装置
WO2019119771A1 (zh) 语音交互方法、装置及计算机可读存储介质
WO2019080406A1 (zh) 电视机语音交互方法、语音交互控制装置及存储介质
WO2017101266A1 (zh) 语音控制方法及系统
WO2018006489A1 (zh) 终端的语音交互方法及装置
WO2014003283A1 (en) Display apparatus, method for controlling display apparatus, and interactive system
WO2019051902A1 (zh) 终端控制方法、空调器及计算机可读存储介质
WO2016058258A1 (zh) 终端远程控制方法和系统
WO2019127837A1 (zh) 建表脚本生成方法、装置、设备及计算机可读存储介质
WO2017126835A1 (en) Display apparatus and controlling method thereof
WO2017041538A1 (zh) 终端用户界面的受控显示方法及装置
WO2018023926A1 (zh) 电视与移动终端的互动方法及系统
WO2017036208A1 (zh) 显示界面中的信息提取方法及系统
WO2015058570A1 (zh) 自动识别网络运营商以实现数据配置的方法及装置
WO2015018185A1 (zh) 实现分布式遥控的方法、装置及其电视端和移动终端
WO2016101677A1 (zh) 局域网设备通信的方法及系统
WO2017036203A1 (zh) 媒体应用的播放控制方法、遥控装置及电视系统
WO2019062113A1 (zh) 家电设备的控制方法、装置、家电设备及可读存储介质
WO2017054488A1 (zh) 电视播放控制方法、服务器及电视播放控制系统
WO2018053964A1 (zh) 基于安卓系统的语音输入标点符号的方法及装置
WO2017028613A1 (zh) 基于遥控器app控制终端的方法及装置
WO2017016310A1 (zh) 遥控功能数据动态配置的方法和装置
WO2017036209A1 (zh) 基于智能电视的音频数据播放方法、智能电视及系统
WO2017084301A1 (zh) 音频数据播放方法、装置及智能电视机

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16836465

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 13.07.2018)

122 Ep: pct application non-entry in european phase

Ref document number: 16836465

Country of ref document: EP

Kind code of ref document: A1