WO2020007225A1 - 语音控制方法和设备 - Google Patents

语音控制方法和设备 Download PDF

Info

Publication number
WO2020007225A1
WO2020007225A1 PCT/CN2019/093222 CN2019093222W WO2020007225A1 WO 2020007225 A1 WO2020007225 A1 WO 2020007225A1 CN 2019093222 W CN2019093222 W CN 2019093222W WO 2020007225 A1 WO2020007225 A1 WO 2020007225A1
Authority
WO
WIPO (PCT)
Prior art keywords
control
user
control instruction
parsed
screen information
Prior art date
Application number
PCT/CN2019/093222
Other languages
English (en)
French (fr)
Inventor
李凯
朱众微
宋亮
Original Assignee
青岛海信移动通信技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 青岛海信移动通信技术股份有限公司 filed Critical 青岛海信移动通信技术股份有限公司
Publication of WO2020007225A1 publication Critical patent/WO2020007225A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/725Cordless telephones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/74Details of telephonic subscriber devices with voice recognition means

Definitions

  • This application relates to the field of mobile communication technology.
  • Mobile phone voice control means that the user can use the voice control instead of buttons, touches, etc. to operate the phone, which can free the user's hands and make it easier to use the phone.
  • speech recognition technology mobile phone voice control based on speech recognition is increasingly favored by mobile phone users.
  • This application provides a voice control scheme.
  • the present application provides a voice control method, including:
  • an operation associated with the first control is performed according to the control instruction.
  • the present application also provides a computer device including a processor and a memory;
  • the memory is configured to store computer instructions
  • the processor is configured to run computer instructions stored in the memory to perform the above-mentioned voice control method.
  • the present application provides a non-volatile storage medium that stores processor-executable instructions.
  • the processor-executable instructions cause the processor to execute the voice. Control Method.
  • FIG. 1 is a flowchart of a voice control method according to an embodiment of the present application
  • FIG. 2 is a flowchart of a voice control method according to another embodiment of the present application.
  • FIG. 3 is a flowchart of a voice control method according to another embodiment of the present application.
  • FIG. 4 is a flowchart of a voice control method according to another embodiment of the present application.
  • FIG. 5 is a structural diagram of a computer device according to an embodiment of the present application.
  • the embodiments of the present application are based on the Android operating system. It should be understood that the embodiments of the present application can also be applied to other suitable operating systems.
  • the method for controlling the voice of a mobile phone basically sets keywords in advance and adapts corresponding applications and / or controls through the keywords.
  • the user speaks the relevant control instruction
  • the mobile phone receives the relevant control instruction voice spoken by the user, recognizes it locally or through the server, analyzes the semantics, and then compares the analyzed semantics with the preset keywords to match, Perform corresponding operations on the applications / controls that are matched with the matched keywords, such as clicking, to implement mobile phone operation control.
  • the phone receives the voice of "Open Settings”
  • “Fit to find" System-Settings click" System-Settings "to complete the opening of” System-Settings ".
  • FIG. 1 is a flowchart of a voice control method according to an embodiment of the present application. As shown in FIG. 1, a voice control method provided by an embodiment of the present application includes:
  • S100 Acquire voice data including a control instruction.
  • voice control After the voice control is started, the voice containing the control instruction issued by the user is obtained.
  • voice control may be initiated by long pressing a physical key of a terminal device, such as the Home button of a mobile device.
  • the voice containing the control instruction issued by the user is obtained, the voice is parsed to obtain the control instruction contained in the voice.
  • the analysis of the control instructions contained in the speech is usually combined with the intelligent speech database to intercept the keywords in the received speech and stitch the keywords.
  • S300 Obtain screen information of the current operation interface of the terminal based on the auxiliary function.
  • accessibility is some functions in the Android system that assist users in operating mobile phone applications.
  • the screen information of the current operation interface of the terminal is obtained through auxiliary functions.
  • the screen information of the current operation interface of the mobile device can be obtained by calling the interface provided by Accessbility, for example:
  • AccessibilityNodeInfo nodeInfo getRootInActiveWindow ();
  • This NodeInfo is a collection class of current screen information.
  • the screen information of the current operation interface includes, but is not limited to, text, pictures, and controls on the interface.
  • Text, ID, Clickable, etc. are all attribute elements of the control.
  • the Text element is the value of the control; the ID element is a unique identifier used to identify the control; the Clickable element indicates whether the control is clickable; when the Clickable element attribute is false (false), the control cannot be clicked.
  • the parsed control instruction and the obtained screen information find the control in the screen information that matches the parsed control instruction, and simulate the operation corresponding to the control, for example, click operation to complete the voice. control. Find the control in the screen information that matches the parsed control instruction, that is, find the control that the user needs to operate and control from the screen information in the current operation interface.
  • the voice control method provided in the embodiment of the present application when performing voice control, the screen information of the current operation interface is acquired based on the auxiliary function, so that the operations performed by the voice control are combined with the screen information of the current operation interface of the terminal, and in any interface of the terminal.
  • Implement voice control related to this interface That is, the voice control method provided in the embodiment of the present application can realize operations that a user sees and can directly perform control operations through voice, which is no longer limited to pre-set and adapted applications, and is helpful for implementing voice control of all applications on the mobile phone, and expands Control range of voice control.
  • the voice control method of the present application can be applied to terminal devices such as smart phones and televisions.
  • FIG. 2 is a flowchart of a voice control method according to another embodiment of the present application.
  • the method further includes:
  • step S400 Iterate through the screen information to determine whether there is a control in the screen information that matches the parsed control instruction.
  • step S500 is performed. In this way, it can effectively ensure that when the control that matches the parsed control instruction cannot be found in the screen information of the current operation interface, stop searching in the screen information of the current operation interface in time, and instead find the control instruction parsed in the terminal system. Adaptive controls and control-related operations are performed to ensure the effectiveness of voice control.
  • FIG. 3 is a flowchart of a voice control method according to another embodiment of the present application.
  • step S400 in response to finding a control matching the parsed control instruction in the screen information, performing an operation associated with the control according to the control instruction includes:
  • the controls in the screen information have clickable or non-clickable Clickable properties
  • the simulated click operation will be invalid, that is, the control click cannot be performed, so determine the search before performing the control click operation. Whether the detected control is clickable effectively guarantees the validity of the control click operation when there is a control in the screen information that matches the parsed control instruction.
  • the user says "open discovery” on the WeChat interface obtains the voice containing the control instruction and parses the control instruction contained in the voice, parses and recognizes the semantics of "discovery", and according to the obtained screen information of the current operation interface , Find the control that contains "discovery” in the screen information, when find the control that contains "discovery”, obtain the clickable property of the control that contains “discovery”, determine whether it is clickable, and when it is clickable, It is considered that the found control containing "discovery” matches the parsed control instruction. Clicking on the control containing "discovery” is performed to complete the voice control of "open discovery".
  • FIG. 4 is a flowchart of a voice control method according to another embodiment of the present application.
  • the method further includes:
  • the preset entry is usually used to indicate several control instructions, and records the relevant steps of each control instruction. For example, the preset entry “view circle of friends” records the three steps of "enter WeChat", “click to find” and “click circle of friends”.
  • the parsed control instructions After parsing out the control instructions contained in the voice, first compare the control instructions with the preset entry to determine whether the parsed control instruction matches the preset entry, that is, determine whether the preset entry contains the The parsed control instructions are described. When the parsed control instruction matches the preset entry or the preset entry contains the parsed control instruction, the parsed control instruction is split according to the preset entry, and the control instruction obtained in accordance with the split is sequentially Perform operations related to controls, for example, click on the control of the corresponding control instruction according to the screen information of the corresponding operation interface in turn.
  • a user wants to open a WeChat circle of friends, activate voice control to say "Enter WeChat” on the desktop, the voice control system obtains the voice of "Enter WeChat”, parse the voice to obtain the control instruction to open WeChat, and obtain the current desktop screen information. Find the WeChat control in the screen information of the current desktop, and click on the WeChat control to enter WeChat.
  • the voice control system gets the "discovery” voice and analyzes the Get the control instructions to enter the voice, get the screen information of the current WeChat interface, find the discovery control on the current WeChat interface, click the discovery control to enter the discovery; after entering the discovery interface, the user can say "friend circle”, and the voice control system gets "Friend circle” voice, analyze the voice to obtain the control instructions for entering the circle of friends, obtain the screen information of the current discovery interface, find the circle control in the screen information of the current discovery interface, click the circle control to enter the circle of friends . In this way, based on the acquisition of the screen information of the current interface by the auxiliary function, direct control of the controls in the interface is achieved.
  • the voice control method when the "view circle of friends" has been set as a preset entry, the user can directly say “view circle of friends” on the current desktop operation interface, and the voice control system according to the preset entry
  • the steps of “Entering WeChat”, “Click to Discover” and “Clicking to Friend Circle” recorded in “Viewing the Circle of Friends” will obtain the current desktop screen information, find WeChat control information, and click to enter WeChat; after entering the WeChat interface, get the WeChat interface Screen information, find the "discovery” control in the WeChat interface, click the “discovery” control; after entering the discovery interface, get the screen information of the discovery interface, find the "friend circle” control in the discovery interface, click the "friend circle” control to enter Circle of friends, realize the operation of viewing circle of friends.
  • the voice control method provided in the embodiment of the present application implements multiple control operations by combining screen information of the interface in which it is located.
  • the voice control method provided in the embodiment of the present application further includes:
  • the voice control method when looking for a control in the screen information that matches the parsed control instruction, two or more matching clickable controls, that is, the screen, may be found
  • the control in the information that matches the parsed control instruction is not unique.
  • the user when the control that matches the parsed control instruction in the screen information is not unique, the user is reminded to manually select, for example, click to select, If the text or voice reminder "The command you are giving is not unique, please select it manually" is displayed, and the related control is opened manually according to the user.
  • the preset waiting time is used by the voice control system to wait for the user to make a selection after giving a reminder.
  • the user's manual selection signal When the user's manual selection signal is not received within the preset waiting time, click the first one of the screen information and the parsed control.
  • the instruction matches the control.
  • it is not limited to the first control in the screen information that matches the parsed control instruction, and may be any one, and it can be set as required.
  • voice control searches for XX movies, and N related XX movies are searched according to the control instructions.
  • the current interface contains controls for N XX movies to remind the user to N movies, please select manually ", users can choose manually according to this reminder.
  • the voice control method provided in the present application further includes:
  • the parsed control instruction is uniquely adapted to the ID of the control.
  • the voice control system records the control selected by the user The ID of the control, and then uniquely adapt the parsed control instruction to the ID of the control. In this way, when the user performs the same voice control in the same situation, the control can be selected directly, and when multiple matching controls are no longer found, reminders can be performed to continue the voice control.
  • the parsed control instruction corresponds to the control one by one
  • the ID of the control clicked by the user is recorded, and attribute information such as the Text of the control clicked by the user can also be recorded.
  • the voice control method provided in the present application further includes:
  • the parsed control instruction is uniquely adapted to the ID of the control.
  • the voice control method when the control that matches the parsed control instruction in the screen information is not unique, the user is reminded to manually select it, and the user manually selects it within a preset waiting time, such as clicking, and the voice control system It is assumed that a manual selection signal is received by the user within the waiting time to remind the user whether to record the operation; when the signal of the recording operation is received, the ID of the control selected by the user is recorded, and then the parsed control instruction is uniquely adapted to the ID of the control Settings. In addition to realizing the functions of the above embodiments, it also reminds the user whether to record the operation, preventing the voice control system from automatically setting an inappropriate match; for example, the user clicks a control other than the voice control instruction. As such, the voice control method provided in the embodiments of the present application improves the accuracy of voice control operations.
  • the user is reminded whether the recording operation is implemented in the form of voice.
  • the user also provides feedback by voice. For example, the user sends a "record” / "yes” to the recording control signal to the voice control system, or the feedback "do not record”. "/" No "to signal the voice control system not to record the operation.
  • the user is reminded whether the recording operation is implemented in the form of a pop-up window. For example, the user sends a recording operation signal to the voice control system by clicking a “record” / “yes” control in the pop-up window.
  • this application further provides an embodiment of a computer device.
  • a computer device 700 provided by an embodiment of the present application includes a processor 701, a memory 702, a memory 703, a network interface 704, and an internal bus 705.
  • the interface 704 is connected through an internal bus 705;
  • the memory 702 is configured to store computer instructions
  • the processor 701 is configured to run computer instructions stored in the memory 702 to execute the voice control method according to any one of the foregoing embodiments.
  • processors in the embodiments of the present application may be a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), and an application-specific integrated circuit (Application-Specific Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA), or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof.
  • a processor may also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a DSP and a microprocessor, and so on.
  • the processor 701 is internally provided with a micro memory for storing a program.
  • the program may include a program code, and the program code includes a computer operation instruction.
  • the micro memory may include random access memory (RAM), or may also include non-volatile memory (non-volatile memory), such as at least one disk memory. Only one processor is shown in the figure. Of course, the micro memory can also be multiple microprocessors as required. Microprocessor for reading program code stored in memory.
  • the voice control device provided in the embodiments of the present application can be used for terminal devices such as a smart phone and a television.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • General Health & Medical Sciences (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

本申请提供了一种语音控制方法及终端设备,所述方法包括:接收包含控制指令的语音数据;解析所述语音数据中的所述控制指令;基于辅助功能获取终端当前操作界面的屏幕信息;响应于所述屏幕信息中查找与所解析出的控制指令相匹配的控件,按照控制指令执行控件关联的操作。

Description

语音控制方法和设备
相关申请的交叉引用
本申请要求于2018年7月4日提交的、名称为“一种语音控制方法、装置及手机”、申请号为201810724986.5的中国发明专利申请的优先权,该申请的全文以引用的形式并入本文中用于所有目的。
技术领域
本申请涉及移动通讯技术领域。
背景技术
手机语音控制是指用户通过语音控制代替按键、点触等操作手机,可解放用户的双手,更加便于手机的使用。随着语音识别技术的发展,基于语音识别功能的手机语音控制越来越受到手机用户的青睐。
发明内容
本申请提供了一种语音控制方案。
第一方面,本申请提供了一种语音控制方法,包括:
获取包含控制指令的语音数据;
解析所述语音数据中的所述控制指令;
基于辅助功能获取终端当前操作界面的屏幕信息;
响应于所述屏幕信息中查找到与解析出的控制指令相匹配的第一控件,按照所述控制指令执行所述第一控件关联的操作。
第二方面,本申请还提供了一种计算机设备,其包括处理器和存储器;
所述存储器,用于存储计算机指令;
所述处理器,用于运行所述存储器中存储的计算机指令,以执行上述的语音控制方法。
第三方面,本申请提供了一种非易失性存储介质,其上存储有处理器可执行指令,当有处理器执行时,所述处理器可执行指令促使所述处理器执行上述的语音控制方法。
附图说明
为了更清楚地说明本申请的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请一实施例提供的语音控制方法的流程图;
图2为本申请另一实施例提供的语音控制方法的流程图;
图3为本申请再一实施例提供的语音控制方法的流程图;
图4为本申请又一实施例提供的语音控制方法的流程图;
图5为本申请一实施例的计算机设备的结构图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请的实施例基于安卓操作系统。应当理解,本申请的实施例也可以应用于其他合适的操作系统。
手机语音控制方法,基本上预先设置关键词,并通过关键词去适配相应的应用和/或控件。具体的,用户说出相关控制指令,手机接收用户说出的相关控制指令语音,在本地或者通过服务器进行识别,分析出语义,然后根据分析出的语义与预先设置的关键词比较以进行匹配,对与所匹配的关键词适配的应用/控件进行对应的操作,例如点击,实现手机的操作控制。如,当用户说:“打开设置”,手机接收“打开设置”的语音,识别分析得出“设置”的语义,并与预设的关键词比较发现与“设置”匹配,根据关键词“设置”适配找到“系统-设置”,点击“系统-设置”,完成“系统-设置”的打开。
然而在使用中发现,语音控制往往不能结合当前的界面。例如,以即时通信应用微信为例,用户在“微信-我”的界面想要打开微信设置,说出“打开设置”的语音指令, 手机最后打开的是“系统-设置”,而并不是“微信-设置”。导致用户在不同的界面上说出同样的指令却执行的是同一个操作的局面,从而可能造成操作混乱,不便于用户的使用。且语音控制中,打开的都是与关键词适配的应用,当存在应用没有预先设置适配的关键词时,将无法实现语音控制,不便于用户的使用。
图1为本申请一实施例提供的一种语音控制方法的流程图。如图1所示,本申请一实施例提供的语音控制方法,包括:
S100:获取包含控制指令的语音数据。
启动语音控制后,获取用户发出的包含控制指令的语音。在一实施例中,可通过长按终端设备某个物理键,如移动设备的Home键,来启动语音控制。
S200:解析所述语音数据中包含的控制指令。
当获取到用户发出的包含控制指令的语音后,解析所述语音获得所述语音中包含的控制指令。解析语音中包含的控制指令,通常是结合智能语音库截取所接收语音中的关键词并进行关键词的拼接。
S300:基于辅助功能获取终端当前操作界面的屏幕信息。
在一些实施例中,辅助功能(Accessbility)是Android系统中辅助用户操作手机应用的一些功能。本申请中通过辅助功能获取终端当前操作界面的屏幕信息,具体而言,例如,可通过调用Accessbility提供的接口,获取移动设备的当前操作界面的屏幕信息,例如:
调用getRootInActiveWindow获取当前的操作界面的屏幕信息
AccessibilityNodeInfo nodeInfo=getRootInActiveWindow();
这个NodeInfo就是当前屏幕信息的一个合集类。
当前操作界面的屏幕信息包括,但不限于,界面上的文字、图片、控件。Text、ID、Clickable等均为控件的属性元素。Text元素为控件的值;ID元素为用于标识控件的唯一标识符,Clickable元素表示该控件是否可点击,当Clickable元素属性为否(false)时,表示控件不能点击。
S400:响应于在所述屏幕信息中查找到与所解析出的控制指令相匹配的控件,按照控制指令执行控件关联的操作。
在一些实施例中,根据解析出的控制指令和获取到的屏幕信息,查找屏幕信息中与 解析出的控制指令相匹配的控件,模拟执行所述控件对应的操作,例如,点击操作,完成语音控制。查找屏幕信息中与解析出的控制指令相匹配的控件,即从当前操作界面的屏幕信息中找到用户需要操作控制的控件。
如,用户启动语音控制后,在“微信-我”的界面为当前操作界面时,说:“我要打开设置”,终端的语音控制系统获取“我要打开设置”的语音。根据接收到的“我要打开设置”的语音,截取出所述语音中的关键词“打开”和“设置”,拼接出“打开‘设置’”的控制指令。获取“微信-我”的操作界面的屏幕信息,获得包括“钱包”、“收藏”、“设置”等控件的屏幕信息。在“微信-我”操作界面的屏幕信息中查找与“设置”相匹配的控件,经过遍历屏幕信息,查找到控件“设置”。模拟点击操作,即执行控件点击,从而实现“我要打开设置”的语音控制。不会在“微信-我”的界面执行“打开设置”语音控制的时候,出现打开“系统-设置”的现象,使语音控制结合当前操作界面,提高语音控制的准确性。
在本申请实施例提供的语音控制方法中,在执行语音控制的时候基于辅助功能获取当前操作界面的屏幕信息,做到语音控制所执行操作结合终端当前操作界面的屏幕信息,在终端的任意界面实现与该界面相关的语音控制。即本申请实施例提供的语音控制方法可实现用户看到的操作可直接通过语音进行控制执行操作,不再局限于预先设置适配的应用,有助于实现手机上所有应用的语音控制,扩大了语音控制的控制范围。本申请语音控制方法可用于智能手机、电视等终端设备。
图2为本申请另一实施例提供的一种语音控制方法的流程图。在本申请具体实施方式中,如图2所示,所述方法还包括:
S500:当所述屏幕信息中不存在与解析出的控制指令相匹配的控件时,查找终端系统中与解析出的控制指令相适配的控件并按照所述控制指令执行控件关联的操作。
遍历所述屏幕信息,确定屏幕信息中是否存在与解析出的控制指令相匹配的控件,当存在时,执行步骤S400;当不存在时,执行步骤S500。如此,可有效保证当前操作界面的屏幕信息中无法找到与解析出的控制指令相匹配的控件时,及时停止在当前操作界面的屏幕信息中查找,转而查找终端系统中与解析出的控制指令相适配的控件并执行控件关联的操作,保证语音控制的有效性。
图3为本申请又一实施例提供的一种语音控制方法的流程图。在本申请具体实施方式中,如图3所示,步骤S400:响应于所述屏幕信息中查找到与解析出的控制指令相 匹配的控件,按照控制指令执行控件关联的操作,包括:
S401:当所述屏幕信息中存在与解析出的控制指令相匹配的控件时,判断所述控件是否可点击;
S402:响应于所述控件可点击,执行控件的点击;
S403:响应于所述控件不可点击,查找所述控件的父容器中的可点击控件,执行该可点击控件的点击。
因为屏幕信息中的控件具有可点击或不可点击的Clickable属性,当控件具有不可点击的Clickable属性时,模拟点击操作将是无效的,即无法执行控件点击,所以在执行控件点击操作前,判断查找到的控件是否可点击,有效保证屏幕信息中存在与解析出的控制指令相匹配的控件时,执行控件点击操作时的有效性。
如,用户在微信界面说“打开发现”,获取包含控制指令的所述语音并解析所述语音中包含的控制指令,解析识别出“发现”的语义,根据获取到的当前操作界面的屏幕信息,查找所述屏幕信息中包含“发现”的控件,当找到含有“发现”的控件时,获取所述含有“发现”的控件的clickable属性,判断其是否可点击,当其可点击的时候,认为找到的含有“发现”的控件与解析出的控制指令相匹配,执行含有“发现”控件的点击,完成了“打开发现”的语音控制。
图4为本申请再一实施例提供的一种语音控制方法的流程图。在本申请具体实施方式中,如图4所示,本申请实施例提供的语音控制方法,在基于辅助功能获取终端当前操作界面的屏幕信息前,还包括:
S600:确定解析出的控制指令是否与预置词条匹配;
S601:响应于解析出的控制指令与预置词条匹配,根据所述预置词条拆分所解析出的控制指令。
预置词条通常用于表示若干个控制指令,记录其中每一个控制指令的相关步骤。如,预置词条“查看朋友圈”记录“进入微信”、“点击发现”和“点击朋友圈”三个步骤。
在解析出语音中所包含的控制指令后,先进行所述控制指令与预置词条的比较,判断解析出的控制指令是否与预置词条匹配,即判断预置词条中是否包含所述解析出的控制指令。当解析出的控制指令与预置词条匹配或预置词条中包含所述解析出的控制指令时,根据预置词条拆分所解析出的控制指令,根据拆分得到的控制指令依次执行控件关 联的操作,例如依次根据相应操作界面的屏幕信息查找相应控制指令的控件进行点击。
如,用户想打开微信朋友圈,启动语音控制在桌面说“进入微信”,语音控制系统获取“进入微信”的语音,解析所述语音获取打开微信的控制指令,获取当前桌面的屏幕信息,在当前桌面的屏幕信息中查找微信控件,执行微信控件点击,进入微信;朋友圈在微信的发现-朋友圈内,则可以先说“发现”,语音控制系统获取“发现”的语音,解析所述语音获取进入发现的控制指令,获取当前微信界面的屏幕信息,在当前微信界面查找到发现控件,执行发现控件点击,进入发现;进入发现界面后,用户可以再说“朋友圈”,语音控制系统获取“朋友圈”的语音,解析所述语音获取进入朋友圈的控制指令,获取当前发现界面的屏幕信息,在当前发现界面的屏幕信息中查找到朋友圈控件,执行朋友圈控件点击,进入朋友圈。如此,基于辅助功能对当前界面的屏幕信息的获取,实现对界面内控件的直接控制。
基于上述实施例提供的语音控制方法,在“查看朋友圈”已设置为预置词条的情况下,用户可在桌面当前操作界面直接说“查看朋友圈”,语音控制系统根据预置词条“查看朋友圈”记录的“进入微信”、“点击发现”和“点击朋友圈”步骤,则会获取当前桌面的屏幕信息,查找微信控件信息,点击进入微信;进入微信界面后,获取微信界面的屏幕信息,查找微信界面中“发现”控件,点击“发现”控件;进入发现界面后,获取发现界面的屏幕信息,查找发现界面中的“朋友圈”控件,点击“朋友圈”控件,进入朋友圈,实现查看朋友圈的操作。如此,本申请实施例提供的语音控制方法通过结合所处界面的屏幕信息实现多个控制操作的执行。
进一步,在本申请具体实施方式中,本申请实施例提供的语音控制方法,还包括:
当所述屏幕信息中存在的与解析出的控制指令相匹配的控件不唯一时,提醒用户手动选择。
当在预设等待时间内未接收到用户手动选择信号,执行所述屏幕信息中第一个与解析出的控制指令相匹配的控件关联的操作。
在具体实施本申请提供的语音控制方法时,在查找所述屏幕信息中与解析出的控制指令相匹配的控件时,可能会查找到两个或两个以上相匹配的可点击控件,即屏幕信息中存在的与解析出的控制指令相匹配的控件不唯一。如此,为保证语音控制正常进行,在本申请具体实施方式中,当所述屏幕信息中存在的与解析出的控制指令相匹配的控件不唯一时,提醒用户手动选择,例如点击以进行选择,如显示文字或语音提醒“您给予 的指令不唯一,请手动选择”,根据用户手动选择打开相关控件。更进一步,预设等待时间用于语音控制系统在给予提醒后等待用户进行选择,当在预设等待时间内未接收到用户手动选择信号,点击所述屏幕信息中第一个与解析出的控制指令相匹配的控件。本申请实施例中,不局限于屏幕信息中第一个与解析出的控制指令相匹配的控件,可以为任意一个,可根据需要进行设置。
如,在视频网站搜索界面语音控制搜索XX电影,根据控制指令搜索到N个XX电影相关的电影,当在执行点击XX电影的时候,因为当前界面含有N个XX电影的控件,提醒用户“找到N个电影,请手动选择”,用户可根据此提醒进行手动选择。也可等待一段时间让语音控制系统根据其默认规律自行执行控件点击。如点击第一个“XX电影”控件或更新时间最新的“XX电影”控件等等。
在本申请具体实施方式中,本申请提供的语音控制方法还包括:
当在预设等待时间内接收到用户手动选择信号,记录用户所选择控件的ID;
将解析出的控制指令与所述控件的ID进行唯一适配设置。
具体的,当所述屏幕信息中存在的与解析出的控制指令相匹配的控件不唯一时,提醒用户手动选择,用户在预设等待时间内进行了手动选择,语音控制系统记录用户所选择控件的ID,然后,将解析出的控制指令与所述控件的ID进行唯一适配设置。如此,当用户在相同情况下进行相同的语音控制时,可直接进行此控件的选择,而不会再发生找到多个相匹配的控件时,进行提醒才能继续执行语音控制。
为保证解析出的控制指令与所述控件一一对应,当在预设等待时间内接收到用户手动选择信号,记录用户所点击控件的ID,还可以记录用户所点击控件的Text等属性信息。
更进一步,在本申请具体实施方式中,本申请提供的语音控制方法还包括:
当在预设等待时间内接收到用户手动选择信号,提醒用户是否记录操作;
当接收到记录操作的信号时,将解析出的控制指令与所述控件的ID唯一适配设置。
具体的,当所述屏幕信息中存在的与解析出的控制指令相匹配的控件不唯一时,提醒用户手动选择,用户在预设等待时间内进行了手动选择,例如点击,语音控制系统在预设等待时间内接收到用户手动选择信号,提醒用户是否记录操作;当接收到记录操作的信号时,记录用户所选择控件的ID,然后将解析出的控制指令与所述控件的ID唯一 适配设置。除了实现上述实施例的功能,还通过提醒用户是否记录操作,防止语音控制系统自主设置了不合适的匹配;如,用户点击了语音控制指令外的控件。如此,本申请实施例提供的语音控制方法提高语音控制操作的准确性。
在一实施例中,提醒用户是否记录操作以语音的形式实现,用户亦通过语音进行反馈,例如,用户反馈“记录”/“是”来向语音控制系统发出记录操作信号,或者反馈“不记录”/“否”来向语音控制系统发出不记录操作的信号。在一实施例中,提醒用户是否记录操作以弹窗的形式实现,例如,用户通过点击弹窗中的“记录”/“是”控件来向语音控制系统发出记录操作信号。
与上述实施例提供的一种语音控制方法相对应,本申请还提供了一种计算机设备的实施例。
如图5所示,本申请实施例提供的计算机设备700,包括处理器701、存储器702、内存703、网络接口704和内部总线705,其中,所述处理器701、存储器702、内存703和网络接口704通过内部总线705相连;
所述存储器702,用于存储计算机指令;
所述处理器701,配置为运行所述存储器702中存储的计算机指令,以执行上述实施例提供的任意一项所述的语音控制方法。
需要说明的是,本申请实施例上述涉及的处理器可以是中央处理器(Central Processing Unit,CPU)、通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application-Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合、DSP和微处理器的组合等等。
其中,处理器701内部设置有微存储器,用于存储程序,程序可以包括程序代码,程序代码包括计算机操作指令。微存储器可能包含随机存取存储器(random access memory,简称RAM),也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。图中仅示出了一个处理器,当然,微存储器也可以根据需要,为多个微处理器。微处理器,用于读取存储器中存储的程序代码。本申请实施例提供的语音控制装置可用于智能手机、电视等终端设备。
需要说明的是,本说明书中的各个实施例均采用递进的方式描述,各个实施例之间 相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处,相关之处参见方法实施例的部分说明即可。本领域技术人员在考虑说明书及实践这里的方案后,将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未具体描述的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本申请的真正范围和精神由下面的权利要求指出。
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围的情况下进行各种修改和改变。本申请的范围仅由所附的权利要求来限制。

Claims (18)

  1. 一种用于终端设备的语音控制方法,所述方法包括:
    获取包含控制指令的语音数据;
    解析所述语音数据中的所述控制指令;
    基于辅助功能获取所述终端设备当前操作界面的屏幕信息;
    响应于在所述屏幕信息中查找到与所解析出的控制指令相匹配的第一控件,按照所述控制指令执行所述第一控件关联的操作。
  2. 根据权利要求1所述的语音控制方法,还包括:
    响应于所述屏幕信息中不存在与所解析出的控制指令相匹配的第一控件,查找所述终端设备的系统中与所解析出的控制指令相匹配的第二控件;并
    按照所述控制指令执行所述第二控件关联的操作。
  3. 根据权利要求1所述的语音控制方法,基于辅助功能获取所述终端设备当前操作界面的所述屏幕信息,包括:
    根据辅助功能提供的接口,获取所述终端设备当前操作界面的所述屏幕信息,
    其中,所述屏幕信息包括控件的Text元素、ID元素和指示是否Clickable的元素中的一种或多种。
  4. 根据权利要求1所述的语音控制方法,按照所述控制指令执行所述第一控件关联的操作,包括:
    核查所述第一控件是否可点击;
    响应于所述第一控件可点击,执行所述第一控件关联的操作;
    响应于所述第一控件不可点击,查找所述第一控件的父容器中的可点击控件,并执行所述可点击控件关联的操作。
  5. 根据权利要求1所述的语音控制方法,在基于辅助功能获取所述终端设备当前操作界面的所述屏幕信息前,还包括:
    确定所解析出的控制指令是否与预置词条列表中的第一词条匹配;
    响应于所解析出的控制指令与所述预置词条列表中的第一词条匹配,根据所述第一词条拆分所解析出的控制指令。
  6. 根据权利要求1所述的语音控制方法,按照所述控制指令执行所述第一控件关联的操作,包括:
    确定所述第一控件的个数是否为1,
    响应于所述第一控件的个数大于1,向用户发出提示,所述提示用于指示所述用户 手动选择。
  7. 根据权利要求6所述的语音控制方法,还包括:
    在预设等待时间内未接收到所述用户手动选择,执行所述第一控件中的第一个控件关联的操作。
  8. 根据权利要求6所述的语音控制方法,还包括:
    响应于在预设等待时间内接收到所述用户手动选择,记录所述用户从所述第一控件所选择的目标控件的ID;
    将所解析出的控制指令与所述目标控件的ID唯一适配设置。
  9. 根据权利要求6所述的语音控制方法,还包括:
    响应于预设等待时间内接收到所述用户手动选择,向所述用户发出提示,所述提示用于提醒所述用户记录操作;
    当接收到记录操作信号时,将所解析出的控制指令与所述用户从所述第一控件选择的目标控件的ID唯一适配设置。
  10. 一种终端设备,包括存储器和处理器,
    所述存储器,用于存储计算机指令;
    所述处理器,配置为运行所述计算机指令以实现:
    获取包含控制指令的语音数据;
    解析所述语音数据中的所述控制指令;
    基于辅助功能获取所述终端设备当前操作界面的屏幕信息;
    响应于在所述屏幕信息中查找到与所解析出的控制指令相匹配的第一控件,按照所述控制指令执行所述第一控件关联的操作。
  11. 根据权利要求10所述的设备,所述处理器还配置为:
    响应于所述屏幕信息中不存在与所解析出的控制指令相匹配的第一控件,查找所述终端设备的系统中与所解析出的控制指令相匹配的第二控件;并
    按照所述控制指令执行所述第二控件关联的操作。
  12. 根据权利要求10所述的设备,当基于辅助功能获取所述终端设备当前操作界面的所述屏幕信息时,所述处理器配置为:
    根据辅助功能提供的接口,获取所述终端设备当前操作界面的所述屏幕信息,
    其中,所述屏幕信息包括控件的Text元素、ID元素和指示是否Clickable的元素中的一种或多种。
  13. 根据权利要求10所述的设备,当按照所述控制指令执行所述第一控件关联的操 作时,所述处理器配置为:
    核查所述第一控件是否可点击;
    响应于所述第一控件可点击,执行所述第一控件关联的操作;
    响应于所述第一控件不可点击,查找所述第一控件的父容器中的可点击控件,并执行所述可点击控件关联的操作。
  14. 根据权利要求10所述的设备,在基于辅助功能获取所述终端设备当前操作界面的所述屏幕信息前,所述处理器配置为:
    确定所解析出的控制指令是否与预置词条列表中的第一词条匹配;
    响应于所解析出的控制指令与所述预置词条列表中的第一词条匹配,根据所述第一词条拆分所解析出的控制指令。
  15. 根据权利要求10所述的设备,当按照所述控制指令执行所述第一控件关联的操作时,所述处理器配置为:
    确定所述第一控件的个数是否为1,
    响应于所述第一控件的个数大于1,向用户发出提示,所述提示用于指示所述用户手动选择。
  16. 根据权利要求15所述的设备,所述处理器还配置为:
    在预设等待时间内未接收到所述用户手动选择,执行所述第一控件中的第一个控件关联的操作。
  17. 根据权利要求15所述的设备,所述处理器还配置为:
    响应于在预设等待时间内接收到所述用户手动选择,记录所述用户从所述第一控件所选择的目标控件的ID;
    将所解析出的控制指令与所述目标控件的ID唯一适配设置。
  18. 根据权利要求15所述的设备,所述处理器还配置为:
    响应于预设等待时间内接收到所述用户手动选择,向所述用户发出提示,所述提示用于提醒所述用户记录操作;
    当接收到记录操作信号时,将所解析出的控制指令与所述用户从所述第一控件选择的目标控件的ID唯一适配设置。
PCT/CN2019/093222 2018-07-04 2019-06-27 语音控制方法和设备 WO2020007225A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810724986.5 2018-07-04
CN201810724986.5A CN110691160A (zh) 2018-07-04 2018-07-04 一种语音控制方法、装置及手机

Publications (1)

Publication Number Publication Date
WO2020007225A1 true WO2020007225A1 (zh) 2020-01-09

Family

ID=69059841

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/093222 WO2020007225A1 (zh) 2018-07-04 2019-06-27 语音控制方法和设备

Country Status (2)

Country Link
CN (1) CN110691160A (zh)
WO (1) WO2020007225A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114489437A (zh) * 2022-01-14 2022-05-13 深圳优美创新科技有限公司 智能手表及其控制方法、计算机可读存储介质

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112309388A (zh) * 2020-03-02 2021-02-02 北京字节跳动网络技术有限公司 用于处理信息的方法和装置
CN114007117B (zh) * 2020-07-28 2023-03-21 华为技术有限公司 一种控件显示方法和设备
CN112017656A (zh) * 2020-08-11 2020-12-01 博泰车联网(南京)有限公司 一种语音控制方法、装置及计算机存储介质
CN112581957B (zh) * 2020-12-04 2023-04-11 浪潮电子信息产业股份有限公司 一种计算机语音控制方法、系统及相关装置
CN112712806A (zh) * 2020-12-31 2021-04-27 南方科技大学 一种视障人群辅助阅读方法、装置、移动终端及存储介质
CN114115777A (zh) * 2021-11-19 2022-03-01 武汉虹信技术服务有限责任公司 一种基于安卓系统的增强文本显示方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541438A (zh) * 2010-11-01 2012-07-04 微软公司 集成话音命令模态的用户界面
US20130342457A1 (en) * 2012-06-22 2013-12-26 Cape Evolution Limited Data manipulation on electronic device and remote terminal
CN103869931A (zh) * 2012-12-10 2014-06-18 三星电子(中国)研发中心 语音控制用户界面的方法及装置

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103442138A (zh) * 2013-08-26 2013-12-11 华为终端有限公司 语音控制方法、装置及终端
CN105161106A (zh) * 2015-08-20 2015-12-16 深圳Tcl数字技术有限公司 智能终端的语音控制方法、装置及电视机系统
CN105895093A (zh) * 2015-11-02 2016-08-24 乐视致新电子科技(天津)有限公司 语音信息处理方法及装置
CN105551488A (zh) * 2015-12-15 2016-05-04 深圳Tcl数字技术有限公司 语音控制方法及系统
CN105957530B (zh) * 2016-04-28 2020-01-03 海信集团有限公司 一种语音控制方法、装置和终端设备
CN108010523B (zh) * 2016-11-02 2023-05-09 松下电器(美国)知识产权公司 信息处理方法以及记录介质
CN106683675A (zh) * 2017-02-08 2017-05-17 张建华 一种控制方法及语音操作系统
CN107358953A (zh) * 2017-06-30 2017-11-17 努比亚技术有限公司 语音控制方法、移动终端及存储介质
CN107948698A (zh) * 2017-12-14 2018-04-20 深圳市雷鸟信息科技有限公司 智能电视的语音控制方法、系统及智能电视

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541438A (zh) * 2010-11-01 2012-07-04 微软公司 集成话音命令模态的用户界面
US20130342457A1 (en) * 2012-06-22 2013-12-26 Cape Evolution Limited Data manipulation on electronic device and remote terminal
CN103869931A (zh) * 2012-12-10 2014-06-18 三星电子(中国)研发中心 语音控制用户界面的方法及装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114489437A (zh) * 2022-01-14 2022-05-13 深圳优美创新科技有限公司 智能手表及其控制方法、计算机可读存储介质
CN114489437B (zh) * 2022-01-14 2024-03-19 深圳优美创新科技有限公司 智能手表及其控制方法、计算机可读存储介质

Also Published As

Publication number Publication date
CN110691160A (zh) 2020-01-14

Similar Documents

Publication Publication Date Title
WO2020007225A1 (zh) 语音控制方法和设备
US20170177600A1 (en) Method, system, and device for processing data in connection with an application
US11538477B2 (en) Generating IoT-based notification(s) and provisioning of command(s) to cause automatic rendering of the IoT-based notification(s) by automated assistant client(s) of client device(s)
WO2017084541A1 (zh) 会话中实现表情图像发送的方法和装置
US10402407B2 (en) Contextual smart tags for content retrieval
WO2019042027A1 (zh) 一种终端的操作指导信息提供方法及终端设备
WO2020078174A1 (zh) 应用查找方法、存储介质及电子设备
KR20150036643A (ko) 내추럴 동작 입력을 사용한 문맥 관련 쿼리 조정
US11048736B2 (en) Filtering search results using smart tags
CN110610701B (zh) 语音交互方法、语音交互提示方法、装置和设备
CN111052079B (zh) 提供用于与助理代理进行交互的多功能链接的系统/方法和设备
WO2018120447A1 (zh) 一种医案信息的处理方法、装置和设备
WO2019057191A1 (zh) 内容检索方法、终端、服务器、电子设备及存储介质
CN110570846B (zh) 一种语音控制方法、装置及手机
US20230195802A1 (en) Data processing method and apparatus
RU2643470C2 (ru) Способ поиска и устройство поиска
CN107590137A (zh) 翻译方法、装置及计算机可读存储介质
WO2023065517A1 (zh) 遥控设备快捷键控制方法、设备及存储介质
US20180109582A1 (en) Operating mehtod, apparatus and computer readable storage medium
CN110989876B (zh) 一种应用程序适配方法、移动终端及存储介质
WO2017100357A1 (en) Method, system, and device for processing data in connection with an application
WO2021046824A1 (zh) 一种视频搜索方法、控制设备及电视
WO2017173827A1 (zh) 一种搜索方法、装置和设备
CN111880696B (zh) 一种基于百科的数据处理方法及装置
CN111818225B (zh) 音频数据的处理方法、终端设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19831470

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12.05.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19831470

Country of ref document: EP

Kind code of ref document: A1