WO2016112644A1 - Voice control method, apparatus, and terminal - Google Patents

Voice control method, apparatus, and terminal Download PDF

Info

Publication number
WO2016112644A1
WO2016112644A1 PCT/CN2015/082221 CN2015082221W WO2016112644A1 WO 2016112644 A1 WO2016112644 A1 WO 2016112644A1 CN 2015082221 W CN2015082221 W CN 2015082221W WO 2016112644 A1 WO2016112644 A1 WO 2016112644A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
execution
instruction
user
model
Prior art date
Application number
PCT/CN2015/082221
Other languages
French (fr)
Chinese (zh)
Inventor
党松
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2016112644A1 publication Critical patent/WO2016112644A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Definitions

  • This document relates to the field of communications, and in particular, to a voice control method, apparatus, and terminal.
  • the present invention provides a voice control method, device and terminal, and the voice interaction of the related technology can only achieve a fixed function through a fixed voice instruction, and cannot meet the individualized requirements of different users.
  • a voice control method includes:
  • the operation of learning the user example obtains a voice response execution model, the voice response execution model including an execution instruction corresponding to the execution of the operation;
  • the voice instruction is a private voice command recorded by the user or Standard voice commands preset in the terminal.
  • acquiring the private voice instruction includes:
  • the operation of the user example is obtained before the voice response execution model is obtained, or after the operation of the user example is obtained to obtain the voice response execution model, the voice instruction input by the user is collected;
  • the operation of the user example includes controlling one operation of an application, or controlling a plurality of consecutive operations of one application, or controlling the operations of at least two applications.
  • the operation of learning the user example to obtain the voice response execution model includes:
  • the speech response execution model is obtained by solidifying the execution order of each execution instruction in the execution order of each operation.
  • the execution instructions include an execution request instruction and an execution response instruction.
  • a voice control device includes a model building module, an association module, and an execution module:
  • the model establishing module is configured to: learn an operation of the user example to obtain a voice response execution model, and the voice response execution model includes an execution instruction corresponding to the performing the operation;
  • the execution module is configured to: after receiving the voice instruction, execute an execution instruction in a voice response execution model associated with the voice instruction.
  • the method further includes a voice acquiring module and a voice processing module;
  • the voice acquiring module is configured to: collect the voice input by the user after the model establishing module learns the operation of the user example to obtain the voice response execution model, or after the model establishing module learns the operation of the user example to obtain the voice response execution model.
  • the voice processing module is configured to perform acoustic feature extraction on the voice command to obtain a corresponding private voice command.
  • the model building module includes a recording submodule, an analysis submodule, and a solidifying submodule;
  • the recording submodule is configured to: record an operation of a user example
  • the analysis sub-module is configured to: convert each operation of the user example into a corresponding execution instruction
  • the solidification sub-module is configured to obtain a speech response execution model by curing the execution order of each execution instruction in the execution order of each operation.
  • a terminal comprising a voice control device as described above.
  • a computer readable storage medium storing computer executable instructions for performing the method of any of the above.
  • the voice control method, device and terminal provided by the embodiment of the present invention can learn the operation of the user example to obtain a voice response execution model, and the voice response execution model includes executing an execution instruction corresponding to each operation, and then learning The obtained voice response execution model is associated with the corresponding voice instruction; after receiving the voice command input by the user, the execution command in the voice response execution model associated with the voice instruction can be triggered to start the function application corresponding to the terminal.
  • the embodiment of the present invention can learn the example operation of the user (different users can subscribe to different operations to implement different functions) and obtain a corresponding voice response execution model, that is, different users can privately subscribe the voice control function according to their own needs. Therefore, it can meet the individual needs of different users; get rid of the related technology and the terminal's voice interaction curing mode, enhance the user's personalized experience, and better meet the user's privatization needs.
  • FIG. 1 is a schematic flowchart of a voice control method according to Embodiment 1 of the present invention.
  • FIG. 3 is a schematic flowchart of an operation process of an example of a learning user according to Embodiment 1 of the present invention.
  • FIG. 4 is a schematic structural diagram of a voice control apparatus according to Embodiment 2 of the present invention.
  • FIG. 5 is a schematic structural diagram of another voice control apparatus according to Embodiment 2 of the present invention.
  • FIG. 6 is a schematic structural diagram of a model establishing module of a voice control apparatus according to Embodiment 2 of the present invention.
  • FIG. 8 is a schematic flowchart of a user voice triggering subscription function according to Embodiment 3 of the present invention.
  • Embodiment 1 is a diagrammatic representation of Embodiment 1:
  • the terminal can learn the example operation of the user and obtain the corresponding voice response execution model, that is, different users can customize the voice control function according to their own needs, thereby obtaining the most "understanding".
  • the user and the terminal that most listens to the user's words; get rid of the curing mode of the voice interaction with the terminal, and enhance the personalized experience of the user, and better meet the privatization needs of the user.
  • the voice control method in this embodiment includes:
  • Step 101 Learning an operation of the user example to obtain a voice response execution model, the voice response execution model including an execution instruction corresponding to each operation of executing the user example;
  • Step 102 Associate the obtained voice response execution model with a corresponding voice instruction.
  • Step 103 After receiving the voice instruction input by the user, execute an execution instruction in the voice response execution model associated with the voice instruction to start a function application corresponding to the terminal.
  • the voice command in the above step 102 may be a private voice command recorded by the user, or may be a preset standard voice command in the terminal, and the standard voice command may be preset before the terminal leaves the factory, or Download the corresponding network platform.
  • a voice command may be associated with a voice response execution model, or multiple voice commands may be associated with a voice response execution model, that is, a voice response execution model may be triggered.
  • a plurality of voice commands are implemented; for example, four voice commands of "call”, "dial”, “call”, and "initiate" may be associated with a voice response execution model for implementing functions such as calling.
  • the voice command in the above step 102 is a private voice command input by the user
  • the security used by the terminal is better.
  • the other user does not know whether the private voice command is an instruction to enable the terminal function and an instruction to enable the function;
  • private voice commands can also bind users, which can further improve security.
  • FIG. 2 includes:
  • Step 201 Before the operation of the user example is obtained, the voice response execution model is obtained, or after the operation of the user example is obtained, the voice response execution model is obtained, and the voice command input by the user is collected, and the user voice can be implemented by using a device such as a MIC on the terminal. Collection
  • Step 202 Perform acoustic feature extraction on the collected voice command to obtain a corresponding private voice command, and save the file.
  • the user can customize the privatized voice triggering operation process according to the needs of the user.
  • the operation of the user example may be one operation or multiple consecutive operations of one application in the control terminal, or may include controlling at least two applications.
  • the operation such as the user example, the following operations: wake up the terminal -> open the camera application -> focus -> 3 continuous shooting and save -> exit camera application -> open WeChat application -> select the latest shot of 3 photos -> share to friends ring.
  • Figure 3 The process of learning the user example of this series of operations to obtain a speech response execution model is shown in Figure 3, including:
  • Step 301 Record the operation of the user example, that is, first input each operation of the user example;
  • Step 302 Convert each operation of the user example into a corresponding execution instruction, where the execution instruction includes executing an execution request instruction (including a corresponding parameter) corresponding to the action and executing a response instruction (including a corresponding parameter);
  • Step 303 Acquire an execution sequence of each execution instruction according to an execution order of each operation of the user example to obtain a voice response execution model
  • Step 304 Save the obtained voice response execution model, and save the file to a model database local to the terminal.
  • each time a speech response execution model is obtained an associated operation may be performed immediately with the corresponding voice instruction, and then the next speech response execution model may be learned; after the learning is completed, a plurality of speech response execution models may be obtained, and then Each voice response execution model is associated with each corresponding voice instruction.
  • the user may also modify the voice command corresponding to each voice response execution model, or modify the learned voice response execution model.
  • each operation of the user example may be divided into a system level and an application level, and the operation for waking up the terminal and opening the application belongs to the system level; and the corresponding operation in the opened application belongs to the application level.
  • the operation associated with the corresponding voice response execution model is the user-customized "Heavenly King" voice command. After the completion of the voice response execution model and the "Tianwang Gaihu" voice command, when receiving the voice command issued by the user, it is determined whether the voice command is a "Heavenly King" voice command, and if so, the call is made.
  • the associated speech response execution model executes each execution instruction in the speech response execution model according to the solidified order.
  • the corresponding operation of the terminal is: wake up the terminal -> open the camera application -> focus -> 3 continuous shooting and save -> exit Photo Application -> Open WeChat App -> Select the latest 3 photos -> Share to a circle of friends.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • the embodiment also provides a voice control device, which can be applied to various smart terminals such as a smart phone and an iPad.
  • the model includes a model building module 41, an association module 42 and an execution module 43:
  • the model establishing module 41 is configured to: learn an operation of the user example to obtain a voice response execution model, and the voice response execution model includes executing an execution instruction corresponding to each operation;
  • the execution module 43 is configured to: after receiving the voice instruction input by the user, execute an execution instruction in the voice response execution model associated with the voice instruction to start a function corresponding to the terminal use.
  • the voice command in this embodiment may be a private voice command recorded by the user, or may be a preset standard voice command in the terminal, and the standard voice command may be preset by the terminal before leaving the factory, or Download the corresponding network platform.
  • a voice command may be associated with a voice response execution model, or a plurality of voice commands may be associated with a voice response execution model, that is, triggering a voice response execution model may be implemented by using multiple voice commands.
  • the voice command in this embodiment may be a private voice command entered by the user, because the security used by the terminal is better at this time. First, other users do not know whether the private voice command is an instruction to enable the terminal function and what function is enabled. Instructions; second, private voice commands can also bind users, thus further improving security.
  • the voice control device further includes a voice acquisition module 44 and a voice processing module 45;
  • the voice acquisition module 44 is configured to: before the model establishing module 41 learns the operation of the user example to obtain the voice response execution model, or after the model establishing module 41 learns the operation of the user example to obtain the voice response execution model, the voice command input by the user is collected; The user voice is collected through the MIC and other devices on the terminal;
  • the voice processing module 45 is configured to perform acoustic feature extraction on the voice command acquired by the voice acquiring module 44 to obtain a corresponding private voice command.
  • the user can customize the privatized voice triggering operation process according to the requirements of the user.
  • the operation of the user example may be an operation of controlling one application in the terminal, or may include an operation of controlling at least two applications, such as a user example.
  • the model building module 41 in this embodiment includes a recording submodule 411, an analysis submodule 412, and a solidifying submodule 413;
  • the recording submodule 411 is configured to: record an operation of the user example
  • the analysis sub-module 412 is configured to: convert each operation of the user example into a corresponding execution instruction; the execution instruction includes an execution request instruction (including a corresponding parameter) corresponding to the execution of the action, and an execution response instruction (including a corresponding parameter);
  • the solidification sub-module 413 is arranged to obtain a speech response execution model in accordance with the execution order of each execution instruction in the execution order of each operation of the user example.
  • Executing the model executing each execution instruction in the voice response execution model according to the solidified order, the corresponding operation of the terminal is: waking up the terminal->calling the emergency call xxx->initiating the location location immediately->sending the message to the emergency contact person, With the positioning results.
  • Embodiment 3 is a diagrammatic representation of Embodiment 3
  • the voice command is taken as an example of the private voice command input by the user before the voice response execution model is established, and the entire voice control process is exemplarily described.
  • Step 701 The user turns on the voice manipulation customization mode of the terminal.
  • Step 702 The terminal prompts the user to enter a private voice instruction, and the user promptly speaks to the terminal according to the prompt: “Pineapple Jackfruit”;
  • Step 703 After completing the first pass of the entry, the terminal asks the user to perform another confirmation entry, and the user again prompts to say: “Pineapple Jackfruit”; of course, it can be entered once;
  • Step 704 After completing the second pass of the entry, the terminal analyzes and models the content recorded twice before and after, and compares whether the content is consistent before and after. If they are consistent, go to step 705. If they are not consistent, go back to step 702 and let the user re-enter;
  • Step 705 The terminal saves the voice instruction, and prompts the user to complete the private voice instruction entry. Please continue to enter the corresponding operation, and the user begins to input the example operation according to the prompt:
  • Step 706 The terminal records the operation of the user example, and the terminal starts the operation of the user example.
  • the line is decomposed and the following speech response execution model is obtained:
  • Step 707 The terminal saves the obtained voice response execution model, and prompts the user to complete the entry, whether to continue the entry; if the user selects yes, go to step 702; if no, go to step 708:
  • Step 708 Exit the voice manipulation customization mode.
  • Step 801 Receive a voice command sent by the user; for example, when the user encounters a certain dangerous situation, for example, in a hijacked state, the user cannot alarm, and the user secretly triggers the terminal by speaking the privatization command “pineapple jackfruit”. Call the police;
  • Step 802 After receiving the voice command sent by the user, the terminal analyzes the command and compares it with the stored private voice command; if the command is not legal, go to step 801; if it is legal, go to step 803;
  • Step 803 Call up the voice response execution model corresponding to the voice instruction, and start parsing and executing:
  • the speech response execution model is executed here as follows:
  • SYS_LEVEL The action corresponding to WAKE_UP_Res (OK) is: wake up the terminal.
  • SYS_LEVEL OPEN_APP_Res (OK) corresponding action: open the phone application.
  • APP_LEVEL The action corresponding to SET_UP_EMERGENCY_CALL_Res (OK) is: Make an emergency call.
  • SYS_LEVEL OPEN_APP_Res (OK) corresponding action: open GPS positioning.
  • APP_LEVEL SET_UP_LBS_SERVICE_Res (Local position result) -> The corresponding action is: initiate positioning.
  • SYS_LEVEL OPEN_APP_Res (OK) corresponding action: open the message application.
  • APP_LEVEL The action corresponding to EDIT_res(OK) is: edit message, with positioning result.
  • APP_LEVEL The action corresponding to SEND_MESSAGE_Res (OK) is: send a message.
  • the final complete model execution process is:
  • the voice control scheme provided by the embodiment of the present invention can enable the user to get rid of the current curing mode of the voice communication with the terminal, train the listener to listen to the most, and understand the terminal of the user, which not only enhances the personalized experience of the user, but also enhances the personalized experience of the user. Can solve many privatization needs of users.
  • all or part of the steps of the above embodiments may also be implemented by using an integrated circuit. These steps may be separately fabricated into individual integrated circuit modules, or multiple modules or steps may be fabricated into a single integrated circuit module. achieve.
  • the devices/function modules/functional units in the above embodiments may be implemented by a general-purpose computing device, which may be centralized on a single computing device or distributed over a network of multiple computing devices.
  • the device/function module/functional unit in the above embodiment When the device/function module/functional unit in the above embodiment is implemented in the form of a software function module and sold or used as a stand-alone product, it can be stored in a computer readable storage medium.
  • the above mentioned computer readable storage medium may be a read only memory, a magnetic disk or an optical disk or the like.
  • different users can customize the voice control function according to their own needs, so that the personalized requirements of different users can be met; the curing mode of the voice interaction with the terminal is eliminated, and the personalized experience of the user is enhanced. It can better meet the privatization needs of users.

Abstract

A voice control method, apparatus, and terminal, the method comprising: during a voice control process, learning sample operations from a user to obtain a voice response execution model (S101), the voice response execution model comprises execution instructions corresponding to executing the operations; then associating the learned voice response execution model with corresponding voice instructions (S102); when one of the voice instructions inputted by the user is received, triggering and executing the execution instruction associated with said voice instruction in the voice response execution model (S103), such that a corresponding function application on a terminal is activated.

Description

语音控制方法、装置及终端Voice control method, device and terminal 技术领域Technical field
本文涉及通信领域,尤其涉及一种语音控制方法、装置及终端。This document relates to the field of communications, and in particular, to a voice control method, apparatus, and terminal.
背景技术Background technique
随着智能手机、IPAD等智能终端普及,各种带有语音交互助手的终端也频频现身。例如Android上的Google Now,苹果系统上的Siri语音等。这类语音交互系统依靠强大的数据库支撑,可以和人们进行简单的“交流”,虽然趣味性十足,但实用性却因人而异。例如:在目前的语音交互系统中,当你说出“我饿了”后,系统会按照固定模式帮你搜索目前附近有哪些可以就餐的地方。然而,有些用户说出“我饿了”可能只是想给家人打个电话问问晚上做什么饭,并不是要找餐厅。又例如,碰到某种危险情况时,也许用户需要悄悄的用手机报警,并通知家人,且报告他/她的位置信息,但相关技术的语音交互功能显然无法让用户秘密控制手机完成这一连贯的动作。出现上述问题的原因,是因为目前的语音交互助手都是依靠已经固化好的程序和模式和人们进行“交流”,不能满足不同用户千变万化的个性需求。With the popularization of smart terminals such as smart phones and IPADs, various terminals with voice interactive assistants have also appeared frequently. For example, Google Now on Android, Siri voice on Apple system, etc. This kind of voice interaction system relies on powerful database support and can communicate with people simply. Although it is full of fun, the practicality varies from person to person. For example, in the current voice interaction system, when you say "I am hungry," the system will help you search for places in the vicinity that can be eaten in a fixed mode. However, some users say "I am hungry" may just want to call the family to ask what to do at night, not to find a restaurant. For another example, when encountering a dangerous situation, the user may need to quietly use the mobile phone to alert and notify the family and report his/her location information, but the voice interaction function of the related technology obviously cannot allow the user to secretly control the mobile phone to complete this. Coherent action. The reason for the above problems is that the current voice interaction assistants rely on the already solidified programs and modes and people to "communicate", which can not meet the ever-changing personality needs of different users.
发明内容Summary of the invention
本文提供一种语音控制方法、装置及终端,解决相关技术的语音交互只能通过固定的语音指令实现固定的功能,不能满足不同用户的个性化需求。The present invention provides a voice control method, device and terminal, and the voice interaction of the related technology can only achieve a fixed function through a fixed voice instruction, and cannot meet the individualized requirements of different users.
一种语音控制方法,包括:A voice control method includes:
学习用户示例的操作得到语音响应执行模型,所述语音响应执行模型包含执行所述操作对应的执行指令;The operation of learning the user example obtains a voice response execution model, the voice response execution model including an execution instruction corresponding to the execution of the operation;
将所述语音响应执行模型与对应的语音指令关联;Associating the voice response execution model with a corresponding voice instruction;
接收到所述语音指令后,执行与所述语音指令关联的语音响应执行模型中的执行指令。After receiving the voice instruction, executing an execution instruction in the voice response execution model associated with the voice instruction.
在本发明的一种实施例中,所述语音指令为用户录制的私有语音指令或 终端中预设好的标准语音指令。In an embodiment of the invention, the voice instruction is a private voice command recorded by the user or Standard voice commands preset in the terminal.
在本发明的一种实施例中,所述语音指令为私有语音指令时,获取所述私有语音指令包括:In an embodiment of the present invention, when the voice instruction is a private voice instruction, acquiring the private voice instruction includes:
学习用户示例的操作得到语音响应执行模型之前,或在学习用户示例的操作得到语音响应执行模型之后,采集用户输入的语音指令;The operation of the user example is obtained before the voice response execution model is obtained, or after the operation of the user example is obtained to obtain the voice response execution model, the voice instruction input by the user is collected;
对所述语音指令进行声学特征提取得到对应的私有语音指令。Performing acoustic feature extraction on the voice command to obtain a corresponding private voice command.
在本发明的一种实施例中,所述用户示例的操作包含控制一个应用的一个操作,或控制一个应用的多个连续操作,或控制至少两个应用的操作。In one embodiment of the invention, the operation of the user example includes controlling one operation of an application, or controlling a plurality of consecutive operations of one application, or controlling the operations of at least two applications.
在本发明的一种实施例中,学习用户示例的操作得到语音响应执行模型包括:In an embodiment of the present invention, the operation of learning the user example to obtain the voice response execution model includes:
记录用户示例的操作;Record the operation of the user example;
将用户示例的每个操作转换成对应的执行指令;Converting each operation of the user example into a corresponding execution instruction;
按照所述每个操作的执行顺序固化每个执行指令的执行顺序得到语音响应执行模型。The speech response execution model is obtained by solidifying the execution order of each execution instruction in the execution order of each operation.
在本发明的一种实施例中,所述执行指令包含执行请求指令和执行响应指令。In one embodiment of the invention, the execution instructions include an execution request instruction and an execution response instruction.
一种语音控制装置,包括模型建立模块、关联模块和执行模块:A voice control device includes a model building module, an association module, and an execution module:
所述模型建立模块设置为:学习用户示例的操作得到语音响应执行模型,所述语音响应执行模型包含执行所述操作对应的执行指令;The model establishing module is configured to: learn an operation of the user example to obtain a voice response execution model, and the voice response execution model includes an execution instruction corresponding to the performing the operation;
所述关联模块设置为:将所述语音响应执行模型与对应的语音指令关联;The association module is configured to: associate the voice response execution model with a corresponding voice instruction;
所述执行模块设置为:接收到所述语音指令后,执行与所述语音指令关联的语音响应执行模型中的执行指令。The execution module is configured to: after receiving the voice instruction, execute an execution instruction in a voice response execution model associated with the voice instruction.
在本发明的一种实施例中,还包括语音获取模块和语音处理模块;In an embodiment of the present invention, the method further includes a voice acquiring module and a voice processing module;
所述语音获取模块设置为:在所述模型建立模块学习用户示例的操作得到语音响应执行模型之前,或在所述模型建立模块学习用户示例的操作得到语音响应执行模型之后,采集用户输入的语音指令; The voice acquiring module is configured to: collect the voice input by the user after the model establishing module learns the operation of the user example to obtain the voice response execution model, or after the model establishing module learns the operation of the user example to obtain the voice response execution model. Instruction
所述语音处理模块设置为:对所述语音指令进行声学特征提取得到对应的私有语音指令。The voice processing module is configured to perform acoustic feature extraction on the voice command to obtain a corresponding private voice command.
在本发明的一种实施例中,所述模型建立模块包括记录子模块、分析子模块和固化子模块;In an embodiment of the present invention, the model building module includes a recording submodule, an analysis submodule, and a solidifying submodule;
所述记录子模块设置为:记录用户示例的操作;The recording submodule is configured to: record an operation of a user example;
所述分析子模块设置为:将用户示例的每个操作转换成对应的执行指令;The analysis sub-module is configured to: convert each operation of the user example into a corresponding execution instruction;
所述固化子模块设置为:按照所述每个操作的执行顺序固化每个执行指令的执行顺序得到语音响应执行模型。The solidification sub-module is configured to obtain a speech response execution model by curing the execution order of each execution instruction in the execution order of each operation.
一种终端,包括如上所述的语音控制装置。A terminal comprising a voice control device as described above.
一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行上述任一项的方法。A computer readable storage medium storing computer executable instructions for performing the method of any of the above.
本发明实施例提供的语音控制方法、装置及终端,在语音控制过程中,可以学习用户示例的操作得到语音响应执行模型,该语音响应执行模型包含执行每个操作对应的执行指令,然后将学习到的语音响应执行模型与对应的语音指令关联;当接收到用户输入的该语音指令后,即可触发执行与该语音指令关联的语音响应执行模型中的执行指令进而开启终端对应的功能应用。可见本发明实施例可以学习用户的示例操作(不同用户可以订制不同的操作实现不同的功能)进而得到对应的语音响应执行模型,也即不同用户可以根据自己需求对语音控制功能进行私有化订制,因此可以满足不同用户的个性化需求;摆脱了相关技术和终端进行语音交互的固化模式,增强了用户的个性化体验,更能满足用户的私有化需求。The voice control method, device and terminal provided by the embodiment of the present invention can learn the operation of the user example to obtain a voice response execution model, and the voice response execution model includes executing an execution instruction corresponding to each operation, and then learning The obtained voice response execution model is associated with the corresponding voice instruction; after receiving the voice command input by the user, the execution command in the voice response execution model associated with the voice instruction can be triggered to start the function application corresponding to the terminal. It can be seen that the embodiment of the present invention can learn the example operation of the user (different users can subscribe to different operations to implement different functions) and obtain a corresponding voice response execution model, that is, different users can privately subscribe the voice control function according to their own needs. Therefore, it can meet the individual needs of different users; get rid of the related technology and the terminal's voice interaction curing mode, enhance the user's personalized experience, and better meet the user's privatization needs.
附图概述BRIEF abstract
图1为本发明实施例一提供的语音控制方法流程示意图;1 is a schematic flowchart of a voice control method according to Embodiment 1 of the present invention;
图2为本发明实施例一提供的获取用户私有语音指令流程示意图; 2 is a schematic flowchart of obtaining a private voice command of a user according to Embodiment 1 of the present invention;
图3为本发明实施例一提供的学习用户示例的操作流程示意图;3 is a schematic flowchart of an operation process of an example of a learning user according to Embodiment 1 of the present invention;
图4为本发明实施例二提供的语音控制装置结构示意图;4 is a schematic structural diagram of a voice control apparatus according to Embodiment 2 of the present invention;
图5为本发明实施例二提供的另一语音控制装置结构示意图;FIG. 5 is a schematic structural diagram of another voice control apparatus according to Embodiment 2 of the present invention; FIG.
图6为本发明实施例二提供的语音控制装置的模型建立模块结构示意图;6 is a schematic structural diagram of a model establishing module of a voice control apparatus according to Embodiment 2 of the present invention;
图7为本发明实施例三提供的用户私有订制过程流程示意图;FIG. 7 is a schematic flowchart of a user private subscription process according to Embodiment 3 of the present invention;
图8为本发明实施例三提供的用户语音触发订制功能的流程示意图。FIG. 8 is a schematic flowchart of a user voice triggering subscription function according to Embodiment 3 of the present invention.
本发明的实施方式Embodiments of the invention
下面结合附图对本发明的实施方式进行说明。Embodiments of the present invention will be described below with reference to the accompanying drawings.
实施例一:Embodiment 1:
本实施例提供的语音控制方法,终端可以学习用户的示例操作进而得到对应的语音响应执行模型,也即不同用户可以根据自己需求对语音控制功能进行私有化订制,从而得到最能“理解”用户以及最“听用户话”的终端;摆脱了和终端进行语音交互的固化模式,在增强用户的个性化体验的同时,更能满足用户的私有化需求。请参见图1所示,本实施例中的语音控制方法包括:In the voice control method provided by the embodiment, the terminal can learn the example operation of the user and obtain the corresponding voice response execution model, that is, different users can customize the voice control function according to their own needs, thereby obtaining the most "understanding". The user and the terminal that most listens to the user's words; get rid of the curing mode of the voice interaction with the terminal, and enhance the personalized experience of the user, and better meet the privatization needs of the user. Referring to FIG. 1, the voice control method in this embodiment includes:
步骤101:学习用户示例的操作得到语音响应执行模型,该语音响应执行模型包含执行用户示例的每个操作对应的执行指令;Step 101: Learning an operation of the user example to obtain a voice response execution model, the voice response execution model including an execution instruction corresponding to each operation of executing the user example;
步骤102:将得到的语音响应执行模型与对应的语音指令关联;Step 102: Associate the obtained voice response execution model with a corresponding voice instruction.
步骤103:当接收到用户输入的所述语音指令后,执行与该语音指令关联的语音响应执行模型中的执行指令,以启动终端对应的功能应用。Step 103: After receiving the voice instruction input by the user, execute an execution instruction in the voice response execution model associated with the voice instruction to start a function application corresponding to the terminal.
应当理解的是,上述步骤102中的语音指令可以为用户录制的私有语音指令,也可以是终端中预设好的标准语音指令,该标准语音指令可以是终端出厂前预置好的,或者在对应的网络平台上下载获取。应当理解的是,本实施例中可以是一个语音指令关联一个语音响应执行模型,也可以是多个语音指令关联一个语音响应执行模型,也即触发一个语音响应执行模型可以通过 多种语音指令实现;例如可以设“呼叫”、“拨打”、“call”、“发起”四个语音指令对应一个用于实现呼叫等功能的语音响应执行模型。It should be understood that the voice command in the above step 102 may be a private voice command recorded by the user, or may be a preset standard voice command in the terminal, and the standard voice command may be preset before the terminal leaves the factory, or Download the corresponding network platform. It should be understood that, in this embodiment, a voice command may be associated with a voice response execution model, or multiple voice commands may be associated with a voice response execution model, that is, a voice response execution model may be triggered. A plurality of voice commands are implemented; for example, four voice commands of "call", "dial", "call", and "initiate" may be associated with a voice response execution model for implementing functions such as calling.
上述步骤102中的语音指令为用户录入的私有语音指令时,终端使用的安全性会更好,首先其他用户并不知道该私有语音指令是否是开启终端功能的指令以及是开启什么功能的指令;其次,私有语音指令还可以对用户进行绑定,因此可以进一步提升安全性。本实施例中获取用户的私有语音指令的过程请参见图2所示,包括:When the voice command in the above step 102 is a private voice command input by the user, the security used by the terminal is better. First, the other user does not know whether the private voice command is an instruction to enable the terminal function and an instruction to enable the function; Second, private voice commands can also bind users, which can further improve security. For the process of obtaining the private voice command of the user in this embodiment, refer to FIG. 2, which includes:
步骤201:在学习用户示例的操作得到语音响应执行模型之前,或在学习用户示例的操作得到语音响应执行模型之后,采集用户输入的语音指令,可通过终端上自带的MIC等设备实现用户语音的采集;Step 201: Before the operation of the user example is obtained, the voice response execution model is obtained, or after the operation of the user example is obtained, the voice response execution model is obtained, and the voice command input by the user is collected, and the user voice can be implemented by using a device such as a MIC on the terminal. Collection
步骤202:对采集的语音指令进行声学特征提取得到对应的私有语音指令,并进行保存。Step 202: Perform acoustic feature extraction on the collected voice command to obtain a corresponding private voice command, and save the file.
本实施例中,用户可以根据自己的需求定制私有化的语音触发操作过程,用户示例的操作可以是控制终端中的一个应用的一个操作或多个连续操作,也可以是包含控制至少两个应用的操作,例如用户示例以下操作:唤醒终端->打开照相应用->对焦->3连拍并保存->退出照相应用->打开微信应用->选择最新拍摄的3张照片->分享到朋友圈。学习用户示例的这一系列操作得到语音响应执行模型的过程请参见图3所示,包括:In this embodiment, the user can customize the privatized voice triggering operation process according to the needs of the user. The operation of the user example may be one operation or multiple consecutive operations of one application in the control terminal, or may include controlling at least two applications. The operation, such as the user example, the following operations: wake up the terminal -> open the camera application -> focus -> 3 continuous shooting and save -> exit camera application -> open WeChat application -> select the latest shot of 3 photos -> share to friends ring. The process of learning the user example of this series of operations to obtain a speech response execution model is shown in Figure 3, including:
步骤301:记录用户示例的操作,也即先将用户示例的每个操作进行录入;Step 301: Record the operation of the user example, that is, first input each operation of the user example;
步骤302:将用户示例的每个操作转换成对应的执行指令,该执行指令包含执行该动作对应的执行请求指令(包含对应的参数)和执行响应指令(包含对应的参数);Step 302: Convert each operation of the user example into a corresponding execution instruction, where the execution instruction includes executing an execution request instruction (including a corresponding parameter) corresponding to the action and executing a response instruction (including a corresponding parameter);
步骤303:按照用户示例的每个操作的执行顺序固化每个执行指令的执行顺序得到语音响应执行模型;Step 303: Acquire an execution sequence of each execution instruction according to an execution order of each operation of the user example to obtain a voice response execution model;
步骤304:将得到的语音响应执行模型进行保存,可保存到终端本地的模型数据库中。当然也可以保存到远程数据库中,在需要时进行调用。Step 304: Save the obtained voice response execution model, and save the file to a model database local to the terminal. Of course, you can also save to a remote database and call it when needed.
重复上述步骤301-304即可得到用户私有订制的多个语音响应执行模型 中。本实施例中可以在每得到一个语音响应执行模型后就立即与对应的语音指令进行关联操作,然后再学习下一个语音响应执行模型;也可以在学习完得到多个语音响应执行模型后,再将每个语音响应执行模型与每个对应的语音指令进行关联。且本实施例中用户也可以对每个语音响应执行模型对应的语音指令进行修改,或对已学习到的语音响应执行模型进行修改。本实施例中可以将用户示例的每个操作划分为系统级和应用级,对于唤醒终端以及开启应用的操作则属于系统级;在开启的应用中实现相应的操作则属于应用级。Repeat the above steps 301-304 to obtain a plurality of voice response execution models customized by the user. in. In this embodiment, each time a speech response execution model is obtained, an associated operation may be performed immediately with the corresponding voice instruction, and then the next speech response execution model may be learned; after the learning is completed, a plurality of speech response execution models may be obtained, and then Each voice response execution model is associated with each corresponding voice instruction. In this embodiment, the user may also modify the voice command corresponding to each voice response execution model, or modify the learned voice response execution model. In this embodiment, each operation of the user example may be divided into a system level and an application level, and the operation for waking up the terminal and opening the application belongs to the system level; and the corresponding operation in the opened application belongs to the application level.
设用户示例的唤醒终端->打开照相应用->对焦->3连拍并保存->退出照相应用->打开微信应用->选择最新拍摄的3张照片->分享到朋友圈这一系列操作对应的语音响应执行模型关联的操作是用户私有订制的“天王盖地虎”语音指令。则在完成语音响应执行模型与“天王盖地虎”语音指令关联后,当接收到用户下发的语音指令时,判断该语音指令是否是“天王盖地虎”语音指令,如是,则调用出关联的语音响应执行模型,按照固化好的顺序执行语音响应执行模型中的每个执行指令,终端对应的操作是:唤醒终端->打开照相应用->对焦->3连拍并保存->退出照相应用->打开微信应用->选择最新拍摄的3张照片->分享到朋友圈。Set the user's wake-up terminal -> turn on the camera application -> focus -> 3 continuous shooting and save -> exit camera application -> open WeChat application -> select the latest shot of 3 photos -> share to friends circle this series of operations The operation associated with the corresponding voice response execution model is the user-customized "Heavenly King" voice command. After the completion of the voice response execution model and the "Tianwang Gaihu" voice command, when receiving the voice command issued by the user, it is determined whether the voice command is a "Heavenly King" voice command, and if so, the call is made. The associated speech response execution model executes each execution instruction in the speech response execution model according to the solidified order. The corresponding operation of the terminal is: wake up the terminal -> open the camera application -> focus -> 3 continuous shooting and save -> exit Photo Application -> Open WeChat App -> Select the latest 3 photos -> Share to a circle of friends.
实施例二:Embodiment 2:
本实施例还提供了语音控制装置,该语音控制装置可应用于智能手机、IPAD等多种智能终端,请参见图4所示,其包括模型建立模块41、关联模块42和执行模块43:The embodiment also provides a voice control device, which can be applied to various smart terminals such as a smart phone and an iPad. Referring to FIG. 4, the model includes a model building module 41, an association module 42 and an execution module 43:
模型建立模块41设置为:学习用户示例的操作得到语音响应执行模型,语音响应执行模型包含执行每个操作对应的执行指令;The model establishing module 41 is configured to: learn an operation of the user example to obtain a voice response execution model, and the voice response execution model includes executing an execution instruction corresponding to each operation;
关联模块42设置为:将模型建立模块得到的语音响应执行模型与对应的语音指令关联;The association module 42 is configured to: associate a voice response execution model obtained by the model establishment module with a corresponding voice instruction;
执行模块43设置为:接收到用户输入的所述语音指令后,执行与该语音指令关联的语音响应执行模型中的执行指令,以启动终端对应的功能应 用。The execution module 43 is configured to: after receiving the voice instruction input by the user, execute an execution instruction in the voice response execution model associated with the voice instruction to start a function corresponding to the terminal use.
应当理解的是,本实施例中的语音指令可以为用户录制的私有语音指令,也可以是终端中预设好的标准语音指令,该标准语音指令可以是终端出厂前预置好的,或者在对应的网络平台上下载获取。本实施例中可以是一个语音指令关联一个语音响应执行模型,也可以是多个语音指令关联一个语音响应执行模型,也即触发一个语音响应执行模型可以通过多种语音指令实现。It should be understood that the voice command in this embodiment may be a private voice command recorded by the user, or may be a preset standard voice command in the terminal, and the standard voice command may be preset by the terminal before leaving the factory, or Download the corresponding network platform. In this embodiment, a voice command may be associated with a voice response execution model, or a plurality of voice commands may be associated with a voice response execution model, that is, triggering a voice response execution model may be implemented by using multiple voice commands.
本实施例的语音指令可以为用户录入的私有语音指令,因为此时终端使用的安全性会更好,首先其他用户并不知道该私有语音指令是否是开启终端功能的指令以及是开启什么功能的指令;其次,私有语音指令还可以对用户进行绑定,因此可以进一步提升安全性。请参见图5所示,语音控制装置还包括语音获取模块44和语音处理模块45;The voice command in this embodiment may be a private voice command entered by the user, because the security used by the terminal is better at this time. First, other users do not know whether the private voice command is an instruction to enable the terminal function and what function is enabled. Instructions; second, private voice commands can also bind users, thus further improving security. Referring to Figure 5, the voice control device further includes a voice acquisition module 44 and a voice processing module 45;
语音获取模块44设置为:在模型建立模块41学习用户示例的操作得到语音响应执行模型之前,或在模型建立模块41学习用户示例的操作得到语音响应执行模型之后,采集用户输入的语音指令;可通过终端上自带的MIC等设备实现用户语音的采集;The voice acquisition module 44 is configured to: before the model establishing module 41 learns the operation of the user example to obtain the voice response execution model, or after the model establishing module 41 learns the operation of the user example to obtain the voice response execution model, the voice command input by the user is collected; The user voice is collected through the MIC and other devices on the terminal;
语音处理模块45设置为:对语音获取模块44获取的语音指令进行声学特征提取得到对应的私有语音指令。The voice processing module 45 is configured to perform acoustic feature extraction on the voice command acquired by the voice acquiring module 44 to obtain a corresponding private voice command.
本实施例中,用户可以根据自己的需求定制私有化的语音触发操作过程,用户示例的操作可以是控制终端中的一个应用的操作,也可以是包含控制至少两个应用的操作,例如用户示例以下操作:唤醒终端->拨打紧急求救电话xxx->立即发起位置定位->给紧急联络人发送消息,并附带定位结果。请参见图6所示,本实施例中的模型建立模块41包括记录子模块411、分析子模块412和固化子模块413;In this embodiment, the user can customize the privatized voice triggering operation process according to the requirements of the user. The operation of the user example may be an operation of controlling one application in the terminal, or may include an operation of controlling at least two applications, such as a user example. The following operations: wake up the terminal -> call emergency call xxx-> immediately initiate location location -> send a message to the emergency contact, with the positioning result. Referring to FIG. 6, the model building module 41 in this embodiment includes a recording submodule 411, an analysis submodule 412, and a solidifying submodule 413;
记录子模块411设置为:记录用户示例的操作;The recording submodule 411 is configured to: record an operation of the user example;
分析子模块412设置为:将用户示例的每个操作转换成对应的执行指令;该执行指令包含执行该动作对应的执行请求指令(包含对应的参数)和执行响应指令(包含对应的参数); The analysis sub-module 412 is configured to: convert each operation of the user example into a corresponding execution instruction; the execution instruction includes an execution request instruction (including a corresponding parameter) corresponding to the execution of the action, and an execution response instruction (including a corresponding parameter);
固化子模块413设置为:按照用户示例的每个操作的执行顺序固化每个执行指令的执行顺序得到语音响应执行模型。The solidification sub-module 413 is arranged to obtain a speech response execution model in accordance with the execution order of each execution instruction in the execution order of each operation of the user example.
设用户示例的唤醒终端->拨打紧急求救电话xxx->立即发起位置定位->给紧急联络人发送消息,并附带定位结果这一系列操作对应的语音响应执行模型关联的操作是用户私有订制的“菠萝菠萝蜜”语音指令。则在完成语音响应执行模型与“菠萝菠萝蜜”语音指令关联后,当接收到用户下发的语音指令时,判断该语音指令是否是“菠萝菠萝蜜”语音指令,如是,则调用出关联的语音响应执行模型,按照固化好的顺序执行语音响应执行模型中的每个执行指令,终端对应的操作是:唤醒终端->拨打紧急求救电话xxx->立即发起位置定位->给紧急联络人发送消息,并附带定位结果。Let the wake-up terminal of the user example -> call emergency call xxx -> immediately initiate location fix -> send a message to the emergency contact, and the operation result associated with the series of operations with the positioning result is the user's private subscription The "Pineapple Jackfruit" voice command. After the completion of the voice response execution model and the "pineapple jackfruit" voice instruction, when the voice instruction issued by the user is received, it is determined whether the voice instruction is a "pineapple jackfruit" voice instruction, and if so, the associated voice response is called. Executing the model, executing each execution instruction in the voice response execution model according to the solidified order, the corresponding operation of the terminal is: waking up the terminal->calling the emergency call xxx->initiating the location location immediately->sending the message to the emergency contact person, With the positioning results.
实施例三:Embodiment 3:
本实施例中以语音指令为在建立语音响应执行模型前用户录入的私有语音指令为例,对整个语音控制过程进行示例性的说明。In this embodiment, the voice command is taken as an example of the private voice command input by the user before the voice response execution model is established, and the entire voice control process is exemplarily described.
用户私有订制的过程请参见图7所示,包括:The process of user private subscription is shown in Figure 7, including:
步骤701:用户打开终端的语音操控定制模式;Step 701: The user turns on the voice manipulation customization mode of the terminal.
步骤702:终端提示用户录入私有语音指令,用户按照提示对着终端说出:“菠萝菠萝蜜”;Step 702: The terminal prompts the user to enter a private voice instruction, and the user promptly speaks to the terminal according to the prompt: “Pineapple Jackfruit”;
步骤703:终端完成第一遍录入后,要求用户再进行一次确认录入,用户按照提示再次说出:“菠萝菠萝蜜”;当然也可一次录入即可;Step 703: After completing the first pass of the entry, the terminal asks the user to perform another confirmation entry, and the user again prompts to say: “Pineapple Jackfruit”; of course, it can be entered once;
步骤704:终端完成第二遍录入后,对前后2次录入的内容进行分析建模,并比较前后是否一致。如果一致,则进入到步骤705,如果不一致,则回到步骤702,让用户重新录入;Step 704: After completing the second pass of the entry, the terminal analyzes and models the content recorded twice before and after, and compares whether the content is consistent before and after. If they are consistent, go to step 705. If they are not consistent, go back to step 702 and let the user re-enter;
步骤705:终端保存语音指令,并提示用户私有语音指令录入完成,请继续录入对应的操作,用户按照提示开始录入示例的操作:Step 705: The terminal saves the voice instruction, and prompts the user to complete the private voice instruction entry. Please continue to enter the corresponding operation, and the user begins to input the example operation according to the prompt:
唤醒终端->拨打紧急求救电话xxx->立即发起位置定位->给紧急联络人发送消息,并附带定位结果。Wake up the terminal -> call emergency call xxx-> immediately initiate location fix -> send a message to the emergency contact, with the result of the positioning.
步骤706:终端记录完用户示例的操作,终端开始对用户示例的操作进 行分解并得到以下语音响应执行模型:Step 706: The terminal records the operation of the user example, and the terminal starts the operation of the user example. The line is decomposed and the following speech response execution model is obtained:
(唤醒终端:)(Wake up terminal:)
SYS_LEVEL:WAKE_UP_Req->SYS_LEVEL: WAKE_UP_Req->
SYS_LEVEL:WAKE_UP_Res(OK)->SYS_LEVEL: WAKE_UP_Res(OK)->
(打开电话应用:)(Open the phone app:)
SYS_LEVEL:OPEN_APP_Req(Call)->SYS_LEVEL: OPEN_APP_Req(Call)->
SYS_LEVEL:OPEN_APP_Res(OK)->SYS_LEVEL: OPEN_APP_Res(OK)->
(拨打紧急电话:)(call emergency number:)
APP_LEVEL:SET_UP_EMERGENCY_CALL_Req(number)->APP_LEVEL: SET_UP_EMERGENCY_CALL_Req(number)->
APP_LEVEL:SET_UP_EMERGENCY_CALL_Res(OK)->APP_LEVEL: SET_UP_EMERGENCY_CALL_Res(OK)->
(打开GPS定位:)(Open GPS location:)
SYS_LEVEL:OPEN_APP_Req(LBS)->SYS_LEVEL: OPEN_APP_Req(LBS)->
SYS_LEVEL:OPEN_APP_Res(OK)->SYS_LEVEL: OPEN_APP_Res(OK)->
(发起定位:)(Initiation of targeting:)
APP_LEVEL:SET_UP_LBS_SERVICE_Req(Local position)->APP_LEVEL: SET_UP_LBS_SERVICE_Req(Local position)->
APP_LEVEL:SET_UP_LBS_SERVICE_Res(Local position result)->APP_LEVEL: SET_UP_LBS_SERVICE_Res(Local position result)->
(打开消息应用:)(Open the message app:)
SYS_LEVEL:OPEN_APP_Req(Message)->SYS_LEVEL: OPEN_APP_Req(Message)->
SYS_LEVEL:OPEN_APP_Res(OK)->SYS_LEVEL: OPEN_APP_Res(OK)->
(编辑消息、附带定位结果:)(Edit message, with positioning results:)
APP_LEVEL:EDIT_req(Local position result and other information)->APP_LEVEL: EDIT_req(Local position result and other information)->
APP_LEVEL:EDIT_res(OK)->APP_LEVEL: EDIT_res(OK)->
(发送消息:)(Send a message:)
APP_LEVEL:SEND_MESSAGE_Req(number)->APP_LEVEL: SEND_MESSAGE_Req(number)->
APP_LEVEL:SEND_MESSAGE_Res(OK) APP_LEVEL: SEND_MESSAGE_Res (OK)
步骤707:终端保存得到的语音响应执行模型,并提示用户录入完成,是否继续录入;用户如果选择是,转至步骤702;如果选择否,则转至步骤708:Step 707: The terminal saves the obtained voice response execution model, and prompts the user to complete the entry, whether to continue the entry; if the user selects yes, go to step 702; if no, go to step 708:
步骤708:退出语音操控定制模式。Step 708: Exit the voice manipulation customization mode.
用户通过私有化语音指令在特定场合触发事先设定好的功能的过程请参见图8所示,包括:The process of triggering a pre-set function on a specific occasion by a privatized voice command is shown in Figure 8, including:
步骤801:接收用户下发的语音指令;例如当用户遇到某种危险情况时,例如被劫持状态下,用户无法报警,此时用户通过说出私有化指令“菠萝菠萝蜜”来秘密触发终端进行报警;Step 801: Receive a voice command sent by the user; for example, when the user encounters a certain dangerous situation, for example, in a hijacked state, the user cannot alarm, and the user secretly triggers the terminal by speaking the privatization command “pineapple jackfruit”. Call the police;
步骤802:终端收到用户发出的语音指令后,对指令进行分析,并和已存储的私有语音指令进行比较;若指令不合法,转至步骤801;若合法,转至步骤803;Step 802: After receiving the voice command sent by the user, the terminal analyzes the command and compares it with the stored private voice command; if the command is not legal, go to step 801; if it is legal, go to step 803;
步骤803:调出语音指令对应的语音响应执行模型,并开始解析执行:Step 803: Call up the voice response execution model corresponding to the voice instruction, and start parsing and executing:
这里执行语音响应执行模型如下:The speech response execution model is executed here as follows:
SYS_LEVEL:WAKE_UP_Req->SYS_LEVEL: WAKE_UP_Req->
SYS_LEVEL:WAKE_UP_Res(OK)->SYS_LEVEL: WAKE_UP_Res(OK)->
SYS_LEVEL:OPEN_APP_Req(Call)->SYS_LEVEL: OPEN_APP_Req(Call)->
SYS_LEVEL:OPEN_APP_Res(OK)->SYS_LEVEL: OPEN_APP_Res(OK)->
APP_LEVEL:SET_UP_EMERGENCY_CALL_Req(number)->APP_LEVEL: SET_UP_EMERGENCY_CALL_Req(number)->
APP_LEVEL:SET_UP_EMERGENCY_CALL_Res(OK)->APP_LEVEL: SET_UP_EMERGENCY_CALL_Res(OK)->
SYS_LEVEL:OPEN_APP_Req(LBS)->SYS_LEVEL: OPEN_APP_Req(LBS)->
SYS_LEVEL:OPEN_APP_Res(OK)->SYS_LEVEL: OPEN_APP_Res(OK)->
APP_LEVEL:SET_UP_LBS_SERVICE_Req(Local position)->APP_LEVEL: SET_UP_LBS_SERVICE_Req(Local position)->
APP_LEVEL:SET_UP_LBS_SERVICE_Res(Local position result)->APP_LEVEL: SET_UP_LBS_SERVICE_Res(Local position result)->
SYS_LEVEL:OPEN_APP_Req(Message)->SYS_LEVEL: OPEN_APP_Req(Message)->
SYS_LEVEL:OPEN_APP_Res(OK)-> SYS_LEVEL: OPEN_APP_Res(OK)->
APP_LEVEL:EDIT_req(Local position result and other information)->APP_LEVEL: EDIT_req(Local position result and other information)->
APP_LEVEL:EDIT_res(OK)->APP_LEVEL: EDIT_res(OK)->
APP_LEVEL:SEND_MESSAGE_Req(number)->APP_LEVEL: SEND_MESSAGE_Req(number)->
APP_LEVEL:SEND_MESSAGE_Res(OK)APP_LEVEL: SEND_MESSAGE_Res (OK)
其中:among them:
SYS_LEVEL:WAKE_UP_Req->SYS_LEVEL: WAKE_UP_Req->
SYS_LEVEL:WAKE_UP_Res(OK)对应的动作为:唤醒终端。SYS_LEVEL: The action corresponding to WAKE_UP_Res (OK) is: wake up the terminal.
SYS_LEVEL:OPEN_APP_Req(Call)->SYS_LEVEL: OPEN_APP_Req(Call)->
SYS_LEVEL:OPEN_APP_Res(OK)对应的动作为:打开电话应用。SYS_LEVEL: OPEN_APP_Res (OK) corresponding action: open the phone application.
APP_LEVEL:SET_UP_EMERGENCY_CALL_Req(number)->APP_LEVEL: SET_UP_EMERGENCY_CALL_Req(number)->
APP_LEVEL:SET_UP_EMERGENCY_CALL_Res(OK)对应的动作为:拨打紧急电话。APP_LEVEL: The action corresponding to SET_UP_EMERGENCY_CALL_Res (OK) is: Make an emergency call.
SYS_LEVEL:OPEN_APP_Req(LBS)->SYS_LEVEL: OPEN_APP_Req(LBS)->
SYS_LEVEL:OPEN_APP_Res(OK)对应的动作为:打开GPS定位。SYS_LEVEL: OPEN_APP_Res (OK) corresponding action: open GPS positioning.
APP_LEVEL:SET_UP_LBS_SERVICE_Req(Local position)->APP_LEVEL: SET_UP_LBS_SERVICE_Req(Local position)->
APP_LEVEL:SET_UP_LBS_SERVICE_Res(Local position result)->对应的动作为:发起定位。APP_LEVEL: SET_UP_LBS_SERVICE_Res (Local position result) -> The corresponding action is: initiate positioning.
SYS_LEVEL:OPEN_APP_Req(Message)->SYS_LEVEL: OPEN_APP_Req(Message)->
SYS_LEVEL:OPEN_APP_Res(OK)对应的动作为:打开消息应用。SYS_LEVEL: OPEN_APP_Res (OK) corresponding action: open the message application.
APP_LEVEL:EDIT_req(Local position result and other information)->APP_LEVEL: EDIT_req(Local position result and other information)->
APP_LEVEL:EDIT_res(OK)对应的动作为:编辑消息、附带定位结果。APP_LEVEL: The action corresponding to EDIT_res(OK) is: edit message, with positioning result.
APP_LEVEL:SEND_MESSAGE_Req(number)->APP_LEVEL: SEND_MESSAGE_Req(number)->
APP_LEVEL:SEND_MESSAGE_Res(OK)对应的动作为:发送消息。APP_LEVEL: The action corresponding to SEND_MESSAGE_Res (OK) is: send a message.
最终完整的模型执行过程即为:The final complete model execution process is:
唤醒终端->拨打紧急求救电话xxx->立即发起位置定位->给紧急联络人发送消息,并附带定位结果。 Wake up the terminal -> call emergency call xxx-> immediately initiate location fix -> send a message to the emergency contact, with the result of the positioning.
可见,采用本发明实施例提供的语音控制方案,可以使用户摆脱目前和终端进行语音交流的固化模式,训练出最听自己话,最理解自己的终端,不但增强了用户的个性化体验,也能解决用户的很多私有化需求。It can be seen that the voice control scheme provided by the embodiment of the present invention can enable the user to get rid of the current curing mode of the voice communication with the terminal, train the listener to listen to the most, and understand the terminal of the user, which not only enhances the personalized experience of the user, but also enhances the personalized experience of the user. Can solve many privatization needs of users.
本领域普通技术人员可以理解上述实施例的全部或部分步骤可以使用计算机程序流程来实现,所述计算机程序可以存储于一计算机可读存储介质中,所述计算机程序在相应的硬件平台上(如系统、设备、装置、器件等)执行,在执行时,包括方法实施例的步骤之一或其组合。One of ordinary skill in the art will appreciate that all or a portion of the steps of the above-described embodiments can be implemented using a computer program flow, which can be stored in a computer readable storage medium, such as on a corresponding hardware platform (eg, The system, device, device, device, etc. are executed, and when executed, include one or a combination of the steps of the method embodiments.
可选地,上述实施例的全部或部分步骤也可以使用集成电路来实现,这些步骤可以被分别制作成一个个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。Alternatively, all or part of the steps of the above embodiments may also be implemented by using an integrated circuit. These steps may be separately fabricated into individual integrated circuit modules, or multiple modules or steps may be fabricated into a single integrated circuit module. achieve.
上述实施例中的装置/功能模块/功能单元可以采用通用的计算装置来实现,它们可以集中在单个的计算装置上,也可以分布在多个计算装置所组成的网络上。The devices/function modules/functional units in the above embodiments may be implemented by a general-purpose computing device, which may be centralized on a single computing device or distributed over a network of multiple computing devices.
上述实施例中的装置/功能模块/功能单元以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。上述提到的计算机可读取存储介质可以是只读存储器,磁盘或光盘等。When the device/function module/functional unit in the above embodiment is implemented in the form of a software function module and sold or used as a stand-alone product, it can be stored in a computer readable storage medium. The above mentioned computer readable storage medium may be a read only memory, a magnetic disk or an optical disk or the like.
工业实用性Industrial applicability
通过本发明实施例,不同用户可以根据自己需求对语音控制功能进行私有化订制,因此可以满足不同用户的个性化需求;摆脱了和终端进行语音交互的固化模式,增强了用户的个性化体验,更能满足用户的私有化需求。 Through the embodiment of the present invention, different users can customize the voice control function according to their own needs, so that the personalized requirements of different users can be met; the curing mode of the voice interaction with the terminal is eliminated, and the personalized experience of the user is enhanced. It can better meet the privatization needs of users.

Claims (11)

  1. 一种语音控制方法,包括:A voice control method includes:
    学习用户示例的操作得到语音响应执行模型,所述语音响应执行模型包含执行所述操作对应的执行指令;The operation of learning the user example obtains a voice response execution model, the voice response execution model including an execution instruction corresponding to the execution of the operation;
    将所述语音响应执行模型与对应的语音指令关联;Associating the voice response execution model with a corresponding voice instruction;
    接收到所述语音指令后,执行与所述语音指令关联的语音响应执行模型中的执行指令。After receiving the voice instruction, executing an execution instruction in the voice response execution model associated with the voice instruction.
  2. 如权利要求1所述的语音控制方法,其中,所述语音指令为用户录制的私有语音指令或终端中预设好的标准语音指令。The voice control method according to claim 1, wherein the voice command is a private voice command recorded by a user or a standard voice command preset in the terminal.
  3. 如权利要求2所述的语音控制方法,其中,所述语音指令为私有语音指令时,获取所述私有语音指令包括:The voice control method according to claim 2, wherein when the voice instruction is a private voice instruction, acquiring the private voice instruction comprises:
    学习用户示例的操作得到语音响应执行模型之前,或在学习用户示例的操作得到语音响应执行模型之后,采集用户输入的语音指令;The operation of the user example is obtained before the voice response execution model is obtained, or after the operation of the user example is obtained to obtain the voice response execution model, the voice instruction input by the user is collected;
    对所述语音指令进行声学特征提取得到对应的私有语音指令。Performing acoustic feature extraction on the voice command to obtain a corresponding private voice command.
  4. 如权利要求1-3任一项所述的语音控制方法,其中,所述用户示例的操作包含控制一个应用的一个操作,或控制一个应用的多个连续操作,或控制至少两个应用的操作。The voice control method according to any one of claims 1 to 3, wherein the operation of the user example includes controlling one operation of an application, or controlling a plurality of continuous operations of one application, or controlling operations of at least two applications. .
  5. 如权利要求1-3任一项所述的语音控制方法,其中,学习用户示例的操作得到语音响应执行模型包括:The voice control method according to any one of claims 1 to 3, wherein the operation of learning the user example to obtain the voice response execution model comprises:
    记录用户示例的操作;Record the operation of the user example;
    将用户示例的每个操作转换成对应的执行指令;Converting each operation of the user example into a corresponding execution instruction;
    按照所述每个操作的执行顺序固化每个执行指令的执行顺序得到语音响应执行模型。The speech response execution model is obtained by solidifying the execution order of each execution instruction in the execution order of each operation.
  6. 如权利要求5所述的语音控制方法,其中,所述执行指令包含执行请求指令和执行响应指令。The voice control method of claim 5, wherein the execution instruction comprises an execution request instruction and an execution response instruction.
  7. 一种语音控制装置,包括模型建立模块、关联模块和执行模块:A voice control device includes a model building module, an association module, and an execution module:
    所述模型建立模块设置为:学习用户示例的操作得到语音响应执行模 型,所述语音响应执行模型包含执行所述操作对应的执行指令;The model building module is configured to: learn the operation of the user example to obtain a voice response execution mode Type, the voice response execution model includes an execution instruction corresponding to performing the operation;
    所述关联模块设置为:将所述语音响应执行模型与对应的语音指令关联;The association module is configured to: associate the voice response execution model with a corresponding voice instruction;
    所述执行模块设置为:接收到所述语音指令后,执行与所述语音指令关联的语音响应执行模型中的执行指令。The execution module is configured to: after receiving the voice instruction, execute an execution instruction in a voice response execution model associated with the voice instruction.
  8. 如权利要求7所述的语音控制装置,还包括语音获取模块和语音处理模块;The voice control device according to claim 7, further comprising a voice acquisition module and a voice processing module;
    所述语音获取模块设置为:在所述模型建立模块学习用户示例的操作得到语音响应执行模型之前,或在所述模型建立模块学习用户示例的操作得到语音响应执行模型之后,采集用户输入的语音指令;The voice acquiring module is configured to: collect the voice input by the user after the model establishing module learns the operation of the user example to obtain the voice response execution model, or after the model establishing module learns the operation of the user example to obtain the voice response execution model. instruction;
    所述语音处理模块设置为:对所述语音指令进行声学特征提取得到对应的私有语音指令。The voice processing module is configured to perform acoustic feature extraction on the voice command to obtain a corresponding private voice command.
  9. 如权利要求7或8所述的语音控制装置,其中,所述模型建立模块包括记录子模块、分析子模块和固化子模块;The voice control device according to claim 7 or 8, wherein the model building module comprises a recording submodule, an analysis submodule and a solidifying submodule;
    所述记录子模块设置为:记录用户示例的操作;The recording submodule is configured to: record an operation of a user example;
    所述分析子模块设置为:将用户示例的每个操作转换成对应的执行指令;The analysis sub-module is configured to: convert each operation of the user example into a corresponding execution instruction;
    所述固化子模块设置为:按照所述每个操作的执行顺序固化每个执行指令的执行顺序得到语音响应执行模型。The solidification sub-module is configured to obtain a speech response execution model by curing the execution order of each execution instruction in the execution order of each operation.
  10. 一种终端,包括如权利要求7-9任一项所述的语音控制装置。A terminal comprising the voice control device according to any one of claims 7-9.
  11. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1-6任一项的方法。 A computer readable storage medium storing computer executable instructions for performing the method of any of claims 1-6.
PCT/CN2015/082221 2015-01-13 2015-06-24 Voice control method, apparatus, and terminal WO2016112644A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510016343.1 2015-01-13
CN201510016343.1A CN105845136A (en) 2015-01-13 2015-01-13 Voice control method and device, and terminal

Publications (1)

Publication Number Publication Date
WO2016112644A1 true WO2016112644A1 (en) 2016-07-21

Family

ID=56405183

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/082221 WO2016112644A1 (en) 2015-01-13 2015-06-24 Voice control method, apparatus, and terminal

Country Status (2)

Country Link
CN (1) CN105845136A (en)
WO (1) WO2016112644A1 (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106773923B (en) * 2016-11-30 2020-04-21 北京光年无限科技有限公司 Multi-mode emotion data interaction method and device for robot
CN106773817B (en) * 2016-12-01 2020-11-17 北京光年无限科技有限公司 Command analysis method for intelligent robot and robot
CN107342085A (en) * 2017-07-24 2017-11-10 深圳云知声信息技术有限公司 Method of speech processing and device
CN107951368A (en) * 2017-09-13 2018-04-24 浙江苏泊尔家电制造有限公司 Method, cooking apparatus and the computer-readable storage medium of culinary art
CN108470567B (en) * 2018-03-15 2021-08-24 青岛海尔科技有限公司 Voice interaction method and device, storage medium and computer equipment
CN108632463A (en) * 2018-04-24 2018-10-09 维沃移动通信有限公司 A kind of sound control method and mobile terminal
CN108737933A (en) * 2018-05-30 2018-11-02 上海与德科技有限公司 A kind of dialogue method, device and electronic equipment based on intelligent sound box
CN108831469B (en) * 2018-08-06 2021-02-12 珠海格力电器股份有限公司 Voice command customizing method, device and equipment and computer storage medium
CN109754797A (en) * 2018-12-18 2019-05-14 广东金祺盛工业设备有限公司 Intelligent terminal operation system based on interactive voice
CN112309373A (en) * 2020-09-28 2021-02-02 惠州市德赛西威汽车电子股份有限公司 System and method for self-defining vehicle-mounted voice technology
CN112735387A (en) * 2020-12-25 2021-04-30 惠州市德赛西威汽车电子股份有限公司 User-defined vehicle-mounted voice skill system and method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101246687A (en) * 2008-03-20 2008-08-20 北京航空航天大学 Intelligent voice interaction system and method thereof
US20100179812A1 (en) * 2009-01-14 2010-07-15 Samsung Electronics Co., Ltd. Signal processing apparatus and method of recognizing a voice command thereof
CN102842306A (en) * 2012-08-31 2012-12-26 深圳Tcl新技术有限公司 Voice control method and device as well as voice response method and device
CN103188406A (en) * 2011-12-27 2013-07-03 中国电信股份有限公司 Method and system for achieving interactive voice response flow control
CN103426429A (en) * 2013-07-15 2013-12-04 三星半导体(中国)研究开发有限公司 Voice control method and voice control device
CN103632669A (en) * 2012-08-20 2014-03-12 上海闻通信息科技有限公司 A method for a voice control remote controller and a voice remote controller
CN103646646A (en) * 2013-11-27 2014-03-19 联想(北京)有限公司 Voice control method and electronic device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101889836B1 (en) * 2012-02-24 2018-08-20 삼성전자주식회사 Method and apparatus for cotrolling lock/unlock state of terminal through voice recognition
CN103366743A (en) * 2012-03-30 2013-10-23 北京千橡网景科技发展有限公司 Voice-command operation method and device
US9384732B2 (en) * 2013-03-14 2016-07-05 Microsoft Technology Licensing, Llc Voice command definitions used in launching application with a command
CN103885783A (en) * 2014-04-03 2014-06-25 深圳市三脚蛙科技有限公司 Voice control method and device of application program
CN104269170B (en) * 2014-09-17 2018-04-20 成都博智维讯信息技术有限公司 A kind of ERP authorities audio recognition method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101246687A (en) * 2008-03-20 2008-08-20 北京航空航天大学 Intelligent voice interaction system and method thereof
US20100179812A1 (en) * 2009-01-14 2010-07-15 Samsung Electronics Co., Ltd. Signal processing apparatus and method of recognizing a voice command thereof
CN103188406A (en) * 2011-12-27 2013-07-03 中国电信股份有限公司 Method and system for achieving interactive voice response flow control
CN103632669A (en) * 2012-08-20 2014-03-12 上海闻通信息科技有限公司 A method for a voice control remote controller and a voice remote controller
CN102842306A (en) * 2012-08-31 2012-12-26 深圳Tcl新技术有限公司 Voice control method and device as well as voice response method and device
CN103426429A (en) * 2013-07-15 2013-12-04 三星半导体(中国)研究开发有限公司 Voice control method and voice control device
CN103646646A (en) * 2013-11-27 2014-03-19 联想(北京)有限公司 Voice control method and electronic device

Also Published As

Publication number Publication date
CN105845136A (en) 2016-08-10

Similar Documents

Publication Publication Date Title
WO2016112644A1 (en) Voice control method, apparatus, and terminal
US11315405B2 (en) Systems and methods for provisioning appliance devices
JP6811758B2 (en) Voice interaction methods, devices, devices and storage media
US10609199B1 (en) Providing hands-free service to multiple devices
CN105677460B (en) Applied program processing method and device
KR102178896B1 (en) Provides a personal auxiliary module with an optionally steerable state machine
JP2023051963A (en) Implementation of voice assistant on device
US20150228281A1 (en) Device, system, and method for active listening
CN109410952B (en) Voice awakening method, device and system
CN104902086B (en) Alarm clock ringing method and device
US20140297288A1 (en) Telephone voice personal assistant
US9854439B2 (en) Device and method for authenticating a user of a voice user interface and selectively managing incoming communications
CN109658927A (en) Wake-up processing method, device and the management equipment of smart machine
JP6783339B2 (en) Methods and devices for processing audio
CN107370772A (en) Account login method, device and computer-readable recording medium
CN111556197B (en) Method and device for realizing voice assistant and computer storage medium
CN108806714B (en) Method and device for adjusting volume
CN108632653A (en) Voice management-control method, smart television and computer readable storage medium
CN105898890A (en) Device searching method and electronic device for supporting the same
CN105871561A (en) Wireless wakeup device for cell module
JP6619488B2 (en) Continuous conversation function in artificial intelligence equipment
CN108648754B (en) Voice control method and device
WO2022088964A1 (en) Control method and apparatus for electronic device
CN109325337A (en) Unlocking method and device
US20140315520A1 (en) Recording and playing back portions of a telephone call

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15877559

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15877559

Country of ref document: EP

Kind code of ref document: A1