WO2016184095A1 - Procédé et appareil d'exécution d'événement de mise en œuvre, et terminal - Google Patents

Procédé et appareil d'exécution d'événement de mise en œuvre, et terminal Download PDF

Info

Publication number
WO2016184095A1
WO2016184095A1 PCT/CN2015/098022 CN2015098022W WO2016184095A1 WO 2016184095 A1 WO2016184095 A1 WO 2016184095A1 CN 2015098022 W CN2015098022 W CN 2015098022W WO 2016184095 A1 WO2016184095 A1 WO 2016184095A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
mode
operation instruction
terminal
voice operation
Prior art date
Application number
PCT/CN2015/098022
Other languages
English (en)
Chinese (zh)
Inventor
张大凯
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2016184095A1 publication Critical patent/WO2016184095A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B19/00Programme-control systems
    • G05B19/02Programme-control systems electric
    • G05B19/04Programme control other than numerical control, i.e. in sequence controllers or logic controllers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • G10L17/24Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase

Definitions

  • the present invention relates to the field of communications, and in particular to a method, an apparatus, and a terminal for performing an operation event.
  • the existing operating system is a single instruction execution system.
  • the user says that one instruction can only perform one action. If the user is in a certain situation, he hopes that the mobile terminal may do more than one thing for him, such as when the user enters the car. Users want the phone to turn on Bluetooth, play music, turn on navigation, and even turn the volume to the appropriate volume in the car.
  • These actions require at least four voice commands if they are based on an existing voice control system. This obviously makes the user feel that the operation is very cumbersome, reduces the user experience of voice control, and finally makes the voice control system useless.
  • an embodiment of the present invention provides a method, an apparatus, and a terminal for performing an operation event.
  • a method for performing an operation event including: receiving a voice operation instruction of an end user; and opening a scene mode corresponding to the voice operation instruction, wherein the scenario mode includes: Or more than two operational events; performing the operational events included in the context mode.
  • the method before the scenario corresponding to the voice operation instruction is turned on, the method further includes: configuring a correspondence between the voice operation instruction and the scenario mode.
  • the method before receiving the voice operation instruction of the terminal user, the method further includes: detecting a specified event; and starting the voice mode, triggering the voice mode, wherein, in the voice mode, receiving the End user voice operation instructions.
  • the method before receiving the voice operation instruction of the terminal user, the method further includes: configuring an operation event included in the scene mode.
  • the operation event includes at least one of adjusting a volume to a specified value, turning on a navigation function, opening a music player, and calling a designated contact.
  • the voice operation instruction includes: a single voice operation instruction.
  • an apparatus for performing an operation event including: a receiving module configured to receive a voice operation instruction of the terminal user; and a first opening module configured to enable the voice operation instruction to be opened
  • the context mode wherein the context mode comprises: two or more operational events; an execution module configured to perform an operational event included in the context mode.
  • the device further includes: a first configuration module, connected to the first opening module, configured to configure a correspondence between the voice operation instruction and the scene mode.
  • the device further includes: a detecting module configured to detect a specified event; and a second opening module connected to the detecting module, configured to start a voice mode under the trigger of the specified event, wherein And in the voice mode, receiving a voice operation instruction of the terminal user.
  • the device further includes: a second configuration module, configured to configure an operation event included in the context mode.
  • a terminal comprising: the execution device of the operation event according to any of the above.
  • the terminal by receiving the voice operation instruction, the technical solution including the scene mode of the multiple operation events corresponding to the voice operation instruction is opened, that is, the terminal can control the terminal to execute multiple operation events by using a single voice operation instruction.
  • a single voice instruction can only perform one operation and the user experience is poor, thereby reducing the complexity of the operation of the terminal and greatly improving the user experience.
  • FIG. 1 is a flow chart of a method of performing an operation event according to an embodiment of the present invention
  • FIG. 2 is a structural block diagram of an apparatus for executing an operation event according to an embodiment of the present invention
  • FIG. 3 is a block diagram showing another structure of an apparatus for operating an event according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a terminal according to an example of the present invention.
  • FIG. 5 is a flow chart of a context mode based voice control method in accordance with a preferred embodiment of the present invention.
  • FIG. 1 is a flowchart of a method for executing an operation event according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps:
  • Step S102 receiving a voice operation instruction of the terminal user
  • Step S104 the scenario mode corresponding to the voice operation instruction is started, where the scenario mode includes: two or more operation events;
  • Step S106 executing an operation event included in the above scenario mode.
  • the technical solution including the scene mode of the plurality of operation events corresponding to the voice operation instruction is turned on, that is, the terminal can execute the plurality of operation events by using a single voice operation instruction, and the solution is solved.
  • a single voice instruction can only perform one operation and the user experience is poor, thereby reducing the complexity of the operation of the terminal and greatly improving the user experience.
  • steps S102-S106 may be a terminal such as a mobile phone or a tablet computer, but is not limited thereto.
  • step S104 before performing the step of step S104, that is, before the scenario corresponding to the voice operation instruction is turned on, the following technical solution may be further performed: configuring a correspondence between the voice operation instruction and the scene mode, for example, configuring a voice operation instruction.
  • the "I am in the car" operation command can find a pre-configured scene mode according to the corresponding relationship, and execute a plurality of corresponding operation events in the scene mode, which can be: opening music, opening navigation, and the like.
  • the voice mode command is not turned on at any time, and the scene mode is turned on.
  • the specified event is also detected; under the trigger of the specified event, the voice mode is turned on, where In the voice mode, the voice operation instruction of the terminal user is received, that is, the user needs to open the voice mode in advance, so that when the terminal receives the specified voice operation instruction, the scene mode is turned on, and multiple operation events are executed.
  • the method before receiving the voice operation instruction of the terminal user, the method further includes: configuring an operation event included in the scenario mode, and the configured operation event may include multiple executable events, which may be according to the user's request.
  • the configuration operation includes at least one of the following: adjusting the volume to a specified value, turning on the navigation function, opening the music player, and calling the designated contact.
  • End users can summarize the profiles they are in and add, such as "I am in the car”, “I am taking a shower”, “I In the meeting, etc.
  • the terminal provides various executable actions for the user to select.
  • the user selects different action collections for various scene modes, such as selecting the Bluetooth for the scene mode "I am in the car”, playing music, and adjusting the volume of the music.
  • Level 10 open navigation.
  • the terminal voice recognition module automatically recognizes and matches successfully, automatically turns on the Bluetooth, plays music and adjusts for the user. Turn the volume to level 10 and turn on the navigation.
  • the terminal when the user is in a certain scenario, the terminal can conveniently make a plurality of specified actions for the user through a single instruction, thereby avoiding each time the user enters the scene, and the user needs to pass Multiple voice commands allow the terminal to perform multiple actions for itself.
  • the user has many steps and is cumbersome to operate.
  • each voice interaction takes a certain amount of time, and the efficiency is very low. Both of these aspects will greatly reduce the enthusiasm of users to use the voice control system.
  • the process that requires multiple interactions only needs one interaction. If the number of interactions is N, the interaction efficiency is increased by N times, if it is difficult to avoid in each interaction process. The interaction failure caused by the false recognition rate will increase the efficiency, which will significantly improve the user experience of the voice control system.
  • the user may add a custom profile, and of course, the terminal may also preset various common scenarios; the user may add a custom executable action, and of course, the terminal may also preset various commonly used executables. Action; the user selects a plurality of actions from the executable actions for the customized profile; the terminal saves the user-defined profile and the corresponding action to the instruction library.
  • the voice recognition application When the user is in the scene mode, the voice recognition application is started by a voice wake-up command, a headset button or other hardware processing means, and after the voice recognition application is in the voice recognition mode, the instruction corresponding to the scene mode is spoken.
  • the voice recognition application recognizes the user instruction and matches the existing scene mode instruction. If the matching is successful, the actions required by the instruction are executed; if the matching fails, the user is prompted to re-enter, and if the number of failures exceeds the specified number of times, the process ends.
  • FIG. 1 is a structural block diagram of an apparatus for executing an operation event according to an embodiment of the present invention. As shown in Figure 2, the device comprises:
  • the receiving module 20 is configured to receive a voice operation instruction of the terminal user
  • the first opening module 22 is connected to the receiving module 20 and configured to enable a scenario corresponding to the voice operation instruction, where the scenario mode includes: two or more operation events;
  • the execution module 24 is configured to perform an operation event included in the above scenario.
  • the terminal can control the terminal to execute multiple operation events by using a single voice operation instruction.
  • the problem of poor user experience caused by only one operation of a single voice instruction can be solved in the related art, thereby reducing the complexity of the operation of the terminal and greatly improving the user experience.
  • FIG. 3 is another structural block diagram of an apparatus for performing an operation event according to an embodiment of the present invention.
  • the apparatus further includes: a first configuration module 26 connected to the first opening module 22 and configured to configure voice operations. The correspondence between instructions and context patterns.
  • the device further includes: a detecting module 28 configured to detect a specified event; and a second opening module 30 connected to the detecting module 28, configured to turn on a voice mode under the trigger of the specified event, where In the voice mode, the voice operation command of the terminal user is received, and the device further includes: a second configuration module 32, connected to the first opening module 22, configured to configure an operation event included in the scenario mode.
  • a terminal including: an apparatus for executing an operation event according to any one of the preceding claims.
  • FIG. 4 is a schematic structural diagram of a terminal according to an example of the present invention. As shown in FIG. 4, the method includes:
  • Scene mode customization unit 40 The user adds a custom profile by rules according to the rule, and the profile name is generally a semantic phrase for the voice recognition unit to recognize.
  • the executable action customization unit 42 the user adds a custom executable action according to the rule by the unit, and the rule divides the executable action into an action type and an action object, and the two can be translated into a behavior understandable by the terminal.
  • the profile mode configuration unit 44 (corresponding to the second configuration module 32 in the above embodiment): the user selects the action to be performed for the custom profile mode selection by this unit.
  • Profile mode storage unit 46 A many-to-many database model set to store a many-to-many relationship between context patterns and executable actions.
  • the voice recognition wake-up unit 48 (corresponding to the second opening module 30 of the above embodiment) is used to wake up the voice recognition application, and can be implemented by a voice wake-up command, a headset button, a terminal button or other hardware processing means.
  • the voice recognition unit 50 (corresponding to the receiving module 20 of the above embodiment) receives the user's scene mode command, identifies the user's scene mode command, and matches the stored custom scene pattern name in the scene mode storage unit, such as a successful match. Then, the actions required by the instruction are executed; if the matching fails, the user is prompted to re-enter, and if the number of failures exceeds the specified number of times, the process ends.
  • the action execution unit 52 (corresponding to the execution module 24 of the above embodiment): after the context mode instruction is successfully matched, the action execution unit parses out the action type and the action object of each action required by the instruction according to the rule, and the terminal is understandable according to the analysis result. The instructions are executed by the terminal to complete each action.
  • FIG. 5 is a flowchart of a scenario-based voice control method according to a preferred embodiment of the present invention. As shown in FIG. 5, the method includes the following steps:
  • Step S502 The user adds a custom profile.
  • the terminal can also preset various common scene modes, and the above manner is included in the present invention.
  • Step S504 The user adds a custom executable action
  • the terminal can also preset various commonly used executable actions, and the above manner is included in the present invention.
  • Step S506 The user selects and configures multiple actions from the executable actions for the customized profile.
  • the scene mode and the executable action are in a many-to-many relationship.
  • the action mode may be selected as the main body in the scene mode, or the action mode may be selected as the main body, and the above-mentioned manner is included in the present invention.
  • Step S508 The terminal saves the user-defined profile and the corresponding action to the instruction library
  • the method for the terminal to save the user-defined profile and the corresponding action to the instruction library includes, but is not limited to, using database storage, file storage, and the like.
  • Step S510 The user is in the profile mode, and the voice recognition application is started by using a voice wake-up command, a headset button or other hardware processing means, and after the voice recognition application is in the voice recognition mode, the instruction corresponding to the scene mode is spoken;
  • Step S512 the voice recognition application recognizes the user instruction and matches the existing scene mode instruction. If the matching is successful, the actions required by the instruction are executed; if the matching fails, the user is prompted to re-enter, if the number of failures exceeds the specified number of times, The process ends.
  • the embodiment of the present invention achieves the following technical effects: the problem of poor user experience caused by only one operation of a single voice instruction can be solved in the related art, thereby reducing the operation complexity of the terminal and greatly improving The user experience extends the application scenario of the terminal.
  • a storage medium is further provided, wherein the software includes the above-mentioned software, including but not limited to: an optical disk, a floppy disk, a hard disk, an erasable memory, and the like.
  • modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein.
  • the steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module.
  • the invention is not limited to any specific combination of hardware and software.
  • the foregoing technical solution provided by the present invention can be applied to the execution of an operation event, by receiving a voice operation instruction, and then opening a technical solution including a scene mode of multiple operation events corresponding to the voice operation instruction, that is, a single voice operation
  • the instruction can control the terminal to execute a plurality of operation events, and solves the problem that the user experience degree is poor caused by only one operation of the single voice instruction in the related art, thereby reducing the operation complexity of the terminal.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Computational Linguistics (AREA)
  • User Interface Of Digital Computer (AREA)
  • Telephone Function (AREA)

Abstract

L'invention concerne un procédé et un appareil d'exécution d'événement de mise en œuvre, ainsi qu'un terminal. Le procédé consiste : à recevoir une instruction de mise en œuvre vocale d'un utilisateur de terminal (S102) ; à ouvrir un profil correspondant à l'instruction de mise en œuvre vocale, le profil comprenant : au moins deux événements de mise en œuvre (S104) ; et à exécuter les événements de mise en œuvre inclus dans le profil (S106). L'adoption de la solution technique ci-dessus résout le problème de mauvaise expérience d'utilisateur causé par le fait que seule une mise en œuvre peut être exécutée pour une instruction vocale unique dans la technique associée, ce qui permet de réduire la complexité de fonctionnement d'un terminal et d'améliorer considérablement l'expérience d'utilisateur.
PCT/CN2015/098022 2015-10-16 2015-12-21 Procédé et appareil d'exécution d'événement de mise en œuvre, et terminal WO2016184095A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510673325.0A CN106601242A (zh) 2015-10-16 2015-10-16 操作事件的执行方法及装置、终端
CN201510673325.0 2015-10-16

Publications (1)

Publication Number Publication Date
WO2016184095A1 true WO2016184095A1 (fr) 2016-11-24

Family

ID=57319349

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/098022 WO2016184095A1 (fr) 2015-10-16 2015-12-21 Procédé et appareil d'exécution d'événement de mise en œuvre, et terminal

Country Status (2)

Country Link
CN (1) CN106601242A (fr)
WO (1) WO2016184095A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018099000A1 (fr) * 2016-12-01 2018-06-07 中兴通讯股份有限公司 Procédé de traitement d'entrée vocale, terminal et serveur de réseau
WO2019033437A1 (fr) * 2017-08-18 2019-02-21 广东欧珀移动通信有限公司 Procédé de commande d'appels, appareil, dispositif terminal et support de stockage

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107277225B (zh) * 2017-05-04 2020-04-24 北京奇虎科技有限公司 语音控制智能设备的方法、装置和智能设备
CN110164426B (zh) * 2018-02-10 2021-10-26 佛山市顺德区美的电热电器制造有限公司 语音控制方法和计算机存储介质
CN109117233A (zh) 2018-08-22 2019-01-01 百度在线网络技术(北京)有限公司 用于处理信息的方法和装置
CN111048078A (zh) * 2018-10-15 2020-04-21 阿里巴巴集团控股有限公司 语音复合指令处理方法和系统及语音处理设备和介质
CN113401134A (zh) * 2021-06-10 2021-09-17 吉利汽车研究院(宁波)有限公司 一种情景模式的自定义方法、装置、电子设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4513189A (en) * 1979-12-21 1985-04-23 Matsushita Electric Industrial Co., Ltd. Heating apparatus having voice command control operative in a conversational processing manner
CN1460050A (zh) * 2001-03-27 2003-12-03 索尼公司 对于机器人装置的动作教学装置和方法以及存储介质
CN103197571A (zh) * 2013-03-15 2013-07-10 张春鹏 一种控制方法及装置、系统

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202798881U (zh) * 2012-07-31 2013-03-13 北京播思软件技术有限公司 一种使用语音命令控制移动设备运行的装置
CN102833421B (zh) * 2012-09-17 2014-06-18 东莞宇龙通信科技有限公司 移动终端和提醒方法
CN105739940A (zh) * 2014-12-08 2016-07-06 中兴通讯股份有限公司 存储方法及装置
CN104866181B (zh) * 2015-06-08 2018-05-08 北京金山安全软件有限公司 一种多操作事件的执行方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4513189A (en) * 1979-12-21 1985-04-23 Matsushita Electric Industrial Co., Ltd. Heating apparatus having voice command control operative in a conversational processing manner
CN1460050A (zh) * 2001-03-27 2003-12-03 索尼公司 对于机器人装置的动作教学装置和方法以及存储介质
CN103197571A (zh) * 2013-03-15 2013-07-10 张春鹏 一种控制方法及装置、系统

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018099000A1 (fr) * 2016-12-01 2018-06-07 中兴通讯股份有限公司 Procédé de traitement d'entrée vocale, terminal et serveur de réseau
CN108132768A (zh) * 2016-12-01 2018-06-08 中兴通讯股份有限公司 语音输入的处理方法,终端和网络服务器
WO2019033437A1 (fr) * 2017-08-18 2019-02-21 广东欧珀移动通信有限公司 Procédé de commande d'appels, appareil, dispositif terminal et support de stockage

Also Published As

Publication number Publication date
CN106601242A (zh) 2017-04-26

Similar Documents

Publication Publication Date Title
WO2016184095A1 (fr) Procédé et appareil d'exécution d'événement de mise en œuvre, et terminal
US10832682B2 (en) Methods and apparatus for reducing latency in speech recognition applications
US11393472B2 (en) Method and apparatus for executing voice command in electronic device
JP6811758B2 (ja) 音声対話方法、装置、デバイス及び記憶媒体
JP6115941B2 (ja) 対話シナリオにユーザ操作を反映させる対話プログラム、サーバ及び方法
US20190179610A1 (en) Architecture for a hub configured to control a second device while a connection to a remote system is unavailable
US9047857B1 (en) Voice commands for transitioning between device states
US9698999B2 (en) Natural language control of secondary device
US11340566B1 (en) Interoperability of secondary-device hubs
US9466286B1 (en) Transitioning an electronic device between device states
US20170076208A1 (en) Terminal application launching method, and terminal
US10559303B2 (en) Methods and apparatus for reducing latency in speech recognition applications
WO2016078214A1 (fr) Procédé et dispositif de traitement de terminal, et support de stockage informatique
CN103873655A (zh) 移动终端防盗系统及方法
WO2017032025A1 (fr) Procédé de commande de lecture de musique et dispositif terminal
WO2021196617A1 (fr) Procédé et appareil d'interaction vocale, dispositif électronique et support de stockage
CN107948854B (zh) 一种操作音频生成方法、装置、终端及计算机可读介质
US10693944B1 (en) Media-player initialization optimization
US11470201B2 (en) Systems and methods for providing real time assistance to voice over internet protocol (VOIP) users
WO2016172846A1 (fr) Procédé basé sur une action de soufflage pour faire fonctionner un terminal mobile et terminal mobile
US20230362026A1 (en) Output device selection
CN111161734A (zh) 基于指定场景的语音交互方法及装置
CN111344781A (zh) 音频处理
CN105788598B (zh) 一种语音处理方法和电子设备
CN107767857B (zh) 一种信息播放方法、第一电子设备和计算机存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15892469

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15892469

Country of ref document: EP

Kind code of ref document: A1