WO2019228140A1 - 指令执行方法、装置、存储介质及电子设备 - Google Patents

指令执行方法、装置、存储介质及电子设备 Download PDF

Info

Publication number
WO2019228140A1
WO2019228140A1 PCT/CN2019/085563 CN2019085563W WO2019228140A1 WO 2019228140 A1 WO2019228140 A1 WO 2019228140A1 CN 2019085563 W CN2019085563 W CN 2019085563W WO 2019228140 A1 WO2019228140 A1 WO 2019228140A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
voice information
executed instructions
voiceprint feature
preset
Prior art date
Application number
PCT/CN2019/085563
Other languages
English (en)
French (fr)
Inventor
李冠
达剑
熊万江
刘嘉飞
周伍润
朱忠磊
董治
李海泉
文昭彦
高亮
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2019228140A1 publication Critical patent/WO2019228140A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Definitions

  • the present application relates to the technical field of electronic devices, and in particular, to a method, a device, a storage medium, and an electronic device for executing instructions.
  • electronic devices can perform specific operations by means of voice instructions. For example, when a user speaks “play music”, the electronic device recognizes "play music” as a music playback instruction, and executes the music playback instruction to perform music playback.
  • an embodiment of the present application provides an instruction execution method, including:
  • Sort the plurality of first instructions to be executed to obtain first sort information of the plurality of first instructions to be executed
  • an instruction execution apparatus including:
  • a receiving module configured to receive the input first voice information
  • An acquisition module configured to acquire a plurality of first to-be-executed instructions included in the first voice information
  • a sorting module configured to sort the plurality of first instructions to be executed to obtain first sort information of the plurality of first instructions to be executed;
  • An execution module is configured to sequentially execute the plurality of first to-be-executed instructions according to the first sorting information.
  • an embodiment of the present application provides a storage medium on which a computer program is stored, and when the computer program is run on a computer, the computer is caused to execute:
  • Sort the plurality of first instructions to be executed to obtain first sort information of the plurality of first instructions to be executed
  • an embodiment of the present application provides an electronic device including a processor and a memory, where the memory has a computer program, and the processor calls the computer program to execute:
  • Sort the plurality of first instructions to be executed to obtain first sort information of the plurality of first instructions to be executed
  • FIG. 1 is a schematic flowchart of an instruction execution method according to an embodiment of the present application.
  • FIG. 2 is a schematic diagram of operations of acquiring multiple first to-be-executed instructions included in the first voice information provided in the embodiment of the present application.
  • FIG. 3 is a schematic diagram of executing multiple first to-be-executed instructions in an embodiment of the present application.
  • FIG. 4 is a schematic diagram of splicing and executing multiple first to-be-executed instructions and multiple second to-be-executed instructions in the embodiment of the present application.
  • FIG. 5 is another schematic flowchart of an instruction execution method according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of an instruction execution apparatus according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • FIG. 8 is another schematic structural diagram of an electronic device according to an embodiment of the present application.
  • the computer execution referred to herein includes operations by a computer processing unit representing electronic signals in a structured form. This operation transforms the data or maintains it at a location in the computer's memory system, which can be reconfigured or otherwise alter the operation of the computer in a manner well known to testers in the art.
  • the data structure maintained by the data is the physical location of the memory, which has specific characteristics defined by the data format.
  • Testers in the art will understand that various steps and operations described below can also be implemented in hardware.
  • module as used herein can be viewed as a software object executing on the computing system.
  • the different components, modules, engines, and services described in this article can be considered as implementation objects on this computing system.
  • the devices and methods described herein can be implemented in software, and of course, they can also be implemented in hardware, which are all within the protection scope of this application.
  • an embodiment herein means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application.
  • the appearances of this phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are they independent or alternative embodiments that are mutually exclusive with other embodiments. It is clearly and implicitly understood by those skilled in the art that the embodiments described herein may be combined with other embodiments.
  • the embodiment of the present application provides an instruction execution method.
  • the execution subject of the instruction execution method may be an instruction execution device provided by the embodiment of the application, or an electronic device integrated with the instruction execution device.
  • the instruction execution device may use hardware or Software way.
  • the electronic device may be a smart phone, a tablet computer, a palmtop computer, a notebook computer, or a desktop computer.
  • An embodiment of the present application provides an instruction execution method, including:
  • the instruction execution method further includes:
  • the plurality of second to-be-executed instructions are sequentially executed according to the second sorting information.
  • the acquiring a plurality of first to-be-executed instructions included in the first voice information includes:
  • the method before the acquiring a plurality of first to-be-executed instructions included in the first voice information, the method further includes:
  • the voiceprint feature matches a preset voiceprint feature, obtaining a plurality of first to-be-executed instructions included in the first voice information.
  • determining whether the voiceprint feature matches a preset voiceprint feature includes:
  • the method further includes:
  • the method further includes:
  • the voiceprint feature does not match the preset voiceprint feature, the first voice information is discarded.
  • the receiving the input first voice information includes:
  • FIG. 1 is a schematic flowchart of an instruction execution method according to an embodiment of the present application. As shown in FIG. 1, the process of the instruction execution method provided by the embodiment of the present application may be as follows:
  • the input first voice information is received.
  • the electronic device may collect sound in an external environment through an audio collection module to obtain sound information in audio format. After the sound information in the external environment is collected, noise reduction processing is performed on the collected sound information to extract human voice information in the voice information, and the human voice information is recorded as the input first voice information.
  • the audio acquisition module may be a microphone built in the electronic device, or a microphone externally connected to the electronic device. This application does not specifically limit this, and the electronic device may select it according to a set selection rule.
  • the selection rule is configured as follows: if an external microphone is connected, the sound in the external environment is collected through the connected external microphone; if the external microphone is not connected, the sound in the external environment is performed through the built-in microphone. collection.
  • the electronic device when the user needs to control the electronic device to download the XX application and install the XX application by voice instructions, he can say "Please help me download and install the XX application".
  • the electronic device will collect Voice information "Please help me download and install XX application” and the sound information of environmental noise. After that, the electronic device performs noise reduction processing on the collected sound information, removes the environmental noise in the sound information, and extracts the voice information. "Please Help me download and install the XX application ", using the vocal information” Please help me download and install the XX application "as the first voice message input.
  • a plurality of first to-be-executed instructions included in the first voice information are acquired.
  • the electronic device after receiving the first voice information of the input audio format, determines whether a voice parsing engine exists locally. If it exists, the electronic device inputs the first voice information to the local voice parsing engine for voice. Parse to get speech parsed text. Among them, the speech information is parsed, that is, the process of converting the speech information from "audio" to "text".
  • the electronic device may select a speech parsing engine from the multiple speech parsing engines to perform voice continuation on the received first speech information in the following manner:
  • the electronic device may randomly select a speech analysis engine from a plurality of local speech analysis engines and perform speech analysis on the received first speech information.
  • the electronic device may select a speech parsing engine with the highest parsing success rate from a plurality of speech parsing engines to perform speech parsing on the received first speech information.
  • the electronic device may select a speech parsing engine with the shortest parsing time from a plurality of speech parsing engines to perform speech parsing on the received first speech information.
  • the electronic device may also select a speech parsing engine that has a parsing success rate that reaches a preset success rate and has the shortest parsing time from multiple speech parsing engines to perform speech parsing on the first voice information.
  • an electronic device may Speech analysis engines perform speech analysis on the first speech information, and when the speech analysis texts obtained by the two speech analysis engines are the same, use the same speech analysis text as the speech analysis text of the first speech information; for example, an electronic device
  • the first speech information may be parsed by at least three speech analysis engines, and when the speech analysis text obtained by at least two of the speech analysis engines is the same, the same speech analysis text is used as the speech analysis text of the first speech information.
  • the electronic device After the voice parsed text of the first voice information is parsed, the electronic device further obtains a plurality of first to-be-executed instructions included in the first voice information from the voice parsed text.
  • the electronic device stores a plurality of instruction keywords in advance, and each instruction keyword corresponds to an instruction.
  • each instruction keyword corresponds to an instruction.
  • the electronic device first performs a word segmentation operation on the aforementioned voice parsed text to obtain a word sequence corresponding to the voice parsed text.
  • the word sequence includes Multiple words.
  • the electronic device After obtaining the word sequence corresponding to the speech parsing text, the electronic device matches the instruction keywords on the word sequence, that is, finds out a plurality of instruction keywords in the word sequence, thereby obtaining a plurality of words corresponding to the plurality of instruction keywords.
  • An instruction using the plurality of instructions as a plurality of first to-be-executed instructions included in the first voice information.
  • the matching search of the instruction keywords includes an exact match and / or a fuzzy match.
  • the electronic device uses the local speech parsing engine to parse the first audio message “Please download and install XX application for me” in audio format, and obtain the text parsing text “Please help me download and Install XX Application ". Perform word segmentation on the speech parsing text, and get the word sequence as ⁇ Please, help me, download, and, install, XX application ⁇ . Match the instruction keywords to the word sequence, and identify the instruction keywords in the word sequence as "download” and "install”, so as to obtain two first pending instructions, namely "download XX application” and "install XX” application".
  • a plurality of first to-be-executed instructions are sorted to obtain first sorting information.
  • the electronic device after obtaining a plurality of first to-be-executed instructions included in the first voice information, the electronic device performs multiple operations on the multiple The first to-be-executed instructions are sorted to obtain first sorting information.
  • the word sequence corresponding to the first voice message is ⁇ Please, help me, download, and install, XX application ⁇
  • the two first to-be-executed instructions obtained are "download XX application” and "install XX application”
  • the command keyword corresponding to "Download XX Application” is "Download”
  • the command keyword corresponding to "According to XX Application” is "Install”.
  • the "Download XX” Applications “and” Install XX applications "are sorted, and the first sorting information obtained is:” Download XX applications "and” Install XX applications ", where the order of" Download XX applications "precedes” Install XX applications ".
  • a plurality of first to-be-executed instructions are sequentially executed according to the first sorting information.
  • the electronic device when the electronic device completes the sorting operation of the plurality of first to-be-executed instructions and obtains the first sorting information, it can sequentially execute the plurality of first to-be-executed instructions according to the obtained first sorting information.
  • the two to-be-executed instructions that have obtained the first voice information are "Download XX application” and "Install XX application”, and the first sorted information obtained by sorting is: “Download XX application", "Install XX applications ".
  • the electronic device first executes "Download XX application”, downloads the installation package of XX application from the Internet, and then executes "Install XX application", and installs XX application according to the downloaded installation package of XX application. .
  • the electronic device can receive the input first voice information. Acquire a plurality of first to-be-executed instructions included in the first voice information. Sorting a plurality of first instructions to be executed to obtain first sorting information of the plurality of first instructions to be executed. According to the first sorting information, multiple first to-be-executed instructions are sequentially executed. Therefore, even if the voice information spoken by the user includes multiple instructions, the multiple instructions in the voice information can be sequentially executed, ensuring that no instructions are omitted, and the purpose of improving the accuracy of voice control is achieved.
  • the instruction execution method may further include:
  • the plurality of second to-be-executed instructions are sequentially executed according to the second sorting information.
  • the electronic device continues to collect sounds in the external environment through the audio collection module to obtain sound information in audio format. After the sound information in the external environment is collected, noise reduction processing is performed on the collected sound information to extract the human voice information in the sound information, and the human voice information extracted at this time is recorded as the input second voice information.
  • the electronic device After receiving the inputted second voice information, the electronic device obtains a plurality of second to-be-executed instructions included in the second voice information, and sorts the plurality of second to-be-executed instructions to obtain the second sorting information.
  • the electronic device After receiving the inputted second voice information, the electronic device obtains a plurality of second to-be-executed instructions included in the second voice information, and sorts the plurality of second to-be-executed instructions to obtain the second sorting information.
  • the processing operation of the first voice information in the embodiment is implemented correspondingly, and details are not described herein again.
  • the electronic device stitches the obtained second to-be-executed instructions to the tail of the foregoing first to-be-executed instructions, so that when the execution of the multiple first to-be-executed instructions is completed,
  • the second sorting information sequentially executes a plurality of second to-be-executed instructions.
  • the two to-be-executed instructions that have obtained the first voice information are “Download XX application” and “Install XX application”, and the first sorted information obtained by sorting is: “Download XX application "," Install XX apps. "
  • the second voice information is received, and two second to-be-executed instructions included in the second voice information are respectively “start XX application” and “in XX Play XX video in application ", and then sort these two second to-be-executed instructions, and get the second sorting information:” Start XX application "and” Play XX video in XX application ".
  • the electronic device can splice the instructions input in the non-continuous voice mode, and continuously execute the instructions according to the combination of the instructions obtained by the splicing, thereby improving the intelligence of the voice interaction between the electronic device and the user.
  • acquiring the plurality of first to-be-executed instructions included in the first voice information includes:
  • the electronic device determines whether a voice parsing engine exists locally after receiving the input first voice information in audio format. If it does not exist, the electronic device sends the received first voice information to the server (the server provides voice parsing). Service server), instructs the server to parse the first voice information, and returns the voice parsed text obtained by parsing the first voice information.
  • the electronic device After receiving the speech parsing text returned by the server, the electronic device can obtain multiple first to-be-executed instructions included in the first speech information according to the speech parsing text. For how to obtain the plurality of first to-be-executed instructions from the speech parsing text, reference may be specifically made to the related description in the foregoing embodiment, and details are not described herein again.
  • the method before acquiring the plurality of first to-be-executed instructions included in the first voice information, the method further includes:
  • the characteristic of this sound is the voiceprint feature.
  • the voiceprint feature is mainly determined by two factors. The first is the size of the acoustic cavity, which specifically includes the throat, nasal cavity, and oral cavity. The shape, size, and position of these organs determine the vocal cord tension Size and range of sound frequencies. Therefore, although different people say the same thing, the frequency distribution of the sound is different, and some sound low and loud.
  • the second factor that determines the characteristics of the voiceprint is the manner in which the vocal organs are manipulated.
  • the vocal organs include lips, teeth, tongue, soft palate, and diaphragm muscles, and their interaction produces clear speech. And the way they collaborate is learned randomly by people in their interactions with the people around them. In the process of learning to speak, by simulating the speech of different people around them, they will gradually form their own voiceprint characteristics.
  • the electronic device when receiving the input first voice information, the electronic device first obtains the voiceprint feature of the first voice information.
  • the electronic device After acquiring the voiceprint feature of the first voice information, the electronic device further compares the acquired voiceprint feature with a preset voiceprint feature to determine whether the voiceprint feature matches the preset voiceprint feature.
  • the preset voiceprint feature may be a voiceprint feature entered in advance by the owner, to determine whether the voiceprint feature of the input voice information matches the preset voiceprint feature, that is, to determine whether the user who currently inputs voice information is the owner .
  • the electronic device determines that the user currently inputting the first voice information is the owner, and then obtains a plurality of first to-be-executed instructions included in the first voice information, and executes them.
  • the plurality of first to-be-executed instructions reference may be made to related descriptions of the foregoing embodiments, and details are not described herein again.
  • the electronic device may obtain the voiceprint feature (the voiceprint feature obtained from the received voice information) and the preset voiceprint feature. Similarity, and determine whether the obtained similarity is greater than or equal to the first preset similarity (set according to actual needs, for example, it can be set to 95%). Wherein, when the acquired similarity is greater than or equal to the first preset similarity, it is determined that the acquired voiceprint feature matches the preset voiceprint feature; when the acquired similarity is less than or lower than the similarity, it is determined that the acquired The voiceprint feature does not match the preset voiceprint feature.
  • the electronic device determines that the user currently inputting the voice information is not the owner, discards the received first voice information, and continues to receive the first input voice information.
  • Voice information until the first voice information input by the owner is received, acquiring a plurality of first to-be-executed instructions included in the first voice information and executing the plurality of first to-be-executed instructions.
  • the user's identity before responding to the inputted first voice information, the user's identity is first identified according to the voiceprint characteristics of the first voice information, and only when the user who inputs the first voice information is the owner, Only in response to inputting the first voice message. Therefore, the electronic device can be prevented from performing operations that are not intended by the owner, and the use experience of the owner can be improved.
  • the method further includes:
  • the characteristics of the voiceprint are closely related to the physiological characteristics of the human body, in daily life, if the user catches a cold, his voice will become hoarse, and the characteristics of the voiceprint will also change accordingly. In this case, even if the voice information received by the electronic device is spoken by the owner, the electronic device cannot recognize it. In addition, there are many situations that cause the electronic device to fail to identify the owner, which will not be repeated here.
  • the electronic device completes the judgment of the similarity of the voiceprint feature, if the voiceprint feature of the received voice information and the preset voiceprint are The similarity of the features is less than the first preset similarity, and it is further judged whether the voiceprint feature is greater than or equal to the second preset similarity (the second preset similarity is configured to be less than the first preset similarity, which can be specifically determined by the art
  • the technician takes an appropriate value according to actual needs, for example, when the first preset similarity is set to 95%, the second preset similarity may be set to 75%).
  • the electronic device When the judgment result is yes, that is, the voiceprint feature of the acquired voice information and the similarity with the preset voiceprint feature is less than the first preset similarity and greater than or equal to the second preset similarity, the electronic device further Get the current location information.
  • the electronic device may use different positioning technologies such as satellite positioning technology or base station positioning technology to obtain the current position information.
  • the electronic device After acquiring the current position information, the electronic device determines whether it is currently within a preset position range according to the position information.
  • the preset position range can be configured as a common position range of the owner, such as home and company.
  • the electronic device determines that the aforementioned voiceprint feature matches the preset voiceprint feature, and recognizes the current user who inputs the voice information as the owner.
  • the instruction execution method may include:
  • the input first voice information is received.
  • the electronic device may collect sound in an external environment through an audio collection module to obtain sound information in audio format. After the sound information in the external environment is collected, noise reduction processing is performed on the collected sound information to extract human voice information in the voice information, and the human voice information is recorded as the input first voice information.
  • the audio acquisition module may be a microphone built in the electronic device, or a microphone externally connected to the electronic device. This application does not specifically limit this, and the electronic device may select it according to a set selection rule.
  • the selection rule is configured as follows: if an external microphone is connected, the sound in the external environment is collected through the connected external microphone; if the external microphone is not connected, the sound in the external environment is performed through the built-in microphone. collection.
  • the electronic device when the user needs to control the electronic device to download the XX application and install the XX application by voice instructions, he can say "Please help me download and install the XX application".
  • the electronic device will collect Voice information "Please help me download and install XX application” and the sound information of environmental noise. After that, the electronic device performs noise reduction processing on the collected sound information, removes the environmental noise in the sound information, and extracts the voice information. "Please Help me download and install the XX application ", using the vocal information” Please help me download and install the XX application "as the first voice message input.
  • 202 a plurality of first to-be-executed instructions included in the first voice information are acquired.
  • the electronic device after receiving the first voice information of the input audio format, determines whether a voice parsing engine exists locally. If it exists, the electronic device inputs the first voice information to the local voice parsing engine for voice. Parse to get speech parsed text. Among them, the speech information is parsed, that is, the process of converting the speech information from "audio" to "text".
  • the electronic device may select a speech parsing engine from the multiple speech parsing engines to perform voice continuation on the received first speech information in the following manner:
  • the electronic device may randomly select a speech analysis engine from a plurality of local speech analysis engines and perform speech analysis on the received first speech information.
  • the electronic device may select a speech parsing engine with the highest parsing success rate from a plurality of speech parsing engines to perform speech parsing on the received first speech information.
  • the electronic device may select a speech parsing engine with the shortest parsing time from a plurality of speech parsing engines to perform speech parsing on the received first speech information.
  • the electronic device may also select a speech parsing engine that has a parsing success rate that reaches a preset success rate and has the shortest parsing time from multiple speech parsing engines to perform speech parsing on the first voice information.
  • an electronic device may Speech analysis engines perform speech analysis on the first speech information, and when the speech analysis texts obtained by the two speech analysis engines are the same, use the same speech analysis text as the speech analysis text of the first speech information; for example, an electronic device
  • the first speech information may be parsed by at least three speech analysis engines, and when the speech analysis text obtained by at least two of the speech analysis engines is the same, the same speech analysis text is used as the speech analysis text of the first speech information.
  • the electronic device After the voice parsed text of the first voice information is parsed, the electronic device further obtains a plurality of first to-be-executed instructions included in the first voice information from the voice parsed text.
  • the electronic device stores a plurality of instruction keywords in advance, and each instruction keyword corresponds to an instruction.
  • each instruction keyword corresponds to an instruction.
  • the electronic device first performs a word segmentation operation on the aforementioned voice parsed text to obtain a word sequence corresponding to the voice parsed text.
  • the word sequence includes Multiple words.
  • the electronic device After obtaining the word sequence corresponding to the speech parsing text, the electronic device matches the instruction keywords on the word sequence, that is, finds out a plurality of instruction keywords in the word sequence, thereby obtaining a plurality of words corresponding to the plurality of instruction keywords.
  • An instruction using the plurality of instructions as a plurality of first to-be-executed instructions included in the first voice information.
  • the matching search of the instruction keywords includes an exact match and / or a fuzzy match.
  • the electronic device uses the local speech parsing engine to parse the first audio message “Please download and install XX application for me” in audio format, and obtain the text parsing text “Please help me download and Install XX Application ". Perform word segmentation on the speech parsing text, and get the word sequence as ⁇ Please, help me, download, and, install, XX application ⁇ . Match the instruction keywords to the word sequence, and identify the instruction keywords in the word sequence as "download” and "install”, so as to obtain two first pending instructions, namely "download XX application” and "install XX” application".
  • the server when determining whether a voice analysis engine exists locally and there is no voice analysis engine locally, sending the received first voice information to a server (the server is a server providing a voice analysis service), The server is instructed to parse the first voice information and return a voice parsed text obtained by analyzing the first voice information.
  • the electronic device After receiving the speech parsing text returned by the server, the electronic device can obtain multiple first to-be-executed instructions included in the first speech information according to the speech parsing text. For how to obtain the plurality of first to-be-executed instructions from the speech parsing text, reference may be specifically made to the related description in the foregoing embodiment, and details are not described herein again.
  • a plurality of first to-be-executed instructions are sorted to obtain first sorting information.
  • the electronic device after obtaining a plurality of first to-be-executed instructions included in the first voice information, the electronic device performs multiple operations on the multiple The first to-be-executed instructions are sorted to obtain first sorting information.
  • the word sequence corresponding to the first voice message is ⁇ Please, help me, download, and install, XX application ⁇
  • the two first to-be-executed instructions obtained are "download XX application” and "install XX application”
  • the command keyword corresponding to "Download XX Application” is "Download”
  • the command keyword corresponding to "According to XX Application” is "Install”.
  • the "Download XX” Applications “and” Install XX applications "are sorted, and the first sorting information obtained is:” Download XX applications "and” Install XX applications ", where the order of" Download XX applications "precedes” Install XX applications ".
  • a plurality of first to-be-executed instructions are sequentially executed according to the first sorting information.
  • the electronic device when the electronic device completes the sorting operation of the plurality of first to-be-executed instructions and obtains the first sorting information, it can sequentially execute the plurality of first to-be-executed instructions according to the obtained first sorting information.
  • the two to-be-executed instructions that have obtained the first voice information are "Download XX application” and "Install XX application”, and the first sorted information obtained by sorting is: “Download XX application", "Install XX applications ".
  • the electronic device first executes "Download XX application”, downloads the installation package of XX application from the Internet, and then executes "Install XX application", and installs XX application according to the downloaded installation package of XX application. .
  • the input second voice information is received.
  • the electronic device continues to collect sounds in the external environment through the audio collection module to obtain sound information in audio format. After the sound information in the external environment is collected, noise reduction processing is performed on the collected sound information to extract the human voice information in the sound information, and the human voice information extracted at this time is recorded as the input second voice information.
  • 206 a plurality of second to-be-executed instructions included in the second voice information are acquired.
  • a plurality of second to-be-executed instructions are sorted to obtain second sorting information.
  • the electronic device After receiving the inputted second voice information, the electronic device obtains a plurality of second to-be-executed instructions included in the second voice information, and sorts the plurality of second to-be-executed instructions to obtain the second sorting information.
  • the electronic device After receiving the inputted second voice information, the electronic device obtains a plurality of second to-be-executed instructions included in the second voice information, and sorts the plurality of second to-be-executed instructions to obtain the second sorting information.
  • the processing operation of the first voice information in the embodiment is implemented correspondingly, and details are not described herein again.
  • the plurality of second to-be-executed instructions are sequentially executed according to the second sort information.
  • the electronic device stitches the obtained second to-be-executed instructions to the tail of the foregoing first to-be-executed instructions, so that when the execution of the multiple first to-be-executed instructions is completed,
  • the second sorting information sequentially executes a plurality of second to-be-executed instructions.
  • the two to-be-executed instructions that have obtained the first voice information are “Download XX application” and “Install XX application”, and the first sorted information obtained by sorting is: “Download XX application "," Install XX apps. "
  • the second voice information is received, and two second to-be-executed instructions included in the second voice information are respectively “start XX application” and “in XX Play XX video in application ", and then sort these two second to-be-executed instructions, and get the second sorting information:” Start XX application "and” Play XX video in XX application ".
  • an instruction execution device is also provided.
  • FIG. 6, is a schematic structural diagram of an instruction execution apparatus 400 according to an embodiment of the present application.
  • the instruction execution device is applied to an electronic device.
  • the instruction execution device includes a receiving module 401, an obtaining module 402, a sorting module 403, and an executing module 404, as follows:
  • the receiving module 401 is configured to receive input first voice information.
  • the obtaining module 402 is configured to obtain a plurality of first to-be-executed instructions included in the first voice information.
  • a sorting module 403 is configured to sort a plurality of first instructions to be executed to obtain first sorting information.
  • the execution module 404 is configured to sequentially execute a plurality of first instructions to be executed according to the first sorting information.
  • the receiving module 401 may be further configured to receive the input second voice information during the execution of the plurality of first to-be-executed instructions by the execution module 404.
  • the acquiring module 402 may be further configured to acquire a plurality of second to-be-executed instructions included in the second voice information.
  • the sorting module 403 may be further configured to sort a plurality of second to-be-executed instructions to obtain second sorting information.
  • the execution module 404 may also be configured to sequentially execute the plurality of second to-be-executed instructions according to the second sorting information when the execution of the plurality of first-to-be-executed instructions is completed.
  • the obtaining module 402 may be further configured to:
  • the obtaining module 402 may be further configured to:
  • the obtaining module 402 may be further configured to:
  • the obtaining module 402 may be further configured to:
  • the obtaining module 402 is further configured to:
  • the voiceprint feature does not match the preset voiceprint feature, the first voice information is discarded.
  • the receiving module 401 when receiving the input first voice information, is configured to:
  • the sound information in the external environment is collected, and the sound information is subjected to noise reduction processing, and the human voice information in the sound information is extracted as the first voice information.
  • the instruction execution device 400 may be integrated in an electronic device, such as a mobile phone, a tablet computer, or the like.
  • the above modules can be implemented as independent entities, or can be arbitrarily combined, and implemented as the same or several entities.
  • the specific implementation of the above units can refer to the previous embodiments, and will not be repeated here.
  • the instruction execution apparatus of this embodiment may receive the input first voice information by the receiving module 401.
  • the acquiring module 402 acquires a plurality of first to-be-executed instructions included in the first voice information.
  • the sorting module 403 sorts a plurality of first instructions to be executed to obtain first sorting information of the plurality of first instructions to be executed.
  • the execution module 404 sequentially executes a plurality of first instructions to be executed according to the first sorting information. Therefore, even if the voice information spoken by the user includes multiple instructions, the multiple instructions in the voice information can be sequentially executed, ensuring that no instructions are omitted, and the purpose of improving the accuracy of voice control is achieved.
  • an electronic device is also provided.
  • the electronic device 500 includes a processor 501 and a memory 502.
  • the processor 501 is electrically connected to the memory 502.
  • the processor 500 is a control center of the electronic device 500. It connects various parts of the entire electronic device by using various interfaces and lines. Various functions of the device 500 and process data.
  • the memory 502 may be configured to store software programs and modules.
  • the processor 501 executes various functional applications and data processing by running computer programs and modules stored in the memory 502.
  • the memory 502 may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, a computer program (such as a sound playback function, an image playback function, etc.) required for at least one function, and the like; Data created by the use of electronic devices, etc.
  • the memory 502 may include a high-speed random access memory, and may further include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices. Accordingly, the memory 502 may further include a memory controller to provide the processor 501 with access to the memory 502.
  • the processor 501 in the electronic device 500 loads the instructions corresponding to the process of one or more computer programs into the memory 502 according to the following steps, and the processor 501 runs and stores the memory 502 A computer program in the computer to achieve various functions, as follows:
  • first sorting information multiple first to-be-executed instructions are sequentially executed.
  • the electronic device 500 may further include a display 503, a radio frequency circuit 504, an audio circuit 505, and a power source 506.
  • the display 503, the radio frequency circuit 504, the audio circuit 505, and the power supply 506 are electrically connected to the processor 501, respectively.
  • the display 503 may be used to display information input by the user or information provided to the user and various graphical user interfaces. These graphical user interfaces may be composed of graphics, text, icons, videos, and any combination thereof.
  • the display 503 may include a display panel.
  • the display panel may be configured in the form of a liquid crystal display (Liquid Crystal Display, LCD), or an organic light emitting diode (Organic Light-Emitting Diode, OLED).
  • the radio frequency circuit 504 may be used to transmit and receive radio frequency signals to establish wireless communication with a network device or other electronic device through wireless communication, and transmit and receive signals to and from the network device or other electronic device.
  • the audio circuit 505 may be used to provide an audio interface between the user and the electronic device through a speaker or a microphone.
  • the power source 506 may be used to power various components of the electronic device 500.
  • the power supply 506 may be logically connected to the processor 501 through a power management system, so as to implement functions such as management of charging, discharging, and power consumption management through the power management system.
  • the electronic device 500 may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
  • the processor 501 may execute:
  • the plurality of second to-be-executed instructions are sequentially executed according to the second sorting information.
  • the processor 501 may execute:
  • the processor 501 may execute:
  • the processor 501 may further execute:
  • the processor 501 may further execute:
  • the processor 501 may further execute:
  • the voiceprint feature does not match the preset voiceprint feature, the first voice information is discarded.
  • the processor 501 when receiving the input first voice information, the processor 501 may execute:
  • the sound information in the external environment is collected, and the sound information is subjected to noise reduction processing, and the human voice information in the sound information is extracted as the first voice information.
  • An embodiment of the present application further provides a storage medium that stores a computer program.
  • the computer program runs on a computer, the computer is caused to execute the instruction execution method in any of the foregoing embodiments, for example, receiving Input the first voice information; obtain a plurality of first to-be-executed instructions included in the first voice information; sort the plurality of first to-be-executed instructions to obtain first sort information; and execute a plurality of first sequentially based on the first sort information.
  • a pending instruction is provided.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM, ROM), or a random access device (Random Access Memory, RAM).
  • ROM read-only memory
  • RAM Random Access Memory
  • the computer program may be stored in a computer-readable storage medium, such as stored in a memory of an electronic device, and executed by at least one processor in the electronic device.
  • the execution process may include instructions such as method execution The process of the embodiment.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, or the like.
  • the instruction execution device in the embodiment of the present application, its functional modules may be integrated in one processing chip, or each module may exist separately physically, or two or more modules may be integrated in one module.
  • the above integrated modules can be implemented in the form of hardware or software functional modules. If the integrated module is implemented in the form of a software functional module and sold or used as an independent product, it may also be stored in a computer-readable storage medium, such as a read-only memory, a magnetic disk, or an optical disk. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)
  • Telephonic Communication Services (AREA)

Abstract

一种指令执行方法、装置、存储介质及电子设备,其中,电子设备接收输入的第一语音信息(101);获取第一语音信息包括的多个第一待执行指令(102);对多个第一待执行指令进行排序,得到多个第一待执行指令的第一排序信息(103);根据第一排序信息,依次执行多个第一待执行指令(104)。

Description

指令执行方法、装置、存储介质及电子设备
本申请要求于2018年05月30日提交中国专利局、申请号为201810542932.7、发明名称为“指令执行方法、装置、存储介质及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电子设备技术领域,具体涉及一种指令执行方法、装置、存储介质及电子设备。
背景技术
目前,电子设备可以通过语音指令的方式来执行特定操作。比如,当用户说出“播放音乐”时,电子设备将“播放音乐”识别为音乐播放指令,并执行该音乐播放指令,进行音乐播放。
发明内容
第一方面,本申请实施例提供了一种指令执行方法,包括:
接收输入的第一语音信息;
获取所述第一语音信息包括的多个第一待执行指令;
对所述多个第一待执行指令进行排序,得到所述多个第一待执行指令的第一排序信息;
根据所述第一排序信息,依次执行所述多个第一待执行指令。
第二方面,本申请实施例提供了一种指令执行装置,包括:
接收模块,用于接收输入的第一语音信息;
获取模块,用于获取所述第一语音信息包括的多个第一待执行指令;
排序模块,用于对所述多个第一待执行指令进行排序,得到所述多个第一待执行指令的第一排序信息;
执行模块,用于根据所述第一排序信息,依次执行所述多个第一待执行指令。
第三方面,本申请实施例提供了一种存储介质,其上存储有计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行:
接收输入的第一语音信息;
获取所述第一语音信息包括的多个第一待执行指令;
对所述多个第一待执行指令进行排序,得到所述多个第一待执行指令的第一排序信息;
根据所述第一排序信息,依次执行所述多个第一待执行指令。
第四方面,本申请实施例提供了一种电子设备,包括处理器和存储器,所述存储器有计算机程序,所述处理器通过调用所述计算机程序,用于执行:
接收输入的第一语音信息;
获取所述第一语音信息包括的多个第一待执行指令;
对所述多个第一待执行指令进行排序,得到所述多个第一待执行指令的第一排序信息;
根据所述第一排序信息,依次执行所述多个第一待执行指令。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的指令执行方法的一流程示意图。
图2是本申请实施例提供中获取第一语音信息包括的多个第一待执行指令的操作示意图。
图3是本申请实施例中执行多个第一待执行指令的示意图。
图4是本申请实施例中拼接执行多个第一待执行指令以及多个第二待执行指令的示意图。
图5是本申请实施例提供的指令执行方法的另一流程示意图。
图6是本申请实施例提供的指令执行装置的一结构示意图。
图7是本申请实施例提供的电子设备的一结构示意图。
图8是本申请实施例提供的电子设备的另一结构示意图。
具体实施方式
请参照图式,其中相同的组件符号代表相同的组件,本申请的原理是以实施在一适当的运算环境中来举例说明。以下的说明是基于所例示的本申请具体实施例,其不应被视为限制本申请未在此详述的其它具体实施例。
在以下的说明中,本申请的具体实施例将参考由一部或多部计算机所执行的步骤及符号来说明,除非另有述明。因此,这些步骤及操作将有数次提到由计算机执行,本文所指的计算机执行包括了由代表了以一结构化型式中的数据的电子信号的计算机处理单元的操作。此操作转换该数据或将其维持在该计算机的内存系统中的位置处,其可重新配置或另外以本领域测试人员所熟知的方式来改变该计算机的运作。该数据所维持的数据结构为该内存的实体位置,其具有由该数据格式所定义的特定特性。但是,本申请原理以上述文字来说明,其并不代表为一种限制,本领域测试人员将可了解到以下所述的多种步骤及操作亦可实施在硬件当中。
本文所使用的术语“模块”可看做为在该运算系统上执行的软件对象。本文所述的不同组件、模块、引擎及服务可看做为在该运算系统上的实施对象。而本文所述的装置及方法可以以软件的方式进行实施,当然也可在硬件上进行实施,均在本申请保护范围之内。
本申请中的术语“第一”、“第二”和“第三”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或模块的过程、方法、系统、产品或设备没有限定于已列出的步骤或模块,而是某些实施例还包括没有列出的步骤或模块,或某些实施例还包括对于这些过程、方法、产品或设备固有的其它步骤或模块。
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。
本申请实施例提供一种指令执行方法,该指令执行方法的执行主体可以是本申请实施例提供的指令执行装置,或者集成了该指令执行装置的电子设备,其中该指令执行装置可以采用硬件或者软件的方式实现。其中,电子设备可以是智能手机、平板电脑、掌上电脑、笔记本电脑、或者台式电脑等设备。
本申请实施例提供一种指令执行方法,其中,包括:
接收输入的第一语音信息;
获取所述第一语音信息包括的多个第一待执行指令;
对所述多个第一待执行指令进行排序,得到第一排序信息;
根据所述第一排序信息,依次执行所述多个第一待执行指令。
在一实施例中,所述指令执行方法,还包括:
在执行所述多个第一待执行指令期间,接收输入的第二语音信息;
获取所述第二语音信息包括的多个第二待执行指令;
对所述多个第二待执行指令进行排序,得到第二排序信息;
在执行完成所述多个第一待执行指令时,根据所述第二排序信息,依次执行所述多个第二待执行指令。
在一实施例中,所述获取所述第一语音信息包括的多个第一待执行指令,包括:
将所述第一语音信息发送至服务器,指示所述服务器对所述第一语音信息进行解析,并返回解析所述第一语音信息所得到的语音解析文本;
接收所述服务器返回的所述语音解析文本;
根据所述语音解析文本获取所述多个第一待执行指令。
在一实施例中,所述获取所述第一语音信息包括的多个第一待执行指令之前,还包括:
获取所述第一语音信息的声纹特征;
判断所述声纹特征是否与预设声纹特征匹配;
在所述声纹特征与预设声纹特征匹配时,获取所述第一语音信息包括的多个第一待执行指令。
在一实施例中,所述判断所述声纹特征是否与预设声纹特征匹配,包括:
获取所述声纹特征和所述预设声纹特征的相似度;
判断所述相似度是否大于或等于第一预设相似度;
在所述相似度大于或等于所述第一预设相似度时,确定所述声纹特征与所述预设声纹特征匹配。
在一实施例中,所述判断所述相似度是否大于或等于第一预设相似度之后,还包括:
在所述相似度小于所述第一预设相似度且大于或等于第二预设相似度时,获取当前的位置信息;
根据所述位置信息确定当前是否位于预设位置范围内;
在当前位于预设位置范围内时,确定所述声纹特征与所述预设声纹特征匹配。7
在一实施例中,所述判断所述声纹特征是否与预设声纹特征匹配之后,还包括:
若所述声纹特征与所述预设声纹特征不匹配,则丢弃所述第一语音信息。
在一实施例中,所述接收输入的第一语音信息,包括:
采集外部环境中的声音信息,并对所述声音信息进行降噪处理,提取出所述声音信息中的人声信息作为所述第一语音信息。
请参照图1,图1为本申请实施例提供的指令执行方法的流程示意图。如图1所示,本申请实施例提供的指令执行方法的流程可以如下:
在101中,接收输入的第一语音信息。
本申请实施例中,电子设备可以通过音频采集模组来对外部环境中的声音进行采集,以得到音频格式的声音信息。在采集得到外部环境中的声音信息之后,对采集得到的声音信息进行降噪处理,提取出声音信息中的人声信息,将该人声信息记为输入的第一语音信息。
其中,音频采集模组可以是电子设备内置的麦克风,也可以是电子设备外部接入的麦克风,本申请对此不做具体限制,可由电子设备根据设定的选取规则进行选取。比如,选取规则被配置为:若接入了外部麦克风,则通过接入的外部麦克风对外部环境中的声音进行采集;若未接入外部麦克风,则通过内置麦克风进对外部环境中的声音进行采集。
比如,当用户需要用语音指令的方式控制电子设备下载XX应用,并安装XX应用时,可以说出“请帮我下载和安装XX应用”,与此同时,电子设备将通过内置麦克风采集到包括人声信息“请帮我下载和安装XX应用”和环境噪音的声音信息,之后,电子设备对采集到的声音信息进行降噪处理,去除声音信息中的环境噪音,提取出人声信息“请帮我 下载和安装XX应用”,将人声信息“请帮我下载和安装XX应用”作为输入的第一语音信息。
在102中,获取第一语音信息包括的多个第一待执行指令。
本申请实施例中,电子设备在接收到输入的音频格式的第一语音信息之后,判断本地是否存在语音解析引擎,若存在,则电子设备将第一语音信息输入到本地的语音解析引擎进行语音解析,得到语音解析文本。其中,对语音信息进行语音解析,也即是将语音信息由“音频”向“文字”的转换过程。
此外,在本地存在多个语音解析引擎时,电子设备可以按照以下方式从多个语音解析引擎中选取一个语音解析引擎对接收到的第一语音信息进行语音继续:
其一,电子设备可以从本地的多个语音解析引擎中随机选取一个语音解析引擎,对接收到第一语音信息进行语音解析。
其二,电子设备可以从多个语音解析引擎中选取解析成功率最高的语音解析引擎,对接收到的第一语音信息进行语音解析。
其三,电子设备可以从多个语音解析引擎中选取解析时长最短的语音解析引擎,对接收到的第一语音信息进行语音解析。
其四,电子设备还可以从多个语音解析引擎中,选取解析成功率达到预设成功率、且解析时长最短的语音解析引擎对第一语音信息进行语音解析。
需要说明的是,本领域技术人员还可以按照以上未列出的方式进行语音解析引擎的选取,或者可以结合多个语音解析引擎对第一语音信息进行语音解析,比如,电子设备可以同时通过两个语音解析引擎对第一语音信息进行语音解析,并在两个语音解析引擎得到的语音解析文本相同时,将该相同的语音解析文本作为第一语音信息的语音解析文本;又比如,电子设备可以通过至少三个语音解析引擎对第一语音信息进行语音解析,并在其中至少两个语音解析引擎得到的语音解析文本相同时,将该相同的语音解析文本作为第一语音信息的语音解析文本。
在解析得到第一语音信息的语音解析文本之后,电子设备进一步从该语音解析文本中获取第一语音信息包括的多个第一待执行指令。
其中,电子设备预先存储有多个指令关键词,每一个指令关键词对应一个指令。在从解析得到的语音解析文本获取第一语音信息包括的多个第一待执行指令时,电子设备首先对前述语音解析文本进行分词操作,得到对应语音解析文本的词序列,该词序列中包括多个词。
在得到对应语音解析文本的词序列之后,电子设备对词序列进行指令关键词的匹配,也即是查找出词序列中的多个指令关键词,从而得到对应这多个指令关键词的多个指令,将这多个指令作为第一语音信息包括的多个第一待执行指令。其中,指令关键词的匹配查找包括完全匹配和/或模糊匹配。
比如,请参照图2,电子设备通过本地的语音解析引擎对音频格式的第一语音信息“请帮我下载和安装XX应用”进行语音解析,得到文本格式的语音解析文本“请帮我下载和安装XX应用”。对该语音解析文本进行分词操作,得到词序列为{请,帮我,下载,和,安装,XX应用}。对该词序列进行指令关键词的匹配,识别出词序列中的指令关键词为“下载”和“安装”,从而得到两个第一待执行指令,分别为“下载XX应用”和“安装XX应用”。
在103中,对多个第一待执行指令进行排序,得到第一排序信息。
本申请实施例中,在获取到第一语音信息包括的多个第一待执行指令之后,电子设备根据各第一待执行指令所对应的指令关键词在前述词序列中的顺序,对多个第一待执行指令进行排序,得到第一排序信息。
比如,对应第一语音信息的词序列为{请,帮我,下载,和,安装,XX应用},获取得到的两个第一待执行指令分别为“下载XX应用”和“安装XX应用”,其中,“下载XX应用”对应的指令关键词为“下载”,“按照XX应用”对应的指令关键词为“安装”,根据两个指令关键词在词序列中的顺序,对“下载XX应用”和“安装XX应用”进行排序,得到的第一排序信息为:“下载XX应用”、“安装XX应用”,其中,“下载XX应用”的顺序在“安装XX应用”之前。
在104中,根据第一排序信息,依次执行多个第一待执行指令。
本申请实施例中,电子设备在完成对多个第一待执行指令的排序操作,并得到第一排序信息时,即可根据得到的第一排序信息,依次执行多个第一待执行指令。
比如,请参照图3,获取到第一语音信息的两个待执行指令分别为“下载XX应用”和“安装XX应用”,排序得到的第一排序信息为:“下载XX应用”、“安装XX应用”。根据第一排序信息,电子设备设备首先执行“下载XX应用”,从互联网下载到XX应用的安装包,然后执行“安装XX应用”,根据下载得到的XX应用的安装包,进行XX应用的安装。
由上可知,本申请实施例中,电子设备可以接收输入的第一语音信息。获取第一语音信息包括的多个第一待执行指令。对多个第一待执行指令进行排序,得到多个第一待执行指令的第一排序信息。根据第一排序信息,依次执行多个第一待执行指令。由此,即使用户说出的语音信息中包括多条指令,也能够依次执行语音信息中的多条指令,确保不遗漏任何指令,达到提高语音控制准确度的目的。
在一实施例中,该指令执行方法,还可以包括:
在执行多个第一待执行指令期间,接收输入的第二语音信息;
获取第二语音信息包括的多个第二待执行指令;
对多个第二待执行指令进行排序,得到第二排序信息;
在执行完成多个第一待执行指令时,根据第二排序信息,依次执行多个第二待执行指令。
本申请实施例中,电子设备在执行多个第一待执行指令的期间,继续通过音频采集模组对外部环境中的声音进行采集,得到音频格式的声音信息。在采集得到外部环境中的声音信息之后,对采集得到的声音信息进行降噪处理,提取出声音信息中的人声信息,将此时提取出的人声信息记为输入的第二语音信息。
在接收到输入的第二语音信息之后,电子设备获取第二语音信息包括的多个第二待执行指令,并对多个第二待执行指令进行排序,得到第二排序信息,具体可参照以上实施例中对第一语音信息的处理操作相应实施,此处不再赘述。
在得到第二排序信息之后,电子设备将获取得到的多个第二待执行指令拼接至前述多个第一待执行指令的尾部,从而在执行完成多个第一待执行指令时,根据得到的第二排序信息,依次执行多个第二待执行指令。
比如,请结合参照图4和图3,获取到第一语音信息的两个待执行指令分别为“下载XX应用”和“安装XX应用”,排序得到的第一排序信息为:“下载XX应用”、“安装XX应用”。在执行前述两个第一待执行指令的过程中,接收到第二语音信息,并获取到该第二语音信息包括的两个第二待执行指令,分别为“启动XX应用”和“在XX应用中播放XX视频”,之后多这两个第二待执行指令进行排序,得到第二排序信息为:“启动XX应用”和“在XX应用中播放XX视频”。在执行完成两个第一待执行指令之后,再执行“启动XX应用”,启动安装的XX应用,再执行“在XX应用中播放XX视频”,通过XX应用播放XX视频。
由此,可以使得电子设备能够对非连续语音方式输入的指令进行拼接,并根据拼接得 到的指令组合来连续执行,提升了电子设备与用户语音交互的智能性。
在一实施例中,获取第一语音信息包括的多个第一待执行指令,包括:
将第一语音信息发送至服务器,指示服务器对第一语音信息进行解析,并返回解析第一语音信息所得到的语音解析文本;
接收服务器返回的语音解析文本;
根据接收到的语音解析文本,获取第一语音信息包括的多个第一待执行指令。
其中,电子设备在接收到输入的、音频格式的第一语音信息之后,判断本地是否存在语音解析引擎,若不存在,则将接收到的第一语音信息发送至服务器(该服务器为提供语音解析服务的服务器),指示该服务器对第一语音信息进行解析,并返回解析第一语音信息所得到的语音解析文本。
在接收到服务器返回的语音解析文本之后,电子设备即可根据该语音解析文本获取第一语音信息包括的多个第一待执行指令。其中,对于如何从语音解析文本中获取前述多个第一待执行指令,具体可参照以上实施例中的相关描述,此处不再赘述。
在一实施例中,获取第一语音信息包括的多个第一待执行指令之前,还包括:
获取第一语音信息的声纹特征;
判断获取到的声纹特征是否与预设声纹特征匹配;
在获取到的声纹特征与预设声纹特征匹配时,获取第一语音信息包括的多个第一待执行指令。
在实际生活中,每个人说话时的声音都有自己的特点,熟悉的人之间,可以只听声音而相互辨别出来。
这种声音的特点就是声纹特征,声纹特征主要由两个因素决定,第一个是声腔的尺寸,具体包括咽喉、鼻腔和口腔等,这些器官的形状、尺寸和位置决定了声带张力的大小和声音频率的范围。因此不同的人虽然说同样的话,但是声音的频率分布是不同的,听起来有的低沉有的洪亮。
第二个决定声纹特征的因素是发声器官被操纵的方式,发声器官包括唇、齿、舌、软腭及腭肌肉等,他们之间相互作用就会产生清晰的语音。而他们之间的协作方式是人通过后天与周围人的交流中随机学习到的。人在学习说话的过程中,通过模拟周围不同人的说话方式,就会逐渐形成自己的声纹特征。
本申请实施例中,在接收到输入的第一语音信息时,电子设备首先获取到该第一语音信息的声纹特征。
在获取到第一语音信息的声纹特征之后,电子设备进一步将获取到的该声纹特征与预设声纹特征进行进行比对,以判断该声纹特征是否与预设声纹特征匹配。其中,预设声纹特征可以为机主预先录入的声纹特征,判断输入的语音信息的声纹特征是否与预设声纹特征匹配,也即是判断当前输入语音信息的用户是否为机主。
在获取到的声纹特征与预设声纹特征匹配时,电子设备确定当前输入第一语音信息的用户为机主,此时获取第一语音信息包括的多个第一待执行指令,并执行这多个第一待执行指令,具体可参照以上实施例的相关描述,此处不再赘述。
其中,电子设备在判断获取到的声纹特征是否与预设声纹特征匹配时,可以获取该声纹特征(从接收到的语音信息所获取到的声纹特征)与预设声纹特征的相似度,并判断获取到的相似度是否大于或等于第一预设相似度(根据实际需要进行设置,比如,可以设置为95%)。其中,在获取到的相似度大于或等于第一预设相似度时,确定获取到的声纹特征与预设声纹特征匹配;在获取到的相似度小于低于相似度时,确定获取到的声纹特征与预设声纹特征不匹配。
此外,在获取到的声纹特征与预设声纹特征不匹配时,电子设备确定当前输入语音信 息的用户不为机主,丢弃接收到的第一语音信息,并继续接收输入到的第一语音信息,直至接收到机主输入的第一语音信息时,获取该第一语音信息包括的多个第一待执行指令,并执行这多个第一待执行指令,具体可参照以上实施例的相关描述,此处不再赘述。
本申请实施例通过在对输入的第一语音信息进行响应之前,首先根据该第一语音信息的声纹特征进行用户的身份识别,在且仅在输入第一语音信息的用户为机主时,才对输入第一语音信息进行响应。由此,能够避免电子设备执行非机主意愿的操作,提升机主的使用体验。
在一实施例中,判断获取到的相似度是否大于或等于第一预设相似度之后,还包括:
在获取到的相似度小于第一预设相似度且大于或等于第二预设相似度时,获取当前的位置信息;
根据该位置信息判断当前是否位于预设位置范围内;
在当前位于预设位置范围内时,确定获取的声纹特征与预设声纹特征匹配。
需要说明的是,由于声纹特征和人体的生理特征密切相关,在日常生活中,如果用户感冒发炎的话,其声音将变得沙哑,声纹特征也将随之发生变化。在这种情况下,即使电子设备接收到的语音信息由机主说出,电子设备也无法识别出。此外,还存在多种导致电子设备无法识别出机主的情况,此处不再赘述。
为解决可能出现的、无法识别出机主的情况,在本申请实施例中,电子设备在完成对声纹特征相似度的判断之后,若接收到的语音信息的声纹特征与预设声纹特征的相似度小于第一预设相似度,进一步判断该声纹特征是否大于获等于第二预设相似度(该第二预设相似度配置为小于第一预设相似度,具体可由本领域技术人员根据实际需要取合适值,比如,在第一预设相似度被设置为95%时,可以将第二预设相似度设置为75%)。
在判断结果为是,也即是获取到的语音信息的声纹特征、与预设声纹特征的相似度小于第一预设相似度且大于或等于第二预设相似度时,电子设备进一步获取到当前的位置信息。其中,电子设备可以采用卫星定位技术或者基站定位技术等不同的定位技术来获取到当前的位置信息。
在获取到当前的位置信息之后,电子设备根据该位置信息判断当前是否位于预设位置范围内。其中,预设位置范围可以配置为机主的常用位置范围,比如家里和公司等。
在当前位于预设位置范围内时,电子设备确定前述声纹特征与预设声纹特征匹配,将输入语音信息的当前用户识别为机主。
由此,能够避免可能出现的、无法识别出机主的情况,达到提升机主使用体验的目的。
下面将在上述实施例描述的方法基础上,对本申请的指令执行方法做进一步介绍。请参照图5,该指令执行方法可以包括:
在201中,接收输入的第一语音信息。
本申请实施例中,电子设备可以通过音频采集模组来对外部环境中的声音进行采集,以得到音频格式的声音信息。在采集得到外部环境中的声音信息之后,对采集得到的声音信息进行降噪处理,提取出声音信息中的人声信息,将该人声信息记为输入的第一语音信息。
其中,音频采集模组可以是电子设备内置的麦克风,也可以是电子设备外部接入的麦克风,本申请对此不做具体限制,可由电子设备根据设定的选取规则进行选取。比如,选取规则被配置为:若接入了外部麦克风,则通过接入的外部麦克风对外部环境中的声音进行采集;若未接入外部麦克风,则通过内置麦克风进对外部环境中的声音进行采集。
比如,当用户需要用语音指令的方式控制电子设备下载XX应用,并安装XX应用时,可以说出“请帮我下载和安装XX应用”,与此同时,电子设备将通过内置麦克风采集到 包括人声信息“请帮我下载和安装XX应用”和环境噪音的声音信息,之后,电子设备对采集到的声音信息进行降噪处理,去除声音信息中的环境噪音,提取出人声信息“请帮我下载和安装XX应用”,将人声信息“请帮我下载和安装XX应用”作为输入的第一语音信息。
在202中,获取第一语音信息包括的多个第一待执行指令。
本申请实施例中,电子设备在接收到输入的音频格式的第一语音信息之后,判断本地是否存在语音解析引擎,若存在,则电子设备将第一语音信息输入到本地的语音解析引擎进行语音解析,得到语音解析文本。其中,对语音信息进行语音解析,也即是将语音信息由“音频”向“文字”的转换过程。
此外,在本地存在多个语音解析引擎时,电子设备可以按照以下方式从多个语音解析引擎中选取一个语音解析引擎对接收到的第一语音信息进行语音继续:
其一,电子设备可以从本地的多个语音解析引擎中随机选取一个语音解析引擎,对接收到第一语音信息进行语音解析。
其二,电子设备可以从多个语音解析引擎中选取解析成功率最高的语音解析引擎,对接收到的第一语音信息进行语音解析。
其三,电子设备可以从多个语音解析引擎中选取解析时长最短的语音解析引擎,对接收到的第一语音信息进行语音解析。
其四,电子设备还可以从多个语音解析引擎中,选取解析成功率达到预设成功率、且解析时长最短的语音解析引擎对第一语音信息进行语音解析。
需要说明的是,本领域技术人员还可以按照以上未列出的方式进行语音解析引擎的选取,或者可以结合多个语音解析引擎对第一语音信息进行语音解析,比如,电子设备可以同时通过两个语音解析引擎对第一语音信息进行语音解析,并在两个语音解析引擎得到的语音解析文本相同时,将该相同的语音解析文本作为第一语音信息的语音解析文本;又比如,电子设备可以通过至少三个语音解析引擎对第一语音信息进行语音解析,并在其中至少两个语音解析引擎得到的语音解析文本相同时,将该相同的语音解析文本作为第一语音信息的语音解析文本。
在解析得到第一语音信息的语音解析文本之后,电子设备进一步从该语音解析文本中获取第一语音信息包括的多个第一待执行指令。
其中,电子设备预先存储有多个指令关键词,每一个指令关键词对应一个指令。在从解析得到的语音解析文本获取第一语音信息包括的多个第一待执行指令时,电子设备首先对前述语音解析文本进行分词操作,得到对应语音解析文本的词序列,该词序列中包括多个词。
在得到对应语音解析文本的词序列之后,电子设备对词序列进行指令关键词的匹配,也即是查找出词序列中的多个指令关键词,从而得到对应这多个指令关键词的多个指令,将这多个指令作为第一语音信息包括的多个第一待执行指令。其中,指令关键词的匹配查找包括完全匹配和/或模糊匹配。
比如,请参照图2,电子设备通过本地的语音解析引擎对音频格式的第一语音信息“请帮我下载和安装XX应用”进行语音解析,得到文本格式的语音解析文本“请帮我下载和安装XX应用”。对该语音解析文本进行分词操作,得到词序列为{请,帮我,下载,和,安装,XX应用}。对该词序列进行指令关键词的匹配,识别出词序列中的指令关键词为“下载”和“安装”,从而得到两个第一待执行指令,分别为“下载XX应用”和“安装XX应用”。
在一实施例中,在完成对本地是否存在语音解析引擎的判断,且本地不存在语音解析引擎时,将接收到的第一语音信息发送至服务器(该服务器为提供语音解析服务的服务器), 指示该服务器对第一语音信息进行解析,并返回解析第一语音信息所得到的语音解析文本。
在接收到服务器返回的语音解析文本之后,电子设备即可根据该语音解析文本获取第一语音信息包括的多个第一待执行指令。其中,对于如何从语音解析文本中获取前述多个第一待执行指令,具体可参照以上实施例中的相关描述,此处不再赘述。
在203中,对多个第一待执行指令进行排序,得到第一排序信息。
本申请实施例中,在获取到第一语音信息包括的多个第一待执行指令之后,电子设备根据各第一待执行指令所对应的指令关键词在前述词序列中的顺序,对多个第一待执行指令进行排序,得到第一排序信息。
比如,对应第一语音信息的词序列为{请,帮我,下载,和,安装,XX应用},获取得到的两个第一待执行指令分别为“下载XX应用”和“安装XX应用”,其中,“下载XX应用”对应的指令关键词为“下载”,“按照XX应用”对应的指令关键词为“安装”,根据两个指令关键词在词序列中的顺序,对“下载XX应用”和“安装XX应用”进行排序,得到的第一排序信息为:“下载XX应用”、“安装XX应用”,其中,“下载XX应用”的顺序在“安装XX应用”之前。
在204中,根据第一排序信息,依次执行多个第一待执行指令。
本申请实施例中,电子设备在完成对多个第一待执行指令的排序操作,并得到第一排序信息时,即可根据得到的第一排序信息,依次执行多个第一待执行指令。
比如,请参照图3,获取到第一语音信息的两个待执行指令分别为“下载XX应用”和“安装XX应用”,排序得到的第一排序信息为:“下载XX应用”、“安装XX应用”。根据第一排序信息,电子设备设备首先执行“下载XX应用”,从互联网下载到XX应用的安装包,然后执行“安装XX应用”,根据下载得到的XX应用的安装包,进行XX应用的安装。
在205中,在执行多个第一待执行指令期间,接收输入的第二语音信息。
电子设备在执行多个第一待执行指令的期间,继续通过音频采集模组对外部环境中的声音进行采集,得到音频格式的声音信息。在采集得到外部环境中的声音信息之后,对采集得到的声音信息进行降噪处理,提取出声音信息中的人声信息,将此时提取出的人声信息记为输入的第二语音信息。
在206中,获取第二语音信息包括的多个第二待执行指令。
在207中,对多个第二待执行指令进行排序,得到第二排序信息。
在接收到输入的第二语音信息之后,电子设备获取第二语音信息包括的多个第二待执行指令,并对多个第二待执行指令进行排序,得到第二排序信息,具体可参照以上实施例中对第一语音信息的处理操作相应实施,此处不再赘述。
在208中,在执行完成多个第一待执行指令时,根据第二排序信息,依次执行多个第二待执行指令。
在得到第二排序信息之后,电子设备将获取得到的多个第二待执行指令拼接至前述多个第一待执行指令的尾部,从而在执行完成多个第一待执行指令时,根据得到的第二排序信息,依次执行多个第二待执行指令。
比如,请结合参照图4和图3,获取到第一语音信息的两个待执行指令分别为“下载XX应用”和“安装XX应用”,排序得到的第一排序信息为:“下载XX应用”、“安装XX应用”。在执行前述两个第一待执行指令的过程中,接收到第二语音信息,并获取到该第二语音信息包括的两个第二待执行指令,分别为“启动XX应用”和“在XX应用中播放XX视频”,之后多这两个第二待执行指令进行排序,得到第二排序信息为:“启动XX应用”和“在XX应用中播放XX视频”。在执行完成两个第一待执行指令之后,再执行“启动XX应用”,启动安装的XX应用,再执行“在XX应用中播放XX视频”,通过 XX应用播放XX视频。
在一实施例中,还提供了一种指令执行装置。请参照图6,图6为本申请实施例提供的指令执行装置400的结构示意图。其中该指令执行装置应用于电子设备,该指令执行装置包括接收模块401、获取模块402、排序模块403和执行模块404,如下:
接收模块401,用于接收输入的第一语音信息。
获取模块402,用于获取第一语音信息包括的多个第一待执行指令。
排序模块403,用于对多个第一待执行指令进行排序,得到第一排序信息。
执行模块404,用于根据第一排序信息,依次执行多个第一待执行指令。
在一实施例中,接收模块401还可以用于在执行模块404执行多个第一待执行指令期间,接收输入的第二语音信息。
获取模块402还可以用于获取第二语音信息包括的多个第二待执行指令。
排序模块403还可以用于对多个第二待执行指令进行排序,得到第二排序信息。
执行模块404还可以用于在执行完成多个第一待执行指令时,根据第二排序信息,依次执行多个第二待执行指令。
在一实施例中,获取模块402,还可以用于:
将第一语音信息发送至服务器,指示服务器对第一语音信息进行解析,并返回解析第一语音信息所得到的语音解析文本;
接收服务器返回的语音解析文本;
根据接收到的语音解析文本,获取第一语音信息包括的多个第一待执行指令。
在一实施例中,获取模块402,还可以用于:
获取第一语音信息的声纹特征;
判断获取到的声纹特征是否与预设声纹特征匹配;
在获取到的声纹特征与预设声纹特征匹配时,获取第一语音信息包括的多个第一待执行指令。
在一实施例中,获取模块402,还可以用于:
获取前述声纹特征和预设声纹特征的相似度;
判断获取到的相似度是否大于或等于第一预设相似度;
在获取到的相似度大于或等于第一预设相似度时,确定前述声纹特征与预设声纹特征匹配。
在一实施例中,获取模块402,还可以用于:
在获取到的相似度小于第一预设相似度且大于或等于第二预设相似度时,获取当前的位置信息;
根据该位置信息判断当前是否位于预设位置范围内;
在当前位于预设位置范围内时,确定获取的声纹特征与预设声纹特征匹配。
在一实施例中,在判断声纹特征是否与预设声纹特征匹配之后,获取模块402还用于:
若声纹特征与预设声纹特征不匹配,则丢弃第一语音信息。
在一实施例中,在接收输入的第一语音信息时,接收模块401用于:
采集外部环境中的声音信息,并对声音信息进行降噪处理,提取出声音信息中的人声信息作为第一语音信息。
其中,指令执行装置400中各模块执行的步骤可以参考上述方法实施例描述的方法步骤。该指令执行装置400可以集成在电子设备中,如手机、平板电脑等。
具体实施时,以上各个模块可以作为独立的实体实现,也可以进行任意组合,作为同一或若干个实体来实现,以上各个单位的具体实施可参见前面的实施例,在此不再赘述。
由上可知,本实施例指令执行装置可以由接收模块401接收输入的第一语音信息。由获取模块402获取第一语音信息包括的多个第一待执行指令。由排序模块403对多个第一待执行指令进行排序,得到多个第一待执行指令的第一排序信息。由执行模块404根据第一排序信息,依次执行多个第一待执行指令。由此,即使用户说出的语音信息中包括多条指令,也能够依次执行语音信息中的多条指令,确保不遗漏任何指令,达到提高语音控制准确度的目的。
在一实施例中,还提供一种电子设备。请参照图7,电子设备500包括处理器501以及存储器502。其中,处理器501与存储器502电性连接。
处理器500是电子设备500的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或加载存储在存储器502内的计算机程序,以及调用存储在存储器502内的数据,执行电子设备500的各种功能并处理数据。
存储器502可用于存储软件程序以及模块,处理器501通过运行存储在存储器502的计算机程序以及模块,从而执行各种功能应用以及数据处理。存储器502可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的计算机程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据电子设备的使用所创建的数据等。此外,存储器502可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。相应地,存储器502还可以包括存储器控制器,以提供处理器501对存储器502的访问。
在本申请实施例中,电子设备500中的处理器501会按照如下的步骤,将一个或一个以上的计算机程序的进程对应的指令加载到存储器502中,并由处理器501运行存储在存储器502中的计算机程序,从而实现各种功能,如下:
接收输入的第一语音信息;
获取第一语音信息包括的多个第一待执行指令;
对多个第一待执行指令进行排序,得到第一排序信息;
根据第一排序信息,依次执行多个第一待执行指令。
请一并参阅图8,在某些实施方式中,电子设备500还可以包括:显示器503、射频电路504、音频电路505以及电源506。其中,其中,显示器503、射频电路504、音频电路505以及电源506分别与处理器501电性连接。
显示器503可以用于显示由用户输入的信息或提供给用户的信息以及各种图形用户接口,这些图形用户接口可以由图形、文本、图标、视频和其任意组合来构成。显示器503可以包括显示面板,在某些实施方式中,可以采用液晶显示器(Liquid Crystal Display,LCD)、或者有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示面板。
射频电路504可以用于收发射频信号,以通过无线通信与网络设备或其他电子设备建立无线通讯,与网络设备或其他电子设备之间收发信号。
音频电路505可以用于通过扬声器、传声器提供用户与电子设备之间的音频接口。
电源506可以用于给电子设备500的各个部件供电。在一些实施例中,电源506可以通过电源管理系统与处理器501逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。
尽管图8中未示出,电子设备500还可以包括摄像头、蓝牙模块等,在此不再赘述。
在某些实施方式中,处理器501可以执行:
在执行多个第一待执行指令期间,接收输入的第二语音信息;
获取第二语音信息包括的多个第二待执行指令;
对多个第二待执行指令进行排序,得到第二排序信息;
在执行完成多个第一待执行指令时,根据第二排序信息,依次执行多个第二待执行指令。
在某些实施方式中,在获取第一语音信息包括的多个第一待执行指令时,处理器501可以执行:
将第一语音信息发送至服务器,指示服务器对第一语音信息进行解析,并返回解析第一语音信息所得到的语音解析文本;
接收服务器返回的语音解析文本;
根据接收到的语音解析文本,获取第一语音信息包括的多个第一待执行指令。
在某些实施方式中,在获取第一语音信息包括的多个第一待执行指令之前,处理器501可以执行:
获取第一语音信息的声纹特征;
判断获取到的声纹特征是否与预设声纹特征匹配;
在获取到的声纹特征与预设声纹特征匹配时,获取第一语音信息包括的多个第一待执行指令。
在某些实施方式中,在判断获取到的声纹特征是否与预设声纹特征匹配时,处理器501还可以执行:
获取前述声纹特征和预设声纹特征的相似度;
判断获取到的相似度是否大于或等于第一预设相似度;
在获取到的相似度大于或等于第一预设相似度时,确定前述声纹特征与预设声纹特征匹配。
在某些实施方式中,在判断获取到的相似度是否大于或等于第一预设相似度之后,处理器501还可以执行:
在获取到的相似度小于第一预设相似度且大于或等于第二预设相似度时,获取当前的位置信息;
根据该位置信息判断当前是否位于预设位置范围内;
在当前位于预设位置范围内时,确定获取的声纹特征与预设声纹特征匹配。
在某些实施方式中,在判断声纹特征是否与预设声纹特征匹配之后,处理器501还可以执行:
若声纹特征与预设声纹特征不匹配,则丢弃第一语音信息。
在一实施例中,在接收输入的第一语音信息时,处理器501可以执行:
采集外部环境中的声音信息,并对声音信息进行降噪处理,提取出声音信息中的人声信息作为第一语音信息。
本申请实施例还提供一种存储介质,所述存储介质存储有计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行上述任一实施例中的指令执行方法,比如:接收输入的第一语音信息;获取第一语音信息包括的多个第一待执行指令;对多个第一待执行指令进行排序,得到第一排序信息;根据第一排序信息,依次执行多个第一待执行指令。
本申请实施例中,存储介质可以是磁碟、光盘、只读存储器(Read Only Memory,ROM,)或者随机存取器(Random Access Memory,RAM)等。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
需要说明的是,对本申请实施例的指令执行方法而言,本领域普通测试人员可以理解 实现本申请实施例的指令执行方法的全部或部分流程,是可以通过计算机程序来控制相关的硬件来完成,所述计算机程序可存储于一计算机可读取存储介质中,如存储在电子设备的存储器中,并被该电子设备内的至少一个处理器执行,在执行过程中可包括如指令执行方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储器、随机存取记忆体等。
对本申请实施例的指令执行装置而言,其各功能模块可以集成在一个处理芯片中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中,所述存储介质譬如为只读存储器,磁盘或光盘等。
以上对本申请实施例所提供的一种指令执行方法、装置、存储介质及电子设备进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。

Claims (20)

  1. 一种指令执行方法,其中,包括:
    接收输入的第一语音信息;
    获取所述第一语音信息包括的多个第一待执行指令;
    对所述多个第一待执行指令进行排序,得到第一排序信息;
    根据所述第一排序信息,依次执行所述多个第一待执行指令。
  2. 如权利要求1所述的指令执行方法,其中,所述指令执行方法,还包括:
    在执行所述多个第一待执行指令期间,接收输入的第二语音信息;
    获取所述第二语音信息包括的多个第二待执行指令;
    对所述多个第二待执行指令进行排序,得到第二排序信息;
    在执行完成所述多个第一待执行指令时,根据所述第二排序信息,依次执行所述多个第二待执行指令。
  3. 如权利要求1所述的指令执行方法,其中,所述获取所述第一语音信息包括的多个第一待执行指令,包括:
    将所述第一语音信息发送至服务器,指示所述服务器对所述第一语音信息进行解析,并返回解析所述第一语音信息所得到的语音解析文本;
    接收所述服务器返回的所述语音解析文本;
    根据所述语音解析文本获取所述多个第一待执行指令。
  4. 如权利要求1所述的指令执行方法,其中,所述获取所述第一语音信息包括的多个第一待执行指令之前,还包括:
    获取所述第一语音信息的声纹特征;
    判断所述声纹特征是否与预设声纹特征匹配;
    在所述声纹特征与预设声纹特征匹配时,获取所述第一语音信息包括的多个第一待执行指令。
  5. 如权利要求4所述的指令执行方法,其中,所述判断所述声纹特征是否与预设声纹特征匹配,包括:
    获取所述声纹特征和所述预设声纹特征的相似度;
    判断所述相似度是否大于或等于第一预设相似度;
    在所述相似度大于或等于所述第一预设相似度时,确定所述声纹特征与所述预设声纹特征匹配。
  6. 如权利要求5所述的指令执行方法,其中,所述判断所述相似度是否大于或等于第一预设相似度之后,还包括:
    在所述相似度小于所述第一预设相似度且大于或等于第二预设相似度时,获取当前的位置信息;
    根据所述位置信息确定当前是否位于预设位置范围内;
    在当前位于预设位置范围内时,确定所述声纹特征与所述预设声纹特征匹配。7
  7. 如权利要求4所述的指令执行方法,其中,所述判断所述声纹特征是否与预设声纹特征匹配之后,还包括:
    若所述声纹特征与所述预设声纹特征不匹配,则丢弃所述第一语音信息。
  8. 如权利要求1所述的指令执行方法,其中,所述接收输入的第一语音信息,包括:
    采集外部环境中的声音信息,并对所述声音信息进行降噪处理,提取出所述声音信息中的人声信息作为所述第一语音信息。
  9. 一种指令执行装置,其中,包括:
    接收模块,用于接收输入的第一语音信息;
    获取模块,用于获取所述第一语音信息包括的多个第一待执行指令;
    排序模块,用于对所述多个第一待执行指令进行排序,得到所述多个第一待执行指令的第一排序信息;
    执行模块,用于根据所述第一排序信息,依次执行所述多个第一待执行指令。
  10. 如权利要求9所述的指令执行装置,其中,
    所述接收模块还用于在执行模块执行多个第一待执行指令期间,接收输入的第二语音信息。
    所述获取模块还用于获取第二语音信息包括的多个第二待执行指令。
    所述排序模块还用于对多个第二待执行指令进行排序,得到第二排序信息。
    所述执行模块还用于在执行完成多个第一待执行指令时,根据第二排序信息,依次执行多个第二待执行指令。
  11. 如权利要求9所述的指令执行装置,其中,所述获取模块还用于:
    将第一语音信息发送至服务器,指示服务器对第一语音信息进行解析,并返回解析第一语音信息所得到的语音解析文本;
    接收服务器返回的语音解析文本;
    根据接收到的语音解析文本,获取第一语音信息包括的多个第一待执行指令。
  12. 一种存储介质,其上存储有计算机程序,其中,当所述计算机程序在计算机上运行时,使得所述计算机执行:
    接收输入的第一语音信息;
    获取所述第一语音信息包括的多个第一待执行指令;
    对所述多个第一待执行指令进行排序,得到第一排序信息;
    根据所述第一排序信息,依次执行所述多个第一待执行指令。
  13. 一种电子设备,包括处理器和存储器,所述存储器储存有计算机程序,其中,所述处理器通过调用所述计算机程序,用于执行:
    接收输入的第一语音信息;
    获取所述第一语音信息包括的多个第一待执行指令;
    对所述多个第一待执行指令进行排序,得到第一排序信息;
    根据所述第一排序信息,依次执行所述多个第一待执行指令。
  14. 如权利要求13所述的电子设备,其中,所述处理器还用于执行:
    在执行所述多个第一待执行指令期间,接收输入的第二语音信息;
    获取所述第二语音信息包括的多个第二待执行指令;
    对所述多个第二待执行指令进行排序,得到第二排序信息;
    在执行完成所述多个第一待执行指令时,根据所述第二排序信息,依次执行所述多个第二待执行指令。
  15. 如权利要求13所述的电子设备,其中,在获取所述第一语音信息包括的多个第一待执行指令时,所述处理器用于执行:
    将所述第一语音信息发送至服务器,指示所述服务器对所述第一语音信息进行解析,并返回解析所述第一语音信息所得到的语音解析文本;
    接收所述服务器返回的所述语音解析文本;
    根据所述语音解析文本获取所述多个第一待执行指令。
  16. 如权利要求13所述的电子设备,其中,在获取所述第一语音信息包括的多个第一待执行指令之前,所述处理器还用于执行:
    获取所述第一语音信息的声纹特征;
    判断所述声纹特征是否与预设声纹特征匹配;
    在所述声纹特征与预设声纹特征匹配时,获取所述第一语音信息包括的多个第一待执行指令。
  17. 如权利要求16所述的电子设备,其中,在判断所述声纹特征是否与预设声纹特征匹配时,所述处理器用于执行:
    获取所述声纹特征和所述预设声纹特征的相似度;
    判断所述相似度是否大于或等于第一预设相似度;
    在所述相似度大于或等于所述第一预设相似度时,确定所述声纹特征与所述预设声纹特征匹配。
  18. 如权利要求17所述的电子设备,其中,在判断所述相似度是否大于或等于第一预设相似度之后,所述处理器还用于执行:
    在所述相似度小于所述第一预设相似度且大于或等于第二预设相似度时,获取当前的位置信息;
    根据所述位置信息确定当前是否位于预设位置范围内;
    在当前位于预设位置范围内时,确定所述声纹特征与所述预设声纹特征匹配。7
  19. 如权利要求16所述的电子设备,其中,在判断所述声纹特征是否与预设声纹特征匹配之后,所述处理器还用于执行:
    若所述声纹特征与所述预设声纹特征不匹配,则丢弃所述第一语音信息。
  20. 如权利要求13所述的电子设备,其中,在接收输入的第一语音信息时,所述处理器用于执行:
    采集外部环境中的声音信息,并对所述声音信息进行降噪处理,提取出所述声音信息中的人声信息作为所述第一语音信息。
PCT/CN2019/085563 2018-05-30 2019-05-05 指令执行方法、装置、存储介质及电子设备 WO2019228140A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810542932.7 2018-05-30
CN201810542932.7A CN108711428B (zh) 2018-05-30 2018-05-30 指令执行方法、装置、存储介质及电子设备

Publications (1)

Publication Number Publication Date
WO2019228140A1 true WO2019228140A1 (zh) 2019-12-05

Family

ID=63870489

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/085563 WO2019228140A1 (zh) 2018-05-30 2019-05-05 指令执行方法、装置、存储介质及电子设备

Country Status (2)

Country Link
CN (1) CN108711428B (zh)
WO (1) WO2019228140A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108711428B (zh) * 2018-05-30 2021-05-25 Oppo广东移动通信有限公司 指令执行方法、装置、存储介质及电子设备
CN115240668B (zh) * 2022-07-06 2023-06-02 广东开放大学(广东理工职业学院) 语音交互家居控制方法及机器人

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1217506A2 (en) * 2000-12-19 2002-06-26 Hewlett-Packard Company Controlling the order of output of multiple devices
CN106023994A (zh) * 2016-04-29 2016-10-12 杭州华橙网络科技有限公司 一种语音处理的方法、装置以及系统
CN106373566A (zh) * 2016-08-25 2017-02-01 深圳市元征科技股份有限公司 数据传输控制方法及装置
CN106814639A (zh) * 2015-11-27 2017-06-09 富泰华工业(深圳)有限公司 语音控制系统及方法
CN107347111A (zh) * 2017-05-16 2017-11-14 上海与德科技有限公司 终端的控制方法及终端
CN108711428A (zh) * 2018-05-30 2018-10-26 Oppo广东移动通信有限公司 指令执行方法、装置、存储介质及电子设备

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7536304B2 (en) * 2005-05-27 2009-05-19 Porticus, Inc. Method and system for bio-metric voice print authentication
CN105185380B (zh) * 2015-06-24 2020-06-23 联想(北京)有限公司 一种信息处理方法及电子设备
CN105161099B (zh) * 2015-08-12 2019-11-26 恬家(上海)信息科技有限公司 一种语音控制的遥控装置及其实现方法
CN107180632A (zh) * 2017-06-19 2017-09-19 微鲸科技有限公司 语音控制方法、装置及可读存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1217506A2 (en) * 2000-12-19 2002-06-26 Hewlett-Packard Company Controlling the order of output of multiple devices
CN106814639A (zh) * 2015-11-27 2017-06-09 富泰华工业(深圳)有限公司 语音控制系统及方法
CN106023994A (zh) * 2016-04-29 2016-10-12 杭州华橙网络科技有限公司 一种语音处理的方法、装置以及系统
CN106373566A (zh) * 2016-08-25 2017-02-01 深圳市元征科技股份有限公司 数据传输控制方法及装置
CN107347111A (zh) * 2017-05-16 2017-11-14 上海与德科技有限公司 终端的控制方法及终端
CN108711428A (zh) * 2018-05-30 2018-10-26 Oppo广东移动通信有限公司 指令执行方法、装置、存储介质及电子设备

Also Published As

Publication number Publication date
CN108711428A (zh) 2018-10-26
CN108711428B (zh) 2021-05-25

Similar Documents

Publication Publication Date Title
KR102309540B1 (ko) 사용자의 입력 입력에 기초하여 타겟 디바이스를 결정하고, 타겟 디바이스를 제어하는 서버 및 그 동작 방법
US10811005B2 (en) Adapting voice input processing based on voice input characteristics
US20200349940A1 (en) Server for determining target device based on speech input of user and controlling target device, and operation method of the server
WO2021004481A1 (zh) 一种媒体文件推荐方法及装置
TWI511125B (zh) 語音操控方法、行動終端裝置及語音操控系統
WO2019242414A1 (zh) 语音处理方法、装置、存储介质及电子设备
CN107463700B (zh) 用于获取信息的方法、装置及设备
CN106202165B (zh) 人机交互的智能学习方法及装置
WO2018045646A1 (zh) 基于人工智能的人机交互方法和装置
US11328711B2 (en) User adaptive conversation apparatus and method based on monitoring of emotional and ethical states
CN112799630B (zh) 使用网络可寻址设备创建电影化的讲故事体验
CN109671435B (zh) 用于唤醒智能设备的方法和装置
WO2019228138A1 (zh) 音乐播放方法、装置、存储介质及电子设备
CN110047481A (zh) 用于语音识别的方法和装置
CN109712610A (zh) 用于识别语音的方法和装置
CN109710799B (zh) 语音交互方法、介质、装置和计算设备
JP6625772B2 (ja) 検索方法及びそれを用いた電子機器
KR20190068133A (ko) 오디오 데이터에 포함된 음소 정보를 이용하여 어플리케이션을 실행하기 위한 전자 장치 및 그의 동작 방법
WO2023272616A1 (zh) 一种文本理解方法、系统、终端设备和存储介质
CN111460231A (zh) 电子设备以及电子设备的搜索方法、介质
WO2019228140A1 (zh) 指令执行方法、装置、存储介质及电子设备
WO2019242415A1 (zh) 位置提示方法、装置、存储介质及电子设备
CN109064720B (zh) 位置提示方法、装置、存储介质及电子设备
CN108989551B (zh) 位置提示方法、装置、存储介质及电子设备
KR20140086853A (ko) 음성 데이터 분석을 통한 화자기반 콘텐츠 관리 장치 및 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19811852

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19811852

Country of ref document: EP

Kind code of ref document: A1