WO2019242415A1 - 位置提示方法、装置、存储介质及电子设备 - Google Patents

位置提示方法、装置、存储介质及电子设备 Download PDF

Info

Publication number
WO2019242415A1
WO2019242415A1 PCT/CN2019/085557 CN2019085557W WO2019242415A1 WO 2019242415 A1 WO2019242415 A1 WO 2019242415A1 CN 2019085557 W CN2019085557 W CN 2019085557W WO 2019242415 A1 WO2019242415 A1 WO 2019242415A1
Authority
WO
WIPO (PCT)
Prior art keywords
noise
signal
preset
speech
voiceprint feature
Prior art date
Application number
PCT/CN2019/085557
Other languages
English (en)
French (fr)
Inventor
黄粟
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2019242415A1 publication Critical patent/WO2019242415A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L21/0202
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion
    • G10L21/057Time compression or expansion for improving intelligibility
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Definitions

  • the present application relates to the technical field of electronic devices, and in particular, to a location prompting method, device, storage medium, and electronic device.
  • the interaction modes between human and machine have become more and more abundant.
  • users can control electronic devices such as mobile phones and tablets through voice. That is, after receiving a voice signal sent by the user, the electronic device can analyze the voice signal, obtain a control instruction, and execute it. For example, when the user cannot find the electronic device, the electronic device can provide a location prompt according to the user's voice signal to guide the user to find the electronic device.
  • an embodiment of the present application provides a location prompting method, including:
  • an embodiment of the present application provides a location prompting device, including:
  • a first acquiring module configured to acquire a historical noise signal corresponding to the noisy speech signal when the noisy speech signal is received
  • a second acquisition module configured to acquire a noise signal during reception of the noisy speech signal according to the historical noise signal
  • a noise reduction module configured to perform inverse phase superposition of the noise signal and the noisy speech signal to obtain a noise reduction speech signal
  • the prompting module is configured to obtain a to-be-executed instruction included in the noise reduction voice signal, and when the to-be-executed instruction is an instruction for triggering a location prompt, perform a prompt operation to prompt the current location in a preset manner.
  • an embodiment of the present application provides a storage medium on which a computer program is stored, and when the computer program is run on a computer, the computer is caused to execute:
  • an embodiment of the present application provides an electronic device including a processor and a memory, where the memory has a computer program, and the processor calls the computer program to execute:
  • FIG. 1 is a schematic flowchart of a location prompting method according to an embodiment of the present application.
  • FIG. 2 is a schematic diagram of a prompt mode setting interface provided by an embodiment of the present application.
  • FIG. 3 is an example diagram of triggering an electronic device to perform a prompt operation to prompt a current position in the embodiment of the present application.
  • FIG. 4 is a diagram illustrating an example of a prompt operation performed by the electronic device in the embodiment of the present application.
  • FIG. 5 is another schematic flowchart of a location prompting method according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a position prompting device according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • FIG. 8 is another schematic structural diagram of an electronic device according to an embodiment of the present application.
  • an embodiment herein means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application.
  • the appearances of this phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are they independent or alternative embodiments that are mutually exclusive with other embodiments. It is clearly and implicitly understood by those skilled in the art that the embodiments described herein may be combined with other embodiments.
  • the embodiment of the present application provides a location prompting method.
  • the execution subject of the location prompting method may be the location prompting device provided in the embodiment of the present application, or an electronic device integrated with the location prompting device.
  • the location prompting device may use hardware or Software way.
  • the electronic device may be a smart phone, a tablet computer, a palmtop computer, a notebook computer, or a desktop computer.
  • An embodiment of the present application provides a location prompting method, which includes:
  • the acquiring a noise signal during reception of the noisy voice signal according to the historical noise signal includes:
  • performing the prompting operation to prompt the current position in a preset manner includes:
  • the instruction to be executed included in the acquiring the noise reduction voice signal includes:
  • the acquiring the to-be-executed instruction according to the speech parsing text includes:
  • the method before sending the noise-reduced voice signal to the server, the method further includes:
  • the noise-reduced voice signal is sent to the server.
  • the method before the step of acquiring a to-be-executed instruction included in the noise reduction voice signal, the method further includes:
  • determining whether the voiceprint feature matches a preset voiceprint feature includes:
  • the method further includes:
  • FIG. 1 is a schematic flowchart of a location prompting method according to an embodiment of the present application. As shown in FIG. 1, the process of the location prompting method provided in the embodiment of the present application may be as follows:
  • the noisy voice signal is formed by a combination of a voice signal and an environmental noise signal.
  • the electronic device can receive the input noisy voice signal in a variety of different ways. For example, when the electronic device is not externally connected to a microphone, the electronic device can pass The built-in microphone collects external voice, and uses the collected noisy voice signal as the received noisy voice signal; for example, when an electronic device is externally connected with a microphone, the electronic device can perform external voice through the external microphone Acquisition, using the collected noisy speech signal as the received noisy speech signal.
  • an electronic device when an electronic device receives an input noisy voice signal through a microphone (the microphone here may be a built-in microphone or an external microphone), if the microphone is an analog microphone, an analog noisy voice signal will be collected.
  • the electronic device needs to sample the analog noisy speech signal to convert the analog noisy speech signal into a digital noisy speech signal. For example, it can sample at a sampling frequency of 16KHz.
  • the microphone is a digital microphone, the electronic The device will receive the digitized noisy voice signal directly through the digital microphone without conversion.
  • the electronic device will receive a noisy speech signal when there is a voice signal from the speaker in the environment where the electronic device is located, and the electronic device will not receive the voice signal when the speaker is in the environment where the electronic device is located. Only noisy signals will be received. Among them, the electronic device will buffer the received noisy speech signal and noise signal.
  • the electronic device when receiving a noisy voice signal, uses the start time of the noisy voice signal as the end time to obtain a preset time length (the preset time length) received before the noisy voice signal is received.
  • a preset time length (the preset time length) received before the noisy voice signal is received.
  • a suitable value can be taken by those skilled in the art according to actual needs, and this embodiment of the present application does not specifically limit this.
  • it can be set to a historical noise signal of 500 ms), and the noise signal is used as the historical noise signal corresponding to the noisy speech signal.
  • the electronic device obtains 11:04 on June 12, 2018.
  • the noise signal with a duration of 500 milliseconds buffered from 56 seconds to 11:04:56 and 500 milliseconds on June 12, 2018 is used as the historical noise signal corresponding to the noisy speech signal.
  • the electronic device further acquires a noise signal during the reception of the noisy speech signal according to the acquired historical noise signal after acquiring the historical noise signal corresponding to the noisy speech signal.
  • the electronic device can predict the noise distribution during the reception of the noisy speech signal based on the acquired historical noise signal, thereby obtaining the noise signal during the reception of the noisy speech signal.
  • the noise variation in continuous time is usually small.
  • the electronic device can use the acquired historical noise signal as the noise signal during the reception of the noisy speech signal. If the duration of the historical noise signal is greater than the The length of the noisy speech signal can be intercepted from the historical noise signal as the noise signal with the same duration as the noise signal during the reception of the noisy speech signal; if the length of the historical noise signal is less than the length of the noisy speech signal, Then, the historical noise signal can be copied, and multiple historical noise signals can be spliced to obtain a noise signal with the same duration as the noisy speech signal, as the noise signal during the reception of the noisy speech signal.
  • the electronic device after acquiring the noise signal during the reception of the noisy voice signal, the electronic device first performs inverse processing on the acquired noise signal, and then superimposes the noise signal after the inversion processing with the noisy voice signal, and the band has been eliminated.
  • the noise part of the noisy speech signal is obtained as a noise-reduced speech signal.
  • the electronic device determines whether there is a voice parsing engine locally. If it exists, the electronic device inputs the noise-reduced voice signal to the local voice parsing engine for voice parsing to obtain a voice parsed text.
  • the speech signal is parsed, that is, the conversion process of the speech signal from "audio" to "text".
  • the electronic device can select a speech parsing engine from the multiple speech parsing engines to perform speech parsing on the noise-reduced speech signal in the following manner:
  • the electronic device may randomly select a speech analysis engine from a plurality of local speech analysis engines to perform speech analysis on the received noise-reduced speech signal.
  • the electronic device can select a speech parsing engine with the highest parsing success rate from multiple speech parsing engines to perform speech parsing on the received noise-reduced speech signal.
  • the electronic device may select a speech parsing engine with the shortest parsing time from multiple speech parsing engines to perform speech parsing on the received noise-reduced speech signal.
  • the electronic device may also select a speech parsing engine with a parsing success rate that reaches a preset success rate and the shortest parsing time from multiple speech parsing engines to perform speech parsing on the received noise-reduced speech signal.
  • an electronic device may pass two Two speech parsing engines perform speech parsing on the noise reduction speech signal, and when the speech parsing texts obtained by the two speech parsing engines are the same, use the same speech parsing text as the speech parsing text of the noise reduction speech signal; for example, an electronic device
  • the noise reduction speech signal can be parsed by at least three speech analysis engines, and when the speech analysis text obtained by at least two of the speech analysis engines is the same, the same speech analysis text is used as the speech analysis text of the noise reduction speech signal.
  • the electronic device After the speech analysis text of the noise reduction speech signal is parsed, the electronic device further obtains the to-be-executed instructions included in the noise reduction speech signal from the speech analysis text.
  • the electronic device stores multiple instruction keywords in advance, and a single instruction keyword or a combination of multiple instruction keywords corresponds to one instruction.
  • the electronic device first performs a word segmentation operation on the foregoing speech parsed text to obtain a word sequence corresponding to the speech parsed text, and the word sequence includes multiple words.
  • the electronic device After obtaining the word sequence corresponding to the speech parsed text, the electronic device matches the instruction keywords to the word sequence, that is, finds out the instruction keywords in the word sequence, so as to match to obtain the corresponding instruction, and use the matched instruction as a descendant.
  • noisy voice signal pending instructions The matching search of the instruction keywords includes an exact match and / or a fuzzy match.
  • the prompt operation indicating the current position is performed in a preset manner.
  • the command used to trigger the location prompt corresponds to the command keyword combination "Xiaoou” + "You” + "Where”.
  • the electronic device will determine "Xiaoou you are Where "includes pending instructions that are instructions for triggering a location cue.
  • the manner in which the electronic device performs the prompt operation may be set by default, or may be set according to user input data.
  • the default prompt mode of the electronic device is bright screen.
  • the electronic device also provides a setting interface of the prompt mode for the user to select the prompt mode according to actual needs.
  • the user selects "while the screen is bright, In the "ringing" prompt mode, if a voice "Where are you, Xiaoou" from the user is received, the electronic device will remind the user of his current location by brightening the screen and ringing.
  • the electronic device when receiving the noisy voice signal, can obtain the historical noise signal corresponding to the noisy voice signal. According to the acquired historical noise signal, a noise signal during the reception of the noisy speech signal is obtained. The acquired noise signal and the noisy speech signal are inverted in phase to obtain a noise reduction speech signal. Acquire a pending instruction included in the noise reduction voice signal, and when the pending instruction is an instruction for triggering a position prompt, perform a prompt operation prompting the current position in a preset manner.
  • the noise-reduced voice signal is subjected to noise reduction processing to obtain the noise-reduced voice signal, and then a prompt operation indicating the current position is performed according to the noise-reduced voice signal to avoid
  • the noise interference can improve the success rate of triggering the position prompting of the electronic device.
  • obtaining a noise signal during reception of a noisy voice signal according to the acquired historical noise signal includes:
  • the acquired historical noise signal is used as sample data for model training to obtain a noise prediction model
  • the electronic device obtains the historical noise signal, uses the historical noise signal as sample data, and performs model training according to a preset training algorithm to obtain a noise prediction model.
  • the training algorithm is a machine learning algorithm.
  • the machine learning algorithm can predict the data through continuous feature learning.
  • the electronic device can predict the current noise distribution based on the historical noise distribution.
  • machine learning algorithms can include: decision tree algorithms, regression algorithms, Bayesian algorithms, neural network algorithms (which can include deep neural network algorithms, convolutional neural network algorithms and recursive neural network algorithms, etc.), clustering algorithms, etc. Which training algorithm is selected as a preset training algorithm for model training can be selected by those skilled in the art according to actual needs.
  • the preset training algorithm configured by the electronic device is a Gaussian mixture model algorithm (which is a regression algorithm).
  • the historical noise signal is used as sample data, and the model training is performed according to the Gaussian mixture model algorithm.
  • a Gaussian mixture model is obtained by training (the noise prediction model includes multiple Gaussian units for describing the noise distribution), and the Gaussian mixture model is used as the noise prediction model.
  • the electronic device uses the start time and end time of the noisy speech signal reception period as the input of the noise prediction model, and inputs the noise prediction model for processing to obtain the noise prediction model output noise signal during the reception of the noisy speech signal.
  • performing a prompt operation to prompt the current position according to a preset manner includes:
  • the embodiments of the present application provide a way to perform prompt operations, including:
  • the electronic device first obtains the current position information when performing the prompt operation to prompt the current position in a preset manner.
  • the electronic device can identify the current outdoor environment according to the strength of the received satellite positioning signal, Is still in an indoor environment, for example, when the received satellite positioning signal strength is lower than a preset threshold, it is determined to be in an indoor environment, and when the received satellite positioning signal strength is higher than or equal to a preset threshold, it is determined to be in an outdoor environment)
  • the electronic device can use satellite positioning technology to obtain the current location information.
  • the electronic device can use indoor positioning technology to obtain the current location information.
  • the electronic device After acquiring the current position information, the electronic device outputs the acquired position information in a voice manner to prompt its current position.
  • the electronic device when the user leaves the electronic device on the desktop of the conference room, he can say "Where are you, Xiaoou” to trigger the electronic device to perform the prompt operation to prompt the current position.
  • the electronic device will receive the noisy voice signal "Where are you Xiaoou + Noise”, and then perform noise reduction processing on the noisy voice signal to obtain the noise-reduced voice signal "Where are you Xiaoou?"
  • the to-be-executed instructions included in the noisy voice signal are instructions for triggering a location reminder.
  • the current location information "meeting room” is obtained, and "I am in the meeting room” is output in a voice manner to guide the user and help the user find Electronic equipment.
  • the "to-be-executed instructions included in acquiring the noise-reduction voice signal” include:
  • the electronic device determines whether a voice analysis engine exists locally after obtaining the noise reduction voice signal, and if it does not exist, sends the noise reduction voice signal to a server (the server is a server providing a voice analysis service), and instructs the server to The noise-reduced speech signal is parsed, and the speech parsed text obtained by analyzing the noise-reduced speech signal is returned.
  • a server is a server providing a voice analysis service
  • the electronic device After receiving the speech analysis text returned by the server, the electronic device can obtain the to-be-executed instructions included in the noise reduction speech signal according to the speech analysis text.
  • the to-be-executed instruction from the speech parsing text, reference may be specifically made to the foregoing description, and details are not described herein again.
  • the method before "obtaining a pending instruction included in the noise reduction voice signal", the method further includes:
  • the characteristic of this sound is the voiceprint feature.
  • the voiceprint feature is mainly determined by two factors. The first is the size of the acoustic cavity, which specifically includes the throat, nasal cavity, and oral cavity. The shape, size, and position of these organs determine the vocal cord tension. Size and range of sound frequencies. Therefore, although different people say the same thing, the frequency distribution of the sound is different, and some sound low and loud.
  • the second factor that determines the characteristics of the voiceprint is the manner in which the vocal organs are manipulated.
  • the vocal organs include lips, teeth, tongue, soft palate, and diaphragm muscles, and their interaction produces clear speech. And the way they collaborate is learned randomly by people in their interactions with the people around them. In the process of learning to speak, by simulating the speech of different people around them, they will gradually form their own voiceprint characteristics.
  • the electronic device when the electronic device obtains the noise-reduced voice signal, it first obtains the voiceprint characteristics of the noise-reduced voice signal.
  • the electronic device After acquiring the voiceprint feature of the noise-reduction voice signal, the electronic device further compares the acquired voiceprint feature with a preset voiceprint feature to determine whether the voiceprint feature matches the preset voiceprint feature.
  • the preset voiceprint feature may be a voiceprint feature previously recorded by the owner, and it is determined whether the voiceprint feature of the noise reduction voice signal matches the preset voiceprint feature, that is, whether the speaker corresponding to the noise reduction voice signal is Owner.
  • the electronic device determines the speaker corresponding to the noise reduction voice signal as the owner. At this time, the pending instructions included in the noise reduction voice signal are obtained.
  • the related description above. Will not repeat them here.
  • the identity of the speaker is identified according to the voiceprint characteristics of the noise reduction voice signal, and the speaker corresponding to the noise reduction voice signal is the owner. Only when the instruction to be executed included in the noise reduction voice signal is obtained. Therefore, the electronic device can be prevented from performing operations that are not intended by the owner, and the use experience of the owner can be improved.
  • determining whether the acquired voiceprint feature matches a preset voiceprint feature includes:
  • the electronic device may acquire the similarity between the aforementioned voiceprint feature and the preset voiceprint feature, and determine whether the acquired similarity is greater than or equal to the first A preset similarity (set according to actual needs, for example, it can be set to 95%).
  • the obtained similarity is greater than or equal to the first preset similarity, it is determined that the obtained voiceprint feature matches the preset voiceprint feature, and when the obtained similarity is less than the first preset similarity, It is determined that the obtained voiceprint feature does not match the preset voiceprint feature.
  • the method further includes:
  • the characteristics of the voiceprint are closely related to the physiological characteristics of the human body, in daily life, if the user catches a cold, his voice will become hoarse, and the characteristics of the voiceprint will also change accordingly. In this case, even if the speaker corresponding to the noisy voice signal received by the electronic device is the owner, after the noise reduction process is performed to obtain the noise-reduced voice signal, the electronic device cannot identify it. In addition, there are many situations that cause the electronic device to fail to identify the owner, which will not be repeated here.
  • the electronic device completes the judgment of the similarity of the voiceprint feature, if the voiceprint feature of the noise-reduction voice signal and the preset voiceprint feature If the similarity is less than the first preset similarity, it is further judged whether the similarity is greater than or equal to the second preset similarity (the second preset similarity is configured to be less than the first preset similarity, which can be specifically determined by the technology in the art The person takes an appropriate value according to actual needs, for example, when the first preset similarity is set to 95%, the second preset similarity may be set to 75%).
  • the electronic device When the judgment result is yes, that is, the voiceprint feature of the noise reduction voice signal, and the similarity with the preset voiceprint feature is less than the first preset similarity and greater than or equal to the second preset similarity, the electronic device further obtains To the current location information.
  • the electronic device when in an outdoor environment (the electronic device can identify whether it is currently in an outdoor environment or an indoor environment according to the strength of the received satellite positioning signal, for example, when the strength of the received satellite positioning signal is lower than a preset threshold, it is determined to be in Indoor environment, when the strength of the received satellite positioning signal is higher than or equal to a preset threshold, it is determined to be in an outdoor environment), the electronic device can use satellite positioning technology to obtain the current position information. Indoor location technology can be used to obtain the current location information.
  • the electronic device After acquiring the current position information, the electronic device determines whether it is currently within a preset position range according to the position information.
  • the preset position range can be configured as a common position range of the owner, such as home and company.
  • the electronic device determines that the aforementioned voiceprint feature matches the preset voiceprint feature, and the speaker corresponding to the noise-reduction voice signal is identified as the owner.
  • the location prompting method may include:
  • the noisy voice signal is formed by a combination of a voice signal and an environmental noise signal.
  • the electronic device can receive the input noisy voice signal in a variety of different ways. For example, when the electronic device is not externally connected to a microphone, the electronic device can pass The built-in microphone collects external voice, and uses the collected noisy voice signal as the received noisy voice signal; for example, when an electronic device is externally connected with a microphone, the electronic device can perform external voice through the external microphone Acquisition, using the collected noisy speech signal as the received noisy speech signal.
  • an electronic device when an electronic device receives an input noisy voice signal through a microphone (the microphone here may be a built-in microphone or an external microphone), if the microphone is an analog microphone, an analog noisy voice signal will be collected.
  • the electronic device needs to sample the analog noisy speech signal to convert the analog noisy speech signal into a digital noisy speech signal. For example, it can sample at a sampling frequency of 16KHz.
  • the microphone is a digital microphone, the electronic The device will receive the digitized noisy voice signal directly through the digital microphone without conversion.
  • the electronic device will receive a noisy speech signal when there is a voice signal from the speaker in the environment where the electronic device is located, and the electronic device will not receive the voice signal when the speaker is in the environment where the electronic device is located. Only noisy signals will be received. Among them, the electronic device will buffer the received noisy speech signal and noise signal.
  • the electronic device when receiving a noisy voice signal, uses the start time of the noisy voice signal as the end time to obtain a preset time length (the preset time length) received before the noisy voice signal is received.
  • a preset time length (the preset time length) received before the noisy voice signal is received.
  • a suitable value can be taken by those skilled in the art according to actual needs, and this embodiment of the present application does not specifically limit this.
  • it can be set to a historical noise signal of 500 ms), and the noise signal is used as the historical noise signal corresponding to the noisy speech signal.
  • the electronic device obtains 11:04 on June 12, 2018.
  • the noise signal with a duration of 500 milliseconds buffered from 56 seconds to 11:04:56 and 500 milliseconds on June 12, 2018 is used as the historical noise signal corresponding to the noisy speech signal.
  • the electronic device obtains the historical noise signal, uses the historical noise signal as sample data, and performs model training according to a preset training algorithm to obtain a noise prediction model.
  • the training algorithm is a machine learning algorithm.
  • the machine learning algorithm can predict the data through continuous feature learning.
  • the electronic device can predict the current noise distribution based on the historical noise distribution.
  • machine learning algorithms can include: decision tree algorithms, regression algorithms, Bayesian algorithms, neural network algorithms (which can include deep neural network algorithms, convolutional neural network algorithms and recursive neural network algorithms, etc.), clustering algorithms, etc. Which training algorithm is selected as a preset training algorithm for model training can be selected by those skilled in the art according to actual needs.
  • the preset training algorithm configured by the electronic device is a Gaussian mixture model algorithm.
  • the historical noise signal is obtained, the historical noise signal is used as sample data, and the model is trained according to the Gaussian mixture model algorithm to obtain a Gaussian mixture model (
  • the noise prediction model includes multiple Gaussian units for describing the noise distribution.
  • the Gaussian mixture model is used as the noise prediction model.
  • the electronic device After training to obtain the noise prediction model, the electronic device uses the start time and end time of the noisy speech signal reception period as the input of the noise prediction model and inputs it to the noise prediction model for processing to obtain the noise prediction model output during the noisy speech signal reception period. Noise signal.
  • the electronic device after acquiring the noise signal during the reception of the noisy voice signal, the electronic device first performs inverse processing on the acquired noise signal, and then superimposes the noise signal after the inversion processing with the noisy voice signal, and the band has been eliminated.
  • the noise part of the noisy speech signal is obtained as a noise-reduced speech signal.
  • the characteristic of this sound is the voiceprint feature.
  • the voiceprint feature is mainly determined by two factors. The first is the size of the acoustic cavity, which specifically includes the throat, nasal cavity, and oral cavity. The shape, size, and position of these organs determine the vocal cord tension. Size and range of sound frequencies. Therefore, although different people say the same thing, the frequency distribution of the sound is different, and some sound low and loud.
  • the second factor that determines the characteristics of the voiceprint is the manner in which the vocal organs are manipulated.
  • the vocal organs include lips, teeth, tongue, soft palate, and diaphragm muscles, and their interaction produces clear speech. And the way they collaborate is learned randomly by people in their interactions with the people around them. In the process of learning to speak, by simulating the speech of different people around them, they will gradually form their own voiceprint characteristics.
  • the electronic device when the electronic device obtains the noise-reduced voice signal, it first obtains the voiceprint characteristics of the noise-reduced voice signal.
  • the electronic device After acquiring the voiceprint feature of the noise-reduction voice signal, the electronic device further compares the acquired voiceprint feature with a preset voiceprint feature to determine whether the voiceprint feature matches the preset voiceprint feature.
  • the preset voiceprint feature may be a voiceprint feature previously recorded by the owner, and it is determined whether the voiceprint feature of the noise reduction voice signal matches the preset voiceprint feature, that is, whether the speaker corresponding to the noise reduction voice signal is Owner.
  • the electronic device determines that the speaker corresponding to the noise-reduced voice signal is the owner, and at this time acquires the instructions to be executed included in the noise-reduced voice signal.
  • the electronic device when acquiring the pending instructions included in the noise-reduction voice signal, the electronic device first determines whether a voice analysis engine exists locally, and if so, the electronic device inputs the noise-reduction voice signal to the local voice analysis engine for voice analysis to obtain the voice. Parse text. Among them, the speech signal is parsed, that is, the conversion process of the speech signal from "audio" to "text".
  • the electronic device After the speech analysis text of the noise reduction speech signal is parsed, the electronic device further obtains the to-be-executed instructions included in the noise reduction speech signal from the speech analysis text.
  • the electronic device stores multiple instruction keywords in advance, and a single instruction keyword or a combination of multiple instruction keywords corresponds to one instruction.
  • the electronic device first performs a word segmentation operation on the foregoing speech parsed text to obtain a word sequence corresponding to the speech parsed text, and the word sequence includes multiple words.
  • the electronic device After obtaining the word sequence corresponding to the speech parsed text, the electronic device matches the instruction keywords to the word sequence, that is, finds out the instruction keywords in the word sequence, so as to match to obtain the corresponding instruction, and use the matched instruction as a descendant.
  • noisy voice signal pending instructions The matching search of the instruction keywords includes an exact match and / or a fuzzy match.
  • the acquired to-be-executed instruction is an instruction for triggering a position prompt, perform a prompt operation prompting the current position in a preset manner.
  • the prompt operation indicating the current position is performed in a preset manner.
  • the command used to trigger the location prompt corresponds to the command keyword combination "Xiaoou” + "You” + "Where”.
  • the electronic device will determine "Xiaoou you are Where "includes pending instructions that are instructions for triggering a location cue.
  • the manner in which the electronic device performs the prompt operation may be set by default, or may be set according to user input data.
  • the default prompt mode of the electronic device is bright screen.
  • the electronic device also provides a setting interface of the prompt mode for the user to select the prompt mode according to actual needs.
  • the user selects "while the screen is bright, In the "ringing" prompt mode, if a voice "Where are you, Xiaoou" from the user is received, the electronic device will remind the user of his current location by brightening the screen and ringing.
  • a position prompting device is also provided.
  • FIG. 6, is a schematic structural diagram of a position prompting device 400 according to an embodiment of the present application.
  • the position prompting device is applied to an electronic device.
  • the position prompting device includes a first obtaining module 401, a second obtaining module 402, a noise reduction module 403, and a prompting module 404, as follows:
  • the first acquiring module 401 is configured to acquire a historical noise signal corresponding to the noisy speech signal when the noisy speech signal is received.
  • the second acquisition module 402 is configured to acquire a noise signal during the reception of the noisy voice signal according to the acquired historical noise signal.
  • the noise reduction module 403 is configured to perform inverse phase superposition of the acquired noise signal and the noisy speech signal to obtain a noise-reduced speech signal.
  • the prompting module 404 is configured to obtain a to-be-executed instruction included in the noise-reduction voice signal, and when the to-be-executed instruction is an instruction for triggering a location prompt, perform a prompt operation to prompt the current location in a preset manner.
  • the second obtaining module 402 may be configured to:
  • the obtained historical noise signal is used as sample data for model training to obtain a noise prediction model
  • the noise signal during noise signal reception is predicted according to the noise prediction model.
  • the prompting module 404 may be configured to:
  • the prompting module 404 may be configured to:
  • the prompting module 404 may be configured to:
  • the word sequence is matched with the instruction keywords, and the instructions to be executed are obtained by matching.
  • the prompting module 404 may be configured to:
  • the noise reduction voice signal is sent to the server.
  • the prompting module 404 may be configured to:
  • the to-be-executed instructions included in the noise reduction voice signal are acquired.
  • the prompting module 404 may be configured to:
  • the prompting module 404 may be configured to:
  • the position prompting device 400 may be integrated in an electronic device, such as a mobile phone, a tablet computer, or the like.
  • the above modules can be implemented as independent entities, or can be arbitrarily combined, and implemented as the same or several entities.
  • the specific implementation of the above units can refer to the previous embodiments, and will not be repeated here.
  • the location prompting device may obtain the historical noise signal corresponding to the noisy voice signal by the first acquisition module 401.
  • the second acquisition module 402 acquires a noise signal during the reception of the noisy speech signal according to the acquired historical noise signal.
  • the noise reduction module 403 performs inverse phase superposition of the acquired noise signal and the noisy speech signal to obtain a noise reduction speech signal.
  • the prompting module 404 obtains the instructions to be executed included in the noise-reduction voice signal, and when the instructions to be executed are instructions for triggering the location prompt, the prompt operation for prompting the current location is performed in a preset manner.
  • the noise-reduced voice signal is subjected to noise reduction processing to obtain the noise-reduced voice signal, and then a prompt operation indicating the current position is performed according to the noise-reduced voice signal to avoid
  • the noise interference can improve the success rate of triggering the position prompting of the electronic device.
  • an electronic device is also provided.
  • the electronic device 500 includes a processor 501 and a memory 502.
  • the processor 501 is electrically connected to the memory 502.
  • the processor 500 is a control center of the electronic device 500. It connects various parts of the entire electronic device by using various interfaces and lines, and executes the electronic program by running or loading a computer program stored in the memory 502, and calling data stored in the memory 502. Various functions of the device 500 and process data.
  • the memory 502 may be configured to store software programs and modules.
  • the processor 501 executes various functional applications and data processing by running computer programs and modules stored in the memory 502.
  • the memory 502 may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, a computer program (such as a sound playback function, an image playback function, etc.) required for at least one function; the storage data area may store data according to Data created by the use of electronic devices, etc.
  • the memory 502 may include a high-speed random access memory, and may further include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices. Accordingly, the memory 502 may further include a memory controller to provide the processor 501 with access to the memory 502.
  • the processor 501 in the electronic device 500 loads the instructions corresponding to the process of one or more computer programs into the memory 502 according to the following steps, and the processor 501 runs the stored data in the memory 502
  • a computer program in the computer to achieve various functions, as follows:
  • the electronic device 500 may further include a display 503, a radio frequency circuit 504, an audio circuit 505, and a power source 506.
  • the display 503, the radio frequency circuit 504, the audio circuit 505, and the power supply 506 are electrically connected to the processor 501, respectively.
  • the display 503 may be used to display information input by the user or information provided to the user and various graphical user interfaces. These graphical user interfaces may be composed of graphics, text, icons, videos, and any combination thereof.
  • the display 503 may include a display panel.
  • the display panel may be configured by using a liquid crystal display (Liquid Crystal Display, LCD), or an organic light emitting diode (Organic Light-Emitting Diode, OLED).
  • the radio frequency circuit 504 may be used to transmit and receive radio frequency signals to establish wireless communication with a network device or other electronic device through wireless communication, and transmit and receive signals to and from the network device or other electronic device.
  • the audio circuit 505 may be used to provide an audio interface between the user and the electronic device through a speaker or a microphone.
  • the power source 506 may be used to power various components of the electronic device 500.
  • the power supply 506 may be logically connected to the processor 501 through a power management system, so as to implement functions such as management of charging, discharging, and power consumption management through the power management system.
  • the electronic device 500 may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
  • the processor 501 may execute:
  • the obtained historical noise signal is used as sample data for model training to obtain a noise prediction model
  • the noise signal during noise signal reception is predicted according to the noise prediction model.
  • the processor 501 may execute:
  • the processor 501 may execute:
  • the processor 501 may execute:
  • the word sequence is matched with the instruction keywords, and the instructions to be executed are obtained by matching.
  • the processor 501 may execute:
  • the noise reduction voice signal is sent to the server.
  • the processor 501 may execute:
  • the to-be-executed instructions included in the noise reduction voice signal are acquired.
  • the processor 501 may further execute:
  • the processor 501 may further execute:
  • An embodiment of the present application further provides a storage medium.
  • the storage medium stores a computer program, and when the computer program is run on a computer, the computer is caused to execute the location prompting method in any one of the foregoing embodiments, for example:
  • the noisy speech signal is received, the historical noise signal corresponding to the noisy speech signal is obtained; according to the acquired historical noise signal, the noise signal during the reception of the noisy speech signal is obtained; Inverse phase superposition to obtain a noise-reduced voice signal; obtain the pending instructions included in the noise-reduced voice signal, and when the pending instruction is an instruction for triggering a position prompt, perform a prompt operation indicating the current position in a preset manner.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM, ROM), or a random access device (Random Access Memory, RAM).
  • ROM read-only memory
  • RAM Random Access Memory
  • the computer program may be stored in a computer-readable storage medium, such as stored in a memory of an electronic device, and executed by at least one processor in the electronic device, and may include, for example, a location prompt method during execution.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, or the like.
  • the position prompting device For the position prompting device according to the embodiment of the present application, its functional modules may be integrated into one processing chip, or each module may exist separately physically, or two or more modules may be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or software functional modules. If the integrated module is implemented in the form of a software functional module and sold or used as an independent product, it may also be stored in a computer-readable storage medium, such as a read-only memory, a magnetic disk, or an optical disk. .

Abstract

提供了一种位置提示方法、装置、存储介质及电子设备,其中,当电子设备接收到带噪语音信号时,根据历史噪声信号对带噪语音信号进行降噪处理,得到降噪语音信号,利用该降噪语音信号执行提示当前位置的提示操作。

Description

位置提示方法、装置、存储介质及电子设备
本申请要求于2018年06月19日提交中国专利局、申请号为201810648187.4、发明名称为“位置提示方法、装置、存储介质及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电子设备技术领域,具体涉及一种位置提示方法、装置、存储介质及电子设备。
背景技术
目前,随着技术的发展,人机之间的交互方式变得越来越丰富。相关技术中,用户可以通过语音对手机、平板电脑等电子设备进行控制,即电子设备在接收到用户发出的语音信号后,可以对该语音信号进行解析,得到控制指令并执行。比如,在用户找不到电子设备时,电子设备可以根据用户的语音信号进行位置提示,引导用户找到电子设备。
发明内容
第一方面,本申请实施例提供了一种位置提示方法,包括:
在接收到带噪语音信号时,获取对应所述带噪语音信号的历史噪声信号;
根据所述历史噪声信号,获取所述带噪语音信号接收期间的噪声信号;
将所述噪声信号与所述带噪语音信号进行反相位叠加,得到降噪语音信号;
获取所述降噪语音信号包括的待执行指令,并在所述待执行指令为用于触发位置提示的指令时,按照预设方式执行提示当前位置的提示操作。
第二方面,本申请实施例提供了一种位置提示装置,包括:
第一获取模块,用于在接收到带噪语音信号时,获取对应所述带噪语音信号的历史噪声信号;
第二获取模块,用于根据所述历史噪声信号,获取所述带噪语音信号接收期间的噪声信号;
降噪模块,用于将所述噪声信号与所述带噪语音信号进行反相位叠加,得到降噪语音信号;
提示模块,用于获取所述降噪语音信号包括的待执行指令,并在所述待执行指令为用于触发位置提示的指令时,按照预设方式执行提示当前位置的提示操作。
第三方面,本申请实施例提供了一种存储介质,其上存储有计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行:
在接收到带噪语音信号时,获取对应所述带噪语音信号的历史噪声信号;
根据所述历史噪声信号,获取所述带噪语音信号接收期间的噪声信号;
将所述噪声信号与所述带噪语音信号进行反相位叠加,得到降噪语音信号;
获取所述降噪语音信号包括的待执行指令,并在所述待执行指令为用于触发位置提示的指令时,按照预设方式执行提示当前位置的提示操作。
第四方面,本申请实施例提供了一种电子设备,包括处理器和存储器,所述存储器有计算机程序,所述处理器通过调用所述计算机程序,用于执行:
在接收到带噪语音信号时,获取对应所述带噪语音信号的历史噪声信号;
根据所述历史噪声信号,获取所述带噪语音信号接收期间的噪声信号;
将所述噪声信号与所述带噪语音信号进行反相位叠加,得到降噪语音信号;
获取所述降噪语音信号包括的待执行指令,并在所述待执行指令为用于触发位置提示的指令时,按照预设方式执行提示当前位置的提示操作。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的位置提示方法的一流程示意图。
图2是本申请实施例提供的提示方式的设置界面的示意图。
图3是本申请实施例中触发电子设备执行提示当前位置的提示操作的示例图。
图4是本申请实施例中电子设备执行提示操作的示例图。
图5是本申请实施例提供的位置提示方法的另一流程示意图。
图6是本申请实施例提供的位置提示装置的一结构示意图。
图7是本申请实施例提供的电子设备的一结构示意图。
图8是本申请实施例提供的电子设备的另一结构示意图。
具体实施方式
请参照图式,其中相同的组件符号代表相同的组件,本申请的原理是以实施在一适当的运算环境中来举例说明。以下的说明是基于所例示的本申请具体实施例,其不应被视为限制本申请未在此详述的其它具体实施例。
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。
本申请实施例提供一种位置提示方法,该位置提示方法的执行主体可以是本申请实施例提供的位置提示装置,或者集成了该位置提示装置的电子设备,其中该位置提示装置可以采用硬件或者软件的方式实现。其中,电子设备可以是智能手机、平板电脑、掌上电脑、笔记本电脑、或者台式电脑等设备。
本申请实施例提供一种位置提示方法,其中,包括:
在接收到带噪语音信号时,获取对应所述带噪语音信号的历史噪声信号;
根据所述历史噪声信号,获取所述带噪语音信号接收期间的噪声信号;
将所述噪声信号与所述带噪语音信号进行反相位叠加,得到降噪语音信号;
获取所述降噪语音信号包括的待执行指令,并在所述待执行指令为用于触发位置提示的指令时,按照预设方式执行提示当前位置的提示操作。
在一实施例中,所述根据所述历史噪声信号,获取所述带噪语音信号接收期间的噪声信号,包括:
将所述历史噪声信号作为样本数据进行模型训练,得到噪声预测模型;
根据所述噪声预测模型预测所述接收期间的所述噪声信号。
在一实施例中,所述按照预设方式执行提示当前位置的提示操作,包括:
获取当前的位置信息,并以语音的方式输出所述位置信息。
在一实施例中,所述获取所述降噪语音信号包括的待执行指令,包括:
将所述降噪语音信号发送至服务器,指示所述服务器对所述降噪语音信号进行解析,并返回解析所述降噪语音信号所得到的语音解析文本;
接收所述服务器返回的所述语音解析文本,并根据所述语音解析文本获取所述待执行指令。
在一实施例中,所述根据所述语音解析文本获取所述待执行指令,包括:
对所述语音解析文本进行分词操作,得到对应所述语音解析文本的词序列;
对所述词序列进行指令关键词的匹配,匹配得到所述待执行指令。
在一实施例中,所述将所述降噪语音信号发送至服务器之前,还包括:
判断本地是否存语音解析引擎;
若不存在,则将所述降噪语音信号发送至所述服务器。
在一实施例中,所述获取所述降噪语音信号包括的待执行指令的步骤之前,还包括:
获取所述降噪语音信号的声纹特征;
判断所述声纹特征是否与预设声纹特征匹配;
在所述声纹特征与预设声纹特征匹配时,获取所述降噪语音信号包括的待执行指令。
在一实施例中,所述判断所述声纹特征是否与预设声纹特征匹配,包括:
获取所述声纹特征和所述预设声纹特征的相似度;
判断所述相似度是否大于或等于第一预设相似度;
在所述相似度大于或等于所述第一预设相似度时,确定所述声纹特征与所述预设声纹特征匹配。
在一实施例中,判断所述相似度是否大于或等于第一预设相似度之后,还包括:
在所述相似度小于所述第一预设相似度且大于或等于第二预设相似度时,获取当前的 位置信息;
根据所述位置信息判断当前是否位于预设位置范围内;
在当前位于预设位置范围内时,确定所述声纹特征与所述预设声纹特征匹配。
请参照图1,图1为本申请实施例提供的位置提示方法的流程示意图。如图1所示,本申请实施例提供的位置提示方法的流程可以如下:
101、在接收到带噪语音信号时,获取对应带噪语音信号的历史噪声信号。
需要说明的是,带噪语音信号由语音信号和环境噪声信号组合形成,电子设备可以通过多种不同方式来接收输入的带噪语音信号,比如,在电子设备未外接麦克风时,电子设备可以通过内置的麦克风对外部的语音进行采集,将采集到的带噪语音信号作为接收到的带噪语音信号;又比如,在电子设备外接有麦克风时,电子设备可以通过外接的麦克风对外部的语音进行采集,将采集到的带噪语音信号作为接收到的带噪语音信号。
其中,电子设备在通过麦克风(此处的麦克风可以是内置麦克风,也可以是外接麦克风)接收输入的带噪语音信号时,若麦克风为模拟麦克风,将采集到模拟的带噪语音信号,此时电子设备需要对模拟的带噪语音信号进行采样,以将模拟的带噪语音信号转换为数字化的带噪语音信号,比如,可以16KHz的采样频率进行采样;此外,若麦克风为数字麦克风,则电子设备将通过数字麦克风直接接收到数字化的带噪语音信号,无需进行转换。
容易理解的是,当电子设备所处环境中有发音者发出语音信号时,电子设备才会接收到带噪语音信号,而当电子设备所处的环境中没有发音者发出语音信号,那么电子设备将仅接收到噪声信号。其中,电子设备将缓存接收到的带噪语音信号和噪声信号。
本申请实施例中,电子设备在接收到带噪语音信号时,以带噪语音信号的起始时刻为结束时刻,获取接收到带噪语音信号之前接收到的、预设时长(该预设时长可由本领域技术人员根据实际需要取合适值,本申请实施例对此不做具体限制,比如,可以设置为500ms)的历史噪声信号,将该噪声信号作为对应带噪语音信号的历史噪声信号。
比如,预设时长被配置为500毫秒,带噪语音信号的起始时刻为2018年06月12日11时04分56秒又500毫秒,则电子设备获取2018年06月12日11时04分56秒至2018年06月12日11时04分56秒又500毫秒期间缓存的、时长为500毫秒的噪声信号,将该噪声信号作为对应带噪语音信号的历史噪声信号。
102、根据获取到的历史噪声信号,获取带噪语音信号接收期间的噪声信号。
其中,电子设备在获取到对应带噪语音信号的历史噪声信号之后,根据获取到的历史噪声信号,进一步获取到带噪语音信号接收期间的噪声信号。
比如,电子设备可以根据获取到的历史噪声信号,来预测带噪语音信号接收期间的噪声分布,从而得到带噪语音信号接收期间的噪声信号。
又比如,考虑到噪声的稳定性,连续时间内的噪声变化通常较小,电子设备可以将获取到历史噪声信号作为带噪语音信号接收期间的噪声信号,其中,若历史噪声信号的时长 大于带噪语音信号的时长,则可以从历史噪声信号中截取与带噪语音信号相同时长的噪声信号,作为带噪语音信号接收期间的噪声信号;若历史噪声信号的时长小于带噪语音信号的时长,则可以对历史噪声信号进行复制,拼接多个历史噪声信号以得到与带噪语音信号相同时长的噪声信号,作为带噪语音信号接收期间的噪声信号。
103、将获取到的噪声信号与带噪语音信号进行反相位叠加,得到降噪语音信号。
其中,在获取到带噪语音信号接收期间的噪声信号之后,电子设备首先对获取到的噪声信号进行反相处理,再将反相处理后的噪声信号与带噪语音信号进行叠加,已消除带噪语音信号中的噪声部分,得到降噪语音信号。
104、获取降噪语音信号包括的待执行指令,并在待执行指令为用于触发位置提示的指令时,按照预设方式执行提示当前位置的提示操作。
其中,在得到降噪语音信号之后,电子设备判断本地是否存在语音解析引擎,若存在,则电子设备将降噪语音信号输入到本地的语音解析引擎进行语音解析,得到语音解析文本。其中,对语音信号进行语音解析,也即是将语音信号由“音频”向“文字”的转换过程。
此外,在本地存在多个语音解析引擎时,电子设备可以按照以下方式从多个语音解析引擎中选取一个语音解析引擎对降噪语音信号进行语音解析:
其一,电子设备可以从本地的多个语音解析引擎中随机选取一个语音解析引擎,对接收到的降噪语音信号进行语音解析。
其二,电子设备可以从多个语音解析引擎中选取解析成功率最高的语音解析引擎,对接收到的降噪语音信号进行语音解析。
其三,电子设备可以从多个语音解析引擎中选取解析时长最短的语音解析引擎,对接收到的降噪语音信号进行语音解析。
其四,电子设备还可以从多个语音解析引擎中,选取解析成功率达到预设成功率、且解析时长最短的语音解析引擎对接收到的降噪语音信号进行语音解析。
需要说明的是,本领域技术人员还可以按照以上未列出的方式进行语音解析引擎的选取,或者可以结合多个语音解析引擎对降噪语音信号进行语音解析,比如,电子设备可以同时通过两个语音解析引擎对降噪语音信号进行语音解析,并在两个语音解析引擎得到的语音解析文本相同时,将该相同的语音解析文本作为降噪语音信号的语音解析文本;又比如,电子设备可以通过至少三个语音解析引擎对降噪语音信号进行语音解析,并在其中至少两个语音解析引擎得到的语音解析文本相同时,将该相同的语音解析文本作为降噪语音信号的语音解析文本。
在解析得到降噪语音信号的语音解析文本之后,电子设备进一步从该语音解析文本中获取降噪语音信号包括的待执行指令。
其中,电子设备预先存储有多个指令关键词,单个指令关键词或者多个指令关键词组合对应一个指令。在从解析得到的语音解析文本获取降噪语音信号包括的待执行指令时, 电子设备首先对前述语音解析文本进行分词操作,得到对应语音解析文本的词序列,该词序列中包括多个词。
在得到对应语音解析文本的词序列之后,电子设备对词序列进行指令关键词的匹配,也即是查找出词序列中的指令关键词,从而匹配得到对应的指令,将匹配得到的指令作为降噪语音信号的待执行指令。其中,指令关键词的匹配查找包括完全匹配和/或模糊匹配。
在得到降噪语音信号包括待执行指令之后,若识别到待执行指令为用于触发位置提示的指令,则按照预设方式执行提示当前位置的提示操作。比如,用于触发位置提示的指令对应指令关键词组合“小欧”+“你”+“在哪儿”,当用户说出“小欧你在哪儿”时,电子设备将判定“小欧你在哪儿”包括的待执行指令为用于触发位置提示的指令。
其中,电子设备执行提示操作的方式可以缺省设置,也可根据用户输入数据进行设置。比如,电子设备的缺省提示方式为亮屏,此外,请参照图2,电子设备还提供有提示方式的设置界面,供用户根据实际需要选择提示方式,当用户选择了“在亮屏的同时响铃”的提示方式时,若接收到用户发出的语音“小欧你在哪儿”,电子设备将以亮屏并且响铃的方式来提醒用户其当前所在的位置。
由上可知,本申请实施例中,电子设备可以在接收到带噪语音信号时,获取对应带噪语音信号的历史噪声信号。根据获取到的历史噪声信号,获取带噪语音信号接收期间的噪声信号。将获取到的噪声信号与带噪语音信号进行反相位叠加,得到降噪语音信号。获取降噪语音信号包括的待执行指令,并在待执行指令为用于触发位置提示的指令时,按照预设方式执行提示当前位置的提示操作。本方案中,能够在嘈杂的环境中接收到带噪语音信号时,对带噪语音信号进行降噪处理,得到降噪语音信号,再根据该降噪语音信号执行提示当前位置的提示操作,避免了噪声干扰,可以提高触发电子设备进行位置提示的成功率。
在一实施方式中,“根据获取到的历史噪声信号,获取带噪语音信号接收期间的噪声信号”包括:
(1)将获取到的历史噪声信号作为样本数据进行模型训练,得到噪声预测模型;
(2)根据噪声预测模型预测带噪语音信号接收期间的噪声信号。
其中,电子设备在获取到历史噪声信号之后,将该历史噪声信号作为样本数据,并按照预设训练算法进行模型训练,得到噪声预测模型。
需要说明的是,训练算法为机器学习算法,机器学习算法可以通过不断的进行特征学习来对数据进行预测,比如,电子设备可以根据历史的噪声分布来预测当前的噪声分布。其中,机器学习算法可以包括:决策树算法、回归算法、贝叶斯算法、神经网络算法(可以包括深度神经网络算法、卷积神经网络算法以及递归神经网络算法等)、聚类算法等等,对于选取何种训练算法用作预设训练算法进行模型训练,可由本领域技术人员根据实际需要进行选取。
比如,电子设备配置的预设训练算法为高斯混合模型算法(为一种回归算法),在获 取到历史噪声信号之后,将该历史噪声信号作为样本数据,并按照高斯混合模型算法进行模型训练,训练得到一个高斯混合模型(噪声预测模型包括多个高斯单元,用于描述噪声分布),将该高斯混合模型作为噪声预测模型。之后,电子设备将带噪语音信号接收期间的开始时刻和结束时刻作为噪声预测模型的输入,输入到噪声预测模型进行处理,得到噪声预测模型输出带噪语音信号接收期间的噪声信号。
在一实施方式中,“按照预设方式执行提示当前位置的提示操作”包括:
获取当前的位置信息,并以语音的方式输出所述位置信息。
为帮助用户更好的找到电子设备,本申请实施例提供一种执行提示操作的方式,包括:
电子设备在按照预设方式执行提示当前位置的提示操作时,首先获取到当前的位置信息,其中,在处于室外环境(电子设备可以根据接收到卫星定位信号的强度大小来识别当前处于室外环境,还是处于室内环境,比如,在接收到的卫星定位信号强度低于预设阈值时,判定处于室内环境,在接收到的卫星定位信号强度高于或等于预设阈值时,判定处于室外环境)时,电子设备可以采用卫星定位技术来获取到当前的位置信息,在处于室内环境时,电子设备可以采用室内定位技术来获取当前的位置信息。在获取到当前的位置信息之后,电子设备以语音的方式输出获取到的位置信息,以对其当前位置进行提示。
比如,请结合参照图3和图4,当用户将电子设备遗落在会议室的桌面上时,可以说出“小欧你在哪儿”来触发电子设备执行提示当前位置的提示操作。相应的,电子设备将接收到带噪语音信号“小欧你在哪儿+噪声”,对该带噪语音信号进行降噪处理后,得到降噪语音信号“小欧你在哪儿”,确定该降噪语音信号包括的待执行指令为用于触发位置提示的指令,获取到当前的位置信息“会议室”,并以语音的方式输出“我在会议室里”,对用户进行引导,帮助用户找到电子设备。
在一实施方式中,“获取降噪语音信号包括的待执行指令”包括:
(1)将降噪语音信号发送至服务器,指示服务器对降噪语音信号进行解析,并返回解析降噪语音信号所得到的语音解析文本;
(2)接收服务器返回的语音解析文本,并根据接收到的语音解析文本获取降噪语音信号包括的待执行指令。
其中,电子设备在得到降噪语音信号之后,判断本地是否存在语音解析引擎,若不存在,则将得到降噪语音信号发送至服务器(该服务器为提供语音解析服务的服务器),指示该服务器对降噪语音信号进行解析,并返回解析降噪语音信号所得到的语音解析文本。
在接收到服务器返回的语音解析文本之后,电子设备即可根据该语音解析文本获取降噪语音信号包括的待执行指令。其中,对于如何从语音解析文本中获取待执行指令,具体可参照以上相关描述,此处不再赘述。
在一实施方式中,“获取降噪语音信号包括的待执行指令”之前,还包括:
(1)获取降噪语音信号的声纹特征;
(2)判断获取到的声纹特征是否与预设声纹特征匹配;
(3)在获取到的声纹特征与预设声纹特征匹配时,获取降噪语音信号包括的待执行指令。
在实际生活中,每个人说话时的声音都有自己的特点,熟悉的人之间,可以只听声音而相互辨别出来。
这种声音的特点就是声纹特征,声纹特征主要由两个因素决定,第一个是声腔的尺寸,具体包括咽喉、鼻腔和口腔等,这些器官的形状、尺寸和位置决定了声带张力的大小和声音频率的范围。因此不同的人虽然说同样的话,但是声音的频率分布是不同的,听起来有的低沉有的洪亮。
第二个决定声纹特征的因素是发声器官被操纵的方式,发声器官包括唇、齿、舌、软腭及腭肌肉等,他们之间相互作用就会产生清晰的语音。而他们之间的协作方式是人通过后天与周围人的交流中随机学习到的。人在学习说话的过程中,通过模拟周围不同人的说话方式,就会逐渐形成自己的声纹特征。
本申请实施例中,电子设备得到降噪语音信号时,首先获取该降噪语音信号的声纹特征。
在获取到降噪语音信号的声纹特征之后,电子设备进一步将获取到的声纹特征与预设声纹特征进行比对,以判断该声纹特征是否与预设声纹特征匹配。其中,预设声纹特征可以为机主预先录入的声纹特征,判断降噪语音信号的声纹特征是否与预设声纹特征匹配,也即是判断降噪语音信号对应的发音者是否为机主。
在获取到的声纹特征与预设声纹特征匹配时,电子设备确定降噪语音信号对应的发音者为机主,此时获取降噪语音信号包括的待执行指令,具体可参照以上相关描述,此处不再赘述。
本申请实施例通过在获取降噪语音信号包括的待执行指令之前,根据该降噪语音信号的声纹特征进行发音者的身份识别,在且仅在降噪语音信号对应的发音者为机主时,才获取降噪语音信号包括的待执行指令。由此,能够避免电子设备执行非机主意愿的操作,提升机主的使用体验。
在一实施方式中,“判断获取到的声纹特征是否与预设声纹特征匹配”包括:
(1)获取前述声纹特征和预设声纹特征的相似度;
(2)判断获取到的相似度是否大于或等于第一预设相似度;
(3)在获取到的相似度大于或等于第一预设相似度时,确定前述声纹特征与预设声纹特征匹配。
其中,电子设备在判断获取到的声纹特征是否与预设声纹特征匹配时,可以获取前述声纹特征与预设声纹特征的相似度,并判断获取到的相似度是否大于或等于第一预设相似度(根据实际需要进行设置,比如,可以设置为95%)。其中,在获取到的相似度大于或 等于第一预设相似度时,确定获取到的前述声纹特征与预设声纹特征匹配,在获取到的相似度小于第一预设相似度时,确定获取到的前述声纹特征与预设声纹特征不匹配。
在一实施方式中,“判断获取到的相似度是否大于或等于第一预设相似度”之后,还包括:
(1)在获取到的相似度小于第一预设相似度且大于或等于第二预设相似度时,获取当前的位置信息;
(2)根据该位置信息判断当前是否位于预设位置范围内;
(3)在当前位于预设位置范围内时,确定获取的声纹特征与预设声纹特征匹配。
需要说明的是,由于声纹特征和人体的生理特征密切相关,在日常生活中,如果用户感冒发炎的话,其声音将变得沙哑,声纹特征也将随之发生变化。在这种情况下,即使电子设备接收到的带噪语音信号对应的发音者为机主,在对其进行降噪处理得到降噪语音信号之后,电子设备也无法识别出。此外,还存在多种导致电子设备无法识别出机主的情况,此处不再赘述。
为解决可能出现的、无法识别出机主的情况,在本申请实施例中,电子设备在完成对声纹特征相似度的判断之后,若降噪语音信号的声纹特征与预设声纹特征的相似度小于第一预设相似度,则进一步判断该相似度是否大于或等于第二预设相似度(该第二预设相似度配置为小于第一预设相似度,具体可由本领域技术人员根据实际需要取合适值,比如,在第一预设相似度被设置为95%时,可以将第二预设相似度设置为75%)。
在判断结果为是,也即是降噪语音信号的声纹特征、与预设声纹特征的相似度小于第一预设相似度且大于或等于第二预设相似度时,电子设备进一步获取到当前的位置信息。
其中,在处于室外环境(电子设备可以根据接收到卫星定位信号的强度大小来识别当前处于室外环境,还是处于室内环境,比如,在接收到的卫星定位信号强度低于预设阈值时,判定处于室内环境,在接收到的卫星定位信号强度高于或等于预设阈值时,判定处于室外环境)时,电子设备可以采用卫星定位技术来获取到当前的位置信息,在处于室内环境时,电子设备可以采用室内定位技术来获取当前的位置信息。
在获取到当前的位置信息之后,电子设备根据该位置信息判断当前是否位于预设位置范围内。其中,预设位置范围可以配置为机主的常用位置范围,比如家里和公司等。
在判定当前位于预设位置范围内时,电子设备确定前述声纹特征与预设声纹特征匹配,降噪语音信号对应的发音者识别为机主。
由此,能够避免可能出现的、无法识别出机主的情况,达到提升机主使用体验的目的。
下面将在上述实施例描述的方法基础上,对本申请的位置提示方法做进一步介绍。请参照图5,该位置提示方法可以包括:
201、在接收到带噪语音信号时,获取对应带噪语音信号的历史噪声信号。
需要说明的是,带噪语音信号由语音信号和环境噪声信号组合形成,电子设备可以通过多种不同方式来接收输入的带噪语音信号,比如,在电子设备未外接麦克风时,电子设备可以通过内置的麦克风对外部的语音进行采集,将采集到的带噪语音信号作为接收到的带噪语音信号;又比如,在电子设备外接有麦克风时,电子设备可以通过外接的麦克风对外部的语音进行采集,将采集到的带噪语音信号作为接收到的带噪语音信号。
其中,电子设备在通过麦克风(此处的麦克风可以是内置麦克风,也可以是外接麦克风)接收输入的带噪语音信号时,若麦克风为模拟麦克风,将采集到模拟的带噪语音信号,此时电子设备需要对模拟的带噪语音信号进行采样,以将模拟的带噪语音信号转换为数字化的带噪语音信号,比如,可以16KHz的采样频率进行采样;此外,若麦克风为数字麦克风,则电子设备将通过数字麦克风直接接收到数字化的带噪语音信号,无需进行转换。
容易理解的是,当电子设备所处环境中有发音者发出语音信号时,电子设备才会接收到带噪语音信号,而当电子设备所处的环境中没有发音者发出语音信号,那么电子设备将仅接收到噪声信号。其中,电子设备将缓存接收到的带噪语音信号和噪声信号。
本申请实施例中,电子设备在接收到带噪语音信号时,以带噪语音信号的起始时刻为结束时刻,获取接收到带噪语音信号之前接收到的、预设时长(该预设时长可由本领域技术人员根据实际需要取合适值,本申请实施例对此不做具体限制,比如,可以设置为500ms)的历史噪声信号,将该噪声信号作为对应带噪语音信号的历史噪声信号。
比如,预设时长被配置为500毫秒,带噪语音信号的起始时刻为2018年06月12日11时04分56秒又500毫秒,则电子设备获取2018年06月12日11时04分56秒至2018年06月12日11时04分56秒又500毫秒期间缓存的、时长为500毫秒的噪声信号,将该噪声信号作为对应带噪语音信号的历史噪声信号。
202、将获取到的历史噪声信号作为样本数据进行模型训练,得到噪声预测模型。
其中,电子设备在获取到历史噪声信号之后,将该历史噪声信号作为样本数据,并按照预设训练算法进行模型训练,得到噪声预测模型。
需要说明的是,训练算法为机器学习算法,机器学习算法可以通过不断的进行特征学习来对数据进行预测,比如,电子设备可以根据历史的噪声分布来预测当前的噪声分布。其中,机器学习算法可以包括:决策树算法、回归算法、贝叶斯算法、神经网络算法(可以包括深度神经网络算法、卷积神经网络算法以及递归神经网络算法等)、聚类算法等等,对于选取何种训练算法用作预设训练算法进行模型训练,可由本领域技术人员根据实际需要进行选取。
比如,电子设备配置的预设训练算法为高斯混合模型算法,在获取到历史噪声信号之后,将该历史噪声信号作为样本数据,并按照高斯混合模型算法进行模型训练,训练得到一个高斯混合模型(噪声预测模型包括多个高斯单元,用于描述噪声分布),将该高斯混合模型作为噪声预测模型。
203、根据噪声预测模型预测带噪语音信号接收期间的噪声信号。
在训练得到噪声预测模型之后,电子设备将带噪语音信号接收期间的开始时刻和结束时刻作为噪声预测模型的输入,输入到噪声预测模型进行处理,得到噪声预测模型输出带噪语音信号接收期间的噪声信号。
204、将获取到的噪声信号与带噪语音信号进行反相位叠加,得到降噪语音信号。
其中,在获取到带噪语音信号接收期间的噪声信号之后,电子设备首先对获取到的噪声信号进行反相处理,再将反相处理后的噪声信号与带噪语音信号进行叠加,已消除带噪语音信号中的噪声部分,得到降噪语音信号。
205、获取降噪语音信号的声纹特征。
在实际生活中,每个人说话时的声音都有自己的特点,熟悉的人之间,可以只听声音而相互辨别出来。
这种声音的特点就是声纹特征,声纹特征主要由两个因素决定,第一个是声腔的尺寸,具体包括咽喉、鼻腔和口腔等,这些器官的形状、尺寸和位置决定了声带张力的大小和声音频率的范围。因此不同的人虽然说同样的话,但是声音的频率分布是不同的,听起来有的低沉有的洪亮。
第二个决定声纹特征的因素是发声器官被操纵的方式,发声器官包括唇、齿、舌、软腭及腭肌肉等,他们之间相互作用就会产生清晰的语音。而他们之间的协作方式是人通过后天与周围人的交流中随机学习到的。人在学习说话的过程中,通过模拟周围不同人的说话方式,就会逐渐形成自己的声纹特征。
本申请实施例中,电子设备得到降噪语音信号时,首先获取该降噪语音信号的声纹特征。
206、判断获取到的声纹特征是否与预设声纹特征匹配。
在获取到降噪语音信号的声纹特征之后,电子设备进一步将获取到的声纹特征与预设声纹特征进行比对,以判断该声纹特征是否与预设声纹特征匹配。其中,预设声纹特征可以为机主预先录入的声纹特征,判断降噪语音信号的声纹特征是否与预设声纹特征匹配,也即是判断降噪语音信号对应的发音者是否为机主。
207、在获取到的声纹特征与预设声纹特征匹配时,获取降噪语音信号包括的待执行指令。
在获取到的声纹特征与预设声纹特征匹配时,电子设备确定降噪语音信号对应的发音者为机主,此时获取降噪语音信号包括的待执行指令。
其中,在获取降噪语音信号包括的待执行指令时,电子设备首先判断本地是否存在语音解析引擎,若存在,则电子设备将降噪语音信号输入到本地的语音解析引擎进行语音解析,得到语音解析文本。其中,对语音信号进行语音解析,也即是将语音信号由“音频”向“文字”的转换过程。
在解析得到降噪语音信号的语音解析文本之后,电子设备进一步从该语音解析文本中获取降噪语音信号包括的待执行指令。
其中,电子设备预先存储有多个指令关键词,单个指令关键词或者多个指令关键词组合对应一个指令。在从解析得到的语音解析文本获取降噪语音信号包括的待执行指令时,电子设备首先对前述语音解析文本进行分词操作,得到对应语音解析文本的词序列,该词序列中包括多个词。
在得到对应语音解析文本的词序列之后,电子设备对词序列进行指令关键词的匹配,也即是查找出词序列中的指令关键词,从而匹配得到对应的指令,将匹配得到的指令作为降噪语音信号的待执行指令。其中,指令关键词的匹配查找包括完全匹配和/或模糊匹配。
208、在获取到的待执行指令为用于触发位置提示的指令时,按照预设方式执行提示当前位置的提示操作。
在得到降噪语音信号包括待执行指令之后,若识别到待执行指令为用于触发位置提示的指令,则按照预设方式执行提示当前位置的提示操作。比如,用于触发位置提示的指令对应指令关键词组合“小欧”+“你”+“在哪儿”,当用户说出“小欧你在哪儿”时,电子设备将判定“小欧你在哪儿”包括的待执行指令为用于触发位置提示的指令。
其中,电子设备执行提示操作的方式可以缺省设置,也可根据用户输入数据进行设置。比如,电子设备的缺省提示方式为亮屏,此外,请参照图2,电子设备还提供有提示方式的设置界面,供用户根据实际需要选择提示方式,当用户选择了“在亮屏的同时响铃”的提示方式时,若接收到用户发出的语音“小欧你在哪儿”,电子设备将以亮屏并且响铃的方式来提醒用户其当前所在的位置。
在一实施例中,还提供了一种位置提示装置。请参照图6,图6为本申请实施例提供的位置提示装置400的结构示意图。其中该位置提示装置应用于电子设备,该位置提示装置包括第一获取模块401、第二获取模块402、降噪模块403和提示模块404,如下:
第一获取模块401,用于在接收到带噪语音信号时,获取对应带噪语音信号的历史噪声信号。
第二获取模块402,用于根据获取到的历史噪声信号,获取带噪语音信号接收期间的噪声信号。
降噪模块403,用于将获取到的噪声信号与带噪语音信号进行反相位叠加,得到降噪语音信号。
提示模块404,用于获取降噪语音信号包括的待执行指令,并在待执行指令为用于触发位置提示的指令时,按照预设方式执行提示当前位置的提示操作。
在一实施方式中,第二获取模块402可以用于:
将获取到的历史噪声信号作为样本数据进行模型训练,得到噪声预测模型;
根据噪声预测模型预测带噪语音信号接收期间的噪声信号。
在一实施例中,提示模块404可以用于:
获取当前的位置信息,并以语音的方式输出所述位置信息。
在一实施方式中,提示模块404可以用于:
将降噪语音信号发送至服务器,指示服务器对降噪语音信号进行解析,并返回解析降噪语音信号所得到的语音解析文本;
接收服务器返回的语音解析文本,并根据接收到的语音解析文本获取降噪语音信号包括的待执行指令。
在一实施例中,提示模块404可以用于:
对语音解析文本进行分词操作,得到对应语音解析文本的词序列;
对词序列进行指令关键词的匹配,匹配得到待执行指令。
在一实施例中,提示模块404可以用于:
判断本地是否存语音解析引擎;
若不存在,则将降噪语音信号发送至服务器。
在一实施方式中,提示模块404可以用于:
获取降噪语音信号的声纹特征;
判断获取到的声纹特征是否与预设声纹特征匹配;
在获取到的声纹特征与预设声纹特征匹配时,获取降噪语音信号包括的待执行指令。
在一实施方式中,提示模块404可以用于:
获取前述声纹特征和预设声纹特征的相似度;
判断获取到的相似度是否大于或等于第一预设相似度;
在获取到的相似度大于或等于第一预设相似度时,确定前述声纹特征与预设声纹特征匹配。
在一实施方式中,提示模块404可以用于:
在获取到的相似度小于第一预设相似度且大于或等于第二预设相似度时,获取当前的位置信息;
根据该位置信息判断当前是否位于预设位置范围内;
在当前位于预设位置范围内时,确定获取的声纹特征与预设声纹特征匹配。
其中,位置提示装置400中各模块执行的步骤可以参考上述方法实施例描述的方法步骤。该位置提示装置400可以集成在电子设备中,如手机、平板电脑等。
具体实施时,以上各个模块可以作为独立的实体实现,也可以进行任意组合,作为同一或若干个实体来实现,以上各个单位的具体实施可参见前面的实施例,在此不再赘述。
由上可知,本实施例位置提示装置可以在接收到带噪语音信号时,由第一获取模块401获取对应带噪语音信号的历史噪声信号。由第二获取模块402根据获取到的历史噪声信号, 获取带噪语音信号接收期间的噪声信号。由降噪模块403将获取到的噪声信号与带噪语音信号进行反相位叠加,得到降噪语音信号。由提示模块404获取降噪语音信号包括的待执行指令,并在待执行指令为用于触发位置提示的指令时,按照预设方式执行提示当前位置的提示操作。本方案中,能够在嘈杂的环境中接收到带噪语音信号时,对带噪语音信号进行降噪处理,得到降噪语音信号,再根据该降噪语音信号执行提示当前位置的提示操作,避免了噪声干扰,可以提高触发电子设备进行位置提示的成功率。
在一实施例中,还提供一种电子设备。请参照图7,电子设备500包括处理器501以及存储器502。其中,处理器501与存储器502电性连接。
处理器500是电子设备500的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或加载存储在存储器502内的计算机程序,以及调用存储在存储器502内的数据,执行电子设备500的各种功能并处理数据。
存储器502可用于存储软件程序以及模块,处理器501通过运行存储在存储器502的计算机程序以及模块,从而执行各种功能应用以及数据处理。存储器502可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的计算机程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据电子设备的使用所创建的数据等。此外,存储器502可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。相应地,存储器502还可以包括存储器控制器,以提供处理器501对存储器502的访问。
在本申请实施例中,电子设备500中的处理器501会按照如下的步骤,将一个或一个以上的计算机程序的进程对应的指令加载到存储器502中,并由处理器501运行存储在存储器502中的计算机程序,从而实现各种功能,如下:
在接收到带噪语音信号时,获取对应带噪语音信号的历史噪声信号;
根据获取到的历史噪声信号,获取带噪语音信号接收期间的噪声信号;
将获取到的噪声信号与带噪语音信号进行反相位叠加,得到降噪语音信号;
获取降噪语音信号包括的待执行指令,并在待执行指令为用于触发位置提示的指令时,按照预设方式执行提示当前位置的提示操作。
请一并参阅图8,在某些实施方式中,电子设备500还可以包括:显示器503、射频电路504、音频电路505以及电源506。其中,其中,显示器503、射频电路504、音频电路505以及电源506分别与处理器501电性连接。
显示器503可以用于显示由用户输入的信息或提供给用户的信息以及各种图形用户接口,这些图形用户接口可以由图形、文本、图标、视频和其任意组合来构成。显示器503可以包括显示面板,在某些实施方式中,可以采用液晶显示器(Liquid Crystal Display,LCD)、或者有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示 面板。
射频电路504可以用于收发射频信号,以通过无线通信与网络设备或其他电子设备建立无线通讯,与网络设备或其他电子设备之间收发信号。
音频电路505可以用于通过扬声器、传声器提供用户与电子设备之间的音频接口。
电源506可以用于给电子设备500的各个部件供电。在一些实施例中,电源506可以通过电源管理系统与处理器501逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。
尽管图8中未示出,电子设备500还可以包括摄像头、蓝牙模块等,在此不再赘述。
在某些实施方式中,在根据获取到的历史噪声信号,获取带噪语音信号接收期间的噪声信号时,处理器501可以执行:
将获取到的历史噪声信号作为样本数据进行模型训练,得到噪声预测模型;
根据噪声预测模型预测带噪语音信号接收期间的噪声信号。
在某些实施方式中,在按照预设方式执行提示当前位置的提示操作时,处理器501可以执行:
获取当前的位置信息,并以语音的方式输出所述位置信息。
在某些实施方式中,在获取降噪语音信号包括的待执行指令时,处理器501可以执行:
将降噪语音信号发送至服务器,指示服务器对降噪语音信号进行解析,并返回解析降噪语音信号所得到的语音解析文本;
接收服务器返回的语音解析文本,并根据接收到的语音解析文本获取降噪语音信号包括的待执行指令。
在某些实施方式中,在根据语音解析文本获取待执行指令时,处理器501可以执行:
对语音解析文本进行分词操作,得到对应语音解析文本的词序列;
对词序列进行指令关键词的匹配,匹配得到待执行指令。
在某些实施方式中,在将降噪语音信号发送至服务器之前,处理器501可以执行:
判断本地是否存语音解析引擎;
若不存在,则将降噪语音信号发送至服务器。
在某些实施方式中,在获取降噪语音信号包括的待执行指令之前,处理器501可以执行:
获取降噪语音信号的声纹特征;
判断获取到的声纹特征是否与预设声纹特征匹配;
在获取到的声纹特征与预设声纹特征匹配时,获取降噪语音信号包括的待执行指令。
在某些实施方式中,在判断获取到的声纹特征是否与预设声纹特征匹配时,处理器501还可以执行:
获取前述声纹特征和预设声纹特征的相似度;
判断获取到的相似度是否大于或等于第一预设相似度;
在获取到的相似度大于或等于第一预设相似度时,确定前述声纹特征与预设声纹特征匹配。
在某些实施方式中,在判断获取到的相似度是否大于或等于第一预设相似度之后,处理器501还可以执行:
在获取到的相似度小于第一预设相似度且大于或等于第二预设相似度时,获取当前的位置信息;
根据该位置信息判断当前是否位于预设位置范围内;
在当前位于预设位置范围内时,确定获取的声纹特征与预设声纹特征匹配。
本申请实施例还提供一种存储介质,所述存储介质存储有计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行上述任一实施例中的位置提示方法,比如:在接收到带噪语音信号时,获取对应带噪语音信号的历史噪声信号;根据获取到的历史噪声信号,获取带噪语音信号接收期间的噪声信号;将获取到的噪声信号与带噪语音信号进行反相位叠加,得到降噪语音信号;获取降噪语音信号包括的待执行指令,并在待执行指令为用于触发位置提示的指令时,按照预设方式执行提示当前位置的提示操作。
本申请实施例中,存储介质可以是磁碟、光盘、只读存储器(Read Only Memory,ROM,)或者随机存取器(Random Access Memory,RAM)等。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
需要说明的是,对本申请实施例的位置提示方法而言,本领域普通测试人员可以理解实现本申请实施例的位置提示方法的全部或部分流程,是可以通过计算机程序来控制相关的硬件来完成,所述计算机程序可存储于一计算机可读取存储介质中,如存储在电子设备的存储器中,并被该电子设备内的至少一个处理器执行,在执行过程中可包括如位置提示方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储器、随机存取记忆体等。
对本申请实施例的位置提示装置而言,其各功能模块可以集成在一个处理芯片中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中,所述存储介质譬如为只读存储器,磁盘或光盘等。
以上对本申请实施例所提供的一种位置提示方法、装置、存储介质及电子设备进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容 不应理解为对本申请的限制。

Claims (20)

  1. 一种位置提示方法,其中,包括:
    在接收到带噪语音信号时,获取对应所述带噪语音信号的历史噪声信号;
    根据所述历史噪声信号,获取所述带噪语音信号接收期间的噪声信号;
    将所述噪声信号与所述带噪语音信号进行反相位叠加,得到降噪语音信号;
    获取所述降噪语音信号包括的待执行指令,并在所述待执行指令为用于触发位置提示的指令时,按照预设方式执行提示当前位置的提示操作。
  2. 如权利要求1所述的位置提示方法,其中,所述根据所述历史噪声信号,获取所述带噪语音信号接收期间的噪声信号,包括:
    将所述历史噪声信号作为样本数据进行模型训练,得到噪声预测模型;
    根据所述噪声预测模型预测所述接收期间的所述噪声信号。
  3. 如权利要求1所述的位置提示方法,其中,所述按照预设方式执行提示当前位置的提示操作,包括:
    获取当前的位置信息,并以语音的方式输出所述位置信息。
  4. 如权利要求1所述的位置提示方法,其中,所述获取所述降噪语音信号包括的待执行指令,包括:
    将所述降噪语音信号发送至服务器,指示所述服务器对所述降噪语音信号进行解析,并返回解析所述降噪语音信号所得到的语音解析文本;
    接收所述服务器返回的所述语音解析文本,并根据所述语音解析文本获取所述待执行指令。
  5. 如权利要求4所述的位置提示方法,其中,所述根据所述语音解析文本获取所述待执行指令,包括:
    对所述语音解析文本进行分词操作,得到对应所述语音解析文本的词序列;
    对所述词序列进行指令关键词的匹配,匹配得到所述待执行指令。
  6. 如权利要求4所述的位置提示方法,其中,所述将所述降噪语音信号发送至服务器之前,还包括:
    判断本地是否存语音解析引擎;
    若不存在,则将所述降噪语音信号发送至所述服务器。
  7. 如权利要求1所述的位置提示方法,其中,所述获取所述降噪语音信号包括的待执行指令的步骤之前,还包括:
    获取所述降噪语音信号的声纹特征;
    判断所述声纹特征是否与预设声纹特征匹配;
    在所述声纹特征与预设声纹特征匹配时,获取所述降噪语音信号包括的待执行指令。
  8. 如权利要求7所述的位置提示方法,其中,所述判断所述声纹特征是否与预设声纹特征匹配,包括:
    获取所述声纹特征和所述预设声纹特征的相似度;
    判断所述相似度是否大于或等于第一预设相似度;
    在所述相似度大于或等于所述第一预设相似度时,确定所述声纹特征与所述预设声纹特征匹配。
  9. 如权利要求8所述的位置提示方法,其中,所述判断所述相似度是否大于或等于第一预设相似度之后,还包括:
    在所述相似度小于所述第一预设相似度且大于或等于第二预设相似度时,获取当前的位置信息;
    根据所述位置信息判断当前是否位于预设位置范围内;
    在当前位于预设位置范围内时,确定所述声纹特征与所述预设声纹特征匹配。
  10. 一种位置提示装置,其中,包括:
    第一获取模块,用于在接收到带噪语音信号时,获取对应所述带噪语音信号的历史噪声信号;
    第二获取模块,用于根据所述历史噪声信号,获取所述带噪语音信号接收期间的噪声信号;
    降噪模块,用于将所述噪声信号与所述带噪语音信号进行反相位叠加,得到降噪语音信号;
    提示模块,用于获取所述降噪语音信号包括的待执行指令,并在所述待执行指令为用于触发位置提示的指令时,按照预设方式执行提示当前位置的提示操作。
  11. 一种存储介质,其上存储有计算机程序,其中,当所述计算机程序在计算机上运行时,使得所述计算机执行:
    在接收到带噪语音信号时,获取对应所述带噪语音信号的历史噪声信号;
    根据所述历史噪声信号,获取所述带噪语音信号接收期间的噪声信号;
    将所述噪声信号与所述带噪语音信号进行反相位叠加,得到降噪语音信号;
    获取所述降噪语音信号包括的待执行指令,并在所述待执行指令为用于触发位置提示 的指令时,按照预设方式执行提示当前位置的提示操作。
  12. 一种电子设备,包括处理器和存储器,所述存储器储存有计算机程序,其中,所述处理器通过调用所述计算机程序,用于执行:
    在接收到带噪语音信号时,获取对应所述带噪语音信号的历史噪声信号;
    根据所述历史噪声信号,获取所述带噪语音信号接收期间的噪声信号;
    将所述噪声信号与所述带噪语音信号进行反相位叠加,得到降噪语音信号;
    获取所述降噪语音信号包括的待执行指令,并在所述待执行指令为用于触发位置提示的指令时,按照预设方式执行提示当前位置的提示操作。
  13. 如权利要求12所述的电子设备,其中,在根据所述历史噪声信号,获取所述带噪语音信号接收期间的噪声信号时,所述处理器可以执行:
    将所述历史噪声信号作为样本数据进行模型训练,得到噪声预测模型;
    根据所述噪声预测模型预测所述接收期间的所述噪声信号。
  14. 如权利要求12所述的电子设备,其中,在按照预设方式执行提示当前位置的提示操作时,所述处理器用于执行:
    获取当前的位置信息,并以语音的方式输出所述位置信息。
  15. 如权利要求12所述的电子设备,其中,在获取所述降噪语音信号包括的待执行指令时,所述处理器用于执行:
    将所述降噪语音信号发送至服务器,指示所述服务器对所述降噪语音信号进行解析,并返回解析所述降噪语音信号所得到的语音解析文本;
    接收所述服务器返回的所述语音解析文本,并根据所述语音解析文本获取所述待执行指令。
  16. 如权利要求15所述的电子设备,其中,在根据所述语音解析文本获取所述待执行指令时,所述处理器用于执行:
    对所述语音解析文本进行分词操作,得到对应所述语音解析文本的词序列;
    对所述词序列进行指令关键词的匹配,匹配得到所述待执行指令。
  17. 如权利要求15所述的电子设备,其中,在将所述降噪语音信号发送至服务器之前,所述处理器还用于执行:
    判断本地是否存语音解析引擎;
    若不存在,则将所述降噪语音信号发送至所述服务器。
  18. 如权利要求12所述的电子设备,其中,在获取所述降噪语音信号包括的待执行指 令的步骤之前,所述处理器还用于执行:
    获取所述降噪语音信号的声纹特征;
    判断所述声纹特征是否与预设声纹特征匹配;
    在所述声纹特征与预设声纹特征匹配时,获取所述降噪语音信号包括的待执行指令。
  19. 如权利要求18所述的电子设备,其中,在判断所述声纹特征是否与预设声纹特征匹配时,所述处理器用于执行:
    获取所述声纹特征和所述预设声纹特征的相似度;
    判断所述相似度是否大于或等于第一预设相似度;
    在所述相似度大于或等于所述第一预设相似度时,确定所述声纹特征与所述预设声纹特征匹配。
  20. 如权利要求19所述的电子设备,其中,在判断所述相似度是否大于或等于第一预设相似度之后,所述处理器还用于执行:
    在所述相似度小于所述第一预设相似度且大于或等于第二预设相似度时,获取当前的位置信息;
    根据所述位置信息判断当前是否位于预设位置范围内;
    在当前位于预设位置范围内时,确定所述声纹特征与所述预设声纹特征匹配。
PCT/CN2019/085557 2018-06-19 2019-05-05 位置提示方法、装置、存储介质及电子设备 WO2019242415A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810648187.4A CN108922523B (zh) 2018-06-19 2018-06-19 位置提示方法、装置、存储介质及电子设备
CN201810648187.4 2018-06-19

Publications (1)

Publication Number Publication Date
WO2019242415A1 true WO2019242415A1 (zh) 2019-12-26

Family

ID=64420994

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/085557 WO2019242415A1 (zh) 2018-06-19 2019-05-05 位置提示方法、装置、存储介质及电子设备

Country Status (2)

Country Link
CN (1) CN108922523B (zh)
WO (1) WO2019242415A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108922523B (zh) * 2018-06-19 2021-06-15 Oppo广东移动通信有限公司 位置提示方法、装置、存储介质及电子设备
CN113709291A (zh) * 2021-08-06 2021-11-26 北京三快在线科技有限公司 音频处理方法、装置、电子设备及可读存储介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102905029A (zh) * 2012-10-17 2013-01-30 广东欧珀移动通信有限公司 一种手机及智能语音寻找手机的方法
CN103024157A (zh) * 2012-11-28 2013-04-03 广东欧珀移动通信有限公司 一种基于语音寻找移动终端的方法及系统
CN103578477A (zh) * 2012-07-30 2014-02-12 中兴通讯股份有限公司 基于噪声估计的去噪方法和装置
CN106034024A (zh) * 2015-03-11 2016-10-19 广州杰赛科技股份有限公司 一种基于位置和声纹的认证方法
CN106960667A (zh) * 2017-03-08 2017-07-18 杭州联络互动信息科技股份有限公司 位置提醒方法、装置和系统
CN107666536A (zh) * 2016-07-29 2018-02-06 北京搜狗科技发展有限公司 一种寻找终端的方法和装置、一种用于寻找终端的装置
KR20180041355A (ko) * 2016-10-14 2018-04-24 삼성전자주식회사 전자 장치 및 전자 장치의 오디오 신호 처리 방법
CN108922523A (zh) * 2018-06-19 2018-11-30 Oppo广东移动通信有限公司 位置提示方法、装置、存储介质及电子设备

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2191465B1 (en) * 2007-09-12 2011-03-09 Dolby Laboratories Licensing Corporation Speech enhancement with noise level estimation adjustment
CN103002147A (zh) * 2012-11-29 2013-03-27 广东欧珀移动通信有限公司 一种移动终端自动应答方法和装置
US9619645B2 (en) * 2013-04-04 2017-04-11 Cypress Semiconductor Corporation Authentication for recognition systems
CN104580699B (zh) * 2014-12-15 2017-06-30 广东欧珀移动通信有限公司 一种待机时声控智能终端方法及装置
CN104900237B (zh) * 2015-04-24 2019-07-05 上海聚力传媒技术有限公司 一种用于对音频信息进行降噪处理的方法、装置和系统
CN106941703B (zh) * 2016-01-04 2020-02-18 上海交通大学 基于态势感知的室内外无缝定位装置和方法
CN106297779A (zh) * 2016-07-28 2017-01-04 块互动(北京)科技有限公司 一种基于位置信息的背景噪声消除方法及装置
CN106101909B (zh) * 2016-08-26 2019-05-17 维沃移动通信有限公司 一种耳机降噪的方法及移动终端
CN106412272A (zh) * 2016-09-23 2017-02-15 珠海格力电器股份有限公司 提示移动终端位置的方法、装置及移动终端
CN107339990B (zh) * 2017-06-27 2020-05-08 北京邮电大学 多模式融合定位系统及方法
CN108062464A (zh) * 2017-11-27 2018-05-22 北京传嘉科技有限公司 基于声纹识别的终端控制方法及系统

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103578477A (zh) * 2012-07-30 2014-02-12 中兴通讯股份有限公司 基于噪声估计的去噪方法和装置
CN102905029A (zh) * 2012-10-17 2013-01-30 广东欧珀移动通信有限公司 一种手机及智能语音寻找手机的方法
CN103024157A (zh) * 2012-11-28 2013-04-03 广东欧珀移动通信有限公司 一种基于语音寻找移动终端的方法及系统
CN106034024A (zh) * 2015-03-11 2016-10-19 广州杰赛科技股份有限公司 一种基于位置和声纹的认证方法
CN107666536A (zh) * 2016-07-29 2018-02-06 北京搜狗科技发展有限公司 一种寻找终端的方法和装置、一种用于寻找终端的装置
KR20180041355A (ko) * 2016-10-14 2018-04-24 삼성전자주식회사 전자 장치 및 전자 장치의 오디오 신호 처리 방법
CN106960667A (zh) * 2017-03-08 2017-07-18 杭州联络互动信息科技股份有限公司 位置提醒方法、装置和系统
CN108922523A (zh) * 2018-06-19 2018-11-30 Oppo广东移动通信有限公司 位置提示方法、装置、存储介质及电子设备

Also Published As

Publication number Publication date
CN108922523A (zh) 2018-11-30
CN108922523B (zh) 2021-06-15

Similar Documents

Publication Publication Date Title
US11756563B1 (en) Multi-path calculations for device energy levels
US10699702B2 (en) System and method for personalization of acoustic models for automatic speech recognition
WO2019242414A1 (zh) 语音处理方法、装置、存储介质及电子设备
KR101726945B1 (ko) 수동 시작/종료 포인팅 및 트리거 구문들에 대한 필요성의 저감
US20190311718A1 (en) Context-aware control for smart devices
US20200294523A1 (en) System and Method for Network Bandwidth Management for Adjusting Audio Quality
US20130211826A1 (en) Audio Signals as Buffered Streams of Audio Signals and Metadata
WO2021051506A1 (zh) 语音交互方法、装置、计算机设备及存储介质
US10811005B2 (en) Adapting voice input processing based on voice input characteristics
KR20190042918A (ko) 전자 장치 및 그의 동작 방법
CN108806684B (zh) 位置提示方法、装置、存储介质及电子设备
CN108962241B (zh) 位置提示方法、装置、存储介质及电子设备
KR20160005050A (ko) 키워드 검출을 위한 적응적 오디오 프레임 프로세싱
WO2021008538A1 (zh) 语音交互方法及相关装置
WO2019228138A1 (zh) 音乐播放方法、装置、存储介质及电子设备
CN111599358A (zh) 语音交互方法及电子设备
CN112313930B (zh) 管理保持的方法和装置
US20210005187A1 (en) User adaptive conversation apparatus and method based on monitoring of emotional and ethical states
US20220180859A1 (en) User speech profile management
WO2019242415A1 (zh) 位置提示方法、装置、存储介质及电子设备
US20210082405A1 (en) Method for Location Reminder and Electronic Device
WO2019228140A1 (zh) 指令执行方法、装置、存储介质及电子设备
CN109064720B (zh) 位置提示方法、装置、存储介质及电子设备
CN108989551B (zh) 位置提示方法、装置、存储介质及电子设备
US10693944B1 (en) Media-player initialization optimization

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19822270

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19822270

Country of ref document: EP

Kind code of ref document: A1