CN108922523B - Position prompting method and device, storage medium and electronic equipment - Google Patents

Position prompting method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN108922523B
CN108922523B CN201810648187.4A CN201810648187A CN108922523B CN 108922523 B CN108922523 B CN 108922523B CN 201810648187 A CN201810648187 A CN 201810648187A CN 108922523 B CN108922523 B CN 108922523B
Authority
CN
China
Prior art keywords
noise
signal
voice signal
preset
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201810648187.4A
Other languages
Chinese (zh)
Other versions
CN108922523A (en
Inventor
黄粟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN201810648187.4A priority Critical patent/CN108922523B/en
Publication of CN108922523A publication Critical patent/CN108922523A/en
Priority to PCT/CN2019/085557 priority patent/WO2019242415A1/en
Application granted granted Critical
Publication of CN108922523B publication Critical patent/CN108922523B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L21/0202
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion
    • G10L21/057Time compression or expansion for improving intelligibility
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Telephone Function (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)

Abstract

The embodiment of the application discloses a position prompting method, a position prompting device, a storage medium and electronic equipment, wherein the electronic equipment in the embodiment of the application can acquire a historical noise signal corresponding to a voice signal with noise when the voice signal with noise is received. And acquiring a noise signal during the receiving period of the voice signal with noise according to the acquired historical noise signal. And performing reverse phase superposition on the acquired noise signal and the voice signal with the noise to obtain a noise reduction voice signal. And acquiring a command to be executed included in the noise reduction voice signal, and executing a prompt operation for prompting the current position according to a preset mode when the command to be executed is a command for triggering position prompt. According to the scheme, the success rate of triggering the electronic equipment to prompt the position can be improved.

Description

Position prompting method and device, storage medium and electronic equipment
Technical Field
The present application relates to the field of electronic device technologies, and in particular, to a position prompting method and apparatus, a storage medium, and an electronic device.
Background
At present, with the development of technology, the interaction modes between human and machine become more and more abundant. In the related art, a user can control electronic equipment such as a mobile phone and a tablet computer through voice, that is, after receiving a voice signal sent by the user, the electronic equipment can analyze the voice signal to obtain a control instruction and execute the control instruction. For example, when the user cannot find the electronic device, the electronic device may perform a location prompt according to a voice signal of the user to guide the user to find the electronic device. However, when the environment where the electronic device is located is noisy, it is often difficult for the electronic device to separate the voice signal of the user from the noisy environment, and the user cannot be prompted for the location.
Disclosure of Invention
The embodiment of the application provides a position prompting method and device, a storage medium and electronic equipment, and can improve the success rate of triggering the electronic equipment to carry out position prompting.
In a first aspect, an embodiment of the present application provides a position prompting method, including:
when a voice signal with noise is received, acquiring a historical noise signal corresponding to the voice signal with noise;
acquiring a noise signal during the receiving period of the voice signal with the noise according to the historical noise signal;
performing reverse phase superposition on the noise signal and the voice signal with the noise to obtain a noise reduction voice signal;
and acquiring a command to be executed included in the noise reduction voice signal, and executing a prompt operation for prompting the current position according to a preset mode when the command to be executed is a command for triggering position prompt.
In a second aspect, an embodiment of the present application provides a position indication apparatus, including:
the first acquisition module is used for acquiring a historical noise signal corresponding to the voice signal with noise when the voice signal with noise is received;
the second acquisition module is used for acquiring a noise signal during the receiving period of the voice signal with noise according to the historical noise signal;
the noise reduction module is used for performing reverse phase superposition on the noise signal and the voice signal with the noise to obtain a noise reduction voice signal;
and the prompt module is used for acquiring the instruction to be executed included in the noise reduction voice signal and executing prompt operation for prompting the current position according to a preset mode when the instruction to be executed is an instruction for triggering position prompt.
In a third aspect, the present application provides a storage medium, on which a computer program is stored, and when the computer program runs on a computer, the computer is caused to execute the steps in the position prompting method provided by the embodiment of the present application.
In a fourth aspect, an embodiment of the present application provides an electronic device, including a processor and a memory, where the memory has a computer program, and the processor is configured to execute the steps in the location hint method provided in any embodiment of the present application by calling the computer program.
In the embodiment of the application, the electronic device can acquire the historical noise signal corresponding to the voice signal with noise when receiving the voice signal with noise. And acquiring a noise signal during the receiving period of the voice signal with noise according to the acquired historical noise signal. And performing reverse phase superposition on the acquired noise signal and the voice signal with the noise to obtain a noise reduction voice signal. And acquiring a command to be executed included in the noise reduction voice signal, and executing a prompt operation for prompting the current position according to a preset mode when the command to be executed is a command for triggering position prompt. In the scheme, when the electronic equipment can receive a voice signal with noise in a noisy environment, the voice signal with noise is subjected to noise reduction processing to obtain a noise reduction voice signal, and then the prompt operation for prompting the current position is executed according to the noise reduction voice signal, so that noise interference is avoided, and the success rate of triggering the electronic equipment to perform position prompt can be improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flow chart of a position indication method according to an embodiment of the present disclosure.
Fig. 2 is a schematic diagram of a setting interface of a prompt mode provided in an embodiment of the present application.
Fig. 3 is an exemplary diagram of triggering an electronic device to perform a prompt operation for prompting a current location in an embodiment of the present application.
Fig. 4 is an exemplary diagram of an electronic device performing a prompt operation in an embodiment of the present application.
Fig. 5 is another schematic flow chart of a location prompting method according to an embodiment of the present application.
Fig. 6 is a schematic structural diagram of a position indication device according to an embodiment of the present application.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Fig. 8 is another schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Referring to the drawings, wherein like reference numbers refer to like elements, the principles of the present application are illustrated as being implemented in a suitable computing environment. The following description is based on illustrated embodiments of the application and should not be taken as limiting the application with respect to other embodiments that are not detailed herein.
In the description that follows, specific embodiments of the present application will be described with reference to steps and symbols executed by one or more computers, unless otherwise indicated. Accordingly, these steps and operations will be referred to, several times, as being performed by a computer, the computer performing operations involving a processing unit of the computer in electronic signals representing data in a structured form. This operation transforms the data or maintains it at locations in the computer's memory system, which may be reconfigured or otherwise altered in a manner well known to those skilled in the art. The data maintains a data structure that is a physical location of the memory that has particular characteristics defined by the data format. However, while the principles of the application have been described in language specific to above, it is not intended to be limited to the specific form set forth herein, and it will be recognized by those of ordinary skill in the art that various of the steps and operations described below may be implemented in hardware.
The term module, as used herein, may be considered a software object executing on the computing system. The various components, modules, engines, and services described herein may be viewed as objects implemented on the computing system. The apparatus and method described herein may be implemented in software, but may also be implemented in hardware, and are within the scope of the present application.
The terms "first", "second", and "third", etc. in this application are used to distinguish between different objects and not to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or modules is not limited to only those steps or modules listed, but rather, some embodiments may include other steps or modules not listed or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
An execution main body of the position prompting method may be the position prompting device provided in the embodiment of the present application, or an electronic device integrated with the position prompting device, where the position prompting device may be implemented in a hardware or software manner. The electronic device may be a smart phone, a tablet computer, a palm computer, a notebook computer, or a desktop computer.
Referring to fig. 1, fig. 1 is a schematic flow chart of a position prompting method according to an embodiment of the present disclosure. As shown in fig. 1, a flow of the position prompting method provided in the embodiment of the present application may be as follows:
101. and when the voice signal with noise is received, acquiring a historical noise signal corresponding to the voice signal with noise.
It should be noted that the voice signal with noise is formed by combining a voice signal and an environmental noise signal, and the electronic device may receive the input voice signal with noise in a plurality of different ways, for example, when the electronic device is not externally connected with a microphone, the electronic device may collect external voice through a built-in microphone, and use the collected voice signal with noise as the received voice signal with noise; for another example, when a microphone is externally connected to the electronic device, the electronic device may collect external voice through the externally connected microphone, and use the collected voice signal with noise as the received voice signal with noise.
When the electronic device receives an input noisy speech signal through a microphone (the microphone here may be a built-in microphone or an external microphone), if the microphone is an analog microphone, the analog noisy speech signal is collected, and at this time, the electronic device needs to sample the analog noisy speech signal to convert the analog noisy speech signal into a digital noisy speech signal, for example, the sampling frequency may be 16 KHz; in addition, if the microphone is a digital microphone, the electronic device will receive the digitized noisy speech signal directly through the digital microphone without conversion.
It is easily understood that the electronic device will receive the noisy speech signal when a speaker in the environment of the electronic device utters the speech signal, and will only receive the noise signal when no speaker in the environment of the electronic device utters the speech signal. Wherein the electronic device will buffer the received noisy speech signal and the noise signal.
In this embodiment of the present application, when receiving a noisy speech signal, an electronic device obtains a historical noise signal, which is received before the noisy speech signal is received and has a preset time length (the preset time length may be a suitable value according to actual needs by a person skilled in the art, and this is not particularly limited in this embodiment of the present application, for example, may be set to 500ms), by using a start time of the noisy speech signal as an end time, and uses this noise signal as the historical noise signal corresponding to the noisy speech signal.
For example, the preset time duration is configured to be 500ms, the starting time of the noisy speech signal is 04 minutes 56 seconds and 500ms at 12 th 11 th 06 month in 2018, the electronic device acquires the noise signal with the time duration of 500ms buffered from 04 minutes 56 seconds at 11 th 04 minutes 12 th 04 minutes 56 seconds in 2018 to 06 month 12 th 11 th 04 minutes 56 seconds and 500ms in 2018, and uses the noise signal as the historical noise signal corresponding to the noisy speech signal.
102. And acquiring a noise signal during the receiving period of the voice signal with noise according to the acquired historical noise signal.
After acquiring the historical noise signal corresponding to the voice signal with noise, the electronic equipment further acquires the noise signal during the receiving period of the voice signal with noise according to the acquired historical noise signal.
For example, the electronic device may predict the noise distribution during the reception of the noisy speech signal according to the acquired historical noise signal, thereby obtaining the noise signal during the reception of the noisy speech signal.
For another example, in consideration of noise stability, noise change in continuous time is usually small, and the electronic device may use the acquired historical noise signal as a noise signal during the reception of the noisy speech signal, wherein if the duration of the historical noise signal is greater than the duration of the noisy speech signal, a noise signal having the same duration as the noisy speech signal may be intercepted from the historical noise signal as the noise signal during the reception of the noisy speech signal; if the duration of the historical noise signal is less than the duration of the voice signal with noise, the historical noise signal can be copied, and a plurality of historical noise signals are spliced to obtain a noise signal with the same duration as the voice signal with noise as the noise signal during the receiving period of the voice signal with noise.
103. And performing reverse phase superposition on the acquired noise signal and the voice signal with the noise to obtain a noise reduction voice signal.
After the noise signal during the receiving period of the voice signal with noise is acquired, the electronic equipment firstly carries out phase inversion processing on the acquired noise signal, then superposes the noise signal after the phase inversion processing and the voice signal with noise, eliminates the noise part in the voice signal with noise and obtains the voice signal with noise.
104. And acquiring a command to be executed included in the noise reduction voice signal, and executing a prompt operation for prompting the current position according to a preset mode when the command to be executed is a command for triggering position prompt.
After the noise-reduction voice signal is obtained, the electronic equipment judges whether a voice analysis engine exists locally, if so, the electronic equipment inputs the noise-reduction voice signal into the local voice analysis engine for voice analysis, and a voice analysis text is obtained. The voice analysis is performed on the voice signal, that is, the voice signal is converted from "audio" to "text".
Furthermore, when a plurality of speech analysis engines exist locally, the electronic device may select one speech analysis engine from the plurality of speech analysis engines to perform speech analysis on the noise-reduced speech signal in the following manner:
firstly, the electronic device can randomly select one voice analysis engine from a plurality of local voice analysis engines to perform voice analysis on the received noise-reduced voice signal.
And secondly, the electronic equipment can select the voice analysis engine with the highest analysis success rate from the plurality of voice analysis engines and carry out voice analysis on the received noise reduction voice signal.
And thirdly, the electronic equipment can select the voice analysis engine with the shortest analysis time length from the plurality of voice analysis engines and carry out voice analysis on the received noise-reduction voice signal.
Fourthly, the electronic equipment can also select a voice analysis engine with the analysis success rate reaching the preset success rate and the shortest analysis time length from the plurality of voice analysis engines to perform voice analysis on the received noise reduction voice signals.
It should be noted that, a person skilled in the art may also select a speech analysis engine according to a manner not listed above, or may perform speech analysis on the noise-reduced speech signal by combining a plurality of speech analysis engines, for example, the electronic device may perform speech analysis on the noise-reduced speech signal by using two speech analysis engines at the same time, and when speech analysis texts obtained by the two speech analysis engines are the same, use the same speech analysis text as a speech analysis text of the noise-reduced speech signal; for another example, the electronic device may perform speech analysis on the noise-reduced speech signal through at least three speech analysis engines, and when speech analysis texts obtained by at least two of the speech analysis engines are the same, use the same speech analysis text as a speech analysis text of the noise-reduced speech signal.
After the voice analysis text of the noise reduction voice signal is obtained through analysis, the electronic equipment further obtains a command to be executed included in the noise reduction voice signal from the voice analysis text.
The electronic equipment stores a plurality of instruction keywords in advance, and a single instruction keyword or a plurality of instruction keyword combinations correspond to one instruction. When the to-be-executed instruction included in the noise reduction voice signal is obtained from the voice analysis text obtained through analysis, the electronic equipment firstly carries out word segmentation operation on the voice analysis text to obtain a word sequence corresponding to the voice analysis text, and the word sequence includes a plurality of words.
After the word sequence corresponding to the voice analysis text is obtained, the electronic device matches instruction keywords with the word sequence, that is, the instruction keywords in the word sequence are found out, so that a corresponding instruction is obtained through matching, and the instruction obtained through matching is used as an instruction to be executed of the noise reduction voice signal. Wherein the matching search of the instruction keywords comprises complete matching and/or fuzzy matching.
After the noise reduction voice signal is obtained and comprises a to-be-executed instruction, if the to-be-executed instruction is recognized to be an instruction for triggering position prompt, prompt operation for prompting the current position is executed according to a preset mode. For example, the instruction for triggering the position prompt corresponds to the instruction keyword combination "small europe" + "you" + "where", and when the user says "small europe you are" where, the electronic device determines that the instruction to be executed included in "small europe you are" is the instruction for triggering the position prompt.
The mode of the electronic device for executing the prompt operation can be set by default or according to data input by a user. For example, the default prompting mode of the electronic device is a bright screen, and in addition, referring to fig. 2, the electronic device is further provided with a setting interface of the prompting mode, so that the user can select the prompting mode according to actual needs, and when the user selects the prompting mode of "ringing while bright screen", and when the voice "you are in" sent by the user is received, the electronic device reminds the user of the current position in the mode of bright screen and ringing.
As can be seen from the above, in the embodiment of the present application, the electronic device may obtain the historical noise signal corresponding to the noisy speech signal when receiving the noisy speech signal. And acquiring a noise signal during the receiving period of the voice signal with noise according to the acquired historical noise signal. And performing reverse phase superposition on the acquired noise signal and the voice signal with the noise to obtain a noise reduction voice signal. And acquiring a command to be executed included in the noise reduction voice signal, and executing a prompt operation for prompting the current position according to a preset mode when the command to be executed is a command for triggering position prompt. In the scheme, when a noisy speech signal is received in a noisy environment, the noisy speech signal is subjected to noise reduction processing to obtain a noise reduction speech signal, and then the prompt operation for prompting the current position is executed according to the noise reduction speech signal, so that noise interference is avoided, and the success rate of triggering the electronic equipment to prompt the position can be improved.
In one embodiment, the "acquiring a noise signal during reception of a noisy speech signal from an acquired historical noise signal" includes:
(1) performing model training by taking the acquired historical noise signal as sample data to obtain a noise prediction model;
(2) the noise signal during reception of the noisy speech signal is predicted from the noise prediction model.
After the electronic equipment acquires the historical noise signal, the historical noise signal is used as sample data, model training is carried out according to a preset training algorithm, and a noise prediction model is obtained.
It should be noted that the training algorithm is a machine learning algorithm, and the machine learning algorithm may predict data by continuously performing feature learning, for example, the electronic device may predict a current noise distribution according to a historical noise distribution. Wherein the machine learning algorithm may include: decision tree algorithm, regression algorithm, bayesian algorithm, neural network algorithm (which may include deep neural network algorithm, convolutional neural network algorithm, recursive neural network algorithm, etc.), clustering algorithm, etc., and the selection of which training algorithm to use as the preset training algorithm for model training may be selected by those skilled in the art according to actual needs.
For example, a preset training algorithm configured for the electronic device is a gaussian mixture model algorithm (which is a regression algorithm), after a historical noise signal is obtained, the historical noise signal is used as sample data, model training is performed according to the gaussian mixture model algorithm, a gaussian mixture model is obtained through training (a noise prediction model includes a plurality of gaussian units and is used for describing noise distribution), and the gaussian mixture model is used as a noise prediction model. And then, the electronic equipment takes the starting time and the ending time of the receiving period of the voice signal with noise as the input of a noise prediction model, inputs the input into the noise prediction model and processes the input to obtain a noise signal of the noise prediction model output during the receiving period of the voice signal with noise.
In one embodiment, the "performing a prompting operation for prompting a current location in a preset manner" includes:
acquiring current position information and outputting the position information in a voice mode.
In order to help a user to find an electronic device better, an embodiment of the present application provides a method for performing a prompt operation, including:
when the electronic device performs a prompt operation for prompting a current position according to a preset mode, the electronic device first acquires current position information, wherein when the electronic device is in an outdoor environment (the electronic device may identify whether the electronic device is currently in the outdoor environment or in an indoor environment according to the strength of a received satellite positioning signal, for example, when the strength of the received satellite positioning signal is lower than a preset threshold, the electronic device determines that the electronic device is in the indoor environment, and when the strength of the received satellite positioning signal is higher than or equal to the preset threshold, the electronic device may acquire the current position information by using a satellite positioning technology, and when the electronic device is in the indoor environment, the electronic device may acquire the current position information by using the indoor positioning technology. After the current position information is acquired, the electronic equipment outputs the acquired position information in a voice mode to prompt the current position of the electronic equipment.
For example, referring to fig. 3 and fig. 4 in combination, when the user leaves the electronic device on the desktop of the conference room, "you are in you" can be spoken to trigger the electronic device to perform the prompt operation for prompting the current location. Correspondingly, the electronic equipment receives a noisy speech signal ' small Ou you is + noise ', performs noise reduction processing on the noisy speech signal to obtain a noise-reduced speech signal ' small Ou you is ', determines a to-be-executed instruction included in the noise-reduced speech signal as an instruction for triggering position prompt, acquires current position information ' meeting room ', outputs ' I ' in the meeting room ' in a speech mode, guides a user, and helps the user to find the electronic equipment.
In one embodiment, the "acquiring the instruction to be executed included in the noise reduction voice signal" includes:
(1) sending the noise reduction voice signal to a server, instructing the server to analyze the noise reduction voice signal, and returning a voice analysis text obtained by analyzing the noise reduction voice signal;
(2) and receiving a voice analysis text returned by the server, and acquiring a to-be-executed instruction included in the noise reduction voice signal according to the received voice analysis text.
After obtaining the noise-reduction voice signal, the electronic device judges whether a voice analysis engine exists locally, if not, the obtained noise-reduction voice signal is sent to a server (the server is a server providing voice analysis service), the server is instructed to analyze the noise-reduction voice signal, and a voice analysis text obtained by analyzing the noise-reduction voice signal is returned.
After receiving the voice analysis text returned by the server, the electronic device can obtain the instruction to be executed included in the noise reduction voice signal according to the voice analysis text. For how to obtain the instruction to be executed from the voice parsing text, the above related description may be specifically referred to, and details are not repeated here.
In an embodiment, before "acquiring the instruction to be executed included in the noise reduction voice signal", the method further includes:
(1) acquiring the voiceprint characteristics of the noise reduction voice signal;
(2) judging whether the acquired voiceprint features are matched with preset voiceprint features or not;
(3) and when the acquired voiceprint feature is matched with the preset voiceprint feature, acquiring a to-be-executed instruction included in the noise reduction voice signal.
In actual life, each person speaking has own characteristics, and familiar persons can only listen to the voice and distinguish the voice from each other.
The characteristics of the sound are the vocal print characteristics, which are mainly determined by two factors, the first is the size of the vocal cavity, specifically including throat, nasal cavity, oral cavity, etc., and the shape, size and position of these organs determine the magnitude of vocal cord tension and the range of vocal frequency. Therefore, different people speak the same, but the frequency distribution of the sound is different, and the sound sounds with heavy and loud sound.
The second factor that determines the characteristics of the voiceprint is the manner in which the vocal organs, including lip, tooth, tongue, soft palate and palatal muscles, are manipulated, and their interaction produces clear speech. And the cooperation mode among the people is randomly learned by the communication between the acquired people and the surrounding people. In the process of learning speaking, a person can gradually form the vocal print characteristics of the person by simulating the speaking modes of different people around the person.
In the embodiment of the application, when the electronic device obtains the noise reduction voice signal, the voiceprint feature of the noise reduction voice signal is firstly obtained.
After acquiring the voiceprint feature of the noise reduction voice signal, the electronic device further compares the acquired voiceprint feature with a preset voiceprint feature to judge whether the voiceprint feature is matched with the preset voiceprint feature. The preset voiceprint feature can be a voiceprint feature pre-recorded by the owner, and whether the voiceprint feature of the noise reduction voice signal is matched with the preset voiceprint feature is judged, namely whether a speaker corresponding to the noise reduction voice signal is the owner is judged.
When the obtained voiceprint feature matches the preset voiceprint feature, the electronic device determines that the speaker corresponding to the noise reduction voice signal is the owner, and at this time, obtains the instruction to be executed included in the noise reduction voice signal, which may specifically refer to the above description, and is not described here any more.
According to the method and the device, before the to-be-executed instruction included in the noise reduction voice signal is obtained, the identity of a speaker is identified according to the voiceprint feature of the noise reduction voice signal, and the to-be-executed instruction included in the noise reduction voice signal is obtained only when the speaker corresponding to the noise reduction voice signal is the owner. Therefore, the electronic equipment can be prevented from executing operations which are not intended by the owner, and the use experience of the owner is improved.
In one embodiment, the "determining whether the obtained voiceprint features match the preset voiceprint features" includes:
(1) acquiring the similarity of the voiceprint characteristics and preset voiceprint characteristics;
(2) judging whether the acquired similarity is greater than or equal to a first preset similarity or not;
(3) and when the acquired similarity is greater than or equal to a first preset similarity, determining that the voiceprint features are matched with the preset voiceprint features.
When judging whether the obtained voiceprint features are matched with the preset voiceprint features, the electronic device may obtain the similarity between the voiceprint features and the preset voiceprint features, and judge whether the obtained similarity is greater than or equal to a first preset similarity (set according to actual needs, for example, may be set to 95%). When the obtained similarity is greater than or equal to a first preset similarity, the obtained voiceprint feature is determined to be matched with the preset voiceprint feature, and when the obtained similarity is smaller than the first preset similarity, the obtained voiceprint feature is determined to be not matched with the preset voiceprint feature.
In an embodiment, after "determining whether the obtained similarity is greater than or equal to a first preset similarity", the method further includes:
(1) when the obtained similarity is smaller than a first preset similarity and larger than or equal to a second preset similarity, obtaining current position information;
(2) judging whether the current position is within a preset position range or not according to the position information;
(3) and when the current position is within the preset position range, determining that the acquired voiceprint features are matched with the preset voiceprint features.
It should be noted that, because the voiceprint characteristics and the physiological characteristics of the human body are closely related, in daily life, if a user catches a cold and is inflamed, the voice of the user becomes dull, and the voiceprint characteristics are changed accordingly. In this case, even if the speaker corresponding to the noisy speech signal received by the electronic device is the owner, the electronic device cannot recognize the noisy speech signal after the noise reduction processing is performed on the speaker to obtain the noise-reduced speech signal. In addition, there are various situations that cause the electronic device to be unable to identify the owner, and the details are not described here.
In order to solve the possible situation that the owner cannot be identified, in this embodiment of the application, after the electronic device completes the judgment of the similarity of the voiceprint feature, if the similarity of the voiceprint feature of the noise-reduced speech signal and the preset voiceprint feature is smaller than the first preset similarity, it is further judged whether the similarity is greater than or equal to a second preset similarity (the second preset similarity is configured to be smaller than the first preset similarity, specifically, a suitable value may be taken by a person skilled in the art according to actual needs, for example, when the first preset similarity is set to 95%, the second preset similarity may be set to 75%).
And when the judgment result is yes, namely the similarity between the voiceprint feature of the noise reduction voice signal and the preset voiceprint feature is smaller than the first preset similarity and larger than or equal to the second preset similarity, the electronic equipment further acquires the current position information.
When the mobile terminal is in an outdoor environment (the electronic device may identify whether the mobile terminal is currently in the outdoor environment or in an indoor environment according to the strength of the received satellite positioning signal, for example, when the strength of the received satellite positioning signal is lower than a preset threshold, the mobile terminal is determined to be in the indoor environment, and when the strength of the received satellite positioning signal is higher than or equal to the preset threshold, the mobile terminal is determined to be in the outdoor environment), the electronic device may acquire current location information by using a satellite positioning technology, and when the mobile terminal is in the indoor environment, the electronic device may acquire the current location information by using the indoor positioning technology.
After the current position information is acquired, the electronic equipment judges whether the current position is within a preset position range according to the position information. The preset location range may be configured as a common location range of the owner, such as home and company.
And when the current voice is judged to be located in the preset position range, the electronic equipment determines that the voiceprint features are matched with the preset voiceprint features, and the speaker corresponding to the noise reduction voice signal is identified as the owner.
Therefore, the situation that the owner cannot be identified and the purpose of improving the use experience of the owner can be achieved.
The position indication method of the present application will be further described below on the basis of the methods described in the above embodiments. Referring to fig. 5, the position prompting method may include:
201. and when the voice signal with noise is received, acquiring a historical noise signal corresponding to the voice signal with noise.
It should be noted that the voice signal with noise is formed by combining a voice signal and an environmental noise signal, and the electronic device may receive the input voice signal with noise in a plurality of different ways, for example, when the electronic device is not externally connected with a microphone, the electronic device may collect external voice through a built-in microphone, and use the collected voice signal with noise as the received voice signal with noise; for another example, when a microphone is externally connected to the electronic device, the electronic device may collect external voice through the externally connected microphone, and use the collected voice signal with noise as the received voice signal with noise.
When the electronic device receives an input noisy speech signal through a microphone (the microphone here may be a built-in microphone or an external microphone), if the microphone is an analog microphone, the analog noisy speech signal is collected, and at this time, the electronic device needs to sample the analog noisy speech signal to convert the analog noisy speech signal into a digital noisy speech signal, for example, the sampling frequency may be 16 KHz; in addition, if the microphone is a digital microphone, the electronic device will receive the digitized noisy speech signal directly through the digital microphone without conversion.
It is easily understood that the electronic device will receive the noisy speech signal when a speaker in the environment of the electronic device utters the speech signal, and will only receive the noise signal when no speaker in the environment of the electronic device utters the speech signal. Wherein the electronic device will buffer the received noisy speech signal and the noise signal.
In this embodiment of the present application, when receiving a noisy speech signal, an electronic device obtains a historical noise signal, which is received before the noisy speech signal is received and has a preset time length (the preset time length may be a suitable value according to actual needs by a person skilled in the art, and this is not particularly limited in this embodiment of the present application, for example, may be set to 500ms), by using a start time of the noisy speech signal as an end time, and uses this noise signal as the historical noise signal corresponding to the noisy speech signal.
For example, the preset time duration is configured to be 500ms, the starting time of the noisy speech signal is 04 minutes 56 seconds and 500ms at 12 th 11 th 06 month in 2018, the electronic device acquires the noise signal with the time duration of 500ms buffered from 04 minutes 56 seconds at 11 th 04 minutes 12 th 04 minutes 56 seconds in 2018 to 06 month 12 th 11 th 04 minutes 56 seconds and 500ms in 2018, and uses the noise signal as the historical noise signal corresponding to the noisy speech signal.
202. And performing model training by taking the acquired historical noise signal as sample data to obtain a noise prediction model.
After the electronic equipment acquires the historical noise signal, the historical noise signal is used as sample data, model training is carried out according to a preset training algorithm, and a noise prediction model is obtained.
It should be noted that the training algorithm is a machine learning algorithm, and the machine learning algorithm may predict data by continuously performing feature learning, for example, the electronic device may predict a current noise distribution according to a historical noise distribution. Wherein the machine learning algorithm may include: decision tree algorithm, regression algorithm, bayesian algorithm, neural network algorithm (which may include deep neural network algorithm, convolutional neural network algorithm, recursive neural network algorithm, etc.), clustering algorithm, etc., and the selection of which training algorithm to use as the preset training algorithm for model training may be selected by those skilled in the art according to actual needs.
For example, a preset training algorithm configured by the electronic device is a gaussian mixture model algorithm, after a historical noise signal is obtained, the historical noise signal is used as sample data, model training is performed according to the gaussian mixture model algorithm, a gaussian mixture model is obtained through training (a noise prediction model comprises a plurality of gaussian units and is used for describing noise distribution), and the gaussian mixture model is used as a noise prediction model.
203. The noise signal during reception of the noisy speech signal is predicted from the noise prediction model.
After the noise prediction model is obtained through training, the electronic equipment takes the starting time and the ending time of the receiving period of the voice signal with noise as the input of the noise prediction model, inputs the input into the noise prediction model for processing, and obtains the noise signal of the noise prediction model output voice signal with noise receiving period.
204. And performing reverse phase superposition on the acquired noise signal and the voice signal with the noise to obtain a noise reduction voice signal.
After the noise signal during the receiving period of the voice signal with noise is acquired, the electronic equipment firstly carries out phase inversion processing on the acquired noise signal, then superposes the noise signal after the phase inversion processing and the voice signal with noise, eliminates the noise part in the voice signal with noise and obtains the voice signal with noise.
205. And acquiring the voiceprint characteristics of the noise reduction voice signal.
In actual life, each person speaking has own characteristics, and familiar persons can only listen to the voice and distinguish the voice from each other.
The characteristics of the sound are the vocal print characteristics, which are mainly determined by two factors, the first is the size of the vocal cavity, specifically including throat, nasal cavity, oral cavity, etc., and the shape, size and position of these organs determine the magnitude of vocal cord tension and the range of vocal frequency. Therefore, different people speak the same, but the frequency distribution of the sound is different, and the sound sounds with heavy and loud sound.
The second factor that determines the characteristics of the voiceprint is the manner in which the vocal organs, including lip, tooth, tongue, soft palate and palatal muscles, are manipulated, and their interaction produces clear speech. And the cooperation mode among the people is randomly learned by the communication between the acquired people and the surrounding people. In the process of learning speaking, a person can gradually form the vocal print characteristics of the person by simulating the speaking modes of different people around the person.
In the embodiment of the application, when the electronic device obtains the noise reduction voice signal, the voiceprint feature of the noise reduction voice signal is firstly obtained.
206. And judging whether the acquired voiceprint features are matched with preset voiceprint features.
After acquiring the voiceprint feature of the noise reduction voice signal, the electronic device further compares the acquired voiceprint feature with a preset voiceprint feature to judge whether the voiceprint feature is matched with the preset voiceprint feature. The preset voiceprint feature can be a voiceprint feature pre-recorded by the owner, and whether the voiceprint feature of the noise reduction voice signal is matched with the preset voiceprint feature is judged, namely whether a speaker corresponding to the noise reduction voice signal is the owner is judged.
207. And when the acquired voiceprint feature is matched with the preset voiceprint feature, acquiring a to-be-executed instruction included in the noise reduction voice signal.
When the acquired voiceprint features are matched with the preset voiceprint features, the electronic equipment determines that the speaker corresponding to the noise reduction voice signal is the owner, and at the moment, the to-be-executed instruction included in the noise reduction voice signal is acquired.
When the to-be-executed instruction included in the noise-reduction voice signal is acquired, the electronic equipment firstly judges whether a voice analysis engine exists locally, if so, the electronic equipment inputs the noise-reduction voice signal into the local voice analysis engine for voice analysis, and a voice analysis text is obtained. The voice analysis is performed on the voice signal, that is, the voice signal is converted from "audio" to "text".
After the voice analysis text of the noise reduction voice signal is obtained through analysis, the electronic equipment further obtains a command to be executed included in the noise reduction voice signal from the voice analysis text.
The electronic equipment stores a plurality of instruction keywords in advance, and a single instruction keyword or a plurality of instruction keyword combinations correspond to one instruction. When the to-be-executed instruction included in the noise reduction voice signal is obtained from the voice analysis text obtained through analysis, the electronic equipment firstly carries out word segmentation operation on the voice analysis text to obtain a word sequence corresponding to the voice analysis text, and the word sequence includes a plurality of words.
After the word sequence corresponding to the voice analysis text is obtained, the electronic device matches instruction keywords with the word sequence, that is, the instruction keywords in the word sequence are found out, so that a corresponding instruction is obtained through matching, and the instruction obtained through matching is used as an instruction to be executed of the noise reduction voice signal. Wherein the matching search of the instruction keywords comprises complete matching and/or fuzzy matching.
208. And when the obtained instruction to be executed is an instruction for triggering position prompt, executing prompt operation for prompting the current position according to a preset mode.
After the noise reduction voice signal is obtained and comprises a to-be-executed instruction, if the to-be-executed instruction is recognized to be an instruction for triggering position prompt, prompt operation for prompting the current position is executed according to a preset mode. For example, the instruction for triggering the position prompt corresponds to the instruction keyword combination "small europe" + "you" + "where", and when the user says "small europe you are" where, the electronic device determines that the instruction to be executed included in "small europe you are" is the instruction for triggering the position prompt.
The mode of the electronic device for executing the prompt operation can be set by default or according to data input by a user. For example, the default prompting mode of the electronic device is a bright screen, and in addition, referring to fig. 2, the electronic device is further provided with a setting interface of the prompting mode, so that the user can select the prompting mode according to actual needs, and when the user selects the prompting mode of "ringing while bright screen", and when the voice "you are in" sent by the user is received, the electronic device reminds the user of the current position in the mode of bright screen and ringing.
In one embodiment, a position prompting device is further provided. Referring to fig. 6, fig. 6 is a schematic structural diagram of a position indicating device 400 according to an embodiment of the present disclosure. The position prompting device is applied to an electronic device, and includes a first obtaining module 401, a second obtaining module 402, a noise reduction module 403, and a prompting module 404, as follows:
the first obtaining module 401 is configured to obtain a historical noise signal corresponding to a noisy speech signal when the noisy speech signal is received.
A second obtaining module 402, configured to obtain a noise signal during the receiving of the voice signal with noise according to the obtained historical noise signal.
And a noise reduction module 403, configured to perform inverse phase superposition on the acquired noise signal and the voice signal with noise, so as to obtain a noise reduction voice signal.
The prompt module 404 is configured to acquire a to-be-executed instruction included in the noise reduction voice signal, and when the to-be-executed instruction is an instruction for triggering position prompt, execute a prompt operation for prompting a current position according to a preset manner.
In an embodiment, the second obtaining module 402 may be configured to:
performing model training by taking the acquired historical noise signal as sample data to obtain a noise prediction model;
the noise signal during reception of the noisy speech signal is predicted from the noise prediction model.
In an embodiment, the prompt module 404 may be configured to:
acquiring current position information and outputting the position information in a voice mode.
In an embodiment, the prompt module 404 may be configured to:
sending the noise reduction voice signal to a server, instructing the server to analyze the noise reduction voice signal, and returning a voice analysis text obtained by analyzing the noise reduction voice signal;
and receiving a voice analysis text returned by the server, and acquiring a to-be-executed instruction included in the noise reduction voice signal according to the received voice analysis text.
In an embodiment, the prompt module 404 may be configured to:
acquiring the voiceprint characteristics of the noise reduction voice signal;
judging whether the acquired voiceprint features are matched with preset voiceprint features or not;
and when the acquired voiceprint feature is matched with the preset voiceprint feature, acquiring a to-be-executed instruction included in the noise reduction voice signal.
In an embodiment, the prompt module 404 may be configured to:
acquiring the similarity of the voiceprint characteristics and preset voiceprint characteristics;
judging whether the acquired similarity is greater than or equal to a first preset similarity or not;
and when the acquired similarity is greater than or equal to a first preset similarity, determining that the voiceprint features are matched with the preset voiceprint features.
In an embodiment, the prompt module 404 may be configured to:
when the obtained similarity is smaller than a first preset similarity and larger than or equal to a second preset similarity, obtaining current position information;
judging whether the current position is within a preset position range or not according to the position information;
and when the current position is within the preset position range, determining that the acquired voiceprint features are matched with the preset voiceprint features.
The steps executed by each module in the position prompting device 400 may refer to the method steps described in the above method embodiments. The position prompting device 400 can be integrated into an electronic device, such as a mobile phone, a tablet computer, and the like.
In specific implementation, the modules may be implemented as independent entities, or may be combined arbitrarily to be implemented as the same or several entities, and specific implementation of the units may refer to the foregoing embodiments, which are not described herein again.
As can be seen from the above, in the position indication apparatus of this embodiment, when the noisy speech signal is received, the first obtaining module 401 obtains the historical noise signal corresponding to the noisy speech signal. The second obtaining module 402 obtains a noise signal during the receiving of the voice signal with noise according to the obtained historical noise signal. The noise reduction module 403 performs inverse phase superposition on the acquired noise signal and the voice signal with noise to obtain a noise reduction voice signal. The prompt module 404 obtains a to-be-executed instruction included in the noise reduction voice signal, and when the to-be-executed instruction is an instruction for triggering position prompt, a prompt operation for prompting the current position is executed according to a preset mode. In the scheme, when a noisy speech signal is received in a noisy environment, the noisy speech signal is subjected to noise reduction processing to obtain a noise reduction speech signal, and then the prompt operation for prompting the current position is executed according to the noise reduction speech signal, so that noise interference is avoided, and the success rate of triggering the electronic equipment to prompt the position can be improved.
In an embodiment, an electronic device is also provided. Referring to fig. 7, an electronic device 500 includes a processor 501 and a memory 502. The processor 501 is electrically connected to the memory 502.
The processor 500 is a control center of the electronic device 500, connects various parts of the entire electronic device using various interfaces and lines, performs various functions of the electronic device 500 and processes data by running or loading a computer program stored in the memory 502 and calling data stored in the memory 502.
The memory 502 may be used to store software programs and modules, and the processor 501 executes various functional applications and data processing by running the computer programs and modules stored in the memory 502. The memory 502 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, a computer program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 502 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 502 may also include a memory controller to provide the processor 501 with access to the memory 502.
In this embodiment, the processor 501 in the electronic device 500 loads instructions corresponding to one or more processes of the computer program into the memory 502, and the processor 501 runs the computer program stored in the memory 502, so as to implement various functions as follows:
when a voice signal with noise is received, acquiring a historical noise signal corresponding to the voice signal with noise;
acquiring a noise signal during the receiving period of the voice signal with noise according to the acquired historical noise signal;
performing reverse phase superposition on the acquired noise signal and the voice signal with the noise to obtain a noise reduction voice signal;
and acquiring a command to be executed included in the noise reduction voice signal, and executing a prompt operation for prompting the current position according to a preset mode when the command to be executed is a command for triggering position prompt.
Referring to fig. 8, in some embodiments, the electronic device 500 may further include: a display 503, radio frequency circuitry 504, audio circuitry 505, and a power supply 506. The display 503, the rf circuit 504, the audio circuit 505, and the power source 506 are electrically connected to the processor 501.
The display 503 may be used to display information entered by or provided to the user as well as various graphical user interfaces, which may be made up of graphics, text, icons, video, and any combination thereof. The Display 503 may include a Display panel, and in some embodiments, the Display panel may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.
The rf circuit 504 may be used for transceiving rf signals to establish wireless communication with a network device or other electronic devices through wireless communication, and for transceiving signals with the network device or other electronic devices.
The audio circuit 505 may be used to provide an audio interface between the user and the electronic device through a speaker, microphone.
The power supply 506 may be used to power various components of the electronic device 500. In some embodiments, power supply 506 may be logically coupled to processor 501 through a power management system, such that functions of managing charging, discharging, and power consumption are performed through the power management system.
Although not shown in fig. 8, the electronic device 500 may further include a camera, a bluetooth module, and the like, which are not described in detail herein.
In some embodiments, when acquiring a noise signal during reception of a noisy speech signal from an acquired historical noise signal, processor 501 may perform the following steps:
performing model training by taking the acquired historical noise signal as sample data to obtain a noise prediction model;
the noise signal during reception of the noisy speech signal is predicted from the noise prediction model.
In some embodiments, when the prompting operation for prompting the current position is performed in a preset manner, the processor 501 may perform the following steps:
acquiring current position information and outputting the position information in a voice mode.
In some embodiments, when obtaining the instructions to be executed included in the noise-reduced speech signal, the processor 501 may perform the following steps:
sending the noise reduction voice signal to a server, instructing the server to analyze the noise reduction voice signal, and returning a voice analysis text obtained by analyzing the noise reduction voice signal;
and receiving a voice analysis text returned by the server, and acquiring a to-be-executed instruction included in the noise reduction voice signal according to the received voice analysis text.
In some embodiments, before obtaining the instructions to be executed included in the noise reduced speech signal, the processor 501 may perform the following steps:
acquiring the voiceprint characteristics of the noise reduction voice signal;
judging whether the acquired voiceprint features are matched with preset voiceprint features or not;
and when the acquired voiceprint feature is matched with the preset voiceprint feature, acquiring a to-be-executed instruction included in the noise reduction voice signal.
In some embodiments, when determining whether the obtained voiceprint feature matches a preset voiceprint feature, the processor 501 may further perform the following steps:
acquiring the similarity of the voiceprint characteristics and preset voiceprint characteristics;
judging whether the acquired similarity is greater than or equal to a first preset similarity or not;
and when the acquired similarity is greater than or equal to a first preset similarity, determining that the voiceprint features are matched with the preset voiceprint features.
In some embodiments, after determining whether the obtained similarity is greater than or equal to a first preset similarity, the processor 501 may further perform the following steps:
when the obtained similarity is smaller than a first preset similarity and larger than or equal to a second preset similarity, obtaining current position information;
judging whether the current position is within a preset position range or not according to the position information;
and when the current position is within the preset position range, determining that the acquired voiceprint features are matched with the preset voiceprint features.
An embodiment of the present application further provides a storage medium, where the storage medium stores a computer program, and when the computer program runs on a computer, the computer is caused to execute the position indication method in any one of the above embodiments, for example: when a voice signal with noise is received, acquiring a historical noise signal corresponding to the voice signal with noise; acquiring a noise signal during the receiving period of the voice signal with noise according to the acquired historical noise signal; performing reverse phase superposition on the acquired noise signal and the voice signal with the noise to obtain a noise reduction voice signal; and acquiring a command to be executed included in the noise reduction voice signal, and executing a prompt operation for prompting the current position according to a preset mode when the command to be executed is a command for triggering position prompt.
In the embodiment of the present application, the storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
It should be noted that, for the position indication method in the embodiment of the present application, it can be understood by a person skilled in the art that all or part of the process of implementing the position indication method in the embodiment of the present application can be completed by controlling the relevant hardware through a computer program, where the computer program can be stored in a computer-readable storage medium, such as a memory of an electronic device, and executed by at least one processor in the electronic device, and during the execution process, the process of the embodiment of the position indication method can be included. The storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, etc.
In the position indication device according to the embodiment of the present application, each functional module may be integrated into one processing chip, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium, such as a read-only memory, a magnetic or optical disk, or the like.
The position prompting method, the position prompting device, the storage medium and the electronic device provided by the embodiment of the application are introduced in detail, a specific example is applied in the description to explain the principle and the implementation of the application, and the description of the embodiment is only used for helping to understand the method and the core idea of the application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (8)

1. A position prompting method is characterized by comprising the following steps:
when a voice signal with noise is received, taking the initial time of the voice signal with noise as an ending time, and acquiring a historical noise signal which is received before the voice signal with noise is received and has preset duration as a historical noise signal corresponding to the voice signal with noise;
acquiring a noise signal during the receiving period of the voice signal with the noise according to the historical noise signal;
performing reverse phase superposition on the noise signal and the voice signal with the noise to obtain a noise reduction voice signal;
acquiring the voiceprint characteristics of the noise reduction voice signal;
acquiring the similarity of the voiceprint features and preset voiceprint features;
when the similarity is smaller than a first preset similarity and larger than or equal to a second preset similarity, judging whether the current environment is an indoor environment or an outdoor environment;
if the current indoor environment is judged, acquiring current position information by adopting an indoor positioning technology;
if the current outdoor environment is judged, the current position information is obtained by adopting a satellite positioning technology;
judging whether the current position is within a preset position range or not according to the current position information;
when the current position is within the preset position range, determining that the voiceprint features are matched with the preset voiceprint features, acquiring a to-be-executed instruction included in the noise-reduction voice signal, and executing a prompt operation for prompting the current position information according to a preset mode when the to-be-executed instruction is an instruction for triggering position prompt.
2. The position prompting method according to claim 1, wherein the step of obtaining a noise signal during reception of the noisy speech signal based on the historical noise signal comprises:
performing model training by taking the historical noise signal as sample data to obtain a noise prediction model;
predicting the noise signal during the receiving according to the noise prediction model.
3. The position prompt method according to claim 1, wherein the step of performing a prompt operation for prompting the current position information in a preset manner comprises:
acquiring current position information and outputting the position information in a voice mode.
4. The position prompting method according to claim 1, wherein the step of parsing the noise-reduced voice signal into a voice parsed text includes:
sending the noise reduction voice signal to a server, instructing the server to analyze the noise reduction voice signal, and returning a voice analysis text obtained by analyzing the noise reduction voice signal;
and receiving the voice analysis text returned by the server.
5. The position prompting method according to any one of claims 1-4, wherein after obtaining the similarity of the voiceprint feature and the preset voiceprint feature, further comprising:
and when the similarity is greater than or equal to the first preset similarity, determining that the voiceprint features are matched with the preset voiceprint features.
6. A position prompting device, comprising:
the first acquisition module is used for acquiring a historical noise signal which is received before the voice signal with noise is received and has preset duration as an acquisition historical noise signal corresponding to the voice signal with noise by taking the starting time of the voice signal with noise as an ending time when the voice signal with noise is received;
the second acquisition module is used for acquiring a noise signal during the receiving period of the voice signal with noise according to the historical noise signal;
the noise reduction module is used for performing reverse phase superposition on the noise signal and the voice signal with the noise to obtain a noise reduction voice signal;
a prompt module for obtaining the voiceprint feature of the noise reduction voice signal and obtaining the similarity between the voiceprint feature and the preset voiceprint feature, when the similarity is smaller than the first preset similarity and larger than or equal to the second preset similarity, judging whether the current environment is in an indoor environment or an outdoor environment, if the current environment is judged to be in the indoor environment, adopting an indoor positioning technology to acquire current position information, if the current environment is judged to be in the outdoor environment, adopting a satellite positioning technology to acquire current position information, judging whether the current position information is in a preset position range or not according to the current position information, when the voice print is currently located in a preset position range, determining that the voice print features are matched with the preset voice print features, acquiring a command to be executed included in the noise reduction voice signal, and when the instruction to be executed is an instruction for triggering position prompt, executing prompt operation for prompting the current position information according to a preset mode.
7. A storage medium having stored thereon a computer program, characterized in that, when the computer program is run on a computer, it causes the computer to execute a position prompting method according to any one of claims 1 to 5.
8. An electronic device comprising a processor and a memory, said memory storing a computer program, wherein said processor is adapted to perform the location alert method of any of claims 1 to 5 by invoking said computer program.
CN201810648187.4A 2018-06-19 2018-06-19 Position prompting method and device, storage medium and electronic equipment Expired - Fee Related CN108922523B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810648187.4A CN108922523B (en) 2018-06-19 2018-06-19 Position prompting method and device, storage medium and electronic equipment
PCT/CN2019/085557 WO2019242415A1 (en) 2018-06-19 2019-05-05 Position prompt method, device, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810648187.4A CN108922523B (en) 2018-06-19 2018-06-19 Position prompting method and device, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN108922523A CN108922523A (en) 2018-11-30
CN108922523B true CN108922523B (en) 2021-06-15

Family

ID=64420994

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810648187.4A Expired - Fee Related CN108922523B (en) 2018-06-19 2018-06-19 Position prompting method and device, storage medium and electronic equipment

Country Status (2)

Country Link
CN (1) CN108922523B (en)
WO (1) WO2019242415A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108922523B (en) * 2018-06-19 2021-06-15 Oppo广东移动通信有限公司 Position prompting method and device, storage medium and electronic equipment
CN113709291A (en) * 2021-08-06 2021-11-26 北京三快在线科技有限公司 Audio processing method and device, electronic equipment and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104580699A (en) * 2014-12-15 2015-04-29 广东欧珀移动通信有限公司 Method and device for acoustically controlling intelligent terminal in standby state
CN106941703A (en) * 2016-01-04 2017-07-11 上海交通大学 Indoor and outdoor seamless positioning apparatus and method based on Situation Awareness
CN106960667A (en) * 2017-03-08 2017-07-18 杭州联络互动信息科技股份有限公司 Position reminding methods, devices and systems
CN107339990A (en) * 2017-06-27 2017-11-10 北京邮电大学 Multi-pattern Fusion alignment system and method
CN108062464A (en) * 2017-11-27 2018-05-22 北京传嘉科技有限公司 Terminal control method and system based on Application on Voiceprint Recognition

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101802909B (en) * 2007-09-12 2013-07-10 杜比实验室特许公司 Speech enhancement with noise level estimation adjustment
CN103578477B (en) * 2012-07-30 2017-04-12 中兴通讯股份有限公司 Denoising method and device based on noise estimation
CN102905029A (en) * 2012-10-17 2013-01-30 广东欧珀移动通信有限公司 Mobile phone and method for looking for mobile phone through intelligent voice
CN103024157B (en) * 2012-11-28 2015-01-07 广东欧珀移动通信有限公司 Voice based mobile terminal seeking method and system
CN103002147A (en) * 2012-11-29 2013-03-27 广东欧珀移动通信有限公司 Auto-answer method and device for mobile terminal (MT)
US9619645B2 (en) * 2013-04-04 2017-04-11 Cypress Semiconductor Corporation Authentication for recognition systems
CN106034024A (en) * 2015-03-11 2016-10-19 广州杰赛科技股份有限公司 Authentication method based on position and voiceprint
CN104900237B (en) * 2015-04-24 2019-07-05 上海聚力传媒技术有限公司 A kind of methods, devices and systems for audio-frequency information progress noise reduction process
CN106297779A (en) * 2016-07-28 2017-01-04 块互动(北京)科技有限公司 A kind of background noise removing method based on positional information and device
CN107666536B (en) * 2016-07-29 2021-02-12 北京搜狗科技发展有限公司 Method and device for searching terminal
CN106101909B (en) * 2016-08-26 2019-05-17 维沃移动通信有限公司 A kind of method and mobile terminal of earphone noise reduction
CN106412272A (en) * 2016-09-23 2017-02-15 珠海格力电器股份有限公司 Method and device for prompting position of mobile terminal and mobile terminal
KR102562287B1 (en) * 2016-10-14 2023-08-02 삼성전자주식회사 Electronic device and audio signal processing method thereof
CN108922523B (en) * 2018-06-19 2021-06-15 Oppo广东移动通信有限公司 Position prompting method and device, storage medium and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104580699A (en) * 2014-12-15 2015-04-29 广东欧珀移动通信有限公司 Method and device for acoustically controlling intelligent terminal in standby state
CN106941703A (en) * 2016-01-04 2017-07-11 上海交通大学 Indoor and outdoor seamless positioning apparatus and method based on Situation Awareness
CN106960667A (en) * 2017-03-08 2017-07-18 杭州联络互动信息科技股份有限公司 Position reminding methods, devices and systems
CN107339990A (en) * 2017-06-27 2017-11-10 北京邮电大学 Multi-pattern Fusion alignment system and method
CN108062464A (en) * 2017-11-27 2018-05-22 北京传嘉科技有限公司 Terminal control method and system based on Application on Voiceprint Recognition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
语音降噪实时处理算法研究;王海峰;《中国优秀硕士学位论文全文数据库信息科技辑》;20090915;第6-7页 *

Also Published As

Publication number Publication date
WO2019242415A1 (en) 2019-12-26
CN108922523A (en) 2018-11-30

Similar Documents

Publication Publication Date Title
CN108922525B (en) Voice processing method, device, storage medium and electronic equipment
US12080280B2 (en) Systems and methods for determining whether to trigger a voice capable device based on speaking cadence
CN110634483B (en) Man-machine interaction method and device, electronic equipment and storage medium
CN110998720B (en) Voice data processing method and electronic device supporting the same
CN106201424B (en) A kind of information interacting method, device and electronic equipment
CN108806684B (en) Position prompting method and device, storage medium and electronic equipment
CN108962241B (en) Position prompting method and device, storage medium and electronic equipment
CN112201246B (en) Intelligent control method and device based on voice, electronic equipment and storage medium
CN108804070B (en) Music playing method and device, storage medium and electronic equipment
CN111640434A (en) Method and apparatus for controlling voice device
EP4260314A1 (en) User speech profile management
WO2019045816A1 (en) Graphical data selection and presentation of digital content
CN109712623A (en) Sound control method, device and computer readable storage medium
CN108922523B (en) Position prompting method and device, storage medium and electronic equipment
CN114360527A (en) Vehicle-mounted voice interaction method, device, equipment and storage medium
EP3793275B1 (en) Location reminder method and apparatus, storage medium, and electronic device
CN111369992A (en) Instruction execution method and device, storage medium and electronic equipment
CN108711428B (en) Instruction execution method and device, storage medium and electronic equipment
CN109064720B (en) Position prompting method and device, storage medium and electronic equipment
CN108989551B (en) Position prompting method and device, storage medium and electronic equipment
CN112740219A (en) Method and device for generating gesture recognition model, storage medium and electronic equipment
CN114495981A (en) Method, device, equipment, storage medium and product for judging voice endpoint
US11527247B2 (en) Computing device and method of operating the same
CN112771608A (en) Voice information processing method and device, storage medium and electronic equipment
CN113066513B (en) Voice data processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210615