WO2021179854A1 - Voiceprint wakeup method and apparatus, device, and storage medium - Google Patents

Voiceprint wakeup method and apparatus, device, and storage medium Download PDF

Info

Publication number
WO2021179854A1
WO2021179854A1 PCT/CN2021/074833 CN2021074833W WO2021179854A1 WO 2021179854 A1 WO2021179854 A1 WO 2021179854A1 CN 2021074833 W CN2021074833 W CN 2021074833W WO 2021179854 A1 WO2021179854 A1 WO 2021179854A1
Authority
WO
WIPO (PCT)
Prior art keywords
wake
preset
voiceprint
electronic device
threshold
Prior art date
Application number
PCT/CN2021/074833
Other languages
French (fr)
Chinese (zh)
Inventor
胡宁宁
陈喆
曹冰
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2021179854A1 publication Critical patent/WO2021179854A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/06Decision making techniques; Pattern matching strategies

Definitions

  • This application relates to the technical field of voice processing, and relates to a voiceprint wake-up method, device, device, and storage medium that are not limited to.
  • the voiceprint feature is one of the important biological characteristics of the human body. It has strong individual specificity. It is often configured as a feature of identity authentication in areas such as voiceprint recognition and voiceprint authentication.
  • voiceprint wake-up technology has been widely used in electronic devices with intelligent voice dialogue wake-up functions. For example, users can wake up voice interactive applications such as "voice assistant" by speaking a voice command when it is inconvenient to directly control the electronic device, and then realize the control of the electronic device through voice interaction with the voice interactive application.
  • the embodiments of the present application provide a voiceprint wake-up method, device, equipment, and storage medium.
  • an embodiment of the present application provides a voiceprint wake-up method, and the method includes:
  • the application program to be awakened is awakened according to the preset wake-up parameter.
  • an embodiment of the present application provides a voiceprint wake-up device.
  • the device includes a first determination module and a first wake-up module, wherein:
  • the first determining module is configured to determine a preset wake-up parameter matching the current scene according to a preset condition; wherein the preset wake-up parameter is used to characterize a wake-up threshold and/or a voiceprint threshold that meets a wake-up requirement;
  • the first wake-up module is configured to wake up the application to be awakened according to the preset wake-up parameter in response to the voiceprint wake-up operation.
  • an embodiment of the present application provides an electronic device, including a memory and a processor.
  • the memory stores a computer program that can run on the processor.
  • the processing The device performs the following operations: obtains preset wake-up parameters that match the current scene; wherein, the preset wake-up parameters are used to characterize the wake-up threshold and/or voiceprint threshold that meet the wake-up requirements; in response to the voiceprint wake-up operation, follow the Preset wake-up parameters and wake up the application to be awakened.
  • an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the voiceprint wake-up method described above are realized.
  • a preset wake-up parameter that matches the current scene is determined according to preset conditions; wherein, the preset wake-up parameter is used to characterize the wake-up threshold and/or voiceprint threshold that meets the wake-up requirement; then, In response to the voiceprint wake-up operation, the application to be awakened is awakened according to the preset wake-up parameters; in this way, the wake-up and voiceprint thresholds can be automatically adjusted according to the current scene, which improves the user's wake-up rate and enhances the user experience.
  • FIG. 1A is an optional flowchart of a voiceprint wake-up method provided by an embodiment of this application
  • FIG. 1B is an optional flowchart of a voiceprint wake-up method provided by an embodiment of this application.
  • FIG. 2 is a schematic diagram of an optional process of the voiceprint wake-up method provided by an embodiment of the application
  • FIG. 3 is an optional flowchart of the voiceprint wake-up method provided by an embodiment of the application.
  • FIG. 4A is a logical block diagram of improving the voiceprint awakening rate in a driving scenario provided by an embodiment of the application
  • 4B is a schematic diagram of a voiceprint monitoring process in a non-driving mode provided by an embodiment of the application;
  • 4C is a schematic diagram of a voiceprint monitoring process in a driving mode provided by an embodiment of the application.
  • 5A is a schematic diagram of the structure of the voiceprint wake-up device provided by an embodiment of the application.
  • 5B is a schematic diagram of the composition structure of the voiceprint wake-up device provided by an embodiment of the application.
  • FIG. 6 is a schematic diagram of a hardware entity of an electronic device provided by an embodiment of the application.
  • Voiceprint Recognition is a type of biometric technology that converts acoustic signals into electrical signals and then uses a computer for recognition. Also known as Speaker Recognition, there are two types, namely Speaker Identification and Speaker Verification. The former is used to determine which one of several people said a certain speech, which is a "multiple choice” question; while the latter is used to confirm whether a certain speech is spoken by a designated person, which is a "one-to-one discrimination". "problem.
  • the smart rearview mirror In the car scene, in the absence of the support of voiceprint recognition technology, the smart rearview mirror is easily awakened by accidental wake-up words in the broadcast, thereby affecting driving. Let the smart device understand "who" is giving the order when receiving the order, so that every user can enjoy the most suitable content and service for the first time with just one sentence, realizing the real naturalness of human-computer interaction And intelligent.
  • the identity document (ID) of the voiceprint biological information becomes the best choice.
  • the system automatically extracts voiceprint features, which can quickly identify the speaker's identity without being affected by the language.
  • Voice wake-up refers to the user waking up the electronic device by speaking the wake-up word, causing the electronic device to enter a state of waiting for a voice command or causing the electronic device to directly execute a predetermined voice command.
  • Wake-up word A string used to wake up the electronic device to perform a voice wake-up operation.
  • the wake-up words are "Xiaoai classmate", “Xiaobu Xiaobu” and so on.
  • Voice instructions instructions for voice-controlled electronic devices to perform predetermined operations.
  • the voice command can be "navigate home”, “play music” and so on.
  • Wake-up threshold used for the electronic device to determine whether to perform a voice wake-up operation based on the wake-up word. When the acoustic score of the wake-up word is greater than the wake-up threshold, the voice wake-up operation is performed; when the acoustic score of the wake-up word is less than the wake-up threshold, the voice wake-up operation is not performed.
  • Voiceprint is a sound wave spectrum that carries verbal information displayed by an electro-acoustic instrument.
  • Voiceprint threshold used to identify whether it is a voice wake-up operation performed by a specified user. It can be used to determine the similarity between the input voiceprint feature and the registered voiceprint features that have been stored by all registered users.
  • Awakening rate refers to the success rate of user interaction, the technical term is the recall rate.
  • False awakening rate the probability of false awakening within a certain period of time.
  • the wake-up threshold is obtained by weighting the posterior of the wake-up words according to the sequence of appearance of the words.
  • the reduction of the wake-up threshold means that the requirements for some words in the wake-up words are reduced, resulting in an increase in deletion and replacement of false arousals.
  • Voice assistants are a smart mobile phone application that helps users solve problems through intelligent interactions between intelligent dialogue and instant question and answer, which is mainly to help users solve their lives. Class problem. The user can perform part of the operation of the electronic device by inputting voice instructions, so that the user can release his hands when controlling the electronic device in some specific situations.
  • voiceprint awakening is generally performed by pre-setting wakeup and voiceprint thresholds.
  • a scene-based voice operation method and device are proposed. The method is to directly wake up according to the scene information after receiving the voice awakening instruction.
  • the voice assistant is not awakened during the wake-up process; another technology proposes a voice wake-up method, which is applied to the vehicle environment.
  • the method proposes to activate the wake-up-free control logic when the user state meets the conditions, that is, without a wake-up word.
  • the voice control logic can be awakened by voice; the related technology also proposes a method and device for adjusting the awakening sensitivity. This method is aimed at smart speaker devices and dynamically adjusts preset parameters by monitoring the noise of the surrounding environment or network traffic.
  • the existing methods for increasing the voiceprint wake-up rate have the following significant shortcomings: 1) Since users use electronic devices in various scenarios and are in a constantly changing process, a fixed preset threshold is adopted. , There is no way to meet the user's multi-scenario wake-up requirements; 2) Since the voice assistant is not awakened during the wake-up process, the usage scenarios are very limited and under current technical conditions, only very limited applications can be awakened; 3) Because The wake-up-free control logic is activated when the user status meets the conditions, which will cause frequent false wake-ups of the device; 4) Only monitoring environmental noise or network traffic cannot meet the needs of electronic devices for more diverse wake-up scenarios.
  • embodiments of the present application provide a voiceprint wake-up method, device, device, and storage medium, which can automatically adjust wake-up and voiceprint thresholds according to the current scene, improve user wake-up rate, and improve user experience .
  • FIG. 1A is an optional flowchart of the voiceprint wake-up method provided by an embodiment of this application, which is used for electronic devices with intelligent voice dialogue wake-up functions, such as in-vehicle voice devices, smart voice TVs, smart speakers, smart dialogue toys, and other modern devices. Some smart electronic devices that support voice wake-up, etc. As shown in Figure 1A, the method includes:
  • Step S110a Determine a preset wake-up parameter matching the current scene according to a preset condition.
  • the preset wake-up parameter is used to characterize the wake-up threshold and/or the voiceprint threshold that meets the wake-up requirement; wherein the wake-up threshold is used to set the threshold corresponding to the wake-up word in the preset scene, and the voiceprint threshold is used to set the preset The threshold corresponding to the user's voiceprint feature in the scene.
  • the voice signal can be collected in a preset scene, and the wake-up threshold in the preset scene can be obtained through model training, or voiceprint feature extraction can be performed on the voice signal input by a specific user, and the sample voiceprint feature can be stored or established. Voiceprint database to determine the preset voiceprint threshold.
  • the preset wakeup threshold and the preset voiceprint threshold can be determined according to different usage scenarios, so as to strike a balance between the false wakeup rate and the wakeup success rate.
  • the preset wake-up threshold and preset sound that match the current scene can be determined based on the surrounding environment information where the electronic device is located, such as the environmental noise extracted from the collected voice signal or the environmental state collected by the camera.
  • Pattern threshold For example, when the electronic device is in a relatively quiet environment such as a bedroom or an office, there is less noise in the surrounding environment and relatively less interference from other users. The wakeup threshold and voiceprint threshold can be reduced, thereby increasing the wakeup rate.
  • Scene matching preset wake-up threshold and preset voiceprint threshold For example, when the user turns on the driving mode switch, or inputs the voice command "turn on (or enter) the driving mode", or connects to the car's Bluetooth, it means that the electronic device is currently in a driving situation and the voice assistant is only used in the driving position. Interference from other users It is also relatively small, which can reduce the wake-up threshold and the voiceprint threshold, thereby increasing the wake-up rate.
  • Step S120a in response to the voiceprint wake-up operation, wake up the application to be awakened according to the preset wake-up parameter.
  • the voiceprint wake-up operation may be a voice instruction including a wake-up word.
  • the voice instruction may be "navigate home”, “xiaobuxiaobu, play music”, and so on.
  • the application program to be awakened on the electronic device can be awakened from the dormant state to respond to other voice commands of the user.
  • the application program to be awakened is an application program installed on the electronic device in a state to be activated, and the application program to be awakened is awakened, that is, the application program in the state to be activated or in a dormant state directly enters the state of waiting for instructions.
  • the application program to be awakened may be a voice assistant. By waking up the voice assistant, the voice command is further recognized and the user operation is responded to.
  • the voice assistant is in a standby state and occupies a small amount of memory.
  • the user inputs a voice command containing a preset wake-up word.
  • the voice assistant is activated by waking up the voice assistant.
  • the user can say what he wants to do, such as "play music” or "call XXX", so that the electronic device can automatically complete the operation.
  • a preset wake-up parameter matching the current scene is determined according to preset conditions; wherein the preset wake-up parameter includes a preset wake-up threshold and a preset voiceprint threshold; and an input voice signal is received; Determine the wake-up value and voiceprint feature of the voice signal; in the case that the wake-up value is greater than the preset wake-up threshold, and the voiceprint feature is greater than the preset voiceprint threshold, the application to be awakened is performed Wake up; in this way, the wake-up and voiceprint thresholds can be automatically adjusted according to the current scene, which improves the user's wake-up rate and enhances the user experience.
  • FIG 1B is an optional flowchart of the voiceprint wake-up method provided by an embodiment of the application, which is used for electronic devices with intelligent voice dialogue wake-up functions, such as car voice devices, smart voice TVs, smart speakers, smart dialogue toys, and others Existing smart electronic devices that support voice wake-up, etc.
  • the method includes:
  • Step S110b in response to the received preset operation, it is determined that the electronic device enters the driving scenario
  • the preset operation includes at least one of the following: turning on the interface switch, inputting a first preset voice command, connecting to the car Bluetooth and connecting to a specific Bluetooth device, wherein the first preset voice command includes turning on the driving mode or Enter driving mode.
  • a voiceprint monitoring module can be set to monitor whether the electronic device receives these preset operations in real time, so as to quickly determine that the electronic device enters the driving situation.
  • Step S120b in the driving scenario, determine a preset wake-up parameter matching the driving scenario
  • the preset wake-up parameter is used to characterize the wake-up threshold and the voiceprint threshold that meet the wake-up requirement.
  • the core demand of the user at this time is that the wake-up response is more sensitive, that is, the wake-up rate is high. Therefore, when the system recognizes that it is in a driving situation, a lower wake-up threshold and voiceprint threshold can be set as the preset wake-up parameters, thereby greatly improving the wake-up rate.
  • Step S130b in response to the voiceprint wake-up operation, wake up the voice assistant of the electronic device according to the preset wake-up parameter.
  • the voice assistant can perform voice interaction with the user, so that the user can make the electronic device perform corresponding operations through voice commands in a driving situation.
  • the preset wake-up parameter matching the driving scenario is determined; and the wake-up operation in response to the voiceprint , According to the preset wake-up parameters, wake up the voice assistant of the electronic device; in this way, when the system recognizes that it is in a driving situation, the preset wake-up threshold and voiceprint threshold can be adjusted to greatly increase the wake-up rate.
  • FIG. 2 is an optional flowchart of the voiceprint wake-up method provided by an embodiment of the application. As shown in FIG. 2, the method includes at least the following steps:
  • Step S210 Determine that the electronic device enters a preset scene mode according to the surrounding environment information where the electronic device is located.
  • the surrounding environment information includes information such as multi-user interaction status, moving speed, background music, and environmental noise.
  • the preset scene modes include driving mode, audio and video mode, conference mode, game entertainment mode, and so on. It is understandable that electronic devices such as mobile phones and tablet computers are generally pre-set with multiple scene modes. Different scene modes provide functions, human-computer interaction, and displayed interfaces corresponding to the scene, so as to facilitate electronic device users in the Various operations of electronic equipment can still be performed in the scene.
  • an application program may be provided in the electronic device to detect the current scene mode. For example, after the electronic device detects the current scene mode by using the application program, it can determine whether the current scene mode is a driving mode. Alternatively, it can also be set directly in the application layer of the driving mode. When the electronic device enters the driving mode, the application layer of the driving mode sends out a notification to enter the mode, which is not limited here.
  • Step S220 Adjust the default wake-up parameter of the electronic device to a preset wake-up parameter matching the preset scene mode.
  • the application layer of the preset scene mode can send a notification to the module that performs the voiceprint wake-up operation, and the voiceprint wake-up module can automatically adjust the default wake-up parameters to the preset scene mode. Matching preset wake-up parameters.
  • a mapping relationship between different preset scene modes and wake-up parameters can be established.
  • the preset scene mode and the corresponding preset can be obtained from the mapping relationship.
  • Step S230 Receive the input voice signal.
  • the voice signal is a voice command including a wake-up word
  • the voice signal input by the user can be collected through the recording function on the electronic device.
  • Step S240 Determine the wake-up value and voiceprint characteristics of the voice signal.
  • the wake-up value may be the similarity between the wake-up word input by the user's voice and the set fixed wake-up word;
  • the voiceprint feature is the personality information in the voice signal, which may characterize the personal identity of the user.
  • the voice signal may be converted to obtain the corresponding text information, and then semantic recognition is performed on the text information to determine the wake-up word of the voice signal, and compare the wake-up word with the set fixed wake-up The similarity between words is used to obtain the wake-up value of the voice signal.
  • voiceprint recognition may be used to extract feature parameters of the voice signal to obtain the voiceprint feature, which can be further compared with the sample voiceprint feature stored in the electronic device to determine whether it is a wake-up of a specific user.
  • Step S250 in the case that the wake-up value is greater than the preset wake-up threshold, and the voiceprint feature is greater than the preset voiceprint threshold, wake up the application to be awakened.
  • the voice signal input by the user through voice needs to meet the preset wakeup threshold and the preset voiceprint threshold to wake up the application to be awakened.
  • the wake-up value of the voice signal is greater than the preset wake-up threshold, and the voiceprint feature of the voice signal is greater than the preset voiceprint threshold. Therefore, when the default wake-up threshold is reduced, the user can wake up applications such as the voice assistant more accurately, and the wake-up interaction efficiency between the user and the electronic device is improved.
  • the voiceprint feature may be extracted from the voice signal input by the user first, and the extracted voiceprint feature may be matched with a preset voiceprint feature to determine whether the user currently inputting the voice signal has a specific user with wake-up authority If yes, then determine whether there is a word with a wake-up value of voice recognition higher than the wake-up threshold in the current input voice signal, which is the wake-up word of the specific user; if yes, trigger the voiceprint wake-up module to wake up the application to be waked up, otherwise it will not wake up Wait to wake up the application. In this way, the user is identified in combination with voiceprint characteristics, and the wakeup threshold is adjusted to optimize the user's wakeup success rate and false wakeup rate.
  • step S240 determine the wake-up value and voiceprint characteristics of the voice signal.
  • Step S2401 Determine the similarity between the voice signal and a preset wake-up word, and use the similarity as the wake-up value of the voice signal.
  • the preset wake-up words can be stored in the electronic device in advance.
  • the voice signal can be converted to obtain the corresponding text information, and then semantic recognition can be performed to obtain the wake-up word input by the user , And match the wake-up word with the preset wake-up word, and obtain the matching similarity as the wake-up value corresponding to the voice signal.
  • Step S2403 Perform voiceprint recognition on the input voice signal to obtain voiceprint features.
  • the voiceprint recognition is performed on the voice signal input by the user, and the voiceprint feature is extracted to further match the extracted voiceprint feature with a preset voiceprint threshold to determine whether the user currently inputting the voice signal has a specific user with wake-up authority .
  • the electronic device is determined to enter the preset scene mode based on the surrounding environment information where the electronic device is located, and the current wake-up threshold and voiceprint threshold of the electronic device are adjusted to a preset matching the preset scene mode
  • the wake-up threshold and voiceprint threshold can automatically adjust the wake-up and voiceprint thresholds according to the current scene, increase the user's wake-up rate, and enhance the user experience.
  • FIG. 3 is an optional flowchart of the voiceprint wake-up method provided by the embodiment of the present application. As shown in FIG. 3, in the above step S110, "determine the preset wake-up parameter matching the current scene according to the preset condition", and it can also Through the following steps:
  • Step S310 Monitor current operation information.
  • Step S320 in a case where the current operation information satisfies a preset operation, determine the current scene in which the electronic device is located.
  • the implementation process can include at least the following steps:
  • Step S3201 When the current operation information satisfies the first preset operation, it is determined that the electronic device enters the driving scenario.
  • the first preset operation includes at least one of the following operations: turning on the interface switch, inputting a first preset voice command, connecting to the car Bluetooth and connecting to a specific Bluetooth device, wherein the first preset voice command includes turning on Driving mode or enter driving mode.
  • Step S3202 In a case where the current operation information satisfies a second preset operation, it is determined that the electronic device exits the driving scenario.
  • the second preset operation includes at least one of the following operations: turning off the interface switch, inputting a second preset voice command, disconnecting the car Bluetooth, and disconnecting the specific Bluetooth device, wherein the The second preset voice command includes turning off the driving mode or exiting the driving mode.
  • Step S330 Adjust the default wake-up parameter of the electronic device to a preset wake-up parameter matching the current scene.
  • wake up the application to be awakened according to the preset wake-up parameter or, in the case that the electronic device exits the driving scenario, According to the default wake-up parameter, wake up the application to be waked up.
  • FIG. 4A is a logical block diagram of improving the voiceprint awakening rate in a driving scenario provided by an embodiment of the present application, as shown in FIG. 4A.
  • the method includes four modules: an interactive interface module 41, a voice assistant module 42, a Bluetooth module 43, and a voiceprint wake-up module 44.
  • the first three modules provide services for the voiceprint wake-up module 44 to comprehensively determine whether the electronic device enters the driving situation.
  • the voiceprint wake-up module 44 is triggered to enter the driving mode, and the voiceprint wake-up module 44 automatically adjusts the wake-up threshold of the voiceprint wake-up algorithm and the voiceprint threshold to the preset value of the driving mode, and according to the preset value of the driving mode Value, wake up the voice assistant module 42; similarly, after detecting that the electronic device exits the driving scenario, the wake-up threshold of the voiceprint wake-up algorithm and the voiceprint threshold are automatically restored to the original preset values, thereby increasing the wake-up rate of the user's voiceprint wake-up. Improve user experience.
  • FIG. 4B is a schematic diagram of a voiceprint monitoring process in a non-driving mode provided by an embodiment of the application. As shown in FIG. 4B, it includes the following steps:
  • step S410 it is judged whether the electronic device enters the driving scene.
  • the interactive interface module, voice assistant module or Bluetooth module can be used to determine whether the electronic device enters the driving scenario.
  • step S420 the voiceprint wake-up module is adjusted to enter the driving mode.
  • the voiceprint wake-up module detects that the electronic device enters the driving scenario, it automatically enters the driving mode
  • step S430 the voiceprint wake-up module is adjusted to maintain the default mode.
  • the voiceprint wake-up module detects that the electronic device has not entered the driving situation, the voiceprint wake-up module remains in the default mode, that is, the original non-driving mode, and does not enter the driving mode.
  • Step S440 Adjust the wake-up threshold and the voiceprint threshold to the preset values of the driving mode.
  • the voiceprint wake-up module After the voiceprint wake-up module enters the driving mode, it automatically adjusts the wakeup threshold and the voiceprint threshold to the preset values of the driving mode.
  • the wake-up threshold and voiceprint threshold of the voiceprint wake-up algorithm are fixed original preset values in non-driving mode. After entering the driving mode, the wake-up threshold and voiceprint threshold are automatically adjusted to the preset values of the driving mode.
  • the voice assistant is basically used in the driving position.
  • the user who uses the electronic device has relatively little interference from other people around, which means that the false wake-up rate of others is very high. few.
  • the user's core demand is that the wake-up response is more sensitive, that is, the wake-up rate is high. Therefore, when the system recognizes that the electronic device is in a driving situation, the wakeup threshold and the voiceprint threshold can be reduced, thereby greatly improving the wakeup rate.
  • the electronic device performs a voiceprint wake-up operation in the driving mode
  • the voiceprint wake-up module wakes up the voice assistant according to the preset value of the driving mode, and then the voice assistant interacts with the user according to the received voice command.
  • the voiceprint wakeup module adjusts the wakeup threshold and the voiceprint threshold after detecting that the electronic device enters the driving situation. Since it is a test for a specific driver, the voiceprint threshold can be fixed as relative For lower values, Table 1 is the test data before lowering the wake-up threshold, and Table 2 is the test data after lowering the wake-up threshold, where the lower the wake-up threshold is 0.05:
  • the voiceprint threshold is fixed to a very low value, and then the wakeup threshold is reduced by 0.05 on the basis of the original wakeup threshold, which can increase the voiceprint wakeup rate in driving scenarios by more than a dozen Percentage points: For example, in a driving scenario with a vehicle speed of 80km/h, the voiceprint wake-up rate can be increased from 83.80% to 96.04%; in a driving scenario with a vehicle speed of 80km/h and the music mode is turned on, the voiceprint wake-up rate can be increased from 77.20 % Increased to 97.08%.
  • the wake-up threshold of the voiceprint wake-up algorithm and the voiceprint threshold are automatically adjusted to the preset values of the driving mode, which can significantly improve the user's voiceprint wake-up rate in the driving mode. Improve user experience.
  • FIG. 4C is a schematic diagram of the voiceprint monitoring process in the driving mode provided by an embodiment of the application. As shown in FIG. 4C, it includes the following steps:
  • step S450 it is determined whether the electronic device exits the driving scenario.
  • the user interaction interface module, voice assistant module, or Bluetooth module on the electronic device can be used to determine whether the electronic device exits the driving scenario.
  • step S460 the voiceprint wake-up module is adjusted to exit the driving mode.
  • the voiceprint wake-up module detects that the electronic device exits the driving scenario, it automatically exits the driving mode
  • step S470 the voiceprint wake-up module is adjusted to maintain the driving mode.
  • the voiceprint wakeup module detects that the electronic device has not exited the driving scenario, the voiceprint wakeup module remains in the driving mode without exiting.
  • Step S480 adjusting the wake-up threshold and the voiceprint threshold to the original preset values.
  • the wake-up threshold and the voiceprint threshold of the voiceprint wake-up algorithm are automatically restored to the original preset values.
  • the voiceprint wake-up method automatically enters the driving mode according to the user's voice command or interface operation or when connecting to the car Bluetooth or connecting to the user-specified Bluetooth device, and then automatically adjusts the voiceprint after detecting that the electronic device enters the driving mode
  • the wake-up threshold and voiceprint threshold of the wake-up algorithm are the preset values of the driving mode, which improves the wake-up rate of the user's voiceprint wake-up; after detecting that the electronic device exits the driving mode, the wake-up threshold and voiceprint threshold of the voiceprint wake-up algorithm are automatically restored
  • the original preset value can accurately determine the state of the user's driving situation. In the driving situation, it will automatically enter the driving mode; and after entering the driving mode, the wake-up and voiceprint thresholds will be automatically adjusted to increase the user's wake-up rate and enhance the user experience.
  • the embodiment of the present application provides a voiceprint wake-up device, which includes each module included and each unit included in each module, which can be implemented by a processor in an electronic device; of course, it can also be implemented by a logic circuit;
  • the processor can be a central processing unit (CPU), a microprocessor (Micro Processing Unit, MPU), a digital signal processor (Digital Signal Processor, DSP), or a field programmable gate array ( Field Programmable Gate Array, FPGA), etc.
  • FIG. 5A is a schematic diagram of the composition structure of a voiceprint wake-up device provided by an embodiment of the application.
  • the device 500a includes a first determination module 501a and a first wake-up module 502a, wherein:
  • the first determining module 501a is configured to determine a preset wake-up parameter matching the current scene according to preset conditions
  • the first wake-up module 502a is configured to wake up the application to be awakened according to the preset wake-up parameter in response to the voiceprint wake-up operation.
  • the first determining module 501a includes a first determining unit and a first adjusting unit, wherein: the first determining unit is configured to determine the The electronic device enters a preset scene mode; the first adjustment unit is configured to adjust the current wake-up parameter of the electronic device to a preset wake-up parameter matching the preset scene mode.
  • the first determining module 501a includes a monitoring unit, a second determining unit, and a second adjusting unit, wherein: the monitoring unit is configured to monitor current operation information; the second determining unit is configured to When the current operation information satisfies the preset operation, determine the current scene in which the electronic device is located; the second adjustment unit is configured to adjust the default wake-up parameter of the electronic device to match the current scene The preset wake-up parameters.
  • the second determining unit is further configured to determine that the electronic device enters the driving scenario when the current operation information satisfies a first preset operation; wherein, the first preset operation is at least It includes one of the following operations: turning on an interface switch, inputting a first preset voice command, connecting a car Bluetooth and connecting a specific Bluetooth device, and the first preset voice command includes turning on the driving mode or entering the driving mode.
  • the second determining unit is further configured to determine that the electronic device exits the driving scenario when the current operation information satisfies a second preset operation; wherein, the second preset The operation includes at least one of the following operations: turning off the interface switch, inputting a second preset voice command, disconnecting the car Bluetooth and disconnecting the specific Bluetooth device, the second preset voice command including turning off the driving mode Or exit driving mode.
  • the device 500a further includes a second wake-up module configured to wake up the application to be awakened according to the preset wake-up parameter when the electronic device enters the driving scenario; Or when the electronic device exits the driving scenario, wake up the application to be awakened according to the default wake-up parameter.
  • a second wake-up module configured to wake up the application to be awakened according to the preset wake-up parameter when the electronic device enters the driving scenario; Or when the electronic device exits the driving scenario, wake up the application to be awakened according to the default wake-up parameter.
  • the first wake-up module 502a includes a receiving unit, a third determining unit, and a wake-up unit, wherein: the receiving unit is configured to receive an input voice signal; the third determining unit is configured to determine The wake-up value and voiceprint feature of the voice signal; the voiceprint recognition unit is configured to be in the case where the wake-up value is greater than the preset wake-up threshold, and the voiceprint feature is greater than the preset voiceprint threshold Next, wake up the application to be awakened.
  • the third determining unit is further configured to determine the similarity between the voice signal and a preset wake-up word, and use the similarity as the wake-up value of the voice signal; Perform voiceprint recognition on the voice signal to obtain voiceprint features.
  • the embodiment of the present application provides a voiceprint wake-up device, which includes each module included and each unit included in each module, which can be implemented by a processor in an electronic device; of course, it can also be implemented by a logic circuit;
  • the processor may be a central processing unit, a microprocessor, a digital signal processor, or a field programmable gate array.
  • FIG. 5B is a schematic diagram of the composition structure of a voiceprint wake-up device provided by an embodiment of the application.
  • the device 500b includes a second determination module 501b, a third determination module 502b, and a third wake-up module 503b. :
  • the second determining module 501b is configured to determine that the electronic device enters the driving scenario in response to receiving a preset operation of waking up the application to be awakened;
  • the third determining module 502b is configured to determine a preset wake-up parameter matching the driving scenario under the driving scenario;
  • the third wake-up module 503b is configured to wake up the voice assistant of the electronic device according to the preset wake-up parameter in response to the voiceprint wake-up operation.
  • the preset operation includes at least one of the following: turning on an interface switch, inputting a first preset voice command, connecting a car Bluetooth and connecting a specific Bluetooth device, wherein the first preset voice command includes turning on driving mode or entering driving model.
  • the above voiceprint wake-up method is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium.
  • the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence or the parts that contribute to related technologies.
  • the computer software products are stored in a storage medium and include several instructions to enable The automatic test line of the device containing the storage medium executes all or part of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), magnetic disk or optical disk and other media that can store program codes.
  • FIG. 6 is a schematic diagram of the hardware entity of the electronic device provided by an embodiment of the application.
  • the hardware entity of the device 600 includes: a processor 601 and a communication interface 602 And memory 603, where
  • the processor 601 generally controls the overall operation of the device 600.
  • the communication interface 602 can enable the device 600 to communicate with other electronic devices or servers through a network.
  • the memory 603 is configured to store instructions and applications executable by the processor 601, and can also cache data to be processed or processed by the processor 601 and each module in the device 600 (for example, image data, audio data, voice communication data, and video data). Communication data) can be realized by flash memory (FLASH) or random access memory (Random Access Memory, RAM).
  • an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the voiceprint wake-up method provided in the above-mentioned embodiments are implemented.
  • the disclosed device and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, such as: multiple units or components can be combined, or It can be integrated into another system, or some features can be ignored or not implemented.
  • the coupling, or direct coupling, or communication connection between the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms. of.
  • the units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units; they may be located in one place or distributed on multiple network units; Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application.
  • the functional units in the embodiments of the present application can be all integrated into one processing unit, or each unit can be individually used as a unit, or two or more units can be integrated into one unit;
  • the unit can be implemented in the form of hardware, or in the form of hardware plus software functional units.
  • the aforementioned integrated unit of the present application is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer readable storage medium.
  • the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence or the parts that contribute to related technologies.
  • the computer software products are stored in a storage medium and include several instructions to enable The equipment automatic test line executes all or part of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: removable storage devices, ROMs, magnetic disks, or optical disks and other media that can store program codes.
  • the preset wake-up parameters that match the current scene are first determined according to preset conditions; wherein the preset wake-up parameters include a preset wake-up threshold and a preset voiceprint threshold; then, in response to the voiceprint wake-up operation, The application program to be awakened is awakened according to the preset wake-up parameters; in this way, the wake-up and voiceprint thresholds can be automatically adjusted according to the current scene, which improves the user's wake-up rate and enhances the user experience.

Abstract

A voiceprint wakeup method and apparatus, a device, and a storage medium. The method comprises: determining, according to a preset condition, a preset wakeup parameter matching a current scene, wherein the preset wakeup parameter being used for characterizing a wakeup threshold and/or a voiceprint threshold meeting a wakeup requirement (S110a); and in response to a voiceprint wakeup operation, awakening, according to the preset wakeup parameter, an application to be awakened (S120a). In the present method, the current scene comprises a driving scene.

Description

声纹唤醒方法及装置、设备、存储介质Voiceprint wake-up method and device, equipment and storage medium
相关申请的交叉引用Cross-references to related applications
本申请基于申请号为202010172688.7、申请日为2020年03月12日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以全文引入的方式引入本申请。This application is filed based on the Chinese patent application with the application number 202010172688.7 and the filing date on March 12, 2020, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated into this application in full .
技术领域Technical field
本申请涉及语音处理技术领域,涉及不限定于一种声纹唤醒方法及装置、设备、存储介质。This application relates to the technical field of voice processing, and relates to a voiceprint wake-up method, device, device, and storage medium that are not limited to.
背景技术Background technique
声纹特征是人体重要生物特征之一,具有较强的个体特殊性,常配置为声纹识别、声纹认证等领域作为身份认证的一种特征。随着语音处理技术的快速发展,声纹唤醒技术已经广泛应用于具有智能语音对话唤醒功能的电子设备中。例如,用户可以在不方便直接操控电子设备的情况下,通过说出语音指令来唤醒“语音助手”等语音交互应用,进而通过与语音交互应用之间的语音交互实现对电子设备的控制。The voiceprint feature is one of the important biological characteristics of the human body. It has strong individual specificity. It is often configured as a feature of identity authentication in areas such as voiceprint recognition and voiceprint authentication. With the rapid development of voice processing technology, voiceprint wake-up technology has been widely used in electronic devices with intelligent voice dialogue wake-up functions. For example, users can wake up voice interactive applications such as "voice assistant" by speaking a voice command when it is inconvenient to directly control the electronic device, and then realize the control of the electronic device through voice interaction with the voice interactive application.
发明内容Summary of the invention
本申请实施例提供一种声纹唤醒方法及装置、设备、存储介质。The embodiments of the present application provide a voiceprint wake-up method, device, equipment, and storage medium.
本申请实施例的技术方案是这样实现的:The technical solutions of the embodiments of the present application are implemented as follows:
第一方面,本申请实施例提供一种声纹唤醒方法,所述方法包括:In the first aspect, an embodiment of the present application provides a voiceprint wake-up method, and the method includes:
根据预设条件确定与当前场景匹配的预设唤醒参数;其中,所述预设唤醒参数用于表征满足唤醒要求的唤醒阈值和/或声纹阈值;Determine a preset wake-up parameter matching the current scene according to preset conditions; wherein the preset wake-up parameter is used to characterize a wake-up threshold and/or a voiceprint threshold that meets the wake-up requirement;
响应于声纹唤醒操作,按照所述预设唤醒参数,对待唤醒的应用程序进行唤醒。In response to the voiceprint wake-up operation, the application program to be awakened is awakened according to the preset wake-up parameter.
第二方面,本申请实施例提供一种声纹唤醒装置,所述装置包括第一确定模块和第一唤醒模块,其中:In a second aspect, an embodiment of the present application provides a voiceprint wake-up device. The device includes a first determination module and a first wake-up module, wherein:
所述第一确定模块,配置为根据预设条件确定与当前场景匹配的预设唤醒参数;其中,所述预设唤醒参数用于表征满足唤醒要求的唤醒阈值和/或声纹阈值;The first determining module is configured to determine a preset wake-up parameter matching the current scene according to a preset condition; wherein the preset wake-up parameter is used to characterize a wake-up threshold and/or a voiceprint threshold that meets a wake-up requirement;
所述第一唤醒模块,配置为响应于声纹唤醒操作,按照所述预设唤醒参数,对待唤醒的应用程序进行唤醒。The first wake-up module is configured to wake up the application to be awakened according to the preset wake-up parameter in response to the voiceprint wake-up operation.
第三方面,本申请实施例提供一种电子设备,包括存储器和处理器,所述存储器存 储有可在处理器上运行的计算机程序,所述计算机程序被所述处理器执行时,所述处理器执行如下操作:获取与当前场景匹配的预设唤醒参数;其中,所述预设唤醒参数用于表征满足唤醒要求的唤醒阈值和/或声纹阈值;响应于声纹唤醒操作,按照所述预设唤醒参数,对待唤醒的应用程序进行唤醒。In a third aspect, an embodiment of the present application provides an electronic device, including a memory and a processor. The memory stores a computer program that can run on the processor. When the computer program is executed by the processor, the processing The device performs the following operations: obtains preset wake-up parameters that match the current scene; wherein, the preset wake-up parameters are used to characterize the wake-up threshold and/or voiceprint threshold that meet the wake-up requirements; in response to the voiceprint wake-up operation, follow the Preset wake-up parameters and wake up the application to be awakened.
第四方面,本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述声纹唤醒方法中的步骤。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the voiceprint wake-up method described above are realized.
本发明实施例提供的技术方案带来的有益效果至少包括:The beneficial effects brought about by the technical solutions provided by the embodiments of the present invention include at least:
在本申请实施例中,首先,根据预设条件确定与当前场景匹配的预设唤醒参数;其中,所述预设唤醒参数用于表征满足唤醒要求的唤醒阈值和/或声纹阈值;然后,响应于声纹唤醒操作,按照所述预设唤醒参数,对待唤醒的应用程序进行唤醒;如此,能够根据当前场景自动调整唤醒及声纹阈值,提升用户唤醒率,提升用户体验。In the embodiment of the present application, first, a preset wake-up parameter that matches the current scene is determined according to preset conditions; wherein, the preset wake-up parameter is used to characterize the wake-up threshold and/or voiceprint threshold that meets the wake-up requirement; then, In response to the voiceprint wake-up operation, the application to be awakened is awakened according to the preset wake-up parameters; in this way, the wake-up and voiceprint thresholds can be automatically adjusted according to the current scene, which improves the user's wake-up rate and enhances the user experience.
附图说明Description of the drawings
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图,其中:In order to explain the technical solutions in the embodiments of the present invention more clearly, the following will briefly introduce the drawings needed in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, without creative work, other drawings can be obtained based on these drawings, among which:
图1A为本申请实施例提供的声纹唤醒方法的一个可选的流程示意图;FIG. 1A is an optional flowchart of a voiceprint wake-up method provided by an embodiment of this application;
图1B为本申请实施例提供的声纹唤醒方法的一个可选的流程示意图;FIG. 1B is an optional flowchart of a voiceprint wake-up method provided by an embodiment of this application;
图2为本申请实施例提供的声纹唤醒方法的一个可选的流程示意图;FIG. 2 is a schematic diagram of an optional process of the voiceprint wake-up method provided by an embodiment of the application;
图3为本申请实施例提供的声纹唤醒方法的一个可选的流程示意图;FIG. 3 is an optional flowchart of the voiceprint wake-up method provided by an embodiment of the application;
图4A为本申请实施例提供的提升驾驶情景中声纹唤醒率的逻辑框图;FIG. 4A is a logical block diagram of improving the voiceprint awakening rate in a driving scenario provided by an embodiment of the application;
图4B为本申请实施例提供的非驾驶模式下的声纹监听过程示意图;4B is a schematic diagram of a voiceprint monitoring process in a non-driving mode provided by an embodiment of the application;
图4C为本申请实施例提供的驾驶模式下的声纹监听过程示意图;4C is a schematic diagram of a voiceprint monitoring process in a driving mode provided by an embodiment of the application;
图5A为本申请实施例提供的声纹唤醒装置的组成结构示意图;5A is a schematic diagram of the structure of the voiceprint wake-up device provided by an embodiment of the application;
图5B为本申请实施例提供的声纹唤醒装置的组成结构示意图;5B is a schematic diagram of the composition structure of the voiceprint wake-up device provided by an embodiment of the application;
图6为本申请实施例提供的电子设备的硬件实体示意图。FIG. 6 is a schematic diagram of a hardware entity of an electronic device provided by an embodiment of the application.
具体实施方式Detailed ways
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。以下实施例用于说明本申请,但不用来 限制本申请的范围。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of the present application, but not all of the embodiments. The following examples are used to illustrate the application, but not to limit the scope of the application. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
在对本申请实施例提供的声纹唤醒方法进行详细介绍之前,先对本申请实施例涉及的名词和相关技术进行简单介绍。Before describing in detail the voiceprint wake-up method provided by the embodiments of the present application, a brief introduction to the terms and related technologies involved in the embodiments of the present application will be given first.
声纹识别(Voiceprint Recognition,VPR),是生物识别技术的一种,就是把声信号转换成电信号,再用计算机进行识别。也称为说话人识别(Speaker Recognition),有两类,即说话人辨认(Speaker Identification)和说话人确认(Speaker Verification)。前者用以判断某段语音是若干人中的哪一个所说的,是“多选一”问题;而后者用以确认某段语音是否是指定的某个人所说的,是“一对一判别”问题。Voiceprint Recognition (VPR) is a type of biometric technology that converts acoustic signals into electrical signals and then uses a computer for recognition. Also known as Speaker Recognition, there are two types, namely Speaker Identification and Speaker Verification. The former is used to determine which one of several people said a certain speech, which is a "multiple choice" question; while the latter is used to confirm whether a certain speech is spoken by a designated person, which is a "one-to-one discrimination". "problem.
在车载场景中,在缺乏声纹识别技术的加持的情况下,智能后视镜就很容易被广播中偶然出现的唤醒词误唤醒,从而影响驾驶。让智能设备在收到命令的同时明白是“谁”在发号施令,从而让每个用户都只需一句话就能在第一时间享受到最适合自己的内容和服务,实现人机交互的真正自然化、智能化。In the car scene, in the absence of the support of voiceprint recognition technology, the smart rearview mirror is easily awakened by accidental wake-up words in the broadcast, thereby affecting driving. Let the smart device understand "who" is giving the order when receiving the order, so that every user can enjoy the most suitable content and service for the first time with just one sentence, realizing the real naturalness of human-computer interaction And intelligent.
在语音交互场景下,声纹生物信息的标识(Identity document,ID)成为最好的选择。通过录入语音,系统自动提取声纹特征,可以迅速识别说话人身份,且不受语种的影响。In the voice interaction scenario, the identity document (ID) of the voiceprint biological information becomes the best choice. By recording the voice, the system automatically extracts voiceprint features, which can quickly identify the speaker's identity without being affected by the language.
语音唤醒:指用户通过说出唤醒词来唤醒电子设备,使电子设备进入到等待语音指令的状态或使电子设备直接执行预定语音指令。Voice wake-up: refers to the user waking up the electronic device by speaking the wake-up word, causing the electronic device to enter a state of waiting for a voice command or causing the electronic device to directly execute a predetermined voice command.
唤醒词:用于唤醒电子设备执行语音唤醒操作的字符串。比如,唤醒词是“小爱同学”、“小布小布”等。Wake-up word: A string used to wake up the electronic device to perform a voice wake-up operation. For example, the wake-up words are "Xiaoai classmate", "Xiaobu Xiaobu" and so on.
语音指令:语音控制电子设备执行预定操作的指令。比如,语音指令可以是“导航回家”、“播放音乐”等。Voice instructions: instructions for voice-controlled electronic devices to perform predetermined operations. For example, the voice command can be "navigate home", "play music" and so on.
唤醒阈值:用于供电子设备判定是否根据唤醒词执行语音唤醒操作。当唤醒词的声学得分大于唤醒阈值时,执行语音唤醒操作;当唤醒词的声学得分小于唤醒阈值时,不执行语音唤醒操作。唤醒阈值是根据误唤醒次数来设置的,唤醒阈值越大,误唤醒的次数越少。通常实际应用产品要保证在24小时内,误唤醒次数要<=1。需要准备100小时的噪声音频,调整唤醒阈值满足误唤醒要求。Wake-up threshold: used for the electronic device to determine whether to perform a voice wake-up operation based on the wake-up word. When the acoustic score of the wake-up word is greater than the wake-up threshold, the voice wake-up operation is performed; when the acoustic score of the wake-up word is less than the wake-up threshold, the voice wake-up operation is not performed. The wakeup threshold is set according to the number of false wakeups. The larger the wakeup threshold, the fewer the number of false wakeups. Usually, the actual application of the product must ensure that the number of false wakeups within 24 hours should be <= 1. It is necessary to prepare 100 hours of noise audio and adjust the wakeup threshold to meet the false wakeup requirements.
声纹(Voiceprint),是用电声学仪器显示的携带言语信息的声波频谱。Voiceprint (Voiceprint) is a sound wave spectrum that carries verbal information displayed by an electro-acoustic instrument.
声纹阈值:用于识别是否为指定用户进行的语音唤醒操作,可以为判断录入声纹特征与所有注册用户已储存的注册声纹特征之间的相似度。Voiceprint threshold: used to identify whether it is a voice wake-up operation performed by a specified user. It can be used to determine the similarity between the input voiceprint feature and the registered voiceprint features that have been stored by all registered users.
唤醒率:指用户交互的成功率,专业术语为召回率。Awakening rate: refers to the success rate of user interaction, the technical term is the recall rate.
误唤醒:语音没有输入特定唤醒词而引起的语音唤醒。False wake-up: Voice wake-up caused by no specific wake-up word entered in the voice.
误唤醒率:一定时间内出现误唤醒的概率。False awakening rate: the probability of false awakening within a certain period of time.
提升唤醒率即唤醒成功率方面,主要是降低唤醒门槛,最简单快捷的就是降低唤醒阈值,达到容易唤醒的目的,但是同时会带来误唤醒的提升。唤醒阈值是唤醒词根据字出现的先后顺序对其后验加权得到的,唤醒阈值降低就意味着对唤醒词中部分字的要求降低,致使删除和替换误唤醒的增加。也就是说,唤醒阈值越低,唤醒率越高,误唤醒率越高;唤醒阈值越高,唤醒率越低,误唤醒率越低。Increasing the wake-up rate, that is, the wake-up success rate, is mainly to lower the wake-up threshold. The simplest and fastest way is to lower the wake-up threshold to achieve the purpose of easy wake-up, but at the same time it will bring about the improvement of false wake-up. The wake-up threshold is obtained by weighting the posterior of the wake-up words according to the sequence of appearance of the words. The reduction of the wake-up threshold means that the requirements for some words in the wake-up words are reduced, resulting in an increase in deletion and replacement of false arousals. In other words, the lower the wake-up threshold, the higher the wake-up rate, and the higher the false wake-up rate; the higher the wake-up threshold, the lower the wake-up rate and the lower the false wake-up rate.
目前,电子设备一般都具有语音交互功能,也即语音助手,语音助手是一款智能型的手机应用,通过智能对话与即时问答的智能交互,实现帮忙用户解决问题,其主要是帮忙用户解决生活类问题。用户可以通过输入语音指令,进行电子设备的部分操作,从而使用户在一些特定的场合下控制电子设备时可以释放双手。At present, electronic devices generally have voice interaction functions, that is, voice assistants. Voice assistants are a smart mobile phone application that helps users solve problems through intelligent interactions between intelligent dialogue and instant question and answer, which is mainly to help users solve their lives. Class problem. The user can perform part of the operation of the electronic device by inputting voice instructions, so that the user can release his hands when controlling the electronic device in some specific situations.
现有技术中一般采用预先设置唤醒及声纹阈值的方式进行声纹唤醒,相关技术中提出了一种基于场景的语音操作方法及装置,该方法为收到语音唤醒指令后根据场景信息直接唤醒相关应用,在唤醒过程中不唤醒语音助手;另一技术中提出了一种语音唤醒方法,应用于车载环境,该方法提出在用户状态满足条件时,激活免唤醒控制逻辑,即无需唤醒词即可用语音唤醒语音控制逻辑;相关技术还提出了一种调整唤醒灵敏度的方法及装置。该方法针对智能音箱设备,通过监控周围环境的噪音或者网络流量,动态调整预设参数。In the prior art, voiceprint awakening is generally performed by pre-setting wakeup and voiceprint thresholds. In the related art, a scene-based voice operation method and device are proposed. The method is to directly wake up according to the scene information after receiving the voice awakening instruction. For related applications, the voice assistant is not awakened during the wake-up process; another technology proposes a voice wake-up method, which is applied to the vehicle environment. The method proposes to activate the wake-up-free control logic when the user state meets the conditions, that is, without a wake-up word. The voice control logic can be awakened by voice; the related technology also proposes a method and device for adjusting the awakening sensitivity. This method is aimed at smart speaker devices and dynamically adjusts preset parameters by monitoring the noise of the surrounding environment or network traffic.
综上可见,现有的提升声纹唤醒率的方式有以下显著缺点:1)由于用户使用电子设备的场景是多样的,且处于一个不断变化的过程中,所以采用固定的预先设置阈值的方式,没有办法满足用户多场景的唤醒需求;2)由于在唤醒的过程中不唤醒语音助手,所以使用场景非常受限且在目前技术条件下,只能实现非常有限的应用的唤醒;3)由于在用户状态满足条件时激活了免唤醒控制逻辑,会造成设备频繁的误唤醒;4)只监控环境噪音或者网络流量,无法满足电子设备更多样的唤醒场景需求。In summary, the existing methods for increasing the voiceprint wake-up rate have the following significant shortcomings: 1) Since users use electronic devices in various scenarios and are in a constantly changing process, a fixed preset threshold is adopted. , There is no way to meet the user's multi-scenario wake-up requirements; 2) Since the voice assistant is not awakened during the wake-up process, the usage scenarios are very limited and under current technical conditions, only very limited applications can be awakened; 3) Because The wake-up-free control logic is activated when the user status meets the conditions, which will cause frequent false wake-ups of the device; 4) Only monitoring environmental noise or network traffic cannot meet the needs of electronic devices for more diverse wake-up scenarios.
基于相关技术所存在的上述至少一个问题,本申请实施例提供一种声纹唤醒方法及装置、设备、存储介质,能够根据当前场景自动调整唤醒及声纹阈值,提升用户唤醒率,提升用户体验。Based on at least one of the above-mentioned problems in related technologies, embodiments of the present application provide a voiceprint wake-up method, device, device, and storage medium, which can automatically adjust wake-up and voiceprint thresholds according to the current scene, improve user wake-up rate, and improve user experience .
图1A为本申请实施例提供的声纹唤醒方法的一个可选的流程示意图,用于具备智能语音对话唤醒功能的电子设备如车载语音设备、智能语音电视、智能音箱、智能对话 玩具以及其他现有的支持语音唤醒的智能电子设备等。如图1A所示,所述方法包括:FIG. 1A is an optional flowchart of the voiceprint wake-up method provided by an embodiment of this application, which is used for electronic devices with intelligent voice dialogue wake-up functions, such as in-vehicle voice devices, smart voice TVs, smart speakers, smart dialogue toys, and other modern devices. Some smart electronic devices that support voice wake-up, etc. As shown in Figure 1A, the method includes:
步骤S110a,根据预设条件确定与当前场景匹配的预设唤醒参数。Step S110a: Determine a preset wake-up parameter matching the current scene according to a preset condition.
这里,所述预设唤醒参数用于表征满足唤醒要求的唤醒阈值和/或声纹阈值;其中唤醒阈值用于设定预设场景下唤醒词对应的阈值,声纹阈值用于设定预设场景下用户的声纹特征对应的阈值。Here, the preset wake-up parameter is used to characterize the wake-up threshold and/or the voiceprint threshold that meets the wake-up requirement; wherein the wake-up threshold is used to set the threshold corresponding to the wake-up word in the preset scene, and the voiceprint threshold is used to set the preset The threshold corresponding to the user's voiceprint feature in the scene.
需要说明的是,可以在预设场景下采集语音信号,通过模型训练,得到预设场景下的唤醒阈值,也可以对特定用户输入的语音信号进行声纹特征提取,存储样本声纹特征或者建立声纹数据库,从而确定预设声纹阈值。It should be noted that the voice signal can be collected in a preset scene, and the wake-up threshold in the preset scene can be obtained through model training, or voiceprint feature extraction can be performed on the voice signal input by a specific user, and the sample voiceprint feature can be stored or established. Voiceprint database to determine the preset voiceprint threshold.
可以理解的是,随着用户进入驾驶情景或室内安静场景,此时用户身边其他人的干扰相对而言极少,即意味着他人误唤醒非常少。因此,可以根据不同的使用场景确定预设唤醒阈值和预设声纹阈值,以便在误唤醒率和唤醒成功率之间取得平衡。It is understandable that as the user enters a driving scene or a quiet indoor scene, there is relatively little interference from other people around the user at this time, which means that there are very few accidental wake-ups by others. Therefore, the preset wakeup threshold and the preset voiceprint threshold can be determined according to different usage scenarios, so as to strike a balance between the false wakeup rate and the wakeup success rate.
在一些实施例中,可以通过电子设备所处的周围环境信息,比如从采集的语音信号中提取的环境噪声或者由摄像头采集的环境状态,确定与当前场景匹配的预设唤醒阈值和预设声纹阈值。例如,当电子设备处于卧室、办公室等相对安静的环境中时,周围环境噪声较少,其他用户的干扰也相对较少,可以降低唤醒阈值以及声纹阈值,从而提高唤醒率。In some embodiments, the preset wake-up threshold and preset sound that match the current scene can be determined based on the surrounding environment information where the electronic device is located, such as the environmental noise extracted from the collected voice signal or the environmental state collected by the camera. Pattern threshold. For example, when the electronic device is in a relatively quiet environment such as a bedroom or an office, there is less noise in the surrounding environment and relatively less interference from other users. The wakeup threshold and voiceprint threshold can be reduced, thereby increasing the wakeup rate.
在一些实施例中,还可以通过电子设备上用户的特定操作,比如通过界面开关、通过特定的语音指令、或者是否连接特定的蓝牙设备,判断电子设备是否进入特定的使用场景,从而确定与当前场景匹配的预设唤醒阈值和预设声纹阈值。例如,当用户打开驾驶模式开关、或者输入语音指令“打开(或进入)驾驶模式”、或者连接车载蓝牙时,说明电子设备当前处于驾驶情景,且只在驾驶位使用语音助手,其他用户的干扰也相对较少,可以降低唤醒阈值以及声纹阈值,从而提高唤醒率。In some embodiments, it is also possible to determine whether the electronic device enters a specific usage scenario through a specific operation of the user on the electronic device, such as an interface switch, a specific voice command, or whether to connect a specific Bluetooth device, so as to determine whether the electronic device enters a specific usage scenario. Scene matching preset wake-up threshold and preset voiceprint threshold. For example, when the user turns on the driving mode switch, or inputs the voice command "turn on (or enter) the driving mode", or connects to the car's Bluetooth, it means that the electronic device is currently in a driving situation and the voice assistant is only used in the driving position. Interference from other users It is also relatively small, which can reduce the wake-up threshold and the voiceprint threshold, thereby increasing the wake-up rate.
步骤S120a,响应于声纹唤醒操作,按照所述预设唤醒参数,对待唤醒的应用程序进行唤醒。Step S120a, in response to the voiceprint wake-up operation, wake up the application to be awakened according to the preset wake-up parameter.
这里,所述声纹唤醒操作可以为包括唤醒词的语音指令,例如,语音指令可以是“导航回家”、“小布小布,播放音乐”等。Here, the voiceprint wake-up operation may be a voice instruction including a wake-up word. For example, the voice instruction may be "navigate home", "xiaobuxiaobu, play music", and so on.
这里,通过在电子设备或软件中预置唤醒阈值和/或声纹阈值,当满足声纹阈值的指定用户发出语音指令时或当语音指令中包括的唤醒词的声学得分大于唤醒阈值时,便可以使电子设备上待唤醒的应用程序从休眠状态中被唤醒,以响应用户的其他语音指令。Here, by presetting the wake-up threshold and/or voiceprint threshold in the electronic device or software, when a designated user who meets the voiceprint threshold gives a voice command or when the acoustic score of the wake-up word included in the voice command is greater than the wake-up threshold, The application program to be awakened on the electronic device can be awakened from the dormant state to respond to other voice commands of the user.
这里,待唤醒的应用程序为电子设备上安装的处于待激活状态的应用程序,对待唤 醒的应用程序进行唤醒,即就是让处于待激活或休眠状态下的应用程序直接进入到等待指令状态。例如,待唤醒的应用程序可以为语音助手,通过唤醒语音助手,进一步识别语音指令,响应用户操作。Here, the application program to be awakened is an application program installed on the electronic device in a state to be activated, and the application program to be awakened is awakened, that is, the application program in the state to be activated or in a dormant state directly enters the state of waiting for instructions. For example, the application program to be awakened may be a voice assistant. By waking up the voice assistant, the voice command is further recognized and the user operation is responded to.
可以理解的是,电子设备开机后,语音助手处于待激活状态,占用少量内存,当用户的声纹特征与电子设备预存的声纹特征匹配时,该用户输入包含预设唤醒词的语音指令,或者该语音指令的唤醒值大于预设唤醒阈值时,对语音助手进行唤醒即激活语音助手。此时,用户可说出想要做的事情,比如说“播放音乐”或者是“打电话给XXX”,以使电子设备自动完成操作。It is understandable that after the electronic device is turned on, the voice assistant is in a standby state and occupies a small amount of memory. When the user’s voiceprint feature matches the voiceprint feature pre-stored in the electronic device, the user inputs a voice command containing a preset wake-up word. Or when the wake-up value of the voice command is greater than the preset wake-up threshold, the voice assistant is activated by waking up the voice assistant. At this point, the user can say what he wants to do, such as "play music" or "call XXX", so that the electronic device can automatically complete the operation.
在本申请实施例中,首先,根据预设条件确定与当前场景匹配的预设唤醒参数;其中,所述预设唤醒参数包括预设唤醒阈值和预设声纹阈值;接收输入的语音信号;确定所述语音信号的唤醒值和声纹特征;在所述唤醒值大于所述预设唤醒阈值,并且所述声纹特征大于所述预设声纹阈值的情况下,对待唤醒的应用程序进行唤醒;如此,能够根据当前场景自动调整唤醒及声纹阈值,提升用户唤醒率,提升用户体验。In the embodiment of the present application, firstly, a preset wake-up parameter matching the current scene is determined according to preset conditions; wherein the preset wake-up parameter includes a preset wake-up threshold and a preset voiceprint threshold; and an input voice signal is received; Determine the wake-up value and voiceprint feature of the voice signal; in the case that the wake-up value is greater than the preset wake-up threshold, and the voiceprint feature is greater than the preset voiceprint threshold, the application to be awakened is performed Wake up; in this way, the wake-up and voiceprint thresholds can be automatically adjusted according to the current scene, which improves the user's wake-up rate and enhances the user experience.
图1B为本申请实施例提供的声纹唤醒方法的一个可选的流程示意图,用于具备智能语音对话唤醒功能的电子设备、如车载语音设备、智能语音电视、智能音箱、智能对话玩具以及其他现有的支持语音唤醒的智能电子设备等。如图1B所示,所述方法包括:Figure 1B is an optional flowchart of the voiceprint wake-up method provided by an embodiment of the application, which is used for electronic devices with intelligent voice dialogue wake-up functions, such as car voice devices, smart voice TVs, smart speakers, smart dialogue toys, and others Existing smart electronic devices that support voice wake-up, etc. As shown in Figure 1B, the method includes:
步骤S110b,响应于接收的预设操作,确定电子设备进入驾驶情景;Step S110b, in response to the received preset operation, it is determined that the electronic device enters the driving scenario;
这里,所述预设操作至少包括以下之一:打开界面开关、输入第一预设语音指令、连接车载蓝牙和连接特定的蓝牙设备,其中,所述第一预设语音指令包括打开驾驶模式或者进入驾驶模式。Here, the preset operation includes at least one of the following: turning on the interface switch, inputting a first preset voice command, connecting to the car Bluetooth and connecting to a specific Bluetooth device, wherein the first preset voice command includes turning on the driving mode or Enter driving mode.
在实施中可以设置声纹监听模块,实时监听电子设备是否收到这些预设操作,从而快速判定电子设备进入驾驶情景。In the implementation, a voiceprint monitoring module can be set to monitor whether the electronic device receives these preset operations in real time, so as to quickly determine that the electronic device enters the driving situation.
步骤S120b,在所述驾驶情景下,确定与所述驾驶情景匹配的预设唤醒参数;Step S120b, in the driving scenario, determine a preset wake-up parameter matching the driving scenario;
这里,所述预设唤醒参数用于表征满足唤醒要求的唤醒阈值和声纹阈值。Here, the preset wake-up parameter is used to characterize the wake-up threshold and the voiceprint threshold that meet the wake-up requirement.
在电子设备处于驾驶情景的情况下,此时用户的核心诉求在于唤醒响应更灵敏,即唤醒率高。故在系统识别到位于驾驶情景下时,可以设置较低的唤醒阈值及声纹阈值作为预设唤醒参数,从而较大提升唤醒率。When the electronic device is in a driving situation, the core demand of the user at this time is that the wake-up response is more sensitive, that is, the wake-up rate is high. Therefore, when the system recognizes that it is in a driving situation, a lower wake-up threshold and voiceprint threshold can be set as the preset wake-up parameters, thereby greatly improving the wake-up rate.
步骤S130b,响应于声纹唤醒操作,按照所述预设唤醒参数,唤醒所述电子设备的语音助手。Step S130b, in response to the voiceprint wake-up operation, wake up the voice assistant of the electronic device according to the preset wake-up parameter.
这里,通过唤醒电子设备的语音助手,语音助手可以和用户进行语音交互,从而使得用户在驾驶情景下通过语音指令使电子设备执行相应操作。Here, by waking up the voice assistant of the electronic device, the voice assistant can perform voice interaction with the user, so that the user can make the electronic device perform corresponding operations through voice commands in a driving situation.
在本申请实施例中,首先,响应于接收的预设操作,确定电子设备进入驾驶情景;在所述驾驶情景下,确定与所述驾驶情景匹配的预设唤醒参数;响应于声纹唤醒操作,按照所述预设唤醒参数,唤醒所述电子设备的语音助手;如此,能够在系统识别到位于驾驶情景下时,通过调整预设的唤醒阈值及声纹阈值,从而较大提升唤醒率。In the embodiment of the present application, firstly, in response to the received preset operation, it is determined that the electronic device enters the driving scenario; in the driving scenario, the preset wake-up parameter matching the driving scenario is determined; and the wake-up operation in response to the voiceprint , According to the preset wake-up parameters, wake up the voice assistant of the electronic device; in this way, when the system recognizes that it is in a driving situation, the preset wake-up threshold and voiceprint threshold can be adjusted to greatly increase the wake-up rate.
图2为本申请实施例提供的声纹唤醒方法的一个可选的流程示意图,如图2所示,所述方法至少包括以下步骤:FIG. 2 is an optional flowchart of the voiceprint wake-up method provided by an embodiment of the application. As shown in FIG. 2, the method includes at least the following steps:
步骤S210,根据所述电子设备所处的周围环境信息,确定所述电子设备进入预设场景模式。Step S210: Determine that the electronic device enters a preset scene mode according to the surrounding environment information where the electronic device is located.
这里,所述周围环境信息包括多用户交互状态、移动速度、背景音乐、环境噪音等信息。Here, the surrounding environment information includes information such as multi-user interaction status, moving speed, background music, and environmental noise.
这里,预设场景模式包括驾驶模式、音视频模式、会议模式、游戏娱乐模式等。可以理解的,手机以及平板电脑等电子设备中一般会预先设置有多种场景模式,不同的场景模式下提供与该场景相应的功能、人机交互以及显示的界面,以方便电子设备用户在该场景下仍可以进行电子设备的各项操作。Here, the preset scene modes include driving mode, audio and video mode, conference mode, game entertainment mode, and so on. It is understandable that electronic devices such as mobile phones and tablet computers are generally pre-set with multiple scene modes. Different scene modes provide functions, human-computer interaction, and displayed interfaces corresponding to the scene, so as to facilitate electronic device users in the Various operations of electronic equipment can still be performed in the scene.
在一些实施例中,在电子设备中可以设置有应用程序用于检测当前的场景模式,例如电子设备利用应用程序检测到当前的场景模式后,可以判断当前的场景模式是否为驾驶模式。或者,也可以在驾驶模式的应用层中直接进行设置,当电子设备进入驾驶模式时,由驾驶模式的应用层发出进入该模式的通知,这里不做限定。In some embodiments, an application program may be provided in the electronic device to detect the current scene mode. For example, after the electronic device detects the current scene mode by using the application program, it can determine whether the current scene mode is a driving mode. Alternatively, it can also be set directly in the application layer of the driving mode. When the electronic device enters the driving mode, the application layer of the driving mode sends out a notification to enter the mode, which is not limited here.
步骤S220,将所述电子设备的默认唤醒参数调整为,与所述预设场景模式匹配的预设唤醒参数。Step S220: Adjust the default wake-up parameter of the electronic device to a preset wake-up parameter matching the preset scene mode.
这里,当电子设备启用预设场景模式后,该预设场景模式的应用层可以发送通知至执行声纹唤醒操作的模块,该声纹唤醒模块可以将默认唤醒参数自动调整为与预设场景模式匹配的预设唤醒参数。Here, when the electronic device enables the preset scene mode, the application layer of the preset scene mode can send a notification to the module that performs the voiceprint wake-up operation, and the voiceprint wake-up module can automatically adjust the default wake-up parameters to the preset scene mode. Matching preset wake-up parameters.
在一些实施例中,可以建立不同预设场景模式与唤醒参数之间的映射关系,当判断电子设备进入某预设场景模式后,从所述映射关系中获取与预设场景模式与对应的预设唤醒参数,将默认的唤醒参数自动调整为与预设场景模式匹配的预设唤醒参数。In some embodiments, a mapping relationship between different preset scene modes and wake-up parameters can be established. When it is determined that the electronic device enters a certain preset scene mode, the preset scene mode and the corresponding preset can be obtained from the mapping relationship. Set the wake-up parameters and automatically adjust the default wake-up parameters to the preset wake-up parameters that match the preset scene mode.
步骤S230,接收输入的语音信号。Step S230: Receive the input voice signal.
这里,所述语音信号为包括唤醒词的语音指令,可以通过电子设备上的录音功能采集用户输入的语音信号。Here, the voice signal is a voice command including a wake-up word, and the voice signal input by the user can be collected through the recording function on the electronic device.
步骤S240,确定所述语音信号的唤醒值和声纹特征。Step S240: Determine the wake-up value and voiceprint characteristics of the voice signal.
这里,所述唤醒值可以为用户语音输入的唤醒词与设定的固定唤醒词之间的相似度;所述声纹特征为所述语音信号中的个性信息,可以表征用户的个人身份。Here, the wake-up value may be the similarity between the wake-up word input by the user's voice and the set fixed wake-up word; the voiceprint feature is the personality information in the voice signal, which may characterize the personal identity of the user.
在一些实施例中,可以对所述语音信号进行转换,得到对应的文本信息,然后对该文本信息进行语义识别,确定所述语音信号的唤醒词,比较所述唤醒词和设定的固定唤醒词之间的相似度,得到该语音信号的唤醒值。In some embodiments, the voice signal may be converted to obtain the corresponding text information, and then semantic recognition is performed on the text information to determine the wake-up word of the voice signal, and compare the wake-up word with the set fixed wake-up The similarity between words is used to obtain the wake-up value of the voice signal.
在一些实施例中,可以通过声纹识别,对所述语音信号进行特征参数提取得到声纹特征,以进一步与电子设备中存储的样本声纹特征进行比较,确定是否为特定用户的唤醒。In some embodiments, voiceprint recognition may be used to extract feature parameters of the voice signal to obtain the voiceprint feature, which can be further compared with the sample voiceprint feature stored in the electronic device to determine whether it is a wake-up of a specific user.
步骤S250,在所述唤醒值大于所述预设唤醒阈值,并且所述声纹特征大于所述预设声纹阈值的情况下,对待唤醒的应用程序进行唤醒。Step S250, in the case that the wake-up value is greater than the preset wake-up threshold, and the voiceprint feature is greater than the preset voiceprint threshold, wake up the application to be awakened.
这里,用户通过语音输入的语音信号,需要满足预设唤醒阈值以及预设声纹阈值,才能唤醒待唤醒的应用程序。例如,该语音信号的唤醒值大于预设唤醒阈值,该语音信号的声纹特征大于预设声纹阈值。从而当减小默认的唤醒阈值时,可以使用户能够较为准确地唤醒语音助手等应用程序,提高了用户与电子设备的唤醒交互效率。Here, the voice signal input by the user through voice needs to meet the preset wakeup threshold and the preset voiceprint threshold to wake up the application to be awakened. For example, the wake-up value of the voice signal is greater than the preset wake-up threshold, and the voiceprint feature of the voice signal is greater than the preset voiceprint threshold. Therefore, when the default wake-up threshold is reduced, the user can wake up applications such as the voice assistant more accurately, and the wake-up interaction efficiency between the user and the electronic device is improved.
在一些实施例中,可以首先从用户输入的语音信号中提取声纹特征,将提取的声纹特征与预设的声纹特征进行匹配,确定当前输入语音信号的用户是否具有唤醒权限的特定用户,若是,再确定当前输入语音信号中是否存在语音识别的唤醒值高于唤醒阈值的词,为该特定用户的唤醒词;若是,则触发声纹唤醒模块对待唤醒应用程序进行唤醒,否则不唤醒待唤醒应用程序。如此,结合声纹特征识别用户,以及通过调整唤醒阈值,优化该用户的唤醒成功率和误唤醒率。In some embodiments, the voiceprint feature may be extracted from the voice signal input by the user first, and the extracted voiceprint feature may be matched with a preset voiceprint feature to determine whether the user currently inputting the voice signal has a specific user with wake-up authority If yes, then determine whether there is a word with a wake-up value of voice recognition higher than the wake-up threshold in the current input voice signal, which is the wake-up word of the specific user; if yes, trigger the voiceprint wake-up module to wake up the application to be waked up, otherwise it will not wake up Wait to wake up the application. In this way, the user is identified in combination with voiceprint characteristics, and the wakeup threshold is adjusted to optimize the user's wakeup success rate and false wakeup rate.
在一些实施例中,上述步骤S240“确定所述语音信号的唤醒值和声纹特征”可以通过以下过程实现:In some embodiments, the above step S240 "determine the wake-up value and voiceprint characteristics of the voice signal" can be implemented through the following process:
步骤S2401,确定所述语音信号与预设唤醒词之间的相似度,将所述相似度作为所述语音信号的唤醒值。Step S2401: Determine the similarity between the voice signal and a preset wake-up word, and use the similarity as the wake-up value of the voice signal.
这里,可以预先在电子设备中存储预设唤醒词,当接收到用户输入的语音信号后,可以将该语音信号进行转换,得到对应的文本信息,接着可以进行语义识别,得到用户输入的唤醒词,并将该唤醒词与预设唤醒词进行匹配,得到匹配相似度作为语音信号对 应的唤醒值。Here, the preset wake-up words can be stored in the electronic device in advance. When the voice signal input by the user is received, the voice signal can be converted to obtain the corresponding text information, and then semantic recognition can be performed to obtain the wake-up word input by the user , And match the wake-up word with the preset wake-up word, and obtain the matching similarity as the wake-up value corresponding to the voice signal.
步骤S2403,对所述输入的语音信号进行声纹识别,得到声纹特征。Step S2403: Perform voiceprint recognition on the input voice signal to obtain voiceprint features.
这里,对用户输入的语音信号进行声纹识别,提取声纹特征,以进一步将提取的声纹特征与预设的声纹阈值进行匹配,确定当前输入语音信号的用户是否具有唤醒权限的特定用户。Here, the voiceprint recognition is performed on the voice signal input by the user, and the voiceprint feature is extracted to further match the extracted voiceprint feature with a preset voiceprint threshold to determine whether the user currently inputting the voice signal has a specific user with wake-up authority .
在本申请实施例中,通过电子设备所处的周围环境信息,确定电子设备进入预设场景模式,并将电子设备的当前唤醒阈值及声纹阈值,调整为与预设场景模式匹配的预设唤醒阈值及声纹阈值,如此能够根据当前场景自动调整唤醒及声纹阈值,提升用户唤醒率,提升用户体验。In the embodiment of the present application, the electronic device is determined to enter the preset scene mode based on the surrounding environment information where the electronic device is located, and the current wake-up threshold and voiceprint threshold of the electronic device are adjusted to a preset matching the preset scene mode The wake-up threshold and voiceprint threshold can automatically adjust the wake-up and voiceprint thresholds according to the current scene, increase the user's wake-up rate, and enhance the user experience.
图3是本申请实施例提供的声纹唤醒方法的一个可选的流程示意图,如图3所示,上述步骤S110中“根据预设条件确定与当前场景匹配的预设唤醒参数”,还可以通过以下步骤实现:FIG. 3 is an optional flowchart of the voiceprint wake-up method provided by the embodiment of the present application. As shown in FIG. 3, in the above step S110, "determine the preset wake-up parameter matching the current scene according to the preset condition", and it can also Through the following steps:
步骤S310,监听当前操作信息。Step S310: Monitor current operation information.
步骤S320,在所述当前操作信息满足预设操作的情况下,确定所述电子设备所处的当前场景。Step S320, in a case where the current operation information satisfies a preset operation, determine the current scene in which the electronic device is located.
这里,在实施过程中可以至少包括以下步骤:Here, the implementation process can include at least the following steps:
步骤S3201,在所述当前操作信息满足第一预设操作的情况下,确定所述电子设备进入驾驶情景。Step S3201: When the current operation information satisfies the first preset operation, it is determined that the electronic device enters the driving scenario.
这里,所述第一预设操作至少包括以下操作之一:打开界面开关、输入第一预设语音指令、连接车载蓝牙和连接特定的蓝牙设备,其中,所述第一预设语音指令包括打开驾驶模式或者进入驾驶模式。Here, the first preset operation includes at least one of the following operations: turning on the interface switch, inputting a first preset voice command, connecting to the car Bluetooth and connecting to a specific Bluetooth device, wherein the first preset voice command includes turning on Driving mode or enter driving mode.
步骤S3202,在所述当前操作信息满足第二预设操作的情况下,确定所述电子设备退出所述驾驶情景。Step S3202: In a case where the current operation information satisfies a second preset operation, it is determined that the electronic device exits the driving scenario.
这里,所述第二预设操作至少包括以下操作之一:关闭所述界面开关、输入第二预设语音指令、断开所述车载蓝牙和断开所述特定的蓝牙设备,其中,所述第二预设语音指令包括关闭驾驶模式或者退出驾驶模式。Here, the second preset operation includes at least one of the following operations: turning off the interface switch, inputting a second preset voice command, disconnecting the car Bluetooth, and disconnecting the specific Bluetooth device, wherein the The second preset voice command includes turning off the driving mode or exiting the driving mode.
步骤S330,将所述电子设备的默认唤醒参数调整为,与当前场景匹配的预设唤醒参数。Step S330: Adjust the default wake-up parameter of the electronic device to a preset wake-up parameter matching the current scene.
进一步地,在所述电子设备进入所述驾驶情景的情况下,按照所述预设唤醒参数,唤醒所述待唤醒的应用程序,或者,在所述电子设备退出所述驾驶情景的情况下,按照 所述默认唤醒参数,唤醒所述待唤醒的应用程序。Further, in the case that the electronic device enters the driving scenario, wake up the application to be awakened according to the preset wake-up parameter, or, in the case that the electronic device exits the driving scenario, According to the default wake-up parameter, wake up the application to be waked up.
下面,将说明本申请实施例在一个实际的应用场景中的示例性应用。In the following, an exemplary application of the embodiment of the present application in an actual application scenario will be described.
本申请实施例提供一种提升驾驶情景中声纹唤醒率的方法,应用于电子设备,图4A为本申请实施例提供的提升驾驶情景中声纹唤醒率的逻辑框图,如图4A所示,该方法包含交互界面模块41、语音助手模块42、蓝牙模块43及声纹唤醒模块44四个模块,其中,前三个模块为声纹唤醒模块44提供服务,综合判断电子设备是否进入驾驶情景,然后在驾驶情景下触发声纹唤醒模块44进入驾驶模式,声纹唤醒模块44则自动调整声纹唤醒算法的唤醒阈值及声纹阈值为驾驶模式的预设值,并根据该驾驶模式的预设值,唤醒语音助手模块42;同样的,在检测到电子设备退出驾驶情景后,自动恢复声纹唤醒算法的唤醒阈值及声纹阈值为原始预设值,从而提升用户声纹唤醒的唤醒率,提升用户体验。The embodiment of the present application provides a method for improving the voiceprint awakening rate in a driving scenario, which is applied to an electronic device. FIG. 4A is a logical block diagram of improving the voiceprint awakening rate in a driving scenario provided by an embodiment of the present application, as shown in FIG. 4A. The method includes four modules: an interactive interface module 41, a voice assistant module 42, a Bluetooth module 43, and a voiceprint wake-up module 44. The first three modules provide services for the voiceprint wake-up module 44 to comprehensively determine whether the electronic device enters the driving situation. Then in the driving situation, the voiceprint wake-up module 44 is triggered to enter the driving mode, and the voiceprint wake-up module 44 automatically adjusts the wake-up threshold of the voiceprint wake-up algorithm and the voiceprint threshold to the preset value of the driving mode, and according to the preset value of the driving mode Value, wake up the voice assistant module 42; similarly, after detecting that the electronic device exits the driving scenario, the wake-up threshold of the voiceprint wake-up algorithm and the voiceprint threshold are automatically restored to the original preset values, thereby increasing the wake-up rate of the user's voiceprint wake-up. Improve user experience.
图4B为本申请实施例提供的非驾驶模式下的声纹监听过程示意图,如图4B所示,包括以下步骤:FIG. 4B is a schematic diagram of a voiceprint monitoring process in a non-driving mode provided by an embodiment of the application. As shown in FIG. 4B, it includes the following steps:
步骤S410,判断电子设备是否进入驾驶情景。In step S410, it is judged whether the electronic device enters the driving scene.
这里,可以通过交互界面模块、语音助手模块或者蓝牙模块,判断电子设备是否进入驾驶情景。Here, the interactive interface module, voice assistant module or Bluetooth module can be used to determine whether the electronic device enters the driving scenario.
例如,响应于用户通过界面开关,或者语音指令,如“打开驾驶模式”或“进入驾驶模式”,或者连接车载蓝牙或者连接用户指定的蓝牙设备,判断为电子设备进入驾驶情景;否则,判断为电子设备未进入驾驶情景。For example, in response to the user's interface switch, or voice command, such as "open driving mode" or "enter driving mode", or connect the car Bluetooth or connect to the user-specified Bluetooth device, it is determined that the electronic device has entered the driving situation; otherwise, it is determined as The electronic device does not enter the driving situation.
步骤S420,调整声纹唤醒模块进入驾驶模式。In step S420, the voiceprint wake-up module is adjusted to enter the driving mode.
这里,声纹唤醒模块检测到电子设备进入驾驶情景后,自动进入驾驶模式;Here, after the voiceprint wake-up module detects that the electronic device enters the driving scenario, it automatically enters the driving mode;
步骤S430,调整声纹唤醒模块保持默认模式。In step S430, the voiceprint wake-up module is adjusted to maintain the default mode.
这里,在声纹唤醒模块检测到电子设备未进入驾驶情景的情况下,声纹唤醒模块保持在默认模式即原来非驾驶模式,不进入驾驶模式。Here, when the voiceprint wake-up module detects that the electronic device has not entered the driving situation, the voiceprint wake-up module remains in the default mode, that is, the original non-driving mode, and does not enter the driving mode.
步骤S440,调整唤醒阈值及声纹阈值为驾驶模式的预设值。Step S440: Adjust the wake-up threshold and the voiceprint threshold to the preset values of the driving mode.
声纹唤醒模块在进入驾驶模式后,自动调整唤醒阈值及声纹阈值为驾驶模式的预设值。After the voiceprint wake-up module enters the driving mode, it automatically adjusts the wakeup threshold and the voiceprint threshold to the preset values of the driving mode.
这里,声纹唤醒算法的唤醒阈值及声纹阈值在非驾驶模式下为固定的原始预设值,在进入驾驶模式后,自动调整唤醒阈值及声纹阈值为驾驶模式的预设值。Here, the wake-up threshold and voiceprint threshold of the voiceprint wake-up algorithm are fixed original preset values in non-driving mode. After entering the driving mode, the wake-up threshold and voiceprint threshold are automatically adjusted to the preset values of the driving mode.
通常情况下,电子设备进入驾驶情景后,意味着基本是在驾驶位中使用语音助手,此时使用该电子设备的用户身边其他人的干扰相对而言极少,即意味着他人误唤醒率非常少。此时用户的核心诉求在于唤醒响应更灵敏,即唤醒率高。故在系统识别到电子设备处于驾驶情景下时,可以降低唤醒阈值及声纹阈值,从而较大提升唤醒率。Under normal circumstances, when the electronic device enters the driving situation, it means that the voice assistant is basically used in the driving position. At this time, the user who uses the electronic device has relatively little interference from other people around, which means that the false wake-up rate of others is very high. few. At this time, the user's core demand is that the wake-up response is more sensitive, that is, the wake-up rate is high. Therefore, when the system recognizes that the electronic device is in a driving situation, the wakeup threshold and the voiceprint threshold can be reduced, thereby greatly improving the wakeup rate.
进一步地,电子设备在驾驶模式下进行声纹唤醒操作,声纹唤醒模块根据驾驶模式的预设值唤醒语音助手,之后语音助手根据接收的语音指令与用户进行交互。Further, the electronic device performs a voiceprint wake-up operation in the driving mode, the voiceprint wake-up module wakes up the voice assistant according to the preset value of the driving mode, and then the voice assistant interacts with the user according to the received voice command.
示例地,下面示出在检测到电子设备进入驾驶情景后,声纹唤醒模块调整唤醒阈值及声纹阈值前后的客观测试数据,由于是针对特定的驾驶人员的测试,声纹阈值可以固定为相对较低的数值,表1为降低唤醒阈值前的测试数据,表2为降低唤醒阈值后的测试数据,其中降低唤醒阈值0.05:For example, the following shows the objective test data before and after the voiceprint wakeup module adjusts the wakeup threshold and the voiceprint threshold after detecting that the electronic device enters the driving situation. Since it is a test for a specific driver, the voiceprint threshold can be fixed as relative For lower values, Table 1 is the test data before lowering the wake-up threshold, and Table 2 is the test data after lowering the wake-up threshold, where the lower the wake-up threshold is 0.05:
表1降低唤醒阈值前的测试数据Table 1 Test data before lowering the wake-up threshold
驾驶情景Driving situation 声纹唤醒率Voiceprint awakening rate
车速80km/hSpeed 80km/h 83.80%83.80%
车速80km/h且开启音乐模式Speed 80km/h and turn on music mode 77.20%77.20%
表2降低唤醒阈值后的测试数据Table 2 Test data after lowering the wake-up threshold
驾驶情景Driving situation 声纹唤醒率Voiceprint awakening rate
车速80km/hSpeed 80km/h 96.04%96.04%
车速80km/h且开启音乐模式Speed 80km/h and turn on music mode 97.08%97.08%
根据表1和表2可以看出,针对特定用户,将声纹阈值固定为很低的值,然后在原始唤醒阈值的基础上降低唤醒阈值0.05,可以提升驾驶情景下声纹唤醒率十几个百分点:例如,在车速80km/h的驾驶情景下,可以将声纹唤醒率由83.80%提升到96.04%;在车速80km/h且开启音乐模式的驾驶情景下,可以将声纹唤醒率由77.20%提升到97.08%。也就是说,在检测到电子设备进入驾驶模式后,通过自动调整声纹唤醒算法的唤醒阈值及声纹阈值为驾驶模式的预设值,可以明显提升用户在驾驶模式下的声纹唤醒率,提升用户体验。According to Table 1 and Table 2, it can be seen that for a specific user, the voiceprint threshold is fixed to a very low value, and then the wakeup threshold is reduced by 0.05 on the basis of the original wakeup threshold, which can increase the voiceprint wakeup rate in driving scenarios by more than a dozen Percentage points: For example, in a driving scenario with a vehicle speed of 80km/h, the voiceprint wake-up rate can be increased from 83.80% to 96.04%; in a driving scenario with a vehicle speed of 80km/h and the music mode is turned on, the voiceprint wake-up rate can be increased from 77.20 % Increased to 97.08%. That is to say, after detecting that the electronic device enters the driving mode, the wake-up threshold of the voiceprint wake-up algorithm and the voiceprint threshold are automatically adjusted to the preset values of the driving mode, which can significantly improve the user's voiceprint wake-up rate in the driving mode. Improve user experience.
图4C为本申请实施例提供的驾驶模式下的声纹监听过程示意图,如图4C所示,包括以下步骤:FIG. 4C is a schematic diagram of the voiceprint monitoring process in the driving mode provided by an embodiment of the application. As shown in FIG. 4C, it includes the following steps:
步骤S450,判断电子设备是否退出驾驶情景。In step S450, it is determined whether the electronic device exits the driving scenario.
这里,可以通过电子设备上的用户交互界面模块、语音助手模块或者蓝牙模块,判 断电子设备是否退出驾驶情景。Here, the user interaction interface module, voice assistant module, or Bluetooth module on the electronic device can be used to determine whether the electronic device exits the driving scenario.
例如,响应于界面开关接收操作指令,或者语音指令,如“关闭驾驶模式”或“退出驾驶模式”,或者断开车载蓝牙或者断开连接用户指定的蓝牙设备,判断为电子设备退出驾驶情景。For example, in response to the interface switch receiving operation instructions, or voice instructions, such as "turn off driving mode" or "exit driving mode", or disconnect the car Bluetooth or disconnect the Bluetooth device designated by the user, it is determined that the electronic device has exited the driving scenario.
步骤S460,调整声纹唤醒模块退出驾驶模式。In step S460, the voiceprint wake-up module is adjusted to exit the driving mode.
这里,声纹唤醒模块检测到电子设备退出驾驶情景后,自动退出驾驶模式;Here, after the voiceprint wake-up module detects that the electronic device exits the driving scenario, it automatically exits the driving mode;
步骤S470,调整声纹唤醒模块保持驾驶模式。In step S470, the voiceprint wake-up module is adjusted to maintain the driving mode.
这里,在声纹唤醒模块检测到电子设备未退出驾驶情景的情况下,声纹唤醒模块保持在驾驶模式不退出。Here, in the case that the voiceprint wakeup module detects that the electronic device has not exited the driving scenario, the voiceprint wakeup module remains in the driving mode without exiting.
步骤S480,调整唤醒阈值及声纹阈值为原始预设值。Step S480, adjusting the wake-up threshold and the voiceprint threshold to the original preset values.
这里,声纹唤醒模块在退出驾驶模式后,自动恢复声纹唤醒算法的唤醒阈值及声纹阈值为原始预设值。Here, after the voiceprint wake-up module exits the driving mode, the wake-up threshold and the voiceprint threshold of the voiceprint wake-up algorithm are automatically restored to the original preset values.
本申请实施例提供的声纹唤醒方法,根据用户语音指令或者界面操作或者连接车载蓝牙或者连接用户指定蓝牙设备时,自动进入驾驶模式,然后在检测到电子设备进入驾驶模式后,自动调整声纹唤醒算法的唤醒阈值及声纹阈值为驾驶模式的预设值,提升用户声纹唤醒的唤醒率;在检测到电子设备退出驾驶模式后,自动恢复声纹唤醒算法的唤醒阈值及声纹阈值为原始预设值,如此能够精准判断用户驾驶情景状态,在驾驶情景下,自动进入驾驶模式;并且在进入驾驶模式后,自动调整唤醒及声纹阈值,提升用户唤醒率,提升用户体验。The voiceprint wake-up method provided by the embodiments of the application automatically enters the driving mode according to the user's voice command or interface operation or when connecting to the car Bluetooth or connecting to the user-specified Bluetooth device, and then automatically adjusts the voiceprint after detecting that the electronic device enters the driving mode The wake-up threshold and voiceprint threshold of the wake-up algorithm are the preset values of the driving mode, which improves the wake-up rate of the user's voiceprint wake-up; after detecting that the electronic device exits the driving mode, the wake-up threshold and voiceprint threshold of the voiceprint wake-up algorithm are automatically restored The original preset value can accurately determine the state of the user's driving situation. In the driving situation, it will automatically enter the driving mode; and after entering the driving mode, the wake-up and voiceprint thresholds will be automatically adjusted to increase the user's wake-up rate and enhance the user experience.
本申请实施例提供一种声纹唤醒装置,该装置包括所包括的各模块、以及各模块所包括的各单元,可以通过电子设备中的处理器来实现;当然也可通过的逻辑电路实现;在实施的过程中,处理器可以为中央处理器(Central Processing Unit,CPU)、微处理器(Micro Processing Unit,MPU)、数字信号处理器(Digital Signal Processor,DSP)或现场可编程门阵列(Field Programmable Gate Array,FPGA)等。The embodiment of the present application provides a voiceprint wake-up device, which includes each module included and each unit included in each module, which can be implemented by a processor in an electronic device; of course, it can also be implemented by a logic circuit; In the implementation process, the processor can be a central processing unit (CPU), a microprocessor (Micro Processing Unit, MPU), a digital signal processor (Digital Signal Processor, DSP), or a field programmable gate array ( Field Programmable Gate Array, FPGA), etc.
图5A为本申请实施例提供的一种声纹唤醒装置的组成结构示意图,如图5A所示,所述装置500a包括第一确定模块501a和第一唤醒模块502a,其中:FIG. 5A is a schematic diagram of the composition structure of a voiceprint wake-up device provided by an embodiment of the application. As shown in FIG. 5A, the device 500a includes a first determination module 501a and a first wake-up module 502a, wherein:
所述第一确定模块501a,配置为根据预设条件确定与当前场景匹配的预设唤醒参数;The first determining module 501a is configured to determine a preset wake-up parameter matching the current scene according to preset conditions;
所述第一唤醒模块502a,配置为响应于声纹唤醒操作,按照所述预设唤醒参数,对待唤醒的应用程序进行唤醒。The first wake-up module 502a is configured to wake up the application to be awakened according to the preset wake-up parameter in response to the voiceprint wake-up operation.
在一些实施例中,所述第一确定模块501a包括第一确定单元和第一调整单元,其中:所述第一确定单元,配置为根据所述电子设备所处的周围环境信息,确定所述电子设备进入预设场景模式;所述第一调整单元,配置为将所述电子设备的当前唤醒参数调整为,与所述预设场景模式匹配的预设唤醒参数。In some embodiments, the first determining module 501a includes a first determining unit and a first adjusting unit, wherein: the first determining unit is configured to determine the The electronic device enters a preset scene mode; the first adjustment unit is configured to adjust the current wake-up parameter of the electronic device to a preset wake-up parameter matching the preset scene mode.
在一些实施例中,所述第一确定模块501a包括监听单元、第二确定单元和第二调整单元,其中:所述监听单元,配置为监听当前操作信息;所述第二确定单元,配置为在所述当前操作信息满足预设操作的情况下,确定所述电子设备所处的当前场景;所述第二调整单元,配置为将所述电子设备的默认唤醒参数调整为,与当前场景匹配的预设唤醒参数。In some embodiments, the first determining module 501a includes a monitoring unit, a second determining unit, and a second adjusting unit, wherein: the monitoring unit is configured to monitor current operation information; the second determining unit is configured to When the current operation information satisfies the preset operation, determine the current scene in which the electronic device is located; the second adjustment unit is configured to adjust the default wake-up parameter of the electronic device to match the current scene The preset wake-up parameters.
在一些实施例中,所述第二确定单元还配置为在所述当前操作信息满足第一预设操作的情况下,确定所述电子设备进入驾驶情景;其中,所述第一预设操作至少包括以下操作之一:打开界面开关、输入第一预设语音指令、连接车载蓝牙和连接特定的蓝牙设备,所述第一预设语音指令包括打开驾驶模式或者进入驾驶模式。In some embodiments, the second determining unit is further configured to determine that the electronic device enters the driving scenario when the current operation information satisfies a first preset operation; wherein, the first preset operation is at least It includes one of the following operations: turning on an interface switch, inputting a first preset voice command, connecting a car Bluetooth and connecting a specific Bluetooth device, and the first preset voice command includes turning on the driving mode or entering the driving mode.
在一些实施例中,所述第二确定单元还配置为在所述当前操作信息满足第二预设操作的情况下,确定所述电子设备退出所述驾驶情景;其中,所述第二预设操作至少包括以下操作之一:关闭所述界面开关、输入第二预设语音指令、断开所述车载蓝牙和断开所述特定的蓝牙设备,所述第二预设语音指令包括关闭驾驶模式或者退出驾驶模式。In some embodiments, the second determining unit is further configured to determine that the electronic device exits the driving scenario when the current operation information satisfies a second preset operation; wherein, the second preset The operation includes at least one of the following operations: turning off the interface switch, inputting a second preset voice command, disconnecting the car Bluetooth and disconnecting the specific Bluetooth device, the second preset voice command including turning off the driving mode Or exit driving mode.
在一些实施例中,所述装置500a还包括第二唤醒模块,配置为在所述电子设备进入所述驾驶情景的情况下,按照所述预设唤醒参数,唤醒所述待唤醒的应用程序;或者在所述电子设备退出所述驾驶情景的情况下,按照所述默认唤醒参数,唤醒所述待唤醒的应用程序。In some embodiments, the device 500a further includes a second wake-up module configured to wake up the application to be awakened according to the preset wake-up parameter when the electronic device enters the driving scenario; Or when the electronic device exits the driving scenario, wake up the application to be awakened according to the default wake-up parameter.
在一些实施例中,所述第一唤醒模块502a包括接收单元、第三确定单元和唤醒单元,其中:所述接收单元,配置为接收输入的语音信号;所述第三确定单元,配置为确定所述语音信号的唤醒值和声纹特征;所述声纹识别单元,配置为在所述唤醒值大于所述预设唤醒阈值,并且所述声纹特征大于所述预设声纹阈值的情况下,对所述待唤醒的应用程序进行唤醒。In some embodiments, the first wake-up module 502a includes a receiving unit, a third determining unit, and a wake-up unit, wherein: the receiving unit is configured to receive an input voice signal; the third determining unit is configured to determine The wake-up value and voiceprint feature of the voice signal; the voiceprint recognition unit is configured to be in the case where the wake-up value is greater than the preset wake-up threshold, and the voiceprint feature is greater than the preset voiceprint threshold Next, wake up the application to be awakened.
在一些实施例中,所述第三确定单元,还配置为确定所述语音信号与预设唤醒词之间的相似度,将所述相似度作为所述语音信号的唤醒值;对所述输入的语音信号进行声纹识别,得到声纹特征。In some embodiments, the third determining unit is further configured to determine the similarity between the voice signal and a preset wake-up word, and use the similarity as the wake-up value of the voice signal; Perform voiceprint recognition on the voice signal to obtain voiceprint features.
这里需要指出的是:以上装置实施例的描述,与上述方法实施例的描述是类似的, 具有同方法实施例相似的有益效果。对于本申请装置实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。It should be pointed out here that the description of the above device embodiment is similar to the description of the above method embodiment, and has similar beneficial effects as the method embodiment. For technical details not disclosed in the device embodiments of the present application, please refer to the description of the method embodiments of the present application for understanding.
本申请实施例提供一种声纹唤醒装置,该装置包括所包括的各模块、以及各模块所包括的各单元,可以通过电子设备中的处理器来实现;当然也可通过的逻辑电路实现;在实施的过程中,处理器可以为中央处理器、微处理器、数字信号处理器或现场可编程门阵列等。The embodiment of the present application provides a voiceprint wake-up device, which includes each module included and each unit included in each module, which can be implemented by a processor in an electronic device; of course, it can also be implemented by a logic circuit; In the implementation process, the processor may be a central processing unit, a microprocessor, a digital signal processor, or a field programmable gate array.
图5B为本申请实施例提供的一种声纹唤醒装置的组成结构示意图,如图5B所示,所述装置500b包括第二确定模块501b、第三确定模块502b和第三唤醒模块503b,其中:FIG. 5B is a schematic diagram of the composition structure of a voiceprint wake-up device provided by an embodiment of the application. As shown in FIG. 5B, the device 500b includes a second determination module 501b, a third determination module 502b, and a third wake-up module 503b. :
所述第二确定模块501b,配置为响应于接收对待唤醒的应用程序进行唤醒的预设操作,确定电子设备进入驾驶情景;The second determining module 501b is configured to determine that the electronic device enters the driving scenario in response to receiving a preset operation of waking up the application to be awakened;
所述第三确定模块502b,配置为在所述驾驶情景下,确定与所述驾驶情景匹配的预设唤醒参数;The third determining module 502b is configured to determine a preset wake-up parameter matching the driving scenario under the driving scenario;
所述第三唤醒模块503b,配置为响应于声纹唤醒操作,按照所述预设唤醒参数,唤醒所述电子设备的语音助手。The third wake-up module 503b is configured to wake up the voice assistant of the electronic device according to the preset wake-up parameter in response to the voiceprint wake-up operation.
所述预设操作至少包括以下之一:打开界面开关、输入第一预设语音指令、连接车载蓝牙和连接特定的蓝牙设备,其中,所述第一预设语音指令包括打开驾驶模式或者进入驾驶模式。The preset operation includes at least one of the following: turning on an interface switch, inputting a first preset voice command, connecting a car Bluetooth and connecting a specific Bluetooth device, wherein the first preset voice command includes turning on driving mode or entering driving model.
这里需要指出的是:以上装置实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本申请装置实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。It should be pointed out here that the description of the above device embodiment is similar to the description of the above method embodiment, and has similar beneficial effects as the method embodiment. For technical details not disclosed in the device embodiments of the present application, please refer to the description of the method embodiments of the present application for understanding.
需要说明的是,本申请实施例中,如果以软件功能模块的形式实现上述的声纹唤醒方法,并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得包含该存储介质的设备自动测试线执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。It should be noted that, in the embodiments of the present application, if the above voiceprint wake-up method is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence or the parts that contribute to related technologies. The computer software products are stored in a storage medium and include several instructions to enable The automatic test line of the device containing the storage medium executes all or part of the method described in each embodiment of the present application. The aforementioned storage media include: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), magnetic disk or optical disk and other media that can store program codes.
对应地,本申请实施例提供一种电子设备,图6为本申请实施例提供的电子设备的硬件实体示意图,如图6所示,该设备600的硬件实体包括:处理器601、通信接口602 和存储器603,其中Correspondingly, an embodiment of the present application provides an electronic device. FIG. 6 is a schematic diagram of the hardware entity of the electronic device provided by an embodiment of the application. As shown in FIG. 6, the hardware entity of the device 600 includes: a processor 601 and a communication interface 602 And memory 603, where
处理器601通常控制设备600的总体操作。The processor 601 generally controls the overall operation of the device 600.
通信接口602可以使设备600通过网络与其他电子设备或服务器通信。The communication interface 602 can enable the device 600 to communicate with other electronic devices or servers through a network.
存储器603配置为存储由处理器601可执行的指令和应用,还可以缓存待处理器601以及设备600中各模块待处理或已经处理的数据(例如,图像数据、音频数据、语音通信数据和视频通信数据),可以通过闪存(FLASH)或随机访问存储器(Random Access Memory,RAM)实现。The memory 603 is configured to store instructions and applications executable by the processor 601, and can also cache data to be processed or processed by the processor 601 and each module in the device 600 (for example, image data, audio data, voice communication data, and video data). Communication data) can be realized by flash memory (FLASH) or random access memory (Random Access Memory, RAM).
对应地,本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述实施例中提供的声纹唤醒方法中的步骤。Correspondingly, an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the voiceprint wake-up method provided in the above-mentioned embodiments are implemented.
这里需要指出的是:以上存储介质和设备实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本申请存储介质和设备实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。It should be pointed out here that the description of the above storage medium and device embodiments is similar to the description of the above method embodiment, and has similar beneficial effects as the method embodiment. For technical details not disclosed in the storage medium and device embodiments of this application, please refer to the description of the method embodiments of this application for understanding.
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。It should be understood that “one embodiment” or “an embodiment” mentioned throughout the specification means that a specific feature, structure, or characteristic related to the embodiment is included in at least one embodiment of the present application. Therefore, the appearances of "in one embodiment" or "in an embodiment" in various places throughout the specification do not necessarily refer to the same embodiment. In addition, these specific features, structures or characteristics can be combined in one or more embodiments in any suitable manner. It should be understood that in the various embodiments of the present application, the size of the sequence number of the above-mentioned processes does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not correspond to the embodiments of the present application. The implementation process constitutes any limitation. The serial numbers of the foregoing embodiments of the present application are for description only, and do not represent the superiority or inferiority of the embodiments.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It should be noted that in this article, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements not only includes those elements, It also includes other elements that are not explicitly listed, or elements inherent to the process, method, article, or device. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, article, or device that includes the element.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。In the several embodiments provided in this application, it should be understood that the disclosed device and method may be implemented in other ways. The device embodiments described above are merely illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, such as: multiple units or components can be combined, or It can be integrated into another system, or some features can be ignored or not implemented. In addition, the coupling, or direct coupling, or communication connection between the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms. of.
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元;既可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本申请实施例方案的目的。另外,在本申请各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。The units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units; they may be located in one place or distributed on multiple network units; Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application. In addition, the functional units in the embodiments of the present application can be all integrated into one processing unit, or each unit can be individually used as a unit, or two or more units can be integrated into one unit; The unit can be implemented in the form of hardware, or in the form of hardware plus software functional units.
或者,本申请上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得设备自动测试线执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、磁碟或者光盘等各种可以存储程序代码的介质。Alternatively, if the aforementioned integrated unit of the present application is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence or the parts that contribute to related technologies. The computer software products are stored in a storage medium and include several instructions to enable The equipment automatic test line executes all or part of the methods described in the various embodiments of the present application. The aforementioned storage media include: removable storage devices, ROMs, magnetic disks, or optical disks and other media that can store program codes.
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。The methods disclosed in the several method embodiments provided in this application can be combined arbitrarily without conflict to obtain new method embodiments. The features disclosed in the several method or device embodiments provided in this application can be combined arbitrarily without conflict to obtain a new method embodiment or device embodiment.
以上所述,仅为本申请的实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only the implementation manners of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in this application. Covered in the scope of protection of this application. Therefore, the protection scope of this application should be subject to the protection scope of the claims.
工业实用性Industrial applicability
本申请实施例中,首先根据预设条件确定与当前场景匹配的预设唤醒参数;其中,所述预设唤醒参数包括预设唤醒阈值和预设声纹阈值;然后响应于声纹唤醒操作,按照所述预设唤醒参数,对待唤醒的应用程序进行唤醒;如此,能够根据当前场景自动调整唤醒及声纹阈值,提升用户唤醒率,提升用户体验。In the embodiment of the present application, the preset wake-up parameters that match the current scene are first determined according to preset conditions; wherein the preset wake-up parameters include a preset wake-up threshold and a preset voiceprint threshold; then, in response to the voiceprint wake-up operation, The application program to be awakened is awakened according to the preset wake-up parameters; in this way, the wake-up and voiceprint thresholds can be automatically adjusted according to the current scene, which improves the user's wake-up rate and enhances the user experience.

Claims (21)

  1. 一种声纹唤醒方法,应用于电子设备,所述方法包括:A voiceprint wake-up method, applied to an electronic device, and the method includes:
    根据预设条件确定与当前场景匹配的预设唤醒参数;其中,所述预设唤醒参数用于表征满足唤醒要求的唤醒阈值和声纹阈值;Determine a preset wake-up parameter matching the current scene according to preset conditions; wherein the preset wake-up parameter is used to characterize the wake-up threshold and the voiceprint threshold that meet the wake-up requirements;
    响应于声纹唤醒操作,按照所述预设唤醒参数,对待唤醒的应用程序进行唤醒。In response to the voiceprint wake-up operation, the application program to be awakened is awakened according to the preset wake-up parameter.
  2. 如权利要求1所述的方法,其中,所述根据预设条件确定与当前场景匹配的预设唤醒参数,包括:The method of claim 1, wherein the determining a preset wake-up parameter matching the current scene according to a preset condition comprises:
    根据所述电子设备所处的周围环境信息,确定所述电子设备进入预设场景模式;Determining that the electronic device enters a preset scene mode according to the surrounding environment information where the electronic device is located;
    将所述电子设备的默认唤醒参数调整为,与所述预设场景模式匹配的预设唤醒参数。The default wake-up parameter of the electronic device is adjusted to a preset wake-up parameter matching the preset scene mode.
  3. 如权利要求1所述的方法,其中,所述根据预设条件确定与当前场景匹配的预设唤醒参数,包括:The method of claim 1, wherein the determining a preset wake-up parameter matching the current scene according to a preset condition comprises:
    监听当前操作信息;Monitor current operation information;
    在所述当前操作信息满足预设操作的情况下,确定所述电子设备所处的当前场景;If the current operation information satisfies a preset operation, determine the current scene where the electronic device is located;
    将所述电子设备的默认唤醒参数调整为,与当前场景匹配的预设唤醒参数。The default wake-up parameter of the electronic device is adjusted to a preset wake-up parameter matching the current scene.
  4. 如权利要求3所述的方法,其中,所述在所述当前操作信息满足预设操作的情况下,确定所述电子设备所处的当前场景,包括:The method according to claim 3, wherein the determining the current scene where the electronic device is located when the current operation information satisfies a preset operation comprises:
    在所述当前操作信息满足第一预设操作的情况下,确定所述电子设备进入驾驶情景;In a case where the current operation information satisfies the first preset operation, determining that the electronic device enters the driving scenario;
    其中,所述第一预设操作至少包括以下操作之一:打开界面开关、输入第一预设语音指令、连接车载蓝牙和连接特定的蓝牙设备,所述第一预设语音指令包括打开驾驶模式或者进入驾驶模式。Wherein, the first preset operation includes at least one of the following operations: turning on the interface switch, inputting a first preset voice command, connecting the car Bluetooth and connecting a specific Bluetooth device, and the first preset voice command includes turning on the driving mode Or enter driving mode.
  5. 如权利要求4所述的方法,其中,所述方法还包括:The method of claim 4, wherein the method further comprises:
    在所述当前操作信息满足第二预设操作的情况下,确定所述电子设备退出所述驾驶情景;In a case where the current operation information satisfies the second preset operation, determining that the electronic device exits the driving scenario;
    其中,所述第二预设操作至少包括以下操作之一:关闭所述界面开关、输入第二预设语音指令、断开所述车载蓝牙和断开所述特定的蓝牙设备,所述第二预设语音指令包括关闭驾驶模式或者退出驾驶模式。Wherein, the second preset operation includes at least one of the following operations: turning off the interface switch, inputting a second preset voice command, disconnecting the car Bluetooth and disconnecting the specific Bluetooth device, the second The preset voice commands include turning off the driving mode or exiting the driving mode.
  6. 如权利要求3至5所述的方法,其中,所述方法还包括:The method according to claims 3 to 5, wherein the method further comprises:
    在所述电子设备进入所述驾驶情景的情况下,按照所述预设唤醒参数,唤醒所述待唤醒的应用程序;或者When the electronic device enters the driving scenario, wake up the application to be awakened according to the preset wake-up parameter; or
    在所述电子设备退出所述驾驶情景的情况下,按照所述默认唤醒参数,唤醒所述待唤醒的应用程序。When the electronic device exits the driving scenario, wake up the application to be awakened according to the default wake-up parameter.
  7. 如权利要求1至5任一项所述的方法,其特征在于,所述预设唤醒参数包括预设唤醒阈值和预设声纹阈值,The method according to any one of claims 1 to 5, wherein the preset wake-up parameters include a preset wake-up threshold and a preset voiceprint threshold,
    所述响应于声纹唤醒操作,按照所述预设唤醒参数,对待唤醒的应用程序进行唤醒,包括:The waking up the application program to be awakened in response to the voiceprint wake-up operation according to the preset wake-up parameter includes:
    接收输入的语音信号;Receive input voice signal;
    确定所述语音信号的唤醒值和声纹特征;Determining the wake-up value and voiceprint characteristics of the voice signal;
    在所述唤醒值大于所述预设唤醒阈值,并且所述声纹特征大于所述预设声纹阈值的情况下,对所述待唤醒的应用程序进行唤醒。In a case where the wake-up value is greater than the preset wake-up threshold, and the voiceprint feature is greater than the preset voiceprint threshold, wake up the application to be awakened.
  8. 如权利要求7所述的方法,其特征在于,所述确定所述语音信号的唤醒值和声纹特征,包括:The method according to claim 7, wherein the determining the wake-up value and voiceprint characteristics of the voice signal comprises:
    确定所述语音信号与预设唤醒词之间的相似度,将所述相似度作为所述语音信号的唤醒值;Determine the similarity between the voice signal and a preset wake-up word, and use the similarity as the wake-up value of the voice signal;
    对所述输入的语音信号进行声纹识别,得到声纹特征。Perform voiceprint recognition on the input voice signal to obtain voiceprint features.
  9. 一种声纹唤醒方法,所述方法包括:A voiceprint wake-up method, the method includes:
    响应于接收的预设操作,确定电子设备进入驾驶情景;In response to the received preset operation, it is determined that the electronic device enters the driving scenario;
    在所述驾驶情景下,确定与所述驾驶情景匹配的预设唤醒参数;其中,所述预设唤醒参数用于表征满足唤醒要求的唤醒阈值和声纹阈值;In the driving scenario, determining a preset wake-up parameter matching the driving scenario; wherein the preset wake-up parameter is used to characterize the wake-up threshold and the voiceprint threshold that meet the wake-up requirements;
    响应于声纹唤醒操作,按照所述预设唤醒参数,唤醒所述电子设备的语音助手。In response to the voiceprint wake-up operation, wake up the voice assistant of the electronic device according to the preset wake-up parameter.
  10. 如权利要求9所述的方法,其中,所述预设操作至少包括以下之一:9. The method of claim 9, wherein the preset operation includes at least one of the following:
    打开界面开关、输入第一预设语音指令、连接车载蓝牙和连接特定的蓝牙设备,其中,所述第一预设语音指令包括打开驾驶模式或者进入驾驶模式。Turn on the interface switch, input the first preset voice instruction, connect the car Bluetooth and connect to a specific Bluetooth device, wherein the first preset voice instruction includes turning on the driving mode or entering the driving mode.
  11. 一种声纹唤醒装置,所述装置包括第一确定模块和第一唤醒模块,其中:A voiceprint wake-up device, the device includes a first determination module and a first wake-up module, wherein:
    所述第一确定模块,配置为根据预设条件确定与当前场景匹配的预设唤醒参数;其中,所述预设唤醒参数用于表征满足唤醒要求的唤醒阈值和/或声纹阈值;The first determining module is configured to determine a preset wake-up parameter matching the current scene according to a preset condition; wherein the preset wake-up parameter is used to characterize a wake-up threshold and/or a voiceprint threshold that meets a wake-up requirement;
    所述第一唤醒模块,配置为响应于声纹唤醒操作,按照所述预设唤醒参数,对待唤醒的应用程序进行唤醒响应于声纹唤醒操作,按照所述预设唤醒参数,对待唤醒的应用 程序进行唤醒。The first wake-up module is configured to, in response to the voiceprint wake-up operation, wake up the application to be awakened according to the preset wake-up parameter. In response to the voiceprint wake-up operation, the application to be woken up according to the preset wake-up parameter The program wakes up.
  12. 如权利要求11所述的装置,其中,所述第一确定模块,包括:The apparatus of claim 11, wherein the first determining module comprises:
    第一确定单元,配置为根据所述电子设备所处的周围环境信息,确定所述电子设备进入预设场景模式;The first determining unit is configured to determine that the electronic device enters a preset scene mode according to the surrounding environment information where the electronic device is located;
    第一调整单元,配置为将所述电子设备的当前唤醒参数调整为,与所述预设场景模式匹配的预设唤醒参数。The first adjustment unit is configured to adjust the current wake-up parameter of the electronic device to a preset wake-up parameter matching the preset scene mode.
  13. 如权利要求11所述的装置,其中,所述第一确定模块,包括:The apparatus of claim 11, wherein the first determining module comprises:
    监听单元,配置为监听当前操作信息;The monitoring unit is configured to monitor current operation information;
    第二确定单元,配置为在所述当前操作信息满足预设操作的情况下,确定所述电子设备所处的当前场景;The second determining unit is configured to determine the current scene where the electronic device is located when the current operation information satisfies a preset operation;
    第二调整单元,配置为将所述电子设备的默认唤醒参数调整为,与当前场景匹配的预设唤醒参数。The second adjustment unit is configured to adjust the default wake-up parameter of the electronic device to a preset wake-up parameter matching the current scene.
  14. 如权利要求13所述的装置,其中,所述第二确定单元还配置为在所述当前操作信息满足第一预设操作的情况下,确定所述电子设备进入驾驶情景;The apparatus according to claim 13, wherein the second determining unit is further configured to determine that the electronic device enters the driving scenario when the current operation information satisfies the first preset operation;
    所述第一预设操作至少包括以下操作之一:打开界面开关、输入第一预设语音指令、连接车载蓝牙和连接特定的蓝牙设备,所述第一预设语音指令包括打开驾驶模式或者进入驾驶模式。The first preset operation includes at least one of the following operations: turning on the interface switch, inputting a first preset voice command, connecting to the car Bluetooth and connecting to a specific Bluetooth device, and the first preset voice command includes turning on the driving mode or entering Driving mode.
  15. 如权利要求14所述的装置,其中,所述第二确定单元还配置为在所述当前操作信息满足第二预设操作的情况下,确定所述电子设备退出所述驾驶情景;The apparatus of claim 14, wherein the second determining unit is further configured to determine that the electronic device exits the driving scenario when the current operation information satisfies a second preset operation;
    所述第二预设操作至少包括以下操作之一:关闭所述界面开关、输入第二预设语音指令、断开所述车载蓝牙和断开所述特定的蓝牙设备,所述第二预设语音指令包括关闭驾驶模式或者退出驾驶模式。The second preset operation includes at least one of the following operations: turning off the interface switch, inputting a second preset voice command, disconnecting the car Bluetooth and disconnecting the specific Bluetooth device, the second preset Voice commands include turning off driving mode or exiting driving mode.
  16. 如权利要求13至15所述的装置,其中,所述装置还包括:The device according to claims 13-15, wherein the device further comprises:
    第二唤醒模块,配置为在所述电子设备进入所述驾驶情景的情况下,按照所述预设唤醒参数,唤醒所述待唤醒的应用程序;或者在所述电子设备退出所述驾驶情景的情况下,按照所述默认唤醒参数,唤醒所述待唤醒的应用程序。The second wake-up module is configured to wake up the application to be awakened according to the preset wake-up parameters when the electronic device enters the driving scenario; or when the electronic device exits the driving scenario In this case, the application program to be awakened is awakened according to the default awakening parameter.
  17. 如权利要求11至15任一项所述的装置,其中,所述第一唤醒模块,包括:The device according to any one of claims 11 to 15, wherein the first wake-up module comprises:
    接收单元,配置为接收输入的语音信号;A receiving unit, configured to receive an input voice signal;
    第三确定单元,配置为确定所述语音信号的唤醒值和声纹特征;The third determining unit is configured to determine the wake-up value and voiceprint characteristics of the voice signal;
    声纹识别单元,配置为在所述唤醒值大于所述预设唤醒阈值,并且所述声纹特征大 于所述预设声纹阈值的情况下,对所述待唤醒的应用程序进行唤醒。The voiceprint recognition unit is configured to wake up the application to be awakened when the wakeup value is greater than the preset wakeup threshold and the voiceprint feature is greater than the preset voiceprint threshold.
  18. 如权利要求17所述的装置,其中,所述第三确定单元,还配置为确定所述语音信号与预设唤醒词之间的相似度,将所述相似度作为所述语音信号的唤醒值;对所述输入的语音信号进行声纹识别,得到声纹特征。17. The apparatus of claim 17, wherein the third determining unit is further configured to determine the similarity between the voice signal and a preset wake-up word, and use the similarity as the wake-up value of the voice signal ; Perform voiceprint recognition on the input voice signal to obtain voiceprint features.
  19. 一种声纹唤醒装置,所述装置包括第二确定模块、第三确定模块和第三唤醒模块,其中:A voiceprint wake-up device, the device includes a second determination module, a third determination module, and a third wake-up module, wherein:
    所述第二确定模块,配置为响应于接收对待唤醒的应用程序进行唤醒的预设操作,确定电子设备进入驾驶情景;The second determining module is configured to determine that the electronic device enters the driving scenario in response to receiving a preset operation of waking up the application to be awakened;
    所述第三确定模块,配置为在所述驾驶情景下,确定与所述驾驶情景匹配的预设唤醒参数;其中,所述预设唤醒参数用于表征满足唤醒要求的唤醒阈值和/或声纹阈值;The third determination module is configured to determine a preset wake-up parameter matching the driving scenario under the driving scenario; wherein the preset wake-up parameter is used to characterize a wake-up threshold and/or sound that meets the wake-up requirement. Pattern threshold
    所述第三唤醒模块,配置为响应于声纹唤醒操作,按照所述预设唤醒参数,唤醒所述电子设备的语音助手。The third wake-up module is configured to wake up the voice assistant of the electronic device according to the preset wake-up parameter in response to the voiceprint wake-up operation.
  20. 一种电子设备,包括存储器和处理器,所述存储器存储有可在处理器上运行的计算机程序,所述计算机程序被所述处理器执行时,所述处理器执行如下操作:An electronic device includes a memory and a processor, the memory stores a computer program that can run on the processor, and when the computer program is executed by the processor, the processor performs the following operations:
    获取与当前场景匹配的预设唤醒参数;其中,所述预设唤醒参数用于表征满足唤醒要求的唤醒阈值和/或声纹阈值;Acquiring a preset wake-up parameter matching the current scene; wherein the preset wake-up parameter is used to characterize the wake-up threshold and/or the voiceprint threshold that meets the wake-up requirement;
    响应于声纹唤醒操作,按照所述预设唤醒参数,对待唤醒的应用程序进行唤醒。In response to the voiceprint wake-up operation, the application program to be awakened is awakened according to the preset wake-up parameter.
  21. 一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现权利要求1至8中任一项所述方法中的步骤,或者,实现权利要求9或10所述方法中的步骤。A computer-readable storage medium with a computer program stored thereon, which, when executed by a processor, implements the steps in the method described in any one of claims 1 to 8, or implements the steps described in claim 9 or 10 Steps in the method.
PCT/CN2021/074833 2020-03-12 2021-02-02 Voiceprint wakeup method and apparatus, device, and storage medium WO2021179854A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010172688.7 2020-03-12
CN202010172688.7A CN111223490A (en) 2020-03-12 2020-03-12 Voiceprint awakening method and device, equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2021179854A1 true WO2021179854A1 (en) 2021-09-16

Family

ID=70832665

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/074833 WO2021179854A1 (en) 2020-03-12 2021-02-02 Voiceprint wakeup method and apparatus, device, and storage medium

Country Status (2)

Country Link
CN (1) CN111223490A (en)
WO (1) WO2021179854A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111223490A (en) * 2020-03-12 2020-06-02 Oppo广东移动通信有限公司 Voiceprint awakening method and device, equipment and storage medium
CN111949323A (en) * 2020-08-31 2020-11-17 深圳市欧瑞博科技股份有限公司 Intelligent device awakening optimization method and device, intelligent device and storage medium
CN112201246B (en) * 2020-11-19 2023-11-28 深圳市欧瑞博科技股份有限公司 Intelligent control method and device based on voice, electronic equipment and storage medium
CN113038048B (en) * 2021-03-02 2022-10-28 海信视像科技股份有限公司 Far-field voice awakening method and display device
CN115083390A (en) * 2021-03-10 2022-09-20 Oppo广东移动通信有限公司 Sound source distance sorting method and related product
CN113470660A (en) * 2021-05-31 2021-10-01 翱捷科技(深圳)有限公司 Voice wake-up threshold adjusting method and system based on router flow

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102254551A (en) * 2010-05-20 2011-11-23 盛乐信息技术(上海)有限公司 Voiceprint authentication apparatus
CN103595869A (en) * 2013-11-15 2014-02-19 华为终端有限公司 Terminal voice control method and device and terminal
CN106339083A (en) * 2016-08-19 2017-01-18 惠州Tcl移动通信有限公司 Method and system for automatically switching to driving mode based on intelligent wearable device
CN107622770A (en) * 2017-09-30 2018-01-23 百度在线网络技术(北京)有限公司 voice awakening method and device
CN108564948A (en) * 2018-03-30 2018-09-21 联想(北京)有限公司 A kind of audio recognition method and electronic equipment
CN108924343A (en) * 2018-06-19 2018-11-30 Oppo广东移动通信有限公司 Control method of electronic device, device, storage medium and electronic equipment
US10360916B2 (en) * 2017-02-22 2019-07-23 Plantronics, Inc. Enhanced voiceprint authentication
CN110580897A (en) * 2019-08-23 2019-12-17 Oppo广东移动通信有限公司 audio verification method and device, storage medium and electronic equipment
CN111223490A (en) * 2020-03-12 2020-06-02 Oppo广东移动通信有限公司 Voiceprint awakening method and device, equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105246023A (en) * 2015-08-28 2016-01-13 努比亚技术有限公司 Driving assistant starting device and method
CN105329187B (en) * 2015-11-05 2018-06-22 深圳市几米软件有限公司 The intelligent vehicle-mounted system and control method of safety operation are realized in Bluetooth key triggering
CN107623896A (en) * 2017-09-30 2018-01-23 珠海市魅族科技有限公司 Connect method, apparatus, mobile terminal and the storage medium of on-vehicle Bluetooth
CN108711430B (en) * 2018-04-28 2020-08-14 广东美的制冷设备有限公司 Speech recognition method, intelligent device and storage medium
CN109346071A (en) * 2018-09-26 2019-02-15 出门问问信息科技有限公司 Wake up processing method, device and electronic equipment
CN110047487B (en) * 2019-06-05 2022-03-18 广州小鹏汽车科技有限公司 Wake-up method and device for vehicle-mounted voice equipment, vehicle and machine-readable medium
CN110473556B (en) * 2019-09-17 2022-06-21 深圳市万普拉斯科技有限公司 Voice recognition method and device and mobile terminal

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102254551A (en) * 2010-05-20 2011-11-23 盛乐信息技术(上海)有限公司 Voiceprint authentication apparatus
CN103595869A (en) * 2013-11-15 2014-02-19 华为终端有限公司 Terminal voice control method and device and terminal
CN106339083A (en) * 2016-08-19 2017-01-18 惠州Tcl移动通信有限公司 Method and system for automatically switching to driving mode based on intelligent wearable device
US10360916B2 (en) * 2017-02-22 2019-07-23 Plantronics, Inc. Enhanced voiceprint authentication
CN107622770A (en) * 2017-09-30 2018-01-23 百度在线网络技术(北京)有限公司 voice awakening method and device
CN108564948A (en) * 2018-03-30 2018-09-21 联想(北京)有限公司 A kind of audio recognition method and electronic equipment
CN108924343A (en) * 2018-06-19 2018-11-30 Oppo广东移动通信有限公司 Control method of electronic device, device, storage medium and electronic equipment
CN110580897A (en) * 2019-08-23 2019-12-17 Oppo广东移动通信有限公司 audio verification method and device, storage medium and electronic equipment
CN111223490A (en) * 2020-03-12 2020-06-02 Oppo广东移动通信有限公司 Voiceprint awakening method and device, equipment and storage medium

Also Published As

Publication number Publication date
CN111223490A (en) 2020-06-02

Similar Documents

Publication Publication Date Title
WO2021179854A1 (en) Voiceprint wakeup method and apparatus, device, and storage medium
KR102293063B1 (en) Customizable wake-up voice commands
KR101726945B1 (en) Reducing the need for manual start/end-pointing and trigger phrases
CN109410952B (en) Voice awakening method, device and system
EP2959474B1 (en) Hybrid performance scaling for speech recognition
KR20190022109A (en) Method for activating voice recognition servive and electronic device for the same
CN105575395A (en) Voice wake-up method and apparatus, terminal, and processing method thereof
US20060074658A1 (en) Systems and methods for hands-free voice-activated devices
CN110100447A (en) Information processing method and device, multimedia equipment and storage medium
EP3526789B1 (en) Voice capabilities for portable audio device
CN109643548A (en) System and method for content to be routed to associated output equipment
CN109101517B (en) Information processing method, information processing apparatus, and medium
KR20190109916A (en) A electronic apparatus and a server for processing received data from the apparatus
CN111328417A (en) Audio peripheral
US11626104B2 (en) User speech profile management
CN108932942A (en) A kind of interactive system and method for realization intelligent sound box
EP4059011A1 (en) Voice activation based on user recognition
TW201717192A (en) Electronic apparatus and voice trigger method therefor
US20210225363A1 (en) Information processing device and information processing method
WO2023155607A1 (en) Terminal devices and voice wake-up methods
CN110400568B (en) Awakening method of intelligent voice system, intelligent voice system and vehicle
WO2023124248A1 (en) Voiceprint recognition method and apparatus
CN114999496A (en) Audio transmission method, control equipment and terminal equipment
CN108663942B (en) Voice recognition equipment control method, voice recognition equipment and central control server
CN110083392B (en) Audio awakening pre-recording method, storage medium, terminal and Bluetooth headset thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21768574

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21768574

Country of ref document: EP

Kind code of ref document: A1