WO2020228095A1 - 实时语音唤醒的音频设备、运行方法、装置及存储介质 - Google Patents

实时语音唤醒的音频设备、运行方法、装置及存储介质 Download PDF

Info

Publication number
WO2020228095A1
WO2020228095A1 PCT/CN2019/091973 CN2019091973W WO2020228095A1 WO 2020228095 A1 WO2020228095 A1 WO 2020228095A1 CN 2019091973 W CN2019091973 W CN 2019091973W WO 2020228095 A1 WO2020228095 A1 WO 2020228095A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
input signal
voice input
audio device
communication connection
Prior art date
Application number
PCT/CN2019/091973
Other languages
English (en)
French (fr)
Inventor
刘涛
朱彪
王丽
Original Assignee
深圳市豪恩声学股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市豪恩声学股份有限公司 filed Critical 深圳市豪恩声学股份有限公司
Publication of WO2020228095A1 publication Critical patent/WO2020228095A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/162Interface to dedicated audio devices, e.g. audio drivers, interface to CODECs
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B5/00Near-field transmission systems, e.g. inductive or capacitive transmission systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • H04R1/1041Mechanical or electronic switches, or control elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/80Services using short range communication, e.g. near-field communication [NFC], radio-frequency identification [RFID] or low energy communication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W52/00Power management, e.g. TPC [Transmission Power Control], power saving or power classes
    • H04W52/02Power saving arrangements
    • H04W52/0209Power saving arrangements in terminal devices
    • H04W52/0225Power saving arrangements in terminal devices using monitoring of external events, e.g. the presence of a signal
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W76/00Connection management
    • H04W76/10Connection setup
    • H04W76/14Direct-mode setup
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2420/00Details of connection covered by H04R, not provided for in its groups
    • H04R2420/05Detection of connection of loudspeakers or headphones to amplifiers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2420/00Details of connection covered by H04R, not provided for in its groups
    • H04R2420/09Applications of special connectors, e.g. USB, XLR, in loudspeakers, microphones or headphones
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Definitions

  • the present invention relates to the field of communication technology, in particular to a real-time voice wake-up audio device, an operation method, a device and a storage medium.
  • the real-time voice assistant headset With the popularity of smart speakers, a real-time voice assistant headset that can be worn on the market appears.
  • the solution adopted by the traditional technology is: the real-time voice assistant headset establishes a classic Bluetooth communication connection with the terminal. After the real-time voice assistant headset is awakened by voice, the headset passes the Hands-free Profile (HFP) or custom serial linear simulation The (RFCOMM) protocol sends the voice data collected by the headset to the terminal, and the terminal sends the voice data to the cloud server for voice recognition, and returns an answer corresponding to the voice data to the terminal. The terminal sends the answer corresponding to the voice data to the headset for playback.
  • HFP Hands-free Profile
  • RCOMM custom serial linear simulation
  • the terminal sends the answer corresponding to the voice data to the headset for playback.
  • the standby state of the real-time voice assistant headset has a technical problem of high power consumption.
  • a real-time voice wake-up audio device operating method includes:
  • the decibel value of the voice input signal is greater than the preset decibel threshold, turn on the second acoustic-electric transducer, collect the voice input signal through the second acoustic-electric transducer, and perform processing on the voice input signal Beamforming and noise reduction processing and saving processing results, wherein the power consumption of the first acoustic-electric transducer is lower than the power consumption of the second acoustic-electric transducer;
  • the first Bluetooth communication connection between the audio device and the terminal is established.
  • An operating device for audio equipment comprising:
  • the voice input detection module is used to detect the voice input signal in the current environment through the first acoustic-electric transducer when the audio device is in the standby state;
  • the voice input processing module is used to turn on the second acoustic-electric transducer when the decibel value of the voice input signal is greater than the preset decibel threshold, collect the voice input signal through the second acoustic-electric transducer, and compare Performing beamforming and noise reduction processing on the voice input signal and saving the processing result, wherein the power consumption of the first acoustic-electric transducer is lower than the power consumption of the second acoustic-electric transducer;
  • the processing result detection module is used to detect the processing result
  • the first communication connection module is configured to establish a first Bluetooth communication connection between the audio device and the terminal when it is detected that the processing result contains a wake-up keyword.
  • a real-time voice wake-up audio device comprising a memory, a processor, and a computer program stored on the memory and capable of running on the processor.
  • the processor implements the above-mentioned real-time voice wake-up when the computer program is executed. The steps of the method of operation of the audio device.
  • a computer-readable storage medium has a computer program stored thereon, and when the computer program is executed by a processor, the steps of the above-mentioned real-time voice wake-up audio device operation method are realized.
  • the audio equipment, operating method, device and storage medium for real-time voice wake-up use the first acoustic-electric transducer with low power consumption to detect the voice input signal in the current environment, and when the decibel value of the voice input signal exceeds the preset threshold
  • the second acousto-electric transducer is turned on to collect and process the voice input signal, and check whether the processing result includes the wake-up keyword; if the wake-up keyword is detected, establish a Bluetooth low energy communication connection between the audio device and the terminal Therefore, in order to achieve all-weather voice wake-up in the traditional technology, the Bluetooth headset and the terminal in the standby state have always maintained the classic Bluetooth communication connection and the technical problem of high power consumption caused by the high power consumption.
  • Fig. 1 is an application environment diagram of an operation method of an audio listening device according to one or more embodiments
  • FIG. 2 is a schematic flowchart of an operation method of an audio listening device according to one or more embodiments
  • FIG. 3 is a schematic flowchart of an operation method of an audio listening device according to one or more embodiments
  • FIG. 4 is a schematic flowchart of an operation method of an audio listening device according to one or more embodiments
  • Fig. 5 is a schematic flowchart of an operation method of an audio listening device according to one or more embodiments
  • Fig. 6 is a schematic flowchart of an operation method of an audio listening device according to one or more embodiments
  • Figure 7a is a schematic diagram of the composition of an audio listening device according to one or more embodiments.
  • Fig. 7b is a timing diagram of an operation method of an audio listening device according to one or more embodiments.
  • Fig. 8 is a structural block diagram of an operating device of an audio listening device according to one or more embodiments.
  • the audio device 110 is provided with a first device-side Bluetooth communication module, a second device-side Bluetooth communication module, a first acoustic-electric transducer and a second acoustic-electric transducer, and the first acoustic-electric transducer has low power consumption In the second acoustic-electric transducer.
  • the terminal 120 is provided with a first terminal-side Bluetooth communication module and a second terminal-side Bluetooth communication module.
  • the connection between the first device side Bluetooth communication module and the first terminal side Bluetooth communication module is the first Bluetooth communication connection
  • the connection between the second device side Bluetooth communication module and the second terminal side Bluetooth communication module is the second Bluetooth communication Connected
  • the power consumption of the first Bluetooth communication connection is lower than the power consumption of the second Bluetooth communication connection.
  • a Bluetooth communication connection between the terminal 120 and the audio device 110 is established.
  • the audio device 110 disconnects the first Bluetooth communication connection and the second Bluetooth communication connection with the terminal 120, the second acoustic-electric transducer is in the off state, and the first acoustic-electric transducer It is in sound detection mode.
  • the voice input signal in the current environment is detected by the first acoustic-electric transducer, and when the decibel value of the detected voice input signal is greater than the preset decibel threshold, the second acoustic-electric transducer is turned on, and the second acoustic-electric transducer is used
  • the receiver collects the voice input signal, performs noise reduction processing on the voice input signal and saves the processing result.
  • the audio listening device is equipped with a local voice recognition engine, and the local voice recognition engine detects whether the processing result includes a wake-up keyword. If the wake-up keyword is detected, the first Bluetooth communication connection between the audio listening device and the terminal is established.
  • the audio device 110 processes the collected voice input signal and saves it in a buffer, so as to compress the voice data in the buffer.
  • the audio device 110 sends the compressed voice data to the terminal 120.
  • the terminal 120 forwards the compressed voice data to the cloud server 130 for voice Recognition, the cloud server 130 returns the voice recognition result to the terminal 120.
  • the audio device 110 is widely used in relation to the listening or playback features of many types of terminals, and may be, but not limited to, audio listening devices such as band earphones, headsets, headsets, and earphones.
  • the terminal 120 may be, but is not limited to, a portable audio playback device, a portable multimedia device, a personal computer, a notebook computer, a smart phone, a tablet computer, and a portable wearable device.
  • the cloud server 130 may be implemented by an independent server or a server cluster composed of multiple servers.
  • first and second used in the present invention can be used herein to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish the first element from another element.
  • first Bluetooth communication connection may be referred to as the second Bluetooth communication connection
  • second Bluetooth communication connection may be referred to as the first Bluetooth communication connection.
  • Both the first Bluetooth communication connection and the second Bluetooth communication connection are Bluetooth communication connections, but they are Bluetooth communication connections in different ways.
  • the first Bluetooth communication connection as a low-power Bluetooth communication connection
  • the second Bluetooth communication connection as a classic Bluetooth communication connection
  • the audio device that is awakened by real-time voice as an audio listening device as an example.
  • the present application provides an operation method of an audio listening device.
  • the method is applied to the audio listening device in FIG. 1 as an example for description.
  • the operation method includes the following steps:
  • the audio listening device can receive user requests in the form of natural language commands, requests, inquiries, etc., and the user request can instruct the audio listening device to make informational answers or instruct to perform corresponding tasks.
  • the standby state refers to the state where the audio listening device is turned on but does not perform any substantial work (such as playing audio).
  • the first acoustic-electric transducer refers to a device used to receive a sound input signal and convert it into an electric output signal, so that certain required characteristics of the sound input signal are reflected in the output signal.
  • the voice input signal refers to the voice signal sent by the user to request the audio listening device to answer or perform a task.
  • the audio listening device is provided with a first acoustic-electric transducer. When the audio listening device is turned on but not working, the first acoustic-electric transducer detects the voice input signal in the environment to monitor whether the audio listening device is Need to wake up.
  • the power consumption of the first acoustic-electric transducer is lower than the power consumption of the second acoustic-electric transducer.
  • the second acoustic-electric transducer is turned off by default.
  • the decibel value of the voice input signal detected by the first acoustic-electric transducer exceeds the preset decibel threshold, the audio listening device is awakened, and the second acoustic-electric transducer is turned on.
  • the voice input in the current environment is recorded through the second acoustic-electric transducer, beamforming and noise reduction are performed on the voice input signal, and the processing result is stored in the buffer of the audio listening device.
  • the audio listening device is equipped with a local speech recognition engine
  • the wake-up keyword refers to a preset specific keyword used to wake the audio listening device to make it work, such as "Hello" or the name of the audio listening device.
  • Simple specific terms the local speech recognition engine is used to detect whether the processing result in the cache contains the wake-up keyword. If it is detected that the processing result contains the wake-up keyword, a Bluetooth low energy communication connection between the audio listening device and the terminal is established.
  • Bluetooth Low Energy Bluetooth Low Energy, BLE for short
  • BLE Bluetooth Low Energy
  • the voice input signal in the current environment is detected by the first acoustic-electric transducer with lower power consumption, and when the decibel value of the voice input signal exceeds the preset threshold, the second acoustic-electric transducer is turned on to perform
  • the collection and processing of voice input signals, and the detection of whether the processing results include wake-up keywords; if the wake-up keywords are detected, establish a low-power Bluetooth communication connection between the audio listening device and the terminal, thereby solving the traditional technology for all-weather voice Wake up, the Bluetooth headset and the terminal in the standby state always maintain a classic Bluetooth communication connection and the microphone is always on, which causes a technical problem with high power consumption.
  • the first acoustic-electric transducer is a piezoelectric wake-up microphone.
  • the first acoustic-electric transducer detects the voice input signal in the current environment, including: when the audio listening device is in the standby state, in the preset frequency band, the piezoelectric wake-up microphone detects the current Voice input signal in the environment.
  • the piezoelectric wake-up microphone refers to a microphone with a piezoelectric sensing element, and its current consumption is on the order of microamperes ( ⁇ A), which is much smaller than an ordinary digital microphone.
  • ⁇ A microamperes
  • the working frequency band of the piezoelectric wake-up microphone can be preset according to the human voice frequency.
  • the audio listening device is equipped with a piezoelectric wake-up microphone. When the audio listening device is turned on but not working, the piezoelectric wake-up microphone detects the voice input signal in the current environment in the preset frequency band to monitor whether the audio listening device needs to wake up .
  • the power consumption of the audio listening device in the standby state can be greatly reduced.
  • the second acousto-electric transducer includes a first digital microphone and a second digital microphone. As shown in Figure 3, when the decibel value of the voice input signal is greater than the preset decibel threshold, the second acousto-electric transducer is turned on, the voice input signal is collected through the second acousto-electric transducer, and the voice input signal is processed for noise reduction And save the processing results, including the following steps:
  • S320 Collect the voice input signal through the first digital microphone and the second digital microphone, and perform beamforming and noise reduction processing on the voice input signal.
  • the first digital microphone and the second digital microphone are turned off by default.
  • the first digital microphone and the second digital microphone are turned on. Start recording through the first digital microphone and the second digital microphone, and collect voice input signals in the current environment. And through the first digital microphone and the second digital microphone to perform beamforming and noise reduction processing on the collected voice data. Specifically, the sound wave phase difference of the voice input signal is collected by the first digital microphone and the second digital microphone, and the weighting coefficient of each frequency band is calculated, and the voice input signals collected by the first digital microphone and the second digital microphone are weighted and superimposed. Process and output the voice data after single channel beamforming. In order to ensure the integrity of the voice data, the processed voice data is stored in a ring buffer of a preset size.
  • the operation method of the audio listening device further includes: compressing the voice data in the ring buffer; and sending the compressed voice data to the terminal through the Bluetooth low energy communication connection.
  • the compressed speech data is used for speech recognition to obtain speech recognition results.
  • the terminal is used to send the compressed voice data to the cloud server for voice recognition and receive the voice recognition result.
  • audio listening devices use OPUS (voice coding format) or MSBC (Modified Sub-Band Code) protocols, and preset a certain compression ratio .
  • the voice data in the ring buffer is compressed by a compression algorithm such as serialization to save bandwidth ratio and solve the technical problem of large time delay in traditional technology.
  • the audio listening device sends the compressed voice data to the terminal through the Bluetooth low energy communication connection.
  • the terminal receives the compressed voice data.
  • the compressed voice data can include voice information such as the user requesting the audio listening device to perform a certain task or asking the audio listening device for information.
  • There is a network connection between the terminal and the cloud server and the terminal will receive The received voice data is sent to the cloud server, and the cloud server performs voice recognition on the voice data and returns the voice recognition result to the terminal, and the terminal receives the voice recognition result.
  • the terminal may also be provided with a voice recognition engine, and the received voice data is voice recognized through the voice recognition engine of the terminal.
  • the operation method of the audio listening device further includes: judging whether the audio listening device is in use or not in use through an optical proximity sensor, a capacitance sensor, a pressure sensor, or a Hall sensor.
  • the optical proximity sensor is provided with a photodiode inside for detecting reflected light signals from the outside, such as infrared signals.
  • the Hall sensor is used to determine whether the two audio listeners (such as earplugs) of the audio listening device are in a magnetic state.
  • the pressure sensor is used to determine whether the two audio listeners (such as earplugs) of the audio listening device are under pressure.
  • Capacitive sensors are used to determine whether two audio listeners (such as earplugs) of the audio listening device are in contact with the human ear canal. If the optical proximity sensor detects the reflected light signal or the two earplugs are separated, it can be determined that the audio listening device is in use. If the optical proximity sensor does not detect the reflected light signal or the two earplugs are in a magnetic state, it can be determined that the audio listening device is not in use.
  • the operation method of the audio listening device further includes the following steps:
  • S430 Receive the voice recognition result sent by the terminal through the classic Bluetooth communication connection.
  • the acousto-electric transducer refers to a device used to receive electrical signals and convert them into sound signals. Specifically, if the reflected light signal is detected by the optical proximity sensor, the earplug of the audio listening device is located in the ear canal of the user, that is, the audio listening device is in a wearing state. Or, when the two earplugs of the unworn band headset are not placed in the user’s ear canal, and the two earplugs are in a magnetic state, if the Hall sensor detects that the two earplugs of the audio listening device are separated, audio listening The device is wearing.
  • the audio listening device in use needs to establish a Bluetooth communication connection with the terminal.
  • a classic Bluetooth communication connection between the audio listening device in use and the terminal is established.
  • the terminal sends the received voice recognition result to the audio listening device.
  • the audio listening device receives the voice recognition result, and plays the voice recognition result through the electroacoustic transducer of the audio listening device.
  • a classic Bluetooth communication connection between the audio listening device and the terminal is established.
  • establishing a classic Bluetooth communication connection can not only reduce the power consumption of the audio listening device, but also transmit audio data through the classic Bluetooth communication connection to improve the sound quality and avoid the defects of playing audio data.
  • the operation method of the audio listening device further includes: when detecting that the audio listening device is in a non-wearing state through an optical sensor, a capacitance sensor, a pressure sensor, or a Hall sensor, determining that the audio listening device is in a non-use state;
  • the electro-acoustic transducer plays the result of speech recognition.
  • the earplugs of the audio listening device are not located in the user's ear canal, that is, in a non-wearing state; or the Hall sensor detects that the two earplugs of the audio listening device are magnetic In the inhaled state, it is determined that the audio listening device is not in a wearing state. Or it is detected by the pressure sensor that the two earplugs of the audio listening device are not in a compressed state, and it is determined that the audio listening device is in a non-wearing state.
  • the capacitive sensor detects that the two earplugs of the audio listening device are not in contact with the human ear canal, and it is determined that the audio listening device is in a non-wearing state. It is understandable that the state of the audio listening device can be detected by any one or a combination of at least two of the optical sensor, the capacitive sensor, the pressure sensor, and the Hall sensor.
  • the voice recognition result cannot be played through the audio listening device, and there is no need to establish a classic Bluetooth communication connection between the audio listening device and the terminal.
  • the voice recognition result can be played directly through the electroacoustic transducer of the terminal, which is convenient for users.
  • the audio listening device is provided with a voice wake-up button. As shown in Figure 5, before establishing the Bluetooth low energy communication connection between the audio listening device and the terminal, the method further includes:
  • the voice wake-up button refers to a button used to wake up the audio listening device, which can be a touch button or a mechanical button. Specifically, when the user triggers the voice wake-up button, it indicates that a Bluetooth low energy communication connection between the audio listening device and the terminal needs to be established, and then it is detected whether a trigger operation occurs on the voice wake-up button, and if a trigger operation is detected on the voice wake-up button , To establish a low-power Bluetooth communication connection between the audio listening device and the terminal. It is understandable that after the voice wake-up button is triggered, the audio listening device can also record the voice input signal in the current environment through the second acoustic-electric transducer.
  • an operating method of an audio listening device is provided.
  • a piezoelectric wake-up microphone is used to monitor the voice input signal in the current environment, and the second acoustic-electric transducer uses a first digital microphone and The second digital microphone.
  • the method includes the following steps:
  • S606. Collect the voice input signal through the first digital microphone and the second digital microphone, and perform beamforming and noise reduction processing on the voice input signal.
  • S610 Detect the processing result, and determine whether the processing result includes a wake-up keyword.
  • S614 Determine whether the audio listening device is in use or not in use through an optical proximity sensor, a capacitance sensor, a pressure sensor, or a Hall sensor.
  • S616 When detecting that the audio listening device is in use, establish a classic Bluetooth communication connection between the audio listening device and the terminal.
  • S618 Receive the voice recognition result sent by the terminal through the classic Bluetooth communication connection.
  • S620 Play the voice recognition result through the electroacoustic transducer of the audio listening device.
  • the audio listening device is provided with a low-power Bluetooth communication module 710 and a classic Bluetooth communication module 720.
  • the audio listening device includes a piezoelectric wake-up microphone 711 connected to the Bluetooth low energy communication module 710, a first digital microphone 712, a second digital microphone 713, an optical proximity sensor 714, a Hall sensor 715, and a voice wake-up button 716.
  • the audio listening device also includes an LED indicator 721 connected to the classic Bluetooth communication module 720, a headset speaker 722, a multi-function button 723, and a volume button 724.
  • the audio listening device is in a standby state, and the Bluetooth low energy communication module and the classic Bluetooth communication module are both in low power consumption mode, and the audio listening device and the terminal disconnect the Bluetooth low energy communication connection and classic Bluetooth communication connection .
  • the first digital microphone and the second digital microphone are in a closed state.
  • the piezoelectric wake-up microphone is in sound monitoring mode.
  • this embodiment provides a method for operating an audio listening device. The method includes the following steps 701 to 715. The specific process is as follows:
  • Step 701 Detect a voice input signal in the current environment through the piezoelectric wake-up microphone.
  • Step 702 When the decibel value of the voice input signal is greater than the preset decibel threshold, the piezoelectric wake-up microphone sends the terminal to the Bluetooth low energy communication module to activate the first digital microphone and the second digital microphone.
  • Step 703 Collect voice input signals through the first digital microphone and the second digital microphone.
  • Step 704 Perform beamforming and noise reduction processing on the voice input signal, and save the processing result to a preset ring buffer.
  • Step 705 Perform wake-up keyword detection on the buffered data in the ring buffer through the local speech recognition engine of the audio listening device.
  • Step 706 When the wake-up keyword is detected, establish a Bluetooth low energy communication connection between the audio listening device and the terminal.
  • Step 707 Compress the buffered data in the ring buffer.
  • Step 708 Send the compressed voice data to the terminal through the Bluetooth low energy communication connection.
  • Step 709 The terminal receives the compressed voice data and sends it to the cloud server.
  • Step 710 The cloud server performs voice recognition on the received voice data.
  • Step 711 The cloud server sends the voice recognition result to the terminal.
  • Step 712 The terminal receives the voice recognition result and sends it to the audio listening device.
  • Step 713 While the voice recognition is being performed, if the audio listening device is in use, establish a classic Bluetooth communication connection between the audio listening device and the terminal.
  • the Bluetooth low energy communication module wakes up the classic Bluetooth communication module to establish a classic Bluetooth communication connection between the audio listening device and the terminal.
  • Step 714 Receive the voice recognition result sent by the terminal through the classic Bluetooth communication connection.
  • Step 715 Play the voice recognition result through the speaker of the audio listening device.
  • a device 800 for operating an audio device with real-time voice wake-up includes:
  • the voice input detection module 810 is configured to detect the voice input signal in the current environment through the first acoustic-electric transducer when the audio device is in a standby state.
  • the voice input processing module 820 is used to turn on the second acoustic-electric transducer when the decibel value of the voice input signal is greater than the preset decibel threshold, collect the voice input signal through the second acoustic-electric transducer, and beam the voice input signal The shaping and noise reduction process and save the processing result, wherein the power consumption of the first acoustic-electric transducer is lower than that of the second acoustic-electric transducer.
  • the processing result detection module 830 is configured to detect the processing result.
  • the first communication connection module 840 is configured to establish a first Bluetooth communication connection between the audio device and the terminal when it is detected that the processing result contains the wake-up keyword.
  • the first acousto-electric transducer is a piezoelectric wake-up microphone; the voice input detection module 810 is also used to detect the current environment through the piezoelectric wake-up microphone in a preset frequency band when the audio device is in a standby state The voice input signal in.
  • the second acousto-electric transducer includes a first digital microphone and a second digital microphone.
  • the voice input processing module 820 is also used to turn on the first digital microphone and the second digital microphone when the decibel value of the voice input signal is greater than the preset decibel threshold; collect the sound waves of the voice input signal through the first digital microphone and the second digital microphone The phase difference is calculated and the weighting coefficient of each frequency band is calculated, and the voice input signals collected by the first digital microphone and the second digital microphone are weighted and superposed to output a single-channel beamforming voice data; the voice data is processed for noise reduction, And save to the preset ring buffer.
  • the device further includes a voice data compression module and a voice data transmission module, wherein the voice data compression module is used to compress voice data in the ring buffer; the voice data transmission module is used to pass the first Bluetooth
  • the communication connection sends the compressed voice data to the terminal; the compressed voice data is used for voice recognition to obtain the voice recognition result.
  • the device further includes a use state judgment module for judging whether the audio device is in use or not in use through an optical proximity sensor or a Hall sensor.
  • the use state determination module is further configured to determine that the audio device is in use if it is detected by the optical proximity sensor or the Hall sensor that the audio device is in the wearing state.
  • the device also includes a second communication connection module, a voice recognition result receiving module and a playing module, wherein:
  • the second communication connection module is used to establish a second Bluetooth communication connection between the audio device and the terminal, wherein the power consumption of the first Bluetooth communication connection is lower than that of the second Bluetooth communication connection
  • the voice recognition result receiving module is used to receive the voice recognition result sent by the terminal through the second Bluetooth communication connection.
  • the playback module is used to play the voice recognition result through the electroacoustic transducer of the audio device.
  • the audio device is provided with a voice wake-up button; the device further includes a trigger operation detection module, which is used to detect whether a trigger operation occurs on the voice wake-up button; the first communication connection module is also used to establish if a trigger operation occurs The first Bluetooth communication connection between the audio device and the terminal.
  • the various modules in the above-mentioned audio equipment operating device can be implemented in whole or in part by software, hardware and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the corresponding operations of the above-mentioned modules.
  • a real-time voice wake-up audio device including a memory, a processor, and a computer program stored on the memory and running on the processor, and the processor executes the computer
  • the program implements the steps of the above-mentioned real-time voice wake-up audio device operation method.
  • a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor to realize the method steps in the above-mentioned embodiments.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Telephone Function (AREA)

Abstract

一种实时语音唤醒的音频设备的运行方法,该运行方法通过功耗较低的第一声电换能器检测当前环境中的语音输入信号,并在语音输入信号的分贝值超过预设阈值时,开启第二声电换能器进行语音输入信号的采集和处理,并检测处理结果是否包括唤醒关键词;若检测到唤醒关键词,建立音频设备与终端之间的第一蓝牙通信连接。

Description

实时语音唤醒的音频设备、运行方法、装置及存储介质 技术领域
本发明涉及通信技术领域,特别是涉及一种实时语音唤醒的音频设备、运行方法、装置及存储介质。
背景技术
随着智能音箱的流行,市场上出现了一种能随身佩戴的实时语音助手耳机。传统技术采用的方案是:实时语音助手耳机与终端建立经典蓝牙通信连接,实时语音助手耳机被语音唤醒后,耳机通过免提规格协议(Hands-free Profile,简称HFP)或自定义串行线性仿真(RFCOMM)协议将耳机采集的语音数据发送至终端,终端将语音数据发送至云端服务器进行语音识别,并向终端返回与语音数据对应的回答。终端将与语音数据对应的回答发送至耳机进行播放。
在传统技术中,实时语音助手耳机的待机状态存在功耗较高的技术问题。
发明内容
基于此,有必要针对传统技术中实时语音助手耳机存在功耗较高的技术问题,提供一种实时语音唤醒的音频设备、运行方法、装置及存储介质。
一种实时语音唤醒的音频设备的运行方法,所述运行方法包括:
当所述音频设备处于待机状态时,通过第一声电换能器检测当前环境中的语音输入信号;
当所述语音输入信号的分贝值大于预设分贝阈值时,开启第二声电换能器,通过所述第二声电换能器采集所述语音输入信号,并对所述语音输入信号进行波束成型和降噪处理并保存处理结果,其中,所述第一声电换能器的功耗低于所述第二声电换能器的功耗;
对所述处理结果进行检测;
当检测到所述处理结果中包含唤醒关键词时,建立所述音频设备与终端之间的第一蓝牙通信连接。
一种音频设备的运行装置,所述运行装置包括:
语音输入检测模块,用于当所述音频设备处于待机状态时,通过第一声电换能器检测当前环境中的语音输入信号;
语音输入处理模块,用于当所述语音输入信号的分贝值大于预设分贝阈值时,开启第二声电换能器,通过第二声电换能器采集所述语音输入信号,并对所述语音输入信号进行波束成型和降噪处理并保存处理结果,其中,所述第一声电换能器的功耗低于所述第二声电换能器的功耗;
处理结果检测模块,用于对所述处理结果进行检测;
第一通信连接模块,用于当检测到所述处理结果中包含唤醒关键词时,并建立所述音频设备与终端之间的第一蓝牙通信连接。
一种实时语音唤醒的音频设备,包括存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述实时语音唤醒的音频设备的运行方法的步骤。
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述实时语音唤醒的音频设备的运行方法的步骤。
上述实时语音唤醒的音频设备、运行方法、装置及存储介质,通过功耗 较低的第一声电换能器检测当前环境中的语音输入信号,并在语音输入信号的分贝值超过预设阈值时,开启第二声电换能器进行语音输入信号的采集和处理,并检测处理结果是否包括唤醒关键词;若检测到唤醒关键词,建立音频设备与终端之间的低功耗蓝牙通信连接,从而解决传统技术中为了实现全天候语音唤醒,待机状态下的蓝牙耳机与终端一直保持着经典蓝牙通信连接导致的功耗较高的技术问题。
附图说明
为了更好地描述和说明这里公开的那些发明的实施例和/或示例,可以参考一幅或多幅附图。用于描述附图的附加细节或示例不应当被认为是对所公开的发明、目前描述的实施例和/或示例以及目前理解的这些发明的最佳模式中的任何一者的范围的限制。
图1为根据一个或多个实施例的音频聆听设备的运行方法的应用环境图;
图2为根据一个或多个实施例的音频聆听设备的运行方法的流程示意图;
图3为根据一个或多个实施例的音频聆听设备的运行方法的流程示意图;
图4为根据一个或多个实施例的音频聆听设备的运行方法的流程示意图;
图5为根据一个或多个实施例的音频聆听设备的运行方法的流程示意图;
图6为根据一个或多个实施例的音频聆听设备的运行方法的流程示意图;
图7a为根据一个或多个实施例的音频聆听设备的组成示意图;
图7b为根据一个或多个实施例的音频聆听设备的运行方法的时序图;
图8为根据一个或多个实施例的音频聆听设备的运行装置的结构框图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例用以解释本申请,并不用于限定本申请。
本申请提供一种实时语音唤醒的音频设备的运行方法,可以应用于如图1所示的应用环境中。其中,音频设备110设有第一设备侧蓝牙通信模块、第二设备侧蓝牙通信模块、第一声电换能器和第二声电换能器,第一声电换能器的功耗低于第二声电换能器。终端120设有第一终端侧蓝牙通信模块和第二终端侧蓝牙通信模块。第一设备侧蓝牙通信模块与第一终端侧蓝牙通信模块之间的连接为第一蓝牙通信连接,第二设备侧蓝牙通信模块与第二终端侧蓝牙通信模块之间的连接为第二蓝牙通信连接,第一蓝牙通信连接的功耗低于第二蓝牙通信连接的功耗。通过第一蓝牙通信连接或者第二蓝牙通信连接,建立终端120与音频设备110之间的蓝牙通信连接。音频设备110处于待机状态时,音频设备110断开与终端120之间的第一蓝牙通信连接和第二蓝牙通信连接,第二声电换能器处于关闭状态,而第一声电换能器处于声音检测模式。通过第一声电换能器检测当前环境中的语音输入信号,当检测到的语音输入信号的分贝值大于预设分贝阈值时,开启第二声电换能器,通过第二声电换能器采集语音输入信号,对语音输入信号进行降噪处理并保存处理结果。音频聆听设备设有本地语音识别引擎,通过本地语音识别引擎检测该处理结果是否包括唤醒关键词,若检测到唤醒关键词,建立音频聆听设备与终端之间的第一蓝牙通信连接。
进一步地,音频设备110对采集到的语音输入信号进行处理并保存至缓存内,从而对缓存内的语音数据进行压缩。通过低功耗蓝牙通信连接,音频设备110将压缩后的语音数据发送至终端120,通过终端120与云端服务器 130之间的网络连接,终端120将压缩后的语音数据转发至云端服务器130进行语音识别,云端服务器130向终端120返回语音识别结果。
可以理解的是,音频设备110被广泛地与许多类终端的聆听或回放特征相关地使用,可以但不限于项带式耳机、头戴式耳机、耳麦、入耳耳机等音频聆听设备。终端120可以但不限于是便携式音频播放设备、便携式多媒体设备、个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备。云端服务器130可以用独立的服务器或者是多个服务器组成的服务器集群来实现。
需要说明的是,本发明所使用的术语“第一”、“第二”等可在本文中用于描述各种元件,但这些元件不受这些术语限制。这些术语仅用于将第一个元件与另一个元件区分。举例来说,在不脱离本发明的范围的情况下,可以将第一蓝牙通信连接称为第二蓝牙通信连接,且类似地,可将第二蓝牙通信连接称为第一蓝牙通信连接。第一蓝牙通信连接和第二蓝牙通信连接两者都是蓝牙通信连接,但其是不同方式的蓝牙通信连接。
本申请中的各个实施例以第一蓝牙通信连接为低功耗蓝牙通信连接、第二蓝牙通信连接为经典蓝牙通信连接、实时语音唤醒的音频设备为音频聆听设备为例进行具体地说明。
在一个实施例中,本申请提供一种音频聆听设备的运行方法,以该方法应用于图1中的音频聆听设备为例进行说明,如图2所示,该运行方法包括以下步骤:
S210、当音频聆听设备处于待机状态时,通过第一声电换能器检测当前环境中的语音输入信号。
其中,音频聆听设备能够接收自然语言命令、请求、询问等形式的用户 请求,用户请求可以指示音频聆听设备做出信息性回答或者指示执行对应的任务。待机状态指的是音频聆听设备开机但不进行任何实质性工作(如播放音频)的状态。第一声电换能器是指用于接收声音输入信号,并转换为电输出信号的器件,使声音输入信号的某些所需特征在输出信号中反映出来。语音输入信号是指为请求音频聆听设备做出回答或者执行任务用户所发出的语音信号。具体地,音频聆听设备设有第一声电换能器,当音频聆听设备处于开机但不工作的状态时,通过第一声电换能器检测环境中的语音输入信号以监测音频聆听设备是否需要唤醒。
S220、当语音输入信号的分贝值大于预设分贝阈值时,开启第二声电换能器,通过第二声电换能器采集语音输入信号,对语音输入信号进行波束成型和降噪处理并保存处理结果。
其中,第一声电换能器的功耗低于第二声电换能器的功耗。具体地,为了降低音频聆听设备待机状态下的功耗,第二声电换能器默认处于关闭状态。当第一声电换能器检测到的语音输入信号的分贝值超过预设分贝阈值时,唤醒音频聆听设备,并开启第二声电换能器。通过第二声电换能器对当前环境中的语音输入进行录音,对该语音输入信号进行波束成型和降噪处理并将处理结果保存在音频聆听设备的缓存内。
S230、对处理结果进行检测。
S240、当检测到处理结果中包含唤醒关键词时,建立音频聆听设备与终端之间的低功耗蓝牙通信连接。
其中,音频聆听设备设有本地语音识别引擎,唤醒关键词是指用于唤醒音频聆听设备使其工作的预设的特定关键词,比如可以是“你好”或者音频聆听设备的名称之类的简单特定用语。具体地,通过本地语音识别引擎检测 缓存内的处理结果是否包含唤醒关键词。若检测到处理结果中包含唤醒关键词时,建立音频聆听设备与终端之间的低功耗蓝牙通信连接。低功耗蓝牙(Bluetooth Low Energy,简称BLE)是蓝牙技术联盟设计的个人局域网技术。相对于经典蓝牙,低功耗蓝牙旨在保持同等通信范围的同时显著降低功耗和成本。因此,为了降低功耗,音频聆听设备与终端之间的第一蓝牙通信连接为低功耗蓝牙通信连接。
本实施例中,通过功耗较低的第一声电换能器检测当前环境中的语音输入信号,并在语音输入信号的分贝值超过预设阈值时,开启第二声电换能器进行语音输入信号的采集和处理,并检测处理结果是否包括唤醒关键词;若检测到唤醒关键词,建立音频聆听设备与终端之间的低功耗蓝牙通信连接,从而解决传统技术中为了实现全天候语音唤醒,待机状态下的蓝牙耳机与终端一直保持着经典蓝牙通信连接且麦克风一直处于开启状态导致的功耗较高的技术问题。
在一个实施例中,第一声电换能器为压电唤醒麦克风。当音频聆听设备处于待机状态时,通过第一声电换能器检测当前环境中的语音输入信号,包括:当音频聆听设备处于待机状态时,在预设频段内,通过压电唤醒麦克风检测当前环境中的语音输入信号。
其中,压电唤醒麦克风指的是具有压电感测元件的麦克风,其电流消耗在在微安(μA)量级,远小于普通的数字麦克风。具体地,当前环境中可能存在各种频段的声音,而人的发声频率是固定频段内,为了提升声音检测的准确性,可以根据人的发声频率预设压电唤醒麦克风的工作频段。音频聆听设备设有压电唤醒麦克风,当音频聆听设备处于开机但不工作的状态时,在预设频段内,通过压电唤醒麦克风检测当前环境中的语音输入信号以监测音 频聆听设备是否需要唤醒。
本实施例中,通过采用微安量级电流消耗的压电唤醒麦克风进行当前环境中的声音监测,能够极大地降低待机状态下的音频聆听设备的功耗。
在一个实施例中,第二声电换能器包括第一数字麦克风和第二数字麦克风。如图3所示,当语音输入信号的分贝值大于预设分贝阈值时,开启第二声电换能器,通过第二声电换能器采集语音输入信号,对语音输入信号进行降噪处理并保存处理结果,包括以下步骤:
S310、当语音输入信号的分贝值大于预设分贝阈值时,开启第一数字麦克风和第二数字麦克风。
S320、通过第一数字麦克风和第二数字麦克风采集语音输入信号,并对语音输入信号进行波束成型和降噪处理。
S330、将处理结果保存至预设的环形缓存。
具体地,为了节省功耗,第一数字麦克风和第二数字麦克风默认处于关闭状态,当检测到的语音输入信号的分贝值大于预设分贝阈值时,开启第一数字麦克风和第二数字麦克风。通过第一数字麦克风和第二数字麦克风开始录音,采集当前环境中的语音输入信号。并通过第一数字麦克风和第二数字麦克风对采集到的语音数据进行波束成型和降噪处理。具体地,通过第一数字麦克风和第二数字麦克风采集语音输入信号的声波相位差并计算出每个频段的加权系数,对第一数字麦克风和第二数字麦克风分别采集的语音输入信号进行加权叠加处理输出单路波束成型后的语音数据。为了保证语音数据的完整性,将处理后的语音数据保存在预设大小的环形缓存内。
在一个实施例中,音频聆听设备的运行方法还包括:将环形缓存内的语音数据进行压缩;通过低功耗蓝牙通信连接将压缩后的语音数据发送至终端。
其中,压缩后的语音数据被用于语音识别以得到语音识别结果。终端用于将压缩后的语音数据发送至云端服务器进行语音识别,并接收语音识别结果。具体地,为了解决传统技术中录音缓存不足的技术问题,音频聆听设备利用OPUS(声音编码格式)或者MSBC(Modified Sub-Band Code,改进型子带编码)等协议,并预设一定的压缩比,通过压缩算法对环形缓存内的语音数据进行序列化等压缩处理以节省带宽比,解决传统技术中时间延迟较大的技术问题。
进一步地,通过低功耗蓝牙通信连接,音频聆听设备将压缩后的语音数据发送至终端。终端接收压缩后的语音数据,压缩后的语音数据可以包括用户请求音频聆听设备执行某一任务或者向音频聆听设备的信息询问等语音信息,终端与云端服务器之间设有网络连接,终端将接收到的语音数据发送至云端服务器,云端服务器对语音数据进行语音识别并返回语音识别结果给终端,终端接收语音识别结果。可以理解的是,终端也可以设有语音识别引擎,通过终端的语音识别引擎对接收的语音数据进行语音识别。
在一个实施例中,音频聆听设备的运行方法还包括:通过光学接近传感器或电容传感器或压力传感器或霍尔传感器判断音频聆听设备处于使用状态或者非使用状态。
其中,光学接近传感器内部设有光敏二极管,用于检测来自外部的反射光信号,比如红外信号。霍尔传感器用于判断音频聆听设备的两个音频聆听器(如耳塞)是否处于磁吸状态。压力传感器用于判断音频聆听设备的两个音频聆听器(如耳塞)是否处于受压状态。电容传感器用于判断音频聆听设备的两个音频聆听器(如耳塞)是否处于和人体耳道接触的状态。如果光学接近传感器检测到反射光信号或者两个耳塞处于分离的状态,可以判断音频 聆听设备处于使用状态。如果光学接近传感器未检测到反射光信号或者两个耳塞处于磁吸状态,可以判断音频聆听设备处于非使用状态。
在一个实施例中,如图4所示,音频聆听设备的运行方法还包括以下步骤:
S410、当通过光学传感器或电容传感器或压力传感器或霍尔传感器检测到音频聆听设备处于佩戴状态,判定音频聆听设备处于使用状态。
S420、建立音频聆听设备与终端之间的经典蓝牙通信连接。
S430、通过经典蓝牙通信连接,接收终端发送的语音识别结果。
S440、通过音频聆听设备的电声换能器将语音识别结果进行播放。
其中,声电换能器是指用于接收电信号,并转换为声音信号的器件。具体地,若通过光学接近传感器检测到反射光信号,则音频聆听设备的耳塞位于用户的耳道内,即音频聆听设备处于佩戴状态。或者,未佩戴的项带式耳机的两个耳塞未放置在用户耳道内时,两个耳塞处于磁吸状态,则若通过霍尔传感器检测到音频聆听设备的两个耳塞处于分离状态,音频聆听设备处于佩戴状态。
具体地,若通过光学传感器或电容传感器或压力传感器或霍尔传感器检测到音频聆听设备处于佩戴状态,可以判定音频聆听设备处于使用状态。处于使用状态的音频聆听设备需要与终端建立蓝牙通信连接。为了保证音频聆听设备播放音频的音质,建立处于使用状态的音频聆听设备与终端之间的经典蓝牙通信连接。通过该经典蓝牙通信连接,终端将接收到的语音识别结果发送至音频聆听设备。音频聆听设备接收语音识别结果,并通过音频聆听设备的电声换能器将语音识别结果进行播放。
本实施例中,通过光学传感器或电容传感器或压力传感器或霍尔传感器 判断音频聆听设备处于使用状态时,则建立音频聆听设备与终端之间的经典蓝牙通信连接。此时建立经典蓝牙通信连接不仅可以减少音频聆听设备的功耗,而且通过经典蓝牙通信连接传输音频数据可以改善音质,避免播放音频数据的瑕疵。
在一个实施例中,音频聆听设备的运行方法还包括:当通过光学传感器或电容传感器或压力传感器或霍尔传感器检测到音频聆听设备处于非佩戴状态,判定音频聆听设备处于非使用状态;通过终端的电声换能器将语音识别结果进行播放。
其中,若通过光学接近传感器没有检测到反射光信号,则音频聆听设备的耳塞并不位于用户的耳道内,即处于非佩戴状态;或者通过霍尔传感器检测到音频聆听设备的两个耳塞处于磁吸状态时,判定音频聆听设备处于非佩戴状态。或者通过压力传感器检测到音频聆听设备的两个耳塞并未处于受压状态,判定音频聆听设备处于非佩戴状态。或者通过电容传感器检测到音频聆听设备的两个耳塞并未处于和人体耳道接触的状态,判定音频聆听设备处于非佩戴状态。可以理解的是,可以通过光学传感器、电容传感器、压力传感器和霍尔传感器中的任意一种或至少两种的结合检测音频聆听设备的状态。
具体地,若音频聆听设备处于非佩戴状态,判定音频聆听设备处于非使用状态,则不能够通过音频聆听设备播放语音识别结果,也不需要建立音频聆听设备与终端之间的经典蓝牙通信连接,可以直接通过终端的电声换能器将语音识别结果进行播放,从而方便用户的使用。
在一个实施例中,音频聆听设备设有语音唤醒按键。如图5所示,在建立音频聆听设备与终端之间的低功耗蓝牙通信连接之前,方法还包括:
S510、检测语音唤醒按键上是否发生触发操作;
建立音频聆听设备与终端之间的低功耗蓝牙通信连接,包括:
S520、若发生触发操作,建立音频聆听设备与终端之间的低功耗蓝牙通信连接。
其中,语音唤醒按键是指用于唤醒音频聆听设备的按键,可以是触摸按键也可以是机械按键。具体地,当用户触发语音唤醒按键时,表示需要建立音频聆听设备与终端之间的低功耗蓝牙通信连接,则检测语音唤醒按键上是否发生触发操作,若检测到语音唤醒按键上发生触发操作,建立音频聆听设备与终端之间的低功耗蓝牙通信连接。可以理解的是,语音唤醒按键被触发后,音频聆听设备也能够通过第二声电换能器对当前环境中的语音输入信号进行录音。
在一个实施例中,如图6所示,提供了一种音频聆听设备的运行方法,采用压电唤醒麦克风监测当前环境中的语音输入信号,第二声电换能器采用第一数字麦克风和第二数字麦克风。该方法包括以下步骤:
S602、当音频聆听设备处于待机状态时,在预设频段内,通过压电唤醒麦克风检测当前环境中的语音输入信号。
S604、当语音输入信号的分贝值大于预设分贝阈值时,开启第一数字麦克风和第二数字麦克风。
S606、通过第一数字麦克风和第二数字麦克风采集语音输入信号,并对语音输入信号进行波束成型和降噪处理。
S608、将处理结果保存至预设的环形缓存。
S610、对处理结果进行检测,判断处理结果是否包括唤醒关键词。
S612、当检测到唤醒关键词时,建立音频聆听设备与终端之间的低功耗蓝牙通信连接。
S614、通过光学接近传感器或电容传感器或压力传感器或霍尔传感器判断音频聆听设备处于使用状态或者非使用状态。
S616、当检测到音频聆听设备处于使用状态时,建立音频聆听设备与终端之间的经典蓝牙通信连接。
S618、通过经典蓝牙通信连接,接收终端发送的语音识别结果。
S620、通过音频聆听设备的电声换能器将语音识别结果进行播放。
S622、当检测到音频聆听设备处于非使用状态时,通过终端的电声换能器将语音识别结果进行播放。
在一个实施例中,如图7a所示,音频聆听设备设有低功耗蓝牙通信模块710、经典蓝牙通信模块720。音频聆听设备包括与低功耗蓝牙通信模块710连接的压电唤醒麦克风711、第一数字麦克风712、第二数字麦克风713、光学接近传感器714、霍尔传感器715和语音唤醒按键716。音频聆听设备还包括与经典蓝牙通信模块720连接的LED指示灯721、耳机喇叭722、多功能按键723、音量按键724。
本实施例中,音频聆听设备处于待机状态,且低功耗蓝牙通信模块与经典蓝牙通信模块均处于低功耗模式,且音频聆听设备与终端断开低功耗蓝牙通信连接与经典蓝牙通信连接。第一数字麦克风、第二数字麦克风处于关闭状态。压电唤醒麦克风处于声音监测模式。如图7b所示,本实施例提供了一种音频聆听设备的运行方法,该方法包括如下步骤701至步骤715。具体过程如下:
步骤701,通过压电唤醒麦克风检测当前环境中的语音输入信号。
步骤702,当语音输入信号的分贝值大于预设分贝阈值时,压电唤醒麦克风发送终端至低功耗蓝牙通信模块,启动第一数字麦克风、第二数字麦克风。
步骤703,通过第一数字麦克风和第二数字麦克风采集语音输入信号。
步骤704,对语音输入信号进行波束成型和降噪处理,将处理结果保存至预设的环形缓存。
步骤705,通过音频聆听设备的本地语音识别引擎对环形缓存内的缓存数据进行唤醒关键词检测。
步骤706,当检测到唤醒关键词时,建立音频聆听设备与终端之间的低功耗蓝牙通信连接。
步骤707,将环形缓存内的缓存数据进行压缩。
步骤708,通过低功耗蓝牙通信连接将压缩后的语音数据发送至终端。
步骤709,终端接收压缩后的语音数据,发送至云端服务器。
步骤710,云端服务器对接收到的语音数据进行语音识别。
步骤711,云端服务器向终端发送语音识别结果。
步骤712,终端接收语音识别结果,并向音频聆听设备发送。
步骤713,语音识别的同时,若音频聆听设备处于使用状态,建立音频聆听设备与终端之间的经典蓝牙通信连接。
具体地,低功耗蓝牙通信模块唤醒经典蓝牙通信模块,建立音频聆听设备与终端之间的经典蓝牙通信连接。
步骤714,通过经典蓝牙通信连接,接收终端发送的语音识别结果。
步骤715,通过音频聆听设备的喇叭将语音识别结果进行播放。
应该理解的是,虽然上述流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,上述流程图中的至少一部分步骤可以包括多个子步骤或者多 个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图8所示,提供了一种实时语音唤醒的音频设备的运行装置800。该运行装置包括:
语音输入检测模块810,用于当音频设备处于待机状态时,通过第一声电换能器检测当前环境中的语音输入信号。
语音输入处理模块820,用于当语音输入信号的分贝值大于预设分贝阈值时,开启第二声电换能器,通过第二声电换能器采集语音输入信号,对语音输入信号进行波束成型和降噪处理并保存处理结果,其中,第一声电换能器的功耗低于第二声电换能器的功耗。
处理结果检测模块830,用于对所述处理结果进行检测。
第一通信连接模块840,用于当检测到处理结果中包含唤醒关键词时,并建立音频设备与终端之间的第一蓝牙通信连接。
在一个实施例中,第一声电换能器为压电唤醒麦克风;语音输入检测模块810,还用于当音频设备处于待机状态时,在预设频段内,通过压电唤醒麦克风检测当前环境中的语音输入信号。
在一个实施例中,第二声电换能器包括第一数字麦克风和第二数字麦克风。语音输入处理模块820,还用于将语音输入信号的分贝值大于预设分贝阈值时,开启第一数字麦克风和第二数字麦克风;通过第一数字麦克风和第二数字麦克风采集语音输入信号的声波相位差并计算出每个频段的加权系数,对第一数字麦克风和第二数字麦克风分别采集的语音输入信号进行加权叠加 处理输出单路波束成型后的语音数据;对语音数据进行降噪处理,并保存至预设的环形缓存。
在一个实施例中,该装置还包括语音数据压缩模块和语音数据发送模块,其中,语音数据压缩模块,用于将环形缓存内的语音数据进行压缩;语音数据发送模块,用于通过第一蓝牙通信连接将压缩后的语音数据发送至终端;压缩后的语音数据被用于语音识别以得到语音识别结果。
在一个实施例中,该装置还包括使用状态判断模块,用于通过光学接近传感器或霍尔传感器判断音频设备处于使用状态或者非使用状态。
在一个实施例中,使用状态判断模块,还用于若通过光学接近传感器或霍尔传感器检测到音频设备处于佩戴状态,判定音频设备处于使用状态。该装置还包括第二通信连接模块、语音识别结果接收模块和播放模块,其中:
第二通信连接模块,用于建立音频设备与终端之间的第二蓝牙通信连接,其中,第一蓝牙通信连接的功耗低于第二蓝牙通信连接的功耗
语音识别结果接收模块,用于通过第二蓝牙通信连接,接收终端发送的语音识别结果。
播放模块,用于通过音频设备的电声换能器将语音识别结果进行播放。
在一个实施例中,音频设备设有语音唤醒按键;该装置还包括触发操作检测模块,用于检测语音唤醒按键上是否发生触发操作;第一通信连接模块,还用于若发生触发操作,建立音频设备与终端之间的第一蓝牙通信连接。
关于音频设备的运行装置的具体限定可以参见上文中对于音频聆听设备的运行方法的限定,在此不再赘述。上述音频设备的运行装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设 备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种实时语音唤醒的音频设备,包括存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述实时语音唤醒的音频设备的运行方法的步骤。
在一个实施例中,提供了一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现上述实施例中的方法步骤。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本发明的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明专利的保护范围应以所附权利要求为准。

Claims (10)

  1. 一种实时语音唤醒的音频设备的运行方法,所述运行方法包括:
    当所述音频设备处于待机状态时,通过第一声电换能器检测当前环境中的语音输入信号;
    当所述语音输入信号的分贝值大于预设分贝阈值时,开启第二声电换能器,通过所述第二声电换能器采集所述语音输入信号,并对所述语音输入信号进行波束成型和降噪处理并保存处理结果,其中,所述第一声电换能器的功耗低于所述第二声电换能器的功耗;
    对所述处理结果进行检测;
    当检测到所述处理结果中包含唤醒关键词时,建立所述音频设备与终端之间的第一蓝牙通信连接。
  2. 根据权利要求1所述的方法,其特征在于,所述第一声电换能器包括压电唤醒麦克风;所述当所述音频设备处于待机状态时,通过第一声电换能器检测当前环境中的语音输入信号,包括:
    当所述音频设备处于待机状态时,在预设频段内,通过所述压电唤醒麦克风检测当前环境中的语音输入信号。
  3. 根据权利要求1所述的方法,其特征在于,所述第二声电换能器包括第一数字麦克风和第二数字麦克风;所述当所述语音输入信号的分贝值大于预设分贝阈值时,开启第二声电换能器,通过所述第二声电换能器采集所述语音输入信号,并对所述语音输入信号进行波束成型和降噪处理并保存处理结果,包括:
    当所述语音输入信号的分贝值大于预设分贝阈值时,开启所述第一数字麦克风和所述第二数字麦克风;
    通过所述第一数字麦克风和所述第二数字麦克风采集所述语音输入信号的声波相位差并计算出每个频段的加权系数,对所述第一数字麦克风和所述第二数字麦克风分别采集的语音输入信号进行加权叠加处理输出单路波束成 型后的语音数据;
    对所述语音数据进行降噪处理,并保存至缓存中。
  4. 根据权利要求3所述的方法,其特征在于,所述方法还包括:
    将所述环形缓存内的语音数据进行压缩;
    通过所述第一蓝牙通信连接将压缩后的语音数据发送至所述终端;所述压缩后的语音数据被用于语音识别以得到语音识别结果。
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:
    当通过光学接近传感器或电容传感器或压力传感器或霍尔传感器检测到所述音频设备处于佩戴状态时,判定所述音频设备处于使用状态;
    建立所述音频设备与所述终端之间的第二蓝牙通信连接,其中,所述第一蓝牙通信连接的功耗低于所述第二蓝牙通信连接的功耗;
    通过所述第二蓝牙通信连接,接收所述终端发送的所述语音识别结果;
    通过所述音频设备的电声换能器将所述语音识别结果进行播放。
  6. 根据权利要求4所述的方法,其特征在于,所述方法还包括:
    当通过光学接近传感器或电容传感器或压力传感器或霍尔传感器检测到所述音频设备处于非佩戴状态,判定所述音频设备处于非使用状态;
    通过所述终端的电声换能器将所述语音识别结果进行播放。
  7. 根据权利要求1至6任一项所述的方法,其特征在于,所述音频设备设有语音唤醒按键;在所述建立所述音频设备与终端之间的第一蓝牙通信连接之前,所述方法还包括:
    检测所述语音唤醒按键上是否发生触发操作;
    所述建立所述音频设备与终端之间的第一蓝牙通信连接,包括:
    当发生所述触发操作时,建立所述音频设备与终端之间的第一蓝牙通信连接。
  8. 一种实时语音唤醒的音频设备的运行装置,所述运行装置包括:
    语音输入检测模块,用于当所述音频设备处于待机状态时,通过第一声电换能器检测当前环境中的语音输入信号;
    语音输入处理模块,用于当所述语音输入信号的分贝值大于预设分贝阈值时,开启第二声电换能器,通过第二声电换能器采集所述语音输入信号,并对所述语音输入信号进行波束成型和降噪处理并保存处理结果,其中,所述第一声电换能器的功耗低于所述第二声电换能器的功耗;
    处理结果检测模块,用于对所述处理结果进行检测;
    第一通信连接模块,用于当检测到所述处理结果中包含唤醒关键词时,并建立所述音频设备与终端之间的第一蓝牙通信连接。
  9. 一种实时语音唤醒的音频设备,包括存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现权利要求1至7中任意一项方法中的步骤。
  10. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1至7中任一项所述方法的步骤。
PCT/CN2019/091973 2019-05-16 2019-06-20 实时语音唤醒的音频设备、运行方法、装置及存储介质 WO2020228095A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910405965.1A CN110312235A (zh) 2019-05-16 2019-05-16 实时语音唤醒的音频设备、运行方法、装置及存储介质
CN201910405965.1 2019-05-16

Publications (1)

Publication Number Publication Date
WO2020228095A1 true WO2020228095A1 (zh) 2020-11-19

Family

ID=68074766

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/091973 WO2020228095A1 (zh) 2019-05-16 2019-06-20 实时语音唤醒的音频设备、运行方法、装置及存储介质

Country Status (2)

Country Link
CN (1) CN110312235A (zh)
WO (1) WO2020228095A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113990311A (zh) * 2021-10-15 2022-01-28 深圳市航顺芯片技术研发有限公司 语音采集装置、控制器、控制方法及语音采集控制系统
CN114173426A (zh) * 2021-11-30 2022-03-11 广州番禺巨大汽车音响设备有限公司 基于无线音频传输的无线音箱播放控制方法、装置及系统
CN114928412A (zh) * 2022-05-27 2022-08-19 深圳市智慧海洋科技有限公司 水声通信控制方法、装置、运动检测传感器及通信系统

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110675873B (zh) 2019-09-29 2023-02-07 百度在线网络技术(北京)有限公司 智能设备的数据处理方法、装置、设备及存储介质
CN110830866A (zh) * 2019-10-31 2020-02-21 歌尔科技有限公司 一种语音助手唤醒方法、装置及无线耳机和存储介质
CN111028831B (zh) * 2019-11-11 2022-02-18 云知声智能科技股份有限公司 一种语音唤醒方法及装置
CN111124511A (zh) * 2019-12-09 2020-05-08 浙江省北大信息技术高等研究院 唤醒芯片及唤醒系统
CN111429911A (zh) * 2020-03-11 2020-07-17 云知声智能科技股份有限公司 一种降低噪音场景下语音识别引擎功耗的方法及装置
CN111524513A (zh) * 2020-04-16 2020-08-11 歌尔科技有限公司 一种可穿戴设备及其语音传输的控制方法、装置及介质
CN111679861A (zh) * 2020-05-09 2020-09-18 浙江大华技术股份有限公司 电子设备的唤醒装置、方法和计算机设备和存储介质
CN112216279A (zh) * 2020-09-29 2021-01-12 星络智能科技有限公司 语音传输方法、智能终端及计算机可读存储介质
CN112399638B (zh) * 2020-11-17 2023-07-14 Oppo广东移动通信有限公司 一种通信连接建立方法、存储介质及设备
CN114816026B (zh) * 2021-01-21 2024-05-17 华为技术有限公司 一种低功耗待机方法、电子设备及计算机可读存储介质
CN113225662B (zh) * 2021-05-28 2022-04-29 杭州国芯科技股份有限公司 一种带G-sensor的TWS耳机唤醒测试方法
CN113470658A (zh) * 2021-05-31 2021-10-01 翱捷科技(深圳)有限公司 一种智能耳机及其语音唤醒阈值调整方法
CN113889107A (zh) * 2021-10-18 2022-01-04 深圳追一科技有限公司 数字人系统的唤醒方法和数字人系统
CN115022452B (zh) * 2022-06-13 2024-04-02 浙江地芯引力科技有限公司 音频设备的通信方法、装置、设备及存储介质
CN115278075A (zh) * 2022-07-26 2022-11-01 浙江大华技术股份有限公司 设备控制方法、信息处理方法及设备控制系统
CN115988380B (zh) * 2023-03-21 2023-06-20 东莞市云仕电子有限公司 一种具有促进睡眠功能的儿童无线耳机及方法
CN118175527B (zh) * 2024-05-13 2024-07-16 深圳市爱都科技有限公司 蓝牙手表连接手机端使用语音助手的方法、装置及设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107403621A (zh) * 2017-08-25 2017-11-28 深圳市沃特沃德股份有限公司 语音唤醒装置及方法
CN107577449A (zh) * 2017-09-04 2018-01-12 百度在线网络技术(北京)有限公司 唤醒语音的拾取方法、装置、设备及存储介质
CN108877788A (zh) * 2017-05-08 2018-11-23 瑞昱半导体股份有限公司 具有语音唤醒功能的电子装置及其操作方法
CN208227271U (zh) * 2017-12-05 2018-12-11 Tcl通力电子(惠州)有限公司 蓝牙智能音响及音响语音交互系统
US20180366115A1 (en) * 2017-06-19 2018-12-20 Lenovo (Singapore) Pte. Ltd. Systems and methods for identification of response cue at peripheral device

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104349241B (zh) * 2013-08-07 2019-04-23 联想(北京)有限公司 一种耳机及信息处理方法
CN105792050A (zh) * 2016-04-20 2016-07-20 青岛歌尔声学科技有限公司 一种蓝牙耳机及基于该蓝牙耳机的通信方法
TW201824836A (zh) * 2016-12-28 2018-07-01 立創智能股份有限公司 遠端藍牙裝置通訊系統及其方法
CN206640743U (zh) * 2017-03-14 2017-11-14 潍坊歌尔电子有限公司 一种蓝牙耳机及可穿戴电子设备
CN107277754B (zh) * 2017-07-12 2020-02-28 深圳市冠旭电子股份有限公司 一种蓝牙连接的方法及蓝牙外围设备
CN108962240B (zh) * 2018-06-14 2021-09-21 百度在线网络技术(北京)有限公司 一种基于耳机的语音控制方法及系统
CN108989931B (zh) * 2018-06-19 2020-10-09 美特科技(苏州)有限公司 听力保护耳机及其听力保护方法、计算机可读存储介质
CN109493857A (zh) * 2018-09-28 2019-03-19 广州智伴人工智能科技有限公司 一种自动休眠唤醒机器人系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108877788A (zh) * 2017-05-08 2018-11-23 瑞昱半导体股份有限公司 具有语音唤醒功能的电子装置及其操作方法
US20180366115A1 (en) * 2017-06-19 2018-12-20 Lenovo (Singapore) Pte. Ltd. Systems and methods for identification of response cue at peripheral device
CN107403621A (zh) * 2017-08-25 2017-11-28 深圳市沃特沃德股份有限公司 语音唤醒装置及方法
CN107577449A (zh) * 2017-09-04 2018-01-12 百度在线网络技术(北京)有限公司 唤醒语音的拾取方法、装置、设备及存储介质
CN208227271U (zh) * 2017-12-05 2018-12-11 Tcl通力电子(惠州)有限公司 蓝牙智能音响及音响语音交互系统

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113990311A (zh) * 2021-10-15 2022-01-28 深圳市航顺芯片技术研发有限公司 语音采集装置、控制器、控制方法及语音采集控制系统
CN114173426A (zh) * 2021-11-30 2022-03-11 广州番禺巨大汽车音响设备有限公司 基于无线音频传输的无线音箱播放控制方法、装置及系统
CN114173426B (zh) * 2021-11-30 2023-09-29 广州番禺巨大汽车音响设备有限公司 基于无线音频传输的无线音箱播放控制方法、装置及系统
CN114928412A (zh) * 2022-05-27 2022-08-19 深圳市智慧海洋科技有限公司 水声通信控制方法、装置、运动检测传感器及通信系统
CN114928412B (zh) * 2022-05-27 2024-03-19 深圳市智慧海洋科技有限公司 水声通信控制方法、装置、运动检测传感器及通信系统

Also Published As

Publication number Publication date
CN110312235A (zh) 2019-10-08

Similar Documents

Publication Publication Date Title
WO2020228095A1 (zh) 实时语音唤醒的音频设备、运行方法、装置及存储介质
US11605456B2 (en) Method and device for audio recording
US11412333B2 (en) Interactive system for hearing devices
CN110493678B (zh) 耳机的控制方法、装置、耳机和存储介质
KR101622493B1 (ko) 오디오 피처 데이터의 추출 및 분석
CN108922537B (zh) 音频识别方法、装置、终端、耳机及可读存储介质
CN108521621B (zh) 信号处理方法、装置、终端、耳机及可读存储介质
CN108763901B (zh) 耳纹信息获取方法和装置、终端、耳机及可读存储介质
WO2018095035A1 (zh) 耳机及其语音识别方法
CN108540900B (zh) 音量调节方法及相关产品
US20080158000A1 (en) Autodetect of user presence using a sensor
WO2019033987A1 (zh) 提示方法、装置、存储介质及终端
US20240073577A1 (en) Audio playing method, apparatus and system for in-ear earphone
CN113630708B (zh) 耳机麦克风异常检测的方法、装置、耳机套件及存储介质
US11195518B2 (en) Hearing device user communicating with a wireless communication device
WO2021103260A1 (zh) 耳机的控制方法以及耳机
CN104754462A (zh) 音量自动调节装置及方法和耳机
CN108810787B (zh) 基于音频设备的异物检测方法和装置、终端
WO2023197474A1 (zh) 一种耳机模式对应的参数确定方法、耳机、终端和系统
CN107493376A (zh) 一种铃声音量调节方法和装置
TW202232470A (zh) 音訊信號的處理方法、裝置及電子設備
WO2020220271A1 (zh) 存储器、声学单元及其音频处理方法、装置、设备和系统
WO2018227560A1 (zh) 耳机控制方法及系统
CN112804608B (zh) 带助听功能tws耳机的使用方法、系统、主机及存储介质
US11776538B1 (en) Signal processing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19929019

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19929019

Country of ref document: EP

Kind code of ref document: A1