WO2023197997A1 - 穿戴设备、拾音方法及装置 - Google Patents

穿戴设备、拾音方法及装置 Download PDF

Info

Publication number
WO2023197997A1
WO2023197997A1 PCT/CN2023/087315 CN2023087315W WO2023197997A1 WO 2023197997 A1 WO2023197997 A1 WO 2023197997A1 CN 2023087315 W CN2023087315 W CN 2023087315W WO 2023197997 A1 WO2023197997 A1 WO 2023197997A1
Authority
WO
WIPO (PCT)
Prior art keywords
sound signal
sound
target
microphone
pickup direction
Prior art date
Application number
PCT/CN2023/087315
Other languages
English (en)
French (fr)
Inventor
朱梦尧
黎椿键
石超宇
李英明
张雯
陈景东
冷欣
杨懿晨
王贤锐
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023197997A1 publication Critical patent/WO2023197997A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • H04R1/406Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/20Processing of the output signals of the acoustic transducers of an array for obtaining a desired directivity characteristic

Definitions

  • the present application relates to the field of terminal technology, and in particular to a wearable device, sound pickup method and device.
  • the wearable device By adding a microphone array to the wearable device, the wearable device has a sound pickup function.
  • the microphone array of a wearable device generally includes two omnidirectional microphones. The two omnidirectional microphones are set in the wearable device as much as possible in a straight line with the wearer's mouth, thus based on the principle of sound signal superposition. Obtain the wearer's voice signal, and then process the acquired wearer's voice signal based on the Differential Microphone Array (DMA) to improve the quality of the wearer's voice signal picked up by the wearable device.
  • DMA Differential Microphone Array
  • the microphone array when the microphone array is not effectively installed in the wearable device, or when the wearer uses the wearable device in a relatively noisy environment, the audio signal mixed with human voice and environmental noise will be simultaneously detected by the microphone in the wearable device. Collection can easily reduce the intelligibility of sound signals picked up by wearable devices, affecting the sound pickup quality and reducing the signal-to-noise ratio.
  • This application provides a wearable device, sound pickup method and device, which solves to a certain extent the problems of low intelligibility of picked up sound signals, poor sound pickup quality and low signal-to-noise ratio.
  • this application provides a wearable device.
  • the wearable device includes a microphone array.
  • the microphone array includes at least one directional microphone; the pickup beam directions of the at least one directional microphone are orthogonal to each other.
  • a microphone array including at least one directional microphone is provided in the wearable device, and at least one directional microphone in the microphone array is used to pick up sound signals, and the directional microphone is fully utilized to detect sound in a specific direction.
  • the signal-sensitive characteristics of collecting sound signals can reduce the noise mixed in the sound signal from the source of the sound, effectively avoiding the reduction of the quality of the sound signal due to the collection of overly complex sound signals, and improving the sound signal obtained. sound quality and improve signal-to-noise ratio.
  • the microphone can acquire sound signals with multiple different directions. Based on the acquired sound signals, the acquired sound signals can be further diversified to improve the performance of the microphone. Sound pickup performance, thereby improving the overall performance of wearable devices and improving user experience.
  • the microphone array further includes at least one omnidirectional microphone.
  • the omnidirectional microphone can be used to pick up sound from all directions in a balanced manner to obtain a rich and wide range of audio signals or noise.
  • the audio signal or noise obtained by the omnidirectional microphone can be used to de-noise and enhance the audio signal collected by the directional microphone to improve the sound pickup quality of the directional microphone and further improve the sound pickup performance of wearable devices. .
  • the wearable device is configured to: when the wearable device detects the target sound pickup direction, the wearable device turns on the microphones in the microphone array that are pointed in the target sound pickup direction, and turns off the microphones in the microphone array that are not pointed in the target sound pickup direction. Microphone in the target pickup direction.
  • the wearable device in the actual application process, on the one hand, it can save the power of the wearable device, improve the user experience, and extend the service life of the wearable device; on the other hand, the wearable device can turn on and point to the target according to the detected target pickup direction.
  • the microphone in the pickup direction and turning off other microphones can avoid the microphone picking up noise in other directions except the target pickup direction as much as possible and enhance the pickup effect.
  • the wearable device is configured to: when detecting the presence of a first directional microphone that meets a preset condition in the microphone array, turn on the first directional microphone and turn off other directional microphones. ;
  • the preset condition is that the signal quality of the sound signal picked up by the first directional microphone within the preset time period is greater than that of other directional microphones.
  • the wearable device turns on the first directional microphone that meets the first preset condition according to the preset condition. , and turn off other microphones to prevent the microphones from picking up sound signals in other directions that do not meet the preset conditions as much as possible, and enhance the sound pickup effect.
  • the wearable device is smart glasses.
  • the omnidirectional microphone is located in the nose bridge or nose pad of the smart glasses frame.
  • the microphone array when the microphone array includes two omnidirectional microphones, the two omnidirectional microphones are respectively located on the two temples of the smart glasses; or, the two omnidirectional microphones are respectively located on the two temples of the smart glasses.
  • the sides of the glasses frame are close to the two temples.
  • the multiple omnidirectional microphones are distributed in the middle area and both sides of the smart glasses, and the middle area includes the bridge of the nose and the sides of the smart glasses frame. /or nose pads; the side areas include the two temples of the smart eye and/or the positions on both sides of the frame of the smart glasses close to the two temples.
  • the positions of the omnidirectional microphones are set according to the number of omnidirectional microphones, so that the omnidirectional microphones in the microphone array can pick up sound from multiple directions as balancedly as possible to obtain rich,
  • the audio signals or noise obtained by the omnidirectional microphone can be used to de-noise and enhance the audio signals collected by the directional microphone to improve the sound pickup quality of the directional microphone. , further improving the sound pickup performance of smart glasses.
  • the directional microphone is a figure-8 microphone.
  • the utilization rate of the figure-of-eight microphone can be fully improved, the production, manufacturing and R&D costs of the wearable device can be reduced, and the cost of the wearable device can be improved. manufacturing rate.
  • this application provides a method for picking up sound, which is applied to electronic equipment.
  • the method includes:
  • a first interface is displayed, and the first interface is used to configure the pickup direction;
  • a target sound pickup direction is determined.
  • the electronic device can provide the sound pickup direction configuration function through the first interface, so that the user can select the target sound pickup direction according to the actual application situation, so that the electronic device can directly pick up the sound according to the target during the subsequent sound pickup process.
  • Signal-to-noise ratio improves the intelligibility of sound signals and enhances user experience.
  • the method provided by the embodiment of the present application further includes:
  • the original sound signal is enhanced, and an enhanced sound signal of the first sound signal located in the target sound pickup direction in the original sound signal is obtained.
  • the original sound signal is enhanced according to the target sound pickup direction to obtain the enhanced sound signal corresponding to the target sound pickup direction.
  • the original sound signal can be enhanced according to different actual situations.
  • obtaining the original sound signal includes:
  • the original sound signal is obtained
  • the method further includes: saving the enhanced sound signal.
  • the sound signal that the user listens to in the later period is the enhanced sound signal after enhanced processing, it is convenient for the user to repeatedly listen to the sound signal with higher sound quality in the later period, and solves the problem of collecting the sound signal during the recording process. It eliminates the problem of reducing the intelligibility of the sound signal due to other sound signals other than the sound signal that needs to be recorded, improves the signal-to-noise ratio of the acquired sound signal, and improves the intelligibility of the picked-up sound signal.
  • obtaining the original sound signal includes:
  • the method further includes: sending the enhanced sound signal to the call end equipment.
  • call scenarios include voice calls, video calls, conference calls, etc.
  • both parties in the call can hear the enhanced sound signal after enhanced processing, which solves the problem that other than the two parties in the call are collected during the call.
  • the problem of reducing the intelligibility of the sound signal due to other audio or noise other than the sound signal between them increases the signal-to-noise ratio of the acquired sound signal, improves the intelligibility of the picked-up sound signal, and improves the communication between the two parties. communication efficiency.
  • the original sound signal is the sound signal in the original recorded video
  • the original sound signal is enhanced according to the target sound pickup direction to obtain the original sound signal located in the target sound pickup direction.
  • the method further includes: converting the original video The original sound signal in is replaced by the enhanced sound signal.
  • the sound quality of the sound in the recorded video is greatly improved, and the problem of the problem that the original sound signal is collected in the recorded video is solved.
  • the problem of reducing the intelligibility of the sound signal due to the mixing of sound signals with different audio signals and environmental noise increases the signal-to-noise ratio of the acquired sound signal and improves the intelligibility of the picked-up sound signal.
  • obtaining the original sound signal further includes: receiving the original sound signal sent by the sound equipment.
  • the method provided by the embodiment of the present application further includes: sending the target sound pickup direction to the sound pickup device.
  • This can not only reduce the processing burden of the electronic device processor and effectively ensure the normal and stable operation of the electronic device; the sound pickup device can also pick up the sound signal corresponding to the target pickup direction based on the received target pickup direction, thereby obtaining clarity, Sound signals with higher intelligibility and signal-to-noise ratio.
  • the electronic device includes a microphone array, the microphone array includes at least one directional microphone, and the electronic device acquires the original sound signal, including:
  • the power of the electronic device can be saved, the user experience is improved, and the service life of the smart glasses is extended.
  • the microphone pointing to the target sound pickup direction is turned on according to the detected target sound pickup direction, and Turning off other microphones can also prevent the microphone from picking up noise from other directions except the target pickup direction as much as possible, thereby enhancing the microphone's pickup effect.
  • the open or closed state of each directional microphone can also be used to further achieve different sound pickup effects.
  • obtaining the original sound signal includes: according to the target sound pickup direction, turning on the directional microphones pointing in the target sound pickup direction, and turning off the directional microphones not pointing in the target sound pickup direction;
  • the original sound signal is enhanced, and an enhanced sound signal of the first sound signal located in the target sound pickup direction in the original sound signal is obtained.
  • the electronic device turns on the microphone pointing in the target sound pickup direction and turns off other microphones according to the target sound pickup direction, which can prevent the microphone from picking up noise in other directions except the target sound pickup direction, reducing
  • the noise with strong sound quality in the acquired original sound signal enhances the pickup effect of the microphone, and further enhances the sound signal acquired by the turned on directional microphone to obtain an enhanced sound signal corresponding to the target pickup direction.
  • This can prevent the acquired sound signal from being mixed with sound signals from other directions, improve the clarity and sound quality of the enhanced sound signal, effectively improve the signal-to-noise ratio of the final picked-up sound signal, and improve the reliability of the sound signal. Understand and improve user experience.
  • the method provided by the embodiment of the present application further includes: sending an enhanced sound signal to the audio playback device. This expands the device for enhanced sound signal playback, so that the enhanced sound signal used for playback can be adapted to different usage scenarios.
  • the method provided by the embodiment of the present application further includes: playing the enhanced Strong sound signal. This makes it easy to hear the enhanced sound signal directly.
  • the method provided by the embodiment of the present application further includes:
  • the first operation is detected on the recording interface, and the first operation is the triggering operation of the sound pickup configuration button.
  • the first operation is a recording startup operation
  • the method provided by the embodiment of the present application further includes:
  • the recording function is started.
  • the method provided by the embodiment of the present application further includes:
  • the first operation is detected on the call interface, and the first operation is the triggering operation of the sound pickup configuration button.
  • the first operation is a call connection operation
  • the method provided by the embodiment of the present application further includes:
  • the voice call or video call function is connected.
  • the method provided by the embodiment of the present application further includes:
  • the first operation is detected on the video recording interface, and the first operation is the triggering operation of the sound pickup configuration button.
  • the first operation is a recording startup operation
  • the method provided by the embodiment of the present application further includes:
  • the video recording function is started.
  • the method provided by the embodiment of the present application further includes:
  • the first operation is detected on the conference interface, and the first operation is the triggering operation of the sound pickup configuration button.
  • the first operation is a conference mode starting operation
  • the method provided by the embodiment of the present application further includes: in response to the first operation, starting the conference function.
  • the method provided by the embodiment of the present application further includes:
  • the display scene of the first interface is turned on or off, and the display scene includes at least one scene among a recording scene, a call scene, a video recording scene, and a meeting scene.
  • this application provides a sound pickup method, which is applied to sound pickup equipment.
  • the method includes:
  • the sound pickup device can directly pick up the target sound signal according to the target sound pickup direction in the subsequent sound pickup process, or according to the target sound pickup direction. Perform signal enhancement processing on the picked-up original sound signal in the sound direction to obtain the target sound signal where the original sound signal is located in the target pickup direction, thereby effectively improving the signal-to-noise ratio of the final picked-up sound signal. Improve the intelligibility of sound signals and enhance user experience.
  • acquiring the target sound signal in the target sound pickup direction includes:
  • the original sound signal is enhanced according to the target sound pickup direction to obtain an enhanced sound signal of the first sound signal located in the target sound pickup direction in the original sound signal, and the enhanced sound signal is the target sound signal.
  • the original sound signal is enhanced according to the target sound pickup direction to obtain the enhanced sound signal corresponding to the target sound pickup direction.
  • the original sound signal can be enhanced according to different actual situations.
  • acquiring the target sound signal in the target sound pickup direction includes:
  • the microphone pointing to the target sound pickup direction is turned on and other microphones are turned off. This can prevent the microphone from picking up noise in other directions except the target sound pickup direction and reduce The noise with strong sound quality in the original sound signal obtained can enhance the sound pickup effect of the microphone. In addition, it can also effectively avoid the problem of high power consumption due to the work of unrelated microphones and extend the service life of the sound pickup equipment.
  • acquiring the sound signal in the target pickup direction includes:
  • the original sound signal is enhanced to obtain an enhanced sound signal of the first sound signal located in the target sound pickup direction in the original sound signal, and the enhanced sound signal is the target sound signal.
  • the sound pickup device turns on the microphone pointing in the target pickup direction and turns off other microphones according to the target pickup direction, which can prevent the microphone from picking up noise in other directions except the target pickup direction.
  • Reduce the noise with strong sound quality in the original sound signal obtained enhance the pickup effect of the microphone, and further enhance the sound signal obtained by the turned on directional microphone to obtain the enhanced sound corresponding to the target pickup direction.
  • This can prevent the acquired sound signal from being mixed with sound signals from other directions, improve the clarity and sound quality of the enhanced sound signal, effectively improve the signal-to-noise ratio of the final picked-up sound signal, and improve the reliability of the sound signal. Understand and improve user experience.
  • the method provided by the embodiment of the present application further includes: playing the target sound signal.
  • the method provided by the embodiment of the present application further includes: sending the target sound signal to an audio playback device.
  • the equipment used to play target sound signals has been expanded and practical application scenarios have been enriched.
  • the present application provides a chip system that includes a processor that executes a computer program stored in a memory to implement the method described in any one of the second aspect or the third aspect.
  • the chip system further includes a memory, and the memory and the processor are connected to the memory through circuits or wires.
  • the present application provides an electronic device, including: a processor configured to run a computer program stored in a memory to implement the method in the second aspect or any possible implementation of the second aspect.
  • the electronic device is a wearable device as described in the first aspect or any optional manner of the first aspect.
  • the present application provides a sound pickup device, including: a processor configured to run a computer program stored in a memory to implement the method in the third aspect or any possible implementation of the third aspect. .
  • the sound pickup device is a wearable device as described in the first aspect or any optional manner of the first aspect.
  • the present application provides a computer-readable storage medium that stores a computer program.
  • the computer program When the computer program is executed by a processor, the computer program implements the method described in any one of the second aspect or the third aspect. method described.
  • embodiments of the present application provide a computer program product.
  • the computer program product When the computer program product is run on an electronic device or a sound pickup device, it causes the electronic device to execute the method described in any one of the above second or third aspects. .
  • Figure 1 is a partial structural schematic diagram of a smart glasses provided by an embodiment of the present application.
  • Figure 2 is a schematic structural diagram of an earphone provided by an embodiment of the present application.
  • Figure 3 is a schematic diagram of a usage scenario of smart glasses provided by an embodiment of the present application as a wearable device
  • Figure 4 is a schematic diagram of the sensitivity of the figure-8 directional microphone to sound signals provided by the embodiment of the present application;
  • Figure 5 is a schematic diagram of the sensitivity of the omnidirectional microphone to sound signals provided by the embodiment of the present application.
  • Figure 6 is a functional block diagram of a system composed of a wearable device and an electronic device provided by an embodiment of the present application;
  • FIGS 7-18 are schematic diagrams of beams formed by microphone array structures in different types of smart glasses provided by embodiments of the present application.
  • Figure 19 is a schematic diagram of a sound pickup method provided by an embodiment of the present application.
  • Figures 20-26 are schematic diagrams showing different scenarios of the first interface provided by the embodiment of the present application.
  • FIGS 27 and 28 are schematic diagrams of the first interface provided by the embodiment of the present application.
  • Figure 29 is a schematic diagram of various gestures provided by embodiments of the present application.
  • Figure 30 is a schematic flow chart of a sound signal noise reduction extraction process provided by an embodiment of the present application.
  • Figure 31 is a schematic diagram of spatial feature clustering of sound signals provided by an embodiment of the present application.
  • Figures 32 and 33 are schematic flow charts of another noise reduction and extraction process for sound signals provided by embodiments of the present application.
  • Figure 34 is a schematic diagram comparing the extraction effect of the wearer's voice signal in the same noise environment provided by the embodiment of the present application;
  • Figure 35 is a schematic interactive flow diagram of a sound pickup method provided by an embodiment of the present application.
  • Figure 36 is a schematic interface diagram for connecting an electronic device to a sound pickup device according to an embodiment of the present application.
  • Figure 37 is a schematic interactive flow diagram of another sound pickup method provided by an embodiment of the present application.
  • Figure 38 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • Figure 39 is a schematic diagram of the software structure of an electronic device provided by an embodiment of the present application.
  • a and/or B can mean: A exists alone, A and B exist simultaneously, and B exists alone, Where A and B can be singular or plural.
  • the character "/" generally indicates that the related objects are in an "or” relationship.
  • the wearable device By adding a microphone array to the wearable device, the wearable device has a sound pickup function.
  • the microphone array of a wearable device generally includes two omnidirectional microphones. The two omnidirectional microphones are set in the wearable device as much as possible in a straight line with the wearer's mouth, thus based on the principle of sound signal superposition. Obtain the wearer's voice signal, and then process the acquired wearer's voice signal based on the Differential Microphone Array (DMA) to improve the quality of the wearer's voice signal picked up by the wearable device.
  • DMA Differential Microphone Array
  • Figure 1 is a schematic diagram of a partial structure of smart glasses.
  • two omnidirectional microphones are provided on the temples of the smart glasses.
  • the two omnidirectional microphones are in the smart glasses.
  • the setting position is roughly in a straight line with the mouth of the person wearing the smart glasses.
  • the wearer's mouth emits a sound signal
  • the two omnidirectional lenses in the smart glasses can be used.
  • Microphones collect sound signals.
  • Figure 2 shows a schematic structural diagram of an earphone.
  • two omnidirectional microphones are provided in the ear stem of the earphone.
  • the two omnidirectional microphones are arranged in the earphone at approximately the same position. Wearing this headset The wearer's mouth is in a straight line. When the wearer wears the earphones, when the wearer's mouth emits a sound signal, the sound signal can be collected through the two omnidirectional microphones provided in the ear stem of the earphone.
  • the differential array algorithm (Differential Microphone Array, DMA) is usually used to further process the sound signal picked up by the microphone array to obtain the processed sound signal.
  • DMA mainly uses the difference in spatial sound pressure to process the sound signal. Specifically, when N microphones are installed in the sound pickup device, the N-1 order difference can be obtained, and then the N-1 order difference is used to process the sound signal.
  • the microphone array of the sound pickup device includes two microphones, the first-order differential beam of the sound signal can be obtained through DMA. In other words, the sound signal is extracted by using the difference between the sound signal collected by the microphone and the noise signal collected. .
  • the above existing technology relies on the special arrangement of the two omnidirectional microphones in the microphone array in the wearable device and the DMA method to improve the quality of the wearer's voice signal picked up by the wearable device.
  • the omnidirectional microphone in the microphone array If the setting position of the wearable device is not in a straight line with the wearer's mouth, that is, if there is a large deviation, it will reduce the quality of the picked-up sound signal, reduce the signal-to-noise ratio, and affect the user experience.
  • the above-mentioned wearable device is used in a relatively noisy sound pickup environment, the sound signal mixed with human voice and environmental noise will be collected by the microphone array in the wearable device at the same time, and the sound signal picked up by the microphone array is processed using the above method. Processing cannot filter out the noise signal in the sound signal collected by the microphone, which reduces the intelligibility of the processed sound signal and affects the sound pickup quality.
  • this application provides a wearable device in which at least one directivity
  • the microphone array of the microphone uses at least one directional microphone in the microphone array to pick up the sound signal, and makes full use of the directional microphone's sensitivity to sound signals in a specific direction to collect the sound signal, which can reduce the adulteration of the sound signal from the source of the sound. It effectively avoids the reduction of sound signal quality due to the collection of overly complex sound signals, gets rid of the constraints of the installation of microphone arrays in wearable devices, improves the sound quality of the acquired sound signals, and improves the signal-to-noise ratio.
  • the wearable device provided by the embodiment of the present application can be a smart glasses (smart glasses), an augmented reality (Augmented Reality, AR)/virtual reality (VR)/mixed reality (Mixed Reality, MR) device, a smart helmet (smart helmet), headphones, hearing aids, in-ear headphones, earbuds, smart wristband, smart watch, pedometer, two way radio, recording Devices with sound pickup functions such as recording pens. It is not difficult to understand that wearable devices can be other devices facing future technologies.
  • This wearable device can be applied to a variety of scenarios, for example, scenarios include but are not limited to video call scenarios, voice call scenarios, professional recording scenarios, radio/broadcasting/hosting scenarios, live game/live delivery scenarios, conference scenarios, and other applicable applications In the scene of sound pickup function.
  • call scenarios may include indoor call scenarios, outdoor call scenarios, quiet/noisy call scenarios, cycling/running/exercise call scenarios, car call scenarios, monaural call scenarios, binaural call scenarios, remote conference call scenarios, etc.
  • FIG 3 is a schematic diagram of the use scenario of smart glasses as a wearable device provided by the embodiment of the present application.
  • the smart glasses can be worn on the eyes of the user and can realize wireless communication with electronic devices (such as mobile phones). Communication function.
  • the smart glasses include a microphone array, and the microphone array includes at least one directional microphone.
  • the number of directional microphones in the microphone array can be flexibly set according to actual application requirements. For example, if sound signals from multiple directions need to be collected, multiple directional microphones can be set in a wearable device. When there are at least two directional microphones in the microphone array, the sound pickup performance of the microphone can be improved by further diversifying the acquired sound signals, thereby improving the overall performance of the wearable device and improving the user experience.
  • the number of directional microphones in the microphone array can be set according to actual application requirements, and this application does not impose any limitation on this.
  • the pickup beam directions of at least one directional microphone in the microphone array are orthogonal to each other.
  • the mutually orthogonal pickup beam directions of the directional microphones means that the pickup directions corresponding to the directional microphones in the microphone array are perpendicular to each other.
  • the pickup direction of the directional microphones in the microphone array can be directed to a preset sound source position.
  • the pickup direction of the directional microphone in the microphone array on the smart glasses can be pointed in the direction of the mouth of the person wearing the smart glasses.
  • the pickup direction of the directional microphone in the microphone array on the hearing aid device can be pointed in other directions to better pick up the sound signals of other people wearing the hearing aid device for conversation.
  • Different wearable devices may have different preset sound source locations, and this application does not impose any limitations on this.
  • the directional microphone may be a figure-8 microphone.
  • Figure 4 shows a schematic diagram of the sensitivity of a figure-8 directional microphone to sound signals.
  • the figure-8 microphone is also called a bidirectional microphone, which is mainly sensitive to two sound signals coming from opposite directions at the same time.
  • the utilization rate of the figure-of-eight microphone can be fully improved, the production, manufacturing and R&D costs of the wearable device can be reduced, and the manufacturing rate of the wearable device can be increased.
  • the microphone array may also include omnidirectional microphones.
  • Figure 5 shows a schematic diagram of the sensitivity of an omnidirectional microphone to sound signals.
  • the omnidirectional microphone has the same sensitivity to sound signals from all angles, as shown in the bold line segment in Figure 5.
  • a microphone array including an omnidirectional microphone and a directional microphone can pick up sound evenly from all directions through the omnidirectional microphone to obtain rich and wide range of audio signals or noise, depending on different practical applications.
  • the audio signal or noise obtained by the omnidirectional microphone can be used to de-noise and enhance the audio signal collected by the directional microphone to improve the sound pickup quality of the directional microphone and further improve the sound pickup performance of the wearable device.
  • the smart glasses may also include a speaker and a processor as shown in Figure 6; further, the speaker is a device that is used to be close to the left/right ear of the wearer and can independently play.
  • the speaker can They are respectively set in the temples on both sides of the smart glasses and are used to play sounds to the human ears of the wearer.
  • the speaker may be an external speaker, such as a speaker or a stereo, or it may be a speaker played close to the human ear.
  • the processor is used to process the sound signal, or distribute the sound signal collected by the microphone array to the processor of the electronic device, so that the processor of the electronic device processes the sound signal.
  • smart glasses may also include a communication module and a control interface.
  • the communication module is used to implement communication between the smart glasses and other electronic devices, and the control interface is used to control the smart glasses.
  • the electronic device is also called the master control device. After the master control device and the smart glasses are successfully connected by communication, the smart glasses can be controlled. Among them, the processor of the main control device can be used to analyze the smart glasses processor. The sound signal sent is processed, and the communication module of the main control device can realize interactive communication with the smart glasses through the communication module of the smart glasses.
  • control interface of the smart glasses and/or the main control device can receive externally input control commands, so as to control the smart glasses and/or the main control device through the received control commands.
  • the method of receiving control commands includes but is not limited to physical buttons on the smart glasses or the main control device, or touch gestures, air gestures, etc. on the smart glasses or the main control device.
  • the control command for volume adjustment can be received through the physical buttons on the smart glasses, or the control command for volume adjustment can be received through touch gestures received by the master control device (such as a mobile phone).
  • a gesture action measurement unit is also provided in the wearable device.
  • the posture action measurement unit is used to track the different posture changes of the wearer after wearing the device and distribute the tracking data to the processor.
  • the relative position or direction between the wearable device and the user will change with the movement of the user's head/wrist. For example, if the wearer wears smart glasses, A It is located directly in front of the wearer. When the positions of both parties remain unchanged, the sound signal of A located directly in front of the wearer can be enhanced. The enhanced sound signal of A can be collected correctly.
  • the posture and action measurement unit in the wearable device can be used to obtain the change of the wearer relative to the initialized position information, and monitor the change of the wearer's posture to follow the movement of the user's head/wrist. Adaptively adjust the direction of the picked up sound signal to achieve real-time tracking of the sound signal.
  • smart glasses may be separated from the control of the main control device and use their own multiple functional modules to achieve remote calls, assistive hearing enhancement, and other functions that originally require the help of the main control device.
  • the functions implemented are not limited in this application.
  • the following takes the number of directional microphones in the microphone array as 1, 2, 3, 4, 6 and 9 as an example to illustrate the sound signal beams formed by different numbers of directional microphones in wearable devices. Description of sex. It should be noted that the following schematic diagrams are only partial directional microphones forming sound signal beams in smart glasses. According to different actual needs, the number of directional microphones and the specific types of directional microphones set in smart glasses And the specific installation position of the directional microphone may change, which is not limited in this application.
  • a figure-8 microphone can be installed in the smart glasses.
  • the microphone can be installed in the frame or temples on one side of the smart glasses.
  • the microphone can form a sound signal beam pointed in the direction of the wearer's mouth, so that the microphone can receive the sound signal from the direction of the wearer's mouth.
  • the microphone can also be placed in the middle of the smart glasses frame, where the middle area refers to the bridge of the nose and/or the nose pads of the smart glasses frame; the microphone can form a sound signal beam pointing in the direction of the wearer's mouth.
  • the microphone can also be placed in the frame or temple on the other side of the smart glasses.
  • the direction of the sound signal beam formed is also directed to the human mouth.
  • the microphone array of the smart glasses includes 2 directional microphones. And the two directional microphones are both figure-8 microphones.
  • one of the two figure-eight microphones is located in the middle of the frame of the smart glasses, and the sound signal beam direction formed is pointed in the direction of the wearer's mouth; the remaining one microphone is located in the middle of the frame of the smart glasses.
  • the microphone forms a sound signal beam pointed in the direction of the wearer's mouth.
  • one of the two figure-eight microphones is located in the middle of the frame of the smart glasses, forming a sound signal beam pointing in the direction of the wearer's mouth; the other microphone is Corresponding to the installation direction of one of the microphones shown in Figure 10, if it is installed in the frame, frame or temple on the other side of the smart glasses, it also forms a sound signal beam pointed in the direction of the wearer's mouth.
  • two figure-eight microphones are respectively arranged in the frames, frames or temples on both sides of the smart glasses. The two microphones in the smart glasses can be directed to the wearer's mouth. direction of the sound signal beam.
  • FIG. 13 a schematic diagram of the beam formed by the microphone array structure in the third smart glasses provided by the embodiment of the present application is shown.
  • the microphone array of the smart glasses is provided with three figure-eight microphones, including one microphone.
  • the sound signal beam direction is directed toward the wearer's mouth.
  • the remaining two microphones are respectively set in the frames, frames or temples on both sides of the smart glasses to form the sound signal beam direction. It also points in the direction of the wearer's mouth.
  • the sound signal beams formed by the three directional microphones can have multiple shapes.
  • the position of the other two microphones on the frame or frame of the smart glasses can be changed while the position of the microphone set in the middle position remains unchanged.
  • Figure 14 is another schematic diagram of setting up three figure-eight microphones in the microphone array of smart glasses to form a beam. Comparing Figure 13 and Figure 14, it is easy to see that after changing the setting positions of the two microphones, the The direction of the sound signal beam formed by the three microphones is symmetrical to the direction of the sound signal beam formed by the three microphones in Figure 13, which has little impact on the actual collection of the wearer's sound signal.
  • the microphone array of the smart glasses can include 4 directional microphones, one of which is a full-directional microphone.
  • Directional microphones 3 of which are figure-8 microphones.
  • the above four microphones are located in the middle of the smart glasses frame.
  • the middle area includes the nose bridge and/or nose pads of the smart glasses frame.
  • the directions of the pickup beams formed by the above three figure-8 microphones are orthogonal to each other. For example, three figure-8 microphones
  • the pickup directions formed by the microphone are perpendicular to the frame of the smart glasses, parallel to the frame of the smart glasses, and directed toward the wearer's mouth.
  • the microphone array in the smart glasses may include 6 directional microphones, 2 of which are directional microphones.
  • Omnidirectional microphones can be set on the frame, frame or temples of the smart glasses hinge; the remaining 4 are 8-shaped microphones, two of which are located in the middle of the smart glasses frame.
  • the two microphones form a vertical
  • the pickup directions are on the mirror surface of the smart glasses and parallel to the mirror surface of the smart glasses.
  • the other two are respectively located on the frame, frame or temples of the smart glasses rotation axis.
  • the pickup directions formed are respectively pointed to the direction of the mouth of the person wearing the smart glasses.
  • the microphone array of the smart glasses also includes 6 directional microphones, of which 2 One is an omnidirectional microphone, and the other four are figure-8 microphones.
  • the two omnidirectional microphones are located on the frames, frames or temples on both sides of the hinges of the smart glasses.
  • Two of the four figure-8 microphones are located on On the frame, frame or temple of one side of the smart glasses, there is one of the omnidirectional microphones.
  • the pickup directions formed by these two microphones are respectively pointed towards the direction of the wearer's mouth and parallel to the frame of the smart glasses; the other two microphones
  • the figure-8 microphone is located on the frame, frame or temple of the other end of the hinge of the smart glasses, next to another omnidirectional microphone.
  • the pickup directions formed correspond to the direction of the wearer's mouth and the direction parallel to the smart glasses frame.
  • the microphone array of the smart glasses may include 9 directional microphones, of which 2 It is an omnidirectional microphone, 7 of which are 8-shaped microphones; the above 2 omnidirectional microphones are respectively set on the frames, frames or spectacle legs on both sides of the smart glasses; one of the 7 8-shaped microphones is set In the middle of the frame of the smart glasses, the direction of the sound signal beam formed by the microphone points to the direction of the wearer's mouth; three 8-shaped microphones are set on the frames or temples on each side of the smart glasses, one on each side.
  • the sound signal beams formed by the three figure-of-eight microphones are orthogonal to each other.
  • the number of directional microphones in the microphone array is two or more, when multiple directional microphones are actually deployed in smart glasses, the installation locations of multiple microphones of the same type are not restricted. . When the number of directional microphones in the microphone array is only one, the microphone can be installed in various positions in the smart glasses.
  • the omnidirectional microphone is located in the nose bridge or nose pad of the smart glasses frame.
  • the two omnidirectional microphones can be located in the two temples of the smart glasses respectively; or, the two omnidirectional microphones can be located on both sides of the frame of the smart glasses close to the two mirrors. Leg position.
  • the microphone array includes multiple omnidirectional microphones, the multiple omnidirectional microphones are distributed in the middle area and both sides of the smart glasses.
  • the middle area includes the nose bridge and/or nose pads of the smart glasses frame; the two side areas include smart glasses.
  • the two temples of the eye and/or the sides of the frame of the smart glasses close to the two temples.
  • the microphone array includes three omnidirectional microphones
  • two of the three omnidirectional microphones can be located in the frame of the smart glasses near both sides of the temples, or two of the microphones can be located near the smart glasses.
  • another microphone is located on the bridge of the nose or nose pads of the smart glasses frame.
  • the audio signal or noise obtained by the omnidirectional microphone can be used to de-noise and enhance the audio signal collected by the directional microphone, so as to improve the sound pickup quality of the directional microphone and further improve the sound pickup performance of smart glasses.
  • the wearable device when the wearable device detects the target sound pickup direction, the wearable device turns on the microphone in the microphone array pointing in the target sound pickup direction. And turn off the microphones in the microphone array that are not pointed in the target pickup direction.
  • the preset condition may be that the signal quality of the sound signal picked up by the first directional microphone within a preset time period is greater than that of other directional microphones.
  • the preset conditions can be set according to different actual application requirements. This application does not make any limited.
  • the signal quality parameters of the sound signal include, but are not limited to, the loudness of the sound signal and the signal-to-noise ratio of the sound signal.
  • directional microphones can also be installed in other sound pickup devices, such as headphones, smart helmets and other devices with sound pickup functions. This application does not have any limitation on this.
  • Embodiments of the present application also provide a sound pickup method, which can enhance the original sound signal in a specific direction by flexibly adjusting the sound pickup direction. This improves the intelligibility, sound quality and clarity of sound signals in specific directions.
  • the following is an exemplary description of the sound pickup method provided by the embodiment of the present application in conjunction with several possible scenarios.
  • Scenario 1 This sound pickup method can be applied to electronic equipment, and the electronic equipment picks up sound independently.
  • the electronic device may also be called a terminal device or a mobile device, or a terminal.
  • the electronic device is a device with pickup function and interface display, including but not limited to handheld devices, vehicle-mounted devices, computing devices, or other devices equipped with directional microphones.
  • the electronic device may include a mobile phone (phone), a personal digital assistant ( personal digital assistant), tablet computers, car computers, laptop computers, smart screens, ultra-mobile personal computers (UMPC), wearable devices and other electronic devices with voice pickup and display functions .
  • FIG 19 is a schematic flowchart of a sound pickup method provided by an embodiment of the present application.
  • the sound pickup method includes the following steps:
  • the first operation may be a click operation, a touch operation, or a sliding operation input by the user on the display screen of the electronic device; it may also be a control operation input by the user through physical buttons on the electronic device; it may also be It is an air gesture detected by the user through the camera or other sensors of the electronic device.
  • a "voice pickup setting” button is displayed on the setting page of the electronic device or on the desktop.
  • a "Voice Pickup Settings” button is displayed on the desktop of the electronic device. After the user clicks the button, the screen display system of the electronic device directly displays the first interface for default setting. Pickup direction settings.
  • the electronic device can also display a sound pickup scene setting interface, which is used to set a scene in which the first interface can be directly launched for sound pickup settings. For example, whether to activate the sound pickup setting in a scene where an incoming call is answered, whether to activate the sound pickup setting in a scene where recording is turned on, or whether to activate the sound pickup setting in a hands-free (also called amplification or external speaker) scene. wait.
  • the screen display system of the electronic device automatically displays the first interface, where the triggering of the corresponding scene is the first operation that the electronic device responds to.
  • the sound pickup setting scenes that can be set include but are not limited to recording scenes, call scenes, video recording scenes and conference scenes, where the call scene can be a voice call scene or a video call scene.
  • the scenario can also be a conference call scenario.
  • the electronic device when the electronic device detects that the user clicks the recording button and starts recording, it can jump directly to the first interface. For example, as shown in (a) in Figure 21, on the call interface, after the user clicks the recording function button, the screen display system of the electronic device enters the first interface; or when the user clicks the display on the electronic device, When the recording function button is displayed, the screen display system of the electronic device enters the first interface.
  • the recording interface is displayed, and a pickup enhancement button is displayed in the recording interface.
  • the user can click the sound pickup enhancement button, so that the electronic device detects that after the user clicks the above sound pickup enhancement button, the screen display system of the electronic device jumps to the first interface, and then starts recording based on the recording start operation. function to achieve enhanced processing of sound signals in local recordings.
  • the screen display system of the electronic device enters the first interface.
  • FIG 23 After an incoming call (also called a call), the electronic device displays the interface shown in Figure 23(a). The user can execute the interface shown in Figure 23(b) on the screen display system of the electronic device. The sliding operation is performed to connect the incoming call. After the incoming call is connected, the first interface is directly displayed on the screen display system of the electronic device.
  • an incoming call also called a call
  • the electronic device displays the interface shown in Figure 23(a).
  • the user can execute the interface shown in Figure 23(b) on the screen display system of the electronic device.
  • the sliding operation is performed to connect the incoming call.
  • the first interface is directly displayed on the screen display system of the electronic device.
  • the first interface is directly displayed on the screen display system of the electronic device; or, the user directly After clicking the video recording function button corresponding to the video recording application displayed on the desktop of the electronic device, the screen display system of the electronic device enters the first interface.
  • a "sound pickup enhancement" configuration button is displayed in the conference interface as shown in Figure 26. After the user clicks the configuration button, the screen display system of the electronic device enters the first interface; or, in The user directly clicks the conference function button, and after starting the conference function, the first interface is directly displayed on the screen display system of the electronic device.
  • Figure 27 is a schematic diagram of a first interface provided by an embodiment of the present application.
  • the first interface may include a first switch button 2701 for enhancing the wearer's voice signal, a manual add button 2702 for assistive hearing enhancement (and / Or the positioning button 2703 on the slide bar), the sound signal direction display diagram 2704, and the click button 2705 that can switch different viewing angles, etc.
  • the first switch button 2701 is used to turn on or off the enhancement of the wearer's sound signal;
  • the manual addition button 2702 of auxiliary hearing enhancement (or the positioning button 2703 on the slide bar) is used to determine whether to increase or decrease the sound signal to be enhanced and the corresponding Direction information of the sound signal;
  • the sound signal direction display diagram 2704 is used to display the simulated sound pickup environment, including the wearer's head and the sound pickup environment centered on the wearer's head.
  • the click buttons 2705 of different viewing angles can be used to switch the direction of the sound signal to show different angles of the wearer in the picture 2704.
  • the display content in the above example can be added to the first interface, or part of the display content in the above example can be reduced. This application does not place any limit on the content displayed in the first interface.
  • the electronic device when the electronic device is a device with a display screen, such as a smart phone, a smart watch, a tablet computer, etc., the electronic device can respond to the first operation by displaying on the display screen of the electronic device for configuring the sound pickup direction. the first interface.
  • the electronic device is, for example, an augmented reality, virtual reality, or other device that displays images in a screen projection, projection, or other manner
  • the first interface may be displayed in response to the air gesture.
  • the target pickup direction is used to enhance the original sound signal in the specified direction.
  • the following is combined with what is shown in Figure 27
  • the first interface introduces how to determine the target pickup direction.
  • the electronic device can turn on or off the enhanced sound signal of the wearer in response to the user's click or sliding operation on the first switch button 2701 based on the first interface.
  • the state shown by the first switch button 2701 indicates turning on the enhanced wearer's voice signal
  • the state shown by the second switch button 2706 in Figure 28 indicates turning off the enhanced wearer's voice signal
  • the first switch button 2701 and the third The two switch buttons 2706 may be the same switch button.
  • the user can add the target sound pickup direction through the manual add button 2702 as shown in Figure 27 or the positioning button 2703 on the slide bar.
  • the user can also switch the angle of the wearer through the first gesture based on the sound signal direction display diagram 2704, and then use the second gesture to increase or decrease the direction of the sound signal to be enhanced; or click the button 2705 to switch the sound signal direction display.
  • Figure 2704 is displayed based on the direction of the sound signal, and the direction of the sound signal to be enhanced is increased or decreased through the second gesture.
  • the first gesture may be a rotation gesture as shown in A in Figure 29; the second gesture may be a long press gesture as shown in E in Figure 29.
  • the above-mentioned first gesture and second gesture may be the same or different.
  • the first gesture and/or the second gesture may be any possible gesture shown in A-Z1 in Figure 29, and no examples are given here.
  • the target sound pickup direction can also be determined through air gestures or other control commands, which is not limited in this application.
  • the target pickup direction can include one or more than one.
  • the target pickup direction may include the direction of the wearer's voice, and one other direction set through assistive listening enhancement.
  • the electronic device acquires original sound signals in the environment through a built-in microphone array, where the microphone array may include at least one directional microphone, or at least one directional microphone and at least one omnidirectional microphone, in different application scenarios. , the microphone array may also include at least one omnidirectional microphone.
  • the electronic device can turn on the directional microphone pointing in the target sound pickup direction according to the target sound pickup direction, turn off the directional microphone not pointing in the target sound pickup direction, and use the directional microphone that is turned on and points in the target sound pickup direction to
  • the unique microphone collects original sound signals, which can not only save the power of electronic devices, improve user experience, and extend the service life of smart glasses, but also turn on the microphone pointing in the target pickup direction according to the target pickup direction, and turn off other microphones. Try to avoid the microphone picking up noise from other directions besides the target pickup direction as much as possible to enhance the microphone's pickup effect.
  • the open or closed state of each directional microphone can be used to further achieve different types of sound pickup effects to improve the performance of electronic devices.
  • S2104 Perform enhancement processing on the original sound signal according to the target sound pickup direction to obtain an enhanced sound signal of the first sound signal located in the target sound pickup direction in the original sound signal.
  • the enhanced sound signal after the above enhanced processing can be used for playback, storage, and forwarding to other devices.
  • the enhanced sound signal after enhancement processing can be used for playback to better help people wearing hearing aids listen to the sound signal; for recording scenarios, the enhanced sound signal after enhancement processing can be used It is easy to store so that users can listen to it repeatedly later; for call scenarios, the enhanced sound signal after enhanced processing can be used to send it to the call end device; for video recording scenarios, enhanced processing The processed enhanced sound signal can be used to replace the original sound signal in the original recorded video, so that the user can hear the enhanced sound signal when viewing the recorded video later, improving the user experience; for conference scenarios, the enhanced processed sound signal Enhanced sound signals can be used to send them to conference equipment, allowing for better communication and communication, and more. According to different actual application scenarios, the enhanced sound signal after enhancement processing has different uses, and this application does not impose any limitations on this.
  • the enhancement processing of the first sound signal located in the target sound pickup direction in the original sound signal includes increasing the sound intensity and/or noise reduction processing of the sound signal, so as to improve the sound signal in a specific direction. Intelligibility, sound quality and clarity.
  • Figure 30 is a schematic flow chart of a sound signal noise reduction extraction process provided by an embodiment of the present application.
  • the sound signal noise reduction extraction process refers to reducing the original sound signal according to the target pickup direction. Noise extraction, to obtain the sound signal after filtering out more noise.
  • the noise reduction process includes: The first step: obtaining the original sound signal based on the microphone array.
  • Step 2 According to the target direction, the acquired sound signal is converted into a guidance vector sound signal.
  • the method of converting the acquired sound signal into a guidance vector sound signal includes but is not limited to using a beamformer and a generalized sidelobe canceller (GSC) to process the acquired sound signal to obtain a target direction-guided sound signal.
  • GSC generalized sidelobe canceller
  • Guidance vector sound signal, or blind source separation (BSS) technology is used to process the acquired sound signal in combination with the target direction to obtain a guidance vector sound signal corresponding to the target direction.
  • this step is essentially to preprocess the sound signals collected by the directional microphone to separate different sound signals from multiple sources, eliminate noise outside the target direction, and achieve the purpose of suppressing noise while extracting the target sound signal.
  • Step 3 Suppress diffuse field noise.
  • the diffuse field refers to a sound field in which the energy density of the sound signal is uniform and randomly distributed in various propagation directions.
  • Diffused field noise refers to sound signals from all directions in the entire sound field, such as sound signals from air conditioning, refrigeration or heating.
  • the diffuse field noise can be suppressed on the guidance vector sound signal (or the sound signal collected by the directional microphone) according to the energy relationship of the sound signals from different channels.
  • a directional microphone array (AVS) as an example, whether the sound signal is direct sound or diffuse field noise can be determined based on the energy relationship of the sound signal arriving at each channel in the same AVS.
  • the ideal diffusion field refers to a sound field in which the energy of the sound signals collected from all directions of the sound field space is the same, but the sound signals are unrelated to each other.
  • X w represents the sound signal collected by all channels
  • X x , X y , and X z represent the sound signals collected by the three axis channels of x, y, and z respectively.
  • the microphone array of the smart glasses includes 4 co-point directional microphones, 1 of the 4 microphones is an omnidirectional microphone, and 3 of the 8 microphones are Figure-shaped microphone, the sound signal beam directions formed by three figure-8 microphones are orthogonal to each other.
  • Whether the sound signal belongs to a point sound source can be determined through the above formula (4).
  • mapping conversion method includes but is not limited to Gaussian distribution or uniform distribution.
  • Step 4 Perform nonlinear beam processing to achieve directional collection of sound signals and suppress interference from sound signals in other directions except the target direction.
  • Methods such as azimuth estimation or spatial clustering estimation of sound signals can be used for nonlinear beam processing.
  • the method of azimuth estimation essentially uses the sound intensity vector collected by AVS to calculate the arrival direction of each time-frequency point to estimate the direction of the sound signal, so as to filter out sound signals that do not meet the target direction.
  • the sound intensity vector collected by each AVS can be expressed by the following formula (5).
  • (f,n) represents the time-frequency point with frequency point f and frame number n
  • X w represents the sound signal collected by the full channel
  • X x , X y , and X z represent x and y respectively.
  • z The sound signals collected by the three axis channels.
  • the orientation corresponding to this time-frequency point is determined by the following formula (6).
  • R(*) means taking the real part.
  • the orientation of the time-frequency point is consistent with the target direction, that is to say, the orientation of the time-frequency point is consistent with the target direction.
  • the sound signal is the target sound signal (or the sound signal corresponding to the time-frequency point has a high probability of being the target sound signal), so the coefficient corresponding to the filter can be determined to be 1, so that the sound corresponding to the time-frequency point
  • the signal can be retained in the filter to participate in filtering; otherwise, for example, if the difference between the orientation of the time-frequency point and the target direction is 180° based on the above formula (6), it can be considered that the orientation of the time-frequency point is inconsistent with the target direction. , or that the sound signal corresponding to the time-frequency point is more likely to be noise, so the coefficient corresponding to the time-frequency point mapped to the filter can be determined to be 0 to filter out the sound signal.
  • the comparison result between the orientation of the time-frequency point and the target direction and the corresponding coefficients mapped to the filter and other parameters can be set according to the actual application situation, which is not limited in this application.
  • the spatial clustering estimation method uses the orientation information of the sound signal to simulate the pickup environment as a sphere (i.e., the pickup environment simulation ball as shown in Figure 31), and calculates the spatial characteristics of the sound signal (or the sound signal distance from the sphere, etc.), filter out sound signals that are not in the target direction, thereby achieving the extraction of sound signals in the target direction.
  • Figure 31 is a schematic diagram of spatial feature clustering of sound signals provided by the embodiment of the present application.
  • a sound pickup environment simulation ball is used to simulate the sound pickup environment.
  • the spherical surface of the sound pickup environment simulation ball is used.
  • the points on are corresponding to several sound signals mapped on the spherical surface.
  • the output sound signal can be further processed according to the number of directional microphones in the electronic device.
  • the microphone array of the electronic device includes a directional microphone
  • the target sound signal can be extracted after steering vector conversion, diffusion field noise suppression, and nonlinear beam processing of the sound signal according to the noise reduction process shown in Figure 30.
  • two or more directional microphones can be installed in the electronic equipment. In this case, you can refer to the noise reduction process shown in Figure 32.
  • Correlation processing is performed on the sound signals after steering vector conversion, diffusion field noise suppression and nonlinear beam processing. The correlation processing is to compare the similarities between the multiple sound signals obtained, so as to obtain the results from the multiple sound signals. Determine the sound signal to be output.
  • the following Set filters to further process the sound signal.
  • a more accurate target sound signal can be extracted by obtaining the sound signal from the directional microphone in the microphone array, performing steering vector conversion of the sound signal, diffusion field noise suppression, nonlinear beam and post-filter processing.
  • the sound signal obtained by the directional microphone array can be processed, and the sound signal that suppresses the diffuse field noise and noise in other non-target directions can be extracted.
  • methods such as Voice Activity Detection (VAD) or Speech Presence Probability (SPP) can also be used to extract the audio signal from the collected audio signal. Identify and eliminate silent sound signals in sound signals to speed up the pickup of sound signals and increase the pickup rate.
  • VAD Voice Activity Detection
  • SPP Speech Presence Probability
  • the obtained sound signal is processed using VAD or SPP, and the sound signal directly obtained from the directional microphone is eliminated and placed in a mute state. sound signal to speed up the extraction of sound signals.
  • VAD or SPP can be used to process the processed sound signal to finally output the extracted sound signal.
  • the step of using VAD or SPP to eliminate the silent sound signal can be flexibly adjusted according to different sound signal extraction methods or different adaptation scenarios, and this application does not impose any limitations on this.
  • beamformers and generalized side lobe cancellers can also be used.
  • At least one method performs noise reduction processing on the original sound signal acquired by the directional microphone array to obtain a noise-reduced sound signal. This application does not make any limitation on this.
  • Figure 34 is a schematic diagram comparing the effects of using two methods to extract the wearer's voice signal in the same noise environment provided by the embodiment of the present application. Among them, see (1) in Figure 34, which shows the use of existing methods to extract the sound signal of the wearer in the noise environment. For the effect diagram of extracting the wearer's voice signal in a noisy environment, see (2) in Figure 34. Figure 3 shows the effect diagram of extracting the wearer's voice signal in a noisy environment using the noise reduction method provided by this application. It should be noted that the abscissa in the above sound signal extraction effect diagram represents time (not shown in the figure), the ordinate represents frequency, and the bright colors in the diagram represent the intensity of the sound signal energy at that time-frequency point.
  • the target direction is first used as the guide, and the sound signal acquired by the microphone array is converted into a guidance vector signal, thereby achieving the separation of the multi-channel sound signal collected by the directional microphone that is close to the target direction.
  • the sound signal lays the foundation for the subsequent processing of the sound signal.
  • the diffuse field noise is suppressed on the sound signal, and the diffuse field noise from all directions in the entire space is filtered out in the sound signal, making the sound signal clearer after suppressing the diffuse field noise.
  • the sound signal is further processed through nonlinear filtering to suppress the sound signals in other directions except the target direction, thereby achieving directional collection of sound signals.
  • VAD/SPP processing is performed on the sound signal obtained by the directional microphone, which can speed up the processing speed of noise reduction of the sound signal.
  • post-filter and correlation processing are performed on the processed sound signal to further filter out the processed sound signal.
  • the residual noise in the sound ensures the sound quality of the final sound signal and further improves the pickup signal-to-noise ratio.
  • an enhanced sound signal is obtained, and the enhanced sound signal can also be spatially processed.
  • Rendering processing the sound signal after spatial rendering processing contains the position information of the sound signal, so that the user can clearly distinguish the position of the sound through both ears.
  • methods to achieve spatial rendering effects include but are not limited to interaural time difference (Interaural Time Difference, ITD) or interaural level difference (Interaural Level Difference, ILD) methods.
  • the enhanced sound can be played through an electronic device, for example, the enhanced sound can be played through a built-in speaker of the electronic device; or the electronic device can send the enhanced sound signal. Playback to a playback device, for example, enhanced sound can be played through a speaker. Or the above-mentioned enhanced sound may be stored through the above-mentioned electronic device or playback device.
  • this method can be applied to electronic equipment and sound pickup equipment.
  • the sound pickup equipment collects the original sound signal, and the electronic device sets the target sound pickup direction.
  • the sound-picking device may be a microphone, a walkie-talkie, etc., or may also be the wearable device mentioned in the above embodiment.
  • Electronic devices can be cell phones, personal digital assistants, tablets, in-vehicle computers, laptops, smart screens, ultra-mobile personal computers, wearables, and other devices capable of communicating with sound-picking devices.
  • the electronic device can communicate with the sound pickup device through wireless communication technology (such as Bluetooth technology, infrared radio frequency technology, 2.4G wireless technology, ultrasound) and other methods.
  • wireless communication technology such as Bluetooth technology, infrared radio frequency technology, 2.4G wireless technology, ultrasound
  • smart glasses are a sound pickup device and a mobile phone is an electronic device.
  • the smart glasses can communicate with the mobile phone through wireless communication technology. After the smart glasses and the mobile phone are successfully connected, the smart glasses and the mobile phone can perform the sound pickup method provided by the embodiment of the present application. .
  • FIG 35 is a schematic flowchart of another sound pickup method provided by an embodiment of the present application.
  • the sound pickup method includes the following steps:
  • the electronic device displays a first interface, and the first interface is used to configure the sound pickup direction.
  • the first operation may be a click operation, a touch operation, or a sliding operation input by the user on the display screen of the electronic device; it may also be a control operation input by the user through physical buttons on the electronic device; it may also be It is an air gesture detected by the user through the camera or other sensors of the electronic device.
  • the electronic device when the electronic device is connected to the sound pickup device, the electronic device can automatically display a first interface, where the first operation is a connection operation or a configuration operation. Or, when it is detected that the user clicks the sound pickup setting button, the first interface is displayed.
  • S2 The electronic device determines the target sound pickup direction in response to the second operation detected on the first interface.
  • the target pickup direction is used to enhance the original sound signal in the specified direction.
  • S2101-S2102 in the above scenario 1, which will not be described again here.
  • the enhancement processing of the original sound signal can be processed by electronic equipment or by a sound pickup device.
  • the process of enhancing the sound signal by the electronic device includes:
  • the electronic device receives the original sound signal sent by the audio device.
  • S4 The electronic device performs enhancement processing on the original sound signal according to the target sound pickup direction, and obtains an enhanced sound signal of the first sound signal located in the target sound pickup direction in the original sound signal.
  • the enhanced sound can be played through an electronic device, for example, the enhanced sound can be played through a built-in speaker of the electronic device; or the electronic device can use the enhanced sound signal to Sent to a playback device for playback, for example, enhanced sound can be played through a speaker.
  • the above-mentioned enhanced sound may be stored through the above-mentioned electronic device or playback device.
  • the sound signal can also be enhanced by a sound pickup device. See Figure 35 and Figure 37.
  • the sound pickup method may also include:
  • the electronic device sends the target pickup direction to the pickup device.
  • the sound pickup device acquires the target sound signal in the target pickup direction.
  • the obtained target sound signal can be an enhanced sound signal obtained according to the target sound pickup direction; it can also be a sound signal picked up according to a microphone that is turned on and pointed in the target sound pickup direction; it can also be obtained by using a microphone that is turned on and pointed in the target sound pickup direction.
  • the sound pickup device After the sound pickup device receives the target sound pickup direction sent by the electronic device, the sound pickup device can directly pick up the target sound signal according to the target sound pickup direction in the subsequent sound pickup process, or it can pick up the original sound signal according to the target sound pickup direction.
  • Signal enhancement processing is performed to obtain the target sound signal whose original sound signal is located in the target pickup direction, thereby effectively improving the signal-to-noise ratio of the final picked-up sound signal, improving the intelligibility of the sound signal, and improving the user experience.
  • step S6 in which the sound pickup device acquires the target sound signal in the target sound pickup direction, may include:
  • the sound pickup device collects the original sound signal.
  • S62 perform enhancement processing on the original sound signal according to the target sound pickup direction to obtain an enhanced sound signal of the first sound signal located in the target sound pickup direction in the original sound signal, and the enhanced sound signal is the target sound signal.
  • the original sound signal is enhanced according to the target pickup direction to obtain an enhanced sound signal corresponding to the target pickup direction.
  • This can be used according to different practical applications. scene, flexibly adjust the target sound pickup direction, and obtain the enhanced sound signal corresponding to the target sound pickup direction after enhancement processing, avoiding the adulteration of other omnidirectional sound signals in the acquired sound signal, and improving the clarity of the target sound signal. , improving the sound quality of the target sound signal.
  • step S6, in which the sound pickup device acquires the target sound signal in the target sound pickup direction may also include:
  • This possible implementation can, on the one hand, save the power of electronic devices, improve user experience, and extend the service life of smart glasses. On the other hand, it can turn on the microphone pointing in the target sound pickup direction according to the detected target sound pickup direction, and turn off other microphones. Microphone, you can also try to avoid the microphone picking up noise in other directions except the target pickup direction, and enhance the pickup effect of the microphone. In practical applications, the open or closed state of each directional microphone can also be used to further achieve different sound pickup effects.
  • step S6, in which the sound pickup device acquires the target sound signal in the target sound pickup direction may also include:
  • S67 perform enhancement processing on the original sound signal according to the target sound pickup direction to obtain an enhanced sound signal of the first sound signal located in the target sound pickup direction in the original sound signal, and the enhanced sound signal is the target sound signal.
  • the sound pickup device turns on the microphone pointing in the target pickup direction and turns off other microphones according to the target pickup direction, which can prevent the microphone from picking up noise in other directions except the target pickup direction. Reduce the noise with strong sound quality in the original sound signal obtained, and enhance the pickup effect of the microphone.
  • the sound signal acquired by the turned on directional microphone is further enhanced to obtain an enhanced sound signal corresponding to the target pickup direction. This can prevent the acquired sound signal from being mixed with sound signals from other directions, improve the clarity and sound quality of the enhanced sound signal, effectively improve the signal-to-noise ratio of the final picked-up sound signal, and improve the reliability of the sound signal. Understand and improve user experience.
  • the sound pickup device sends the target sound signal to the electronic device.
  • the above target sound signal can be played through an electronic device, for example, the target sound is played through a built-in speaker of the electronic device; or the electronic device sends the target sound signal to a playback device for playback, for example, the target sound is played through a speaker.
  • the above-mentioned target sound signal is stored through the above-mentioned electronic device or playback device.
  • the sound pickup device may also be one of the above-mentioned electronic devices.
  • the sound pickup device or electronic device may also be other devices oriented to future technologies.
  • the embodiments of this application do not place any restrictions on the specific types of sound pickup equipment and electronic equipment.
  • Figure 38 is a schematic structural diagram of a device 100 provided by this application.
  • the device 100 includes the electronic device and the sound pickup device in the above embodiment.
  • the device 100 may include a processor 110, an external memory interface 120, an internal memory 131, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, and an antenna 1 , Antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone interface 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and subscriber identification module (subscriber identification module, SIM) card interface 195, etc.
  • SIM subscriber identification module
  • the sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light. Sensor 180L, bone conduction sensor 180M, etc.
  • the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the device 100 .
  • the device 100 may include more or fewer components than shown in the figures, or some components may be combined, or some components may be separated, or may be arranged differently.
  • the components illustrated may be implemented in hardware, software, or a combination of software and hardware.
  • the device 100 when the device 100 is a mobile phone or a tablet computer, it may include all the components in the illustration, or may include only some of the components in the illustration.
  • the processor 110 may include one or more processing units.
  • the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (NPU) wait.
  • application processor application processor, AP
  • modem processor graphics processing unit
  • GPU graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller memory
  • video codec digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • NPU neural-network processing unit
  • different processing units can be independent devices or integrated in one or more processors.
  • the controller may be the nerve center and command center of the device 100 .
  • the controller can operate according to instructions Code and timing signals generate operation control signals to complete the control of instruction fetching and execution.
  • the processor 110 may also be provided with a memory for storing instructions and data.
  • the memory in processor 110 is cache memory. This memory may hold instructions or data that have been recently used or recycled by processor 110 . If the processor 110 needs to use the instructions or data again, it can be called directly from the memory. Repeated access is avoided and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.
  • processor 110 may include one or more interfaces.
  • Interfaces may include integrated circuit (inter-integrated circuit, I2C) interface, integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, pulse code modulation (pulse code modulation, PCM) interface, universal asynchronous receiver and transmitter (universal asynchronous receiver/transmitter (UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and /or universal serial bus (USB) interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • UART universal asynchronous receiver and transmitter
  • MIPI mobile industry processor interface
  • GPIO general-purpose input/output
  • SIM subscriber identity module
  • USB universal serial bus
  • the I2C interface is a bidirectional synchronous serial bus, including a serial data line (SDA) and a serial clock line (derail clock line, SCL).
  • processor 110 may include multiple sets of I2C buses.
  • the processor 110 can separately couple the touch sensor 180K, charger, flash, camera 193, etc. through different I2C bus interfaces.
  • the processor 110 can be coupled to the touch sensor 180K through an I2C interface, so that the processor 110 and the touch sensor 180K communicate through the I1C bus interface to implement the touch function of the device 100 .
  • the I1S interface can be used for audio communication.
  • processor 110 may include multiple sets of I2S buses.
  • the processor 110 can be coupled with the audio module 170 through the I1S bus to implement communication between the processor 110 and the audio module 170 .
  • the audio module 170 may transmit audio signals to the wireless communication module 160 through the I1S interface.
  • the PCM interface can also be used for audio communications to sample, quantize and encode analog signals.
  • the audio module 170 and the wireless communication module 160 may be coupled through a PCM bus interface.
  • the audio module 170 can also transmit audio signals to the wireless communication module 160 through the PCM interface. Both the I2S interface and the PCM interface can be used for audio communication.
  • the UART interface is a universal serial data bus used for asynchronous communication.
  • the bus can be a bidirectional communication bus. It converts the data to be transmitted to and from parallel communications.
  • a UART interface is generally used to connect the processor 110 and the wireless communication module 160 .
  • the processor 110 communicates with the Bluetooth module in the wireless communication module 160 through the UART interface to implement the Bluetooth function.
  • the audio module 170 can transmit audio signals to the wireless communication module 160 through the UART interface to implement the function of playing music through a Bluetooth headset.
  • the MIPI interface can be used to connect the processor 110 with peripheral devices such as the display screen 194 and the camera 193 .
  • MIPI interfaces include camera serial interface (CSI), display serial interface (DSI), etc.
  • the processor 110 and the camera 193 communicate through the CSI interface to implement the shooting function of the device 100.
  • the processor 110 and the display screen 194 communicate through the DSI interface to implement the display function of the device 100.
  • the GPIO interface can be configured through software.
  • the GPIO interface can be configured as a control signal or as a data signal.
  • the GPIO interface can be used to connect the processor 110 and the camera 193 to display Screen 194, wireless communication module 160, audio module 170, sensor module 180, etc.
  • the GPIO interface can also be configured as an I2C interface, I2S interface, UART interface, MIPI interface, etc.
  • the USB interface 130 is an interface that complies with the USB standard specification, and may be a Mini USB interface, a Micro USB interface, a USB Type C interface, etc.
  • the USB interface 130 can be used to connect a charger to charge the device 100, and can also be used to transmit data between the device 100 and peripheral devices. It can also be used to connect headphones to play audio through them. This interface can also be used to connect other devices, such as AR devices, etc.
  • the interface connection relationships between the modules illustrated in the embodiments of the present application are only schematic illustrations and do not constitute a structural limitation on the device 100 .
  • the device 100 may also adopt different interface connection methods in the above embodiments, or a combination of multiple interface connection methods.
  • the charging management module 140 is used to receive charging input from the charger.
  • the charger can be a wireless charger or a wired charger.
  • the charging management module 140 may receive charging input from the wired charger through the USB interface 130 .
  • the charge management module 140 may receive wireless charging input through the wireless charging coil of the device 100 . While the charging management module 140 charges the battery 142, it can also provide power to the device through the power management module 141.
  • the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
  • the power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, the internal memory 131, the external memory interface 120, the display screen 194, the camera 193, the wireless communication module 160, and the like.
  • the power management module 141 can also be used to monitor battery capacity, battery cycle times, battery health status (leakage, impedance) and other parameters.
  • the power management module 141 may also be provided in the processor 110 . In other embodiments, the power management module 141 and the charging management module 140 may also be provided in the same device.
  • the wireless communication function of the device 100 can be implemented through the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor and the baseband processor, etc.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in device 100 may be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization. For example: Antenna 1 can be reused as a diversity antenna for a wireless LAN. In other embodiments, antennas may be used in conjunction with tuning switches.
  • the mobile communication module 150 can provide solutions for wireless communication including 2G/3G/4G/5G applied on the device 100 .
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA), etc.
  • the mobile communication module 150 can receive electromagnetic waves through the antenna 1, perform filtering, amplification and other processing on the received electromagnetic waves, and transmit them to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signal modulated by the modem processor and convert it into electromagnetic waves through the antenna 1 for radiation.
  • At least part of the functional modules of the mobile communication module 150 may be disposed in the processor 110 . In some embodiments, at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be provided in the same device.
  • a modem processor may include a modulator and a demodulator.
  • the modulator is used to modulate the low-frequency baseband signal to be sent into a medium-high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal.
  • the demodulator then transmits the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the low-frequency baseband signal passes through the baseband After processing by the processor, it is passed to the application processor.
  • the application processor outputs sound signals through audio devices (not limited to speaker 170A, receiver 170B, etc.), or displays images or videos through display screen 194.
  • the modem processor may be a stand-alone device.
  • the modem processor may be independent of the processor 110 and may be provided in the same device as the mobile communication module 150 or other functional modules.
  • the wireless communication module 160 can provide applications on the device 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) network), Bluetooth (bluetooth, BT), and global navigation satellite system. (global navigation satellite system, GNSS), frequency modulation (FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
  • WLAN wireless local area networks
  • BT Bluetooth
  • GNSS global navigation satellite system
  • FM frequency modulation
  • NFC near field communication technology
  • infrared technology infrared, IR
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110, frequency modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation
  • the antenna 1 of the device 100 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the device 100 can communicate with the network and other devices through wireless communication technology.
  • Wireless communication technologies can include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband code division Multiple access (wideband code division multiple access, WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC, FM , and/or IR technology, etc.
  • GNSS can include global positioning system (GPS), global navigation satellite system (GLONASS), Beidou navigation satellite system (BDS), quasi-zenith satellite system (quasi-zenith) satellite system (QZSS) and/or satellite based augmentation systems (SBAS).
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • BDS Beidou navigation satellite system
  • QZSS quasi-zenith satellite system
  • SBAS satellite based augmentation systems
  • the device 100 implements display functions through a GPU, a display screen 194, an application processor, and the like.
  • the GPU is an image processing microprocessor and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
  • the display screen 194 is used to display images, videos, etc. For example, the icon, folder, folder name, etc. of the APP in the embodiment of this application.
  • Display 194 includes a display panel.
  • the display panel can use a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light emitting diode).
  • device 100 may include 1 or N display screens 194, where N is a positive integer greater than 1.
  • the device 100 can implement the shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, an application processor, and the like.
  • the ISP is used to process the data fed back by the camera 193. For example, when taking a photo, the shutter is opened, the light is transmitted to the camera sensor through the lens, the light signal is converted into an electrical signal, and the camera sensor passes the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye. ISP can also perform algorithm optimization on image noise, brightness, and skin color. change. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP may be provided in the camera 193.
  • Camera 193 is used to capture still images or video.
  • the object passes through the lens to produce an optical image that is projected onto the photosensitive element.
  • the focal length of the lens can be used to indicate the viewing range of the camera. The smaller the focal length of the lens, the larger the viewing range of the lens.
  • the photosensitive element can be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the optical signal into an electrical signal, and then passes the electrical signal to the ISP to convert it into a digital image signal.
  • ISP outputs digital image signals to DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other format image signals.
  • the device 100 may include cameras 193 with 2 or more focal lengths.
  • Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when the device 100 selects a frequency point, the digital signal processor is used to perform Fourier transform on the frequency point energy.
  • Video codecs are used to compress or decompress digital video.
  • Device 100 may support one or more video codecs. In this way, the device 100 can play or record videos in multiple encoding formats, such as moving picture experts group (MPEG)1, MPEG1, MPEG3, MPEG4, etc.
  • MPEG moving picture experts group
  • NPU is a neural network (NN) computing processor.
  • NN neural network
  • the NPU can realize intelligent cognitive applications of the device 100, such as image recognition, face recognition, speech recognition, text understanding, etc.
  • the NPU or other processors may be used to perform operations such as analysis and processing on images in videos stored by the device 100 .
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the device 100.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to implement the data storage function. Such as saving music, videos, etc. files in external memory card.
  • Internal memory 131 may be used to store computer executable program code, which includes instructions.
  • the processor 110 executes instructions stored in the internal memory 131 to execute various functional applications and data processing of the device 100 .
  • the internal memory 131 may include a program storage area and a data storage area.
  • the stored program area can store the operating system and at least one application program required for a function (such as a sound playback function, an image playback function, etc.).
  • the storage data area may store data created during use of the device 100 (such as audio data, phone book, etc.).
  • the internal memory 131 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, universal flash storage (UFS), etc.
  • non-volatile memory such as at least one disk storage device, flash memory device, universal flash storage (UFS), etc.
  • the device 100 can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone interface 170D, and the application processor.
  • the audio module 170 is used to convert digital audio signals into analog audio signal outputs, and is also used to convert analog audio inputs into digital audio signals. Audio module 170 may also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be disposed in the processor 110, or some functions of the audio module 170 may be The block is provided in processor 110.
  • Speaker 170A also called “speaker” is used to convert audio electrical signals into sound signals.
  • the device 100 can listen to music or listen to a hands-free call through the speaker 170A.
  • the speaker can play the comparison analysis results provided by the embodiments of the present application.
  • Receiver 170B also called “earpiece” is used to convert audio electrical signals into sound signals.
  • the voice can be heard by bringing the receiver 170B close to the human ear.
  • Microphone 170C also called “microphone” or “microphone” is used to convert sound signals into electrical signals. When making a call or sending a voice message, the user can speak close to the microphone 170C with the human mouth and input the sound signal to the microphone 170C.
  • Device 100 may be provided with at least one microphone 170C. In other embodiments, the device 100 may be provided with two microphones 170C, which in addition to collecting sound signals, may also implement a noise reduction function. In other embodiments, the device 100 can also be equipped with three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and implement directional recording functions, etc.
  • the headphone interface 170D is used to connect wired headphones.
  • the headphone interface 170D can be a USB interface 130, or a 3.5mm open mobile terminal platform (OMTP) standard interface, or a cellular telecommunications industry association of the USA (CTIA) standard interface.
  • OMTP open mobile terminal platform
  • CTIA cellular telecommunications industry association of the USA
  • the pressure sensor 180A is used to sense pressure signals and can convert the pressure signals into electrical signals.
  • pressure sensor 180A may be disposed on display screen 194 .
  • pressure sensors 180A such as resistive pressure sensors, inductive pressure sensors, capacitive pressure sensors, etc.
  • a capacitive pressure sensor may include at least two parallel plates of conductive material.
  • the device 100 determines the intensity of the pressure based on changes in capacitance.
  • the device 100 detects the strength of the touch operation according to the pressure sensor 180A.
  • Device 100 may also calculate the location of the touch based on the detection signal of pressure sensor 180A.
  • touch operations acting on the same touch location but with different touch operation intensities may correspond to different operation instructions. For example: when a touch operation with a touch operation intensity less than the first pressure threshold is applied to the short message application icon, an instruction to view the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold is applied to the short message application icon, an instruction to create a new short message is executed.
  • the gyro sensor 180B may be used to determine the motion posture of the device 100 .
  • the angular velocity of device 100 about three axes ie, x, y, and z axes
  • the gyro sensor 180B can be used for image stabilization. For example, when the shutter is pressed, the gyro sensor 180B detects the angle at which the device 100 shakes, calculates the distance that the lens module needs to compensate based on the angle, and allows the lens to offset the shake of the device 100 through reverse movement to achieve anti-shake.
  • the gyro sensor 180B can also be used for navigation and somatosensory game scenes.
  • Air pressure sensor 180C is used to measure air pressure. In some embodiments, the device 100 calculates the altitude through the air pressure value measured by the air pressure sensor 180C to assist positioning and navigation.
  • Magnetic sensor 180D includes a Hall sensor.
  • Device 100 may utilize magnetic sensor 180D to detect opening and closing of the flip holster.
  • the device 100 may detect opening and closing of the flip according to the magnetic sensor 180D. Then, based on the detected opening and closing status of the leather case or the opening and closing status of the flip cover, features such as automatic unlocking of the flip cover are set.
  • the acceleration sensor 180E can detect the acceleration of the device 100 in various directions (generally three axes). When the device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify device posture, and can be used in horizontal and vertical screen switching, pedometer and other applications.
  • Distance sensor 180F for measuring distance.
  • Device 100 can measure distance via infrared or laser. In some embodiments, when shooting a scene, the device 100 may utilize the distance sensor 180F to measure distance to achieve fast focusing.
  • Proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector, such as a photodiode.
  • the light emitting diode may be an infrared light emitting diode.
  • the device 100 emits infrared light through light emitting diodes.
  • Device 100 uses photodiodes to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the device 100 . When insufficient reflected light is detected, the device 100 may determine that there is no object near the device 100 .
  • the device 100 can use the proximity light sensor 180G to detect when the user holds the device 100 close to the ear for talking, so as to automatically turn off the screen to save power.
  • the proximity light sensor 180G can also be used in holster mode, and pocket mode automatically unlocks and locks the screen.
  • the ambient light sensor 180L is used to sense ambient light brightness.
  • Device 100 can adaptively adjust display screen 194 brightness based on perceived ambient light brightness.
  • the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
  • the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the device 100 is in the pocket to prevent accidental touching.
  • Fingerprint sensor 180H is used to collect fingerprints.
  • the device 100 can use the collected fingerprint characteristics to achieve fingerprint unlocking, access to application locks, fingerprint photography, fingerprint answering of incoming calls, etc.
  • Temperature sensor 180J is used to detect temperature.
  • the device 100 utilizes the temperature detected by the temperature sensor 180J to execute the temperature processing strategy. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold, the device 100 reduces the performance of a processor located near the temperature sensor 180J to reduce power consumption and implement thermal protection. In other embodiments, when the temperature is lower than another threshold, the device 100 heats the battery 142 to prevent the low temperature from causing the device 100 to shut down abnormally. In some other embodiments, when the temperature is lower than another threshold, the device 100 performs boosting on the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.
  • Touch sensor 180K also called “touch panel”.
  • the touch sensor 180K can be disposed on the display screen 194.
  • the touch sensor 180K and the display screen 194 form a touch screen, which is also called a "touch screen”.
  • the touch sensor 180K is used to detect a touch operation on or near the touch sensor 180K.
  • the touch sensor can pass the detected touch operation to the application processor to determine the touch event type.
  • Visual output related to the touch operation may be provided through display screen 194 .
  • the touch sensor 180K may also be disposed on the surface of the device 100 at a location different from that of the display screen 194 .
  • Bone conduction sensor 180M can acquire vibration signals.
  • the bone conduction sensor 180M can acquire the vibration signal of the vibrating bone mass of the human body's vocal part.
  • the bone conduction sensor 180M can also contact the human body's pulse and receive blood pressure beating signals.
  • the bone conduction sensor 180M can also be provided in an earphone and combined into a bone conduction earphone.
  • the audio module 170 can analyze the voice signal based on the vibration signal of the vocal vibrating bone obtained by the bone conduction sensor 180M to implement the voice function.
  • the application processor can analyze the heart rate information based on the blood pressure beat signal obtained by the bone conduction sensor 180M to implement the heart rate detection function.
  • buttons 190 include a power button, a volume button, etc.
  • Key 190 may be a mechanical key. It can also be a touch button.
  • Device 100 may receive key input and generate keys related to user settings and function control of device 100 signal input.
  • the motor 191 can generate vibration prompts.
  • the motor 191 can be used for vibration prompts for incoming calls and can also be used for touch vibration feedback.
  • touch operations for different applications can correspond to different vibration feedback effects.
  • the motor 191 can also respond to different vibration feedback effects for touch operations in different areas of the display screen 194 .
  • Different application scenarios such as time reminders, receiving information, alarm clocks, games, etc.
  • the touch vibration feedback effect can also be customized.
  • the indicator 192 may be an indicator light, which may be used to indicate charging status, power changes, or may be used to indicate messages, missed calls, notifications, etc.
  • the SIM card interface 195 is used to connect a SIM card.
  • the SIM card can be connected to or separated from the device 100 by inserting it into the SIM card interface 195 or pulling it out from the SIM card interface 195 .
  • the device 100 can support 1 or N SIM card interfaces, where N is a positive integer greater than 1.
  • SIM card interface 195 can support Nano SIM card, Micro SIM card, SIM card, etc. Multiple cards can be inserted into the same SIM card interface 195 at the same time. Multiple cards can be of the same type or different types.
  • the SIM card interface 195 is also compatible with different types of SIM cards.
  • the SIM card interface 195 is also compatible with external memory cards.
  • the device 100 interacts with the network through the SIM card to implement functions such as calls and data communications.
  • the device 100 employs an eSIM, that is, an embedded SIM card.
  • the eSIM card can be embedded in the device 100 and cannot be separated from the device 100.
  • FIG 39 is a schematic diagram of the software structure of the device 100 according to the embodiment of the present application.
  • the operating system in the device 100 may be an Android system, a Microsoft Window System (Windows), an Apple Mobile Operating System (iOS) or a Harmony OS, etc.
  • the operating system of the device 100 is the Hongmeng system as an example for explanation.
  • the Hongmeng system can be divided into four layers, including a kernel layer, a system service layer, a framework layer, and an application layer.
  • the layers communicate through software interfaces.
  • the kernel layer includes the kernel abstract layer (KAL) and driver subsystem.
  • KAL includes multiple kernels, such as the Linux Kernel, the lightweight IoT system kernel LiteOS, etc.
  • the driver subsystem can include the Hardware Driver Foundation (HDF).
  • HDF Hardware Driver Foundation
  • the hardware driver framework can provide unified peripheral access capabilities and driver development and management framework.
  • the multi-core kernel layer can select the corresponding core for processing according to the needs of the system.
  • the system service layer is the core capability set of Hongmeng system.
  • the system service layer provides services to applications through the framework layer.
  • This layer may include a system basic capability subsystem set, a basic software service subsystem set, an enhanced software service subsystem set, and a hardware service subsystem set.
  • the basic capability subsystem set of the system provides basic capabilities for the operation, scheduling, migration and other operations of distributed applications on Hongmeng system devices.
  • the Ark multi-language runtime provides C or C++ or JavaScript (JS) multi-language runtime and basic system class libraries. It can also be used for static Java programs using the Ark compiler (that is, using Java in the application or framework layer). Language development part) provides the runtime.
  • the basic software service subsystem set provides public and general software services for Hongmeng system.
  • the enhanced software service subsystem set provides Hongmeng system with differentiated capability-enhanced software for different devices.
  • software services can include smart screen proprietary services, wearable proprietary services, and Internet of Things (IoT) proprietary business subsystems.
  • IoT Internet of Things
  • the hardware service subsystem set provides hardware services for Hongmeng system. It can include subsystems such as location services, biometric identification, wearable proprietary hardware services, and IoT proprietary hardware services.
  • the framework layer provides multi-language user program frameworks and Ability frameworks such as Java, C, C++, and JS for Hongmeng system application development.
  • Two user interface (UI) frameworks including Java UI for Java language framework, JS UI framework suitable for JS language), and multi-language framework application program interface (Application Programming Interface, API) that is open to the public for various software and hardware services.
  • UI user interface
  • API Application Programming Interface
  • the application layer includes system applications and third-party applications (or extended applications).
  • System apps can include apps installed by default on your device such as the desktop, control bar, settings, phone, etc.
  • Extended applications can be non-essential applications developed and designed by the device manufacturer, such as device manager, device migration, notes, weather and other applications.
  • Third-party non-system applications can be developed by other manufacturers, but can run applications in the Hongmeng system, such as games, navigation, social networking or shopping applications.
  • PA Provides the ability to run tasks in the background and a unified data access abstraction.
  • PA mainly provides support for FA, such as providing computing power as a background service, or providing data access capabilities as a data warehouse.
  • Applications developed based on FA or PA can implement specific business functions, support cross-device scheduling and distribution, and provide users with a consistent and efficient application experience.
  • Multiple devices running Hongmeng system can realize hardware mutual assistance and resource sharing through distributed soft bus, distributed device virtualization, distributed data management and distributed task scheduling.
  • the embodiments of this application also provide the following content:
  • This embodiment provides a computer program product.
  • the program product includes a program.
  • the program When the program is run by an electronic device and/or a sound pickup device, the electronic device and/or the sound pickup device causes the pickup shown in the above embodiments. sound method.
  • Embodiments of the present application provide a computer-readable storage medium.
  • the computer-readable storage medium stores a computer program.
  • the computer program is executed by a processor, the sound pickup method shown in the above embodiments is implemented.
  • Embodiments of the present application provide a chip system.
  • the chip system includes a memory and a processor.
  • the processor executes a computer program stored in the memory to control the above-mentioned electronic device to perform the sound pickup method shown in the above-mentioned embodiments.
  • processors mentioned in the embodiments of this application may be a central processing unit (CPU), or other general-purpose processor, digital signal processor (DSP), or application-specific integrated circuit (ASIC).
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.
  • the memory mentioned in the embodiments of the present application may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be read-only memory (ROM), programmable ROM (PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically removable memory. Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • Volatile memory may be random access memory (random access memory, RAM), which is used as an external cache.
  • RAM static random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • double data rate SDRAM double data rate SDRAM
  • DDR SDRAM double data rate SDRAM
  • ESDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous link dynamic random access memory
  • direct rambus RAM direct rambus RAM
  • Module completion means dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.
  • Each functional unit and module in the embodiment can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above-mentioned integrated unit can be hardware-based. It can also be implemented in the form of software functional units.
  • the specific names of each functional unit and module are only for the convenience of distinguishing each other and are not used to limit the scope of protection of the present application.
  • For the specific working processes of the units and modules in the above system please refer to the corresponding processes in the foregoing method embodiments, and will not be described again here.
  • the disclosed devices and methods can be implemented in other ways.
  • the system embodiments described above are only illustrative.
  • the division of modules or units is only a logical function division.
  • there may be other division methods for example, multiple units or components may be The combination can either be integrated into another system, or some features can be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above integrated units can be implemented in the form of hardware or software functional units.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium.
  • this application implements all or part of the processes in the above embodiment methods, which can be completed by instructing relevant hardware through a computer program.
  • the computer program can be stored in a computer-readable storage medium, and when executed by the processor, the computer program can implement the steps of each of the above method embodiments.
  • the computer program includes computer program code, which may be in the form of source code, object code, executable file or some intermediate form.
  • the computer-readable medium may at least include: any entity or device capable of carrying computer program code to a large-screen device, recording media, computer memory, read-only memory (ROM, Read-Only Memory), random access memory (RAM) , Random Access Memory), electrical carrier signals, telecommunications signals, and software distribution media.
  • ROM read-only memory
  • RAM random access memory
  • telecommunications signals and software distribution media.
  • U disk mobile hard disk, magnetic disk or CD, etc.
  • computer-readable media may not be electrical carrier signals and telecommunications signals.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Otolaryngology (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

本申请提供一种穿戴设备、拾音方法及装置,涉及终端技术领域,该拾音方法应用于电子设备,该方法包括:响应于第一操作,显示第一界面,第一界面用于配置拾音方向;响应于在第一界面上检测到的第二操作,确定目标拾音方向。电子设备可以通过第一界面提供拾音方向配置功能,使得用户可以根据实际应用情况选择目标拾音方向,使得电子设备在后续拾音过程中可以直接根据目标拾音方向拾取声音信号,或者根据目标拾音方向对拾取到的原始声音信号进行信号增强处理,以获取到原始声音信号位于目标拾音方向的增强声音信号,从而有效提高最终拾取到的声音信号的信噪比,提高声音信号的可懂度,提升用户体验。

Description

穿戴设备、拾音方法及装置
本申请要求于2022年04月14日提交国家知识产权局、申请号为202210393694.4、申请名称为“穿戴设备、拾音方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及终端技术领域,尤其涉及一种穿戴设备、拾音方法及装置。
背景技术
随着科技的不断进步,穿戴设备(例如耳机、智能眼镜、智能手环等)已经成为人们日常生活中不可缺少的一部分。通过在穿戴设备中增加麦克风阵列,以使穿戴设备具备拾音功能。目前,穿戴设备的麦克风阵列中一般包括两颗全向麦克风,这两颗全向麦克风在穿戴设备中的设置位置尽可能与佩戴人的人嘴处在一条直线上,从而基于声音信号叠加的原理获取佩戴人的声音信号,然后基于差分阵列算法(Differential Microphone Array,DMA)来处理获取到的佩戴人的声音信号,以提高穿戴设备拾取佩戴人的声音信号的质量。
这种情况下,当麦克风阵列未有效在穿戴设备中安装,或佩戴人在相对噪杂的环境中使用穿戴设备时,掺杂了人声和环境噪声的音频信号会被穿戴设备中的麦克风同时采集,容易降低穿戴设备拾取到的声音信号的可懂度,影响了拾音质量,降低了信噪比。
发明内容
本申请提供一种穿戴设备、拾音方法及装置,一定程度上解决了拾取的声音信号的可懂度低,拾音质量差以及信噪比低的问题。
为达到上述目的,本申请采用如下技术方案:
第一方面,本申请提供一种穿戴设备,穿戴设备包括麦克风阵列,麦克风阵列中包括至少一个指向性麦克风;至少一个指向性麦克风的拾音波束方向互相正交。
基于本申请提供的穿戴设备,在该穿戴设备中设置包括有至少一个指向性麦克风的麦克风阵列,利用麦克风阵列中的至少一个指向性麦克风来拾取声音信号,充分利用指向性麦克风对特定方向的声音信号敏感的特点来采集声音信号,能够从获取声音的源头减少声音信号中掺杂的噪声,有效避免了由于采集了过于复杂的声音信号而降低了声音信号的质量,提升了获取到的声音信号的音质,提升信噪比。
此外,当至少一个指向性麦克风的拾音波束方向互相正交时,麦克风可以获取具有多个不同方向的声音信号,基于获取的声音信号可以进一步对获取的声音信号作多元化处理,提升麦克风的拾音性能,进而提升穿戴设备的整体性能,提升用户体验。
在第一方面的一个可能的实现方式中,麦克风阵列中还包括至少一个全向麦克风。
基于该可能的实现方式,当麦克风阵列中还包括有全向麦克风时,可以通过全向麦克风从所有方向均衡的拾取声音,以获取丰富、范围较广的音频信号或噪声,根据 不同的实际应用需求,可以利用全向麦克风获取的音频信号或噪声对指向性麦克风采集的音频信号进行降噪、增强处理,以提升指向性麦克风的拾音质量,进一步提升穿戴设备的拾音性能。
在第一方面的一个可能的实现方式中,穿戴设备被配置为:当穿戴设备检测到目标拾音方向时,穿戴设备开启麦克风阵列中指向目标拾音方向的麦克风,并关闭麦克风阵列中未指向目标拾音方向的麦克风。
基于该可能的实现方式,在实际应用过程中,一方面能够节约穿戴设备的电量,提升用户体验,同时延长穿戴设备的使用寿命;另一方面穿戴设备根据检测到的目标拾音方向开启指向目标拾音方向的麦克风,并关闭其他麦克风,能够尽可能避免麦克风拾取到除目标拾音方向之外的其他方向上的噪声,增强拾音效果。
在第一方面的一个可能的实现方式中,穿戴设备被配置为:当检测到麦克风阵列中存在满足预设条件的第一指向性麦克风时,开启第一指向性麦克风,并关闭其他指向性麦克风;预设条件为第一指向性麦克风在预设时间段内拾取到的声音信号的信号质量大于其他指向性麦克风。
基于该可能的实现方式,一方面能够节约穿戴设备的电量,提升用户体验,同时延长穿戴设备的使用寿命;另一方面穿戴设备根据预设条件开启满足第一预设条件的第一指向性麦克风,并关闭其他麦克风,能够尽可能避免麦克风拾取到不满足预设条件的其他方向上的声音信号,增强拾音效果。
在第一方面的一个可能的实现方式中,穿戴设备为智能眼镜。
在第一方面的一个可能的实现方式中,当麦克风阵列中包括一个全向麦克风时,全向麦克风位于智能眼镜镜框的鼻梁或鼻托中。
在第一方面的一个可能的实现方式中,当麦克风阵列中包括两个全向麦克风时,两个全向麦克风分别位于智能眼镜的两个镜腿上;或者,两个全向麦克风分别位于智能眼镜的镜框两侧靠近两个镜腿的位置。
在第一方面的一个可能的实现方式中,当麦克风阵列中包括多个全向麦克风时,多个全向麦克风分布在智能眼镜的中间区域以及两侧区域,中间区域包括智能眼镜镜框的鼻梁和/或鼻托;两侧区域包括智能眼睛的两个镜腿和/或智能眼镜的镜框两侧靠近两个镜腿的位置。
基于上述几种可能的实现方式,根据全向麦克风的数量对全向麦克风进行位置的设置,以便于麦克风阵列中的全向麦克风能够尽可能的从多个方向均衡的拾取声音,以获取丰富、范围较广的音频信号或噪声,根据不同的实际应用需求,可以利用全向麦克风获取的音频信号或噪声对指向性麦克风采集的音频信号进行降噪、增强处理,提升指向性麦克风的拾音质量,进一步提升智能眼镜的拾音性能。
在第一方面的一个可能的实现方式中,指向性麦克风为8字型麦克风。
基于该可能的实现方式,当在穿戴设备的麦克风阵列中使用8字型指向性麦克风时,能够充分提高了8字型麦克风的利用率,降低穿戴设备的生产、制造及研发成本,提高穿戴设备的制造速率。
第二方面,本申请提供一种拾音方法,应用于电子设备,该方法包括:
响应于第一操作,显示第一界面,第一界面用于配置拾音方向;
响应于在所述第一界面上检测到的第二操作,确定目标拾音方向。
基于本申请提供的拾音方法,电子设备可以通过第一界面提供拾音方向配置功能,使得用户可以根据实际应用情况选择目标拾音方向,使得电子设备在后续拾音过程中可以直接根据目标拾音方向拾取声音信号,或者根据目标拾音方向对拾取到的原始声音信号进行信号增强处理,以获取到原始声音信号位于目标拾音方向的增强声音信号,从而有效提高最终拾取到的声音信号的信噪比,提高声音信号的可懂度,提升用户体验。
在第二方面的一个可能的实施方式中,本申请实施例提供的方法还包括:
获取原始声音信号;
根据目标拾音方向,对原始声音信号进行增强处理,得到原始声音信号中位于目标拾音方向上的第一声音信号的增强声音信号。
基于该可能的实现方式中,在获取原始声音信号后,根据目标拾音方向对原始声音信号做增强处理,以得到与目标拾音方向对应的增强处理后的声音信号,这样可以根据不同的实际应用场景,灵活的调整目标拾音方向,得到增强处理后的与目标拾音方向对应的增强声音信号,避免了获取到的声音信号中掺杂其他全方向声音信号,提高了用于播放的声音的清晰度,提升了声音信号的音质。
在第二方面的一个可能的实施方式中,获取原始声音信号,包括:
在录音过程中,获取原始声音信号;
根据目标拾音方向,对原始声音信号进行增强处理,得到原始声音信号中位于目标拾音方向上的第一声音信号的增强声音信号之后,所述方法还包括:保存所述增强声音信号。
基于该可能的实现方式中,针对录音场景,由于用户后期听取的声音信号为增强处理后的增强声音信号,因此,便于用户后期重复听取音质较高的声音信号,解决了在录音过程中由于采集了除需要录制的声音信号之外的其他声音信号而降低了声音信号可懂度的问题,提升了获取到的声音信号的信噪比,提高了拾取的声音信号的可懂度。
在第二方面的一个可能的实施方式中,获取原始声音信号,包括:
在通话过程中,获取原始声音信号;
根据目标拾音方向,对原始声音信号进行增强处理,得到原始声音信号中位于目标拾音方向上的第一声音信号的增强声音信号之后,所述方法还包括:将增强声音信号发送至通话端设备。
基于该可能的实现方式中,通话场景包括语音通话、视频通话、会议通话等,针对通话场景,能够使通话双方听到增强处理后的增强声音信号,解决了由于通话过程中采集了除通话双方之间的声音信号之外的其他音频或噪声而降低了声音信号可懂度的问题,提升了获取到的声音信号的信噪比,提高了拾取的声音信号的可懂度,提高了通话双方的沟通效率。
在第二方面的一个可能的实施方式中,原始声音信号为录制的原始视频中的声音信号,根据目标拾音方向,对原始声音信号进行增强处理,得到原始声音信号中位于目标拾音方向上的第一声音信号的增强声音信号之后,所述方法还包括:将原始视频 中的原始声音信号替换为增强声音信号。
基于该可能的实现方式中,针对录像场景,将原始视频中的原始声音信号替换为增强声音信号后,极大提高了录制的视频中的声音的音质,解决了由于录制的原始视频中采集了掺杂了不同音频信号和环境噪声的声音信号而降低了声音信号可懂度的问题,提升了获取到的声音信号的信噪比,提高了拾取的声音信号的可懂度。
在第二方面的一个可能的实施方式中,获取原始声音信号还包括:接收拾音设备发送的原始声音信号。这样利用不同设备之间的互相协作,不仅为全场景的情况下获取原始声音信号提供了可能,而且有利于延长麦克风的使用寿命。
在第二方面的一个可能的实施方式中,本申请实施例提供的方法还包括:向拾音设备发送目标拾音方向。这样不仅能够减轻电子设备处理器的处理负担,有效保障电子设备正常稳定运行;而且拾音设备可以基于接收到的目标拾音方向来拾取与目标拾音方向对应的声音信号,从而获取清晰度、可懂度以及信噪比更高的声音信号。
在第二方面的一个可能的实施方式中,电子设备包括麦克风阵列,麦克风阵列包括至少一个指向性麦克风,电子设备获取原始声音信号,包括:
根据目标拾音方向,开启指向目标拾音方向的指向性麦克风,关闭未指向目标拾音方向的指向性麦克风;
利用开启的指向目标拾音方向的指向性麦克风采集原始声音信号。
基于该可能的实现方式中,一方面可以节约电子设备的电量,提升用户体验,同时延长智能眼镜的使用寿命,另一方面根据检测到的目标拾音方向开启指向目标拾音方向的麦克风,并关闭其他麦克风,也可以尽可能避免麦克风拾取到除目标拾音方向之外的其他方向上的噪声,增强麦克风的拾音效果。在实际应用中,还可以利用各个指向性麦克风呈现的打开或关闭状态进一步实现不同的拾音效果。
在第二方面的一个可能的实施方式中,获取原始声音信号,包括:根据目标拾音方向,开启指向目标拾音方向的指向性麦克风,关闭未指向目标拾音方向的指向性麦克风;
利用开启的指向目标拾音方向的指向性麦克风采集原始声音信号;
根据目标拾音方向,对原始声音信号进行增强处理,得到原始声音信号中位于目标拾音方向上的第一声音信号的增强声音信号。
基于该可能的实现方式中,电子设备根据目标拾音方向,开启指向目标拾音方向的麦克风,并关闭其他麦克风,可以避免麦克风拾取到除目标拾音方向之外的其他方向上的噪声,减少获取到的原始声音信号中音质较强的杂音,增强麦克风的拾音效果,进一步对开启的指向性麦克风获取的声音信号做增强处理,以得到与目标拾音方向对应的增强处理后的声音信号。这样可以避免获取到的声音信号中掺杂其他方向的声音信号,提高了增强处理后的声音信号的清晰度及音质,有效提高了最终拾取到的声音信号的信噪比,提高声音信号的可懂度,提升用户体验。
在第二方面的一个可能的实施方式中,本申请实施例提供的方法还包括:向音频播放设备发送增强声音信号。这样扩展了增强声音信号播放的器件,使得用于播放的增强声音信号可以适应于不同的使用场景。
在第二方面的一个可能的实施方式中,本申请实施例提供的方法还包括:播放增 强声音信号。这样便于直接听取增强后的声音信号。
在第二方面的一个可能的实施方式中,响应于第一操作,显示第一界面之前,本申请实施例提供的方法还包括:
显示录音界面,录音界面上显示有拾音配置按钮;
在录音界面上检测第一操作,第一操作为拾音配置按钮的触发操作。
在第二方面的一个可能的实施方式中,第一操作为录音启动操作,本申请实施例提供的方法还包括:
响应于第一操作,启动录音功能。
在第二方面的一个可能的实施方式中,响应于第一操作,显示第一界面之前,本申请实施例提供的方法还包括:
显示通话界面,通话界面上显示有拾音配置按钮;
在通话界面上检测第一操作,第一操作为拾音配置按钮的触发操作。
在第二方面的一个可能的实施方式中,第一操作为通话接通操作,本申请实施例提供的方法还包括:
响应于第一操作,接通语音通话或者视频通话功能。
在第二方面的一个可能的实施方式中,响应于第一操作,显示第一界面之前,本申请实施例提供的方法还包括:
显示录像界面,录像界面上显示有拾音配置按钮;
在录像界面上检测第一操作,第一操作为拾音配置按钮的触发操作。
在第二方面的一个可能的实施方式中,第一操作为录像启动操作,本申请实施例提供的方法还包括:
响应于第一操作,启动录像功能。
在第二方面的一个可能的实施方式中,响应于第一操作,显示第一界面之前,本申请实施例提供的方法还包括:
显示会议界面,会议界面上显示有拾音配置按钮;
在会议界面上检测第一操作,第一操作为拾音配置按钮的触发操作。
在第二方面的一个可能的实施方式中,第一操作为会议模式启动操作,本申请实施例提供的方法还包括:响应于第一操作,启动会议功能。
响应于第一操作,显示第一界面之前,本申请实施例提供的方法还包括:
显示拾音场景设置界面;
响应于在拾音场景设置界面上检测到的第二操作,打开或者关闭第一界面的显示场景,显示场景包括录音场景、通话场景、录像场景、会议场景中的至少一个场景。
第三方面,本申请提供一种拾音方法,应用于拾音设备,该方法包括:
接收电子设备发送的目标拾音方向;
在目标拾音方向上获取目标声音信号。
基于本申请提供的拾音方法,拾音设备接收到电子设备发送的目标拾音方向后,使得拾音设备在后续拾音过程中可以直接根据目标拾音方向拾取目标声音信号,或者根据目标拾音方向对拾取到的原始声音信号进行信号增强处理,以获取到原始声音信号位于目标拾音方向的目标声音信号,从而有效提高最终拾取到的声音信号的信噪比, 提高声音信号的可懂度,提升用户体验。
在第三方面的一个可能的实施方式中,在目标拾音方向上获取目标声音信号包括:
采集原始声音信号;
根据目标拾音方向,对原始声音信号进行增强处理,得到原始声音信号中位于所述目标拾音方向上的第一声音信号的增强声音信号,增强声音信号为目标声音信号。
基于该可能的实现方式中,在获取原始声音信号后,根据目标拾音方向对原始声音信号做增强处理,以得到与目标拾音方向对应的增强处理后的声音信号,这样可以根据不同的实际应用场景,灵活的调整目标拾音方向,得到增强处理后的与目标拾音方向对应的增强声音信号,避免了获取到的声音信号中掺杂其他全方向声音信号,提高了目标声音信号的清晰度,提升了目标声音信号的音质。
在第三方面的一个可能的实施方式中,在目标拾音方向上获取目标声音信号包括:
根据目标拾音方向,开启指向目标拾音方向的麦克风,关闭未指向目标拾音方向的麦克风;
利用开启的指向目标拾音方向的麦克风采集目标声音信号。
基于该可能的实现方式中,根据检测到的目标拾音方向开启指向目标拾音方向的麦克风,并关闭其他麦克风,可以避免麦克风拾取到除目标拾音方向之外的其他方向上的噪声,减少获取到的原始声音信号中音质较强的杂音,增强麦克风的拾音效果,此外,还可以有效避免由于无关麦克风的工作而导致功耗较大的问题,延长拾音设备的使用寿命。
在第三方面的一个可能的实施方式中,在目标拾音方向上获取声音信号包括:
根据目标拾音方向,开启指向目标拾音方向的麦克风,关闭未指向目标拾音方向的麦克风;
利用开启的指向目标拾音方向的麦克风采集原始声音信号;
根据目标拾音方向,对原始声音信号进行增强处理,得到原始声音信号中位于目标拾音方向上的第一声音信号的增强声音信号,增强声音信号为目标声音信号。
基于该可能的实现方式中,拾音设备根据目标拾音方向,开启指向目标拾音方向的麦克风,并关闭其他麦克风,可以避免麦克风拾取到除目标拾音方向之外的其他方向上的噪声,减少获取到的原始声音信号中音质较强的杂音,增强麦克风的拾音效果,进一步对开启的指向性麦克风获取的声音信号做增强处理,以得到与目标拾音方向对应的增强处理后的声音信号。这样可以避免获取到的声音信号中掺杂其他方向的声音信号,提高了增强处理后的声音信号的清晰度及音质,有效提高了最终拾取到的声音信号的信噪比,提高声音信号的可懂度,提升用户体验。
在第三方面的一个可能的实施方式中,本申请实施例提供的方法还包括:播放目标声音信号。
在第三方面的一个可能的实施方式中,本申请实施例提供的方法还包括:向音频播放设备发送所述目标声音信号。扩展了用于播放目标声音信号的设备,丰富了实际应用场景。
第四方面,本申请提供一种芯片系统,所述芯片系统包括处理器,所述处理器执行存储器中存储的计算机程序,以实现第二方面或第三方面中任一项所述的方法。
在第四方面的一个可能的实施方式中,所述芯片系统还包括存储器,存储器与处理器通过电路或电线与存储器连接。
第五方面,本申请提供一种电子设备,包括:处理器,所述处理器用于运行存储器中存储的计算机程序,以实现第二方面或第二方面的任一可能的实现方式中的方法。
在第五方面的一个可能的实施方式中,电子设备为如第一方面或第一方面的任一可选方式所述的穿戴设备。
第六方面,本申请提供一种拾音设备,包括:处理器,所述处理器用于运行存储器中存储的计算机程序,以实现第三方面或第三方面的任一可能的实现方式中的方法。
在第六方面的一个可能的实施方式中,拾音设备为如第一方面或第一方面的任一可选方式所述的穿戴设备。
第七方面,本申请提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现如第二方面或第三方面中任一项所述的方法。
第八方面,本申请实施例提供了一种计算机程序产品,当计算机程序产品在电子设备或拾音设备上运行时,使得电子设备执行上述第二方面或第三方面中任一所述的方法。
本申请提供的第四方面至第八方面的技术效果可以参见上述第一方面、第二方面或第三方面的各个可选方式的技术效果,此处不再赘述。
附图说明
图1为本申请实施例提供的一种智能眼镜的局部结构示意图;
图2为本申请实施例提供的一种耳机的结构示意图;
图3为本申请实施例提供的智能眼镜作为穿戴设备的使用场景的示意图;
图4为本申请实施例提供的8字指向型麦克风对声音信号的灵敏度示意图;
图5为本申请实施例提供的全向麦克风对声音信号的敏感度示意图;
图6为本申请实施例提供的一种可穿戴设备与电子设备构成的系统功能框图;
图7-图18为本申请实施例提供的不同种智能眼镜中麦克风阵列结构形成波束的示意图;
图19为本申请实施例提供的一种实现拾音方法的示意图;
图20-图26为本申请实施例提供的显示第一界面的不同场景的相关示意图;
图27和图28为本申请实施例提供的第一界面的展示示意图;
图29为本申请实施例提供的多种手势的示意图;
图30为本申请实施例提供的一种对声音信号降噪提取过程的示意性流程图;
图31为本申请实施例提供的对声音信号进行空间特征聚类的示意图;
图32和图33为本申请实施例提供的另一种对声音信号降噪提取过程的示意性流程图;
图34为本申请实施例提供的对同一噪声环境中佩戴人声音信号提取效果对比示意图;
图35为本申请实施例提供的一种拾音方法的交互流程示意图;
图36为本申请实施例提供的一种电子设备与拾音设备进行连接的界面示意图;
图37为本申请实施例提供的另一种拾音方法的交互流程示意图;
图38为本申请实施例提供的一种电子设备的结构示意图;
图39为本申请实施例提供的一种电子设备的软件结构示意图。
具体实施方式
下面结合本申请实施例中的附图以及相关实施例,对本申请实施例中的技术方案进行描述。其中,在本申请实施例的描述中,以下实施例中所使用的术语只是为了描述特定实施例的目的,而并非旨在作为对本申请的限制。如在本申请的说明书和所附权利要求书中所使用的那样,单数表达形式“一种”、“所述”、“上述”、“该”和“这一”旨在也包括例如“一个或多个”这种表达形式,除非其上下文中明确地有相反指示。还应当理解,在本申请以下各实施例中,“至少一个”、“一个或多个”是指一个或两个以上(包含两个)。术语“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系;例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A、B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。
在本说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。术语“连接”包括直接连接和间接连接,除非另外说明。“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。
在本申请实施例中,“示例性地”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性地”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性地”或者“例如”等词旨在以具体方式呈现相关概念。
随着科技的不断进步,穿戴设备(例如耳机、智能眼镜、智能手环等)已经成为人们日常生活中不可缺少的一部分。通过在穿戴设备中增加麦克风阵列,以使穿戴设备具备拾音功能。目前,穿戴设备的麦克风阵列中一般包括两颗全向麦克风,这两颗全向麦克风在穿戴设备中的设置位置尽可能与佩戴人的人嘴处在一条直线上,从而基于声音信号叠加的原理获取佩戴人的声音信号,然后基于差分阵列算法(Differential Microphone Array,DMA)来处理获取到的佩戴人的声音信号,以提高穿戴设备拾取佩戴人的声音信号的质量。
示例性的,如图1所示为一种智能眼镜的局部结构示意图,参见图1中,在该智能眼镜的眼镜腿上设置有两颗全向麦克风,这两颗全向麦克风在智能眼镜中的设置位置大致与佩戴该智能眼镜的人的人嘴处于一条直线上,当佩戴人佩戴该智能眼镜后,在佩戴人的人嘴发出声音信号时,可以通过该智能眼镜中的两颗全向麦克风采集声音信号。
又如,如图2所示为一种耳机的结构示意图,参见图2中,在该耳机的耳柄中设置有两颗全向麦克风,这两颗全向麦克风在耳机中的设置位置大概与佩戴该耳机的佩 戴人的人嘴处于一条直线上,当佩戴人佩戴耳机后,在佩戴人的人嘴发出声音信号时,可以通过耳机的耳柄中设置的两颗全向麦克风来采集声音信号。
上述示例中,通常利用差分阵列算法(Differential Microphone Array,DMA)进一步对麦克风阵列拾取的声音信号进行处理,得到处理后的声音信号。DMA主要利用空间声压的差异性来对声音信号进行处理,具体地,当拾音设备中设置N颗麦克风时可以获取到N-1阶差分,然后利用N-1阶差分来处理声音信号。当拾音设备的麦克风阵列中包括两颗麦克风时,可以通过DMA获取声音信号的1阶差分波束,也就是说,利用麦克风采集到的声音信号与采集到的噪声音信号做差来提取声音信号。
上述现有技术需要依赖于通过麦克风阵列中的两颗全向麦克风在穿戴设备中的特殊设置方式以及DMA方法来提高穿戴设备拾取佩戴人的声音信号的质量,但是若麦克风阵列中的全向麦克风在穿戴设备中的设置位置未与佩戴人的人嘴处在一条直线上,即存在较大偏差时,则会降低拾取到的声音信号的质量,降低信噪比,影响了用户体验。
而且,若在相对噪杂的拾音环境中使用上述穿戴设备,掺杂了人声和环境噪声的声音信号会同时被穿戴设备中的麦克风阵列采集,利用上述方法对麦克风阵列拾取的声音信号进行处理,无法滤除掉麦克风采集到的声音信号中的噪声音信号,降低了处理后的声音信号的可懂度,影响了拾音质量。
因此,针对穿戴设备中拾取的声音信号的可懂度低,拾音质量较差,信噪比较低的问题,本申请提供一种穿戴设备,在该穿戴设备中设置包括有至少一个指向性麦克风的麦克风阵列,利用麦克风阵列中的至少一个指向性麦克风来拾取声音信号,充分利用指向性麦克风对特定方向的声音信号敏感的特点来采集声音信号,能够从获取声音的源头减少声音信号中掺杂的噪声,有效避免了由于采集了过于复杂的声音信号而降低了声音信号的质量,摆脱麦克风阵列在穿戴设备中的安装束缚,提升了获取到的声音信号的音质,提升信噪比。
本申请实施例提供的一种穿戴设备可以智能眼镜(smart glasses)、增强现实(Augmented Reality,AR)/虚拟现实(virtual Reality,VR)/混合现实(Mixed Reality,MR)设备、智能头盔(smart helmet)、头戴式耳机、助听设备、入耳式耳机、耳塞式耳机、智能手环(smart wristband)、智能手表(smart watch)、计步器(pedometer)、对讲机(two way radio)、录音笔(recording pen)等具有拾音功能的设备。不难理解的,穿戴设备可以是面向未来技术的其他设备。
该穿戴设备可以适用于多种场景,例如,场景包括但不限于视频通话场景、语音通话场景、专业录音场景、电台/广播/主持场景、直播游戏/直播带货场景、会议场景以及其他能够应用拾音功能的场景中。进一步地,通话场景可以包括室内通话场景、室外通话场景、安静/嘈杂通话场景、骑行/跑步/运动通话场景、车载通话场景、单耳通话场景、双耳通话场景、远程会议通话场景等。
为了更加方便的阐述本申请实施例提供的穿戴设备,作为示例而非限定,下文将以智能眼镜作为可穿戴设备为例来详细阐述本申请的技术方案。
如图3所示为本申请实施例提供的智能眼镜作为穿戴设备的使用场景的示意图,参见图3,智能眼镜可佩戴于用户的眼部,能够实现与电子设备(例如手机)的无线 通信功能,在本申请实施例中,该智能眼镜中包括麦克风阵列,麦克风阵列中包括至少一个指向性麦克风。
根据实际应用需求,麦克风阵列中的指向性麦克风的数量可以灵活设置,例如,针对需要采集多个方向声音信号的情况下,可以在穿戴设备中设置多个指向性麦克风。在麦克风阵列中有至少两个指向性麦克风的情况下,可以通过进一步对获取的声音信号作多元化处理,提升麦克风的拾音性能,进而提升穿戴设备的整体性能,提升用户体验。麦克风阵列中指向性麦克风的数量可以根据实际应用需求而设定,本申请对此不作任何限定。
在其中一种可能的实施方式中,麦克风阵列中至少一个指向性麦克风的拾音波束方向互相正交。其中,指向性麦克风的拾音波束方向互相正交是指麦克风阵列中的指向性麦克风对应的拾音方向是两两互相垂直的。
应理解的,为了使得麦克风采集的声音信号尽可能保留较多的声音特征,麦克风阵列中的指向性麦克风的拾音方向可以指向预设的声源位置。例如,智能眼镜上麦克风阵列中的指向性麦克风的拾音方向可以指向佩戴智能眼镜的佩戴人的人嘴方向。或者针对助听设备,助听设备上麦克风阵列中的指向性麦克风的拾音方向可以指向其他方向,用于更好的拾取与佩戴该助听设备进行对话的其他人的声音信号。不同的穿戴设备,预设的声源位置可能不同,本申请对此不作任何限定。
在另一种可能的实施方式中,指向性麦克风可以为8字型麦克风。如图4所示为8字指向型麦克风对声音信号的灵敏度示意图,8字型麦克风也称双指向型麦克风,其主要对同时来自方向相反的两个声音信号敏感。当在穿戴设备的麦克风阵列中使用8字型指向性麦克风时,能够充分提高8字型麦克风的利用率,降低穿戴设备的生产、制造及研发成本,提高穿戴设备的制造速率。
可选的,麦克风阵列中还可以包括全向麦克风。如图5所示为全向麦克风对声音信号的敏感度示意图,全向麦克风对所有角度的声音信号都具有相同的灵敏度参见图5中的加粗线段所示。在本申请实施例中,包含全向麦克风和指向性麦克风的麦克风阵列,可以通过全向麦克风从所有方向均衡的拾取声音,以获取丰富、范围较广的音频信号或噪声,根据不同的实际应用需求,可以利用全向麦克风获取的音频信号或噪声对指向性麦克风采集的音频信号进行降噪、增强处理,以提升指向性麦克风的拾音质量,进一步提升穿戴设备的拾音性能。
基于本实施例,智能眼镜中除了包括麦克风阵列外,如图6所示也可以包括扬声器和处理器;进一步地,扬声器是用于贴近佩戴人左/右耳可以独立进行播放的器件,扬声器可以分别设置在智能眼镜两侧的镜腿中,用于向佩戴人的人耳播放声音。其中,扬声器可以是外放的扬声器,例如,喇叭或者音响等;也可以是贴近人耳播放的扬声器。处理器用于对声音信号进行处理,或者将麦克风阵列采集的声音信号分发至电子设备的处理器,使电子设备的处理器对声音信号进行处理。当然,在实际应用中智能眼镜中还可以包括通信模块和控制接口,通信模块用于实现智能眼镜与其他电子设备的通信,控制接口用于实现对智能眼镜的控制。
不难理解的,电子设备也称主控设备,主控设备与智能眼镜在通信连接成功后,可以实现对智能眼镜的控制。其中,主控设备的处理器可以用于对智能眼镜处理器分 发的声音信号进行处理,主控设备的通信模块可以通过智能眼镜的通信模块与智能眼镜实现交互通信。
应理解,智能眼镜和/或主控设备的控制接口可以接收外部输入的控制命令,以通过接收的控制命令实现对智能眼镜和/或主控设备的控制。其中,接收控制命令的方式包括但不限于通过智能眼镜或者主控设备上的物理按键,或者对智能眼镜或者主控设备的触控手势、隔空手势等。例如,对于智能眼镜中音视频的音量调节,可以通过智能眼镜上的物理按键接收音量调节的控制命令,还可以通过主控设备(例如手机)接收的触控手势而接收音量调节的控制命令。
为了增强用户体验,可选地,在可穿戴设备中还设置有姿态行动测量单元。姿态行动测量单元用于追踪佩戴人在佩戴设备后的不同姿态变化情况并向处理器分发追踪数据。在实际应用过程中,用户在佩戴可穿戴设备后,可穿戴设备与用户之间相对位置或者方向会随着用户头部/腕部的活动而发生变化,例如,佩戴人佩戴有智能眼镜,A位于佩戴人的正前方,在双方位置保持不变的情况下,可以将位于其正前方位置的A的声音信号进行增强,增强后的A的声音信号可以正确被采集,但当佩戴人低头或者转头后,智能眼镜中的指向性麦克风获取到的A的声音信号的方向发生变化,这时若还保持增强智能眼镜正前方的声音信号不变,则获取到的声音信号将不再是A的声音信号。因此,为避免上述情况的发生,可以利用可穿戴设备中的姿态行动测量单元获取佩戴人相对于初始化位置信息的变化量,监测佩戴人姿态变化,以随着用户头部/腕部的活动而自适应的调整拾取到的声音信号的方向,实现声音信号的实时追踪。
值得说明的是,在面向未来技术支持的情况下,智能眼镜可能脱离主控设备的控制,通过其自身的多个功能模块实现远程通话、辅听增强以及其他原本需要借助于主控设备控制才能实现的功能,本申请对此不作限定。
下面以麦克风阵列中指向性麦克风的数量分别为1颗、2颗、3颗、4颗、6颗及9颗为例,对不同数量的指向性麦克风在穿戴设备中形成的声音信号波束进行示例性的描述。需要说明的是,以下几种示意图仅是部分指向性麦克风在智能眼镜中形成声音信号波束的情况,根据不同的实际需求,在智能眼镜中设置的指向性麦克风的数量、指向性麦克风的具体类型以及指向性麦克风的具体安装位置可能会发生变化,本申请对此不作任何限定。
需要注意的是,8字型麦克风能够获取的声音信号的波束如下图7-图18中两个相邻的虚线圆所示,全向麦克风能够获取的声音信号的波束如下图7-图18中一个实线圆所示。
为了进一步减轻智能眼镜的重量,减少智能眼镜对佩戴人鼻梁或耳朵的挤压力,在一种可能的实现方式中,参见图7-图9,在智能眼镜中可以设置1颗8字型麦克风。参见图7,该麦克风可以设置在智能眼镜一侧的镜框或者镜腿中,该麦克风能够形成指向佩戴人人嘴方向的声音信号波束,便于该麦克风接收来自佩戴人人嘴方向的声音信号。参见图8,该麦克风也可以设置在智能眼镜镜框的中间位置,中间区域是指智能眼镜镜框的鼻梁和/或鼻托;该麦克风可以形成指向佩戴人人嘴方向的声音信号波束。同样的,参见图9,该麦克风还可以设置在智能眼镜另一侧的镜框或者镜腿中,与图7中麦克风的设置位置对应,形成的声音信号波束的方向也是对应指向人嘴。
参见图10-图12为本申请实施例提供的第二种智能眼镜中麦克风阵列结构形成波束的示意图,如图10-图12所示,该智能眼镜的麦克风阵列中包括2颗指向性麦克风,且2颗指向性麦克风的类型均为8字型麦克风。在一种实施例中,参见图10,2颗8字型麦克风中的1颗麦克风位于智能眼镜镜框的中间位置,形成的声音信号波束方向指向佩戴人的人嘴方向;剩余的1颗麦克风位于智能眼镜一侧的镜框、镜架或者镜腿上,该麦克风是形成指向佩戴人人嘴方向的声音信号波束。在另一种可能的实施方式种,参见图11,2颗8字型麦克风中的1颗麦克风位于智能眼镜镜框的中间位置,形成指向佩戴人人嘴方向的声音信号波束;另一颗麦克风则与图10所示的其中一颗麦克风的设置方向相对应,设置于智能眼镜另一侧的镜框、镜架或者镜腿中,也是形成指向佩戴人人嘴方向的声音信号波束。在其他的实施例中,参见图12,2颗8字型麦克风分别对应设置在智能眼镜两侧的镜框、镜框或者镜腿中,这两颗麦克风在智能眼镜中可以分别形成指向佩戴人人嘴方向的声音信号波束。
参见图13为本申请实施例提供的第三种智能眼镜中麦克风阵列结构形成波束的示意图,如图13所示,该智能眼镜的麦克风阵列中设置有3颗8字型麦克风,其中1颗麦克风设置在智能眼镜镜框的中间位置,形成的声音信号波束方向指向佩戴人的人嘴方向,剩余的2颗麦克风分别设置在智能眼镜两侧的镜框、镜架或者镜腿中,形成声音信号波束方向也对应指向佩戴人的人嘴方向。
可选地,当指向性麦克风的数量为3颗,且这3颗指向性麦克风的类型相同时,这3颗指向性麦克风形成的声音信号波束可以有多种形态。例如,可以在保持中间位置设置的麦克风的位置保持不变的情况下,变换其他两颗麦克风在智能眼镜的镜框或者镜架上的位置。如图14,为另一种在智能眼镜的麦克风阵列中设置3颗8字型麦克风形成波束的示意图,对比图13与图14不难看出,变换2颗麦克风的设置位置后,图14中的3颗麦克风形成的声音信号波束方向与图13中的3颗麦克风形成的声音信号波束方向对称,对实际采集佩戴人的声音信号影响较小。
参见图15为本申请实施例提供的第四种智能眼镜中麦克风阵列结构形成波束的示意图,如图15所示,该智能眼镜的麦克风阵列中可以包括4颗指向性麦克风,其中1颗是全向麦克风,3颗为8字型麦克风。上述4颗麦克风均位于智能眼镜框的中间位置,中间区域包括智能眼镜镜框的鼻梁和/或鼻托,上述3颗8字型麦克风形成的拾音波束方向互相正交,例如,3颗8字型麦克风形成的拾音方向分别为垂直于智能眼镜的镜框、平行于智能眼镜的镜框以及指向佩戴人的人嘴方向。
参见图16为本申请实施例提供的另一种智能眼镜中麦克风阵列结构形成波束的示意图,如图16所示,该智能眼镜中的麦克风阵列中可以包括6颗指向性麦克风,其中2颗为全向麦克风,全向麦克风可以设置于智能眼镜转轴的镜框、镜架或者镜腿上;剩余4颗为8字型麦克风,其中两颗位于智能眼镜框的中间位置,这两颗麦克风分别形成垂直于智能眼镜镜面和平行于智能眼镜镜面的拾音方向,另外两颗分别位于智能眼镜转轴的镜框、镜架或者镜腿上,形成的拾音方向分别指向佩戴智能眼镜的人的人嘴方向。
参见图17为本申请实施例提供的一种智能眼镜中麦克风阵列结构形成波束的示意图,如图17所示,该智能眼镜的麦克风阵列中也包括6颗指向性麦克风,其中2 颗是全向麦克风,4颗为8字型麦克风,上述2颗全向麦克风分别位于智能眼镜两侧转轴的镜框、镜架或者镜腿上;上述4颗8字型麦克风中的2颗麦克风位于智能眼镜一侧转轴的镜框、镜架或者镜腿上,紧邻其中1颗全向麦克风,这2颗麦克风形成的拾音方向分别指向佩戴者的人嘴方向和平行于智能眼镜镜框;另外2颗8字型麦克风位于智能眼镜另一端转轴的镜框、镜架或者镜腿上,紧邻另一颗全向麦克风,形成的拾音方向分别对应指向佩戴者的人嘴方向和平行于智能眼镜框方向。
参见图18为本申请实施例提供的一种智能眼镜中麦克风阵列结构形成的声音信号波束的示意图,如图18所示,该智能眼镜的麦克风阵列中可以包括9颗指向性麦克风,其中2颗是全向麦克风,7颗为8字型麦克风;上述2颗全向麦克风分别设置在智能眼镜两侧的镜框、镜架或者眼镜腿上;上述7颗8字型麦克风中的其中1颗麦克风设置在智能眼镜镜框的中间位置,该麦克风形成的声音信号波束方向指向佩戴人的人嘴方向;在智能眼镜的每一侧的镜框或者镜腿上分别设置3颗8字型麦克风,每一侧设置的3颗8字型麦克风形成的声音信号波束互相正交。
值得说明的是,在麦克风阵列中指向性麦克风的数量为两颗或者两颗以上的情况下,实际在智能眼镜中部署多颗指向性麦克风时,多颗类型相同的麦克风的安装位置不受限制。在麦克风阵列中指向性麦克风的数量为一颗的情况下,该麦克风在智能眼镜中的安装位置可以有多种。
应理解,当麦克风阵列中包括一颗全向麦克风时,该全向麦克风位于智能眼镜镜框的鼻梁或鼻托中。当麦克风阵列中包括两颗全向麦克风时,这两颗全向麦克风可以分别位于智能眼镜的两个镜腿中;或者,这两颗全向麦克风分别位于智能眼镜的镜框两侧靠近两个镜腿的位置。当麦克风阵列中包括多颗全向麦克风时,多颗全向麦克风分布在智能眼镜的中间区域以及两侧区域,其中,中间区域包括智能眼镜镜框的鼻梁和/或鼻托;两侧区域包括智能眼睛的两个镜腿和/或智能眼镜的镜框两侧靠近两个镜腿的位置。示例性的,当麦克风阵列中包括3颗全向麦克风时,这3颗全向麦克风中的其中2颗可以分别位于智能眼镜靠近镜腿两侧的镜框中,或者其中2颗麦克风位于智能眼镜靠近镜框两侧的镜腿中,另外1颗麦克风位于智能眼镜镜框的鼻梁或鼻托。
根据全向麦克风的数量进行位置的设置,便于麦克风阵列中的全向麦克风能够尽可能的从多个方向均衡的拾取声音,以获取丰富、范围较广的音频信号或噪声,根据不同的实际应用需求,可以利用全向麦克风获取的音频信号或噪声对指向性麦克风采集的音频信号进行降噪、增强处理,以提升指向性麦克风的拾音质量,进一步提升智能眼镜的拾音性能。
在实际应用过程中,为了节约智能眼镜的电量,提升用户体验,同时延长智能眼镜的使用寿命,当穿戴设备检测到目标拾音方向时,穿戴设备开启麦克风阵列中指向目标拾音方向的麦克风,并关闭麦克风阵列中未指向目标拾音方向的麦克风。
为了能够尽可能避免麦克风拾取到除目标拾音方向之外的其他方向上的噪声,增强拾音效果,还可以当智能眼镜检测到麦克风阵列中存在满足预设条件的第一指向性麦克风时,开启第一指向性麦克风,并关闭麦克风阵列中的其他指向性麦克风。预设条件可以为第一指向性麦克风在预设时间段内拾取到的声音信号的信号质量大于其他指向性麦克风。预设条件可以根据不同的实际应用需求而设置,本申请对此不作任何 限定。
应理解,声音信号的信号质量参数包括但不限于声音信号的响度及声音信号的信噪比。
还应注意的是,随着技术的不断发展,麦克风的类型越来越多,针对各种不同的应用场景,使用到的指向性麦克风的类型可能不同,本申请对不同应用场景中指向性麦克风的具体类型不作限定。
此外,根据不同的实际应用需求,指向性麦克风还可以设置在其他拾音设备中,例如耳机、智能头盔等具有拾音功能的设备中。本申请对此没有任何限定。
本申请实施例还提供一种拾音方法,可以通过灵活调节拾音方向,以在特定方向上对原始声音信号进行增强。从而提高特定方向上声音信号的可懂度、音质和清晰度。下面结合几种可能的场景对本申请实施例提供的拾音方法进行示例性的说明。
场景一、该拾音方法可以应用于电子设备,由电子设备自主拾音的场景。其中,电子设备也可被称为终端设备或者移动设备,又或者终端。该电子设备为具有拾功能和界面显示的设备,包括但不限于手持设备、车载设备、计算设备或者安装有指向性麦克风的其他设备,例如,电子设备可以包括手机(phone)、个人数字助理(personal digital assistant)、平板电脑、车载电脑、膝上电脑(laptop computer)、智慧屏、超级移动个人计算机(ultra-mobile personal computer,UMPC)、穿戴设备以及其他具有拾音功能和显示功能的电子设备。
如图19所示为本申请实施例提供的一种拾音方法的流程示意图,参见图19,该拾音方法包括以下步骤:
S2101,响应于第一操作,显示第一界面,第一界面用于配置拾音方向。
在本申请实施例中,第一操作可以是用户在电子设备的显示屏上输入的点击操作、触摸操作、滑动操作;也可以是用户通过在电子设备上的物理按键输入的控制操作;还可以是用户通过电子设备的摄像头或者其他传感器检测到的隔空手势等。
例如,电子设备的设置页面或者桌面上显示有“拾音设置”按钮。示例性的,如图20中(a)所示,电子设备的桌面上显示有“拾音设置”按钮,用户通过点击该按钮后,电子设备的屏幕显示系统直接显示第一界面,以进行默认拾音方向的设置。
或者,用户在点击该按钮后,电子设备也可以显示拾音场景设置界面,用于设置可以直接启动第一界面进行拾音设置的场景。例如,在来电被接通的场景中是否启动拾音设置,或者录音开启的场景中是否启动拾音设置,或者在免提(也可以称为扩音或外放)场景中是否启动拾音设置等。设置完成后,电子设备检测到对应的场景被触发后,电子设备的屏幕显示系统自动显示第一界面,其中,对应场景的触发即为电子设备响应的第一操作。
应理解,如图20中(b)所示,可以设置的拾音设置场景包括但不限于录音场景、通话场景、录像场景以及会议场景,其中通话场景可以是语音通话场景,也可以是视频通话场景,当然,还可以是会议通话场景。
例如,针对录音场景,可以是在电子设备检测到用户点击录音按钮,启动录音时,直接跳转至第一界面。例如,如图21中(a)所示,在通话界面上,用户点击录音功能按钮后,电子设备的屏幕显示系统进入第一界面;或者在当用户点击电子设备上显 示的录音功能按钮时,电子设备的屏幕显示系统进入第一界面。
或者,如图21中(b)所示,用户点击电子设备的桌面上显示的录音应用程序对应的录音功能按钮后,显示录音界面,在录音界面中显示有拾音增强按钮,当用户需要对本地录音进行指定方向增强时,用户可以点击拾音增强按钮,以使得电子设备检测到用户点击上述拾音增强按钮后,电子设备的屏幕显示系统跳转第一界面,然后基于录音启动操作启动录音功能,实现对本地录音中声音信号的增强处理。
又例如,如图22所示的通话界面中,当电子设备检测到用户点击的外放(也称扩音或者免提)等按钮后,电子设备的屏幕显示系统进入第一界面。
又例如,参见图23,电子设备在来电(也称通话)后显示如图23中(a)所示的界面,用户可以在电子设备的屏幕显示系统上执行如图23中(b)所示的滑动操作以接通来电,在来电被接通后,电子设备的屏幕显示系统上直接显示第一界面。
又例如,在如图24所示的来电被接通的场景中,用户点击如图24所示的“拾音增强”功能按钮后,电子设备的屏幕显示系统进入第一界面。
在其中一种可能的实施方式中,在如图25所示的录像界面中,用户点击“拾音增强”功能按钮后,电子设备的屏幕显示系统上直接显示第一界面;或者,在用户直接点击电子设备的桌面上显示的录像应用程序对应的录像功能按钮后,电子设备的屏幕显示系统进入第一界面。
在另一种可能的实施方式中,在如图26所示的会议界面中显示有“拾音增强”配置按钮,用户点击该配置按钮后电子设备的屏幕显示系统进入第一界面;或者,在用户直接点击会议功能按钮,启动会议功能后电子设备的屏幕显示系统上直接显示第一界面。
如图27所示为本申请实施例提供的一种第一界面的示意图,第一界面上可以包括用于增强佩戴人声音信号的第一开关按钮2701、辅听增强的手动添加按钮2702(和/或滑动条上的定位按钮2703)、声音信号方向展示图2704以及可以转换不同视角的点击按钮2705等。
其中,第一开关按钮2701用于打开或关闭增强佩戴人的声音信号;辅听增强的手动添加按钮2702(或者滑动条上的定位按钮2703)用于确定增加或者减少待增强的声音信号及对应声音信号的方向信息;声音信号方向展示图2704用于展示模拟的拾音环境,包括佩戴人的头部以及以佩戴人头为中心的拾音环境。不同视角的点击按钮2705可以用于切换声音信号方向展示图2704中佩戴人的不同角度。
应该理解的,根据不同实际应用场景,第一界面中可以增加上述示例中的显示内容,或者减少上述示例中的部分显示内容,本申请对第一界面中展示的内容不作任何限定。
不难理解的,当电子设备为例如智能手机、智能手表、平板电脑等具有显示屏的设备时,该电子设备可以响应于第一操作,在电子设备的显示屏上显示用于配置拾音方向的第一界面。当电子设备是例如增强现实、虚拟现实等以投屏、投影等方式显示图像的设备时,可以响应于隔空手势而显示第一界面。
S2102,响应于在第一界面上检测到的第二操作,确定目标拾音方向。
其中,目标拾音方向用于增强指定方向的原始声音信号。下面结合如图27所示的 第一界面,对如何确定目标拾音方向进行介绍。
当目标拾音方向为佩戴人的人声方向时,电子设备可以基于第一界面,响应于用户对第一开关按钮2701的点击或滑动操作,来打开或者关闭增强佩戴人的声音信号。参见图27中第一开关按钮2701所示的状态表示打开增强佩戴人声音信号,图28中第二开关按钮2706所示的状态表示关闭增强佩戴人声音信号,其中,第一开关按钮2701与第二开关按钮2706可以为同一个开关按钮。
当目标拾音方向不是佩戴人的人声方向时,用户可以通过如图27所示的手动添加按钮2702或者滑动条上的定位按钮2703增加目标拾音方向。用户还可以基于声音信号方向展示图2704,通过第一手势切换中佩戴人的角度,然后再通过第二手势增加或者减少待增强的声音信号的方向;或者通过点击按钮2705切换声音信号方向展示图2704中佩戴人的不同角度,再基于声音信号方向展示图2704,通过第二手势增加或者减少待增强的声音信号的方向。
例如,上述第一手势可以是如图29中A所示的旋转手势;第二手势可以是如图29中E所示的长按手势。应理解,根据不同的使用设置,上述第一手势和第二手势可以相同,也可以不同。在保证第一手势和第二手势不同的情况下,第一手势和/或第二手势可以是图29中A-Z1所示的任一种可能的手势,这里不再一一举例。
需要说明的是,除可以通过上述示例确定目标拾音方向外,还可以通过隔空手势或者其他控制命令来确定目标拾音方向,对此本申请不作任何限定。
值得说明的是,目标拾音方向可以包括一个或者一个以上。例如,目标拾音方向可以包括佩戴人的人声方向,和一个通过辅听增强设置的其他方向。
S2103,获取原始声音信号。
电子设备通过内置的麦克风阵列获取到环境中的原始声音信号,其中,麦克风阵列中可以包括至少一个指向性麦克风,也可以包括至少一个指向性麦克风和至少一个全向麦克风,在不同的应用场景下,麦克风阵列中还可以包括至少一个全向麦克风。
应理解,在实际应用中,电子设备可以根据目标拾音方向,开启指向目标拾音方向的指向性麦克风,关闭未指向目标拾音方向的指向性麦克风,利用开启且指向目标拾音方向的指向性麦克风采集原始声音信号,这样不仅可以节约电子设备的电量,提升用户体验,同时延长智能眼镜的使用寿命,而且根据目标拾音方向开启指向目标拾音方向的麦克风,并关闭其他麦克风,也可以尽可能避免麦克风拾取到除目标拾音方向之外的其他方向上的噪声,增强麦克风的拾音效果。在实际应用中,还可以利用各个指向性麦克风呈现的打开或关闭状态进一步实现不同类型的拾音效果,以提升电子设备的使用性能。
S2104,根据目标拾音方向,对原始声音信号进行增强处理,得到原始声音信号中位于目标拾音方向上的第一声音信号的增强声音信号。
应理解,上述增强处理后的增强声音信号可以用于播放,也可以用于存储,还可以用于转发至其他设备等。作为示例而非限定,针对拾音场景,增强处理后的增强声音信号可以用于播放,更好的帮助佩戴助听设备的人听取声音信号;针对录音场景,增强处理后的增强声音信号可以用于存储,以便于用户后期重复听取;针对通话场景,增强处理后的增强声音信号可以用于将其发送至通话端设备;针对录像场景,增强处 理后的增强声音信号可以用于替换录制的原始视频中的原始声音信号,以便于用户后期查看录制的视频时能够听到增强后的声音信号,提升用户体验;针对会议场景,增强处理后的增强声音信号可以用于将其发送至会议方设备,便于更好的交流与沟通等等。根据不同的实际应用场景,增强处理后的增强声音信号的用途不同,本申请对此不作任何限定。
本申请实施例中,对原始声音信号中位于目标拾音方向上的第一声音信号的增强处理包括对声音强度的提升和/或对声音信号的降噪处理,以提高特定方向上声音信号的可懂度、音质和清晰度。
其中,如图30所示是本申请实施例提供的一种对声音信号降噪提取过程的示意性流程图,该声音信号降噪提取过程是指根据目标拾音方向,对原始声音信号进行降噪提取,得到滤除了较多噪音后的声音信号。参见图30,该降噪过程包括:第一步:基于麦克风阵列中获取原始声音信号。
第二步:根据目标方向,将获取到的声音信号对应转换为导向矢量声音信号。将获取到的声音信号转换为导向矢量声音信号的方法包括但不限于利用波束形成器和广义旁瓣消除器(generalized sidelobe canceller,GSC)对获取的声音信号进行处理,得到以目标方向为引导的导向矢量声音信号,或者利用盲源分离(Blind source separation,BSS)技术,结合目标方向对获取的声音信号进行处理,得到与目标方向对应的导向矢量声音信号。
应理解,该步骤本质上是对指向性麦克风采集的声音信号作预处理,实现多源的不同声音信号的分离,消除目标方向外的噪声,提取到目标声音信号的同时达到抑制噪声的目的。
第三步:对扩散场噪声抑制。其中,扩散场是指声音信号的能量密度均匀,在各个传播方向上作无规分布的声场。扩散场噪声即来自声场全空间各个方向的声音信号,例如,空调制冷或制热发出的声音信号等。
在本申请实施例中,可以根据来自不同通道的声音信号的能量关系对导向矢量声音信号(或者指向性麦克风采集的声音信号)进行扩散场噪声的抑制。
以指向性麦克风阵列(AVS)为例,可以根据声音信号到达同一个AVS中的各个通道的能量关系来确定声音信号是直达声还是扩散场噪声,具体地,当声场空间属于理想的扩散场时,全向通道与x、y、z三个轴线通道采集的声音信号满足以下公式(1):
Xw 2=Xx 2+Xy 2+Xz 2     (1)
其中,理想的扩散场是指采集来自声场空间各个方向的声音信号的能量相同,但声音信号互不相关的声场。上述公式(1)中,Xw表示全通道采集的声音信号,Xx、Xy、Xz分别表示x、y、z三个轴线通道采集的声音信号。
根据上述公式(1)不难看出,当声场空间中仅存在位于三个轴线通道中其中一个通道的点声源时,以x轴线通道为例,则全通道采集的声音信号与x轴线通道采集的声音信号满足以下公式(2):
Xw=Xx        (2)
应理解,当声场空间中仅存在位于三个轴线通道中的y轴线、z轴线或者其他三 维空间任一方向上的点声源均满足与上述公式(2)类似的条件。这样,可以根据各通道间采集声音信号的能量关系来判断每个时频点(由时间和频率共同确定的点)AVS采集到的声音信号是点声源还是扩散场噪声,即以下公式(3):
示例性的,以电子设备为上述图15所示的智能眼镜为例,该智能眼镜的麦克风阵列中包括4颗共点指向性麦克风,4颗麦克风中1颗为全向麦克风,3颗为8字型麦克风,3颗8字型麦克风形成的声音信号波束方向互相正交,在3颗8字型麦克风的单个声音信号的接收强度与全向麦克风接收该声音信号的强度相等的情况下,则全向麦克风接收声音信号的强度Xw1与8字型麦克风接收声音信号的强度Xx1满足以下公式(4):
Xw1 2=3Xx1 2       (4)
通过上述公式(4)即可确定声音信号是否属于点声源。
在实际对扩散场噪声抑制的过程中,可以进一步对上述公式(3)做映射转换,以对扩散场噪声进行滤波抑制,其中,映射转换的方法包括但不限于高斯分布或者均匀分布。
第四步:作非线性波束处理,以实现声音信号的定向采集,抑制除目标方向之外的其他方向的声音信号的干扰。
作非线性波束处理可以采用声音信号的方位估计或者空间聚类估计等方法。其中,采用方位估计的方法本质上是通过AVS采集的声强度矢量来计算每个时频点的到达方向来估计声音信号的方向,以滤除不满足目标方向的声音信号。
具体地,方位估计方法中,每个AVS采集的声强度矢量可以用以下公式(5)表示。以下公式(5)中,(f,n)表示频点为f、帧数为n的时频点,Xw表示全通道采集的声音信号,Xx、Xy、Xz分别表示x、y、z三个轴线通道采集的声音信号。
对应至该时频点的方位则通过以下公式(6)确定。
上述公式(6)中,R(*)表示取实部。根据上述公式(6)计算得到该时频点的方位后,将该时频点的方位与目标方向进行比较,然后利用高斯函数将比较结果映射成对应滤波器的系数,以此来抑制除目标方向之外的其他方向的声音信号。
作为示例而非限定的,假设根据上述公式(6)确定该时频点的方位与目标方向之间的相差0°,则可以认为该时频点的方位与目标方向一致,也就是说该时频点对应的 声音信号为目标声音信号(或者说该时频点对应的声音信号是目标声音信号的概率较大),从而可以将对应映射至滤波器的系数确定为1,以使该时频点对应的声音信号可以保留至滤波器中参与滤波;反之,例如,根据上述公式(6)确定该时频点的方位与目标方向之间的相差180°,则可以认为该时频点的方位与目标方向不一致,或者说该时频点对应的声音信号为噪声的可能性较大,这样可以将该时频点对应映射至滤波器的系数确定为0,以滤除该声音信号。在该示例中,时频点的方位与目标方向的比较结果以及对应映射至滤波器的系数等参数可以根据实际应用情况进行设置,本申请对此不作限定。
而采用空间聚类估计方法是利用声音信号的方位信息,将拾音环境模拟为一个球面(即如图31所示的拾音环境模拟球),通过对声音信号进行空间特征计算(或者声音信号距离球面的距离等),滤除不是目标方向上的声音信号,从而实现对目标方向上声音信号的提取。
应该理解,如图31所示为本申请实施例提供的对声音信号进行空间特征聚类的示意图,参见图31,用拾音环境模拟球来模拟拾音环境,该拾音环境模拟球的球面上的点是对应映射在球面的若干声音信号。通过将若干声音信号对应映射至拾音环境模拟球的球面上,对不在球面上扇形面内的声音信号进行抑制,可以提取特定方向的声音信号。示例性的,根据图31所示的声音信号可得,经过空间特征聚类后的声音信号集中在X=0且Y=1的方向上,可以对不在该方向上的声音信号进行抑制,从而提取到在X=0且Y=1的方向上的声音信号。
当电子设备中包括至少一个指向性麦克风时,可以根据电子设备中指向性麦克风的数量对输出的声音信号作进一步的处理。若电子设备的麦克风阵列中包括一个指向性麦克风时,则可以根据如图30所示的降噪过程对声音信号的导向矢量转换、扩散场噪声抑制、非线性波束处理后提取到目标声音信号。若为了增加声音信号的识别准确度,进一步丰富电子设备的功能,在电子设备中可以设置两个或者两个以上的指向性麦克风,这种情况下,可以参见图32所示的降噪过程,对经导向矢量转换、扩散场噪声抑制以及非线性波束处理后的声音信号进行相关性处理,相关性处理是对得到的多个声音信号之间的相似性进行比较,从而从多个声音信号中确定待输出的声音信号。
为了进一步滤除掉经导向矢量转换、扩散场噪声抑制以及非线性波束处理后的声音信号中的噪声,进一步降低噪声对目标方向的声音信号的影响,在一种可能的实施方式中,利用后置滤波器对声音信号作进一步处理。这样通过对麦克风阵列中的指向性麦克风获取声音信号进行声音信号的导向矢量转换、扩散场噪声抑制、非线性波束以及后置滤波器处理后可以提取到较为准确的目标声音信号。
经过上述几个步骤即可对指向性麦克风阵列获取的声音信号进行处理,提取到抑制了扩散场噪声以及其他非目标方向上噪声的声音信号。可选地,在对指向性麦克风阵列获取的声音信号进行处理的过程中,还可以采用语音活性检测(Voice Activity Detection,VAD)或者语音存在概率(Speech Presence Probability,SPP)等方法从采集到的声音信号中识别和消除处于静音状态的声音信号,以便加快声音信号的拾取速度,提升拾音速率。
值得说明的是,为了避免滤噪处理对声音信号的影响,提升拾音的准确性,在本 申请实施例中,在第一步基于麦克风阵列中的指向性麦克风获取到声音信号后,利用VAD或SPP对获取到的声音信号进行处理,直接从指向性麦克风获取到的声音信号消除处于静音状态的声音信号,加速声音信号的提取。在另一种可能的实施方式中,也可以如图33所示在对声音信号处理完成后,再利用VAD或SPP对处理完成的声音信号进行处理,以最终输出提取完成的声音信号。当然,利用VAD或SPP以消除处于静音状态的声音信号的步骤,可以根据不同的声音信号提取方法或者不同的适应场景而灵活调整该步骤,本申请对此不作任何限定。
当然,在另外一种可能的实施方式中,还可以采用波束形成器和广义旁瓣消除器、盲源分离技术、扩散场噪声抑制、非线性波束、语音活性检测算法/语音存在概率算法中的至少一种方法对指向性麦克风阵列获取的原始声音信号进行降噪处理,得到降噪处理后的声音信号。本申请对此不作任何限定。
如图34所示为本申请实施例提供的利用两种方法对同一噪声环境中佩戴人声音信号提取效果对比示意图,其中,参见图34中(1)图所示为利用现有方法对噪声环境中佩戴人声音信号提取效果图,参见图34中(2)图所示为利用本申请提供的降噪方法对噪声环境中佩戴人声音信号提取效果图。需要说明的是,上述声音信号提取效果图中横坐标表示时间(未在图中示出),纵坐标表示频率,图中鲜亮的颜色表示在该时频点上的声音信号能量的强弱。颜色越鲜亮,图中背景的颜色越暗,则说明该时频点上提取的声音越好,也即对声音信号的降噪效果越明显。通过对比图34中(1)图和(2)图,不难发现通过对同一噪声环境下的佩戴人的声音信号进行提取,图34中(2)图所示的佩戴人的声音信号的谐波更明显,这也正说明了利用本申请提供的降噪方法能够有效的将佩戴人的声音信号和噪声进行分离,噪声抑制效果更好。
上述实施例提供的降噪方法中,首先以目标方向为引导,将麦克风阵列获取的声音信号转换为导向矢量信号,实现了从指向性麦克风采集的多通道的声音信号中分离与目标方向接近的声音信号,为声音信号的后续处理奠定了基础。然后对声音信号进行扩散场噪声的抑制,滤除了声音信号中来自全空间各个方向的扩散场噪声,使抑制了扩散场噪声后的声音信号更清晰。接着通过非线性滤波对声音信号作进一步处理,抑制了声音信号中除目标方向外的其他方向的声音信号,从而实现了声音信号的定向采集。然后对指向性麦克风获取的声音信号进行VAD/SPP处理,能够加快对声音信号降噪的处理速度,加之对处理后的声音信号进行后置滤波器以及相关性处理进一步滤除了处理后的声音信号中的残余噪声,保证了最终得到的声音信号的音质,进一步提高了拾音信噪比。
在一种可能的实现方式中,基于目标拾音方向,从原始有声音信号中增强与目标拾音方向对应的声音信号以后,得到增强后的声音信号,还可以对增强处理的声音信号进行空间渲染处理,空间渲染处理后的声音信号中有声音信号的方位信息,使用户能够通过双耳清晰的分辨声音的方位。其中,实现空间渲染效果的方法包括但不限于双耳时间差(Interaural Time Difference,ITD)或者双耳能级差(Interaural Level Difference,ILD)方法。
可选地,得到增强声音信号后,可以通过电子设备播放上述增强的声音,例如,通过电子设备内置的扬声器播放增强的声音;或者电子设备将增强后的声音信号发送 给播放设备进行播放,例如,可以通过音响播放增强的声音。又或者通过上述电子设备或播放设备存储上述增强的声音。
场景二,该方法可以应用于电子设备和拾音设备,由拾音设备采集原始声音信号,由电子设备进行目标拾音方向的设置。其中拾音设备可以是话筒、对讲机等等,还可以是上述实施例涉及的穿戴设备。电子设备可以是手机、个人数字助理、平板电脑、车载电脑、膝上电脑、智慧屏、超级移动个人计算机、穿戴设备以及其他能够与拾音设备通信的设备。在该场景中,该电子设备可以通过无线通信技术(例如蓝牙技术、红外射频技术、2.4G无线技术、超声波)等方式与拾音设备进行通信。例如,智能眼镜为拾音设备,手机为电子设备,智能眼镜可以通过无线通信技术与手机进行通信,在智能眼镜与手机连接成功后,智能眼镜和手机可以执行本申请实施例提供的拾音方法。
如图35所示为本申请实施例提供的另一种拾音方法的流程示意图,参见图35,该拾音方法包括以下步骤:
S1,电子设备响应于第一操作,显示第一界面,第一界面用于配置拾音方向。
在本申请实施例中,第一操作可以为用户在电子设备的显示屏上输入的点击操作、触摸操作、滑动操作;也可以是用户通过在电子设备上的物理按键输入的控制操作;还可以是用户通过电子设备的摄像头或者其他传感器检测到的隔空手势。
例如,参见图36所显示的界面,当电子设备连接到拾音设备时,电子设备可以自动显示第一界面,其中第一操作为连接操作或者配置操作。或者,在检测到用户点击拾音设置按钮时,显示第一界面。
可选的,针对第一操作其他示例性实施例,以及第一界面的相关示例说明,可以参考上述场景一中的相关描述。在此不再赘述。
S2,电子设备响应于在第一界面上检测到的第二操作,确定目标拾音方向。
应理解,目标拾音方向用于增强指定方向的原始声音信号。具体可以参见上述场景一中S2101-S2102的描述,此处不再赘述。
需要说明的是,在该场景中,对原始声音信号的增强处理可以由电子设备处理,也可以由拾音设备处理。
示例性的,如图35所示,在上述S1-S2之后,由电子设备对声音信号作增强处理的过程包括:
S3,电子设备接收拾音设备发送的原始声音信号。
S4,电子设备根据目标拾音方向,对原始声音信号进行增强处理,得到原始声音信号中位于目标拾音方向上的第一声音信号的增强声音信号。
具体可以参见上述场景一中S2103-S2104的描述,此处不再赘述。
得到增强的声音信号后,在另一种可选的实施方式中,可以通过电子设备播放上述增强的声音,例如,通过电子设备内置的扬声器播放增强的声音;或者电子设备将增强后的声音信号发送给播放设备进行播放,例如,可以通过音响播放增强的声音。又或者通过上述电子设备或播放设备存储上述增强的声音。
可选地,还可以由拾音设备对声音信号作增强处理,参见基于图35,如图37在上述步骤S1和S2之后,拾音方法还可以包括:
S5,电子设备向拾音设备发送目标拾音方向。
S6,拾音设备在目标拾音方向上获取目标声音信号。
应理解,获取的目标声音信号可以是根据目标拾音方向获取的增强处理后的声音信号;也可以是根据开启并指向目标拾音方向的麦克风拾取的声音信号;还可以是利用根据开启并指向目标拾音方向的麦克风拾取,且经增强处理后的声音信号。
拾音设备接收到电子设备发送的目标拾音方向后,使得拾音设备在后续拾音过程中可以直接根据目标拾音方向拾取目标声音信号,或者根据目标拾音方向对拾取到的原始声音信号进行信号增强处理,以获取到原始声音信号位于目标拾音方向的目标声音信号,从而有效提高最终拾取到的声音信号的信噪比,提高声音信号的可懂度,提升用户体验。
在一种可能的实施方式中,步骤S6拾音设备在目标拾音方向上获取目标声音信号,可以包括:
S61,拾音设备采集原始声音信号。
S62,根据目标拾音方向,对原始声音信号进行增强处理,得到原始声音信号中位于目标拾音方向上的第一声音信号的增强声音信号,增强声音信号为目标声音信号。
基于该可能的实现方式,在获取原始声音信号后,根据目标拾音方向对原始声音信号做增强处理,以得到与目标拾音方向对应的增强处理后的声音信号,这样可以根据不同的实际应用场景,灵活的调整目标拾音方向,得到增强处理后的与目标拾音方向对应的增强声音信号,避免了获取到的声音信号中掺杂其他全方向声音信号,提高了目标声音信号的清晰度,提升了目标声音信号的音质。
在另一种可能的实施方式中,步骤S6拾音设备在目标拾音方向上获取目标声音信号,也可以包括:
S63,根据目标拾音方向,开启指向目标拾音方向的麦克风,关闭未指向目标拾音方向的麦克风。
S64,利用开启的指向目标拾音方向的麦克风采集目标声音信号。
该可能的实施方式,一方面可以节约电子设备的电量,提升用户体验,同时延长智能眼镜的使用寿命,另一方面根据检测到的目标拾音方向开启指向目标拾音方向的麦克风,并关闭其他麦克风,也可以尽可能避免麦克风拾取到除目标拾音方向之外的其他方向上的噪声,增强麦克风的拾音效果。在实际应用中,还可以利用各个指向性麦克风呈现的打开或关闭状态进一步实现不同的拾音效果。
可选地,步骤S6拾音设备在目标拾音方向上获取目标声音信号,还可以包括:
S65,根据目标拾音方向,开启指向目标拾音方向的麦克风,关闭未指向目标拾音方向的麦克风。
S66,利用开启的指向目标拾音方向的麦克风采集原始声音信号。
S67,根据目标拾音方向,对原始声音信号进行增强处理,得到原始声音信号中位于目标拾音方向上的第一声音信号的增强声音信号,增强声音信号为目标声音信号。
基于该可能的实现方式中,拾音设备根据目标拾音方向,开启指向目标拾音方向的麦克风,并关闭其他麦克风,可以避免麦克风拾取到除目标拾音方向之外的其他方向上的噪声,减少获取到的原始声音信号中音质较强的杂音,增强麦克风的拾音效果, 进一步对开启的指向性麦克风获取的声音信号做增强处理,以得到与目标拾音方向对应的增强处理后的声音信号。这样可以避免获取到的声音信号中掺杂其他方向的声音信号,提高了增强处理后的声音信号的清晰度及音质,有效提高了最终拾取到的声音信号的信噪比,提高声音信号的可懂度,提升用户体验。
具体可以参见上述场景一中电子设备获取声音信号的实施例描述,此处不再赘述。
S7,拾音设备向电子设备发送目标声音信号。
可选地,可以通过电子设备播放上述目标声音信号,例如,通过电子设备内置的扬声器播放目标声音;或者电子设备将目标声音信号发送给播放设备进行播放,例如,通过音响播放目标声音。又或者通过上述电子设备或播放设备存储上述目标声音信号。
应理解,根据实际应用场景,拾音设备还可以是上述电子设备中的设备。当然,拾音设备或者电子设备还可以是面向未来技术的其他设备。本申请实施例对拾音设备及电子设备的具体类型不作任何限制。
下文将描述本申请提供的装置实施例。应理解,装置实施例的描述与方法实施例的描述相互对应,因此,未详细描述的内容可以参见上文方法实施例,为了简洁,这里不再赘述。
如图38为本申请提供的一种设备100的结构示意图,该设备100包括上述实施例中的电子设备以及拾音设备。参见图38,设备100可以包括处理器110,外部存储器接口120,内部存储器131,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本申请实施例示意的结构并不构成对设备100的具体限定。在本申请另一些实施例中,设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
作为举例,当设备100为手机或平板电脑时,可以包括图示中的全部部件,也可以仅包括图示中的部分部件。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
其中,控制器可以是设备100的神经中枢和指挥中心。控制器可以根据指令操作 码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
I2C接口是一种双向同步串行总线,包括一根串行数据线(serial data line,SDA)和一根串行时钟线(derail clock line,SCL)。在一些实施例中,处理器110可以包含多组I2C总线。处理器110可以通过不同的I2C总线接口分别耦合触摸传感器180K,充电器,闪光灯,摄像头193等。例如:处理器110可以通过I2C接口耦合触摸传感器180K,使处理器110与触摸传感器180K通过I1C总线接口通信,实现设备100的触摸功能。
I1S接口可以用于音频通信。在一些实施例中,处理器110可以包含多组I2S总线。处理器110可以通过I1S总线与音频模块170耦合,实现处理器110与音频模块170之间的通信。在一些实施例中,音频模块170可以通过I1S接口向无线通信模块160传递音频信号。
PCM接口也可以用于音频通信,将模拟信号抽样,量化和编码。在一些实施例中,音频模块170与无线通信模块160可以通过PCM总线接口耦合。
在一些实施例中,音频模块170也可以通过PCM接口向无线通信模块160传递音频信号。I2S接口和PCM接口都可以用于音频通信。
UART接口是一种通用串行数据总线,用于异步通信。该总线可以为双向通信总线。它将要传输的数据在与并行通信之间转换。
在一些实施例中,UART接口通常被用于连接处理器110与无线通信模块160。例如:处理器110通过UART接口与无线通信模块160中的蓝牙模块通信,实现蓝牙功能。在一些实施例中,音频模块170可以通过UART接口向无线通信模块160传递音频信号,实现通过蓝牙耳机播放音乐的功能。
MIPI接口可以被用于连接处理器110与显示屏194,摄像头193等外围器件。MIPI接口包括摄像头串行接口(camera serial interface,CSI),显示屏串行接口(display serial interface,DSI)等。在一些实施例中,处理器110和摄像头193通过CSI接口通信,实现设备100的拍摄功能。处理器110和显示屏194通过DSI接口通信,实现设备100的显示功能。
GPIO接口可以通过软件配置。GPIO接口可以被配置为控制信号,也可被配置为数据信号。在一些实施例中,GPIO接口可以用于连接处理器110与摄像头193,显示 屏194,无线通信模块160,音频模块170,传感器模块180等。GPIO接口还可以被配置为I2C接口,I2S接口,UART接口,MIPI接口等。
USB接口130是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口130可以用于连接充电器为设备100充电,也可以用于设备100与外围设备之间传输数据。也可以用于连接耳机,通过耳机播放音频。该接口还可以用于连接其他设备,例如AR设备等。
可以理解的是,本申请实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对设备100的结构限定。在本申请另一些实施例中,设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块140可以通过设备100的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为设备供电。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器131,外部存储器接口120,显示屏194,摄像头193,和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。
在其他一些实施例中,电源管理模块141也可以设置于处理器110中。在另一些实施例中,电源管理模块141和充电管理模块140也可以设置于同一个器件中。
设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。
在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处 理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器110,与移动通信模块150或其他功能模块设置在同一个器件中。
无线通信模块160可以提供应用在设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
在一些实施例中,设备100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得设备100可以通过无线通信技术与网络以及其他设备通信。无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(code division multiple access,CDMA),宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。
设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。例如本申请实施例中的APP的图标、文件夹、文件夹名称等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,设备100可以包括1个或N个显示屏194,N为大于1的正整数。
设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优 化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在摄像头193中。
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。镜头的焦段可以用于表示摄像头的取景范围,镜头的焦段越小,表示镜头的取景范围越大。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。
在本申请中,设备100可以包括2个或2个以上焦段的摄像头193。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
视频编解码器用于对数字视频压缩或解压缩。设备100可以支持一种或多种视频编解码器。这样,设备100可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)1,MPEG1,MPEG3,MPEG4等。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现设备100的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。
在本申请实施例中,NPU或其他处理器可以用于对设备100存储的视频中的图像进行分析处理等操作。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器131可以用于存储计算机可执行程序代码,可执行程序代码包括指令。处理器110通过运行存储在内部存储器131的指令,从而执行设备100的各种功能应用以及数据处理。内部存储器131可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)。存储数据区可存储设备100使用过程中所创建的数据(比如音频数据,电话本等)。
此外,内部存储器131可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。
音频模块170用于将数字音频信号转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可以用于对音频信号编码和解码。在一些实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模 块设置于处理器110中。
扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。设备100可以通过扬声器170A收听音乐,或收听免提通话,例如扬声器可以播放本申请实施例提供的比对分析结果。
受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。当设备100接听电话或语音信息时,可以通过将受话器170B靠近人耳接听语音。
麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风170C发声,将声音信号输入到麦克风170C。设备100可以设置至少一个麦克风170C。在另一些实施例中,设备100可以设置两个麦克风170C,除了采集声音信号,还可以实现降噪功能。在另一些实施例中,设备100还可以设置三个,四个或更多麦克风170C,实现采集声音信号,降噪,还可以识别声音来源,实现定向录音功能等。
耳机接口170D用于连接有线耳机。耳机接口170D可以是USB接口130,也可以是3.5mm的开放移动设备平台(open mobile terminal platform,OMTP)标准接口,美国蜂窝电信工业协会(cellular telecommunications industry association of the USA,CTIA)标准接口。
压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器180A可以设置于显示屏194。压力传感器180A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器180A,电极之间的电容改变。设备100根据电容的变化确定压力的强度。当有触摸操作作用于显示屏194,设备100根据压力传感器180A检测触摸操作强度。设备100也可以根据压力传感器180A的检测信号计算触摸的位置。
在一些实施例中,作用于相同触摸位置,但不同触摸操作强度的触摸操作,可以对应不同的操作指令。例如:当有触摸操作强度小于第一压力阈值的触摸操作作用于短消息应用图标时,执行查看短消息的指令。当有触摸操作强度大于或等于第一压力阈值的触摸操作作用于短消息应用图标时,执行新建短消息的指令。
陀螺仪传感器180B可以用于确定设备100的运动姿态。在一些实施例中,可以通过陀螺仪传感器180B确定设备100围绕三个轴(即,x,y和z轴)的角速度。陀螺仪传感器180B可以用于拍摄防抖。示例性的,当按下快门,陀螺仪传感器180B检测设备100抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消设备100的抖动,实现防抖。陀螺仪传感器180B还可以用于导航,体感游戏场景。
气压传感器180C用于测量气压。在一些实施例中,设备100通过气压传感器180C测得的气压值计算海拔高度,辅助定位和导航。
磁传感器180D包括霍尔传感器。设备100可以利用磁传感器180D检测翻盖皮套的开合。在一些实施例中,当设备100是翻盖机时,设备100可以根据磁传感器180D检测翻盖的开合。进而根据检测到的皮套的开合状态或翻盖的开合状态,设置翻盖自动解锁等特性。
加速度传感器180E可检测设备100在各个方向上(一般为三轴)加速度的大小。当设备100静止时可检测出重力的大小及方向。还可以用于识别设备姿态,应用于横竖屏切换,计步器等应用。
距离传感器180F,用于测量距离。设备100可以通过红外或激光测量距离。在一些实施例中,拍摄场景,设备100可以利用距离传感器180F测距以实现快速对焦。
接近光传感器180G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。设备100通过发光二极管向外发射红外光。设备100使用光电二极管检测来自附近物体的红外反射光。当检测到充分的反射光时,可以确定设备100附近有物体。当检测到不充分的反射光时,设备100可以确定设备100附近没有物体。设备100可以利用接近光传感器180G检测用户手持设备100贴近耳朵通话,以便自动熄灭屏幕达到省电的目的。接近光传感器180G也可用于皮套模式,口袋模式自动解锁与锁屏。
环境光传感器180L用于感知环境光亮度。设备100可以根据感知的环境光亮度自适应调节显示屏194亮度。环境光传感器180L也可用于拍照时自动调节白平衡。环境光传感器180L还可以与接近光传感器180G配合,检测设备100是否在口袋里,以防误触。
指纹传感器180H用于采集指纹。设备100可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。
温度传感器180J用于检测温度。在一些实施例中,设备100利用温度传感器180J检测的温度,执行温度处理策略。例如,当温度传感器180J上报的温度超过阈值,设备100执行降低位于温度传感器180J附近的处理器的性能,以便降低功耗实施热保护。在另一些实施例中,当温度低于另一阈值时,设备100对电池142加热,以避免低温导致设备100异常关机。在其他一些实施例中,当温度低于又一阈值时,设备100对电池142的输出电压执行升压,以避免低温导致的异常关机。
触摸传感器180K,也称“触控面板”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于设备100的表面,与显示屏194所处的位置不同。
骨传导传感器180M可以获取振动信号。在一些实施例中,骨传导传感器180M可以获取人体声部振动骨块的振动信号。骨传导传感器180M也可以接触人体脉搏,接收血压跳动信号。
在一些实施例中,骨传导传感器180M也可以设置于耳机中,结合成骨传导耳机。音频模块170可以基于骨传导传感器180M获取的声部振动骨块的振动信号,解析出语音信号,实现语音功能。应用处理器可以基于骨传导传感器180M获取的血压跳动信号解析心率信息,实现心率检测功能。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。设备100可以接收按键输入,产生与设备100的用户设置以及功能控制有关的键 信号输入。
马达191可以产生振动提示。马达191可以用于来电振动提示,也可以用于触摸振动反馈。例如,作用于不同应用(例如拍照,音频播放等)的触摸操作,可以对应不同的振动反馈效果。作用于显示屏194不同区域的触摸操作,马达191也可对应不同的振动反馈效果。不同的应用场景(例如:时间提醒,接收信息,闹钟,游戏等)也可以对应不同的振动反馈效果。触摸振动反馈效果还可以支持自定义。
指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。
SIM卡接口195用于连接SIM卡。SIM卡可以通过插入SIM卡接口195,或从SIM卡接口195拔出,实现和设备100的接触和分离。设备100可以支持1个或N个SIM卡接口,N为大于1的正整数。SIM卡接口195可以支持Nano SIM卡,Micro SIM卡,SIM卡等。同一个SIM卡接口195可以同时插入多张卡。多张卡的类型可以相同,也可以不同。SIM卡接口195也可以兼容不同类型的SIM卡。SIM卡接口195也可以兼容外部存储卡。设备100通过SIM卡和网络交互,实现通话以及数据通信等功能。在一些实施例中,设备100采用eSIM,即:嵌入式SIM卡。eSIM卡可以嵌在设备100中,不能和设备100分离。
参见图39,为本申请实施例的设备100的软件结构示意图。设备100中的操作系统可以是安卓(Android)系统,微软窗口系统(Windows),苹果移动操作系统(iOS)或者鸿蒙系统(Harmony OS)等。在此,以设备100的操作系统为鸿蒙系统为例进行说明。
在一些实施例中,可将鸿蒙系统分为四层,包括内核层、系统服务层、框架层以及应用层,层与层之间通过软件接口通信。
如图39所示,内核层包括内核抽象层(Kernel Abstract Layer,KAL)和驱动子系统。KAL下包括多个内核,如Linux系统的内核Linux Kernel、轻量级物联网系统内核LiteOS等。驱动子系统则可以包括硬件驱动框架(Hardware Driver Foundation,HDF)。硬件驱动框架能够提供统一外设访问能力和驱动开发、管理框架。多内核的内核层可以根据系统的需求选择相应的内核进行处理。
系统服务层是鸿蒙系统的核心能力集合,系统服务层通过框架层对应用程序提供服务。该层可包括系统基本能力子系统集、基础软件服务子系统集、增强软件服务子系统集以及硬件服务子系统集。
系统基本能力子系统集为分布式应用在鸿蒙系统的设备上的运行、调度、迁移等操作提供了基础能力。可包括分布式软总线、分布式数据管理、分布式任务调度、方舟多语言运行时、公共基础库、多模输入、图形、安全、人工智能(Artificial Intelligence,AI)、用户程序框架等子系统。其中,方舟多语言运行时提供了C或C++或JavaScript(JS)多语言运行时和基础的系统类库,也可以为使用方舟编译器静态化的Java程序(即应用程序或框架层中使用Java语言开发的部分)提供运行时。
基础软件服务子系统集为鸿蒙系统提供公共的、通用的软件服务。可包括事件通知、电话、多媒体、面向X设计(Design For X,DFX)、MSDP&DV等子系统。
增强软件服务子系统集为鸿蒙系统提供针对不同设备的、差异化的能力增强型软 件服务。可包括智慧屏专有业务、穿戴专有业务、物联网(Internet of Things,IoT)专有业务子系统组成。
硬件服务子系统集为鸿蒙系统提供硬件服务。可包括位置服务、生物特征识别、穿戴专有硬件服务、IoT专有硬件服务等子系统。
框架层为鸿蒙系统应用开发提供了Java、C、C++、JS等多语言的用户程序框架和能力(Ability)框架,两种用户界面(User Interface,UI)框架(包括适用于Java语言的Java UI框架、适用于JS语言的JS UI框架),以及各种软硬件服务对外开放的多语言框架应用程序接口(Application Programming Interface,API)。根据系统的组件化裁剪程度,鸿蒙系统设备支持的API也会有所不同。
应用层包括系统应用和第三方应用(或称为扩展应用)。系统应用可包括桌面、控制栏、设置、电话等设备默认安装的应用程序。扩展应用可以是由设备的制造商开发设计的、非必要的应用,如设备管家、换机迁移、便签、天气等应用程序。而第三方非系统应用则可以是由其他厂商开发,但是可以在鸿蒙系统中运行应用程序,如游戏、导航、社交或购物等应用程序。
提供后台运行任务的能力以及统一的数据访问抽象。PA主要为FA提供支持,例如作为后台服务提供计算能力,或作为数据仓库提供数据访问能力。基于FA或PA开发的应用,能够实现特定的业务功能,支持跨设备调度与分发,为用户提供一致、高效的应用体验。
多个运行鸿蒙系统的设备之间可以通过分布式软总线、分布式设备虚拟化、分布式数据管理和分布式任务调度实现硬件互助和资源共享。
基于上述各个实施例提供的拾音方法,本申请实施例还提供以下内容:
本实施例提供了一种计算机程序产品,该程序产品包括程序,当该程序被电子设备和/或拾音设备运行时,使得电子设备和/或拾音设备上述各实施例中示出的拾音方法。
本申请实施例提供一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序被处理器执行时实现上述各个实施例中示出的拾音方法。
本申请实施例提供一种芯片系统,该芯片系统包括存储器和处理器,该处理器执行存储器中存储的计算机程序,以实现控制上述电子设备执行上述各个实施例中示出的拾音方法。
应理解,本申请实施例中提及的处理器可以是中央处理单元(central processing unit,CPU),还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
还应理解,本申请实施例中提及的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器 (random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。例如,以上所描述的系统实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,可以通过计算机程序来指令相关的硬件来完成,所述的计 算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质至少可以包括:能够将计算机程序代码携带到大屏设备的任何实体或装置、记录介质、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质。例如U盘、移动硬盘、磁碟或者光盘等。在某些司法管辖区,根据立法和专利实践,计算机可读介质不可以是电载波信号和电信信号。
最后应说明的是:以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (29)

  1. 一种穿戴设备,其特征在于,所述穿戴设备包括麦克风阵列,所述麦克风阵列中包括至少一个指向性麦克风;
    所述至少一个指向性麦克风的拾音波束方向互相正交。
  2. 根据权利要求1所述的穿戴设备,其特征在于,所述麦克风阵列中还包括至少一个全向麦克风。
  3. 根据权利要求1或2所述的穿戴设备,其特征在于,所述穿戴设备被配置为:当所述穿戴设备检测到目标拾音方向时,所述穿戴设备开启所述麦克风阵列中指向所述目标拾音方向的麦克风,并关闭所述麦克风阵列中未指向所述目标拾音方向的麦克风。
  4. 根据权利要求1或2所述的穿戴设备,其特征在于,所述穿戴设备被配置为:当检测到所述麦克风阵列中存在满足预设条件的第一指向性麦克风时,开启所述第一指向性麦克风,并关闭其他所述指向性麦克风;所述预设条件为所述第一指向性麦克风在预设时间段内拾取到的声音信号的信号质量大于其他所述指向性麦克风。
  5. 根据权利要求1-4任一项所述的穿戴设备,其特征在于,所述穿戴设备为智能眼镜。
  6. 根据权利要求5所述的穿戴设备,其特征在于,当所述麦克风阵列中包括一个全向麦克风时,所述全向麦克风位于所述智能眼镜镜框的鼻梁或鼻托中。
  7. 根据权利要求5所述的穿戴设备,其特征在于,当所述麦克风阵列中包括两个全向麦克风时,所述两个全向麦克风分别位于所述智能眼镜的两个镜腿上;或者,所述两个全向麦克风分别位于所述智能眼镜的镜框两侧靠近所述两个镜腿的位置。
  8. 根据权利要求5所述的穿戴设备,其特征在于,当所述麦克风阵列中包括多个全向麦克风时,所述多个全向麦克风分布在所述智能眼镜的中间区域以及两侧区域,所述中间区域包括所述智能眼镜镜框的鼻梁和/或鼻托;所述两侧区域包括所述智能眼睛的两个镜腿和/或所述智能眼镜的镜框两侧靠近所述两个镜腿的位置。
  9. 根据权利要求1-8任一项所述的穿戴设备,其特征在于,所述指向性麦克风为8字型麦克风。
  10. 一种拾音方法,其特征在于,应用于电子设备,所述方法包括:
    响应于第一操作,显示第一界面,所述第一界面用于配置拾音方向;
    响应于在所述第一界面上检测到的第二操作,确定目标拾音方向。
  11. 根据权利要求10所述的拾音方法,其特征在于,所述方法还包括:
    获取原始声音信号;
    根据所述目标拾音方向,对所述原始声音信号进行增强处理,得到所述原始声音信号中位于所述目标拾音方向上的第一声音信号的增强声音信号。
  12. 根据权利要求11所述的拾音方法,其特征在于,所述获取所述原始声音信号,包括:
    在录音过程中,获取原始声音信号;
    所述根据所述目标拾音方向,对所述原始声音信号进行增强处理,得到所述原始声音信号中位于所述目标拾音方向上的第一声音信号的增强声音信号之后,所述方法 还包括:
    保存所述增强声音信号。
  13. 根据权利要求11所述的拾音方法,其特征在于,所述获取所述原始声音信号,包括:
    在通话过程中,获取原始声音信号;
    所述根据所述目标拾音方向,对所述原始声音信号进行增强处理,得到所述原始声音信号中位于所述目标拾音方向上的第一声音信号的增强声音信号之后,所述方法还包括:
    将所述增强声音信号发送至通话端设备。
  14. 根据权利要求11所述的拾音方法,其特征在于,所述原始声音信号为录制的原始视频中的声音信号,所述根据所述目标拾音方向,对所述原始声音信号进行增强处理,得到所述原始声音信号中位于所述目标拾音方向上的第一声音信号的增强声音信号之后,所述方法还包括:
    将所述原始视频中的所述原始声音信号替换为所述增强声音信号。
  15. 根据权利要求11-14任一项所述的拾音方法,其特征在于,所述获取所述原始声音信号,包括:
    接收拾音设备发送的所述原始声音信号。
  16. 根据权利要求10-15任一项所述的拾音方法,其特征在于,所述方法还包括:向拾音设备发送所述目标拾音方向。
  17. 根据权利要求11-14任一项所述的拾音方法,其特征在于,所述电子设备包括麦克风阵列,所述麦克风阵列包括至少一个指向性麦克风,所述电子设备获取所述原始声音信号,包括:
    根据所述目标拾音方向,开启指向所述目标拾音方向的所述指向性麦克风,关闭未指向所述目标拾音方向的所述指向性麦克风;
    利用开启的指向所述目标拾音方向的所述指向性麦克风采集所述原始声音信号。
  18. 根据权利要求10所述的拾音方法,其特征在于,所述电子设备包括麦克风阵列,所述麦克风阵列包括至少一个指向性麦克风,所述方法还包括:
    根据所述目标拾音方向,开启指向所述目标拾音方向的所述指向性麦克风,关闭未指向所述目标拾音方向的所述指向性麦克风;
    利用开启的指向所述目标拾音方向的所述指向性麦克风采集原始声音信号。
  19. 根据权利要求10-18任一项所述的拾音方法,其特征在于,所述响应于第一操作,显示第一界面之前,所述方法还包括:
    显示拾音场景设置界面;
    响应于在所述拾音场景设置界面上检测到的第二操作,打开或者关闭所述第一界面的显示场景,所述显示场景包括录音场景、通话场景、录像场景、会议场景中的至少一个场景。
  20. 一种拾音方法,其特征在于,应用于拾音设备,所述方法包括:
    接收电子设备发送的目标拾音方向;
    在所述目标拾音方向上获取目标声音信号。
  21. 根据权利要求20所述的方法,其特征在于,所述在所述目标拾音方向上获取目标声音信号包括:
    采集原始声音信号;
    根据所述目标拾音方向,对所述原始声音信号进行增强处理,得到所述原始声音信号中位于所述目标拾音方向上的第一声音信号的增强声音信号,所述增强声音信号为所述目标声音信号。
  22. 根据权利要求20所述的方法,其特征在于,所述在所述目标拾音方向上获取目标声音信号,包括:
    根据所述目标拾音方向,开启指向所述目标拾音方向的麦克风,关闭未指向所述目标拾音方向的麦克风;
    利用开启的指向所述目标拾音方向的麦克风采集所述目标声音信号。
  23. 根据权利要求20所述的方法,其特征在于,所述在所述目标拾音方向上获取声音信号包括:
    根据所述目标拾音方向,开启指向所述目标拾音方向的麦克风,关闭未指向所述目标拾音方向的麦克风;
    利用开启的指向所述目标拾音方向的麦克风采集原始声音信号;
    根据所述目标拾音方向,对所述原始声音信号进行增强处理,得到所述原始声音信号中位于所述目标拾音方向上的第一声音信号的增强声音信号,所述增强声音信号为所述目标声音信号。
  24. 根据权利要求20-23任一项所述的方法,其特征在于,所述方法还包括:播放所述目标声音信号。
  25. 根据权利要求20-24任一项所述的方法,其特征在于,所述方法还包括:向音频播放设备发送所述目标声音信号。
  26. 一种芯片系统,其特征在于,所述芯片系统包括处理器,所述处理器执行存储器中存储的计算机程序,以实现如权利要求10至25任一项所述的方法。
  27. 一种设备,其特征在于,被配置为执行如权利要求10至19中任一项所述的电子设备所执行的方法;或者,被配置为执行如权利要求20至25中任一项所述的拾音设备所执行的方法。
  28. 根据权利要求27所述的设备,其特征在于,所述设备为如权利要求1至9任一项所述的穿戴设备。
  29. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机指令,当所述计算机指令在计算机设备上运行时,使得所述计算机设备执行如权利要求10至25中任一项所述的方法。
PCT/CN2023/087315 2022-04-14 2023-04-10 穿戴设备、拾音方法及装置 WO2023197997A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210393694.4 2022-04-14
CN202210393694.4A CN116962937A (zh) 2022-04-14 2022-04-14 穿戴设备、拾音方法及装置

Publications (1)

Publication Number Publication Date
WO2023197997A1 true WO2023197997A1 (zh) 2023-10-19

Family

ID=88328975

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/087315 WO2023197997A1 (zh) 2022-04-14 2023-04-10 穿戴设备、拾音方法及装置

Country Status (2)

Country Link
CN (1) CN116962937A (zh)
WO (1) WO2023197997A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105072540A (zh) * 2015-09-01 2015-11-18 青岛小微声学科技有限公司 一种立体声拾音装置及立体声拾音方法
US20180176679A1 (en) * 2016-12-20 2018-06-21 Verizon Patent And Licensing Inc. Beamforming optimization for receiving audio signals
CN108419168A (zh) * 2018-01-19 2018-08-17 广东小天才科技有限公司 拾音设备的指向性拾音方法、装置、拾音设备及存储介质
CN111883160A (zh) * 2020-08-07 2020-11-03 上海茂声智能科技有限公司 一种定向麦克风阵列拾音降噪方法及装置
CN113301476A (zh) * 2021-03-31 2021-08-24 阿里巴巴新加坡控股有限公司 拾音设备及麦克风阵列结构
CN113496708A (zh) * 2020-04-08 2021-10-12 华为技术有限公司 拾音方法、装置和电子设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105072540A (zh) * 2015-09-01 2015-11-18 青岛小微声学科技有限公司 一种立体声拾音装置及立体声拾音方法
US20180176679A1 (en) * 2016-12-20 2018-06-21 Verizon Patent And Licensing Inc. Beamforming optimization for receiving audio signals
CN108419168A (zh) * 2018-01-19 2018-08-17 广东小天才科技有限公司 拾音设备的指向性拾音方法、装置、拾音设备及存储介质
CN113496708A (zh) * 2020-04-08 2021-10-12 华为技术有限公司 拾音方法、装置和电子设备
CN111883160A (zh) * 2020-08-07 2020-11-03 上海茂声智能科技有限公司 一种定向麦克风阵列拾音降噪方法及装置
CN113301476A (zh) * 2021-03-31 2021-08-24 阿里巴巴新加坡控股有限公司 拾音设备及麦克风阵列结构

Also Published As

Publication number Publication date
CN116962937A (zh) 2023-10-27

Similar Documents

Publication Publication Date Title
WO2020211701A1 (zh) 模型训练方法、情绪识别方法及相关装置和设备
WO2020078237A1 (zh) 音频处理方法和电子设备
WO2021052214A1 (zh) 一种手势交互方法、装置及终端设备
WO2020168968A1 (zh) 一种具有折叠屏的电子设备的控制方法及电子设备
WO2022193989A1 (zh) 电子设备的操作方法、装置和电子设备
WO2021180085A1 (zh) 拾音方法、装置和电子设备
WO2020056684A1 (zh) 通过转发模式连接的多tws耳机实现自动翻译的方法及装置
WO2020015144A1 (zh) 一种拍照方法及电子设备
WO2020015149A1 (zh) 一种皱纹检测方法及电子设备
CN114257920B (zh) 一种音频播放方法、系统和电子设备
WO2022206825A1 (zh) 一种调节音量的方法、系统及电子设备
CN114339429A (zh) 音视频播放控制方法、电子设备和存储介质
CN114090102A (zh) 启动应用程序的方法、装置、电子设备和介质
CN111930335A (zh) 声音调节方法及装置、计算机可读介质及终端设备
WO2022022319A1 (zh) 一种图像处理方法、电子设备、图像处理系统及芯片系统
WO2023216930A1 (zh) 基于穿戴设备的振动反馈方法、系统、穿戴设备和电子设备
WO2020078267A1 (zh) 在线翻译过程中的语音数据处理方法及装置
WO2023179123A1 (zh) 蓝牙音频播放方法、电子设备及存储介质
CN114120950B (zh) 一种人声屏蔽方法和电子设备
WO2022214004A1 (zh) 一种目标用户确定方法、电子设备和计算机可读存储介质
WO2022161036A1 (zh) 天线选择方法、装置、电子设备及可读存储介质
CN114120987B (zh) 一种语音唤醒方法、电子设备及芯片系统
CN113467747B (zh) 音量调节方法、电子设备及存储介质
WO2023197997A1 (zh) 穿戴设备、拾音方法及装置
CN115641867A (zh) 语音处理方法和终端设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23787641

Country of ref document: EP

Kind code of ref document: A1