WO2021184920A1 - 一种声音的掩蔽方法、装置及终端设备 - Google Patents

一种声音的掩蔽方法、装置及终端设备 Download PDF

Info

Publication number
WO2021184920A1
WO2021184920A1 PCT/CN2020/141881 CN2020141881W WO2021184920A1 WO 2021184920 A1 WO2021184920 A1 WO 2021184920A1 CN 2020141881 W CN2020141881 W CN 2020141881W WO 2021184920 A1 WO2021184920 A1 WO 2021184920A1
Authority
WO
WIPO (PCT)
Prior art keywords
sound signal
signal
sound
audio signal
masking
Prior art date
Application number
PCT/CN2020/141881
Other languages
English (en)
French (fr)
Inventor
宋贤高
吴融融
刘嘉禾
高俊平
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to MX2022011638A priority Critical patent/MX2022011638A/es
Priority to BR112022018722A priority patent/BR112022018722A2/pt
Priority to EP20926107.2A priority patent/EP4109863A4/en
Publication of WO2021184920A1 publication Critical patent/WO2021184920A1/zh
Priority to US17/947,600 priority patent/US20230008818A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • G10K11/1752Masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/60Substation equipment, e.g. for use by subscribers including speech amplifiers
    • H04M1/6016Substation equipment, e.g. for use by subscribers including speech amplifiers in the receiver circuit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • G10K11/178Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • G10K11/178Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
    • G10K11/1781Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase characterised by the analysis of input or output signals, e.g. frequency range, modes, transfer functions
    • G10K11/17821Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase characterised by the analysis of input or output signals, e.g. frequency range, modes, transfer functions characterised by the analysis of the input signals only
    • G10K11/17823Reference signals, e.g. ambient acoustic environment
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • G10K11/178Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
    • G10K11/1787General system configurations
    • G10K11/17873General system configurations using a reference signal without an error signal, e.g. pure feedforward
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/02Constructional features of telephone sets
    • H04M1/19Arrangements of transmitters, receivers, or complete sets to prevent eavesdropping, to attenuate local noise or to prevent undesired transmission; Mouthpieces or receivers specially adapted therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/04Circuits for transducers, loudspeakers or microphones for correcting frequency response
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K2210/00Details of active noise control [ANC] covered by G10K11/178 but not provided for in any of its subgroups
    • G10K2210/10Applications
    • G10K2210/108Communication systems, e.g. where useful sound is kept and noise is cancelled
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2420/00Details of connection covered by H04R, not provided for in its groups
    • H04R2420/01Input selection or mixing for amplifiers or loudspeakers

Definitions

  • This application relates to the technical field of electronic equipment, and in particular to a method, device and terminal equipment for sound masking.
  • the present application provides a sound masking method, device and terminal equipment, which prevent the listener from understanding the information in the leaked sound of the receiver without affecting the call quality of the listener.
  • this application provides a sound masking method.
  • the method is applied to any terminal device with communication function including receiver and loudspeaker.
  • the method includes:
  • the masking sound signal is determined according to the audio signal
  • the masking sound signal is transmitted through the speaker, and the masking sound signal is used to mask the audio signal output by the far-field receiver.
  • the speaker can be controlled to output the masking sound signal.
  • the masking sound signal is determined according to the audio signal, and for the far field, the distance between the speaker and the far field and the distance between the receiver and the far field are small, so the audio signal can be effectively masked by the masking sound signal.
  • the attenuation amplitude of the audio signal is significantly smaller than that of the masking sound signal. Instead, the masking sound signal will be masked by the audio signal. Therefore, the masking sound signal emitted by the speaker will not interfere with the listener's ears, thereby avoiding the impact on the listener's call quality. .
  • the audio signal played by the receiver may include multiple types of audio signals, such as human voice signals, animal voice signals, or music.
  • the masking sound signal may be determined based on the audio signal.
  • the corresponding masking sound signal can be selected or matched from a pre-generated audio library according to the audio signal.
  • the masking sound signal may also be generated based on the real-time downlink audio signal.
  • the spectrum characteristic analysis can match the pink noise signal or the white noise signal as the masking sound signal, and the spectrum characteristic analysis can generate the pink noise signal or the white noise signal as the masking sound signal.
  • determining the masking sound signal according to the audio signal may specifically include:
  • the masking sound signal is generated based on the spectral response.
  • the masking sound signal is generated according to the spectrum response of the audio signal to ensure that the spectrum response of the masking sound signal and the audio signal match each other, so that the masking sound signal can efficiently mask the audio signal output by the receiver.
  • determining the masking sound signal according to the audio signal may specifically include:
  • the inverted sound is directly spliced to generate the masking sound signal, or the masking sound signal is spliced after the window function is used to generate the masking sound signal.
  • the masking sound signal is generated by inversion, so that the masking sound signal is difficult to be understood by the observer, so that when the masking sound signal is transmitted to the far field, the leakage of the receiver can be masked by the masking sound signal that is not easy to understand.
  • determining the masking sound signal according to the audio signal may specifically include:
  • Interpolate sound fragments to obtain the supplemented sound signal or match subsequent fragments from the preset audio library to obtain the supplemented sound signal;
  • the corresponding masking sound signal is generated according to the supplemented sound signal.
  • the supplementary sound signal is obtained through interpolation or matching, which avoids frequent interception of sound clips, reduces the processing load of the terminal device, and improves the generation efficiency of the masked sound signal.
  • the method may further include:
  • the audio signal is delayed according to the length of time, so that the audio signal output by the receiver is compatible with the masking sound signal output by the speaker.
  • the audio signal and the masking sound signal are partially aligned or completely aligned.
  • the synchronization of the masking sound signal and the audio signal is ensured, and the masking effect is improved.
  • the method may further include:
  • the inverted sound signal is subjected to amplitude reduction processing and then mixed with the audio signal to obtain the mixed sound signal;
  • the receiver outputs a mixed sound signal.
  • the mixed sound signal can offset the influence of the masking sound signal emitted by the speaker on the ears of the near-field listeners to a certain extent, and further improve the call quality on the basis of ensuring the privacy of the call.
  • the method transmits a masking sound signal through a speaker, which specifically includes:
  • the sound signal of the surrounding environment is detected, and when the amplitude of the sound signal of the surrounding environment is lower than the first preset threshold, the masking sound signal is emitted through the speaker. This method reduces the risk of leakage of the leaked sound content.
  • the method transmits a masking sound signal through a speaker, which may specifically include:
  • the downlink audio signal When the downlink audio signal is detected, it is determined that the amplitude of the downlink audio signal is greater than the second preset threshold, and the masking sound signal is emitted through the speaker. This method avoids unnecessary noise interference to the observers.
  • the time value range of the preset frame length is 10 ms-300 ms.
  • phase range of the inversion processing is 90 degrees to 270 degrees.
  • the distance between the receiver and the speaker is greater than the width of the terminal device
  • the distance between the receiver and the speaker is greater than half the length of the terminal device
  • the distance between the receiver and the speaker is greater than 100mm;
  • the distance between the receiver and the speaker is at least 20 times the distance between the receiver and the listener's ears.
  • this application provides a sound masking device.
  • the device is applied to any terminal equipment with communication function including receiver and speaker.
  • the device includes: a judgment module, a determination module and a first control module.
  • the judging module is used to judge whether the terminal device uses the receiver as the output end of the audio signal
  • the determination module is used to determine the masking sound signal according to the audio signal when the judgment result of the judgment module is yes;
  • the first control module is used to control the speaker to emit a masking sound signal to mask the audio signal output by the receiver in the far field.
  • the determining module is configured to select or match the corresponding masking sound signal from a pre-generated audio library according to the audio signal when the audio signal is output by the receiver.
  • the determining module is configured to generate the masking sound signal in real time according to the audio signal.
  • the determining module may specifically include:
  • the spectrum analysis unit is used to perform spectrum analysis on the audio signal to obtain the spectrum response
  • the first generating unit is used to generate a masking sound signal according to the spectral response.
  • the determining module may specifically include:
  • the signal interception unit is used to intercept the audio signal according to the preset frame length to obtain the intercepted sound fragment
  • the signal reversal unit is used for time-domain reversal of the sound segment to obtain reversal sound
  • the second generating unit is configured to generate a corresponding masking sound signal according to the inverted sound.
  • the determining module may specifically include:
  • the signal interception unit is used to intercept the audio signal according to the preset frame length to obtain the intercepted sound fragment
  • the signal supplement unit is used to interpolate the sound fragments to obtain the supplemented sound signal; or to match the subsequent fragments from the preset audio library to obtain the supplemented sound signal;
  • the third generating unit is used to generate a corresponding masking sound signal according to the supplemented sound signal.
  • the device may further include:
  • the time length obtaining module is used to obtain the time length for generating the masking sound signal according to the audio signal
  • the delay module is used to delay the audio signal according to the length of time, so that the audio signal output by the receiver is compatible with the masked sound signal output by the speaker.
  • the foregoing apparatus may further include:
  • the inversion processing module is used to perform phase inversion processing on the masked sound signal to obtain an inverted sound signal
  • the mixing module is used to process the inverted sound signal by reducing the amplitude and then mix it with the audio signal to obtain the mixed sound signal;
  • the second control module is used to control the receiver to output a mixed sound signal.
  • the first control module may specifically include:
  • the first detection unit is used to detect the sound signal of the surrounding environment
  • the first judging unit is used to judge whether the amplitude of the surrounding environment sound is lower than the first preset threshold
  • the first control unit is configured to transmit a masking sound signal through the speaker when the judgment result of the first judgment unit is yes.
  • the first control module may specifically include:
  • the second detection unit is used to detect whether there is a downlink audio signal
  • the second determining unit is configured to determine whether the amplitude of the downstream audio signal is greater than the second preset threshold when the second detecting unit detects that there is a downstream audio signal;
  • the second control unit is used to transmit a masking sound signal through the speaker when the judgment result of the second judgment unit is yes.
  • the apparatus may further include:
  • the signal enhancement module is used to enhance the masking sound signal to obtain the enhanced masking sound signal, and then provide it to the speaker.
  • this application provides a terminal device.
  • the terminal equipment can be mobile phones, tablet computers, personal digital assistants (English full name: Personal Digital Assistant, English abbreviation: PDA), sales terminal equipment (English full name: Point of Sales, English abbreviation: POS), vehicle-mounted computer, etc. Function and include a receiver and speaker terminal equipment.
  • the terminal equipment provided by the third aspect of this application includes: a receiver, a speaker, and a processor;
  • the processor is used to determine or generate a masking sound signal according to the audio signal when the audio signal is output by the receiver;
  • the loudspeaker is used to transmit the masking sound signal to mask the audio signal output by the receiver to the far field.
  • the processor is specifically configured to select or match the corresponding masking sound signal from a pre-generated audio library according to the audio signal when the audio signal is output by the receiver.
  • the processor is specifically configured to generate the masking sound signal in real time according to the audio signal.
  • the processor is specifically configured to perform spectrum analysis on the audio signal to obtain a spectrum response; and generate a masking sound signal according to the spectrum response.
  • the processor is specifically configured to intercept the audio signal according to the preset frame length to obtain the intercepted sound fragment; perform the time domain inversion of the sound fragment to obtain the inverted sound; The sound generates a corresponding masking sound signal.
  • the processor is specifically configured to intercept the audio signal according to the preset frame length to obtain the intercepted sound fragment; interpolate the sound fragment to obtain the supplemented sound signal; or from the preset Match subsequent clips in the audio library to obtain the supplemented sound signal; generate the corresponding masking sound signal according to the supplemented sound signal.
  • the processor is also used to obtain the length of time for generating the masked sound signal according to the audio signal; delay the audio signal according to the length of time, so that the audio signal output by the receiver and the speaker output are masked The sound signal adapts.
  • the processor is also used to perform phase inversion processing on the masked sound signal to obtain an inverted sound signal; the inverted sound signal is subjected to amplitude reduction processing and then mixed with the audio signal To obtain the mixed sound signal; control the receiver to output the mixed sound signal.
  • the processor is specifically configured to detect the sound signal of the surrounding environment, and when the amplitude of the sound signal of the surrounding environment is lower than the first preset threshold, transmit the masking sound signal through the speaker.
  • the processor is specifically configured to, when a downstream audio signal is detected, determine that the amplitude of the downstream audio signal is greater than the second preset threshold, transmit the masking sound signal through the speaker.
  • the sound masking method provided in this application is applied to a terminal device with a receiver and a speaker.
  • the terminal device uses the receiver as the output end of the audio signal
  • the masking sound signal is determined based on the audio signal, and then the speaker is controlled to emit the masking sound signal. Since the masked sound signal is determined based on the audio signal, and the distance between the speaker and the receiver relative to the far field is small, the masked sound signal can better conceal the leakage of the receiver and prevent information leakage in the call voice.
  • the masking sound signal and the sound signal are output by the speaker and the receiver respectively, when the listener listens to the sound signal with the receiver, the distance between the speaker and the receiver relative to the listener's ear is quite different. Therefore, the masking sound signal is for the listener to listen to the sound signal. The interference is small and will not affect the call quality of the listener.
  • FIG. 1 is a schematic diagram of an application scenario of a sound masking method provided by an embodiment of the application
  • FIG. 2 is a schematic diagram of the distance between the first terminal device shown in FIG. 1 and the ears of the near-field listener and the ears of the far-field listener;
  • FIG. 3 is a flowchart of a sound masking method provided by an embodiment of the application.
  • FIG. 4 is a flowchart of another sound masking method provided by an embodiment of the application.
  • FIG. 5 is a schematic diagram of a spectrum response obtained by performing a spectrum analysis on a sound signal according to an embodiment of the application
  • Figure 6 is a schematic diagram of masking sound signals and spectral characteristic curves of sound signals
  • FIG. 7 is a schematic diagram of signal processing provided by an embodiment of the application.
  • FIG. 8 is a schematic diagram of an intercepted sound segment provided by an embodiment of the application.
  • FIG. 9 is a schematic diagram of inverted sound of the sound segment shown in FIG. 8 after time domain inversion
  • FIG. 10 is a schematic diagram of a masked sound signal and an inverted sound signal provided by an embodiment of the application;
  • FIG. 11 is a schematic diagram of the comparison of the masked sound signal before and after enhancement provided by an embodiment of the application.
  • FIG. 12 is a schematic structural diagram of a sound masking device provided by an embodiment of the application.
  • FIG. 13 is a schematic structural diagram of a signal generation module provided by an embodiment of the application.
  • FIG. 14 is a schematic structural diagram of another signal generation module provided by an embodiment of the application.
  • 15 is a schematic structural diagram of yet another signal generation module provided by an embodiment of the application.
  • FIG. 16 is a schematic structural diagram of a terminal device provided by an embodiment of this application.
  • FIG. 17 is a hardware architecture diagram of a mobile phone provided by an embodiment of the application.
  • mobile communication equipment meets people's communication needs in a variety of scenarios with its portable characteristics. For example, people can use mobile communication devices to communicate with other people on crowded subways, in busy commercial streets, and in empty locker rooms.
  • a receiver When using a receiver to receive audio signals, if the listener is in a quiet room, the other party's volume is too loud, or the receiver's playback volume is high, the problem of receiver leakage is unavoidable.
  • the call may contain content that you do not want to be known by other people, and these content may involve private or confidential information. The problem of voice leakage from the receiver can easily lead to the disclosure of private or confidential information in the content of the call.
  • the way to avoid sound leakage by adjusting the receiver's playback volume is insufficiently controllable.
  • the listener reduces the playback volume of the receiver, the sound leakage phenomenon may have occurred.
  • processing the audio signal to be played by the receiver can reduce the intelligibility of the audio signal to a certain extent, it also makes it difficult for the listener to hear the call content of the other party. Therefore, this method affects the call quality of the listener.
  • the embodiments of the present application provide a sound masking method, device, and terminal equipment.
  • the masking sound signal ie, the masking sound
  • the speaker is controlled to emit the masking sound signal. Since the masking sound is determined based on the masked sound, there is a correlation between the two, and the distance between the listener and the receiver is close to the distance between the listener and the speaker, so the masking sound can mask the masked sound from the listener.
  • the masking sound will not affect the listener’s normal listening of the audio signal, and the masking sound signal emitted by the speaker will be at the listener’s ears instead. It is masked by the audio signal sent by the receiver, thus ensuring that the listener's call quality is not disturbed.
  • the position where the distance from the sound source is greater than a critical value r far (1/r-law) is defined as the far field, and the distance from the sound source is less than or equal to a critical value r far (1/r-rule) The location is defined as the near field.
  • the terminal device is used as the sound source, and the listener is set in the near field and the observer is in the far field.
  • FIG. 1 is a schematic diagram of an application scenario of a sound masking method provided in an embodiment of the application.
  • the first terminal device 101 and the second terminal device 102 establish a communication connection.
  • the sound masking method provided in the embodiment of the present application is applied to the first terminal device 101, and the second terminal device 102 serves as the opposite device of the first terminal device 101.
  • the first terminal device 101 may be any mobile communication device with communication functions, such as a mobile phone, a tablet computer, or a portable notebook computer.
  • FIG. 1 only the first terminal device 101 in the form of a mobile phone is shown as an example, and the specific type of the first terminal device 101 is not limited in the embodiment of the present application.
  • the first terminal device 101 includes a receiver 1011 and a speaker 1012.
  • the receiver 1011 and the loudspeaker 1012 of the first terminal device 101 may be arranged on both sides of the first terminal device 1011, respectively.
  • the receiver 1011 is located on one side of the midline 1013 of the first terminal 1011 in the longitudinal direction
  • the speaker 1012 is located on the other side of the midline 1013.
  • Both the receiver 1011 and the speaker 1012 can output audio signals from the second terminal device 102 to the outside world.
  • the user ie, listener
  • the user of the first terminal device 101 can choose to use the receiver 1011 to output audio signals or
  • the speaker 1012 outputs audio signals.
  • the technical solution of the embodiment of the present application is mainly based on a scenario where the receiver 1011 is used as the audio signal output terminal.
  • the speaker 1012 outputs a masking sound signal.
  • the masking sound signal is used to mask the audio signal to the far field.
  • FIG. 2 is a schematic diagram of the distance between the first terminal device 101 shown in FIG. 1 and the ears of the near-field listener and the far-field listener.
  • the distance between the listener (the user of the first terminal device 1011, that is, the target receiver of the audio signal) and the first terminal device 101 It is much smaller than the distance between the observer (other people located near the listener, that is, the non-target receiver of the audio signal) and the first terminal device 101. Therefore, in the embodiment of the present application, the ear 201 of the listener is located in the near field of the first terminal device 101, and the ear 202 of the observer is located in the far field of the first terminal device 101.
  • the listener's ear 201 is close to the position of the receiver 1011.
  • the first distance d1 between the ear reference point (English full name: Ear Reference Point, English abbreviation: ERP) of the listener’s ear 201 and the receiver 1011 is about 5 mm
  • the ERP of the listener’s ear 201 is at the second distance of the speaker 1012.
  • the distance d2 is approximately 150 mm. It can be seen that the difference between the second distance d2 and the first distance d1 is close to 30 times.
  • the third distance d3 between the ERP of the listener's ear 202 and the receiver 1011 is about 500 mm
  • the fourth distance d4 between the ERP of the listener's ear 202 and the speaker 1012 is about 500 mm. It can be seen that the difference between the fourth distance d4 and the third distance d3 is very small, and the difference between the two is much less than 30.
  • the relationship between the sound pressure level and the distance can be obtained: the sound pressure amplitude increases with the radial distance And it decreases inversely.
  • the receiver 1011 and the speaker 1012 respectively output the audio signal and the masking sound signal of the same volume, since the second distance d2 is nearly 30 times different from the first distance d1, the attenuation amplitude of the masking sound signal heard by the listener’s ear 201 is greater than that of the masking sound signal heard by the listener’s ear 201.
  • the attenuation amplitude of the audio signal is provided.
  • the masking of the sound signal does not interfere with the listener's ear 201 listening to the audio signal. Since the difference between the fourth distance d4 and the third distance d3 is very small, usually less than twice, the attenuation amplitude of the masking sound signal heard by the listener’s ear 202 is close to the attenuation amplitude of the audio signal heard by the listener’s ear 202.
  • the masked sound signal heard is basically the same as the loudness obtained by the sound signal.
  • the masking sound signal is determined based on the audio signal, the masking sound signal can mask the audio signal to the far field.
  • the numerical values of the first distance d1, the second distance d2, the third distance d3, and the fourth distance d4 shown in FIG. 2 are only examples.
  • the second distance d2 may be greater than or less than 150 mm.
  • the values of the third distance d3 and the fourth distance d4 may also change according to changes in the relative position of the observer and the first terminal device. Therefore, the third distance d3 and the fourth distance d4 may also be greater than 500 mm. Therefore, the embodiments of the present application do not limit the numerical values of d1, d2, d3, and d4.
  • FIG. 3 is a flowchart of a sound masking method provided by an embodiment of the application.
  • the sound masking methods include:
  • the audio signal output by the receiver may include human voice signals, animal voice signals, or music.
  • the type and specific content of the audio signal are not limited here.
  • the receiver determines whether the receiver receives an audio signal output instruction.
  • the receiver receives the audio signal output instruction, it means that the receiver is used as the output terminal of the audio signal.
  • the speaker receives an audio signal output instruction, it means that the speaker is used as the output terminal of the audio signal.
  • the audio signal output command is only sent to the receiver or only to the speaker.
  • the speaker provides the function of voice output when working, when the terminal device uses the speaker as the output end of the audio signal, it means that the user of the terminal device does not need to keep the content of the call confidential, and there is no need to perform the subsequent operations of the method in this embodiment. Mask the audio signal.
  • the terminal device uses the receiver as the output end of the audio signal
  • privacy or confidentiality may be involved in the conversation between the user of the terminal device and the user of the other terminal device. If the receiver leaks the sound, the observer may learn the private or confidential information in the audio signal. For this reason, the audio signal needs to be masked.
  • the audio signal provided by the opposite terminal device has corresponding time-domain characteristics and frequency-domain characteristics.
  • the masked sound signal may specifically be analyzed in real time based on the time-domain characteristics and/or frequency-domain characteristics of the real-time audio signal. Generated.
  • an audio library containing multiple masked sound signals can be constructed.
  • the terminal device uses the receiver as the audio signal output terminal again, according to the time domain characteristics and/or frequency domain characteristics of the audio signal this time, the masking sound signal is selected or matched from the preset audio library to mask the original Second audio signal.
  • the masking sound signal determined according to the audio signal has a higher degree of matching with the audio signal in the time domain and/or frequency domain, and can better mask the audio signal far away, reducing the privacy information of the listener in the far field. Or the intelligibility of confidential information.
  • S303 The speaker emits a masking sound signal, where the masking sound signal is used to mask the audio signal output to the far-field mask receiver.
  • the speaker can be controlled to emit a masking sound signal according to a variety of possible trigger conditions.
  • the terminal device controls the speaker to continuously emit the masking sound signal during the entire call in which the receiver is used as the audio signal output terminal.
  • the sound signal of the surrounding environment of the terminal device is detected.
  • the amplitude of the sound signal of the surrounding environment is lower than the first threshold, it indicates that the surrounding environment is too quiet. At this time, it is necessary to control the speaker to emit a masking sound signal.
  • the masking of the sound signal does not play a role in masking the audio signal, but causes noise interference to the observers.
  • the sound masking method provided in this embodiment can be implemented according to the user's choice.
  • an application full English name: Applicaiton, English abbreviation: APP
  • the terminal device can execute the sound masking method provided in the embodiment of the present application.
  • the terminal device can also embed a function module in the call interface, and the function module can be turned on or off according to the user's choice.
  • the function module is turned on, the terminal device can execute the sound masking method provided in the embodiment of the present application.
  • the above APP or functional module can also be automatically turned on. For example, it is automatically turned on when a downlink audio signal is detected, when a call request is received, or when the terminal device is turned on.
  • the sound masking method provided by the embodiments of this application.
  • This method is applied to terminal equipment with receivers and speakers.
  • the terminal device When the terminal device is used as the output terminal of the audio signal through the receiver, the masking sound signal is determined according to the audio signal. Then control the speaker to emit a masking sound signal. Since the masking sound signal is determined based on the audio signal, and the distance between the speaker and the receiver relative to the far field is small, the masking sound signal can better mask the leakage sound of the receiver and reduce the intelligibility of the leaked sound by the observer , To prevent the leakage of information in the call voice.
  • the masked sound signal and audio signal are output by the speaker and the receiver respectively, when the listener listens to the audio signal with the receiver, the distance between the speaker and the receiver relative to the listener's ear is quite different, so the masked sound signal is not for the listener to listen to the audio signal. The interference is small and will not affect the call quality of the listener.
  • the second distance d2 between the speaker of the terminal device and the ear of the listener is the first distance between the receiver and the ear of the listener.
  • d1 is more than 10 times of d2.
  • the sound pressure level of the audio signal reaching the listener’s ears is more than 20dB higher than the sound pressure level of the masking sound signal reaching the listener’s ears. At this time, the listener’s ears will not be masked when listening to the audio signal. Interference of sound signals.
  • the length of the first terminal device 101 is L1
  • the width is W1
  • the distance between the speaker 1012 and the receiver 1011 is L2.
  • L2 satisfies at least one of the following inequalities (1)-(3):
  • the first distance d1 is much smaller than the second distance d2, thereby ensuring that the masked sound signal reaches the listener at a lower sound pressure level than the audio signal Ears, to avoid masking the sound signal and causing interference to the listener's call quality.
  • L2 satisfies the following inequality (4):
  • the sound masking method provided in the embodiments of the present application includes multiple implementation manners for generating a masked sound signal. The description will be launched below in conjunction with the embodiments and the drawings.
  • FIG. 4 is a flowchart of another sound masking method provided by an embodiment of the application.
  • the sound masking method includes:
  • S401 is basically the same as the implementation manner of S301 in the foregoing method embodiment, and the relevant description of S401 can refer to the foregoing embodiment, which is not repeated here.
  • FIG. 5 is a schematic diagram of the spectrum response obtained by performing S402.
  • the horizontal axis represents frequency (unit: Hz)
  • the vertical axis represents signal amplitude (unit: dBFS).
  • S403 Generate a masking sound signal according to the spectrum response.
  • the spectral response curve obtained in S402 can be used as a filter to generate a masking sound signal.
  • the generated masking sound signal includes multiple possible forms.
  • the masking sound signal may be a random noise signal, such as a white noise signal or a pink noise signal whose frequency response curve is consistent with the audio signal.
  • S404 The speaker emits a masking sound signal.
  • the implementation manner of S404 is basically the same as the implementation manner of S303 in the foregoing method embodiment, and the related description of S404 can refer to the foregoing embodiment, which will not be repeated here.
  • the masking sound signal since the masking sound signal is generated according to the spectral response of the audio signal, the masking sound signal has the same similar or identical characteristic curve as the audio signal in the frequency spectrum.
  • the amplitude of the masking sound signal and the amplitude of the audio signal may be the same or different.
  • the spectral characteristic curve of the masking sound signal 601 is very similar to the spectral characteristic curve of the audio signal 602, so the masking sound signal 601 has a better effect of shielding the audio signal 602 to the far field.
  • the masked sound signal may be generated in real time based on the audio signal of the current call, for example, generated based on the first n milliseconds of the audio signal of the current call (n is a positive number, and n milliseconds is less than the total duration of the downstream audio signal).
  • the masked sound signal can also be pre-formed based on the audio signal of the historical conversation between the terminal device and the opposite end. For example, when the peer device 102 had a previous call with the local device 101, it sent an audio signal to the local device 101, and the audio signal contained the spectral characteristics of the voice of the user A2 of the peer device 102. Before the peer device 102 establishes a communication connection with the local device 101 again, the masked sound signal V2 corresponding to the user A2 is generated according to the spectrum response of the audio signal provided by the peer device 102. By analogy, for user A3, the corresponding masking sound signal V3 can also be established.
  • mapping table between each contact in the address book of the terminal device 101 and the masked sound signal can be established, and the masked sound signals V2, V3, etc. can be added to the audio library.
  • a contact in the address book establishes a communication connection with the local device 101 through the terminal device it holds, if the receiver of the terminal device 101 is used as the output end of the audio signal, you can directly use the mapping table to select from the audio library Or match, obtain the masking sound signal corresponding to the contact, and then mask the audio signal of the contact in the far field.
  • the masking sound signal is generated in advance by the above method, which improves the generation efficiency of the masking sound signal, and the masking effect is more targeted.
  • S402 and S403 are completed in advance, only S401 and S404 are executed every time the method of this embodiment is implemented after generation.
  • the audio signal can also be processed to weaken the influence of the masked sound signal on the listener. Description will be given below in conjunction with the drawings and embodiments.
  • FIG. 7 is a schematic diagram of signal processing provided by an embodiment of the application.
  • the audio signal in the terminal device is divided into two channels, and the content of the two channels of audio signal 701 and 702 is the same.
  • the audio signal 701 is provided to the receiver, and the audio signal 702 is provided to the speaker.
  • the audio signal 702 may be obtained by duplicating the audio signal 701.
  • the audio signal 702 is intercepted according to the preset frame length to obtain the intercepted sound segment; then the sound segment is time-domain inverted to obtain the inverted sound .
  • the preset frame length can be a fixed frame length or a floating frame length (that is, the frame length is variable). It is understandable that if the value of the preset frame length is too large, it may take too long to generate the masked sound signal, which affects the listener's call experience.
  • the value range of the preset frame length is 10ms-300ms, which ensures that the sound segment is time-domain inverted at a relatively fast frequency, so as to mask the audio signal to the far field in real time.
  • the corresponding masking sound signal 703 can be generated by using the inverted sound. For example, each frame of inverted sound is directly spliced to generate a masking sound signal, or after the inverted sound is processed by a window function, the processed sound is spliced to generate a masking sound signal.
  • the generation time of the masked sound signal 703 may be delayed relative to the audio signal 701.
  • the masking sound signal 703 lags behind the audio signal 701 by several milliseconds.
  • the time length for generating the masked sound signal 703 according to the audio signal 702 can also be obtained, and the audio signal 701 is delayed according to the time length, so that the audio signal 701 output by the receiver and the speaker output are delayed.
  • the masking sound signal 703 is adapted, for example, partially aligned or completely aligned. For example, if it takes 10 ms to generate the masking sound signal, the audio signal 701 is delayed by 10 ms.
  • the audio signal 701 can also be delayed according to a preset delay length, and the value range of the preset delay length is 10 ms-300 ms. It should be noted that in this embodiment, delaying the audio signal 701 is an optional operation, rather than a necessary operation.
  • the masking sound signal 703 is directly provided to the speaker, so that the speaker outputs the masking sound signal.
  • the masked sound signal 703 may also be used to process the audio signal 701 in the embodiment of the present application.
  • the masked sound signal is subjected to phase inversion processing to obtain an inverted sound signal 704.
  • the phase range of the inversion processing is 90 degrees to 270 degrees, so as to ensure that the inverted sound signal 704 has a better compensation capability for the masked sound signal 703.
  • FIG. 10 is a schematic diagram of the masked sound signal 703 and the inverted sound signal 704.
  • the inverted sound signal 704 is subjected to amplitude reduction processing and then mixed with the audio signal 701 to obtain a mixed sound signal 705, and finally the mixed sound signal 705 is provided to the receiver so that the receiver can play the mixed sound Sound signal 705.
  • the amplitude reduction processing can be achieved through an equalizer, or through gain control or filtering. The specific implementation of the amplitude reduction processing is not limited here.
  • the mixed audio signal 705 is formed by mixing the inverted audio signal 704 and the audio signal 701, the mixed audio signal 705 contains effective conversation content.
  • the components of the inverted sound signal 704 in the mixed sound signal 705 can also compensate the masked sound signal 703 in the near field after the mixed sound signal 705 is output, so as to cancel the call quality of the masked sound signal played by the speaker to the listener.
  • the interference effect of the inverted sound signal 704 in the mixed sound signal 705 on the ears of the listener is also weakened.
  • the masking sound signal 703 may also be enhanced to obtain an enhanced masking sound signal, and the enhanced masking sound signal may be provided to a speaker for transmission.
  • the equalizer can be used to enhance the mid-to-high range of the masked sound signal to strengthen the audio signal output by the receiver in the mid-to-high range.
  • the masking effect of the audio domain can be implemented through an equalizer, gain control, or filtering processing, and the implementation manner of the enhancement processing is not limited here.
  • the curve 1101 in the figure represents the masking sound signal before enhancement
  • the curve 1102 represents the masking sound signal after enhancement.
  • the audio signal 702 is intercepted according to a preset frame length, and the intercepted sound segment is obtained. Afterwards, the sound segment can be interpolated to obtain the supplemented sound information; or the subsequent segments can be matched from the preset audio library to obtain the supplemented sound information. Finally, the corresponding masking sound signal is generated according to the supplemented sound information.
  • the intercepted sound segment can be preprocessed, the characteristic parameters (such as bytes, pitch, etc.) can be extracted, and then the characteristic parameters and a pre-trained empirical model can be used to interpolate the sound segment. Added sound information.
  • an audio library is constructed in advance, and each sound segment in the audio library matches at least one other sound segment. After the intercepted sound fragment is obtained, any matched sound fragment is obtained from the preset audio library according to the sound fragment, and the matched sound fragment is called a subsequent fragment. The supplementary sound information is obtained by using the sound segment and subsequent segments.
  • the masked sound signal is generated by using the supplemented sound information, which reduces the frequency of intercepting sound segments and improves the generation efficiency of the masked sound signal.
  • the audio signal 701 may also be delayed, so that the masking sound signal and the audio signal 701 are played in alignment.
  • the intelligibility of the leaked sound at 500mm is significantly reduced, and the intelligibility of the leaked sound is reduced from 90% before the implementation of the method of this embodiment.
  • the far-field intelligibility of missing words is less than 30%, and the intelligibility of sentences is less than 10%.
  • the noise impact on the surrounding environment is below 6dB, and there is no significant change in the loudness impact on the surrounding environment compared to before the implementation of this method.
  • the implementation of this method has almost no effect on the audio intelligibility of the listener in the near field. Therefore, the sound masking method provided in this embodiment can effectively mask the leakage of the receiver to the far field without changing the call quality of the listener.
  • the present application also provides a sound masking device.
  • the following description will be given with reference to the drawings and embodiments.
  • FIG. 12 is a schematic structural diagram of a sound masking device provided by an embodiment of the application.
  • the sound masking device 120 shown in this figure can be applied to the first terminal device 101 shown in FIGS. 1 and 2.
  • the device 120 includes:
  • the judging module 1201 is used to judge whether the terminal device uses the receiver as the output end of the audio signal
  • the determining module 1203 is configured to determine the masking sound signal according to the audio signal when the judgment result of the judgment module is yes;
  • the first control module 1202 is used to control the speaker to emit a masking sound signal to mask the audio signal output by the receiver in the far field.
  • the masking sound signal is determined based on the audio signal, and the distance between the speaker and the receiver relative to the far field is small, the masking sound signal can better mask the leakage sound of the receiver and reduce the intelligibility of the leaked sound by the observer , To prevent the leakage of information in the call voice.
  • the masked sound signal and audio signal are output by the speaker and the receiver respectively, when the listener listens to the audio signal with the receiver, the distance between the speaker and the receiver relative to the listener's ear is quite different, so the masked sound signal is not for the listener to listen to the audio signal. The interference is small and will not affect the call quality of the listener.
  • the determining module 1203 is configured to select or match a corresponding masked sound signal from a pre-generated audio library according to the audio signal.
  • the distance between the speaker and the ear of the listener is greater than the distance between the receiver and the ear of the listener.
  • the determining module 1203 is configured to generate a masking sound signal according to the audio signal.
  • the determining module 1203 specifically includes:
  • the spectrum analysis unit 12031 is used to perform spectrum analysis on the audio signal to obtain a spectrum response
  • the first generating unit 12032 is configured to generate a masking sound signal according to the spectral response.
  • the masking sound signal is generated according to the frequency spectrum response of the audio signal, the masking sound and the masked sound have consistency or similarity in the spectrum characteristics. Furthermore, masking the sound signal can better mask the sound signal played by the receiver.
  • the determining module 1203 specifically includes:
  • the signal interception unit 12033 is configured to intercept the audio signal according to the preset frame length to obtain the intercepted sound fragment
  • the signal inversion unit 12034 is used to invert the sound segment in time domain to obtain inverted sound
  • the second generating unit 12035 is configured to generate a corresponding masking sound signal according to the inverted sound.
  • the generated masking sound signal can be used to reduce the intelligibility of the far-field observers to the leakage of the sound. In turn, the security of private information or confidential information in the content of the call is ensured.
  • the determining module 1203 specifically includes:
  • the signal interception unit 12033 is configured to intercept the audio signal according to the preset frame length to obtain the intercepted sound fragment
  • the signal supplement unit 12036 is used to interpolate sound fragments to obtain a supplemented sound signal; or to match subsequent fragments from a preset audio library to obtain a supplemented sound signal;
  • the third generating unit 12037 is configured to generate a corresponding masking sound signal according to the supplemented sound signal.
  • the interception frequency of the sound signal is reduced, and the efficiency of generating the masking sound signal is improved.
  • the waiting time for the listener to listen to the sound signal is avoided, and the call experience of the listener is improved.
  • it also includes:
  • the time length obtaining module is used to obtain the time length for generating the masking sound signal according to the audio signal
  • the delay module is used to delay the audio signal according to the length of time, so that the sound signal output by the receiver is compatible with the masking sound signal output by the speaker.
  • it also includes:
  • the inversion processing module is used to perform phase inversion processing on the masked sound signal to obtain an inverted sound signal
  • the mixing module is used to perform EQ or amplitude processing on the inverted sound signal and mix it with the audio signal to obtain the mixed sound signal;
  • the second control module is used to control the receiver to output a mixed sound signal.
  • the inverted sound signal obtained through the processing of the inverted processing module can compensate for the masked sound signal, to a certain extent, offset the interference of the masked sound signal on the ears of the listener in the near field, and ensure the quality of the listener's conversation.
  • the first control module 1203 specifically includes:
  • the first detection unit is used to detect the sound signal of the surrounding environment
  • the first judging unit is used to judge whether the amplitude of the sound signal in the surrounding environment is lower than the first preset threshold
  • the first control unit is configured to transmit a masking sound signal through the speaker when the judgment result of the first judgment unit is yes.
  • the amplitude of the sound signal of the surrounding environment is lower than the first preset threshold as the trigger condition for transmitting the masking sound signal through the speaker, so as to prevent the listener in the environment from being too quiet to hear the leakage of the receiver. Therefore, the private information or confidential information in the leakage of the sound is prevented from being leaked.
  • the first control module 1203 specifically includes:
  • the second detection unit is used to detect whether there is a downlink audio signal
  • the second determining unit is configured to determine whether the amplitude of the downstream audio signal is greater than the second preset threshold when the second detecting unit detects that there is a downstream audio signal;
  • the second control unit is used to transmit the masking sound signal through the speaker when the judgment result of the first judgment unit is yes.
  • the amplitude of the downlink audio signal is greater than the preset threshold as a trigger condition for transmitting the masking sound signal through the speaker, so as to avoid unnecessary noise interference to the observers in the surrounding environment.
  • it also includes:
  • the signal enhancement module is used to enhance the masking sound signal to obtain the enhanced masking sound signal.
  • the enhanced masking sound signal has a more effective masking effect on the sound signal played by the receiver, reducing the intelligibility of the leaked sound by the observer.
  • the present application also provides a terminal device.
  • the terminal device may be the first terminal device 101 shown in FIG. 1 and FIG. 2.
  • FIG. 1 and FIG. 2 For the application scenario of the terminal device, refer to FIG. 1 and FIG. 2, which will not be repeated here.
  • the following describes the structural implementation of the terminal device provided by the embodiment of the present application in conjunction with the embodiment and the drawings.
  • FIG. 16 is a schematic structural diagram of a terminal device according to an embodiment of the application.
  • the terminal device 160 includes a receiver 1601, a speaker 1602, and a processor 1603.
  • the processor 1603 is configured to determine the masking sound signal according to the audio signal when the audio signal is output by the receiver 1601;
  • the speaker 1602 is used to transmit a masking sound signal to mask the audio signal output by the receiver 1011 in the far field.
  • the speaker 1602 can output the masking sound signal under the control of the processor 1603.
  • the masking sound signal is determined based on the audio signal, and the distance between the speaker 1602 and the receiver 1601 relative to the far field is small, the masking sound signal can better mask the sound leakage of the receiver 1601 and reduce the listener’s risk of leakage. Intelligibility, to prevent the leakage of information in the call voice.
  • the masking sound signal and the sound signal are respectively output by the speaker 1602 and the receiver 1601, when the listener uses the receiver 1601 to listen to the sound signal, the distance between the speaker 1602 and the receiver 1601 relative to the ear of the listener is quite different. The sound signal at the listener's ears will be masked by the audio signal emitted by the receiver 1601. Therefore, the masking sound signal has less interference to the listener's listening to the sound signal, and will not affect the listener's call quality.
  • the processor 1603 is specifically configured to select or match the corresponding masked sound signal from a pre-generated audio library according to the audio signal when the audio signal is output by the receiver.
  • the distance between the speaker 1602 and the ear of the listener is greater than the distance between the receiver 1601 and the ear of the listener.
  • the processor 1603 is specifically configured to perform spectrum analysis on the audio signal to obtain a spectrum response; and generate a masking sound signal according to the spectrum response.
  • the processor 1603 is specifically configured to intercept the audio signal according to the preset frame length to obtain the intercepted sound fragment; perform the time domain inversion of the sound fragment to obtain the inverted sound; and generate the corresponding sound according to the inverted sound. Mask the sound signal.
  • the processor 1603 is specifically configured to intercept the audio signal according to the preset frame length to obtain the intercepted sound fragment; interpolate the sound fragment to obtain the supplemented sound signal; or match from the preset audio library In the subsequent clips, the supplemented sound signal is obtained; the corresponding masking sound signal is generated according to the supplemented sound signal.
  • the processor 1603 is further configured to obtain the length of time for generating the masking sound signal according to the audio signal; delay the audio signal according to the length of time, so that the audio signal output by the receiver 1601 is the same as the masking sound signal output by the speaker 1602 Adapt.
  • the processor 1603 is also used to perform phase inversion processing on the masked sound signal to obtain an inverted sound signal; mix the inverted sound signal and audio signal to obtain a mixed sound signal; control the receiver 1601 Output the mixed sound signal.
  • the processor 1603 is specifically configured to detect the sound signal of the surrounding environment, and when the amplitude of the sound signal of the surrounding environment is lower than the first preset threshold, the speaker 1602 is controlled to output the masking sound signal.
  • the processor 1603 is specifically configured to control the speaker 1602 to output the masking sound signal when it determines that the amplitude of the downlink audio signal is greater than the second preset threshold when the downlink audio signal is detected.
  • the processor 1603 is further configured to perform enhancement processing on the masking sound signal to obtain an enhanced masking sound signal.
  • the processor 1603 may be used to execute part or all of the steps in the foregoing method embodiment.
  • the processor 1603 and the related technical effects of executing method steps reference may be made to the foregoing method embodiments and device embodiments, and details are not described herein again.
  • the terminal device 160 shown in FIG. 16 only shows the part related to the embodiment of the present application. For specific technical details that are not disclosed, please refer to the method part of the embodiment of the present application.
  • the terminal device 160 may include any mobile phone, tablet computer, personal digital assistant (English full name: Personal Digital Assistant, English abbreviation: PDA), sales terminal device (English full name: Point of Sales, English abbreviation: POS), on-board computer, etc. Terminal Equipment.
  • PDA Personal Digital Assistant
  • POS Point of Sales
  • POS Point of Sales
  • Terminal Equipment Terminal Equipment
  • FIG. 17 shows a partial structural block diagram of a mobile phone related to a terminal device provided in an embodiment of the present application.
  • the mobile phone 170 includes: radio frequency (English full name: Radio Frequency, English abbreviation: RF) circuit 1710, memory 1720, input unit 1730, display unit 1740, sensor 1750, audio circuit 1760, wireless fidelity (English full name: wireless Fidelity (English abbreviation: WiFi) module 1770, processor 1780 (the processor 1780 can implement the function of the processor 1603 shown in FIG. 16), and components such as the power supply Bat.
  • radio frequency English full name: Radio Frequency, English abbreviation: RF
  • memory 1720 input unit 1730, display unit 1740, sensor 1750, audio circuit 1760
  • wireless fidelity English full name: wireless Fidelity (English abbreviation: WiFi) module 1770
  • processor 1780 the processor 1780 can implement the function of the processor 1603 shown in FIG. 16
  • components such as the power supply Bat.
  • the RF circuit 1710 can be used for receiving and sending signals during the process of sending and receiving information or talking. In particular, after receiving the downlink information of the base station, it is processed by the processor 1780; in addition, the designed uplink data is sent to the base station.
  • the RF circuit 1710 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (full English name: Low Noise Amplifier, English abbreviation: LNA), a duplexer, and the like.
  • the RF circuit 1710 can also communicate with the network and other devices through wireless communication.
  • the above-mentioned wireless communication can use any communication standard or protocol, including but not limited to Global System of Mobile Communications (English full name: Global System of Mobile communication, English abbreviation: GSM), General Packet Radio Service (English full name: General Packet Radio Service, GPRS) ), Code Division Multiple Access (English name: Code Division Multiple Access, English abbreviation: CDMA), Wideband Code Division Multiple Access (English name: Wideband Code Division Multiple Access, English abbreviation: WCDMA), Long Term Evolution (English name: Long Term Evolution, English abbreviation: LTE), e-mail, short message service (English full name: Short Messaging Service, SMS), etc.
  • GSM Global System of Mobile Communications
  • GSM Global System of Mobile Communications
  • GPRS General Packet Radio Service
  • CDMA Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • LTE Long Term Evolution
  • SMS Short Messaging Service
  • the memory 1720 may be used to store software programs and modules.
  • the processor 1780 executes various functional applications and data processing of the mobile phone 170 by running the software programs and modules stored in the memory 1720.
  • the memory 1720 may mainly include a program storage area and a data storage area.
  • the program storage area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data (such as audio data, phone book, etc.) created by the use of the mobile phone 170, etc.
  • the memory 1720 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
  • the input unit 1730 may be used to receive inputted numeric or character information, and generate key signal input related to user settings and function control of the mobile phone 170.
  • the input unit 1730 may include a touch panel 1731 and other input devices 1732.
  • the touch panel 1731 also called a touch screen, can collect user touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc.) on the touch panel 1731 or near the touch panel 1731. Operation), and drive the corresponding connection device according to the preset program.
  • the touch panel 1731 may include two parts: a touch detection device and a touch controller.
  • the touch detection device detects the user's touch position, detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and then sends it To the processor 1780, and can receive and execute the commands sent by the processor 1780.
  • the touch panel 1731 can be implemented in multiple types such as resistive, capacitive, infrared, and surface acoustic wave.
  • the input unit 1730 may also include other input devices 1732.
  • the other input device 1732 may include, but is not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackball, mouse, and joystick.
  • the display unit 1740 may be used to display information input by the user or information provided to the user and various menus of the mobile phone 170.
  • the display unit 1740 may include a display panel 1741.
  • a liquid crystal display (English full name: Liquid Crystal Display, English abbreviation: LCD), organic light-emitting diode (English full name: Organic Light-Emitting Diode, English abbreviation: OLED), etc. Form to configure the display panel 1741.
  • the touch panel 1731 can cover the display panel 1741. When the touch panel 1731 detects a touch operation on or near it, it is transmitted to the processor 1780 to determine the type of the touch event, and then the processor 1780 determines the type of the touch event.
  • the type provides corresponding visual output on the display panel 1741.
  • the touch panel 1731 and the display panel 1741 are used as two independent components to implement the input and input functions of the mobile phone 170, but in some embodiments, the touch panel 1731 and the display panel 1741 can be integrated And realize the input and output functions of the mobile phone 170.
  • the mobile phone 170 may also include at least one sensor 1750, such as a light sensor, a motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor.
  • the ambient light sensor can adjust the brightness of the display panel 1741 according to the brightness of the ambient light.
  • the proximity sensor can close the display panel 1741 and the display panel 1741 when the mobile phone 170 is moved to the ear. / Or backlight.
  • the accelerometer sensor can detect the magnitude of acceleration in various directions (usually three axes), and can detect the magnitude and direction of gravity when stationary, and can be used for applications that recognize the 170 posture of the mobile phone (such as horizontal and vertical screen switching, Related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer, percussion), etc.; as for other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc., which can also be configured on the mobile phone 170, here No longer.
  • the audio circuit 1760, the speaker 1761, the microphone 1762, and the receiver 1763 can provide an audio interface between the user and the mobile phone 170.
  • the audio circuit 1760 can transmit the electric signal after the conversion of the received audio data to the speaker 1761 or the receiver 1763, and the speaker 1761 or the receiver 1763 converts into a sound signal for output; on the other hand, the microphone 1762 converts the collected sound signal into electric
  • the signal is received by the audio circuit 1760 and converted into audio data, and then processed by the audio data output processor 1780, and sent to, for example, another mobile phone 170 via the RF circuit 1710, or the audio data is output to the memory 1720 for further processing.
  • WiFi is a short-distance wireless transmission technology.
  • the mobile phone 170 can help users send and receive e-mails, browse web pages, and access streaming media. It provides users with wireless broadband Internet access.
  • FIG. 17 shows the WiFi module 1770, it is understandable that it is not a necessary component of the mobile phone 170 and can be omitted as needed without changing the essence of the invention.
  • the processor 1780 is the control center of the mobile phone 170. It uses various interfaces and lines to connect the various parts of the entire mobile phone 170, runs or executes software programs and/or modules stored in the memory 1720, and calls data stored in the memory 1720. , Execute various functions of the mobile phone 170 and process data, so as to monitor the mobile phone 170 as a whole.
  • the processor 1780 may include one or more processing units; preferably, the processor 1780 may integrate an application processor and a modem processor, where the application processor mainly processes the operating system, user interface, application programs, etc. , The modem processor mainly deals with wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 1780.
  • the mobile phone 170 also includes a power Bat (such as a battery) for supplying power to various components.
  • a power Bat such as a battery
  • the power supply can be logically connected to the processor 1780 through a power management system, so that functions such as charging, discharging, and power consumption management can be managed through the power management system.
  • the mobile phone 170 may also include a camera, a Bluetooth module, etc., which will not be repeated here.
  • At least one (item) refers to one or more, and “multiple” refers to two or more.
  • “And/or” is used to describe the association relationship of associated objects, indicating that there can be three types of relationships, for example, “A and/or B” can mean: only A, only B, and both A and B , Where A and B can be singular or plural.
  • the character “/” generally indicates that the associated objects before and after are in an “or” relationship.
  • the following at least one item (a) or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a).
  • At least one of a, b, or c can mean: a, b, c, "a and b", “a and c", “b and c", or "a and b and c" ", where a, b, and c can be single or multiple.

Abstract

本申请公开了一种声音的掩蔽方法、装置及终端设备。当终端设备通过受话器作为音频信号的输出端时,基于音频信号确定掩蔽声音信号,再通过扬声器发射掩蔽声音信号。由于该掩蔽声音信号是依据音频信号确定的,并且扬声器与受话器相对于远场的距离差异较小,因此掩蔽声音信号能够较好地掩蔽受话器的漏音,防止通话语音中的信息泄露。此外,由于掩蔽声音信号和声音信号分别由扬声器和受话器输出,当听者以受话器收听声音信号时,扬声器与受话器相对于听者耳朵的距离差异较大,因此掩蔽声音信号对听者收听声音信号的干扰较小,不会影响听者的通话质量。

Description

一种声音的掩蔽方法、装置及终端设备
本申请要求于2020年03月20日提交中华人民共和国国家知识产权局、申请号为202010202057.5、发明名称为“一种声音的掩蔽方法、装置及终端设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电子设备技术领域,尤其涉及一种声音的掩蔽方法、装置及终端设备。
背景技术
随着移动通讯技术的不断发展,移动通讯设备已经普遍应用于人们的生产和生活中。人们通过移动通讯设备建立通讯连接,开展通话。通常在使用移动通讯设备通话时,听者以受话器收听对方的声音,但是当听者处于静室、对方的音量较大或受话器的播放音量较大时,位于听者附近的其他人(以下简称旁听者)也可能收听到受话器传出的语音。一方面,受话器传出的语音容易对旁听者产生干扰;另一方面,这可能导致部分或全部的语音泄露到旁听者耳中,语音中的隐私信息或机密信息的安全性得不到保证。
针对上述问题,目前可以通过对受话器待播放的语音进行处理的方式以避免通话语音的隐私信息或机密信息泄露。但是该方法需要处理受话器待播放的语音,与此同时会影响听者的通话质量。
目前,在利用移动通讯设备进行通话时,如何保证通话内容的私密性并避免对听者的通话质量造成干扰,已经成为本领域急需解决的技术问题。
发明内容
本申请提供一种声音的掩蔽方法、装置及终端设备,在不影响听者的通话质量的前提下,防止旁听者听懂受话器漏音中的信息。
第一方面,本申请提供一种声音的掩蔽方法。该方法应用于任意具有通讯功能且包含受话器和扬声器的终端设备。该方法包括:
当终端设备通过受话器作为音频信号的输出端时,根据音频信号确定掩蔽声音信号;
通过扬声器发射掩蔽声音信号,掩蔽声音信号用于向远场掩蔽受话器输出的音频信号。例如,可以控制扬声器输出该掩蔽声音信号。
该方法中掩蔽声音信号是依据音频信号确定的,并且对于远场而言,扬声器与远场距离和受话器与远场距离相差较小,因此音频信号能够被掩蔽声音信号有效掩蔽。另外,对于听者耳朵,音频信号的衰减幅度明显小于,掩蔽声音信号反而会被音频信号掩蔽,因此扬声器发射的掩蔽声音信号不会对听者耳朵造成干扰,从而避免对听者通话质量造成影响。
受话器播放的音频信号可以包括多种类型的音频信号,例如人的语音信号、动物的声音信号或音乐等。
掩蔽声音信号可以是根据音频信号确定的。例如,在第一方面的第一种实现方式中,可以根据音频信号从预先生成的音频库中选择或匹配出对应的掩蔽声音信号。在第一方面的第二种实现方式中,还可以根据实时的下行音频信号生成掩蔽声音信号。
通过频谱特性分析可以匹配粉噪信号或者白噪信号作为掩蔽声音信号,还可以通过频谱特性分析生成粉噪信号或白噪信号作为掩蔽声音信号。另外,还可以基于音频信号进行具体分析,对音频信号进行特征提取后匹配针对性的掩蔽声音信号;或者基于音频信号进行针对性的处理操作,例如时域反转、增强处理等,生成掩蔽声音信号。
结合第一方面的第二种实现方式,根据音频信号确定掩蔽声音信号,具体可以包括:
对音频信号进行频谱分析,获得频谱响应;
根据频谱响应生成掩蔽声音信号。
依据音频信号的频谱响应生成掩蔽声音信号,保证掩蔽声音信号与音频信号的频谱响应相互匹配,从而使掩蔽声音信号能够高效掩蔽受话器输出的音频信号。
结合第一方面的第二种实现方式,根据音频信号确定掩蔽声音信号,具体可以包括:
按照预设帧长截取音频信号,获得截取后的声音片断;
将声音片断进行时域反转,获得反转声音;
将反转声音直接拼接生成掩蔽声音信号,或者通过窗函数之后再拼接生成掩蔽声音信号。
通过反转生成掩蔽声音信号,使掩蔽声音信号难以被旁听者听懂,从而当掩蔽声音信号向远场发射时,受话器漏音能够被不易听懂的掩蔽声音信号掩蔽。
结合第一方面的第二种实现方式,根据音频信号确定掩蔽声音信号,具体可以包括:
按照预设帧长截取音频信号,获得截取后的声音片断;
将声音片断进行插值,获得补充后的声音信号;或者从预设音频库中匹配后续片断,获得补充后的声音信号;
根据补充后的声音信号生成对应的掩蔽声音信号。
通过插值或匹配获取补充后的声音信号,避免频繁截取声音片段,降低了终端设备的处理负荷,提升对掩蔽声音信号的生成效率。
可选地,结合第一方面的第二种实现方式,该方法还可以包括:
获得根据音频信号生成掩蔽声音信号的时间长度;
根据时间长度对音频信号进行延迟,以使受话器输出的音频信号与扬声器输出的掩蔽声音信号相适应。
例如,延迟后是音频信号和掩蔽声音信号局部对齐或完全对齐。通过对齐掩蔽声音信号和音频信号,保证掩蔽声音信号与音频信号的同步性,提升了掩蔽效果。
结合第一方面的任意一种实现方式,该方法还可以包括:
将掩蔽声音信号进行相位反相处理,获得反相声音信号;
将反相声音信号进行降低幅值处理后和音频信号进行混音,获得混音声音信号;
受话器输出混音声音信号。
通过反相处理,使混音声音信号能够在一定程度上抵消扬声器发射的掩蔽声音信 号对近场听者耳朵的影响,在保证通话私密性的基础上进一步提升了通话质量。
结合第一方面的任意一种实现方式,该方法通过扬声器发射掩蔽声音信号,具体包括:
检测周围环境的声音信号,当周围环境的声音信号的幅值低于第一预设阈值时,通过扬声器发射掩蔽声音信号。该方法降低漏音内容的泄露风险。
结合第一方面的任意一种实现方式,该方法通过扬声器发射掩蔽声音信号,具体可以包括:
当检测到有下行音频信号时,判断下行音频信号的幅值大于第二预设阈值时,通过扬声器发射掩蔽声音信号。该方法避免对旁听者造成不必要的噪声干扰。
可选地,预设帧长的时间取值范围为10ms-300ms。
可选地,反相处理的相位范围是90度-270度。
受话器与扬声器的距离大于终端设备的宽度;
受话器与扬声器的距离大于终端设备的长度的一半;
受话器与扬声器的距离大于100mm;
受话器与扬声器的距离至少为受话器与听者耳朵的20倍。
第二方面,本申请提供一种声音的掩蔽装置。该装置应用于任意具有通讯功能且包含受话器和扬声器的终端设备。
该装置包括:判断模块、确定模块和第一控制模块。
其中,判断模块,用于判断终端设备是否以受话器作为音频信号的输出端;
确定模块,用于当判断模块的判断结果为是时,根据音频信号确定掩蔽声音信号;
第一控制模块,用于控制扬声器发射掩蔽声音信号,以向远场掩蔽受话器输出的音频信号。
在第二方面的第一种实现方式中,确定模块用于当以所述受话器输出音频信号时,根据音频信号从预先生成的音频库中选择或匹配对应的掩蔽声音信号。
在第二方面的第二种实现方式中,确定模块用于根据音频信号实时生成掩蔽声音信号。
结合第二方面的第二种实现方式,确定模块具体可以包括:
频谱分析单元,用于对音频信号进行频谱分析,获得频谱响应;
第一生成单元,用于根据频谱响应生成掩蔽声音信号。
结合第二方面的第二种实现方式,确定模块具体可以包括:
信号截取单元,用于按照预设帧长截取音频信号,获得截取后的声音片断;
信号反转单元,用于将声音片断进行时域反转,获得反转声音;
第二生成单元,用于根据反转声音生成对应的掩蔽声音信号。
结合第二方面的第二种实现方式,确定模块具体可以包括:
信号截取单元,用于按照预设帧长截取音频信号,获得截取后的声音片断;
信号补充单元,用于将声音片断进行插值,获得补充后的声音信号;或者从预设音频库中匹配后续片断,获得补充后的声音信号;
第三生成单元,用于根据补充后的声音信号生成对应的掩蔽声音信号。
结合第二方面的第二种实现方式,该装置还可以包括:
时间长度获取模块,用于获得根据音频信号生成掩蔽声音信号的时间长度;
延迟模块,用于根据时间长度对音频信号进行延迟,以使受话器输出的音频信号与扬声器输出的掩蔽声音信号相适应。
结合第二方面的任意一种实现方式,上述装置还可以包括:
反相处理模块,用于将掩蔽声音信号进行相位反相处理,获得反相声音信号;
混音模块,用于将反相声音信号进行降低幅值处理后和音频信号进行混音,获得混音声音信号;
第二控制模块,用于控制受话器输出混音声音信号。
结合第二方面的任意一种实现方式,第一控制模块,具体可以包括:
第一检测单元,用于检测周围环境的声音信号;
第一判断单元,用于判断周围环境声音的幅值是否低于第一预设阈值;
第一控制单元,用于当第一判断单元判断结果为是时通过扬声器发射掩蔽声音信号。
结合第二方面的任意一种实现方式,第一控制模块,具体可以包括:
第二检测单元,用于检测是否有下行音频信号;
第二判断单元,用于当第二检测单元检测到有下行音频信号时,判断下行音频信号的幅值是否大于第二预设阈值;
第二控制单元,用于当第二判断单元判断结果为是时,通过扬声器发射掩蔽声音信号。
结合第二方面的任意一种实现方式,装置还可以包括:
信号增强模块,用于对掩蔽声音信号进行增强处理,获得增强后的掩蔽声音信号,再提供给扬声器。
第三方面,本申请提供一种终端设备。该终端设备可以为手机、平板电脑、个人数字助理(英文全称:Personal Digital Assistant,英文缩写:PDA)、销售终端设备(英文全称:Point of Sales,英文缩写:POS)、车载电脑等任意具有通讯功能且包含受话器和扬声器的终端设备。
本申请第三方面提供的终端设备包括:受话器、扬声器和处理器;
处理器,用于当以受话器输出音频信号时,根据音频信号确定或生成掩蔽声音信号;
扬声器,用于发射掩蔽声音信号,以向远场掩蔽受话器输出的音频信号。
在第三方面的第一种实现方式中,处理器具体用于当以所述受话器输出音频信号时,根据音频信号从预先生成的音频库中选择或匹配对应的掩蔽声音信号。
在第三方面的第二种实现方式中,处理器具体用于根据音频信号实时地生成掩蔽声音信号。
结合第三方面的第二种实现方式,处理器,具体用于对音频信号进行频谱分析, 获得频谱响应;根据频谱响应生成掩蔽声音信号。
结合第三方面的第二种实现方式,处理器,具体用于按照预设帧长截取音频信号,获得截取后的声音片断;将声音片断进行时域反转,获得反转声音;根据反转声音生成对应的掩蔽声音信号。
结合第三方面的第二种实现方式,处理器,具体用于按照预设帧长截取音频信号,获得截取后的声音片断;将声音片断进行插值,获得补充后的声音信号;或者从预设音频库中匹配后续片断,获得补充后的声音信号;根据补充后的声音信号生成对应的掩蔽声音信号。
结合第三方面的第二种实现方式,处理器,还用于获得根据音频信号生成掩蔽声音信号的时间长度;根据时间长度对音频信号进行延迟,以使受话器输出的音频信号与扬声器输出的掩蔽声音信号相适应。
结合第三方面的任意一种实现方式,处理器,还用于将掩蔽声音信号进行相位反相处理,获得反相声音信号;将反相声音信号进行降低幅值处理后和音频信号进行混音,获得混音声音信号;控制受话器输出混音声音信号。
结合第三方面的任意一种实现方式,处理器,具体用于检测周围环境的声音信号,当周围环境的声音信号的幅值低于第一预设阈值时,通过扬声器发射掩蔽声音信号。
结合第三方面的第二种实现方式,处理器,具体用于当检测到有下行音频信号时,判断下行音频信号的幅值大于第二预设阈值时,通过扬声器发射掩蔽声音信号。
本申请至少具有以下优点:
本申请提供的声音的掩蔽方法应用在具有受话器和扬声器的终端设备。当终端设备通过受话器作为音频信号的输出端时,基于音频信号确定掩蔽声音信号,再控制扬声器发射掩蔽声音信号。由于该掩蔽声音信号是依据音频信号确定的,并且扬声器与受话器相对于远场的距离差异较小,因此掩蔽声音信号能够较好地掩蔽受话器的漏音,防止通话语音中的信息泄露。此外,由于掩蔽声音信号和声音信号分别由扬声器和受话器输出,当听者以受话器收听声音信号时,扬声器与受话器相对于听者耳朵的距离差异较大,因此掩蔽声音信号对听者收听声音信号的干扰较小,不会影响听者的通话质量。
附图说明
图1为本申请实施例提供的声音的掩蔽方法的应用场景示意图;
图2为图1所示的第一终端设备与近场听者耳朵及远场旁听者耳朵的距离示意图;
图3为本申请实施例提供的一种声音的掩蔽方法流程图;
图4为本申请实施例提供的另一种声音的掩蔽方法流程图;
图5为本申请实施例提供的对声音信号进行频谱分析得到的频谱响应示意图;
图6为掩蔽声音信号和声音信号的频谱特性曲线的示意图;
图7为本申请实施例提供的信号处理示意图;
图8为本申请实施例提供的一种截取后的声音片段示意图;
图9为图8所示的声音片段经过时域反转后的反转声音示意图;
图10为本申请实施例提供的掩蔽声音信号和反相声音信号的示意图;
图11为本申请实施例提供的掩蔽声音信号在增强前后的对比示意图;
图12为本申请实施例提供的一种声音的掩蔽装置的结构示意图;
图13为本申请实施例提供的一种信号生成模块的结构示意图;
图14为本申请实施例提供的另一种信号生成模块的结构示意图;
图15为本申请实施例提供的又一种信号生成模块的结构示意图;
图16为本申请实施例提供的一种终端设备的结构示意图;
图17为本申请实施例提供的一种手机的硬件架构图。
具体实施方式
移动通讯设备作为当前通用的通讯设备,以其便携的特点满足人们在多种场景下的通讯需求。例如,在拥挤的地铁上、在人来人往的商业街、在空旷的更衣室里,人们均可以利用移动通讯设备与其他人进行通话。当以受话器接收音频信号时,如果听者处于静室、对方音量过大或者受话器的播放音量较大,受话器漏音的问题难以避免。通话中可能包含不希望被其他人获知的内容,这些内容可能涉及到隐私信息或者机密信息。受话器漏音的问题容易导致通话内容中的隐私信息或机密信息遭到泄露。
通过调节受话器播放音量来避免漏音的方式可控性不足。当听者调小受话器播放音量时,漏音现象可能已经发生。而处理受话器待播放的音频信号尽管能够在一定程度上降低旁听者对音频信号的可懂度,但是同时导致听者也难以听清对方的通话内容。因此该方法影响了听者的通话质量。
基于以上问题,经过研究,本申请实施例提供一种声音的掩蔽方法、装置及终端设备。本申请实施例中,当以受话器作为音频信号(即被掩蔽音)的输出端时,根据音频信号确定掩蔽声音信号(即掩蔽音),再控制扬声器发射掩蔽声音信号。由于掩蔽音是根据被掩蔽音确定的,二者存在相关性,并且旁听者相对于受话器的距离与旁听者相对于扬声器的距离接近,因此掩蔽音能够向旁听者掩蔽被掩蔽音。同时,由于掩蔽音到达听者耳朵的距离要远大于被掩蔽音到达听者耳朵的距离,因此掩蔽音不会影响听者正常收听音频信号,扬声器发出的掩蔽声音信号在听者耳朵处反而会被受话器发出的音频信号所掩蔽,从而保证了听者的通话质量不受干扰。通常而言,与声源的距离大于一个临界值r far(1/r-法则)的位置定义为远场,与声源的距离小于或等于一个临界值r far(1/r-法则)的位置定义为近场。为便于理解本申请实施例的技术方案,由于听者与终端设备的距离远小于旁听者与终端设备的距离,因此以终端设备作为声源,设定听者处于近场,旁听者处于远场。
为了使本领域技术人员更好地理解本申请实施例提供的技术方案,下面先介绍本申请实施例提供的声音的掩蔽方法的应用场景。
参见图1,该图为本申请实施例提供的声音的掩蔽方法的应用场景示意图。
图1中,第一终端设备101与第二终端设备102建立通讯连接。本申请实施例提供的声音的掩蔽方法应用于第一终端设备101,第二终端设备102则作为该第一终端设备101的对端设备。实际应用中,第一终端设备101可以是任意一种具有通讯功能的 移动通讯设备,例如手机、平板电脑或便携式笔记本电脑等。图1中仅以手机形式的第一终端设备101为例进行展示,本申请实施例中对于第一终端设备101的具体类型不进行限定。
第一终端设备101包括受话器1011和扬声器1012。在本申请实施例中,第一终端设备101的受话器1011和扬声器1012可以分别设置于该第一终端设备1011的两侧。如图1所示,受话器1011位于第一终端1011在长度方向的中线1013一侧,扬声器1012位于中线1013另一侧。
受话器1011和扬声器1012均可以向外界输出来自第二终端设备102的音频信号,进行通话时,第一终端设备101的用户(即听者)可以根据自身的需要选择以受话器1011输出音频信号或者以扬声器1012输出音频信号。本申请实施例的技术方案主要基于以受话器1011作为音频信号输出端的场景。当以受话器1011作为音频信号的输出端时,扬声器1012输出掩蔽声音信号。该掩蔽声音信号用于向远场掩蔽音频信号。
参见图2,该图为图1所示的第一终端设备101与近场听者耳朵及远场旁听者的距离示意图。
图2中,当以第一终端设备1011的受话器1011作为音频信号的输出端时,听者(第一终端设备1011的用户,即音频信号的目标接收者)与该第一终端设备101的距离远小于旁听者(位于听者附近的其他人,即音频信号的非目标接收者)与第一终端设备101的距离。因此,在本申请实施例中,听者的耳朵201位于第一终端设备101的近场,旁听者的耳朵202位于第一终端设备101的远场。
当以受话器1011作为音频信号的输出端时,听者耳朵201贴近受话器1011的位置。此时,听者耳朵201的耳参考点(英文全称:Ear Reference Point,英文缩写:ERP)处与受话器1011的第一距离d1大约为5mm,听者耳朵201的ERP处与扬声器1012的第二距离d2大约为150mm。可见,第二距离d2与第一距离d1相差接近30倍。旁听者耳朵202的ERP处与受话器1011的第三距离d3大约为500mm,旁听者耳朵202的ERP处与扬声器1012的第四距离d4大约为500mm。可见,第四距离d4与第三距离d3的差异非常小,二者相差倍数远小于30。
基于脉动球源在空间辐射的声压与辐射径向距离的关系,以及声压级与声压的转换关系,可以得到声压级与距离的关系:声压振幅随着径向距离的增大而反比例地减小。假设受话器1011和扬声器1012分别输出同等音量的音频信号和掩蔽声音信号,由于第二距离d2与第一距离d1相差接近30倍,因此听者耳朵201听到的掩蔽声音信号的衰减幅度大于听到的音频信号的衰减幅度。因此,掩蔽声音信号不会干扰听者耳朵201收听音频信号。而由于第四距离d4与第三距离d3的差异非常小,通常小于2倍,因此旁听者耳朵202听到的掩蔽声音信号的衰减幅度与听到的音频信号的衰减幅度接近,旁听者耳朵202收听到的掩蔽声音信号和声音信号得到的响度基本一致。此外,由于掩蔽声音信号是根据音频信号确定的,因此掩蔽声音信号能够向远场掩蔽音频信号。
需要说明的是,图2所示的第一距离d1、第二距离d2、第三距离d3和第四距离 d4的数值仅为示例。实际应用中,第一距离d1可能与听者的听力相关,或者与听者手持第一终端设备的姿势相关。例如听者听力较差,第一距离d1可能小于5mm;而如果听者听力较好,第一距离d1可能大于5mm,例如d1=10mm。随着第一终端设备101的长度变化,第二距离d2可能大于或小于150mm。另外,第三距离d3和第四距离d4的数值也可能根据旁听者与第一终端设备的相对位置变化而改变,因此第三距离d3和第四距离d4也可能大于500mm。因此,本申请实施例对d1、d2、d3和d4的数值大小不进行限定。
参见图3,该图为本申请实施例提供的一种声音的掩蔽方法流程图。
如图3所示,声音的掩蔽方法包括:
S301:判断终端设备是否通过受话器作为音频信号的输出端,如果是,则执行S302。
受话器输出的音频信号可以包括人的语音信号、动物的声音信号或音乐等。此处对音频信号的类型和具体内容不进行限定。
在一种可能的实现方式中,判断受话器是否接收到音频信号输出指令,当受话器接收到音频信号输出指令时,表示以受话器作为音频信号的输出端。而当扬声器接收到音频信号输出指令时,表示以扬声器作为音频信号的输出端。实际应用中音频信号输出指令仅发向受话器,或者仅发向扬声器。
由于扬声器工作时提供语音外放的功能,因此当终端设备以扬声器作为音频信号的输出端时,表示终端设备的用户不需要对通话内容进行保密,此时无需执行本实施例方法的后续操作来掩蔽音频信号。
而当终端设备以受话器作为音频信号的输出端时,终端设备的用户与对方终端设备的用户通话过程中可能涉及到隐私或机密。如果受话器漏音,则旁听者有可能获知音频信号中的隐私信息或机密信息。为此,需要对音频信号进行掩蔽。
S302:当终端设备通过受话器作为音频信号的输出端时,根据音频信号确定掩蔽声音信号。
例如,对方终端设备提供的音频信号具有对应的时域特性和频域特性,本申请实施例中掩蔽声音信号具体可以是依据实时的音频信号的时域特性和/或频域特性进行分析后实时生成的。
另外,还可以预先对历史音频信号的时域特性和/或频域特性进行分析后,构建包含多种掩蔽声音信号的音频库。当终端设备再次以受话器作为音频信号的输出端时,根据本次的音频信号的时域特性和/或频域特性,从预设的音频库中选择或匹配出掩蔽声音信号,用以掩蔽本次的音频信号。
依据音频信号确定的掩蔽声音信号在时域和/或频域上,与音频信号的匹配度更高,能够更好地向远掩蔽音频信号,降低处于远场的旁听者对漏音中隐私信息或机密信息的可懂度。
S303:扬声器发射掩蔽声音信号,所述掩蔽声音信号用于向远场掩蔽受话器输出的音频信号。
实际应用中,可以根据多种可能的触发条件控制扬声器发射掩蔽声音信号。
在一种可能的实现方式中,终端设备以受话器作为音频信号输出端的整个通话过程中,控制扬声器持续发射掩蔽声音信号。
当终端设备周围环境比较嘈杂时,终端设备的受话器即便漏音,通话内容也很难被旁听者获知。而当终端设备周围环境比较安静时,受话器漏音便容易致使通话内容中的隐私信息或机密信息泄露。为避免此问题,在另一种可能的实现方式中,检测终端设备周围环境的声音信号,当周围环境的声音信号的幅值低于第一阈值时,表示周围环境过于安静。此时需要控制扬声器发射掩蔽声音信号。
在终端设备的通话过程中,音频信号可能存在空白或幅值较小的时段,此时掩蔽声音信号起不到掩蔽音频信号的作用,反而对旁听者造成噪声干扰。为了避免此问题,在又一种可能的实现方式中,当检测到有下行音频信号时,判断下行音频信号的幅值是否大于第二预设阈值。当下行音频信号的幅值大于第二预设阈值时,表示下行音频信号的音量较大,受话器有可能漏音。此时,控制扬声器发射掩蔽声音信号。
实际应用中,可以依据用户的选择实施本实施例提供的声音的掩蔽方法。例如,终端设备中安装了一个具有掩蔽音频信号的功能的应用程序(英文全称:Applicaiton,英文缩写:APP),APP运行时,用户可以操控该APP上的功能选项即可选择开启或关闭掩蔽音频信号的功能。当功能开启时,该终端设备即可执行本申请实施例提供的声音的掩蔽方法。另外,终端设备中还可以在通话界面中嵌入功能模块,该功能模块可以根据用户的选择开启或关闭,当功能模块开启时,终端设备即可执行本申请实施例提供的声音的掩蔽方法。
此外,上述APP或功能模块还可以自动开启。例如,检测到下行音频信号时自动开启,接收到通话请求时自动开启,或者终端设备开机后自动开启。
以上即为本申请实施例提供的声音的掩蔽方法。该方法应用在具有受话器和扬声器的终端设备。当终端设备通过受话器作为音频信号的输出端时,根据音频信号确定掩蔽声音信号。再控制扬声器发射掩蔽声音信号。由于该掩蔽声音信号是依据音频信号确定的,并且扬声器与受话器相对于远场的距离差异较小,因此掩蔽声音信号能够较好地掩蔽受话器的漏音,降低旁听者对漏音的可懂度,防止通话语音中的信息泄露。此外,由于掩蔽声音信号和音频信号分别由扬声器和受话器输出,当听者以受话器收听音频信号时,扬声器与受话器相对于听者耳朵的距离差异较大,因此掩蔽声音信号对听者收听音频信号的干扰较小,不会影响听者的通话质量。
实际应用中,如果两种声音信号到达听者耳朵的声压级相差15dB,便能使听者耳朵感受到明显的差异。为了避免掩蔽音干扰听者的通话质量,在一种可能的实现方式中,终端设备的扬声器与听者耳朵的第二距离d2是受话器大于听者耳朵的第一距离。作为示例,d1为d2的10倍以上,音频信号到达听者耳朵的声压级比掩蔽声音信号到达听者耳朵的声压级高出20dB以上,此时听者耳朵收听音频信号不会受到掩蔽声音信号的干扰。
以图1所示的第一终端设备101为例,第一终端设备101的长度为L1,宽度为W1,扬声器1012和受话器1011的距离为L2。L2满足以下不等式(1)-(3)中的至 少一个:
L2>W1                           (1)
L2>0.5*L1                       (2)
L2>100mm                        (3)
当L2满足不等式(1)-(3)中的至少一个时,可以保证第一距离d1远小于第二距离d2,从而保证掩蔽声音信号相比于音频信号以更小的声压级到达听者耳朵,避免掩蔽声音信号对听者的通话质量造成干扰。
在另一种可能的实现方式中,L2满足以下不等式(4):
L2≥20*d1                          (4)
本申请实施例提供的声音的掩蔽方法中,包括多种生成掩蔽声音信号的实现方式。下面结合实施例和附图展开说明。
参见图4,该图为本申请实施例提供的另一种声音的掩蔽方法流程图。
如图4所示,该声音的掩蔽方法包括:
S401:判断终端设备是否通过受话器作为音频信号的输出端,如果是,则执行S402。
S401的实现方式与前述方法实施例中S301的实现方式基本相同,S401的相关描述可参照前述实施例,此处不做赘述。
S402:当终端设备通过受话器作为音频信号的输出端时,对音频信号进行频谱分析,获得频谱响应。
对一段声音信号进行频谱分析得到频谱响应属于本领域比较成熟的技术,故此处对S402的具体实现方式不做赘述。为便于理解,可参见图5,该图为执行S402得到的频谱响应示意图。图5横轴表示频率(单位:Hz),纵轴表示信号幅度(单位:dBFS)。
S403:根据频谱响应生成掩蔽声音信号。
在实际应用中,可以将S402得到的频谱响应曲线作为滤波器,进而生成掩蔽声音信号。生成的掩蔽声音信号包括多种可能的形式,例如:掩蔽声音信号可以是随机噪声信号,如频率响应曲线与音频信号一致的白噪信号或粉噪信号。
S404:扬声器发射掩蔽声音信号。
S404的实现方式与前述方法实施例中S303的实现方式基本相同,S404的相关描述可参照前述实施例,此处不做赘述。
本申请实施例提供的声音的掩蔽方法中,由于掩蔽声音信号是依据音频信号的频谱响应生成的,因此掩蔽声音信号在频谱上与音频信号具备一样相似或相同的特性曲线。掩蔽声音信号的幅值与音频信号的幅值可以相同,也可以有所区别。参见图6,该图中掩蔽声音信号601的频谱特性曲线和音频信号602的频谱特性曲线非常相似,因此掩蔽声音信号601向远场遮蔽该音频信号602的效果较好。
需要说明的是,在上文介绍的实施例中,掩蔽声音信号可以是根据本次通话的音频信号实时生成的,例如根据本次通话的音频信号的前n毫秒生成(n为正数,且n毫秒小于下行音频信号的总时长)。
另外,掩蔽声音信号还可以是根据终端设备与对端的历史通话的音频信号预先生 成的。例如,对端设备102前一次与本端设备101通话时,向本端设备101发送了音频信号,该音频信号包含对端设备102的用户A2声音的频谱特征。在对端设备102再一次与本端设备101建立通讯连接之前,根据其提供的音频信号的频谱响应生成该用户A2对应的掩蔽声音信号V2。以此类推,对于用户A3,也可建立其对应的掩蔽声音信号V3。如此即可以建立该终端设备101的通讯录中每个联系人与掩蔽声音信号的映射表,并将掩蔽声音信号V2、V3等加入到音频库中。当通讯录中的联系人通过其持有的终端设备与本端设备101建立通讯连接时,若以终端设备101的受话器作为音频信号的输出端时,即可以直接利用映射表从音频库中选择或匹配,获得该联系人对应的掩蔽声音信号,进而向远场对该联系人的音频信号进行掩蔽。
通过上述方法预先生成掩蔽声音信号,提升了掩蔽声音信号的生成效率,并且掩蔽效果更加具有针对性。在该实现方式中,由于S402和S403是预先完成的,因此生成后每次实施本实施例方法时,仅执行S401和S404。
实际应用中,为了进一步防止掩蔽声音信号对听者通话造成干扰,还可以对音频信号进行处理,以此弱化掩蔽声音信号对听者的影响。下面结合附图和实施例进行说明。
参见图7,该图为本申请实施例提供的信号处理示意图。
如图7所示,终端设备中音频信号被分为两路,两路音频信号701和702的内容相同。其中,音频信号701提供给受话器,音频信号702提供给扬声器。作为一种可能的实现方式,音频信号702可以是通过复制音频信号701得到的。
下面首先介绍掩蔽声音信号的生成流程。为了降低旁听者对受话器漏音的可懂度,本申请实施例中按照预设帧长截取音频信号702,获得截取后的声音片段;其后将声音片段进行时域反转,获得反转声音。在一种可能的实现方式中,预设帧长可以是固定帧长,也可以是浮动帧长(即帧长可变)。可以理解的是,如果预设帧长的取值过大,可能需要花费过长的时间来生成掩蔽声音信号,影响听者的通话感受。本申请实施例中,预设帧长的取值范围为10ms-300ms,保证以较快的频率对声音片段进行时域反转,以便于实时向远场掩蔽音频信号。
参见图8和图9,其中,图8所示为截取后的声音片段,图9为图8所示的声音片段经过时域反转后的反转声音。可以理解的是,反转声音与反转前的声音片段相比,可懂度大大降低。利用反转声音即可生成对应的掩蔽声音信号703。例如,将每帧反转声音直接拼接生成掩蔽声音信号,或者利用窗函数对反转声音进行处理后,将处理得到的声音拼接生成掩蔽声音信号。
实际应用中,掩蔽声音信号703的生成时间相对于音频信号701可能存在迟滞。例如,掩蔽声音信号703滞后于音频信号701若干毫秒。为了进一步提升掩蔽效果,本申请实施例中还可以获取根据音频信号702生成掩蔽声音信号703的时间长度,依据该时间长度对音频信号701进行延迟,以使受话器输出的音频信号701与扬声器输出的掩蔽声音信号703相适应,例如局部对齐或完全对齐。例如,生成掩蔽声音信号花费了10ms的时间,则将音频信号701进行10ms的延迟。另外,还可以按照预设延 时长度对音频信号701进行延时,该预设延时长度的取值范围为10ms-300ms。需要说明的是,在本实施例中对音频信号701进行延迟为可选的操作,而非必需执行的操作。
掩蔽声音信号703直接提供给扬声器,以便扬声器输出该掩蔽声音信号。另外,为了减少掩蔽声音信号703对近场听者耳朵的干扰,本申请实施例中还可以利用该掩蔽声音信号703对音频信号701进行处理。具体实现时,对掩蔽声音信号进行相位反相处理,获得反相声音信号704。在一种可能的实现方式中,反相处理的相位范围是90度-270度,以此保证该反相声音信号704对掩蔽声音信号703具有较好的补偿能力。
图10为掩蔽声音信号703和反相声音信号704的示意图。本申请实施例中,将反相声音信号704进行降低幅值处理后与音频信号701进行混音,获得混音声音信号705,最后将该混音声音信号705提供给受话器,以便受话器播放混音声音信号705。降低幅值处理可以通过均衡器实现,也可以通过增益控制或滤波处理的方式实现。此处对降低幅值处理的具体实现方式不进行限定。
由于混音声音信号705是根据反相声音信号704和音频信号701混音而成,因此,该混音声音信号705包含了有效的通话内容。同时,混音声音信号705中反相声音信号704的成分在混音声音信号705输出后,也能够在近场补偿掩蔽声音信号703,抵消扬声器播放的掩蔽声音信号对听者通话质量。另外,降低幅值处理后,也削弱了混音声音信号705中的反相声音信号704对听者耳朵的干扰影响。
如图10所示,在一种可能的实现方式中,还可以对掩蔽声音信号703进行增强处理,从而获得增强后的掩蔽声音信号,增强后的掩蔽声音信号可以提供给扬声器,以进行发射。具体实现时,由于漏音的中高音频域与旁听者对漏音的可懂度紧密关联,因此可以利用均衡器对掩蔽声音信号的中高音频域进行增强,以加强对受话器输出的音频信号在中高音频域的掩蔽效果。增强处理可以通过均衡器、增益控制或滤波处理来实现,此处对增强处理的实现方式不进行限定。
参见图11,该图中曲线1101表示增强之前的掩蔽声音信号,曲线1102表示增强后的掩蔽声音信号。通过增强掩蔽声音信号703,提升掩蔽声音信号对漏音的掩蔽效果。可以理解的是,在实际应用中还可以采用其他方式对掩蔽声音信号703进行增强,本实施例中对增强掩蔽声音信号703的具体方式、实施增强的频域以及增强的幅度不加以限定。
在上文和图7中介绍了截取声音片段后,通过反转声音片段来生成掩蔽声音信号的实现方式。下面描述利用截取后的声音片段生成掩蔽声音信号的另一种实现方式。
本申请实施例中,按照预设帧长截取音频信号702,获得截取后的声音片段。其后可以将声音片段进行插值,得到补充后的声音信息;或者从预设音频库中匹配后续片段,从而得到补充后的声音信息。最后,根据补充后的声音信息生成对应的掩蔽声音信号。
作为一示例,可以对截取后的声音片段进行预处理,提取其中的特征参数(例如字节、音调等),其后利用特征参数和预先训练的经验模型对该声音片段进行插值,抽查之后得到补充后的声音信息。
作为另一示例,预先构建音频库,该音频库中每一段声音片段至少匹配一段其他的声音片段。当获得截取后的声音片段后,依据该声音片段从预设音频库中获得其匹配的任意一段声音片段,该匹配出的声音片段称为后续片段。利用声音片段和后续片段即获得补充后的声音信息。
本申请实施例中通过补充后的声音信息生成掩蔽声音信号,降低了截取声音片段的频率,提升掩蔽声音信号的生成效率。该掩蔽声音信号生成后,作为一种可选的实现方式,为了提升掩蔽效果,也可以将音频信号701延迟,以使掩蔽声音信号与音频信号701对齐播放。
经测试,通过执行以上实施例中提供的声音掩蔽方法,显著降低了500mm处旁听者对漏音的可懂度,使其对漏音的可懂度从实施本实施例方法之前的90%降低至10%以下。远场对漏音的单字可懂度小于30%,对句子的可懂度小于10%。此外,对于周边环境的噪声影响在6dB以下,相比于实施该方法之前,对周围环境的响度影响无明显变化。实施该方法对近场听者的音频可懂度几乎没有影响,因此本实施例提供的声音掩蔽方法能够在不改变听者通话质量的基础上,向远场有效地掩蔽受话器漏音。
基于前述实施例提供的声音的掩蔽方法,相应地,本申请还提供一种声音的掩蔽装置。以下结合附图和实施例进行说明。
参见图12,该图为本申请实施例提供的声音的掩蔽装置的结构示意图。该图所示的声音的掩蔽装置120可以应用在图1和图2所示的第一终端设备101中。
如图12所示,该装置120包括:
判断模块1201用于判断终端设备是否以受话器作为音频信号的输出端;
确定模块1203用于当判断模块的判断结果为是时,根据音频信号确定掩蔽声音信号;
第一控制模块1202用于控制扬声器发射掩蔽声音信号,以向远场掩蔽受话器输出的音频信号。
由于该掩蔽声音信号是依据音频信号确定的,并且扬声器与受话器相对于远场的距离差异较小,因此掩蔽声音信号能够较好地掩蔽受话器的漏音,降低旁听者对漏音的可懂度,防止通话语音中的信息泄露。此外,由于掩蔽声音信号和音频信号分别由扬声器和受话器输出,当听者以受话器收听音频信号时,扬声器与受话器相对于听者耳朵的距离差异较大,因此掩蔽声音信号对听者收听音频信号的干扰较小,不会影响听者的通话质量。
在一种可能的实现方式中,确定模块1203用于根据所述音频信号从预先生成的音频库中选择或匹配对应的掩蔽声音信号。
在一种可能的实现方式中,扬声器与听者耳朵的距离大于受话器与听者耳朵的距离。
在一种可能的实现方式中,确定模块1203,用于根据音频信号生成掩蔽声音信号。
如图13所示的信号生成模块的结构示意图,在一种可能的实现方式中,确定模块 1203具体包括:
频谱分析单元12031用于对音频信号进行频谱分析,获得频谱响应;
第一生成单元12032用于根据频谱响应生成掩蔽声音信号。
可以理解的是,由于掩蔽声音信号是依据音频信号的频谱响应生成的,因此掩蔽音与被掩蔽音在频谱特性上存在一致性或相似性。进而,掩蔽声音信号能够较好地掩蔽受话器播放的声音信号。
如图14所示的信号生成模块的结构示意图,在另一种可能的实现方式中,确定模块1203具体包括:
信号截取单元12033用于按照预设帧长截取音频信号,获得截取后的声音片断;
信号反转单元12034用于将声音片断进行时域反转,获得反转声音;
第二生成单元12035用于根据反转声音生成对应的掩蔽声音信号。
通过反转声音片段,并依据反转声音生成对应的掩蔽声音信号,能够以该生成的掩蔽声音信号降低远场旁听者对漏音的可懂度。进而保证了通话内容中隐私信息或机密信息的安全性。
如图15所示的信号生成模块的结构示意图,在又一种可能的实现方式中,确定模块1203具体包括:
信号截取单元12033用于按照预设帧长截取音频信号,获得截取后的声音片断;
信号补充单元12036用于将声音片断进行插值,获得补充后的声音信号;或者从预设音频库中匹配后续片断,获得补充后的声音信号;
第三生成单元12037用于根据补充后的声音信号生成对应的掩蔽声音信号。
通过补充声音信号,降低对声音信号的截取频率,提升生成掩蔽声音信号的效率。从而,避免听者对声音信号的收听等待时间,提升听者的通话体验。
在一种可能的实现方式中,还包括:
时间长度获取模块,用于获得根据音频信号生成掩蔽声音信号的时间长度;
延迟模块,用于根据时间长度对音频信号进行延迟,以使受话器输出的声音信号与扬声器输出的掩蔽声音信号相适应。
通过延迟受话器输出的声音信号,保证掩蔽声音信号同步掩蔽受话器输出的音频信号,提升掩蔽效果。
在一种可能的实现方式中,还包括:
反相处理模块,用于将掩蔽声音信号进行相位反相处理,获得反相声音信号;
混音模块,用于将反相声音信号进行EQ或幅值处理后和音频信号进行混音,获得混音声音信号;
第二控制模块,用于控制受话器输出混音声音信号。
通过反相处理模块的处理获得的反相声音信号能够补偿掩蔽声音信号,在一定程度上抵消掩蔽声音信号对近场听者耳朵的干扰,保证听者的通话质量。
在一种可能的实现方式中,第一控制模块1203,具体包括:
第一检测单元,用于检测周围环境的声音信号;
第一判断单元,用于判断周围环境的声音信号的幅值是否低于第一预设阈值;
第一控制单元,用于当第一判断单元的判断结果为是时,通过扬声器发射掩蔽声音信号。
将周围环境的声音信号的幅值低于第一预设阈值作为通过扬声器发射掩蔽声音信号的触发条件,防止因为周围环境过于安静以至于环境中的旁听者收听到受话器的漏音。因此,避免了漏音中的隐私信息或机密信息被泄露。
在一种可能的实现方式中,第一控制模块1203,具体包括:
第二检测单元,用于检测是否有下行音频信号;
第二判断单元,用于当第二检测单元检测到有下行音频信号时,判断下行音频信号的幅值是否大于第二预设阈值;
第二控制单元,用于当第一判断单元判断结果为是时,通过扬声器发射掩蔽声音信号。
将下行音频信号的幅值大于预设阈值作为通过扬声器发射掩蔽声音信号的触发条件,避免对周围环境中的旁听者造成不必要的噪声干扰。
在一种可能的实现方式中,还包括:
信号增强模块,用于对掩蔽声音信号进行增强处理,获得增强后的掩蔽声音信号。通过增强掩蔽声音信号,使增强后的掩蔽声音信号对受话器播放的声音信号起到更有效的掩蔽作用,降低旁听者对漏音的可懂度。
基于前述实施例提供的声音的掩蔽方法和声音的掩蔽装置,相应地,本申请还提供一种终端设备。该终端设备可以是图1和图2中所示的第一终端设备101,关于该终端设备的应用场景可以参见图1和图2,此处不加以赘述。下面结合实施例和附图描述本申请实施例提供的终端设备的结构实现。
参见图16,该图为本申请实施例提供的一种终端设备的结构示意图。
如图16所示,该终端设备160包括:受话器1601、扬声器1602和处理器1603。
其中,处理器1603用于当以受话器1601输出音频信号时,根据音频信号确定掩蔽声音信号;
扬声器1602用于发射掩蔽声音信号,以向远场掩蔽受话器1011输出的音频信号。扬声器1602可以在处理器1603的控制下输出该掩蔽声音信号。
由于该掩蔽声音信号是依据音频信号确定的,并且扬声器1602与受话器1601相对于远场的距离差异较小,因此掩蔽声音信号能够较好地掩蔽受话器1601的漏音,降低旁听者对漏音的可懂度,防止通话语音中的信息泄露。此外,由于掩蔽声音信号和声音信号分别由扬声器1602和受话器1601输出,当听者以受话器1601收听声音信号时,扬声器1602与受话器1601相对于听者耳朵的距离差异较大,扬声器1602发射的掩蔽声音信号在听者耳朵处反而会被受话器1601发出的音频信号所掩蔽,因此掩蔽声音信号对听者收听声音信号的干扰较小,不会影响听者的通话质量。
在一种实现方式中,处理器1603,具体用于当以所述受话器输出音频信号时,根据音频信号从预先生成的音频库中选择或匹配对应的掩蔽声音信号。
在一种实现方式中,扬声器1602与听者耳朵的距离大于受话器1601与听者耳朵的距离。
在一种实现方式中,处理器1603具体用于对音频信号进行频谱分析,获得频谱响应;根据频谱响应生成掩蔽声音信号。
在一种实现方式中,处理器1603具体用于按照预设帧长截取音频信号,获得截取后的声音片断;将声音片断进行时域反转,获得反转声音;根据反转声音生成对应的掩蔽声音信号。
在一种实现方式中,处理器1603具体用于按照预设帧长截取音频信号,获得截取后的声音片断;将声音片断进行插值,获得补充后的声音信号;或者从预设音频库中匹配后续片断,获得补充后的声音信号;根据补充后的声音信号生成对应的掩蔽声音信号。
在一种实现方式中,处理器1603还用于获得根据音频信号生成掩蔽声音信号的时间长度;根据时间长度对音频信号进行延迟,以使受话器1601输出的音频信号与扬声器1602输出的掩蔽声音信号相适应。
在一种实现方式中,处理器1603还用于将掩蔽声音信号进行相位反相处理,获得反相声音信号;将反相声音信号和音频信号进行混音,获得混音声音信号;控制受话器1601输出混音声音信号。
在一种实现方式中,处理器1603具体用于检测周围环境的声音信号,当周围环境的声音信号的幅值低于第一预设阈值时,控制扬声器1602输出掩蔽声音信号。
在一种实现方式中,处理器1603具体用于当检测到有下行音频信号时,判断下行音频信号的幅值大于第二预设阈值时,控制扬声器1602输出掩蔽声音信号。
在一种实现方式中,处理器1603还用于对掩蔽声音信号进行增强处理,获得增强后的掩蔽声音信号。
本申请实施例提供的终端设备160中,处理器1603可以用于执行前述方法实施例中的部分或全部步骤。关于处理器1603的功能描述以及执行方法步骤的相关技术效果可以参照前述方法实施例和装置实施例,此处不再赘述。
在图16所示的终端设备160仅示出了与本申请实施例相关的部分,具体技术细节未揭示的,请参照本申请实施例方法部分。该终端设备160可以为包括手机、平板电脑、个人数字助理(英文全称:Personal Digital Assistant,英文缩写:PDA)、销售终端设备(英文全称:Point of Sales,英文缩写:POS)、车载电脑等任意终端设备。下面以手机为例对本申请实施例提供的终端设备进行描述和说明。
图17示出的是与本申请实施例提供的终端设备相关的手机的部分结构框图。参考图17,手机170包括:射频(英文全称:Radio Frequency,英文缩写:RF)电路1710、存储器1720、输入单元1730、显示单元1740、传感器1750、音频电路1760、无线保真(英文全称:wireless fidelity,英文缩写:WiFi)模块1770、处理器1780(该处理器1780可以实现图16中所示的处理器1603的功能)、以及电源Bat等部件。本领域技术人员可以理解,图17中示出的手机结构并不构成对手机的限定,可以包括比图示 更多或更少的部件,或者组合某些部件,或者不同的部件布置。
下面结合图17对手机的各个构成部件进行具体的介绍:
RF电路1710可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,给处理器1780处理;另外,将设计上行的数据发送给基站。通常,RF电路1710包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(英文全称:Low Noise Amplifier,英文缩写:LNA)、双工器等。此外,RF电路1710还可以通过无线通信与网络和其他设备通信。上述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统(英文全称:Global System of Mobile communication,英文缩写:GSM)、通用分组无线服务(英文全称:General Packet Radio Service,GPRS)、码分多址(英文全称:Code Division Multiple Access,英文缩写:CDMA)、宽带码分多址(英文全称:Wideband Code Division Multiple Access,英文缩写:WCDMA)、长期演进(英文全称:Long Term Evolution,英文缩写:LTE)、电子邮件、短消息服务(英文全称:Short Messaging Service,SMS)等。
存储器1720可用于存储软件程序以及模块,处理器1780通过运行存储在存储器1720的软件程序以及模块,从而执行手机170的各种功能应用以及数据处理。存储器1720可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机170的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器1720可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
输入单元1730可用于接收输入的数字或字符信息,以及产生与手机170的用户设置以及功能控制有关的键信号输入。具体地,输入单元1730可包括触控面板1731以及其他输入设备1732。触控面板1731,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板1731上或在触控面板1731附近的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板1731可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器1780,并能接收处理器1780发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板1731。除了触控面板1731,输入单元1730还可以包括其他输入设备1732。具体地,其他输入设备1732可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。
显示单元1740可用于显示由用户输入的信息或提供给用户的信息以及手机170的各种菜单。显示单元1740可包括显示面板1741,可选的,可以采用液晶显示器(英文全称:Liquid Crystal Display,英文缩写:LCD)、有机发光二极管(英文全称:Organic Light-Emitting Diode,英文缩写:OLED)等形式来配置显示面板1741。进一步的,触控面板1731可覆盖显示面板1741,当触控面板1731检测到在其上或附近的触摸操作 后,传送给处理器1780以确定触摸事件的类型,随后处理器1780根据触摸事件的类型在显示面板1741上提供相应的视觉输出。虽然在图17中,触控面板1731与显示面板1741是作为两个独立的部件来实现手机170的输入和输入功能,但是在某些实施例中,可以将触控面板1731与显示面板1741集成而实现手机170的输入和输出功能。
手机170还可包括至少一种传感器1750,比如光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板1741的亮度,接近传感器可在手机170移动到耳边时,关闭显示面板1741和/或背光。作为运动传感器的一种,加速计传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别手机170姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于手机170还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
音频电路1760、扬声器1761、传声器1762及受话器1763可提供用户与手机170之间的音频接口。音频电路1760可将接收到的音频数据转换后的电信号,传输到扬声器1761或受话器1763,由扬声器1761或受话器1763转换为声音信号输出;另一方面,传声器1762将收集的声音信号转换为电信号,由音频电路1760接收后转换为音频数据,再将音频数据输出处理器1780处理后,经RF电路1710以发送给比如另一手机170,或者将音频数据输出至存储器1720以便进一步处理。
WiFi属于短距离无线传输技术,手机170通过WiFi模块1770可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图17示出了WiFi模块1770,但是可以理解的是,其并不属于手机170的必须构成,完全可以根据需要在不改变发明的本质的范围内而省略。
处理器1780是手机170的控制中心,利用各种接口和线路连接整个手机170的各个部分,通过运行或执行存储在存储器1720内的软件程序和/或模块,以及调用存储在存储器1720内的数据,执行手机170的各种功能和处理数据,从而对手机170进行整体监控。可选的,处理器1780可包括一个或多个处理单元;优选的,处理器1780可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器1780中。
手机170还包括给各个部件供电的电源Bat(比如电池),优选的,电源可以通过电源管理系统与处理器1780逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。
尽管未示出,手机170还可以包括摄像头、蓝牙模块等,在此不再赘述。
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一 项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。
以上所述,仅是本申请的较佳实施例而已,并非对本申请作任何形式上的限制。虽然本申请已以较佳实施例揭露如上,然而并非用以限定本申请。任何熟悉本领域的技术人员,在不脱离本申请技术方案范围情况下,都可利用上述揭示的方法和技术内容对本申请技术方案做出许多可能的变动和修饰,或修改为等同变化的等效实施例。因此,凡是未脱离本申请技术方案的内容,依据本申请的技术实质对以上实施例所做的任何简单修改、等同变化及修饰,均仍属于本申请技术方案保护的范围内。

Claims (16)

  1. 一种声音的掩蔽方法,其特征在于,应用于终端设备,所述终端设备包括受话器和扬声器;
    该方法包括:
    当所述终端设备通过所述受话器作为音频信号的输出端时,根据所述音频信号确定掩蔽声音信号;
    所述扬声器发射所述掩蔽声音信号,所述掩蔽声音信号用于向远场掩蔽所述受话器输出的音频信号。
  2. 根据权利要求1所述的掩蔽方法,其特征在于,根据所述音频信号确定所述掩蔽声音信号,具体包括:
    根据所述音频信号从预先生成的音频库中选择或匹配对应的掩蔽声音信号。
  3. 根据权利要求1所述的方法,其特征在于,根据所述音频信号确定掩蔽声音信号,具体包括:
    对所述音频信号进行频谱分析,获得频谱响应;
    根据所述频谱响应生成所述掩蔽声音信号。
  4. 根据权利要求1所述的方法,其特征在于,根据所述音频信号确定掩蔽声音信号,具体包括:
    按照预设帧长截取所述音频信号,获得截取后的声音片断;
    将所述声音片断进行时域反转,获得反转声音;
    将所述反转声音直接拼接生成所述掩蔽声音信号,或者通过窗函数之后再拼接生成所述掩蔽声音信号。
  5. 根据权利要求1所述的方法,其特征在于,根据所述音频信号确定掩蔽声音信号,具体包括:
    按照预设帧长截取所述音频信号,获得截取后的声音片断;
    将所述声音片断进行插值,获得补充后的声音信号;或者从预设音频库中匹配后续片断,获得补充后的声音信号;
    根据所述补充后的声音信号生成对应的掩蔽声音信号。
  6. 根据权利要求4或5所述的方法,其特征在于,还包括:
    获得根据所述音频信号生成所述掩蔽声音信号的时间长度;
    根据所述时间长度对所述音频信号进行延迟,以使所述受话器输出的音频信号与所述扬声器输出的掩蔽声音信号相适应。
  7. 根据权利要求1-5任一项所述的方法,其特征在于,还包括:
    将所述掩蔽声音信号进行相位反相处理,获得反相声音信号;
    将所述反相声音信号进行降低幅值处理后和所述音频信号进行混音,获得混音声音信号;
    所述受话器输出所述混音声音信号。
  8. 根据权利要求1-5任一项所述的方法,其特征在于,所述通过所述扬声器发射 所述掩蔽声音信号,具体包括:
    当检测到有下行音频信号时,判断所述下行音频信号的幅值大于第二预设阈值时,通过所述扬声器发送掩蔽声音信号。
  9. 一种终端设备,其特征在于,包括:受话器、扬声器和处理器;
    所述处理器,用于当以所述受话器输出音频信号时,根据所述音频信号确定或生成掩蔽声音信号;
    所述扬声器,用于发射所述掩蔽声音信号,以向远场掩蔽所述受话器输出的所述音频信号。
  10. 根据权利要求9所述的终端设备,其特征在于,
    所述处理器,具体用于当以所述受话器输出音频信号时,根据所述音频信号从预先生成的音频库中选择或匹配对应的掩蔽声音信号。
  11. 根据权利要求9所述的终端设备,其特征在于,
    所述处理器,具体用于对所述音频信号进行频谱分析,获得频谱响应;根据所述频谱响应生成所述掩蔽声音信号。
  12. 根据权利要求9所述的终端设备,其特征在于,
    所述处理器,具体用于按照预设帧长截取所述音频信号,获得截取后的声音片断;将所述声音片断进行时域反转,获得反转声音;根据所述反转声音生成对应的掩蔽声音信号。
  13. 根据权利要求9所述的终端设备,其特征在于,
    所述处理器,具体用于按照预设帧长截取所述音频信号,获得截取后的声音片断;将所述声音片断进行插值,获得补充后的声音信号;或者从预设音频库中匹配后续片断,获得补充后的声音信号;根据所述补充后的声音信号生成对应的掩蔽声音信号。
  14. 根据权利要求12或13所述的终端设备,其特征在于,
    所述处理器,还用于获得根据所述音频信号生成所述掩蔽声音信号的时间长度;根据所述时间长度对所述音频信号进行延迟,以使所述受话器输出的音频信号与所述扬声器输出的掩蔽声音信号相适应。
  15. 根据权利要求9-13任一项所述的终端设备,其特征在于,
    所述处理器,还用于将所述掩蔽声音信号进行相位反相处理,获得反相声音信号;将所述反相声音信号进行降低幅值处理后和所述音频信号进行混音,获得混音声音信号;控制所述受话器输出所述混音声音信号。
  16. 根据权利要求9-13任一项所述的终端设备,其特征在于,
    所述处理器,具体用于当检测到有下行音频信号时,判断所述下行音频信号的幅值大于第二预设阈值时,通过所述扬声器发射所述掩蔽声音信号。
PCT/CN2020/141881 2020-03-20 2020-12-31 一种声音的掩蔽方法、装置及终端设备 WO2021184920A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
MX2022011638A MX2022011638A (es) 2020-03-20 2020-12-31 Método y aparato de enmascaramiento de sonido y dispositivo de terminal.
BR112022018722A BR112022018722A2 (pt) 2020-03-20 2020-12-31 Método e aparelho de mascaramento de som e dispositivo terminal
EP20926107.2A EP4109863A4 (en) 2020-03-20 2020-12-31 METHOD AND APPARATUS FOR MAKING SOUND, AND TERMINAL DEVICE
US17/947,600 US20230008818A1 (en) 2020-03-20 2022-09-19 Sound masking method and apparatus, and terminal device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010202057.5 2020-03-20
CN202010202057.5A CN113497849A (zh) 2020-03-20 2020-03-20 一种声音的掩蔽方法、装置及终端设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/947,600 Continuation US20230008818A1 (en) 2020-03-20 2022-09-19 Sound masking method and apparatus, and terminal device

Publications (1)

Publication Number Publication Date
WO2021184920A1 true WO2021184920A1 (zh) 2021-09-23

Family

ID=77770283

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/141881 WO2021184920A1 (zh) 2020-03-20 2020-12-31 一种声音的掩蔽方法、装置及终端设备

Country Status (6)

Country Link
US (1) US20230008818A1 (zh)
EP (1) EP4109863A4 (zh)
CN (2) CN116684514A (zh)
BR (1) BR112022018722A2 (zh)
MX (1) MX2022011638A (zh)
WO (1) WO2021184920A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116405589A (zh) * 2023-06-07 2023-07-07 荣耀终端有限公司 声音处理方法及相关装置

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114299990A (zh) * 2022-01-28 2022-04-08 杭州老板电器股份有限公司 吸油烟机的异音识别及音频注入的控制方法和系统
CN115278453A (zh) * 2022-04-07 2022-11-01 长城汽车股份有限公司 通话隐私保护系统、方法、存储介质及车辆
CN116320123B (zh) * 2022-08-11 2024-03-08 荣耀终端有限公司 一种语音信号的输出方法和电子设备
CN117714583A (zh) * 2022-09-08 2024-03-15 北京荣耀终端有限公司 一种电子设备、电子设备的参数确定方法及装置
CN117119092A (zh) * 2023-02-22 2023-11-24 荣耀终端有限公司 一种音频处理方法及电子设备
CN117692843B (zh) * 2024-02-02 2024-04-16 江西斐耳科技有限公司 一种声音自动调节方法、系统、存储介质及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100322430A1 (en) * 2009-06-17 2010-12-23 Sony Ericsson Mobile Communications Ab Portable communication device and a method of processing signals therein
CN108494918A (zh) * 2018-05-28 2018-09-04 维沃移动通信有限公司 一种移动终端
CN109445746A (zh) * 2018-12-28 2019-03-08 北京小米移动软件有限公司 一种电子设备
CN110602696A (zh) * 2019-10-30 2019-12-20 维沃移动通信有限公司 通话隐私保护方法和电子设备
CN111212364A (zh) * 2020-03-19 2020-05-29 锐迪科微电子(上海)有限公司 音频输出设备及其漏音消除方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060109983A1 (en) * 2004-11-19 2006-05-25 Young Randall K Signal masking and method thereof
KR100643310B1 (ko) * 2005-08-24 2006-11-10 삼성전자주식회사 음성 데이터의 포먼트와 유사한 교란 신호를 출력하여송화자 음성을 차폐하는 방법 및 장치
CN102110441A (zh) * 2010-12-22 2011-06-29 中国科学院声学研究所 一种基于时间反转的声掩蔽信号产生方法
CN102238452A (zh) * 2011-05-05 2011-11-09 安百特半导体有限公司 一种在免提耳机中主动抗噪声的方法及免提耳机
US9361903B2 (en) * 2013-08-22 2016-06-07 Microsoft Technology Licensing, Llc Preserving privacy of a conversation from surrounding environment using a counter signal
CN107071119B (zh) * 2017-04-26 2019-10-18 维沃移动通信有限公司 一种声音消除方法及移动终端

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100322430A1 (en) * 2009-06-17 2010-12-23 Sony Ericsson Mobile Communications Ab Portable communication device and a method of processing signals therein
CN108494918A (zh) * 2018-05-28 2018-09-04 维沃移动通信有限公司 一种移动终端
CN109445746A (zh) * 2018-12-28 2019-03-08 北京小米移动软件有限公司 一种电子设备
CN110602696A (zh) * 2019-10-30 2019-12-20 维沃移动通信有限公司 通话隐私保护方法和电子设备
CN111212364A (zh) * 2020-03-19 2020-05-29 锐迪科微电子(上海)有限公司 音频输出设备及其漏音消除方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4109863A4

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116405589A (zh) * 2023-06-07 2023-07-07 荣耀终端有限公司 声音处理方法及相关装置
CN116405589B (zh) * 2023-06-07 2023-10-13 荣耀终端有限公司 声音处理方法及相关装置

Also Published As

Publication number Publication date
CN116684514A (zh) 2023-09-01
BR112022018722A2 (pt) 2022-11-29
CN113497849A (zh) 2021-10-12
EP4109863A1 (en) 2022-12-28
US20230008818A1 (en) 2023-01-12
EP4109863A4 (en) 2023-08-16
MX2022011638A (es) 2022-11-10

Similar Documents

Publication Publication Date Title
WO2021184920A1 (zh) 一种声音的掩蔽方法、装置及终端设备
JP6505252B2 (ja) 音声信号を処理するための方法及び装置
WO2020215965A1 (zh) 终端控制方法及终端
CN107231473B (zh) 一种音频输出调控方法、设备及计算机可读存储介质
CN108391205B (zh) 左右声道切换方法和装置、可读存储介质、终端
JP2011514019A (ja) Fm送信機付き無線ヘッドセット
CN108540900B (zh) 音量调节方法及相关产品
JP2014520284A (ja) 電子デバイス上でのマスキング信号の生成
CN106982286B (zh) 一种录音方法、设备和计算机可读存储介质
CN109379490B (zh) 音频播放方法、装置、电子设备及计算机可读介质
US20210127352A1 (en) Method and apparatus for sending a notification to a short-range wireless communication audio output device
WO2021238844A1 (zh) 音频输出方法及电子设备
WO2019033984A1 (zh) 音量调节方法、装置、终端及存储介质
WO2020073852A1 (zh) 显示控制方法及相关产品
WO2021169869A1 (zh) 音频播放装置、音频播放方法及电子设备
WO2020107290A1 (zh) 音频输出控制方法和装置、计算机可读存储介质、电子设备
CN110602696A (zh) 通话隐私保护方法和电子设备
WO2020097927A1 (zh) 通话控制方法和装置、计算机可读存储介质、电子设备
WO2021068875A1 (zh) 上行传输控制方法及终端
CN109889660B (zh) 临时信息记录方法、存储介质和移动终端
CN108391208B (zh) 信号切换方法、装置、终端、耳机及计算机可读存储介质
WO2020118496A1 (zh) 音频通路切换方法和装置、可读存储介质、电子设备
CN111432071A (zh) 通话控制方法及电子设备
WO2018166087A1 (zh) 一种终端通话方法及终端
CN111970668B (zh) 一种蓝牙音频控制方法、设备及计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20926107

Country of ref document: EP

Kind code of ref document: A1

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112022018722

Country of ref document: BR

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020926107

Country of ref document: EP

Effective date: 20220923

ENP Entry into the national phase

Ref document number: 112022018722

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20220919