CN111161699A - Method, device and equipment for masking environmental noise - Google Patents

Method, device and equipment for masking environmental noise Download PDF

Info

Publication number
CN111161699A
CN111161699A CN201911399710.5A CN201911399710A CN111161699A CN 111161699 A CN111161699 A CN 111161699A CN 201911399710 A CN201911399710 A CN 201911399710A CN 111161699 A CN111161699 A CN 111161699A
Authority
CN
China
Prior art keywords
audio
masking
noise
sound pressure
pressure level
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911399710.5A
Other languages
Chinese (zh)
Other versions
CN111161699B (en
Inventor
邹煜晖
刘锦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Xinyuchao Information Technology Co ltd
Original Assignee
Guangzhou Xinyuchao Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Xinyuchao Information Technology Co ltd filed Critical Guangzhou Xinyuchao Information Technology Co ltd
Priority to CN201911399710.5A priority Critical patent/CN111161699B/en
Publication of CN111161699A publication Critical patent/CN111161699A/en
Application granted granted Critical
Publication of CN111161699B publication Critical patent/CN111161699B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • G10K11/178Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
    • G10K11/1781Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase characterised by the analysis of input or output signals, e.g. frequency range, modes, transfer functions
    • G10K11/17821Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase characterised by the analysis of input or output signals, e.g. frequency range, modes, transfer functions characterised by the analysis of the input signals only
    • G10K11/17823Reference signals, e.g. ambient acoustic environment
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Abstract

The invention discloses a method for masking environmental noise, which comprises the following steps: establishing a masking audio material library, extracting and classifying the characteristics of each audio material, and dividing the audio material into background, melody or auxiliary audio; dividing the frequency part of 20Hz-16kHz into 25 critical sub-bands, and respectively calculating the signal masking ratio of tone components and the signal masking ratio of non-tone components of the sub-bands; monitoring and analyzing the environmental noise to obtain the characteristics of tonal components and non-tonal components of the environmental noise; respectively recommending background audio and melody audio in the material library according to the non-tonal component characteristics and tonal component characteristics of the noise, and recommending auxiliary audio in the material library according to the characteristics of human ears; and combining the three recommended audios and outputting the final masking audio. The equipment can be used only by comprising the sound collecting device and the playing device, fully utilizes the existing resources of the mobile phone/the intelligent sound box, and has the characteristics of low calculated amount, good masking effect and low requirement on hardware.

Description

Method, device and equipment for masking environmental noise
Technical Field
The invention belongs to the field of research of noise detection and control, and particularly relates to a method, a device and equipment for masking environmental noise.
Background
Noise refers to sounds that cause aversion or that have a volume above the threshold normally tolerated by the human ear. Common noises in life, such as busy urban traffic, noisy street crowd, indoor electric appliance sound, and even around people chatting and playing, affect work and study in the daytime and disturb sleep at night. The long-term contact not only affects the physical health, but also easily causes negative emotions such as dysphoria, anxiety and the like, thereby affecting the mental health.
Noise pollution is created when noise continues to propagate through the environment. Noise pollution, water pollution and air pollution are combined into three problems. How to effectively solve the noise pollution is also a problem facing human beings.
Currently, the approaches to noise management in this field are: control from the sound source; control from a propagation medium; control at the ear. The first two approaches are widely used in industrial production and building design, and do not belong to a mode available for individuals, so they are not in the category of this comparison. The way to control the noise from the ear is to make the device available for individual choice, which can be divided into passive and active noise reduction from a physical point of view.
Passive noise reduction is through the physics material inhale sound and sound insulation to noise reduction gets into the ear, and common scheme has earplug, ear muff etc. and its noise reduction effect is mainly decided by the physics material of choosing, for example the earplug is airtight solid material usually, mainly forms the enclosure space through surrounding the ear, perhaps adopts sound insulation material such as silica gel earplug to block external noise, and this kind of method is very effective to the sound of high-frequency, generally can make the noise reduction about 15-20dB (decibel). But has the following disadvantages: firstly, the noise reduction effect on low-frequency noise is not obvious; secondly, the noise is lack of pertinence, and the external sound is isolated from entering the ears indiscriminately, so that the sound which needs to be heard originally can be isolated; and thirdly, the noise reduction effect is limited by the material and the design for noise reduction, so that the effects are uneven, the cost is very high when the noise reduction effect is good, or the wearing is heavy and uncomfortable.
The active noise reduction is based on the characteristic of noise, reverse sound waves equal to external noise are generated through a noise reduction system, and the noise reduction effect is achieved in a noise neutralization mode. The principle is that all sounds are made up of a certain frequency spectrum, and if a sound can be found, the frequency spectrum is identical to the noise to be eliminated, and the noise can be completely cancelled out only by the phase being exactly opposite (180 degrees apart). A common solution is an active noise reduction headphone. The active noise reduction mode has the advantages of excellent general effect, good target noise elimination effect, high technical difficulty, high requirement on hardware, and power supply requirement of an independent battery, so that the active noise reduction mode is expensive, long in use time and limited by the shape of the battery and equipment to a certain extent by wearing feeling, is usually used in the field of serious noise such as army, airports, shooting and automobiles, and is not beneficial to being flexibly used in states such as personal sleep, going out and the like.
In addition to the above two ways, it is also a common method in life to mask noise by other sounds, for example, some people may listen to some light music to ignore ambient noise when working in a noisy environment, so as to achieve the effect of resisting noise. This approach actually exploits the masking effect of hearing. The effect is the effect of one sound on the perception of another sound by the auditory system, which is prevalent in the perception and localization of sounds by humans and animals. In recent years, the method is mainly applied to the treatment of clinical tinnitus, speech enhancement, digital copyright watermarking of audio and control of environmental noise.
The masking effect is applied in environmental noise control, mainly designed for large environments such as open offices, and usually uses continuous, low-loudness and information-free sounds as the noise masking sounds, because this type of sounds can become an easily acceptable noise floor, and other noises can be suppressed, so that people can hear the sounds without feeling a fidgety, and sounds such as brake harshness and plate collision can be masked with softer sounds such as a fan, that is, a restaurant near the road can reduce noise interference of vehicles by installing and turning on the fan. However, these can only reduce the overall noise impact and do not meet the individual anti-noise requirements of each individual because firstly the environment in which the individual is located varies in real time and secondly the individual is not as sensitive to the type and degree of noise, for example someone may find the fan sound more annoying than the sound of a car.
However, in daily life, whether learning, working, or sleeping, individuals need a relatively quiet space, a calm mood. Therefore, many people can mask the noise by playing the favorite sound, and although the noise is effective to a certain extent, due to the complex type and composition of the noise, the sound source is far and near, the sound size is different, most of music melodies selected by the people can need to adopt larger volume or be very close to ears, but the existence of the noise can be obviously sensed, on one hand, the complete shielding effect cannot be achieved, on the other hand, the hearing of the people can be damaged after long-term use, and in addition, the audio which is favored at ordinary times or can mask the current noise can not be suitable for the current use scene, for example, the sleeping quality is easily influenced by listening to the excited melody.
Therefore, a more scientific and accurate method and apparatus are needed to accurately and conveniently achieve the best masking effect by matching the volume and characteristics of noise in daily life and the usage scenario (such as sleeping or working).
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide the method for masking the environmental noise, which has the characteristics of low calculated amount, good masking effect and low requirement on hardware.
Another object of the present invention is to provide a device for masking environmental noise, which has the advantages of low calculation amount, good masking effect, low requirement on hardware and low cost.
Another object of the present invention is to provide a device including the above-mentioned ambient noise masking device, where the device may be a stand-alone device only used for monitoring ambient noise, or an ambient parameter monitor not only for monitoring ambient noise but also for monitoring other ambient parameters, and a terminal of the monitoring device integrated with the above-mentioned functions, such as a mobile phone, a computer, an intelligent sound box, an intelligent hardware, and so on.
The purpose of the invention is realized by the following technical scheme: a method of masking ambient noise, comprising the steps of:
establishing a masking audio material library, extracting and classifying the characteristics of each audio material, and dividing the audio material into background audio, melody audio or auxiliary audio;
dividing the frequency part of 20Hz-16kHz into 25 critical sub-bands, and respectively calculating the signal masking ratio of tonal components and the signal masking ratio of non-tonal components of the 25 critical sub-bands; the signal masking ratio is higher than the masking object decibel if tone masking or non-tone masking is to be realized;
monitoring and analyzing the environmental noise to obtain the characteristics of tonal components and non-tonal components of the environmental noise;
recommending background audio in a material library according to the non-tonal component characteristics of the noise;
recommending melody audio in the material library according to the tone component characteristics of the noise;
recommending auxiliary audio in the material library aiming at the characteristics of the human ears;
and combining the recommended background audio, melody audio and auxiliary audio, and outputting the final masking audio.
Preferably, the masking audio material library is created as follows: white noise, pink noise, brown noise, music melody, atmosphere sound, atmosphere music and the like of various musical instruments are selected and used as audio materials to construct a material library.
Preferably, after the masking audio material library is established, feature extraction is performed on each audio material, and the method comprises the following steps:
reading PCM-code-modulation data of an audio material by seconds;
transforming the PCM coded data obtained every second into a frequency response curve through Fast Fourier Transform (FFT);
dividing the 20Hz-16kHz part of the frequency response curve into 25 critical sub-bands;
analyzing the 25 critical sub-bands to obtain the maximum value of the sound pressure level in each sub-band;
continuously analyzing the audio frequency one second by one second through a window function until the full-length analysis of the audio frequency is finished, obtaining a characteristic matrix of [ L × 25], wherein L represents the length of the audio frequency, and the parameter value in the matrix is the maximum value of the sound pressure level in each sub-frequency band;
and solving the maximum value Max, the average value Avg and the standard deviation Sd of the sound pressure level under each frequency band to obtain a first masking feature matrix of the audio frequency.
Preferably, each audio material is classified according to its characteristics by:
if the standard deviation Sd values under all the frequency bands are smaller than a preset first threshold value and the Max-Avg under each frequency band is smaller than a preset second threshold value, the audio material is background audio; the background class audio is used for masking non-tonal components, and the frequency response curve distribution of each second is relatively uniform;
if the standard deviation Sd values under all the frequency bands are smaller than a preset first threshold value, and Max-Avg > under each frequency band is equal to a preset second threshold value, the audio material is melody audio; melody audio is audio for masking noise tone components and is mostly melodic music;
if the standard deviation Sd values of all the frequency bands are not smaller than the preset first threshold value, but Max-Avg > of each frequency band is equal to the preset second threshold value, the audio material is auxiliary audio, and the auxiliary audio refers to audio with more frequency distribution and higher sound pressure level in the frequency range of 2kHz-4 kHz.
Preferably, the audio in the masking audio material library can be uploaded by the user autonomously, and for the audio uploaded by the user autonomously, the following processing is performed: converting the uploaded audio into a prescribed format; and then classifying the audio materials, if the classification is successful, storing the audio materials in the local of the user, otherwise, outputting 'the audio is not suitable as the information for masking the audio'.
Preferably, the signal masking ratios of tonal components and non-tonal components of the 25 critical subbands are calculated by:
first is the signal-to-mask ratio of the tonal components: t isTM(i,j)=PTM(j)-0.275z(j)+SF(i,j)-6.025;TTM(i, j) is the masking threshold for a single tone, PTM(j) Is a toneThe Sound Pressure Level (SPL) of the masker at j, SF (i, j) is the extended masking threshold from masker j to masking object i. Because the masking effect of the masker and the masking object is the best under the same critical frequency band, the invention does not consider the mutual influence of different critical frequency bands, avoids the additional calculation of masking superposition, not only saves the calculation resources, but also improves the calculation efficiency. And SF (i, j) 17(z (i) -z (j)) 0, i.e. T, in the same critical bandTM(i,j)=PTM(j)-0.275z(j)-17(z(i)-z(j))-6.025=PTM(j) -0.275z (j) -6.025, wherein z (j) is [0,1,2, …, 24](ii) a That is, since the sound pressure level of the tone masker is 0.275z (j) +6.025 higher than the sound pressure level of the tone component to be masked, the absolute masking effect can be achieved, and therefore, 0.275z (j) +6.025 is the signal masking ratio of the tone component; calculating the signal masking ratio of the tone component for each frequency band to obtain a second masking feature matrix of the tone component;
second is the signal masking ratio of the non-tonal components: t isNM(i,j)=PNM(j) 0.175z (j) + SF (i, j) -2.025, at the same critical band, TNM(i,j)=PNM(j)-0.175z(j)+SF(i,j)-2.025=PNM(j)-0.175z(j)-17(z(i)-z(j))-2.025=PNM(j) -0.175z (j) -2.025, wherein z (j) is [0,1,2, …, 24]。TNM(i, j) is a single non-tonal masking threshold, PNM(j) The sound pressure level at j for non-tonal maskers, and SF (i, j) is the expanded masking threshold from masker j to masking object i; that is, in the same critical frequency band, the sound pressure level of the non-tone masker is higher than the sound pressure level of the non-tone component to be masked by 0.175z (j) +2.025, and the absolute masking effect can be achieved, so that the signal masking ratio of the non-tone component is 0.175z (j) + 2.025; and calculating the signal masking ratio of the non-tonal component for each frequency band to obtain a third masking feature matrix of the non-tonal component.
Preferably, the environmental noise is monitored and analyzed by:
collecting noise, and converting the noise analog signal into PCM coded data;
transforming the PCM coded data obtained every second into a frequency response curve through fast Fourier transform;
dividing the 20Hz-16kHz part of the frequency response curve of the second into 25 critical sub-frequency bands;
taking the maximum value of the sound pressure level of each critical sub-band as the tone masking threshold TM of the sub-band, and taking the average value of other values except the maximum value of each critical sub-band as the non-tone masking threshold NM of the band;
continuously collecting noise for L seconds;
obtaining the maximum value, the average value and the standard deviation of TM and NM under each critical sub-frequency band in the L-second process through a window function to obtain a noise characteristic matrix;
adding the second masking feature matrix to all the values of the TM to obtain a sound pressure threshold matrix of the masking audio required by the tone component of the environmental noise, namely a fourth masking feature matrix;
and adding all the values of the NM to the third masking feature matrix to obtain a sound pressure level threshold matrix of the masking audio required by the non-tonal components of the environmental noise, namely a fifth masking feature matrix.
Preferably, the method of recommending the background class audio in the material library according to the non-tonal characteristics of the noise is as follows:
comparing the first masking feature matrixes of the background class audios in the material library one by one according to the fifth masking feature matrix of the non-tonal component of the environmental noise;
calculating the total standard deviation P of the difference values of the average value of the sound pressure level of the background class audio and the average value of the sound pressure level threshold of the non-tonal component of the environmental noise under each sub-frequency band;
counting the times N that the maximum value of the sound pressure level of the background class audio exceeds the maximum value of the sound pressure level threshold of the non-tonal component of the environmental noise;
calculating the difference value of the standard deviation of the sound pressure level of the background audio and the standard deviation of the sound pressure level threshold of the non-tonal component of the environmental noise under each sub-frequency band, and then summing the 25 difference values to obtain a numerical value D;
comparing the [ P, N, D ] values of all the background-class audios, wherein the larger the P value of the two audios is, the higher the audio sequence is, secondly, the larger the N value is, the higher the audio sequence is, and secondly, the larger the D value is, the higher the audio sequence is, and if the two audios are equal in P, N and D value, the priority order of the two audios is the same; in this way, the top NUM1 background class audio numbers are selected as the first audio candidate set B to be masked.
Preferably, the melody-like audio in the material library is recommended according to the tone characteristics of the noise by the method comprising the following steps:
comparing the first masking feature matrixes of the melody type audios in the material library one by one according to the fourth masking feature matrix of the tone component of the environmental noise;
calculating the total standard deviation P of the difference values of the sound pressure level average value of the melody audio and the sound pressure level threshold value average value of the environment noise tone component under each sub-frequency band;
counting the times N that the maximum value of the sound pressure level of the melody audio exceeds the maximum value of the sound pressure level threshold of the tone component of the environmental noise;
calculating the difference value of the standard deviation of the sound pressure level of the melody audio and the standard deviation of the sound pressure level threshold of the tone component of the environmental noise under each sub-frequency band, and then summing the 25 difference values to obtain a numerical value D;
comparing the [ P, N, D ] values of all melody type audios, wherein the larger the P value of the two audios is, the higher the audio sequence is, secondly, the larger the N value is, the higher the audio sequence is, and secondly, the larger the D value is, the higher the audio sequence is, and if the two audios are equal in P, N and D value, the priority order of the two audios is the same; in this way, the number of the melody class audio with the top NUM of 2 is selected as the second audio candidate set M for masking.
Preferably, the method for recommending auxiliary audio in the material library according to the characteristics of the human ears comprises the following steps:
determining the most sensitive range of human ear to be 2kHz-4kHz, and extracting 5 critical sub-bands (13 th-17 th critical sub-bands) from 25 critical sub-bands covered by the range;
acquiring a signal masking ratio of a noise audio tonal component;
calculating the total standard deviation P of the difference value of the average value of the sound pressure level of the auxiliary audio and the average value of the sound pressure level threshold value of the tone component of the environmental noise under the 5 critical sub-frequency bands;
calculating the times that the maximum value of the sound pressure level of the auxiliary audio exceeds the maximum value of the sound pressure level threshold of the environmental noise tone component under the 5 critical sub-frequency bands, wherein the times are counted as N;
calculating the difference value between the standard deviation of the sound pressure level of the auxiliary audio and the standard deviation of the sound pressure level threshold of the tone component of the environmental noise under the 5 critical sub-frequency bands, and then summing the 5 difference values to obtain a numerical value D;
comparing the [ P, N, D ] values of all auxiliary audio, wherein the larger the P value of the two audio is, the higher the audio sequence is, secondly, the larger the N value is, the higher the audio sequence is, and secondly, the larger the D value is, the higher the audio sequence is, and if the two audio P, N and D values are equal, the priority order of the two audio is the same; in this way, the top NUM3 auxiliary class audio numbers are selected as the masked third audio candidate set S.
An ambient noise masking device, the device comprising:
the masking audio material library feature extraction module is used for establishing a masking audio material library, extracting and classifying the features of each audio material, and dividing the audio materials into background audio, melody audio or auxiliary audio;
the signal masking ratio calculating module is used for dividing the frequency part of 20Hz-16kHz into 25 critical sub-bands and respectively calculating the signal masking ratio of the tone component and the signal masking ratio of the non-tone component of the 25 critical sub-bands; the signal masking ratio is higher than the masking object decibel if tone masking or non-tone masking is to be realized;
the environmental noise feature extraction module is used for monitoring and analyzing the environmental noise to obtain the features of tonal components and non-tonal components of the environmental noise;
the masking audio generation module is used for recommending background audio in the material library according to the non-tonal component characteristics of the noise; recommending melody audio in the material library according to the tone component characteristics of the noise; recommending auxiliary audio in the material library aiming at the characteristics of the human ears; and combining the recommended background audio, melody audio and auxiliary audio, and outputting the final masking audio.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. in terms of sound masking methods, most of the current methods in the market adopt a masking model based on MPEG-1, but the MPEG-1 algorithm is not considered, and the premise is to ensure the quality of audio, so a large number of complex algorithms are applied to ensure that sound waves are not deformed and audio is not distorted. However, the masking of daily noise does not need fidelity to the noise, so compared with the existing method, the algorithm developed by the invention has the advantages of low calculation complexity, high calculation speed, good masking effect, extremely low requirement on hardware, difficulty in being limited by various external conditions and extremely high practicability.
2. In terms of the masking effect, most of the current methods for masking on the market generate a segment of white noise to mask the noise of the environment, which only solves the problem on the physical level. Many times, white noise is not a pleasing alternative per se, and masking another noise segment with a noise segment often does not eliminate the psychological discomfort caused by the previous noise segment. Scientific research shows that music can relax, melodies of some musical instruments can improve working efficiency of people in a working environment compared with white noise, memory is better, sounds such as thunderstorms and sea waves can be more pleasant than the white noise, therefore, a lot of people can also use the music to shield noise, however, most of music selected by individuals can not achieve the best masking effect according to own preferences, too much volume can be adopted to achieve the masking effect, and hearing can be damaged after long-term use. Therefore, the invention combines the latest research at present, and creatively divides the masking sound into the combination of three parts of melody sound, background sound and auxiliary audio frequency, wherein the melody sound can not only mask the tone component of the noise but also help to improve the work efficiency or help to relax the mood or improve the sleep, the background sound can not only mask the non-tone component of the noise but also improve the pleasure degree, and the auxiliary audio frequency can further weaken the influence of the noise in the ear sensitivity range (2kHz-4 kHz). Therefore, the invention not only solves the masking effect of noise on a physical level, achieves the masking effect of minimum volume and maximum efficiency, but also relieves the uncomfortable feeling of noise to people from a psychological angle.
3. The output scheme of the invention is diversified. The invention is based on the masking audio material library, the algorithm for selecting the audio materials has low requirement on the masking audio materials, hundreds of materials are already created or selected by the system, and a combined recommendation mode is adopted, so that diversified masking schemes can be flexibly output aiming at various noise environments, and the effect can not be reduced due to fatigue or familiarity after long-term use.
4. Compared with available noise-proof ear plugs, noise reducers and noise reducing earphones, the invention has the advantages of low requirement on hardware, low use cost, no need of adding other equipment, realization of noise shielding effect only by using common smart phones, smart sound equipment with microphones, computers with microphones and the like, and flexible use in all noise environments.
5. The existing large-scale sound shielding method designed aiming at the office in the market is high in cost and is not suitable for individual use, and the design emphasis is on shielding the speaking sound of a person in the office and is not suitable for other noise environments. The invention has low cost, can realize flexible use for individuals, and can be applied to various noise environments.
Drawings
FIG. 1 is a schematic diagram of frequency domain sound masking;
fig. 2(a) is a flow chart of the establishment of the masking audio material library according to the embodiment;
FIG. 2(b) is a flowchart illustrating the user's own addition of audio to the library of masking audio materials according to this embodiment;
FIG. 3 is a diagram illustrating an exemplary method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a frequency response curve of a thunderstorm sound produced by the autonomous recording process of the present invention;
FIG. 5 is a graph showing a frequency response of transient ambient noise from a recorded sound test;
fig. 6 is a masking schematic of the curve of the noise data of fig. 5 for three audio combinations.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Examples
The present invention provides a method for masking an ambient noise, which is based on the psychoacoustic aspect of human ears (i.e., masking effect), and which at least comprises a step of generating a masking audio material library, a step of measuring the ambient noise, a step of recommending noise masking, etc., so as to cover the noise that is not desired to be heard by using other pleasant sounds, such as white noise, pink noise, brown noise, musical melodies of various musical instruments, atmosphere sounds, and atmosphere music, as masking audio. The method is not focused on processing objective sound of noise, but on changing subjective feeling of noise of human ears. The process is just like in a dark room, only one light beam is irradiated in, and the process is particularly obvious; if the light in the room is on and the previous light is still, it is not so obvious, however, if the light itself is too glaring, although the purpose of light mitigation can be achieved, the influence of the light itself cannot be improved, so that one of the purposes of the present invention is to achieve the best masking. Sound masking is also a similar principle, namely a more comfortable and more moderate sound is used to mask the original unpleasant sound. The masking method not only reduces the bad feeling of the human ears to the noise, but also relieves the dysphoria and dislike mood of the mind disturbed by the noise.
The specific principle of the masking effect can be illustrated by an example: assuming that the threshold of sound pressure level for hearing the sound a is 40dB in a quiet environment, if the sound B can be heard at the same time, the sound pressure level of a is increased to 54dB, i.e. 14dB higher than the original sound pressure level, to hear the sound a due to the influence of B. At this time, we call B a masking sound and a masked sound. The number of decibels that the masking sound hearing threshold is raised is called the masking amount, 14dB in the example is the masking amount, and 54dB is called the masking threshold.
Masking can be divided into frequency domain masking and time domain masking. Frequency domain masking, also known as simultaneous masking, refers to: when two sounds are made simultaneously, a strong sound may mask a weak sound in a nearby frequency range because the human ear's perception of a strong sound may weaken the perception of a weak sound. As shown in fig. 1, sounds with sound frequencies around 300Hz and sound intensities around 60dB mask sounds with sound frequencies around 150Hz and around 500 Hz. The frequency domain masking is used for masking sounds of adjacent frequencies, and the time domain masking is used for masking sounds of adjacent times, and is divided into leading masking and lagging masking due to the occurrence sequence of the times. The principle of the masking is irrelevant to the auditory sense of human ears, because the human brain needs a certain time to process information, in this time range, sound signals received by the brain are mixed together to achieve the masking effect.
For frequency domain masking, in order to quantify the frequency range in which the "masking tone" and the "masked tone" influence each other, scientists of Fletcher et al experimentally predicted a set of pass bands, called critical bands. The critical bands are a characteristic of human hearing, and a critical bandwidth is measured by "Bark", each corresponding to a "fixed length" of about 1mm on the basal membrane of the human ear. That is, the human ear cannot distinguish acoustically the frequencies within a critical frequency band corresponding to a fixed length of 1mm, so that the frequencies within this frequency band can be equivalent to a single frequency component. The relation between the frequency and the bandwidth of the critical sub-frequency band is calculated by science families such as Zwicker and the like, and the range of 20Hz to 16kHz is divided into 25 sub-critical frequency bands (Table 1).
Table list of frequency ranges (lower bound, upper bound, bandwidth) of 125 critical sub-bands
Figure BDA0002347185770000101
The critical band z (f) is related to the frequency f by:
Figure BDA0002347185770000102
the more similar the frequencies of the sounds (in the same critical frequency band), the higher the sound pressure level, the larger the masking amount, so the invention needs to find out the maximum sound pressure level of each critical frequency band as the masking threshold of the current critical frequency band. In addition, high frequency sounds are easily masked by low frequency sounds, especially when the low frequency sounds are loud, while the low frequency sounds are difficult to mask by the high frequency sounds, so the present invention needs to achieve maximum masking at each critical frequency band as much as possible to achieve the best masking effect. In summary, it is necessary to analyze and calculate the masking threshold of each critical band in the noise source, and find out the audio/audio combination that can achieve the maximum masking in each critical band as much as possible, so as to achieve the best masking effect.
In order to accurately describe the physiological phenomenon of human ears masking sound, a masking model of human ears needs to be established, and currently, the widely applied models are a Johnston masking model and an MPEG masking model under the ISO MPEG1 standard. There are two psychoacoustic models applied in MPEG-1, and taking psychoacoustic model 1 as an example, the model firstly performs signal power density spectrum conversion calculation, and then performs sub-band sound pressure level measurement. Because the masking capabilities of tonal and non-tonal components are different, it is necessary to extract tonal and non-tonal components, calculate masking thresholds respectively, determine the masking threshold of each sub-band, and finally calculate the global masking threshold of the whole 20Hz-16kHz range.
Specifically, the algorithm flow of the MPEG-1 model 1 is as follows:
firstly, the audio signal is transformed to a frequency domain through FFT, then, the audio signal is classified into a critical frequency band, tone and non-tone components are distinguished, an individual masking threshold value and a total masking threshold value are calculated according to the position and the intensity of the frequency, and finally, a signal masking ratio is obtained.
The algorithm is specifically executed as follows:
let the power spectrum of the analyzed audio signal x (i) be p (k), and normalize its maximum value to a 96dB reference sound pressure level spl (reference sound pressure level). Since the masking models for tonal (tonal) and non-tonal (non-tonal) components are different, the tonal and non-tonal components are first distinguished by a power spectrum. The tonal component is a local maximum in the power spectrum, defined as:
Figure BDA0002347185770000111
wherein:
Figure BDA0002347185770000112
according to STA medium local maximum for calculating the sound pressure level P of the tonal componentTM(k):
Figure BDA0002347185770000113
Rather than the sound pressure level of the tonal component
Figure BDA0002347185770000114
The sum of the power spectrum of the residual signal in each critical frequency band is as follows:
Figure BDA0002347185770000115
wherein,
Figure BDA0002347185770000116
is the geometric mean of the critical band, i.e.:
Figure BDA0002347185770000117
wherein l is the lower boundary of the critical band and u is the upper boundary of the critical band.
According to PTM、PNMAnd Tq(k) The basic principle of processing the tone or non-tone masker is as follows:
only satisfy PTM,NM≥Tq(k) Then, the masked one is retained, where Tq(k) Being the absolute threshold at line k, it can be approximated as:
Figure BDA0002347185770000121
the most powerful masker is retained within a sliding window of 0.5Bark width.
After preprocessing the tonal and non-tonal masking thresholds, computing the masking thresholds for individual tones and non-tones based on the frequency domain masking effect of the Human Auditory System (HAS), the individual tone masking thresholds being defined as:
TTM(i,j)=PTM(j)-0.275z(j)+SF(i,j)-6.025
wherein, PTM(j) Representing the Sound Pressure Level (SPL) of a pitch masker at j, z (j) being the bark representation of the frequency, the spread masking threshold function SF (i, j) from masker j to masked object i is defined as:
Figure BDA0002347185770000122
wherein, DeltazZ (i) -z (j) (denoted by Bark). The threshold for a single non-tonal masker is defined as:
TNM(i,j)=PNM(j)-0.175z(j)+SF(i,j)-2.025
wherein, PNM(j) Represents the Sound Pressure Level (SPL) of a non-tonal masker at j, SF (i, j) being defined as:
Figure BDA0002347185770000123
the global masking threshold T is obtained by adding the masking thresholds of the corresponding individual tones and non-tones to the absolute thresholdg(i):
Figure BDA0002347185770000124
Wherein, Tq(i) To absolute masking threshold, TTM(i, l) is the masking threshold for a single tone, l is the number of tone maskers, TNM(i, m) is the masking threshold for a single non-tone, and m is the number of non-tone maskers.
The MPEG-1 algorithm is mainly used for compressing audio, under the premise of ensuring that the audio is not distorted and a waveform diagram is not distorted, masking thresholds of tones and non-tones are calculated, tone and non-tone components masked by the audio are discarded, and low-frequency and high-frequency components insensitive to human ears are discarded, and the audio is re-encoded and compressed on the basis, so that a large amount of calculation is needed. For the masking of noise environment, the key point is the masking effect, but not the voice enhancement, therefore, the method of the invention is optimized and improved on the basis of the MPEG-1 algorithm, and can achieve better masking effect on the basis of reducing a large amount of calculation.
Referring to fig. 1, the method for masking environmental noise of the present embodiment includes the steps of:
s1, establishing a masking audio material library, extracting and classifying the characteristics of each audio material, and dividing the audio material into background audio, melody audio or auxiliary audio;
s2, dividing the frequency part of 20Hz-16kHz into 25 critical sub-bands, and respectively calculating the signal masking ratio of the tone component and the signal masking ratio of the non-tone component of the 25 critical sub-bands; the signal masking ratio is higher than the masking object decibel if tone masking or non-tone masking is to be realized;
s3, monitoring and analyzing the environmental noise to obtain the characteristics of tonal components and non-tonal components of the environmental noise;
s4, recommending background audio in the material library according to the non-tonal component characteristics of the noise; recommending melody audio in the material library according to the tone component characteristics of the noise; recommending auxiliary audio in the material library aiming at the characteristics of the human ears; and combining the recommended background audio, melody audio and auxiliary audio, and outputting the final masking audio.
In step S1, for the establishment of the masking audio material library, white noise, pink noise, brown noise, musical melodies of various musical instruments, atmosphere sounds (e.g., bird song, thunder, sound of coffee hall, etc.), atmosphere music (e.g., music in tropical rainforest scene + natural atmosphere in tropical rainforest), and other sounds are selected, which are usually used for noise masking and have high psychological comfort. Dividing the masking sound in the whole material library into three types of audio according to the distribution and the classification of the frequency response curve: background class (audio frequency of non-tonal component of masking noise, frequency response curve distribution of each second is relatively uniform, such as water flow noise, white noise, pink noise), melody class (audio frequency of tonal component of masking noise, mostly melodic tunes, such as piano tunes, harp tunes and the like), auxiliary class (audio frequency with frequency distribution relatively more and higher sound pressure level between 2kHz and 4 kHz).
Noise is random in time, so one key to masking noise is: the frequency response of the masked audio is distributed evenly every second to ensure that the noise is masked over the entire time sequence. Conventionally, a method of directly generating white noise is mostly used to mask noise because the sound pressure level of a frequency component in white noise is uniform over the entire audible range and the output data per second is the same. The noise is masked by white noise, and the noise can be covered at a physical angle, but the auditory sensation of the white noise per se can be the noise sensation of sand, so that the noise is not good in pleasure degree, and the noise is difficult to be used for a long time. Many studies have shown that, in terms of sound masking, natural atmosphere sounds, such as the water flow sounds, select water flow sounds with sound pressure levels and frequencies distributed uniformly over the entire time sequence, which not only can achieve the masking effect of white noise, but also have high pleasure, and at the same time, have been proved to improve mood, even work efficiency, and sleep. Therefore, in the embodiment, besides white noise is selected from the material library of the masking sound, various optimized natural atmosphere sounds or artificial atmosphere sounds, atmosphere music and musical instrument melodies are also selected and manufactured, and the flexibility of the system is improved.
Referring to fig. 2(a), in step S1, feature extraction is required for the audio in the masking audio material library, specifically:
s101, the audio frequency in the embodiment is stored by default by adopting a WAV format of PCM coding with a sampling rate of 44kHz and 16bit double channels, and the audio data is ensured not to be distorted. Each wav formatted audio material is first read in seconds of its PCM encoded data.
S102, the PCM coded data obtained every second is converted into a frequency response curve (the vertical axis is sound pressure level dB, the horizontal axis is frequency Hz) through Fast Fourier Transform (FFT), and the maximum value is normalized to 96dB reference sound pressure level through the public methodFormula (II)
Figure BDA0002347185770000141
Converting the time domain signal into the frequency domain signal can quickly identify different frequency components in the sound and respective sound pressure levels. For example, fig. 4 is a frequency response curve of thunderstorm sound that we recorded and then processed, where the horizontal axis is frequency and the vertical axis is sound pressure level (dB) of sound at the corresponding frequency.
S103, the main range of the sound heard by human ears is the range of 20Hz to 16kHz, so the embodiment mainly analyzes the audio frequency in the frequency interval. The 20Hz-16kHz portion of the frequency response curve for this second is divided into 25 sub-critical bands (see table 1). The critical band is a characteristic of human hearing and is a band-pass filter that looks like the human ear. The loudness heard by human ears is the same no matter how high the frequency is, as long as the sound is in the same sub-critical frequency band and under the condition that the objective sound pressure level is not changed, so that the frequency in the same critical frequency band can be treated as equivalent to a single frequency component.
And S104, analyzing the frequency response curve of the 25 critical sub-bands in the second to obtain the maximum value of the sound pressure level in each sub-band.
The analysis continues, second by second, through the window function until the full-length analysis of the audio ends S105. For example, the audio length is L seconds (if the audio is 13.45 seconds, L is 14, and the part beyond the integer is processed in +1 second), and the maximum value of sound pressure level per second of 25 critical sub-bands is 25, and L seconds results in a feature matrix of [ L × 25 ].
S106, comparing the L groups of data under each frequency band of the 25 sub-frequency bands to obtain the maximum value Max (the largest one of the L groups of data under the frequency band), the average value Avg (the average value of the L groups of data under the frequency band) and the standard deviation Sd (the measure of the difference between the L groups of data under the frequency band), and finally obtaining the feature matrix [25 x 3] (the maximum value, the average value and the standard deviation) of the masking audio.
And S107, obtaining [25 x 3] data for each audio frequency as a first masking feature matrix of the audio frequency.
And S108, dividing the masking audio into a background audio, a melody audio and an auxiliary audio. Specifically, in the feature matrix of [25 × 3], it is necessary to satisfy Sd values of all 25 groups < a preset first threshold (stable and uninterrupted), and Max-Avg values of each group < a preset second threshold (small sound variation amplitude); melody-type audio is mainly used for pitch masking, and the characteristic that sound changes with large enough amplitude and is stable and uninterrupted needs to be satisfied, specifically, in the feature matrix of [25 × 3], all 25 groups of Sd values < a preset first threshold (sound is stable and uninterrupted) and Max-Avg > of each group is a preset second threshold (sound changes with large enough amplitude); the auxiliary audio is used for supplementing the tone masking of the melody audio, the sound persistence is not required to be ensured, and the auxiliary audio can be classified as the auxiliary audio only by meeting the tone masking at a specific frequency, namely meeting the condition that the Max-Avg > of each group is equal to a preset second threshold value (all the melody audio can be counted as the auxiliary audio, but the auxiliary audio is not necessarily the melody audio). Some audios cannot satisfy the above division condition of the class 3 audios, and are not suitable for being used as masking audios. The audio (thunderstorm sound) shown in fig. 4 satisfies the characteristics of the background audio, and is classified as the background audio.
In practical applications, the user can also add his own favorite sound to the masking sound material library by himself, as shown in fig. 2(b), since the present invention requires that all audio in the masking sound material library is in wav format with PCM encoding 44kHz sampling rate 16bit binaural, in order to guarantee the integrity of the audio data used for the calculation. Therefore, firstly, the audio format added by the user needs to be judged, if the wav format of the 44kHz sampling rate 16bit double-channel is not PCM-coded, the system automatically converts the format into the wav format, and if the conversion fails, the user is prompted to replace the audio file. And (4) executing the steps S101-S108 aiming at the audio file with the corresponding format, if any one of the three types of the background audio, the melody audio and the auxiliary audio is met, uploading and storing the audio to the local of the user for the user to use, otherwise, returning 'unsuitable masking audio, and failing to upload'.
Compared with the algorithm of MPEG-1, the step S2 of the present invention focuses on the calculation of the signal masking ratio of the tonal components and the signal masking ratio of the non-tonal components of 25 critical sub-bands by:
s201, firstly, the signal masking ratio of the tone component: t isTM(i,j)=PTM(j)-0.275z(j)+SF(i,j)-6.025,TTM(i, j) is the masking threshold for a single tone, PTM(j) For the tone masker at the sound pressure level of j, SF (i, j) is the spread masking threshold from masker j to masking object i. Because the masking effect of the masker and the masking object is the best under the same critical frequency band, the invention does not consider the mutual influence of different critical frequency bands, avoids the additional calculation of masking superposition, not only saves the calculation resources, but also improves the calculation efficiency.
While
Figure BDA0002347185770000161
At the same critical sub-band, ΔzZ (i) -z (j) 0, so SF (i, j) -17 Δz0 (z) (i) -z (j), i.e. TTM(i,j)=PTM(j)-0.275z(j)-17(z(i)-z(j))-6.025=PTM(j) -0.275z (j) -6.025, wherein z (j) is [0,1,2, …, 24](ii) a That is, since the sound pressure level of the tone masker is 0.275z (j) +6.025 higher than the sound pressure level of the tone component to be masked, the absolute masking effect can be achieved, and therefore, 0.275z (j) +6.025 is the signal masking ratio of the tone component; calculating the signal masking ratio of the tone component for each frequency band to obtain a second masking feature matrix of the tone component; referring to fig. 1, the calculation results of each frequency band are shown in table 2, and a second masking feature matrix of the tonal components is generated.
TABLE 2 signal-to-mask ratio of tonal components at 25 critical subbands
Figure BDA0002347185770000162
S202, signal masking ratio of next non-tonal component: t isNM(i,j)=PNM(j)-0.175z(j)+SF(i, j) -2.025, supra, at the same critical band, ΔzZ (i) -z (j) 0, so SF (i, j) -17 Δz0 (z) (i) -z (j), i.e. TNM(i,j)=PNM(j)-0.175z(j)+SF(i,j)-2.025=PNM(j)-0.175z(j)-17(z(i)-z(j))-2.025=PNM(j) -0.175z (j) -2.025, wherein z (j) is [0,1,2, …, 24]。TNM(i, j) is a single non-tonal masking threshold, PNM(j) The sound pressure level at j for non-tonal maskers, and SF (i, j) is the expanded masking threshold from masker j to masking object i; that is, in the same critical frequency band, the sound pressure level of the non-tone masker is higher than the sound pressure level of the non-tone component to be masked by 0.175z (j) +2.025, and the absolute masking effect can be achieved, so that the signal masking ratio of the non-tone component is 0.175z (j) + 2.025; and calculating the signal masking ratio of the non-tonal component for each frequency band to obtain a third masking feature matrix of the non-tonal component. Referring to fig. 1, the calculation results for each frequency band are shown in table 3, and a third masking feature matrix for non-tonal components is generated.
TABLE 3 signal-to-mask ratio of non-tonal components at 25 critical sub-bands
Figure BDA0002347185770000171
In step S1, all the audios in the audio material library are subjected to feature extraction and classification once when the material library is established, and only new materials are processed when new materials are uploaded later, and the processed results are stored for subsequent actual use. Similarly, the calculation of the second masking feature matrix and the third masking feature matrix obtained in step S2 may be performed only once, and the calculation result is saved for subsequent actual use.
When the method is actually used, firstly, the first masking feature matrix which is the feature parameters of all audios in the masking audio material library is loaded, and the second masking feature matrix of 25 critical sub-bands and the third masking feature matrix of non-tonal components are loaded. Then, the environmental noise is monitored and analyzed by the following method:
s301, collecting noise, and converting the sound analog signal into PCM coded data.
The collection of ambient noise data may be performed by any device with a microphone, such as a smartphone, tablet, smart speaker, PC or other noise monitoring instrument that accesses the microphone, and the like. In the embodiment, data acquisition is performed by using a p9-plus 4g/128g pressure screen and an iPhone6s 128g Unicom edition of the android smart phone. The sampling parameters are set as follows: the 44kHz sampling rate and 16bit sampling bit number double-channel are selected to acquire sound with maximum fidelity. The data collected per second was 44k 16 x 2/8-176 k. Common devices are single microphones that can be stored as dual channel data so that the left channel and right channel data are identical. The microphone of the android mobile phone can only receive signals with the frequency within the range of 20 Hz-20 kHz, and the high sampling rate can ensure that data is not lost as much as possible.
S302, the mobile phone end converts the collected audio signals into a PCM coding format and transmits the PCM coding format to be analyzed and calculated, and the frequency spectrum conversion is carried out through FFT (fast Fourier transform), so that the sound pressure level accurate to 1Hz can be obtained. Performing FFT conversion for 2 times per second to obtain data, accumulating, converting into frequency response curve of 0 Hz-20 kHz (frequency/Hz on horizontal axis and sound pressure level/dB on vertical axis), normalizing the maximum value to 96dB reference sound pressure level, and calculating the maximum value according to formula
Figure BDA0002347185770000181
The time domain signal is converted into the frequency domain signal, so that different frequency components in the sound and respective sound pressure levels can be rapidly identified. For example, fig. 5 is a frequency response curve of instantaneous ambient noise in a recording test, where the horizontal axis represents frequency and the vertical axis represents sound pressure level (dB) of sound at frequency.
And S303, dividing the 20Hz-16kHz part in the frequency response curve of the second into 25 sub-critical frequency bands and carrying out analysis in real time.
At S304, since the noise may contain both tonal and non-tonal components, separate calculations are required. The maximum value of the sound pressure level of each critical frequency band is taken as the tone masking threshold TM (in dB) of the frequency band, and the average value of the values except the maximum value of each critical frequency band is taken as the non-tone masking threshold NM (in dB) of the frequency band.
And S305, continuously acquiring noise, and performing data analysis through a window function, wherein the maximum value, the mean value and the standard deviation of the sound pressure level of each second in 25 critical sub-frequency bands are 25 respectively, and an L25 x 3 characteristic matrix is obtained in L seconds.
S306, comparing the L groups of data in each frequency band of 25 sub-frequency bands to obtain the maximum value of TM and NM (the maximum value of L groups of data in the frequency band), the average value (the average value of L groups of data in the frequency band) and the standard deviation (the measure of the difference between L groups of data in the frequency band), obtaining a noise feature matrix with the size of [ 25X 3X 2], finally adding all the values of TM (with the size of [ 25X 3]) to the signal masking ratio of the corresponding 25 critical frequency band tone components in the table 2(a second masking feature matrix), adding all the values of NM (with the size of [ 25X 3]) to the signal masking ratio of the corresponding 25 critical frequency band non-tone components in the table 3 (a third masking feature matrix), and finally obtaining a sound pressure level threshold matrix of the sound frequency required for masking the tone components of the environmental noise (a fourth masking feature matrix), size [25 × 3], including the maximum, mean, and standard deviation of the sound pressure level threshold for the desired masking audio at 25 critical subbands, and the sound pressure level threshold matrix for the desired masking audio for the non-tonal components of the ambient noise (fifth masking feature matrix), and size [25 × 3], including the maximum, mean, and standard deviation of the sound pressure level threshold for the desired masking audio at 25 critical subbands.
Step S4 is to recommend a masking scheme according to the characteristics of the noise and the characteristics in the audio material library, where the masking scheme includes 3 parts, that is, to recommend the background audio in the material library according to the non-tonal characteristics of the noise, to screen the appropriate melody audio according to the tonal characteristics of the noise, and to select the appropriate auxiliary audio according to the characteristics of the human ear. The respective steps are specifically described below.
And (3) screening proper background audio according to the non-tonal characteristics of the noise by the following steps:
s411, comparing the first masking feature matrix of each audio of the background class in the masking music material library one by one according to the fifth masking feature matrix of the non-tonal component of the noise audio;
s412, in each audio comparison of 1 to 1, under the respective comparison of 25 critical sub-bands, the total standard deviation P of the difference values of the "average value of the sound pressure level of the background-class audio" and the "average value of the sound pressure level threshold of the non-tonal components of the ambient noise" at each sub-band is calculated.
S413, in the above corresponding audio comparison between 1 and 1, under the respective comparison of 25 critical sub-bands, if "maximum value of sound pressure level value of background audio" > "maximum value of sound pressure level threshold of non-tonal component of ambient noise", count 1, after comparing all 25 critical sub-bands, count the number of times that the maximum value of sound pressure level of background audio exceeds the maximum value of sound pressure level threshold, and sum to N.
S414, in the corresponding audio comparison of 1 to 1, under the condition that 25 critical sub-bands are respectively compared, calculating the difference value of 'standard deviation of sound pressure level of background audio' and 'threshold standard deviation of sound pressure level of non-tonal component of environmental noise' in each sub-band, and then summing the 25 difference values to obtain the value D.
After the above calculation, each background-type audio has a combination of characteristic values [ P, N, D ] with respect to noise S415. On the basis of considering the maximum power value of 25 critical frequency bands preferentially, unnecessary masking amount (high masking curve fitting degree P) is properly reduced, the maximum power number of 25 frequency bands is considered to achieve the covering effect (N), and the frequency fitting degree (D) of masking audio frequency and noise is achieved, so that on the basis of reducing the calculated amount, the sufficient masking effect is achieved.
Their P, N, D values are compared between all background class audios by priority. 1) Preferentially comparing the numerical values of P, wherein the audio with larger P value is higher in sequence; 2) if the P values of the two audios are equal, comparing the N values; the audio with larger N value is ranked higher; 3) if the N values are also equal, comparing the D values; audio with larger values of D are ranked higher. If the two audio P, N, D values are equal, then the priorities are the same. In this way, the top 50 ranked background audio numbers are selected as the first audio candidate set B to be masked.
Screening proper melody audio according to the tone characteristics of the noise by the following method:
s421, selecting the tone component of the masking noise from the melody class audio material library: and comparing the first masking feature matrix of each audio frequency of the melody class in the masking material library one by one according to the fourth masking feature matrix of the tone component of the noise audio frequency.
S422, in each audio comparison of 1 to 1, under the respective comparison of 25 critical sub-bands, the total standard deviation P of the difference values of the "mean value of the sound pressure level of the melody-like audio" and the "mean value of the sound pressure level threshold of the ambient noise tone component" at each sub-band is calculated.
And S423, in the corresponding audio comparison of 1 to 1, under the respective comparison of 25 critical sub-bands, counting 1 if 'the maximum value of the sound pressure level of the melody audio' > 'the maximum value of the sound pressure level threshold of the tone component of the environmental noise audio', and counting the number of times that the maximum value of the sound pressure level of the melody audio exceeds the maximum value of the sound pressure level threshold after all 25 critical sub-bands are compared, wherein the total number is N.
S424, in the corresponding audio comparison of 1 to 1, under the respective comparison of 25 critical sub-bands, calculating the difference value of the 'standard deviation of the sound pressure level of melody audio' and the 'standard deviation of the sound pressure level threshold of the ambient noise tonal component' in each sub-band, and then summing the 25 difference values to obtain the value D.
S425, after the above calculation, each melody type audio has a feature value combination [ P, N, D ] corresponding to noise. On the basis of considering the maximum power value of 25 critical frequency bands preferentially, unnecessary masking amount (high masking curve fitting degree P) is properly reduced, the maximum power number of 25 frequency bands is considered to achieve the covering effect (N), and the frequency fitting degree (D) of masking audio frequency and noise is achieved, so that on the basis of reducing the calculated amount, the sufficient masking effect is achieved.
Their P, N, D values are compared among all melody-like audio by priority. 1) Preferentially comparing the numerical values of P, wherein the audio with larger P value is higher in sequence; 2) if the P values of the two audios are equal, comparing the N values; the audio with larger N value is ranked higher; 3) if the N values are also equal, comparing the D values; audio with larger values of D are ranked higher. If the two audio P, N, D values are equal, then the priorities are the same. In this way, the top 50 melody audio numbers are selected as the second audio candidate set M for masking.
Selecting proper auxiliary audio aiming at human ear characteristics, wherein the method comprises the following steps:
s431, researchers research a lot of noises uncomfortable for human ears, and find that after the sound in the range of 2kHz-4kHz is erased, the noises become good, which shows that 2kHz-4kHz is a frequency range sensitive to human ears, and based on the fact that the noise in the region is shielded intensively, the effect of the whole system for shielding the noise is optimized.
This embodiment selects masking audio for the characteristics of the human ear from the auxiliary class audio of the masking material library: in addition to the overall masking in the frequency range of 20Hz-16kHz, we chose to enhance the masking in the range of the human ear most sensitive (2kHz-4kHz) for better masking. In the 2kHz-4kHz range, 5 of the 25 critical sub-bands (critical sub-bands No. 13 to 17) are covered.
S432, since the main purpose of the auxiliary audio-like is to enhance masking in the 2kHz-4kHz range, whether tonal or non-tonal components of the noise. And the signal masking ratio of the tonal component is higher than that of the non-tonal component, therefore, the invention selects to calculate by using the higher signal masking ratio of the tonal component to ensure the masking effect.
And S433, extracting a sixth masking feature matrix and a seventh masking feature matrix of the noise audio and the auxiliary audio under No. 13 to No. 17 critical sub-bands respectively according to the fourth masking feature matrix of the noise tone component and the first masking feature matrix of each auxiliary audio. At these 5 critical sub-bands, the seventh masking feature matrix of the auxiliary audio class is compared with the sixth masking feature matrix of the tonal component of the noise one by one.
S434, in each audio comparison of 1 to 1, calculating the total standard deviation P of the difference between the "average value of the sound pressure level of the auxiliary audio" and the "average value of the sound pressure level threshold of the ambient noise tone component" in each sub-band under the respective comparison of 5 critical sub-bands.
S435, in the corresponding audio comparison of 1 to 1, under the respective comparison of 5 critical sub-bands, if "the maximum value of the sound pressure level of the auxiliary audio" > "the maximum value of the sound pressure level threshold of the ambient noise audio tone component", count 1, after comparing all 5 critical sub-bands, count the number of times that the maximum value of the sound pressure level of the auxiliary audio exceeds the maximum value of the sound pressure level threshold, and count N in total.
S436, in the above corresponding audio comparison of 1 to 1, under the respective comparison of 5 critical sub-bands, calculating the "standard deviation of the sound pressure level of the auxiliary audio" and the "threshold standard deviation of the sound pressure level of the ambient noise tonal component", and then summing the 5 differences to obtain the value D.
S437, after the above calculation, each auxiliary audio has a feature value combination [ P, N, D ] corresponding to noise. Their P, N, D values are compared between all auxiliary audio classes, prioritizing. 1) Preferentially comparing the numerical values of P, wherein the audio with larger P value is higher in sequence; 2) if the P values of the two audios are equal, comparing the N values; the audio with larger N value is ranked higher; 3) if the N values are also equal, comparing the D values; audio with larger values of D are ranked higher. If the two audio P, N, D values are equal, then the priorities are the same. In this way, the top 50 auxiliary audio numbers are selected as the masked third audio candidate set S.
After the recommendation steps, a background audio candidate set B, a melody audio candidate set M and an auxiliary audio candidate set S are respectively obtained, and an audio combination with the highest priority is selected from the audio combination schemes B, M and S, and a result is output. In the combination, besides the masking effect on human ear psychology, the combination of the audio is also a combination of pleasant degree and psychological feeling by referring to a background sound, a melody and an auxiliary audio of a detail, which not only has the effect of noise masking, but also has the effect of relieving the mind, and the user can customize the playing time length.
Based on the above method for masking environmental noise, correspondingly, a device for masking environmental noise can be obtained, the device comprising:
the masking audio material library feature extraction module is used for establishing a masking audio material library, extracting and classifying the features of each audio material, and dividing the audio materials into background audio, melody audio or auxiliary audio;
the signal masking ratio calculating module is used for dividing the frequency part of 20Hz-16kHz into 25 critical sub-bands and respectively calculating the signal masking ratio of the tone component and the signal masking ratio of the non-tone component of the 25 critical sub-bands; the signal masking ratio is higher than the masking object decibel if tone masking or non-tone masking is to be realized;
the environmental noise feature extraction module is used for monitoring and analyzing the environmental noise to obtain the features of tonal components and non-tonal components of the environmental noise;
the masking audio generation module is used for recommending background audio in the material library according to the non-tonal component characteristics of the noise; recommending melody audio in the material library according to the tone component characteristics of the noise; recommending auxiliary audio in the material library aiming at the characteristics of the human ears; and combining the recommended background audio, melody audio and auxiliary audio, and outputting the final masking audio.
The masking audio material library feature extraction module and the signal masking ratio calculation module are all modules which can be loaded in advance, the environmental noise feature extraction module and the masking audio generation module are arranged on a handheld terminal or a fixed device of a user, and specific contents of the modules refer to detailed description of a method part, which is not described herein again.
The embodiment provides a device, which comprises a mobile phone/intelligent sound box with a microphone, and a masking sound database in a masking device for presetting the environmental noise inside. When the device is used, the device is started, characteristic parameters of audio in all the masking audio material libraries are loaded, meanwhile, the device is connected with a central processing unit of the mobile phone/intelligent sound box, and a microphone of the mobile phone/intelligent sound box is used for sampling environmental noise; then, by means of a preset mode of the device, frequency spectrum conversion is carried out on the environmental noise to obtain a frequency response data curve of the sound source, the maximum decibel value is displayed in the system in real time, the characteristic data of the sound source is output, the characteristic data of the sound source is analyzed in a statistical mode, and the audio data are continuously collected in the period. Furthermore, appropriate masking audios are respectively screened according to the noise characteristics, matching processing is carried out through the selected combinations, and a scheme of the final masking audio combination is output.
Therefore, the invention can be applied to hardware such as a microphone, a vehicle-mounted system and the like in the traditional means, and thus, the invention can be applied only in a system with a sound collection device and a playing device, can be completed in a state of not adding any other hardware at all, fully utilizes the existing resources of the mobile phone/intelligent sound box, and expands the practicability of the invention.
In the embodiments provided by the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the unit modules is only a logical division, and there may be other divisions when the actual implementation is performed, or units having the same function may be grouped into one unit, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or may not be executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.
Each functional unit in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (12)

1. A method of masking ambient noise, comprising the steps of:
establishing a masking audio material library, extracting and classifying the characteristics of each audio material, and dividing the audio material into background audio, melody audio or auxiliary audio;
dividing the frequency part of 20Hz-16kHz into 25 critical sub-bands, and respectively calculating the signal masking ratio of tonal components and the signal masking ratio of non-tonal components of the 25 critical sub-bands;
monitoring and analyzing the environmental noise to obtain the characteristics of tonal components and non-tonal components of the environmental noise;
recommending background audio in a material library according to the non-tonal component characteristics of the noise;
recommending melody audio in the material library according to the tone component characteristics of the noise;
recommending auxiliary audio in the material library aiming at the characteristics of the human ears;
and combining the recommended background audio, melody audio and auxiliary audio, and outputting the final masking audio.
2. The method of masking ambient noise according to claim 1, wherein the library of masked audio materials is created by: white noise, pink noise, brown noise, music melody of various musical instruments, atmosphere sound and atmosphere music are selected and used as audio materials to construct a material library.
3. The method of masking ambient noise according to claim 1, wherein the step of extracting features from each of the audio materials after the library of masked audio materials is created comprises:
reading PCM coded data of the audio material according to seconds;
transforming the PCM coded data obtained every second into a frequency response curve through fast Fourier transform;
dividing the 20Hz-16kHz part of the frequency response curve into 25 critical sub-bands;
analyzing the 25 critical sub-bands to obtain the maximum value of the sound pressure level in each sub-band;
continuously analyzing the audio frequency one second by one second through a window function until the full-length analysis of the audio frequency is finished, obtaining a characteristic matrix of [ L × 25], wherein L represents the length of the audio frequency, and the parameter value in the matrix is the maximum value of the sound pressure level in each sub-frequency band;
and solving the maximum value Max, the average value Avg and the standard deviation Sd of the sound pressure level under each frequency band to obtain a first masking feature matrix of the audio frequency.
4. A method of masking ambient noise according to claim 3, wherein each audio material is classified according to its characteristics by:
if the standard deviation Sd values under all the frequency bands are smaller than a preset first threshold value and Max-Avg under each frequency band is smaller than a preset second threshold value, the audio material is background audio;
if the standard deviation Sd values of all the frequency bands are smaller than a preset first threshold value, and Max-Avg & gt & lt & gt a preset second threshold value, the audio material is melody audio;
and if the standard deviation Sd values of all the frequency bands are not smaller than the preset first threshold value, but Max-Avg & gt & lt & gt &.
5. The method for masking environmental noise according to claim 1, wherein the audio in the masked audio material library is uploaded autonomously by the user, and for the audio uploaded autonomously by the user, the following processes are performed: converting the uploaded audio into a prescribed format; and then classifying the audio materials, if the classification is successful, storing the audio materials in the local of the user, otherwise, outputting information that the audio is not suitable for being used as masking audio.
6. The method for masking environmental noise according to claim 3, wherein the signal masking ratios of tonal components and non-tonal components of the 25 critical sub-bands are calculated by:
first is the signal-to-mask ratio of the tonal components: t isTM(i,j)=PTM(j)-0.275z(j)+SF(i,j)-6.025;TTM(i, j) is the masking threshold for a single tone, PTM(j) Is the sound pressure level of the tone masker at j, and SF (i, j) is the spread masking threshold from masker j to masked object i; since the masker and the masking object have the best masking effect in the same critical band, SF (i, j) ═ 17(z (i) -z (j) ═ 0, i.e., T, in the same critical band without considering the mutual influence of different critical bandsTM(i,j)=PTM(j)-0.275z(j)-17(z(i)-z(j))-6.025=PTM(j) -0.275z (j) -6.025, wherein z (j) is [0,1,2, …, 24](ii) a That is, since the sound pressure level of the tone masker is 0.275z (j) +6.025 higher than the sound pressure level of the tone component to be masked, the absolute masking effect can be achieved, and therefore, 0.275z (j) +6.025 is the signal masking ratio of the tone component; calculating the signal masking ratio of the tone component for each frequency band to obtain a second masking feature matrix of the tone component;
second is the signal masking ratio of the non-tonal components: t isNM(i,j)=PNM(j) 0.175z (j) + SF (i, j) -2.025, at the same critical band, TNM(i,j)=PNM(j)-0.175z(j)+SF(i,j)-2.025=PNM(j)-0.175z(j)-17(z(i)-z(j))-2.025=PNM(j) -0.175z (j) -2.025, wherein z (j) is [0,1,2, …, 24],TNM(i, j) is a single non-tonal masking threshold, PNM(j) The sound pressure level at j for non-tonal maskers, and SF (i, j) is the expanded masking threshold from masker j to masking object i; that is, in the same critical frequency band, the sound pressure level of the non-tone masker is higher than the sound pressure level of the non-tone component to be masked by 0.175z (j) +2.025, and the absolute masking effect can be achieved, so that the signal masking ratio of the non-tone component is 0.175z (j) + 2.025; and calculating the signal masking ratio of the non-tonal component for each frequency band to obtain a third masking feature matrix of the non-tonal component.
7. The method of masking ambient noise according to claim 6, wherein the ambient noise is monitored and analyzed by:
collecting noise, and converting the noise analog signal into PCM coded data;
transforming the PCM coded data obtained every second into a frequency response curve through fast Fourier transform;
dividing the 20Hz-16kHz part of the frequency response curve of the second into 25 critical sub-frequency bands;
taking the maximum value of the sound pressure level of each critical sub-band as the masking threshold TM of the tonal component of the sub-band, and taking the average value of other values except the maximum value of each critical sub-band as the masking threshold NM of the non-tonal component of the band;
continuously collecting noise for L seconds;
obtaining the maximum value, the average value and the standard deviation of TM and NM under each critical sub-frequency band in the L-second process through a window function to obtain a noise characteristic matrix;
adding the second masking feature matrix to all the values of the TM to obtain a sound pressure threshold matrix of the masking audio required by the tone component of the environmental noise, namely a fourth masking feature matrix;
and adding all the values of the NM to the third masking feature matrix to obtain a sound pressure level threshold matrix of the masking audio required by the non-tonal components of the environmental noise, namely a fifth masking feature matrix.
8. The method of masking ambient noise according to claim 7, wherein the background-like audio in the library is recommended based on non-tonal characteristics of the noise by:
comparing the first masking feature matrixes of the background class audios in the material library one by one according to the fifth masking feature matrix of the non-tonal component of the environmental noise;
calculating the total standard deviation P of the difference values of the average value of the sound pressure level of the background class audio and the average value of the sound pressure level threshold of the non-tonal component of the environmental noise under each sub-frequency band;
counting the times N that the maximum value of the sound pressure level of the background class audio exceeds the maximum value of the sound pressure level threshold of the non-tonal component of the environmental noise;
calculating the difference value of the standard deviation of the sound pressure level of the background audio and the standard deviation of the sound pressure level threshold of the non-tonal component of the environmental noise under each sub-frequency band, and then summing the 25 difference values to obtain a numerical value D;
comparing the [ P, N, D ] values of all the background-class audios, wherein the larger the P value of the two audios is, the higher the audio sequence is, secondly, the larger the N value is, the higher the audio sequence is, and secondly, the larger the D value is, the higher the audio sequence is, and if the two audios are equal in P, N and D value, the priority order of the two audios is the same; in this way, the top NUM1 background class audio numbers are selected as the first audio candidate set B to be masked.
9. The method for masking environmental noise according to claim 7, wherein the melody-like audio in the material library is recommended according to the pitch characteristics of the noise by:
comparing the first masking feature matrixes of the melody type audios in the material library one by one according to the fourth masking feature matrix of the tone component of the environmental noise;
calculating the total standard deviation P of the difference values of the sound pressure average value of the melody audio and the sound pressure level threshold value average value of the environment noise tone component under each sub-frequency band;
counting the times N that the maximum value of the sound pressure level of the melody audio exceeds the maximum value of the sound pressure level threshold of the tone component of the environmental noise;
calculating the difference value of the standard deviation of the sound pressure level of the melody audio and the standard deviation of the sound pressure level threshold of the tone component of the environmental noise under each sub-frequency band, and then summing the 25 difference values to obtain a numerical value D;
comparing the [ P, N, D ] values of all melody type audios, wherein the larger the P value of the two audios is, the higher the audio sequence is, secondly, the larger the N value is, the higher the audio sequence is, and secondly, the larger the D value is, the higher the audio sequence is, and if the two audios are equal in P, N and D value, the priority order of the two audios is the same; in this way, the number of the melody class audio with the top NUM of 2 is selected as the second audio candidate set M for masking.
10. The method for masking environmental noise according to claim 3, wherein the auxiliary audio in the material library is recommended for the characteristics of human ears by:
determining the most sensitive range of human ears to be 2kHz-4kHz, and extracting 5 critical sub-bands from 25 critical sub-bands covered by the range, namely critical sub-bands No. 13 to 17;
acquiring a signal masking ratio of a noise audio tonal component;
calculating the total standard deviation P of the difference value of the average value of the sound pressure level of the auxiliary audio and the average value of the sound pressure level threshold value of the tone component of the environmental noise under the 5 critical sub-frequency bands;
calculating the times that the maximum value of the sound pressure level of the auxiliary audio exceeds the maximum value of the sound pressure level threshold of the environmental noise tone component under the 5 critical sub-frequency bands, wherein the times are counted as N;
calculating the difference value between the standard deviation of the sound pressure level of the auxiliary audio and the standard deviation of the sound pressure level threshold of the tone component of the environmental noise under the 5 critical sub-frequency bands, and then summing the 5 difference values to obtain a numerical value D;
comparing the [ P, N, D ] values of all auxiliary audio, wherein the larger the P value of the two audio is, the higher the audio sequence is, secondly, the larger the N value is, the higher the audio sequence is, and secondly, the larger the D value is, the higher the audio sequence is, and if the two audio P, N and D values are equal, the priority order of the two audio is the same; in this way, the top NUM3 auxiliary class audio numbers are selected as the masked third audio candidate set S.
11. An ambient noise masking device, comprising:
the masking audio material library feature extraction module is used for establishing a masking audio material library, extracting and classifying the features of each audio material, and dividing the audio materials into background audio, melody audio or auxiliary audio;
the signal masking ratio calculating module is used for dividing the frequency part of 20Hz-16kHz into 25 critical sub-bands and respectively calculating the signal masking ratio of the tone component and the signal masking ratio of the non-tone component of the 25 critical sub-bands;
the environmental noise feature extraction module is used for monitoring and analyzing the environmental noise to obtain the features of tonal components and non-tonal components of the environmental noise;
the masking audio generation module is used for recommending background audio in the material library according to the non-tonal component characteristics of the noise; recommending melody audio in the material library according to the tone component characteristics of the noise; recommending auxiliary audio in the material library aiming at the characteristics of the human ears; and combining the recommended background audio, melody audio and auxiliary audio, and outputting the final masking audio.
12. An apparatus comprising a processor, a memory, a sound collection device, a playback device, and a computer program stored on the memory and executable on the processor, the processor implementing the method of masking ambient noise according to any one of claims 1 to 10 when executing the computer program.
CN201911399710.5A 2019-12-30 2019-12-30 Method, device and equipment for masking environmental noise Active CN111161699B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911399710.5A CN111161699B (en) 2019-12-30 2019-12-30 Method, device and equipment for masking environmental noise

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911399710.5A CN111161699B (en) 2019-12-30 2019-12-30 Method, device and equipment for masking environmental noise

Publications (2)

Publication Number Publication Date
CN111161699A true CN111161699A (en) 2020-05-15
CN111161699B CN111161699B (en) 2023-04-28

Family

ID=70559465

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911399710.5A Active CN111161699B (en) 2019-12-30 2019-12-30 Method, device and equipment for masking environmental noise

Country Status (1)

Country Link
CN (1) CN111161699B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111956398A (en) * 2020-07-13 2020-11-20 恒大恒驰新能源汽车研究院(上海)有限公司 Vehicle collision hearing protection method, vehicle and equipment
CN112509592A (en) * 2020-11-18 2021-03-16 广东美的白色家电技术创新中心有限公司 Electrical equipment, noise processing method and readable storage medium
CN113883669A (en) * 2021-09-13 2022-01-04 Tcl空调器(中山)有限公司 Sleep-assisting control method and device for air conditioner, electronic equipment and storage medium
CN113883671A (en) * 2021-09-13 2022-01-04 Tcl空调器(中山)有限公司 Abnormal noise shielding control method for air conditioner, air conditioner and readable storage medium
CN116996807A (en) * 2023-09-28 2023-11-03 小舟科技有限公司 Brain-controlled earphone control method and device based on user emotion, earphone and medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101354885A (en) * 2007-01-16 2009-01-28 哈曼贝克自动系统股份有限公司 Active noise control system
CN102008371A (en) * 2010-10-28 2011-04-13 中国科学院声学研究所 Digital tinnitus masker
JP2012123070A (en) * 2010-12-07 2012-06-28 Yamaha Corp Masker sound generation device, masker sound output device and masker sound generation program
US20150003625A1 (en) * 2012-03-26 2015-01-01 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for improving the perceived quality of sound reproduction by combining active noise cancellation and a perceptual noise compensation
JP2016052049A (en) * 2014-09-01 2016-04-11 三菱電機株式会社 Sound environment control device and sound environment control system using the same
CN105741849A (en) * 2016-03-06 2016-07-06 北京工业大学 Voice enhancement method for fusing phase estimation and human ear hearing characteristics in digital hearing aid
CN105869652A (en) * 2015-01-21 2016-08-17 北京大学深圳研究院 Psychological acoustic model calculation method and device
CN106796782A (en) * 2014-10-16 2017-05-31 索尼公司 Information processor, information processing method and computer program
CN109238448A (en) * 2018-09-17 2019-01-18 上海市环境科学研究院 A method of acoustic environment satisfaction is improved based on sound masking

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101354885A (en) * 2007-01-16 2009-01-28 哈曼贝克自动系统股份有限公司 Active noise control system
CN102008371A (en) * 2010-10-28 2011-04-13 中国科学院声学研究所 Digital tinnitus masker
JP2012123070A (en) * 2010-12-07 2012-06-28 Yamaha Corp Masker sound generation device, masker sound output device and masker sound generation program
US20150003625A1 (en) * 2012-03-26 2015-01-01 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for improving the perceived quality of sound reproduction by combining active noise cancellation and a perceptual noise compensation
JP2016052049A (en) * 2014-09-01 2016-04-11 三菱電機株式会社 Sound environment control device and sound environment control system using the same
CN106796782A (en) * 2014-10-16 2017-05-31 索尼公司 Information processor, information processing method and computer program
CN105869652A (en) * 2015-01-21 2016-08-17 北京大学深圳研究院 Psychological acoustic model calculation method and device
CN105741849A (en) * 2016-03-06 2016-07-06 北京工业大学 Voice enhancement method for fusing phase estimation and human ear hearing characteristics in digital hearing aid
CN109238448A (en) * 2018-09-17 2019-01-18 上海市环境科学研究院 A method of acoustic environment satisfaction is improved based on sound masking

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111956398A (en) * 2020-07-13 2020-11-20 恒大恒驰新能源汽车研究院(上海)有限公司 Vehicle collision hearing protection method, vehicle and equipment
CN112509592A (en) * 2020-11-18 2021-03-16 广东美的白色家电技术创新中心有限公司 Electrical equipment, noise processing method and readable storage medium
CN112509592B (en) * 2020-11-18 2024-01-30 广东美的白色家电技术创新中心有限公司 Electrical apparatus, noise processing method, and readable storage medium
CN113883669A (en) * 2021-09-13 2022-01-04 Tcl空调器(中山)有限公司 Sleep-assisting control method and device for air conditioner, electronic equipment and storage medium
CN113883671A (en) * 2021-09-13 2022-01-04 Tcl空调器(中山)有限公司 Abnormal noise shielding control method for air conditioner, air conditioner and readable storage medium
CN116996807A (en) * 2023-09-28 2023-11-03 小舟科技有限公司 Brain-controlled earphone control method and device based on user emotion, earphone and medium
CN116996807B (en) * 2023-09-28 2024-01-30 小舟科技有限公司 Brain-controlled earphone control method and device based on user emotion, earphone and medium

Also Published As

Publication number Publication date
CN111161699B (en) 2023-04-28

Similar Documents

Publication Publication Date Title
CN111161699B (en) Method, device and equipment for masking environmental noise
CN107708046B (en) Method and system for self-administered sound enhancement
CN104937954B (en) Method and system for the enhancing of Self management sound
CN104811891B (en) The method and system that the scaling of voice related channel program is avoided in multi-channel audio
CN106463107A (en) Collaboratively processing audio between headset and source
CN106464998A (en) Collaboratively processing audio between headset and source to mask distracting noise
CN101208742A (en) Adapted audio response
JPH08508626A (en) Adaptive gain and filtering circuit for audio reproduction device
CN113949956B (en) Noise reduction processing method and device, electronic equipment, earphone and storage medium
WO2018069900A1 (en) Audio-system and method for hearing-impaired
US20150005661A1 (en) Method and process for reducing tinnitus
CN112767908B (en) Active noise reduction method based on key voice recognition, electronic equipment and storage medium
CN113949955B (en) Noise reduction processing method and device, electronic equipment, earphone and storage medium
CN111385688A (en) Active noise reduction method, device and system based on deep learning
Sun et al. A supervised speech enhancement method for smartphone-based binaural hearing aids
Kates Modeling the effects of single-microphone noise-suppression
KR20050121698A (en) Method and system for increasing audio perceptual tone alerts
JP2012063614A (en) Masking sound generation device
CN116132875B (en) Multi-mode intelligent control method, system and storage medium for hearing-aid earphone
Rämö et al. Real-time perceptual model for distraction in interfering audio-on-audio scenarios
US8107660B2 (en) Hearing aid
KR20120081424A (en) Volume adjusting method
CN113963699A (en) Intelligent voice interaction method for financial equipment
CN107111921A (en) The method and apparatus set for effective audible alarm
CN201684071U (en) Digital procedure type tinnitus comprehensive therapeutic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant