CN111161699B - Method, device and equipment for masking environmental noise - Google Patents

Method, device and equipment for masking environmental noise Download PDF

Info

Publication number
CN111161699B
CN111161699B CN201911399710.5A CN201911399710A CN111161699B CN 111161699 B CN111161699 B CN 111161699B CN 201911399710 A CN201911399710 A CN 201911399710A CN 111161699 B CN111161699 B CN 111161699B
Authority
CN
China
Prior art keywords
audio
masking
noise
sound pressure
pressure level
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911399710.5A
Other languages
Chinese (zh)
Other versions
CN111161699A (en
Inventor
邹煜晖
刘锦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Xinyuchao Information Technology Co ltd
Original Assignee
Guangzhou Xinyuchao Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Xinyuchao Information Technology Co ltd filed Critical Guangzhou Xinyuchao Information Technology Co ltd
Priority to CN201911399710.5A priority Critical patent/CN111161699B/en
Publication of CN111161699A publication Critical patent/CN111161699A/en
Application granted granted Critical
Publication of CN111161699B publication Critical patent/CN111161699B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • G10K11/178Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
    • G10K11/1781Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase characterised by the analysis of input or output signals, e.g. frequency range, modes, transfer functions
    • G10K11/17821Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase characterised by the analysis of input or output signals, e.g. frequency range, modes, transfer functions characterised by the analysis of the input signals only
    • G10K11/17823Reference signals, e.g. ambient acoustic environment
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention discloses a masking method of environmental noise, which comprises the following steps: establishing a masking audio material library, extracting and classifying the characteristics of each audio material, and dividing the audio material into background type, melody type or auxiliary type audio; dividing a frequency part of 20Hz-16kHz into 25 critical sub-bands, and respectively calculating a signal masking ratio of a tone component and a signal masking ratio of a non-tone component of the sub-bands; monitoring and analyzing the environmental noise to obtain the characteristics of the tone component and the non-tone component of the environmental noise; recommending background audio and melody audio in the material library according to the non-tone component characteristics and the tone component characteristics of the noise, and recommending auxiliary audio in the material library according to the human ear characteristics; the recommended three audio are combined to output the final masking audio. The device can be used only by comprising the sound collecting device and the playing device, fully utilizes the existing resources of the mobile phone/intelligent sound box, and has the characteristics of low calculated amount, good masking effect and low requirement on hardware.

Description

Method, device and equipment for masking environmental noise
Technical Field
The invention belongs to the field of noise detection and control research, and particularly relates to a method, a device and equipment for masking environmental noise.
Background
Noise refers to sounds that cause an aversion to emotion or volume above the normal threshold of being received by the human ear. Noise which is common in life, such as heavy urban traffic, noisy street groups, indoor electrical appliances, even chat and play of surrounding people, affects working and learning in daytime and interferes with sleeping at night. The long-term contact not only can influence the physical health, but also can easily cause negative emotions such as dysphoria, anxiety and the like, thereby influencing the mental health.
Noise pollution is formed when noise continues to propagate in the environment. Noise pollution and water pollution and air pollution are three problems in parallel. How to effectively solve the noise pollution is also a common problem facing human beings.
Currently, the approaches to noise management in this field are: controlling from a sound source; control from the propagation medium; controlled at the in-ear position. The first two approaches are widely applied to industrial production and architectural design, and do not belong to the modes available for individuals, so the comparison is not performed. The noise is controlled from the in-ear position by manufacturing a device for individual selection, which can be divided into passive noise reduction and active noise reduction from a physical perspective.
The passive noise reduction is to absorb and insulate sound through physical materials so as to reduce noise entering ears, and common schemes include earplugs, earmuffs and the like, wherein the noise reduction effect is mainly determined by the selected physical materials, such as the earplugs are usually airtight solid materials, and enclosed spaces are mainly formed by surrounding the ears, or the outside noise is blocked by adopting sound insulation materials such as silica gel earplugs and the like. But has the disadvantages that: 1. the noise reduction effect on low-frequency noise is not obvious; 2. the noise is lack of pertinence, and external sounds are isolated from entering the ears indiscriminately, so that sounds which are needed to be heard originally are isolated; 3. noise reduction effect is limited by the material and design for noise reduction, so that the effect is uneven, the cost is very high when the noise reduction effect is good, or the wearing is heavy and uncomfortable.
The active noise reduction is to start from the characteristics of noise, generate reverse sound waves equal to external noise through a noise reduction system, and realize the noise reduction effect in a noise neutralization mode. The principle is that all sounds consist of a certain frequency spectrum, and if a sound can be found, the frequency spectrum is identical to the noise to be eliminated, and the noise can be completely cancelled out only by just opposite phases (180 degrees apart). A common solution is active noise reduction headphones. The active noise reduction mode has excellent general effect and good target noise elimination effect, but has high technical difficulty, high requirement on hardware and the requirement on independent batteries for power supply, so the active noise reduction mode is high in price, long in use time and limited in wearing feeling by the shapes of the batteries and equipment to a certain extent, is generally used in the important noise fields of armies, airports, shooting, automobiles and the like, and is not beneficial to flexible use in sleeping, going-out and other states of individuals.
In addition to the two modes, other sounds are adopted to mask noise, so that a common method in life is adopted, and some people can listen to light music to ignore the environmental noise when working in a noisy environment, so that the anti-noise effect is achieved. This approach takes advantage of the masking effect of hearing. The effect refers to the effect of one sound on the auditory system to perceive another sound, as is common in human and animal perception and localization of sound. In recent years, the method is mainly applied to the treatment of clinical tinnitus, voice enhancement, digital copyright watermarking of audio and the control of environmental noise.
The masking effect is used in environmental noise control, and is designed mainly for a large environment such as an open office, and is generally a continuous, low-noise sound without information content sound is used as the noise masking sound, because this type of sound can be a noise floor that is easily accepted by people, and other noises can be suppressed, so that people can hear the sounds without being psychologically annoyed, such as the harshness of a brake and the sound of a plate collision can be masked by a softer sound such as a fan, that is, restaurants adjacent to a road can use a fan installed and the fan opened to reduce noise interference of a vehicle. However, these can only reduce the overall noise impact and do not meet the individual anti-noise requirements of each individual, because firstly the environment in which the individual is located is changing in real time and secondly each individual is not equally sensitive to noise categories and sensitivity, for example, some people may feel fan sounds more annoying than the sound of a car.
However, in daily life, individuals need relatively quiet space, calm mood, whether they learn, work, or sleep. Therefore, many people can use the voice which plays personal preference to mask the noise, but the voice is effective to a certain extent, due to the complex variety and composition of the noise, the sound source is far and near and the sound is different, most music melodies selected by the people can need to use larger volume or be very close to ears, but the existence of the noise can be obviously perceived, on one hand, the complete shielding effect can not be achieved, on the other hand, the hearing of the people can be damaged after long-term use, and moreover, the voice which is favored at ordinary times or can mask the current noise can not be suitable for the current use scene, such as listening to the excessively excited melodies to fall asleep, so that the sleeping quality is easily influenced.
Therefore, how to match the volume, characteristics and usage scenario (such as sleep or work) of noise in daily life, so as to achieve the best masking effect accurately and conveniently, and a more scientific and accurate method and device are needed.
Disclosure of Invention
The invention aims to overcome the defects and shortcomings of the prior art and provide a masking method for environmental noise, which has the characteristics of low calculated amount, good masking effect and low requirement on hardware.
Another object of the present invention is to provide a masking device for environmental noise, which has the advantages of low calculation amount, good masking effect, low requirement for hardware, and low cost.
Another object of the present invention is to provide a device comprising the above-mentioned masking device for environmental noise, where the device may be a stand-alone device for environmental noise monitoring only, or may be an environmental parameter monitor for monitoring not only environmental noise but also other environmental parameters, and a terminal of the monitoring device integrated with the above-mentioned functions, such as a mobile phone, a computer, a smart speaker, smart hardware, etc.
The aim of the invention is achieved by the following technical scheme: a method of masking ambient noise comprising the steps of:
Establishing a masking audio material library, extracting and classifying the characteristics of each audio material, and dividing the audio material into background audio, melody audio or auxiliary audio;
dividing a frequency part of 20Hz-16kHz into 25 critical sub-bands, and respectively calculating signal masking ratios of tone components and signal masking ratios of non-tone components of the 25 critical sub-bands; the signal masking ratio refers to how much decibel higher than the masking object if tone masking or non-tone masking is to be achieved;
monitoring and analyzing the environmental noise to obtain the characteristics of the tone component and the non-tone component of the environmental noise;
recommending background audio in a material library according to the non-tone component characteristics of noise;
recommending melody audio in the material library according to the tone component characteristics of the noise;
recommending auxiliary audio in a material library aiming at the characteristics of human ears;
and combining the recommended background audio, melody audio and auxiliary audio, and outputting final masking audio.
Preferably, the method for establishing the masking audio material library is as follows: white noise, pink noise, brown noise, musical melodies of various musical instruments, atmosphere sound, atmosphere music and the like are selected as audio materials, and a material library is constructed.
Preferably, after the masking audio material library is established, feature extraction is performed on each audio material, and the method comprises the following steps:
PCM encoded (Pulse-code-modulation) data of the audio material is read in seconds;
converting the PCM coded data obtained every second into a frequency response curve through Fast Fourier Transform (FFT);
dividing a 20Hz-16kHz portion of the frequency response curve into 25 critical sub-bands;
analyzing the 25 critical sub-bands to obtain the maximum value of the sound pressure level in each sub-band;
continuously analyzing one second after one second through a window function until the full-length analysis of the audio is finished, obtaining a characteristic matrix of [ L x 25], wherein L represents the length of the audio, and the parameter value in the matrix is the maximum value of sound pressure level in each sub-band;
and solving the maximum value Max, the average value Avg and the standard deviation Sd of the sound pressure level under each frequency band to obtain a first masking feature matrix of the audio.
Preferably, each audio material is classified according to its characteristics by:
if the standard deviation Sd value under all the frequency bands is smaller than a preset first threshold value and Max-Avg under each frequency band is smaller than a preset second threshold value, the audio material is background audio; the background audio is used for masking non-tone components, and the frequency response curve of each second is distributed uniformly;
If the standard deviation Sd values under all the frequency bands are smaller than a preset first threshold value, and Max-Avg > = a preset second threshold value under each frequency band, the audio material is melody audio; melody-like audio is audio for masking a noise tone component, and is mostly a melody-like melody;
if the Sd value of the standard deviation in all the frequency bands is not less than the preset first threshold, but Max-Avg > =preset second threshold in each frequency band, the audio material is auxiliary audio, the auxiliary audio is audio with relatively more frequency distribution and relatively higher sound pressure level in the frequency range of 2kHz-4 kHz.
Preferably, the audio in the masking audio material library can be uploaded by the user himself, and the following processing is performed on the audio uploaded by the user himself: converting the uploaded audio into a prescribed format; the audio material is then classified, and if the classification is successful, it is stored locally to the user, otherwise "the audio is not suitable as information for masking the audio" is output.
Preferably, the signal masking ratios of the tone components and the signal masking ratios of the non-tone components of the 25 critical subbands are calculated, respectively, by:
first is the signal masking ratio of the tone component: t (T) TM (i,j)=P TM (j)-0.275z(j)+SF(i,j)-6.025;T TM (i, j) is a masking threshold for a single tone, P TM (j) For the Sound Pressure Level (SPL) of the tone masker at j, SF (i, j) is the extended masking threshold from the masker j to the masking object i. Because the masking effect of the masker and the masking object is best under the same critical frequency band, the invention does not consider the mutual influence of different critical frequency bands, avoids the extra calculation of masking superposition, saves the calculation resource and improves the calculation efficiency. While in the same critical band, SF (i, j) =17 (z (i) -z (j))=0, i.e., T TM (i,j)=P TM (j)-0.275z(j)-17(z(i)-z(j))-6.025=P TM (j) -0.275z (j) -6.025, where z (j) = [0,1,2, …,24]The method comprises the steps of carrying out a first treatment on the surface of the That is, the sound pressure level of the tone masker is 0.275z (j) +6.025 higher than that of the tone component of the masking object, and thus, the signal masking ratio of the tone component is 0.275z (j) +6.025; calculating the signal masking ratio of the tone component for each frequency band to obtain the tone componentA divided second masking feature matrix;
second, signal masking ratio of non-tonal components: t (T) NM (i,j)=P NM (j) -0.175z (j) +SF (i, j) -2.025, T at the same critical band NM (i,j)=P NM (j)-0.175z(j)+SF(i,j)-2.025=P NM (j)-0.175z(j)-17(z(i)-z(j))-2.025=P NM (j) -0.175z (j) -2.025, where z (j) = [0,1,2, …,24]。T NM (i, j) is a masking threshold of a single non-tone, P NM (j) SF (i, j) is the extended masking threshold from the masker j to the masking object i for the sound pressure level of the non-tone masker at j; that is, in the same critical frequency band, the sound pressure level of the non-tone masker is higher than the sound pressure level of the non-tone component of the masking object by 0.175z (j) +2.025, so that the signal masking ratio of the non-tone component is 0.175z (j) +2.025; and calculating the signal masking ratio of the non-tone component for each frequency band to obtain a third masking characteristic matrix of the non-tone component.
Preferably, the environmental noise is monitored and analyzed by:
collecting noise, and converting a noise analog signal into PCM (pulse code modulation) encoded data;
converting PCM coded data obtained every second into a frequency response curve through fast Fourier transform;
the 20Hz-16kHz portion of the second frequency response curve is divided into 25 critical sub-bands;
taking the maximum value of sound pressure level of each critical sub-band as tone masking threshold TM of the sub-band, and taking the average value of other values except the maximum value of each critical sub-band as non-tone masking threshold NM of the band;
continuously collecting noise for L seconds;
obtaining the maximum value, the average value and the standard deviation of TM and NM under each critical sub-band in the L second process through a window function, and obtaining a noise characteristic matrix;
adding all the values of TM to the second masking feature matrix to obtain a sound pressure level threshold matrix of masking audio required by the tone component of the environmental noise, namely a fourth masking feature matrix;
and adding all values of NM to the third masking feature matrix to obtain a sound pressure level threshold matrix of masking audio required by the non-tone component of the environmental noise, namely a fifth masking feature matrix.
Preferably, the background audio in the material library is recommended according to the non-tone characteristic of noise, and the method is as follows:
According to a fifth masking feature matrix of the non-tone component of the environmental noise, comparing the first masking feature matrices of the background audio in the material library one by one;
calculating the total standard deviation P of the difference value of the average value of the sound pressure level of the background audio and the average value of the sound pressure level threshold value of the non-tone component of the environmental noise under each sub-band;
counting the times N of the maximum sound pressure level of the background audio exceeding the maximum sound pressure level threshold value of the non-tone component of the environmental noise;
calculating the difference value of the sound pressure level standard deviation of the background audio and the sound pressure level threshold standard deviation of the non-tonal component of the environmental noise under each sub-band, and then summing the 25 difference values to obtain a numerical value D;
comparing [ P, N, D ] values of all background audio, wherein the larger the P value of the two audio is, the higher the audio sequence is, the larger the N value is, the higher the audio sequence is, the larger the D value is, the higher the audio sequence is, and if the P, N and D values of the two audio are equal, the priority order of the two audio is the same; in this way, the background class audio numbers of the top NUM1 are selected as the first audio candidate set B of masking.
Preferably, the melody type audio in the material library is recommended according to the tone characteristics of noise, and the method is as follows:
According to the fourth masking feature matrix of the tone component of the environmental noise, comparing the first masking feature matrix of the melody type audio in the material library one by one;
calculating the total standard deviation P of the difference value of the sound pressure level average value of the melody type audio and the sound pressure level threshold value average value of the environmental noise tone component under each sub-band;
counting the times N that the maximum sound pressure level of melody type audio exceeds the maximum sound pressure level threshold value of the environmental noise tone component;
calculating the difference value of the standard deviation of the sound pressure level of the melody type audio and the standard deviation of the threshold value of the sound pressure level of the environmental noise tone component under each sub-band, and then summing the 25 difference values to obtain a numerical value D;
comparing [ P, N, D ] values of all melody audios, wherein the larger the P value of the two audios is, the higher the audio ranking is, the larger the N value is, the higher the audio ranking is, the larger the D value is, the higher the audio ranking is, and if the P, N and D values of the two audios are equal, the priority order of the two audios is the same; in this way, the melody class audio numbers of the top NUM2 are selected as the second set of audio candidates M for masking.
Preferably, auxiliary audio in a material library is recommended aiming at the characteristics of human ears, and the method comprises the following steps:
Determining the most sensitive range of human ears as 2kHz-4kHz, and extracting 5 critical sub-bands (13 th-17 th critical sub-bands) in 25 critical sub-bands covered by the range;
acquiring a signal masking ratio of the noise audio tone component;
calculating the total standard deviation P of the difference between the average value of the sound pressure level of the auxiliary audio and the average value of the sound pressure level threshold value of the environmental noise tone component under the 5 critical frequency sub-bands;
calculating the number of times that the maximum sound pressure level of the auxiliary audio exceeds the maximum sound pressure level threshold of the environmental noise tone component under the 5 critical sub-bands, and summing the number of times to N;
calculating the difference between the standard deviation of the sound pressure level of the auxiliary audio and the standard deviation of the sound pressure level threshold of the environmental noise tone component under the 5 critical sub-bands, and then summing the 5 differences to obtain a value D;
comparing the [ P, N, D ] values of all auxiliary types of audios, wherein the larger the P value of the two audios is, the higher the audio ranking is, the larger the N value is, the higher the audio ranking is, the larger the D value is, the higher the audio ranking is, and if the P, N and D values of the two audios are equal, the priority order of the two audios is the same; in this way, the auxiliary class audio numbers of the top NUM3 are selected as the third set S of audio candidates for masking.
A masking device for ambient noise, the device comprising:
the masking audio material library feature extraction module is used for establishing a masking audio material library, extracting and classifying features of each audio material, and dividing the audio material into background audio, melody audio or auxiliary audio;
a signal masking ratio calculating module for dividing the frequency part of 20Hz-16kHz into 25 critical sub-bands, and calculating the signal masking ratio of the tone component and the signal masking ratio of the non-tone component of the 25 critical sub-bands respectively; the signal masking ratio refers to how much decibel higher than the masking object if tone masking or non-tone masking is to be achieved;
the environmental noise characteristic extraction module is used for monitoring and analyzing the environmental noise to obtain the characteristics of the tone component and the non-tone component of the environmental noise;
the masking audio generation module is used for recommending background audio in the material library according to the non-tone component characteristics of the noise; recommending melody audio in the material library according to the tone component characteristics of the noise; recommending auxiliary audio in a material library aiming at the characteristics of human ears; and combining the recommended background audio, melody audio and auxiliary audio, and outputting final masking audio.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. most of the current methods of sound masking use a masking model based on MPEG-1, but the premise of the MPEG-1 algorithm is that the quality of the audio is guaranteed, so a large number of complex algorithms are applied to ensure that the sound waves are not distorted and the audio is not distorted. However, the noise is shielded in daily life, and the noise itself does not need to be fidelity, so compared with the existing method, the algorithm developed by the invention has low calculation complexity, high calculation speed, good masking effect, extremely low requirement on hardware, difficult limitation by various external conditions and extremely strong practicability.
2. In terms of masking effect, masking is adopted in the market at present, a white noise is generated to mask environmental noise, and the problem of the physical level is only solved. Many times white noise itself is not a pleasing choice and masking one piece of noise with another often does not eliminate the psychological discomfort associated with the previous piece of noise. Scientific researches show that music can be relaxed, the melody of some musical instruments can be more efficient and better in memory than white noise in a working environment, and sounds such as thunder rain, sea waves and the like can be more pleasant than white noise, so that many people can also use music to shield noise, however, most of music selected by individuals per se can not achieve the best masking effect according to own preference, and excessive volume can be adopted to achieve the masking effect, so that hearing can be damaged after long-term use. Therefore, the invention combines the current latest researches, and creatively divides the masking sound into the combination of three parts of melody sound, background sound and auxiliary audio, wherein the melody sound can mask tone components of noise and can also help to improve working efficiency or relax mood or improve sleep, the background sound can mask non-tone components of noise and can also improve the pleasure, and the auxiliary audio can further weaken the influence of the noise in the sensitive range (2 kHz-4 kHz) of human ears. Therefore, the invention not only solves the masking effect of noise on the physical level and achieves the masking effect of minimum volume and maximum efficiency, but also relieves the uncomfortable feeling of noise to people from the psychological perspective.
3. The output scheme of the invention is diversified. The method is based on a masking audio material library, the requirement of an algorithm for selecting the audio material on the masking audio material is not high, the system itself has authored or selected hundreds of materials, and a diversified masking scheme can be flexibly output aiming at various noise environments by adopting a mode of combined recommendation, so that the effect is not reduced due to fatigue or familiarity after long-term use.
4. Compared with the noise-proof earplug, the noise reducer and the noise-proof earphone on the market, the noise-proof earphone has low requirement on hardware and low use cost, does not need to add other equipment, can realize the noise shielding effect by only relying on a common smart phone, a smart sound with a microphone, a computer with a microphone and the like, and can be flexibly used in all noise environments.
5. The existing large-scale sound shielding method designed for offices in the market is high in cost, is not suitable for individuals, and is mainly designed for shielding the voices of people speaking in the offices, so that the method is not suitable for other noise environments. The invention has low cost, can realize flexible use for individuals, and can be applied to various noise environments.
Drawings
FIG. 1 is a schematic diagram of frequency domain sound masking;
FIG. 2 (a) is a flow chart of the establishment of a library of masking audio material of the present embodiment;
FIG. 2 (b) is a flow chart of the present embodiment for the user to autonomously add audio to a library of masking audio material;
FIG. 3 is a diagram illustrating an example of a method of implementing an embodiment of the present invention;
FIG. 4 is a graph showing the frequency response of Lei Yusheng of the autonomous recording process of the present invention;
FIG. 5 is a schematic diagram of the frequency response of the transient ambient noise of the sound recording test;
fig. 6 is a masking schematic of the curves of three audio combinations versus noise data of fig. 5.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but embodiments of the present invention are not limited thereto.
Examples
The invention provides a masking method of environmental noise from the psychoacoustic angle (i.e. masking effect) of human ears, which at least comprises the steps of generating a masking audio material library, measuring the environmental noise, recommending noise masking and the like, and aims to cover unwanted noise by taking white noise, pink noise, brown noise, musical melodies of various musical instruments, atmosphere sound, atmosphere music and other more pleasant sounds as masking audio. The focus of this approach is not on handling the objective sound of noise, but rather on changing the subjective perception of noise by the human ear. This process is as if in a dark room, only one beam of light is coming in, and is particularly noticeable; if the lamps in the room are on, the previous beam of light is still, but is not so obvious, however if the lamp light is very glaring, the aim of the light can be achieved, but the influence of the light cannot be improved, so that one of the aims of the invention is how to achieve the best masking. Sound masking is also a similar principle, namely masking the original uncomfortable sound with a more comfortable, gentler sound. The masking method not only reduces the bad feeling of human ears on noise, but also relieves the psychological dysphoria and aversion to the noise.
The specific principle of the masking effect can be illustrated by an example: assuming that the threshold of sound pressure level of sound a is 40dB in a quiet environment, if sound B can be heard at the same time, the sound pressure level of sound a is raised to 54dB, i.e. 14dB higher than the original sound pressure level, because of the influence of sound B. At this time, we call B a masking sound and a masked sound. The decibels of the improvement in the threshold of the masking sound are referred to as masking amounts, 14dB in the example being the masking amount, 54dB being the masking threshold.
Masking can be classified into frequency domain masking and time domain masking. Frequency domain masking is also known as simultaneous masking, referring to: when two sounds are emitted simultaneously, the strong sound can mask the weak sound in the nearby frequency range because the perception of the strong sound by the human ear weakens the perception of the weak sound. As shown in fig. 1, sound with a sound frequency around 300Hz and a sound intensity of about 60dB masks sound with sound frequencies around 150Hz and 500 Hz. Frequency domain masking is to mask adjacent frequency sounds, while time domain masking is to mask adjacent time sounds, and is divided into advanced masking and lagging masking due to the time sequence. The principle of masking is irrelevant to the hearing of the human ear, because the brain of a person needs to process information for a certain time, so that sound signals received by the brain can be mixed together within the time range, and the masking effect is achieved.
For frequency domain masking, scientists such as flecher have experimentally predicted a set of pass bands, called critical bands, in order to quantify the frequency ranges in which the "masking tones" and "masked tones" interact. Critical frequency bands are a feature of human hearing, one critical frequency bandwidth being measured in "Bark", each Bark corresponding to a "fixed length" of about 1mm on the basilar membrane of the human ear. That is, the frequencies in a critical frequency band corresponding to a fixed length of 1mm are not distinguished acoustically by the human ear, so that the frequencies in this frequency band can be equivalent to a single frequency component. Scientists such as Zwicker calculated the relationship between the frequency and bandwidth of critical sub-bands and divided the range of 20Hz-16kHz into 25 sub-critical bands (table 1).
Table 1 list of frequency ranges (lower limit, upper limit, bandwidth) for 25 critical subbands
Figure BDA0002347185770000101
The relation between the critical frequency band z (f) and the frequency f is:
Figure BDA0002347185770000102
/>
the higher the sound pressure level, the larger the masking amount, the more similar the frequency sounds (in the same critical frequency band), so the present invention needs to find the maximum sound pressure level of each critical frequency band as the masking threshold of the current critical frequency band. In addition, high frequency sound is easily masked by low frequency sound, especially when low frequency sound is very loud, but low frequency sound is difficult to mask by high frequency sound, so the present invention needs to reach maximum masking at each critical frequency band as much as possible to maximize masking effect. To sum up, it is necessary to analyze and calculate the masking threshold of each critical frequency band in the noise source, and find out the audio/audio combination that can reach the maximum masking at each critical frequency band as much as possible, so as to achieve the best masking effect.
In order to accurately describe the physiological phenomenon of human ear masking of sound, a masking model of human ear needs to be established, and a relatively widely-used model is a Johnston masking model and an MPEG masking model under the ISO MPEG1 standard at present. There are two kinds of psychoacoustic models applied in MPEG-1, and psychoacoustic model 1 is taken as an example, and the model firstly performs signal power density spectrum conversion calculation and then performs sub-band sound pressure level measurement. Because the masking capability of tonal and non-tonal components are different, it is necessary to extract tonal and non-tonal components, calculate masking thresholds separately, then determine masking thresholds for each subband, and finally calculate a global masking threshold for the entire 20Hz-16kHz range.
Specifically, the algorithm flow of MPEG-1 model 1 is:
firstly, audio signals are transformed into a frequency domain through FFT, then the audio signals are classified into critical frequency bands, tone and non-tone components are distinguished, an independent masking threshold value and a total masking threshold value are calculated according to the frequency position and the intensity of the audio signals, and finally, a signal masking ratio is obtained.
The algorithm is specifically implemented as follows:
let the power spectrum of the analyzed audio signal x (i) be P (k), its maximum value is normalized to the 96dB reference sound pressure level SPL (reference sound pressure level). Since masking models for tonal (tonal) and non-tonal (non-tonal) components are different, tonal and non-tonal components are first distinguished according to the power spectrum. The tonal component is a local maximum in the power spectrum, defined as:
Figure BDA0002347185770000111
Wherein:
Figure BDA0002347185770000112
according to S T The local maximum value of (1) can calculate the sound pressure level P of the tone component TM (k):
Figure BDA0002347185770000113
Sound pressure level of not tone component
Figure BDA0002347185770000114
For the sum of the residual signal power spectra within each critical band, namely:
Figure BDA0002347185770000115
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0002347185770000116
is the geometric mean of the critical bands, namely:
Figure BDA0002347185770000117
where l is the lower critical band bound and u is the upper critical band bound.
According to P TM 、P NM And T q (k) The basic principle of processing tonal or non-tonal maskers is:
only satisfy P TM,NM ≥T q (k) Only then is the masker reserved, where T q (k) As absolute threshold at line k, can be approximated as:
Figure BDA0002347185770000121
the highest power masker is retained within a sliding window of 0.5Bark width.
After preprocessing the tonal and non-tonal masking thresholds, single tonal and non-tonal masking thresholds are calculated from the frequency domain masking effect of the Human Auditory System (HAS), the single tonal masking threshold being defined as:
T TM (i,j)=P TM (j)-0.275z(j)+SF(i,j)-6.025
wherein P is TM (j) A Sound Pressure Level (SPL) representing a tone masker at j, z (j) being a barker representation of frequency, an extended masking threshold function SF (i, j) from masker j to masking object i is defined as:
Figure BDA0002347185770000122
wherein delta is z =z (i) -z (j) (denoted by Bark). The threshold for a single non-tone masker is defined as:
T NM (i,j)=P NM (j)-0.175z(j)+SF(i,j)-2.025
wherein P is NM (j) Representing the Sound Pressure Level (SPL) of a non-tone masker at j, SF (i, j) is defined as:
Figure BDA0002347185770000123
The global masking threshold T is obtained by summing the masking thresholds of the corresponding single tone and non-tone and the absolute threshold g (i):
Figure BDA0002347185770000124
Wherein T is q (i) For absolute masking threshold, T TM (i, l) is the masking threshold of a single tone, l is the number of tone maskers, T NM (i, m) is a single non-tonal masking threshold and m is the number of non-tonal maskers.
The MPEG-1 algorithm is mainly used for compressing the audio, calculates masking thresholds of tones and non-tones on the premise of ensuring that the audio is not distorted and a waveform diagram is not distorted, discards tone and non-tone components masked by the audio and low-frequency and high-frequency components insensitive to human ears, and re-encodes and compresses the audio on the basis, so that a large amount of calculation is required. For masking of noise environment, the key point is masking effect, not speech enhancement, so the method of the invention optimizes and improves on the basis of MPEG-1 algorithm, and can achieve better masking effect on the basis of reducing a large amount of calculation.
Referring to fig. 1, a method for masking environmental noise in this embodiment includes the steps of:
s1, establishing a masking audio material library, extracting and classifying the characteristics of each audio material, and dividing the audio material into background audio, melody audio or auxiliary audio;
S2, dividing a frequency part of 20Hz-16kHz into 25 critical sub-bands, and respectively calculating signal masking ratios of tone components and signal masking ratios of non-tone components of the 25 critical sub-bands; the signal masking ratio refers to how much decibel higher than the masking object if tone masking or non-tone masking is to be achieved;
s3, monitoring and analyzing the environmental noise to obtain the characteristics of the tone component and the non-tone component of the environmental noise;
s4, recommending background audio in the material library according to the non-tone component characteristics of the noise; recommending melody audio in the material library according to the tone component characteristics of the noise; recommending auxiliary audio in a material library aiming at the characteristics of human ears; and combining the recommended background audio, melody audio and auxiliary audio, and outputting final masking audio.
In step S1, for the establishment of the library of masking audio materials, white noise, pink noise, brown noise, musical melodies of various musical instruments, atmosphere sounds (e.g., bird sounds, thunder sounds, sounds of cafes, etc.), and atmosphere music (e.g., tropical rainforest scene music+natural atmosphere sounds of tropical rainforest), and other sounds, which are generally used for noise masking and have high psychological comfort, are selected. Masking sounds in the whole material library are divided into three types of audios according to the distribution of frequency response curves and categories: background class (tone frequency of masking noise non-tone component, frequency response curve distribution of every second is relatively uniform, such as water flow sound, white noise, powder noise), melody class (tone frequency of masking noise tone component is mostly melody, such as piano melody, harp melody etc.), auxiliary class (tone frequency of relatively high sound pressure level and relatively more frequency distribution in frequency range of 2kHz-4 kHz).
Noise is random in time, so one key to masking noise is: the distribution of the frequency response curve of the masking audio is averaged over each second to ensure that the noise is masked over the entire time sequence. In the past, noise masking was mostly performed by directly generating white noise, because the sound pressure level of frequency components in white noise is uniform throughout the audible range and the data output per second is the same. The noise is masked by white noise, and the noise can be covered at a physical angle, but the hearing feeling of the white noise may be sandy noise feeling, and the comfort is poor, so that the noise is difficult to be used for a long time. Many studies have shown that in terms of sound masking, natural atmosphere sounds such as running water sounds, the running water sounds with sound pressure levels and frequencies distributed uniformly over the whole time sequence are selected, so that not only can the masking effect of white noise be achieved, but also the masking effect is high in pleasure, and meanwhile, the mood, the work efficiency and the sleep can be improved. Therefore, in addition to selecting white noise, the material library of the masking sound in the embodiment also selects and manufactures various optimized natural atmosphere sounds or artificial atmosphere sounds, atmosphere music and some musical instrument melodies, thereby improving the flexibility of the system.
Referring to fig. 2 (a), in step S1, feature extraction is performed for audio in a masking audio material library, specifically:
s101, the audio in the embodiment is stored by default in a wav format of a 16bit double-channel with a PCM coding 44kHz sampling rate, so that the audio data is ensured not to be distorted. Each wav format audio material is first read in seconds of its PCM encoded data.
S102, converting the PCM coded data obtained every second into a frequency response curve (the vertical axis is sound pressure level dB, the horizontal axis is frequency Hz) through Fast Fourier Transform (FFT), normalizing the maximum value to 96dB reference sound pressure level, and obtaining the PCM coded data by the formula
Figure BDA0002347185770000141
The time domain signal is converted into the frequency domain signal, so that different frequency components in the sound and respective sound pressure levels can be quickly identified. For example, fig. 4 is a graph of the frequency response of thunderstorm sounds we record and post-process, where the horizontal axis is frequency and the vertical axis is sound pressure level (dB) of sound at the corresponding frequency.
S103, the main range of sound heard by the human ear is 20Hz-16kHz, so the embodiment mainly analyzes the audio frequency of the frequency interval. The 20Hz-16kHz portion of the frequency response curve for this second was divided into 25 sub-critical bands (see Table 1). The critical band is a characteristic of human hearing as if it were a bandpass filter. The sound is only in the same sub-critical frequency band, and the loudness heard by the human ear is the same regardless of the frequency under the condition that the objective sound pressure level is unchanged, so that the frequency in the same critical frequency band can be treated as a single frequency component.
And S104, analyzing the frequency response curves of the 25 critical sub-bands in the second to obtain the maximum value of the sound pressure level in each sub-band.
S105, continuing analysis one second by one second through the window function until the full length analysis of the audio is completed. For example, the audio length is L seconds (if the audio is 13.45 seconds, the value of L is 14, the part exceeding the integer is processed by +1 second), and the maximum value of the sound pressure level of each second of 25 critical subbands is 25, and L seconds obtain a feature matrix of [ l×25 ].
S106, comparing the L groups of data in each of the 25 sub-bands to obtain a maximum value Max (the largest one of the L groups of data in the frequency band), an average value Avg (the average value of the L groups of data in the frequency band) and a standard deviation Sd (the measurement of the difference degree between the L groups of data in the frequency band) of the sound pressure level in each frequency band, and finally obtaining a characteristic matrix [25 x 3] (maximum value, average value and standard deviation) of the masking audio.
And S107, obtaining data of [25 x 3] for each audio, and taking the data as a first masking characteristic matrix of the audio.
S108, the invention divides the masking audio into three types of background audio, melody audio and auxiliary audio. The background audio is mainly used for continuous sound coverage, and needs to meet the characteristics of small sound variation amplitude and stable and uninterrupted, specifically, in a feature matrix of [25 x 3], all 25 groups of Sd values < a preset first threshold (stable and uninterrupted), and each group of Max-Avg < a preset second threshold (small sound variation amplitude) must be met; the melody audio is mainly used for tone masking, and needs to meet the characteristic that the sound variation amplitude is large enough and stable and uninterrupted, specifically, in a feature matrix of [25 x 3], all 25 groups of Sd values < a preset first threshold (sound stable and uninterrupted), and each group of Max-Avg > = a preset second threshold (sound variation amplitude is large enough); the auxiliary audio is used for supplementing tone masking of the melody audio, and the auxiliary audio can be classified as the auxiliary audio only by meeting tone masking at a specific frequency, that is, meeting Max-Avg > = preset second threshold value of each group (all melody audio can be calculated as auxiliary audio, but the auxiliary audio is not necessarily melody audio). Some audio cannot meet the above division conditions for 3 kinds of audio, and is not suitable as masking audio. The audio (thunderstorm sound) shown in fig. 4 satisfies the characteristics of the background audio, and is classified as the background audio.
In practical application, the user can also add his favorite sound to the masking sound material library autonomously, as shown in fig. 2 (b), because the present invention requires that all audio in the masking audio material library is in wav format of PCM encoded 44kHz sampling rate 16bit binaural in order to ensure the integrity of the audio data for calculation. It is therefore first necessary to determine the audio format added by the user, and if the format is not PCM encoded 44kHz sampling rate 16bit binaural wav format, the system will automatically convert to that format, and if a conversion failure occurs, prompt the user to replace the audio file. For the audio file in the corresponding format, the steps S101-S108 are executed, if any one of the three types of background audio, melody audio and auxiliary audio is satisfied, the audio is uploaded and saved to the local of the user for the user, otherwise, the audio is returned to be "unsuitable as masking audio, and the uploading fails".
In contrast to the algorithm of MPEG-1, the present invention, step S2, focuses on the calculation of the signal masking ratios of the tonal components and the non-tonal components of 25 critical subbands, by:
s201, first, signal masking ratio of tone component: t (T) TM (i,j)=P TM (j)-0.275z(j)+SF(i,j)-6.025,T TM (i, j) is a masking threshold for a single tone, P TM (j) For the sound pressure level of tone masker at j, SF (i, j) is the expanded mask from masker j to mask object iA masking threshold. Because the masking effect of the masker and the masking object is best under the same critical frequency band, the invention does not consider the mutual influence of different critical frequency bands, avoids the extra calculation of masking superposition, saves the calculation resource and improves the calculation efficiency.
While
Figure BDA0002347185770000161
Delta at the same critical subband z =z (i) -z (j) =0, and therefore SF (i, j) = -17 Δ z = -17 (z (i) -z (j))=0, i.e. T TM (i,j)=P TM (j)-0.275z(j)-17(z(i)-z(j))-6.025=P TM (j) -0.275z (j) -6.025, where z (j) = [0,1,2, …,24]The method comprises the steps of carrying out a first treatment on the surface of the That is, the sound pressure level of the tone masker is 0.275z (j) +6.025 higher than that of the tone component of the masking object, and thus, the signal masking ratio of the tone component is 0.275z (j) +6.025; calculating the signal masking ratio of the tone component for each frequency band to obtain a second masking feature matrix of the tone component; the principle is shown in fig. 1, and the calculation result of each frequency band is shown in table 2, and a second masking feature matrix of the tone component is generated.
TABLE 2 Signal mask ratio of tone components at 25 critical subbands
Figure BDA0002347185770000162
S202, second signal masking ratio of non-tonal components: t (T) NM (i,j)=P NM (j) -0.175z (j) +SF (i, j) -2.025, as above, delta at the same critical band z =z (i) -z (j) =0, and therefore SF (i, j) = -17 Δ z = -17 (z (i) -z (j))=0, i.e. T NM (i,j)=P NM (j)-0.175z(j)+SF(i,j)-2.025=P NM (j)-0.175z(j)-17(z(i)-z(j))-2.025=P NM (j) -0.175z (j) -2.025, where z (j) = [0,1,2, …,24]。T NM (i, j) is a masking threshold of a single non-tone, P NM (j) SF (i, j) is the extended masking threshold from the masker j to the masking object i for the sound pressure level of the non-tone masker at j; i.e. at the same critical frequencyWith the sound pressure level of the non-tone masker being 0.175z (j) +2.025 higher than that of the non-tone component of the masking object, the effect of absolute masking can be achieved, and therefore, the signal masking ratio of the non-tone component is 0.175z (j) +2.025; and calculating the signal masking ratio of the non-tone component for each frequency band to obtain a third masking characteristic matrix of the non-tone component. The principle is shown in fig. 1, and the calculation result of each frequency band is shown in table 3, and a third masking feature matrix of non-tone components is generated.
TABLE 3 Signal mask ratio of non-tonal components at 25 critical subbands
Figure BDA0002347185770000171
In the step S1, all the audios in the audio material library are subjected to feature extraction and classification once when the material library is built, and only new materials are processed when the new materials are uploaded later, and the processed results are stored for subsequent practical use. Similarly, the second masking feature matrix and the third masking feature matrix obtained in step S2 may be calculated only once, and the calculation result may be saved for subsequent practical use.
When in actual use, first masking feature matrixes which are feature parameters of all audios in a masking audio material library are loaded, and second masking feature matrixes which are feature parameters of 25 critical sub-bands and are tone components and third masking feature matrixes which are non-tone components are loaded. Then, the environmental noise is monitored and analyzed, and the analysis method is as follows:
s301, collecting noise, and then converting the voice analog signal into PCM encoded data.
The collection of ambient noise data may be performed by any device with a microphone, such as a smart phone, tablet, smart box, microphone-attached PC or other noise monitoring instrument, etc. In the embodiment, data acquisition is carried out through the android smart phone with p9-plus 4g/128g pressure screen and the iPhone6s 128g communication edition. The sampling parameters were set as follows: a44 kHz sampling rate and a 16bit sampling bit number dual-channel are selected to acquire the sound with the maximum fidelity. The data collected per second is 44k x 16 x 2/8=176 k. A common device is a mono microphone that can hold two-channel data, so that the left channel and right channel data are the same. The microphone of the android mobile phone can only receive signals with the frequency in the range of 20 Hz-20 kHz, and the high sampling rate can ensure that data is not lost as much as possible.
S302, the mobile phone end converts the collected audio signals into a PCM coding format, transmits the PCM coding format to analyze and calculate, and performs spectrum conversion through FFT, so that the sound pressure level accurate to 1Hz can be obtained. The FFT conversion is carried out for 2 times per second, the data are accumulated, the data are finally converted into a frequency response curve (the horizontal axis is frequency/Hz, the vertical axis is sound pressure level/dB) of 0 Hz-20 kHz, the maximum value is normalized to 96dB reference sound pressure level, and the maximum value is normalized to 96dB reference sound pressure level through a formula
Figure BDA0002347185770000181
The time domain signal is converted into the frequency domain signal, so that different frequency components in the sound and respective sound pressure levels can be rapidly identified. For example, fig. 5 is a frequency response curve of instantaneous environmental noise during a recording test, where the horizontal axis represents frequency and the vertical axis represents sound pressure level (dB) of sound at frequency.
S303, dividing the 20Hz-16kHz part of the second frequency response curve into 25 sub-critical frequency bands, and immediately analyzing.
S304, since noise may contain both tonal and non-tonal components, separate calculations are required. The maximum value of sound pressure level of each critical frequency band is taken as a tone masking threshold TM (in dB) of the frequency band, and the average value of other values except the maximum value of each critical frequency band is taken as a non-tone masking threshold NM (in dB) of the frequency band.
And S305, continuously collecting noise, and carrying out data analysis through a window function, for example, collecting L seconds, wherein the maximum value, the mean value and the standard deviation of the sound pressure level of each second of 25 critical sub-bands are respectively 25, and L seconds obtain a [ L.times.25.times.3 ] feature matrix.
S306, comparing L groups of data in each of 25 sub-bands to obtain the maximum value (the largest one of the L groups of data in the frequency band), the average value (the average value of the L groups of data in the frequency band) and the standard deviation (the measurement of the difference degree between the L groups of data in the frequency band) of each frequency band, obtaining a noise characteristic matrix with the size of [25 x 3 x 2], finally adding all values (the size of [25 x 3 ]) of TM to the signal masking ratio (the second masking characteristic matrix) of the tone components of the corresponding 25 critical frequency bands in the table 2, adding all values (the size of [25 x 3 ]) of NM to the signal masking ratio (the third masking characteristic matrix) of the non-tone components of the corresponding 25 critical frequency bands in the table 3, finally obtaining a sound pressure level threshold matrix (the fourth masking characteristic matrix) of the tone component of the environmental noise, wherein the sound pressure level threshold value, the average value and the audio level of the required masking audio frequency are the required to be 25 x 3, and the non-tone component of the required tone component of the environmental noise is the required sound pressure level threshold value, and the sound pressure level threshold value of the required threshold value is the required sound pressure level threshold value of [25 x 3], and the non-tone component is required to be the sound level threshold value of the sound level threshold value of the required tone level threshold value of the audio level threshold value of the required to be 25 x 3.
Step S4 is to recommend a masking scheme according to the characteristics of noise and combining the characteristics in the audio material library, wherein the masking scheme comprises 3 parts, namely, recommending background audio in the material library according to the non-tone characteristics of the noise, screening proper melody audio according to the tone characteristics of the noise, and selecting proper auxiliary audio according to the characteristics of human ears. The respective steps are specifically described below.
The method for screening the proper background audio according to the non-tone characteristic of the noise comprises the following steps:
s411, according to a fifth masking feature matrix of the non-tone component of the noise audio, comparing the first masking feature matrix of each audio of the background class in the masking music material library one by one;
s412, in each audio comparison of 1 to 1, under the comparison of 25 critical sub-bands, calculating the total standard deviation P of the differences of the sound pressure level average value of the background audio and the sound pressure level threshold average value of the non-tone component of the environmental noise under each sub-band.
S413, in the corresponding 1-to-1 audio comparison, under the comparison of 25 critical sub-bands, if the sound pressure level maximum value of the background audio is ">" the sound pressure level threshold maximum value of the non-tonal component of the environmental noise ", counting 1, and counting the number of times that the sound pressure level maximum value of the background audio exceeds the sound pressure level threshold maximum value after all 25 critical sub-bands are compared, wherein the total number is N.
S414, in the corresponding 1-to-1 audio comparison, under the respective comparison of 25 critical sub-bands, calculating the difference value of the sound pressure level standard deviation of the background audio and the sound pressure level threshold standard deviation of the non-tonal component of the environmental noise under each sub-band, and then summing the 25 difference values to obtain a value D.
S415, after the calculation, each background audio has a characteristic value combination [ P, N, D ] corresponding to the noise. On the basis of giving priority to the maximum power values of 25 critical frequency bands, unnecessary masking amounts (masking curve fitting degree high P) are appropriately reduced, and the maximum power numbers of 25 frequency bands are considered to achieve a covering effect (N), and frequency fitting degrees (D) of masking audio and noise are considered, so that on the basis of reducing calculation amounts, sufficient masking effects are achieved.
The [ P, N, D ] values of all background classes of audio are compared among them by priority. 1) The value of P is preferentially compared, and the audio with larger value of P is ranked higher; 2) If the P values of the two audios are equal, comparing the N values; audio with a larger value of N is ranked higher; 3) If the N values are also equal, comparing the D values; audio with a larger D value is ranked higher. If the two audio P, N, D values are equal, then the two are prioritized the same. In this way, the top 50 background audio number is selected as the first audio candidate set B for masking.
Screening proper melody audio according to the tone characteristics of noise, wherein the method comprises the following steps:
s421, selecting tone components of masking noise from the melody type audio material library: and comparing the first masking feature matrix of each audio of the melody class in the masking material library one by one according to the fourth masking feature matrix of the noise audio tone component.
In each 1-to-1 audio comparison, the total standard deviation P of the differences between the "average sound pressure level of melody audio" and the "threshold average sound pressure level of ambient noise tonal components" in each of the 25 critical subbands is calculated at S422.
S423, in the corresponding 1-to-1 audio comparison, under the comparison of 25 critical sub-bands, if the sound pressure level maximum value of the melody audio ">" the sound pressure level threshold maximum value of the environmental noise audio tone component ", counting 1, counting the times that the sound pressure level maximum value of the melody audio exceeds the sound pressure level threshold maximum value after all 25 critical sub-bands are compared, and totalizing N.
S424, in the corresponding 1-to-1 audio comparison, under the comparison of 25 critical sub-bands, calculating the standard deviation of sound pressure level of melody audio and the standard deviation of sound pressure level threshold of environmental noise tone component, and summing the 25 differences under each sub-band to obtain the value D.
After the calculation, each melody audio has a characteristic value combination [ P, N, D ] corresponding to the noise. On the basis of giving priority to the maximum power values of 25 critical frequency bands, unnecessary masking amounts (masking curve fitting degree high P) are appropriately reduced, and the maximum power numbers of 25 frequency bands are considered to achieve a covering effect (N), and frequency fitting degrees (D) of masking audio and noise are considered, so that on the basis of reducing calculation amounts, sufficient masking effects are achieved.
By priority, their [ P, N, D ] values are compared among all melody-like tones. 1) The value of P is preferentially compared, and the audio with larger value of P is ranked higher; 2) If the P values of the two audios are equal, comparing the N values; audio with a larger value of N is ranked higher; 3) If the N values are also equal, comparing the D values; audio with a larger D value is ranked higher. If the two audio P, N, D values are equal, then the two are prioritized the same. In this way, the top 50 melody audio numbers are selected as the second set of audio candidates M for masking.
Selecting proper auxiliary audio aiming at the characteristics of human ears, wherein the method comprises the following steps:
s431, researchers study a lot of uncomfortable noise of human ears, found that after wiping out the sound in the range of 2kHz-4kHz, the noise becomes quite acceptable, which shows that 2kHz-4kHz is a frequency range sensitive to human ears, and on the basis of the invention, the noise in the area is enhanced and shielded, so that the noise shielding effect of the whole system is optimized.
The present embodiment selects masking audio for human ear characteristics from auxiliary class audio of a masking material library: in addition to the overall masking in the frequency range 20Hz-16kHz, we choose to do enhanced masking in the most sensitive range of the human ear (2 kHz-4 kHz) in order to achieve better masking. In the 2kHz-4kHz range, 5 out of 25 critical subbands (critical subbands No. 13 to 17) are covered.
S432, because the primary purpose of the auxiliary audio class is to enhance masking in the range of 2kHz-4kHz, whether tonal or non-tonal components of noise. Whereas the signal-to-masking ratio of the tonal components is higher than that of the non-tonal components, the invention chooses here to calculate with a higher signal-to-masking ratio of the tonal components, ensuring the masking effect.
S433, respectively extracting a sixth masking feature matrix and a seventh masking feature matrix of the noise audio and the auxiliary audio under the 13-17 critical sub-bands according to the fourth masking feature matrix of the noise tone component and the first masking feature matrix of each auxiliary audio. The seventh masking feature matrix of the auxiliary class audio is compared with the sixth masking feature matrix of the tonal component of the noise, one by one, at these 5 critical subbands.
In each 1-to-1 audio comparison, the total standard deviation P of the differences of the "average sound pressure level of auxiliary audio" and the "average sound pressure level threshold value of the environmental noise tonal component" in each sub-band is calculated under the respective comparison of 5 critical sub-bands S434.
S435, in the corresponding 1-to-1 audio comparison, under the comparison of 5 critical sub-bands, if the "maximum sound pressure level of auxiliary audio" > "maximum sound pressure level threshold value of the audio tone component of the environmental noise" is counted 1, and after all 5 critical sub-bands are compared, the number of times that the maximum sound pressure level of the auxiliary audio exceeds the maximum sound pressure level threshold value is counted and is counted as N.
S436, in the corresponding 1-to-1 audio comparison, under the respective comparison of 5 critical sub-bands, calculating the difference value under each sub-band of the "standard deviation of the sound pressure level of the auxiliary audio class" and the "standard deviation of the sound pressure level threshold of the environmental noise tone component", and then summing the 5 difference values to obtain a value D.
S437, after the calculation, each auxiliary audio class has a characteristic value combination [ P, N, D ] corresponding to the noise. Comparing their [ P, N, D ] values among all auxiliary classes of audio, and discharging the priority order. 1) The value of P is preferentially compared, and the audio with larger value of P is ranked higher; 2) If the P values of the two audios are equal, comparing the N values; audio with a larger value of N is ranked higher; 3) If the N values are also equal, comparing the D values; audio with a larger D value is ranked higher. If the two audio P, N, D values are equal, then the two are prioritized the same. In this way, the top 50 auxiliary class audio numbers are selected as the third set of audio candidates S for masking.
After the recommendation step, a background audio candidate set B, a melody audio candidate set M and an auxiliary audio candidate set S are respectively obtained, the audio combination with the highest priority is selected from the audio combination schemes B, M and S, and a result is output. In the combination, besides the psychological shielding effect of the human ears, the audio combination refers to a background sound, a melody and a detail auxiliary audio to form a combination with pleasant tone and comfortable psychological feeling, so that the combination has the effects of noise shielding and psychological relief, and a user can customize the playing time.
Based on the above-mentioned method for masking environmental noise, a corresponding device for masking environmental noise can be obtained, which comprises:
the masking audio material library feature extraction module is used for establishing a masking audio material library, extracting and classifying features of each audio material, and dividing the audio material into background audio, melody audio or auxiliary audio;
a signal masking ratio calculating module for dividing the frequency part of 20Hz-16kHz into 25 critical sub-bands, and calculating the signal masking ratio of the tone component and the signal masking ratio of the non-tone component of the 25 critical sub-bands respectively; the signal masking ratio refers to how much decibel higher than the masking object if tone masking or non-tone masking is to be achieved;
The environmental noise characteristic extraction module is used for monitoring and analyzing the environmental noise to obtain the characteristics of the tone component and the non-tone component of the environmental noise;
the masking audio generation module is used for recommending background audio in the material library according to the non-tone component characteristics of the noise; recommending melody audio in the material library according to the tone component characteristics of the noise; recommending auxiliary audio in a material library aiming at the characteristics of human ears; and combining the recommended background audio, melody audio and auxiliary audio, and outputting final masking audio.
The masking audio material library feature extraction module and the signal masking ratio calculation module are preloaded modules, the environmental noise feature extraction module and the masking audio generation module are arranged on a handheld terminal or a fixed device of a user, and specific contents of the modules are referred to in the detailed description of the method part and are not repeated here.
The embodiment provides equipment, which comprises a mobile phone/an intelligent sound box with a microphone, wherein a masking sound database in a masking device of the environmental noise is preset inside the equipment. When the device is used, the device is started, characteristic parameters of all the audios in the masking audio material library are loaded, and meanwhile, the central processing unit of the mobile phone/intelligent sound box is connected, and the microphone of the mobile phone/intelligent sound box is utilized for sampling the environmental noise; then, through the preset mode of the device, the frequency response data curve of the sound source is obtained by carrying out frequency spectrum conversion on the ambient noise, the maximum decibel value is displayed in the system in real time, the characteristic data of the sound source is output, the characteristic data of the sound source is obtained through statistical analysis, and the audio data is continuously collected during the period. Further, the appropriate masking audio is screened according to the noise characteristics, the matching processing is carried out through the selected combination, and the scheme of the final masking audio combination is output.
From the above, the invention can be applied to the hardware such as microphone, vehicle-mounted system, etc. of the traditional means, so that the invention can be applied only in the system with sound collecting device and playing device, and can be completed without adding any other hardware, thereby fully utilizing the existing resources of the mobile phone/intelligent sound box and expanding the practicability of the invention.
In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, for example, the division of the unit modules is merely a logic function division, there may be another division manner when actually implemented, or units having the same function may be integrated into one unit, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices, or elements, or may be an electrical, mechanical, or other form of connection.
The functional units in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units may be stored in a storage medium if implemented in the form of software functional units and sold or used as stand-alone products. Based on such understanding, the technical solution of the present invention is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, an optical disk, or other various media capable of storing program codes.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (12)

1. A method of masking ambient noise, comprising the steps of:
establishing a masking audio material library, extracting and classifying the characteristics of each audio material, and dividing the audio material into background audio, melody audio or auxiliary audio;
dividing a frequency part of 20Hz-16kHz into 25 critical sub-bands, and respectively calculating signal masking ratios of tone components and signal masking ratios of non-tone components of the 25 critical sub-bands;
monitoring and analyzing the environmental noise to obtain the characteristics of the tone component and the non-tone component of the environmental noise;
recommending background audio in a material library according to the non-tone component characteristics of noise;
recommending melody audio in the material library according to the tone component characteristics of the noise;
Recommending auxiliary audio in a material library aiming at the characteristics of human ears;
and combining the recommended background audio, melody audio and auxiliary audio, and outputting final masking audio.
2. The masking method of ambient noise according to claim 1, wherein the method of creating the library of masking audio material is as follows: white noise, pink noise, brown noise, musical melodies of various musical instruments, atmosphere sound and atmosphere music are selected and used as audio materials, and a material library is constructed.
3. The masking method of environmental noise according to claim 1, wherein after the masking audio material library is established, feature extraction is performed on each audio material, the method comprising:
the PCM encoded data of the audio material is read in seconds;
converting PCM coded data obtained every second into a frequency response curve through fast Fourier transform;
dividing a 20Hz-16kHz portion of the frequency response curve into 25 critical sub-bands;
analyzing the 25 critical sub-bands to obtain the maximum value of the sound pressure level in each sub-band;
continuously analyzing one second after one second through a window function until the full-length analysis of the audio is finished, obtaining a characteristic matrix of [ L x 25], wherein L represents the length of the audio, and the parameter value in the matrix is the maximum value of sound pressure level in each sub-band;
And solving the maximum value Max, the average value Avg and the standard deviation Sd of the sound pressure level under each frequency band to obtain a first masking feature matrix of the audio.
4. A method of masking ambient noise according to claim 3, wherein each audio material is classified according to its characteristics by:
if the standard deviation Sd value under all the frequency bands is smaller than a preset first threshold value and Max-Avg under each frequency band is smaller than a preset second threshold value, the audio material is background audio;
if the standard deviation Sd values under all the frequency bands are smaller than a preset first threshold value, and Max-Avg > = a preset second threshold value under each frequency band, the audio material is melody audio;
if the standard deviation Sd value in all the frequency bands is not less than the preset first threshold, but Max-Avg > =preset second threshold in each frequency band, the audio material is auxiliary audio.
5. The method for masking ambient noise according to claim 1, wherein the audio in the library of masking audio material is uploaded by the user himself, and the following processing is performed for the audio uploaded by the user himself: converting the uploaded audio into a prescribed format; the audio material is then classified, and if the classification is successful, the audio material is stored locally to the user, otherwise, the information that the audio is unsuitable as masking audio is output.
6. A masking method of ambient noise according to claim 3, wherein the signal masking ratios of the tonal components and the signal masking ratios of the non-tonal components of the 25 critical subbands are calculated separately by:
first is the signal masking ratio of the tone component: t (T) TM (i,j)=P TM (j)-0.275z(j)+SF(i,j)-6.025;T TM (i, j) is a masking threshold for a single tone, P TM (j) SF (i, j) is the extended masking threshold from the masker j to the masking object i for the sound pressure level of the tone masker at j; since the masking effect is best in the same critical frequency band as the masking object, SF (i, j) =17 (z (i) -z (j))=0, i.e., T, in the same critical frequency band irrespective of the mutual influence of different critical frequency bands TM (i,j)=P TM (j)-0.275z(j)-17(z(i)-z(j))-6.025=P TM (j) -0.275z (j) -6.025, where z (j) = [0,1,2, …,24]The method comprises the steps of carrying out a first treatment on the surface of the That is, the sound pressure level of the tone masker is 0.275z (j) +6.025 higher than that of the tone component of the masking object, and thus, the signal masking ratio of the tone component is 0.275z (j) +6.025; calculating the signal masking ratio of the tone component for each frequency band to obtain a second masking feature matrix of the tone component;
second, signal masking ratio of non-tonal components: t (T) NM (i,j)=P NM (j) -0.175z (j) +SF (i, j) -2.025, T at the same critical band NM (i,j)=P NM (j)-0.175z(j)+SF(i,j)-2.025=P NM (j)-0.175z(j)-17(z(i)-z(j))-2.025=P NM (j) -0.175z (j) -2.025, where z (j) = [0,1,2, …,24],T NM (i, j) is a masking threshold of a single non-tone, P NM (j) The sound pressure level at j for the non-tone masker, SF (i, j) is from masker j to the masking pairAn extended masking threshold like i; that is, in the same critical frequency band, the sound pressure level of the non-tone masker is higher than the sound pressure level of the non-tone component of the masking object by 0.175z (j) +2.025, so that the signal masking ratio of the non-tone component is 0.175z (j) +2.025; and calculating the signal masking ratio of the non-tone component for each frequency band to obtain a third masking characteristic matrix of the non-tone component.
7. The method of masking ambient noise according to claim 6, wherein the ambient noise is monitored and analyzed by:
collecting noise, and converting a noise analog signal into PCM (pulse code modulation) encoded data;
converting PCM coded data obtained every second into a frequency response curve through fast Fourier transform;
the 20Hz-16kHz portion of the second frequency response curve is divided into 25 critical sub-bands;
taking the maximum value of sound pressure level of each critical sub-band as a masking threshold TM of tone components of the sub-band, and taking the average value of other values except the maximum value of each critical sub-band as a masking threshold NM of non-tone components of the band;
Continuously collecting noise for L seconds;
obtaining the maximum value, the average value and the standard deviation of TM and NM under each critical sub-band in the L second process through a window function, and obtaining a noise characteristic matrix;
adding all the values of TM to the second masking feature matrix to obtain a sound pressure level threshold matrix of masking audio required by the tone component of the environmental noise, namely a fourth masking feature matrix;
and adding all values of NM to the third masking feature matrix to obtain a sound pressure level threshold matrix of masking audio required by the non-tone component of the environmental noise, namely a fifth masking feature matrix.
8. The method of masking ambient noise according to claim 7, wherein the background audio in the library of material is recommended based on non-tonal characteristics of the noise by:
according to a fifth masking feature matrix of the non-tone component of the environmental noise, comparing the first masking feature matrices of the background audio in the material library one by one;
calculating the total standard deviation P of the difference value of the average value of the sound pressure level of the background audio and the average value of the sound pressure level threshold value of the non-tone component of the environmental noise under each sub-band;
counting the times N of the maximum sound pressure level of the background audio exceeding the maximum sound pressure level threshold value of the non-tone component of the environmental noise;
Calculating the difference value of the sound pressure level standard deviation of the background audio and the sound pressure level threshold standard deviation of the non-tonal component of the environmental noise under each sub-band, and then summing the 25 difference values to obtain a numerical value D;
comparing [ P, N, D ] values of all background audio, wherein the larger the P value of the two audio is, the higher the audio sequence is, the larger the N value is, the higher the audio sequence is, the larger the D value is, the higher the audio sequence is, and if the P, N and D values of the two audio are equal, the priority order of the two audio is the same; in this way, the background class audio numbers of the top NUM1 are selected as the first audio candidate set B of masking.
9. The method of masking ambient noise according to claim 7, wherein the melody-like audio in the library is recommended based on the tonal characteristics of the noise by:
according to the fourth masking feature matrix of the tone component of the environmental noise, comparing the first masking feature matrix of the melody type audio in the material library one by one;
calculating the total standard deviation P of the difference value of the sound pressure average value of the melody type audio frequency and the sound pressure level threshold value average value of the environmental noise tone component under each sub-band;
counting the times N that the maximum sound pressure level of melody type audio exceeds the maximum sound pressure level threshold value of the environmental noise tone component;
Calculating the difference value of the standard deviation of the sound pressure level of the melody type audio and the standard deviation of the threshold value of the sound pressure level of the environmental noise tone component under each sub-band, and then summing the 25 difference values to obtain a numerical value D;
comparing [ P, N, D ] values of all melody audios, wherein the larger the P value of the two audios is, the higher the audio ranking is, the larger the N value is, the higher the audio ranking is, the larger the D value is, the higher the audio ranking is, and if the P, N and D values of the two audios are equal, the priority order of the two audios is the same; in this way, the melody class audio numbers of the top NUM2 are selected as the second set of audio candidates M for masking.
10. A method of masking ambient noise according to claim 3, wherein the auxiliary audio class in the library of material is recommended for human ear characteristics by:
determining the most sensitive range of human ears as 2kHz-4kHz, and extracting 5 critical sub-bands in 25 critical sub-bands covered by the range, namely, critical sub-bands from No. 13 to No. 17;
acquiring a signal masking ratio of the noise audio tone component;
calculating the total standard deviation P of the difference between the average value of the sound pressure level of the auxiliary audio and the average value of the sound pressure level threshold value of the environmental noise tone component under the 5 critical frequency sub-bands;
Calculating the number of times that the maximum sound pressure level of the auxiliary audio exceeds the maximum sound pressure level threshold of the environmental noise tone component under the 5 critical sub-bands, and summing the number of times to N;
calculating the difference between the standard deviation of the sound pressure level of the auxiliary audio and the standard deviation of the sound pressure level threshold of the environmental noise tone component under the 5 critical sub-bands, and then summing the 5 differences to obtain a value D;
comparing the [ P, N, D ] values of all auxiliary types of audios, wherein the larger the P value of the two audios is, the higher the audio ranking is, the larger the N value is, the higher the audio ranking is, the larger the D value is, the higher the audio ranking is, and if the P, N and D values of the two audios are equal, the priority order of the two audios is the same; in this way, the auxiliary class audio numbers of the top NUM3 are selected as the third set S of audio candidates for masking.
11. A masking device for ambient noise, the device comprising:
the masking audio material library feature extraction module is used for establishing a masking audio material library, extracting and classifying features of each audio material, and dividing the audio material into background audio, melody audio or auxiliary audio;
a signal masking ratio calculating module for dividing the frequency part of 20Hz-16kHz into 25 critical sub-bands, and calculating the signal masking ratio of the tone component and the signal masking ratio of the non-tone component of the 25 critical sub-bands respectively;
The environmental noise characteristic extraction module is used for monitoring and analyzing the environmental noise to obtain the characteristics of the tone component and the non-tone component of the environmental noise;
the masking audio generation module is used for recommending background audio in the material library according to the non-tone component characteristics of the noise; recommending melody audio in the material library according to the tone component characteristics of the noise; recommending auxiliary audio in a material library aiming at the characteristics of human ears; and combining the recommended background audio, melody audio and auxiliary audio, and outputting final masking audio.
12. An electronic device comprising a processor, a memory, sound collection means, playback means and a computer program stored on the memory and executable on the processor, the processor implementing the method of masking ambient noise according to any one of claims 1-10 when the computer program is executed by the processor.
CN201911399710.5A 2019-12-30 2019-12-30 Method, device and equipment for masking environmental noise Active CN111161699B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911399710.5A CN111161699B (en) 2019-12-30 2019-12-30 Method, device and equipment for masking environmental noise

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911399710.5A CN111161699B (en) 2019-12-30 2019-12-30 Method, device and equipment for masking environmental noise

Publications (2)

Publication Number Publication Date
CN111161699A CN111161699A (en) 2020-05-15
CN111161699B true CN111161699B (en) 2023-04-28

Family

ID=70559465

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911399710.5A Active CN111161699B (en) 2019-12-30 2019-12-30 Method, device and equipment for masking environmental noise

Country Status (1)

Country Link
CN (1) CN111161699B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111956398B (en) * 2020-07-13 2022-07-22 恒大恒驰新能源汽车研究院(上海)有限公司 Vehicle collision hearing protection method, vehicle and equipment
CN112509592B (en) * 2020-11-18 2024-01-30 广东美的白色家电技术创新中心有限公司 Electrical apparatus, noise processing method, and readable storage medium
CN113883671A (en) * 2021-09-13 2022-01-04 Tcl空调器(中山)有限公司 Abnormal noise shielding control method for air conditioner, air conditioner and readable storage medium
CN113883669A (en) * 2021-09-13 2022-01-04 Tcl空调器(中山)有限公司 Sleep-assisting control method and device for air conditioner, electronic equipment and storage medium
CN116996807B (en) * 2023-09-28 2024-01-30 小舟科技有限公司 Brain-controlled earphone control method and device based on user emotion, earphone and medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101354885A (en) * 2007-01-16 2009-01-28 哈曼贝克自动系统股份有限公司 Active noise control system
CN102008371A (en) * 2010-10-28 2011-04-13 中国科学院声学研究所 Digital tinnitus masker
JP2012123070A (en) * 2010-12-07 2012-06-28 Yamaha Corp Masker sound generation device, masker sound output device and masker sound generation program
JP2016052049A (en) * 2014-09-01 2016-04-11 三菱電機株式会社 Sound environment control device and sound environment control system using the same
CN105741849A (en) * 2016-03-06 2016-07-06 北京工业大学 Voice enhancement method for fusing phase estimation and human ear hearing characteristics in digital hearing aid
CN105869652A (en) * 2015-01-21 2016-08-17 北京大学深圳研究院 Psychological acoustic model calculation method and device
CN106796782A (en) * 2014-10-16 2017-05-31 索尼公司 Information processor, information processing method and computer program
CN109238448A (en) * 2018-09-17 2019-01-18 上海市环境科学研究院 A method of acoustic environment satisfaction is improved based on sound masking

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2645362A1 (en) * 2012-03-26 2013-10-02 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for improving the perceived quality of sound reproduction by combining active noise cancellation and perceptual noise compensation

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101354885A (en) * 2007-01-16 2009-01-28 哈曼贝克自动系统股份有限公司 Active noise control system
CN102008371A (en) * 2010-10-28 2011-04-13 中国科学院声学研究所 Digital tinnitus masker
JP2012123070A (en) * 2010-12-07 2012-06-28 Yamaha Corp Masker sound generation device, masker sound output device and masker sound generation program
JP2016052049A (en) * 2014-09-01 2016-04-11 三菱電機株式会社 Sound environment control device and sound environment control system using the same
CN106796782A (en) * 2014-10-16 2017-05-31 索尼公司 Information processor, information processing method and computer program
CN105869652A (en) * 2015-01-21 2016-08-17 北京大学深圳研究院 Psychological acoustic model calculation method and device
CN105741849A (en) * 2016-03-06 2016-07-06 北京工业大学 Voice enhancement method for fusing phase estimation and human ear hearing characteristics in digital hearing aid
CN109238448A (en) * 2018-09-17 2019-01-18 上海市环境科学研究院 A method of acoustic environment satisfaction is improved based on sound masking

Also Published As

Publication number Publication date
CN111161699A (en) 2020-05-15

Similar Documents

Publication Publication Date Title
CN111161699B (en) Method, device and equipment for masking environmental noise
CN101641968B (en) Sound enrichment for the relief of tinnitus
US11707633B2 (en) Variable sound system for audio devices
CN104811891B (en) The method and system that the scaling of voice related channel program is avoided in multi-channel audio
US6212496B1 (en) Customizing audio output to a user&#39;s hearing in a digital telephone
EP1770685A1 (en) A system for providing a reduction of audiable noise perception for a human user
WO2018069900A1 (en) Audio-system and method for hearing-impaired
CN107547983B (en) Method and hearing device for improving separability of target sound
WO2012053629A1 (en) Voice processor and voice processing method
US20150005661A1 (en) Method and process for reducing tinnitus
CN108235181A (en) The method of noise reduction in apparatus for processing audio
JP2015029342A (en) Sound enrichment system for tinnitus relief
CN113949956B (en) Noise reduction processing method and device, electronic equipment, earphone and storage medium
Kates Modeling the effects of single-microphone noise-suppression
KR20050121698A (en) Method and system for increasing audio perceptual tone alerts
Robinson et al. Psychoacoustic models and non-linear human hearing
JP3482465B2 (en) Mobile fitting system
Patel et al. Compression Fitting of Hearing Aids and Implementation
Rämö et al. Real-time perceptual model for distraction in interfering audio-on-audio scenarios
Warner et al. Thresholds of Discomfort for Complex Stimuli
KR102006250B1 (en) Tinnitus rehabilitation sound therapy device using compound sound
CN111785295A (en) Sleep aiding method utilizing white noise
US20200315498A1 (en) Systems and methods for evaluating hearing health
Neuman et al. Preferred listening levels for linear and slow-acting compression hearing aids
Dobrucki et al. Various aspects of auditory fatigue caused by listening to loud music

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant