CN112331225A - Method and device for assisting hearing in high-noise environment - Google Patents

Method and device for assisting hearing in high-noise environment Download PDF

Info

Publication number
CN112331225A
CN112331225A CN202011159182.9A CN202011159182A CN112331225A CN 112331225 A CN112331225 A CN 112331225A CN 202011159182 A CN202011159182 A CN 202011159182A CN 112331225 A CN112331225 A CN 112331225A
Authority
CN
China
Prior art keywords
noise
voice
information
sample database
frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011159182.9A
Other languages
Chinese (zh)
Other versions
CN112331225B (en
Inventor
周宇阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202011159182.9A priority Critical patent/CN112331225B/en
Publication of CN112331225A publication Critical patent/CN112331225A/en
Application granted granted Critical
Publication of CN112331225B publication Critical patent/CN112331225B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating

Abstract

The invention provides a method and a device for assisting hearing in a high-noise environment. The method comprises the steps of obtaining noise information in the environment and establishing a noise sample database; acquiring a plurality of voice information, and establishing a voice sample database; acquiring voice information communicated by workers in a high-noise environment; processing the voice information based on a noise sample database and a human voice sample database to obtain clean voice; and outputting the clean voice.

Description

Method and device for assisting hearing in high-noise environment
Technical Field
The invention relates to the technical field of voice separation, in particular to a method and a device for assisting hearing in a high-noise environment.
Background
At present, voice is one of the most direct ways for human to perform information interaction, people are always interfered by other sounds when acquiring voice, and especially people working in a high-noise environment are more easily interfered by noise, so that two parties of workers cannot perform effective information interaction, the working efficiency is seriously influenced, and voice separation as a preprocessing scheme is an effective way for suppressing interference;
speech separation refers to the task of separating the target speech from background interference. At present, the method of auditory scene analysis, nonnegative matrix decomposition and the like is mainly utilized for voice separation, the method is simple to implement, has large limitation and few applicable scenes, and has the defects of rapid performance reduction in the presence of noise, failure of considering voice characteristics, damage to voice and failure of considering a high-noise voice environment;
therefore, the invention provides a method and a device for assisting hearing in a high-noise environment to solve the problem that workers are difficult to communicate with each other in the high-noise environment.
Disclosure of Invention
The invention provides a method and a device for assisting hearing in a high-noise environment, which are used for solving the problem that workers are difficult to communicate with each other in the high-noise environment.
A method of assisting hearing in a high noise environment, comprising:
acquiring noise information in an environment, and establishing a noise sample database;
acquiring a plurality of voice information, and establishing a voice sample database;
acquiring voice information communicated by workers in a high-noise environment;
processing the voice information based on a noise sample database and a human voice sample database to obtain clean voice;
and outputting the clean voice.
As an embodiment of the present invention, the acquiring of the voice information and establishing a voice sample database includes:
performing analog-to-digital conversion on the voice information to obtain digital signals of the voice information;
processing the digital signals by utilizing a fast Fourier transform technology to obtain a plurality of frequency spectrums of the voice information;
obtaining the voice frequency information of each time point according to the frequency spectrum;
and establishing a voice sample database according to the voice frequency information of each time point.
As an embodiment of the present invention, the processing the digital signal by using a fast fourier transform technique to obtain a plurality of frequency spectrums of the human voice information includes:
the frequency spectrum of several vocal information is calculated by FFT:
Figure BDA0002743701260000021
wherein ,
Figure BDA0002743701260000022
e is a natural number logarithm, p is 0,1, …, M-1, x (N) is an N point sequence;
Figure BDA0002743701260000031
Figure BDA0002743701260000032
wherein ,Tp(theta) calculating a value in the frequency spectrum of several vocal information for the FFT,
Figure BDA0002743701260000034
is a positive integer of 0<=θ<=α-1。
As an embodiment of the present invention, the processing the voice information based on the noise sample database and the human voice sample database to obtain a clean voice includes:
obtaining a noise frequency threshold according to the noise sample database;
according to the noise frequency threshold, performing first processing on the voice information to obtain first filtered voice information; the first processing is filtering frequency signals in the voice information, wherein the frequency signals are higher than the noise frequency threshold;
and matching the first-time filtered voice information with the voice sample database, and filtering frequency signals with a difference larger than a preset difference value from a preset mean value in the voice sample database in the first-time filtered voice information to obtain clean voice information.
As an embodiment of the present invention, the obtaining a noise frequency threshold according to the noise sample database includes:
calculating a noise frequency threshold:
Figure BDA0002743701260000033
where v is the noise frequency threshold, FiThe range of the sample frequency information in the noise sample database is shown, N is the number of samples in the noise sample database, pi is the circumferential rate, k is the stiffness coefficient, and m is the mass.
As an embodiment of the present invention, the noise frequency threshold is determined according to a high noise environment;
the high noise environment is determined by noise information in the high noise environment acquired within a preset time; wherein,
the high noise environment includes: traffic noise and industrial noise.
As an embodiment of the present invention, the determining of the high noise environment from noise information in the high noise environment acquired within a preset time includes:
acquiring noise information in a high-noise environment within a preset time, and obtaining a digital signal of the voice information through analog-to-digital conversion;
obtaining a noise frequency waveform according to the digital signal, and filtering an isolated waveform in the noise frequency waveform to obtain a section of continuous noise frequency waveform;
taking out the maximum value in the frequency range of the continuous noise waveform, and comparing the maximum value with a noise frequency threshold value in the noise sample database to obtain the most similar noise frequency threshold value;
determining a high noise environment according to the most similar noise frequency threshold;
wherein the noise frequency threshold corresponds to the high noise environment one to one.
An apparatus for assisting hearing in a high noise environment, comprising:
the acquisition module is used for acquiring noise information, a plurality of voice information and voice information in the environment;
the creating module is used for creating a noise sample database according to the noise information acquired in the acquiring module to obtain a noise frequency threshold value, and creating a voice sample database according to the voice information acquired in the acquiring module;
the comparison module is used for comparing the voice information in the environment with a noise frequency threshold value and determining the frequency information of which the voice information frequency is greater than the noise frequency threshold value in the environment;
the filtering module is used for filtering frequency information of which the frequency is greater than a noise frequency threshold value in the environment by using a filter to obtain first-time filtered voice information;
the matching module is used for matching the first-time filtered voice information with the voice sample database, and filtering frequency signals with difference larger than a preset difference value from a preset mean value in the voice sample database in the first-time filtered voice information to obtain clean voice;
and the transmission module is used for synthesizing the clean voice into voice segments and transmitting the voice segments to a receiver.
As an embodiment of the present invention, the creating module performs the following operations:
according to the acquired noise information, establishing an industrial noise sample database, a traffic noise sample database and a mixed noise sample database;
establishing at least 3 types of human voice sample databases according to the acquired human voice information; wherein the 3-type human voice sample database comprises: a voice sample database for the adult male, a voice sample database for the adult female and a voice sample database for the elderly male.
As an embodiment of the present invention, the matching module performs operations including:
if multi-user voice exists in the voice information of the second filtering, multi-voice sections in the clean voice are separated into single voice sections by utilizing a multi-user voice separation technology in the voice separation technology.
The invention has the beneficial effects that: avoid the influence of noise to pronunciation interchange for the staff also can be fast clear under high noise environment hear the meaning that the other side wanted the expression, help improving work efficiency, reduce the mood dysphoria because of the noise brings, the reinforcing makes things convenient for the mutual exchange between the staff to the protection of staff's hearing under the high noise environment.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
fig. 1 is a flowchart of a method and an apparatus for assisting hearing in a high noise environment according to an embodiment of the present invention;
fig. 2 is a flowchart of an apparatus and a method for assisting hearing in a high noise environment according to an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
Example 1:
as shown in fig. 1, an embodiment of the present invention provides a method for assisting hearing in a high-noise environment, including:
step S101: acquiring noise information in an environment, and establishing a noise sample database;
step S102: acquiring a plurality of voice information, and establishing a voice sample database;
step S103: acquiring voice information communicated by workers in a high-noise environment;
step S104: processing the voice information based on a noise sample database and a human voice sample database to obtain clean voice;
step S105: outputting the clean voice;
the working principle of the technical scheme is as follows: collecting noise information which possibly occurs in a working environment, such as traffic noise and industrial noise, through a microphone, so as to establish a noise sample database, obtaining a preset noise frequency threshold value through integrating the noise sample database, collecting a plurality of voice information through the microphone, wherein the voice information comprises but is not limited to voice information of a strong male, voice information of an adult female and voice information of an old male, establishing a voice sample database, making a frequency spectrum of the voice sample database through time domain and frequency domain conversion by utilizing an FFT (fast Fourier transform algorithm), and obtaining a voice frequency mean value through counting and integrating the frequency spectrum of the voice sample database; acquiring voice information in an environment through a microphone, converting the voice information into an analog signal based on a voice separation technology, filtering frequency signals higher than a preset noise frequency threshold value through a filter to obtain first-time filtered voice information, matching the first-time voice information with a human voice frequency mean value, filtering frequency signals with a difference larger than a preset difference value from the human voice frequency mean value in the frequency information of the first-time filtered voice information to obtain second-time filtered voice information, separating target voice through the voice separation technology if multi-person voice exists in the second-time voice information, converting multi-voice segments into single voice segments, transmitting the single voice segments, and directly transmitting the second language information if the multi-person voice does not exist in the second language information;
the beneficial effects of the above technical scheme are: noise information is integrated, a noise information threshold value is preset, the influence of high noise on voice communication is eliminated, and the influence of high noise on voice communication is eliminated secondarily through matching of a human voice information mean value and voice information in the environment; the voice communication method and the voice communication device can avoid the influence of noise on voice communication, so that workers can quickly and clearly hear the meaning of the people who want to express in a high-noise environment, the work efficiency is improved, the mood and the fidget caused by the noise are reduced, the hearing protection of the workers in the high-noise environment is enhanced, and the mutual communication among the workers is facilitated.
Example 2:
in one embodiment, the acquiring of the voice information and establishing a voice sample database includes:
performing analog-to-digital conversion on the voice information to obtain digital signals of the voice information;
processing the digital signals by utilizing a fast Fourier transform technology to obtain a plurality of frequency spectrums of the voice information;
obtaining the voice frequency information of each time point according to the frequency spectrum;
establishing a voice sample database according to the voice frequency information of each time point;
the working principle of the technical scheme is as follows: converting a plurality of collected voice information into analog signals through a microphone, converting the analog signals of the voice information into digital signals through an ADC (analog to digital converter) module, converting the digital signals of the voice information into continuous frequency spectrums of the voice information by using an FFT (fast Fourier transform) technology, and realizing the conversion of time domains and frequency domains through the FFT; acquiring voice frequency information of each time point, establishing a voice sample database according to the voice frequency information of each time point, and acquiring a voice frequency mean value;
the beneficial effects of the above technical scheme are: by utilizing the fast Fourier transform technology, the integration of a plurality of voice frequency information is efficiently achieved, the voice frequency mean value is obtained, the voice information in the deep filtering environment is benefited, and the communication of workers is more convenient.
Example 3:
in one embodiment, the obtaining the frequency spectrum of the voice information by using the fast fourier transform technique includes:
the frequency spectrum of several vocal information is calculated by FFT:
Figure BDA0002743701260000081
wherein ,
Figure BDA0002743701260000082
e is a natural number logarithm, p is 0,1, …, M-1, x (N) is an N point sequence;
Figure BDA0002743701260000091
Figure BDA0002743701260000092
wherein ,Tp(theta) calculating a value in the frequency spectrum of several vocal information for the FFT,
Figure BDA0002743701260000096
is a positive integer of 0<=θ<=α-1;
The working principle of the technical scheme is as follows: assuming that the sampling frequency of the human voice signal is fs, when performing an N-point FFT on the human voice signal, the frequency interval between two points of the FFY result is fs/N, i.e., the frequency represented by any point p (p is 0 to M-1) is p x fs/N,
Figure BDA0002743701260000093
Figure BDA0002743701260000094
wherein ,Tp(theta) calculating a value in the frequency spectrum of several vocal information for the FFT,
Figure BDA0002743701260000095
is a positive integer of 0<=θ<Alpha-1, thereby making a plurality of frequency spectrums of the voice information;
the beneficial effects of the above technical scheme are: the frequency spectrum of a plurality of voice information is manufactured, so that the fluctuation condition of the voice information frequency can be visually obtained, and the accuracy of the voice frequency mean value is improved.
Example 4:
in one embodiment, the processing the voice information based on the noise sample database and the voice sample database to obtain clean voice includes:
obtaining a noise frequency threshold according to the noise sample database;
according to the noise frequency threshold, performing first processing on the voice information to obtain first filtered voice information; the first processing is filtering frequency signals in the voice information, wherein the frequency signals are higher than the noise frequency threshold;
matching the first-time filtered voice information with the voice sample database, and filtering frequency signals with a difference larger than a preset difference value from a preset mean value in the voice sample database in the first-time filtered voice information to obtain clean voice information;
the working principle of the technical scheme is as follows: collecting voice information in an environment through a microphone, utilizing voice denoising in a voice separation technology, filtering frequency signals with frequencies higher than a preset noise frequency threshold value in the voice information through a filter to obtain first-time filtering voice, comparing the obtained first-time filtering voice with a human voice frequency mean value, filtering frequency signals with a difference larger than a preset difference value from the human voice frequency mean value in the frequency information of the first-time filtering voice information, and converting the filtered frequency signals into digital signals, namely clean voice information;
the beneficial effects of the above technical scheme are: the voice separation technology is utilized, noise in voice information is reduced beneficially, definition of target voice is improved, communication of workers is faster, and work efficiency is improved.
Example 5:
in one embodiment, said obtaining a noise frequency threshold from said noise sample database comprises:
calculating a noise frequency threshold:
Figure BDA0002743701260000101
where v is the noise frequency threshold, FiThe method comprises the steps of obtaining a sample frequency information range in a noise sample database, wherein N is the number of samples in the noise sample database, pi is a circumferential rate, k is a stiffness coefficient, and m is mass;
the working principle of the technical scheme is as follows: acquiring the maximum values of all sample frequencies in a sample database, adding the maximum values to calculate an average value, and removing the influence of sound frequency generated by the inherent vibration of the device on collected data to obtain a noise frequency threshold, wherein k is the stiffness coefficient of the device, and m is the quality of the device;
the beneficial effects of the above technical scheme are: the device noise filtering precision is improved.
Example 6:
in one embodiment, the noise frequency threshold is determined based on a high noise environment;
the high noise environment is determined by noise information in the high noise environment acquired within a preset time; wherein,
the high noise environment includes: traffic noise and industrial noise;
the working principle of the technical scheme is as follows: determining a corresponding noise frequency threshold according to the acquired noise information in the preset time of the high noise environment, and determining the high noise environment;
the beneficial effects of the above technical scheme are: different noise frequency thresholds are selected in different high-noise environments, and the noise filtering precision is improved.
Example 7:
in one embodiment, the determining of the high noise environment by the noise information in the high noise environment acquired within the preset time includes:
acquiring noise information in a high-noise environment within a preset time, and obtaining a digital signal of the voice information through analog-to-digital conversion;
obtaining a noise frequency waveform according to the digital signal, and filtering an isolated waveform in the noise frequency waveform to obtain a section of continuous noise frequency waveform;
taking out the maximum value in the frequency range of the continuous noise waveform, and comparing the maximum value with a noise frequency threshold value in the noise sample database to obtain the most similar noise frequency threshold value;
determining a high noise environment according to the most similar noise frequency threshold;
wherein the noise frequency threshold corresponds to the high noise environment one to one;
the working principle of the technical scheme is as follows: the obtained noise information is converted into a digital signal through an analog-digital converter, a noise frequency waveform is obtained according to the digital signal, an isolated waveform in the noise frequency waveform is filtered, and a section of continuous noise frequency waveform is obtained. The isolated waveform refers to a waveform that is discontinuous or fluctuates greatly. Selecting the maximum value of the continuous noise frequency waveform, comparing the maximum value with all noise frequency threshold values in a noise sample database, selecting the noise frequency threshold value with the most similar comparison result, and determining the current high-noise environment;
the beneficial effects of the above technical scheme are: different high noise environments use different noise frequency thresholds to filter, improve the precision of filtering the noise, strengthen the effect of supplementary hearing.
Example 8:
as shown in fig. 2, an embodiment of the present invention provides a device for assisting hearing in a high noise environment, including:
step S201: the acquisition module is used for acquiring noise information, a plurality of voice information and voice information in the environment;
step S202: the creating module is used for creating a noise sample database according to the noise information acquired in the acquiring module and creating a voice sample database according to the voice information acquired in the acquiring module to obtain a preset noise frequency threshold value and a voice frequency mean value;
step S203: the comparison module is used for comparing the voice information in the environment with a preset noise frequency threshold value and determining the frequency information of which the voice information frequency in the environment is greater than the preset noise frequency threshold value;
step S204: the filtering module is used for filtering frequency information of which the frequency is greater than a preset noise frequency threshold value in the environment by using a filter to obtain first-time filtered voice information;
step S205: the matching module is used for matching the first-time filtered voice information with the voice frequency mean value, and filtering frequency signals with a difference larger than a preset difference value from the voice frequency mean value in the frequency information of the first-time filtered voice information to obtain second-time filtered voice information;
step S206: the transmission module is used for synthesizing the second-time filtered voice information into voice sections and transmitting the voice sections to a receiver;
the working principle of the technical scheme is as follows: collecting noise information which possibly occurs in a working environment, such as traffic noise and industrial noise, through a microphone, so as to establish a noise sample database, obtaining a preset noise frequency threshold value through integrating the noise sample database, collecting a plurality of voice information through the microphone, wherein the voice information comprises but is not limited to voice information of a strong male, voice information of an adult female and voice information of an old male, establishing a voice sample database, making a frequency spectrum of the voice sample database through time domain and frequency domain conversion by utilizing an FFT (fast Fourier transform algorithm), and obtaining a voice frequency mean value through counting and integrating the frequency spectrum of the voice sample database; acquiring voice information in an environment through a microphone, converting the voice information into an analog signal based on a voice separation technology, filtering frequency signals higher than a preset noise frequency threshold value through a filter to obtain first-time filtered voice information, matching the first-time voice information with a human voice frequency mean value, filtering frequency signals with a difference larger than a preset difference value from the human voice frequency mean value in the frequency information of the first-time filtered voice information to obtain second-time filtered voice information, separating target voice through the voice separation technology if multi-person voice exists in the second-time voice information, converting multi-voice segments into single voice segments, transmitting the single voice segments, and directly transmitting the second language information if the multi-person voice does not exist in the second language information;
the beneficial effects of the above technical scheme are: noise information is integrated, a noise information threshold value is preset, the influence of high noise on voice communication is eliminated, and the influence of high noise on voice communication is eliminated secondarily through matching of a human voice information mean value and voice information in the environment; the voice communication method and the voice communication device can avoid the influence of noise on voice communication, so that workers can quickly and clearly hear the meaning of the people who want to express in a high-noise environment, the work efficiency is improved, the mood and the fidget caused by the noise are reduced, the hearing protection of the workers in the high-noise environment is enhanced, and the mutual communication among the workers is facilitated.
Example 9:
in one embodiment, the creation module performs the following operations:
according to the acquired noise information, establishing an industrial noise sample database, a traffic noise sample database and a mixed noise sample database;
establishing at least 3 types of human voice sample databases according to the acquired human voice information; wherein the 3-type human voice sample database comprises: a voice sample database for the adult male, a voice sample database for the adult female and a voice sample database for the old male;
the working principle of the technical scheme is as follows: collecting whistling sound, automobile rumbling sound, ship noise, mechanical noise, aerodynamic noise and electromagnetic noise through a microphone, converting the collected sound into an analog signal through the microphone, converting the analog signal into a digital signal through an ADC (analog to digital converter) module, and integrating the digital signal through simulation to obtain a traffic noise sample database, an industrial noise frequency database and a mixed noise database; collecting voice information of young and old people in a working environment through a microphone, establishing a voice sample database of a male in the young, collecting voice information of an adult female in the working environment through the microphone, establishing a voice sample database of the adult female, collecting voice information of an old male in the working environment through the microphone, and establishing a voice sample database of the old male;
the beneficial effects of the above technical scheme are: different preset noise information thresholds are set according to different noise information, so that the noise filtering accuracy is improved; aiming at different crowds, different voice sample databases are established, and accurate assistance of the device to the audition of different crowds is realized.
Example 10:
in one embodiment, the matching module performs operations comprising:
if multi-user voice exists in the second-time filtered voice information, separating multi-voice sections in the second-time filtered voice information into single voice sections by utilizing a multi-user voice separation technology in a voice separation technology;
the working principle of the technical scheme is as follows: separating the multi-person voice speech section in the second filtered speech into a single speech section by utilizing a multi-person voice separation technology in the speech separation technology;
the beneficial effects of the above technical scheme are: the target voice definition is improved, simultaneous communication of multiple persons is facilitated, and the working efficiency is improved.
Example 11:
the invention also provides a method for assisting hearing in a high-noise environment, which comprises the following steps:
step S301: establishing a plurality of noise sample databases based on first positioning information of a high-noise environment, equipment operation information in the high-noise environment and noise information in the high-noise environment;
the main source of noise in the high-noise environment is the operation condition of each operating device in the environment, and when the device operation information in the high-noise environment is different, different noise sample databases are established, namely the noise sample databases take the positioning information and the device operation information as first calling tags; the accurate noise sample database is called;
step S302: acquiring a plurality of voice information, and establishing a voice sample database;
for example: generally, workers in the same working environment are fixed or do not change too much, so that a single-ended voice sample database can be established in advance according to each worker; then establishing a second calling label according to the sound characteristics of the staff;
step S303: acquiring second positioning information of a worker, matching the second positioning information with the first positioning information, and acquiring voice information communicated by the worker in a high-noise environment when the second positioning information is matched with the first positioning information;
the second positioning information confirms whether the worker is in a high-noise environment, and after a corresponding noise sample database is established in the high-noise environment where the worker is likely to be located in advance, the worker can be considered not to be in the high-noise environment when the position represented by the second positioning information is not the position where the first positioning information of the noise sample database is established; voice data information is not acquired.
Step S304: acquiring equipment operation information of the working environment of the current worker based on the second positioning information;
and the server acquires the operation information of each device by inquiring the device at the position of the second positioning information.
Step S305: calling a corresponding noise sample database based on the second positioning information and the equipment operation information of the current working environment of the working personnel;
and matching the second positioning information with the first positioning information in the first calling tag, and matching the equipment operation information of the current working environment of the working personnel with the equipment operation information in the first calling tag so as to match a noise sample database corresponding to the first calling tag.
Step S306: denoising the voice information based on a noise sample database to obtain denoised voice;
the initial denoising of the current speaking voice information of the working personnel is realized through the pre-established noise sample database, and the denoising effect is improved by adopting the accurate noise sample database.
Step S307: performing feature extraction on the de-noised voice to obtain a first sound feature; calling a corresponding human voice sample database based on the first voice characteristics;
the first sound characteristic is a sound characteristic for identifying differences among persons to which the speech belongs, and comprises timbre, loudness, tone and the like.
Step S308: matching the denoised voice with the called sample voice in the human voice sample database to obtain clean voice;
the matching method comprises the following steps: performing secondary feature extraction on the denoised voice, extracting a second sound feature, calculating the similarity between the second sound feature and the second sound feature of the sample voice, and calling the sample voice with the maximum similarity as clean voice; the second sound characteristic includes: short-time energy spectrum, formant frequency, amplitude spectrum, etc
Step S309: outputting the clean voice;
clean pronunciation broadcast is given another staff through pronunciation broadcast equipment, realizes two staff's exchange under the high noise environment, realizes the cooperation of staff between the high noise environment, avoids instruction or other to take place to convey the mistake under the interference of high noise, causes unexpected loss or accident.
Under the condition of not considering the denoising effect, the embodiment can also have another feasible scheme, namely, the existing denoising method is directly adopted for denoising, and then the denoised voice is directly matched with the sample voice in the human voice sample database to obtain the clean voice.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A method of assisting hearing in a high noise environment, comprising:
acquiring noise information in an environment, and establishing a noise sample database;
acquiring a plurality of voice information, and establishing a voice sample database;
acquiring voice information communicated by workers in a high-noise environment;
processing the voice information based on a noise sample database and a human voice sample database to obtain clean voice;
and outputting the clean voice.
2. The method for assisting hearing in a high noise environment according to claim 1, wherein the acquiring a plurality of voice information and establishing a voice sample database comprises:
performing analog-to-digital conversion on the voice information to obtain digital signals of the voice information;
processing the digital signals by utilizing a fast Fourier transform technology to obtain a plurality of frequency spectrums of the voice information;
obtaining the voice frequency information of each time point according to the frequency spectrum;
and establishing a voice sample database according to the voice frequency information of each time point.
3. The method for assisting hearing in a high noise environment according to claim 2, wherein the processing the digital signal using a fast fourier transform technique to obtain a plurality of frequency spectrums of the vocal information comprises:
the frequency spectrum of several vocal information is calculated by FFT:
Figure FDA0002743701250000011
wherein ,
Figure FDA0002743701250000021
e is a natural number logarithm, p is 0,1, …, M-1, x (N) is an N point sequence;
Figure FDA0002743701250000022
Figure FDA0002743701250000023
wherein ,Tp(theta) calculating a value in the frequency spectrum of several vocal information for the FFT,
Figure FDA0002743701250000024
is a positive integer of 0<=θ<=α-1。
4. The method of assisting in hearing in a high noise environment according to claim 1, wherein the processing the speech information based on a noise sample database and a human voice sample database to obtain clean speech comprises:
obtaining a noise frequency threshold according to the noise sample database;
according to the noise frequency threshold, performing first processing on the voice information to obtain first filtered voice information; the first processing is filtering frequency signals in the voice information, wherein the frequency signals are higher than the noise frequency threshold;
and matching the first-time filtered voice information with the voice sample database, and filtering frequency signals with a difference larger than a preset difference value from a preset mean value in the voice sample database in the first-time filtered voice information to obtain clean voice information.
5. The method of assisting a hearing in a high noise environment according to claim 4, wherein said deriving a noise frequency threshold from the noise sample database comprises:
calculating a noise frequency threshold:
Figure FDA0002743701250000031
where v is the noise frequency threshold, FiThe range of the sample frequency information in the noise sample database is shown, N is the number of samples in the noise sample database, pi is the circumferential rate, k is the stiffness coefficient, and m is the mass.
6. The method for assisting a hearing in a high noise environment of claim 5,
the noise frequency threshold is determined according to a high noise environment;
the high noise environment is determined by noise information in the high noise environment acquired within a preset time; wherein,
the high noise environment includes: traffic noise and industrial noise.
7. The method of assisting a hearing in a high noise environment according to claim 6, wherein the high noise environment is determined by noise information in the high noise environment acquired within a preset time, and the method comprises:
acquiring noise information in a high-noise environment within a preset time, and obtaining a digital signal of the voice information through analog-to-digital conversion;
obtaining a noise frequency waveform according to the digital signal, and filtering an isolated waveform in the noise frequency waveform to obtain a section of continuous noise frequency waveform;
taking out the maximum value in the frequency range of the continuous noise waveform, and comparing the maximum value with a noise frequency threshold value in the noise sample database to obtain the most similar noise frequency threshold value;
determining a high noise environment according to the most similar noise frequency threshold;
wherein the noise frequency threshold corresponds to the high noise environment one to one.
8. An apparatus for assisting hearing in a high noise environment, comprising:
the acquisition module is used for acquiring noise information, a plurality of voice information and voice information in the environment;
the creating module is used for creating a noise sample database according to the noise information acquired in the acquiring module to obtain a noise frequency threshold value, and creating a voice sample database according to the voice information acquired in the acquiring module;
the comparison module is used for comparing the voice information in the environment with a noise frequency threshold value and determining the frequency information of which the voice information frequency is greater than the noise frequency threshold value in the environment;
the filtering module is used for filtering frequency information of which the frequency is greater than a noise frequency threshold value in the environment by using a filter to obtain first-time filtered voice information;
the matching module is used for matching the first-time filtered voice information with the voice sample database, and filtering frequency signals with difference larger than a preset difference value from a preset mean value in the voice sample database in the first-time filtered voice information to obtain clean voice;
and the transmission module is used for synthesizing the clean voice into voice segments and transmitting the voice segments to a receiver.
9. The device for assisting hearing in a high noise environment of claim 8, wherein the creating module performs the following operations:
according to the acquired noise information, establishing an industrial noise sample database, a traffic noise sample database and a mixed noise sample database;
establishing at least 3 types of human voice sample databases according to the acquired human voice information; wherein the 3-type human voice sample database comprises: a voice sample database for the adult male, a voice sample database for the adult female and a voice sample database for the elderly male.
10. The device for assisting hearing in a high noise environment of claim 8, wherein the matching module performs the following operations:
if multi-user voice exists in the voice information of the second filtering, multi-voice sections in the clean voice are separated into single voice sections by utilizing a multi-user voice separation technology in the voice separation technology.
CN202011159182.9A 2020-10-26 2020-10-26 Method and device for assisting hearing in high-noise environment Active CN112331225B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011159182.9A CN112331225B (en) 2020-10-26 2020-10-26 Method and device for assisting hearing in high-noise environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011159182.9A CN112331225B (en) 2020-10-26 2020-10-26 Method and device for assisting hearing in high-noise environment

Publications (2)

Publication Number Publication Date
CN112331225A true CN112331225A (en) 2021-02-05
CN112331225B CN112331225B (en) 2023-09-26

Family

ID=74312377

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011159182.9A Active CN112331225B (en) 2020-10-26 2020-10-26 Method and device for assisting hearing in high-noise environment

Country Status (1)

Country Link
CN (1) CN112331225B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0643892A (en) * 1992-02-18 1994-02-18 Matsushita Electric Ind Co Ltd Voice recognition method
JP2005284016A (en) * 2004-03-30 2005-10-13 Iwatsu Electric Co Ltd Method for inferring noise of speech signal and noise-removing device using the same
CN102664006A (en) * 2012-04-14 2012-09-12 中国人民解放军国防科学技术大学 Abnormal voice detecting method based on time-domain and frequency-domain analysis
CN105635453A (en) * 2015-12-28 2016-06-01 上海博泰悦臻网络技术服务有限公司 Conversation volume automatic adjusting method and system, vehicle-mounted device, and automobile
CN105723458A (en) * 2013-09-12 2016-06-29 沙特阿拉伯石油公司 Dynamic threshold methods, systems, computer readable media, and program code for filtering noise and restoring attenuated high-frequency components of acoustic signals
CN105989836A (en) * 2015-03-06 2016-10-05 腾讯科技(深圳)有限公司 Voice acquisition method, device and terminal equipment
CN107785028A (en) * 2016-08-25 2018-03-09 上海英波声学工程技术股份有限公司 Voice de-noising method and device based on signal autocorrelation
CN107945815A (en) * 2017-11-27 2018-04-20 歌尔科技有限公司 Voice signal noise-reduction method and equipment
CN111161753A (en) * 2020-01-03 2020-05-15 上海交通大学 Safe voice interaction method and system based on intelligent terminal
CN111754968A (en) * 2020-06-15 2020-10-09 中科上声(苏州)电子有限公司 Wind noise control method and device for vehicle

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0643892A (en) * 1992-02-18 1994-02-18 Matsushita Electric Ind Co Ltd Voice recognition method
JP2005284016A (en) * 2004-03-30 2005-10-13 Iwatsu Electric Co Ltd Method for inferring noise of speech signal and noise-removing device using the same
CN102664006A (en) * 2012-04-14 2012-09-12 中国人民解放军国防科学技术大学 Abnormal voice detecting method based on time-domain and frequency-domain analysis
CN105723458A (en) * 2013-09-12 2016-06-29 沙特阿拉伯石油公司 Dynamic threshold methods, systems, computer readable media, and program code for filtering noise and restoring attenuated high-frequency components of acoustic signals
CN105989836A (en) * 2015-03-06 2016-10-05 腾讯科技(深圳)有限公司 Voice acquisition method, device and terminal equipment
CN105635453A (en) * 2015-12-28 2016-06-01 上海博泰悦臻网络技术服务有限公司 Conversation volume automatic adjusting method and system, vehicle-mounted device, and automobile
CN107785028A (en) * 2016-08-25 2018-03-09 上海英波声学工程技术股份有限公司 Voice de-noising method and device based on signal autocorrelation
CN107945815A (en) * 2017-11-27 2018-04-20 歌尔科技有限公司 Voice signal noise-reduction method and equipment
CN111161753A (en) * 2020-01-03 2020-05-15 上海交通大学 Safe voice interaction method and system based on intelligent terminal
CN111754968A (en) * 2020-06-15 2020-10-09 中科上声(苏州)电子有限公司 Wind noise control method and device for vehicle

Also Published As

Publication number Publication date
CN112331225B (en) 2023-09-26

Similar Documents

Publication Publication Date Title
CN109147796B (en) Speech recognition method, device, computer equipment and computer readable storage medium
CN110120227A (en) A kind of depth stacks the speech separating method of residual error network
AU2010204470B2 (en) Automatic sound recognition based on binary time frequency units
Mitra et al. Medium-duration modulation cepstral feature for robust speech recognition
JP2005091732A (en) Method for restoring target speech based on shape of amplitude distribution of divided spectrum found by blind signal separation
KR20060044629A (en) Isolating speech signals utilizing neural networks
CN112017682B (en) Single-channel voice simultaneous noise reduction and reverberation removal system
JP2007523374A (en) Method and system for generating training data for an automatic speech recognizer
CN103730112A (en) Multi-channel voice simulation and acquisition method
EP0640237B1 (en) Method of converting speech
CN105845126A (en) Method for automatic English subtitle filling of English audio image data
CN109686365B (en) Voice recognition method and voice recognition system
Alasadi et al. Efficient feature extraction algorithms to develop an arabic speech recognition system
KR101802444B1 (en) Robust speech recognition apparatus and method for Bayesian feature enhancement using independent vector analysis and reverberation parameter reestimation
CN112331225B (en) Method and device for assisting hearing in high-noise environment
CN106228984A (en) Voice recognition information acquisition methods
CN100358007C (en) Method for raising precision of identifying speech by using improved subtractive method of spectrums
Nazarov et al. Technology is getting rid of the noise in speech perception
CN111292723A (en) Voice recognition system
CN116312561A (en) Method, system and device for voice print recognition, authentication, noise reduction and voice enhancement of personnel in power dispatching system
CN110909827A (en) Noise reduction method suitable for fan blade sound signals
Ganapathy et al. Robust spectro-temporal features based on autoregressive models of hilbert envelopes
CN111862991A (en) Method and system for identifying baby crying
CN1473323A (en) System and method for improving voice recognition in noisy environments and frequency mismatch conditions
KR100848789B1 (en) Postprocessing method for removing cross talk

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant