WO2023226193A1 - Procédé et appareil de traitement audio, et support de stockage lisible par ordinateur non transitoire - Google Patents
Procédé et appareil de traitement audio, et support de stockage lisible par ordinateur non transitoire Download PDFInfo
- Publication number
- WO2023226193A1 WO2023226193A1 PCT/CN2022/110275 CN2022110275W WO2023226193A1 WO 2023226193 A1 WO2023226193 A1 WO 2023226193A1 CN 2022110275 W CN2022110275 W CN 2022110275W WO 2023226193 A1 WO2023226193 A1 WO 2023226193A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio signal
- audio
- time
- processing method
- encoding
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 64
- 230000005236 sound signal Effects 0.000 claims abstract description 446
- 238000012545 processing Methods 0.000 claims abstract description 105
- 230000015654 memory Effects 0.000 claims description 24
- 238000000034 method Methods 0.000 claims description 17
- 238000001914 filtration Methods 0.000 claims description 15
- 230000008569 process Effects 0.000 claims description 12
- 230000000737 periodic effect Effects 0.000 claims description 11
- 238000005070 sampling Methods 0.000 claims description 7
- 230000009467 reduction Effects 0.000 description 20
- 238000010586 diagram Methods 0.000 description 16
- 238000013528 artificial neural network Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 10
- 230000000694 effects Effects 0.000 description 10
- 238000013473 artificial intelligence Methods 0.000 description 8
- 238000010276 construction Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 230000001629 suppression Effects 0.000 description 6
- 238000005553 drilling Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 3
- 230000003595 spectral effect Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 230000006403 short-term memory Effects 0.000 description 2
- 230000001743 silencing effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000005034 decoration Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000030279 gene silencing Effects 0.000 description 1
- 239000012774 insulation material Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000011946 reduction process Methods 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K11/00—Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/16—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/175—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K11/00—Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/16—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/175—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
- G10K11/178—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K11/00—Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/16—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/175—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
- G10K11/178—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
- G10K11/1781—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase characterised by the analysis of input or output signals, e.g. frequency range, modes, transfer functions
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K11/00—Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/16—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/175—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
- G10K11/178—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
- G10K11/1785—Methods, e.g. algorithms; Devices
Definitions
- Embodiments of the present disclosure relate to an audio processing method, an audio processing device, and a non-transitory computer-readable storage medium.
- noise reduction methods mainly include active noise reduction and passive noise reduction.
- Active noise reduction uses the noise reduction system to generate an inverse signal that is equal to the external noise to neutralize the noise, thereby achieving the noise reduction effect.
- Passive noise reduction mainly achieves the noise reduction effect by forming a closed space around the object or using sound insulation materials to block external noise.
- Active noise reduction usually uses lagging inverted audio to destructively superimpose the originally received audio (for example, noise) to achieve the effect of audio suppression.
- An active noise reduction process is as follows: First, the audio Vn generated by the sound source is received through the microphone, and the received audio Vn is sent to the processor. Then, the processor performs inversion processing on the audio Vn to generate inverted audio. Vn' and output the inverted audio Vn' to the speaker, and the speaker emits the inverted audio Vn'.
- the human ear can receive the inverted audio Vn’ and the audio Vn, and the inverted audio Vn’ and the audio Vn can be destructively superimposed to achieve the effect of suppressing the audio.
- the time of the inverted audio Vn' output by the speaker must lag behind the time of the audio Vn originally received by the microphone. Therefore, the human ear receives The time to the inverted audio Vn' must also lag behind the time when the human ear receives the audio Vn, and the silencing effect is poor, and may even be impossible to achieve.
- At least one embodiment of the present disclosure provides an audio processing method, which includes: generating a control instruction based on a first audio signal; generating a second audio signal based on the control instruction; and outputting the second audio signal to Suppressing a third audio signal, wherein the sum of the phases of the second audio signal and the third audio signal is less than a phase threshold, and the first audio signal appears earlier than the third audio signal. time.
- the outputting the second audio signal to suppress the third audio signal includes: based on the control instruction, determining the output of the second audio signal. The first moment; outputting the second audio signal at the first moment, wherein the third audio signal starts to appear from the second moment, and the absolute time difference between the first moment and the second moment is The value is less than the time threshold.
- the time difference between the first moment and the second moment is 0.
- generating a control instruction based on a first audio signal includes: acquiring the first audio signal; processing the first audio signal to predict a fourth audio signal; based on the fourth audio signal, the control instruction is generated.
- the second audio signal and/or the third audio signal and/or the fourth audio signal are periodic or intermittent time domain Signal.
- processing the first audio signal to predict a fourth audio signal includes: generating a first audio feature code based on the first audio signal ; Query the lookup table based on the first audio feature coding to obtain the second audio feature coding; predict and obtain the fourth audio signal based on the second audio feature coding.
- the lookup table includes at least one first encoding field.
- the lookup table further includes at least one second encoding field, and multiple first encoding fields constitute one second encoding field.
- the second audio feature encoding includes at least one of the first encoding field and/or at least one of the second encoding field.
- obtaining the first audio signal includes: collecting an initial audio signal; performing downsampling processing on the initial audio signal to obtain the first audio signal. Signal.
- obtaining the first audio signal includes: collecting an initial audio signal; filtering the initial audio signal to obtain the first audio signal .
- the phase of the second audio signal is opposite to the phase of the third audio signal.
- At least one embodiment of the present disclosure also provides an audio processing device, including: an instruction generation module configured to generate a control instruction based on a first audio signal; and an audio generation module configured to generate a second audio based on the control instruction. signal; an output module configured to output the second audio signal to suppress a third audio signal; wherein the sum of the phases of the second audio signal and the phase of the third audio signal is less than a phase threshold, the The first audio signal appears earlier than the third audio signal.
- the output module includes a time determination sub-module and an output sub-module, and the time determination sub-module is configured to determine to output the first time based on the control instruction.
- the first moment of the second audio signal; the output sub-module is configured to output the second audio signal at the first moment, wherein the third audio signal begins to appear from the second moment, and the first moment
- the absolute value of the time difference between the second moment and the second moment is less than the time threshold.
- the time difference between the first time and the second time is 0.
- the instruction generation module includes an audio acquisition sub-module, a prediction sub-module and a generation sub-module, and the audio acquisition sub-module is configured to acquire the first audio signal; the prediction sub-module is configured to process the first audio signal to predict a fourth audio signal; the generation sub-module is configured to generate the control instruction based on the fourth audio signal.
- the second audio signal and/or the third audio signal and/or the fourth audio signal are periodic or intermittent time domain Signal.
- the prediction sub-module includes a query unit and a prediction unit, the query unit is configured to generate a first audio feature encoding based on the first audio signal and a prediction unit based on the first audio signal.
- the first audio feature coding queries a lookup table to obtain a second audio feature coding; the prediction unit is configured to predict the fourth audio signal based on the second audio feature coding.
- the lookup table includes at least one first encoding field.
- the lookup table further includes at least one second encoding field, and multiple first encoding fields constitute one second encoding field.
- the second audio feature encoding includes at least one of the first encoding field and/or at least one of the second encoding field.
- the audio acquisition sub-module includes a collection unit and a down-sampling processing unit, the collection unit is configured to collect an initial audio signal; the down-sampling processing unit is Configured to perform downsampling processing on the initial audio signal to obtain the first audio signal.
- the audio acquisition sub-module includes an acquisition unit and a filtering unit, the acquisition unit is configured to acquire an initial audio signal; the filtering unit is configured to The initial audio signal is filtered to obtain the first audio signal.
- the phase of the second audio signal is opposite to the phase of the third audio signal.
- At least one embodiment of the present disclosure also provides an audio processing device, including: one or more memories non-transiently storing computer-executable instructions; one or more processors configured to run the computer-executable instructions, Wherein, the computer-executable instructions implement the audio processing method according to any embodiment of the present disclosure when run by the one or more processors.
- At least one embodiment of the present disclosure also provides a non-transitory computer-readable storage medium, wherein the non-transitory computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are implemented when executed by a processor.
- An audio processing method according to any embodiment of the present disclosure.
- a future inverted audio signal is generated by learning the characteristics of the current audio signal (ie, the first audio signal) (i.e., the second audio signal) to suppress the future audio signal (i.e., the third audio signal) to avoid the problem of out-of-synchronization of the inverted audio signal and the audio signal that needs to be suppressed due to the delay between the input end and the output end, Improving the noise canceling effect can significantly reduce or even eliminate the impact of input-to-output delay on noise canceling, and the audio suppression effect is better than the backward active noise canceling system commonly used in the industry.
- Figure 1 is a schematic block diagram of an audio processing system provided by at least one embodiment of the present disclosure
- Figure 2A is a schematic flow chart of an audio processing method provided by at least one embodiment of the present disclosure
- Figure 2B is a schematic flow chart of step S10 shown in Figure 2A;
- FIG. 2C is a schematic flow chart of step S102 shown in Figure 2B;
- Figure 3 is a schematic diagram of a first audio signal and a third audio signal provided by at least one embodiment of the present disclosure
- Figure 4 is a schematic diagram of a third audio signal and a fourth audio signal provided by at least one embodiment of the present disclosure
- Figure 5A is a schematic diagram of an audio signal provided by some embodiments of the present disclosure.
- Figure 5B is an enlarged schematic diagram of the audio signal in the dotted rectangular frame P1 in Figure 5A;
- Figure 6 is a schematic block diagram of an audio processing device provided by at least one embodiment of the present disclosure.
- Figure 7 is a schematic block diagram of another audio processing device provided by at least one embodiment of the present disclosure.
- FIG. 8 is a schematic diagram of a non-transitory computer-readable storage medium provided by at least one embodiment of the present disclosure.
- At least one embodiment of the present disclosure provides an audio processing method.
- the audio processing method includes: generating a control instruction based on the first audio signal; generating a second audio signal based on the control instruction; and outputting the second audio signal to suppress the third audio signal.
- the sum of the phases of the second audio signal and the phase of the third audio signal is less than the phase threshold, and the first audio signal appears earlier than the third audio signal.
- a future inverted audio signal ie, the second audio signal
- the future audio signal That is, the third audio signal
- the effect of terminal delay on noise reduction is better than that of the backward active noise reduction system commonly used in the industry.
- Embodiments of the present disclosure also provide an audio processing device and a non-transitory computer-readable storage medium.
- the audio processing method can be applied to the audio processing device provided by the embodiment of the present disclosure, and the audio processing device can be configured on an electronic device.
- the electronic device may be a personal computer, a mobile terminal, a car headrest, etc.
- the mobile terminal may be a mobile phone, a headset, a tablet computer or other hardware devices.
- Figure 1 is a schematic block diagram of an audio processing system provided by at least one embodiment of the present disclosure.
- Figure 2A is a schematic flow chart of an audio processing method provided by at least one embodiment of the present disclosure.
- Figure 2B is shown in Figure 2A
- Figure 2C is a schematic flow chart of step S10 shown in Figure 2B.
- Figure 3 is a schematic diagram of a first audio signal and a third audio signal provided by at least one embodiment of the present disclosure.
- the audio processing system shown in Figure 1 can be used to implement the audio processing method provided by any embodiment of the present disclosure, for example, the audio processing method shown in Figure 2A.
- the audio processing system may include an audio receiving part, an audio processing part and an audio output part.
- the audio receiving part can receive the audio signal Sn1 emitted by the sound source at time tt1, and then transmit the audio signal Sn1 to the audio processing part.
- the audio processing part processes the audio signal Sn1 to predict the future inverted audio signal Sn2; then the The future inverted audio signal Sn2 is output through the audio output section.
- the future inverted audio signal Sn2 may be used to suppress the future audio signal Sn3 generated by the sound source at time tt2 later than time tt1.
- the target object e.g., human ear, etc.
- the target object can receive the inverted audio signal Sn2 and the future audio signal Sn3 at the same time, so that the future inverted audio signal Sn2 and the future audio signal Sn3 can be destructively superimposed, thereby achieving noise elimination.
- the audio receiving part may include a microphone, an amplifier (for example, a microphone amplifier), an analog to digital converter (ADC), a downsampler, etc.
- the audio processing part may include an AI engine and/or a digital signal Processor (Digital Signal Processing, DSP), etc.
- the audio output part can include an upsampler, a digital to analog converter (digital to analog converter, DAC), an amplifier (for example, a speaker amplifier), a speaker, etc.
- an audio processing method includes steps S10 to S12.
- step S10 a control instruction is generated based on the first audio signal;
- step S11 a second audio signal is generated based on the control instruction;
- step S12 the second audio signal is output to suppress the third audio signal.
- the first audio signal may be the audio signal Sn1 shown in FIG. 1
- the second audio signal may be the inverted audio signal Sn2 shown in FIG. 1
- the third audio signal may be the future audio signal Sn3 shown in FIG. 1 .
- the audio receiving part can receive a first audio signal; the audio processing part can process the first audio signal to generate a control instruction, and generate a second audio signal based on the control instruction; the audio output part can output the second audio signal, thereby achieving Suppress third audio signal.
- the first audio signal appears earlier than the third audio signal.
- the time when the first audio signal starts to appear is t11
- the time when the third audio signal starts to appear is t21.
- time t11 is earlier than time t21.
- the time period during which the first audio signal exists may be the time period between time t11 and time t12
- the time period during which the third audio signal exists may be the time period between time t21 and time t22.
- time t12 and time t21 may not be the same time, and time t12 is earlier than time t21.
- the time period in which the audio signal exists or the time in which it appears means the time period in which the audio corresponding to the audio signal exists or the time in which it appears.
- the sum of the phases of the second audio signal and the phase of the third audio signal is less than the phase threshold.
- the phase threshold can be set according to the actual situation, and this disclosure does not specifically limit this.
- the phase of the second audio signal is opposite to the phase of the third audio signal, so that complete silence can be achieved, that is, the third audio signal is completely suppressed.
- the error energy of the audio signal received by the audio collection device is 0; if the second audio signal and the third audio signal are received by the human ear, it is equivalent to the person not hearing the sound. .
- the first audio signal may be the time-domain audio signal with the maximum volume (maximum amplitude) between time t11 and time t12, and the first audio signal is not an audio signal of a specific frequency, so the implementation of the present disclosure
- the audio processing method provided in the example does not need to extract spectral features from the audio signal to generate a spectrogram, which can simplify the audio signal processing process and save processing time.
- the first audio signal and the third audio signal may be audio signals generated by the external environment, machines, etc., the sound of machine operation, the sound of electric drills and electric saws during decoration, etc.
- machines may include household appliances (air conditioners, range hoods, washing machines, etc.) and the like.
- step S10 may include steps S101 to 103.
- step S101 a first audio signal is obtained; in step S102, the first audio signal is processed to predict Fourth audio signal; in step S103, a control instruction is generated based on the fourth audio signal.
- an audio signal that has not yet been generated ie, the fourth audio signal
- is predicted by learning the characteristics of the current audio signal ie, the first audio signal.
- the fourth audio signal is a predicted future audio signal.
- the time period in which the fourth audio signal exists is later than the time period in which the first audio signal exists, for example, the time period in which the fourth audio signal exists.
- the segment is the same as the time period in which the third audio signal exists, so the time period in which the fourth audio signal exists may also be the time period between time t21 and time t22 shown in FIG. 3 .
- Figure 4 is a schematic diagram of a third audio signal and a fourth audio signal provided by at least one embodiment of the present disclosure.
- the horizontal axis represents time (Time)
- the vertical axis represents amplitude (Amplitude)
- the amplitude can be expressed as a voltage value.
- the predicted fourth audio signal is substantially the same as the third audio signal.
- the third audio signal and the fourth audio signal may be exactly the same.
- the phase of the second audio signal finally generated based on the fourth audio signal is opposite to the phase of the third audio signal, thereby achieving complete Silencing.
- processing the first audio signal to predict the fourth audio signal may include processing the first audio signal through a neural network to predict the fourth audio signal.
- neural networks may include recurrent neural networks, long short-term memory networks, or generative adversarial networks.
- the characteristics of the audio signal can be learned based on artificial intelligence, thereby predicting the audio signal of a certain future time period that has not yet occurred, and thereby generating an inverted audio signal of the future time period to suppress the time period audio signal.
- step S102 may include steps S1021 to 1023.
- a first audio feature code is generated based on the first audio signal; in step S1022, based on the first audio signal, The feature coding queries the lookup table to obtain the second audio feature coding; in step S1023, based on the second audio feature coding, a fourth audio signal is predicted.
- the first audio signal may be an analog signal, and the first audio signal may be processed through an analog-to-digital converter to obtain a processed first audio signal.
- the processed first audio signal may be a digital signal. Based on the processed The first audio signal may generate a first audio feature code.
- the first audio signal may be a digital signal, such as a PDM (Pulse-density-modulation, pulse density modulation) signal.
- the first audio feature code may be generated directly based on the first audio signal.
- PDM signals can be represented by binary numbers 0 and 1.
- any suitable encoding method may be used to implement the first audio feature encoding.
- the changing state of the audio signal can be used to describe the audio signal, and multi-bits can be used to represent the changing state of the audio signal.
- multi-bits can be used to represent the changing state of the audio signal.
- two bits (2bits) can be used to represent the changing state of the audio signal.
- 00 means that the audio signal becomes larger
- 01 means that the audio signal becomes smaller
- 10 means that there is no audio signal
- 11 means that there is no audio signal. The audio signal remains unchanged.
- the audio signal becomes larger means that the amplitude of the audio signal in the unit time period (each time step) becomes larger with time, and "the audio signal becomes smaller” means that the amplitude of the audio signal in the unit time period increases with time. The time becomes smaller, “the audio signal remains unchanged” means that the amplitude of the audio signal in the unit time period does not change with time, and “no audio signal” means that there is no audio signal in the unit time period, that is, the amplitude of the audio signal is 0.
- Figure 5A is a schematic diagram of an audio signal provided by some embodiments of the present disclosure.
- Figure 5B is an enlarged schematic diagram of the audio signal in the dotted rectangular box P1 in Figure 5A.
- the abscissa is time (ms, milliseconds), and the ordinate is the amplitude of the audio signal (volts, volts).
- the audio signal V is a periodically changing signal, and the periodic pattern of the audio signal V is the pattern shown by the dotted rectangular frame P2.
- the amplitude of the audio signal represented by the waveform segment 30 does not change with time t, and the time corresponding to the waveform segment 30 is a unit time period, then the waveform segment 30 can be expressed as audio feature coding (11); similarly Ground, the amplitude of the audio signal represented by waveform segment 31 gradually increases with time t, and the time corresponding to waveform segment 31 is four unit time segments, then waveform segment 31 can be expressed as audio feature encoding (00,00,00, 00); the amplitude of the audio signal represented by waveform segment 32 remains unchanged with time t, the time corresponding to waveform segment 32 is a unit time period, and waveform segment 32 can be represented as audio feature encoding (11); represented by waveform segment 33 The amplitude of the audio signal gradually becomes smaller with time t, and the time corresponding to the waveform segment 33 is six unit time periods, then the waveform segment 33 can be expressed as the audio feature code (01,01,01,01,01); The amplitude of the audio signal
- the audio feature encoding corresponding to the audio signal shown in Figure 5B can be expressed as ⁇ 11,00,00,00,00,11,01,01,01,01,01,11,00,00,00, 00,00,00,00,00,01,01,01,01,01,01,01,01,01,01,01,11,00,00,00,00,00,00,00,01,01,01,01,01,01,01,11,00,00,00,00,00,00,00,00,00,00,00,... ⁇ .
- a lookup table includes at least one first code field.
- the lookup table further includes at least one second encoding field, and multiple first encoding fields constitute a second encoding field, so that dimensionally reduced high-order features can be formed from combinations of low-level features.
- the coding method of the coding field (codeword, for example, the codeword may include a first coding field and a second coding field) in the lookup table may be the same as the coding method of the above-mentioned first audio feature coding.
- the first encoding field when two bits are used to represent the changing state of the audio signal to implement feature encoding, the first encoding field may be one of 00, 01, 10, and 11. 00, 01, 10 and 11 can be combined to form the second encoding field.
- a second encoding field may be represented as ⁇ 00,00,00,01,01,01,11,11,01,... ⁇ , which is composed of a combination of 00, 01 and 11.
- the lookup table includes a plurality of second encoding fields
- the number of first encoding fields included in each of the plurality of second encoding fields may be different.
- the types of the first coding field can be more, for example, when 3 bits are used to represent When the audio signal changes state, there can be up to eight types of first coding fields. At this time, the first coding fields can be part or all of 000, 001, 010, 011, 100, 101, 110 and 111.
- one or more second encoding fields can also be combined to obtain a third encoding field, or one or more second encoding fields and one or more first encoding fields can be combined to obtain a third encoding field, similarly Alternatively, one or more third coding fields may be combined or one or more third coding fields may be combined with the first coding field and/or the second coding field to obtain a higher order coding field.
- low-order feature codes can be combined to obtain high-order feature codes, thereby achieving more efficient and longer predictions.
- the second audio feature encoding includes at least one first encoding field and/or at least one second encoding field.
- the second audio feature encoding may include one or more complete second encoding fields, or the second audio feature encoding may include part of the first encoding field in one second encoding field.
- the second audio feature encoding may include at least one first encoding field and/or at least one second encoding field and/or at least one third encoding field.
- W2 ⁇ 11,01,00,00,01, 01,01,01,01,01,01,01. ⁇
- W3 ⁇ 11,00,01,00,00,01,01,01,11,00,00,00,01,01,01,01,01,01,01,01,01. ⁇ .
- the audio collection device continues to collect the first audio signal.
- the first feature encoding field corresponding to the first audio signal collected by the audio collection device is expressed as ⁇ 11 ⁇ , corresponding to waveform segment 30, a query is performed based on the lookup table to determine whether there is a certain coding field (including the first coding field and the second coding field) in the lookup table, including ⁇ 11 ⁇ .
- the query The second encoding field W1, the second encoding field W2, and the second encoding field W3 in the lookup table all include ⁇ 11 ⁇ .
- the second encoding field W1, the second encoding field W2, and the second encoding field W3 are all used as to-be-coded fields.
- the second encoding field W2 can be deleted from the list of encoding fields to be output.
- the second encoding field W2 Field W1 and the second encoding field W3 serve as the encoding fields to be output in the encoding field list to be output.
- the third feature encoding field corresponding to the first audio signal collected by the audio collection device is represented as ⁇ 00 ⁇ , corresponding to the second unit time period in the waveform segment 31, continue to query the lookup table to determine Check whether there is a certain encoding field in the lookup table that includes ⁇ 11,00,00 ⁇ .
- the second encoding field W1 in the lookup table is queried and includes ⁇ 11,00,00 ⁇ . Then, it can be predicted that the next audio signal should be the pattern of the second encoding field W1.
- the fourth field in the second coding field W1 can be output (ie ⁇ 00 ⁇ ) is used as the predicted second audio coding feature.
- the second audio feature coding is expressed as ⁇ 00,00,11,01,01,01,01,01,01 ,01 ,11 ,00,00,00,00,00,00,01,01,01,01,01,01,01,01,01,01,11,00,00,00 ,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,.--- ⁇ .
- how many feature coding fields are matched before determining the second audio feature coding can be adjusted according to actual application scenarios, design requirements and other factors. For example, in the above example, when 3 matching fields (in actual In the application, if 10, 20, 50, etc.) feature coding fields can be matched, the second audio feature coding can be determined.
- the first audio feature code corresponding to the first audio signal includes 3 feature code fields and is represented as ⁇ 11,00,00 ⁇ .
- the time period corresponding to the first audio signal It is from time t31 to time t32.
- the system actually needs to output the second audio signal at time t33, which is later than time t32.
- the first two feature coding fields in the second audio feature coding ⁇ 00
- the time period corresponding to ,00 ⁇ (that is, the time period between time t32 and time t33) has passed, so the audio feature encoding corresponding to the predicted fourth audio signal is actually expressed as ⁇ 11,01,01,01,01 ,01,01,11,00,00,00,00,00,00,01,01,01,01,01,01,01,01,01,01,11,11,00 ,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,... ⁇ .
- the audio feature code corresponding to the third audio signal is also expressed as ⁇ 11,01,01,01,01,01,11,00,00,00 ,00,00,00,00,00,01,01,01,01,01,01,01,01,01,01,01,11,00,00,00,00,00,00,00,01,01,01,01,01,01,01,01,11,00,00,00,00,00,00,00 ,00,00,00,00,00,00,00,00,00,00,00,... ⁇ .
- the second audio signal is a signal obtained by inverting the fourth audio signal, that is, the second audio signal can be ⁇ 11,01,01,01,01,01,11,00,00,00, 00,00,00,00,00,01,01,01,01,01,01,01,01,01,01,01,11,00,00,00,00,00,00,00,01,01,01,01,01,01,01,11,00,00,00,00,00,00,00,00,00, 00,00,00,00,00,00,00,00,00,00,00,... ⁇ The inverted audio signal of this pattern.
- the duration of the second audio signal, the duration of the third audio signal, and the duration of the fourth audio signal are substantially the same, eg, identical.
- the leading feature coding field may be set for at least part of the first coding field and/or the second coding field in the lookup table.
- the leading feature coding field may be set for the second coding field W1 ⁇ 11,00 ,00 ⁇ , when the leading feature coding field is detected, the second coding field W1 is output as the second audio feature coding.
- the first audio feature code corresponding to the first audio signal is ⁇ 11,00,00 ⁇
- the first audio feature code corresponding to the first audio signal and the preamble feature code field ⁇ 11,00, 00 ⁇ matching so that the second encoding field W1 can be output as the second audio feature encoding.
- the leading feature coding field ⁇ 11,00,00,01,01 ⁇ can be set for the second coding field W1.
- the second coding field W1 and the leading feature coding field are The remaining fields in the feature encoding field are output as the second audio feature encoding.
- the first audio signal corresponding The first audio feature encoding matches the first three fields ⁇ 11,00,00 ⁇ in the leading feature encoding field, so that the remaining fields ⁇ 01,01 ⁇ and the second encoding field W1 in the leading feature encoding field can be output as the third 2. Audio feature encoding.
- the time corresponding to the first two feature coding fields ⁇ 01,01 ⁇ in the second audio feature coding (that is, the remaining fields in the leading feature coding field) can be the time for the system to process the signal, so that the predicted first
- the audio feature encoding corresponding to the four audio signals may be the complete second encoding field W1.
- the length of the leading feature encoding field can be adjusted according to actual conditions, and this disclosure does not limit this.
- look-up tables when the memory used to store the look-up table is large enough and the content stored in the look-up table is rich enough (that is, there are enough combinations of encoding fields in the look-up table), the user's desire to eliminate all types of audio signals.
- the samples used to train the neural network are rich enough and the types of samples are rich enough, any type of audio signal that the user wants to eliminate can be predicted based on the neural network.
- the lookup table may be stored in the memory in the form of a table, etc.
- the embodiments of the present disclosure do not limit the specific form of the lookup table.
- predictions in neural networks can be achieved by looking up tables.
- the second audio signal and/or the third audio signal and/or the fourth audio signal are periodic or intermittent time domain signals
- the second audio signal and/or the third audio signal and/or the fourth audio signal are periodic or intermittent time domain signals
- the second audio signal and/or the third audio signal and/or the fourth audio signal have the characteristics of continuous repetition or intermittence repetition, and have a fixed pattern.
- intermittent audio signals since there is no audio signal during the pause period of the intermittent audio signal, there is no spectral feature to be extracted during the pause period, but the pause period can become the time domain feature of the intermittent audio signal. one.
- step S101 may include: collecting an initial audio signal; performing downsampling on the initial audio signal to obtain a first audio signal.
- the sampling rate of the initial audio signal collected by the audio acquisition device is high, it is not conducive to the back-end audio signal processing device (for example, artificial intelligence engine (AI (Artificial Intelligence) Engine), digital signal processor (Digital Signal) Processing (DSP for short), etc.), therefore, the initial audio signal can be down-sampled to achieve frequency reduction, which is convenient for processing by the audio signal processing device.
- the frequency can be reduced to 48K Hz or even lower.
- step S101 may include: collecting an initial audio signal; and filtering the initial audio signal to obtain a first audio signal.
- filtering can also be performed through a bandwidth controller (Bandwidth controller) to suppress audio signals within a specific frequency range.
- a bandwidth controller for continuous and intermittent audio signals (for example, knocking or dripping noise, etc.)
- the effective bandwidth of the first audio signal is set to the frequency range corresponding to the audio signal that needs to be suppressed, for example, 1K ⁇ 6K Hz , thereby ensuring that users can still hear more important sounds. For example, when used in the automotive field, it must be ensured that the driver can hear the horn, etc. to improve driving safety.
- obtaining the first audio signal may include: collecting an initial audio signal; filtering the initial audio signal to obtain an audio signal within a predetermined frequency range; and downsampling the audio signal within the predetermined frequency range.
- Processing to obtain the first audio signal; alternatively, obtaining the first audio signal may include: collecting an initial audio signal; performing downsampling processing on the initial audio signal; and performing filtering processing on the downsampled audio signal to obtain the first audio signal.
- control instruction may include the time at which the second audio signal is output, the fourth audio signal, a control signal instructing to invert the fourth audio signal, and the like.
- step S11 may include: based on the control instruction, determining a fourth audio signal and a control signal indicating inverting the fourth audio signal; based on the control signal, inverting the fourth audio signal Processed to generate a second audio signal.
- step S12 may include: determining a first moment to output the second audio signal based on the control instruction; and outputting the second audio signal at the first moment.
- the third audio signal starts to appear from the second moment, and the absolute value of the time difference between the first moment and the second moment is less than the time threshold.
- the time threshold can be specifically set according to the actual situation, and this disclosure does not limit this. The smaller the time threshold, the better the silencing effect.
- the time difference between the first moment and the second moment is 0, that is, the moment when the second audio signal starts to be output and the moment when the third audio signal starts to appear are the same.
- the time when the second audio signal starts to be output and the time when the third audio signal starts to appear are both time t21.
- the time difference between the first moment and the second moment can be set according to the actual situation.
- the first moment and the second moment can be set to ensure that the second audio signal and the third audio signal are transmitted to the target object at the same time, thereby avoiding The transmission of audio signals causes the second audio signal and the third audio signal to be out of sync, further improving the noise canceling effect.
- the target object can be a human ear, a microphone, etc.
- the second audio signal can be output through a device such as a speaker that can convert an electrical signal into a sound signal for output.
- the audio processing method provided by the present disclosure may not be executed until the audio collection device collects the audio signal, thereby saving power consumption.
- the audio processing method can reduce or eliminate periodic audio signals (for example, noise) in environmental audio signals.
- periodic audio signals for example, noise
- the sound of construction at a nearby construction site can be eliminated. wait.
- This type of scenario does not require special knowledge of the audio signals that you want to keep. It simply reduces the target sounds to be silenced in the environment that need to be eliminated.
- These target sounds to be silenced usually have the characteristics of continuous repetition or intermittence repetition, so they can be predicted through prediction. Predicted.
- the "target sound to be silenced" can be determined according to the actual situation. For example, for an application scenario such as a library, when there is a construction site around the library, the external environment audio signal can include two audio signals.
- the first The audio signal can be the sound of drilling at the construction site
- the second audio signal can be the sound of discussions by people around you.
- the sound of construction site drilling has periodic characteristics and usually has a fixed pattern.
- the discussion sound most likely does not have a fixed pattern and does not have periodic characteristics.
- the target sound to be silenced is the construction site drilling sound.
- the audio processing method provided by embodiments of the present disclosure can be applied to automobile driving headrests to create a silent zone near the driver's ears to avoid unnecessary external audio signals (such as engine noise, road noise, wind noise, and tire noise). Noise signals while the car is driving) interfere with the driver.
- this audio processing method can also be applied to hair dryers, range hoods, vacuum cleaners, non-inverter air conditioners and other equipment to reduce the operating sound emitted by these equipment, allowing users to stay in noisy environments without being affected by the surrounding environment. The impact of environmental noise.
- This audio processing method can also be applied to headphones, etc., to reduce or eliminate external sounds, so that users can better receive the sounds from the headphones (music or phone calls, etc.).
- FIG. 6 is a schematic block diagram of an audio processing device provided by at least one embodiment of the present disclosure.
- the audio processing device 600 includes an instruction generation module 601 , an audio generation module 602 and an output module 603 .
- the components and structures of the audio processing device 600 shown in FIG. 6 are only exemplary and not restrictive.
- the audio processing device 600 may also include other components and structures as needed.
- the instruction generation module 601 is configured to generate a control instruction based on the first audio signal.
- the instruction generation module 601 is used to execute step S10 shown in Figure 2A.
- the audio generation module 602 is configured to generate a second audio signal based on the control instruction.
- the audio generation module 602 is used to perform step S11 shown in Figure 2A.
- the output module 603 is configured to output the second audio signal to suppress the third audio signal.
- the output module 603 is used to perform step S12 shown in Figure 2A.
- step S10 shown in FIG. 2A in the embodiment of the above audio processing method.
- step S11 shown in FIG. 2A in the embodiment of the processing method
- step S12 shown in FIG. 2A in the embodiment of the audio processing method.
- the audio processing device can achieve similar or identical technical effects to the foregoing audio processing method, which will not be described again here.
- the first audio signal appears earlier than the third audio signal.
- the sum of the phases of the second audio signal and the third audio signal is less than the phase threshold.
- the phase of the second audio signal is opposite to the phase of the third audio signal, so that the third audio signal can be completely suppressed. .
- the instruction generation module 601 may include an audio acquisition sub-module, a prediction sub-module and a generation sub-module.
- the audio acquisition sub-module is configured to acquire the first audio signal;
- the prediction sub-module is configured to process the first audio signal to predict a fourth audio signal;
- the generation sub-module is configured to generate a control instruction based on the fourth audio signal.
- the second audio signal and/or the third audio signal and/or the fourth audio signal are periodic or intermittent time domain signals.
- the third audio signal and the fourth audio signal may be exactly the same.
- the prediction sub-module may process the first audio signal based on a neural network to predict the fourth audio signal.
- the prediction sub-module may include the AI engine and/or digital signal processor in the audio processing part shown in Figure 1.
- the AI engine may include a neural network.
- the AI engine may include a recurrent neural network, a long short-term memory network, or At least one neural network among generative adversarial networks and the like.
- the prediction sub-module includes a query unit and a prediction unit.
- the query unit is configured to generate a first audio feature code based on the first audio signal and query the lookup table based on the first audio feature code to obtain a second audio feature code.
- the prediction unit is configured to predict the fourth audio signal based on the second audio feature encoding.
- the lookup unit may include memory for storing lookup tables.
- the lookup table may include at least one first encoding field.
- the lookup table further includes at least one second encoding field, and multiple first encoding fields constitute one second encoding field.
- the second audio feature encoding includes at least one first encoding field and/or at least one second encoding field.
- the audio acquisition sub-module includes an acquisition unit and a downsampling processing unit.
- the acquisition unit is configured to collect the initial audio signal;
- the down-sampling processing unit is configured to perform down-sampling processing on the initial audio signal to obtain the first audio signal.
- the audio acquisition sub-module includes an acquisition unit and a filtering unit.
- the acquisition unit is configured to acquire an initial audio signal; and the filtering unit is configured to filter the initial audio signal to obtain a first audio signal.
- the audio acquisition sub-module can be implemented as the audio receiving part shown in Figure 1.
- the collection unit may include an audio collection device, such as a microphone in the audio receiving part shown in FIG. 1 , or the like.
- the acquisition unit may also include an amplifier, an analog-to-digital converter, etc.
- the output module 603 may include a moment determination sub-module and an output sub-module.
- the time determination sub-module is configured to determine a first time to output the second audio signal based on the control instruction; the output sub-module is configured to output the second audio signal at the first time.
- the output module 603 may be implemented as the audio output part shown in FIG. 1 .
- the third audio signal starts to appear from the second moment, and the absolute value of the time difference between the first moment and the second moment is less than the time threshold.
- the time difference between the first time and the second time may be zero.
- the output sub-module may include audio output devices such as speakers.
- the output sub-module may also include a digital-to-analog converter, etc.
- the instruction generation module 601, the audio generation module 602, and/or the output module 603 may be hardware, software, firmware, or any feasible combination thereof.
- the instruction generation module 601, the audio generation module 602 and/or the output module 603 can be a dedicated or general-purpose circuit, chip or device, or a combination of a processor and a memory.
- the embodiments of the present disclosure do not limit the specific implementation forms of each of the above modules, sub-modules and units.
- FIG. 7 is a schematic block diagram of another audio processing device provided by at least one embodiment of the present disclosure.
- the audio processing device 700 includes one or more memories 701 and one or more processors 702 .
- One or more memories 701 are configured to store non-transitory computer-executable instructions; one or more processors 702 are configured to execute the computer-executable instructions.
- the computer-executable instructions when executed by one or more processors 702, implement the audio processing method according to any of the above embodiments.
- each step of the audio processing method please refer to the description of the above embodiments of the audio processing method, and will not be described again here.
- the audio processing device 700 may further include a communication interface and a communication bus.
- the memory 701, the processor 702 and the communication interface can communicate with each other through the communication bus, and the memory 701, the processor 6702 and the communication interface and other components can also communicate through a network connection. This disclosure does not limit the type and function of the network.
- the communication bus may be a Peripheral Component Interconnect Standard (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, or the like.
- PCI Peripheral Component Interconnect Standard
- EISA Extended Industry Standard Architecture
- the communication bus can be divided into address bus, data bus, control bus, etc.
- the communication interface is used to implement communication between the audio processing device 700 and other devices.
- the communication interface may be a Universal Serial Bus (USB) interface, etc.
- the processor 702 and the memory 701 can be provided on the server side (or cloud).
- processor 702 may control other components in audio processing device 700 to perform desired functions.
- the processor 702 may be a central processing unit (CPU), a network processor (NP), etc.; it may also be a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable Logic devices, discrete gate or transistor logic devices, discrete hardware components.
- the central processing unit (CPU) can be X86 or ARM architecture, etc.
- memory 701 may include any combination of one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
- Volatile memory may include, for example, random access memory (RAM) and/or cache memory (cache), etc.
- Non-volatile memory may include, for example, read-only memory (ROM), hard disk, erasable programmable read-only memory (EPROM), portable compact disk read-only memory (CD-ROM), USB memory, flash memory, and the like.
- One or more computer-executable instructions may be stored on the computer-readable storage medium, and the processor 702 may execute the computer-executable instructions to implement various functions of the audio processing device 700 .
- Various applications and various data can also be stored in the storage medium.
- the audio processing device 700 may be embodied in the form of a chip, a small device/device, or the like.
- FIG. 8 is a schematic diagram of a non-transitory computer-readable storage medium provided by at least one embodiment of the present disclosure.
- one or more computer-executable instructions 1001 may be non-transitory stored on a non-transitory computer-readable storage medium 1000.
- one or more steps in the audio processing method described above may be performed when the computer-executable instructions 1001 are executed by a processor.
- the non-transitory computer-readable storage medium 1000 can be applied in the above-mentioned audio processing device 700, and for example, it can include the memory 701 in the audio processing device 700.
- non-transitory computer-readable storage medium 1000 For description of the non-transitory computer-readable storage medium 1000, reference may be made to the description of the memory 701 in the embodiment of the audio processing device 600 shown in FIG. 7, and repeated descriptions will not be repeated.
- At least one embodiment of the present disclosure provides an audio processing method, an audio processing device and a non-transitory computer-readable storage medium.
- an audio signal that has not yet been generated ie, the fourth audio signal
- the audio signal predicted based on this generates a future inverted audio signal to suppress the future audio signal, avoiding the problem of the inverted audio signal being out of sync with the audio signal that needs to be suppressed due to the delay between the input end and the output end, and improving noise reduction.
- the effect can significantly reduce or even eliminate the impact of the input-to-output delay on noise reduction, and the audio suppression effect is better than that of the backward active noise reduction system commonly used in the industry; because the first audio signal is a time domain signal , the first audio signal is not an audio signal of a specific frequency, so the audio processing method provided by the embodiment of the present disclosure does not need to extract spectral features from the audio signal to generate a spectrogram, thereby simplifying the audio signal processing process and saving processing time.
- low-order feature codes can be combined to obtain high-order feature codes, thereby achieving more efficient and longer predictions; and in this audio processing method, filtering processing can also be performed through the bandwidth controller , thereby achieving suppression of audio signals within a specific frequency range to ensure that users can still hear more important sounds. For example, when used in the automotive field, it must be ensured that the driver can hear the horn, etc. to improve driving safety. property; in addition, when no audio signal is collected, the audio processing method provided by the present disclosure may not be executed until the audio signal is collected, thereby saving power consumption.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Circuit For Audible Band Transducer (AREA)
- Stereophonic System (AREA)
- Soundproofing, Sound Blocking, And Sound Damping (AREA)
- Tone Control, Compression And Expansion, Limiting Amplitude (AREA)
Abstract
Procédé de traitement audio, appareil de traitement audio et support de stockage lisible par ordinateur non transitoire. Le procédé de traitement audio consiste : à générer une instruction de commande en fonction d'un premier signal audio ; à générer un deuxième signal audio en fonction de l'instruction de commande ; et à émettre en sortie le deuxième signal audio pour supprimer un troisième signal audio, la somme de la phase du deuxième signal audio et de la phase du troisième signal audio étant inférieure à un seuil de phase, et l'instant où le premier signal audio apparaît étant antérieur à l'instant où le troisième signal audio apparaît.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/117526 WO2023226234A1 (fr) | 2022-05-23 | 2022-09-07 | Procédé et appareil d'apprentissage de modèle et support de stockage non transitoire lisible par ordinateur |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202263344642P | 2022-05-23 | 2022-05-23 | |
US63/344,642 | 2022-05-23 | ||
US202263351439P | 2022-06-13 | 2022-06-13 | |
US63/351,439 | 2022-06-13 | ||
US202263352213P | 2022-06-14 | 2022-06-14 | |
US63/352,213 | 2022-06-14 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023226193A1 true WO2023226193A1 (fr) | 2023-11-30 |
Family
ID=83825587
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/110275 WO2023226193A1 (fr) | 2022-05-23 | 2022-08-04 | Procédé et appareil de traitement audio, et support de stockage lisible par ordinateur non transitoire |
PCT/CN2022/117526 WO2023226234A1 (fr) | 2022-05-23 | 2022-09-07 | Procédé et appareil d'apprentissage de modèle et support de stockage non transitoire lisible par ordinateur |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/117526 WO2023226234A1 (fr) | 2022-05-23 | 2022-09-07 | Procédé et appareil d'apprentissage de modèle et support de stockage non transitoire lisible par ordinateur |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN115294952A (fr) |
TW (2) | TWI837756B (fr) |
WO (2) | WO2023226193A1 (fr) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB0820653D0 (en) * | 2008-11-11 | 2008-12-17 | Isis Innovation | Acoustic noise reduction during magnetic resonance imaging |
CN102110438A (zh) * | 2010-12-15 | 2011-06-29 | 方正国际软件有限公司 | 一种基于语音的身份认证方法及系统 |
CN104900237A (zh) * | 2015-04-24 | 2015-09-09 | 上海聚力传媒技术有限公司 | 一种用于对音频信息进行降噪处理的方法、装置和系统 |
CN110970010A (zh) * | 2019-12-03 | 2020-04-07 | 广州酷狗计算机科技有限公司 | 噪音消除方法、装置、存储介质及设备 |
CN113470684A (zh) * | 2021-07-23 | 2021-10-01 | 平安科技(深圳)有限公司 | 音频降噪方法、装置、设备及存储介质 |
CN113903322A (zh) * | 2021-10-16 | 2022-01-07 | 艾普科模具材料(上海)有限公司 | 基于移动端的汽车主动降噪系统、方法及可编程逻辑器件 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9053697B2 (en) * | 2010-06-01 | 2015-06-09 | Qualcomm Incorporated | Systems, methods, devices, apparatus, and computer program products for audio equalization |
US9208771B2 (en) * | 2013-03-15 | 2015-12-08 | Cirrus Logic, Inc. | Ambient noise-based adaptation of secondary path adaptive response in noise-canceling personal audio devices |
US9449594B2 (en) * | 2013-09-17 | 2016-09-20 | Intel Corporation | Adaptive phase difference based noise reduction for automatic speech recognition (ASR) |
CN106328154B (zh) * | 2015-06-30 | 2019-09-17 | 芋头科技(杭州)有限公司 | 一种前端音频处理系统 |
CN109671440B (zh) * | 2019-01-09 | 2020-08-14 | 四川虹微技术有限公司 | 一种模拟音频失真方法、装置、服务器及存储介质 |
CN112889109B (zh) * | 2019-09-30 | 2023-09-29 | 深圳市韶音科技有限公司 | 使用子带降噪技术降噪的系统和方法 |
CN112235674B (zh) * | 2020-09-24 | 2022-11-04 | 头领科技(昆山)有限公司 | 一种基于噪声分析的主动降噪处理方法、系统及芯片 |
CN112634923B (zh) * | 2020-12-14 | 2021-11-19 | 广州智讯通信系统有限公司 | 基于指挥调度系统的音频回声消除方法、设备、存储介质 |
CN113707167A (zh) * | 2021-08-31 | 2021-11-26 | 北京地平线信息技术有限公司 | 残留回声抑制模型的训练方法和训练装置 |
-
2022
- 2022-08-04 CN CN202210931490.1A patent/CN115294952A/zh active Pending
- 2022-08-04 WO PCT/CN2022/110275 patent/WO2023226193A1/fr unknown
- 2022-08-04 TW TW111129321A patent/TWI837756B/zh active
- 2022-09-07 WO PCT/CN2022/117526 patent/WO2023226234A1/fr unknown
- 2022-09-07 TW TW111133851A patent/TW202347318A/zh unknown
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB0820653D0 (en) * | 2008-11-11 | 2008-12-17 | Isis Innovation | Acoustic noise reduction during magnetic resonance imaging |
CN102110438A (zh) * | 2010-12-15 | 2011-06-29 | 方正国际软件有限公司 | 一种基于语音的身份认证方法及系统 |
CN104900237A (zh) * | 2015-04-24 | 2015-09-09 | 上海聚力传媒技术有限公司 | 一种用于对音频信息进行降噪处理的方法、装置和系统 |
CN110970010A (zh) * | 2019-12-03 | 2020-04-07 | 广州酷狗计算机科技有限公司 | 噪音消除方法、装置、存储介质及设备 |
CN113470684A (zh) * | 2021-07-23 | 2021-10-01 | 平安科技(深圳)有限公司 | 音频降噪方法、装置、设备及存储介质 |
CN113903322A (zh) * | 2021-10-16 | 2022-01-07 | 艾普科模具材料(上海)有限公司 | 基于移动端的汽车主动降噪系统、方法及可编程逻辑器件 |
Also Published As
Publication number | Publication date |
---|---|
WO2023226234A1 (fr) | 2023-11-30 |
TW202347318A (zh) | 2023-12-01 |
TW202347319A (zh) | 2023-12-01 |
CN115294952A (zh) | 2022-11-04 |
TWI837756B (zh) | 2024-04-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9294834B2 (en) | Method and apparatus for reducing noise in voices of mobile terminal | |
CN110164451B (zh) | 语音识别 | |
JP5085556B2 (ja) | エコー除去の構成 | |
US6801889B2 (en) | Time-domain noise suppression | |
US8972251B2 (en) | Generating a masking signal on an electronic device | |
JP2011511571A (ja) | 複数のマイクからの信号間で知的に選択することによって音質を改善すること | |
JP2004511823A (ja) | 動的再構成可能音声認識システムとその方法 | |
AU2017405291B2 (en) | Method and apparatus for processing speech signal adaptive to noise environment | |
US20200045166A1 (en) | Acoustic signal processing device, acoustic signal processing method, and hands-free communication device | |
CN106251856B (zh) | 一种基于移动终端的环境噪声消除系统及方法 | |
WO2020002448A1 (fr) | Détermination de paramètre de bruit de confort adaptatif | |
CN112309416B (zh) | 车载语音回音消除方法、系统、车辆和存储介质 | |
WO2019239977A1 (fr) | Dispositif de suppression d'écho, procédé de suppression d'écho, et programme de suppression d'écho | |
WO2023226193A1 (fr) | Procédé et appareil de traitement audio, et support de stockage lisible par ordinateur non transitoire | |
US10811029B1 (en) | Cascade echo cancellation for asymmetric references | |
CN116612778A (zh) | 回声及噪声抑制方法、相关装置和介质 | |
CN109429125B (zh) | 电子装置与耳机装置的控制方法 | |
CN115188390A (zh) | 一种音频降噪方法和相关装置 | |
CN115457930A (zh) | 模型训练方法及装置、非瞬时性计算机可读存储介质 | |
DK1585362T3 (en) | Method, system and computer program product for reducing audible side effects of dynamic power consumption | |
CN117392994B (zh) | 一种音频信号处理方法、装置、设备及存储介质 | |
CN115278456B (en) | Sound equipment and audio signal processing method | |
CN115051991B (zh) | 音频处理方法、装置、存储介质与电子设备 | |
WO2023220918A1 (fr) | Procédé et appareil de traitement de signal audio, support d'enregistrement et véhicule | |
CN109841222B (zh) | 音频通信方法、通信设备及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22943384 Country of ref document: EP Kind code of ref document: A1 |