US10643635B2 - Electronic device and method for filtering anti-voice interference - Google Patents

Electronic device and method for filtering anti-voice interference Download PDF

Info

Publication number
US10643635B2
US10643635B2 US15/665,965 US201715665965A US10643635B2 US 10643635 B2 US10643635 B2 US 10643635B2 US 201715665965 A US201715665965 A US 201715665965A US 10643635 B2 US10643635 B2 US 10643635B2
Authority
US
United States
Prior art keywords
audio signal
background audio
sequence
interval
background
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US15/665,965
Other versions
US20180350386A1 (en
Inventor
Yen-Hsin Lin
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanning Fulian Fugui Precision Industrial Co Ltd
Original Assignee
Nanning Fugui Precision Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanning Fugui Precision Industrial Co Ltd filed Critical Nanning Fugui Precision Industrial Co Ltd
Assigned to NANNING FUGUI PRECISION INDUSTRIAL CO., LTD. reassignment NANNING FUGUI PRECISION INDUSTRIAL CO., LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: LIN, YEN-HSIN
Publication of US20180350386A1 publication Critical patent/US20180350386A1/en
Application granted granted Critical
Publication of US10643635B2 publication Critical patent/US10643635B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0264Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0224Processing in the time domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/84Detection of presence or absence of voice signals for discriminating voice from noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information

Definitions

  • the subject matter herein generally relates to device control technologies.
  • Electronic devices with a playback function have various functions and complex options.
  • Traditional control methods such as remote control, touch control, mouse and keyboard control
  • voice controls are developed.
  • voice commands can fail to control a target device, because the voice commands are seriously interfered with by noises, such as audio currently playing on the target device.
  • FIG. 1 is a diagram of an exemplary embodiment of an electronic device.
  • FIG. 2 is a block diagram of an exemplary embodiment of a filtering system for anti-voice interference.
  • FIG. 3 is a flowchart of an exemplary embodiment of a voice interference filtering method.
  • module refers to logic embodied in computing or firmware, or to a collection of software instructions, written in a programming language, such as, Java, C, or assembly.
  • One or more software instructions in the modules may be embedded in firmware, such as in an erasable programmable read only memory (EPROM).
  • EPROM erasable programmable read only memory
  • the modules described herein may be implemented as either software and/or computing modules and may be stored in any type of non-transitory computer-readable medium or other storage device. Some non-limiting examples of non-transitory computer-readable media include CDs, DVDs, BLU-RAY, flash memory, and hard disk drives.
  • the term “comprising”, when utilized, means “including, but not necessarily limited to”; it specifically indicates open-ended inclusion or membership in a so-described combination, group, series, and the like.
  • an exemplary embodiment of an electronic device 2 includes an anti-voice interference filtering system 10 , a memory 20 , a processor 30 , an audio collecting unit 40 , and an audio output unit 50 .
  • the electronic device 2 may be a smart appliance, a smart phone, a computer, or the like.
  • the memory 20 includes at least one type of readable storage medium including flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a random access memory (RAM), static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disks, optical disks, and the like.
  • the processor 30 may be a central processing unit (CPU), a controller, a microcontroller, a microprocessor, or other data processing chip.
  • FIG. 2 shows an exemplary embodiment of the system 10 .
  • the system 10 includes an acquisition module 100 , a filtering module 200 , a comparison module 300 , a modification module 400 , and a synthesis module 500 .
  • the modules are configured to be executed by one or more processors (the processor 30 in this embodiment).
  • the memory 20 is used to store data such as program code of the system 10 .
  • the processor 30 is used to execute the program code stored in the memory 20 .
  • the acquisition module 100 acquires, through the audio acquisition unit 40 , a first audio signal from the environment, the first audio signal including a user voice signal.
  • the acquisition module 100 also acquires a second audio signal output from the audio output unit 50 .
  • the second audio signal is taken from the inside of the electronic device 2 , it is not taken from the surrounding environment.
  • the filtering module 200 filters a speech sound region in the first audio signal to obtain a first background audio signal, and filters a speech sound region in the second audio signal to obtain the second background audio signal.
  • speech sound region refers to a sound region corresponding to a normal human voice frequency, for example, an 80-1000 Hz region.
  • the comparison module 300 compares the first background audio signal with the second background audio signal to obtain a time difference T and a sound amplified parameter X between the first background audio signal and the second background audio signal.
  • the comparison module 300 samples the first background audio signal to extract a first eigenvalue sequence of a plurality of sampling points in the first background audio signal, and samples the second background audio signal to extract a second eigenvalue sequence of a plurality of sampling points in the second background audio signal.
  • a method of calculating the first eigenvalue sequence and the second eigenvalue sequence comprises:
  • the length of the fixed interval is t.
  • E1[10] ⁇ E1 1 , E1 2 , . . . , E1 10 ⁇ , by calculating the energy values of the 10 fixed intervals set in the first background audio signal.
  • E1 1 is the energy value of the first fixed interval
  • E1 2 is the energy value of the second fixed interval, and so on.
  • E2[10] ⁇ E2 1 , E2 2 ,. . . E2 10 ⁇ , by calculating the energy values of the 10 fixed intervals set in the first second background audio signal.
  • E2 1 is the energy value of the first fixed interval
  • E2 2 is the energy value of the second fixed interval, and so on.
  • each energy value in the fixed interval is compared with the energy value in the next fixed interval to obtain a first eigenvalue sequence C1[m] and a second eigenvalue sequence C2[m].
  • the eigenvalues are calculated as follows:
  • E m is the energy value of the m-th fixed interval.
  • the first eigenvalue sequence C1[9] and the second eigenvalue sequence C2[9] are calculated.
  • C1[9] ⁇ 0,1,0, ⁇ 1,1,1,1,0,0 ⁇
  • C2[9] ⁇ 0, ⁇ 1,1,1,1,0,0,1,0 ⁇
  • the time difference T is equal to the product of the interval length t and the value k.
  • the comparison module 300 also calculates the sound amplification parameter X based on the value k.
  • E1 n is the energy value of the n-th fixed interval in the first background audio signal
  • E2 n is the energy value of the n-th fixed interval in the second background audio signal
  • E1 10 ⁇ 3.7,3.8,6.0,5.9,3.8,5.0,5.6,6.5,7.1,7.4 ⁇
  • E2 10 ⁇ 5.0,4.9,3.2,4,4.7,5.4,5.9,6.2,6.8,7.3 ⁇
  • k 2.
  • the modification module 400 performs a time compensation operation, an amplification operation, and an inverting operation on the second audio signal, to obtain a third audio signal.
  • S 3 (t) is the third audio signal and S 2 (t) is the second audio signal.
  • the synthesis module 500 synthesizes the first audio signal and the third audio signal to obtain a fourth audio signal.
  • S 4 ( t ) S 1 ( t )+ S 3 ( t )
  • S 4 (t) is the fourth audio signal
  • S 1 (t) is the first audio signal
  • S 3 (t) is the third audio signal.
  • the fourth audio signal is a user voice from which the background noise has been filtered, and the fourth audio signal can be directly input to a voice recognition system of the electronic device 2 .
  • FIG. 3 is a flowchart of an exemplary embodiment of a voice interference filtering method.
  • a first audio signal including a user voice signal from the environment is acquired through an audio acquisition unit, wherein the first audio signal includes a user voice signal.
  • a second audio signal is acquired from an audio output unit.
  • a first background audio signal is obtained by filtering a speech sound region in the first audio signal and a second background audio signal is obtained by filtering a speech sound region in the second audio signal.
  • a time difference T and a sound amplified parameter X are obtained by comparing the first background audio signal with the second background audio signal.
  • a third audio signal is obtained by performing a time compensation operation, an amplification operation and an inverting operation on the second audio signal in accordance with the time difference T and the sound amplified parameter X.
  • a fourth audio signal is obtained by synthesizing the first audio signal and the third audio signal.

Abstract

An interference filtering method applied to the voice commands of a user of a device includes audio acquisition unit of device taking a first audio signal including user voice from the environment and a second audio signal from an audio output unit of a device creating competing noise. A first background audio signal is obtained by filtering a speech sound region in first audio signal, and a second background audio signal is obtained by filtering a speech sound region in second audio signal. A time difference T and a sound amplified parameter X are obtained by comparison. A third audio signal is obtained by performing time compensation, amplification, and an inverting operation on second audio signal. First audio signal and third audio signal are synthesized to produce fourth audio signal for feeding to voice recognition unit of the original user device.

Description

FIELD
The subject matter herein generally relates to device control technologies.
BACKGROUND
Electronic devices with a playback function (such as smart TV, computers, mobile phones, etc.) have various functions and complex options. Traditional control methods (such as remote control, touch control, mouse and keyboard control) cannot satisfy demands of users to conveniently operate the above electronic devices. Therefore, voice controls are developed.
However, voice commands can fail to control a target device, because the voice commands are seriously interfered with by noises, such as audio currently playing on the target device.
BRIEF DESCRIPTION OF THE DRAWINGS
Implementations of the present technology will now be described, by way of example only, with reference to the attached figures, wherein:
FIG. 1 is a diagram of an exemplary embodiment of an electronic device.
FIG. 2 is a block diagram of an exemplary embodiment of a filtering system for anti-voice interference.
FIG. 3 is a flowchart of an exemplary embodiment of a voice interference filtering method.
DETAILED DESCRIPTION
It will be appreciated that for simplicity and clarity of illustration, where appropriate, reference numerals have been repeated among the different figures to indicate corresponding or analogous elements. In addition, numerous specific details are set forth in order to provide a thorough understanding of an exemplary embodiment described herein. However, it will be understood by those of ordinary skill in the art an exemplary embodiments described herein can be practiced without these specific details. In other instances, methods, procedures, and components have not been described in detail so as not to obscure the related relevant feature being described. Also, the description is not to be considered as limiting the scope of an exemplary embodiments described herein. The drawings are not necessarily to scale and the proportions of certain parts may be exaggerated to better illustrate details and features of the present disclosure.
References to “an” or “one” embodiment in this disclosure are not necessarily to the same embodiment, and such references mean “at least one.”
In general, the word “module” as used hereinafter, refers to logic embodied in computing or firmware, or to a collection of software instructions, written in a programming language, such as, Java, C, or assembly. One or more software instructions in the modules may be embedded in firmware, such as in an erasable programmable read only memory (EPROM). The modules described herein may be implemented as either software and/or computing modules and may be stored in any type of non-transitory computer-readable medium or other storage device. Some non-limiting examples of non-transitory computer-readable media include CDs, DVDs, BLU-RAY, flash memory, and hard disk drives. The term “comprising”, when utilized, means “including, but not necessarily limited to”; it specifically indicates open-ended inclusion or membership in a so-described combination, group, series, and the like.
Referring to FIG. 1, an exemplary embodiment of an electronic device 2 includes an anti-voice interference filtering system 10, a memory 20, a processor 30, an audio collecting unit 40, and an audio output unit 50. In the present embodiment, the electronic device 2 may be a smart appliance, a smart phone, a computer, or the like.
The memory 20 includes at least one type of readable storage medium including flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a random access memory (RAM), static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disks, optical disks, and the like. The processor 30 may be a central processing unit (CPU), a controller, a microcontroller, a microprocessor, or other data processing chip.
FIG. 2 shows an exemplary embodiment of the system 10. The system 10 includes an acquisition module 100, a filtering module 200, a comparison module 300, a modification module 400, and a synthesis module 500. The modules are configured to be executed by one or more processors (the processor 30 in this embodiment). The memory 20 is used to store data such as program code of the system 10. The processor 30 is used to execute the program code stored in the memory 20.
The acquisition module 100 acquires, through the audio acquisition unit 40, a first audio signal from the environment, the first audio signal including a user voice signal.
The acquisition module 100 also acquires a second audio signal output from the audio output unit 50. In an embodiment, the second audio signal is taken from the inside of the electronic device 2, it is not taken from the surrounding environment.
The filtering module 200 filters a speech sound region in the first audio signal to obtain a first background audio signal, and filters a speech sound region in the second audio signal to obtain the second background audio signal. In this embodiment, speech sound region refers to a sound region corresponding to a normal human voice frequency, for example, an 80-1000 Hz region.
The comparison module 300 compares the first background audio signal with the second background audio signal to obtain a time difference T and a sound amplified parameter X between the first background audio signal and the second background audio signal.
In this embodiment, the comparison module 300 samples the first background audio signal to extract a first eigenvalue sequence of a plurality of sampling points in the first background audio signal, and samples the second background audio signal to extract a second eigenvalue sequence of a plurality of sampling points in the second background audio signal.
A method of calculating the first eigenvalue sequence and the second eigenvalue sequence comprises:
Setting a fixed interval as the time interval for calculating an energy value, the length of the fixed interval is t.
Continuously setting n fixed intervals with the interval length t at the same time points of the first background audio signal and the second background audio signal. In this embodiment, n=10 is taken as an example.
Obtaining a first interval energy sequence, E1[10]={E11, E12, . . . , E110}, by calculating the energy values of the 10 fixed intervals set in the first background audio signal. E11 is the energy value of the first fixed interval, E12 is the energy value of the second fixed interval, and so on.
Obtaining a second interval energy sequence, E2[10]={E21, E22,. . . E210}, by calculating the energy values of the 10 fixed intervals set in the first second background audio signal. E21 is the energy value of the first fixed interval, E22 is the energy value of the second fixed interval, and so on.
For the first background audio signal and the second background audio signal, each energy value in the fixed interval is compared with the energy value in the next fixed interval to obtain a first eigenvalue sequence C1[m] and a second eigenvalue sequence C2[m].
The eigenvalues are calculated as follows:
C m = { 1 E m + 1 E m > 1.10 0 0.90 E m + 1 E m 1.10 - 1 E m + 1 E m < 0.90
Wherein Em is the energy value of the m-th fixed interval.
In this embodiment, the first eigenvalue sequence C1[9] and the second eigenvalue sequence C2[9] are calculated.
The comparison module 300 compares the first eigenvalue sequence C1[9] with the second eigenvalue sequence C2[9] to obtain a value k such that C1m+k=C2m. For example, if C1[9]={0,1,0,−1,1,1,1,0,0}, C2[9]={0,−1,1,1,1,0,0,1,0}, it can be seen that C13=C21=0, C14=C22=−1, . . . , C19=C27=0, so the value k is 2.
The time difference T is equal to the product of the interval length t and the value k.
The comparison module 300 also calculates the sound amplification parameter X based on the value k.
The calculation of the sound amplification parameter X is as follows:
X = n = k + 2 10 E 1 n n = 2 10 - k E 2 n
Wherein E1n is the energy value of the n-th fixed interval in the first background audio signal, and E2n is the energy value of the n-th fixed interval in the second background audio signal.
In an embodiment, E110={3.7,3.8,6.0,5.9,3.8,5.0,5.6,6.5,7.1,7.4}, E210={5.0,4.9,3.2,4,4.7,5.4,5.9,6.2,6.8,7.3}, and k=2.
X = n = 4 10 E 1 n n = 2 8 E 2 n
At this time, the sound amplification parameter X=1.1971.
The modification module 400 performs a time compensation operation, an amplification operation, and an inverting operation on the second audio signal, to obtain a third audio signal. The third audio signal is calculated as:
S 3(t)=−XS 2(t−T)
Wherein S3(t) is the third audio signal and S2(t) is the second audio signal.
The synthesis module 500 synthesizes the first audio signal and the third audio signal to obtain a fourth audio signal.
S 4(t)=S 1(t)+S 3(t)
Wherein S4(t) is the fourth audio signal, S1(t) is the first audio signal, and S3(t) is the third audio signal. In an embodiment, the fourth audio signal is a user voice from which the background noise has been filtered, and the fourth audio signal can be directly input to a voice recognition system of the electronic device 2.
FIG. 3 is a flowchart of an exemplary embodiment of a voice interference filtering method.
At block 302, a first audio signal including a user voice signal from the environment is acquired through an audio acquisition unit, wherein the first audio signal includes a user voice signal.
At block 304, a second audio signal is acquired from an audio output unit.
At block 306, a first background audio signal is obtained by filtering a speech sound region in the first audio signal and a second background audio signal is obtained by filtering a speech sound region in the second audio signal.
At block 308, a time difference T and a sound amplified parameter X are obtained by comparing the first background audio signal with the second background audio signal.
At block 310, a third audio signal is obtained by performing a time compensation operation, an amplification operation and an inverting operation on the second audio signal in accordance with the time difference T and the sound amplified parameter X.
At block 312, a fourth audio signal is obtained by synthesizing the first audio signal and the third audio signal.
It should be emphasized that the above-described embodiments of the present disclosure, including any particular embodiments, are merely possible examples of implementations, set forth for a clear understanding of the principles of the disclosure. Many variations and modifications can be made to the above-described embodiment(s) of the disclosure without departing substantially from the spirit and principles of the disclosure. All such modifications and variations are intended to be included within the scope of this disclosure and protected by the following claims.

Claims (12)

What is claimed is:
1. An electronic device, comprising:
at least one processor;
a non-transitory storage medium coupled to the processor and configured to store one or more programs that are executed by the processor, the one or more programs comprises instructions for:
acquiring, from the environment, a first audio signal including a user voice signal;
acquiring a second audio signal output from an audio output unit;
filtering a speech sound region in the first audio signal to obtain a first background audio signal, and filtering the speech sound region in the second audio signal to obtain a second background audio signal;
comparing the first background audio signal with the second background audio signal to obtain a time difference T and a sound amplification parameter X between the first background audio signal and the second background audio signal;
performing a time compensation operation, an amplification operation and an inverting operation on the second audio signal to obtain a third audio signal according to the time difference T and the sound amplified parameter X; and
synthesizing the first audio signal and the third audio signal to obtain a fourth audio signal;
extracting a first eigenvalue sequence consisting of multiple first eigenvalues corresponding to multiple sampling points in the first background audio signal, and extracting a second eigenvalue sequence consisting of multiple second eigenvalues corresponding to multiple sampling points in the second background audio signal;
calculating the time difference T between the first background audio signal and the second background audio signal based on the first eigenvalue sequence and the second eigenvalue sequence;
compensating the second background audio signal based on the time difference T; and
comparing the compensated second background audio signal with the first background audio signal to obtain the sound amplification parameter X.
2. The electronic device as claimed in claim 1, wherein the one or more programs further comprise instructions for:
setting a time interval t for calculating an energy value;
setting, based on a same starting point, n consecutive time intervals in the first background audio signal and in the second background audio signal;
obtaining a first interval energy sequence E1[n] by calculating energy values of the n consecutive time intervals in the first background audio signal;
obtaining a second interval energy sequence E2[n] by calculating energy values of the n consecutive time intervals in the second background audio signal;
obtaining a first eigenvalue sequence C1[m] by comparing each energy value in the first interval energy sequence with a next adjacent energy value in the first interval energy sequence;
obtaining a second eigenvalue sequence C2[m] by comparing each energy value in the second interval energy sequence with a next adjacent energy value in the second interval energy sequence.
3. The electronic device as claimed in claim 2, wherein an eigenvalue Cmis calculated through a formula as following:
C m = { 1 E m + 1 E m > 1.10 0 0.90 E m + 1 E m 1.10 - 1 E m + 1 E m < 0.90
wherein Em is the energy value of the m-th fixed interval.
4. The electronic device as claimed in claim 2, wherein and the one or more programs further comprise instructions for:
comparing the first eigenvalue sequence C1[m] with the second eigenvalue sequence C2[m] to obtain a value k, wherein C1 m÷k=C2 m;
the time difference T is equal to a product of the time interval t and the value k.
5. The electronic device as claimed in claim 4, wherein the sound amplification parameter X is calculated through a formula as following:
X = n = k + 2 10 E 1 n n = 2 10 - k E 2 n
wherein E1 n is a energy value of the n-th time interval in the first background audio signal, and E2 n is a energy value in the n-th time interval of the second background audio signal.
6. The electronic device as claimed in claim 1, wherein the third audio signal is calculated through a formula as following:

S3(t)=−XS 2(t−T)
wherein S3(t), is the third audio signal, S2(t) is the second audio signal.
7. A voice interference filtering method, the method comprising:
acquiring, from the environment, a first audio signal including a user voice signal;
acquiring a second audio signal output from an audio output unit;
filtering a speech sound region in the first audio signal to obtain a first background audio signal, and filtering the speech sound region in the second audio signal to obtain a second background audio signal;
comparing the first background audio signal with the second background audio signal to obtain a time difference T and a sound amplified parameter X between the first background audio signal and the second background audio signal;
performing a time compensation operation, an amplification operation and an inverting operation on the second audio signal to obtain a third audio signal according to the time difference T and the sound amplified parameter X; and
synthesizing the first audio signal and the third audio signal to obtain a fourth audio signal;
extracting a first eigenvalue sequence consisting of multiple first eigenvalues corresponding to multiple sampling points in the first background audio signal, and extracting a second eigenvalue sequence consisting of multiple second eigenvalues corresponding to multiple sampling points in the second background audio signal;
calculating the time difference T between the first background audio signal and the second background audio signal based on the first eigenvalue sequence and the second eigenvalue sequence; and
compensating the second background audio signal based on the time difference T; and
comparing the compensated second background audio signal with the first background audio signal to obtain the sound amplified parameter X.
8. The voice interference filtering method as claimed in claim 7, the method further comprising:
setting a time interval t for calculating an energy value;
setting, based on a same starting point, n consecutive time intervals in the first background audio signal and in the second background audio signal;
obtaining a first interval energy sequence E1[n] by calculating energy values of the n consecutive time intervals in the first background audio signal;
obtaining a second interval energy sequence E2[n] by calculating energy values of the n consecutive time intervals in the second background audio signal;
obtaining a first eigenvalue sequence C1[m] by comparing each energy value in the first interval energy sequence with a next adjacent energy value in the first interval energy sequence; and
obtaining a second eigenvalue sequence C2[m] by comparing each energy value in the second interval energy sequence with a next adjacent energy value in the second interval energy sequence.
9. The voice interference filtering method as claimed in claim 8, wherein an eigenvalue Cm is calculated through a formula as following:
C m = { 1 E m + 1 E m > 1.10 0 0.90 E m + 1 E m 1.10 - 1 E m + 1 E m < 0.90
wherein Em is the energy value of the m-th fixed interval.
10. The voice interference filtering method as claimed in claim 8, the method further comprising:
comparing the first eigenvalue sequence C1[m] with the second eigenvalue sequence C2[m] to obtain a value k, wherein C1 m÷k=C2 m;
the time difference T is equal to a product of the time interval t and the value k.
11. The voice interference filtering method as claimed in claim 10, wherein the sound amplified parameter X is calculated through a formula as following:
X = n = k + 2 10 E 1 n n = 2 10 - k E 2 n
wherein E1 n is a energy value of the n-th tine interval in the first background audio signal, and E2 n is a energy value in the n-th time interval of the second background audio signal.
12. The voice interference filtering method as claimed in claim 7, wherein the third audio signal is calculated through a formula as following:

S3(t)=−XS 2(t−T)
wherein S3(t) is the third audio signal, S2(t) is the second audio signal.
US15/665,965 2017-05-31 2017-08-01 Electronic device and method for filtering anti-voice interference Active 2038-04-21 US10643635B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201710396430 2017-05-31
CN201710396430.3A CN108986831B (en) 2017-05-31 2017-05-31 Method for filtering voice interference, electronic device and computer readable storage medium
CN201710396430.3 2017-05-31

Publications (2)

Publication Number Publication Date
US20180350386A1 US20180350386A1 (en) 2018-12-06
US10643635B2 true US10643635B2 (en) 2020-05-05

Family

ID=64460723

Family Applications (1)

Application Number Title Priority Date Filing Date
US15/665,965 Active 2038-04-21 US10643635B2 (en) 2017-05-31 2017-08-01 Electronic device and method for filtering anti-voice interference

Country Status (3)

Country Link
US (1) US10643635B2 (en)
CN (1) CN108986831B (en)
TW (1) TWI663595B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109658930B (en) * 2018-12-19 2021-05-18 Oppo广东移动通信有限公司 Voice signal processing method, electronic device and computer readable storage medium
CN111210833A (en) * 2019-12-30 2020-05-29 联想(北京)有限公司 Audio processing method, electronic device, and medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020094043A1 (en) * 2001-01-17 2002-07-18 Fred Chu Apparatus, method and system for correlated noise reduction in a trellis coded environment
CN1397062A (en) 2000-12-29 2003-02-12 祖美和 Voice-controlled television set and control method thereof
US20040161121A1 (en) * 2003-01-17 2004-08-19 Samsung Electronics Co., Ltd Adaptive beamforming method and apparatus using feedback structure
CN102025852A (en) 2009-09-23 2011-04-20 宝利通公司 Detection and suppression of returned audio at near-end
US20110150257A1 (en) * 2009-04-02 2011-06-23 Oticon A/S Adaptive feedback cancellation based on inserted and/or intrinsic characteristics and matched retrieval
US8538052B2 (en) * 2007-07-10 2013-09-17 Oticon A/S Generation of probe noise in a feedback cancellation system
US9455847B1 (en) * 2015-07-27 2016-09-27 Sanguoon Chung Wireless communication apparatus with phase noise mitigation

Family Cites Families (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5761638A (en) * 1995-03-17 1998-06-02 Us West Inc Telephone network apparatus and method using echo delay and attenuation
US6515976B1 (en) * 1998-04-06 2003-02-04 Ericsson Inc. Demodulation method and apparatus in high-speed time division multiplexed packet data transmission
WO2002052546A1 (en) * 2000-12-27 2002-07-04 Intel Corporation Voice barge-in in telephony speech recognition
JP4940588B2 (en) * 2005-07-27 2012-05-30 ソニー株式会社 Beat extraction apparatus and method, music synchronization image display apparatus and method, tempo value detection apparatus and method, rhythm tracking apparatus and method, music synchronization display apparatus and method
DK2237573T3 (en) * 2009-04-02 2021-05-03 Oticon As Adaptive feedback suppression method and device therefor
CN102314868A (en) * 2010-06-30 2012-01-11 中兴通讯股份有限公司 Fan noise inhibition method and device
CN102044253B (en) * 2010-10-29 2012-05-30 深圳创维-Rgb电子有限公司 Echo signal processing method and system as well as television
US9589580B2 (en) * 2011-03-14 2017-03-07 Cochlear Limited Sound processing based on a confidence measure
EP2568695B1 (en) * 2011-07-08 2016-08-03 Goertek Inc. Method and device for suppressing residual echo
CN102385862A (en) * 2011-09-07 2012-03-21 武汉大学 Voice frequency digital watermarking method transmitting towards air channel
CN102543060B (en) * 2011-12-27 2014-03-12 瑞声声学科技(深圳)有限公司 Active noise control system and design method thereof
WO2014132102A1 (en) * 2013-02-28 2014-09-04 Nokia Corporation Audio signal analysis
US9185199B2 (en) * 2013-03-12 2015-11-10 Google Technology Holdings LLC Method and apparatus for acoustically characterizing an environment in which an electronic device resides
CN104050969A (en) * 2013-03-14 2014-09-17 杜比实验室特许公司 Space comfortable noise
EP2922058A1 (en) * 2014-03-20 2015-09-23 Nederlandse Organisatie voor toegepast- natuurwetenschappelijk onderzoek TNO Method of and apparatus for evaluating quality of a degraded speech signal
TWI569263B (en) * 2015-04-30 2017-02-01 智原科技股份有限公司 Method and apparatus for signal extraction of audio signal
CN105654962B (en) * 2015-05-18 2020-01-10 宇龙计算机通信科技(深圳)有限公司 Signal processing method and device and electronic equipment
CN105989846B (en) * 2015-06-12 2020-01-17 乐融致新电子科技(天津)有限公司 Multichannel voice signal synchronization method and device
JP6404780B2 (en) * 2015-07-14 2018-10-17 日本電信電話株式会社 Wiener filter design apparatus, sound enhancement apparatus, acoustic feature quantity selection apparatus, method and program thereof
TWI671737B (en) * 2015-08-07 2019-09-11 圓剛科技股份有限公司 Echo-cancelling apparatus and echo-cancelling method
CN105681513A (en) * 2016-02-29 2016-06-15 上海游密信息科技有限公司 Call voice signal transmission method and system as well as a call terminal
CN106303119A (en) * 2016-09-26 2017-01-04 维沃移动通信有限公司 Echo cancel method in a kind of communication process and mobile terminal
CN106653046B (en) * 2016-09-27 2020-07-14 北京云知声信息技术有限公司 Device and method for loop denoising in voice acquisition

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1397062A (en) 2000-12-29 2003-02-12 祖美和 Voice-controlled television set and control method thereof
US20020094043A1 (en) * 2001-01-17 2002-07-18 Fred Chu Apparatus, method and system for correlated noise reduction in a trellis coded environment
US20040161121A1 (en) * 2003-01-17 2004-08-19 Samsung Electronics Co., Ltd Adaptive beamforming method and apparatus using feedback structure
US8538052B2 (en) * 2007-07-10 2013-09-17 Oticon A/S Generation of probe noise in a feedback cancellation system
US20110150257A1 (en) * 2009-04-02 2011-06-23 Oticon A/S Adaptive feedback cancellation based on inserted and/or intrinsic characteristics and matched retrieval
CN102025852A (en) 2009-09-23 2011-04-20 宝利通公司 Detection and suppression of returned audio at near-end
US9455847B1 (en) * 2015-07-27 2016-09-27 Sanguoon Chung Wireless communication apparatus with phase noise mitigation

Also Published As

Publication number Publication date
TW201903756A (en) 2019-01-16
CN108986831B (en) 2021-04-20
CN108986831A (en) 2018-12-11
US20180350386A1 (en) 2018-12-06
TWI663595B (en) 2019-06-21

Similar Documents

Publication Publication Date Title
US11138992B2 (en) Voice activity detection based on entropy-energy feature
US11523771B1 (en) Audio assessment for analyzing sleep trends using machine learning techniques
US10540995B2 (en) Electronic device and method for recognizing speech
US9159324B2 (en) Identifying people that are proximate to a mobile device user via social graphs, speech models, and user context
US10062379B2 (en) Adaptive beam forming devices, methods, and systems
JP2019003700A (en) Automatic fitting of haptic effects
KR20170016760A (en) Electronic device and method for controlling external device
US20140148933A1 (en) Sound Feature Priority Alignment
US10546574B2 (en) Voice recognition apparatus and method
US10643635B2 (en) Electronic device and method for filtering anti-voice interference
CN108461081B (en) Voice control method, device, equipment and storage medium
Stoeger et al. Age-group estimation in free-ranging African elephants based on acoustic cues of low-frequency rumbles
US20140142933A1 (en) Device and method for processing vocal signal
CN110226201B (en) Speech recognition with periodic indication
KR102220964B1 (en) Method and device for audio recognition
US20160180155A1 (en) Electronic device and method for processing voice in video
US10825463B2 (en) Electronic device and method for controling the electronic device thereof
US11423880B2 (en) Method for updating a speech recognition model, electronic device and storage medium
CN116072125A (en) Method and system for constructing self-supervision speaker recognition model in noise environment
KR20210025348A (en) Electronic device and Method for controlling the electronic device thereof
US9165067B2 (en) Computer system, audio matching method, and non-transitory computer-readable recording medium thereof
CN114049882A (en) Noise reduction model training method and device and storage medium
CN112397073A (en) Audio data processing method and device
US11948569B2 (en) Electronic apparatus and controlling method thereof
CN110853633A (en) Awakening method and device

Legal Events

Date Code Title Description
AS Assignment

Owner name: NANNING FUGUI PRECISION INDUSTRIAL CO., LTD., CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LIN, YEN-HSIN;REEL/FRAME:043401/0162

Effective date: 20170704

Owner name: NANNING FUGUI PRECISION INDUSTRIAL CO., LTD., CHIN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LIN, YEN-HSIN;REEL/FRAME:043401/0162

Effective date: 20170704

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4