KR101748039B1 - Sampling rate conversion method and system for efficient voice call - Google Patents
Sampling rate conversion method and system for efficient voice call Download PDFInfo
- Publication number
- KR101748039B1 KR101748039B1 KR1020150154083A KR20150154083A KR101748039B1 KR 101748039 B1 KR101748039 B1 KR 101748039B1 KR 1020150154083 A KR1020150154083 A KR 1020150154083A KR 20150154083 A KR20150154083 A KR 20150154083A KR 101748039 B1 KR101748039 B1 KR 101748039B1
- Authority
- KR
- South Korea
- Prior art keywords
- electronic device
- aliasing
- voice
- filter
- selecting
- Prior art date
Links
- 238000005070 sampling Methods 0.000 title claims abstract description 79
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000006243 chemical reaction Methods 0.000 title claims abstract description 23
- 238000004891 communication Methods 0.000 claims abstract description 44
- 238000004590 computer program Methods 0.000 claims abstract description 6
- 238000003384 imaging method Methods 0.000 claims description 29
- 230000002265 prevention Effects 0.000 claims description 24
- 238000004364 calculation method Methods 0.000 claims description 15
- 230000001172 regenerating effect Effects 0.000 claims description 12
- 238000001228 spectrum Methods 0.000 claims description 9
- 230000006866 deterioration Effects 0.000 claims description 7
- 230000002542 deteriorative effect Effects 0.000 claims description 5
- 238000009825 accumulation Methods 0.000 claims description 4
- 230000008929 regeneration Effects 0.000 claims 2
- 238000011069 regeneration method Methods 0.000 claims 2
- 230000015654 memory Effects 0.000 description 16
- 230000008569 process Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 12
- 230000006870 function Effects 0.000 description 11
- 230000005540 biological transmission Effects 0.000 description 8
- 230000001186 cumulative effect Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 230000015556 catabolic process Effects 0.000 description 3
- 238000006731 degradation reaction Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 210000003484 anatomy Anatomy 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005315 distribution function Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Telephone Function (AREA)
Abstract
A sampling rate conversion method and system for efficient voice communication is disclosed. A computer program stored on a medium may be provided to perform the sampling rate conversion method in combination with a computer implementing the electronic device. Here, the sampling rate conversion method may include: transmitting and receiving a data packet through a network in the electronic device, and performing a voice call; converting frequency characteristics of a voice signal input during the voice call progression into a plurality of bands And selecting one of a plurality of anti-aliasing filters based on the energy of each of the plurality of bands in the electronic device, and using the selected anti-aliasing filter And processing the down-sampling for the input speech signal.
Description
The following description relates to a sampling rate conversion method and system for efficient voice communication.
A sampling rate converter (SRC) is used to change the sampling frequency of a digital signal. For example, Korean Patent Laid-Open No. 10-2008-0098530 discloses a digital domain sampling rate converter. In the conventional SRC, the aliasing prevention filter and the imaging prevention filter have narrow band (NB) of 300 to 3400 Hz in the pass band, so that the sound quality is not deteriorated even with a small calculation amount. It is preferable that such SRC does not burden the system by reducing the amount of calculation so that it is always used as a module at the end of a communication terminal for voice communication. However, the recent voice call service is provided in the wide band (WB) of 70-7000 Hz. In such a wide band, since a high frequency band and a high sampling rate are used, a filter requiring a large amount of calculation is required in order not to cause deterioration of sound quality.
Conventionally, reducing the amount of computation is prioritized, so that sound quality degradation due to erroneous selection of SRC filters (anti-aliasing filters and anti-imaging filters) in a narrowband voice call service occurs. In addition, among voice processing used in Voice over Internet Protocol (VoIP), sound quality damage due to SRC filter is dominant in the dominant state.
For example, the anti-aliasing filter of the transmission side (Tx) SRC changes the frequency of an input voice signal, and thus is the portion where sound quality deterioration occurs first during a call. Generally, anti-aliasing filters are composed of N kinds according to the amount of computation and performance. For example, the N anti-aliasing filters may be used for various frequencies ranging from a filter for covering a band of 200-5000 Hz (hereinafter referred to as a first filter) to a filter for covering a band of 70-7000 Hz (hereinafter referred to as a second filter) A plurality of communication terminals can be implemented in each communication band. More specifically, in the case of using the first filter, the calculation amount is about 10% of the case of using the second filter, but the sound quality is distinguishable (for example, a Mean Opinion Score (MOS) of 0.3 or more) The use of the second filter increases the load of the system of the communication terminal due to a large amount of computation and increases the battery consumption of the communication terminal. In particular, in the prior art, (Anti-aliasing filter and anti-imaging filter) until the end of the call, the problem of continuous sound quality degradation, system load increase, and battery consumption increase repeatedly.
References: <PCT / KR / 2014/010167, US20140019540A1, US20130332543A1, US20130260893>
The anti-aliasing filter of the transmission side (Tx) sampling rate converter (SRC) and the anti-imaging filter of the reception side (Rx) SRC each operate efficiently (for example, And a sampling rate conversion method and system for controlling the sampling rate to be controlled.
A computer program stored in a medium for executing a sampling rate conversion method in combination with a computer embodying an electronic device, the sampling rate conversion method comprising the steps of: transmitting and receiving a data packet through a network in the electronic device ; Analyzing a frequency characteristic of a voice signal input during the voice communication in the electronic device for each of a plurality of bands and calculating energy for each of the plurality of bands; Selecting one of a plurality of anti-aliasing filters based on the plurality of band-specific energies in the electronic device; And processing the down-sampling of the input speech signal using the selected anti-aliasing filter.
A method of converting a sampling rate of an electronic device, the method comprising the steps of: transmitting and receiving data packets through a network in the electronic device to conduct a voice call; Analyzing a frequency characteristic of a voice signal input during the voice communication in the electronic device for each of a plurality of bands and calculating energy for each of the plurality of bands; Selecting one of a plurality of anti-aliasing filters based on the plurality of band-specific energies in the electronic device; And processing the downsampling for the input speech signal using the selected anti-aliasing filter.
A method of converting a sampling rate of an electronic device, the method comprising the steps of: transmitting and receiving data packets through a network in the electronic device to conduct a voice call; And selecting one of a plurality of anti-aliasing filters at predetermined time intervals in the voice call using the frequency characteristics of the voice signal input during the voice call progression Rate conversion method.
The anti-aliasing filter of the transmission side (Tx) sampling rate converter (SRC) and the anti-imaging filter of the reception side (Rx) SRC each operate efficiently (for example, Can be controlled.
1 is a diagram illustrating an example of a network environment according to an embodiment of the present invention.
2 is a block diagram illustrating an internal configuration of an electronic device and a server according to an embodiment of the present invention.
FIG. 3 is a diagram illustrating an example of components that a processor of an electronic device according to an embodiment of the present invention may include.
4 is a flowchart illustrating an example of a sampling rate conversion method that can be performed by an electronic device according to an embodiment of the present invention.
5 is a diagram illustrating an example of a component that a processor of an electronic device according to an embodiment of the present invention may further include.
FIG. 6 is a flowchart illustrating an example of steps that the electronic apparatus according to an embodiment of the present invention may further include a sampling rate conversion method that can be performed.
7 is a block diagram of logical components that an electronic device according to an embodiment of the present invention may include.
8 is a graph showing an example of a graph showing an accumulated value of energy for each band in an embodiment of the present invention.
Hereinafter, embodiments will be described in detail with reference to the accompanying drawings.
1 is a diagram illustrating an example of a network environment according to an embodiment of the present invention. 1 shows an example in which a plurality of
The plurality of
The communication method is not limited, and may include a communication method using a communication network (for example, a mobile communication network, a wired Internet, a wireless Internet, a broadcasting network) that the
Each of the
In one example, the
2 is a block diagram illustrating an internal configuration of an electronic device and a server according to an embodiment of the present invention. In FIG. 2, an internal configuration of the electronic device 1 (110) as an example of one electronic device and the
The electronic device 1 110 and the
The
The input /
Also, in other embodiments, electronic device 1 110 and
In this embodiment, the
FIG. 3 is a diagram illustrating an example of a component that a processor of an electronic device according to an embodiment of the present invention can include; FIG. 4 is a diagram illustrating an example of a sampling rate Fig. 3 is a flowchart showing an example of a conversion method. Fig.
3, the
In
At this time, the voice call
In
In
In
More specifically, the anti-aliasing
At this time, there may be a plurality of anti-aliasing filters capable of processing the band having the largest accumulated energy value without deteriorating the sound quality. Here, the fact that the band having the largest accumulation value of energy can be processed without deteriorating the sound quality can mean that it is possible to guarantee similar performance (prevention of deterioration of sound quality). Therefore, the anti-aliasing
In
As described above, according to this embodiment, degradation of sound quality can be prevented by processing down-sampling using a high-performance anti-aliasing filter for a voice signal having a large energy distribution in a high band, Sampling processing is performed using a low-performance anti-aliasing filter for a visible speech signal, thereby reducing the amount of calculation for converting the sampling rate.
At this time, the anti-aliasing filter selected for processing the down-sampling of the input voice signal may be reselected by the
Hereinabove, the process of down-sampling the voice signal input during voice communication for converting the sampling rate of the transmitting side has been described. On the other hand, as described above, since the electronic device 1 (110) can be both the transmitting side (Tx) and the receiving side (Rx), the receiving side must also be able to process the sampling rate conversion.
FIG. 5 is a diagram illustrating an example of a component that a processor of an electronic device according to an embodiment of the present invention may further include; FIG. 6 is a diagram illustrating an example of a sampling FIG. 8 is a flowchart illustrating an example of steps that a rate conversion method may further include.
5, the
In
In addition, the call mode may mean a hand free mode, a handset call mode, or the like. For example, in the case of a hand-free call, the frequency characteristics of the voice reproduced through a loud speaker may be very poor due to mechanical effects. Further, in the case of a handset call, since the frequency characteristics of the voice reproduced through the receiver of the communication terminal are managed within a strict range, a call of good quality is possible. Therefore, the voice output apparatus
In
For example, a step (not shown) for managing a matching table in which one of a plurality of anti-imaging filters is matched for each type of sound output device, for each frequency reproduction power range of the sound output device, or for each communication mode, May be performed by the
In
In
7 is a block diagram of logical components that an electronic device according to an embodiment of the present invention may include. The electronic device 1 110 may include a transmission side sampling rate converter (Tx-SRC) 710 and a reception side sampling rate converter Rx-
In the transmission side
In the receiving side
8 is a graph showing an example of a graph showing an accumulated value of energy for each band in an embodiment of the present invention. The
As described above, according to the embodiments of the present invention, the anti-aliasing filter of the transmission side (Tx) sampling rate converter (SRC) and the imaging prevention filter of the reception side (Rx) SRC efficiently operate And operate at the optimum point of performance and computation amount).
The system or apparatus described above may be implemented as a hardware component, a software component or a combination of hardware components and software components. For example, the apparatus and components described in the embodiments may be implemented within a computer system, such as, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA) , A programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For ease of understanding, the processing apparatus may be described as being used singly, but those skilled in the art will recognize that the processing apparatus may have a plurality of processing elements and / As shown in FIG. For example, the processing unit may comprise a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as a parallel processor.
The software may include a computer program, code, instructions, or a combination of one or more of the foregoing, and may be configured to configure the processing device to operate as desired or to process it collectively or collectively Device can be commanded. The software and / or data may be in the form of any type of machine, component, physical device, virtual equipment, computer storage media, or device , Or may be embodyed temporarily. The software may be distributed over a networked computer system and stored or executed in a distributed manner. The software and data may be stored on one or more computer readable recording media.
The method according to an embodiment may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like.
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced.
Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.
Claims (14)
The sampling rate conversion method includes:
Transmitting and receiving a data packet through the network in the electronic device and proceeding with a voice call;
Analyzing a frequency characteristic of a voice signal input during the voice communication in the electronic device for each of a plurality of bands and calculating energy for each of the plurality of bands;
Selecting one of a plurality of anti-aliasing filters during the voice call based on the plurality of band-specific energies in the electronic device; And
Sampling the input speech signal using the selected anti-aliasing filter while the voice call is in progress
Lt; / RTI >
Wherein the anti-aliasing filter selected for processing the down-sampling of the input speech signal is reselected at predetermined time intervals during the voice call.
Wherein the calculating the energy for each of the plurality of bands comprises:
The spectrum of the speech signal is divided into a plurality of bands, energy is extracted and accumulated for each band for each frame,
Wherein the step of selecting one of the plurality of anti-
Wherein the anti-aliasing filter identifies the anti-aliasing filter capable of processing the band having the largest accumulation value of the energy among the plurality of anti-aliasing filters without deterioration of sound quality.
Wherein the step of selecting one of the plurality of anti-
When there are a plurality of anti-aliasing filters that are confirmed to be capable of processing the band having the largest accumulated energy value without deteriorating the sound quality, the calculation amount for changing the frequency of the input voice signal among the plurality of anti- And selecting a small anti-aliasing filter.
The sampling rate conversion method includes:
Confirming a type of a sound output apparatus included in or connected to the electronic apparatus, a frequency regeneration power or a communication mode of the sound output apparatus;
Selecting one of a plurality of imaging prevention filters based on the identified kind, the identified frequency regenerating power or the identified calling mode;
Decoding a data packet received through the network to generate a voice signal; And
Processing the up-sampling for the generated speech signal
≪ / RTI >
The sampling rate conversion method includes:
Managing a matching table in which one of the plurality of imaging prevention filters is matched for each type of the sound output apparatus, for each frequency reproduction power range of the sound output apparatus or for each communication mode
Further comprising:
Wherein the step of selecting one of the plurality of anti-
An image rejection filter matched for the identified type in the matching table, an image rejection filter matched for the range containing the identified frequency regenerative power or an image rejection filter matched for the call mode, And selecting an image protection filter from among the plurality of imaging prevention filters.
Transmitting and receiving a data packet through the network in the electronic device and proceeding with a voice call;
Analyzing a frequency characteristic of a voice signal input during the voice communication in the electronic device for each of a plurality of bands and calculating energy for each of the plurality of bands;
Selecting one of a plurality of anti-aliasing filters during the voice call based on the plurality of band-specific energies in the electronic device; And
Sampling the input speech signal using the selected anti-aliasing filter while the voice call is in progress
Lt; / RTI >
Wherein the anti-aliasing filter selected for processing the down-sampling of the input speech signal is reselected at predetermined time intervals in the voice call.
Wherein the calculating the energy for each of the plurality of bands comprises:
The spectrum of the speech signal is divided into a plurality of bands, energy is extracted and accumulated for each band for each frame,
Wherein the step of selecting one of the plurality of anti-
Wherein an anti-aliasing filter capable of processing the band having the largest accumulation value of the energy among the plurality of anti-aliasing filters without deterioration of sound quality is identified.
Wherein the step of selecting one of the plurality of anti-
When there are a plurality of anti-aliasing filters that are confirmed to be capable of processing the band having the largest accumulated energy value without deteriorating the sound quality, the calculation amount for changing the frequency of the input voice signal among the plurality of anti- And a small anti-aliasing filter is selected.
Confirming a type of a sound output apparatus included in or connected to the electronic apparatus, a frequency regeneration power or a communication mode of the sound output apparatus;
Selecting one of a plurality of imaging prevention filters based on the identified kind, the identified frequency regenerating power or the identified calling mode;
Decoding a data packet received through the network to generate a voice signal; And
Processing the up-sampling for the generated speech signal
≪ / RTI >
Managing a matching table in which one of the plurality of imaging prevention filters is matched for each type of the sound output apparatus, for each frequency reproduction power range of the sound output apparatus or for each communication mode
Further comprising:
Wherein the step of selecting one of the plurality of anti-
An image rejection filter matched for the identified type in the matching table, an image rejection filter matched for the range containing the identified frequency regenerative power or an image rejection filter matched for the call mode, Wherein an image rejection filter is selected from among the plurality of anti-imaging filters.
Transmitting and receiving a data packet through the network in the electronic device and proceeding with a voice call;
Selecting one of a plurality of anti-aliasing filters at predetermined time intervals in the voice call using a frequency characteristic of the voice signal input during the voice call; And
Changing an anti-aliasing filter for processing the down-sampling of the input voice signal during the voice call to the anti-aliasing filter selected at each time interval
Wherein the sampling rate conversion step comprises:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150154083A KR101748039B1 (en) | 2015-11-03 | 2015-11-03 | Sampling rate conversion method and system for efficient voice call |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150154083A KR101748039B1 (en) | 2015-11-03 | 2015-11-03 | Sampling rate conversion method and system for efficient voice call |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20170052090A KR20170052090A (en) | 2017-05-12 |
KR101748039B1 true KR101748039B1 (en) | 2017-06-15 |
Family
ID=58740427
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020150154083A KR101748039B1 (en) | 2015-11-03 | 2015-11-03 | Sampling rate conversion method and system for efficient voice call |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101748039B1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102423977B1 (en) * | 2019-12-27 | 2022-07-22 | 삼성전자 주식회사 | Method and apparatus for transceiving voice signal based on neural network |
KR20210111603A (en) * | 2020-03-03 | 2021-09-13 | 삼성전자주식회사 | Apparatus and method for improving sound quality |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130211827A1 (en) | 2012-02-15 | 2013-08-15 | Microsoft Corporation | Sample rate converter with automatic anti-aliasing filter |
-
2015
- 2015-11-03 KR KR1020150154083A patent/KR101748039B1/en active IP Right Grant
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130211827A1 (en) | 2012-02-15 | 2013-08-15 | Microsoft Corporation | Sample rate converter with automatic anti-aliasing filter |
Non-Patent Citations (2)
Title |
---|
Bruno Bessette, et al. The adaptive multirate wideband speech codec (AMR-WB). IEEE transactions on speech and audio processing. 2002.11. Vol.10, No.8, pp.620-636.* |
Ronald E. Crochiere, et al. Optimum FIR digital filter implementations for decimation, interpolation, and narrow-band filtering. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1975.10.* |
Also Published As
Publication number | Publication date |
---|---|
KR20170052090A (en) | 2017-05-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10559313B2 (en) | Speech/audio signal processing method and apparatus | |
EP3252767B1 (en) | Voice signal processing method, related apparatus, and system | |
JP6545815B2 (en) | Audio decoder, method of operating the same and computer readable storage device storing the method | |
US9294834B2 (en) | Method and apparatus for reducing noise in voices of mobile terminal | |
KR101668401B1 (en) | Method and apparatus for encoding an audio signal | |
JP2011516901A (en) | System, method, and apparatus for context suppression using a receiver | |
JP2000305599A (en) | Speech synthesizing device and method, telephone device, and program providing media | |
KR20200123395A (en) | Method and apparatus for processing audio data | |
KR101748039B1 (en) | Sampling rate conversion method and system for efficient voice call | |
CA2945791A1 (en) | Systems, methods and devices for electronic communications having decreased information loss | |
EP2786373A1 (en) | Quality enhancement in multimedia capturing | |
CN105761724B (en) | Voice frequency signal processing method and device | |
CN111145776B (en) | Audio processing method and device | |
CN115631758B (en) | Audio signal processing method, apparatus, device and storage medium | |
KR102526699B1 (en) | Apparatus and method for providing call quality information | |
CN117157705A (en) | Data processing method and device | |
CN115602183A (en) | Audio enhancement method and device, electronic equipment and storage medium | |
JP2010160496A (en) | Signal processing device and signal processing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant |