EP4131264A1

EP4131264A1 - Digital audio signal processing

Info

Publication number: EP4131264A1
Application number: EP21190169.9A
Authority: EP
Inventors: Christoph Musialik; Friedemann TISCHMEYER
Original assignee: Maat Labs GmbH
Current assignee: Maat Labs GmbH
Priority date: 2021-08-06
Filing date: 2021-08-06
Publication date: 2023-02-08

Abstract

The embodiments show a method and an apparatus for processing a digital audio signal (40), wherein in a channel or in several channels of a multiple of channels for processing the digital audio signal or a part of the digital audio signal, the channel is working or several channels are working on a higher working sampling frequency (k*Fs), compared to the base sampling frequency of the channel or the channels in a predefinable section of the channel or the several channels.

Description

Embodiments of the invention relate to a method for processing a digital audio signal, a non-transitory computer readable medium, a computer program product and an apparatus for processing or generating a digital audio signal. The apparatus can be a digital audio workstation or an audio network or any other digital signal processing system e.g. a video digital processing system.
The majority of today's commercial digital audio processing systems operate at two standard sampling frequencies: 44.1 kHz and 48 kHz. In the audio industry, it is assumed to be high enough for the necessary audio bandwidth, which is defined for current audio processing and distribution as high as 20 kHz. The CD and the most of music reproducing or distribution systems work with these sampling frequencies.
However, there are reasons to use higher sampling frequencies for intermediate audio processing or audio content production process (e.g., mastering) even when the sampling frequency of the final product has one of the standard values.
The most important reasons for using higher intermediate sampling frequencies, typically twice or quadruple of the standard sampling frequencies (often called as oversampling) are: (1) reduction of the amount of alias frequencies being produced when any non-linear processing of the audio signal is used (e.g., any kind of dynamics, soft clipping, saturation, and purposefully distortion generation); (2) reduction of the shape distortion of digital filters when setting up the center or cut-off frequency close to the Nyquist frequency (resulting in the squeezing of filter characteristics calculated by using the bilinear transformation for calculation of digital filters).
In the current digital audio systems, some measures have been established to reduce the above-mentioned problems. Processing modules, or so-called plugins in today's audio processing systems which can significantly benefit from oversampling, incorporate local oversampling embodiment, i.e., at the input of the module or plugin, a so-called interpolator is implemented, and at the output a so-called decimator. The interpolator multiplies the sampling frequency of the input signal, the input signal is processed with this increased sampling frequency, and after processing the decimator decreases the sampling frequency back to the original input sampling frequency. However, this solution has various disadvantages: each module or plugin needs their own interpolator and decimator filter which are normally developed as a compromise between signal quality and additional latency caused by such filters. The better the quality in terms of alias frequency suppression, the longer the FIR filters (finite impulse response filter) and thus their latency. The cascading of more such plugins in series in one audio processing channel, that is often the typical application case, may cause latencies up to seconds which disturbs or makes impossible some audio processes. In addition, multiple up-sampling filters and down-sampling filters in the chain may significantly degrade the overall audio quality in the chain.
The other solution is to switch the whole processing system (e.g., digital audio workstation) to entirely work with higher sampling frequency and let the plugins work on that frequency. However, this approach has extensive consequences. For example, at 4-time oversampling, a system needs approximately four time more computational power and four times more memory.
Assuming multitrack, or contemporary multichannel systems (e.g., with 128 channels or often more) the additional request on resources can be enormous. In addition, already recorded tracks or signals coming with standard sampling frequencies from other units or over networks have to be converted to the respective higher sampling frequency of the processing system and after processing again converted down to a standard sampling frequency, e.g., 44,1 kHz for CD production streaming services.
In view of this prior art, it is an object of the invention to provide a method for processing a digital audio signal, a non-transitory computer readable medium, a computer program product and an apparatus for processing or generating a digital audio signal, which lead to an improvement with regard to the sound quality of the audio signal with only a low overall channel latency as well as a low computational power and a low amount of memory, compared to a system working permanently at a higher sampling rate and/or using plugins using their individual oversampling rates or filters.
According to an embodiment, a method for processing a digital audio signal comprises the following method steps:
providing a channel for processing the digital audio signal or a part of the digital audio signal, wherein the channel is working on a predefined or predefinable base sampling frequency, characterized in that the base sampling frequency is increased to a working sampling frequency in a predefinable section of the channel and at the end of the section the working sampling frequency is decreased back to the base sampling frequency.
Within the context of the present description, the method for processing a digital audio signal can be performed on a digital audio workstation (in the following: DAW), in a virtual sub-mixers within virtual instruments, in virtual instruments as well as on digital hardware mixing desks or digital mixers.
Within an embodiment of the invention, the method can also be used in other multichannel digital processing systems working generally with single sampling frequencies, but using up-sampling or oversampling for dedicated modules or plugins e.g., video processing systems.
Within the context of the present description or invention, a channel is a channel of a multiple of channels being used for processing a digital audio signal. For example, in a DAW there are used a multiple of channels for processing different aspects of audio signals. For example, one channel can be used for a voice recording and processing. Another channel can be used for an electric guitar, a further channel for a bass. Another channel can be a virtual instrument. There can also be further channels, for example, for a drum kit. In the drum kit, the bass drum can be processed in one channel, a snare can be processed in another channel, the hi-hat can be processed in a further channel and so on.
Regarding the idea, which is realized in the invention, only some of the channels, which are to be processed, are up-sampled or the sampling frequency is increased in a predefined or pre-definable section of the channel or the channels. Up-sampling of only a selected number of audio processing channels leads to more efficient performance in terms of audio quality and required computational power. For example, in some audio productions only the voice channel or the voice channels, which are situated in the foreground of the audio signal to be processed, need to be processed in a higher quality with respect to some plugins that do produce nonlinearities or disturbing frequencies in an audible range. These channels need to be processed on a higher sampling rate to get rid of these audible sound problems. Thus, an embodiment is to up-sample just a section of a channel or some few channels of a multiple of channels in the processing of a digital audio signal.
Within the context of the present invention, the base sampling frequency is the base sampling frequency of the channel. The increase of the base sampling frequency to the higher working sampling frequency or the up-sampling is performed at the beginning of the section in which the up-sampled frequency is used, such that at least two inserted plugins work on this working sampling frequency or the at least two inserted plugins are triggered or driven by the working sampling frequency.
In an embodiment, the section of the channel includes at least a part of an insert area, wherein in the insert area plugins are inserted or are insertable into the channel.
In a further preferred embodiment, the section of the channel includes a cluster of at least two successive insert interfaces, in which plugins are inserted or insertable.
Very good results are to be achieved, if, according to an embodiment, the cluster includes three or more insert interfaces. The more insert interfaces are used within an oversampled channel the better the efficiency due to the invention. In a typical setup of a voice processing channel strip, plugins like equalizer, compressor, limiter, and soft-clip or tape emulation are used simultaneously in series. In this case 4 local interpolators and decimators can be saved.
In an embodiment, the cluster includes all insert interfaces of the insert area.
In the embodiments the plugins inserted in such oversampled channel work with the increased working sampling frequency without using their own interpolators and decimators. The most of the current plugins can automatically work with higher than standard (i.e., 44.1 kHz or 48 kHz) sampling frequency, especially if they incorporate internal oversampling filters.
According to an embodiment of the invention, the section starts at the beginning of the channel or starts after a signal input control. Thus, at the beginning of the channel or after the signal input control, the sampling frequency is increased. In the signal input control, for example, the gain and/or phase of the signal is adjusted or adjustable.
According to an embodiment of the invention, the section, in which oversampling is performed, starts prior an insert interface of the channel. The insert interface is used to insert plugins or processing modules (which can be software or hardware) into the channel for processing the audio signal. For example, a plugin can be a distortion effect, a compressor, an equalizer and/or any other audio processing functionality.
According to an embodiment of the invention, the section ends prior to the output. In a complex channel topology the oversampled sections can be mixed with other system specific modules like gains, faders, internal equalizers, etc.
If the section ends after the pre-fader send interface or the post-fader send interface, it is preferred to decrease the sampling frequency of the send signal to the base sampling frequency. This means an extra down-sampling compared to the down-sampling that is performed in the channel.
In an embodiment, the working sampling frequency is increased by a factor of k to the base sampling frequency, wherein k is an integer that is ususally a multiple of 2, but generally can be any factor. According to an embodiment, a non-transitory computer readable medium, including a computer program, executable by a processor or a gate array for performing a method, which is described above is, provided.
Furthermore, according to an embodiment a computer program product for performing a method as shown above is provided.
The computer program can run on a stand-alone computer or any kind of a dedicated embedded computer like a DSP (digital signal processing) engine included in a digital mixing console or a DAW used a specific DSP platform.
Furthermore, there is provided in an embodiment an apparatus for processing a digital audio signal, wherein the apparatus comprises multiple audio processing channels, the multiple audio processing channels are working on a predefined or predefinable base sampling frequency, wherein in a predefinable number of channels the base sampling frequency is increased to a working sampling frequency in a predefinable section of the number of channels and after the section or the end of the section the working sampling frequency is decreased back to the base sampling frequency.
The apparatus can be a digital audio workstation, a virtual sub-mixer, within a virtual instrument, a digital hardware mixer or a digital mixer. The methods according to embodiments of the invention and the apparatus according to embodiments of the invention can be used for audio processing or for video processing.
According to an embodiment, the apparatus is configured to perform a method, which is described above.
The embodiments of the invention use dedicated channels or paths of the audio signal in the audio processing system or the video signal in a video processing system, which allow oversampling and thus the use of plugins or processing modules without switching the whole system to a higher sampling frequency and/or without using local oversampling filters in each plugin.
In the framework of this description a plugin can be called a processing module.
Within the framework of the description, a selected channel or selected channels within an apparatus for processing a digital audio signal, which can be switched to higher sampling frequencies by using for example only one interpolator at the input or somewhere in the beginning of the channel and only one decimeter somewhere later in the channel or at the output of the channel are used.
Within such sections, which use a higher sampling frequency, which is called a working sampling frequency within this description, plugins or processing modules set up to a desired or predefinable working sampling frequency or oversampling rate can be placed or inserted without using the own local interpolators and decimators of the plugins.
Within this description, an increase of the sampling frequency has the same meaning as an up-sampling or oversampling of the frequency. The same is the case with respect to a decrease of the sampling frequency and a down-sampling of the frequency.
Furthermore, a single up-sampling algorithm or module can be used and a single down-sampling algorithm or module can be used in a channel that has a better quality than up-sampling and down-sampling algorithm, normally used in single plugins. This further increases the quality of the processed audio signal. Furthermore, the latency in the channel is much lower than caused by the series of local oversampling filters as well as the computational power and the amount of memory is lower compared to a system working entirely at a higher sampling frequency.
According to an embodiment of the invention, the length of the section, in which the working sampling frequency is used, can be different in different channels. The length can also be the same for two or more channels.
The increase of the sampling frequency or the oversampling factor k can be different for different channels. The oversampling factors can be selected differently in different channels.
According to an embodiment of the invention, the section of the channel includes at least a part of an insert area, wherein in the insert area plugins are inserted or are insertable into the channel. The insert area normally has several inserts or insert interfaces, in which plugins can be inserted.
Within this description, an insert interface can be an insert, an insert slot and an insert point.
According to an embodiment of the invention, the section starts prior an insert interface.
According to a further embodiment of the invention, the section starts at the beginning of the channel or after a signal input control.
According to an embodiment, the section of the channel includes a cluster of at least two successive insert interfaces, in which plugins are inserted or insertable.
An embodiment include the feature, that at a first insert interface a plugin is inserted or insertable that is enabled to increase the base sampling frequency to the working sampling frequency and to keep the working sampling frequency at the output of the plugin and at the end of the section at another insert interface another plugin is inserted or insertable that is enabled to decrease the working sampling frequency back to the base sampling frequency.
In this embodiment, it is possible to insert an interpolator and a decimator as a plugin. For the embodiment, the plugin standard can implement that in the section that is selected to be in an up-sampling mode the plugins are not forced to reduce the sampling frequency to the base sampling frequency of the channel.
The further method features, which are described above, can also be implemented into the apparatus.
Further characteristics of the invention will become apparent from the description of the embodiments according to the invention together with the claims and the included drawings. Embodiments according to the invention can fulfill individual characteristics or a combination of several characteristics.
The invention is described below, without restricting the general intent of the invention, based on exemplary embodiments, wherein reference is made expressly to the drawings with regard to the disclosure of all details according to the invention that are not explained in greater detail in the text. The drawings show in:

Fig. 1: shows schematically a channel according to an embodiment of the invention,
Fig. 2: shows a schematic flowchart according to an embodiment of the invention,
Fig. 3: shows another schematic flowchart according to an embodiment of the invention.

In the drawings, the same or similar types of elements or respectively corresponding parts are provided with the same reference numbers in order to prevent the item from needing to be reintroduced.
Fig. 1 shows a channel 18 according to a digital audio processing system or an apparatus for processing a digital audio signal in a schematic view.
The audio channel 18 can be a mono-channel format, a stereo-channel format, a surround-channel format or any multi-channel format configuration for, especially immersive, audio. It can be a part of a digital audio workstation, a virtual instrument plugin or a digital mixing desk.
At 10, the input signal is shown. The input signal 10 can have any mono or multi-channel configuration and can come from audio inputs, for example, microphones or line sources or from the hard drive of the audio computer. In case of a hardware mixing console, the signal can also come from an internal or external recording device. In case of a virtual instrument, the signal comes from the instrument itself.
In the flow of the signal after the input signal 10, the signal comes to the input control, which usually consists or comprises an input trim (gain) and a phase reverse button.
At the position 30, which is in the signal flow after the input control 11 and prior the first insert interface 21, the up-sampling to a working sampling frequency is performed. This is a good point in the signal chain to apply up-sampling. This may vary in different situations. The up-sampling can also be performed prior the input control 11. The up-sampling could also start after a view initial insert interfaces, which then would work in the base or system or apparatus sampling frequency, for example, 44.1 or 48 kHz. Thus, the up-sampling 30 could also take place, for example between the first insert interface 21 and the second insert interface 22 or after another insert interface. In Fig. 1 not all insert interfaces are shown, but only the first two and the last two of, for example, ten insert interfaces.
The insert interfaces 21, 22, 23, 24 are dedicated to insert audio processing tools such as equalizers, dynamic processing tools and so forth. The total number of insert interfaces varies between systems or apparatuses.
According to an embodiment of the invention, the down-sampling to the base sampling frequency can take place at different points in the audio signal processing line or the channel 18. The down-sampling is shown by the reference number 31 and in the example of Fig. 1, there are shown different places for down-sampling 31.
A good point, at which the down-sampling to the base sampling frequency can take place, can be within the insert area 20 between two insert interfaces, for example, the insert interface 22 or 23.
Between the insert interfaces 22 and 23, several other insert interfaces can be positioned. A good point for the down-sampling is after insert interfaces in which plugins are inserted, that profit from working on a higher sampling rate or frequency. Other plugins working satisfactory at the base sampling frequency, e.g. have a good sound quality if working on the base sampling frequency, can be inserted into insert interfaces, which are positioned after the down-sampling 31.
The down-sampling can take place at other positions as shown in Fig. 1.
In case of many available insert interfaces 21-24, it is a preferred embodiment of the invention to allow the user to define how many insert interfaces 21-24 are integrated into the up-sampling cluster.
If internal digital signal processing tools are available, it also makes sense to integrate those into the up-sampling cluster.
After the fader, a further insert interface or insert interfaces 25 are shown, for example, insert interfaces 11 and 12. After this insert interface 25, in which, for example, plugins with reverb-effects or delay-effects or other effects can be implemented, the post-fader sends 17 is implemented. The signal is send transferred to the panorama 14.
The post-fader sends 17 usually is used to send signals to FX devices such as reverbs and delays. In case the post-fader send is within the up-sampling cluster, which means within the section in which the working sampling frequency is used, it is required to have an extra down-sampling processing of the post-fader sends signal to communicate with other processes in the base sampling frequency of the system or apparatus.
The output signal 15 may be mono, stereo or any multi-channel configuration for surround or immersive audio.
Fig. 1 shows one embodiment of a channel. However different configurations of channels can be used according to the invention.
Fig. 2 shows a schematic flowchart of an embodiment of the invention.
At 40, the signal is shown, which is put into an interpolator 41, which leads to an up-sampling of the base sampling frequency Fs of the signal 40 to a working sampling frequency k*Fs. After that, there is a section 43 or cluster 43, which is processed with the working sampling frequency k*Fs. In the section 43, several plugins P1, P2 and so on up to Pn are inserted into this section. The plugins P1, P2 ... Pn are inserted into insert interfaces.
After the section 43, a decimator 42 is used to down sample the working sampling frequency back to the base sampling frequency.
This embodiment shows schematically a channel or a part of a channel of a system or an apparatus, e.g. a digital audio workstation, a digital mixing console or an audio matric processor, which incorporates its own interpolator at the beginning or prior of the oversampled section and its own decimator at the end or after the oversampled section. So the interpolator and decimator are provided by system manufacturer as a part of the system in case of channels which can be switched to oversampling mode.
Fig. 3 shows another schematic flowchart according to an embodiment of the invention. In this embodiment it is possible to insert an interpolator 41 and a decimator 42 as a plugin. It would be an advantageous solution because of the possibility that different plugin manufacturers can deliver their own solutions for interpolators and decimators. It would only be necessary to use a plugin standard or change existing plugin standards, e.g. the VST standard, in a manner that in the section that is selected to be in an up-sampling mode the plugins are not forced to reduce the sampling frequency to the base sampling frequency of the channel.
All named characteristics, including those taken from the drawings alone, and individual characteristics, which are disclosed in combination with other characteristics, are considered alone and in combination as important to the invention. Embodiments according to the invention can be fulfilled through individual characteristics or a combination of several characteristics. Features which are combined with the wording "in particular" or "especially" are to be treated as preferred embodiments.

List of References

10: input signal
11: input control
12: input equalizer and dynamics
13: fader
14: panorama
15: output signal
16: pre-fader sends
17: post-fader sends
18: channel
20: insert area
21: insert interface 1
22: insert interface 2
23: insert interface 9
24: insert interface 10
25: insert interface 11 & 12
30: up-sampling
31: down-sampling
40: signal
41: interpolator
42: decimator
43: section

P1, P2, Pn: plugin
Fs: base sampling frequency
k*Fs: working sampling frequency
k: multiplication factor

Claims

A method for processing a digital audio signal (40) comprising the following method steps:
providing a channel (18) for processing the digital audio signal (40) or a part of the digital audio signal (40), wherein the channel (18) is working on a predefined or predefinable base sampling frequency (Fs), characterized in that the base sampling frequency (Fs) is increased to a working sampling frequency (k*Fs) in a predefinable section (43) of the channel (18) and at the end of the section (43) the working sampling frequency (k*Fs) is decreased back to the base sampling frequency (Fs).
The method according to claim 1, characterized in that the section (43) of the channel (18) includes at least a part of an insert area (20), wherein in the insert area (20) plugins (P1, P2, Pn) are inserted or are insertable into the channel (18).
The method according to claim 2, characterized in that the section (43) of the channel (18) includes a cluster of at least two successive insert interfaces (21, 22, 23, 24), in which plugins (P1, P2, Pn) are inserted or insertable.
The method according to claim 3, characterized in that the cluster includes all inserts (21, 22, 23, 24) of the insert area (20).
The method according to any one of the claims 1 to 4, characterized in that the section (43) starts at the beginning of the channel (18) or after a signal input control (11).
The method according to any one of the claims 3 to 5, characterized in that the section (43) ends after the cluster of at least two successive insert interfaces (21, 22, 23, 24).
The method according to any one of the claims 1 to 6, characterized in that the working sampling frequency (k*Fs) is increased by a factor of k compared to the base sampling frequency (Fs).
The method according to claim 7, characterized in that k is an integer that is a multiple of 2.
A non-transitory computer readable medium, including a computer program, executable by a processor or gate array for performing a method according to any one of the claims 1 to 8.
A computer program product for performing a method according to any one of the claims 1 to 8.
An apparatus for processing a digital audio signal (40), wherein the apparatus comprises multiple audio processing channels (18), the multiple audio processing channels (18) are working on a predefined or predefinable base sampling frequency (Fs), characterized in that in a predefinable number of channels (18) the base sampling frequency (Fs) is increased to a working sampling frequency (k*Fs) in a predefinable section (43) of the number of channels (18) and after the section (43) or the end of the section (43) the working sampling frequency (k*Fs) is decreased back to the base sampling frequency (Fs).
The apparatus according to claim 11, characterized in that the section (43) of the channel (18) includes at least a part of an insert area (20), wherein in the insert area (20) plugins (P1, P2, Pn) are inserted or are insertable into the channel (18).
The apparatus according to claim 11 or 12, characterized in that the section (43) starts at the beginning of the channel (18) or after a signal input control (11).
The apparatus according to any one of the claims 11 to 13, characterized in that the section (43) of the channel (18) includes a cluster of at least two successive insert interfaces (21, 22, 23, 24), in which plugins (P1, P2, Pn) are inserted or insertable.
The apparatus according to any one of the claims 11 to 14, characterized in that at a first insert interface a plugin is inserted or insertable that is enabled to increase the sampling frequency (Fs) to the working sampling frequency (k*Fs) and to keep the working sampling frequency (k*Fs) at the output of the plugin and at the end of the section (43) at another insert interface another plugin is inserted or insertable that is enabled to decrease the working sampling frequency (k*Fs) back to the base sampling frequency (Fs).