KR101078379B1

KR101078379B1 - Method and Apparatus for Processing Audio Data

Info

Publication number: KR101078379B1
Application number: KR1020090018624A
Authority: KR
Inventors: 서상원; 김헌중
Original assignee: 주식회사 코아로직
Priority date: 2009-03-04
Filing date: 2009-03-04
Publication date: 2011-10-31
Also published as: KR20100099998A; WO2010101381A2; CN102341845A; JP2012519310A; WO2010101381A3; CN102341845B

Abstract

Disclosed are a method and apparatus for processing audio data. The audio data processing method loads any channel data and previous cumulative data of a plurality of channel data from a memory, calculates the loaded channel data using a downmix coefficient corresponding to the channel data, and loads the loaded channel. Determine whether data is the last input channel data for constituting output channel data, and if the loaded channel data is the last input channel data, use the window coefficients for the calculated channel data and the previous cumulative data. To operate. Thus, the downmix and windowing process can be integrated.

Downmix, audio, window, processor, channel

Description

Method and apparatus for processing audio data {Method and Apparatus for Processing Audio Data}

The present invention relates to a method of processing audio data, and more particularly, to a technology related to downmixing and windowing that can integrate downmixing and windowing of input channel data.

In recent years, in order to provide a higher quality and realistic audio service to users, multi-channel digital surround audio formats have diversified from existing one-channel (eg mono) or two-channel (eg stereo) audio formats. Is being developed. For example, Dolby AC3 (Dolby Audio Coding ?? 3), DTS (Digital Theater System), and the like are representative examples of multichannel digital surround audio formats of 5.1 channels or more.

Dolby AC3, DTS, etc. are multi-channel surround audio formats originally developed for the purpose of realistically displaying movies in theaters, but are currently available through various media such as DVD, CD, laser disc, digital broadcasting, etc. It is also widely used. Therefore, even if a general home includes an audio system capable of supporting the multi-channel surround audio format, such as a home theater or a DVD / Divx player, anyone can listen to high quality surround sound.

On the other hand, in many cases, the actual personal audio system cannot support the multi-channel surround audio format. In particular, in recent years, the use of portable devices such as mobile communication terminals, etc. has become common, and these portable devices are not suitable for outputting surround sound due to their mechanical characteristics. Therefore, in such a case, a sound quality improvement technique for an audio system that does not support multiple channels is required.

The down mix may refer to a technique of mixing audio data of a plurality of channels and converting the audio data of fewer channels. For example, the downmixing apparatus may downmix audio data of 5.1 channels and convert the audio data of two channels (or one channel). In this case, since the audio data of the two channels (or one channel) includes components of the audio data of the 5.1 channels, it is possible to reproduce a higher quality sound by using the two channel (or one channel) audio system.

In general, a conventional audio system having such a downmix function encodes a bit stream input from the outside for each channel, converts the encoded frequency domain data into time domain data, and outputs the converted time domain data. Down mix to match the channel. In addition, a windowing process of multiplying the downmixed data by the window coefficient is performed to generate audio data, for example, PCM data.

This conventional audio system performs downmixing and windowing as separate processes, respectively, as mentioned above. For example, in the downmix process, data of a plurality of input channels is stored in a memory, and the stored data are sequentially loaded and processed for each channel, and then stored in the memory. After the downmix process is completed, the windowing processor loads the downmixed data stored in the memory to multiply the window coefficients and outputs the window coefficients.

However, such a conventional technology has too many accesses to the memory in the downmix process and the windowing process, and the computation amount is excessive. This is because the downmix process and the windowing process continually repeat the process of loading data from memory, processing it, and saving it again. Thus, there is an urgent need for new technologies that can simplify the downmix and windowing process.

The technical problem to be solved by the present invention is to provide an audio data processing method and apparatus that can minimize the access to the memory and significantly reduce the amount of computation by integrating down mix and windowing when processing audio data. .

In order to solve this technical problem, the present invention provides an audio data processing method in one aspect. The audio data processing method includes the steps of loading any one of a plurality of channel data and previous cumulative data from a memory; Calculating the loaded channel data using a downmix coefficient corresponding to the channel data; Determining whether the loaded channel data is last input channel data for constituting output channel data; And when the loaded channel data is the last input channel data, calculating the calculated channel data and the previous cumulative data using window coefficients.

The audio data processing method may further include adding the calculated channel data to the previous cumulative data when the loaded channel data is not the last input channel data; And storing the added data in the memory. The audio data storing method may further include storing data calculated using the window coefficient in a specific buffer.

Computing using the window coefficients comprises: adding the computed channel data to the previous cumulative data; Extracting window coefficients from the memory; And multiplying the added data by the window coefficient. The loading may include storing a plurality of channel samples included in any one channel data in a plurality of registers.

The calculating using the downmix coefficients may include extracting a downmix coefficient corresponding to the loaded channel data from the memory; And multiplying the extracted down mix coefficients to a plurality of channel samples stored in the register, respectively.

Meanwhile, in order to solve the above technical problem, the present invention provides an audio data processing apparatus in another aspect. The audio data processing apparatus includes a memory for storing information; And loading one channel data of the plurality of channel data and previous cumulative data from the memory, and calculating the loaded channel data using a downmix coefficient, wherein the loaded channel data is configured to configure an output channel. And determining whether the input channel data is the last input channel, and calculating the calculated channel data and the loaded previous accumulated data using a window coefficient when the loaded channel data is the last input channel data.

When the loaded channel data is not the last input channel data, the calculator may add the calculated channel data to the previous cumulative data and store the calculated channel data in the memory.

When the loaded channel data is the last input channel data, the calculator may add the calculated channel data to the previous accumulated data and multiply the added data by a window coefficient. In addition, the calculator may multiply the loaded channel data by a downmix coefficient corresponding to the loaded channel data.

The memory may include a downmix coefficient table including downmix coefficients for each channel for downmixing the plurality of channel data; And a window coefficient table including window coefficients for windowing the downmixed data.

The loaded channel data may include a plurality of channel samples. In this case, the operation unit may include a plurality of registers for storing the plurality of channel samples.

As described above, according to the present invention, by performing windowing in parallel when performing a downmix operation on the last input channel data for configuring an output channel, the downmix and windowing may be integrated. Therefore, there is an advantage in that the downmix and windowing process can minimize access to memory and greatly reduce the amount of computation.

Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art may easily implement the present invention. In the preferred embodiment of the present invention described below, specific technical terms are used for clarity of content. However, the invention is not limited to the particular term selected, and it is to be understood that each specific term includes all technical synonyms that operate in a similar manner to achieve a similar purpose.

1 is a block diagram showing a configuration of an audio system including an audio data processing apparatus according to a preferred embodiment of the present invention.

As shown in FIG. 1, the audio system 100 may perform a function of receiving an encoded bit stream in units of frames, processing the same, and outputting the encoded bit stream as an audio signal. The audio system 100 may be configured in the form of a portable device such as a mobile communication terminal, or may be a fixed system such as a traditional digital hi-fi system.

The audio system 100 may include a decoder 10, a domain converter 20, a data processor 30, and an audio output unit 40.

The decoder 10 may receive an encoded bit stream input from the outside in units of frames and decode the same to output audio data of a frequency domain. For example, the bit stream may include data in the frequency domain of N channels where N is an integer of 2 or more. The encoder 10 may decode such a bit stream for each channel and output audio data of frequency domains of N channels.

The domain transform unit 20 may inversely transform the audio data of the frequency domain output by the decoder 10 into audio data of the time domain. For example, the domain converter 20 converts the decoded audio data of the N-channel frequency domain into audio data of the N-channel time domain, that is, N channel data, and then converts the audio data to the data processor 30. Can transmit

The data processor 30 receives N channel data converted into the time domain from the domain converter 20, down-mixes and windowes the received N channel data to M (M). Is an integer of less than N). The channel data may include a plurality of channel samples.

For example, the data processor 30 sequentially calculates N channel data for each channel using the downmix coefficient, accumulates the result value, and uses the downmix coefficient for the last input channel data for configuring the output channel. The windowing operation can be performed while calculating. The data processor 30 may store the M output channel data on which downmixing and windowing is performed, in a specific buffer of the audio output unit 40, for example, a pulse code modulation (PCM) buffer.

The audio output unit 40 may receive M output channel data output from the data processing unit 30 and output the M output channel data using M speakers. For example, the audio output unit 40 may process the M output channel data stored in the PCM buffer and then output the M output speakers using M output speakers corresponding to the M output channels.

FIG. 2 is a block diagram showing the configuration of the data processing unit 30 shown in FIG.

1 to 2, the data processor 30 may perform a function of downmixing and windowing N channel data received from the domain converter 20 to output M output channel data. As illustrated in FIG. 2, the data processor 30 may include a memory 36, a calculator 31, and the like.

The memory 36 receives and stores N channel data transmitted from the domain converter 20 in units of channels, and stores accumulated data calculated by the calculator 31. The memory 36 may provide the stored channel data and the accumulated data to the calculator 31 at the request of the calculator 31.

In addition, the memory 36 may store a downmix coefficient table required for the downmix operation, a window coefficient table required for the windowing operation, and the like. The downmix coefficient table is information in which downmix coefficients for multiplying each channel data in a downmix operation are configured in a table form. In addition, the window coefficient table is information in which window coefficients according to the window type applied to the frame are configured in a table form.

The calculation unit 31 may refer to a computing device, for example, a core processor, which operates according to a predetermined downmix and windowing algorithm. The downmix and windowing algorithm may include various pieces of information for controlling the processing order of channel data, the number of output channels, corresponding information of input and output channels, downmixing and windowing calculation procedures, and the like.

The operation unit 31 may include an arithmetic logic unit (ALU) 33 and a register unit 34 for performing an operation. The ALU 33 may control a downmix and windowing process according to the downmix and windowing algorithm. The register unit 34 may include a plurality of registers for use in the operation operation of the ALU 33.

The calculation unit 31 loads any channel data of the N channel data and previous cumulative data from the memory 36, calculates the loaded channel data using a downmix coefficient, and then loads the loaded channel. It may be determined whether the data is the last input channel data for configuring the output channel.

In this case, if the loaded channel data is the last input channel data, the calculation unit 31 may calculate the calculated channel data and the loaded previously accumulated data using window coefficients. On the other hand, when the loaded channel data is not the last input channel data, the calculation unit 31 may add the calculated channel data and the loaded previous accumulated data and store them in the memory.

On the other hand, the calculation unit 31 compares the number of channels of the input channel data, that is, the number of input channels N and the number of output channels M to perform downmixing and windowing on the input N channel data or windowing. You can also decide whether or not to perform only.

3 is a flowchart illustrating an operation flow of the operation unit 31, and illustrates an audio data processing method according to a preferred embodiment of the present invention.

1 to 3, first, the calculator 31 may load any channel data and previous cumulative data of N channel data from the memory 36 (step S1). The channel data may include a plurality of channel samples. For example, the calculator 31 may access the memory 36, retrieve a plurality of channel samples included in any one channel data, and store the plurality of channel samples in the register 34. At this time, the corresponding channel sample is stored in each register of the register unit 34.

Subsequently, the calculator 31 may calculate the loaded channel data using a downmix coefficient (step S2). For example, the calculating unit 31 queries the downmix coefficient table stored in the memory 36 to extract downmix coefficients corresponding to the loaded channel data, and then extracts the extracted downmix coefficients into the register unit 34. Each stored channel sample may be multiplied.

Next, the calculation unit 31 determines whether the loaded channel data is the last input channel data for configuring an output channel (that is, one of the M output channels) (step: S3). The downmix operation order for the N channel data may be predetermined by the downmix and windowing algorithm. Accordingly, the calculator 31 may determine whether the currently loaded channel data is the last input channel data based on a predetermined downmix operation order.

For example, the input N channel data includes five channel data, such as L (Left) channel data, C (Center) channel data, R (Right) channel data, Ls (Left Surround) channel data, and Rs (Right Surround). A) channel data, and the output channel data is two channel data, e.g., L output channel data and R output channel data, the downmix and windowing algorithm may, for example, configure the L channel to form the L output channel data. The operation using the downmix coefficient may be instructed in the order of data, C channel data, Ls channel data, and Rs channel data. In this case, the calculator 31 may determine whether the currently loaded channel data is Rs channel data, and determine whether the loaded channel data is the last input channel data for configuring L output channel data.

In the determination, when the loaded channel data is the last input channel data for constituting the output channel, the calculation unit 31 calculates the channel data calculated using the downmix coefficient and the previous cumulative data using the window coefficient. Can be calculated (step: S4). For example, the calculator 31 adds the channel data calculated using the downmix coefficients to the previous cumulative data, and inquires the window coefficient table stored in the memory 36 to search window coefficients corresponding to the corresponding frame. After extracting, the extracted window coefficient may be multiplied by the added accumulated data. In addition, the calculator 31 may store the windowed data to the audio output unit 40 to store a specific buffer, such as a PCM buffer.

On the other hand, in the determination, if the loaded channel data is not the last input channel for constituting the output channel, the calculation unit 31 adds the channel data calculated using the downmix coefficient to previous cumulative data, The added cumulative data can be stored in the memory (step: S5). At this time, the stored cumulative data may be loaded by the calculator 31 as previous cumulative data along with channel data of the next order.

As described above, according to the audio data processing method according to the preferred embodiment of the present invention, a windowing operation is performed when downmixing the last input channel data for configuring an output channel. Therefore, the number of accesses to the memory can be greatly reduced, and the amount of computation can be reduced.

Hereinafter, a more specific embodiment will be described as another preferred embodiment of the present invention. First, in the following embodiment, five channel data, for example, L channel data, C channel data, R channel data, Ls channel data, and Rs channel data are input, and downmixed and windowed to output two channel data, For example, an example of outputting the L output channel data and the R output channel data will be described.

4 is a flowchart illustrating an audio data processing method according to another exemplary embodiment of the present invention. Among the five channel data input from the outside, four channels of data except for the R channel data are downmixed and windowed to L output. The procedure for outputting channel data is shown. At this time, it is assumed that the order of processing the four channel data is determined by the downmixing and windowing algorithm in the order of L channel data, C channel data, Ls channel data, and Rs channel data.

As shown in FIG. 4, first, the calculator loads L channel data, which is channel data to be processed first, from a memory (step S11). For example, the calculator may take channel samples included in the L channel data stored in the memory and store the channel samples in the register. In this case, since the L channel data is the channel data to be processed first, previous cumulative data does not exist.

Next, the operation unit extracts the down mix coefficients corresponding to the L channel data loaded from the memory from the down mix coefficient table, and multiplies the extracted down mix coefficients by the L channel data (step: S12). For example, the calculator may multiply the downmix coefficients by the channel samples stored in the register unit.

When the operation is completed, the operation unit determines whether the L channel data is the last input channel data, ie, the Rs channel data for configuring the L output channel data, and the L channel data is not the last input channel data. The obtained L-channel data is stored in the memory as first accumulated data (step S13).

The process of steps S11 to S13 is expressed by the following equation.

P1 [0. 1.2,… , k] = a x L [0, 1, 2,... , k]

In this case, L [0. 1.2,… , k] means L channel data having k (k is an integer of 2 or more) channel samples, a means a downmix coefficient corresponding to the L channel data, and P1 [0. 1.2,… , k] may mean L channel data, that is, first cumulative data, calculated using the downmix coefficient.

In these steps S11 to S13, the loading operation from the memory occurs once and the storage operation into the memory occurs once. In this case, one time may mean a unit for processing all k channel samples.

Next, the operation unit loads the C channel data, which is the channel data to be processed second, and the previous accumulated data (ie, the first accumulated data) from the memory (step: S14). For example, the calculator may take channel samples included in the C channel data stored in the memory and store the channel samples in the register.

Next, the operation unit extracts the down mix coefficients corresponding to the C channel data loaded from the memory from the down mix coefficient table, and multiplies the extracted down mix coefficients to the C channel data (step: S15). For example, the calculator may multiply the downmix coefficients by the channel samples stored in the register unit.

When the operation is completed, the operation unit determines whether the C channel data is the last input channel data, that is, the Rs channel data for configuring the L output channel data, and the C channel data is not the last input channel data. The added C-channel data is added to the loaded first accumulated data, and the added data is stored in the memory as the second accumulated data (step S16).

The process of steps S14 to S16 is expressed by the following equation.

P2 [0. 1.2,… , k] = P1 [0. 1.2,… , k] + b x C [0, 1, 2,... , k]

At this time, the C [0. 1.2,… , k] means C channel data having k channel samples, b means downmix coefficient corresponding to C channel data, and P1 [0. 1.2,… , k] means first cumulative data, and P2 [0. 1.2,… , k] may mean data obtained by adding C channel data and first cumulative data calculated using a downmix coefficient, that is, second cumulative data.

In these steps S14 to S16, the loading operation from the memory occurs twice and the storage operation into the memory occurs once.

Next, the calculator loads the Ls channel data, which is the third channel data to be processed, and the previous cumulative data (ie, the second cumulative data) from the memory (step S17). For example, the operation unit may take channel samples included in the Ls channel data stored in the memory and store the channel samples in the register unit.

Next, the operation unit extracts the down mix coefficients corresponding to the Ls channel data loaded from the memory from the down mix coefficient table, and multiplies the extracted down mix coefficients to the Ls channel data (step: S18). For example, the calculator may multiply the downmix coefficients by the channel samples stored in the register unit.

When the operation is completed, the operation unit determines whether the Ls channel data is the last input channel data, ie, the Rs channel data for configuring the L output channel data, and the Ls channel data is not the last input channel data. The added Ls channel data is added to the loaded second accumulated data, and the added data is stored in the memory as the third accumulated data (step S19).

The process of steps S17 to S19 is expressed by the following equation.

P3 [0. 1.2,… , k] = P2 [0. 1.2,… , k] + c x Ls [0, 1, 2,... , k]

At this time, the Ls [0. 1.2,… , k] means Ls channel data having k channel samples, c means downmix coefficient corresponding to Ls channel data, and P2 [0. 1.2,… , k] means second cumulative data, and P3 [0. 1.2,… , k] may refer to data obtained by adding Ls channel data and second cumulative data calculated using a downmix coefficient, that is, third cumulative data.

In these steps S17 to S19, the loading operation from the memory occurs twice and the storage operation into the memory occurs once.

Next, the calculator loads Rs channel data, which is the fourth channel data to be processed, and previous cumulative data (ie, third cumulative data) from the memory (step S20). For example, the calculator may take channel samples included in the Rs channel data stored in the memory and store the channel samples in the register.

Next, the calculating unit extracts the down mix coefficients corresponding to the Rs channel data loaded from the memory from the down mix coefficient table, and multiplies the extracted down mix coefficients by the Rs channel data (step: S21). For example, the calculator may multiply the downmix coefficients by the channel samples stored in the register unit.

When the operation is completed, the operation unit determines whether the Rs channel data is the last input channel data, ie, the Rs channel data for configuring the L output channel data, and since the Rs channel data currently being processed is the last input channel data, The calculated Rs channel data is added to the loaded third accumulated data, and the added data is calculated using a windowing coefficient (step S22). For example, the calculator may multiply the window coefficient by a value obtained by adding the Rs channel data and the third cumulative data calculated using the downmix coefficient.

When the process of steps S20 to S22 is expressed by an equation, Equation 4 below.

Pout [0. 1.2,… , k] = (P3 [1.0.2,…, k] + d x Rs [0, 1, 2,…, k]) x W [0, 1, 2,... , k]

At this time, the Rs [0. 1.2,… , k] means Rs channel data having k channel samples, d means downmix coefficient corresponding to the Rs channel data, and P3 [0. 1.2,… , k] means third cumulative data, and W [0. 1.2,… , k] means window coefficient, and Pout [0, 1, 2,... , k] may refer to data obtained by multiplying a window coefficient by a value obtained by adding Rs channel data calculated using the downmix coefficient and third accumulated data, that is, L channel output data.

In these steps S20 to S22, the loading operation from the memory occurs three times and the storage operation into the memory occurs once.

Referring to FIG. 4, a process of downmixing and windowing four channel data and outputting L channel output data has been described. In this process, a total of eight times of loading operations from the memory and four times of storing data into the memory occurred. In the conventional case, since the downmix procedure and the windowing procedure are completely separated from each other, much more memory is required to perform the corresponding process. For example, in the conventional case, when the downmix is completed, the completed data is stored in the memory, and when the windowing is performed, the stored data must be reloaded, thereby causing unnecessary access to the memory.

FIG. 5 is a flowchart illustrating an audio data processing method according to another exemplary embodiment of the present invention, and outputs R by downmixing and windowing four channel data except for the L channel data among five channel data input from the outside The procedure for outputting channel data is shown. In this case, it is assumed that the order of processing the four channel data is determined by the downmixing and windowing algorithm in the order of the R channel data, the C channel data, the Ls channel data, and the Rs channel data.

As illustrated in FIG. 5, the procedure of downmixing and windowing four channel data and outputting the R output channel data may be performed in the same concept as the procedure of outputting the L output channel data described above.

The calculation unit loads the R channel data (step: S31), performs an operation using a downmix coefficient (step: S32), and since the R channel data is not the last input channel data, stores the first accumulated data in the memory. (Step S33). Subsequently, the operation unit loads the C channel data (step: S34), performs an operation using the downmix coefficients (step: S35), and since the C channel data is not the last input channel data, the second accumulated data is stored in the memory. (Step S36).

Next, the operation unit loads the Ls channel data (step S37), performs an operation using a downmix coefficient (step S38), and since the Ls channel data is not the last input channel data, Store in memory (step S37). The operation unit then loads the Rs channel data (step: S40), performs an operation using the downmix coefficients (step: S41), and then confirms that the Rs channel data is the last input channel data and computed the downmix coefficients. After the Rs channel data and the third cumulative data are added, the added data is calculated using the window coefficient (step S42).

Although the present invention has been described above with reference to its preferred embodiments, those skilled in the art will variously modify the present invention without departing from the spirit and scope of the invention as set forth in the claims below. And can be practiced with modification. Accordingly, modifications of the embodiments of the present invention will not depart from the scope of the present invention.

FIG. 2 is a block diagram showing the configuration of the data processing unit shown in FIG.

3 is a flowchart illustrating an operation flow of an operation unit included in the data processing unit.

4 is a flowchart illustrating an audio data processing method according to another exemplary embodiment of the present invention.

5 is a flowchart illustrating an audio data processing method according to another exemplary embodiment of the present invention.

Description of the Related Art [0002]

10: decoder

20: domain conversion unit

30: data processing unit

31: calculation unit

33: Arithmetic Logic Unit (ALU)

34: register

36: memory

40: audio output

Claims

Loading any one of the plurality of channel data and previous cumulative data from the memory;

Calculating the loaded channel data using a downmix coefficient corresponding to the channel data;

Determining whether the loaded channel data is the last input channel data for constituting output channel data based on a predetermined downmix operation order; And

If the loaded channel data is the last input channel data, the calculated channel data is added to the previous cumulative data, and after multiplying a window coefficient by the added data before storing the added data in an output buffer. Storing in the output buffer.

The method of claim 1, further comprising: adding the calculated channel data to the previous cumulative data when the loaded channel data is not the last input channel data; And

And storing the added data in the memory.

delete

The method of claim 1, wherein the loading step,

And storing a plurality of channel samples included in any one channel data in a plurality of registers.

The method of claim 4, wherein the calculating using the downmix coefficients comprises:

Extracting a downmix coefficient corresponding to the loaded channel data from the memory; And

Multiplying the extracted down mix coefficients by a plurality of channel samples stored in the register, respectively.

delete

A memory for storing information; And

Loading any channel data of the plurality of channel data and previous cumulative data from the memory, calculating the loaded channel data using a downmix coefficient, and based on a predetermined downmix operation order It is determined whether the channel data is the last input channel data for constituting the output channel. If the loaded channel data is the last input channel data, the calculated channel data is added to the previous accumulated data, and the added data is added. And an operation unit which multiplies a window coefficient by the added data before storing the result in an output buffer and stores the result in the output buffer.

The audio data processing of claim 7, wherein the calculation unit adds the calculated channel data to the previous accumulated data and stores the calculated channel data in the memory when the loaded channel data is not the last input channel data. Device.

delete

The method of claim 7, wherein the memory,

A downmix coefficient table including downmix coefficients for each channel for downmixing the plurality of channel data; And

And at least one of a window coefficient table including window coefficients for windowing down mixed data.

The method of claim 7, wherein the loaded channel data comprises a plurality of channel samples,

And the operation unit comprises a plurality of registers for storing the plurality of channel samples.

The audio data processing apparatus of claim 7, wherein the calculator multiplies the loaded channel data by a downmix coefficient corresponding to the loaded channel data.