KR20170139988A

KR20170139988A - Video conference server

Info

Publication number: KR20170139988A
Application number: KR1020160072705A
Authority: KR
Inventors: 강진아; 장종현
Original assignee: 한국전자통신연구원
Priority date: 2016-06-10
Filing date: 2016-06-10
Publication date: 2017-12-20

Abstract

A video conference server according to an embodiment of the present invention includes: an audio decoding unit for decoding each of audio signals received from a plurality of terminal devices participating in the same video conference; an audio mixing unit for determining a mixing mode according to the number of the plurality of terminal devices participating in the same video conference and mixing each of the audio signals according to the determined mixing mode; an audio encoding unit for encoding the mixed audio signals; and an audio transmitting unit for transmitting the encoded mixed audio signals to the plurality of terminal devices, respectively. Accordingly, the present invention can provide improved audio quality.

Description

VIDEO CONFERENCE SERVER {VIDEO CONFERENCE SERVER}

본 발명은 영상 회의 서버에 관한 것으로, 보다 상세하게는 영상회의에 참여한 단말장치들로부터 생성되는 오디오 데이터를 믹싱하는 기술에 관한 것이다.The present invention relates to a video conference server, and more particularly, to a technique for mixing audio data generated from terminal devices participating in a video conference.

다지점 회의 서비스에서는, 음성 부호화기에 의해 부호화된 각 참가자의 음성 데이터가 다지점 회의 서버에 송신된다. 다지점 회의 서버는, 각 참가자에게, 해당 참가자 이외의 참가자의 모든 음성을 믹싱한 음성 데이터를 송신한다.In the multi-point conference service, the voice data of each participant encoded by the voice coder is transmitted to the multi-point conference server. The multi-point conference server transmits voice data obtained by mixing all the voice of the participant other than the participant to each participant.

음성 데이터를 믹싱할 때, 우선, 각 참가자의 음성 데이터를 복호하여 얻은 복호 음성 신호를 모두 가산함으로써 전체 참가자의 음성 신호를 산출한다. 다음으로, 각 참가자에 대해, 전체 참가자의 음성 신호로부터 자신의 음성을 감산한 음성 신호를 산출하고, 그 음성 신호를 부호화하여 생성한 음성 데이터를 송신한다.When mixing voice data, first, all the decoded voice signals obtained by decrypting the voice data of each participant are added to calculate the voice signals of all the participants. Next, for each participant, a speech signal obtained by subtracting the speech of the participant from the speech signal of all participants is calculated, and the speech data generated by encoding the speech signal is transmitted.

영상회의 시스템에서 각 단말장치에서 발생되는 미디어 데이터(영상, 소리)를 효율적으로 처리(압축, 전송, 출력 등)하기 위하여 별도의 서버에서 오디오를 믹싱하는 과정을 수행한다. In order to efficiently process (compress, transmit, output, etc.) media data (image, sound) generated in each terminal device in a video conference system, audio mixing is performed in a separate server.

오디오 신호의 경우, 믹싱 개수, 즉 접속한 단말장치의 수가 늘어날수록 합산된 오디오 신호의 크기가 커지고, 또한 컴퓨터 연산에서 제한된 비트 수를 사용하기 때문에 오디오 믹싱 개수가 증가할수록 소리가 포화(saturation)되어 불분명하게 들리는 현상이 발생한다. 따라서 영상회의 참여자들은 의사소통에 불편함을 느낄 수 있다.In the case of audio signals, as the number of mixes increases, that is, as the number of connected terminal devices increases, the size of the summed audio signal becomes larger. In addition, since a limited number of bits is used in computer operations, the sound saturates as the number of audio mixes increases An unrecognized phenomenon occurs. Therefore, video conference participants may feel uncomfortable with communication.

특허공개번호 KR 10-2009-0035728호Patent Publication No. KR 10-2009-0035728

본 발명의 실시예는 영상회의 단말장치들로부터 전송되는 오디오 데이터를 접속한 단말의 개수에 따라 지능적으로 신호 크기를 조절하여 믹싱함으로써 영상회의 사용자에게 향상된 음질을 제공할 수 있는 영상 회의 서버를 제공하고자 한다.An embodiment of the present invention provides a video conferencing server capable of intelligently adjusting the signal size according to the number of terminals connected to audio data transmitted from video conferencing terminals, do.

본 발명의 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재들로부터 당업자에게 명확하게 이해될 수 있을 것이다.The technical problems of the present invention are not limited to the above-mentioned technical problems, and other technical problems which are not mentioned can be understood by those skilled in the art from the following description.

본 발명의 실시예에 따른 영상 회의 서버는 동일한 영상 회의에 참여하는 복수의 단말 장치로부터 수신된 각각의 오디오 신호들을 디코딩하는 오디오 디코딩부; 상기 동일한 영상 회의에 참여하는 복수의 단말 장치의 개수에 따라 믹싱 모드를 결정하고 결정된 믹싱 모드에 따라 상기 각각의 오디오 신호들을 믹싱하는 오디오 믹싱부; 상기 믹싱된 오디오 신호들을 인코딩하는 오디오 인코딩부; 및 상기 인코딩된 믹싱된 오디오 신호들을 상기 복수의 단말 장치별로 전송하는 오디오 전송부를 포함할 수 있다.A video conference server according to an embodiment of the present invention includes an audio decoder for decoding respective audio signals received from a plurality of terminal devices participating in the same video conference; An audio mixing unit for determining a mixing mode according to the number of terminals participating in the same video conference and mixing the audio signals according to the determined mixing mode; An audio encoding unit for encoding the mixed audio signals; And an audio transmission unit for transmitting the encoded mixed audio signals for each of the plurality of terminal apparatuses.

본 기술은 영상회의 시스템이 회의 접속자 수에 따라 적절한 오디오 믹싱 기능을 수행함으로써 사용자에게 향상된 오디오 품질을 제공할 수 있다.This technology enables the video conferencing system to perform an appropriate audio mixing function according to the number of conference users, thereby providing the user with improved audio quality.

도 1은 본 발명의 실시예에 따른 영상회의 시스템의 전체 구성도이다.
도 2는 본 발명의 실시예에 따른 영상회의 서버내의 오디오 믹싱부의 세부구성도이다.
도 3은 본 발명의 실시예에 따른 영상회의 시스템의 영상 회의를 위한 오디오 믹싱 방법을 나타내는 순서도이다.
도 4는 본 발명의 실시예에 따른 분산 자원 관리 시스템을 적용한 컴퓨터 시스템의 구성도이다.1 is an overall configuration diagram of a video conference system according to an embodiment of the present invention.
2 is a detailed configuration diagram of an audio mixing unit in a video conference server according to an embodiment of the present invention.
3 is a flowchart illustrating an audio mixing method for video conference in a video conference system according to an embodiment of the present invention.
4 is a configuration diagram of a computer system to which a distributed resource management system according to an embodiment of the present invention is applied.

이하, 본 발명의 일부 실시예들을 예시적인 도면을 통해 상세하게 설명한다. 각 도면의 구성요소들에 참조부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의해야 한다. 또한, 본 발명의 실시예를 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 실시예에 대한 이해를 방해한다고 판단되는 경우에는 그 상세한 설명은 생략한다.Hereinafter, some embodiments of the present invention will be described in detail with reference to exemplary drawings. It should be noted that, in adding reference numerals to the constituent elements of the drawings, the same constituent elements are denoted by the same reference symbols as possible even if they are shown in different drawings. In the following description of the embodiments of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the difference that the embodiments of the present invention are not conclusive.

본 발명의 실시예의 구성 요소를 설명하는 데 있어서, 제 1, 제 2, A, B, (a), (b) 등의 용어를 사용할 수 있다. 이러한 용어는 그 구성 요소를 다른 구성 요소와 구별하기 위한 것일 뿐, 그 용어에 의해 해당 구성 요소의 본질이나 차례 또는 순서 등이 한정되지 않는다. 또한, 다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가진 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.In describing the components of the embodiment of the present invention, terms such as first, second, A, B, (a), and (b) may be used. These terms are intended to distinguish the constituent elements from other constituent elements, and the terms do not limit the nature, order or order of the constituent elements. Also, unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the relevant art and are to be interpreted in an ideal or overly formal sense unless explicitly defined in the present application Do not.

이하, 도 1 내지 도 4를 참조하여, 본 발명의 실시예들을 구체적으로 설명하기로 한다.Hereinafter, embodiments of the present invention will be described in detail with reference to FIGS. 1 to 4. FIG.

도 1은 본 발명의 실시예에 따른 영상회의 시스템의 전체 구성도이다.1 is an overall configuration diagram of a video conference system according to an embodiment of the present invention.

본 발명의 실시예에 따른 영상 회의 시스템은 영상 회의를 수행하는 복수개의 단말장치(100), 복수개의 단말장치(100)와 네트워크를 통해 통신하는 영상회의 서버(200)를 포함한다.The video conference system according to an embodiment of the present invention includes a plurality of terminal devices 100 for performing video conference and a video conference server 200 for communicating with a plurality of terminal devices 100 via a network.

영상회의 서버(200)는 동일한 영상회의에 참여한 단말장치들(101, 102, .100N)로부터 인코딩된 오디오 신호들(x₁ ~ x_N)을 수신한다. 영상회의 서버(200)는 수신한 오디오 신호들(x₁ ~ x_N)을 디코딩하고 각각 자기 단말장치의 오디오 신호를 제외한 나머지 단말장치들의 오디오 신호를 더해서 믹싱된 오디오 신호들(y₁ ~ y_N)을 생성한다. 영상회의 서버(200)는 믹싱된 오디오 신호들(y₁ ~ y_N)을 인코딩하여 해당 단말장치들(101, 102, ..., 100N)에게 전송한다.The video conference server 200 receives the encoded audio signals x ₁ to x _N from the terminal devices 101, 102,... 100N participating in the same video conference. The video conference server 200 decodes the received audio signals x ₁ to x _{N and} adds the audio signals of the remaining terminal devices except for the audio signal of the terminal device to mix the audio signals y ₁ to y _N ). The video conference server 200 encodes the mixed audio signals y ₁ to y _N and transmits the encoded audio signals y ₁ to y _N to the corresponding terminal devices 101, 102, ..., 100N.

이를 위해, 영상회의 서버(200)는 오디오 수신부(210), 오디오 디코딩부(220), 오디오 믹싱부(230), 오디오 인코딩부(240), 오디오 전송부(250)를 구비한다.The video conference server 200 includes an audio receiving unit 210, an audio decoding unit 220, an audio mixing unit 230, an audio encoding unit 240, and an audio transmission unit 250.

오디오 수신부(210)는 동일한 영상회의에 참여한 단말장치들(101, 102, ..., 100N)로부터 인코딩된 오디오 신호들(x₁ ~ x_N)을 수신한다. The audio receiving unit 210 receives the encoded audio signals x ₁ to x _N from the terminal devices 101, 102, ..., 100 N participating in the same video conference.

오디오 디코딩부(220)는 수신한 오디오 신호들(x₁ ~ x_N)을 디코딩한다.The audio decoding unit 220 decodes the received audio signals x ₁ to x _N.

오디오 믹싱부(230)는 디코딩된 오디오 신호들에 대해 믹싱 모드를 결정하고 결정된 믹싱모드별로 믹싱을 수행한다. 이때, 오디오 믹싱부(230)는 각각 자기 단말장치의 오디오 신호를 제외한 나머지 단말장치들의 오디오 신호를 더해서 믹싱된 오디오 신호들(y₁ ~ y_N)을 생성한다.The audio mixing unit 230 determines a mixing mode for the decoded audio signals and performs mixing according to the determined mixing mode. At this time, the audio mixing unit 230 adds the audio signals of the remaining terminal devices except for the audio signal of the terminal device to generate mixed audio signals y ₁ to y _N.

오디오 인코딩부(240)는 믹싱된 오디오 신호들(y₁ ~ y_N)을 인코딩한다.The audio encoding unit 240 encodes the mixed audio signals y ₁ to y _N.

오디오 전송부(250)는 인코딩된 오디오신호들을 해당 단말장치들(101, 102,... 100N)로 전송한다.The audio transmitting unit 250 transmits the encoded audio signals to the corresponding terminal devices 101, 102, ..., 100N.

도 2는 본 발명의 실시예에 따른 영상회의 서버내의 오디오 믹싱부(230)의 세부구성도이다.2 is a detailed configuration diagram of an audio mixing unit 230 in a video conference server according to an embodiment of the present invention.

오디오 믹싱부(230)는 오디오 믹싱 전처리부(231), 오디오 믹싱 모드 결정부(232), 믹싱 모드부(233), 오디오 믹싱 후처리부(234)를 포함한다.The audio mixing unit 230 includes an audio mixing preprocessing unit 231, an audio mixing mode determining unit 232, a mixing mode unit 233, and an audio mixing post-processing unit 234.

오디오 믹싱 전처리부(231)는 N개의 오디오 신호가 입력되면 에너지 정규화 등의 믹싱 전처리 과정을 수행한다.The audio mixing preprocessing unit 231 performs a mixing preprocessing process such as energy normalization when N audio signals are input.

오디오 믹싱 모드 결정부(232)는 오디오 신호 믹싱을 위한 믹싱 모드를 결정한다. 이때, 믹싱 모드는 트루(true) 모드, 균등(uniform) 모드, 비균등(non-uniform)모드의 3개 모드로 정의된다. The audio mixing mode determination unit 232 determines a mixing mode for mixing audio signals. At this time, the mixing mode is defined as three modes of a true mode, a uniform mode, and a non-uniform mode.

트루 믹싱 모드는 입력 오디오 신호(x_k)들을 그대로 더하고(y_k = x₁+ x₂+ ...+ x_k _-1+ x_k ₊₁+ ...+ x_N), 균등 믹싱모드는 입력 오디오 신호들에 동일한 이득값(g)을 곱하여 더하며(y_k = g*x₁+ g*x₂+ ...+ g*x_k _-1+ g*x_k ₊₁+ ...+ g*x_N), 비균등 믹싱 모드는 주로 말하는 화자(주화자, main speaker)의 오디오를 강조하는 등의 특정 목적을 위하여 각 오디오 신호별로 정해진 이득값을 곱하여 더하도록(y_k = g₁*x₁+ g₂*x₂+ ...+ g_k-1*x_k _-1+ g_k+1*x_k ₊₁+ ...+ g_N*x_N) 정의된다.The true mixing mode adds the input audio signals x _k as they are (y _k = x ₁ + x ₂ + ... + x _k _-1 + x _k ₊₁ + ... + x _N ), the even mixing mode multiplies the input audio signals by the same gain value (g) _k = g * x ₁ + g * x ₂ + ... + g * x _k _-1 + g x _k ₊₁ + ... + g x _N ), the non-equal mixing mode is mainly referred to as a speaker , main speaker, and so on) for a specific purpose such as emphasizing the audio of the main speaker (y _k = g ₁ * x ₁ + g ₂ * x ₂ + ... + g _k-1 * x _k _-1 + g _{k + 1} x _k ₊₁ + ... + g _N x x _N.

오디오 믹싱 모드 결정부(232)는 아래 수학식 1을 이용하여 믹싱 모드를 결정할 수 있다.The audio mixing mode determining unit 232 may determine the mixing mode using Equation 1 below.

여기서, M은 믹싱 모드이고 N은 믹싱 개수, θ₁은 임계치이다. 즉, 믹싱 개수 N이 임계치 θ₁ 미만일 때에는 트루 모드로, N이 임계치 θ₁ 이상이면서 θ₂ 미만일 때에는 균등 모드로, 그 외에는 비균등 모드로 결정한다. Here, M is the mixing mode, N is the mixing number, and? ₁ is the threshold value. That is, when the mixing number N is less than the threshold value? ₁ , it is determined as the true mode. When N is equal to or greater than the threshold value? ₁ and less than? ₂ , the equal mode is determined.

균등 믹싱 모드인 경우의 이득값(g)이 아래 수학식 2와 같이 적용되어 산출될 수 있다.The gain value g in the even mixing mode can be calculated by applying Equation 2 below.

비균등 믹싱 모드인 경우의 이득값은 특정 단말이 주화자로 지정되는 경우에는 특정값 g'₁을, 그렇지 않는 경우에는 g'₂를 적용할 수 있다. 이때 이득값 g'₁, g'₂이 교차 적용되는 시점에는 부드러운 오디오 신호 출력을 위하여 입력 오디오 신호의 샘플 수(S)에 비례하여 점진적으로 증가/감소되도록 조정할 수 있다. 즉, 이전 오디오 믹싱 시에는 단말장치 k가 주화자로 결정이 되어 이득값 g'₁을 적용한 후에 현재 오디오 믹싱 시에는 단말장치 k가 비 주화자로 결정되어 이득값 g₂를 적용해야 하는 경우, 이득값 g'₂는 아래 수학식 3과 같이 산출될 수 있다.The gain value in the non-equal mixing mode can be applied to a specific value g ' ₁ when a particular terminal is designated as a coinizer, and g' ₂ if not. At this time, at the time when the gain values g ' ₁ and g' ₂ are applied to each other, it can be adjusted so as to gradually increase / decrease in proportion to the number of samples S of the input audio signal for smooth audio signal output. That is, when the previous audio mixing during has terminal device k is when the current audio mixing after been determined as the coin applying a gain value g _'1, the terminal device k determined as non-coin to apply the gain g _2, a gain value g ' ₂ can be calculated as shown in Equation 3 below.

또는 시그모이드(sigmoid) 함수 등을 써서 이득 값을 샘플 단위로 매끄럽게(smoothing) 증가/감소 되도록 산출할 수 있다.Or by using a sigmoid function or the like, the gain value can be calculated so as to increase / decrease smoothing in units of samples.

믹싱 모드부(233)는 오디오 믹싱 모드 결정부(232)에서 정해진 믹싱 모드에 따라 수신한 오디오 신호를 믹싱한다. The mixing mode unit 233 mixes the received audio signals according to the mixing mode determined by the audio mixing mode determination unit 232. [

오디오 믹싱 후처리부(234)는 믹싱된 오디오 신호에 대해 클리핑(clipping) 등의 후처리 과정을 수행한다.The audio mixing post-processing unit 234 performs post-processing such as clipping on the mixed audio signal.

이하, 도 3을 참조하여 본 발명의 실시예에 따른 영상회의 시스템의 영상 회의를 위한 오디오 믹싱 방법을 구체적으로 설명하기로 한다.Hereinafter, an audio mixing method for video conference of a video conference system according to an embodiment of the present invention will be described in detail with reference to FIG.

먼저, 하나의 영상 회의에 참여중인 단말장치들(101, 102, 103)이 네트워크를 통해 자신의 오디오 신호를 영상회의 서버(200)로 전송한다(S101, S102, S103). First, the terminal devices 101, 102, and 103 participating in one video conference transmit their audio signals to the video conference server 200 through the network (S101, S102, and S103).

이에, 영상 회의 서버(200)는 수신한 각 오디오 신호들을 디코딩하고(S104), 신호들을 믹싱하기 위한 믹싱 모드를 결정한다(S105). 이때, 믹싱 모드의 결정은 앞서 설명한 바와 같이 수학식 1을 이용하여 트루 믹싱 모드, 균등 믹싱 모드, 비균등 믹싱 모드 중 하나로 결정한다. 즉, 믹싱 개수가 임계치1 미만일 때에는 트루 믹싱 모드로 결정하고, 믹싱 개수가 임계치1 이상이면서 임계치2 미만일 때에는 균등 모드로 결정하고, 그 외에는 비균등 모드로 결정한다.The video conference server 200 decodes the received audio signals (S104), and determines a mixing mode for mixing the signals (S105). At this time, the determination of the mixing mode is determined as one of a true mixing mode, an equal mixing mode, and an unequal mixing mode using Equation (1) as described above. That is, when the mixing number is less than the threshold value 1, the true mixing mode is determined. When the mixing number is equal to or more than the threshold value 1 and less than the threshold value 2, the equal mode is determined. Otherwise, the unequal mode is determined.

이 후, 영상 회의 서버(200)는 결정된 믹싱 모드에 따라 오디오 신호들을 믹싱한다(S106). 이때, 영상 회의 서버(200)는 각각 자기 단말장치의 오디오 신호를 제외한 나머지 단말장치들의 오디오 신호를 더해서 믹싱된 오디오 신호들(y₁ ~ y_N)을 생성한다. 예를 들어, 단말장치1(101)로 전송할 오디오신호의 믹싱은 단말장치1(101)로부터 수신한 오디오신호 x₁을 제외한 x₂+x₃+....x_n을 믹싱하여 믹싱된 오디오신호 y₁을 생성한다. 단말장치2(102)로 전송할 오디오신호의 믹싱은 단말장치2(102)로부터 수신한 오디오신호 x₂을 제외한 x₁+x₃+....x_n을 믹싱하여 믹싱된 오디오신호 y₂을 생성한다. Thereafter, the video conference server 200 mixes the audio signals according to the determined mixing mode (S106). At this time, the video conference server 200 adds the audio signals of the remaining terminal devices except for the audio signal of the terminal device, and generates mixed audio signals y ₁ to y _N. For example, the mixing of the audio signal to be transmitted to the terminal device 1 101 is performed by mixing x ₂ + x ₃ + .... x _n except for the audio signal x ₁ received from the terminal device 1 101, And generates a signal y ₁ . Mixing of the audio signal to be transmitted to the terminal device 2 102 is performed by mixing x ₁ + x ₃ + .... x _n excluding the audio signal x ₂ received from the terminal device 2 102 and outputting the mixed audio signal y ₂ .

이처럼 오디오 믹싱은 믹싱된 출력 오디오 신호를 수신할 특정 단말장치 자신의 오디오 신호를 제외한 나머지 단말장치들로부터 수신된 오디오 신호들을 입력으로 하여 믹싱하게 되는데, 트루 믹싱 모드의 경우 입력 오디오 신호들을 그대로 더하고, 균등 믹싱 모드의 경우 접속한 단말장치의 개수에 반비례하는 이득값(1/N)을 곱하여 더하며, 비균등 믹싱 모드의 경우 특정 단말이 주로 말하는 화자로 지정되는 경우에는 이득값1을, 그렇지 않는 경우에는 이득값2를 적용한다.As such, the audio mixing mixes audio signals received from the remaining terminal devices except the audio signal of the specific terminal device to receive the mixed output audio signal. In the true mixing mode, the input audio signals are added as they are, In the case of the equal mixing mode, the gain value 1 is multiplied by a gain value (1 / N) in inverse proportion to the number of connected terminal equipments. In case of non-equal mixing mode, The gain value 2 is applied.

특히, 비균등 믹싱 모드인 경우에, 이전 오디오 믹싱 시에는 단말장치 k가 주화자로 결정이 되어 이득값1을 적용한 후에 현재 오디오 믹싱 시에는 단말장치 k가 비 주화자로 결정되어 이득값2를 적용해야 하는 경우, 이전에 적용된 이득값1과 현재 적용하려는 이득값2 사이에서 이득값을 점진적으로 감소/증가시키면서 각 입력 오디오 신호의 샘플 단위로 이득값을 적용할 수 있다.Particularly, in the case of the non-equal mixing mode, the terminal device k is determined as the main player and the gain value 1 is applied in the previous audio mixing. After the current audio mixing, the terminal device k is determined as the non- , The gain value can be applied in units of samples of each input audio signal while gradually increasing / decreasing the gain value between the previously applied gain value 1 and the currently applied gain value 2.

그 후, 영상회의 서버(200)는 믹싱된 오디오 신호들(y₁ ~ y_N)을 인코딩하여(S107) 인코딩된 오디오 신호들을 해당 단말장치들(101, 102, ...100N)에게 전송한다(S108, S109, S110).Thereafter, the video conference server 200 encodes the mixed audio signals y ₁ to y _N (S107) and transmits the encoded audio signals to the corresponding terminal devices 101, 102, ..., 100N (S108, S109, S110).

이와 같이, 본 발명은 영상회의 시스템에서 영상 회의 접속자 수에 따라 적절한 오디오 믹싱 기능을 수행함으로써 사용자에게 향상된 오디오 품질을 제공할 수 있다.As such, the present invention can provide an improved audio quality to a user by performing an appropriate audio mixing function according to the number of video conferencing users in a video conference system.

도 4는 본 발명의 실시예에 따른 영상회의를 위한 오디오 믹싱 방법을 적용한 컴퓨터 시스템의 구성도이다.4 is a block diagram of a computer system to which an audio mixing method for video conference according to an embodiment of the present invention is applied.

도 4를 참조하면, 컴퓨팅 시스템(1000)은 버스(1200)를 통해 연결되는 적어도 하나의 프로세서(1100), 메모리(1300), 사용자 인터페이스 입력 장치(1400), 사용자 인터페이스 출력 장치(1500), 스토리지(1600), 및 네트워크 인터페이스(1700)를 포함할 수 있다. 4, a computing system 1000 includes at least one processor 1100, a memory 1300, a user interface input device 1400, a user interface output device 1500, (1600), and a network interface (1700).

프로세서(1100)는 중앙 처리 장치(CPU) 또는 메모리(1300) 및/또는 스토리지(1600)에 저장된 명령어들에 대한 처리를 실행하는 반도체 장치일 수 있다. 메모리(1300) 및 스토리지(1600)는 다양한 종류의 휘발성 또는 불휘발성 저장 매체를 포함할 수 있다. 예를 들어, 메모리(1300)는 ROM(Read Only Memory) 및 RAM(Random Access Memory)을 포함할 수 있다. The processor 1100 may be a central processing unit (CPU) or a memory device 1300 and / or a semiconductor device that performs processing for instructions stored in the storage 1600. Memory 1300 and storage 1600 may include various types of volatile or non-volatile storage media. For example, the memory 1300 may include a ROM (Read Only Memory) and a RAM (Random Access Memory).

따라서, 본 명세서에 개시된 실시예들과 관련하여 설명된 방법 또는 알고리즘의 단계는 프로세서(1100)에 의해 실행되는 하드웨어, 소프트웨어 모듈, 또는 그 2 개의 결합으로 직접 구현될 수 있다. 소프트웨어 모듈은 RAM 메모리, 플래시 메모리, ROM 메모리, EPROM 메모리, EEPROM 메모리, 레지스터, 하드 디스크, 착탈형 디스크, CD-ROM과 같은 저장 매체(즉, 메모리(1300) 및/또는 스토리지(1600))에 상주할 수도 있다. Thus, the steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by processor 1100, or in a combination of the two. The software module may reside in a storage medium (i.e., memory 1300 and / or storage 1600) such as a RAM memory, a flash memory, a ROM memory, an EPROM memory, an EEPROM memory, a register, a hard disk, a removable disk, You may.

예시적인 저장 매체는 프로세서(1100)에 커플링되며, 그 프로세서(1100)는 저장 매체로부터 정보를 판독할 수 있고 저장 매체에 정보를 기입할 수 있다. 다른 방법으로, 저장 매체는 프로세서(1100)와 일체형일 수도 있다. 프로세서 및 저장 매체는 주문형 집적회로(ASIC) 내에 상주할 수도 있다. ASIC는 사용자 단말기 내에 상주할 수도 있다. 다른 방법으로, 프로세서 및 저장 매체는 사용자 단말기 내에 개별 컴포넌트로서 상주할 수도 있다.An exemplary storage medium is coupled to the processor 1100, which can read information from, and write information to, the storage medium. Alternatively, the storage medium may be integral to the processor 1100. [ The processor and the storage medium may reside within an application specific integrated circuit (ASIC). The ASIC may reside within the user terminal. Alternatively, the processor and the storage medium may reside as discrete components in a user terminal.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. The foregoing description is merely illustrative of the technical idea of the present invention, and various changes and modifications may be made by those skilled in the art without departing from the essential characteristics of the present invention.

따라서, 본 발명에 개시된 실시예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.Therefore, the embodiments disclosed in the present invention are intended to illustrate rather than limit the scope of the present invention, and the scope of the technical idea of the present invention is not limited by these embodiments. The scope of protection of the present invention should be construed according to the following claims, and all technical ideas within the scope of equivalents should be construed as falling within the scope of the present invention.

100 : 단말장치
200 : 영상회의 서버
210 : 오디오 수신부
220 : 오디오 디코딩부
230 : 오디오 믹싱부
240 : 오디오 인코딩부
250 : 오디오 전송부
231 : 오디오 믹싱 전처리부
232 : 오디오 믹싱 모드 결정부
233 : 믹싱 모드부
234 : 오디오 믹싱 후처리부
235 : 트루 믹싱부
236 : 균등 믹싱부
237 : 비균등 믹싱부 100: terminal device
200: video conference server
210: Audio receiver
220: Audio decoding unit
230: audio mixing unit
240: Audio encoding unit
250: Audio transmission unit
231: Audio Mixing Pre-
232: Audio Mixing Mode Decision Unit
233: Mixing mode section
234: After the audio mixing processing unit
235: True Mixing Unit
236: Equal Mixing Unit
237: Unequal mixing section

Claims

An audio decoding unit decoding each audio signal received from a plurality of terminal devices participating in the same video conference;
An audio mixing unit for determining a mixing mode according to the number of terminals participating in the same video conference and mixing the audio signals according to the determined mixing mode;
An audio encoding unit for encoding the mixed audio signals; And
An audio transmission unit for transmitting the encoded mixed audio signals for each of the plurality of terminal devices,
Lt; / RTI >