KR20040044849A

KR20040044849A - Method and system for providing media services

Info

Publication number: KR20040044849A
Application number: KR10-2003-7017098A
Authority: KR
Inventors: 라우센아더이르빈; 이스라엘데이비드; 매크나이트토마스; 도스트써칸리셉; 스텐위크도날드에이
Original assignee: 아이피 유니티
Priority date: 2001-06-29
Filing date: 2002-06-28
Publication date: 2004-05-31

Abstract

본 발명은 VoIP 전화에서 미디어 서비스를 제공하는 방법 및 시스템을 제공한다. 하나 이상의 오디오 소스와 네트워크 인터페이스 제어기 사이에 스위치가 연결되어 있다. 이 스위치는 패킷 스위치 또는 셀 스위치일 수 있다. 본 발명은 또한 VoIP 전화에서의 분산 컨퍼런스 브리지 처리 방법 및 시스템도 제공한다. 분산 컨퍼런스 브리지는 믹싱 장치에서의 복제 작업을 완화시키는 방식으로 컨퍼런스 콜의 믹싱된 오디오 콘텐츠를 멀티캐스트한다. 본 발명은 또한 독립적인 오디오 스트림들 사이에서의 무잡음 스위칭 방법 및 시스템도 제공한다. 이러한 무잡음 스위칭은 스위치 오버 시에 유효한 RTP 정보를 보존한다.The present invention provides a method and system for providing a media service in a VoIP telephone. A switch is connected between at least one audio source and the network interface controller. This switch may be a packet switch or a cell switch. The present invention also provides a distributed conference bridge processing method and system in a VoIP telephone. The distributed conference bridge multicasts the mixed audio content of the conference call in a way that mitigates the duplication at the mixing device. The invention also provides a method and system for noiseless switching between independent audio streams. This noiseless switching preserves valid RTP information upon switchover.

Description

METHOD AND SYSTEM FOR PROVIDING MEDIA SERVICES}

네트워크를 통한 전화 호에서 오디오가 전달된 지는 오래되었다. PSTN(public-switched telephone network, 공중 교환 전화망) 및 POTS(plain old telephone network, 통상 전화망)을 비롯한 종래의 회선 교환 시분할 다중화(TDM) 네트워크가 사용되었다. 이들 회선 교환 네트워크는 각 호마다 네트워크에 걸쳐 회선을 설정한다. 오디오는 이 회선을 거쳐 실시간으로 아날로그 및/또는 디지털 형태로 전달된다.It's been a while since audio was delivered from telephone calls over the network. Conventional circuit switched time division multiplexing (TDM) networks have been used, including public-switched telephone networks (PSTNs) and plain old telephone networks (POTSs). These circuit-switched networks establish a circuit over the network for each call. Audio is delivered over this line in real time in analog and / or digital form.

LAN(local area network, 근거리 통신망) 및 인터넷 등의 패킷 교환 네트워크의 등장으로 이제 오디오가 패킷 형태로 디지털적으로 전달되어야만 하게 되었다. 오디오에는 음성, 음악, 또는 다른 유형의 오디오 데이터가 포함될 수 있지만, 이에 한정되는 것은 아니다. VoIP(Voice over Internet Protocol, 또는 Voice over IP라고도 함) 시스템은 전화 호에 속하는 디지털 오디오 데이터를 종래의 회선 교환 네트워크 대신에 패킷 형태로 패킷 교환 네트워크를 거쳐 전달한다. 일례에서, VoIP 시스템은 전화 호 연결을 달성하기 위해 TCP/IP(전송 제어 프로토콜/인터넷 프로토콜) 주소를 사용하여 2개 이상의 연결을 형성한다. VoIP 네트워크에 연결되는 장치들은 VoIP 네트워크 내의 다른 장치들과 상호 연동하기 위해 표준 규격의 TCP/IP 패킷 프로토콜을 따라야만 한다. 이러한 장치들의 일례로는 IP 전화, 통합형 접속 장치, 미디어 게이트웨이 및 미디어 서버가 있다.With the advent of packet switched networks such as local area networks (LANs) and the Internet, audio must now be delivered digitally in the form of packets. Audio may include, but is not limited to, voice, music, or other types of audio data. VoIP (also called Voice over Internet Protocol, or Voice over IP) systems deliver digital audio data belonging to a telephone call over a packet switched network in the form of packets instead of a conventional circuit switched network. In one example, a VoIP system forms two or more connections using TCP / IP (Transmission Control Protocol / Internet Protocol) addresses to achieve a telephone call connection. Devices connected to the VoIP network must follow the standard TCP / IP packet protocol to interoperate with other devices in the VoIP network. Examples of such devices are IP telephones, integrated access devices, media gateways and media servers.

미디어 서버는 VoIP 전화 호에서 종단점이 되는 경우가 많다. 미디어 서버는 착신 및 발신 오디오 스트림, 즉 각각 미디어 서버에 들어오거나 그를 떠나는 오디오 스트림을 담당한다. 미디어 서버에 의해 생성되는 오디오의 유형은 음성 메일, 컨퍼런스 브리지(conference bridge), IVR(interactive voice response, 대화식 음성 응답), 음성 인식 등 전화 호에 대응하는 응용에 의해 제한된다. 많은 응용에서, 생성되는 오디오는 예측할 수 없으며, 최종 사용자 응답에 따라 여러가지임은 당연하다. 단어, 문장, 및 음악과 같은 전체 오디오 세그먼트는 오디오 스트림으로 송출되기 때문에 동적으로 실시간으로 조립되어야만 한다.Media servers are often endpoints in VoIP phone calls. The media server is responsible for incoming and outgoing audio streams, ie audio streams entering or leaving the media server, respectively. The type of audio generated by the media server is limited by the application corresponding to the telephone call, such as voice mail, conference bridge, interactive voice response (IVR), and voice recognition. In many applications, the audio produced is unpredictable and may vary depending on the end user response. The entire audio segment, such as words, sentences, and music, is sent out to an audio stream and must be assembled dynamically in real time.

그렇지만, 패킷 교환 네트워크는 전화 호에서 전달되는 오디오 스트림에 지연 및 지터를 일으킬 수 있다. 실시간 전송 프로토콜(real-time transport protocol, RTP)은 종종 미디어 서버로부터 송출되는 오디오 스트림에서의 지연, 패킷 손실 및 대기 시간을 제어하는 데 사용된다. 오디오 스트림은 RTP를 사용하여 네트워크 링크를 거쳐 실시간 장치(전화 등) 또는 비실시간 장치[통합 메시징 (unified messaging)에서의 이메일 클라이언트 등]로 송출될 수 있다. RTP는 IP 계열의 일부인 UDP(사용자 데이터그램 프로토콜) 등의 프로토콜의 상부에서 동작한다. RTP 패킷은 그 중에서도 특히 순서 번호 및 타임 스탬프를 포함한다. 순서번호는 목적지 애플리케이션이 RTP를 사용하여 패킷 분실의 발생을 검출할 수 있게 해주고 또 사용자에게 정확한 순서의 패킷이 제공되도록 보장할 수 있게 해준다. 타임 스탬프는 패킷이 조립된 시간에 대응한다. 타임 스탬프는 목적지 애플리케이션이 목적지 사용자에 대해 동기화된 송출을 보장할 수 있게 해주고 또 지연 및 지터를 계산할 수 있게 해준다. D. Collins의 저서, Carrier Grade Voice over IP, Mc-Graw Hill: 미국 저작권 2001, 52-72 페이지를 참조하기 바라며, 여기에 인용함으로서 이 도서의 전체 내용을 본 명세서에 포함한다.However, packet-switched networks can introduce delays and jitter in the audio stream carried on the telephone call. Real-time transport protocol (RTP) is often used to control delay, packet loss and latency in audio streams sent from media servers. The audio stream can be sent using RTP to a real time device (such as a telephone) or a non-real time device (such as an email client in unified messaging) via a network link. RTP runs on top of protocols such as UDP (User Datagram Protocol), which is part of the IP family. RTP packets include inter alia sequence numbers and time stamps. Sequence numbers allow the destination application to detect the occurrence of packet loss using RTP and ensure that the user is provided with the correct sequence of packets. The time stamp corresponds to the time the packet was assembled. Time stamps allow the destination application to ensure synchronized delivery to the destination user and to calculate delays and jitter. See D. Collins, Carrier Grade Voice over IP, Mc-Graw Hill: US Copyright 2001, pages 52-72, which is incorporated herein by reference in its entirety.

VoIP 전화 호에서의 종단점에 있는 미디어 서버는 단 하나의 오디오 스트림에 대한 통신 품질을 향상시키기 위해 RTP 등의 프로토콜을 사용한다. 그렇지만, 이러한 미디어 서버는 주어진 전화 호에서 RTP 패킷들로 된 단 하나의 오디오 스트림을 출력하는 것으로 제한되어 있다.Media servers at endpoints in VoIP phone calls use protocols such as RTP to improve the communication quality for only one audio stream. However, such media servers are limited to outputting only one audio stream of RTP packets on a given telephone call.

컨퍼런스 콜(conference call)은 공통된 호에서 네트워크를 거쳐 다수 당사자들을 연결시킨다. 컨퍼런스 콜은 처음에는 POTS 또는 PSTN 등의 회선 교환 네트워크를 통해 수행되었다. 컨퍼런스 콜은 이제 LAN 및 인터넷 등의 패킷 교환 네트워크를 통해 수행되기도 한다. 실제로, VoIP 시스템의 등장으로 네트워크를 통한 컨퍼런스 콜의 수요가 증가하였다.Conference calls connect multiple parties across a network on a common call. The conference call was initially conducted through a circuit switched network such as POTS or PSTN. Conference calls may now be made through packet-switched networks such as LANs and the Internet. Indeed, the advent of VoIP systems has increased the demand for conference calls over the network.

컨퍼런스 브리지(conference bridge)는 컨퍼런스 콜의 참가자들을 연결시켜준다. 부분적으로는 네트워크의 유형 및 음성이 네트워크를 통해 컨퍼런스 브리지로 어떻게 전달되는지에 따라, 서로 다른 유형의 컨퍼런스 브리지가 사용되어 왔다. 한 유형의 컨퍼런스 브리지가 미국 특허 제5,436,896에 기술되어 있다(전체특허를 참조하기 바란다). 이 컨퍼런스 브리지(10)는 음성 신호가 64Kbps 데이터 스트림으로 디지털적으로 인코딩되는 환경에서 동작한다(도 1의 제1단 제21-26행). 컨퍼런스 브리지(10)는 복수의 입력(12)과 출력(14)을 갖는다. 입력(12)은 각자의 음성 검출기(16) 및 스위치(18)를 통해 공통의 가산 증폭기(20)에 연결되어 있다. 음성 검출기(16)는 입력 데이터 스트림을 샘플링하고 시간의 경과에 따라 존재하는 에너지량을 판정함으로써 음성을 검출한다(제1단 제36-39행). 각 음성 검출기(16)는 스위치(18)를 제어한다. 음성이 존재하지 않을 때, 스위치(18)는 잡음을 감소시키기 위해 개방된 상태로 유지된다. 컨퍼런스 콜 동안에, 말을 하고 있는 모든 참가자의 입력은 가산 증폭기(20)를 통해 출력(14) 각각에 연결된다. 감산기(24)는 각 참가자 자신의 음성 데이터 스트림을 제거한다. 이어서, 여러명의 참가자 (1-n)가 컨퍼런스 브리지(10)를 통해 이루어진 연결 상태에서 서로 말을 할 수 있고 들을 수 있게 된다. 상기 미국 특허 제5,436,896호의 제1단 제12행 내지 제2단 제16행을 참조하기 바란다.A conference bridge connects participants in a conference call. Different types of conference bridges have been used, in part, depending on the type of network and how voice is passed through the network to the conference bridge. One type of conference bridge is described in US Pat. No. 5,436,896 (see the entire patent). This conference bridge 10 operates in an environment in which a voice signal is digitally encoded into a 64 Kbps data stream (first stage lines 21-26 of FIG. 1). Conference bridge 10 has a plurality of inputs 12 and outputs 14. The input 12 is connected to a common add amplifier 20 via respective voice detector 16 and switch 18. Voice detector 16 detects voice by sampling the input data stream and determining the amount of energy present over time (first stage lines 36-39). Each voice detector 16 controls the switch 18. When no voice is present, the switch 18 remains open to reduce noise. During the conference call, the inputs of all the participants speaking are connected to each of the outputs 14 through the adder amplifier 20. Subtractor 24 removes each participant's own voice data stream. Subsequently, several participants 1-n can talk and listen to each other in a connected state made via conference bridge 10. See, US Pat. No. 5,436,896, first row 12th to second row 16th.

디지털화된 음성은 이제 패킷 교환 네트워크를 통해 패킷 형태로도 전달되고 있다. 상기 미국 특허 제5,436,896호에서는 비동기 전송 모드(ATM) 패킷('셀'이라고도 함)의 일례에 대해 기술하고 있다. 이러한 네트워킹 환경에서 컨퍼런스 콜을 지원하기 위해서, 컨퍼런스 브리지(10)는 입력 ATM 셀을 네트워크 패킷으로 변환한다. 디지털화된 음성은 전술한 바와 같이 그 패킷으로부터 추출되고, 컨퍼런스 브리지(12)에서 처리된다. 디지털화된 음성의 합산 출력은 참가자들(1-n)로 전송되기에 앞서 네트워크 패킷으로부터 ATM 셀로 재변환된다. 상기 미국 특허제5,436,896호의 제2단 제17행 내지 제2단 제36행을 참조하기 바란다.Digitized voice is now also delivered in packet form over packet-switched networks. U. S. Patent No. 5,436, 896 describes an example of an asynchronous transmission mode (ATM) packet (also called a 'cell'). To support conference calls in this networking environment, conference bridge 10 converts the input ATM cell into a network packet. The digitized voice is extracted from the packet as described above and processed at the conference bridge 12. The summed output of the digitized voice is reconverted from the network packet to the ATM cell before being sent to the participants 1-n. See, US Pat. No. 5,436,896, second column 17 to second column 36.

상기 미국 특허 제5,436,896호는 또한 컨퍼런스 브리지(10)에서와 같이 ATM 셀을 네트워크 패킷으로 변환 및 재변환하지 않고 ATM 셀을 처리하는 컨퍼런스 브리지(238)에 대해서도 기재하고 있으며, 이 컨퍼런스 브리지(238)는 도 2 및 도 3에 도시되어 있다. 컨퍼런스 브리지(238)는 참가자 각자로부터 하나씩 오는 입력 (302-306)과 참가자 각자로 하나씩 가는 출력(308-312)을 갖는다. 음성 검출기 (314-318)는 샘플-홀드 버퍼(322-326)에 모인 입력 데이터를 분석한다. 음성 검출기(314-318)는 검출된 음성 및/또는 검출된 음성의 볼륨을 제어기(320)에 보고한다. 상기 미국 특허 제5,436,896호의 제4단 제16 내지 39행을 참조하기 바란다.U. S. Patent No. 5,436, 896 also describes a conference bridge 238 that processes an ATM cell without converting and reconverting the ATM cell into a network packet, as in conference bridge 10. This conference bridge 238 Is shown in FIGS. 2 and 3. Conference bridge 238 has inputs 302-306 one by one from each participant and outputs 308-312 one by one to each participant. The voice detectors 314-318 analyze the input data collected in the sample-hold buffers 322-326. The voice detectors 314-318 report to the controller 320 the detected voice and / or the volume of the detected voice. See, US Pat. No. 5,436,896, fourth column 16-39.

제어기(320)는 셀렉터(328), 이득 제어(329) 및 리플리케이터(replicator, 330)에 연결되어 있다. 제어기(320)는 음성 검출기(314-318)의 출력에 기초하여 참가자들 중 누가 말을 하고 있는지를 판정한다. 한 명의 화자(예를 들면 참가자 1)가 말을 하고 있는 경우, 제어기(320)는 버퍼(322)로부터 데이터를 판독하도록 셀렉터(328)를 설정한다. 그 데이터는 자동 이득 제어(329)를 통해 리플리케이터 (330)로 이동한다. 리플리케이터는 화자를 제외한 모든 참가자들에 대해 셀렉터(328)에 의해 선택된 ATM 셀 내의 데이터를 복제한다. 상기 미국 특허 제5,436,896호의 제4단 제40행 내지 제5단 제5행을 참조하기 바란다. 두명 이상의 화자가 말을 하고 있을 때는, 주어진 선택 기간에서 가장 목소리가 큰 화자가 선택된다. 이어서, 그 다음으로 목소리가 큰 화자는 후속 선택 기간에서 선택된다. 6 밀리초 등의 적절한 간격으로 음성 검출기(314-318)를 스캐닝하고 셀렉터(328)를재구성함으로써 동시 음성의 출현에도 따라갈 수 있다. 상기 미국 특허 제5,436,896호의 제5단 제6 내지 65행을 참조하기 바란다.Controller 320 is coupled to selector 328, gain control 329, and replicator 330. The controller 320 determines who of the participants is speaking based on the output of the voice detectors 314-318. When one speaker (eg, participant 1) is speaking, controller 320 sets selector 328 to read data from buffer 322. The data is moved to replicator 330 via automatic gain control 329. The replicator replicates the data in the ATM cell selected by the selector 328 for all participants except the speaker. See, US Pat. No. 5,436,896, fourth column 40 to fifth column fifth. When two or more speakers are speaking, the loudest speaker is chosen for a given selection period. The next loudest speaker is then selected in the subsequent selection period. Scanning the voice detectors 314-318 at appropriate intervals, such as 6 milliseconds, and reconfiguring the selector 328 may also follow the appearance of simultaneous voice. See, US Pat. No. 5,436,896, fifth column 6 to 65.

다른 유형의 컨퍼런스 브리지가 미국 특허 제5,983,192호에 기재되어 있다(이 특허 전부를 참조할 것). 일 실시예에서, 컨퍼런스 브리지(12)는 실시간 프로토콜(RTP/RTCP)을 통해 압축된 오디오 패킷을 수신한다. 상기 미국 특허 제5,983,192호의 제3단 제66행 내지 제4단 제40행을 참조하기 바란다. 컨퍼런스 브리지(12)는 오디오 프로세서(14a-14d)를 포함한다. 지점 C(즉, 참가자 C)와 관련된 전형적인 오디오 프로세서(14c)는 스위치(22)와 셀렉터(26)를 포함한다. 셀렉터(26)는 다른 지점 A, B 또는 D 중 어느 것이 말할 가능성이 가장 높은지를 판정하는 음성 검출기를 포함한다. 상기 미국 특허 제5,983,192호의 제4단 제40 내지 67행을 참조하기 바란다. 다른 대안에서는 하나 이상의 지점의 선택 및 음향 에너지 검출기의 사용을 포함한다. 상기 미국 특허 제5,983,192호의 제5단 제1 내지 7행을 참조하기 바란다. 상기 미국 특허 제5,983,192호에 기재된 다른 실시예에서, 셀렉터(26)/스위치(22)는 별도의 스트림에서의 가장 목소리가 큰 복수명의 화자를 로컬 믹싱 종단점 지점(local mixing end-point site)으로 출력한다. 가장 목소리가 큰 스트림들이 다수의 지점으로 전송된다. 상기 미국 특허 제5,983,192호의 제5단 제8 내지 67행을 참조하기 바란다. "더블-토크(double-talk)" 및 "트리플-토크(triple-talk)"라고도 하는 동시에 다수의 화자를 처리하기 위한 믹서/인코더의 설정에 대해서도 기재되어 있다. 상기 미국 특허 제5,983,192호의 제7단 제20행 내지 제9단 제29행을 참조하기 바란다.Another type of conference bridge is described in US Pat. No. 5,983,192 (see all of these patents). In one embodiment, conference bridge 12 receives compressed audio packets via a real time protocol (RTP / RTCP). See, US Pat. No. 5,983,192, third column 66 to fourth row 40. Conference bridge 12 includes audio processors 14a-14d. Typical audio processor 14c associated with point C (ie, participant C) includes switch 22 and selector 26. The selector 26 includes a voice detector that determines which of the other points A, B or D is most likely to speak. See, US Pat. No. 5,983,192, fourth column 40-67. Other alternatives include the selection of one or more points and the use of an acoustic energy detector. See, US Pat. No. 5,983,192, fifth column to lines 1-7. In another embodiment described in US Pat. No. 5,983,192, selector 26 / switch 22 outputs the plurality of loudest speakers in a separate stream to a local mixing end-point site. do. The loudest streams are sent to multiple points. See lines 5 to 8, line 67 of U.S. Patent 5,983,192, supra. It is also referred to as "double-talk" and "triple-talk", as well as setting up a mixer / encoder to handle multiple speakers. See, US Pat. No. 5,983,192, line 7, line 20 to step 9, line 29.

VoIP 시스템은 개선된 컨퍼런스 브리지를 계속하여 필요로 한다. 예를 들어, 소프트스위치 VoIP 구조에서는 MGCP(RFC 2705) 등의 미디어 게이트웨이 제어 프로토콜을 갖는 하나 이상의 미디어 서버를 사용할 수 있다. D. Collins의 저서, Carrier Grade Voice over IP, Mc-Graw Hill: 미국 저작권 2001, 234-244 페이지를 참조하기 바라며, 여기에 인용함으로서 이 도서의 전체 내용을 본 명세서에 포함한다. 이러한 미디어 서버는 VoIP 호에서의 오디오 스트림을 처리하는 데 종종 사용된다. 이들 미디어 서버는 종종 컨퍼런스 콜에서 오디오 스트림이 믹싱되는 종단점이다. 이들 종단점은 또한 "컨퍼런스 브리지 액세스 포인트(conference bridge access point)"라고도 하는데, 그 이유는 미디어 서버가 다수의 호출자로부터의 미디어 스트림이 믹싱되어 다시 그 호출자들 모두에게 제공되는 종단점이기 때문이다. 상기 D. Collins의 저서의 242 페이지를 참조하기 바란다.VoIP systems continue to require improved conference bridges. For example, the softswitch VoIP architecture may use one or more media servers having a media gateway control protocol such as MGCP (RFC 2705). See D. Collins, Carrier Grade Voice over IP, Mc-Graw Hill: US Copyright 2001, pages 234-244, which is incorporated herein by reference in its entirety. Such media servers are often used to process audio streams in VoIP calls. These media servers are often the endpoints where audio streams are mixed in conference calls. These endpoints are also referred to as " conference bridge access points " because the media server is an endpoint where media streams from multiple callers are mixed and provided back to all callers. See page 242 of D. Collins's book above.

IP 전화 및 VoIP 호의 인기 및 수요가 증가함에 따라, 미디어 서버는 반송파급 품질을 갖는 컨퍼런스 콜 처리를 처리하게 될 것으로 예상된다. 미디어 서버 내의 컨퍼런스 브리지는 서로 다른 수의 참가자들을 처리하는 것으로 확장될 수 있도록 할 필요가 있다. RTP/RTCP 패킷 등의 패킷 스트림 내의 오디오는 실시간으로 효율적으로 처리될 필요가 있다.As the popularity and demand of IP telephony and VoIP calls increase, media servers are expected to handle conference call processing with carrier quality. Conference bridges within the media server need to be able to scale to handle different numbers of participants. Audio in the packet stream, such as RTP / RTCP packets, needs to be efficiently processed in real time.

본 발명은 일반적으로 네트워크를 통한 오디오 통신에 관한 것이다.The present invention relates generally to audio communication over a network.

도 1은 본 발명에 따른 전형적인 VoIP의 환경에서의 미디어 서버를 나타낸 도면이고,1 is a diagram illustrating a media server in a typical VoIP environment according to the present invention.

도 2는 본 발명에 따른 미디어 서비스 및 자원을 포함하는 전형적인 미디어 서버를 나타낸 도면이며,2 is a diagram illustrating an exemplary media server including media services and resources according to the present invention.

도 3a 및 도 3b는 본 발명의 일 실시예에 따른 오디오 처리 플랫폼을 나타낸 도면이고,3A and 3B illustrate an audio processing platform according to an embodiment of the present invention.

도 4는 본 발명의 전형적인 구현예에 따른 도 3a 및 도 3b에 도시한 오디오 처리 플랫폼을 나타낸 도면이며,4 is a diagram of the audio processing platform shown in FIGS. 3A and 3B in accordance with an exemplary embodiment of the present invention;

도 5a는 본 발명의 일 실시예에 따른 호 및 착신 패킷 처리의 설정을 나타낸 흐름도이고,5A is a flowchart illustrating setting of call and incoming packet processing according to an embodiment of the present invention;

도 5b는 본 발명의 일 실시예에 따른 발신 패킷 처리 및 호 종료를 나타내는 흐름도이며,5B is a flowchart illustrating outgoing packet processing and call termination according to an embodiment of the present invention;

도 6a 내지 도 6f는 본 발명의 일 실시예에 따른 무잡음 스위치 오버 시스템을 나타낸 도면으로서,6A to 6F illustrate a noiseless switchover system according to an embodiment of the present invention.

도 6a는 본 발명의 일 실시예에 따른 내부 오디오 소스에 의해 발생된 독립적인 발신 오디오 스트림의 셀 스위칭을 수행하는 무잡음 스위치 오버 시스템을 나타낸 도면이고,6A illustrates a noiseless switchover system for performing cell switching of an independent outgoing audio stream generated by an internal audio source according to an embodiment of the present invention.

도 6b는 본 발명의 일 실시예에 따른 내부 오디오 소스에 의해 발생된 독립적인 발신 오디오 스트림의 셀 스위칭을 수행하는 무잡음 스위치 오버 시스템에서의 오디오 데이터 흐름을 나타낸 도면이며,6B illustrates audio data flow in a noiseless switchover system for performing cell switching of an independent outgoing audio stream generated by an internal audio source according to an embodiment of the present invention.

도 6c는 본 발명의 일 실시예에 따른 내부 및/또는 외부 오디오 소스에 의해 발생된 독립적인 발신 오디오 스트림들 사이의 셀 스위칭을 수행하는 무잡음 스위치 오버 시스템을 나타낸 도면이고,6C illustrates a noiseless switchover system for performing cell switching between independent outgoing audio streams generated by an internal and / or external audio source, in accordance with an embodiment of the present invention;

도 6d는 본 발명의 일 실시예에 따른 내부 및/또는 외부 오디오 소스에 의해발생된 독립적인 발신 오디오 스트림들 사이의 셀 스위칭을 수행하는 무잡음 스위치 오버 시스템에서의 오디오 데이터 흐름을 나타낸 도면이며,FIG. 6D illustrates audio data flow in a noiseless switchover system that performs cell switching between independent outgoing audio streams generated by an internal and / or external audio source in accordance with an embodiment of the present invention. FIG.

도 6e는 본 발명의 일 실시예에 따른 내부 및/또는 외부 오디오 소스에 의해 발생된 독립적인 발신 오디오 스트림들 사이의 패킷 스위칭을 수행하는 무잡음 스위치 오버 시스템에서의 오디오 데이터 흐름을 나타낸 도면이고,6E illustrates audio data flow in a noiseless switchover system that performs packet switching between independent outgoing audio streams generated by an internal and / or external audio source in accordance with an embodiment of the present invention.

도 6f는 본 발명의 일 실시예에 따른 외부 오디오 소스에 의해 발생된 독립적인 발신 오디오 스트림들 사이의 스위칭을 수행하는 무잡음 스위치 오버 시스템을 나타낸 도면이며,FIG. 6F illustrates a noiseless switchover system for switching between independent outgoing audio streams generated by an external audio source in accordance with an embodiment of the present invention. FIG.

도 7a는 RTP 정보를 갖는 IP 패킷을 개략적으로 나타낸 도면이고,7A is a diagram schematically showing an IP packet having RTP information,

도 7b는 본 발명의 일 실시예에 따른 내부 패킷을 개략적으로 나타낸 도면이며,7B is a view schematically showing an inner packet according to an embodiment of the present invention,

도 8은 본 발명의 일 실시예에 따른 스위칭 기능을 나타낸 흐름도이고,8 is a flowchart illustrating a switching function according to an embodiment of the present invention;

도 9a, 도 9b 및 도 9c는 본 발명의 일 실시예에 따른 오디오 스트림 스위칭을 위한 호 이벤트 처리를 나타낸 흐름도이며,9A, 9B and 9C are flowcharts illustrating call event processing for switching audio streams according to an embodiment of the present invention.

도 10은 본 발명의 일 실시예에 따른 분산 컨퍼런스 브리지의 블록도이고,10 is a block diagram of a distributed conference bridge, in accordance with an embodiment of the present invention;

도 11은 도 10의 분산 컨퍼런스 브리지에서 사용되는 전형적인 룩업 테이블을 나타낸 도면이며,FIG. 11 illustrates a typical lookup table used in the distributed conference bridge of FIG. 10.

도 12는 컨퍼런스 콜을 설정함에 있어서 도 10의 분산 컨퍼런스 브리지의 동작의 플로우차트도이고,12 is a flowchart diagram of the operation of the distributed conference bridge of FIG. 10 in establishing a conference call, FIG.

도 13a, 도 13b 및 도 13c는 컨퍼런스 콜을 처리함에 있어서 도10의 분산 컨퍼런스 브리지의 동작의 플로우차트도이며,13A, 13B and 13C are flow chart diagrams of the operation of the distributed conference bridge of FIG. 10 in processing a conference call.

도 14a는 본 발명의 일 실시예에 따른 컨퍼런스 콜 중에 오디오 소스에 의해 발생된 전형적인 내부 패킷을 나타낸 도면이고,14A illustrates an exemplary internal packet generated by an audio source during a conference call in accordance with an embodiment of the present invention.

도 14b는 본 발명에 따른 전부 믹싱된 오디오 스트림과 한 세트의 부분 믹싱된 오디오 스트림에서의 전형적인 패킷 내용을 나타낸 도면이며,14B illustrates typical packet content in a fully mixed audio stream and a set of partially mixed audio streams in accordance with the present invention.

도 15는 본 발명에 따른 64명 참가 컨퍼런스 콜에서 도 14의 패킷들이 멀티캐스트된 후와 이들 패킷이 처리되어 해당 참가자에게 전송될 IP 패킷으로 된 후의 전형적인 패킷 내용을 나타낸 도면이다.FIG. 15 is a diagram illustrating typical packet contents after the packets of FIG. 14 are multicast in a 64 participant conference call according to the present invention, and after these packets are processed to become IP packets to be transmitted to the corresponding participant.

본 발명은 VoIP 전화에서 미디어 서비스를 제공하는 방법 및 시스템을 제공한다. 일 실시예에서, 다수의 오디오 소스와 네트워크 인터페이스 제어기 사이에 스위치가 연결되어 있다. 이 스위치는 패킷 스위치 또는 셀 스위치일 수 있다.내부 및/또는 외부 오디오 소스는 오디오 패킷 스트림을 생성한다. 임의의 유형의 패킷이 사용될 수 있다. 일 실시예에서, 내부 패킷은 패킷 헤더와 페이로드를 포함한다.The present invention provides a method and system for providing a media service in a VoIP telephone. In one embodiment, a switch is connected between the multiple audio sources and the network interface controller. This switch may be a packet switch or a cell switch. The internal and / or external audio source produces an audio packet stream. Any type of packet can be used. In one embodiment, the inner packet includes a packet header and a payload.

일 실시예에서, 패킷 헤더는 대화 중인 화자를 식별하는 정보를 가지고 있으며, 그 화자의 오디오는 믹싱되어 있다. 페이로드는 디지털화된 믹싱된 오디오를 전달한다. 본 발명의 한 특징에 따르면, 하나의 전부 믹싱된 오디오 패킷 스트림과 다수의 부분 믹싱된 오디오 패킷 스트림이 오디오 소스(예를 들면, DSP)에 의해 생성된다. 이 전부 믹싱된 오디오 스트림은 한 그룹의 식별된 능동 화자의 오디오 콘텐츠를 포함한다. 패킷 헤더 정보를 보면 전부 믹싱된 스트림 내의 능동 화자 각각을 알 수 있다. 일례에서, 오디오 소스는 각자의 능동 화자와 관련된 컨퍼런스 식별 번호(CID)를 패킷 내의 헤더 필드에 삽입한다. 오디오 소스는 능동 화자로부터의 믹싱된 디지털 오디오를 패킷의 페이로드에 삽입한다. 믹싱된 디지털 오디오는 컨퍼런스 콜에서 능동 화자에 의해 입력된 음성 또는 다른 유형의 오디오에 대응한다.In one embodiment, the packet header contains information identifying the talker in talk, and the speaker's audio is mixed. The payload carries the digitized mixed audio. According to one aspect of the invention, one fully mixed audio packet stream and a plurality of partially mixed audio packet streams are generated by an audio source (eg, a DSP). This fully mixed audio stream contains the audio content of a group of identified active speakers. The packet header information reveals each active speaker in the fully mixed stream. In one example, the audio source inserts a conference identification number (CID) associated with its active speaker into a header field in the packet. The audio source inserts mixed digital audio from the active speaker into the payload of the packet. The mixed digital audio corresponds to the voice or other type of audio input by the active speaker in the conference call.

부분 믹싱된 오디오 스트림 각각은 식별된 능동 화자의 그룹의 오디오 콘텐츠에서 각자의 수신측 능동 화자의 오디오 콘텐츠를 뺀 것을 포함한다. 수신측 능동 화자는 능동 화자 그룹 내의 능동 화자로서 부분 믹싱된 오디오 스트림은 그를 향해 가는 것이다. 오디오 소스는 식별된 능동 화자의 그룹으로부터의 디지털 오디오에서 수신측 능동 화자의 오디오 콘텐츠를 뺀 것을 패킷 페이로드에 삽입한다. 이와 같이, 수신측 능동 화자는 그 자신의 음성 또는 오디오 입력에 대응하는 오디오를 수신하지 않게 된다. 패킷 헤더 정보를 보면 능동 화자를 알 수 있으며, 그 화자의 오디오 콘텐츠는 그 각자의 부분 믹싱된 오디오 스트림 내에 포함되어 있다. 일례에서, 오디오 소스는 하나 이상의 컨퍼런스 식별 번호(CID)를 패킷의 TAS 및 IAS 헤더에 삽입한다. TAS(Total Active Speakers; 전체 능동 화자) 필드는 컨퍼런스 콜에서 현재 능동 화자 모두의 CID를 열거하고 있다. IAS 필드(Included Active Speakers; 포함된 능동 화자)는 자신의 오디오 콘텐츠가 각자의 부분 믹싱된 스트림 내에 있는 능동 화자의 CID를 열거하고 있다. 일 실시예에서, 오디오 소스(또는 믹서, 왜냐하면 오디오를 믹싱하기 때문에)는 컨퍼런스 콜 동안에 CID 정보와 믹싱된 오디오를 갖는 적절한 전부 믹싱된 또한 부분 믹싱된 오디오 패킷 스트림을 동적으로 생성한다. 오디오 소스는 컨퍼런스 콜의 개시 시에 생성되어 저장된 비교적 정적인 룩업 테이블로부터 컨퍼런스 콜 참가자의 해당 CID 정보를 검색한다.Each of the partially mixed audio streams includes subtracting the audio content of each receiving active speaker from the audio content of the group of identified active speakers. The receiving active speaker is the active speaker in the active speaker group, with the partially mixed audio stream directed towards him. The audio source inserts into the packet payload the subtracted audio content of the receiving active speaker from the digital audio from the identified active speaker group. As such, the receiving active speaker does not receive audio corresponding to his or her own voice or audio input. The packet header information reveals the active speaker, whose audio content is contained within its respective partially mixed audio stream. In one example, the audio source inserts one or more conference identification numbers (CIDs) into the TAS and IAS headers of the packet. The TAS (Total Active Speakers) field lists the CIDs of all current active speakers in the conference call. The IAS field (Included Active Speakers) lists the CIDs of the active speakers whose audio content is in their partially mixed stream. In one embodiment, the audio source (or mixer, because it is mixing audio) dynamically generates an appropriate fully mixed and partially mixed audio packet stream with mixed audio with the CID information during the conference call. The audio source retrieves the corresponding CID information of the conference call participant from the relatively static lookup table created and stored at the beginning of the conference call.

예를 들어, 64명의 참가자가 있으며 그 중 3명이 능동 화자(1-3)로서 식별되는 컨퍼런스 콜에서, 하나의 전부 믹싱된 오디오 스트림은 3명의 능동 화자 모두로부터의 오디오를 포함한다. 이 전부 믹싱된 스트림은 궁극적으로 61명의 수동 참가자 각각에게 전송된다. 3개의 부분 믹싱된 오디오 스트림도 생성된다. 제1 부분 믹싱된 스트림 1은 화자 2-3으로부터의 오디오는 포함하지만 화자 1의 오디오는 포함하지 않는다. 제2 부분 믹싱된 스트림 2는 화자 1-3으로부터의 오디오는 포함하지만 화자 2의 오디오는 포함하지 않는다. 제3 부분 믹싱된 스트림 3은 화자 1-2로부터의 오디오는 포함하지만 화자 3의 오디오는 포함하지 않는다. 제1 내지제3 부분 믹싱된 오디오 스트림은 궁극적으로 화자 1-3에 각각 전송된다. 이와 같이, 오디오 소스에 의해 단지 4개의 믹싱된 오디오가 생성되기만 하면 된다.For example, in a conference call where there are 64 participants, three of which are identified as active speakers 1-3, one fully mixed audio stream includes audio from all three active speakers. This fully mixed stream is ultimately sent to each of the 61 passive participants. Three partially mixed audio streams are also generated. The first partially mixed stream 1 contains audio from speakers 2-3 but no speaker 1 audio. The second partially mixed stream 2 includes audio from speakers 1-3 but does not include audio from speaker 2. The third partially mixed stream 3 contains audio from speakers 1-2 but does not contain audio from speaker 3. The first to third partially mixed audio streams are ultimately sent to speakers 1-3 respectively. As such, only four mixed audios need be produced by the audio source.

전부 믹싱된 오디오 스트림과 다수의 부분 믹싱된 오디오 스트림이 오디오 소스(예를 들면, DSP)로부터 패킷 스위치로 전송된다. 셀 계층도 사용될 수 있다. 패킷 스위치는 전부 믹싱된 오디오 스트림 및 부분 믹싱된 오디오 스트림 각각을 네트워크 인터페이스 제어기(NIC)로 멀티캐스트한다. 이어서, NIC는 각 패킷을 처리하여 전부 믹싱된 또는 부분 믹싱된 오디오 스트림에 대한 패킷을 참가자에게 송달할지 여부를 판정한다. 이 판정은 NIC에 있는 룩업 테이블과 멀티캐스트된 오디오 스트림 내의 패킷 헤더 정보에 기초하여 실시간으로 행해질 수 있다.A fully mixed audio stream and a plurality of partially mixed audio streams are sent from the audio source (eg, DSP) to the packet switch. Cell layers may also be used. The packet switch multicasts each of the fully mixed audio and the partially mixed audio stream to a network interface controller (NIC). The NIC then processes each packet to determine whether to deliver to the participant a packet for a fully mixed or partially mixed audio stream. This determination may be made in real time based on the lookup table in the NIC and the packet header information in the multicast audio stream.

일 실시예에서, 컨퍼런스 콜의 초기화 중에, 그 콜의 각 참가자는 CID를 할당받는다. 교환 가상 회선(SVC)는 또한 컨퍼런스 콜 참가자와도 관련되어 있다. 컨퍼런스 콜 참가자에 대한 엔트리를 포함하는 룩업 테이블이 생성되어 저장된다. 각 엔트리는 네트워크 주소 정보(예를 들면, IP, UDP 주소 정보) 및 각자의 컨퍼런스 콜 참가자의 CID를 포함하고 있다. 룩업 테이블은 컨퍼런스 콜 동안에 패킷을 처리하는 NIC 및 오디오를 믹싱하는 오디오 소스 모두에 의한 접속을 위해 저장될 수 있다.In one embodiment, during the initiation of a conference call, each participant of that call is assigned a CID. Switched virtual circuits (SVCs) are also associated with conference call participants. A lookup table that contains entries for conference call participants is created and stored. Each entry contains network address information (eg, IP, UDP address information) and the CID of the respective conference call participant. The lookup table may be stored for connection by both the NIC processing the packet and the audio source mixing the audio during the conference call.

패킷 스위치는 전부 믹싱된 오디오 스트림과 부분 믹싱된 오디오 스트림 각각을 컨퍼런스 콜에 할당된 SVC 모두를 통해 NIC로 멀티캐스트된다. NIC는 SVC를 통해 도착하는 각 패킷을 처리하고, 특히 패킷 헤더를 검사하여 전부 믹싱된 또는 부분 믹싱된 오디오 스트림에 대한 패킷을 참가자에게 전송할지 아니면 폐기할지를판정한다. 본 발명의 한 잇점은 이 패킷 처리 판정이 룩업 테이블로부터 획득한 CID 정보와 패킷 헤더 정보에 기초하여 컨퍼런스 콜 동안에 신속하게 또한 실시간으로 수행될 수 있다는 것이다. 일 실시예에서, 전송되는 네트워크 패킷은 룩업 테이블로부터 획득한 참가자의 네트워크 주소 정보(IP/UDP), RTP 패킷 헤더 정보(타임 스탬프/순서 정보) 및 오디오 데이터를 포함한다.The packet switch multicasts each of the fully mixed and partially mixed audio streams to the NIC through both the SVCs assigned to the conference call. The NIC processes each packet arriving through the SVC, specifically examining the packet header to determine whether to send or discard the packet to the participant for a fully mixed or partially mixed audio stream. One advantage of the present invention is that this packet processing decision can be performed quickly and in real time during a conference call based on the CID information and packet header information obtained from the lookup table. In one embodiment, the transmitted network packet includes the participant's network address information (IP / UDP), RTP packet header information (time stamp / order information) and audio data obtained from the lookup table.

요약하면, 본 발명의 이점은 다른 컨퍼런스 브리지에서의 믹싱 장치에 일반적으로 요구되는 것보다 적은 대역폭 및 처리를 갖는 자원을 보다더 적게 사용하여 컨퍼런스 브리지 처리를 제공하는 것을 포함한다. 본 발명의 컨퍼런스 브리지 시스템 및 방법은 믹싱 장치의 복제 작업을 덜어주는 방식으로 멀티캐스트된다. N명의 참가자와 c명의 능동 화자를 갖는 컨퍼런스 콜에 있어서, 오디오 소스는 c+1개의 믹싱된 오디오 스트림(하나의 전부 믹싱된 오디오 스트림과 c개의 부분 믹싱된 오디오 스트림)을 생성하기만 하면 된다. 복제를 수행하고 믹싱된 오디오 스트림을 멀티캐스트하는 스위치 내의 멀티캐스터로 작업이 분산된다. 부가의 이점은 본 발명에 따른 컨퍼런스 브리지가 다수의 참가자를 수용하도록 확장될 수 있다는 것이다. 예를 들어, N=1000명의 참가자와 c=3명의 능동 화자인 경우, 오디오 소스는 c+1=4개의 믹싱된 오디오 스트림을 생성하기만 하면 된다. 멀티캐스트된 오디오 스트림 내의 패킷들은 NIC에서 실시간으로 처리되어 컨퍼런스 콜의 참가자로 출력하기에 적절한 패킷을 결정한다. 일례에서, 헤더와 페이로드를 갖는 내부 발신 패킷은 컨퍼런스 콜에 대한 오디오를 믹싱하는 오디오 소스에서의 처리 작업을 추가로 감소시키기 위해 컨퍼런스 브리지에서 사용된다.In summary, the benefits of the present invention include providing conference bridge processing using less resources with less bandwidth and processing than is typically required for mixing devices in other conference bridges. The conference bridge system and method of the present invention are multicast in a manner that reduces the duplication of the mixing device. For a conference call with N participants and c active speakers, the audio source only needs to generate c + 1 mixed audio streams (one fully mixed audio stream and c partially mixed audio stream). The work is distributed to multicasters within the switch that perform replication and multicast the mixed audio streams. An additional advantage is that the conference bridge according to the invention can be extended to accommodate a large number of participants. For example, for N = 1000 participants and c = 3 active speakers, the audio source only needs to generate c + 1 = 4 mixed audio streams. Packets in the multicast audio stream are processed in real time at the NIC to determine the appropriate packet to output to the participants of the conference call. In one example, an internal outgoing packet with a header and payload is used at the conference bridge to further reduce processing at the audio source mixing audio for the conference call.

게다가, 오디오 네트워킹의 사용이 증가하고 사용자와 애플리케이션의 수가 상승함에 따라, 주어진 전화 호에서조차 다수의 오디오 스트림에 대한 필요성이 증가하고 있다. 본 발명자들은 다수의 오디오 스트림이 VoIP 네트워크 등의 오디오 네트워킹 환경에서 행해지는 호에서 RTP 에러를 일으키지 않고 동적으로 스위칭될 필요가 있다는 것을 인식하였다. 이러한 RTP 에러는 클릭, 팝 등의 원하지 않는 잡음을 유발할 수 있다.In addition, as the use of audio networking increases and the number of users and applications increases, the need for multiple audio streams even on a given telephone call increases. The inventors have recognized that multiple audio streams need to be dynamically switched without causing RTP errors in calls made in audio networking environments such as VoIP networks. This RTP error can cause unwanted noise such as clicks and pops.

본 발명은 독립적인 오디오 스트림들간에 무잡음 스위칭하는 방법 및 시스템을 제공한다. 이러한 무잡음 스위칭은 스위치 오버 시에 유효한 RTP 정보를 보존한다. 설정된 VoIP 호의 경우, 본 발명은 한 오디오 소스로부터 다른 오디오 소스로 무잡음 스위칭할 수 있다. 이 스위칭 시스템은 동적이고 또 다수의 호를 처리하도록 확장될 수 있다.The present invention provides a method and system for noiseless switching between independent audio streams. This noiseless switching preserves valid RTP information upon switchover. In the case of an established VoIP call, the present invention can switch noiselessly from one audio source to another. This switching system is dynamic and can be extended to handle multiple calls.

본 발명의 실시예들에서, 스위치를 사용하여 다수의 오디오 소스로부터 네트워크 인터페이스 제어기로 오디오 데이터를 보낸다. 이 스위치는 셀 스위치 또는 패킷 스위치일 수 있다. 오디오 소스는 내부 오디오 소스 및/또는 외부 오디오 소스일 수 있다. 네트워크 인터페이스 제어기(NIC)는 IP 네트워크를 갖는 임의의 인터페이스일 수 있으며, 하나 이상의 패킷 프로세서를 포함한다. 발신 오디오 제어기는 본 발명에 따른 무잡음 스위칭을 수행하기 위해 내부 오디오 소스, 스위치 및 네트워크 인터페이스 제어기의 동작을 제어한다.In embodiments of the present invention, a switch is used to send audio data from multiple audio sources to a network interface controller. This switch may be a cell switch or a packet switch. The audio source may be an internal audio source and / or an external audio source. The network interface controller (NIC) can be any interface with an IP network and includes one or more packet processors. The outgoing audio controller controls the operation of the internal audio source, switch and network interface controller to perform noiseless switching in accordance with the present invention.

본 발명의 한 특징에서는, 설정된 VoIP 전화 호에서 내부 또는 외부 오디오 소스로부터의 오디오 스트림 중 어느 것을 전송할지를 판정하기 위해 네트워크 인터페이스 제어기에 의해 우선순위 정보가 사용된다. 내부 오디오 소스가 2개인 경우를 생각해보자. 이 오디오 소스들은 하나의 목적지 발신 오디오 채널에 대한 내부 발신 패킷들로 된 각자의 오디오 스트림을 생성한다. 일 실시예에서, 각각의 내부 발신 패킷은 오디오 및 제어 헤더 정보를 전달하는 페이로드를 포함한다. 제어 헤더 정보는 우선순위 정보를 갖는다. 이 우선순위 정보는 이어서 네트워크 인터페이스 제어기에 의해 사용되어 어느 오디오 스트림이 전달되는지를 판정하게 되는데, 그 이유는 각 VoIP 호마다 주어진 시간에 단지 하나의 RTP 스트림만이 출력될 수 있기 때문이다.In one aspect of the invention, priority information is used by the network interface controller to determine which of the audio streams from an internal or external audio source to transmit in the established VoIP telephone call. Consider the case of two internal audio sources. These audio sources create their own audio stream of internal outgoing packets for one destination outgoing audio channel. In one embodiment, each inner outgoing packet includes a payload that carries audio and control header information. The control header information has priority information. This priority information is then used by the network interface controller to determine which audio stream is delivered because only one RTP stream can be output at a given time for each VoIP call.

본 발명의 한 특징에서, 내부 발신 패킷은 IP 패킷보다 더 작으며, 페이로드와 제어 헤더 정보만으로 이루어져 있다. 이와 같이, 전체 IP 패킷을 생성하는 데 요구되는 처리 작업이 DSP 등의 내부 오디오 소스에 의해 수행될 필요가 없고 네트워크 인터페이스 제어기 내의 패킷 프로세서로 분산된다.In one aspect of the invention, the inner outgoing packet is smaller than the IP packet and consists only of payload and control header information. As such, the processing work required to generate the entire IP packet need not be performed by an internal audio source such as a DSP and is distributed to the packet processor in the network interface controller.

추가의 특징에 따르면, 많은 가용 대역폭을 갖는 ATM 셀 스위치 등의 완전 메쉬형 셀 스위치인 셀 스위치가 사용된다. 서로 다른 오디오 스트림에 대한 내부 발신 패킷은 셀로 변환된다. 셀 스위치는 서로 다른 소스들로부터 병합된 셀을 합성하고 이를 SVC를 거쳐 NIC으로 전달한다. SVC는 설정된 전화 호에 서비스하는 하나의 발신 출력 오디오 채널과 관련되어 있다.According to a further feature, a cell switch is used, which is a fully meshed cell switch such as an ATM cell switch with a lot of available bandwidth. Internal outgoing packets for different audio streams are converted into cells. The cell switch synthesizes the merged cells from different sources and passes them through the SVC to the NIC. SVC is associated with one outgoing output audio channel serving a established telephone call.

일 실시예에서, 발신 오디오 제어기는 VoIP 전화 호에서 오디오의 무잡음 스위칭을 제어하기 위해 사용된다. 본 발명에 따른 이 무잡음 스위칭은 본 명세서에서 "무잡음 스위치 오버(noiseless switch over)"라고도 한다. 일 실시예에서, 부가의 오디오의 무잡음 스위치 오버는 이 서비스를 이용할 수 있는 호에 대해 수행된다. 이와 같이, 무잡음 스위치 오버 서비스의 제공에 대한 추가 요금이 부과될 수 있다. 다른 실시예들에서는, 무잡음 스위치 오버가 어떤 호에 대해서도 수행된다.In one embodiment, an outgoing audio controller is used to control noiseless switching of audio in a VoIP telephone call. This noiseless switching according to the invention is also referred to herein as "noiseless switch over". In one embodiment, noiseless switchover of additional audio is performed for calls that may use this service. As such, an additional fee may be charged for providing a noiseless switchover service. In other embodiments, noiseless switchover is performed for any call.

부가의 오디오를 수반하는 어떤 호 이벤트는 무잡음 스위치 오버를 트리거한다. 이 무잡음 스위치 오버는 본 발명의 무잡음 스위칭 시스템 및 방법을 사용하여 수행된다. 호 이벤트의 일례로는 다음과 같은 상황, 즉 긴급 상황, 호 시그널링 상황, 호출자 또는 피호출자 정보에 기초한 호 이벤트, 또는 서로 다른 오디오 정보의 요청이 있지만, 이에 한정되는 것은 아니다. 오디오 정보의 요청은 광고, 뉴스, 스포츠, 금융, 음악 또는 다른 오디오 콘텐츠에 대한 요청 등의 임의의 오디오 요청일 수 있다.Any call event with additional audio triggers a noiseless switchover. This noiseless switchover is performed using the noiseless switching system and method of the present invention. Examples of call events include, but are not limited to, the following situations: emergency, call signaling, call event based on caller or callee information, or request of different audio information. The request for audio information can be any audio request, such as a request for advertising, news, sports, finance, music or other audio content.

오디오 소스는 임의의 유형의 오디오를 생성할 수 있다. 예를 들어, 발신 패킷들로 된 오디오 스트림은 음성, 음악, 톤, 및/또는 임의의 다른 음향을 나타내는 오디오 페이로드를 포함할 수 있다.The audio source can produce any type of audio. For example, an audio stream of outgoing packets may include audio payloads representing voice, music, tones, and / or any other sound.

발신 오디오 제어기는 독립형 장치이거나 오디오 처리 플랫폼 내의 호 제어 및 오디오 특징 관리자의 일부일 수 있다. 본 발명은 미디어 서버, 오디오 프로세서, 라우터, 패킷 스위치, 또는 오디오 처리 플랫폼에 구현될 수 있다.The outgoing audio controller may be a standalone device or part of a call control and audio feature manager in an audio processing platform. The invention may be implemented in a media server, audio processor, router, packet switch, or audio processing platform.

다른 실시예는 외부 오디오 소스로부터의 오디오 스트림을 포함하는 오디오 스트림의 스위칭을 포함한다. 이 경우에, NIC는 오디오 스트림을 포함하는 IP 패킷을 수신하여, IP 패킷을 내부 발신 패킷으로 변환한다. 이 때, 내부 발신 패킷은 마치 내부 오디오 소스에서 생성된 것처럼 처리된다. 내부 발신 패킷은 우선순위 정보를 포함할 수 있다. 내부 발신 패킷은 SVC를 거쳐 스위치를 통해 NIC로 패킷 또는 셀로서 전송될 수 있다. 외부 오디오 스트림이 비교적 높은 우선순위를 가지고 있으며 스위치 오버가 진행될 예정일 때, NIC에 있는 패킷 프로세서는 동기화된 헤더 정보(예를 들면, RTP 정보)를 갖는 IP 패킷을 생성하여 이 IP 패킷을 목적지 장치로 전송한다.Another embodiment includes switching an audio stream including an audio stream from an external audio source. In this case, the NIC receives the IP packet containing the audio stream and converts the IP packet into an internal outgoing packet. At this point, the internal outgoing packet is processed as if it were generated from an internal audio source. The inner outgoing packet may include priority information. The internal outgoing packet may be sent as a packet or cell through the switch to the NIC via the switch. When the external audio stream has a relatively high priority and is scheduled to switch over, the packet processor on the NIC generates an IP packet with synchronized header information (e.g. RTP information) and sends this IP packet to the destination device. send.

일 실시예에서, 본 발명에 따른 무잡음 스위치 오버 시스템은 DSP와 같은 내부 오디오 소스로부터만 온 오디오 스트림의 스위칭을 포함한다. 다른 실시예에서, 본 발명에 따른 무잡음 스위치 오버 시스템은 내부 오디오 소스와 외부 오디오 소스로부터의 오디오 스트림의 스위칭을 포함한다. 다른 실시예에서, 본 발명에 따른 무잡음 스위치 오버 시스템은 외부 오디오 소스로부터만 온 오디오 스트림의 스위칭을 포함하며, 이 경우 이 스위치 오버 시스템은 오디오 스트림에 대한 일반적인 스위치로서 작용하며 내부 DSP를 필요로 하지 않는다.In one embodiment, a noiseless switch over system according to the present invention comprises switching of an audio stream only from an internal audio source such as a DSP. In another embodiment, a noiseless switch over system according to the present invention comprises switching of an audio stream from an internal audio source and an external audio source. In another embodiment, the noiseless switchover system according to the present invention comprises the switching of an audio stream only from an external audio source, in which case the switchover system acts as a general switch for the audio stream and requires an internal DSP. I never do that.

본 발명의 추가의 실시예, 특징 및 이점은 물론 본 발명의 여러가지 실시예의 구조 및 동작에 대해 첨부 도면을 참조하여 이하에서 상세히 설명한다.Further embodiments, features, and advantages of the present inventions, as well as the structure and operation of the various embodiments of the present invention, are described in detail below with reference to the accompanying drawings.

본 명세서에 포함되어 그 일부를 형성하는 첨부 도면은 본 발명을 예시한 것으로서 상세한 설명과 함께 본 발명의 원리를 설명하고 또 당업자로 하여금 본 발명을 실시 및 사용할 수 있도록 하는데 도움이 된다.BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are incorporated in and form a part of this specification, illustrate the principles of the invention, together with the description, and help to enable those skilled in the art to make and use the invention.

이제부터 첨부 도면을 참조하여 본 발명에 대해 설명한다. 도면에서, 동일한 참조 번호는 동일하거나 기능이 유사한 구성요소를 나타낸다. 게다가, 참조 번호의 좌단 숫자(들)를 보면 그 참조 번호가 처음으로 나타난 도면을 알 수 있다.The present invention will now be described with reference to the accompanying drawings. In the drawings, like reference numerals refer to like or similar components. Furthermore, looking at the left digit (s) of the reference number reveals the figure in which the reference number first appeared.

I. 개요 및 설명I. Overview and Description

본 발명은 VoIP 전화에서 분산 컨퍼런스 브리지 처리를 위한 방법 및 시스템을 제공한다. DSP 등의 믹싱 장치의 작업이 분산되어 덜어진다. 상세히 말하면, 본 발명에 따른 분산 컨퍼런스 브리지는 오디오 믹싱 장치에서의 작업을 감소시키기 위해 네트워크 인터페이스에서 내부 멀티캐스트 및 패킷 처리를 사용한다. 컨퍼런스 콜을 설정 및 종료하기 위해 컨퍼런스 콜 에이전트(conference call agent)가 사용된다. DSP와 같은 오디오 소스는 능동 컨퍼런스 콜 참가자의 오디오를 믹싱한다. 단 하나의 전부 믹싱된(fully mixed) 오디오 스트림과 한 세트의 부분 믹싱된(partially mixed) 오디오 스트림이 생성될 필요가 있다. 오디오 콘텐츠를 믹싱하는 오디오 소스와 네트워크 인터페이스 제어기 사이에 스위치가 연결되어 있다. 이 스위치는 멀티캐스터(multi-caster)를 포함한다. 이 멀티캐스터는 단 하나의 전부 믹싱된 오디오 스트림과 한 세트의 부분 믹싱된 오디오 스트림 내의 패킷들을 복제하여 이 복제된 패킷들을 각 콜 참가자와 관련된 링크(예를 들면, SVC)로 멀티캐스트한다. 네트워크 인터페이스 제어기는 각 패킷을 처리하여 전부 믹싱된 또는 부분 믹싱된 오디오 스트림의 패킷을 참가자에게 전달할지 아니면 폐기할지를 판정한다. 이 판정은 NIC에 있는 룩업 테이블과 멀티캐스트된 오디오 스트림내의 패킷 헤더 정보에 기초하여 실시간으로 이루어질 수 있다.The present invention provides a method and system for distributed conference bridge processing in a VoIP telephone. The work of the mixing device such as the DSP is distributed and saved. Specifically, the distributed conference bridge according to the present invention uses internal multicast and packet processing at the network interface to reduce the work at the audio mixing device. A conference call agent is used to set up and terminate the conference call. An audio source, such as a DSP, mixes the audio of an active conference call participant. Only one fully mixed audio stream and a set of partially mixed audio streams need to be created. A switch is connected between the audio source that mixes the audio content and the network interface controller. This switch includes a multi-caster. The multicaster duplicates packets in only one fully mixed audio stream and a set of partially mixed audio streams and multicasts these duplicated packets onto a link (eg, SVC) associated with each call participant. The network interface controller processes each packet to determine whether to forward or discard a packet of a fully mixed or partially mixed audio stream to the participant. This determination can be made in real time based on the lookup table in the NIC and the packet header information in the multicast audio stream.

일 실시예에서, 본 발명에 따른 컨퍼런스 브리지는 미디어 서버에 구현되어 있다. 본 발명의 실시예들에 따르면, 미디어 서버는 컨퍼런스 브리지의 동작을 관리하는 호 제어 및 오디오 특징 관리자를 포함할 수 있다.In one embodiment, the conference bridge according to the present invention is implemented in a media server. According to embodiments of the present invention, the media server may include a call control and audio feature manager to manage the operation of the conference bridge.

본 발명에 대해서는 전형적인 VoIP 환경과 관련하여 설명하고 있다. 이러한 관점에서의 설명은 단지 편의상 제공된 것에 불과하다. 본 발명이 이들 전형적인 환경에서의 적용에만 한정되도록 하려는 의도는 아니다. 사실, 당업자에게는 이하의 설명을 읽고 난 후라면 본 발명을 현재 알고 있거나 장래에 나타날 다른 환경에서 어떻게 구현할지는 자명할 것이다.The present invention is described in relation to a typical VoIP environment. The description in this respect is merely provided for convenience. It is not intended that the present invention be limited to applications in these typical environments. In fact, it will be apparent to those skilled in the art after reading the following description how to implement the present invention in other environments that are presently known or will appear in the future.

II. 용어 설명II. Term description

본 발명에 대해 보다 명확하게 서술하기 위해, 명세서 전반에 걸쳐 이하의 용어 정의를 가능한 한 일관성있게 따르기 위해 노력하였다.In order to more clearly describe the present invention, efforts have been made to follow the following term definitions as consistently as possible throughout the specification.

본 발명에 따르면 용어무잡음(noiseless)은 독립적인 오디오 스트림들 간의 스위칭에서 패킷 시퀀스 정보가 보존되는 것을 말하는 것이다. 용어동기화된 헤더 정보(synchronized header information)는 헤더를 갖는 패킷에서 패킷 시퀀스 정보가 보존되어 있는 것을 말하는 것이다. 패킷 시퀀스 정보는 유효한 RTP 정보를 포함할 수 있지만 이에 한정되는 것은 아니다.According to the invention the term noiseless refers to the preservation of packet sequence information in switching between independent audio streams. The term synchronized header information refers to the fact that packet sequence information is preserved in a packet having a header. The packet sequence information may include valid RTP information, but is not limited thereto.

용어디지털 신호 처리기(DSP)는 프로그램 또는 응용 서비스에 따라 디지털화된 음성 샘플을 코딩 또는 디코딩하는 데 사용되는 장치를 포함하지만 이에 한정되는 것은 아니다.The term digital signal processor (DSP) includes, but is not limited to, an apparatus used to code or decode digitized speech samples in accordance with a program or application service.

용어디지털화된 음성또는음성은 표준 규격의 전화 회선 압축기/압축 해제기(CODEC)에 의해 펄스 코드 변조(PCM) 방식으로 생성된 오디오 바이트 샘플을 포함하지만 이에 한정되는 것은 아니다.The term digitized voice or voice includes, but is not limited to, audio byte samples generated by pulse code modulation (PCM) by standard line telephone compressors / decompressors (CODECs).

용어패킷 프로세서는 패킷 교환 네트워크용의 패킷을 생성하는 임의의 유형의 패킷 프로세서를 말한다. 일례에서, 패킷 프로세서는 프로그램 또는 응용 서비스에 따라 이더넷 패킷을 검사 및 수정하도록 설계된 특수 마이크로프로세서이다.The term packet processor refers to any type of packet processor that generates a packet for a packet switched network. In one example, a packet processor is a special microprocessor designed to inspect and modify Ethernet packets according to a program or application service.

용어패킷화된 음성은 패킷에 담아 전달되는 디지털화된 음성 샘플을 말한다.The term packetized speech refers to a digitized speech sample carried in a packet.

용어실시간 프로토콜(RTP) 오디오 스트림은 하나의 패킷화된 음성 채널과관련된 RTP 패킷의 시퀀스를 말한다.The term Real Time Protocol (RTP) audio stream refers to a sequence of RTP packets associated with one packetized voice channel.

용어교환 가상 회선(SVC)은 데이터가 전송되고 있는 동안에만 설정되어 사용되는 임시 가상 회선을 말한다. 2개의 호스트 사이의 통신이 종료되면, SVC는 사라진다. 이와 반대로, 영구 가상 회선(PVC)는 항상 이용가능한 상태로 있다.The term switched virtual circuit (SVC) refers to a temporary virtual circuit that is set up and used only while data is being transmitted. When the communication between the two hosts is terminated, the SVC disappears. In contrast, the permanent virtual circuit (PVC) is always available.

III. 오디오 네트워킹 환경III. Audio networking environment

본 발명은 어떤 오디오 네트워킹 환경에서도 사용될 수 있다. 이러한 오디오 네트워킹 환경으로는 WAN(wide area network; 광역 통신망) 및/또는 LAN(local area network; 근거리 통신망) 환경이 있을 수 있지만 이에 한정되는 것은 아니다. 전형적인 실시예들에서, 본 발명은 오디오 네트워킹 환경 내에 독립형 장치로서 또는 미디어 서버, 패킷 라우터, 패킷 스위치 또는 다른 네트워크 구성요소의 일부로서 포함되어 있다. 간명함을 위해, 본 발명에 대해서는 미디어 서버에 포함되어 있는 실시예들에 관하여 설명한다.The present invention can be used in any audio networking environment. Such audio networking environments may include, but are not limited to, wide area networks (WANs) and / or local area networks (LANs). In typical embodiments, the present invention is included within an audio networking environment as a standalone device or as part of a media server, packet router, packet switch or other network component. For simplicity, the present invention will be described with respect to the embodiments included in the media server.

미디어 서버는 오디오를 네트워크 링크를 통해 하나 이상의 회선 교환 및/또는 패킷 교환 네트워크를 거쳐 로컬 클라이언트(local client)나 원격 클라이언트(remote client)로 전달한다. 클라이언트는 전화, 셀룰러 전화, 퍼스널 컴퓨터, PDA(personal data assistant; 개인 휴대 단말기), 셋톱 박스, 콘솔 또는 오디오 플레이어를 비롯한 오디오를 처리하는 임의의 유형의 장치일 수 있지만 이에 한정되는 것은 아니다. 도 1은 본 발명에 따른 전형적인 VoIP 환경에서의 미디어 서버(140)를 나타낸 도면이다. 이 일례는 전화 클라이언트(105), 공중 교환 전화망(PSTN)(110), 소프트스위치(120), 게이트웨이(130), 미디어 서버(140), 패킷교환 네트워크(들)(150), 및 컴퓨터 클라이언트(155)를 포함하고 있다. 전화 클라이언트(105)는 PSTN(110)을 거쳐 오디오를 전송 및 수신할 수 있는 임의의 유형의 (유선 또는 무선) 전화일 수 있다. PSTN(110)은 임의의 유형의 회선 교환 네트워크(들)이다. 컴퓨터 클라이언트(155)는 퍼스널 컴퓨터일 수 있다.The media server delivers the audio over one or more circuit switched and / or packet switched networks via a network link to a local client or a remote client. The client may be, but is not limited to, any type of device that processes audio, including telephones, cellular phones, personal computers, personal data assistants (PDAs), set-top boxes, consoles, or audio players. 1 is a diagram illustrating a media server 140 in a typical VoIP environment in accordance with the present invention. This example includes a telephony client 105, a public switched telephone network (PSTN) 110, a softswitch 120, a gateway 130, a media server 140, a packet switched network (s) 150, and a computer client ( 155). The telephony client 105 can be any type of (wired or wireless) telephone capable of transmitting and receiving audio over the PSTN 110. PSTN 110 is any type of circuit switched network (s). Computer client 155 may be a personal computer.

전화 클라이언트(105)는 공중 교환 전화망(PSTN)(110), 게이트웨이(130) 및 네트워크(150)을 통해 미디어 서버(140)에 연결되어 있다. 이 일례에서, 호 시그널링(call signaling) 및 제어는 오디오를 전달하는 미디어 경로 또는 링크와 분리되어 있다. 소프트스위치(120)는 PSTN(110)과 미디어 서버(140) 사이에 제공되어 있다. 소프트스위치(120)는 전화 클라이언트(105)와 미디어 서버(140) 사이에서 음성 호(voice call)를 설정 및 제거하기 위해 호 시그널링 및 제어를 지원한다. 일례에서, 소프트스위치(120)는 SIP(Session Initiation Protocol, 접속 설정 프로토콜)를 따른다. 게이트웨이(130)는 PSTN(110)과 네트워크(150) 사이에서 오디오를 변환시켜 통과시키는 일을 맡고 있다. 이것에는 회선 교환 전화 번호를 IP(Internet Protocol, 인터넷 프로토콜) 주소로 변환하는 것과 그 역으로 변환하는 것 등의 여러가지의 공지된 기능들이 포함될 수 있다.The telephony client 105 is connected to the media server 140 via a public switched telephone network (PSTN) 110, a gateway 130, and a network 150. In this example, call signaling and control are separate from the media path or link carrying the audio. The softswitch 120 is provided between the PSTN 110 and the media server 140. Softswitch 120 supports call signaling and control to establish and remove voice calls between telephony client 105 and media server 140. In one example, the softswitch 120 follows a Session Initiation Protocol (SIP). The gateway 130 is responsible for converting and passing audio between the PSTN 110 and the network 150. This may include various known functions, such as converting a circuit switched telephone number to an Internet Protocol (IP) address and vice versa.

컴퓨터 클라이언트(155)는 네트워크(150)를 거쳐 미디어 서버(140)에 연결되어 있다. 미디어 게이트웨이 제어기(도시 생략)는 또한 컴퓨터 클라이언트(155)와 미디어 서버(140) 사이에서의 음성 호와 같은 링크를 설정 및 차단하기 위해 호 시그널링 및 제어를 지원하는 데 SIP를 사용할 수 있다. 응용 서버(도시 생략)는 또한 VoIP 서비스 및 응용을 지원하기 위해 미디어 서버(140)에 연결될 수 있다.Computer client 155 is coupled to media server 140 via network 150. The media gateway controller (not shown) may also use SIP to support call signaling and control to establish and block a link, such as a voice call, between the computer client 155 and the media server 140. An application server (not shown) may also be connected to the media server 140 to support VoIP services and applications.

본 발명이 이들 전형적인 실시예들과 관련하여 설명되어 있다. 이러한 관점에서의 설명은 단지 편의상 제공된 것에 불과하다. 본 발명이 미디어 서버, 라우터, 스위치, 네트워크 구성요소 또는 독립형 장치를 네트워크 내에 포함하는 이들 전형적인 환경에서의 응용으로 한정하되도록 하려는 의도는 아니다. 실제로, 당업자에게는 이하의 설명을 읽고 난 후라면 본 발명을 현재 알고 있거나 장래에 나타날 다른 환경에서 어떻게 구현할지는 자명할 것이다.The present invention has been described in connection with these exemplary embodiments. The description in this respect is merely provided for convenience. It is not intended that the present invention be limited to applications in these typical environments, including media servers, routers, switches, network components, or standalone devices in a network. Indeed, it will be apparent to those skilled in the art after reading the following description how to implement the present invention in other environments, which are currently known or will appear in the future.

IV. 미디어 서버, 서비스 및 자원IV. Media server, services, and resources

도 2는 본 발명의 일 실시예에 따른 전형적인 미디어 플랫폼(200)을 나타낸 도면이다. 미디어 플랫폼(200)은 확장성있는 VoIP 전화를 제공한다. 미디어 플랫폼(200)은 미디어 서버(202)를 포함하며, 이 미디어 서버(202)는 자원(들)(210), 미디어 서버(들)(212) 및 인터페이스(들)(208)에 연결되어 있다. 미디어 서버(202)는 하나 이상의 애플리케이션(210), 자원 관리자(220) 및 오디오 처리 플랫폼(230)을 포함하고 있다. 미디어 서버(202)는 자원(210) 및 서비스(212)를 제공한다. 자원(210)에는 도 2에 도시한 바와 같이 모듈(211a-f)이 포함되지만 이에 한정되는 것은 아니다. 자원 모듈(211a-f)은 재생 통보/숫자 수집 IVR(play announcement/collect digits IVR) 자원(211a), 톤/숫자 음성 스캐닝 자원(211b), 트랜스코딩 자원(211c), 오디오 녹음/재생 자원(211d), 텍스트-음성 변환 자원(211e) 및 음성 인식 자원(211f)과 같은 종래의 자원을 포함하고 있다. 미디어 서비스(212)는 도 2에도시한 바와 같은 모듈(213a-e)를 포함하지만 이에 한정되는 것은 아니다. 미디어 서비스 모듈(213a-e)은 텔레브라우징(telebrowsing)(213a), 음성 메일 서비스(213b), 컨퍼런스 브리지 서비스(213b), 비디오 스트리밍(213d) 및 VoIP 게이트웨이(213e)와 같은 종래의 서비스를 포함하고 있다.2 illustrates an exemplary media platform 200 in accordance with one embodiment of the present invention. Media platform 200 provides scalable VoIP telephony. Media platform 200 includes a media server 202, which is connected to resource (s) 210, media server (s) 212 and interface (s) 208. . Media server 202 includes one or more applications 210, resource manager 220 and audio processing platform 230. Media server 202 provides resources 210 and services 212. The resource 210 includes, but is not limited to, modules 211a-f as shown in FIG. 2. The resource module 211a-f includes a play announcement / collect digits IVR resource 211a, a tone / numeric voice scanning resource 211b, a transcoding resource 211c, an audio recording / playback resource ( 211d), a text-to-speech resource 211e and a speech recognition resource 211f. Media service 212 includes, but is not limited to, modules 213a-e as shown in FIG. The media service module 213a-e includes conventional services such as telebrowsing 213a, voice mail service 213b, conference bridge service 213b, video streaming 213d and VoIP gateway 213e. Doing.

미디어 서버(202)는 애플리케이션 중앙 처리 장치(CPU)(210), 자원 관리자 CPU(220), 및 오디오 처리 플랫폼(230)을 포함하고 있다. 애플리케이션 CPU(210)는 애플리케이션 및 애플릿에 대한 프로그램 인터페이스를 지원 및 실행하는 임의의 프로세서이다. 애플리케이션 CPU(210)는 미디어 플랫폼(200)이 미디어 서비스(212) 중 하나 이상을 제공할 수 있도록 해준다. 자원 관리자 CPU(220)는 자원(210)과 애플리케이션 CPU(210) 및/또는 오디오 처리 플랫폼(230) 사이의 연결을 제어하는 임의의 프로세서이다. 오디오 처리 플랫폼(230)은 네트워크 인터페이스(208) 중의 하나 이상과의 통신 연결을 제공한다. 미디어 플랫폼(200)은 오디오 처리 플랫폼(230)을 통해 네트워크 인터페이스(208)를 거쳐 정보를 수신 및 전송한다. 인터페이스(208)에는 ATM(Asynchronous Transfer Mode, 비동기 전송 모드)(209a), LAN 이더넷(209b), DSL(digital subscriber line, 디지털 가입자 회선)(209c), 케이블 모뎀(209d) 및 채널화된 T1-T3 회선(209e)이 포함될 수 있지만, 이에 한정되는 것은 아니다.The media server 202 includes an application central processing unit (CPU) 210, a resource manager CPU 220, and an audio processing platform 230. Application CPU 210 is any processor that supports and executes program interfaces for applications and applets. The application CPU 210 enables the media platform 200 to provide one or more of the media services 212. The resource manager CPU 220 is any processor that controls the connection between the resource 210 and the application CPU 210 and / or the audio processing platform 230. The audio processing platform 230 provides a communication connection with one or more of the network interfaces 208. Media platform 200 receives and transmits information via network interface 208 via audio processing platform 230. Interface 208 includes Asynchronous Transfer Mode (ATM) 209a, LAN Ethernet 209b, Digital Subscriber Line (DSL) 209c, Cable Modem 209d, and Channelized T1-. The T3 line 209e may be included, but is not limited thereto.

V. 독립적인 오디오 스트림의 무잡음 스위칭을 위한 패킷/셀 스위치를 갖는 오디오 처리 플랫폼V. Audio Processing Platform with Packet / Cell Switch for Noiseless Switching of Independent Audio Streams

본 발명의 일 실시예에서, 오디오 처리 플랫폼(230)은 완전 메시형 동적 셀 스위치(dynamic fully-meshed cell switch)(304) 및 IP 패킷 등의 패킷의 수신 및 처리를 위한 다른 구성요소들을 포함한다. 오디오 처리 플랫폼(230)은 본 발명에따른 무잡음 스위칭을 포함한 오디오 처리와 관련하여 도 3a에 도시되어 있다.In one embodiment of the invention, the audio processing platform 230 includes a dynamic fully-meshed cell switch 304 and other components for receiving and processing packets, such as IP packets. . The audio processing platform 230 is shown in FIG. 3A in connection with audio processing including noiseless switching in accordance with the present invention.

도시되어 있는 바와 같이, 오디오 처리 플랫폼(230)은 호 제어 및 오디오 특징 관리자(302), 셀 스위치(304)[셀 스위치(304)가 셀 스위치 또는 패킷 스위치일 수 있다는 것을 나타내기 위해 패킷/셀 스위치라고도 함], 네트워크 연결부(305), 네트워크 인터페이스 제어기(306), 및 오디오 채널 프로세서(308)를 포함한다. 네트워크 인터페이스 제어기(306)는 패킷 프로세서(307)를 더 포함한다. 호 제어 및 오디오 특징 관리자(302)는 셀 스위치(304), 네트워크 인터페이스 제어기(306), 및 오디오 채널 프로세서(308)에 연결되어 있다. 한 구성에서, 호 제어 및 오디오 특징 관리자(302)는 네트워크 인터페이스 제어기(306)에 직접 연결되어 있다. 네트워크 인터페이스 제어기(306)는 이어서 호 제어 및 오디오 특징 관리자(302)가 전송한 제어 명령에 기초하여 패킷 프로세서(307)의 동작을 제어한다.As shown, the audio processing platform 230 includes a call control and audio feature manager 302, a cell switch 304 (a packet / cell to indicate that the cell switch 304 may be a cell switch or a packet switch). Also referred to as a switch], a network connection 305, a network interface controller 306, and an audio channel processor 308. The network interface controller 306 further includes a packet processor 307. Call control and audio feature manager 302 is coupled to a cell switch 304, a network interface controller 306, and an audio channel processor 308. In one configuration, call control and audio feature manager 302 is directly connected to network interface controller 306. The network interface controller 306 then controls the operation of the packet processor 307 based on the control commands sent by the call control and audio feature manager 302.

일 실시예에서, 호 제어 및 오디오 특징 관리자(302)는 본 발명에 따른 독립적인 오디오 스트림의 무잡음 스위칭을 제공하도록 셀 스위치(304), 네트워크 인터페이스 제어기(306)[패킷 프로세서(307)를 포함함], 및 오디오 채널 프로세서(308)를 제어한다. 이러한 무잡음 스위칭에 대해서는 도 6 내지 도 9를 참조하여 이하에서 더 설명한다. 본 발명에 따른 호 제어 및 오디오 특징 관리자(302)의 실시예에 대해서는 도 3b를 참조하여 이하에서 더 설명한다.In one embodiment, call control and audio feature manager 302 includes cell switch 304, network interface controller 306 (packet processor 307) to provide noiseless switching of independent audio streams in accordance with the present invention. And the audio channel processor 308. Such noiseless switching will be further described below with reference to FIGS. 6 to 9. An embodiment of the call control and audio feature manager 302 according to the present invention is further described below with reference to FIG. 3B.

네트워크 연결부(305)는 패킷 프로세서(307)에 연결되어 있다. 패킷 프로세서(307)는 또한 셀 스위치(304)에도 연결되어 있다. 셀 스위치(304)는 이어서 오디오 채널 프로세서(308)에 연결되어 있다. 일 실시예에서, 오디오 채널 프로세서(308)는 4개의 호를 처리할 수 있는 4개의 채널을 포함하고 있다. 즉, 4개의 오디오 처리 섹션이 있다. 대체 실시예에서는, 더 많거나 더 적은 오디오 채널 프로세서(308)가 있다.The network connection 305 is connected to the packet processor 307. The packet processor 307 is also connected to the cell switch 304. The cell switch 304 is then connected to the audio channel processor 308. In one embodiment, the audio channel processor 308 includes four channels capable of processing four calls. That is, there are four audio processing sections. In alternative embodiments, there are more or fewer audio channel processors 308.

오디오 데이터를 갖는 페이로드(payload)를 포함하는 IP 패킷 등의 데이터 패킷은 네트워크 연결부(305)에 도달한다. 일 실시예에서, 패킷 프로세서(307)는 링크당 매초 300,000 패킷(300,000 packets per second per link)인 범위의 고속 네트워크 트래픽을 지원하는 하나 이상 또는 8개의 100Base-TX 전이중 이더넷 링크를 포함한다. 다른 실시예에서, 패킷 프로세서(307)는 링크당 1,000 G.711 음성 포트 및/또는 시스템당 8,000 G.711 음성 채널을 지원한다.A data packet such as an IP packet including a payload having audio data arrives at the network connection 305. In one embodiment, the packet processor 307 includes one or more or eight 100Base-TX full-duplex Ethernet links that support high speed network traffic in the range of 300,000 packets per second per link. In another embodiment, the packet processor 307 supports 1,000 G.711 voice ports per link and / or 8,000 G.711 voice channels per system.

또다른 실시예에서, 패킷 프로세서(307)는 패킷의 IP 헤더를 인식하고 최소한의 패킷 지연 또는 지터로 모든 RTP 라우팅 결정을 처리한다.In another embodiment, the packet processor 307 recognizes the IP header of the packet and handles all RTP routing decisions with minimal packet delay or jitter.

본 발명의 일 실시예에서, 패킷/셀 스위치(304)는 2.5Gbps의 총 대역폭을 갖는 논-블로킹 스위치(non-blocking switch)이다. 다른 실시예에서, 패킷/셀 스위치(304)는 5Gbps의 총 대역폭을 갖는다.In one embodiment of the invention, the packet / cell switch 304 is a non-blocking switch with a total bandwidth of 2.5 Gbps. In another embodiment, the packet / cell switch 304 has a total bandwidth of 5 Gbps.

일 실시예에서, 오디오 채널 프로세서(308)는 이하에 도 4를 참조하여 더 상세히 설명하는 디지털 신호 처리기 등의 임의의 오디오 소스를 포함한다. 오디오 채널 프로세서(308)는 서비스(211a-f) 중 하나 이상을 포함하는 오디오 관련 서비스를 수행할 수 있다.In one embodiment, the audio channel processor 308 includes any audio source, such as a digital signal processor, described in greater detail below with reference to FIG. The audio channel processor 308 may perform an audio related service including one or more of the services 211a-f.

VI. 전형적인 오디오 처리 플랫폼 구현VI. Typical audio processing platform implementation

도 4는 예시적인 것으로서 본 발명을 한정하려는 것이 아닌 한 전형적인 구현예를 나타낸 것이다. 도 4에 도시한 바와 같이, 오디오 처리 플랫폼(230)은 SCC(shelf controller card)일 수 있다. 시스템(400)은 이러한 SCC의 하나를 구현한 것이다. 시스템(400)은 셀 스위치(304), 호 제어 및 오디오 특징 관리자(302), 네트워크 인터페이스 제어기(306), 인터페이스 회로(410), 및 오디오 채널 프로세서(308a-d)를 포함한다.4 is exemplary and shows an exemplary embodiment unless it is intended to limit the invention. As shown in FIG. 4, the audio processing platform 230 may be a shelf controller card (SCC). System 400 implements one of these SCCs. System 400 includes cell switch 304, call control and audio feature manager 302, network interface controller 306, interface circuit 410, and audio channel processor 308a-d.

보다 상세히 설명하면, 시스템(400)은 네트워크 연결부(424, 426)에서 패킷을 수신한다. 네트워크 연결부(424, 426)는 네트워크 인터페이스 제어기(306)에 연결되어 있다. 네트워크 인터페이스 제어기(306)는 패킷 프로세서(307a-b)를 포함한다. 패킷 프로세서(307a-b)는 제어기(420, 422), 포워딩 테이블(forwarding table)(412, 416) 및 포워딩 프로세서(EPIF)(414, 418)를 포함한다. 도 4에 도시한 바와 같이, 패킷 프로세서(307a)는 네트워크 연결부(424)에 연결되어 있다. 네트워크 연결부(424)는 제어기(420)에 연결되어 있다. 제어기(420)는 포워딩 테이블(412) 및 EPIF(414)에 연결되어 있다. 패킷 프로세서(307b)는 네트워크 연결부 (426)에 연결되어 있다. 네트워크 연결부(426)는 제어기(422)에 연결되어 있다. 제어기(422)는 포워딩 테이블(416) 및 EPIF(418)에 연결되어 있다.In more detail, system 400 receives packets at network connections 424 and 426. Network connections 424 and 426 are connected to network interface controller 306. Network interface controller 306 includes packet processors 307a-b. The packet processor 307a-b includes a controller 420, 422, a forwarding table 412, 416, and a forwarding processor (EPIF) 414, 418. As shown in FIG. 4, the packet processor 307a is connected to a network connection 424. The network connection 424 is connected to the controller 420. Controller 420 is coupled to forwarding table 412 and EPIF 414. The packet processor 307b is connected to a network connection 426. The network connection 426 is connected to the controller 422. Controller 422 is connected to forwarding table 416 and EPIF 418.

일 실시예에서, 패킷 프로세서(307)는 하나 이상의 LAN 도터카드 모듈 (daughtercard module) 상에 구현될 수 있다. 다른 실시예에서, 각 네트워크 연결부(424, 426)는 100Base-TX 또는 1000Base-T 링크일 수 있다.In one embodiment, the packet processor 307 may be implemented on one or more LAN daughtercard modules. In other embodiments, each network connection 424, 426 may be a 100Base-TX or 1000Base-T link.

패킷 프로세서(307)에 의해 수신되는 IP 패킷은 처리되어 내부 패킷으로 된다. 셀 계층이 사용되는 경우, 내부 패킷은 셀(예를 들면, 종래의 분해 및 재조립(SAR; segmentation and reassembly) 모듈에 의한 ATM 셀)로 변환된다. 셀들은 패킷 프로세서(307)에 의해 셀 스위치(304)로 전달된다. 패킷 프로세서(307)는 셀 버스(428, 430, 432, 434)를 거쳐 셀 스위치(304)에 연결되어 있다. 셀 스위치(304)는 셀을 셀 버스(454, 456, 458, 460)를 거쳐 인터페이스 회로(410)로 전달한다. 셀 스위치(304)는 각 셀을 분석하고 그 셀의 목적지로 되어 있는 오디오 채널에 기초하여 각 셀을 셀 버스(454, 456, 458, 460) 중 적절한 셀 버스로 전달한다. 셀 스위치(304)는 완전 메시형 동적 스위치이다.The IP packet received by the packet processor 307 is processed into an internal packet. If a cell layer is used, the inner packet is transformed into a cell (eg, an ATM cell by a conventional segmentation and reassembly (SAR) module). The cells are delivered to the cell switch 304 by the packet processor 307. The packet processor 307 is connected to the cell switch 304 via cell buses 428, 430, 432, 434. Cell switch 304 delivers the cell to interface circuit 410 via cell buses 454, 456, 458, and 460. Cell switch 304 analyzes each cell and forwards each cell to the appropriate cell bus of cell buses 454, 456, 458, and 460 based on the audio channel that is the cell's destination. Cell switch 304 is a fully meshed dynamic switch.

일 실시예에서, 인터페이스 회로(410)는 백플레인 커넥터(backplane connector)이다.In one embodiment, the interface circuit 410 is a backplane connector.

시스템(400)에서의 패킷 및 셀의 처리 및 스위칭에 이용가능한 자원 및 서비스는 호 제어 및 오디오 특징 관리자(302)에 의해 제공된다. 호 제어 및 오디오 특징 관리자(302)는 프로세서 인터페이스(PIF)(436), SAR 및 로컬 버스(437)를 거쳐 셀 스위치(304)에 연결되어 있다. 로컬 버스(437)는 또한 버퍼(438)에도 연결되어 있다. 버퍼(438)는 호 제어 및 오디오 특징 관리자(302)와 셀 스위치(304) 사이에서의 명령어를 저장 및 큐잉한다.Resources and services available for processing and switching packets and cells in system 400 are provided by call control and audio feature manager 302. The call control and audio feature manager 302 is connected to the cell switch 304 via a processor interface (PIF) 436, a SAR and a local bus 437. Local bus 437 is also connected to buffer 438. The buffer 438 stores and queues instructions between the call control and audio feature manager 302 and the cell switch 304.

호 제어 및 오디오 특징 관리자(302)는 또한 버스 연결부(444)를 거쳐 메모리 모듈(442) 및 설정 모듈(configuration module)(440)에도 연결되어 있다. 일 실시예에서, 설정 모듈(440)은 호 제어 및 오디오 특징 관리자(302)의 부팅, 초기 진단 및 동작 파라미터용의 제어 로직을 제공한다. 일 실시예에서, 메모리 모듈(442)은 호 제어 및 오디오 특징 관리자(302)의 RAM 동작용의 듀얼 인-라인 메모리 모듈(DIMM)을 포함한다.Call control and audio feature manager 302 is also connected to memory module 442 and configuration module 440 via bus connection 444. In one embodiment, configuration module 440 provides control logic for booting, initial diagnostics, and operational parameters of call control and audio feature manager 302. In one embodiment, memory module 442 includes a dual in-line memory module (DIMM) for RAM operation of call control and audio feature manager 302.

호 제어 및 오디오 특징 관리자(302)는 또한 인터페이스 회로(410)에도 연결되어 있다. 네트워크 회로(408)는 자원 관리자 CPU(220) 및/또는 애플리케이션 CPU(210)를 인터페이스 회로(410)에 연결시킨다. 일 실시예에서, 호 제어 및 오디오 특징 관리자(302)는 인터페이스 회로(410) 및 이 인터페이스 회로(410)에 연결된 부가의 구성요소의 상태를 모니터링한다. 다른 실시예에서, 호 제어 및 오디오 특징 관리자(302)는 미디어 플랫폼(200)의 자원(210) 및 서비스(212)를 제공하기 위해 인터페이스 회로(410)에 연결된 구성요소들의 동작을 제어한다.Call control and audio feature manager 302 is also connected to interface circuit 410. The network circuit 408 connects the resource manager CPU 220 and / or the application CPU 210 to the interface circuit 410. In one embodiment, the call control and audio feature manager 302 monitors the status of the interface circuit 410 and additional components connected to the interface circuit 410. In another embodiment, call control and audio feature manager 302 controls the operation of components coupled to interface circuit 410 to provide resources 210 and services 212 of media platform 200.

콘솔 포트(470)는 또한 호 제어 및 오디오 특징 관리자(302)에도 연결되어 있다. 콘솔 포트(470)는 호 제어 및 오디오 특징 관리자(302)의 동작으로의 직접 접속을 제공한다. 예를 들어, 콘솔 포트(470)를 사용하여 호 제어 및 오디오 특징 관리자(302) 및 따라서 시스템(400)의 동작을 관리하거나, 그의 미디어 프로세서를 재부팅하거나 그렇지 않으면 그의 성능에 영향을 줄 수 있다.Console port 470 is also connected to call control and audio feature manager 302. Console port 470 provides direct access to the operation of call control and audio feature manager 302. For example, the console port 470 can be used to manage the operation of the call control and audio feature manager 302 and thus the system 400, reboot its media processor or otherwise affect its performance.

기준 클럭(460)은 시스템(400)의 패킷, 셀 및 명령어를 타임스탬프하는 일관된 수단을 제공하기 위해 인터페이스 회로(410) 및 시스템(400)의 다른 구성요소에 연결되어 있다.Reference clock 460 is coupled to interface circuit 410 and other components of system 400 to provide a consistent means for time stamping packets, cells, and instructions of system 400.

인터페이스 회로(410)는 오디오 채널 프로세서(308a-308d)의 각각에 연결되어 있다. 프로세서(308) 각각은 PIF(476), 하나 이상의 카드 프로세서의 그룹(478)("뱅크" 프로세서라고도 함), 및 하나 이상의 디지털 신호 처리기(DSP)와 SDRAM 버퍼의 그룹(480)을 포함한다. 일 실시예에서, 그룹(478)에는 4개의 카드프로세서가 있고 그룹(480)에는 32개의 DSP가 있다. 이러한 실시예에서, 그룹(478)의 각 카드 프로세서는 그룹(480)의 8개의 DSP에 접속하여 그와 함께 동작한다.The interface circuit 410 is connected to each of the audio channel processors 308a-308d. Each of the processors 308 includes a PIF 476, a group 478 of one or more card processors (also referred to as a "bank" processor), and a group 480 of one or more digital signal processors (DSPs) and SDRAM buffers. In one embodiment, there are four card processors in group 478 and 32 DSPs in group 480. In this embodiment, each card processor in group 478 connects to and operates with the eight DSPs in group 480.

VII. 호 제어 및 오디오 특징 관리자VII. Call Control and Audio Feature Manager

도 3b는 본 발명의 일 실시예에 따른 호 제어 및 오디오 특징 관리자(302)의 블록도이다. 호 제어 및 오디오 특징 관리자(302)는 기능상 프로세서(302)로서 설명한다. 프로세서(302)는 호 시그널링 관리자(352), 시스템 관리자(354), 연결 관리자(356), 및 특징 제어기(358)를 포함한다.3B is a block diagram of call control and audio feature manager 302 in accordance with an embodiment of the present invention. Call control and audio feature manager 302 is described as a functional processor 302. Processor 302 includes a call signaling manager 352, a system manager 354, a connection manager 356, and a feature controller 358.

호 시그널링 관리자(352)는 호 설정 및 제거, 소프트스위치와의 인터페이스 및 SIP와 같은 시그널링 프로토콜 처리 등의 호 시그널링 동작을 관리한다.The call signaling manager 352 manages call signaling operations such as call establishment and removal, interface with a softswitch, and signaling protocol processing such as SIP.

시스템 관리자(354)는 시스템(230)의 구성요소들에 대한 부트스트랩 및 진단 동작을 수행한다. 시스템 관리자(354)는 또한 시스템(230)을 모니터링하고 여러가지 핫-스와핑(hot-swapping) 및 중복 동작을 제어한다.System manager 354 performs bootstrap and diagnostic operations on the components of system 230. System manager 354 also monitors system 230 and controls various hot-swapping and redundant operations.

연결 관리자(356)는 테이블(412, 416)과 같은 EPIF 포워딩 테이블을 관리하고, 라우팅 프로토콜[예를 들면, RIP(Routing Information Protocol, 라우팅 정보 프로토콜), OSPF(Open Shortest Path First, 최단 경로 우선) 등]을 제공한다. 또한, 연결 관리자(356)는 내부 ATM 영구 가상 회선(PVC) 및/또는 SVC를 설정한다. 일 실시예에서, 연결 관리자(356)는 데이터 흐름이 DSP나 다른 유형의 채널 프로세서에 의해 소싱 또는 처리될 수 있도록 네트워크 연결부(424, 426) 등의 네트워크 연결부들 간의 양방향 연결, 및 DSP(480a-d) 등의 DSP 채널을 설정한다.The connection manager 356 manages EPIF forwarding tables, such as tables 412 and 416, and provides routing protocols (e.g., Routing Information Protocol (RIP), Open Shortest Path First (OSPF)). Etc.]. Connection manager 356 also establishes an internal ATM permanent virtual circuit (PVC) and / or SVC. In one embodiment, connection manager 356 is a two-way connection between network connections, such as network connections 424 and 426, and DSP 480a- such that data flow can be sourced or processed by a DSP or other type of channel processor. d) Set the DSP channel.

다른 실시예에서, 연결 관리자(356)는 EPIF 및 ATM 하드웨어의 세부사항의 요약을 작성한다. 호 시그널링 관리자(352) 및 자원 관리자 CPU(220)는 그들의 동작이 적절한 서비스 세트 및 성능 파라미터에 기초하도록 이들 세부사항에 접속할 수 있다.In another embodiment, connection manager 356 creates a summary of the details of the EPIF and ATM hardware. Call signaling manager 352 and resource manager CPU 220 may access these details so that their operation is based on the appropriate service set and performance parameters.

특징 제어기(358)는 통신 인터페이스, 및 H.323과 MGCP(Media Gateway Control Protocol) 등의 프로토콜을 제공한다.Feature controller 358 provides a communication interface and protocols such as H.323 and MGCP (Media Gateway Control Protocol).

일 실시예에서, 카드 프로세서(478a-d)는 호 제어 및 오디오 특징 관리자(302) 및 그의 모듈, 즉 호 시그널링 관리자(352), 시스템 관리자(354), 연결 관리자(356) 및 특징 제어기(358) 중 임의의 것으로부터의 명령어를 처리하기 위한 로컬 관리자를 갖는 제어기로서 기능한다. 카드 프로세서(478a-d)는 이어서 DSP 뱅크, 네트워크 인터페이스, 및 오디오 스트림 등의 미디어 스트림을 관리한다.In one embodiment, the card processor 478a-d is the call control and audio feature manager 302 and its modules, namely the call signaling manager 352, the system manager 354, the connection manager 356 and the feature controller 358. Function as a controller with a local administrator for processing commands from any of The card processor 478a-d then manages media streams such as DSP banks, network interfaces, and audio streams.

일 실시예에서, DSP(480a-d)는 미디어 플랫폼(200)의 자원(210) 및 서비스(212)를 제공한다.In one embodiment, DSPs 480a-d provide resources 210 and services 212 of media platform 200.

일 실시예에서, 본 발명의 호 제어 및 오디오 특징 관리자(302)는 애플릿을 사용하여 본 발명의 EPIF에 대한 제어를 행한다. 이러한 실시예에서, 파라미터(예를 들면, 포트 MAC 주소, 포트 IP 주소 등)의 설정, 검색 테이블 관리, 통계 업로딩 등을 위한 명령은 애플릿을 통해 간접적으로 전해진다.In one embodiment, the call control and audio feature manager 302 of the present invention uses an applet to control the EPIF of the present invention. In this embodiment, commands for setting parameters (eg, port MAC address, port IP address, etc.), managing lookup tables, uploading statistics, etc. are indirectly passed through the applet.

EPIF는 엔트리의 생성, 삭제 및 검색과 관련된 기능을 처리하는 검색 엔진을 제공한다. 미디어 플랫폼(200)이 패킷의 소스 및 목적지 상에서 동작하기 때문에,EPIF는 소스 및 목적지의 검색 기능을 제공한다. 패킷의 소스 및 목적지는 착신 및 발신 주소에 대한 검색 테이블에 저장되어 있다. EPIF는 또한 이하에 보다 상세하게 설명하는 바와 같이 RTP 헤더 정보도 관리하고 또 전송될 발신 오디오 스트림의 상대 우선순위를 평가한다.EPIF provides a search engine that handles functions related to the creation, deletion and retrieval of entries. Since media platform 200 operates on the source and destination of the packet, EPIF provides a search function of the source and destination. The source and destination of the packet are stored in a lookup table for called and originating addresses. The EPIF also manages RTP header information and evaluates the relative priority of the outgoing audio stream to be transmitted, as described in more detail below.

VIII. 오디오 처리 플랫폼 동작VIII. Audio processing platform behavior

오디오 처리 플랫폼(230)의 동작은 도 5a 및 도 5b의 흐름도에 설명되어 있다. 도 5a는 본 발명의 일 실시예에 따른 호의 설정 및 착신 패킷 처리의 흐름도이다. 도 5b는 본 발명의 일 실시예에 따른 발신 패킷 처리 및 호 종료를 나타낸 흐름도이다.Operation of the audio processing platform 230 is described in the flowcharts of FIGS. 5A and 5B. 5A is a flowchart of call setup and incoming packet processing according to an embodiment of the present invention. 5B is a flowchart illustrating outgoing packet processing and call termination according to an embodiment of the present invention.

A. 착신 오디오 스트림A. Incoming Audio Stream

도 5a에서, 착신 (또는 인바운드라고도 함) 오디오 스트림에 대한 프로세스는 단계 502에서 시작하여 바로 단계 504로 진행한다.In FIG. 5A, the process for an incoming (or inbound) audio stream begins at step 502 and proceeds directly to step 504.

단계 504에서, 호 제어 및 오디오 특징 관리자(302)는 네트워크 연결부(305)를 거쳐 통신하는 클라이언트와의 호를 설정한다. 일 실시예에서, 호 제어 및 오디오 특징 관리자(302)는 클라이언트로의 접속에 대해 협상하여 이를 허가한다. 클라이언트 접속이 허가되면, 호 제어 및 오디오 특징 관리자(302)는 그 호에 대한 IP 및 UDP 주소 정보를 클라이언트에게 제공한다. 호가 설정되면, 이 프로세스는 바로 단계 506으로 진행한다.At step 504, call control and audio feature manager 302 establishes a call with a client communicating over network connection 305. In one embodiment, call control and audio feature manager 302 negotiates and authorizes the connection to the client. If the client connection is granted, the call control and audio feature manager 302 provides the client with IP and UDP address information for that call. If the call is established, the process immediately proceeds to step 506.

단계 506에서, 패킷 프로세서(307)는 네트워크 연결부(305)를 거쳐 오디오를 전달하는 IP 패킷을 수신한다. Appletalk, IPX 또는 다른 유형의 이더넷 패킷 등의 IP 패킷을 포함한 임의의 유형의 패킷이 사용될 수 있지만, 이에 한정되는 것은 아니다. 패킷이 수신되면, 프로세스는 단계 508로 진행한다.In step 506, the packet processor 307 receives an IP packet carrying audio over the network connection 305. Any type of packet may be used including, but not limited to, IP packets such as Appletalk, IPX, or other types of Ethernet packets. If the packet is received, the process proceeds to step 508.

단계 508에서, 패킷 프로세서(307)는 관련 SVC를 찾기 위해 검색 테이블 내의 IP 및 UDP 헤더 주소를 검사한 다음에, VoIP 패킷을 내부 패킷으로 변환한다. 이러한 내부 패킷은 도 7b와 관련하여 이하에 더 설명하는 바와 같이 예를 들면 페이로드와 제어 헤더로 이루어질 수 있다. 패킷 프로세서(307)는 이어서 데이터 및 라우팅 정보 중 적어도 일부를 사용하여 패킷을 작성하고 교환 가상 회선(SVC)을 할당한다. SVC는 오디오 채널 프로세서(308) 중 하나와 관련되어 있으며, 특히 오디오 페이로드를 처리하는 각자의 DSP 중 하나와 관련되어 있다.In step 508, the packet processor 307 examines the IP and UDP header addresses in the lookup table to find the relevant SVC, and then converts the VoIP packet into an inner packet. This inner packet may consist of a payload and a control header, for example, as described further below with respect to FIG. 7B. The packet processor 307 then uses at least some of the data and routing information to create a packet and assign a switched virtual circuit (SVC). SVC is associated with one of the audio channel processors 308, and in particular with one of the respective DSPs handling the audio payload.

셀 계층이 사용될 때, 내부 패킷은 ATM 셀 등의 셀로 추가적으로 변환 또는 병합된다. 이와 같이, 내부 패킷에서의 오디오 페이로드는 하나 이상의 ATM 셀로 된 스트림 내의 오디오 페이로드로 변환된다. 종래의 분해 및 재조립(SAR) 모듈은 내부 패킷을 ATM 셀로 변환하는 데 사용될 수 있다. 패킷이 셀로 변환되면, 프로세스는 단계 510으로 진행한다.When the cell layer is used, the inner packet is further transformed or merged into a cell such as an ATM cell. As such, the audio payload in the inner packet is converted into an audio payload in a stream of one or more ATM cells. Conventional disassembly and reassembly (SAR) modules can be used to convert internal packets into ATM cells. If the packet is converted to a cell, the process proceeds to step 510.

단계 510에서, 셀 스위치(304)는 셀을 SVC에 기초하여 오디오 채널 프로세서 (308)의 적당한 오디오 채널로 스위칭한다. 프로세스는 단계 512로 진행한다.At step 510, cell switch 304 switches the cell to the appropriate audio channel of audio channel processor 308 based on the SVC. The process proceeds to step 512.

단계 512에서, 오디오 채널 프로세서(308)는 셀을 패킷으로 변환시킨다. 각 채널마다 도착하는 ATM 셀 내의 오디오 페이로드는 하나 이상의 패킷으로 된 스트림 내의 오디오 페이로드로 변환된다. 종래의 SAR 모듈은 ATM 셀을 패킷으로 변환하는 데 사용될 수 있다. 패킷은 오디오 페이로드를 갖는 내부 발신 패킷이거나IP 패킷일 수 있다. 셀이 내부 패킷으로 변환되면, 프로세스는 단계 514로 진행한다.In step 512, the audio channel processor 308 converts the cell into a packet. The audio payload in the ATM cell arriving for each channel is converted into an audio payload in a stream of one or more packets. Conventional SAR modules can be used to convert ATM cells into packets. The packet may be an internal outgoing packet with an audio payload or an IP packet. If the cell is converted into an inner packet, the process proceeds to step 514.

단계 514에서, 오디오 채널 프로세서(308)는 각자의 오디오 채널에서의 패킷의 오디오 데이터를 처리한다. 일 실시예에서, 오디오 채널은 미디어 서비스 (213a-e) 중 하나 이상에 관련되어 있다. 예를 들어, 이들 미디어 서비스는 텔레브라우징, 음성 메일, 컨퍼런스 브리지[컨퍼런스 콜링(conference calling)이라고도 함], 비디오 스트리밍, VoIP 게이트웨이 서비스, 전화 또는 임의의 다른 오디오 콘텐츠에 대한 미디어 서비스일 수 있다.In step 514, the audio channel processor 308 processes the audio data of the packet in its respective audio channel. In one embodiment, the audio channel is associated with one or more of the media services 213a-e. For example, these media services may be media services for telebrowsing, voice mail, conference bridges (also called conference calling), video streaming, VoIP gateway services, telephones, or any other audio content.

B. 발신 오디오 스트림B. Outgoing Audio Stream

도 5b에서, 발신(아웃바운드라고도 함) 오디오 스트림에 대한 프로세스는 단계 522에서 시작하여 바로 단계 524로 진행한다.In FIG. 5B, the process for an outgoing (also called outbound) audio stream begins at step 522 and proceeds directly to step 524.

단계 524에서, 호 제어 및 오디오 특징 관리자(302)는 무잡음 스위치 오버를 위한 오디오 소스를 식별한다. 이 오디오 소스는 설정된 호 또는 다른 미디어 서비스와 관련될 수 있다. 오디오 소스가 식별되면, 프로세스는 바로 단계 526으로 진행한다.In step 524, call control and audio feature manager 302 identifies the audio source for noiseless switchover. This audio source may be associated with an established call or other media service. If the audio source is identified, the process immediately proceeds to step 526.

단계 526에서, 오디오 소스는 패킷을 생성한다. 일 실시예에서, 오디오 채널 프로세서(308) 내의 DSP는 오디오 소스이다. 오디오 데이터는 DSP와 관련된 SDRAM에 저장될 수 있다. 이 오디오 데이터는 이어서 DSP에 의해 패킷으로 패킷화된다. 내부 패킷 또는 이더넷 패킷 등의 IP 패킷을 비롯한 임의의 유형의 패킷이 사용될 수 있지만 이에 한정되는 것은 아니다. 양호한 일 실시예에서, 패킷은 도7b와 관련하여 설명하는 바와 같이 생성된 내부 발신 패킷이다.In step 526, the audio source generates a packet. In one embodiment, the DSP in audio channel processor 308 is an audio source. Audio data may be stored in SDRAM associated with the DSP. This audio data is then packetized into packets by the DSP. Any type of packet may be used including, but not limited to, an IP packet such as an inner packet or an Ethernet packet. In one preferred embodiment, the packet is an internal outgoing packet generated as described in connection with FIG. 7B.

단계 528에서, 오디오 채널 프로세서(308)는 패킷을 ATM 셀과 같은 셀로 변환시킨다. 패킷 내의 오디오 페이로드는 하나 이상의 ATM 셀로 된 스트림 내의 오디오 페이로드로 변환된다. 요약하면, 패킷은 파싱되고 데이터 및 라우팅 정보가 분석된다. 오디오 채널 프로세서(308)는 이어서 데이터 및 라우팅 정보 중 적어도 일부를 사용하여 셀을 작성하고 교환 가상 회선(SVC)을 할당한다. 종래의 SAR 모듈이 패킷을 ATM 셀로 변환하는 데 사용될 수 있다. SVC는 오디오 채널 프로세서 (308) 중 하나와 연관되어 있으며, 특히 오디오 소스의 각자의 DSP와 NIC(306)의 목적지 포트(305)를 연결시키는 회로과 연관되어 있다. 패킷이 셀로 변환되면, 프로세스는 단계 530으로 진행한다.In step 528, the audio channel processor 308 converts the packet into a cell, such as an ATM cell. The audio payload in the packet is converted into an audio payload in a stream of one or more ATM cells. In summary, packets are parsed and data and routing information are analyzed. The audio channel processor 308 then uses at least some of the data and routing information to create the cell and assign a switched virtual circuit (SVC). Conventional SAR modules may be used to convert packets into ATM cells. The SVC is associated with one of the audio channel processors 308, and in particular with circuitry connecting the respective DSP of the audio source with the destination port 305 of the NIC 306. If the packet is converted to a cell, the process proceeds to step 530.

단계 530에서, 셀 스위치(304)는 오디오 채널 프로세서(308)의 오디오 채널의 셀을 SVC에 기초하여 목적지 네트워크 연결부(305)로 스위칭한다. 프로세스는 단계 532로 진행한다.In step 530, the cell switch 304 switches the cell of the audio channel of the audio channel processor 308 to the destination network connection 305 based on the SVC. The process proceeds to step 532.

단계 532에서, 패킷 프로세서(307)는 셀을 IP 패킷으로 변환한다. 각 채널마다 도착하는 ATM 셀 내의 오디오 페이로드는 하나 이상의 내부 패킷으로 된 스트림 내의 오디오 페이로드로 변환된다. 종래의 SAR 모듈이 ATM을 내부 패킷으로 변환시키는 데 사용될 수 있다. 이더넷 패킷 등의 IP 패킷을 비롯한 임의의 유형의 패킷이 사용될 수 있지만, 이에 한정되는 것은 아니다. 셀이 패킷으로 변환되면, 프로세스는 단계 534로 진행한다.At step 532, the packet processor 307 converts the cell into an IP packet. The audio payload in the ATM cell arriving for each channel is converted into an audio payload in a stream of one or more internal packets. Conventional SAR modules can be used to convert ATMs into inner packets. Any type of packet may be used including, but not limited to, IP packets such as Ethernet packets. If the cell is converted into a packet, the process proceeds to step 534.

단계 534에서, 각 패킷 프로세서(307)는 추가로 RTP, IP 및 UDP 헤더 정보를부가한다. SVC와 관련된 IP 및 UDP 헤더 주소 정보를 찾기 위해 검색 테이블이 검사된다. 그 다음에, 네트워크 연결부(305)를 통해 네트워크를 거쳐 목적지 장치(전화, 컴퓨터, 팜 장치, PDA 등)로 오디오를 전달하는 IP 패킷이 전송된다. 패킷 프로세서(307)는 각자의 오디오 채널에서 패킷의 오디오 데이터를 처리한다. 일 실시예에서, 오디오 채널은 미디어 서비스(213a-e) 중 하나 이상과 관련되어 있다. 예를 들어, 이들 미디어 서비스는 텔레브라우징, 음성 메일, 컨퍼런스 브리징(컨퍼런스 콜링이라고도 함), 비디오 스트리밍, VoIP 게이트웨이 서비스, 전화, 또는 임의의 다른 오디오 콘텐츠에 대한 미디어 서비스일 수 있다.In step 534, each packet processor 307 further adds RTP, IP and UDP header information. The lookup table is checked to find the IP and UDP header address information associated with the SVC. The IP packet is then transmitted via the network connection 305 to the destination device (telephone, computer, palm device, PDA, etc.) via the network. The packet processor 307 processes the audio data of the packet in its audio channel. In one embodiment, the audio channel is associated with one or more of the media services 213a-e. For example, these media services may be media services for telebrowsing, voice mail, conference bridging (also called conference calling), video streaming, VoIP gateway service, telephone, or any other audio content.

IX. 발신 오디오 스트림의 무잡음 스위칭IX. Noiseless Switching of Outgoing Audio Streams

본 발명의 일 태양에 따르면, 오디오 처리 플랫폼(230)은 독립적인 발신 오디오 스트림들 간에 무잡음 스위칭을 수행한다. 오디오 처리 플랫폼(230)은 예시적인 것이다. 본 발명은 발신 오디오 스트림의 무잡음 스위칭에 관련하여 기술되어 있지만 임의의 미디어 서버, 라우터, 스위치, 또는 오디오 프로세서에서 사용될 수 있으며, 오디오 처리 플랫폼(230)에 한정하려는 의도는 없다.According to one aspect of the invention, the audio processing platform 230 performs noiseless switching between independent outgoing audio streams. Audio processing platform 230 is exemplary. Although the present invention has been described in terms of noiseless switching of outgoing audio streams, it can be used in any media server, router, switch, or audio processor and is not intended to be limited to audio processing platform 230.

A. 셀 스위치 - 내부 오디오 소스A. Cell Switch-Internal Audio Source

도 6a는 본 발명의 일 실시예에 따른 내부 오디오 소스에 의해 생성되는 독립적인 발신 오디오 스트림의 셀 스위칭을 수행하는 무잡음 스위치 오버 시스템을 나타낸 도면이다. 도 6a는 내부 오디오 소스로부터의 발신 오디오 스트림 스위칭을 위한 시스템(600A)의 일 실시예를 나타낸 것이다. 시스템(600A)은 발신 오디오 스트림 스위칭 동작 모드를 위해 구성된 오디오 처리 플랫폼(230)의 구성요소들을포함한다. 구체적으로 설명하면, 도 6a에 도시되어 있는 바와 같이, 시스템(600A)은 다수인 n개의 내부 오디오 소스(604n)에 연결되어 있는 호 제어 및 오디오 특징 관리자(302), 셀 스위치(304) 및 네트워크 인터페이스 제어기(306)를 포함한다. 내부 오디오 소스(604a-604n)는 2개 이상의 오디오 소스일 수 있다. DSP를 포함한 임의의 유형의 오디오 소스가 사용될 수 있지만, 이에 한정되는 것은 아니다. 일례에서, DSP(480)는 오디오 소스일 수 있다. 오디오를 생성하기 위해, 오디오 소스(604)는 내부적으로 오디오를 생성 및/또는 외부 소스로부터 수신된 오디오를 변환할 수 있다.FIG. 6A illustrates a noiseless switchover system for performing cell switching of independent outgoing audio streams generated by an internal audio source in accordance with an embodiment of the present invention. 6A illustrates one embodiment of a system 600A for switching outgoing audio streams from an internal audio source. System 600A includes components of audio processing platform 230 configured for an outgoing audio stream switching mode of operation. Specifically, as shown in FIG. 6A, system 600A includes a call control and audio feature manager 302, a cell switch 304, and a network connected to a number of n internal audio sources 604n. Interface controller 306. The internal audio sources 604a-604n can be two or more audio sources. Any type of audio source can be used including, but not limited to, a DSP. In one example, DSP 480 may be an audio source. To generate audio, audio source 604 may internally generate audio and / or convert audio received from an external source.

호 제어 및 오디오 특징 관리자(302)는 발신 오디오 제어기(610)를 더 포함한다. 발신 오디오 제어기(610)는 본 발명에 따른 독립적인 발신 오디오 스트림들 간에 무잡음 스위칭을 수행하기 위해 제어 신호를 오디오 소스(604n), 셀 스위치 (304), 및/또는 네트워크 인터페이스 제어기(306)로 내보내는 제어 로직이다. 제어 로직은 소프트웨어, 펌웨어, 마이크로코드, 하드웨어, 또는 이들의 임의의 조합으로 구현될 수 있다.Call control and audio feature manager 302 further includes an outgoing audio controller 610. Outgoing audio controller 610 sends control signals to audio source 604n, cell switch 304, and / or network interface controller 306 to perform noiseless switching between independent outgoing audio streams in accordance with the present invention. Export control logic. The control logic may be implemented in software, firmware, microcode, hardware, or any combination thereof.

SAR(630, 632, 634)를 비롯한 셀 계층도 제공된다. SAR(630, 632)는 셀 스위치(304)와 각 오디오 소스(604a-n) 사이에 연결되어 있다. SAR(634)는 셀 스위치(304)와 NIC(306) 사이에 연결되어 있다.Cell layers are also provided, including SARs 630, 632, 634. SARs 630 and 632 are connected between cell switch 304 and each audio source 604a-n. SAR 634 is coupled between cell switch 304 and NIC 306.

일 실시예에서, 독립적인 발신 오디오 스트림은 RTP 정보와 내부 발신 패킷을 갖는 IP 패킷의 스트림을 포함한다. 따라서, IP 패킷과 내부 발신 패킷에 대해 먼저 설명하는 것이 도움이 된다(도 7a-도 7b). 그 다음에, 시스템(600A) 및 그의동작에 대해 독립적인 발신 오디오 스트림과 관련하여 상세히 설명한다(도 8 내지 도 9).In one embodiment, the independent outgoing audio stream comprises a stream of IP packets with RTP information and internal outgoing packets. Therefore, it is helpful to first describe the IP packet and the inner outgoing packet (FIGS. 7A-7B). Next, the system 600A and its operation will be described in detail with reference to an independent outgoing audio stream (FIGS. 8-9).

B. 패킷B. Packet

일 실시예에서, 본 발명은 2가지 유형의 패킷, 즉 (1) RTP 정보를 갖는 IP 패킷 및 (2) 내부 발신 패킷을 사용한다. 이들 유형의 패킷 모두는 도 7a 및 도 7b의 일례에 도시되어 있으며 이를 참조로 설명한다. IP 패킷(700A)은 NIC(306) 내의 패킷 프로세서(307)에 의해 외부 패킷 교환 네트워크를 거쳐 전송 및 수신된다. 내부 발신 패킷(700B)은 오디오 소스(예를 들면, DSP)(604a-604n)에 의해 생성된다.In one embodiment, the present invention uses two types of packets: (1) IP packets with RTP information and (2) inner originating packets. All of these types of packets are shown in the examples of FIGS. 7A and 7B and are described with reference to this. IP packet 700A is transmitted and received via an external packet switched network by packet processor 307 within NIC 306. Internal outgoing packet 700B is generated by an audio source (e.g., DSP) 604a-604n.

1. RTP 정보를 갖는 IP 패킷1. IP packet with RTP information

표준 규격의 IP 패킷(700A)이 도 7a에 도시되어 있다. IP 패킷(700A)은 여러가지 구성요소, 즉 MAC(매체 접속 제어) 필드(704), IP 필드(706), UDP(사용자 데이터그램 프로토콜) 필드(708), RTP 필드(710), 디지털 데이터가 들어있는 페이로드(712), 및 CRC(순환 중복 검사) 필드(714)를 갖는 것으로 도시되어 있다. RTP(실시간 전송 프로토콜)는 디지털화된 오디오 등의 주기적인 데이터를 소스 장치로부터 목적지 장치로 전달하기 위한 표준화된 프로토콜이다. 또하나의 프로토콜, 즉 RTCP(실시간 제어 프로토콜)이 세션의 품질에 관한 정보를 제공하기 위해 RTP와 함께 사용될 수 있다.A standard specification IP packet 700A is shown in FIG. 7A. IP packet 700A contains various components: MAC (Media Access Control) field 704, IP field 706, UDP (User Datagram Protocol) field 708, RTP field 710, digital data. Is shown having a payload 712, and a cyclic redundancy check (CRC) field 714. RTP (Real Time Transfer Protocol) is a standardized protocol for delivering periodic data, such as digitized audio, from a source device to a destination device. Another protocol, RTCP (Real Time Control Protocol), can be used with RTP to provide information about the quality of the session.

보다 구체적으로 설명하면, MAC 필드(704)와 IP 필드(706)는 각 패킷이 2개의 장치(발신지와 목적지)를 상호 연결시키는 IP 네트워크를 지나갈 수 있도록 해주는 번지 지정 정보를 가지고 있다.More specifically, the MAC field 704 and the IP field 706 have address designation information that allows each packet to pass through an IP network interconnecting two devices (source and destination).

UDP 필드(708)는 네트워크 인터페이스로부터 수신될 때 그것이 내부적으로 오디오 프로세서 목적지로 라우팅될 수 있도록 RTP/오디오 스트림 채널 번호를 식별해주는 2-바이트 포트 번호를 가지고 있다. 본 발명의 일 실시예에서, 오디오 프로세서는 본 명세서에 기술되어 있는 바와 같은 DSP이다.The UDP field 708 has a 2-byte port number that identifies the RTP / audio stream channel number so that when received from the network interface it can be internally routed to the audio processor destination. In one embodiment of the invention, the audio processor is a DSP as described herein.

RTP 필드(710)는 패킷 순서 번호와 타임스탬프를 가지고 있다. 페이로드 (712)는 디지털화된 오디오 바이트 샘플을 가지고 있으며 종단점 오디오 프로세서에 의해 디코딩될 수 있다. 본 명세서의 설명을 보면 당업자에게는 자명하게 될 것인 바와 같이, RTP와 호환되는 오디오 및/또는 비디오 유형의 미디어에 대해 임의의 페이로드 유형 및 인코딩 방식이 사용될 수 있다. CRC 필드(714)는 전체 패킷의 무결성을 확인하는 한 방법을 제공한다. D. Collins의 저서, Carrier Grade Voice over IP, pp. 52-72에 기술되어 있는 RTP 패킷과 페이로드 유형에 대한 설명을 참조하기 바란다(여기에 인용함으로써 이 책의 전체 내용을 본 명세서에 포함한다).The RTP field 710 has a packet sequence number and a timestamp. Payload 712 has digitized audio byte samples and can be decoded by an endpoint audio processor. As will be apparent to those skilled in the art from the description herein, any payload type and encoding scheme may be used for RTP compatible audio and / or video type media. The CRC field 714 provides a way to verify the integrity of the entire packet. D. Collins, Carrier Grade Voice over IP, pp. See the description of RTP packets and payload types described in 52-72 (hereby incorporated by reference in their entirety).

2. 내부 발신 패킷2. Internal Outgoing Packets

도 7b는 본 발명의 전형적인 내부 발신 패킷을 보다 상세히 나타낸 도면이다. 패킷(700B)은 제어(CTRL) 헤더(720)와 페이로드(722)를 포함한다. 내부 발신 패킷(700B)의 이점은 IP 패킷(700A)보다 생성하기가 더 간단하고 또 크기가 더 작다는 것이다. 이것이 오디오 소스 및 내부 발신 패킷을 처리하는 다른 구성요소에 요구되는 부담 및 일을 완화시켜준다.7b is a more detailed view of an exemplary internal outgoing packet of the present invention. Packet 700B includes a control (CTRL) header 720 and a payload 722. The advantage of the inner outgoing packet 700B is that it is simpler and smaller in size than the IP packet 700A. This relieves the burden and work required on the audio source and other components that process internal outgoing packets.

일 실시예에서, 오디오 소스(604a-604n)는 DSP이다. 각 DSP는 각자의 오디오 스트림에 대해 생성한 페이로드(722)의 앞에 CTRL 헤더(720)를 부가한다. CTRL(720)은 이어서 제어 정보를 다운스트림방향으로 중계하는 데 사용된다. 이 제어 정보는 예를 들어 특정의 발신 오디오 스트림과 관련된 우선 순위 정보일 수 있다.In one embodiment, the audio sources 604a-604n are DSPs. Each DSP adds a CTRL header 720 in front of the payload 722 generated for its respective audio stream. CTRL 720 is then used to relay control information downstream. This control information can be, for example, priority information associated with a particular outgoing audio stream.

패킷(700B)은 ATM 셀 등의 하나 이상의 셀로 변환되어, 내부적으로 셀 스위치(304)를 거쳐 네트워크 인터페이스 제어기(306) 내의 패킷 프로세서(307)로 보내진다. 셀이 내부 발신 패킷으로 변환된 후, 패킷 프로세서(307)는 내부 헤더 CTRL(720)를 디코딩하고 이를 제거한다. IP 패킷 정보의 나머지는 페이로드(722)가 IP 패킷(700A)으로서 IP 네트워크 상으로 전송되기 전에 부가된다. 이것은 DSP에서의 처리 작업이 감소되기 때문에 이점이 있다. DSP는 비교적 짧은 제어 헤더를 페이로드에 부가하기만 하면 된다. RTP 헤더 정보를 갖는 유효한 IP 패킷을 생성하기 위해 정보를 부가하는 나머지 처리 작업이 패킷 프로세서(307)에 분산될 수 있다.The packet 700B is converted into one or more cells, such as an ATM cell, and sent internally via a cell switch 304 to the packet processor 307 in the network interface controller 306. After the cell is converted to an inner outgoing packet, the packet processor 307 decodes the inner header CTRL 720 and removes it. The remainder of the IP packet information is added before the payload 722 is transmitted over the IP network as an IP packet 700A. This is advantageous because the processing work in the DSP is reduced. The DSP only needs to add a relatively short control header to the payload. The remaining processing work of adding the information to generate a valid IP packet with RTP header information may be distributed to the packet processor 307.

C. 우선순위 레벨C. Priority Level

NIC(네트워크 인터페이스 제어기)(306)는 모든 내부 발신 패킷은 물론 외부 네트워크를 목적지로 하고 있는 모든 발신 IP 패킷도 처리한다. 따라서, NIC(306)는 각 패킷의 내용에 기초하여 자신에게로 온 각 패킷에 관한 최종 발송 결정을 할 수 있다. 어떤 실시예에서, NIC(306)는 우선순위 정보에 기초하여 발신 IP 패킷의 발송을 관리한다. 이것은 더 높은 우선순위를 갖는 발신 IP 패킷의 오디오 스트림으로의 스위칭 오버 및 보다 낮은 우선순위를 갖는 발신 IP 패킷의 다른 오디오 스트림의 버퍼링 또는 미발송을 포함할 수 있다.The NIC (network interface controller) 306 processes all internal outgoing packets as well as all outgoing IP packets destined for the external network. Thus, the NIC 306 can make a final dispatch decision for each packet that has come to it based on the contents of each packet. In some embodiments, NIC 306 manages the sending of outgoing IP packets based on priority information. This may include switching over an outgoing IP packet of higher priority to an audio stream and buffering or not sending other audio streams of a lower priority outgoing IP packet.

일 실시예에서, 내부 오디오 소스(604a-604n)는 우선순위 레벨을 판정한다. 다른 대안에서, NIC(306)는 외부 소스로부터 NIC(306)에 수신된 오디오에 대한 우선순위를 판정할 수 있다. 임의의 수의 우선순위 레벨이 사용될 수 있다. 우선순위 레벨은 오디오 소스와 그의 각자의 오디오 스트림의 상대 우선순위를 구분해준다. 우선순위 레벨은 하루 중 시간, 호출자 또는 피호출자의 식별자 또는 그룹, 또는 오디오 처리 및 미디어 서비스와 관련한 다른 유사한 인자를 비롯한 사용자에 의해 선택된 임의의 기준에 기초할 수 있지만, 이에 한정되는 것은 아니다. 시스템(600)의 구성요소는 오디오 스트림 내의 우선순위 레벨 정보를 필터링 및 송달할 수 있다. 일 실시예에서, 시스템(600) 내의 자원 관리자는 오디오 스트림의 우선순위 레벨을 변경하기 위해 외부 시스템들과 대화할 수 있다. 예를 들어, 외부 시스템은 호가 있을 시 과금 통지 또는 광고를 큐잉하도록 그 시스템에 알려주는 오퍼레이터일 수 있다. 따라서, 자원 관리자는 오디오 스트림에 관여할 수 있다. 이 무잡음 스위치 오버는 사용자에 의해 또는 수화기를 든 상황, 긴급 이벤트 또는 시간 설정된 이벤트와 같은 시그널링 상황 등의 어떤 미리 정해진 이벤트에 기초하여 자동적으로 트리거될 수 있다.In one embodiment, the internal audio sources 604a-604n determine the priority level. In another alternative, the NIC 306 may determine the priority for audio received at the NIC 306 from an external source. Any number of priority levels can be used. The priority level distinguishes the relative priority of the audio source and its respective audio stream. The priority level may be based on any criteria selected by the user, including but not limited to, time of day, identifier or group of caller or called party, or other similar factors related to audio processing and media services. Components of system 600 may filter and deliver priority level information in the audio stream. In one embodiment, a resource manager within system 600 may talk to external systems to change the priority level of the audio stream. For example, the external system may be an operator who tells the system to queue a billing notification or advertisement when there is a call. Thus, the resource manager may be involved in the audio stream. This noiseless switchover may be triggered automatically by the user or based on some predetermined event, such as a situation in which the handset is lifted, a signaling situation such as an emergency event or a timed event.

D. 무잡음 완전 메시형 셀 스위치D. Noiseless Fully Mesh Cell Switch

시스템(600A)은 다수의 입력(착신) 및 출력(발신) 오디오 채널의 "자유 풀(free pool)"로서 생각될 수 있는데, 그 이유는 완전 메시형 패킷/셀 스위치(304)가 임의의 주어진 호에 참가하기 위해 발신 오디오 채널을 스위칭하는 데 사용되기 때문이다. 임의의 시각에 전화 호에 참가하기 위해 임의의 발신 오디오 채널이 호출될 수 있다. 초기 호 설정 동안 및 호가 설정되어 있는 동안 모두에, 임의의 발신 오디오 채널은 스위칭되어 그 호에 들어가거나 그 호로부터 빠져나올 수 있다. 본 발명의 시스템(600A)의 완전 메시형 스위칭 기능은 본 발명의 IP 패킷 또는 셀을 누락 또는 손상시키지 않는 정확한 무잡음 스위칭 기능을 제공한다. 게다가, 2-단계 발신 스위칭 기술이 사용된다.System 600A may be thought of as a "free pool" of multiple input (incoming) and output (outgoing) audio channels, because a full meshed packet / cell switch 304 may be any given This is because it is used to switch outgoing audio channels to join the call. Any outgoing audio channel can be called to join a telephone call at any time. During both initial call setup and while the call is established, any outgoing audio channel can be switched to enter or exit the call. The fully meshed switching function of the system 600A of the present invention provides an accurate noiseless switching function that does not miss or corrupt the IP packet or cell of the present invention. In addition, a two-stage outgoing switching technique is used.

E. 2-단계 발신 스위칭E. Two-Stage Outgoing Switching

시스템(600A)은 적어도 2 단계의 스위칭을 포함한다. 발신 스위칭의 경우, 첫번째 단계는 셀 스위치(304)이다. 첫번째 단계는 셀 기반으로서 별도의 물리 소스들[오디오 소스(604a-604n)]로부터 단 하나의 목적지 발신 네트워크 인터페이스 제어기(NIC, 306)으로 오디오 스트림을 스위칭하기 위해 교환 가상 회선(SVC)을 사용한다. 우선순위 정보는 오디오 소스에 의해 생성된 셀의 CTRL 헤더(720)에 제공된다. 두번째 단계는 발신 NIC(306) 내에 포함되어 있으며, 따라서 다수의 오디오 소스(604a-604n)로부터의 오디오 스트림 중 어느 것을 처리하여 패킷 교환 IP 네트워크 등의 패킷 네트워크를 통해 전송할지를 선택한다. 어느 오디오 스트림을 송달할지의 선택은 CTRL 헤더(720)에 제공되는 우선순위 정보에 기초하여 NIC(306)에 의해 수행될 수 있다. 이와 같이, 보다 높은 우선순위를 갖는 제2 오디오 스트림은 제1 오디오 스트림과 동일한 채널을 통해 NIC(306)에 의해 송달될 수 있다. 오디오 스트림을 수신하는 목적지 장치에서 볼 때, 제2 오디오 스트림의 채널 상에의삽입은 독립적인 오디오 스트림들 간의 무잡음 스위치로서 받아들여진다.System 600A includes at least two stages of switching. In the case of outgoing switching, the first step is the cell switch 304. The first step uses a switched virtual circuit (SVC) to switch the audio stream from separate physical sources (audio sources 604a-604n) to only one destination originating network interface controller (NIC) 306 as cell based. . Priority information is provided in the CTRL header 720 of the cell generated by the audio source. The second step is contained within the originating NIC 306 and thus selects which of the audio streams from multiple audio sources 604a-604n to process and transmit over a packet network, such as a packet switched IP network. The selection of which audio stream to deliver may be performed by the NIC 306 based on the priority information provided in the CTRL header 720. As such, the second audio stream with higher priority may be delivered by the NIC 306 over the same channel as the first audio stream. When viewed at the destination device receiving the audio stream, the insertion of the second audio stream on the channel is taken as a noiseless switch between independent audio streams.

보다 구체적으로 설명하면, 일 실시예에서, 발신 오디오 스위칭은 전화 호에서 일어날 수 있다. 앞서 기술한 바와 같이 목적지 장치의 MAC, IP 및 UDP 정보를 사용하여 협상함으로써 오디오 소스(604a)를 사용하여 먼저 호가 설정된다. 제1 오디오 소스(604a)는 호 동안에 제1 오디오 스트림을 생성하기 시작한다. 제1 오디오 스트림은 패킷 포맷(700B)와 관련하여 전술한 바와 같이 오디오 페이로드와 CTRL 헤더(720) 정보를 갖는 내부 발신 패킷으로 이루어져 있다. 내부 발신 패킷은 그 호에 대해 설정되어 있는 채널을 통해 송출된다. 음성, 음악, 톤, 또는 다른 오디오 데이터를 비롯한 임의의 유형의 오디오 페이로드가 사용될 수 있다. SAR(630)는 셀 스위치(304)를 통해 SAR(634)로 전송하기 위해 내부 패킷을 셀로 변환한다. SAR(634)는 이어서 NIC(306)으로 전달하기에 앞서 셀을 다시 내부 발신 패킷으로 변환한다.More specifically, in one embodiment, outgoing audio switching may occur on a telephone call. As previously described, the call is first established using the audio source 604a by negotiating with the MAC, IP and UDP information of the destination device. The first audio source 604a starts generating a first audio stream during the call. The first audio stream consists of an internal outgoing packet with audio payload and CTRL header 720 information, as described above with respect to packet format 700B. Internal outgoing packets are sent out over the channel established for that call. Any type of audio payload can be used, including voice, music, tones, or other audio data. SAR 630 converts the inner packet into a cell for transmission to SAR 634 via cell switch 304. SAR 634 then converts the cell back into an internal outgoing packet prior to delivery to NIC 306.

오디오 소스(604a)로부터의 흐름 동안에, 전술한 바와 같이 NIC(306)은 CTRL 헤더(720)를 디코딩하여 제거하고 적절한 RTP, UDP, IP, MAC, 및 CRC 필드를 부가한다. CTRL 헤더(720)는 패킷을 처리하여 대응하는 RTP 패킷을 전송하기 위해 NIC(306)에 의해 사용되는 우선순위 필드를 포함한다. NIC(306)는 우선순위 필드를 평가한다. 비교적 높은 우선순위 필드이면[첫번째 오디오 소스(604a)가 유일한 전송 소스임], NIC(306)는 제1 오디오 스트림을 네트워크를 거쳐 그 호와 관련된 목적지 장치로 전달하는 동기화된 RTP 헤더 정보를 갖는 IP 패킷을 송달한다. [CTRL 헤더(720)는 또한 RTP 또는 NIC(306)가 RTP 헤더 정보를 생성하여 부가하는경우 NIC(306)에 의해 사용되거나 무시될 수 있는 다른 동기화된 헤더 정보도 포함할 수 있음]During the flow from the audio source 604a, the NIC 306 decodes and removes the CTRL header 720 and adds the appropriate RTP, UDP, IP, MAC, and CRC fields as described above. CTRL header 720 includes a priority field used by NIC 306 to process the packet and send the corresponding RTP packet. NIC 306 evaluates the priority field. If it is a relatively high priority field (the first audio source 604a is the only transmission source), then the NIC 306 may have an IP with synchronized RTP header information that forwards the first audio stream over the network to the destination device associated with that call. Deliver the packet. [CTRL header 720 may also include other synchronized header information that may be used or ignored by NIC 306 when RTP or NIC 306 generates and appends RTP header information.]

발신 오디오 제어기(610)가 무잡음 스위치 오버가 행해지게 될 호 이벤트를 판정할 때, 제2 오디오 소스(604n)는 제2 오디오 스트림을 생성하기 시작한다. 오디오는 오디오 스스(604n)에 의해 직접 또는 외부 장치에 의해 최초로 생성된 오디오를 변환함으로써 생성될 수 있다. 제2 오디오 스트림은 패킷 포맷(700B)을 참조하여 기술한 바와 같이 오디오 페이로드와 CTRL 헤더(720) 정보를 갖는 내부 발신 패킷들로 이루어져 있다. 음성, 음악, 또는 다른 오디오 데이터를 포함한 임의의 유형의 오디오 페이로드가 사용될 수 있다. 제2 오디오 스트림이 제1 오디오 스트림보다 더 높은 우선순위 필드를 부여받은 것으로 가정하자. 예를 들어, 제2 오디오 스트림은 광고, 긴급 공공 서비스 메시지 또는 목적지 장치와 설정된 제1 채널로 무잡음 삽입되어 있기를 원하는 다른 오디오 데이터를 나타낼 수 있다.When the outgoing audio controller 610 determines a call event for which no noise switchover will be made, the second audio source 604n begins to generate a second audio stream. Audio may be generated by converting audio originally generated by the audio source 604n directly or by an external device. The second audio stream consists of internal outgoing packets with audio payload and CTRL header 720 information as described with reference to packet format 700B. Any type of audio payload can be used, including voice, music, or other audio data. Assume that the second audio stream is given a higher priority field than the first audio stream. For example, the second audio stream may represent an advertisement, an emergency public service message, or other audio data that is desired to be noise-free inserted into the first channel established with the destination device.

제2 오디오 스트림의 내부 발신 패킷은 이어서 SAR(632)에 의해 셀로 변환된다. 셀 스위치(304)는 셀을 제1 오디오 스트림과 동일한 목적지 NIC(306)를 목적지로 하고 있는 SVC로 스위칭한다. SAR(634)는 이 셀을 다시 내부 패킷으로 변환한다. NIC(306)는 이제 제1 및 제2 오디오 스트림에 대한 내부 패킷을 수신한다. NIC(306)는 각 스트림 내의 우선순위 필드를 평가한다. 우선순위가 더 높은 내부 패킷을 갖는 제2 오디오 스트림은 동기화된 RTP 헤더 정보를 갖는 IP 패킷으로 변환되어 목적지 장치로 송달된다. 우선순위가 보다 낮은 내부 패킷을 갖는 제1 오디오 스트림은 버퍼에 저장되거나 동기화된 RTP 헤더 정보를 갖는 IP 패킷으로 변환되어 버퍼에 저장되거나 한다. NIC(306)는 제2 오디오 스트림이 완료될 때, 소정의 시간이 경과한 후에, 또는 재개하도록 수동 또는 자동 제어 신호가 수신될 때 상기 제1 오디오 스트림의 송달을 재개할 수 있다.The inner outgoing packet of the second audio stream is then converted into cells by the SAR 632. The cell switch 304 switches the cell to an SVC destined for the same destination NIC 306 as the first audio stream. SAR 634 converts this cell back into an inner packet. The NIC 306 now receives internal packets for the first and second audio streams. The NIC 306 evaluates the priority field in each stream. The second audio stream with the higher priority inner packet is converted into an IP packet with synchronized RTP header information and delivered to the destination device. The first audio stream having lower priority inner packets may be stored in a buffer or converted into IP packets having synchronized RTP header information and stored in the buffer. The NIC 306 may resume delivery of the first audio stream when the second audio stream is complete, after a predetermined time has elapsed, or when a manual or automatic control signal is received to resume.

F. 무잡음 스위치 오버를 트리거링하는 호 이벤트F. Call Event Triggering Noiseless Switchover

이제부터, 본 발명에 따른 무잡음 스위칭의 일 실시예에서의 우선순위 필드의 기능에 대해 도 8, 도 9a 및 도 9b를 참조하여 설명한다.The function of the priority field in one embodiment of noiseless switching according to the present invention will now be described with reference to FIGS. 8, 9A and 9B.

도 8에는, 본 발명의 일 실시예에 따른 무잡음 스위칭 루틴(800)의 흐름도가 도시되어 있다. 간명함을 위해, 무잡음 스위칭 루틴(800)에 대해 시스템(600)을 참조하여 기술한다.8 is a flow diagram of a noiseless switching routine 800 in accordance with one embodiment of the present invention. For simplicity, the noiseless switching routine 800 is described with reference to the system 600.

흐름(800)은 단계 802에서 시작하여 바로 단계 804로 진행한다.Flow 800 begins at step 802 and proceeds directly to step 804.

단계 804에서, 호 제어 및 오디오 특징 관리자(302)는 제1 오디오 소스(604a)로부터 목적지 장치로의 호를 설정한다. 호 제어 및 오디오 특징 관리자(302)는 목적지 장치와 협상하여 네트워크를 거쳐 전송되는 IP 패킷들로 된 제1 오디오 스트림에서 사용할 MAC, IP 및 UDP 포트를 결정한다.In step 804, the call control and audio feature manager 302 establishes a call from the first audio source 604a to the destination device. The call control and audio feature manager 302 negotiates with the destination device to determine the MAC, IP and UDP ports to use in the first audio stream of IP packets sent across the network.

오디오 소스(604a)는 제1 오디오 스트림을 설정된 호에 대한 한 채널을 통해 전달한다. 일 실시예에서, DSP는 내부 발신 패킷들로 된 제1 오디오 스트림을 한 채널을 통해 셀 스위치(304)로, 이어서 NIC(306)로 전달한다. 프로세스는 단계 806으로 진행한다.Audio source 604a delivers the first audio stream on one channel for the established call. In one embodiment, the DSP delivers a first audio stream of internal outgoing packets to cell switch 304 and then to NIC 306 over one channel. The process proceeds to step 806.

단계 806에서, 발신 오디오 제어기(610)는 제1 오디오 소스에 대한 우선순위 필드를 설정한다. 일 실시예에서, 발신 오디오 제어기(610)는 우선순위 필드를 값1로 설정한다. 다른 실시예에서, 우선순위 필드는 내부적으로 라우팅된 내부 발신 패킷들의 CTRL 헤더에 저장된다. 프로세스는 바로 단계 808로 진행한다.In step 806, the outgoing audio controller 610 sets a priority field for the first audio source. In one embodiment, outgoing audio controller 610 sets the priority field to value1. In another embodiment, the priority field is stored in the CTRL header of internally routed internal outgoing packets. The process proceeds directly to step 808.

단계 808에서, 발신 오디오 제어기(610)는 호의 상태를 판정한다. 일 실시예에서, 발신 오디오 제어기(610)는 호가 호 이벤트와 상호작용할 수 있도록 되어 있거나 그렇게 구성되어 있는지 여부를 판정한다. 본 발명의 일 실시예에서, 단지 긴급 호 이벤트만이 호를 인터럽트하도록 호를 구성할 수 있다. 다른 실시예에서, 호출자(들) 또는 피호출자(들)(즉, 호 당사자들 중 한명 이상)에 기초하여 어떤 호 이벤트를 수신하도록 호를 구성할 수 있다. 프로세스는 바로 단계 810으로 진행한다.In step 808, the outgoing audio controller 610 determines the state of the call. In one embodiment, the outgoing audio controller 610 determines whether or not the call is intended to be configured to interact with the call event. In one embodiment of the present invention, the call may be configured such that only emergency call events interrupt the call. In other embodiments, the call may be configured to receive certain call events based on the caller (s) or callee (s) (ie, one or more of the call parties). The process proceeds directly to step 810.

단계 810에서, 발신 오디오 제어기(610)는 호 이벤트가 있는지 모니터링한다. 일 실시예에서, 시간 통보, 날씨, 광고, 과금("동전을 더 넣어 주십시오" 또는 "5분 남았습니다") 등의 호 이벤트는 시스템(600) 내에서 생성될 수 있다. 다른 실시예에서, 뉴스, 스포츠 정보 등의 요청과 같은 호 이벤트는 시스템(600)으로 전송될 수 있다. 발신 오디오 제어기(610)는 내부적으로 뿐만 아니라 외부적으로 호 이벤트에 대해 모니터링한다. 프로세스는 바로 단계 812로 진행한다.In step 810, the outgoing audio controller 610 monitors for call events. In one embodiment, call events such as time notifications, weather, advertisements, and billing (“add more coins” or “five minutes left”) may be generated within the system 600. In other embodiments, call events, such as requests for news, sports information, and the like may be sent to the system 600. The outgoing audio controller 610 monitors for call events internally as well as externally. The process immediately proceeds to step 812.

단계 812에서, 발신 오디오 제어기(610)는 호 이벤트를 수신한다. 수신하지 않은 경우, 발신 오디오 제어기(610)는 단계 810에서 말한 바와 같이 계속 모니터링한다. 수신한 경우, 프로세스는 바로 단계 814로 진행한다.In step 812, the outgoing audio controller 610 receives a call event. If not, the outgoing audio controller 610 continues to monitor as mentioned in step 810. If so, the process immediately proceeds to step 814.

단계 814에서, 발신 오디오 제어기(610)는 호 이벤트를 판정하고 그 호 이벤트에 수반되는 동작을 수행한다. 이어서 프로세스는 단계 816으로 진행하여 종료하거나 단계 802로 되돌아간다. 일 실시예에서, 프로세스(800)는 호가 계속되는 한 반복된다.In step 814, the outgoing audio controller 610 determines the call event and performs the operations accompanying the call event. The process then proceeds to step 816 to end or return to step 802. In one embodiment, process 800 is repeated as long as the call continues.

도 9a 내지 도 9c에는, 본 발명의 일 실시예에 따른 우선순위에 기반한 오디오 스트림 스위칭에 대한 호 이벤트 처리의 흐름도(900)가 도시되어 있다. 일 실시예에서, 흐름(900)은 도 8의 단계 814에서 수행되는 동작을 보다 상세히 나타낸다.9A-9C, a flow diagram 900 of call event processing for priority based audio stream switching in accordance with one embodiment of the present invention is shown. In one embodiment, flow 900 illustrates in more detail the operation performed at step 814 of FIG.

프로세스(900)는 단계 902에서 시작하여, 바로 단계 904로 진행한다.Process 900 begins at step 902 and proceeds directly to step 904.

단계 904에서, 발신 오디오 제어기(610)는 설정된 호에 대한 호 이벤트를 판독한다. 이 동작에서, 오디오 소스(604a)로부터의 제1 오디오 스트림은 이미 NIC (306)로부터 설정된 호의 일부로서 전송되고 있는 것이다. 프로세스는 단계 906으로 진행한다.In step 904, the outgoing audio controller 610 reads the call event for the established call. In this operation, the first audio stream from the audio source 604a is already being transmitted as part of the call established from the NIC 306. The process proceeds to step 906.

단계 906에서, 발신 오디오 제어기(610)는 호 이벤트가 제2 오디오 소스를 포함하고 있는지를 판정한다. 포함하고 있는 경우, 프로세스는 단계 908로 진행한다. 포함하고 있지 않은 경우, 프로세스는 단계 930으로 진행한다.In step 906, the outgoing audio controller 610 determines whether the call event includes a second audio source. If yes, the process proceeds to step 908. If no, the process proceeds to step 930.

단계 908에서, 발신 오디오 제어기(610)는 제2 오디오 소스의 우선순위를 판정한다. 일 실시예에서, 발신 오디오 제어기(610)는 제2 오디오 소스(604n)에 명령을 보내 제2 오디오 소스가 내부 발신 패킷들로 된 제2 오디오 스트림을 생성하도록 지시한다. 제2 오디오 스트림에 대한 우선순위 정보는 제2 오디오 소스 (604n)에 의해 자동적으로 생성되거나 발신 오디오 제어기(610)로부터의 명령에 기초하여 생성될 수 있다. 프로세스는 이어서 단계 910으로 진행한다.In step 908, the outgoing audio controller 610 determines the priority of the second audio source. In one embodiment, the outgoing audio controller 610 sends a command to the second audio source 604n to instruct the second audio source to generate a second audio stream of internal outgoing packets. Priority information for the second audio stream may be generated automatically by the second audio source 604n or based on commands from the outgoing audio controller 610. The process then proceeds to step 910.

단계 910에서, 제2 오디오 소스(604n)는 제2 오디오 스트림을 생성하기 시작한다. 제2 오디오 스트림은 패킷 포맷(700B)와 관련하여 설명한 바와 같이 오디오 페이로드와 CRTL 헤더(720) 정보를 갖는 내부 발신 패킷들로 이루어져 있다. 음성, 음악 또는 다른 오디오 데이터를 비롯한 임의의 유형의 오디오 페이로드가 사용될 수 있다. 오디오 페이로드란 광의적으로 비디오 데이터의 일부로서 포함된 오디오 데이터도 포함하는 것을 의미한다. 이어서 프로세스는 단계 912로 진행한다.At step 910, the second audio source 604n begins to generate a second audio stream. The second audio stream consists of internal outgoing packets with audio payload and CRTL header 720 information as described with respect to packet format 700B. Any type of audio payload can be used, including voice, music or other audio data. An audio payload is meant to include audio data that is broadly included as part of the video data. The process then proceeds to step 912.

단계 912에서, 제2 오디오 스트림의 발신 패킷은 이어서 셀로 변환된다. 일례에서, 셀은 ATM 셀이다. 프로세스는 그 다음에 단계 914로 진행한다.In step 912, outgoing packets of the second audio stream are then converted into cells. In one example, the cell is an ATM cell. The process then proceeds to step 914.

단계 914에서, 셀 스위치(304)는 셀을 제1 오디오 스트림과 동일한 발신 채널을 통해 동일한 목적지 NIC(306)를 목적지로 하는 SVC로 스위칭한다. 프로세스는 이어서 단계 915로 진행한다.At step 914, cell switch 304 switches the cell to an SVC destined for the same destination NIC 306 via the same outgoing channel as the first audio stream. The process then proceeds to step 915.

도 9b의 단계 915에 나타낸 바와 같이, SAR(634)는 이제 제1 및 제2 오디오 스트림에 대한 셀을 수신한다. 이 셀은 다시 내부 발신 패킷들로 된 스트림으로 변환되고, 2개의 오디오 스트림에 대한 각자의 우선순위 정보를 포함하는 제어 헤더를 갖는다.As shown in step 915 of FIG. 9B, the SAR 634 now receives cells for the first and second audio streams. This cell is in turn converted to a stream of internal outgoing packets and has a control header containing respective priority information for the two audio streams.

단계 916에서, NIC(306)는 이 2개의 오디오 스트림의 우선순위를 비교한다. 제2 오디오 스트림이 더 높은 우선순위를 갖는 경우, 프로세스는 단계 918로 진행한다. 그렇지 않은 경우, 프로세스는 단계 930으로 진행한다.At step 916, NIC 306 compares the priorities of these two audio streams. If the second audio stream has a higher priority, the process proceeds to step 918. Otherwise, the process proceeds to step 930.

단계 918에서, 제1 오디오 스트림의 전송이 보류된다. 예를 들어, NIC(306)는 제1 오디오 스트림을 버퍼링하거나 심지어는 오디오 소스(604a)에 제어 명령을 보내어 제1 오디오 소스의 전송을 보류하도록 한다. 프로세스는 바로 단계 920으로 진행한다.In step 918, transmission of the first audio stream is suspended. For example, the NIC 306 buffers the first audio stream or even sends a control command to the audio source 604a to suspend transmission of the first audio source. The process immediately proceeds to step 920.

단계 920에서, 제2 오디오 스트림의 전송이 시작한다. NIC(306)는 패킷 프로세서(들)(307)에 대해 제2 오디오 스트림의 내부 발신 패킷들의 오디오 페이로드를 갖는 IP 패킷들을 생성하도록 지시한다. 패킷 프로세서(들)(307)는 부가의 동기화된 RTP 헤더 정보(RTP 패킷 정보) 및 다른 헤더 정보(MAC, IP, UDP 필드)를 제2 오디오 스트림의 내부 발신 패킷의 오디오 페이로드에 부가한다.In step 920, transmission of the second audio stream begins. The NIC 306 instructs the packet processor (s) 307 to generate IP packets having an audio payload of internally outgoing packets of the second audio stream. The packet processor (s) 307 adds additional synchronized RTP header information (RTP packet information) and other header information (MAC, IP, UDP fields) to the audio payload of the inner outgoing packet of the second audio stream.

NIC(306)는 이어서 동기화된 RTP 헤더 정보를 갖는 IP 패킷을 제1 오디오 스트림의 동일한 발신 채널을 통해 전송한다. 이와 같이, 목적지 장치는 제1 오디오 스트림 대신에 제2 오디오 스트림 잡음을 수신한다. 게다가, 목적지 장치의 관점에서 볼 때, 이 제2 오디오 스트림은 지연이나 중단없이 실시간으로 무잡음으로 수신된다. 단계 918 및 단계 920은 물론 동시에 또는 임의의 순서로 수행될 수 있다. 프로세스는 바로 단계 922로 진행한다.The NIC 306 then sends an IP packet with synchronized RTP header information on the same outgoing channel of the first audio stream. As such, the destination device receives second audio stream noise instead of the first audio stream. In addition, from the point of view of the destination device, this second audio stream is received in real time without noise, without delay or interruption. Steps 918 and 920 may of course be performed simultaneously or in any order. The process immediately proceeds to step 922.

도 9c에 도시되어 있는 바와 같이, NIC(306)는 제2 오디오 스트림의 끝인지 알기위해 모니터링한다(단계 922). 프로세스는 바로 단계 924로 진행한다.As shown in FIG. 9C, the NIC 306 monitors to see if it is the end of the second audio stream (step 922). The process proceeds directly to step 924.

단계 924에서, NIC(306)는 제2 오디오 스트림이 끝났는지를 판정한다. 일례에서, NIC(306)는 선행하는 패킷보다 더 낮은 우선순위 레벨을 갖는 제2 오디오 스트림의 마지막 패킷을 판독한다. 끝난 경우, 프로세스는 바로 단계 930으로 진행한다. 끝나지 않은 경우, 프로세스는 단계 922로 진행한다.In step 924, the NIC 306 determines whether the second audio stream is over. In one example, NIC 306 reads the last packet of the second audio stream with a lower priority level than the preceding packet. If so, the process immediately proceeds to step 930. If not, the process proceeds to step 922.

단계 930에서, NIC(306)는 제1 오디오 스트림을 계속 송달하거나(단계 906 후에), 제1 오디오 스트림을 송달하는 단계로 되돌아간다(단계 916 또는 단계 924 후에). 프로세스는 단계 932로 진행한다.In step 930, the NIC 306 continues to serve the first audio stream (after step 906) or returns to serving the first audio stream (after step 916 or step 924). The process proceeds to step 932.

일 실시예에서, NIC(306)는 우선순위 레벨 임계값을 유지한다. NIC(306)는 이어서 오디오 스트림 내의 우선순위 정보에 기초하여 그 임계값을 증분시켜 설정한다. NIC(306)는 다수의 오디오 스트림을 만났을 때, 그 우선순위 레벨 임계값보다 높거나 같은 우선순위 정보를 갖는 오디오 스트림을 송달한다. 예를 들어, 제1 오디오 스트림이 1의 우선순위 값을 갖는 경우, 우선순위 레벨 임계값은 1로 설정되고 제1 오디오 스트림이 전송된다(단계 904에 앞서). 더 높은 우선순위를 갖는 제2 오디오 스트림이 NIC(306)에 수신되는 경우, NIC(306)는 우선순위 임계값을 2로 증분시킨다. 제2 오디오 스트림은 이어서 단계 920에 전술한 바와 같이 전송된다. 우선순위 필드값이 0 [또는 널(null) 또는 다른 특수값]으로 설정되어 있는 제2 오디오 스트림의 마지막 패킷이 판독될 때, 우선순위 레벨 임계값은 단계 924의 일부로서 다시 1로 감소된다. 이 경우, 우선순위 정보 1을 갖는 제1 오디오 스트림은 이어서 NIC(306)에 의해 단계 930과 관련하여 전술한 바와 같이 전송된다.In one embodiment, NIC 306 maintains a priority level threshold. The NIC 306 then increments and sets its threshold based on the priority information in the audio stream. When the NIC 306 encounters multiple audio streams, it delivers audio streams having priority information that is higher than or equal to its priority level threshold. For example, if the first audio stream has a priority value of 1, the priority level threshold is set to 1 and the first audio stream is transmitted (prior to step 904). If a second audio stream with higher priority is received at NIC 306, NIC 306 increments the priority threshold to two. The second audio stream is then transmitted as described above at step 920. When the last packet of the second audio stream whose priority field value is set to 0 (or null or other special value) is read, the priority level threshold is reduced back to 1 as part of step 924. In this case, the first audio stream with priority information 1 is then transmitted by NIC 306 as described above in connection with step 930.

단계 932에서, 발신 오디오 제어기(610)는 남아 있는 호 이벤트가 있으면 이를 처리한다. 프로세스는 이어서 단계 934로 진행하여 재개시될 때까지 종료된다. 일 실시예에서, 전술한 프로세스의 단계들은 거의 동시에 일어나며, 따라서 프로세스는 시스템(600) 내의 하나 이상의 프로세스 상에서 병렬로 또는 중첩하여 실행될 수 있다.In step 932, the outgoing audio controller 610 processes any remaining call events. The process then proceeds to step 934 and ends until it is resumed. In one embodiment, the steps of the above-described process take place at about the same time, so that the processes can be executed in parallel or overlapping on one or more processes in system 600.

G. 오디오 데이터 흐름G. Audio Data Flow

도 6b는 일 실시예에서 도 6a의 무잡음 스위치 오버 시스템에서의 오디오 데이터 흐름(615)의 도면이다. 특히, 도 6b는 오디오 소스(604a-n)로부터 SAR(630, 632)로의 내부 패킷의 흐름, 셀 스위치(304)를 거쳐 SAR(634)로의 셀의 흐름, SAR(634)와 패킷 프로세서(307) 사이에서의 내부 패킷의 흐름, 및 NIC(306)로부터 네트워크를 거치는 IP 패킷의 흐름을 도시하고 있다.6B is a diagram of the audio data flow 615 in the noiseless switchover system of FIG. 6A in one embodiment. In particular, FIG. 6B illustrates the flow of internal packets from audio sources 604a-n to SARs 630 and 632, the flow of cells through cell switch 304 to SAR 634, SAR 634 and packet processor 307. Internal packet flow), and the IP packet flow from the NIC 306 through the network.

H. 다른 실시예H. Other Examples

본 발명은 내부 오디오 소스나 셀 계층에 한정되는 것은 아니다. 무잡음 스위치 오버는 또한 내부 오디오 소스만 사용하거나, 내부 및 외부 오디오 소스를 사용하거나, 외부 오디오 소스만 사용하거나, 셀 스위치를 사용하거나 패킷 스위치를 사용하는 다른 실시예들에서도 수행될 수 있다. 예를 들어, 도 6c는 본 발명의 일 실시예에 따른 내부 오디오 소스(604a-n) 및/또는 외부 오디오 소스(도시 생략)에 의해 생성되는 독립적인 발신 오디오 스트림들 간의 셀 스위칭을 수행하는 무잡음 스위치 오버 시스템(600C)을 나타낸 도면이다. 무잡음 스위치 오버 시스템(600C)은 무잡음 스위치 오버가 외부 오디오 소스로부터 수신된 오디오에 대해 행해진다는 점을 제외하고는 앞서 상세히 설명한 시스템(600A)와 유사하게 동작한다. 도 6c에 도시한 바와 같이, 이 오디오는 IP 패킷으로 수신되어 NIC(306)에 버퍼링된다. NIC(306)는 IP 정보를 분할하여(이를 외부 오디오 소스 및 목적지 장치와 관련된 포워드 테이블 엔트리에 저장하고) SVC에 할당되는 내부 패킷을 생성한다. SAR(634)는 내부 패킷을 셀로 변환하고 이 셀을 내부 패킷으로 변환하기 위해 SVC상의 링크(662)를 통해 스위치(304)를 거쳐 다시 링크(664)를 통해 SAR(634)로 라우팅한다. 전술한 바와 같이, 내부 패킷은 이어서 동기화된 헤더 정보를 갖는 IP 패킷을 생성하기 위해 패킷 프로세서(307)에 의해 처리된다. NIC(306)는 IP 패킷을 목적지 장치로 전송한다. 이와 같이, 목적지 장치에 있는 사용자는 외부 오디오 소스로부터의 오디오를 수신하기 위해 무잡음 스위치 오버된다. 도 6d는 도 6c의 무잡음 스위치 오버 시스템에서 외부 오디오 소스로부터 수신된 발신 오디오 스트림에 대한 오디오 데이터 흐름(625)을 나타낸 도면이다. 상세히 설명하면, 도 6d는 외부 오디오 소스(도시 생략)로부터 NIC(306)로의 IP 패킷의 흐름, NIC(306)로부터 SAR(634)로의 내부 패킷의 흐름, 셀 스위치(304)로부터 다시 SAR(634)로의 셀의 흐름, SAR(634)와 패킷 프로세서(307) 사이에서의 내부 패킷의 흐름, 및 NIC(306)로부터 네트워크를 거쳐 목적지 장치(도시 생략)로의 IP 패킷의 흐름을 나타내고 있다.The invention is not limited to internal audio sources or cell layers. Noiseless switchover may also be performed in other embodiments that use only internal audio sources, use internal and external audio sources, use external audio sources only, use cell switches, or use packet switches. For example, FIG. 6C illustrates no cell switching between independent outgoing audio streams generated by an internal audio source 604a-n and / or an external audio source (not shown) in accordance with one embodiment of the present invention. A diagram of a noise switch over system 600C is shown. The noiseless switchover system 600C operates similarly to the system 600A described above in detail, except that noiseless switchover is performed on audio received from an external audio source. As shown in Fig. 6C, this audio is received in an IP packet and buffered in the NIC 306. The NIC 306 splits the IP information (stores it in a forward table entry associated with the external audio source and destination device) to generate an internal packet that is assigned to the SVC. The SAR 634 routes an internal packet to a cell and routes it through the switch 304 via the link 662 back to the SAR 634 via the link 662 on the SVC to convert the inner packet into a cell. As mentioned above, the inner packet is then processed by the packet processor 307 to generate an IP packet with synchronized header information. The NIC 306 sends the IP packet to the destination device. As such, the user at the destination device is noiseless switched over to receive audio from an external audio source. FIG. 6D illustrates an audio data flow 625 for an outgoing audio stream received from an external audio source in the noiseless switchover system of FIG. 6C. In detail, FIG. 6D shows the flow of IP packets from an external audio source (not shown) to NIC 306, the flow of internal packets from NIC 306 to SAR 634, and SAR 634 back from cell switch 304. Cell flow to the network, internal packet flow between the SAR 634 and the packet processor 307, and IP packet flow from the NIC 306 to the destination device (not shown).

도 6e는 본 발명의 일 실시예에 따른 내부 및/또는 외부 오디오 소스에 의해 생성된 독립적인 발신 오디오 스트림들 사이에서의 패킷 스위칭을 수행하는 무잡음 스위치 오버 시스템(600E)에서의 오디오 데이터 흐름(635, 645)을 나타낸 도면이다. 무잡음 스위치 오버 시스템(600E)은 패킷 스위치(694)가 셀 스위치(304) 대신에 사용된다는 점을 제외하고는 앞서 상세히 설명한 시스템(600A, 600C)와 유사하게 동작한다. 이 실시예에서는, SAR(630, 632, 634)를 포함하는 셀 계층이 생략되어 있다. 오디오 데이터 흐름(635)에서, 내부 패킷은 패킷 스위치(694)를 통해 내부 오디오 소스(604a-n)로부터 패킷 프로세서(307)로 이동한다. IP 패킷은 네트워크쪽으로 이동해 나간다. 오디오 데이터 흐름(645)에서, 외부 오디오 소스(도시 생략)로부터의 IP 패킷은 NIC(306)에서 수신된다. 오디오는 도 6e에 도시한 바와 같이 패킷 형태로 수신되어 NIC(306)에 버퍼링된다. NIC(306)는 IP 정보를 분할하여 (이를 외부 오디오 소스 및 목적지 장치와 관련된 포워드 테이블 엔트리에 저장하고) 이 목적지 장치와 관련된 SVC (또는 다른 유형의 경로)에 할당되는 내부 패킷을 생성한다. 이 내부 패킷은 SVC를 통해 패킷 스위치(694)를 거쳐 NIC(306)로 라우팅된다. 전술한 바와 같이, 내부 패킷은 이어서 동기화된 헤더 정보를 갖는 IP 패킷을 생성하기 위해 패킷 프로세서(307)에 의해 처리된다. NIC(306)는 이어서 IP 패킷을 목적지 장치로 전송한다. 이와 같이, 목적지 장치에 있는 사용자는 외부 오디오 소스로부터의 오디오를 수신하기 위해 무잡음 스위치 오버된다.6E illustrates audio data flow in a noiseless switchover system 600E that performs packet switching between independent outgoing audio streams generated by an internal and / or external audio source in accordance with an embodiment of the present invention. 635 and 645 are shown. The noiseless switch over system 600E operates similarly to the systems 600A and 600C described above in detail except that the packet switch 694 is used instead of the cell switch 304. In this embodiment, the cell layer including the SARs 630, 632, and 634 is omitted. In audio data flow 635, internal packets travel from internal audio source 604a-n to packet processor 307 via packet switch 694. IP packets go out to the network. In audio data flow 645, IP packets from an external audio source (not shown) are received at NIC 306. Audio is received in the form of packets as shown in FIG. 6E and buffered in the NIC 306. The NIC 306 splits the IP information (stores it in a forward table entry associated with the external audio source and destination device) and generates an internal packet that is assigned to the SVC (or other type of path) associated with this destination device. This inner packet is routed through the SVC to the NIC 306 via the packet switch 694. As mentioned above, the inner packet is then processed by the packet processor 307 to generate an IP packet with synchronized header information. The NIC 306 then sends an IP packet to the destination device. As such, the user at the destination device is noiseless switched over to receive audio from an external audio source.

도 6f는 본 발명의 일 실시예에 따른 외부 오디오 소스만에 의해 생성되는 독립적인 발신 오디오 스트림들 사이에서의 스위칭을 수행하는 무잡음 스위치 오버 시스템(600F)를 나타낸 도면이다. 어떤 스위치도 내부 오디오 소스도 필요하지 않다. NIC(306)는 IP 정보를 분할하여 (이를 외부 오디오 소스 및 목적지 장치와 관련된 포워드 테이블 엔트리에 저장하고) 이 목적지 장치와 관련된 SVC (또는 다른 유형의 경로)에 할당되는 내부 패킷을 생성한다. 이 내부 패킷은 SVC를 통해 NIC(306)로 라우팅된다. [NIC(306)는 공통 소스 및 목적 지점일 수 있다.] 전술한 바와 같이, 내부 패킷은 이어서 동기화된 헤더 정보를 갖는 IP 패킷을 생성하기 위해 패킷 프로세서(307)에 의해 처리된다. NIC(306)는 이어서 이 IP 패킷을 목적지 장치로 전송한다. 이와 같이, 목적지 장치에 있는 사용자는 외부 오디오 소스로부터의 오디오를 수신하기 위해 무잡음 스위치 오버된다.FIG. 6F illustrates a noiseless switchover system 600F for switching between independent outgoing audio streams generated by only an external audio source in accordance with one embodiment of the present invention. No switch or internal audio source is required. The NIC 306 splits the IP information (stores it in a forward table entry associated with the external audio source and destination device) and generates an internal packet that is assigned to the SVC (or other type of path) associated with this destination device. This inner packet is routed to NIC 306 via SVC. [NIC 306 may be a common source and destination point.] As mentioned above, the inner packet is then processed by packet processor 307 to generate an IP packet with synchronized header information. The NIC 306 then sends this IP packet to the destination device. As such, the user at the destination device is noiseless switched over to receive audio from an external audio source.

발신 오디오 스위칭 시스템(600)의 동작에 관해 전술한 기능은 제어 로직에서 구현될 수 있다. 이러한 제어 로직은 소프트웨어, 펌웨어, 하드웨어 또는 이들의 임의의 조합으로 구현될 수 있다.The functions described above with respect to the operation of the outgoing audio switching system 600 may be implemented in control logic. Such control logic may be implemented in software, firmware, hardware or any combination thereof.

X. 컨퍼런스 콜 처리X. Conference Call Processing

A. 분산 컨퍼런스 브리지A. Distributed Conference Bridge

도 10은 본 발명의 일 실시예에 따른 분산 컨퍼런스 브리지(1000)을 나타낸 도면이다. 분산 컨퍼런스 브리지(1000)는 네트워크(1005)에 연결되어 있다. 네트워크(1005)는 인터넷 등의 임의의 유형의 네트워크 또는 네트워크의 조합일 수 있다. 예를 들어, 네트워크(1005)는 패킷 교환 네트워크, 또는 회선 교환 네트워크와 결합된 패킷 교환 네트워크를 포함할 수 있다. 다수의 컨퍼런스 콜 참가자(C1-CN)가 네트워크(1005)를 통해 분산 컨퍼런스 브리지(1000)에 연결할 수 있다. 예를 들어, 컨퍼런스 콜 참가자(C1-CN)는 VoIP 호를 네트워크(1005)를 통해 보내어 분산 컨퍼런스 브리지(1000)에 접촉할 수 있다. 분산 컨퍼런스 브리지(1000)는 확장성이 있으며, 임의의 수의 컨퍼런스 콜 참가자를 처리할 수 있다. 예를 들어, 분산 컨퍼런스 브리지(1000)는 2명의 컨퍼런스 콜 참가자 내지 최대 1000명 이상의 컨퍼런스 콜 참가자들 간의 컨퍼런스 콜을 처리할 수 있다.10 illustrates a distributed conference bridge 1000 in accordance with an embodiment of the present invention. Distributed conference bridge 1000 is coupled to network 1005. The network 1005 may be any type of network or combination of networks, such as the Internet. For example, network 1005 may comprise a packet switched network, or a packet switched network coupled with a circuit switched network. Multiple conference call participants C1-CN may connect to distributed conference bridge 1000 via network 1005. For example, conference call participants C1-CN may send a VoIP call over network 1005 to contact distributed conference bridge 1000. Distributed conference bridge 1000 is scalable and can handle any number of conference call participants. For example, distributed conference bridge 1000 may handle conference calls between two conference call participants and up to 1000 or more conference call participants.

도 10에 도시한 바와 같이, 분산 컨퍼런스 브리지(1000)는 컨퍼런스 콜 에이전트(1010), 네트워크 인터페이스 제어기(NIC)(1020), 스위치(1030), 및 오디오 소스(1040)를 포함한다. 컨퍼런스 콜 에이전트(1010)는 NIC(1020), 스위치(1030) 및오디오 소스(1040)에 연결되어 있다. NIC(1020)는 네트워크(1005)와 스위치(1030) 사이에 연결되어 있다. 스위치(1030)는 NIC(1020)와 오디오 소스(1040) 사이에 연결되어 있다. 룩업 테이블(1025)은 NIC(1020)에 연결되어 있다. 룩업 테이블 (1025)(또는 도시되지 않은 별도의 룩업 테이블)은 또한 오디오 소스(1040)에도 연결될 수 있다. 스위치(1030)는 멀티캐스터(1050)를 포함한다. NIC(1020)는 패킷 프로세서(1070)를 포함한다.As shown in FIG. 10, distributed conference bridge 1000 includes a conference call agent 1010, a network interface controller (NIC) 1020, a switch 1030, and an audio source 1040. Conference call agent 1010 is coupled to NIC 1020, switch 1030, and audio source 1040. The NIC 1020 is connected between the network 1005 and the switch 1030. The switch 1030 is connected between the NIC 1020 and the audio source 1040. The lookup table 1025 is connected to the NIC 1020. Lookup table 1025 (or a separate lookup table, not shown) may also be coupled to audio source 1040. The switch 1030 includes a multicaster 1050. The NIC 1020 includes a packet processor 1070.

컨퍼런스 콜 에이전트(1010)는 다수의 참가자들에 대한 컨퍼런스 콜을 설정한다. 컨퍼런스 콜 동안에, 디지털화된 음성 등의 오디오를 전달하는 패킷은 컨퍼런스 콜 참가자(C1-CN)로부터 컨퍼런스 브리지(1000)로 이동한다. 이들 패킷은 RTP/RTCP 패킷을 포함한 IP 패킷일 수 있지만, 이에 한정되는 것은 아니다. NIC (1020)는 패킷을 수신하여 이 패킷을 링크(1028)를 따라 스위치(1030)로 송달한다. 링크(1028)는 PVC 또는 SVC 등의 임의의 유형의 논리 링크 및/또는 물리 링크일 수 있다. 일 실시예에서, NIC(1020)는 (도 7a를 참조하여 전술한 바와 같이) IP 패킷을 (도 7b를 참조하여 전술한 바와 같이) 헤더와 페이로드만을 갖는 내부 패킷으로 변환한다. 내부 패킷의 사용은 오디오 소스(1040)에서의 처리 작업을 추가로 감소시켜준다. NIC(1020)에 의해 처리되는 착신 패킷은 또한 SAR에 의해 ATM 셀 등의 셀로 조합되어 링크(1028)를 거쳐 스위치(1030)로 전송될 수 있다. 스위치(1030)는 NIC(1020)로부터의 착신 패킷 (또는 셀)을 링크(1035) 오디오 소스(1040)로 전달한다. 링크(1035)는 또한 PVC 또는 SVC를 포함한 임의의 유형의 논리 링크 및/또는 물리 링크일 수 있지만, 이에 한정되는 것은 아니다.Conference call agent 1010 sets up a conference call for a number of participants. During the conference call, packets carrying audio such as digitized voice travel from the conference call participants C1-CN to the conference bridge 1000. These packets may be, but are not limited to, IP packets including RTP / RTCP packets. The NIC 1020 receives the packet and delivers the packet along the link 1028 to the switch 1030. Link 1028 may be any type of logical link and / or physical link, such as PVC or SVC. In one embodiment, NIC 1020 converts the IP packet (as described above with reference to FIG. 7A) into an inner packet having only a header and payload (as described above with reference to FIG. 7B). The use of inner packets further reduces processing at the audio source 1040. Incoming packets processed by the NIC 1020 may also be combined by the SAR into cells such as ATM cells and sent to the switch 1030 over the link 1028. The switch 1030 forwards the incoming packet (or cell) from the NIC 1020 to the link 1035 audio source 1040. The link 1035 may also be, but is not limited to, any type of logical link and / or physical link, including PVC or SVC.

링크(1035)를 거쳐 제공되는 오디오는 이 컨퍼런스 브리지 처리와 관련해서는 "외부 오디오"라고 하는데, 그 이유는 이 오디오가 컨퍼런스 콜 참가자로부터 네트워크(1005)를 거쳐 온 것이기 때문이다. 오디오는 또한 도 10에 도시되어 있는 바와 같이 하나 이상의 링크(1036)를 통해 내부적으로 제공될 수 있다. 이러한 "내부 오디오"는 음성, 음악, 광고, 뉴스, 또는 컨퍼런스 콜에서 믹싱되는 다른 오디오 콘텐츠일 수 있다. 내부 오디오는 임의의 오디오 소스에 의해 제공되거나 컨퍼런스 브리지(1000)에 연결된 저장 장치로부터 접근될 수 있다.Audio provided via link 1035 is referred to as " external audio " in connection with this conference bridge processing because it is from the conference call participant over network 1005. Audio may also be provided internally through one or more links 1036 as shown in FIG. 10. Such "internal audio" may be voice, music, advertisements, news, or other audio content that is mixed in a conference call. Internal audio may be provided by any audio source or accessed from a storage device connected to the conference bridge 1000.

오디오 소스(1040)는 컨퍼런스 콜에 대한 오디오를 믹싱한다. 오디오 소스(1040)는 믹싱된 오디오를 포함하는 아웃바운드 패킷을 생성하고 이 패킷을 링크(1045)를 거쳐 스위치(1030)로 전송한다. 상세하게 설명하면, 오디오 소스 (1040)는 전부 믹싱된 오디오 패킷 스트림과 한 세트의 부분 믹싱된 오디오 스트림을 생성한다. 일 실시예에서, 오디오 소스(1040) (또는 믹서, 왜냐하면 오디오를 믹싱하기 때문임)는 컨퍼런스 식별번호 정보(CID)와 컨퍼런스 콜 동안의 믹싱된 오디오를 갖는 적절한 전부 믹싱된 오디오 패킷 스트림과 부분 믹싱된 오디오 패킷 스트림을 동적으로 생성한다. 오디오 소스는 컨퍼런스 콜의 개시 시에 생성되어 저장해둔 비교적 정적인 룩업 테이블[예를 들면, 테이블(1025) 또는 오디오 소스 (1040)에 더 가까운 별도의 테이블]로부터 컨퍼런스 콜 참가자의 해당 CID 정보를 검색한다.Audio source 1040 mixes the audio for the conference call. The audio source 1040 generates an outbound packet that contains the mixed audio and sends the packet to the switch 1030 over the link 1045. In detail, the audio source 1040 produces a fully mixed audio packet stream and a set of partially mixed audio streams. In one embodiment, the audio source 1040 (or mixer, because it is mixing audio) is partially mixed with a suitable fully mixed audio packet stream with conference identification information (CID) and mixed audio during the conference call. Dynamically generated audio packet streams. The audio source retrieves the corresponding CID information of the conference call participant from a relatively static lookup table (e.g., table 1025 or a separate table closer to the audio source 1040) created and stored at the beginning of the conference call. do.

멀티캐스터(1050)는 전부 믹싱된 오디오 스트림 및 한 세트의 부분 믹싱된 오디오 스트림 내의 패킷들을 멀티캐스트한다. 일 실시예에서, 멀티캐스터(1050)는 전부 믹싱된 오디오 스트림 및 한 세트의 부분 믹싱된 오디오 스트림 각각에 있는 패킷을 N명의 컨퍼런스 콜 참가자에 대응하는 N번 복제한다. 이 N개의 복제된 패킷은 이어서 N개의 교환 가상 회로(SVC1-SVCN)을 각각 거쳐 NIC(1020)에 있는 종단점으로 보내진다. 분산 컨퍼런스 브리지(1000)의 한가지 이점은 오디오 소스(1040)(즉, 믹싱 장치)의 복제 작업을 덜어준다는 것이다. 이 복제 작업은 멀티캐스터(1050)와 스위치(1030)로 분산된다.Multicaster 1050 multicasts the packets within the fully mixed audio stream and the set of partially mixed audio streams. In one embodiment, multicaster 1050 replicates the packets in each of the fully mixed audio streams and the set of partially mixed audio streams N times corresponding to N conference call participants. These N replicated packets are then passed through each of the N switched virtual circuits (SVC1-SVCN) to the endpoint at NIC 1020. One advantage of the distributed conference bridge 1000 is that it reduces the duplication of the audio source 1040 (ie the mixing device). This replication job is distributed to multicaster 1050 and switch 1030.

NIC(1020)는 이어서 각 SVC1-SVCN에 도달하는 아웃바운드 패킷들을 처리하여 전부 믹싱된 오디오 스트림과 부분 믹싱된 오디오 스트림의 패킷을 컨퍼런스 콜 참가자(C1-CN)에게 송달할지 폐기할지를 판정한다. 이 판정은 컨퍼런스 콜 동안에 패킷 헤더 정보에 기초하여 실시간으로 행해진다. SVC에 도달하는 각 패킷에 대해, NIC(1020)는 TAS 및 IAS 필드 등의 패킷 헤더 정보에 기초하여 그 패킷이 SVC와 관련된 참가자로 전송하기에 적절한지 여부를 판정한다. 적절한 경우, 그 패킷은 추가의 패킷 처리를 위해 송달된다. 이 패킷은 처리되어 네트워크 패킷으로 만들어져 그 참가자로 송달된다. 그렇지 않은 경우, 이 패킷은 폐기된다. 일 실시예에서, 이 네트워크 패킷은 룩업 테이블(1025), RTP/RTCP 패킷 헤더 정보(타임 스탬프/순서 정보)로부터 획득한 목적지 콜 참가자의 네트워크 주소 정보(IP/UDP 주소)와 오디오 데이터를 포함하는 IP 패킷이다. 이 오디오 데이터는 특정의 컨퍼런스 콜 참가자에 적절한 믹싱된 오디오 데이터이다. 분산 컨퍼런스 브리지(100)의 동작에 대해서는 도 11에 도시한 전형적인 룩업 테이블(1025), 도 12 및 도 13a 내지 도 13c, 그리고 도 14a, 도 14b 및 도 15에 도시된 전형적인 패킷 도면을 참조하여 이하에서 더 설명한다.The NIC 1020 then processes the outbound packets that arrive at each SVC1-SVCN to determine whether to deliver or discard the packets of the fully mixed audio and the partially mixed audio stream to the conference call participants C1-CN. This determination is made in real time based on packet header information during the conference call. For each packet arriving at the SVC, the NIC 1020 determines whether the packet is suitable for transmission to the participant associated with the SVC based on packet header information such as the TAS and IAS fields. If appropriate, the packet is delivered for further packet processing. This packet is processed into a network packet and delivered to its participants. Otherwise, this packet is discarded. In one embodiment, the network packet includes a lookup table 1025, network address information (IP / UDP address) and audio data of the destination call participant obtained from the RTP / RTCP packet header information (time stamp / order information). IP packet. This audio data is mixed audio data appropriate for a particular conference call participant. Operation of the distributed conference bridge 100 is described below with reference to the typical lookup table 1025 shown in FIG. 11, 12 and 13A-13C, and the typical packet diagrams shown in FIGS. 14A, 14B and 15. More on this.

B. 분산 컨퍼런스 브리지 동작B. Distributed Conference Bridge Behavior

도 12는 본 발명의 컨퍼런스 브리지 처리를 설정하는 루틴(1200)(단계 1200 내지 단계 1280)을 나타낸 것이다. 단계 1220에서, 컨퍼런스 콜이 개시된다. 다수의 컨퍼런스 콜 참가자(C1-CN)는 분산 컨퍼런스 브리지(1000)에 전화를 건다. 각 참가자는 전화, 컴퓨터, PDA, 셋탑 박스, 네트워크 가전 등을 비롯한 임의의 VoIP 단말기를 사용할 수 있지만, 이에 한정되는 것은 아니다. 컨퍼런스 콜 에이전트(1010)는 컨퍼런스 콜 참가자가 컨퍼런스 콜에 참가하기를 원한다는 것을 알려주기 위해 종래의 IVR 처리를 수행하여 각 컨퍼런스 콜 참가자의 네트워크 주소를 획득한다. 예를 들어, 네트워크 주소 정보는 IP 및/또는 UDP 주소 정보를 포함할 수 있지만, 이에 한정되는 것은 아니다.12 shows a routine 1200 (steps 1200 to 1280) for setting up conference bridge processing of the present invention. At step 1220, a conference call is initiated. A number of conference call participants C1-CN call the distributed conference bridge 1000. Each participant may use any VoIP terminal, including, but not limited to, telephones, computers, PDAs, set-top boxes, network appliances, and the like. The conference call agent 1010 performs conventional IVR processing to obtain the network address of each conference call participant to indicate that the conference call participant wants to join the conference call. For example, the network address information may include IP and / or UDP address information, but is not limited thereto.

단계 1240에서, 룩업 테이블(1025)이 생성된다. 컨퍼런스 콜 에이전트 (1010)는 룩업 테이블을 생성하거나 NIC(1020)에 대해 룩업 테이블을 생성하도록 지시할 수 있다. 도 11의 예에 도시되어 있는 바와 같이, 룩업 테이블(1025)은 단계 1220에서 개시된 컨퍼런스 콜에서의 N 명의 컨퍼런스 콜 참가자에 대응하는 N개의 엔트리를 포함하고 있다. 룩업 테이블(1025) 내의 각 엔트리는 SVC 식별자, 컨퍼런스 ID(CID) 및 네트워크 주소 정보를 포함한다. SVC 식별자는 특정의 SVC를 식별하는 어떤 숫자나 태그이다. 일례에서, SVC 식별자는 가상 경로 식별자 및 가상 채널 식별자(Virtual Path Identifier/Virtual Channel Identifier, VPI/VCI)이다. 그 대신에, SVC 식별자 또는 태그 정보는 룩업 테이블(1025)로부터 생략될 수있으며, 그 대신에 그 테이블에서의 엔트리의 위치와 내재적으로 관련될 수 있다. 예를 들어, 제1 SVC는 테이블 내의 첫번째 엔트리와 관련될 수 있고, 제2 SVC는 테이블 내의 두번째 엔트리와 관련될 수 있으며, 이하 마찬가지이다. CID는 컨퍼런스 콜 에이전트(1010)에 의해 컨퍼런스 콜 참가자(C1-CN)에게 할당되는 임의의 숫자 또는 태그이다. 네트워크 주소 정보는 N명의 컨퍼런스 콜 참가자 각각에 대해 컨퍼런스 콜 에이전트(1010)에 의해 수집되는 네트워크 주소 정보이다.At step 1240, lookup table 1025 is generated. The conference call agent 1010 may generate a lookup table or instruct the NIC 1020 to generate a lookup table. As shown in the example of FIG. 11, lookup table 1025 includes N entries corresponding to N conference call participants in the conference call initiated at step 1220. Each entry in lookup table 1025 includes an SVC identifier, conference ID (CID), and network address information. An SVC identifier is any number or tag that identifies a particular SVC. In one example, the SVC identifier is a Virtual Path Identifier / Virtual Channel Identifier (VPI / VCI). Instead, the SVC identifier or tag information may be omitted from the lookup table 1025 and may instead be implicitly related to the position of an entry in that table. For example, the first SVC may be associated with the first entry in the table, and the second SVC may be associated with the second entry in the table, and so on. The CID is any number or tag assigned by the conference call agent 1010 to the conference call participants C1-CN. The network address information is network address information collected by the conference call agent 1010 for each of N conference call participants.

단계 1260에서, NIC(1020)는 각각의 SVC를 참가자 각각에 할당한다. N명의 컨퍼런스 콜 참가자인 경우, N개의 SVC가 할당된다. 컨퍼런스 콜 에이전트(1010)는 NIC(1020)에 대해 N개의 SVC를 할당하도록 지시한다. NIC(1020)는 이어서 NIC (1020)와 스위치(1030) 사이에 N개의 SVC 연결을 설정한다. 단계 1280에서, 컨퍼런스 콜은 시작한다. 컨퍼런스 콜 에이전트(1010)는 신호를 NIC(1020), 스위치 (1030) 및 오디오 소스(1040)에 보내어 컨퍼런스 콜 처리를 시작하도록 한다. 도 12는 SVC 및 SVC 식별자에 관해서 기술되어 있지만, 본 발명은 그에 한정되는 것은 아니며 임의의 유형의 링크(물리 링크 및/또는 논리 링크) 및 링크 식별자가 사용될 수 있다. 또한, 내부 오디오 소스가 포함되어 있는 실시예에서, 컨퍼런스 콜 에이전트(1010)는 내부 오디오 소스를 잠재적인 N명의 오디오 참가자 중 하나로서 부가하며, 그의 입력은 오디오 소스(1040)에서 믹싱된다.At step 1260, the NIC 1020 assigns each SVC to each participant. For N conference call participants, N SVCs are allocated. The conference call agent 1010 instructs the NIC 1020 to allocate N SVCs. The NIC 1020 then establishes N SVC connections between the NIC 1020 and the switch 1030. At step 1280, the conference call begins. The conference call agent 1010 sends a signal to the NIC 1020, the switch 1030 and the audio source 1040 to begin the conference call processing. 12 is described with respect to SVCs and SVC identifiers, the present invention is not limited thereto, and any type of link (physical link and / or logical link) and link identifier may be used. Also, in embodiments where an internal audio source is included, the conference call agent 1010 adds the internal audio source as one of the potential N audio participants, whose inputs are mixed at the audio source 1040.

컨퍼런스 콜 처리 동안의 분산 컨퍼런스 브리지(1000)의 동작은 도 13a 내지 도 13c(단계 1300 내지 단계 1398)에 도시되어 있다. 제어는 단계 1300에서 시작하여 단계 1310으로 진행한다. 단계 1310에서, 오디오 소스(1040)는 컨퍼런스 콜참가자(C1-CN)의 착신 오디오 스트림의 에너지를 모니터링한다. 오디오 소스 (1040)는 디지털 신호 처리기(DSP)를 포함한 임의의 유형의 오디오 소스일 수 있지만, 이에 한정되는 것은 아니다. 디지털화된 오디오 샘플의 에너지를 모니터링하는 임의의 종래의 기술이 사용될 수 있다. 단계 1320에서, 오디오 소스(1040)는 단계 1310에서 모니터링된 에너지에 기초하여 능동 화자의 수를 판정한다. 임의의 수의 능동 화자가 선택될 수 있다. 일 실시예에서, 컨퍼런스 콜은 주어진 시간에 3명의 능동 화자로 제한된다. 이 경우, 단계 1320에서의 모니터링 동안 가장 많은 에너지를 갖는 최대 3개의 오디오 스트림에 대응하는 최대 3명의 능동 화자가 결정된다.The operation of distributed conference bridge 1000 during conference call processing is illustrated in FIGS. 13A-13C (steps 1300-1398). Control begins at step 1300 and proceeds to step 1310. In step 1310, the audio source 1040 monitors the energy of the incoming audio stream of the conference call participants C1 -CN. The audio source 1040 may be any type of audio source, including but not limited to a digital signal processor (DSP). Any conventional technique for monitoring the energy of digitized audio samples can be used. In step 1320, the audio source 1040 determines the number of active speakers based on the energy monitored in step 1310. Any number of active speakers can be selected. In one embodiment, the conference call is limited to three active speakers at a given time. In this case, up to three active speakers corresponding to up to three audio streams with the most energy during the monitoring in step 1320 are determined.

그 다음에, 오디오 소스(1040)는 전부 믹싱된 오디오 스트림과 부분 믹싱된 오디오 스트림을 생성하여 전송한다(단계 1330 내지 단계 1360). 단계 1330에서, 하나의 전부 믹싱된 오디오 스트림이 생성된다. 전부 믹싱된 오디오 스트림은 단계 1320에서 결정된 능동 화자의 오디오 콘텐츠를 포함한다. 일 실시예에서, 전부 믹싱된 오디오 스트림은 패킷 헤더와 페이로드를 갖는 패킷들로 된 오디오 스트림이다. 패킷 헤더 정보는 그의 오디오 콘텐츠가 전부 믹싱된 오디오 스트림에 포함되어 있는 능동 화자를 식별한다. 일례에서, 도 14a에 도시한 바와 같이, 오디오 소스(1040)는 TAS, IAS 및 순서 필드를 갖는 패킷 헤더(1401)와, 페이로드(1403)를 갖는 아웃바운드 내부 패킷(1400)을 생성한다. TAS 필드는 컨퍼런스 콜에서의 현재의 능동 화자 모두의 CID를 열거한다. IAS 필드는 그의 오디오 콘텐츠가 믹싱된 스트림 내에 있는 능동 화자의 CID를 열거한다. 순서 정보는 타임 스탬프, 숫자순서값, 또는 다른 유형의 순서 정보일 수 있다. 다른 필드(도시 생략)는 특정 응용에 따라 체크섬 또는 다른 패킷 정보를 포함할 수 있다. 전부 믹싱된 오디오 스트림의 경우, TAS 필드와 IAS 필드는 동일하다. 페이로드(1403)는 전부 믹싱된 오디오 스트림 내의 디지털화된 믹싱된 오디오의 일부분을 포함한다.The audio source 1040 then generates and transmits a fully mixed audio stream and a partially mixed audio stream (steps 1330 through 1360). In step 1330, one fully mixed audio stream is generated. The fully mixed audio stream includes the audio content of the active speaker determined in step 1320. In one embodiment, the fully mixed audio stream is an audio stream of packets with a packet header and payload. The packet header information identifies the active speaker whose audio content is contained in the mixed audio stream. In one example, as shown in FIG. 14A, the audio source 1040 generates a packet header 1401 having a TAS, IAS, and order fields, and an outbound inner packet 1400 having a payload 1403. The TAS field lists the CIDs of all current active speakers in the conference call. The IAS field lists the active speaker's CID whose audio content is in the mixed stream. The order information may be a time stamp, numerical order value, or other type of order information. Other fields (not shown) may include checksums or other packet information, depending on the particular application. In the case of a fully mixed audio stream, the TAS field and the IAS field are the same. Payload 1403 includes a portion of digitized mixed audio in a fully mixed audio stream.

단계 1340에서, 오디오 소스(1040)는 단계 1330에서 생성된 전부 믹싱된 오디오 스트림을 스위치(1030)로 전송한다. 종국적으로, 컨퍼런스 콜에서의 수동 참가자 (즉, 단계 1320에서 결정된 능동 화자의 수에 들어 있지 않은 참가자)는 전부 믹싱된 오디오 스트림으로부터의 믹싱된 오디오를 듣게 된다.In step 1340, the audio source 1040 transmits the fully mixed audio stream generated in step 1330 to the switch 1030. Eventually, the passive participant in the conference call (i.e., the participant not included in the number of active speakers determined in step 1320) will all hear the mixed audio from the mixed audio stream.

단계 1350에서, 오디오 소스(1040)는 한 세트의 부분 믹싱된 오디오 스트림을 생성한다. 이 한 세트의 부분 믹싱된 오디오 스트림은 이어서 스위치(1030)로 보내진다(단계 1360). 단계 1350에서 생성되어 단계 1360에서 보내지는 부분 믹싱된 오디오 스트림 각각은 단계 1320에서 결정된 식별된 능동 화자의 그룹의 믹싱된 오디오 콘텐츠에서 각각의 수신측 능동 화자의 오디오 콘텐츠를 뺀 것을 포함한다. 수신측 능동 화자는 단계 1320에서 결정된 능동 화자의 그룹에 속하는 능동 화자로서 그에게는 부분 믹싱된 오디오 스트림이 가게 된다.In step 1350, the audio source 1040 generates a set of partially mixed audio streams. This set of partially mixed audio streams is then sent to a switch 1030 (step 1360). Each of the partially mixed audio streams generated at step 1350 and sent at step 1360 includes subtracting the audio content of each receiving active speaker from the mixed audio content of the group of identified active speakers determined at step 1320. The receiving active speaker is an active speaker belonging to the group of active speakers determined in step 1320, to which the partially mixed audio stream is directed.

일 실시예에서, 오디오 소스(1040)는 패킷 페이로드에 식별된 능동 화자의 그룹으로부터의 디지털 오디오에서 수신측 능동 화자의 오디오 콘텐츠를 뺀 것을 삽입한다. 이와 같이, 수신측 능동 화자는 자신의 음성 또는 오디오 입력에 대응하는 오디오를 수신하지 않게 된다. 그렇지만, 수신측 능동 화자는 다른 능동 화자의 음성 또는 오디오 입력을 듣게 된다. 일 실시예에서, 패킷 헤더 정보는 그의오디오 콘텐츠가 각각의 부분 믹싱된 오디오 스트림 내에 포함되어 있는 능동 화자를 식별하기 위해 각각의 부분 믹싱된 오디오 스트림에 포함되어 있다. 일례에서, 오디오 소스(1040)는 도 14a의 패킷 포맷을 사용하고 하나 이상의 CID를 패킷의 TAS 및 IAS 필드에 삽입한다. TAS 필드에는 컨퍼런스 콜에서의 현재의 능동 화자 모두의 CID를 열거되어 있다. IAS 필드에는 그의 오디오 콘텐츠가 각각의 부분 믹싱된 스트림 내에 있는 능동 화자의 CID가 열거되어 있다. 부분 믹싱된 오디오 스트림의 경우, TAS와 IAS 필드는 동일하지 않은데, 그 이유는 IAS 필드가 하나 더 적은 CID를 가지고 있기 때문이다. 일례에서, 단계 1330 및 단계 1350에서 패킷을 작성하기 위해, 오디오 소스(1040)는 컨퍼런스 콜의 개시 시에 생성되어 저장해둔 비교적 정적인 룩업 테이블[예를 들면, 테이블(1025) 또는 별도의 테이블]로부터 컨퍼런스 콜 참가자의 해당 CID 정보를 검색한다.In one embodiment, the audio source 1040 inserts the audio content of the receiving active speaker subtracted from the digital audio from the group of active speakers identified in the packet payload. As such, the receiving active speaker does not receive audio corresponding to his or her voice or audio input. However, the receiving active speaker hears the voice or audio input of another active speaker. In one embodiment, the packet header information is included in each partially mixed audio stream to identify the active speaker whose audio content is included in each partially mixed audio stream. In one example, the audio source 1040 uses the packet format of FIG. 14A and inserts one or more CIDs into the TAS and IAS fields of the packet. The TAS field lists the CIDs of all current active speakers in the conference call. The IAS field lists the active speaker's CID whose audio content is in each partially mixed stream. For partially mixed audio streams, the TAS and IAS fields are not the same because the IAS field has one less CID. In one example, to create a packet at steps 1330 and 1350, the audio source 1040 is a relatively static lookup table (e.g., table 1025 or separate table) created and stored at the beginning of the conference call. Retrieves the corresponding CID information of the conference call participant.

예를 들어, 64명의 참가자(N=64)가 있고 그 중 3명이 능동 화자(1-3)으로 식별되는 컨퍼런스 콜에서, 하나의 전부 믹싱된 오디오 스트림은 3명의 능동 화자 모두로부터의 오디오를 포함한다. 이 전부 믹싱된 스트림은 궁극적으로 61명의 수동 참가자 각각에게 전송된다. 3개의 부분 믹싱된 오디오 스트림은 이어서 단계 1350에서 생성된다. 제1 부분 믹싱된 스트림 1은 화자 2-3으로부터의 오디오는 포함하지만 화자 1의 오디오는 포함하지 않는다. 제2 부분 믹싱된 스트림 2는 화자 1-3으로부터의 오디오는 포함하지만 화자 2의 오디오는 포함하지 않는다. 제3 부분 믹싱된 스트림 3은 화자 1-2로부터의 오디오는 포함하지만 화자 3의 오디오는 포함하지 않는다. 제1 내지 제3 부분 믹싱된 오디오 스트림은 궁극적으로 화자 1-3에각각 전송된다. 이와 같이, 오디오 소스에 의해 단지 4개의 믹싱된 오디오(하나의 전부 믹싱된 오디오와 3개의 부분 믹싱된 오디오)가 생성되기만 하면 된다. 이것이 오디오 소스(1040)의 작업을 덜어주게 된다.For example, in a conference call where there are 64 participants (N = 64), three of which are identified as active speakers (1-3), one fully mixed audio stream includes audio from all three active speakers. do. This fully mixed stream is ultimately sent to each of the 61 passive participants. Three partially mixed audio streams are then generated in step 1350. The first partially mixed stream 1 contains audio from speakers 2-3 but no speaker 1 audio. The second partially mixed stream 2 includes audio from speakers 1-3 but does not include audio from speaker 2. The third partially mixed stream 3 contains audio from speakers 1-2 but does not contain audio from speaker 3. The first to third partially mixed audio streams are ultimately sent to speakers 1-3 respectively. As such, only four mixed audios (one fully mixed audio and three partially mixed audio) need be produced by the audio source. This saves the work of the audio source 1040.

도 13b에 도시한 바와 같이, 단계 1370에서, 멀티캐스터(1050)는 전부 믹싱된 오디오 스트림과 한 세트의 부분 믹싱된 오디오 스트림을 복제하여 복제된 패킷 카피를 컨퍼런스 콜에 할당된 SVC(SVC1-SVCN) 모두를 통해 멀티캐스트한다. NIC(1020)는 이어서 SVC를 통해 수신된 각 패킷을 처리한다(단계 1380). 간명함을 위해, 분산 컨퍼런스 브리지(1000)에서 내부적으로 처리된 각 패킷[NIC(1020)에 의해 SVC에서 수신된 패킷을 포함함]은 내부 패킷이라고 말한다. 내부 패킷은 IP 패킷 및/또는 도 7a 및 도 7b에서 전술한 내부 발신 패킷과, 도 14a를 참조하여 전술한 전형적인 내부 발신 패킷, 즉 내부 아웃바운드 패킷을 포함한 임의의 유형의 패킷 포맷일 수 있지만, 이에 한정되는 것은 아니다.As shown in FIG. 13B, in step 1370, the multicaster 1050 duplicates the fully mixed audio stream and a set of partially mixed audio streams to assign the duplicated packet copy to the SVC (SVC1-SVCN) assigned to the conference call. ) Multicast through all The NIC 1020 then processes each packet received via the SVC (step 1380). For simplicity, each packet processed internally in distributed conference bridge 1000 (including packets received in SVC by NIC 1020) is referred to as an inner packet. The inner packet may be any type of packet format including an IP packet and / or the inner outgoing packet described above in FIGS. 7A and 7B and the typical inner outgoing packet described above with reference to FIG. 14A, i.e., the inner outbound packet. It is not limited to this.

각각의 SVC에 대해, NIC(1020)는 추가의 패킷 처리와 궁극적인 전송을 위해 수신된 내부 패킷을 대응하는 컨퍼런스 콘 참가자에게 송달할지 폐기할지를 판정한다(단계 1381). 수신된 내부 패킷은 전부 믹싱된 오디오 스트림 또는 부분 믹싱된 오디오 스트림으로부터 온 것일 수 있다. 그러한 경우, 그 패킷은 송달되고, 이어서 제어는 단계 1390으로 진행한다. 그렇지 않은 경우, 그 패킷은 송달되지 않고, 제어는 단계 1380으로 진행하여 그 다음 패킷을 처리한다. 단계 1390에서, 그 패킷은 처리되어 네트워크 IP 패킷으로 된다. 일 실시예에서, 패킷 프로세서(1070)는 룩업 테이블(1025)로부터 획득된 그 참가자의 네트워크 주소 정보(IP 및/또는UDP 주소)를 적어도 갖는 패킷 헤더를 생성한다. 패킷 프로세서(1070)는 또한 RTP/RTCP 패킷 헤더 정보(예를 들면, 타임 스탬프 및/또는 다른 유형의 순서 정보) 등의 순서 정보도 부가한다. 패킷 프로세서(1070)는 수신된 패킷의 순서 및/또는 오디오 소스(1040)에 의해 [또는 멀티캐스터(1050)에 의해] 생성된 패킷 내에 제공되어 있는 순서 정보(예를 들면 순서 필드)에 기초하여 이러한 순서 정보를 생성할 수 있다. 패킷 프로세서(1070)는 또한 참가자로 송달되는 수신된 내부 패킷으로부터의 오디오를 포함하는 페이로드를 각 네트워크 패킷 내에 부가한다. NIC(1020)[또는 패킷 프로세서(1070)]는 이어서 생성된 IP 패킷을 참가자에게 전송한다(단계 1395).For each SVC, the NIC 1020 determines whether to deliver or discard the received internal packet to the corresponding conference cone participant for further packet processing and ultimate transmission (step 1381). The received inner packet may be from a fully mixed audio stream or a partially mixed audio stream. If so, the packet is delivered and control then proceeds to step 1390. If not, the packet is not delivered and control proceeds to step 1380 to process the next packet. At step 1390, the packet is processed into a network IP packet. In one embodiment, the packet processor 1070 generates a packet header having at least the participant's network address information (IP and / or UDP address) obtained from the lookup table 1025. The packet processor 1070 also adds order information, such as RTP / RTCP packet header information (eg, time stamps and / or other types of order information). Packet processor 1070 is based on the order of received packets and / or order information (eg, order fields) provided in packets generated by audio source 1040 (or by multicaster 1050). Such order information can be generated. The packet processor 1070 also adds in each network packet a payload containing audio from the received internal packet being delivered to the participant. The NIC 1020 (or packet processor 1070) then sends the generated IP packet to the participant (step 1395).

본 발명의 한 특징은 단계 1381에서의 패킷 처리 판정이 컨퍼런스 콜 동안에 신속하게 실시간으로 수행될 수 있다. 도 13c는 본 발명에 따른 패킷 처리 판정 단계 1381을 수행하는 전형적인 루틴(단계 1382 내지 단계 1389)을 나타낸 것이다. 이 루틴은 각각의 SVC를 통해 도달하는 각 아웃바운드 패킷에 대해 수행된다. NIC(1020)는 어느 패킷을 폐기하고 어느 패킷을 IP 패킷으로 변환하여 콜 참가자에게 전송할지를 판정하는 데 있어서의 필터 또는 셀렉터로서 동작한다.One feature of the present invention is that the packet processing decision in step 1381 can be performed quickly and in real time during the conference call. 13C shows an exemplary routine (steps 1382 through 1389) for performing a packet processing decision step 1381 according to the present invention. This routine is performed for each outbound packet arriving over each SVC. The NIC 1020 acts as a filter or selector in determining which packet to discard and which packet to convert to an IP packet to send to the call participant.

내부 패킷이 SVC를 통해 도달할 때, NIC(1020)는 룩업 테이블(1025)에서 특정의 SVC에 대응하는 엔트리를 검색하여 CID값을 획득한다(단계 1382). NIC(1020)는 이어서 획득된 CID 값이 내부 패킷의 TAS 필드 내의 임의의 CID값과 일치하는지 여부를 판정한다(단계 1383). 일치하는 경우, 제어는 단계 1384로 진행한다. 일치하지 않는 경우, 제어는 단계 1386으로 진행한다. 단계 1384에서, NIC(1020)는획득된 CID값이 내부 패킷의 IAS 필드 내의 임의의 CID값과 일치하는지 여부를 판정한다. 일치하는 경우, 제어는 단계 1385로 진행한다. 일치하지 않는 경우, 제어는 단계 1387로 진행한다. 단계 1385에서, 패킷은 폐기된다. 제어는 이어서 단계 1389로 진행하여 그 다음 패킷을 처리하기 위해 단계 1380으로 되돌아간다. 단계 1387에서, 제어는 단계 1390으로 점프하여 내부 패킷으로부터 IP 패킷을 생성한다.When the inner packet arrives through the SVC, the NIC 1020 retrieves an entry corresponding to the particular SVC from the lookup table 1025 to obtain a CID value (step 1382). The NIC 1020 then determines whether the obtained CID value matches any CID value in the TAS field of the inner packet (step 1383). If so, then control proceeds to step 1384. If there is a mismatch, control proceeds to step 1386. In step 1384, the NIC 1020 determines whether the obtained CID value matches any CID value in the IAS field of the inner packet. If so, then control passes to step 1385. If there is no match, control proceeds to step 1387. In step 1385, the packet is discarded. Control then proceeds to step 1389 and returns to step 1380 to process the next packet. In step 1387, control jumps to step 1390 to generate an IP packet from an inner packet.

단계 1386에서, TAS 필드와 IAS 필드의 비교가 행해진다. 이 필드가 동일한 경우(전부 믹싱된 오디오 스트림 패킷의 경우에서와 같이), 제어는 단계 1387로 진행한다. 단계 1387에서, 제어는 단계 1390으로 점프한다. TAS와 IAS 필드가 동일하지 않을 경우, 제어는 단계 1385로 진행하여 그 패킷은 폐기된다.In step 1386, a comparison of the TAS field and the IAS field is made. If this field is the same (as in the case of fully mixed audio stream packets), control proceeds to step 1387. In step 1387, control jumps to step 1390. If the TAS and IAS fields are not the same, control proceeds to step 1385 and the packet is discarded.

C. 분산 컨퍼런스 브리지를 통한 아웃바운드 패킷 흐름C. Outbound Packet Flow Through Distributed Conference Bridge

분산 컨퍼런스 브리지(1000)에서의 아웃바운드 패킷 흐름에 대해 도 14 및 도 15에 도시되어 있는 64명의 컨퍼런스 콜에서의 전형적인 패킷을 참조하여 더 설명한다. 도 14 및 도 15에서, 패킷 페이로드 내의 믹싱된 오디오 콘텐츠는 그의 오디오가 믹싱되어 있는 각자의 참가자를 둘러싸는 괄호로 표시되어 있다(예를 들어, {C1, C2, C3}). 패킷 헤더 내의 CID 정보는 각자의 능동 화자 참가자에 밑줄을 그음으로써 표시되어 있다(예를 들어,등). 순서 번호는 순서 번호 0, 1 등으로 간단히 나타내어져 있다.Outbound packet flow at distributed conference bridge 1000 is further described with reference to typical packets in the 64 conference calls shown in FIGS. 14 and 15. In Figures 14 and 15, the mixed audio content in the packet payload is indicated in parentheses surrounding each participant whose audio is being mixed (e.g. {C1, C2, C3}). The CID information in the packet header is indicated by underlining each active speaker participant (eg, Etc). The sequence number is simply indicated by sequence numbers 0, 1 and the like.

이 일례에서, 컨퍼런스 콜에서 64명의 참가자(C1-C64)가 있으며 그 중 3명이 주어진 시간의 능동 화자로서 식별된다(C1-C3). 오디오 참가자(C4-C64)는 수동 참가자로 간주되어 그의 오디오는 믹싱되지 않는다. 오디오 소스(1040)는 3명의 능동 화자(C1-C3) 모두로부터의 오디오를 갖는 하나의 전부 믹싱된 오디오 스트림 FM을 생성한다. 도 14b는 이 컨퍼런스 콜 동안에 오디오 소스(1040)에 의해 생성된 2개의 전형적인 내부 패킷(1402, 1404)를 나타낸 것이다. 스트림 FM 내의 패킷 (1402, 1404)은 패킷 헤더와 페이로드를 갖는다. 패킷(1402, 1404) 내의 페이로드는 각각 3명의 능동 화자(C1-C3) 각각으로부터의 믹싱된 오디오를 포함한다. 패킷(1402, 1404)은 각각 TAS와 IAS 필드를 갖는 패킷 헤더를 포함한다. TAS 필드에는 3명의 능동 화자(C1-C3) 전부에 대한 CID가 들어 있다. IAS 필드에는 그의 콘텐츠가 패킷의 페이로드에 실제로 믹싱되어 있는 능동 화자(C1-C3)에 대한 CID가 들어 있다. 패킷(1402, 1404)는 또한 패킷(1402)이 패킷(1404)보다 선행한다는 것을 나타내기 위해 순서 정보 0과 1을 각각 더 포함한다. 전부 믹싱된 스트림 FM으로부터의 믹싱된 오디오는 궁극적으로 61명의 현재의 수동 참가자(C4-C64) 각각으로 보내진다.In this example, there are 64 participants (C1-C64) in the conference call, three of which are identified as active speakers at a given time (C1-C3). The audio participants C4-C64 are considered passive participants and their audio is not mixed. Audio source 1040 produces one fully mixed audio stream FM with audio from all three active speakers C1-C3. 14B shows two typical inner packets 1402, 1404 generated by the audio source 1040 during this conference call. Packets 1402 and 1404 in the stream FM have a packet header and payload. The payloads in packets 1402 and 1404 each contain mixed audio from each of three active speakers C1-C3. Packets 1402 and 1404 include packet headers having TAS and IAS fields, respectively. The TAS field contains the CIDs for all three active speakers (C1-C3). The IAS field contains the CID for the active speaker C1-C3 whose content is actually mixed in the payload of the packet. Packets 1402 and 1404 also further include order information 0 and 1, respectively, to indicate that packet 1402 precedes packet 1404. The mixed audio from the fully mixed stream FM is ultimately sent to each of the 61 current passive participants C4-C64.

3개의 부분 믹싱된 오디오 스트림(PM1-PM3)은 오디오 소스(1040)에 의해 생성된다. 도 14b는 제1 부분 믹싱된 스트림(PM1)의 2개의 패킷(1412, 1414)을 나타낸 것이다. 패킷(1412, 1414) 내의 페이로드는 화자(C2, C3)로부터의 믹싱된 오디오는 포함하고 있지만, 화자(C1)로부터의 믹싱된 오디오는 포함하고 있지 않다. 패킷(1412, 1414)은 각각 패킷 헤더를 포함한다. TAS 필드는 3명의 능동 화자(C1-C3) 전부에 대한 CID를 가지고 있다. IAS 필드는 그의 콘텐츠가 패킷의 페이로드에 실제로 믹싱되어 있는 2명의 능동 화자(C2,C3)에 대한 CID를 가지고 있다. 패킷(1412, 1414)은 패킷(1412)이 패킷(1414)보다 선행한다는 것을 나타내기 위해 순서 정보 0과 1을 각각 가지고 있다. 도 14b는 제2 부분 믹싱된 스트림(PM2)의 2개의 패킷(1422, 1424)을 나타낸 것이다. 패킷(1422, 1424) 내의 페이로드는 화자 (C1, C3)로부터의 믹싱된 오디오는 포함하고 있지만, 화자(C2)로부터의 믹싱된 오디오는 포함하고 있지 않다. 패킷(1422, 1424)은 각각 패킷 헤더를 포함한다. TAS 필드는 3명의 능동 화자(C1-C3) 전부에 대한 CID를 가지고 있다. IAS 필드는 그의 콘텐츠가 패킷의 페이로드에 실제로 믹싱되어 있는 2명의 능동 화자(C1,C3)에 대한 CID를 가지고 있다. 패킷(1422, 1424)은 패킷(1422)이 패킷(1424)보다 선행한다는 것을 나타내기 위해 순서 정보 0과 1을 각각 가지고 있다. 도 14c는 제3 부분 믹싱된 스트림(PM3)의 2개의 패킷(1432, 1434)을 나타낸 것이다. 패킷(1432, 1434) 내의 페이로드는 화자(C1, C2)로부터의 믹싱된 오디오는 포함하고 있지만, 화자(C3)로부터의 믹싱된 오디오는 포함하고 있지 않다. 패킷(1432, 1434)은 각각 패킷 헤더를 포함한다. TAS 필드는 3명의 능동 화자(C1-C3) 전부에 대한 CID를 가지고 있다. IAS 필드는 그의 콘텐츠가 패킷의 페이로드에 실제로 믹싱되어 있는 2명의 능동 화자(C1,C2)에 대한 CID를 가지고 있다. 패킷(1432, 1434)은 패킷 (1432)이 패킷(1434)보다 선행한다는 것을 나타내기 위해 순서 정보 0과 1을 각각 가지고 있다.Three partially mixed audio streams PM1-PM3 are generated by the audio source 1040. FIG. 14B shows two packets 1412, 1414 of the first partially mixed stream PM1. Payloads in packets 1412 and 1414 contain mixed audio from speakers C2 and C3, but not mixed audio from speaker C1. Packets 1412 and 1414 each include a packet header. The TAS field contains CIDs for all three active speakers (C1-C3). The IAS field contains the CIDs for two active speakers (C2, C3) whose contents are actually mixed in the payload of the packet. Packets 1412 and 1414 have order information 0 and 1, respectively, to indicate that packet 1412 precedes packet 1414. FIG. 14B shows two packets 1422 and 1424 of the second partially mixed stream PM2. The payloads in packets 1422 and 1424 include mixed audio from speakers C1 and C3, but not mixed audio from speaker C2. Packets 1422 and 1424 each include a packet header. The TAS field contains CIDs for all three active speakers (C1-C3). The IAS field contains the CIDs for the two active speakers C1 and C3 whose contents are actually mixed in the payload of the packet. Packets 1422 and 1424 have order information 0 and 1, respectively, to indicate that packet 1422 precedes packet 1424. 14C shows two packets 1432, 1434 of the third partially mixed stream PM3. Payloads in packets 1432 and 1434 include mixed audio from speakers C1 and C2, but not mixed audio from speaker C3. Packets 1432 and 1434 each include a packet header. The TAS field contains CIDs for all three active speakers (C1-C3). The IAS field contains the CIDs for two active speakers (C1, C2) whose contents are actually mixed in the payload of the packet. Packets 1432 and 1434 have order information 0 and 1, respectively, to indicate that packet 1432 precedes packet 1434.

도 15는 본 발명에 따라 적절한 컨퍼런스 콜 참가자로 전송될 도 14의 패킷이 멀티캐스트되고 또 이들이 처리되어 IP 패킷으로 만들어진 후의 전형적인 패킷 내용을 나타낸 도면이다. 상세히 설명하면, 패킷(1412, 1422, 1432, 1402, 1414)은 SVC1-SVC64 각각을 거쳐 멀티캐스트되어 NIC(1020)에 도달하는 것으로 도시되어 있다. 단계 1381과 관련하여 전술한 바와 같이, NIC(1020)는 각 SVC1-SVC64에 대해 패킷(1412, 1422, 1432, 1402, 1414) 중 어느 것이 각자의 컨퍼런스 콜 참가자(C1-C64)로 송달되는 것이 적절한지를 판정한다. 네트워크 패킷(예를 들어, IP 패킷)은 이어서 패킷 프로세서(1070)에 의해 생성되어 각자의 컨퍼런스 콜 참가자(C1-C64)로 전송된다.15 is a diagram illustrating typical packet contents after the packets of FIG. 14 to be transmitted to the appropriate conference call participant in accordance with the present invention are multicast and processed into an IP packet. In detail, packets 1412, 1422, 1432, 1402, and 1414 are shown as multicasting through SVC1-SVC64 to reach NIC 1020, respectively. As discussed above in connection with step 1381, the NIC 1020 is responsible for which of the packets 1412, 1422, 1432, 1402, 1414 are forwarded to their respective conference call participants C1-C64 for each SVC1-SVC64. Determine if appropriate. Network packets (eg, IP packets) are then generated by the packet processor 1070 and sent to respective conference call participants C1-C64.

도 15에 도시한 바와 같이, SVC1의 경우, 패킷(1412, 1414)은 그의 패킷 헤더에 기초하여 C1으로 송달되는 것으로 결정된다. 패킷(1412, 1414)은 TAS 필드 내에 C1의 CID를 가지지만 IAS 필드내에는 이를 갖지 않는다. 패킷(1412, 1414)은 네트워크 패킷(1512, 1514)으로 변환된다. 네트워크 패킷(1512, 1514)은 C1의 IP 주소(C1ADDR)와 화자(C2, C3)로부터의 믹싱된 오디오를 가지지만 화자(C1)로부터의 믹싱된 오디오는 갖지 않는다. 패킷(1512, 1514)은 패킷(1512)이 패킷(1514)보다 선행한다는 것을 나타내기 위해 각각 순서 정보 0과 1을 갖는다. SVC2(컨퍼런스 콜 참가자 C2에 대응함)의 경우, 패킷(1422)이 C2로 송달되는 것으로 결정된다. 패킷(1422)은 TAS 필드 내에 C2의 CID를 갖지만 LAS 필드 내에는 이를 갖지 않는다. 패킷(1422)은 네트워크 패킷(1522)으로 변환된다. 네트워크 패킷(1522)은 C2의 IP 주소(C2ADDR), 순서 정보 0 및 화자(C1, C3)로부터의 믹싱된 오디오를 포함하지만 화자(C2)로부터의 믹싱된 오디오는 갖지 않는다. SVC3(컨퍼런스 콜 참가자 C3에 대응함)의 경우, 패킷(1432)이 C3로 송달되는 것으로 결정된다. 패킷(1432)은 TAS 필드 내에 C3의 CID를 갖지만 LAS 필드 내에는 이를 갖지 않는다. 패킷(1432)은 네트워크 패킷(1532)으로 변환된다. 네트워크 패킷(1532)은 C3의 IP 주소(C3ADDR), 순서 정보 0 및 화자(C1, C2)로부터의 믹싱된 오디오를 포함하지만 화자(C3)로부터의 믹싱된 오디오는 갖지 않는다. SVC4(컨퍼런스 콜 참가자 C4에 대응함)의 경우, 패킷(1402)이 C4로 송달되는 것으로 결정된다. 패킷(1402)은 TAS 필드 내에 C4의 CID를 갖지 않고 TAS와 LAS 필드는 동일하여 전부 믹싱된 스트림을 나타낸다. 패킷(1402)은 네트워크 패킷(1502)으로 변환된다. 네트워크 패킷 (1502)은 C4의 IP 주소(C4ADDR), 순서 정보 0 및 화자(C1, C2, C3)로부터의 믹싱된 오디오를 포함한다. 다른 수동 참가자(C5-C64) 각각은 유사한 패킷을 수신한다. 예를 들어, SVC64의 경우(컨퍼런스 콜 참가자 C64에 대응함), 패킷(1402)은 C64로 송달되는 것으로 결정된다. 패킷(1402)은 네트워크 패킷(1503)으로 변환된다. 네트워크 패킷(1503)은 C64의 IP 주소(C64ADDR), 순서 정보 0, 및 능동 화자(C1, C2, C3) 모두로부터의 믹싱된 오디오를 포함한다.As shown in FIG. 15, in the case of SVC1, it is determined that packets 1412 and 1414 are delivered to C1 based on the packet header thereof. Packets 1412 and 1414 have a CID of C1 in the TAS field but no in the IAS field. Packets 1412 and 1414 are converted into network packets 1512 and 1514. Network packets 1512 and 1514 have mixed audio from speaker C2 and C3 with IP address C1ADDR of C1 but no mixed audio from speaker C1. Packets 1512 and 1514 have order information 0 and 1, respectively, to indicate that packet 1512 precedes packet 1514. For SVC2 (corresponding to conference call participant C2), it is determined that packet 1422 is delivered to C2. Packet 1422 has a CID of C2 in the TAS field but not in the LAS field. Packet 1422 is converted into network packet 1522. Network packet 1522 includes C2's IP address (C2ADDR), order information 0 and mixed audio from speakers C1 and C3 but no mixed audio from speaker C2. For SVC3 (corresponding to conference call participant C3), it is determined that packet 1432 is delivered to C3. Packet 1432 has a CID of C3 in the TAS field but not in the LAS field. The packet 1432 is converted into a network packet 1532. Network packet 1532 includes C3's IP address C3ADDR, order information 0 and mixed audio from speakers C1 and C2 but no mixed audio from speaker C3. For SVC4 (corresponding to conference call participant C4), it is determined that packet 1402 is delivered to C4. The packet 1402 does not have a CID of C4 in the TAS field, and the TAS and LAS fields are the same to represent a completely mixed stream. Packet 1402 is converted into network packet 1502. Network packet 1502 includes IP address C4ADDR, order information 0 of C4, and mixed audio from speakers C1, C2, C3. Each of the other passive participants C5-C64 receive similar packets. For example, for SVC64 (corresponding to conference call participant C64), packet 1402 is determined to be delivered at C64. Packet 1402 is converted into network packet 1503. Network packet 1503 includes mixed audio from C64's IP address C64ADDR, order information 0, and active speakers C1, C2, C3.

D. 제어 로직 및 부가의 실시예D. Control Logic and Additional Embodiments

컨퍼런스 브리지(1000)[컨퍼런스 콜 에이전트(1010), NIC(1020), 스위치 (1030), 오디오 소스(1040) 및 멀티캐스터(1050)를 포함함]의 동작에 관하여 전술한 기능은 제어 로직에 구현될 수 있다. 이러한 제어 로직은 소프트웨어, 펌웨어, 하드웨어 또는 이들의 임의의 조합으로 구현될 수 있다.The functions described above with respect to the operation of conference bridge 1000 (including conference call agent 1010, NIC 1020, switch 1030, audio source 1040, and multicaster 1050) are implemented in control logic. Can be. Such control logic may be implemented in software, firmware, hardware or any combination thereof.

일 실시예에서, 분산 컨퍼런스 브리지(1000)는 미디어 서버(202)와 같은 미디어 서버에 구현된다. 일 실시예에서, 분산 컨퍼런스 브리지(1000)는 오디오 처리 플랫폼(230)에 구현된다. 컨퍼런스 콜 에이전트(1010)는 호 제어 및 오디오 특징 관리자(302)의 일부이다. NIC(306)는 NIC(1020)의 네트워크 인터페이스 기능을 수행하며, 패킷 프로세서(307)는 패킷 프로세서(1070)의 기능을 수행한다. 스위치 (304)는 스위치(1030)와 멀티캐스터(1050)로 대치된다. 오디오 소스(308) 중 어느 것이라도 오디오 소스(1040)의 기능을 수행할 수 있다.In one embodiment, distributed conference bridge 1000 is implemented in a media server such as media server 202. In one embodiment, distributed conference bridge 1000 is implemented in audio processing platform 230. Conference call agent 1010 is part of call control and audio feature manager 302. The NIC 306 performs the network interface function of the NIC 1020, and the packet processor 307 performs the function of the packet processor 1070. Switch 304 is replaced by switch 1030 and multicaster 1050. Any of the audio sources 308 can perform the function of the audio source 1040.

XI. 결론XI. conclusion

이상에서 본 발명의 특정의 실시예들에 대해 기술하였지만, 이들은 단지 일례로서만 제시된 것이지 한정을 위한 것이 아니라는 것을 이해해야만 한다. 당업자라면 첨부된 청구항들에 정의된 본 발명의 정신 및 범위를 벗어나지 않고 그 형태 및 세부에 있어서 여러가지 변경이 이루어질 수 있다는 것을 잘 알 것이다. 따라서, 본 발명의 범위는 전술한 전형적인 실시예 중 어느 것에 의해서도 제한되어서는 안되며 이하의 청구항들 및 그 등가물에 따라서만 정의되어야만 한다.While specific embodiments of the invention have been described above, it should be understood that they are presented by way of example only, and not limitation. Those skilled in the art will recognize that various changes may be made in form and detail without departing from the spirit and scope of the invention as defined in the appended claims. Accordingly, the scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

Claims

A media platform that provides media services over a network.

A resource manager for managing a resource used to support the media service, and

An audio processing platform that manages the call and the media services provided by this call,

The audio processing platform,

A network interface having a set of packet processors for processing packets of audio data entering and leaving the media platform in the call being processed;

A set of audio processors for processing the audio data in accordance with the media service provided in the call, and

And a switch for switching packets of audio data transmitted between the audio processor and the packet processor.

The media platform of claim 1, wherein the audio processing platform further comprises a call control and audio feature manager that controls media services and resources provided in the call processed by the audio processor.

The method of claim 2, wherein the call control and audio feature manager,

Call signaling manager,

system administrator,

Connection manager, and

And a feature controller.

3. The media platform of claim 2, wherein the audio processing platform comprises a shelf controller card.

The method of claim 1, further comprising a set of ports connected to the network,

Wherein said network interface further comprises a respective controller and forwarding information table for each packet processor.

The media platform of claim 1, wherein the switch comprises a packet switch.

The media platform of claim 1, further comprising a cell layer that combines the packets of audio data into cells of audio, wherein the switch is a cell switch that switches the cells.

The media platform of claim 1, wherein each audio processor comprises a digital signal processor.

The media platform of claim 1, wherein each audio processor comprises a plurality of card processors coupled to a plurality of digital signal processors.

The media platform of claim 1, wherein for at least one incoming audio stream, each packet processor receives an IP packet having RTP information from the network and converts the IP packet into an internal packet having a payload and a header. .

The media platform of claim 10, wherein each audio processor processes internal packets.

The media platform of claim 1, wherein for an outgoing audio stream, each packet processor receives an internal packet and generates an IP packet having RTP information to be transmitted over the network.

A media platform that provides media services over a network.

Means for managing a resource used to support the media service, and

Interface means including means for processing packets of audio data entering and leaving the media platform in a call being processed as interface means for interfacing with a network,

Means for processing the audio data in accordance with the media service provided in the call, and

Means for switching a packet of audio data transmitted between the audio processor and a packet processor.

A scalable audio processing platform for managing VoIP calls and media services provided by the calls,

A network interface having a set of packet processors for processing packets of audio data entering and leaving the platform in the call being processed;

And a switch coupled between the network interface and the set of audio processors to switch packets of audio data.

As a method of providing media services over a network,

Managing resources used to support at least one media service provided in the VoIP call,

Processing IP packets of audio data in the incoming and outgoing audio streams in the call, converting IP packets into internal packets in the incoming audio stream and converting internal packets into IP packets in the outgoing audio stream. An IP packet processing step,

Switching the inner packet of audio data in an incoming audio stream and an outgoing audio stream in the call being processed, and

Processing the inner packet of audio data in an incoming audio stream and an outgoing audio stream to provide at least one media service in the call.

A method of handling audio in conference calls between participants,

(a) generating a fully mixed audio packet stream (each packet having a packet header and a payload),

(b) generating a set of partially mixed audio packet streams, each packet having a packet header and a payload,

(c) multicasting each packet in the fully mixed audio stream and the set of partially mixed audio streams, and

(d) determining which of the multicast packets are to be delivered based on packet header information in the respective packet.

The method of claim 16, wherein prior to steps (a) and (b),

Initiating the conference call between the participants, and

Storing conference identifier information (CID) and network address information associated with each participant in the disclosed conference call.

18. The method of claim 17, further comprising assigning a switched virtual circuit (SVC) to respective participants in the conference call,

And the storing step further comprises storing the CID and network address information so that the CID and network address information for each participant can be retrieved based on their assigned SVC.

17. The method of claim 16, further comprising: monitoring energy of the inbound audio streams of the participants, and

Determining the number of active speakers based on the monitored energy.

20. The method of claim 19, wherein the generating steps (a) and (b) generate a packet header based on the determined number of active speakers, and the determining step (d) comprises the active speaker information in the respective packet header. And determining which of the multicast packets are to be delivered to the participants based on the method.

21. The method of claim 20, wherein the active speaker information comprises a TAS and IAS field, and wherein the generating step (a) generates a fully mixed audio packet stream having a packet header comprising the TAS and IAS fields. (b) generates a set of partially mixed audio packet streams having packet headers comprising TAS and IAS fields.

22. The conference call of claim 21 wherein the determining step (d) determines which of the multicast packets are to be delivered to participants based on information in the TAS and IAS fields in the respective packet header. How to handle audio in.

The method of claim 21, wherein the determining step (d) is for each packet being processed in the SVC.

Obtaining a CID value for the SVC, and

Determining whether the obtained CID value matches any CID value in the TAS field of the packet, and if so, determining whether the obtained CID value matches any CID value in the IAS field of the packet. And determining that the packet is discarded when a value matches any CID value in the TAS field and the obtained CID value matches any CID value in the ISA field. Way.

24. The method of claim 23, wherein when the obtained CID value does not match any CID value in the TAS field of the packet, the packet is converted into a network packet when the compared fields are equal by comparing the TAS field and the ISA field. And a comparing step in which the packet may be discarded when the compared fields are not equal.

The method of claim 21, wherein the packet header further comprises sequence information,

The generating step (c) generates a fully mixed audio packet stream having a packet header including the sequence information,

And said generating step (d) generates a set of partially mixed audio packet streams having a packet header comprising said sequence information.

17. The method of claim 16, wherein the generating step (a) generates a fully mixed audio packet stream, each packet having a packet header and a payload, wherein the payload includes mixed audio from at least three active speakers. and,

The generating step (b) generates a set of partially mixed audio packet streams, each packet having a packet header and a payload, and for each partially mixed audio packet stream the payload is from at least three active speakers. Subtracting the audio of each receiving active speaker from the mixed audio of < RTI ID = 0.0 >

17. The method of claim 16, further comprising: processing the packet determined to be delivered in the determining step (d) to produce a network packet having the network address of the participants of the conference call, and

Sending the network packet to the participants.

17. The method of claim 16, further comprising mixing audio received over a network from participants of the conference call that is an active speaker.

17. The method of claim 16, further comprising mixing audio received from an internal audio source with audio received over a network from participants of the conference call that is an active speaker.

A conference bridge that handles audio in conference calls between participants.

An audio source that produces a fully mixed audio packet stream and a set of partially mixed audio packet streams, each packet having a packet header and payload,

Switch, and

A network interface controller,

The switch is connected between the network interface controller and the audio source,

The switch further comprises a multicaster,

The multicaster multicasts each packet of the fully mixed audio stream and the set of partially mixed audio streams to the network interface controller,

And the network interface controller determines which of the multicast packets is to be delivered based on packet header information in the respective packet.

31. The conference bridge of claim 30 further comprising a conference call agent initiating a conference call between the participants.

32. The conference bridge of claim 31 further comprising a storage device for storing conference identifier information (CID) and network address information associated with each participant of the established conference call.

33. The conference bridge of claim 32 wherein the storage device comprises a lookup table.

33. The system of claim 32, wherein the network interface controller assigns a switched virtual circuit (SVC) to respective participants of the initiated conference call,

And the storage device stores the CID and network address information so that the CID and network address information for each participant can be retrieved based on their assigned SVC.

31. The conference bridge of claim 30 wherein the audio source monitors the energy of the participants' inbound audio stream and determines the number of active speakers based on the monitored energy.

36. The apparatus of claim 35, wherein the packet header generated by the audio source has active speaker information based on the determined number of active speakers,

The network interface controller determining which of the multicast packets are to be delivered to participants based on the active speaker information in the respective packet header.

37. The method of claim 36, wherein the active speaker information includes a TAS field and an IAS field.

Wherein the network interface controller determines which of the multicast packets are to be delivered to participants based on the information in the TAS field and the IAS field in the respective packet header.

38. The conference bridge of claim 37 wherein the packet header generated by the audio source further includes sequence information.

31. The system of claim 30, wherein the fully mixed audio packet stream has a payload comprising mixed audio from at least three active speakers,

Wherein each stream in the set of partially mixed audio packet streams has a payload comprising subtracting audio from a respective receiving side active speaker from mixed audio from the at least three active speakers.

33. The conference bridge of claim 30 further comprising a packet processor for processing the packets determined to be delivered to a network packet having network addresses of participants of the conference call.

31. The conference bridge of claim 30 wherein the audio source mixes audio received over a network from participants in the conference call that is an active speaker.

31. The conference bridge of claim 30 wherein the audio source mixes audio received from an internal audio source with audio received over a network from participants in the conference call that is an active speaker.

A system for processing audio in conference calls between participants,

(a) means for generating a fully mixed audio packet stream (each packet having a packet header and a payload),

(b) means for generating a set of partially mixed audio packet streams, each packet having a packet header and a payload,

(c) means for multicasting each packet in the fully mixed audio stream and the set of partially mixed audio streams, and

(d) means for determining which of the multicast packets are to be delivered based on packet header information in the respective packet.

As a media server for use in a VoIP network,

A distributed conference bridge that processes audio in conference calls between participants, the distributed conference bridge comprising:

Switch, and

A network interface controller,

The switch further comprises a multicaster,

The multicaster multicasts each packet in the fully mixed audio stream and the set of partially mixed audio streams to the network interface controller,

And the network interface controller determines which of the multicast packets is to be delivered based on packet header information in the respective packets.

A method of noiselessly switching audio provided over a network through an outgoing audio channel,

(a) generating a first audio stream of outgoing packets for the outgoing audio channel, each outgoing packet comprising a payload carrying audio and control header information;

(b) switching and delivering the first audio stream to a first network interface controller associated with the outgoing audio channel,

(c) generating a second audio stream of outgoing packets, each outgoing packet comprising a payload carrying audio and control header information;

(d) switching and forwarding the second audio stream to a first network interface controller associated with the outgoing audio channel, and

(e) based on priority information in control header information of the outgoing packet to determine which of the first and second audio streams is a higher priority audio stream to be transmitted over the network via the outgoing audio channel. Estimating the relative priority of the first and second audio streams.

46. The method of claim 45, further comprising: packetizing the higher priority audio stream to produce an output outgoing audio stream of packets with synchronized header information, and

Transmitting said output outgoing audio packet stream over said outgoing audio channel over said network.

46. The method of claim 45, further comprising packetizing a lower priority audio stream to produce an output outgoing audio stream of packets having synchronized header information, whereby the synchronized header information is generated. And noiselessly preserved in an IP packet transmitted across the network via the outgoing audio channel for both audio from both a first and a second audio stream.

46. The method of claim 45, further comprising: converting the first audio stream of outgoing packets into a first cell, and

Converting the second audio stream of outgoing packets into a second cell,

The switching step (b) comprises switching the converted first cell to an SVC associated with the outgoing audio channel,

And said switching step (d) comprises switching said converted second cell to an SVC associated with said outgoing audio channel.

47. The method of claim 46, wherein the synchronized header information includes valid RTP information.

46. The method of claim 45, further comprising: (f) prior to transmitting an IP packet having audio payloads of the respective first and second audio streams over the outgoing audio channel over the network. Determining the synchronized RTP header information for each.

A method for noiselessly switching audio from a second audio source to an outgoing audio channel that is already carrying audio from a first audio source.

Generating an audio stream of outgoing packets at the second audio source,

Converting the audio stream of outgoing packets into cells,

Switching the converted cells to a switched virtual circuit (SVC) associated with the outgoing audio channel,

Converting the switched cells back into the audio stream of outgoing packets;

Packetizing the audio stream to produce an output outgoing audio stream of packets with synchronized header information, and

And transmitting said output outgoing audio packet stream over a network over said outgoing audio channel in place of the audio from said first audio source.

53. The method of claim 51, wherein the generating step generates an audio stream of outgoing packets at the second audio source in response to a call event.

53. The method of claim 51, wherein the generating step generates an audio stream of outgoing packets at the second audio source in response to a call event, wherein the audio stream of outgoing packets is at least one of voice, music, tone, or sound. Noiseless switching method comprising one type of audio selected from one.

54. The method of claim 53, further comprising generating the call event based on at least one of the following situations: an emergency, a call signaling situation, a call event based on caller or callee information, or a request for audio information. Noiseless switching method.

54. The method of claim 53, further comprising generating the call event based on a request for audio information, wherein the request for audio information includes at least one of a request for advertising, news, sports, finance, music, or other audio content. Noise-free switching method comprising.

A method of introducing noiseless switchover audio for VoIP phone calls,

Establishing a VoIP phone call between the destination device and the media server,

Setting priority information for the first audio source,

Delivering a first audio stream of outgoing packets including the set priority information;

Determining a call state with respect to the availability of reception of noiseless switchover audio, and

And processing a call event including noiseless switchover audio when the set VoIP telephone call indicates that the set VoIP telephone call is a candidate for reception of noiseless switchover audio.

The method of claim 56, wherein the processing step comprises:

Determining priority information for the noiseless switchover audio, and

When the determined priority information for the noiseless switchover audio is higher than the set priority information of the first audio stream, the noiseless switchover audio is output to the output audio packet stream in the set VoIP telephone call. Noise-free switch over audio inflow method.

58. The method of claim 57, further comprising: generating a second audio stream of outgoing packets at a second audio source, the audio stream having the noiseless switchover audio in a payload;

Converting a second audio stream of the outgoing packets into cells,

Switching the converted cells to an SVC associated with an outgoing audio channel of the established VoIP telephone call;

Converting the switched cells back to a second audio stream of the outgoing packets;

Packetizing the second audio stream with synchronized header information to generate the output audio packet stream in the established VoIP telephone call, and

Transmitting the output audio packet stream over a network over the outgoing audio channel in the established VoIP telephone call instead of the audio from the first audio source.

A system for noiseless switching of audio provided over a network through an outgoing audio channel,

First and second audio sources,

A switch connected to the first and second audio sources, and

A network interface controller coupled to the switch,

The first audio source generates a first audio stream of outgoing packets for the outgoing audio channel, each outgoing packet comprising a payload carrying audio and control header information,

The second audio source generates a second audio stream of outgoing packets, each outgoing packet including a payload that carries audio and control header information, and the switch replaces the first and second audio streams. Noiseless switching system for switching to the network interface controller for transmission.

60. The system of claim 59, further comprising an outgoing audio controller coupled to the second audio source,

And the outgoing audio controller sends a control signal to the second audio source to initiate generation of the second audio stream.

61. The apparatus of claim 60, wherein the outgoing audio controller is also connected to the first audio source, the switch and the network interface controller.

The outgoing audio controller sends a control signal to the first audio source to initiate generation of the first audio stream when a VoIP telephone call is established, and transmits a control signal to the switch so that the network interface controller sets the VoIP. And identify that it is associated with an outgoing audio output channel associated with a telephone call, and transmit a control signal to the network interface controller associated with an outgoing audio output channel associated with the established VoIP telephone call.

62. The system of claim 61 wherein the outgoing audio controller is also connected to the first audio source,

And the outgoing audio controller sends a control signal to the first and second audio sources to set priority information in the first and second audio streams.

60. The system of claim 59, further comprising at least one packet processor for generating an IP packet having synchronized header information and audio payload,

And the audio payload comprises an audio payload carried in the first and second audio streams.

64. The apparatus of claim 63, wherein the network interface controller dynamically selects which of the IP packets to send based on the relative priority of the first and second audio streams.

And the switch comprises a packet switch or a cell switch.

60. The noiseless switching system of claim 59, wherein at least one of the first audio source and the second audio source internally generates the audio for the respective first and second audio streams.

60. The method of claim 59, wherein at least one of the first audio source and the second audio source converts audio from an external source to produce the audio for the respective first and second audio streams. Noise switching system.

A system for noiseless switching of audio from a second audio source to an outgoing audio channel that is already carrying audio from a first audio source.

Means for generating an audio stream of outgoing packets at the second audio source,

Means for converting an audio stream of the outgoing packets into cells;

Means for switching the converted cells to an SVC associated with the outgoing audio channel;

Means for converting the switched cells back to an audio stream of the outgoing packets;

Means for packetizing the audio stream to produce an output outgoing audio packet stream, and

Means for transmitting the output outgoing audio packet stream over a network over the outgoing audio channel instead of the audio from the first audio source.

A system for introducing noiseless switchover audio for VoIP phone calls.

Means for establishing a VoIP phone call between the destination device and the media server,

Means for setting priority information for the first audio source,

Means for delivering a first audio stream of outgoing packets containing the set priority information;

Means for determining a call state with respect to the availability of reception of noiseless switchover audio, and

And processing a call event including noiseless switchover audio when the set VoIP state call indicates that the set VoIP telephone call is a candidate for reception of noiseless switchover audio.

The method of claim 68, wherein the processing means,

Means for determining priority information for the noiseless switchover audio, and

Outputting the noiseless switchover audio with synchronized header information in the set VoIP telephone call when the determined priority information for the switch over audio is higher than the set priority information of the first audio stream. And a means for transmitting in an audio packet stream.

70. The apparatus of claim 69, further comprising: means for generating a second audio stream of outgoing packets at a second audio source, the audio stream having the noiseless switchover audio in a payload;

Means for converting a second audio stream of the outgoing packets into cells;

Means for switching the converted cells to an SVC associated with an outgoing audio channel of the established VoIP telephone call;

Means for converting the switched cells back to a second audio stream of the outgoing packets;

Means for packetizing the second audio stream to produce the output audio packet stream in the established VoIP telephone call, and

And means for transmitting the output audio packet stream over a network over the outgoing audio channel in the established VoIP telephone call instead of the audio from the first audio source.

A method of introducing noiseless switchover audio for a VoIP phone call,

Establishing a VoIP phone call, and

And transmitting the noiseless switchover audio into an output audio packet stream having synchronized header information in the set VoIP telephone call.

A method of noiseless switching between audio sources in a VoIP network,

(A) selecting one audio source,

(B) inserting audio from the selected one audio source into an output audio packet stream having synchronized header information and transmitting it to a destination device over an outgoing audio channel;

(C) selecting another audio source, and

(D) inserting audio from the selected another audio source into an output audio packet stream with synchronized header information and transmitting it to the destination device over the same outgoing audio channel.

73. The apparatus of claim 72, wherein the another audio source comprises an internal audio source,

Generating an audio payload of the output audio packet stream prior to the transmitting step (B).

73. The apparatus of claim 72, wherein the another audio source comprises an external audio source,

Extracting an audio payload for the output audio packet stream from an IP packet generated at the external audio source prior to the transmitting step (B).

(A) inserting audio from one audio source into an output audio packet stream with synchronized header information and transmitting it to a destination device over an outgoing audio channel;

(B) inserting audio from another independent audio source into an output audio packet stream with synchronized header information and transmitting it to the destination device over the same outgoing audio channel,

Accordingly, the user at the destination device is aware of noiseless switchover between transmitted audio from independent audio sources in the VoIP network.

(A) means for inserting audio from one audio source into an output audio packet stream with synchronized header information and transmitting it to a destination device over an outgoing audio channel;

(B) means for putting audio from another independent audio source into an output audio packet stream with synchronized header information and transmitting it to the destination device over the same outgoing audio channel,