KR101074069B1

KR101074069B1 - System of Private Branch Exchange and Method of Speech Synthesis

Info

Publication number: KR101074069B1
Application number: KR1020090093529A
Authority: KR
Inventors: 류창선; 구명완; 김재인
Original assignee: 주식회사 케이티
Priority date: 2009-09-30
Filing date: 2009-09-30
Publication date: 2011-10-17
Also published as: KR20110035713A

Abstract

According to an embodiment of the present invention, a method of synthesizing a voice for a announcement by a speech synthesis resource server in association with a resource management server includes: exchanging state information between a resource management server and a speech synthesis resource server; When the available channel information is requested from the call box, the resource management server identifying an available channel of the voice synthesis resource server and transmitting server IP information of the available available channel to the call box; And receiving a speech synthesis request including sentences corresponding to the server IP information and announcement from the call box, the speech synthesis resource server generates a speech synthesis file for the sentence, and a result of the speech synthesis request. Transmitting to the call box.

On-premises exchange, speech synthesis, resources

Description

System of Private Branch Exchange and Method of Speech Synthesis

본 발명은 구내 교환 시스템 및 음성 합성 방법에 관한 것이다.The present invention relates to an on-premises exchange system and speech synthesis method.

구내 교환(private branch exchange) 서비스란, 회사 또는 학교의 구내에서 발생하는 구내 통화 및 공중 전화망을 통한 일반 통화가 가능하도록 하는 서비스로서, 기술의 발전에 따라 구내 교환 서비스는 기존의 PSTN(public switched telephone network)과 같은 공중 전화망 기반에서 IP(internet protocol)망 기반으로 진화하고 있다.A private branch exchange service is a service that enables general calls through on-premise calls and public telephone networks occurring on the premises of a company or school. It is evolving from the public telephone network base such as network) to the Internet protocol (IP) network base.

최근 구내 교환 서비스는 종래의 키폰(key-phone) 기능, 즉 대표번호를 통해 구내의 특정 전화기를 연결해주는 기능뿐만 아니라 다양한 기능을 제공하게 되었다. 이에 각 사업장은 사업장의 특성에 따라 다양한 안내 멘트를 제작하고, 제작된 안내 멘트를 필요에 따라 변경하고자 하는 욕구가 증대되고 있다. Recently, the premises exchange service has provided various functions as well as a function of connecting a specific phone in the premises through a conventional key-phone function, that is, a representative number. Accordingly, each workplace has increased the desire to produce a variety of announcements according to the characteristics of the workplace, and to change the produced announcement as needed.

그런데, 안내 멘트는 일반적으로 성우의 음성을 녹음하는 방식으로 구현되는 것이 일반적이다. 즉, 녹음하고자 하는 문장을 미리 결정한 후, 성우를 섭외하여 성우가 문장을 읽는 음성을 녹음하고 이를 시스템에 적용하는 방식으로 안내 멘트 를 제작 또는 변경하였다. 따라서, 안내 멘트의 제작과 변경에는 시간과 비용이 많이 소요된다는 문제점이 있다.By the way, the announcement is generally implemented by recording the voice of the voice actor. That is, after deciding the sentences to be recorded in advance, the voice actors were recorded and voiced by the voice actors read the sentences, and the announcements were produced or changed in such a manner as to be applied to the system. Therefore, there is a problem that the production and change of the guidement takes a lot of time and money.

본 발명이 이루고자 하는 기술적 과제는, 구내 교환 시스템 및 음성 합성 방법을 제공하는 것이다.SUMMARY OF THE INVENTION The present invention has been made in an effort to provide an internal exchange system and a speech synthesis method.

본 발명의 한 실시예에 따른 리소스 관리 서버와 연동하여 음성 합성 리소스 서버가 안내 멘트에 대한 음성을 합성하는 방법은, 상기 리소스 관리 서버와 상기 음성 합성 리소스 서버가 상태 정보―여기서 상태 정보는 상기 음성 합성 리소스 서버의 가용 채널 및 상기 가용채널의 서버IP정보를 포함함―를 교환하는 단계; 콜박스로부터 가용채널 정보를 요청 받은 경우, 상기 리소스 관리 서버가 상기 음성합성 리소스 서버의 가용 채널을 파악하고, 파악된 가용채널의 서버IP 정보를 상기 콜박스로 전송하는 단계; 및 상기 콜박스로부터 상기 서버IP 정보, 안내 멘트에 해당하는 문장이 포함된 음성 합성 요청을 받은 경우, 상기 음성 합성 리소스 서버가 합성기 엔진―여기서 합성기 엔진은 특정 문자 또는 문장에 대해 합성된 음성이 저장되어 있는 합성음DB를 이용하여 음성을 합성함―을 이용하여 상기 문장에 대한 음성 합성 파일을 생성하고, 상기 음성 합성 요청에 대한 결과를 상기 콜박스로 전송하는 단계를 포함한다. In a method of synthesizing a voice for a announcement by a voice synthesis resource server in conjunction with a resource management server according to an embodiment of the present invention, the resource management server and the voice synthesis resource server are state information, wherein the state information is the voice. Exchanging an available channel of a synthetic resource server and server IP information of the available channel; When the available channel information is requested from the call box, the resource management server identifying an available channel of the voice synthesis resource server and transmitting server IP information of the available available channel to the call box; And receiving a speech synthesis request including sentences corresponding to the server IP information and announcement from the call box, wherein the speech synthesis resource server is a synthesizer engine, where the synthesizer engine stores the synthesized speech for a specific character or sentence. Synthesizing the speech using the synthesized speech DB; and generating a speech synthesis file for the sentence, and transmitting the result of the speech synthesis request to the call box.

본 발명의 한 실시예에 따른 구내 교환 장치는 외부의 사용자로부터 호 설정 요청을 수신한 경우 안내 멘트를 서비스하는 콜박스를 포함하는 사업장 시스템; 상기 콜박스로부터 가용채널 정보를 요청 받은 경우, 음성합성 리소스 서버의 가용 채널을 파악하고, 파악된 가용채널의 서버IP 정보를 상기 콜박스로 전송하는 리소스 관리 서버; 및 상기 콜박스로부터 상기 서버IP 정보, 안내 멘트에 해당하는 문장이 포함된 음성 합성 요청을 받은 경우, 합성기 엔진―여기서 합성기 엔진은 특정 문자 또는 문장에 대해 합성된 음성이 저장되어 있는 합성음DB를 이용하여 음성을 합성함―을 이용하여 상기 문장에 대한 음성 합성 파일을 생성하고, 상기 음성 합성 파일을 상기 콜박스로 전송하는 음성 합성 리소스 서버를 포함한다.An on-premises exchange apparatus according to an embodiment of the present invention includes a business system including a call box for serving a announcement when a call establishment request is received from an external user; A resource management server for identifying an available channel of a voice synthesis resource server and transmitting server IP information of the available available channel to the call box when receiving available channel information from the call box; And a voice synthesis request including sentences corresponding to the server IP information and announcements from the call box, a synthesizer engine, wherein the synthesizer engine uses a synthesized sound DB in which a synthesized voice for a specific character or sentence is stored. Synthesize a voice to generate a speech synthesis file for the sentence, and transmit the speech synthesis file to the call box.

본 발명의 실시예에 의하면 구내 교환 시스템 및 음성 합성 방법을 제공할 수 있다.According to an embodiment of the present invention, an on-premises exchange system and a speech synthesis method can be provided.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.DETAILED DESCRIPTION Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art may easily implement the present invention. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention, and like reference numerals designate like parts throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. Throughout the specification, when a part is said to "include" a certain component, it means that it can further include other components, without excluding other components unless specifically stated otherwise.

이제, 본 발명의 한 실시예에 따른 구내 교환 시스템 및 음성 합성 방법에 대하여 도면을 참고하여 상세하게 설명한다.Now, a premises exchange system and a speech synthesis method according to an embodiment of the present invention will be described in detail with reference to the drawings.

도 1은 본 발명의 한 실시예에 따른 구내 교환 시스템을 개념적으로 나타낸 도면이다. 도 1의 구내 교환 시스템(100)은 하나의 사업장에 구축되며, 운용자의 요청에 따라 안내 멘트에 해당하는 음성을 합성하는 음성 합성 서버(102)를 포함한다. 1 is a diagram conceptually showing a premises exchange system according to an embodiment of the present invention. The premises exchange system 100 of FIG. 1 is constructed at one workplace and includes a voice synthesis server 102 for synthesizing a voice corresponding to the announcement at the request of the operator.

도 1을 참고하면, 본 발명의 한 실시예에 따른 구내 교환 시스템(100)은 콜박스(101), 음성 합성 서버(102), 및 복수의 구내 전화기(103_1, 103_2,… 103_n)를 포함한다.Referring to FIG. 1, the premises exchange system 100 according to an embodiment of the present invention includes a call box 101, a voice synthesis server 102, and a plurality of premises telephones 103_1, 103_2,... 103_n. .

콜박스(101)는 인터넷 망과 연결되어 있으며, 인터넷 망을 통해 수신한 호 설정 요청을 기초로 복수의 구내 전화기(103_1, 103_2,… 103_n) 중 하나의 전화기로 호 설정 절차를 수행한다. 또한, 콜박스(101)는 복수의 구내 전화기(103_1, 103_2,… 103_n) 상호 간의 호 설정 절차를 수행한다. The call box 101 is connected to the Internet network and performs a call setup procedure with one of a plurality of telephones 103_1, 103_2, ... 103_n based on a call setup request received through the Internet network. In addition, the call box 101 performs a call setup procedure between the plurality of internal telephones 103_1, 103_2, ... 103_n.

또한, 콜박스(101)는 운용자로부터 안내 멘트에 해당하는 음성을 합성하라는 음성 합성 요청을 입력 받은 경우, 음성 합성 서버(102)로 음성 합성을 요청한 후, 음성 합성 서버(102)로부터 합성된 음성 합성 파일을 수신하여 저장한다. 이후, 인터넷 망으로부터 호 설정 요청이 있는 경우, 저장되어 있는 음성 합성 파일을 재생하여 안내 멘트를 서비스한다. In addition, when the call box 101 receives a voice synthesis request from the operator to synthesize a voice corresponding to the announcement, the call box 101 requests a voice synthesis from the voice synthesis server 102 and then synthesizes the voice from the voice synthesis server 102. Receive and save the composite file. Then, when there is a call setup request from the Internet, the announcement service is played by playing the stored voice synthesis file.

음성 합성 서버(102)는 콜박스(101)와 연결되어 있으며, 콜박스(101)를 통해 운용자로부터 음성 합성 요청을 입력 받은 경우, 안내 멘트에 해당하는 음성을 합성하여 음성 합성 파일을 생성하고, 생성된 음성 합성 파일을 콜박스(101)로 전송한다.The speech synthesis server 102 is connected to the call box 101, and when a speech synthesis request is received from an operator through the call box 101, synthesizes a speech corresponding to the announcement to generate a speech synthesis file, The generated speech synthesis file is transmitted to the call box 101.

복수의 구내 전화기(103_1, 103_2,… 103_n)는 하나의 사업장에 설치되어 있 는 전화기를 말하며, 이때, 전화기는 일반 전화기뿐만 아니라 팩스 등을 포함할 수 있다. 또한, 콜박스(101), 음성 합성 서버(102), 및 복수의 구내 전화기(103_1, 103_2,… 103_n)는 초고속 데이터 통신을 가능하게 하는 디지털 가입자 회선인 xDSL(x digital subscriber line) 등을 이용하여 연결되어 있을 수 있다.The plurality of premises telephones 103_1, 103_2, ... 103_n refer to telephones installed in one business place. In this case, the telephones may include not only ordinary telephones but also faxes. In addition, the call box 101, the voice synthesis server 102, and the plurality of local telephones 103_1, 103_2, ... 103_n use xDSL (x digital subscriber line), which is a digital subscriber line that enables high-speed data communication. May be connected.

그런데, 도 1의 구내 교환 시스템에 의하면, 콜박스(101)가 위치하는 사업장 마다 음성 합성 서버(102)가 구축되어야 하므로, 구축 비용이 증가된다는 문제점이 있다. 따라서, 음성 합성을 위한 서버를 사업장과는 별도로 구축할 필요가 있다.However, according to the premises exchange system of FIG. 1, since the voice synthesis server 102 must be constructed for each business location where the call box 101 is located, there is a problem that the construction cost is increased. Therefore, it is necessary to establish a server for speech synthesis separately from the workplace.

도 2는 본 발명의 다른 실시예에 따른 구내 교환 시스템을 개념적으로 나타낸 도면이다. 2 is a diagram conceptually showing an premises exchange system according to another embodiment of the present invention.

도 2를 참고하면, 본 발명의 다른 실시예에 따른 구내 교환 시스템은 음성 합성 리소스 서버(200), 리소스 관리 서버(300) 및 사업장 시스템(400)을 포함한다.Referring to FIG. 2, the premises exchange system according to another embodiment of the present invention includes a voice synthesis resource server 200, a resource management server 300, and a workplace system 400.

음성 합성 리소스 서버(200)는 초기화 매니저(201), 리소스 매니저(202), 웹 매니저(203), 타이머 매니저(204), 엔진 매니저(205), 복수의 합성기 엔진(206_1, 206_2,…, 206_n), 및 복수의 DB(207~212, 213_1, 213_2,…, 213_m)를 포함한다.The speech synthesis resource server 200 includes an initialization manager 201, a resource manager 202, a web manager 203, a timer manager 204, an engine manager 205, a plurality of synthesizer engines 206_1, 206_2,..., 206_n. ), And a plurality of DBs (207 to 212, 213_1, 213_2, ..., 213_m).

초기화 매니저(201)는 리소스 매니저(202)로부터 초기화 요청을 받은 경우, 초기화DB(207)에 저장된 정보를 기초로 복수의 합성기 엔진(206_1, 206_2,…, 206_n)의 초기화를 수행한다. 또한, 초기화 매니저(201)는 엔진 매니저(205)에서 복수의 합성기 엔진(206_1, 206_2,…, 206_n) 각각의 구동 시 이용할 수 있는 초기화 정보를 관리한다. When the initialization manager 201 receives the initialization request from the resource manager 202, the initialization manager 201 initializes the plurality of synthesizer engines 206_1, 206_2,..., 206_n based on the information stored in the initialization DB 207. In addition, the initialization manager 201 manages initialization information that can be used when the engine manager 205 drives each of the plurality of synthesizer engines 206_1, 206_2,..., 206_n.

이때, 초기화DB(207)에는 복수의 합성기 엔진(206_1, 206_2,…, 206_n)의 초기화를 위한 정보로, 음성 합성 파일을 음성 합성 리소스 서버(200)에 저장할 경우 이용하는 저장 디렉토리, 합성기 엔진의 최대 생성할 수 있는 개수(MaxChannel, n), 로그 파일의 개수(MaxLog), 로그 파일의 설명 레벨 정의(LogLevel), Vox, Wave등 음성 합성시 코딩 방식(CodingFormat), 성우의 아이디(SpeakerID), 음성 합성 파일의 속도(Speed), 음성 합성 파일의 피치(Pitch), 음성 합성 파일의 볼륨(Volume), 음성 합성 요청시 제작에 걸리는 최대시간(MaxResponseTime), CPU의 최대 이용율(MaxCPU), 디스크의 최대 이용율(MaxDISK) 등이 포함될 수 있다. At this time, the initialization DB 207 is information for initializing the plurality of synthesizer engines 206_1, 206_2,..., 206_n, and the maximum storage directory and synthesizer engine used when the speech synthesis file is stored in the speech synthesis resource server 200. Number of log files (MaxChannel, n), number of log files (MaxLog), description level of log files (LogLevel), coding method for voice synthesis such as Vox, Wave (CodingFormat), voice actor ID (SpeakerID), voice Speed of synthesized file, Pitch of voice synthesized file, Volume of voice synthesized file, MaxResponseTime required for production when voice synthesized request (MaxResponseTime), CPU maximum utilization rate (MaxCPU), Maximum disk size Utilization (MaxDISK) and the like.

초기화DB(207)에 저장된 CPU의 최대 이용율(MaxCPU)을 넘어설 경우 음성 합성 리소스 서버(200)는 새로운 안내 멘트에 대한 음성을 합성할 수 없고, 채널DB(208)의 채널 상태를 모두 “busy”로 설정한다. When the maximum utilization rate (CPU) of the CPU stored in the initialization DB 207 is exceeded, the speech synthesis resource server 200 cannot synthesize speech for a new announcement, and all of the channel states of the channel DB 208 are “busy”. Set to "."

채널DB(208)에는 각 채널의 상태 정보가 저장되어 있다. 예를 들어, 초기화DB(207)에 현재 30개가 최대 채널이라고 설정되어 있는 경우, 채널DB(208)에는 현재 30개의 전체 채널 중에서 몇 개의 채널이 사용(busy)되고, 몇 개의 채널이 아이들(idle) 상태인지에 관한 정보가 저장된다. The channel DB 208 stores state information of each channel. For example, if 30 is currently set as the maximum channel in the initialization DB 207, the channel DB 208 currently uses several channels out of all 30 channels, and some channels are idle. Information about whether or not the status is stored.

리소스 매니저(202)는 음성 합성 리소스 서버(200)의 전체 동작을 제어한다. 구체적으로, 리소스 매니저(202)는 초기화 수행을 초기화 매니저(201)로 요청한다. 또한, 리소스 매니저(202)는 채널DB(208)에 저장된 내용을 리소스 관리 서버(300)로 정기적으로 또는 비정기적으로 전송하여, 리소스 관리 서버(300)가 음성 합성 리소스 서버(200)의 채널 상태를 실시간 파악할 수 있도록 한다. The resource manager 202 controls the overall operation of the speech synthesis resource server 200. In detail, the resource manager 202 requests the initialization manager 201 to perform initialization. In addition, the resource manager 202 regularly or irregularly transmits the contents stored in the channel DB 208 to the resource management server 300, so that the resource management server 300 is the channel state of the speech synthesis resource server 200. It can be identified in real time.

또한, 리소스 매니저(202)는 콜박스(401)로부터 음성 합성 요청을 받은 경우, 엔진 매니저(205)에게 음성 합성을 요청하고, 타이머 매니저(204)에게 타이머 수행을 요청한다. In addition, when the resource manager 202 receives a speech synthesis request from the call box 401, the resource manager 202 requests the speech synthesis from the engine manager 205 and requests the timer manager 204 to perform a timer.

이후, 설정된 시간이 경과했음을 나타내는 정보를 타이머 매니저(204)로부터 수신할 때까지 엔진 매니저(205)로부터 음성 합성 완료 응답을 전달받지 못한 경우, 리소스 매니저(202)는 음성 합성을 요청한 콜박스(401)로 불완료 메시지를 전송한다.Subsequently, when the voice synthesis completion response is not received from the engine manager 205 until the timer manager 204 receives the information indicating that the set time has elapsed, the resource manager 202 calls the call box 401 for requesting the speech synthesis. Send an incomplete message.

생성된 음성 합성 파일을 저장하는 방식으로는, 생성된 음성 합성 파일을 음성 합성을 요청한 콜박스(401)에 저장하는 버퍼링 방식과 생성된 음성 합성 파일을 음성 합성 리소스 서버(200)에 저장하는 파일링 방식이 있다. As a method of storing the generated speech synthesis file, a buffering method for storing the generated speech synthesis file in the call box 401 requesting speech synthesis and a filing for storing the generated speech synthesis file in the speech synthesis resource server 200 There is a way.

버퍼링 방식에 의하는 경우, 리소스 매니저(202)는 엔진 매니저(205)로부터 음성 합성 완료 응답을 전달받은 경우, 리소스DB(212)에 저장되어 있는 생성된 음성 합성 파일을 콜박스(401)로 전송한다. In the case of the buffering method, when the resource manager 202 receives the speech synthesis completion response from the engine manager 205, the resource manager 202 transmits the generated speech synthesis file stored in the resource DB 212 to the call box 401. do.

반면, 파일링 방식에 의하는 경우, 리소스 매니저(202)는 엔진 매니저(205)로부터 음성 합성 완료 응답을 전달받은 경우, 리소스DB(212)에 저장되어 있는 생성된 음성 합성 파일을 확인하고, 콜박스(401)로 음성 합성 완료에 대한 정보를 전송한다.On the other hand, in the case of the filing method, when the resource manager 202 receives the speech synthesis completion response from the engine manager 205, the resource manager 202 checks the generated speech synthesis file stored in the resource DB 212 and calls the call box. Information about the completion of speech synthesis is transmitted to 401.

웹 매니저(203)는 사용자가 웹을 통해 음성 합성 리소스 서버(200)의 운용 정보가 저장된 운용DB(211)에 접속하여 관리할 수 있도록 한다. 즉, 웹 매니저(203)는 사용자가 전용 단말을 이용하지 않고 웹을 통해 접속하여 운용DB(211)의 내용을 파악하고, 수정할 수 있도록 한다.The web manager 203 allows a user to access and manage the operation DB 211 in which the operation information of the voice synthesis resource server 200 is stored through the web. In other words, the web manager 203 allows the user to access through the web without using a dedicated terminal to grasp and modify the contents of the operation DB 211.

타이머 매니저(204)는 리소스 매니저(202)로부터 타이머 수행을 요청 받으면, 타이머를 작동시키고, 타이머DB(212)에 설정되어 있는 시간(예를 들어, 3초)에 도달하면 설정된 시간이 경과했음을 나타내는 정보를 리소스 매니저(202)로 전달한다. When the timer manager 204 receives a timer request from the resource manager 202, the timer is activated, and when the time set in the timer DB 212 (for example, 3 seconds) is reached, the timer manager 204 indicates that the set time has elapsed. The information is passed to the resource manager 202.

엔진 매니저(205)는 복수의 합성기 엔진(206_1, 206_2,…, 206_n)의 생성과 소멸을 관리하고, 복수의 합성기 엔진(206_1, 206_2,…, 206_n)의 상태 정보를 각종DB(213_1, 213_2,…,213_m)에 반영한다.The engine manager 205 manages generation and destruction of the plurality of synthesizer engines 206_1, 206_2,..., 206_n, and stores state information of the plurality of synthesizer engines 206_1, 206_2,..., 206_n in various DBs 213_1, 213_2. , ..., 213_m).

복수의 합성기 엔진(206_1, 206_2,…, 206_n)은 초기화DB(207)에 저장되어 있는 정보를 기초로, 음성 합성시 생성되고 음성 합성이 완료되면 소멸하는 방식으로 구현된다. 복수의 합성기 엔진(206_1, 206_2,…, 206_n)은 하드웨어 성능에 따른 용량 한계로, 초기화DB(207)에 저장된 최대 엔진 개수(n) 이내에서 생성되고, 생성된 합성기 엔진의 개수는 채널DB(208)에 반영된다. The plurality of synthesizer engines 206_1, 206_2,..., 206_n are based on the information stored in the initialization DB 207, and are generated in a manner of speech synthesis and are extinguished when the speech synthesis is completed. The plurality of synthesizer engines 206_1, 206_2,..., 206_n are capacity limits according to hardware performance, and are generated within the maximum number n of engines stored in the initialization DB 207. 208).

엔진 매니저(205)는 리소스 메니저(202)로부터 음성 합성을 요청 받은 경우, 합성기 엔진(206_1, 206_2,…, 206_n)의 생성 가부를 채널DB(208)와 초기화DB(207)에 저장되어 있는 정보를 기초로 판단한다. 콜박스(401)에서 요청한 초기화 데이터가 있는 경우는, 콜박스(401)로부터 수신한 초기화 데이터를 기초로 합성기 엔진(206_i)이 생성된다. When the engine manager 205 receives a voice synthesis request from the resource manager 202, the information stored in the channel DB 208 and the initialization DB 207 indicates whether the synthesizer engines 206_1, 206_2, ..., 206_n can be generated. Judging from the base. When there is initialization data requested by the call box 401, the synthesizer engine 206_i is generated based on the initialization data received from the call box 401.

복수의 합성기 엔진(206_1, 206_2,…, 206_n)은 특정 문자 또는 문장에 대해 합성된 음성이 저장되어 있는 합성음DB(213_1, 213_2,…213_m)를 이용하여 안내 멘 트에 대한 음성을 합성하여 음성 합성 파일을 생성한다. 이때, 생성된 음성 합성 파일은 리소스DB(210)에 반영된다. The plurality of synthesizer engines 206_1, 206_2,..., 206_n synthesize a voice for the guidement using a synthesized sound DB 213_1, 213_2,... 213_m that stores a synthesized voice for a specific character or sentence. Create a composite file. In this case, the generated speech synthesis file is reflected in the resource DB 210.

캐쉬 DB(209)에서는 엔진 매니저(205)에서 생성되는 음성 합성 파일이 버퍼링되고, 다른 콜박스에서 동일한 조건의 음성 합성을 요청하는 경우, 합성기 엔진을 구동하지 않고, 캐쉬DB(209)에 저장되어 있는 음성 합성 파일이 이용된다. In the cache DB 209, the speech synthesis file generated by the engine manager 205 is buffered and stored in the cache DB 209 without running the synthesizer engine when requesting speech synthesis under the same condition from another call box. Speech synthesis file is used.

리소스DB(210)에는 콜박스(401)가 음성 합성 리소스 서버(200)에 음성 합성을 요청한 이후의 모든 상태가 저장된다. 구체적으로, 콜박스(401)로부터 수신한 요청 채널ID, 콜박스(401)의 InfoDB(403)의 내용이 저장된다. 또한, 리소스DB(210)에는 음성 합성 파일 및 음성 합성 결과가 정상인지에 대한 정보가 저장되고, 타이머 정보가 저장된다. The resource DB 210 stores all the states after the call box 401 requests the voice synthesis resource server 200 for voice synthesis. Specifically, the request channel ID received from the call box 401 and the contents of the InfoDB 403 of the call box 401 are stored. In addition, the resource DB 210 stores information on whether the speech synthesis file and the speech synthesis result are normal, and timer information.

리소스 관리 서버(300)는 서버 매니저(301) 및 웹IF(302)를 포함하고, 콜박스(401)의 요청에 따라 음성 합성에 이용할 음성 합성 리소스 서버(200)의 자원을 할당한다. The resource management server 300 includes a server manager 301 and a web IF 302, and allocates resources of the speech synthesis resource server 200 to be used for speech synthesis at the request of the call box 401.

서버 매니저(301)는 콜박스(401)로부터 음성 합성을 위한 가용채널 정보를 요청 받고, 음성 합성 리소스 서버(200)의 가용 채널을 서버DB(303)에 저장된 내용을 기초로 확인하여 콜박스(401)로 가용 채널의 서버IP에 대한 정보를 제공한다. The server manager 301 receives the available channel information for speech synthesis from the call box 401, checks the available channel of the speech synthesis resource server 200 based on the contents stored in the server DB 303, and checks the call box ( 401) provides information about the server IP of the available channel.

이때, 서버DB(303)에는 음성 합성 리소스 서버(200)의 상태 정보가 저장되어 있으며, 구체적으로, 음성 합성 리소스 서버(200)의 채널DB(208)와 동일한 정보가 저장되어 있을 수 있다. In this case, the server DB 303 may store state information of the voice synthesis resource server 200, and specifically, the same information as that of the channel DB 208 of the voice synthesis resource server 200 may be stored.

웹IF(302)는 음성 합성 리소스 서버(200)의 웹 매니저(203)와 동일하게, 사 용자가 웹을 통해 음성 합성 리소스 서버(200)의 운용 정보가 저장된 운용DB(304)에 접속하여 관리할 수 있도록 한다. 이때, 운용DB(304)에는 음성 합성 리소스 서버(200) 내의 각종 DB(207~212, 213_1. 213-2,…, 213_n) 정보를 포함하고 있다.The web IF 302 is managed in the same manner as the web manager 203 of the speech synthesis resource server 200 by accessing and managing the operation DB 304 in which the operation information of the speech synthesis resource server 200 is stored through the web. Do it. At this time, the operation DB 304 includes various DB (207 to 212, 213_1, 213-2, ..., 213_n) information in the speech synthesis resource server 200.

웹IF(302)는 운용자에 의해 운용DB(304)의 내용이 수정된 경우, 수정된 내용을 음성 합성 리소스 서버(200)로 전송하고, 수정된 내용을 수신한 음성 합성 리소스 서버(200)는 해당 내용을 음성합성 리소스 서버(200) 내의 각종 DB(207~212, 213_1. 213-2,…, 213_n)에 반영한다. 이를 통해, 음성 합성 리소스 서버(200)와 리소스 관리 서버(300)는 항상 동일한 정보를 가지게 된다. When the contents of the operation DB 304 are modified by the operator, the web IF 302 transmits the modified contents to the speech synthesis resource server 200, and the speech synthesis resource server 200 receiving the modified contents is The contents are reflected in various DBs 207 to 212, 213_1, 213-2, ..., 213_n in the voice synthesis resource server 200. Through this, the speech synthesis resource server 200 and the resource management server 300 will always have the same information.

사업장 시스템(400)은 콜박스(401), 복수의 DB(402~405), 및 복수의 구내 전화기(103_1, 103_2,… 103_n)를 포함한다.The workplace system 400 includes a call box 401, a plurality of DBs 402-405, and a plurality of campus telephones 103_1, 103_2,... 103_n.

콜박스(401)는 인터넷 망과 연결되어 있으며, 인터넷 망을 통해 수신한 호 설정 요청을 기초로 복수의 구내 전화기(103_1, 103_2,… 103_n) 중 하나의 전화기로 호 설정 절차를 수행한다. 또한, 콜박스(401)는 복수의 구내 전화기(103_1, 103_2,… 103_n) 상호 간의 호 설정 절차를 수행한다. The call box 401 is connected to the Internet network and performs a call setup procedure with one of the plurality of premises telephones 103_1, 103_2,... 103_n based on the call setup request received through the Internet network. In addition, the call box 401 performs a call setup procedure between the plurality of internal telephones 103_1, 103_2, ... 103_n.

또한, 콜박스(401)는 운용자로부터 음성 합성 요청을 입력 받은 경우, 리소스 관리 서버(300)로 사용 가능한 채널 정보를 요청하고, 리소스 관리서버(300)로부터 가용 채널의 서버IP 정보를 수신한다. 이후, 콜박스(401)는 가용 채널의 서버IP 정보, Info DB(403), 및 안내 멘트에 해당하는 문장을 음성 합성 리소스 서버(200)로 전송하며 음성 합성을 요청한다. In addition, when the call box 401 receives a voice synthesis request from the operator, the call box 401 requests channel information available to the resource management server 300 and receives server IP information of available channels from the resource management server 300. Thereafter, the call box 401 transmits the sentence corresponding to the server IP information of the available channel, the Info DB 403, and the announcement to the speech synthesis resource server 200 and requests speech synthesis.

운용자로부터의 음성 합성 요청은 구체적으로 다음과 같은 방식으로 구현될 수 있다. 운용자로부터 음성 합성 요청에 대한 정보를 입력 받은 경우, 콜박스(401)는 안내 멘트에 대한 음성 합성 요청 화면을 사용자에게 표시하며 해당 화면에서 각 영역의 내용을 입력할 것을 요청한다. The speech synthesis request from the operator may be specifically implemented in the following manner. When the operator receives the information on the speech synthesis request from the operator, the call box 401 displays a speech synthesis request screen for the announcement and requests the user to input the contents of each area on the screen.

이후, 버퍼링 방식에 의하는 경우, 음성 합성 리소스 서버(200)로부터 음성 합성 파일을 전송 받아 TTS Wave DB(402)에 저장하고, 인터넷 망으로부터 호 설정 요청이 있는 경우, TTS Wave DB(402)에 저장되어 있는 음성 합성 파일을 재생하여 안내 멘트를 서비스한다. Subsequently, when the buffering method is used, the voice synthesis file is received from the voice synthesis resource server 200 and stored in the TTS Wave DB 402. When a call setup request is received from the Internet, the TTS Wave DB 402 is received. The announcement is serviced by playing the stored speech synthesis file.

이때, 운용자로부터 음성 합성 요청에 대한 정보를 입력 받은 경우, 콜박스(401)가 사용자에게 표시하며 각 영역의 내용을 입력할 것을 요청하는 음성 합성 요청 화면은, 도 3과 같이 구현될 수 있다. In this case, when the operator receives the information on the speech synthesis request, the call box 401 is displayed to the user and the voice synthesis request screen requesting to input the contents of each area may be implemented as shown in FIG. 3.

도 3은 본 발명의 한 실시예에 따른 음성 합성 요청 화면을 나타낸 도면이다. 도 3을 참고하면, 음성 합성 요청 화면(30)은 음성제목 영역(31), 음성파일명 영역(32), 음성파일 위치 영역(33), 음성제작용 텍스트 영역(34), 음성파일 포맷 영역(35), 성우 선택 영역(36), 코멘트 영역(39) 및 제작자명 영역(40)을 포함하고, 음성제작 수행 아이콘(37) 및 미리듣기 아이콘(38)을 포함한다. 3 is a diagram illustrating a speech synthesis request screen according to an embodiment of the present invention. Referring to FIG. 3, the voice synthesis request screen 30 includes a voice title area 31, a voice file name area 32, a voice file location area 33, a voice production text area 34, and a voice file format area ( 35, a voice actor selection area 36, a comment area 39, and a producer name area 40, and a voice production performance icon 37 and a pre-listening icon 38.

운용자는 음성 제목 영역(31)에 안내 멘트에 대한 제목을 입력하고, 음성 파일명 영역(32)에 생성될 음성 합성 파일에 사용할 이름을 입력한다. 또한, 음성 파일 위치 영역(33)에 생성될 음성 합성 파일이 저장될 위치를 입력하고, 음성 제작용 텍스트 영역(34)에 음성 합성 파일을 생성할 안내 멘트에 해당하는 텍스트를 입력한다. The operator inputs a title for the announcement in the voice title area 31 and inputs a name to be used for the voice synthesis file to be generated in the voice file name area 32. In addition, the voice file location area 33 inputs the location where the voice synthesis file to be stored is stored and inputs the text corresponding to the announcement to generate the voice synthesis file in the voice production text area 34.

또한, 음성 파일 포맷 영역(35)에 생성될 음성 합성 파일의 코딩 방식(Vox, Wav,…)을 입력하고, 성우 선택 영역(36)에 음성 합성을 수행할 성우 음성을 입력한다. In addition, a coding scheme (Vox, Wav, ...) of a speech synthesis file to be generated is input to the voice file format region 35, and a voice actor voice to be synthesized is input to the voice actor selection region 36.

이상의 각 영역(31~36)을 입력한 후, 운용자는 음성제작 수행 아이콘(37)을 클릭하여 콜박스(401)로 음성 합성을 요청한다. 이 경우, 콜박스(401)는 각 영역(31~36)에 입력된 내용, Info DB(304), 및 리소스 관리서버(300)로부터 수신한 서버IP 정보를 음성 합성 리소스 서버(200)로 전송하며 음성 합성을 요청한다. After inputting each of the above areas 31 to 36, the operator clicks the voice production performing icon 37 to request the voice synthesis from the call box 401. In this case, the call box 401 transmits the contents input to each area 31 to 36, the Info DB 304, and the server IP information received from the resource management server 300 to the voice synthesis resource server 200. And request voice synthesis.

이후, 버퍼링 방식에 의하는 경우, 음성 합성 리소스 서버(200)로부터 안내 멘트에 대한 음성 합성 파일을 수신한 콜박스(401)는 수신한 음성 합성 파일을 TTS Wave DB(402)에 저장한다. Subsequently, when the buffering method is used, the call box 401 which receives the speech synthesis file for the announcement from the speech synthesis resource server 200 stores the received speech synthesis file in the TTS Wave DB 402.

이후, 운용자가 미리듣기 아이콘(38)을 클릭하면, 콜박스(401)는 TTS Wave DB(402)에 저장되어 있는 음성 합성 파일을 재생한다. Then, when the operator clicks on the pre-listen icon 38, the call box 401 plays the speech synthesis file stored in the TTS Wave DB 402.

재생된 음성을 듣고, 운용자가 합성된 파일에 대한 보완 필요 사항 등의 의견이 있는 경우, 운용자는 음성 합성 요청 화면(30)의 코멘트 영역(39)에 의견을 입력하고, 입력된 의견은 코멘트DB(404)에 해당 음성 합성 파일에 대한 코멘트 내용으로 저장된다. 또한, 운용자는 제작자명 영역(40)에 자신의 이름을 입력할 수 있다. When the operator listens to the reproduced voice and the operator has a comment on the complementary requirements for the synthesized file, the operator inputs a comment in the comment area 39 of the voice synthesis request screen 30, and the inputted comment is a comment DB. 404 is stored as comment content for the speech synthesis file. In addition, the operator may enter his name in the producer name area (40).

다시 도 2를 참고하면, TTS Wave DB(402)에는 음성 합성 리소스 서버(200)에서 합성된 안내 멘트에 대한 음성 합성 파일이 저장된다.Referring back to FIG. 2, the TTS Wave DB 402 stores a speech synthesis file for the announcement synthesized by the speech synthesis resource server 200.

Info DB(403)에는 콜박스(401)와 음성 합성 리소스 서버(200)의 통신을 위한 통신 프로토콜 정보가 저장되며, 구체적으로 버퍼링 방식 혹은 파일링 방식을 구분하는 구분자(BufferORFile), 문장 전체 길이(MessageLength), 음성 합성 처리 결과(nResult), 음성 합성 파일 생성시 코딩 방식(nFormat), 수신된 서버 IP(ServerIP), 음성 합성 파일 생성시 필요한 합성음 DB 정보(SpeakerID), 합성음 크기 정보(Volume), 속도 정보(Speed), 피치 정보(Pitch), 및 음성 합성 파일의 길이 정보(TTSLength)가 저장되어 있다.The Info DB 403 stores communication protocol information for communication between the call box 401 and the voice synthesis resource server 200, and specifically, a delimiter (BufferORFile) for distinguishing a buffering method or a filing method (BufferORFile) and a sentence length (MessageLength). ), Voice synthesis processing result (nResult), coding method (nFormat) when generating a voice synthesis file, received server IP (ServerIP), synthesized sound DB information (SpeakerID) required when generating a voice synthesis file, volume information (Volume), speed Information (Speed), pitch information (Pitch), and length information (TTSLength) of the speech synthesis file are stored.

코멘트DB(404)에는 음성 합성 리소스 서버(200)에서 생성된 음성 합성 파일에 대한 운용자의 의견 내용, 구체적으로, 운용자가 도 3의 음성 합성 요청 화면(60)의 코멘트 영역(69)에 입력한 내용이 저장된다. 코멘트 DB(404)에 저장된 내용은 추후 음성 합성 리소스 서버(200)에서 합성음DB(213_1, 213_2,…213_m)의 튜닝작업에 이용될 수 있다. In the comment DB 404, the operator's opinion about the speech synthesis file generated by the speech synthesis resource server 200, specifically, the operator inputs in the comment area 69 of the speech synthesis request screen 60 of FIG. The contents are saved. The contents stored in the comment DB 404 may be used later in the voice synthesis resource server 200 for tuning the synthesized sound DBs 213_1, 213_2,... 213_m.

시나리오DB(405)에는 운용자가 음성 합성을 요청할 때 사용되는 형식이 저장되어 있다. 이때, 사용되는 형식은 구체적으로 도 3의 화면 형식일 수 있다. The scenario DB 405 stores a format used when an operator requests speech synthesis. In this case, the format used may specifically be the screen format of FIG. 3.

이제, 본 발명의 한 실시예에 따른 음성 합성 방법을 도면을 참고하여 상세하게 설명한다.Now, a speech synthesis method according to an embodiment of the present invention will be described in detail with reference to the drawings.

도 4는 본 발명의 한 실시예에 따른 음성 합성 방법을 나타낸 흐름도이다.4 is a flowchart illustrating a speech synthesis method according to an embodiment of the present invention.

도 4를 참고하면, 리소스 관리 서버(300)와 음성 합성 리소스 서버(200)는 상호 간의 상태 정보를 반영한다(S401). 음성 합성 리소스 서버(200)는 운용 중에 사용 가능한 채널 개수를 정기적으로 또는 비정기적으로 리소스 관리 서버(200)로 전송하고, 리소스 관리 서버(200)는 운용DB(304)의 정보가 변경된 경우, 변경 내용 을 음성 합성 리소스 서버(200)로 전송하여, 상호 간에 상태 정보를 반영한다. Referring to FIG. 4, the resource management server 300 and the voice synthesis resource server 200 reflect state information between each other (S401). The speech synthesis resource server 200 transmits the number of channels available during operation to the resource management server 200 regularly or irregularly, and the resource management server 200 changes when the information of the operation DB 304 is changed. The contents are transmitted to the speech synthesis resource server 200 to reflect state information with each other.

음성 합성을 위해 콜박스(401)는 우선 리소스 관리서버(300)로 음성 합성 리소스 서버(200)의 가용채널 정보를 요청하고(S402), 리소스 관리서버(300)는 운용DB(304)에 저장된 정보를 기초로 가용 채널을 파악하고, 가용채널의 서버IP 정보를 콜박스(401)로 전송한다(S403). For speech synthesis, the call box 401 first requests available channel information of the speech synthesis resource server 200 to the resource management server 300 (S402), and the resource management server 300 is stored in the operation DB 304. The available channel is identified based on the information, and the server IP information of the available channel is transmitted to the call box 401 (S403).

서버IP 정보를 수신한 콜박스(401)는 수신한 서버IP 정보, Info DB(403), 및 안내 멘트에 해당하는 문장을 음성 합성 리소스 서버(200)로 전송하며 음성 합성을 요청한다(S404). The call box 401 receiving the server IP information transmits the sentences corresponding to the received server IP information, the Info DB 403, and the announcement to the speech synthesis resource server 200 and requests speech synthesis (S404). .

음성 합성을 요청 받은 음성 합성 리소스 서버(200)는 수신한 정보를 기초로 음성 합성 파일을 제작하고(S405), 버퍼링 방식에 의하는 경우, 생성한 음성 합성 파일을 콜박스(401)로 전송한다(S406). 콜박스(401)는 수신한 음성 합성 파일을 TTS Wave DB(402)에 저장하고, 이후 인터넷 망으로부터 호 설정 요청이 있는 경우, TTS Wave DB(402)에 저장되어 있는 음성 합성 파일을 재생하여 안내 멘트를 서비스한다. The speech synthesis resource server 200, which has received the speech synthesis request, produces a speech synthesis file based on the received information (S405), and transmits the generated speech synthesis file to the call box 401 by the buffering method. (S406). The call box 401 stores the received speech synthesis file in the TTS Wave DB 402, and when there is a call setup request from the Internet network, plays back the speech synthesis file stored in the TTS Wave DB 402. Service the comment.

이상에서는 음성 합성 파일을 버퍼링 방식으로 저장하는 경우를 설명하였다. 그런데, 버퍼링 방식은 데이터 전송에 따른 에러가 발생한다는 문제점이 있으며, 이에 대한 해결 방안으로 두 가지가 있다.In the above, the case of storing the speech synthesis file in a buffering manner has been described. However, the buffering method has a problem that an error occurs due to data transmission, and there are two solutions to this problem.

첫 번째 방법은 콜박스(401)의 Info DB(403)의 자료 구조를 음성 합성 리소스 서버(200)도 동일하게 가지는 방법이고, 두 번째 방법은 우선 Info DB(403)의 정보를 기초로 수신할 정보의 양을 확인한 후에 수신을 하는 방법이 있다. In the first method, the voice synthesis resource server 200 has the same data structure of the Info DB 403 of the call box 401. The second method first receives information based on the information of the Info DB 403. There is a method of receiving after confirming the amount of information.

두 번째 방법을 구체적으로 살펴보면, 콜박스(401)는 음성 합성 리소스 서버(200)로 음성 합성을 요청할 경우, 우선 Info DB(403)를 전송하고, 추후에 안내 멘트에 해당하는 문장을 전송한다. 즉, 음성 합성 리소스 서버(200)는 Info DB(403) 중 MessageLength의 정보를 기초로 수신할 안내 멘트의 문장 길이를 확인한 후에, MessageLength에 저장되어 있는 문장 길이가 올 때까지 수신 모드로 안내 멘트에 해당하는 문장을 수신한다. Referring to the second method in detail, when the call box 401 requests the speech synthesis to the speech synthesis resource server 200, the call box 401 first transmits the Info DB 403, and later transmits a sentence corresponding to the announcement. That is, the speech synthesis resource server 200 checks the sentence length of the announcement to be received based on the information of the MessageLength in the Info DB 403, and then enters the announcement in the reception mode until the sentence length stored in the MessageLength comes. Receive the corresponding sentence.

이후, 콜박스(401)로부터 MessageLength에 저장되어 있는 문장 길이에 해당하는 데이터를 모두 받은 경우, 음성 합성 리소스 서버(200)는 수신한 안내 멘트에 대한 음성을 합성한다. Then, when all data corresponding to the sentence length stored in the MessageLength from the call box 401, the speech synthesis resource server 200 synthesizes the voice for the received announcement.

또한, 안내 멘트에 대한 음성 합성 파일이 생성된 후, 음성 합성 리소스 서버(200)는 캐쉬DB(209)에 저장된 음성 합성 파일의 사이즈를 파악하고, 파악된 사이즈를 Info DB(403) 중 TTSLength에 입력하여 콜박스(401)로 전송한다.In addition, after the speech synthesis file is generated for the announcement, the speech synthesis resource server 200 determines the size of the speech synthesis file stored in the cache DB 209 and stores the size in TTSLength of the Info DB 403. Input and send to the call box (401).

이후, 콜박스(401)는 수신한 Info DB(403) 중 TTSLength에 저장된 값을 확인 후, 저장된 크기만큼 수신 모드로 음성 합성 파일을 수신한다. 따라서, 두 번째 방법에 의하면, 버퍼링에 따른 데이터 에러를 감소시킬 수 있다. Thereafter, the call box 401 checks the value stored in the TTSLength of the received Info DB 403, and then receives the speech synthesis file in the reception mode by the stored size. Therefore, according to the second method, it is possible to reduce data errors due to buffering.

이상에서는 음성 합성 파일을 버퍼링 방식으로 저장하는 방법에 대해서만 설명하였으나, Info DB(403)의 BufferORFile를 이용하면, 버퍼링 방식과 파일링 방식을 동시에 구현할 수 있다.In the above, only the method of storing the speech synthesis file in the buffering method has been described. However, when the BufferORFile of the Info DB 403 is used, the buffering method and the filing method may be simultaneously implemented.

도 5는 본 발명의 한 실시예에 따른 버퍼링 방식과 파일링 방식을 동시에 구현한 흐름도이다.5 is a flowchart in which a buffering scheme and a filing scheme are simultaneously implemented according to an embodiment of the present invention.

도 5를 참고하면, 콜박스(401)는 InfoDB(403)의 BufferORFile에 수행할 방식에 대한 정보로서 버퍼링 방식 또는 파일링 방식 중 하나를 할당하고(S501), 리소스 관리서버(300)로부터 수신한 서버IP 정보, InfoDB(403), 및 안내 멘트에 해당하는 문장을 음성 합성 리소스 서버(200)로 전송하며 음성 합성을 요청한다(S502). Referring to FIG. 5, the call box 401 allocates one of a buffering method or a filing method as information about a method to be performed in the BufferORFile of the InfoDB 403 (S501), and receives the server from the resource management server 300. The IP information, the InfoDB 403, and the sentence corresponding to the announcement are transmitted to the speech synthesis resource server 200, and the speech synthesis is requested (S502).

음성 합성을 요청 받은 음성 합성 리소스 서버(200)는 InfoDB(403)의 BufferORFile에 저장된 값을 기초로 버퍼링 방식 또는 파일링 방식 중 구현된 방식을 판단한다(S503). The speech synthesis resource server 200 that has received the speech synthesis request determines whether the buffering scheme or the filing scheme is implemented based on the value stored in the BufferORFile of the InfoDB 403 (S503).

버퍼링 방식으로 구현된 경우, 음성 합성 리소스 서버(200)는 안내멘트에 대한 음성 합성 파일을 생성하고, 캐쉬DB(209)로 음성 합성 파일을 사용한다(S504, S505). 이후, 음성 합성 파일의 사이즈를 파악하고, InfoDB(403)의 TTSLength에 파악된 사이즈를 할당한다(S506, S507). When implemented in a buffered manner, the speech synthesis resource server 200 generates a speech synthesis file for the announcement and uses the speech synthesis file as the cache DB 209 (S504 and S505). Thereafter, the size of the speech synthesis file is determined, and the size determined in the TTSLength of the InfoDB 403 is allocated (S506 and S507).

이후, 음성 합성 리소스 서버(200)는 InfoDB(403)를 콜박스(401)로 전송한 후, 생성된 음성 합성 파일을 콜박스(401)로 전송한다(S508, S509). 음성 합성 파일이 콜박스(401)로 전송된 후, 음성 합성 리소스 서버(200)에서 해당 안내 멘트에 대한 음성을 합성한 합성 엔진은 소멸한다(S510). Thereafter, the speech synthesis resource server 200 transmits the InfoDB 403 to the call box 401, and then transmits the generated speech synthesis file to the call box 401 (S508 and S509). After the speech synthesis file is transmitted to the call box 401, the synthesis engine that synthesizes the speech for the announcement in the speech synthesis resource server 200 is extinguished (S510).

파일링 방식으로 구현된 경우, 음성 합성 리소스 서버(200)는 안내 멘트에 대한 음성 합성 파일을 생성하고, 캐쉬DB(209)로 음성 합성 파일을 사용한다(S511, S512). 이후, InfoDB(403)의 SavedDirectory에 저장되어 있는 위치에 음성 합성 파일을 저장하고, InfoDB(403)를 콜박스(401)로 전송한다(S513, S514). 이후, 음성 합성 리소스 서버(200)에서 해당 안내 멘트에 대한 음성을 합성한 합성 엔진은 소멸한다(S515). When implemented by the filing method, the speech synthesis resource server 200 generates a speech synthesis file for the announcement and uses the speech synthesis file as the cache DB 209 (S511 and S512). Thereafter, the speech synthesis file is stored in the location stored in the SavedDirectory of the InfoDB 403, and the InfoDB 403 is transmitted to the call box 401 (S513 and S514). Thereafter, the synthesis engine, which synthesized the voice for the announcement in the speech synthesis resource server 200, is extinguished (S515).

이상에서 설명한 본 발명의 실시예는 장치 및 방법을 통해서만 구현이 되는 것은 아니며, 본 발명의 실시예의 구성에 대응하는 기능을 실현하는 프로그램 또는 그 프로그램이 기록된 기록 매체를 통해 구현될 수도 있으며, 이러한 구현은 앞서 설명한 실시예의 기재로부터 본 발명이 속하는 기술분야의 전문가라면 쉽게 구현할 수 있는 것이다. The embodiments of the present invention described above are not only implemented by the apparatus and method but may be implemented through a program for realizing the function corresponding to the configuration of the embodiment of the present invention or a recording medium on which the program is recorded, The embodiments can be easily implemented by those skilled in the art from the description of the embodiments described above.

이상에서 본 발명의 실시예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다.Although the embodiments of the present invention have been described in detail above, the scope of the present invention is not limited thereto, and various modifications and improvements of those skilled in the art using the basic concepts of the present invention defined in the following claims are also provided. It belongs to the scope of rights.

도 1은 본 발명의 한 실시예에 따른 구내 교환 시스템을 개념적으로 나타낸 도면이다. 1 is a diagram conceptually showing a premises exchange system according to an embodiment of the present invention.

도 3은 본 발명의 한 실시예에 따른 음성 합성 요청 화면을 나타낸 도면이다. 3 is a diagram illustrating a speech synthesis request screen according to an embodiment of the present invention.

Claims

In the speech synthesis resource server in conjunction with the resource management server to synthesize the speech for the announcement,

Exchanging status information between the resource management server and the speech synthesis resource server, wherein the status information includes an available channel of the speech synthesis resource server and server IP information of the available channel;

When the available channel information is requested from the call box, the resource management server identifying an available channel of the voice synthesis resource server and transmitting server IP information of the available available channel to the call box; And

When receiving a speech synthesis request including sentences corresponding to the server IP information and announcement from the call box, the speech synthesis resource server is a synthesizer engine, where the synthesizer engine stores the synthesized speech for a specific character or sentence. Synthesizing the voice using the synthesized sound DB; and generating a voice synthesis file for the sentence, and transmitting the result of the voice synthesis request to the call box.

Speech synthesis method.

The method of claim 1,

The result of the speech synthesis request is the speech synthesis file,

Storing, by the call box, the speech synthesis file, the speech synthesis file; And

If the call box receives a call establishment request, further comprising reproducing the speech synthesis file to service announcements;

Speech synthesis method.

The method of claim 1,

Exchanging the status information,

Transmitting, by the voice synthesis resource server, information on an available channel to the resource management server; And

If the information stored in the resource management server is changed, transmitting the change to the speech synthesis resource server.

Speech synthesis method.

The method of claim 1,

Sending the result of the speech synthesis request to the call box,

Receiving, by the speech synthesis resource server, information on the total length of the sentence from the call box; And

When the speech synthesis resource server operates in a reception mode until the sentence length received by the information on the entire length of the sentence is received, and receives all data corresponding to the entire length of the sentence, the voice for the data Comprising the steps of synthesizing

Speech synthesis method.

The method of claim 1,

Sending the result of the speech synthesis request to the call box,

Determining, by the speech synthesis resource server, the length of the speech synthesis file and transmitting the determined length of the speech synthesis file to the call box: And

Operating the call box in a receive mode until a length of the speech synthesis file is received.

Speech synthesis method.

The method of claim 1,

Sending the result of the speech synthesis request to the call box,

Information about a method of storing the speech synthesis file from the call box by the speech synthesis resource server, a buffering method of storing the speech synthesis file in the call box, and the speech synthesis file stored in the speech synthesis resource server. Receiving one of the filing schemes

Speech synthesis method.

The method of claim 1,

Sending the result of the speech synthesis request to the call box,

Generating, by the speech synthesis resource server, a synthesizer engine to generate a speech synthesis file for the sentence, and extinguishing the synthesizer engine after the speech synthesis file is generated.

Speech synthesis method.

The method of claim 1,

Sending the result of the speech synthesis request to the call box,

When the speech synthesis resource server receives a filing scheme from the call box as information on the manner of storing the speech synthesis file, storing the speech synthesis file at a location preset in the speech synthesis resource server. More containing

Speech synthesis method.

A business system including a call box for providing announcements when receiving a call setup request from an external user;

A resource management server for identifying an available channel of a voice synthesis resource server and transmitting server IP information of the available available channel to the call box when receiving available channel information from the call box; And

When receiving a voice synthesis request including a sentence corresponding to the server IP information, announcements from the call box, a synthesizer engine, where the synthesizer engine uses a synthesized sound DB that stores a synthesized voice for a specific character or sentence. A speech synthesis resource server for generating a speech synthesis file for the sentence by synthesizing speech and transmitting the speech synthesis file to the call box.

Premises exchange unit.

10. The method of claim 9,

The workplace system,

Further comprising a TTS Wave DB for storing the speech synthesis file

Premises exchange unit.

10. The method of claim 9,

The speech synthesis resource server

A resource manager for transmitting information about an available channel to the resource management server; And

And a engine manager for generating a synthesizer engine to generate a speech synthesis file for the sentence, and destroying the synthesizer engine after the speech synthesis file is generated.

Premises exchange unit.

10. The method of claim 9,

The resource management server

A server manager for receiving available channel information for voice synthesis from the call box and providing server IP information of the available channel to the call box; And

If the information stored in the resource management server is changed, WebIF for transmitting the change to the voice synthesis resource server

Premises exchange unit.