KR20050086229A

KR20050086229A - Sound effects inserting method and system using functional character

Info

Publication number: KR20050086229A
Application number: KR1020040012627A
Authority: KR
Inventors: 문장원; 문정훈
Original assignee: 엔에이치엔(주)
Priority date: 2004-02-25
Filing date: 2004-02-25
Publication date: 2005-08-30

Abstract

본 발명은 기능 문자를 이용한 사운드 효과 삽입 방법 및 시스템에 관한 것이다. 본 발명에 따른 사운드 효과 삽입 시스템은 네트워크를 통하여 텍스트 단말기 및 음성 단말기와 연결된 사운드 효과 삽입 시스템으로서, 텍스트 단말기로부터 기능 문자와 대화용 텍스트를 포함하는 데이터를 수신하는 데이터 수신부, 데이터에 포함된 기능 문자와 대화용 텍스트를 분리하여 출력하는 기능 문자 추출부, 대화용 텍스트를 입력하여 음성으로 변환하는 음성 합성부, 기능 문자에 대응되는 사운드를 저장하는 데이터베이스, 및 데이터베이스에서 기능 문자에 대응되는 사운드를 추출하고, 음성 합성부에서 변환된 음성과 합성시켜 출력하는 기능 문자 사운드 합성부를 포함한다. The present invention relates to a method and system for inserting sound effects using functional characters. The sound effect insertion system according to the present invention is a sound effect insertion system connected to a text terminal and a voice terminal through a network, comprising: a data receiving unit for receiving data including function text and text for conversation from a text terminal, and function text included in the data And a function text extractor for separating and outputting the text for conversation, a speech synthesizer for inputting the text for conversation and converting it into a voice, a database storing a sound corresponding to the function text, and extracting a sound corresponding to the function text from the database. And a function text sound synthesis unit synthesized with the voice converted by the voice synthesis unit and output.

Description

SOUND EFFECTS INSERTING METHOD AND SYSTEM USING FUNCTIONAL CHARACTER}

본 발명은 사운드 효과 삽입 방법 및 시스템에 관한 것으로, 더욱 상세하게는 기능 문자에 대응되는 효과음을 음성 또는 텍스트에 삽입시킬 수 있는 방법 및 시스템에 관한 것이다.The present invention relates to a method and system for inserting sound effects, and more particularly, to a method and system for inserting sound effects corresponding to functional characters into voice or text.

음성 합성 기술에 있어서, 이를 실용적 어플리케이션으로 이용하는 데에 가장 장애가 되었던 점 중의 하나가 바로 감정의 표현이다. 대화를 전달함에 있어 감정이 상당한 부분을 차지함에도 불구하고, 음성 합성 엔진을 통하여 제작된 음성은 일반적으로 톤이 단조롭거나 일정하며 무색무미한 음성으로, 이를 듣는 이에게 감정을 전혀 전달해 주지 못하는 문제가 있었다.In speech synthesis technology, one of the obstacles to using this as a practical application is the expression of emotion. Although emotions play a significant part in conveying conversations, voices produced through speech synthesis engines are usually monotonous, constant, or colorless, with no tone at all. there was.

또한, 종래에 문장 속에 포함된 단어의 의미나 특정 기능 문자를 이용하여 음성 변환시 이용하고자 하는 노력은 있었으나, 이 역시 기술적인 한계로 인해 실제 실용 기술화되지 않는 측면이 있다.In addition, although there have been efforts to use in converting a speech using a meaning of a word included in a sentence or a specific function character in the related art, this also has aspects that are not practically practical due to technical limitations.

한편, 인터넷에서는 단순 텍스트로만 이루어져 있는 텍스트 기반 대화에서 감정을 나타내기 위한 다양한 방법들이 고안되었으며, 가장 대표적인 것이 이모티콘의 활용이다. 문장의 앞과 뒤에 이모티콘을 추가함으로써 현재 자신의 기분이 어떠한가를 보다 잘 표현할 수 있게 되었으며, 이를 활용하여 아바타의 대화 모습을 자동으로 바꾸어주는 기술까지 등장하게 되었다.On the other hand, various methods have been devised on the Internet to express emotions in text-based conversations that consist only of simple texts, and the most representative one is the use of emoticons. By adding emoticons in front of and behind sentences, it is now possible to better express how you are feeling, and a technology that automatically changes the avatar's conversation appearance has appeared.

그러나, 이러한 텍스트 메시징에서는 단순한 텍스트 또는 이모티콘으로 표현되는 문자 또는 그림 정도로 표현력이 제한되어 대화자의 의도가 풍부하게 전달되지 않는 측면이 있다.However, in such text messaging, the expressive power is limited to a character or a picture represented by a simple text or an emoticon, so that the intention of the talker is not conveyed in abundance.

본 발명이 이루고자 하는 기술적 과제는 텍스트를 음성으로 전달하는 경우 기능 문자에 대응되는 사운드 효과를 기계적인 음성 변환을 통해 나온 음성 스트림에 삽입시킴으로써 변환된 음성을 풍부하게 생성하고 전달력을 높이기 위한 것이다. The technical problem to be achieved by the present invention is to insert a sound effect corresponding to a function character when the text is transmitted to the voice to the voice stream through the mechanical voice conversion to generate a rich converted voice and increase the transmission power.

또한, 텍스트 메시징에서 문장 속에 음향 효과를 나타내는 기능 문자를 추가하여 전달함으로써 화자의 대화의 표현력을 풍부하게 만들고 전달력을 높이기 위한 것이다.In addition, in text messaging, it is intended to enrich the expressiveness of the talker's conversation and increase the delivery power by adding and transmitting a function character representing a sound effect in a sentence.

상기 과제를 달성하기 위하여, 본 발명의 하나의 특징에 따른 사운드 효과 삽입 시스템은 네트워크를 통하여 텍스트 단말기 및 음성 단말기와 연결된 사운드 효과 삽입 시스템으로서, 상기 텍스트 단말기로부터 기능 문자와 대화용 텍스트를 포함하는 데이터를 수신하는 데이터 수신부; 상기 데이터에 포함된 상기 기능 문자와 상기 대화용 텍스트를 분리하여 출력하는 기능 문자 추출부; 상기 대화용 텍스트를 입력하여 음성으로 변환하는 음성 합성부; 상기 기능 문자에 대응되는 사운드를 저장하는 데이터베이스; 및 상기 데이터베이스에서 상기 기능 문자에 대응되는 사운드를 추출하고, 상기 음성 합성부에서 변환된 음성과 합성시켜 출력하는 기능 문자 사운드 합성부를 포함한다. In order to achieve the above object, a sound effect insertion system according to an aspect of the present invention is a sound effect insertion system connected to a text terminal and a voice terminal via a network, the data including the function text and the dialogue text from the text terminal Data receiving unit for receiving; A function text extraction unit for separating and outputting the function text and the dialogue text included in the data; A speech synthesizer configured to convert the text for conversation into speech; A database storing a sound corresponding to the function text; And a function text sound synthesis unit that extracts a sound corresponding to the function text from the database, synthesizes the sound corresponding to the function text, and synthesizes the sound.

본 발명의 하나의 특징에 따른 사운드 효과 삽입 시스템에 있어서, 상기 기능 문자는 사운드의 시작 위치, 종류, 아이디, 및 지속 시간에 대한 정보를 포함한다.In the sound effect insertion system according to an aspect of the present invention, the function letter includes information about the start position, type, ID, and duration of the sound.

본 발명의 하나의 특징에 따른 사운드 효과 삽입 시스템에 있어서, 상기 기능 문자 사운드 합성부는 상기 변환된 음성의 지속 시간과 상기 사운드의 시작 위치를 산술적으로 계산하여 상기 사운드가 합성되어야 할 시작 시간을 계산한다.In the sound effect insertion system according to an aspect of the present invention, the function text sound synthesis unit calculates a start time at which the sound should be synthesized by arithmetically calculating the duration of the converted voice and the start position of the sound. .

본 발명의 하나의 특징에 따른 기능 문자를 이용한 사운드 효과 삽입 방법은 상기 기능 문자와 대화용 텍스트를 포함하는 데이터를 수신하는 단계; 상기 데이터에 포함된 상기 기능 문자와 상기 대화용 텍스트를 분리하여 출력하는 단계; 상기 대화용 텍스트를 디스플레이하는 단계; 상기 기능 문자에 대응되는 사운드를 추출하는 단계; 및 상기 사운드를 재생하는 단계를 포함한다. According to an aspect of the present invention, there is provided a sound effect insertion method using function text, comprising: receiving data including the function text and text for dialogue; Separating and outputting the function text and the dialogue text included in the data; Displaying the conversational text; Extracting a sound corresponding to the function character; And playing the sound.

이하, 본 발명의 실시예를 도면을 참조하여 상세히 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

도면에서, 본 발명과 관계없는 부분은 본 발명의 설명을 명확하게 하기 위하여 생략하였다. 또한, 아래의 설명에서는, 본 발명의 일실시예에 따른 사운드 효과 삽입 방법이 텍스트를 음성으로 변환하여 전달하는 음성 변환 시스템에 적용된 제1 실시예에 대하여 먼저 설명하고, 그 다음으로 텍스트 메시징에 적용된 제2 실시예에 대하여 설명한다.In the drawings, portions irrelevant to the present invention have been omitted for clarity. In addition, in the following description, a sound effect insertion method according to an embodiment of the present invention will be described first with respect to a first embodiment applied to a speech conversion system for converting and converting text into speech, and then applied to text messaging. The second embodiment will be described.

도 1은 본 발명의 제1 실시예에 따른 기능 문자를 이용한 사운드 효과 삽입 시스템을 도시한 것이다.1 shows a sound effect insertion system using function characters according to a first embodiment of the present invention.

도 1에 도시된 바와 같이, 본 발명의 일실시예에 따른 기능 문자를 이용한 사운드 효과 삽입 시스템(300)은 네트워크를 통하여 대화용 텍스트 입력 장치(100) 및 음성 출력 장치(200)와 연결되어 있다.As shown in FIG. 1, the sound effect insertion system 300 using function characters according to an embodiment of the present invention is connected to a text input device 100 for dialogue and a voice output device 200 through a network. .

대화용 텍스트 입력 장치(100)는 음성으로 전송하고자 하는 텍스트를 사운드 효과 삽입 시스템(300)으로 전송하기 위한 장치로서, 텍스트에 기능 문자를 수동 또는 자동으로 삽입시킬 수 있는 기능을 포함한다.The text input device 100 for conversation is a device for transmitting text to be transmitted by voice to the sound effect insertion system 300 and includes a function capable of manually or automatically inserting a function character into the text.

대화용 텍스트 입력 장치(100)는 메신저 프로그램이 설치되거나 인터넷에 접속하여 웹 채팅을 수행할 수 있는 단말기이다. 도 1에서는 대화용 텍스트 입력 장치(100)가 컴퓨터인 경우로 도시하였으나, 실시예에 따라서 모바일 단말기, 전용 클라이언트 채팅 도구 등 다양한 형태의 텍스트 메시징 도구가 이용될 수 있다. The text input device 100 for a conversation is a terminal in which a messenger program is installed or a web chat is performed by accessing the Internet. In FIG. 1, although the text input apparatus 100 for conversation is a computer, various types of text messaging tools, such as a mobile terminal and a dedicated client chat tool, may be used.

음성 출력 장치(200)는 사운드 효과 삽입 시스템(300)으로부터 전송된 음성을 출력할 수 있는 단말기로서, 스피커가 구비된 컴퓨터, 모바일 단말기 등을 포함한다. The voice output device 200 is a terminal capable of outputting a voice transmitted from the sound effect insertion system 300, and includes a computer equipped with a speaker, a mobile terminal, and the like.

한편, 이하에서는 설명의 편의상 대화용 텍스트 입력 장치(100)를 "텍스트 단말기"라 하고, 음성 출력 장치(200)를 "음성 단말기"라 한다. 또한, 대화용 텍스트가 오디오로 변환된 것을 "음성"이라 표현하고, 기능 문자에 대응되는 오디오를 "사운드"라 표현한다.In the following description, the text input apparatus 100 for conversation is referred to as a "text terminal", and the voice output apparatus 200 is referred to as a "voice terminal". In addition, the conversation text is converted into audio as "voice", and the audio corresponding to the function character is expressed as "sound".

본 발명의 일실시예에 따른 사운드 효과 삽입 시스템(300)은 텍스트 수신부(310), 기능 문자 추출부(320), TTS(Text-to-speech) 처리부(330), 음성 합성부(340), 기능 문자 사운드 합성부(350), 기능 문자 사운드 데이터베이스(360), 및 인터페이스부(370)를 포함한다. The sound effect insertion system 300 according to an embodiment of the present invention includes a text receiver 310, a function character extractor 320, a text-to-speech (TTS) processor 330, a voice synthesizer 340, The function text sound synthesis unit 350, the function text sound database 360, and the interface unit 370 are included.

텍스트 수신부(310)는 텍스트 단말기(100)로부터 네트워크를 통하여 텍스트를 수신하면, 수신된 텍스트를 기능 문자 추출부(320)로 전송한다. When the text receiving unit 310 receives the text from the text terminal 100 through the network, the text receiving unit 310 transmits the received text to the function character extracting unit 320.

기능 문자 추출부(320)는 수신된 텍스트에 기능 문자가 포함되어 있는지를 판단하고, 기능 문자가 포함되어 있는 경우에는 기능 문자를 추출하여 기능 문자 사운드 합성부(350)로 전송한다. 이 때, 기능 문자를 제외한 대화용 텍스트는 TTS 처리부(330)로 전송된다.The function character extractor 320 determines whether the function text is included in the received text, and if the function character is included, extracts the function character and transmits the function character to the function character sound synthesizer 350. At this time, the text for conversation excluding the function character is transmitted to the TTS processing unit 330.

TTS 처리부(330)는 입력된 대화용 텍스트를 음성 합성부(340)로 전송하여 텍스트가 대응되는 음성으로 변환되도록 한다. 또한, 음성 합성부(340)로부터 변환된 음성과 음성의 지속 시간을 수신하여 기능 문자 사운드 합성부(350)로 전송한다.The TTS processor 330 transmits the input dialogue text to the speech synthesizer 340 so that the text is converted into a corresponding speech. In addition, the voice synthesizer 340 receives the converted voice and the duration of the voice and transmits the converted voice to the function text sound synthesizer 350.

음성 합성부(340)는 입력된 텍스트를 음성으로 변화시키는 장치로서, 음성 합성부(340)의 내부 구성 및 동작은 이미 당업계에 널리 알려진 것이므로 여기서 상세한 설명은 생략하기로 한다.The speech synthesizer 340 is a device for converting the input text into speech. Since the internal configuration and operation of the speech synthesizer 340 are well known in the art, a detailed description thereof will be omitted.

기능 문자 사운드 합성부(350)는 입력된 기능 문자에 대응되는 사운드를 기능 문자 사운드 데이터베이스(360)로부터 추출하고, 사운드와 음성을 합성시킨다. 이 때, 기능 문자 사운드 합성부(350)는 기능 문자에 대응되는 사운드가 합성되어야 할 시간적 위치를 산출하고, 상기 음성의 해당 위치에 사운드를 합성시킨다.The function text sound synthesis unit 350 extracts a sound corresponding to the input function text from the function text sound database 360, and synthesizes a sound and a voice. At this time, the function text sound synthesis unit 350 calculates a temporal position at which a sound corresponding to the function text is to be synthesized, and synthesizes the sound at a corresponding position of the voice.

기능 문자 사운드 데이터베이스(360)는 기능 문자에 대응되는 사운드를 저장한다. 본 발명의 일실시예에 따라서는, 기능 문자의 종류에 따라 기능 문자에 대응되는 사운드가 별도의 데이터베이스에 저장될 수 있으며, 하나의 데이터베이스에 영역을 달리 하여 저장될 수 있다.The function text sound database 360 stores a sound corresponding to the function text. According to an embodiment of the present invention, the sound corresponding to the function text may be stored in a separate database according to the type of the function text, and may be stored in a single database with different areas.

인터페이스부(370)는 음성 단말기(200)가 텍스트 단말기(100)와 서로 다른 네트워크를 사용하는 경우, 음성 단말기(200)와 텍스트 단말기(100)가 서로 연결될 수 있도록 하는 상호 프로토콜 변환 및 인터페이스 정합을 수행한다. 구체적으로, 인터페이스부(370)는 텍스트 단말기(100)가 인터넷 접속된 컴퓨터이고, 음성 단말기(200)가 무선 통신네트워크를 사용하는 모바일 단말기인 경우 등에 사용될 수 있다.When the voice terminal 200 uses a different network from the text terminal 100, the interface unit 370 performs mutual protocol conversion and interface matching to allow the voice terminal 200 and the text terminal 100 to be connected to each other. Perform. Specifically, the interface unit 370 may be used when the text terminal 100 is a computer connected to the Internet, and the voice terminal 200 is a mobile terminal using a wireless communication network.

도 1에서, 텍스트 단말기(100), 음성 단말기(200), 및 사운드 삽입 시스템 (300)이 연결된 네트워크는 유무선 네트워크를 포함한다. In FIG. 1, a network to which the text terminal 100, the voice terminal 200, and the sound insertion system 300 are connected includes a wired and wireless network.

이하에서는 본 발명의 일실시예에 따른 기능 문자와 기능 문자를 이용한 사운드 합성 방법에 대하여 보다 구체적으로 설명한다.Hereinafter, a sound synthesis method using a function letter and a function letter according to an embodiment of the present invention will be described in more detail.

표 1은 본 발명의 일실시예에 따른 기능 문자의 구성을 예시적으로 도시한 것이다.Table 1 exemplarily shows a configuration of a function character according to an embodiment of the present invention.

1) 나한테 까불면 <FT type="VOICE" id=100 duration=3s> 죽어!! </FT>1) If you tell me, <FT type = "VOICE" id = 100 duration = 3s> die !! </ FT> 2) 희숙아, 사랑해 <FT type="MUSIC" id=99 duration=20s> </FT>2) Hee-sook, I love you <FT type = "MUSIC" id = 99 duration = 20s> </ FT> 3) 아.. <FT type="SOUND" id=10 duration=5s> 배고프다 </FT> 밥먹자.3) Ah ... <FT type = "SOUND" id = 10 duration = 5s> Hungry </ FT> Let's eat.

본 발명의 일실시예에 따르면, 기능 문자는 사운드의 시작 위치, 종류, 아이디, 시간으로 구성될 수 있으며, 표 1에서는 사운드의 종류가 특정한 효과음을 내는 음성 기능 사운드 "VOICE", 일반적인 음악을 출력하는 음악 기능 사운드 "MUSIC", 특정한 효과음(예컨대, 문닫는 소리, 닭소리, 꼬르륵 소리, 시냇물 소리 등)을 출력하는 효과음 기능 사운드 "SOUND"로 구분된 경우를 도시하였다.According to one embodiment of the invention, the function character may be composed of the start position, type, ID, time of the sound, Table 1 shows the voice function sound "VOICE", the type of sound that has a specific sound effect, the general music The case where the music function sound "MUSIC" and the sound effect function sound "SOUND" which outputs a specific sound effect (for example, a door sound, a chicken sound, a chop sound, a stream sound, etc.) is shown.

표 1에서, 첫번째는 특정한 텍스트 구간 동안 기능 문자에 대응되는 사운드가 재생되도록 한 것으로서, 음성 기능 사운드이고, 아이디는 100이며, 사운드의 지속 기간은 3초인 경우를 도시한 것이다.In Table 1, first, a sound corresponding to a function character is played during a specific text section, which is a voice function sound, an ID of 100, and a duration of the sound is 3 seconds.

두번째는 "희숙아, 사랑해" 다음에 기능 문자에 대응되는 사운드가 재생되도록 한 것으로서, 음악 기능 사운드이고, 아이디는 99이며, 20초간 음악이 지속되도록 한 것이다.The second is to play the sound corresponding to the function character after "I love you," I am a music function sound, the ID is 99, and the music lasts for 20 seconds.

세번째는 "아.." 다음에 기능 문자에 대응되는 사운드 예컨대 꼬르륵 소리 등이 재생되도록 한 것으로서, 효과음 기능 사운드이고, 아이디는 10이며, 5초간 사운드가 지속되도록 한 것이다.The third is a sound corresponding to a function character, such as a chock sound, is reproduced next to "ah ..", which is an effect sound function sound, the ID is 10, and the sound lasts for 5 seconds.

표 1에서 설명의 편의상 기능 문자가 XML로 표현된 경우를 도시하였으나, 기능 문자는 XML 이외에 이모티콘 등의 특별 부호나 다른 언어를 이용하여 다양한 방법으로 표현할 수 있다.For convenience of explanation, Table 1 illustrates a case in which the functional characters are expressed in XML. However, the functional characters may be expressed in various ways using special codes such as emoticons or other languages in addition to XML.

구체적으로는, 사용자가 기능 문자를 나타내는 XML을 수동으로 입력할 수 있고, GUI 기능을 텍스트 단말기(100)에 제공하여 마우스, 키보드 등을 복합적으로 이용하여 기능 문자를 입력하도록 할 수 있다. Specifically, the user may manually input XML representing the function character, and may provide the GUI function to the text terminal 100 to input the function character by using a mouse, a keyboard, and the like in combination.

도 2는 본 발명의 제1 실시예에 따른 기능 문자를 이용한 사운드 효과 삽입 방법을 도시한 순서도이다.2 is a flowchart illustrating a sound effect insertion method using function characters according to the first embodiment of the present invention.

도 2에 도시된 바와 같이, 텍스트가 수신되면(S201), 기능 문자 추출부(320)가 기능 문자를 추출한다(S202). 이 후, 기능 문자 추출부(320)는 추출된 기능 문자는 기능 문자 사운드 합성부(350)로 전송하고, 기능 문자를 제외한 대화용 텍스트는 TTS 처리부(330)로 전송한다.As shown in FIG. 2, when text is received (S201), the function character extractor 320 extracts a function character (S202). Thereafter, the function text extraction unit 320 transmits the extracted function text to the function text sound synthesis unit 350, and transmits the text for conversation except the function text to the TTS processing unit 330.

이 후, TTS 처리부(330)는 대화용 텍스트를 음성 변환부(340)로 전송하여 (S203) 텍스트가 음성으로 변환되도록 한다. 또한, 변환된 음성과 음성의 지속 시간을 수신하여(S204), 기능 문자 사운드 합성부(350)로 전송한다(S205).Thereafter, the TTS processor 330 transmits the text for conversation to the voice converter 340 so that the text is converted into voice (S203). In addition, the received voice and the duration of the voice are received (S204) and transmitted to the function text sound synthesis unit 350 (S205).

한편, 기능 문자 사운드 합성부(350)는 입력된 기능 문자에 대응되는 사운드를 기능 문자 사운드 데이터베이스(360)에서 추출하고(S206), TTS 처리부(330)로부터 변환된 음성과 음성의 지속 시간을 수신한다(S207).Meanwhile, the function text sound synthesis unit 350 extracts a sound corresponding to the input function text from the function text sound database 360 (S206), and receives the converted voice and duration of the voice from the TTS processing unit 330. (S207).

이 후, 기능 문자 사운드 합성부(350)는 음성의 지속 시간과 사운드의 시작 위치를 산술적으로 계산하고(S208), 기능 문자에 대응되는 사운드의 시작 시간을 산출하여 변환된 음성에 사운드의 지속 시간만큼 사운드를 합성시킨다(S209).Thereafter, the function text sound synthesis unit 350 arithmetically calculates the duration of the voice and the start position of the sound (S208), calculates the start time of the sound corresponding to the function text, and calculates the duration of the sound in the converted voice. The sound is synthesized as much as it is (S209).

이로써, 기능 문자에 대응되는 사운드 효과음이 삽입된 음성이 출력되고, 텍스트와 음성 전화 등의 실시간 채팅에 있어서, 텍스트 입력자는 자기 감정을 적절하게 표현할 수 있게 된다. As a result, a voice with a sound effect sound corresponding to a function character is output, and the text inputter can appropriately express his or her emotion in real-time chat such as text and voice call.

도 3은 본 발명의 제2 실시예에 따른 텍스트 메시징에서의 기능 문자를 이용한 사운드 효과 삽입 시스템을 도시한 것이다.3 illustrates a sound effect insertion system using function characters in text messaging according to a second embodiment of the present invention.

본 발명의 제2 실시예에 따르면, 기능 문자를 이용한 사운드 효과 삽입 시스템(600)은 네트워크를 통하여 사용자 단말기(400, 500)에 접속되어 있다.According to the second embodiment of the present invention, the sound effect insertion system 600 using the function characters is connected to the user terminals 400 and 500 via a network.

사용자 단말기(400)는 대화용 텍스트와 기능 문자를 입력하여 네트워크를 통해 전송할 수 있는 단말기이고, 사용자 단말기(500)는 텍스트를 수신하여 처리할 수 있는 정보 시스템에 텍스트 뷰어 기능과 사운드 플레이어 기능이 포함된 단말기이다. 이러한, 사용자 단말기(400, 500)로서 컴퓨터 이외에 PDA, 휴대폰 등의 모바일 단말기가 이용될 수 있다.The user terminal 400 is a terminal capable of inputting text and function characters for conversation through a network, and the user terminal 500 includes a text viewer function and a sound player function in an information system capable of receiving and processing text. Terminal. As the user terminals 400 and 500, a mobile terminal such as a PDA or a mobile phone may be used in addition to a computer.

또한, 본 발명의 제2 실시예에 따른 사운드 효과 삽입 시스템(600)은 기능 문자 서버(610)와 기능 문자 사운드 데이터베이스(620)를 포함한다.Also, the sound effect insertion system 600 according to the second embodiment of the present invention includes a function text server 610 and a function text sound database 620.

기능 문자 서버(610)는 사용자 단말기(500)가 기능 문자에 대응되는 사운드의 전송을 요청하는 경우, 기능 문자 사운드 데이터베이스(620)를 검색하여 기능 문자에 대응되는 사운드를 사용자 단말기(500)로 전송한다.When the user terminal 500 requests the transmission of a sound corresponding to the function text, the function text server 610 searches the function text sound database 620 and transmits the sound corresponding to the function text to the user terminal 500. do.

본 발명의 일실시예에 따르면, 기능 문자 사운드 데이터베이스(620)는 상술한 바와 같이, 기능 문자의 종류에 따른 복수의 데이터베이스를 포함할 수 있으며, 하나의 데이터베이스에 영역을 달리하여 기능 문자의 종류별로 사운드를 저장할 수 있다.According to an embodiment of the present invention, the function text sound database 620 may include a plurality of databases according to the type of function text, as described above, and the function text may be different for each type of function text by changing a region in one database. Sound can be saved.

또한, 본 발명의 일실시예에 따른 사운드 합성 시스템(600)은 사용자 단말기(400)와 사용자 단말기(500)가 서로 다른 네트워크를 사용하는 경우에 상호 프로토콜 변환 및 인터페이스 정합을 위한 인터페이스부를 더 포함할 수 있다.In addition, the sound synthesis system 600 according to an embodiment of the present invention may further include an interface unit for mutual protocol conversion and interface matching when the user terminal 400 and the user terminal 500 use different networks. Can be.

이 때에는, 기능 문자 서버(610)가 기능 문자 사운드 데이터베이스(620)로부터 추출한 사운드를 인터페이스부를 통하여 사용자 단말기(500)로 전송하게 된다.In this case, the function text server 610 transmits the sound extracted from the function text sound database 620 to the user terminal 500 through the interface unit.

도 4는 본 발명의 일실시예에 따른 사용자 단말기(500)의 내부 구성을 도시한 것이다.4 illustrates an internal configuration of a user terminal 500 according to an embodiment of the present invention.

도 4에 도시된 바와 같이, 사용자 단말기(500)는 텍스트 입력부(510), 기능 문자 추출부(520), 사운드 추출부(530), 텍스트 뷰어(540), 및 사운드 플레이어 (550)를 포함한다.As shown in FIG. 4, the user terminal 500 includes a text input unit 510, a function character extractor 520, a sound extractor 530, a text viewer 540, and a sound player 550. .

텍스트 입력부(510)는 수신되는 텍스트를 입력하여 기능 문자 추출부(520)로 전송하고, 기능 문자 추출부(520)는 수신된 텍스트에서 기능 문자를 추출하여, 기능 문자는 사운드 추출부(530)로 전송하고, 기능 문자를 제외한 대화용 텍스트는 텍스트 뷰어(540)로 전송한다.The text input unit 510 inputs the received text and transmits the received text to the function character extractor 520, and the function character extractor 520 extracts the function character from the received text, and the function character is the sound extractor 530. And the text for the conversation except for the function character is transmitted to the text viewer 540.

사운드 추출부(530)는 네트워크를 통하여 기능 문자 서버(610)에 접속하여 기능 문자에 대응되는 사운드의 전송을 요청하고, 기능 문자 서버(610)로부터 사운드를 수신하여 사운드 플레이어(550)로 전송한다.The sound extractor 530 connects to the function text server 610 through a network, requests transmission of a sound corresponding to the function text, receives a sound from the function text server 610, and transmits the sound to the sound player 550. .

이로써, 사용자 단말기(500)의 텍스트 뷰어(540)에는 대화용 텍스트가 출력되고, 기능 문자에 대응되는 사운드가 사운드 플레이어(550)를 통하여 재생된다.As a result, the text for dialogue is output to the text viewer 540 of the user terminal 500, and the sound corresponding to the function character is reproduced through the sound player 550.

본 발명의 일실시예에 따르면, 텍스트 뷰어(540)는 사운드 효과 부분을 강조하기 위하여 일반 메신저, 웹 채팅 등과 같이 전달받은 텍스트 전체를 한번에 디스플레이하지 않고, 사람이 말하는 속도로 텍스트를 디스플레이할 수 있다. 이와 같이 하면, 기능 문자가 포함된 부분에서 사운드 플레이어(550)가 기능 문자에 대응되는 사운드를 재생함으로써, 전달력을 높일 수 있다.According to an embodiment of the present invention, the text viewer 540 may display the text at a speed of speaking without displaying the entire text transmitted at once, such as a general messenger or a web chat, in order to emphasize the sound effect portion. . In this way, the sound player 550 reproduces the sound corresponding to the function character in the portion containing the function character, thereby increasing the transmission power.

또한, 상기 설명에서는, 사용자 단말기(400)와 사용자 단말기(500) 간에 네트워크를 통하여 사운드 합성 시스템(600)이 접속되어 기능 문자에 대응되는 사운드를 사용자 단말기(500)로 전송하는 것으로 설명하였으나, 이와 같은 별도의 사운드 합성 시스템(600) 없이 기능 문자 사운드 데이터베이스(620)를 사용자 단말기(500)의 내부에 형성할 수 있다.In addition, in the above description, it has been described that the sound synthesis system 600 is connected between the user terminal 400 and the user terminal 500 to transmit the sound corresponding to the function character to the user terminal 500. The function character sound database 620 may be formed inside the user terminal 500 without the same separate sound synthesis system 600.

본 발명의 제2 실시예와 같이, 텍스트 메시징에서 본 발명의 일실시예에 따른 사운드 합성 방법을 이용하는 경우에는, 기능 문자의 종류로서 음성, 음악, 소리 등 다양한 사운드 효과와 애니메이션 효과가 복합적으로 포함된 멀티미디어 효과(stage)를 더 포함시킬 수 있다.As in the second embodiment of the present invention, in the case of using the sound synthesis method according to an embodiment of the present invention in text messaging, various sound effects such as voice, music, sound, and animation effects are complex as types of functional characters. The included multimedia stages may be further included.

예컨대, 아래의 예와 같은 스테이지 타입의 기능 문자가 메신저 채팅에 적용된다면 상대편의 메신저 윈도우 창 또는 데스크 탑 창에 생일축하와 관련된 아이템인 아이디 '01'번의 스테이지 효과가 플레이된다.For example, if a stage type function character as in the following example is applied to a messenger chat, the stage effect of ID '01', which is an item related to a birthday, is played in the messenger window window or the desktop window of the other party.

예) 생일 축하해 <FT type="STAGE" id=01 duration=15s></FT>Example: Happy Birthday <FT type = "STAGE" id = 01 duration = 15s> </ FT>

스테이지 효과의 예로서, 생일축하 음악이 나오면서 폭죽이 터진다던가, 화면이 어두워지면서 케이크에 촛불이 나타나서 촛불을 끌 수 있게 하는 등의 재미있는 애니메이션이 복합적으로 구성되어 있는 인터렉티브 멀티미디어 효과를 들 수 있다.Examples of stage effects include interactive multimedia effects, such as a combination of fun animations, such as happy birthday music popping firecrackers, darkening the screen, candles appear on the cake, and the candles can be turned off.

이로써, 텍스트와 텍스트 간의 대화에서 보이스 효과, 사운드 효과, 뮤직 효과, 스테이지 효과 등 음성 또는 멀티미디어 효과를 표현할 수 있는 심벌로서 기능 문자를 사용자에게 제공함으로써 텍스트 대화자는 적절하게 자신의 감정을 표현할 수 있게 되어, 채팅 서비스의 질을 높일 수 있다.This enables text chatters to express their emotions appropriately by providing functional characters to users as symbols that can represent voice or multimedia effects such as voice effects, sound effects, music effects, and stage effects in text-to-text dialogue. Can improve the quality of the chat service.

이상으로 본 발명의 실시예에 따른 기능 문자를 이용한 사운드 합성 시스템 및 방법에 대하여 설명하였다. 상기 설명된 실시예는 본 발명이 적용된 일실시예로서 본 발명의 범위가 상기 실시예에 한정되는 것이 아니며, 본 발명의 개념을 그대로 이용하여 여러 가지 다양한 실시예를 형성할 수 있다.The sound synthesis system and method using the function character according to the embodiment of the present invention have been described above. The above-described embodiment is not limited to the above-described embodiment as an embodiment to which the present invention is applied, and various various embodiments may be formed using the concept of the present invention as it is.

또한, 본 발명에 따른 상기의 각 단계는 일반적인 프로그래밍 기법을 이용하여 소프트웨어적으로 또는 하드웨어적으로 다양하게 구현할 수 있다. 그리고, 본 발명의 일부 단계들은, 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. In addition, each of the above steps according to the present invention can be implemented in a variety of software or hardware using a general programming technique. In addition, some steps of the present invention may be embodied as computer readable codes on a computer readable recording medium.

컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 잇는 데이터가 저장되어 있는 모든 종류의 기록장치를 포함한다. 컴퓨터가 읽을 수 있는 기록매체의 예로는 ROM, RAM, CD-ROM, CD-RW, 자기 테이프, 플로피디스크, HDD, 광 디스크, 광자기 저장 장치 등이 있으며, 또한 캐리어 웨이브(예를 들어 인터넷을 통한 전송)의 형태로 구현되는 것도 포함한다. 또한, 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산 방식으로 컴퓨터가 읽을 수 있는 코드로 저장되고 실행될 수 있다.Computer-readable recording media include all types of recording devices that store data that can be read by a computer system. Examples of computer-readable recording media include ROM, RAM, CD-ROM, CD-RW, magnetic tape, floppy disks, HDDs, optical disks, magneto-optical storage devices, and carrier wave (e.g., Internet It also includes the implementation in the form of). The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.

본 발명에 따르면, 텍스트를 음성으로 전달하는 시스템에 있어서, 텍스트에 기능 문자가 포함된 경우, 기능 문자에 대응되는 사운드 효과를 기계적인 음성 변환을 통해 나온 음성 스트림에 삽입시킴으로써 변환된 음성을 풍부하게 생성하고 전달력을 높일 수 있다. According to the present invention, in a system for delivering text to speech, when the text includes functional characters, the converted speech is enriched by inserting sound effects corresponding to the functional characters into the speech stream through mechanical speech conversion. Create and increase delivery power.

또한, 텍스트 메시징에서 문장 속에 음향 효과를 나타내는 기능 문자를 추가하여 전달함으로써 화자의 대화의 표현력을 풍부하게 만들고 전달력을 높임으로써 서비스의 질을 개선시킬 수 있다. In addition, in text messaging, by adding a function character representing a sound effect in a sentence, the quality of service can be improved by enriching the expressive power of the talker's conversation and enhancing the delivery power.

도 4는 본 발명의 일실시예에 따른 사용자 단말기의 내부 구성을 도시한 것이다.4 illustrates an internal configuration of a user terminal according to an embodiment of the present invention.

Claims

In the sound effect insertion system connected to the text terminal and the voice terminal through a network,

A data receiver configured to receive data including function characters and text for conversation from the text terminal;

A function text extraction unit for separating and outputting the function text and the dialogue text included in the data;

A speech synthesizer configured to convert the text for conversation into speech;

A database storing a sound corresponding to the function text; And

A function text sound synthesis unit for extracting a sound corresponding to the function text from the database and synthesized with the voice converted by the speech synthesis unit

Sound effect insertion system using a function character comprising a.

The method of claim 1,

And the function letter includes information about the start position, type, ID, and duration of the sound.

The method of claim 1,

And the function text sound synthesizer calculates a start time at which the sound is to be synthesized by arithmetically calculating a duration of the converted voice and a start position of the sound.

In the sound effect insertion method using a function character,

Receiving data including the function text and text for conversation;

Separating and outputting the function text and the dialogue text included in the data;

Displaying the conversational text;

Extracting a sound corresponding to the function character; And

Playing the sound

Sound effect insertion method comprising a.

The method of claim 1,

The displaying of the text for dialogue may include outputting the dialogue text at a constant speed.