KR100533217B1

KR100533217B1 - A headphone apparatus with gentle function using signal processing for prosody control of speech signals

Info

Publication number: KR100533217B1
Application number: KR10-2002-0082961A
Authority: KR
Inventors: 배명진
Original assignee: 배명진
Priority date: 2002-12-23
Filing date: 2002-12-23
Publication date: 2005-12-05
Also published as: KR20030009263A

Abstract

본 발명은 실생활에서 오디오 청취용으로 아주 널리 사용되고 있는 헤드폰의 기능을 개선하는 음성신호의 발성변환용 신호처리에 의한 친절 헤드폰장치에 관한 것이다. The present invention relates to a kind headphone apparatus by signal processing for speech conversion of a voice signal which improves the function of a headphone which is widely used for audio listening in real life.

이를 위한 본 발명은, 헤드폰을 사용하여 발성자와 통화 또는 음향기기의 소리를 청취하도록 함에 있어서, 발성자의 목소리가 불친절하게 들릴 경우에 친절 키보턴을 누르면 목소리를 친절하게 변환시켜주는 발성변환기능부를 기존 헤드폰에 있는 내,외장형의 신호처리컴퓨터 처리기의 일부분으로 구비하여; 음운의미정보및 운율개성정보추출부는 음운정보 지속부와 운율정보 보존부가 각각 연결되고, 상기 음운정보 지속부와 운율정보 보존부는 소리합성부를 통해 이어폰이 연결된 것을 특징으로 한다. To this end, the present invention, in the headphone to listen to the sound of the talker or the audio device, when the voice of the speaker unkindly press the kind button to switch the voice conversion function to convert the voice kindly As part of an internal and external signal processing computer processor in existing headphones; The phonological meaning information and the rhyme personality information extracting unit are connected to the phonological information continuation unit and the rhyme information preserving unit, respectively, and the phonological information continuation unit and the rhyme information preserving unit are characterized in that an earphone is connected through a sound synthesis unit.

Description

Kind headphone device by signal processing for sound signal conversion {A HEADPHONE APPARATUS WITH GENTLE FUNCTION USING SIGNAL PROCESSING FOR PROSODY CONTROL OF SPEECH SIGNALS}

본 발명은 헤드폰을 이용한 인터넷 전화통화, 일반전화 통화, 휴대폰 통화, 오디오청취 등에서 소리의 청취 방법을 새로이 개선하여 음성 또는 오디오 신호처리 분야에서 발성처리기술로 분류할 수 있는 음성신호의 발성변환용 신호처리에 의한 친절 헤드폰장치에 관한 것이다.현재 개발되어 있는 휴대폰이나 음향 오디오 기기에서 들리는 소리는 각양각색이다. 급하게 들리는 소리, 욕하는 소리, 사투리가 섞인 소리, 불명료한 목소리 등으로 청취자의 감정을 불쾌하게 만든다. 이렇게 사용되고 있는 기존의 오디오 청취용 헤드폰은 오디오 기기에서 나오는 목소리를 그대로 수화기를 통해 귀에 전달해주고 있다. The present invention improves the method of listening to sound in the Internet telephone call, general telephone call, mobile phone call, audio listening, etc. using a headphone, a signal for voice conversion of speech signals that can be classified as speech processing technology in the field of speech or audio signal processing. It is about a kind headphone device by processing. The sound which is heard in the cell phone and the acoustic audio apparatus which are currently developed is various. Displeased listeners can be heard in a hurry, by swearing, by a dialect, or by an unclear voice. Existing audio listening headphones, which are being used in this way, convey the voice from the audio device to the ear through the receiver.

특히, 청각 장애인이나 노인층의 경우에는 청각 기능이 저하되어 평균 발성속도로 이야기를 진행하여도 잘 알아듣지 못하는 경우가 있다. 따라서, 상기와 같은 방식은 소리에 들어있는 성격이나 불친절함이 그대로 청취자에게 전달되어 수신자가 때로는 불쾌감이나 스트레스를 많이 느끼게 되는 단점이 있다. In particular, the hearing impaired or elderly people may have a hearing deterioration and may not understand well even when the story is progressed at an average speech rate. Therefore, the above-described method has a disadvantage in that the personality or unkindness contained in the sound is transmitted to the listener as it is, and the receiver sometimes feels a lot of discomfort or stress.

본 발명은 상기와 같은 제반 사정을 감안하여 발명한 것으로, 친절 키보턴과 발성변환기능부를 이용하여 헤드폰의 수화기에서 들리는 발성자의 목소리를 디지털 발성처리기술을 적용하여 천천히 친절하게 들리도록 하는 청취 방식을 새로이 제안한 음성신호의 발성변환용 신호처리에 의한 친절 헤드폰장치를 제공함에 그 목적이 있다. The present invention has been invented in view of the above-mentioned circumstances, and uses a kind keyboard and a speech conversion function to listen to a voice of a speaker heard from a handset of a headphone by applying a digital vocal processing technology to listen slowly and kindly. It is an object of the present invention to provide a kind headphone device by a signal processing for speech conversion of a newly proposed voice signal.

상기 목적을 달성하기 위한 본 발명은, 헤드폰을 사용하여 발성자와 통화 또는 음향기기의 소리를 청취하도록 함에 있어서, 발성자의 목소리가 불친절하게 들릴 경우에 친절 키보턴을 누르면 목소리를 친절하게 변환시켜주는 발성변환기능부를 기존 헤드폰에 있는 내,외장형의 신호처리컴퓨터 처리기의 일부분으로 구비하여; 음운의미정보및 운율개성정보추출부는 음운정보 지속부와 운율정보 보존부가 각각 연결되고, 상기 음운정보 지속부와 운율정보 보존부는 소리합성부를 통해 이어폰이 연결된 것을 특징으로 한다. 이하, 본 발명의 바람직한 실시예를 예시도면에 의거하여 상세히 설명한다.The present invention for achieving the above object, in listening to the sound of the talker and the sound device using a headphone, when the voice of the speaker is unkindly sounded by pressing the kind button to convert the voice kindly Equipped with a voice conversion function as part of an internal and external signal processing computer processor in an existing headphone; The phonological meaning information and the rhyme personality information extracting unit are connected to the phonological information continuation unit and the rhyme information preserving unit, respectively, and the phonological information continuation unit and the rhyme information preserving unit are characterized in that an earphone is connected through a sound synthesis unit. Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1 은 본 발명의 친절헤드폰장치의 원리를 설명하기 위한 구성도로서, 본 발명의 친절 헤드폰장치는 수신자가 발성변환기능부(104)에 부착된 친절 키보턴(101)(또는 특정 키보턴)을 누르면 발성자의 목소리가 친절하면서 자세한 목소리로 발성 변경시켜서 천천히 들리도록 발명한 것이다.1 is a configuration diagram for explaining the principle of the kind headphone device of the present invention, the kind headphone device of the present invention, the receiver is a kind key button 101 (or a specific key button) attached to the voice conversion function unit 104; If you press the voice of the speaker was invented to change the voice to a friendly and detailed voice to hear slowly.

상기 기존 헤드폰기능부(102)에는 입력측으로 헤드폰플러그(105)와 헤드셋(103)의 마이크셋이 각각 연결되는 한편, 출력측으로 발성변환기능부(104)를 통해 친절 키보턴(101)과 헤드셋(103)의 이어폰이 각각 연결되도록 구성되어 있다. The headphone plug 105 and the microphone set of the headset 103 are connected to the existing headphone function unit 102, respectively, while the friendly keyboard 101 and the headset (the voice conversion function unit 104 are connected to the output side). The earphones of 103 are configured to be connected to each other.

본 발명의 친절헤드폰장치는 헤드폰플러그(105)의 헤드폰을 통해 들리는 소리정보를 분석하여 발성자의 개성정보는 그대로 두고, 의미를 나타내는 음운정보는 늘려 줌으로 마치 동영상에서 슬로우-모션을 구현하는 것처럼, 소리의 슬로우-청취 기능을 구현할 수 있다.본 발명의 친절헤드폰장치는 발성자가 빨리 말하거나 억양이 강한 사투리로 말을 할 때 수신자의 취향에 따라 친절헤드폰기능을 선택할 수 있기 때문에 명료하고 깨끗한 소리로 발성자의 말을 청취할 수 있다. 따라서, 다급하고 불친절한 대화에서 받게 되는 스트레스를 피할 수 있는 효과가 있다.The kind headphone device of the present invention analyzes the sound information heard through the headphones of the headphone plug 105 to leave the personality information of the speaker as it is, and to increase the phonological information indicating the meaning, as if to implement a slow-motion in the video, It is possible to implement the slow-listening function of the sound. The kind headphone device of the present invention can select the kind headphone function according to the taste of the receiver when the speaker speaks quickly or speaks with a strong accent, so that the sound is clear and clean. Listen to the speaker. Therefore, there is an effect to avoid the stress that comes from urgent and unkind conversation.

본 발명의 친절헤드폰장치의 기능은 발성을 천천히 또렷하게 들려줌으로 복지통신 분야에 필수적인 기능으로 활용될 수 있다. 또한, 불특정 다수의 고객을 유무선 전화 통신으로 영접하는 동사무소, 소방서, 회사안내원 등의 관련 서비스업 종사자들은 고객들의 다양한 목소리의 형태로 인해 스트레스를 많이 받게 된다. 이러한 경우에도 본 발명의 친절헤드폰장치는 고객의 목소리를 친절하고 차분하게 들려주는 특장점을 갖는다.The function of the kind headphone device of the present invention can be utilized as an essential function in the field of welfare communication by sounding slowly and clearly. In addition, related service workers, such as offices, fire departments, and company guides who receive unspecified numbers of customers through wired and wireless telephony, are stressed by various voices of customers. Even in this case, the kind headphone device of the present invention has the advantage of hearing the customer's voice kindly and calmly.

이상과 같이 구성되는 본 발명에서 사람의 목소리는 허파에서 나오는 공기가 성대에서 떨림으로서 진동 소리가 발생하게 되고, 이 떨림이 성도를 통해서 나올 때 공명이 발생하면서 생성된다. 목소리 중에서 성대의 떨림 소리의 주기나 발성습관은 발성자의 개성을 나타내게 되고, 성도의 공명특성은 메시지의 의미를 전달하는 음운정보를 주로 나타내게 된다. 이처럼 메시지의 의미를 나타내는 성도의 공명특성을 시간 축 상에서 반복하면서 강조시키게 되면 목소리가 천천히 또렷하게 들리는 원리를 헤드폰 청취에 적용하도록 발명한 것이다.In the present invention constituted as described above, the human voice is generated by the vibration sound as the air from the lungs trembles in the vocal cords, and this vibration is generated while the resonance occurs through the saints. The vocal vocal tremor and vocal habits of the voice represent the personality of the vocalists, and the resonance characteristics of the vocal tracts represent phonological information that conveys the meaning of the message. In this way, the resonance characteristic of the saints representing the meaning of the message is emphasized by repeating it on the time axis, and the invention is applied to the principle of listening to the headphones slowly and clearly.

휴대폰이나 녹음기와 같은 오디오 기기에 우리가 말을 하게 되면, 음파신호가 디지털 처리기술로 압축되어 미디어나 전송채널을 통해 수신기에 전달된다. 수신기에서는 디지털 오디오신호의 압축을 풀고서 음파로 변환하여 스피커나 헤드폰을 통해 우리의 귀에 소리를 전달해 주고 있다. 이처럼 대부분의 헤드폰은 수신 음향기기의 전기신호를 단순히 음파로 바꾸어주는 수화기의 기능을 수행하거나 또는 고성능 컴퓨터 칩이 내장되어 있어, 문자표시나 오디오조절 등의 다양한 서비스도 함께 제공하고 있다.본 발명의 친절헤드폰장치는 기존 헤드폰 기능부(102)에 신호처리용 컴퓨터칩을 통해 수행하는 친절기능의 발성변환기능부(104)를 첨가한 것이다. 헤드폰의 헤드폰플러그(105)를 기존 헤드폰기능부(102)에 연결하면 수화기를 통해 소리가 들리게 되고, 소리가 급하거나 불명료하게 들린다면 발성변환기능부(104)의 친절키보턴(101)을 누르게 된다. 친절 키보턴(101)은 별도의 키보턴으로 장착되었거나 또는 임의로 정한 키보턴을 누르는 방법이다. When we talk to an audio device such as a cell phone or recorder, the sonic signal is compressed by digital processing technology and delivered to the receiver via media or transmission channels. The receiver decompresses the digital audio signal, converts it into sound waves, and delivers the sound to our ears through speakers or headphones. As described above, most headphones perform a function of a handset that simply converts an electrical signal of a receiving acoustic device into sound waves, or have a high-performance computer chip, thereby providing various services such as text display and audio control. Kind headphone device is a conventional headphone function unit 102 is added to the voice conversion function unit 104 of the kind function performed through a computer chip for signal processing. When the headphone plug 105 of the headphone is connected to the existing headphone function unit 102, a sound is heard through the handset, and if the sound is urgent or indistinct, press the kind key button 101 of the voice conversion function unit 104. do. Kindly press the button 101 is a separate button or a key button arbitrarily determined.

친절 키보턴(101)을 누르면 발성자의 목소리가 일례로 2배정도로 천천히 들리도록 발성변환기능부(104)의 컴퓨터 칩이 처리하여 말소리를 분명하고 또렷이 들을 수 있게 한다. 반면, 이때 청취자가 친절 키보턴(101)을 한번 더 누르게 되면 청취속도가 원래의 속도로 다시 복귀된다.Pressing the kind key button 101, the computer chip of the voice conversion function unit 104 is processed so that the voice of the speaker is heard, for example, about twice as slowly, so that the voice can be clearly and clearly heard. On the other hand, if the listener presses the friendly key button 101 once more, the listening speed returns to the original speed.

도 2 는 본 발명의 친절헤드폰장치에서 친절소리의 합성을 설명하기 위한 블록도로서, 본 발명의 친절헤드폰장치는 기존 헤드폰에 내,외장형으로 부착된 컴퓨터 칩에서 목소리를 분석하여 발성자의 목소리특성을 그대로 유지하면서 발성시간이 길게 합성되도록 하는 등의 첨단 처리기능을 구성한 것이다. 즉, 도 1 에 도시된 발성변환 기능부에 있어 음운의미정보및 운율개성정보추출부(202)는 음운정보 지속부(205)와 운율정보 보존부(206)가 각각 연결되고, 상기 음운정보 지속부(205)와 운율정보 보존부(206)는 소리합성부(207)를 통해 이어폰(208)이 연결된다.목소리는 성대의 떨림과 목구멍에서의 공명에 의해 소리가 발생하는데, 이러한 목소리의 생성원리를 이용하여 목소리의 특징은 음운정보지속부(205)에서 그대로 두고 운율정보보존부(206)에서 말하는 의미 정보만을 뽑아서 반복하여 소리합성부(207)를 통해 합성하면, 이어폰(208)을 통해 천천히 들리면서 명료하고 친절한 목소리로 바뀌게 된다. Figure 2 is a block diagram for explaining the synthesis of the kind sound in the kind headphone device of the present invention, the kind headphone device of the present invention analyzes the voice from the computer chip attached to the existing headphones in the internal and external type to analyze the voice characteristics of the speaker It is a high-tech processing function such as long synthesis time while maintaining it as it is. That is, the phonological meaning information and rhyme personality information extracting unit 202 in the voice conversion function unit illustrated in FIG. 1 is connected to the phonological information continuity unit 205 and the rhyme information preserving unit 206, respectively, and continues the phonological information. The earphone 208 is connected to the unit 205 and the rhyme information storage unit 206 through the sound synthesis unit 207. The voice is generated by the shaking of the vocal cords and the resonance in the throat. Using the principle, the characteristics of the voice are left in the phonological information persisting unit 205, and only the semantic information spoken by the rhythm information preserving unit 206 is extracted and repeatedly synthesized through the sound synthesizing unit 207, through the earphone 208. You will hear it slowly and change to a clear and kind voice.

본 발명의 친절헤드폰장치 핵심기술은 음운의미정보및 운율개성정보추출부(202)에서 사람의 목소리에서 말뜻을 나타내는 음운정보와 개성을 나타내는 운율정보를 자동으로 분류하여 개성을 보존하면서 동시에 음운정보를 지속함으로써 이어폰(208)을 통해 목소리의 친절성을 증대시키게 된다는 점이다.In the core technology of the kind headphone device of the present invention, the phonological meaning information and the rhyme personality information extracting unit 202 automatically classify the phonological information representing the meaning in the human voice and the rhyme information indicating the personality, while preserving personality, By continuing to increase the kindness of the voice through the earphone 208.

휴대폰이나 오디오 기기 등으로 들리는 아날로그 형태의 목소리 신호를 입력받아서 친절한 목소리로 발성변환 처리하는 장치는 도 3 에 도시된 바와 같다. 메모리(305)와 주변장치(309)가 각기 연결된 디지털신호처리기(DSP) 또는 범용 CPU의 컴퓨터 처리기(304)는 아날로그-디지털변환기(303 : ADC), 저역통과여파기(302 : LPF) 및 증폭기(301)의 송화기 아날로그입력 S(t)을 구성하고, 디지털-아날로그변환기(308 : DAC), 저역통과여파기(307) 및 증폭기(306)의 수화기 아날로그출력 U(t)을 구성한 것이다. 상기 증폭기(301, 306)에는 볼륭조절기(311, 312)가 설치되어 있다. An apparatus for receiving a voice signal of an analog form heard by a mobile phone or an audio device and converting the voice signal into a friendly voice is illustrated in FIG. 3. The digital signal processor (DSP) or the computer processor 304 of the general purpose CPU, to which the memory 305 and the peripheral device 309 are connected, may be an analog-to-digital converter (303: ADC), a low pass filter (LPF) and an amplifier (302). A transmitter analog input S (t) of 301 is configured, and a receiver analog output U (t) of the digital-to-analog converter 308 (DAC), the low pass filter 307, and the amplifier 306 is configured. The amplifiers 301 and 306 are provided with the balance regulators 311 and 312.

아날로그 형태로 입력된 목소리 신호파형은 증폭기(301)에서 원하는 레벨로 증폭한 다음에 앨리어징(aliasing)효과를 제거하기 위해 저역통과여파기(302)에 통과시키고, 양자화(quantization) 및 부호화(coding)를 수행하는 아날로그-디지털 변환기(303)를 통과함으로서 선형펄스부호변조(PCM) 형태의 디지털 신호로 바뀌어서 범용 CPU나 디지털 신호처리기(DSP)의 컴퓨터 처리기(304)에서 소프트웨어나 펌웨어에 의해 처리된다.The voice signal waveform input in the analog form is amplified to the desired level in the amplifier 301 and then passed through the low pass filter 302 to remove the aliasing effect, and then quantization and coding. By passing through an analog-to-digital converter 303 that performs a digital signal in the form of linear pulse code modulation (PCM), it is processed by software or firmware in a computer processor 304 of a general purpose CPU or digital signal processor (DSP).

신호처리 될 때는 이 컴퓨터 처리기(304)가 내,외에 설치된 주변장치(309)를 참고할 수도 있고, 또한 입력 디지털 신호나 처리 결과를 저장하기 위해 주변 메모리(305)를 참고할 수도 있다.When the signal is processed, the computer processor 304 may refer to the peripheral device 309 installed inside or outside, or may refer to the peripheral memory 305 to store the input digital signal or the processing result.

CPU에서 소프트웨어에 의해 친절한 소리로 발성변환 처리된 디지털 신호는 디지털-아날로그 변환기(308)를 통해 표본화된 아날로그 신호형태로 변환된다. 이 신호를 저역통과여파기(307)에 통과시키면 양자화 잡음이 제거된 아날로그신호가 되고, 증폭기(306)에서 적절히 증폭하면 헤드폰의 수화기나 스피커 등을 통해서 들을 수 있는 아날로그 신호가 된다. 이때 전반적인 소리의 레벨은 각 증폭기(301, 306)에 부착된 볼륨조절기(311)(312)로 조절하게 된다.The digital signal voiced by the software at the CPU is converted into a sampled analog signal through a digital-to-analog converter 308. Passing this signal through the low pass filter 307 becomes an analog signal from which quantization noise has been removed, and when properly amplified by the amplifier 306, an analog signal that can be heard through a handset, a speaker, or the like of a headphone. At this time, the overall sound level is controlled by the volume controllers 311 and 312 attached to the respective amplifiers 301 and 306.

도 4 는 본 발명의 친절헤드폰장치를 설명하기 위한 처리 플로우차트로서, 본 발명의 친절헤드폰장치는 기존 헤드폰에 도 3과 같은 하드웨어 시스템과 그 CPU칩에 도 4와 같은 처리가정을 갖는 친절기능의 소프트웨어나 펌웨어를 탑재한 것이다. 기존의 헤드폰 기능에서 친절 키-보턴(또는 임의의 키-보턴)이 눌러졌는지를 파악하고, 눌러지지 않았다면 기존 헤드폰과 같이 입력되는 소리를 그냥 출력으로 전달하게 하거나, 디지털 제어를 위해서는 소리 전달통신을 수행하게 된다(402 단계). 친절 키-보턴은 소프트웨어 토글스위치 형태로 구성되며 한번 누르면 켜지고, 다시 누르면 꺼지게 된다.Fig. 4 is a processing flowchart for explaining the kind headphone apparatus of the present invention. The kind headphone apparatus of the present invention has a kind function having a processing system as shown in Fig. 4 in a hardware system as shown in Fig. 3 and a CPU chip in an existing headphone. It is equipped with software or firmware. Determines whether the friendly key-button (or any key-button) is pressed in the existing headphone function, and if it is not pressed, sends the input sound to the output just like the existing headphone, or for sound control It is performed (step 402). The friendly key-button is configured as a software toggle switch, press once to turn it on and press again to turn it off.

친절 키-보턴이 눌러져서 친절헤드폰장치 기능이 시작되면 아날로그-디지털 변환기(ADC)에서 입력된 데이터 표본(401 단계)값이 한 프레임단위로 동시에 처리된다. 먼저 현재 프레임에 있는 데이터 값이 유성음 구간인지 아닌지를 파악하고, 유성음 구간이 아니면(404 단계) 링-버퍼의 점유율(Buffer Rate, BR)을 계산하게 된다. 발성자의 목소리가 친절하게 들리도록 하기 위해서는 발성자의 실제 목소리 발성속도 보다 천천히 들리게 해야 하는데, 처리된 데이터를 대기시키는데 필요한 메모리 버퍼를 링-버퍼라고 한다(410 단계).When the kind headphone button is pressed and the kind headphone device function is started, the data sample value (step 401) input from the analog-to-digital converter (ADC) is simultaneously processed in one frame unit. First, whether the data value in the current frame is a voiced sound interval or not (step 404) calculates the occupancy rate (Buffer Rate, BR) of the ring-buffer. In order for the voice of the speaker to be heard kindly, it is necessary to make it sound slower than the voice of the speaker. The memory buffer required to hold the processed data is called a ring buffer (step 410).

링-버퍼(ring buffer)의 점유율(BR)은 친절기능에서 처리된 데이터가 링-버퍼에서 대기되는 시간비율을 나타내는데, 현 프레임이 비유성음구간이고 링-버퍼에 대기하고 있는 시간이 정해진 시간(예 BT=1.5이상)을 넘어섰다면, 발성속도를 앞당기도록 발성의 지속시간감축을 수행하게 된다(408 단계). 이렇게 함으로써 친절기능이 수행될 때 야기되는 발성시간의 지연을 해소할 수 있게 된다. 즉, 유성음 구간에서는 친절하고 또렷하게 발성되도록 데이터를 천천히 출력하지만 비유성음 구간에서는 발성속도를 빠르게 출력하여 전체적인 시간지연을 해소하게 한 것이다.The occupancy rate of the ring buffer (BR) represents the time rate at which the data processed by the kind function is waited in the ring buffer.The time when the current frame is non-voiced and waits for the ring buffer is determined. If BT = 1.5 or more), the duration of speech is reduced to accelerate the speech speed (step 408). By doing so, it is possible to eliminate the delay in the uttering time caused when the kind function is performed. In other words, in the voiced sound section, the data is output slowly so that it is nicely and clearly uttered, but in the voiced sound section, the voice speed is output quickly to eliminate the overall time delay.

현재의 프레임이 유성음 구간인지 비유성음 구간인지를 측정하는 방법(403 단계)은 음성처리 교재에 많이 제안되어져 있으며, 일례로 에너지 레벨을 측정하여 쉽게 파악할 수 있다. 즉, 현재 프레임의 평균 에너지가 정해진 문턱 값 이하라면 이 구간은 비유성음 구간이 된다. 현재의 프레임의 데이터가 유성음 구간이라면, 이 프레임의 데이터에 대해 친절기능 처리를 수행하게 된다.A method of measuring whether the current frame is a voiced sound section or a non-voiced sound section (step 403) has been suggested in a text processing textbook. For example, the energy level can be easily determined by measuring an energy level. That is, if the average energy of the current frame is less than or equal to a predetermined threshold value, this section becomes an unvoiced sound section. If the data of the current frame is a voiced sound section, the kind function processing is performed on the data of this frame.

친절기능은 이 데이터의 발성속도를 천천히 지속하기 위해 지속시간(예, 1.5.3.0배정도)을 연장시킨다(406 단계). 유성음데이터의 지속시간 변경은 피치주기 단위로 수행하였고, 이때 피치주기를 정확히 검출해야 한다. 음성신호의 피치주기검출 법은 최근 40년간에 수많은 방법들이 제안되어 있다. 일예로 피치검출은 자기상관함수법이 주로 사용되고 있으며, 인근음성 파형들 간의 상관관계를 계산하여 반복적인 파형의 주기를 검출하는 방법이 있다. 유성음 구간에서 피치주기가 검출되면, 피치주기 단위로 반복과정을 통해 발성지속시간을 조절한다.The kind function extends the duration (e.g., 1.5.3.0 times) in order to slowly sustain the vocalization of this data (step 406). The change of the duration of the voiced sound data was performed in units of pitch periods, and the pitch periods should be accurately detected. Pitch period detection method of speech signal has been proposed in the last 40 years. For example, the pitch detection is mainly used in the autocorrelation function, there is a method for detecting the repetitive waveform period by calculating the correlation between the adjacent voice waveforms. When the pitch period is detected in the voiced sound section, the speech duration is adjusted by repeating the pitch period unit.

또한 유성음 구간 내에서 억양의 변화를 어느 정도로 제한(예, 1.5배 이내)하기 위해, 연속된 유성음 구간의 피치주기를 검출한 다음에 프레임 당 변화도를 구하고, 변화가 크다면 피치 주기변경을 수행하여 목소리를 안정시키게 된다(407 단계). 피치주기의 변경은 피치주기 검출이 잘 이루어진 다음에 이를 근거로 피치주기를 변경시키게 된다. 또한 피치주기를 변경하는 방법은 지금까지 많이 제안되어져 있다. 일예로 시간 영역에서 피치주기 단위로 음성 파형을 넓게 분절한 다음에 변경된 피치주기 단위로 중첩시켜서 파형을 재구성하는 PSOLA(Pitch Synchronous Overwrapand Add) 피치변경 법이 있다.이렇게 처리된 데이터들은 파형의 진폭이 자연스럽지 못하고 부자연스럽게 되므로 이를 진폭의 변화가 자연스럽게 이어지도록 하는 에너지 진폭변화 조절(409 단계)을 수행해야 한다. 일예로 에너지 진폭의 변경은 피치주기 단위로 처리하며, 한 피치주기의 평균 에너지 진폭을 곱함으로서 수행한다.Also, in order to limit the change of intonation within the voiced sound zone to some extent (for example, within 1.5 times), the pitch period of the continuous voiced sound zone is detected, and then the change rate is calculated per frame, and if the change is large, the pitch cycle change is performed. To stabilize the voice (step 407). Changing the pitch period causes the pitch period to be changed based on the well-detected pitch period. In addition, many methods have been proposed so far to change the pitch period. An example is the PSOLA (Pitch Synchronous Overwrapand Add) pitch change method that reconstructs the waveform by segmenting the speech waveform widely in the time period and then superimposing it in the changed pitch period. Since it is not natural and unnatural, it is necessary to perform the energy amplitude change adjustment (step 409) so that the change in amplitude is naturally followed. For example, the change in energy amplitude is processed in units of pitch periods, and is performed by multiplying the average energy amplitude of one pitch period.

이렇게 처리 완료된 음성 데이터들은 링-버퍼에 저장시키고(410 단계), 저장된 순서에 따라서 디지털-아날로그 변환기(DAC)를 통해 음성 데이터 표본 단위로 수화기나 스피커폰을 통해 출력한다(411 단계). 여기서 친절헤드폰의 기능은 실시간으로 처리된다. 즉, 아날로그-디지털 변환기(ADC)에서 한 프레임의 데이터를 받고(401 단계)나서부터 그다음 프레임의 데이터를 받아올 때까지 친절 헤드폰의 기능 처리(410 단계)가 끝날 수 있도록 도 4의 프로그램의 최장 처리과정을 단축하거나 또는 고속처리 컴퓨터 칩을 사용해야만 한다.The processed voice data is stored in the ring-buffer (step 410), and the digital-analog converter (DAC) is output through the handset or the speakerphone in the unit of voice data according to the stored order (step 411). Here, the function of the kind headphones is processed in real time. That is, the longest function of the program of FIG. 4 is completed so that the function processing of the kind headphone is completed (step 410) after receiving the data of one frame from the analog-to-digital converter (ADC) until the data of the next frame is received. You have to either shorten the processing or use a high speed computer chip.

이상과 같이 본 발명은 목소리 신호의 특징 추출을 수행하여 발성자의 특성정보는 그대로 유지하면서 발성자의 의미정보를 친절하게 변경하는 것으로 발성자의 발성특성에서 지속시간을 조절하여 슬로우-청취 기능을 구현하거나, 발성하는 억양의 변화를 관찰하여 일정범위를 벗어나지 않게 하거나, 발성 지속시간의 지연을 유성 및 비유성 구간으로 구분하여 처리를 다르게 하는 등의 발성변환 법을 헤드폰에 구현하여 발성자의 목소리가 친절하게 들리도록 하는 친절기능을 부가한 헤드폰 방식인 것이다.As described above, the present invention performs a feature extraction of a voice signal to kindly change the semantic information of the speaker while maintaining the characteristic information of the speaker, thereby implementing a slow-listening function by adjusting the duration in the voice characteristic of the speaker. The voice of the speaker can be heard kindly by implementing a voice conversion method on the headphones such as observing a change in the vocal accent so as not to deviate from a certain range, or dividing the delay of the utterance duration into a voiced and non-voiced section. It is a headphone method with a kind function to make it.

본 발명은 기술적으로 볼 때 인간의 오감을 보조하는 기술의 하나로써, 누구나 나이가 들면 감각기능이 노화되어 점차 그 기능이 둔화되는데, 친절헤드폰은 이러한 감각기능을 보완해주는 복지기술의 실용화라는 점이 특이하다. 따라서 청각기능이 떨어지는 노인이나 장애인에게 제공할 수 있는 복지국가용 헤드폰 기술로서 그 응용성이 독특하다.The present invention is a technology that assists the human senses in terms of technology, and as everyone ages, the sensation function becomes aging gradually, and its function is gradually slowed down, and the kind headphones are the practical use of welfare technology that complements these sensory functions. Do. Therefore, its applicability is unique as a headphone technology for the welfare state that can be provided to the elderly or the disabled with hearing impairment.

본 발명의 친절헤드폰장치는 친절한 사회를 이루는데 필요한 핵심기술이다. 사회는 고도로 첨단화되고, 인간은 점차 고립화되기 때문에 일상의 전화 통화에서조차 상대를 무시하는 대화가 아주 보편화되고 있다. 이러한 사회의 분위기를 친절헤드폰을 통해 바꿀 수 있다. 발성자의 급하고 일방적인 목소리를 천천히 친절하게 말하도록 변경시킴으로서 자칫하면 감정이 유발될 수 있는 분위기를 차분하게 안정시켜 줄 수 있기 때문이다.The kind headphone device of the present invention is a core technology necessary to achieve a kind society. Societies are highly advanced, and humans are increasingly isolated, making conversations that ignore one another even in everyday phone calls. This kind of society can be changed through kind headphones. This is because changing the urgent and unilateral voice of the speaker so that it is spoken slowly and kindly can calm and calm the mood that can cause emotion.

본 발명의 친절헤드폰장치에 적용한 발성변환 기술은 대화의 내용을 기록하는 전문 속기사의 듣기 보조시스템으로 활용할 수 있고, 영어듣기 능력을 키워나가는 어학 학습기에 적용할 수 있고, 긴급하거나 조난구조 요청하는 소리를 천천히 풀어서 듣거나 녹음해야 하는 경우에도 쓸 수 있는 등의 실용성 있는 발명기술로서 그 파급 효과가 아주 크다.Voice conversion technology applied to the kind headphone device of the present invention can be used as a listening aid system of a professional shorthand for recording the content of the conversation, can be applied to language learners to develop English listening skills, urgent or distress rescue sounds As a practical invention technology, such as can be used even when you need to listen slowly or record slowly, the ripple effect is very large.

도 1 은 본 발명의 친절헤드폰장치의 원리를 설명하기 위한 구성도,1 is a configuration diagram for explaining the principle of the kind headphones device of the present invention;

도 2 는 본 발명의 친절헤드폰장치에서 친절소리의 합성을 설명하기 위한 블록도,Figure 2 is a block diagram for explaining the synthesis of the kind sound in the kind headphone device of the present invention,

도 3 은 본 발명의 친절헤드폰장치에서 발성 처리 변환시스템을 나타낸 구성도,3 is a block diagram showing a speech processing conversion system in the kind headphone device of the present invention;

도 4 는 본 발명의 친절헤드폰장치를 설명하기 위한 처리 플로우차트이다. * 도면의 주요 부분에 대한 부호의 설명101 : 친절 키보턴 102 : 기존 헤드폰 기능부103 : 헤드셋 104 : 발성변환 기능부105 : 헤드폰 플러그4 is a processing flowchart for explaining the kind headphones device of the present invention. * Explanation of the symbols for the main parts of the drawings 101: Kindly key button 102: Conventional headphone function unit 103: Headset 104: Voice conversion function unit 105: Headphone plug

Claims

In the existing headphone function that can hear the sound of the talker or sound equipment,

Voice conversion function 104 is attached to the headphone function unit 102 is attached to the friendly keyboard 101 to convert the voice kindly;

Phonological meaning information and rhyme personality information extractor 202 for classifying rhyme information representing a meaning in voice and rhyme information representing individuality, rhyme information for extracting only phonological information continuity unit 205 that maintains phonological information and meaning information. The phonological information continuity unit 205 and the rhyme information archiving unit 206 are respectively connected to the storage unit 206 so that the outputs of the phonological information continuity unit 206 are preserved as personality preservation and phonological information and transmitted to the outside. Kind headphone device by the signal processing for speech conversion of the voice signal, characterized in that connected to the earphone (208).