KR100542976B1

KR100542976B1 - A headphone apparatus with soft-sound funtion using prosody control of speech signal

Info

Publication number: KR100542976B1
Application number: KR1020020088556A
Authority: KR
Inventors: 배명진
Original assignee: 배명진
Priority date: 2002-12-31
Filing date: 2002-12-31
Publication date: 2006-01-20
Also published as: KR20030016199A

Abstract

본 발명은 실생활에서 오디오 청취용으로 아주 널리 사용되고 있는 헤드폰의 기능을 개선하는 음성신호의 발성변환처리에 의한 소프트사운드 헤드폰장치에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a soft sound headphone device by voice signal conversion processing for improving the function of a headphone which is widely used for audio listening in real life.

이를 위한 본 발명은, 헤드폰을 사용하여 발성자와 통화하거나 또는 음향기기의 소리를 청취하도록 함에 있어서, 발성자의 목소리가 급하고 퉁명스럽게 들릴 경우에 소프트사운드 키보턴을 누르면 목소리를 부드럽고 예절이 있는 소리로 변환시켜주는 억양변환기능부를 기존의 헤드폰에 있는 내, 외장형의 신호처리 컴퓨터 처리기의 일부분으로 구비하여; 음운의미정보분리 및 운율억양정보추출부는 그대로 유지부와 억양정보 변경부가 각각 연결되고, 상기 그대로 유지부와 억양정보 변경부는 목소리합성처리부를 통해 이어폰이 연결된 것을 특징으로 한다.To this end, the present invention, when using a headphone to talk to the speaker or to listen to the sound of the sound equipment, when the voice of the speaker is urgent and blunt, pressing the soft sound key button to make the voice soft and polite An accent conversion function for converting the signal into a part of an internal and external signal processing computer processor in the existing headphones; The phonological semantic information separation and rhyme intonation information extracting unit is connected to the maintaining unit and the intonation information changing unit, respectively, and the maintaining unit and the intonation information changing unit are characterized in that the earphone is connected through the voice synthesis processing unit.

피치억양변환, 소프트사운드 헤드폰, 억양정보, 운율정보, 부드러운-소리기능, 피치억양신호처리, 첨단헤드폰Pitch intonation conversion, soft sound headphones, intonation information, rhyme information, soft-sound function, pitch intonation signal processing, advanced headphones

Description

Soft sound headphone device by voice conversion process of voice signal {A HEADPHONE APPARATUS WITH SOFT-SOUND FUNTION USING PROSODY CONTROL OF SPEECH SIGNAL}

도 1 은 본 발명의 소프트사운드 헤드폰장치의 원리를 설명하기 위한 구성도,1 is a configuration diagram for explaining the principle of the soft sound headphone device of the present invention;

도 2 는 본 발명의 소프트사운드 헤드폰장치에서 분석 및 합성처리 방식을 설명하기 위한 블록도,2 is a block diagram illustrating an analysis and synthesis processing scheme in a soft sound headphone device according to the present invention;

도 3 은 본 발명의 소프트사운드 헤드폰장치에서 억양변환 처리용 하드웨어구성도,3 is a hardware configuration diagram of intonation conversion processing in the soft sound headphone device of the present invention;

도 4 는 본 발명의 소프트사운드 헤드폰장치에 대한 기능처리를 설명하기 위한 플로우챠트이다.
* 도면의 주요 부분에 대한 부호의 설명
101 : 소프트사운드 키보턴 102 : 기존 헤드폰 기능부
103 : 헤드셋 104 : 억양변환기능부
105 : 헤드폰플러그4 is a flowchart for explaining the functional processing of the soft sound headphone device of the present invention.
* Explanation of symbols for the main parts of the drawings
101: soft sound key button 102: conventional headphone function
103: headset 104: intonation conversion function
105: headphone plug

본 발명은 인터넷 전화통화, 일반전화 통화, 휴대폰 통화, 오디오 청취 등의 응용에서 헤드폰을 이용한 소리의 청취 방법을 새로이 제안하여 음성 또는 오디오 가술분야에서 발성변환 신호처리기술로 분류할 수 있는 음성신호의 발성변환처리에 의한 소프트사운드 헤드폰장치에 관한 것이다.
현재 개발되어 있는 휴대폰을 통해 음향 오디오 기기에서 들리는 소리는 각양각색이다. 급하게 들리는 소리, 욕하는 소리, 사투리가 섞인 소리 및 불명료한 목소리 등으로 청취자의 감정을 불쾌하게 만든다. 이렇게 사용되고 있는 기존의 오디오 청취용 헤드폰은 오디오 기기에서 나오는 목소리를 그대로 이어폰이나 스피커를 통해 귀에 전달해주고 있다. The present invention newly proposes a method of listening to sound using a headphone in applications such as Internet telephony, general telephone call, mobile phone call, audio listening, and the like. A soft sound headphone device by voice conversion processing.
There are a variety of sounds that can be heard from acoustic audio devices through the currently developed mobile phones. Displeased listeners can be heard in a hurry, by swearing, by a mixed dialect, or by an unclear voice. Existing audio listening headphones, which are being used in this way, deliver the voice from the audio device to the ear via earphones or speakers.

특히, 청각 장애인이나 노인층의 경우에는 청각 기능이 저하되어 급한 억양의 변화 소리는 잘 알아듣지 못하는 실정이다. 그러나 이어폰이나 스피커의 수신방식은 소리에 들어있는 성격이나 딱딱함이 그대로 청취자에게 전달되어 수신자가 때로는 불쾌감이나 스트레스를 많이 느끼게 되는 단점이 있다. In particular, the hearing impaired or elderly people are deaf, the hearing function is deteriorated due to the fact that the sound of a sudden change of accent is not well understood. However, the reception method of the earphone or the speaker has a disadvantage in that the character or the hardness contained in the sound is transmitted to the listener as it is, and the receiver sometimes feels a lot of discomfort or stress.

본 발명은 상기와 같은 제반 사정을 감안하여 발명한 것으로, 소프트사운드(soft-sound) 키보턴과 억양변환처리부를 이용하여 헤드폰의 이어폰에서 들리는 발성자의 목소리를 디지털 발성변환 처리기술을 적용하여 부드러운 소리가 들리도록 하는 청취 방식을 새로이 제안하는 음성신호의 발성변환처리에 의한 소프트사운드 헤드폰장치를 제공함에 그 목적이 있다. The present invention has been invented in view of the above circumstances, using a soft-sound key button and an intonation conversion processing unit to soften the sound of the speaker, which is heard from the earphone of the headphone, by applying the digital voice conversion processing technology. It is an object of the present invention to provide a soft sound headphone device by a voice conversion process of a voice signal, which is newly proposed.

상기 목적을 달성하기 위한 본 발명은, 헤드폰을 사용하여 발성자와 통화하거나 또는 음향기기의 소리를 청취하도록 함에 있어서, 발성자의 목소리가 급하고 퉁명스럽게 들릴 경우에 소프트사운드 키보턴을 누르면 목소리를 부드럽고 예절이 있는 소리로 변환시켜주는 억양변환기능부를 기존의 헤드폰에 있는 내, 외장형의 신호처리 컴퓨터 처리기의 일부분으로 구비하여; 음운의미정보분리 및 운율억양정보추출부는 그대로 유지부와 억양정보 변경부가 각각 연결되고, 상기 그대로 유지부와 억양정보 변경부는 목소리합성처리부를 통해 이어폰이 연결된 것을 특징으로 하는 음성신호의 발성변환처리에 의한 소프트사운드 헤드폰장치 인 것이다.
이하, 본 발명의 바람직한 실시예를 예시도면에 의거하여 상세히 설명한다. In order to achieve the above object, the present invention provides a soft voice by pressing the soft sound key button when the voice of the speaker is urgently and bluntly in a call to the speaker or to listen to the sound of the audio device using headphones. An accent conversion function for converting into polite sound as part of an internal and external signal processing computer processor in existing headphones; The phonological semantic information separation and rhyme intonation information extracting unit is connected to the maintaining unit and the intonation information changing unit, respectively, and the maintaining unit and the intonation information changing unit are intact in the voice signal conversion process, characterized in that the earphone is connected through the voice synthesis processing unit. Is a soft sound headphone device.
Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1 은 본 발명의 소프트사운드 헤드폰장치의 원리를 설명하기 위한 구성도로서, 본 발명의 소프트사운드 헤드폰장치는 청취자가 억양변환기능부(104)에 부착된 소프트사운드 키보턴(101)(또는 특정 키보턴)을 누르면 발성자의 목소리를 발성변환 처리하여 부드러우면서 정답게 들리도록 발명한 것이다.
기존 헤드폰 펜기능부(102)에는 입력측으로 헤드폰플러그(105)와 헤드셋(103)의 마이크로셋이 각각 연결되는 한편, 출력측으로 억양변환기능부(104)를 통해 소프트사운드 키보턴(101)과 헤드셋(103)의 이어폰이 각각 연결되도록 구성되어 있다. 1 is a configuration diagram for explaining the principle of the soft sound headphone device of the present invention. The soft sound headphone device of the present invention is a soft sound key button 101 (or a specific type) in which a listener is attached to the intonation converting function unit 104. Press the key button to invent the voice of the voice of the voice conversion process so that it sounds smooth and correct.
The headphone plug 105 and the microset of the headset 103 are respectively connected to the input headphone pen function unit 102, while the soft sound key button 101 and the headset are connected to the output side through the intonation converter function 104. The earphones of 103 are configured to be connected to each other.

본 발명의 소프트사운드 헤드폰장치는 헤드폰플러그(105)의 헤드폰을 통해 들리는 소리정보를 분석하여 발성자의 개성정보와 의미정보는 그대로 두고, 억양을 나타내는 피치억양정보를 조절함으로 마치 급하게 변화하는 억양의 변화를 천천히 변화되도록 하여 소리의 정감을 구현할 수 있다.Soft sound headphone device of the present invention analyzes the sound information heard through the headphones of the headphone plug 105, leaving the personality information and semantic information of the speaker as it is, changing the intonation rapidly changing by adjusting the pitch intonation information indicating the intonation You can change the sound slowly by making the sound change.

본 발명의 소프트사운드 헤드폰장치는 급하거나 억양이 강한 사투리로 말이 들릴 때, 청취자의 취향에 따라 소프트사운드의 기능을 선택할 수 있기 때문에 부드럽고 정감이 있는 소리로 헤드폰의 청취를 할 수 있다. 따라서, 다급하고 퉁명스러운 일상의 대화에서 받게 되는 스트레스를 피할 수 있는 효과가 있다.The soft sound headphone device of the present invention can listen to the headphones with a soft and emotional sound because the function of the soft sound can be selected according to the taste of the listener when the user hears the speech in a hurry or strong accent. Therefore, there is an effect that can avoid the stress in the urgent and blunt everyday conversation.

이 경우에도 본 발명의 소프트사운드 헤드폰장치는 발성의 억양변화를 조절하여 부드럽고 명료하게 들려줌으로 복지통신 분야에 필수적인 기능이 된다. 또한, 불특정 다수의 고객을 유무선 전화 통신으로 영접하는 동사무소, 소방서, 회사안내원 등의 관련서비스업 종사자들은 고객들의 다양한 목소리의 형태로 인해 스트레스를 많이 받게 된다. 이러한 경우에도 본 발명의 소프트사운드 헤드폰장치는 고객의 목소리를 부드럽고 차분하게 들려주는 특장점이 있다.Even in this case, the soft sound headphone device of the present invention is an essential function in the field of welfare communication by controlling the intonation change of vocalization and making it sound soft and clear. In addition, related service workers such as offices, fire departments, and company guides who receive unspecified number of customers through wired and wireless telephone communication are stressed by various voices of customers. Even in this case, the soft sound headphone device of the present invention has the advantage of softly and calmly listening to the voice of the customer.

이상과 같이 구성되는 본 발명에서 사람의 목소리는 허파에서 나오는 공기가 성대에서 떨림으로서 진동 소리가 발생하게 되고, 이 떨림이 성도를 통해서 나올 때 공명이 발생하면서 생성된다. 목소리 중에서 성대의 떨림 소리의 주기나 발성습관은 발성자의 개성을 나타내게 되고, 성도의 공명특성은 메시지의 의미를 전달하는 음운정보를 주로 나타내게 된다. 이처럼 목소리의 개성을 나타내는 운율정보를 그대로 전달하되, 피치억양정보만 검출하여 큰 변화를 제한하여 조절함으로 목소리가 차분하고 부드럽게 들리는 소프트사운드 발성변환 기능을 헤드폰 청취에 응용하도록 발명한 것이다.In the present invention constituted as described above, the human voice is generated by the vibration sound as the air from the lungs trembles in the vocal cords, and this vibration is generated while the resonance occurs through the saints. The vocal vocal tremor and vocal habits of the voice represent the personality of the vocalists, and the resonance characteristics of the vocal tracts represent phonological information that conveys the meaning of the message. In this way, the rhyme information indicating the individuality of the voice is transmitted as it is, but only the pitch intonation information is detected and the large change is limited and controlled so that the voice soft and soft voice conversion function is applied to the headphone listening.

유, 무선 전화기, 휴대폰, 녹음기, 오디오 컴포넌트 등의 기기에서의 출력 신호는 스피커나 헤드폰을 통해 우리의 귀에 음파로 소리를 전달해 주고 있다. 이처럼 기존의 헤드폰은 수신 음향기기의 전기신호를 단순히 음파로 바꾸어주는 수화기의 기능을 수행하거나, 또는 고성능 컴퓨터 칩이 내장되어 있어, 문자표시나 오디오 조절 등의 다양한 서비스도 함께 제공하는 기능형 헤드폰장치가 있다.
본 발명의 소프트사운드 헤드폰장치는 기존 헤드폰 기능부(102)에 신호처리용 컴퓨터 칩을 통해 수행하는 소프트사운드기능의 억양변환기능부(104)를 첨가한 기능형 헤드폰장치로 구성된다. 헤드폰의 헤드폰플러그(105)를 기존 헤드폰기능부(102)에 연결하면 수화기를 통해 소리가 들리게 되고, 소리가 급하거나 억양변화가 급해서 소리의 의미가 불명료하게 들린다면 억양변환기능부(104)의 소프트사운드 키보턴(101)을 누르게 된다. 소프트사운드 키보턴(101)은 별도의 키보턴으로 장착되었거나 또는 임의로 정한 키보턴을 누르는 방법이다.Output signals from devices such as wired, cordless phones, cell phones, recorders, and audio components deliver sound to our ears through speakers or headphones. As such, the existing headphone functions as a handset that simply converts an electrical signal from a receiving sound device into sound waves, or a built-in high-performance computer chip provides a variety of services such as character display and audio control. There is.
The soft sound headphone device of the present invention comprises a functional headphone device in which an existing headphone function section 102 has an accent conversion function section 104 of a soft sound function performed through a computer chip for signal processing. When the headphone plug 105 of the headphone is connected to the existing headphone function unit 102, a sound is heard through the handset, and the intonation conversion function unit 104 is heard if the meaning of the sound is unclear because of a sudden or accent change. Press the soft sound key button 101. The soft sound key button 101 is a method of pressing a key button that is mounted as a separate key button or arbitrarily determined.

소프트사운드 키보턴(101)을 누르면, 발성자의 목소리에서 아주 급한 피치억양의 변화를 조절하도록 억양변환기능부(104)의 소프트웨어나 펌웨어에 의해 컴퓨터 칩이 처리하여 말소리를 부드럽게 들리도록 한다. 반면 이때 청취자가 소프트사운드 키보턴(101)을 한번 더 누르게 되면 원래의 억양상태로 다시 복귀된다.When the soft sound key button 101 is pressed, the computer chip is processed by software or firmware of the intonation converting function 104 to control the sudden change of the pitch intonation in the voice of the speaker so that the sound of the speech is softly heard. On the other hand, when the listener presses the soft sound key button 101 again, the listener returns to the original intonation.

도 2 는 본 발명의 소프트사운드 헤드폰장치에서 분석 및 합성처리 방식을 설명하기 위한 블록도로서, 본 발명의 소프트사운드 헤드폰장치는 기존 헤드폰에 내,외장형으로 부착된 컴퓨터 칩에서 목소리를 분석하여 발성자의 목소리특성을 그대로 유지하면서 소프트사운드 기능이 합성되도록 하는 등의 첨단 처리기능을 구성한 것이다. 즉, 도 1 에 도시된 발성변환 기능부에 있어 음운의미정보분리 및 운율억양정보추출부(202)는 그대로 유지부(205)와 억양정보 변경부(206)가 각각 연결되고, 상기 그대로 유지부(205)와 억양정보 변경부(206)는 목소리합성처리부(207)를 통해 이어폰(208)이 연결된다. Figure 2 is a block diagram illustrating the analysis and synthesis processing method in the soft sound headphone device of the present invention, the soft sound headphone device of the present invention by analyzing the voice from the computer chip attached to the existing headphones in the internal and external type of the speaker It is composed of advanced processing functions such as soft sound function synthesized while maintaining voice characteristics. That is, in the voice conversion function unit illustrated in FIG. 1, the phonological semantic information separation unit and the rhyme intonation information extracting unit 202 are connected to the maintaining unit 205 and the intonation information changing unit 206, respectively, and remain as they are. 205 and the intonation information changing unit 206 are connected to the earphone 208 through the voice synthesis processing unit 207.

목소리는 성대의 떨림과 목구멍에서의 공명에 의해 소리가 발생하는데, 이러한 목소리의 생성원리를 이용하여 목소리의 특징과 의미정보는 그대로 유지부(205)에서 그대로 두고, 억양정보변경부(206)에서 피치 억양정보만을 뽑아서 조절하여 목소리 합성처리부(207)를 통해 합성하면, 이어폰(208)을 통해 다정하고 부드러운 소프트 사운드의 목소리로 바뀌게 된다. 본 발명의 소프트사운드 헤드폰장치 핵심기술은 음운의미정보분리 및 운율억양정보추출부(202)에서 사람의 목소리에서 말뜻을 나타내는 음운정보와 개성을 나타내는 운율정보를 자동으로 분류하여 그대로 유지부(205)에서 의미와 개성을 보존하면서 동시에 억양정보변경부(206)에서 피치 억양정보를 변경함으로 이어폰(208)을 통해 목소리의 부드러움을 증대시키게 된다는 점이다.The voice is generated by the tremor of the vocal cords and the resonance in the throat. By using the principle of generating the voice, the voice characteristic and semantic information are left as it is in the holding unit 205, and in the intonation information changing unit 206. When only the pitch intonation information is extracted and adjusted through the voice synthesis processing unit 207, the earphone is changed into a soft and soft sound voice through the earphone 208. The soft sound headphone device core technology of the present invention automatically maintains the phonological meaning information separation and rhyme intonation information extracting unit 202 to automatically classify the rhyme information representing the meaning of the human voice and the rhyme information indicating the individuality. At the same time to preserve the meaning and personality at the same time by changing the pitch intonation information in the intonation information changing unit 206 is to increase the softness of the voice through the earphone 208.

휴대폰이나 오디오 기기 등으로부터 출력되는 아날로그 형태의 목소리 신호를 입력 받아서 소프트사운드한 목소리로 발성변환 처리하는 장치는 도 3 에 도시된 바와 같다. 메모리(305)와 주변장치(309)가 각기 연결된 디지털신호처리기(DSP) 또는 범용 CPU의 컴퓨터 처리기(304)는 아날로그-디지털변환기(303 : ADC), 저역통과여파기(302 : LPF) 및 증폭기(301)의 송화기 아날로그입력 S(t)을 구성하고, 디지털-아날로그변환기(308 : DAC), 저역통과여파기(307) 및 증폭기(306)의 수화기 아날로그출력 U(t)을 구성한 것이다. 상기 증폭기(301, 306)에는 볼륭조절기(311, 312)가 설치되어 있다.
아날로그 형태로 입력된 목소리 신호파형은 증폭기(301)에서 원하는 레벨로 증폭한 다음에 앨리어징(aliasing)효과를 제거하기 위해 저역통과여파기(302)에 통과시키고, 양자화(quantization) 및 부호화(coding)를 수행하는 아날로그-디지털 변환기(303)를 통과함으로서 선형펄스부호변조(PCM) 형태의 디지털 신호로 바뀌어서 범용 CPU나 디지털 신호처리기(DSP)의 컴퓨터 처리기(304)에서 소프트웨어나 펌웨어에 의해 처리된다.An apparatus for receiving an analog voice signal output from a mobile phone or an audio device or the like and converting the voice signal into a soft sound voice is illustrated in FIG. 3. The digital signal processor (DSP) or the computer processor 304 of the general purpose CPU, to which the memory 305 and the peripheral device 309 are connected, may be an analog-to-digital converter (303: ADC), a low pass filter (LPF) and an amplifier (302). A transmitter analog input S (t) of 301 is configured, and a receiver analog output U (t) of the digital-to-analog converter 308 (DAC), the low pass filter 307, and the amplifier 306 is configured. The amplifiers 301 and 306 are provided with the balance regulators 311 and 312.
The voice signal waveform input in the analog form is amplified to the desired level in the amplifier 301 and then passed through the low pass filter 302 to remove the aliasing effect, and then quantization and coding. By passing through an analog-to-digital converter 303 that performs a digital signal in the form of linear pulse code modulation (PCM), it is processed by software or firmware in a computer processor 304 of a general purpose CPU or digital signal processor (DSP).

신호처리 될 때는 이 컴퓨터 처리기(304)가 내,외에 설치된 주변장치(309)를 참고할 수도 있고, 또한 입력 디지털 신호나 처리 결과를 저장하기 위해 주변 메모리(305)를 참고할 수도 있다.When the signal is processed, the computer processor 304 may refer to the peripheral device 309 installed inside or outside, or may refer to the peripheral memory 305 to store the input digital signal or the processing result.

CPU에서 소프트웨어나 펌웨어에 의해 소프트사운드로 발성변환 처리된 디지털 신호는 디지털-아날로그 변환기(308)를 통해 표본화된 아날로그 신호형태로 변환된다. 이 신호를 저역통과여파기(307)에 통과시키면 양자화 잡음이 제거된 아날로그 신호가 되고, 증폭기(306)에서 적절히 증폭하면 헤드폰의 수화기나 스피커 등을 통해서 들을 수 있는 아날로그 신호가 된다. 이때 전반적인 소리의 레벨은 각 증폭기(301, 306)에 부착된 볼륨조절기(311)(312)로 조절하게 된다.The digital signal, which is voice-converted to soft sound by software or firmware in the CPU, is converted into a sampled analog signal form through the digital-to-analog converter 308. Passing this signal through the low pass filter 307 results in an analog signal from which quantization noise has been removed, and when properly amplified by the amplifier 306, an analog signal that can be heard through a handset, a speaker, or the like of a headphone. At this time, the overall sound level is controlled by the volume controllers 311 and 312 attached to the respective amplifiers 301 and 306.

도 4 는 본 발명의 소프트사운드 헤드폰장치에 대한 기능처리를 설명하기 위한 플로우챠트로서, 본 발명의 소프트사운드 헤드폰장치는 기존 헤드폰에 도 3과 같은 하드웨어 시스템과 그 CPU칩에 도 4와 같은 처리가정을 갖는 소프트사운드기능의 소프트웨어나 펌웨어를 탑재한 것이다. 기존의 헤드폰 기능에서 소프트사운드 키-보턴(또는 임의의 키-보턴)이 눌러졌는지를 파악하고, 눌러지지 않았다면 기존 헤드폰과 같이 입력되는 소리를 그냥 출력으로 전달하게 하거나, 디지털 제어를 위해서는 소리 전달과정을 수행하게 된다(402 단계). 소프트사운드 키-보턴은 소프트웨어 토글스위치 형태로 구성되며 한번 누르면 켜지고, 다시 누르면 꺼지게 된다.4 is a flowchart for explaining the functional processing of the soft sound headphone device of the present invention, in which the soft sound headphone device of the present invention is a hardware system as shown in FIG. It is equipped with software or firmware with a soft sound function. Determines whether the soft sound key-button (or any key-button) is pressed in the existing headphone function, and if it is not pressed, sends the input sound to the output just like the existing headphones, or for digital control. (Step 402). The soft sound key-button is configured as a software toggle switch, press once to turn it on, and press again to turn it off.

본 발명의 소프트사운드 헤드폰장치 기능이 시작되면 아날로그-디지털 변환기(ADC)에서 입력된 데이터 표본(401 단계)값이 한 프레임단위로 동시에 처리된다. 먼저 현재 프레임에 있는 데이터 값이 유성음 구간인지 아닌지를 파악하고, 유성음 구간이 아니면(404 단계) 링버퍼의 점유율(Buffer Rate, BR)을 계산하게 된다. 상대방의 목소리가 부드럽게 들리도록 하기 위해서는 상대방의 실제 목소리 발성속도보다 다르게 처리해야 하는데, 처리된 데이터를 대기시키는데 필요한 메모리 버퍼를 링버퍼라고 한다(409 단계).When the function of the soft sound headphone device of the present invention is started, the data sample value (step 401) input from the analog-to-digital converter (ADC) is simultaneously processed in units of one frame. First, it is determined whether the data value in the current frame is the voiced sound section, and if it is not the voiced sound section (step 404), the occupancy rate (Buffer Rate, BR) of the ring buffer is calculated. In order for the other party's voice to be heard softly, processing must be performed differently than the other party's actual voice uttering speed. The memory buffer required to wait for the processed data is called a ring buffer (step 409).

링버퍼의 점유율(BR)은 소프트사운드 기능에서 처리된 데이터가 링버퍼에서 대기되는 시간비율을 나타나는 데, 현 프레임이 비유성음 구간이고 링버퍼에 대기하고 있는 시간이 정해진 시간의 점유율(예 0.8＜BR＜1.2)을 벗어났다면, 발성속도를 변경하기 위해 지속시간 조절을 수행하게 된다(416 단계). 이렇게 함으로써 소프트사운드 기능이 수행될 때 야기되는 발성시간의 변동을 해소할 수 있게 된다. 즉, 418 단계에서 유성음구간에서만이 데이터가 부드러운 소리로 발성되도록 처리하지만, 이때 발성속도가 원래의 발성에 비해 다를 수가 있기 때문에, 비유성음 구간에서 발성속도를 조절하여 전체적인 시간지연을 해소하게 한 것이다.The occupancy rate of the ring buffer (BR) represents the time rate at which data processed by the soft sound function is waited in the ring buffer. The occupancy rate of the time when the current frame is a non-voiced sound and the waiting time for the ring buffer is determined (for example, 0.8 < If it is out of BR < 1.2, duration adjustment is performed to change the phonation speed (step 416). This makes it possible to eliminate fluctuations in the uttering time caused when the soft sound function is performed. In other words, in step 418, this data is processed to be sounded only in the voiced sound section, but since the voice speed may be different from that of the original voice, the voice delay is adjusted in the non-voice sound section to eliminate the overall time delay. .

현재의 프레임이 유성음 구간인지 비유성음 구간인지를 측정하는 방법(403 단계)은 음성처리 교재에 많이 제안되어져 있으며, 일례로 에너지 레벨을 측정하여 쉽게 파악할 수 있다. A method of measuring whether the current frame is a voiced sound section or a non-voiced sound section (step 403) has been suggested in a text processing textbook. For example, the energy level can be easily determined by measuring an energy level.

즉, 현재 프레임의 평균 에너지가 정해진 문턱 값을 초과하여 일례로 5프레임(100ms) 이상 지속된다면 이 구간은 유성음 구간이 된다.That is, if the average energy of the current frame exceeds a predetermined threshold and lasts 5 frames (100 ms), for example, this section becomes a voiced sound section.

현재의 프레임의 데이터가 유성음 구간(418 단계)이라면 이 데이터에 대해 소프트사운드 기능처리를 수행하게 된다. 소프트사운드 기능은 현 프레임의 데이터에서 피치억양을 검출(406 단계)하고, 프레임 단위로 피치-억양변화도(PAC, Pitch Accent Contour)를 고려하여, PAC가 정해진(일예로 1.5배) 변화범위를 벗어나면, 피치억양을 변경시킨다(407 단계). 피치억양의 변경은 유성음의 한 블록단위로 처리하는데, 연속적으로 검출되는 유성음 프레임의 한 블록구간을 나타낸다.If the data of the current frame is the voiced sound section (step 418), the soft sound function processing is performed on the data. The soft sound function detects the pitch intonation in the data of the current frame (step 406), and considers the pitch-accent gradient (PAC) in the unit of frame to determine the range of change in which the PAC is defined (for example, 1.5 times). If out, the pitch intonation is changed (step 407). The change of the pitch intonation is processed in units of one block of voiced sound, which represents one block section of the voiced sound frame which is continuously detected.

유성음 한 블록에 대해 피치-억양변경(407 단계)은 피치주기 단위로 수행하고, 일예로 PAC가 정해진 변화범위를 초과하였다면, 피치주기가 주어진 최대의 범위이내에서 유지되도록 하기 위해 피치변경을 수행한다. 피치주기를 변경하는 방법은 지금까지 많이 제안되어져 있다, 일예로 시간 영역에서 피치주기 단위로 음성파형을 넓게 분절한 다음에 변경된 피치주기 단위로 중첩시켜서 파형을 재구성하는 PSOLA(Pitch Synchronous Overwrap and Add) 피치변경법이 있다.For a block of voiced sound, the pitch-inhibition change (step 407) is performed in units of pitch periods. For example, if the PAC exceeds a predetermined change range, the pitch change is performed to maintain the pitch period within a given maximum range. . Pitch Synchronous Overwrap and Add (PSOLA) is a method of changing the pitch period so far. For example, the speech waveform is reconstructed by broadly segmenting the speech waveform in the pitch period unit in the time domain and then superimposing it in the changed pitch period unit. There is a pitch change method.

이때 피치주기를 정확히 검출해야 하는데, 음성신호의 피치주기 검출법은 최근 40년간 수많은 방법들이 제안되어 있다. 일예로 피치검출은 자기상관 함수법이 주로 사용되고 있으며, 인근 음성파형들 간의 상관관계를 계산하여 반복적인 파형의 주기를 검출하는 방법이 있다.At this time, it is necessary to accurately detect the pitch period, and a number of methods have been proposed in the past 40 years for the pitch period detection method of a voice signal. For example, pitch detection is mainly used for the autocorrelation function, and there is a method for detecting the period of a repetitive waveform by calculating a correlation between adjacent voice waveforms.

이렇게 처리된 데이터들은 파형의 진폭이 자연스럽지 못하고 부자연스럽게 되므로, 진폭의 변화가 자연스럽게 이어지도록 하는 에너지 진폭변화 조절(408 단계)을 수행해야 한다. 일예로 에너지 진폭의 변경은 피치주기 단위로 처리하며, 한 피치주기의 평균 에너지 진폭을 파형에 곱함으로서 수행한다.Since the processed data becomes unnatural and unnatural in the waveform, it is necessary to perform the energy amplitude change adjustment (step 408) so that the change in amplitude naturally follows. For example, the change in energy amplitude is processed in units of pitch periods, and is performed by multiplying the waveform by the average energy amplitude of one pitch period.

이렇게 처리 완료된 음성 데이터들은 링버퍼에 저장시키고(409 단계), 저장된 순서에 따라서 디지털-아날로그 변환기(DAC)를 통해 음성 데이터 표본 단위로 헤드폰의 수화기나 스피커를 통해 출력한다(410 단계). 여기서 소프트사운드 헤드폰의 기능은 실시간으로 처리된다. 즉, 아날로그-디지털 변환기(ADC)에서 한 프레임의 데이터를 받고(401 단계)나서부터 그 다음 프레임의 데이터를 받아올 때까지 소프트사운드 헤드폰기능의 처리(410 단계)가 끝날 수 있도록 해야만 한다.The processed voice data are stored in the ring buffer (step 409), and are output through the digital receiver (DAC) through the receiver or speaker of the headphone in units of voice data through the digital-to-analog converter (DAC) (step 410). The function of the soft sound headphones is handled in real time. That is, the processing of the soft sound headphone function (step 410) must be completed until the data of one frame is received from the analog-to-digital converter (ADC) until the data of the next frame is received.

이상 설명한 바와 같이 본 발명에 의하면, 목소리 신호의 특징을 추출하여 발성자의 특성정보와 메시지의 의미정보는 그대로 유지하면서 발성자의 억양피치정보를 부드럽게 조절하는 것으로, 발성자의 발성특성에서 피치억양을 조절하여 목소리 음색의 변화범위를 제한하는 청취기능을 구현하거나, 발성하는 에너지의 진폭변화를 관찰하여 일정범위를 벗어나지 않게 하거나, 발성 지속시간의 지연보상을 유성음 및 비유성음 구간으로 구분하여 처리를 다르게 하는 등의 발성변환 처리법을 헤드폰에 구현하여 발성자의 목소리가 부드럽고 다정하게 들리도록 하는 소프트사운드 기능을 탑재한 헤드폰 방식인 것이다.As described above, according to the present invention, the characteristics of the voice signal are extracted to smoothly adjust the accent pitch information of the speaker while maintaining the speaker's characteristic information and the semantic information of the message. Implement a listening function to limit the range of change of voice tone, observe the amplitude change of vocal energy so as not to be out of a certain range, or divide the delay compensation of vocal duration into voiced and non-voiced sound sections It is a headphone method with soft sound function that implements the voice conversion processing method of the headphone so that the voice of the speaker can be heard softly and tenderly.

본 발명은 기술적으로 볼 때 인간의 오감을 보조하는 기술의 하나로써, 누구나 나이가 들면 감각기능이 노화되어 점차 그 기능이 둔화되는데, 소프트사운드 헤드폰은 이러한 감각기능을 보완해주는 복지기술의 실용화라는 점이 특이하다. 따라서 청각기능이 떨어지는 노인이나 장애인에게 상대방의 목소리 톤을 부드럽게 제공할 수 있는 복지국가용 헤드폰기술로서 그 응용성이 독특하다.The present invention is a technology that assists the human senses in terms of technology, and as everyone ages, the sensation function becomes aging, and the function gradually slows down. Soft sound headphones are the practical use of welfare technology that complements these sensory functions. Unusual. Therefore, its applicability is unique as a headphone technology for the welfare state that can softly provide the voice tone of the other party to the elderly or the disabled who are hearing impaired.

그리고 본 발명의 소프트사운드 헤드폰장치는 정감이 있는 사회를 이루는데 필요한 핵심기술이다. 사회는 고도로 첨단화되고, 인간은 점차 고립화되기 때문에 일상의 목소리 소통에서조차 상대를 무시하는 대화가 아주 보편화되고 있다. 이러한 사회의 분위기를 소프트사운드 헤드폰을 통해 바꿀 수 있다. 발성자의 급하고 억양이 강한 목소리를 부드럽게 말하도록 변경시킴으로서 자칫하면 감정이 유발될 수 있는 청취 분위기를 차분하게 안정시켜 줄 수 있기 때문이다.And the soft sound headphone device of the present invention is a core technology necessary to achieve a society with emotion. Societies are highly advanced and humans are increasingly isolated, so conversations that ignore each other are very common even in everyday voice communication. The atmosphere of this society can be changed through soft sound headphones. This is because by changing the speaker's urgent and accenting voice so as to speak softly, it can calmly stabilize the listening mood that may cause emotion.

본 발명의 소프트사운드 헤드폰장치에 적용한 발성변환 기술은 대화의 내용을 기록하는 전문 속기사의 듣기 보조시스템으로 활용할 수 있고, 영어듣기 능력을 키워나가는 어학 학습기에 적용할 수 있고, 긴급하거나 조난구조 요청하는 소리를 부드럽고 공손하게 듣거나 녹음해야 하는 경우에도 쓸 수 있는 등의 실용성 있는 발명기술로서 그 파급 효과가 아주 크다.The speech conversion technology applied to the soft sound headphone device of the present invention can be used as a listening assistant system of a professional shorthand for recording the contents of a conversation, can be applied to a language learner to develop English listening ability, and to request emergency or distress rescue. It is a practical invention technology that can be used even when you need to listen to the sound softly or politely, and its ripple effect is great.

Claims

In the existing headphone function unit 102 that can listen to the sound of the talker or the audio device, the intonation conversion function unit 104 with the soft sound key button 101 is installed,

The phonological meaning information separation and rhyme intonation information extracting unit 202 is connected to each other and maintains the voice feature and semantic information as it is, and the tonal information changing unit 206 for extracting and adjusting only the pitch intonation information, respectively. Soft sound headphone apparatus by the voice conversion processing of the voice signal, characterized in that by sequentially connecting the known voice synthesis processing unit 207 and the known earphone 208 to a soft and soft soft sound.