KR102126204B1

KR102126204B1 - Voice Recognition Sensor having Multi Frequency Channels with Curved type

Info

Publication number: KR102126204B1
Application number: KR1020180076936A
Authority: KR
Inventors: 이건재; 홍성광; 한재현; 왕희승
Original assignee: 한국과학기술원
Priority date: 2018-05-30
Filing date: 2018-07-03
Publication date: 2020-07-07
Also published as: KR20190136878A

Abstract

본 발명에 따른 음성 인식 센서는 상이한 길이를 갖는 복수 개의 주파수 채널을 구비하며, 상기 복수 개의 주파수 채널은 서로 겹치는 주파수 영역을 센싱하며, 상기 복수 개의 주파수 채널의 외측을 연속적으로 연결하는 전체적인 형태는 커브드(curved) 곡면을 이루고, 이를 통해 상기 복수 개의 주파수 채널은 가청 주파수 영역대인 200Hz 내지 4kHz의 주파수 영역을 통해 고르게 주파수 센싱을 하는 것과 동시에 고민감도를 갖는 응답 특성을 갖는 것을 특징으로 한다.The voice recognition sensor according to the present invention includes a plurality of frequency channels having different lengths, the plurality of frequency channels senses a frequency region overlapping each other, and the overall shape of continuously connecting the outsides of the plurality of frequency channels is a curve. It forms a curved surface, and through this, the plurality of frequency channels is characterized by having a response characteristic having high sensitivity while simultaneously sensing frequencies through an audible frequency range of 200 Hz to 4 kHz.

Description

Voice Recognition Sensor having Multi Frequency Channels with Curved type

본 발명은 사물인터넷(IoT, Internet of Thing) 응용을 위하여 커브드 형태의 복수의 주파수 채널을 갖는 저전력 유연압전 음성인식 센서에 관한 것으로, 보다 상세하게는 사람의 음성 대역을 전체적으로 민감하게 센싱할 수 있도록 복수의 주파수 채널을 커브드 형태로 배치하여 복수의 주파수 대역을 전체적으로 고르게 설정할 수 있게 한 유연압전 기반 저전력 음성인식 센서에 관한 것이다.The present invention relates to a low-power flexible piezoelectric voice recognition sensor having a plurality of curved frequency channels for Internet of Thing (IoT) applications, and more specifically, it is capable of sensitively sensing a person's voice band as a whole. The present invention relates to a flexible piezoelectric-based low-power voice recognition sensor that enables multiple frequency bands to be uniformly set as a whole by arranging a plurality of frequency channels in a curved form.

음성인식 센서란 인간의 음성에 포함된 음향학적 정보로부터 언어적 정보를 추출하여 이를 인지하고 반응하게 만드는 센서를 의미한다. 쉽고 편리하게 사용할 수 있는 Natural UI(user interface)가 필요해진 오늘날에 음성으로 대화하는 것은 미래 IoT시대의 수많은 인간과 기계의 정보 교환 매체 중 가장 자연스럽고 간편한 방법으로 여겨지고 있다. 하지만 기계와 음성으로 소통하기 위해서는 인간의 음성을 기계가 처리할 수 있는 형식으로 변환을 해줘야 하는데 이 과정이 바로 음성 인식이다.The speech recognition sensor refers to a sensor that extracts linguistic information from acoustic information contained in human speech and makes it perceive and react. Today, voice communication is considered to be the most natural and convenient way to exchange information between human and machine information in the future IoT era. However, in order to communicate with the machine through speech, human speech must be converted into a format that can be processed by the machine. This process is speech recognition.

애플의 시리(Siri)로 대표되는 음성인식은 마이크로폰, ADC(Analog to Digital Converter), DSP (Digital Signal Processing)의 조합으로 구성되어 있으며, 모바일용 상시 대기로 활용하기에는 소모 전력이 높아 사용자가 시작과 종료 버튼을 누르며 조작하고 있다. 이는 진정한 의미의 음성인식 기반 IoT(internet of Things,사물 인터넷) 구현에 가장 큰 난관 중에 하나이며, 저전력 상시구동 음성인식시스템을 개발하는 경우에는 무궁무진한 IoT 응용처를 열 수 있을 것으로 예상된다.Voice recognition, represented by Apple's Siri, is composed of a combination of a microphone, analog to digital converter (ADC), and digital signal processing (DSP). It is operated by pressing the end button. This is one of the biggest challenges in realizing voice recognition-based Internet of Things (Internet of Things), and it is expected to open endless IoT applications when developing a low-power, always-on voice recognition system.

별도의 학습이나 훈련 없이도 쉽게 사용할 수 있는 음성인식 시스템은 혁신적인 차세대 IT 제품을 위한 UI 개발 및 구축에 대한 요구가 높아진 IoT 시대에 미래 산업을 선도할 유망 기술로서 손이 자유롭지 않은 상황이나 이동시에도 정보를 입력할 수 있으며 입력 속도가 타이핑보다 빠르기 때문에 고속 또는 실시간으로 정보처리가 가능하다는 장점이 있다.The voice recognition system, which can be easily used without learning or training, is a promising technology that will lead the future industry in the IoT era where the demand for UI development and construction for innovative next-generation IT products is high. Since it can be input and the input speed is faster than typing, it has the advantage of being able to process information in high speed or in real time.

근래 스마트폰 단말기 성능의 진화, 인공지능 및 지식 검색 기술의 발전, 클라우드 기반의 음성인식 시스템을 통한 대용량 데이터 처리는 지능형 에이전트로서 사용자가 원하는 답을 정확하고 신속하게 찾을 수 있게 해주지만, 이런 장점과 가능성에도 불구하고 아직 음성인식 기술은 다음과 같은 한계점을 가지고 있다.In recent years, the evolution of smartphone terminal performance, the development of artificial intelligence and knowledge retrieval technology, and the processing of large amounts of data through cloud-based voice recognition systems are intelligent agents that enable users to find the answers they want accurately and quickly. Despite the possibilities, the speech recognition technology still has the following limitations.

먼저, 하드웨어적인 관점에서 볼 때 마이크로폰, ADC, DSP의 조합을 이용한 기존 음성인식 기술은 전력소비가 매우 높아 별도의 전원 없이는 상시 대기 상태에서 음성인식이 현실적으로 불가능하며, 더더구나 모바일용 음성인식 센서에 응용은 에너지 문제로 인하여 매우 제한적이다. 또한 음성 인식 시작 버튼을 누르는 등의 예비동작이 필요하고 그 정확성, 신뢰도, 속도 등이 떨어진다. 즉, IoT를 기반으로 하는 스마트폰, TV, 자동차, 기타 웨어러블 디바이스에 적용하기 위해서는 고감도는 필수이며, Sleep 상태에서도 큰 전력 소모 없이 상시 대기 상태를 유지하여 초전력으로 사용자의 음성을 인식할 수 있어야 한다. First, from a hardware point of view, the existing voice recognition technology using a combination of microphone, ADC, and DSP has a very high power consumption, so voice recognition in a standby mode without a separate power source is practically impossible. Moreover, it is applied to voice recognition sensors for mobile applications. Silver is very limited due to energy problems. In addition, preliminary actions, such as pressing a voice recognition start button, are required, and the accuracy, reliability, and speed are deteriorated. In other words, high sensitivity is essential to apply to IoT-based smartphones, TVs, automobiles, and other wearable devices, and it is necessary to be able to recognize the user's voice with ultra-power by maintaining a standby state without excessive power consumption even in sleep state. do.

다음으로, 음향학과 언어학적 관점에서 바라볼 때 현재의 마이크로폰, ADC, DSP 조합의 음성인식은 복잡한 알고리즘을 기반으로 하고 있기에 자연스런 대화체를 인식하는 데에 한계가 있다.Next, from the perspective of acoustics and linguistics, the speech recognition of current microphone, ADC, and DSP combinations is based on complex algorithms, which limits the recognition of natural conversation.

이에 반해 인간의 달팽이관은 복잡한 언어를 주파수 분리 후 단순한 알고리즘을 통하여 효율적으로 신호처리하고 있다. 이런 달팽이관의 원리를 이용한 여러 장치에도 불구하고, 이를 모사하여 인공와우에 응용한 선례는 있지만 아직까지 IoT용 저전력용 음성인식 센서로 활용된 케이스는 전무한 상태이다.On the other hand, human cochlea is efficiently processing complex language after frequency separation through simple algorithm. Despite the various devices using the principle of the cochlear, there is a precedent for simulating it and applying it to a cochlear implant, but the case used as a low-power voice recognition sensor for IoT has not yet been used.

추가적으로, 기존 음성 센서는 센서부 및 ROIC(Readout integrated circuits)로 이루어지는 것이 일반적일 수 있는데, Cap type의 음성 센서는 센서부 상에 bias를 항상 제공하여야 한다.Additionally, the existing voice sensor may be generally composed of a sensor unit and readout integrated circuits (ROIC), and a cap type voice sensor should always provide a bias on the sensor unit.

유연한 압전박막 인공와우 응용사례는 H. Lee et. al의 Advanced Functional Materials 저널의 논문 Vol. 24, No. 44, pg 6914, 2014에서 참조할 수 있다. 사다리꼴 모양의 얇은 실리콘 멤브레인 위에 3개의 압전소자를 붙여 주파수에 따라서 가청주파수대의 음성 신호를 분리하였다. 상기 문헌에서는, 실리콘 멤브레인 위에 3개의 개별 압전소자를 붙여 주파수를 분리하여 인공와우에 적용하였지만, 이를 IoT용 저전력 음성센서로서 알고리듬, 회로 설계가 고려되지 않았다.Flexible piezoelectric thin film cochlear implant applications are described in H. Lee et. al's Journal of the Advanced Functional Materials Journal Vol. 24, No. 44, pg 6914, 2014. Three piezoelectric elements were attached on a thin trapezoidal silicon membrane to separate the audio signals in the audible frequency band. In the above document, three individual piezoelectric elements are attached to the silicon membrane to separate the frequencies and applied to the cochlear implant, but the algorithm and circuit design are not considered as low-power voice sensors for IoT.

한편, 본 출원인에 의한 선등록 특허 제10-1718214호의 경우에도 사다리꼴 형태로 이루어진 복수의 주파수 분리 채널을 이용하여 감지되는 음성을 주파수에 따라 복수의 채널을 통해 분리하는 것과 동시에 분리된 음성 신호를 압전 소자를 통해 기계적 진동 신호에서 전기적 신호로 변환하게 하여 인식하는 기술적 내용을 개시하지만, 복수의 주파수 채널을 통해 감지되는 응답 성능이 일정 영역에서는 원활하게 구현이 안된다는 한계점이 있다.On the other hand, even in the case of the pre-registered patent No. 10-1718214 by the present applicant, the separated voice signal is piezoelectrically separated by separating the voice sensed through the plurality of channels according to the frequency using a plurality of frequency separation channels formed in a trapezoidal shape Although it discloses the technical content to recognize by converting the mechanical vibration signal to an electrical signal through the device, there is a limitation that response performance sensed through a plurality of frequency channels cannot be smoothly implemented in a certain area.

또한 복수의 공진 주파수를 이용하여 햅틱 피드백 효과를 출력하게 하는 압전 장치를 제시하는 종래의 문헌으로는 공개특허 제10-2012-0099036호(2012.09.06)를 참조할 수 있다. 한편 상기 문헌에서는 촉각, 힘, 운동감 등에 기반한 햅틱 피드백 기술을 제공하지만, 인식된 음성을 복수의 주파수로 분리한 상태에서 인식하는 방안에 대해서는 별도로 개시하고 있지 않다는 한계가 있다.Also, as a conventional document presenting a piezoelectric device that outputs a haptic feedback effect using a plurality of resonant frequencies, reference may be made to Patent Publication No. 10-2012-0099036 (2012.09.06). On the other hand, the above document provides a haptic feedback technology based on tactile sense, force, and movement, but there is a limitation that it does not disclose how to recognize the recognized voice in a state separated into a plurality of frequencies.

(논문) H. Lee et. al, Advanced Functional Materials, 24(44), 6914, 2014(Thesis) H. Lee et. al, Advanced Functional Materials, 24(44), 6914, 2014

(특허문헌 1) KR10-1718214 B(Patent Document 1) KR10-1718214 B

(특허문헌 2) KR10-2012-0099036 A(Patent Document 2) KR10-2012-0099036 A

본 발명은 상기 종래의 문제점을 해소하고자 하는 것으로서, 단일 소자로 구현된 유연한 압전 박막을 이용하여 커브 형태로 이루어진 복수의 주파수 분리 채널을 통하여 감지되는 음성을 주파수에 따라 상기 복수의 채널을 통해 분리하는 것과 동시에 상기 분리된 음성 신호를 유연 무기 압전 소자를 통해 기계적 진동 신호에서 전기적 신호로 변환하게 하여 인식하게 함으로써, 음성인식 회로의 간편화를 통한 전력소모를 줄인 IoT 용 저전력 압전 음성인식 센서를 제공하는 것이 목적이다.The present invention is intended to solve the above-described problems, and separates voices sensed through a plurality of frequency separation channels formed in a curved form using a flexible piezoelectric thin film made of a single element through the plurality of channels according to frequency. At the same time, it is possible to provide a low-power piezoelectric voice recognition sensor for IoT that reduces power consumption through simplification of the voice recognition circuit by recognizing the separated voice signal by converting it from a mechanical vibration signal to an electrical signal through a flexible inorganic piezoelectric element. Purpose.

즉, 복수의 주파수 분리 채널을 사다리꼴 형태로 배치한 기존의 구조 상에서는 특정한 영역의 주파수 대역에서 응답하는 민감도가 현저히 저하된다는 문제점을 커브 형태로 개선하여 사람의 가청 주파수 대역 상에서 전체적으로 높은 민감도를 유지한 상태의 음성인식 센서를 제공하는 것이 목적이다.That is, in a conventional structure in which a plurality of frequency-separated channels are arranged in a trapezoidal shape, a problem in which a sensitivity in response to a frequency band of a specific region is significantly reduced is improved in a curve shape, thereby maintaining a high overall sensitivity in the human audible frequency band. It is an object to provide a voice recognition sensor.

본 발명은 인간 음성의 스펙트럼에 대해 디지털 샘플링 및 음향 신호 처리 수행을 하기 이전에 주파수 별로 분리된 형태로 음향 신호를 감지 및 검출하여 기존 마이크로폰, ADC, DSP 회로 기반의 고전력 음성인식 센서보다 음성인식 회로의 간편화를 통하여 소비하는 전력을 크게 줄이는 압전 음성인식 센서를 제공하는 것이 목적이다.The present invention detects and detects a sound signal in a form separated by frequency before performing digital sampling and sound signal processing on the spectrum of human speech, so that the speech recognition circuit is higher than a high-power speech recognition sensor based on a conventional microphone, ADC, and DSP circuit. An object of the present invention is to provide a piezoelectric voice recognition sensor that greatly reduces power consumption through simplification.

또한, 본 발명은 민감도가 높은 특성을 가진 유연한 무기 압전 소재를 이용해 마이크로폰, ADC, DSP의 조합으로 구성된 종래의 음성센서의 마이크로폰 센서부 대체할 수 있는 차세대 저전력 음성인식 센서를 제공한다.In addition, the present invention provides a next-generation low-power voice recognition sensor that can replace the microphone sensor portion of a conventional voice sensor composed of a combination of a microphone, ADC, and DSP using a flexible inorganic piezoelectric material with high sensitivity.

상기 과제를 해결하기 위하여, 본 발명의 일 관점에 따른 음성 인식 센서는 상이한 길이를 갖는 복수 개의 주파수 채널을 구비하며, 상기 복수 개의 주파수 채널은 서로 겹치는 주파수 영역을 센싱하며, 상기 복수 개의 주파수 채널의 외측을 연속적으로 연결하는 전체적인 형태는 커브드(curved) 곡면을 이루고, 이를 통해 상기 복수 개의 주파수 채널은 가청 주파수 영역대인 200Hz 내지 4kHz의 주파수 영역을 통해 고르게 주파수 센싱을 하는 것과 동시에 고민감도를 갖는 응답 특성을 갖는 것을 특징으로 한다.In order to solve the above problems, the speech recognition sensor according to an aspect of the present invention includes a plurality of frequency channels having different lengths, and the plurality of frequency channels senses a frequency region overlapping each other, and the The overall form of continuously connecting the outside forms a curved surface, whereby the plurality of frequency channels are evenly frequency-sensed through the frequency range of 200 Hz to 4 kHz, which is the audible frequency range, and at the same time have a high sensitivity response. It is characterized by having characteristics.

상기 복수 개의 주파수 채널은 낮은 주파수 대역으로 갈수록 더 높은 강도로 음성을 센싱한다.The plurality of frequency channels senses voice with higher intensity as the frequency band decreases.

상기 복수 개의 주파수 채널 중 가장 높은 주파수 영역을 센싱하는 최단길이 채널과 가장 낮은 주파수 영역을 센싱하는 최장길이 채널 간의 길이비는 1 : 1.5~6.5 의 범위이다.The length ratio between the shortest-length channel sensing the highest frequency region and the longest-length channel sensing the lowest frequency region among the plurality of frequency channels is in a range of 1: 1.5 to 6.5.

상기 최단길이 채널에서 상기 최장길이 채널로 갈수록 상기 커브드 곡면의 기울기가 점점 가파르게 변하는 경향을 갖는다.The slope of the curved curved surface tends to change more steeply from the shortest channel to the longest channel.

상기 음성 인식 센서는 공진 형태를 이루고 있으며, 공진의 품질계수는 35 이하의 값을 갖는다.The speech recognition sensor has a resonance shape, and the quality factor of resonance has a value of 35 or less.

본 발명의 다른 관점에 따른 음성 인식 센서는 플렉서블 박막; 상기 플렉서블 박막 상에 적층되는 압전 물질층; 및 상기 압전 물질층 상에 적층되는 전극;을 포함하고, 상기 전극은 일렬로 배치된 상기 복수의 주파수 분리 채널들을 포함하는 상태이다.The voice recognition sensor according to another aspect of the present invention includes a flexible thin film; A layer of piezoelectric material stacked on the flexible thin film; And an electrode stacked on the piezoelectric material layer, wherein the electrode includes the plurality of frequency separation channels arranged in a line.

상기 음성인식 센서는, 상기 전극을 전체적으로 덮는 형태로 적층되는 보호층(Passivation layer)을 더 포함한다.The voice recognition sensor further includes a passivation layer stacked in a form that covers the electrode as a whole.

본 발명은 상기 음성 인식 센서를 포함하는 음성인식 기반의 사물 인터넷(IoT)이 적용된 소형 음성 센서 시스템을 제공한다.The present invention provides a small voice sensor system to which a voice recognition based Internet of Things (IoT) including the voice recognition sensor is applied.

본 발명은 상기 소형 음성 센서 시스템을 포함하는 스마트홈 가전 장치를 제공한다.The present invention provides a smart home appliance including the small voice sensor system.

본 발명은 상기 음성 센서 시스템을 포함하는 웨어러블 전자소자를 제공한다.The present invention provides a wearable electronic device including the voice sensor system.

본 발명에 따른 음성인식 센서는 커브 형태로 이루어진 복수의 주파수 분리 채널을 통하여 감지되는 음성을 주파수에 따라 상기 복수의 채널을 통해 분리하여 사람의 가청 주파수 대역 상에서 전체적으로 높은 민감도를 유지하게 한다.The voice recognition sensor according to the present invention separates voices sensed through a plurality of frequency-separated channels formed in a curved form through the plurality of channels according to frequency to maintain high sensitivity on the human audible frequency band.

본 발명은 복수의 주파수 분리 채널을 통한 주파수 응답값이 저주파 영역에서 고주파 영역으로 이동함에 따라서 선형으로 일정하게 감소하는 특성을 보이는바, 이를 통해 가청 주파수 영역에 걸쳐 일반적인 음향 진단 마이크로폰의 기준값을 상회하는 응답 특성을 보이게 한다.The present invention shows a characteristic of linearly decreasing as the frequency response value through a plurality of frequency separation channels moves from a low frequency region to a high frequency region, thereby exceeding a reference value of a general acoustic diagnosis microphone over the audible frequency region. It shows the response characteristics.

본 발명은 커브 형태로 이루어진 복수의 주파수 분리 채널을 이용하여 분리된 음성 신호를 압전 소자를 통해 기계적 진동 신호에서 전기적 신호로 변환하게 하여 인식하게 하고, 인간의 신체 중에서 달팽이관의 소리전달 메커니즘을 채용하여, 주파수 분리가 가능한 유연 압전 음성인식 센서 및 이에 호환하는 센서모듈을 제작하여 상시 구동 가능한 사물인터넷 구현을 위한 저전력 음성 UI를 실현한다.The present invention uses a plurality of frequency-separated channels formed in a curved form to recognize the separated voice signal by converting it from a mechanical vibration signal to an electrical signal through a piezoelectric element, and employs a cochlear sound transmission mechanism in a human body. , By making flexible piezoelectric voice recognition sensor capable of frequency separation and sensor module compatible with it, we realize low-power voice UI for real-time IoT operation.

또한, 유연한 압전 소재를 다채널구조를 이용하여 제작함하여 채널에 의해 주파수를 분리해내어 음성인식을 함으로써, 이를 이용하면 전력의 소모량을 최대로 줄인 대기 상태에서 언어 및 화자를 기계가 식별해낼 수 있으며, 양방향의 소통과 대응이 가능한 내장형(Embedded) 음성인식센서 및 모듈을 구현할 수 있다.In addition, by making a flexible piezoelectric material using a multi-channel structure, by separating frequencies by channels and performing voice recognition, using this, the machine can identify languages and speakers in a standby state with the greatest reduction in power consumption. In addition, an embedded voice recognition sensor and module capable of two-way communication and response can be implemented.

본 발명은 주파수 별로 음성 스펙트럼의 분리 및 디지털 샘플링에 의해 보다 빠르고 정확한 음향 신호처리 및 고감도 인식을 가능하게 하며, 음향 분석 모듈이 단순화되어 비용을 절감할 수 있다. 이를 통해 주변 잡음 등의 변이(variability)에도 불구하고 화자 식별을 가능하게 한다.The present invention enables faster and more accurate sound signal processing and high-sensitivity recognition by separating and digitally sampling the speech spectrum for each frequency, and the acoustic analysis module can be simplified to reduce costs. This enables speaker identification despite variability such as ambient noise.

또한, 본 발명은 Sleep 상태에서도 전력 소모가 거의 없어 상시 대기하며 음성인식을 가능하게 한다.In addition, the present invention allows for voice recognition while waiting at all times with little power consumption even in the sleep state.

본 발명은 음성 인식 시작 및 종료 버튼을 조작하는 등의 예비동작 없이 쉽고 편리하게 화자 및 기본 명령을 인식할 수 있게 한다.The present invention makes it easy and convenient to recognize a speaker and a basic command without preliminary operations such as manipulating a voice recognition start and end button.

도 1은 종래의 음성인식 시스템과 본 발명과의 차이점을 보이는 비교도이다.
도 2 내지 10은 본 발명의 일 실시예에 따른 압전 음성인식 센서의 제조방법을 설명하는 단계별 단면도이다.
도 11은 본 발명의 일 실시예에 따른 압전 음성인식 센서의 도식도이다.
도 12 및 도 13 상에서는 가로축으로 복수의 주파수 분리 채널을 통해 설정된 주파수 영역을 표시하고 세로축으로는 상대 응답(dB)인 민감도를 보인다.
도 14는 기존의 마이크로폰의 민감도 및 본 발명에 따른 음성 센서의 민감도 특성을 보인다.
도 15 및 도 16은 본 발명에 따른 복수의 주파수 채널을 갖는 음성인식 센서를 구체적으로 설명하는 도면이다.
도 17 및 도 18은 종래의 사다리꼴 형태를 따른 복수의 주파수 채널을 갖는 음성인식 센서를 구체적으로 설명하는 도면이다.
도 19는 커브드(Curved) 곡면을 갖는 본 발명 및 사다리꼴(Trapezoid) 형태를 갖는 종래의 음성 센서와의 시뮬레이션 비교 그래프이다.
도 20은 본 발명의 일 실시예에 따른 음성 인식 센서의 제작 형태를 보이는 이미지이다.1 is a comparison diagram showing a difference between a conventional speech recognition system and the present invention.
2 to 10 are cross-sectional views illustrating a method of manufacturing a piezoelectric voice recognition sensor according to an embodiment of the present invention.
11 is a schematic diagram of a piezoelectric voice recognition sensor according to an embodiment of the present invention.
12 and 13, frequency domains set through a plurality of frequency separation channels are displayed on the horizontal axis, and sensitivity is a relative response (dB) on the vertical axis.
14 shows the sensitivity of the conventional microphone and the sensitivity characteristics of the voice sensor according to the present invention.
15 and 16 are views for specifically explaining a voice recognition sensor having a plurality of frequency channels according to the present invention.
17 and 18 are diagrams for specifically explaining a voice recognition sensor having a plurality of frequency channels according to a conventional trapezoidal shape.
19 is a simulation comparison graph with the present invention having a curved surface and a conventional speech sensor having a trapezoid shape.
20 is an image showing a production form of a speech recognition sensor according to an embodiment of the present invention.

이하, 첨부된 도면을 참조하여 본 발명의 실시예를 더욱 상세히 설명하기로 한다. 그러나, 본 발명은 이하에서 개시되는 실시예에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하며, 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이다. 도면 상에서 동일 부호는 동일한 요소를 지칭한다.Hereinafter, embodiments of the present invention will be described in more detail with reference to the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but will be implemented in various different forms, and only these embodiments allow the disclosure of the present invention to be complete, and the scope of the invention to those skilled in the art. It is provided to inform you completely. The same reference numerals in the drawings indicate the same elements.

도 1은 기존의 음성인식 시스템과 본 발명과의 차이점을 보이는 비교도이다. 도 1의 상단에 도시된 기존의 음성인식 시스템은 마이크로 음성 신호를 아날로그 형태로 받아서 이를 ADC(Analog to digital converter)를 통해 디지털 신호로 변환한 후, DSP(digital signal processing)를 통해 디지털 신호를 처리하여 주파수를 분리해내는데 이 때 높은 전력이 소모된다는 단점이 있다.1 is a comparative view showing the difference between the existing speech recognition system and the present invention. The existing voice recognition system shown at the top of FIG. 1 receives a micro voice signal in analog form, converts it into a digital signal through an analog to digital converter (ADC), and then processes the digital signal through DSP (digital signal processing). Therefore, there is a disadvantage that high power is consumed.

구체적으로, 종래의 음성 인식 기술은 비공진형 방식에 기초한 상태에서 주파수 영역에 따라서 민감도가 낮은 상태로 일정하게 유지되는 특성을 갖는다. 종래에는 외부 DC 전원을 필요로 하는 정전 용량 방식을 통해 한 개의 출력 신호만을 구비하고, 유전막을 활용한다는 점에서 그 한계가 있다.Specifically, the conventional speech recognition technology has a characteristic that is kept constant in a low sensitivity state according to the frequency domain in a state based on a non-resonant type. Conventionally, there is a limitation in that only one output signal is provided through a capacitive method requiring an external DC power source and a dielectric film is used.

반면, 본 발명에서의 복수의 주파수 채널을 갖는 저전력 음성인식 센서는 공진형 방식을 갖는 압전 센서로서 바로 음성인식이 가능하여 저전력 구동이 가능하다는 장점을 가진다. 먼저 음성 신호를 주파수에 따라 복수의 전극채널에서 분리가 되는데, 이와 동시에 압전소자로 이루어진 박막에서 기계적 움직임이 전기적 신호로 변환되어 각각의 주파수 대역에서 전기적 신호가 검출되게 된다. On the other hand, the low-power voice recognition sensor having a plurality of frequency channels in the present invention is a piezoelectric sensor having a resonant type, and has a merit that low-power driving is possible because voice recognition is possible immediately. First, the audio signal is separated from a plurality of electrode channels according to frequency, and at the same time, mechanical motion is converted into an electrical signal in a thin film made of a piezoelectric element, whereby an electrical signal is detected in each frequency band.

즉, 종래의 마이크로폰의 경우에는 주파수 밴드필터, ADC, DSP가 사용되므로 고전력이 소모되나, 본 발명은 주파수 별로 분리되어 전류를 생선하는 압전소자를 사용하므로, 밴드필터나 ADC, DSP에 소요되는 전력을 감소시킬 수 있다. 또한, 마이크로폰의 민감도가 높아 회로부에서 사용하여야 하는 전압이득을 낮출 수 있어 이로 인핸 전력소모를 줄일수 있으며, 회로적 안정성을 향상시킬 수 있다. In other words, in the case of a conventional microphone, high power is consumed because a frequency band filter, ADC, and DSP are used, but the present invention uses a piezoelectric element that separates each frequency and fishes, so power required for a band filter, ADC, or DSP Can be reduced. In addition, since the sensitivity of the microphone is high, the voltage gain to be used in the circuit can be lowered, thereby reducing power consumption and improving circuit stability.

또한, 본 발명은 복수의 주파수 분리 채널을 일렬로 배치하되 전체적으로는 커브 형태로 이루어지게 하여 가청 주파수 영역에 걸쳐서 저주파 영역에서 고주파 영역으로 이동함에 따라서 응답 특성이 선형으로 일정하게 감소하게 한다. 상기의 과정에서 저주파인 200Hz 대역에서 고주파인 4kHz 대역에 이르기까지 전체적으로 높은 민감도를 유지하는 것을 특징으로 한다.In addition, according to the present invention, a plurality of frequency separation channels are arranged in a line, but in a curved form as a whole, the response characteristics are linearly reduced as the frequency range is shifted from the low frequency region to the high frequency region. In the above process, it is characterized in that the overall sensitivity is maintained from the low frequency 200Hz band to the high frequency 4kHz band.

본 발명은 센서부 자가 구동이 가능한 압전 방식을 이용하고, 하나의 칩에서 영역이 상이한 복수의 출력 신호를 발생하게 한다. 또한, 유연한 무기물 압전 소재를 활용한 독보적인 기술을 사용한다.The present invention uses a piezoelectric method capable of self-driving the sensor unit, and generates a plurality of output signals having different regions in one chip. In addition, it uses unique technology that utilizes flexible inorganic piezoelectric materials.

도 2 내지 10은 본 발명의 일 실시예에 따른 압전 음성인식 센서의 제조방법을 설명하는 단계별 단면도이다.2 to 10 are cross-sectional views illustrating a method of manufacturing a piezoelectric voice recognition sensor according to an embodiment of the present invention.

도 2를 참조하면, 희생기판인 실리콘 기판(100)이 개시된다. 본 발명에서 상기 희생기판(100)은 추후 적층되는 금속층과의 응력 편차를 제공하나, 나노제너레이터 소자와 직접 접합되지는 않는다. 본 발명의 일 실시예에서 상기 실리콘 기판(100)의 압축응력은 소자 상부에 접합되는 금속층의 인장응력과 부조화를 이루며, 이후 인가되는 외부 에너지에 의하여 실리콘 기판(100)상에 접합된 별도의 버퍼층(본 발명의 일 실시예에서 실리콘 산화물층)이 크랙되는데, 버퍼층의 수평 방향 크랙은 다음에 보다 상세히 설명된다. 본 발명은 특히 상기 금속층과 희생기판 사이의 응력차이에 따라 상기 크랙되는 부위를 조절, 제어할 수 있다. Referring to FIG. 2, a silicon substrate 100 as a sacrificial substrate is disclosed. In the present invention, the sacrificial substrate 100 provides stress variation with a metal layer that is later stacked, but is not directly bonded to the nanogenerator device. In one embodiment of the present invention, the compressive stress of the silicon substrate 100 forms a tensile stress and disharmony of the metal layer bonded to the top of the device, and then a separate buffer layer bonded on the silicon substrate 100 by external energy applied. (The silicon oxide layer in one embodiment of the present invention) is cracked, the crack in the horizontal direction of the buffer layer is described in more detail below. In the present invention, in particular, the cracked portion may be adjusted and controlled according to a stress difference between the metal layer and the sacrificial substrate.

상기 실리콘 기판(100) 상에 실리콘 산화물과 같은 버퍼층(200)이 적층된다. 본 발명에서 상기 버퍼층(200)은 응력차이에 따라 발생하는 물리적 힘에 따라 떨어질 수 있는 수준으로, 나노제너레이터 소자와 접합된다. 본 발명의 일 실시예에서, 상기 버퍼층(200)으로 실리콘 산화물층을 사용하였으며, 실리콘 산화물층과 나노제너레이터간 접합력은 상기 하부 기판과 금속층 사이의 응력 차이에 의하여 나노제너레이터 소자가 효과적으로 분리될 수 있는 수준이다.A buffer layer 200 such as silicon oxide is stacked on the silicon substrate 100. In the present invention, the buffer layer 200 is at a level that may fall depending on the physical force generated according to the stress difference, and is bonded to the nanogenerator device. In one embodiment of the present invention, a silicon oxide layer is used as the buffer layer 200, and the bonding force between the silicon oxide layer and the nanogenerator can effectively separate the nanogenerator device due to a difference in stress between the lower substrate and the metal layer. Level.

도 3을 참조하면, 버퍼층(200) 상에서 공지된 기술인 sol-gel 공정을 통해 압전 물질층인 PZT 박막(300)이 증착된다. sol-gel 용액 박막으로부터 유기성분을 제거하기 위해, 0.4M의 PZT sol-gel 용액(10 mol% 초과 PbO 의 52:48 몰비의 Zr:Ti)이 10분 동안 450 ℃의 공기 분위기에서의 열분해 과정과 함께 2500rpm에서 웨이퍼 상에 스핀 캐스트된다. Referring to FIG. 3, the PZT thin film 300 as a piezoelectric material layer is deposited on the buffer layer 200 through a sol-gel process, which is a known technique. In order to remove the organic component from the thin film of sol-gel solution, a 0.4M PZT sol-gel solution (Zr:Ti in a 52:48 molar ratio of PbO greater than 10 mol%) was thermally decomposed in an air atmosphere at 450° C. for 10 minutes. And spin-cast on a wafer at 2500 rpm.

상기 증착 및 열분해 단계는 2㎛ 두께의 PZT 박막을 형성하기 위해 수회 반복된다. PZT 박막의 결정화는 공기 중에서 650℃, 45분 동안 수행된다. 열분해 및 결정화 공정을 위해 급속 열처리(RTA)가 이용된다.The deposition and pyrolysis steps are repeated several times to form a 2 μm thick PZT thin film. Crystallization of the PZT thin film is carried out in air at 650° C. for 45 minutes. Rapid thermal treatment (RTA) is used for the pyrolysis and crystallization process.

도 4를 참조하면, PZT 박막(300)의 상부면에 금속층인 니켈층(400)을 적층한다. 본 발명의 일 실시예에 따르면, 상기 니켈층(400) 적층은 스퍼터링이나 PVD 공정 등과 같은 통상의 반도체 공정을 통하여 수행될 수 있으며, 이 외에도 통상적인 금속 도포 방식에 따라서도 적층될 수 있다. 상기 적층에 따라 PZT 박막(300) 상에 접합된 니켈(400)이 형성된다. Referring to FIG. 4, a nickel layer 400, which is a metal layer, is stacked on the upper surface of the PZT thin film 300. According to an embodiment of the present invention, the nickel layer 400 may be stacked through a conventional semiconductor process such as sputtering or PVD process, and may also be stacked according to a conventional metal coating method. According to the lamination, nickel 400 bonded to the PZT thin film 300 is formed.

도 5를 참조하면, 상기 잔류 인장응력을 가지는 금속층인 니켈층(400)에 기계적 에너지(예를 들어 물리적 충격) 또는 열 에너지를 인가한다. 그 결과, 니켈의 잔류 인장응력이 발생하며, 상기 버퍼층을 통하여 나노제너레이터 소자와 간접적으로 접합된 실리콘 기판의 잔류 압축응력과 상기 잔류 인장 응력 간의 부조화(mismatch) 또는 비대칭 효과가 발생하며, 이에 따라 실리콘 산화물인 버퍼층(200)과 PZT 박막(300) 사이의 경계면에서 두 층간의 접합이 떨어지는 현상이 발생한다. 본 발명은 이와 같이 실리콘 기판의 잔류 압축응력과 상이한 인장응력을 갖는 금속층으로, 원하는 소자와 기판을 적층한 후, 외부로부터 에너지를 인가하여 약한 접합면에서 소자를 분리한다. 특히 이러한 소자의 분리를 발생시키는 분리면을, PZT 박막(300)과 가장 약한 힘으로 접합된 버퍼층의 경계면으로 설정하므로, 실리콘 기판 상에서 제조된 소자를 원형 그대로 분리, 전사시킬 수 있는 장점이 있다. 또한 상기 소자 분리 위치는 금속층과 희생기판 사이의 응력차이에 따라 제어될 수 있다. Referring to FIG. 5, mechanical energy (for example, physical impact) or thermal energy is applied to the nickel layer 400 which is the metal layer having the residual tensile stress. As a result, a residual tensile stress of nickel occurs, and a mismatch or asymmetry effect between the residual compressive stress of the silicon substrate and the residual tensile stress of the silicon substrate indirectly bonded to the nanogenerator element through the buffer layer occurs, and thus silicon At the interface between the oxide buffer layer 200 and the PZT thin film 300, the bonding between the two layers falls. The present invention is a metal layer having a tensile stress different from the residual compressive stress of the silicon substrate as described above, and then stacking the desired device and the substrate, and then applying energy from the outside to separate the device from the weak bonding surface. In particular, since the separation surface that generates separation of these devices is set as the interface between the PZT thin film 300 and the buffer layer bonded with the weakest force, the device manufactured on the silicon substrate can be separated and transferred as it is. In addition, the device separation position may be controlled according to a stress difference between the metal layer and the sacrificial substrate.

도 6을 참조하면, 상기 실리콘 기판과 접촉하는 금속층의 잔류 인장응력 부조화에 따라 접합이 떨어진 PZT 박막(300)을 실리콘산화물 버퍼층(200)으로부터 분리한다(도 7 참조).Referring to FIG. 6, the PZT thin film 300 having a detached bond is separated from the silicon oxide buffer layer 200 according to the residual tensile stress mismatch of the metal layer contacting the silicon substrate (see FIG. 7 ).

한편, PZT 박막(300)을 실리콘산화물 버퍼층(200)으로부터 분리하는 과정은 LLO(laser lift off) 공정에 의해서도 가능할 수 있다. 즉, PZT 박막(300)을 버퍼층(200)로부터 분리하기 위해 XeCl-펄스 엑시머 레이저를 통한 실리콘산화물 버퍼층(200) 후면에 대한 조사는, 예를 들어 XeCl 레이저의 광자에너지(4.03eV)가 버퍼층(200)의 밴드-갭 에너지보다 작고, PZT 박막(300)의 그것보다 크기 때문에, PZT박막이 플렉시블 플라스틱 기재로 이동되는 것을 가능하게 한다. 결과적으로, 레이저 빔은 실리콘산화물 버퍼층을 관통하고, 다음으로 국소 용융 및 버퍼층과의 경계에서 PZT의 해리가 일어난다.Meanwhile, the process of separating the PZT thin film 300 from the silicon oxide buffer layer 200 may be possible by a laser lift off (LLO) process. That is, irradiating the rear surface of the silicon oxide buffer layer 200 through the XeCl-pulse excimer laser to separate the PZT thin film 300 from the buffer layer 200, for example, the photon energy (4.03eV) of the XeCl laser is a buffer layer ( Since it is smaller than the band-gap energy of 200) and larger than that of the PZT thin film 300, it enables the PZT thin film to be transferred to the flexible plastic substrate. As a result, the laser beam penetrates the silicon oxide buffer layer, followed by local melting and dissociation of PZT at the boundary with the buffer layer.

상기와 같이 PZT 박막을 플라스틱 기판으로 전환하기 위한 LLO(laser lift off) 공정이 일어난다.As described above, a laser lift off (LLO) process for converting the PZT thin film into a plastic substrate occurs.

도 8을 참조하면, 상기 분리된 PZT 박막(300)-니켈(400)층을, 플렉서블한 플라스틱 기판(600)으로 물리적으로 이동시켜 접합시킨다. 이로써 플렉서블한 플라스틱 기판(600) 상에 전사된 플렉서블 나노제너레이터가 완성된다. Referring to FIG. 8, the separated PZT thin film 300-nickel 400 layer is physically moved to and bonded to a flexible plastic substrate 600. Thus, the flexible nanogenerator transferred on the flexible plastic substrate 600 is completed.

도 9를 참조하면, 상기 니켈층(400)은 통상의 화학적 식각 공정인 에칭(etching)을 통하여 제거된다. 예를 들어 상기 니켈층(400)을 식각하기 위한 특정 식각액에 상기 플라스틱 기판(600)에 접합된 소자의 상부를 침지시켜 니켈층(400)을 제거할 수 있다. 하지만, 이 외에도 통상적인 다양한 금속층 제거 방식에 따라 상기 니켈층(400)을 선택적으로 제거할 수 있으며, 이 또한 본 발명의 범위에 속한다. Referring to FIG. 9, the nickel layer 400 is removed through etching, which is a conventional chemical etching process. For example, the nickel layer 400 may be removed by immersing the upper portion of the device bonded to the plastic substrate 600 in a specific etchant for etching the nickel layer 400. However, in addition to this, the nickel layer 400 may be selectively removed according to various conventional metal layer removal methods, and this is also within the scope of the present invention.

다음으로, 도 10을 참조하면, PZT 박막(300) 상에 전극(500)이 적층되며, 이로써 하부에서부터 플렉서블 박막인 플라스틱 기판(600), PZT 박막(300), 및 전극(500)의 형태로 적층된다. 여기에서, 상기 전극(500)은 복수의 주파수 분리 채널을 이루게 된다.Next, referring to FIG. 10, an electrode 500 is stacked on the PZT thin film 300, thereby forming a flexible plastic film 600 from the bottom, a PZT thin film 300, and an electrode 500. Stacked. Here, the electrode 500 forms a plurality of frequency separation channels.

전극(500)은 Ti/Au, Ti/Pt, Cr/Au 및 Cr/Pt 을 포함하는 전극 물질들 중 어느 하나일 수 있다. The electrode 500 may be any one of electrode materials including Ti/Au, Ti/Pt, Cr/Au, and Cr/Pt.

플라스틱 기판(600)은 PET, PEN, Parylene, Kapton 을 포함하는 기판 물질들 중 어느 하나일 수 있다.The plastic substrate 600 may be any one of substrate materials including PET, PEN, Parylene, and Kapton.

한편, 도 11을 참조하면 본 발명인 압전 음성인식 센서는 전극(500)을 전체적으로 덮는 형태로 보호층(Passivation layer)를 선택적으로 부가할 수 있다. 상기 보호층은 Parylene 또는 SU-8 일 수 있다.Meanwhile, referring to FIG. 11, the piezoelectric speech recognition sensor according to the present invention may selectively add a passivation layer in a form that covers the electrode 500 as a whole. The protective layer may be Parylene or SU-8.

플라스틱 기판(600) 및 PZT 박막(300) 사이엔 접착층이 배치되는데, 상기 접착층은 Norland, PU 등일 수 있다.An adhesive layer is disposed between the plastic substrate 600 and the PZT thin film 300, and the adhesive layer may be Norland, PU, or the like.

도 12 내지 도 13을 참조하여 본 발명의 기본 개념을 설명한다.The basic concept of the present invention will be described with reference to FIGS. 12 to 13.

도 12 및 도 13 상에서는 가로축으로 복수의 주파수 분리 채널을 통해 설정된 주파수 영역을 표시하고 세로축으로는 상대 응답(dB)인 민감도를 보인다.12 and 13, frequency domains set through a plurality of frequency separation channels are displayed on the horizontal axis, and sensitivity is a relative response (dB) on the vertical axis.

일예로서, 도 12에 도시된 4개의 채널에 해당하는 주파수 채널들은 서로 겹치는 영역의 주파수 영역을 센싱한다. 즉, 그래프 상에서 가장 좌측에 보이는 채널1은 높은 응답 특성을 보이고, 가장 우측에 보이는 채널4는 상대적으로 낮은 응답 특성을 보인다. As an example, the frequency channels corresponding to the four channels illustrated in FIG. 12 sense frequency regions of regions overlapping each other. That is, on the graph, the leftmost channel 1 shows a high response characteristic, and the rightmost channel 4 shows a relatively low response characteristic.

본 발명은 공진형 소자를 이용함으로써 특정 주파수 영역에서 더 높은 민감도를 보유할 수 있다. 즉, 약 -100dB의 민감도를 일정하게 유지하는 기존의 마이크로폰(Ref. Mic)에 대비할 때 주파수 채널의 서로 겹치는 영역들은 전부 마이크로폰(Ref. Mic)의 민감도를 훌쩍 넘는 상태를 보이는 것을 알 수 있다.The present invention can maintain a higher sensitivity in a specific frequency domain by using a resonant element. That is, in contrast to the conventional microphone (Ref. Mic) that maintains a constant sensitivity of about -100dB, it can be seen that all regions overlapping each other in the frequency channel exhibit a state far exceeding the sensitivity of the microphone (Ref. Mic).

기존의 마이크로폰은 센서와 ROIC(read-out IC)로 이루어져 있으며, 음압에 의해 멤브레인이 떨리게 되면 전위차에 의한 전기적 신호가 ROIC로 들어간다. ROIC는 증폭기와 임피던스 변환기로 이루어져 있으며, 증폭기의 Gain에 따라 민감도가 결정된다. Existing microphones consist of a sensor and a read-out IC (ROIC), and when the membrane vibrates due to negative pressure, an electrical signal due to a potential difference enters the ROIC. ROIC consists of an amplifier and an impedance converter, and the sensitivity is determined by the gain of the amplifier.

도 12는 ref.mic에서 ROIC를 제외한 센서부분의 민감도를 보이는 것으로서, 20-20kHz의 화이트 노이즈를 94SPL(sound-level pressure)로 입력하였을 때 각 주파수에 대응하는 응답특성을 보인다. 12 shows the sensitivity of the sensor part excluding the ROIC in ref.mic, and shows response characteristics corresponding to each frequency when white noise of 20-20 kHz is input as 94SPL (sound-level pressure).

예를 들어, 공진형 주파수 채널1(Ch1)에서는 상기 채널1의 3dB 대역폭을 50Hz을 넘는 정도로 넓혀서 Q값을 35 미만으로 낮추는 방향으로 설정함으로써 200Hz~4kHz를 고르게 센싱하게 한다. For example, in the resonant frequency channel 1 (Ch1), the 3 dB bandwidth of the channel 1 is widened to a level exceeding 50 Hz, and the Q value is set to a direction lowering to less than 35, so that 200 Hz to 4 kHz are evenly sensed.

다음으로, 도 13에서는 가로축으로 100Hz 에서 4000Hz에 이르는 가청 주파수 영역을 설정하고 세로축으로는 상대 응답(dB)인 민감도를 보인다.Next, in FIG. 13, an audible frequency range ranging from 100 Hz to 4000 Hz is set on the horizontal axis, and sensitivity is a relative response (dB) on the vertical axis.

복수의 주파수 분리 채널을 통해 설정된 주파수 영역을 표시하고 세로축으로는 상대 응답(dB)인 민감도를 보인다.The frequency domain set through the multiple frequency separation channels is displayed and the vertical axis shows the relative response (dB) sensitivity.

상기 도 13에서는 7개의 주파수 분리 채널을 통해 획득된 민감도를 보인다.In FIG. 13, sensitivity obtained through 7 frequency separation channels is shown.

전체의 주파수 대역을 통해 민감도를 일정하게 유지하는 기존의 마이크로폰(Ref. Mic)에 대비할 때 125Hz에서 4000Hz에 이르는 주파수 영역 내에서 복수의 주파수 분리 채널을 통한 응답 성능이 마이크로폰(Ref. Mic)의 민감도를 훌쩍 넘는 상태를 보이는 것을 알 수 있다.Compared to the conventional microphone (Ref. Mic), which maintains the sensitivity constantly through the entire frequency band, the response performance through multiple frequency separation channels within the frequency range from 125 Hz to 4000 Hz is the sensitivity of the microphone (Ref. Mic). You can see that it shows a state of overshoot.

복수 개의 주파수 채널은 서로 겹치는 영역의 주파수 영역을 센싱하며, 상기 복수 개의 주파수 채널은 낮은 주파수 대역으로 갈수록 더 높은 강도로 음성을 센싱한다.The plurality of frequency channels senses a frequency region of a region overlapping each other, and the plurality of frequency channels senses voice with higher intensity as the frequency band decreases.

한편, 도 14를 참조하면, 기존의 마이크로폰의 민감도는 1kHz에서 94dB(Sound Pressure Level, SPL) 조건에서 나오는 아웃풋 신호를 나타내는 것인데, 본 발명에 따른 음성 센서는 상기 마이크로폰의 측정 조건과 동일한 상태에서 약 30mV(peak-to-peak) 값을 가졌으며, 이는 약 10.6mV(rms)값을 나타낸다. 상기 10.6mV(rms)을 변환하면 -39.5dBV를 갖게 되는데, 이를 통해 본 발명에 따른 소자의 민감도는 -39.5dBV를 갖는다. 한편 마진을 두어 -45dBV 이상을 갖는 고민감도 센서를 제공하게 된다.On the other hand, referring to Figure 14, the sensitivity of the existing microphone is to indicate the output signal from 94dB (Sound Pressure Level, SPL) conditions at 1kHz, the voice sensor according to the present invention is about the same as the measurement conditions of the microphone It had a peak-to-peak (30 mV) value, which represents a value of about 10.6 mV (rms). When the 10.6mV (rms) is converted, it has -39.5dBV, through which the sensitivity of the device according to the present invention is -39.5dBV. Meanwhile, a high sensitivity sensor having a margin of -45 dBV or more is provided.

상기의 민감도 변환 계산은 인터넷 주소인 "http://www.sengpielaudio.com/ calculator-db-volt.htm" 를 통해 가능하게 된다.The above sensitivity conversion calculation is possible through the Internet address "http://www.sengpielaudio.com/calculator-db-volt.htm".

상기와 같이, 본 발명은 멀티 채널을 이용하여 다수개의 민감도 높은 영역 구현을 가능하게 하고, 사람의 음성 대역에 해당하는 200Hz 내지 4kHz 범위에서 고 민감도를 달성하게 한다.As described above, the present invention enables realization of a plurality of sensitive areas using a multi-channel, and achieves high sensitivity in a range of 200 Hz to 4 kHz corresponding to a human voice band.

이하, 도 15 및 도 16을 참조하여 본 발명에 따른 복수의 주파수 채널을 갖는 음성인식 센서를 구체적으로 설명한다.Hereinafter, a voice recognition sensor having a plurality of frequency channels according to the present invention will be described in detail with reference to FIGS. 15 and 16.

음성인식 센서는 상이한 길이를 갖는 복수 개의 주파수 채널을 구비한다. 상기 복수 개의 주파수 채널은 하나의 칩 상에서 소정 간격을 갖도록 배치된 6개의 전극 채널 형태일 수 있다. 상기 복수 개의 주파수 채널은 도 14의 상단 사진 상에서 좌측에서 우측 방향으로 ch1 내서 ch6의 순서로 나란히 배열된다.The speech recognition sensor has a plurality of frequency channels having different lengths. The plurality of frequency channels may be in the form of six electrode channels arranged to have a predetermined interval on one chip. The plurality of frequency channels are arranged side by side in the order of ch6 by ch1 from left to right on the upper picture in FIG. 14.

상기 복수 개의 주파수 채널은 그 양측 가장자리 측단을 연속적으로 연결하는 경우에 전체적인 형태는 커브드(curved) 곡면을 이룬다. When the plurality of frequency channels are continuously connected to both side edges, the overall shape forms a curved surface.

구체적으로, ch1에서 ch2와 ch3으로 가는 경우에는 기울기가 상당히 완만한 상태로 증가하게 되는데, ch3에서 ch4, ch5, ch6으로 가는 경우에는 확연히 큰 기울기 값을 갖는 상태로 증가하는 것을 알 수 있다.Specifically, when going from ch1 to ch2 and ch3, the slope increases to a fairly gentle state, and when going from ch3 to ch4, ch5, ch6, it can be seen that it increases to a state having a significantly large slope value.

즉, 최단길이 채널인 ch1에서 최장길이 채널인 ch6로 갈수록 상기 커브드 곡면의 기울기 값이 점점 증가하는 경향을 갖는다.That is, the slope value of the curved surface tends to gradually increase from the shortest-length channel ch1 to the longest-length channel ch6.

도 15의 하단 그래프 상에서는 최단길이 채널인 ch1에서 최장길이 채널인 ch6로 갈수록 주파수 감지 영역이 전체적으로 균일한 것을 알 수 있다.On the lower graph of FIG. 15, it can be seen that the frequency sensing region is uniform as the channel length from the shortest channel ch1 to the longest channel ch6 increases.

도 15 및 도 16은 각 주파수 채널에서 가장 높은 피크점을 기준으로 했을 때 경향성을 파악한 것이다. 여기에서, 진동 신호와 전기 신호는 거의 모든 주파수 채널에서 일치한다.15 and 16 show the tendency based on the highest peak point in each frequency channel. Here, the vibration signal and the electrical signal coincide in almost all frequency channels.

상기 6개의 전극 채널 중에서 저주파를 센싱하는 일측의 주파수 채널을 제1 센싱부로 설정하고, 제1 센싱부에 비해 상대적으로 고주파를 센싱하는 타측의 주파수 채널을 제2 센싱부로 설정한다.Among the six electrode channels, a frequency channel on one side for sensing a low frequency is set as a first sensing unit, and a frequency channel on the other side for sensing a high frequency relative to the first sensing unit is set as a second sensing unit.

가장 높은 주파수 영역을 센싱하는 최단길이 채널에 해당하는 제2 센싱부와 가장 낮은 주파수 영역을 센싱하는 최장길이 채널에 해당하는 제1 센싱부 간의 길이비는 길이비는 1:1.5~6.5 의 범위이다. The length ratio between the second sensing unit corresponding to the shortest length channel sensing the highest frequency region and the first sensing unit corresponding to the longest length channel sensing the lowest frequency region is in the range of 1:1.5 to 6.5. .

바람직하게는 최단길이 채널과 최장길이 채널 간의 비는 1:4 의 범위일 수 있다. 이는 1:3인 경우는 주파수 커버범위가 고주파수 쪽에 치우쳐질 수 있으며, 1:5 인 경우는 저주파수 쪽에 치우쳐질 수 있다는 점에 기인한다.Preferably, the ratio between the shortest length channel and the longest length channel may be in the range of 1:4. This is due to the fact that in the case of 1:3, the frequency cover range may be biased toward the high frequency, and in the case of 1:5, bias may be biased toward the low frequency.

제1 센싱부 상에서 전극 채널의 길이가 감소하는 비율을 a라 설정하고, 제2 센싱부 상에서 전극 채널의 길이가 감소하는 비율을 b라 설정하는 경우에,When the ratio of the length of the electrode channel decreases on the first sensing unit is set to a, and when the ratio of the length of the electrode channel decreases on the second sensing unit is set to b,

상기 a,b 간의 비인 b/a 는 1보다 작은 범위로 결정되는 것이 바람직하다.The ratio between a and b, b/a is preferably determined in a range smaller than 1.

이를 통해서, 전극 채널의 길이가 감소하는 비율이 a로 설정된 제1 센싱부에서 저주파를 센싱하는 성능이 더 개선될 수 있다.Through this, the performance of sensing the low frequency in the first sensing unit in which the ratio in which the length of the electrode channel decreases is set to a may be further improved.

이하, 도 17 및 도 18을 참조하여 종래의 사다리꼴 형태를 따른 복수의 주파수 채널을 갖는 음성인식 센서를 설명한다.Hereinafter, a voice recognition sensor having a plurality of frequency channels according to a conventional trapezoidal shape will be described with reference to FIGS. 17 and 18.

상이한 길이를 갖는 복수 개의 주파수 채널은 그 양측 가장자리 측단을 연속적으로 연결하는 경우에 전체적인 형태는 사다리꼴 형태를 이룬다.When a plurality of frequency channels having different lengths are continuously connected to both side edges, the overall shape is trapezoidal.

구체적으로, ch1에서 ch6으로 가는 경우에는 기울기가 일정한 것을 알 수 있다.Specifically, when going from ch1 to ch6, it can be seen that the slope is constant.

즉, 최단길이 채널인 ch1에서 최장길이 채널인 ch6로 갈수록 채널의 길이가 일정한 비율로 증가하는 경향을 갖는다.That is, the channel length tends to increase at a constant rate from the shortest-length channel ch1 to the longest-length channel ch6.

도 17의 하단 그래프를 보면, ch6의 최장길이 채널에서 ch2으로 변동하는 경우에 감지하는 주파수 영역에 있어서 명확한 경계가 없이 나열된 것을 보인다. 즉, ch6에서 ch4로 갈수록 주파수의 최대 피크점이 미세하게 감소하는 반면에, ch4에서 ch3로 현저히 증가한 다음에 ch3에서 ch2로 미세하게 증가하는 것을 보인다. 한편, 최단길이 채널인 ch1에서 갑자기 감지하는 주파수 영역이 현저하게 높은 상태를 감지하는 것을 보이는데, ch2에서 ch1 사이에는 커버하지 못하는 주파수 영역이 발생한 것을 확인할 수 있다. Looking at the bottom graph of FIG. 17, it is shown that there is no clear boundary in the frequency domain to be detected when the longest channel of ch6 fluctuates from ch2. That is, while the maximum peak point of the frequency decreases finely from ch6 to ch4, it is seen that it increases significantly from ch4 to ch3 and then increases from ch3 to ch2 finely. On the other hand, it can be seen that a frequency region that is suddenly detected in the shortest-length channel ch1 detects a remarkably high state, and a frequency region that cannot be covered between ch2 and ch1 has occurred.

상기와 같이, 사다리꼴 형태를 갖는 종래의 센서는 복수의 주파수 채널을 거치면서 커버하지 못하는 주파수 영역이 현저히 드러난다는 문제점이 있게 한다.As described above, a conventional sensor having a trapezoidal shape has a problem in that a frequency region not covered by a plurality of frequency channels is remarkably exposed.

도 19를 보면, 커브드(Curved) 곡면을 갖는 본 발명 및 사다리꼴(Trapezoid) 형태를 갖는 종래의 음성 센서와의 시뮬레이션 비교 그래프를 보인다.Referring to FIG. 19, a simulation comparison graph with the present invention having a curved surface and a conventional voice sensor having a trapezoid shape is shown.

가로축으로는 복수의 주파수 채널을 거리에 따라 배열한 상태를 보이고, 세로축은 분리된 복수의 주파수 채널에 따라 공진 주파수를 보인다.The horizontal axis shows a state in which a plurality of frequency channels are arranged according to a distance, and the vertical axis shows a resonant frequency according to a plurality of separated frequency channels.

종래의 음성 센서는 사다리꼴 형태에서의 공진 주파수 방정식을 따를때 포물선 형상의 공진 주파수 배치 형태를 갖는 것을 확인할 수 있다.When the conventional voice sensor follows the resonance frequency equation in the trapezoidal shape, it can be confirmed that it has a parabolic resonance frequency arrangement.

한편, 본 발명은 커브드(Curved) 곡면을 통해서 직선에 거의 근접한 상태로 공진 주파수를 분리하는 것을 확인할 수 있다.On the other hand, it can be seen that the present invention separates the resonant frequency in a state almost close to a straight line through a curved surface.

본 발명에 따른 음성 센서는 압전 타입의 공진형 소자인바 품질계수(Quality factor)는 중요한 특징에 해당한다.The voice sensor according to the present invention is a piezoelectric type resonant element, and the quality factor corresponds to an important feature.

공진에서의 Q(Quality factor)는 주파수 선택 특성품질을 의미한다. 공진주파수점에서 양쪽으로 3dB, 즉 반으로 감쇄되는 지점의 주파수간의 차이를 소위 3dB 대역폭이라고 하는데, 공진주파수를 3dB 대역폭으로 나눈 것이 바로 Q값이다. 즉, 공진 특성이 샤프할수록 3dB 대역폭은 좁아질 것이고, 결국 Q값은 커진다. 한편으로는, Q가 낮으면 대역이 넓다는 의미일 수 있다.Q (Quality factor) at resonance refers to the quality of frequency selective characteristics. The difference between the frequencies of the points at which the resonance frequency is 3 dB on both sides, that is, attenuated in half, is called the 3 dB bandwidth, and the Q value is the resonance frequency divided by the 3 dB bandwidth. In other words, the sharper the resonance characteristic, the narrower the 3dB bandwidth will be, and eventually the Q value will be larger. On the other hand, if Q is low, it may mean that the band is wide.

본 발명에서는 각 채널의 3dB 대역폭을 50Hz 이상으로 넓혀서 Q값을 낮추는 방향으로 설정함으로써 200Hz~4kHz 범위 내에서 기존 MEMS 기반 마이크로폰보다 높은 민감도로 고르게 센싱하게 한다. 기존 MEMS 기반 마이크로폰은 ROIC 부분의 영향을 제외한 순수 마이크로폰 센서의 성능으로, 약 94SPL의 20~20000Hz의 화이트노이즈 조건에서 -100dB의 성능을 의미하며, 본 발명에 따른 유연 압전 기반 음성인식 센서는 200~4kHz의 90% 이상 범위에서 기존의 MEMS 기반 마이크로폰의 성능치를 나타내는 -100dB보다 높은 성능을 갖는 것을 의미한다.In the present invention, the 3dB bandwidth of each channel is set to a direction in which the Q value is lowered by widening the bandwidth to 50Hz or more, so that it is evenly sensed with a higher sensitivity than the conventional MEMS-based microphone within the range of 200Hz to 4kHz. Existing MEMS-based microphone is the performance of a pure microphone sensor excluding the effect of the ROIC part, which means -100dB of performance at a white noise condition of 20 to 20,000 Hz of about 94SPL, and the flexible piezoelectric-based voice recognition sensor according to the present invention is 200 to It means that it has a performance higher than -100dB, which represents the performance value of a conventional MEMS-based microphone in the range of 90% or more of 4kHz.

압전계수보다 더 효과적인 성능을 나타내는 것이 전기기계결합계수(electromechanical coupling factor) K와 전기기계품질계수(electromechanical quality factor) Q 가 있다.The electromechanical coupling factor K and the electromechanical quality factor Q show more effective performance than the piezoelectric coefficient.

품질계수는 전기적 품질계수(Qe)와 기계적 품질계수(Qm)가 있다.The quality factor includes electrical quality factor (Qe) and mechanical quality factor (Qm).

Qe는 전기적 손실(tanδ)의 역수를 뜻하는 반면에, Qm은 진동체의 기계적 진동 흡수(damping) 때문에 나타난 응력에 대한 변위의 집중도를 나타낸다. 레조네이터용 압전 재료는 Qm값이 1500 이상인 특성을 요구하는 반면, 필터용은 Qm값이 400∼600 정도인 값을, 압전스피커는 Qm값이 80 이하인 특성과 고유전율에 재료개발의 초점이 맞추어져 있다.Qe represents the reciprocal of the electrical loss (tanδ), while Qm represents the concentration of displacement against stress due to mechanical vibration damping of the vibrating body. The piezoelectric material for resonators requires a Qm value of 1500 or higher, while the filter uses a Qm value of 400 to 600 or so, and a piezoelectric speaker focuses on material development with a Qm value of 80 or lower and high dielectric constant. have.

본 발명에 따른 음성 센서의 경우엔 멀티 채널로서 총 7개의 채널을 가질 수 있는데, Q값은 18 내지 28 사이의 값을 가지고 있다. 상기의 결과를 토대로 하여 Q값은 35 이하의 값을 유지하는 것이 바람직할 수 있다.In the case of the voice sensor according to the present invention, a multi-channel may have a total of seven channels, and the Q value has a value between 18 and 28. Based on the above results, it may be desirable to keep the Q value below 35.

상기의 기계적 품질계수는 전기적 에너지와 기계적 에너지 간의 교환시 축적되는 에너지의 비율을 나타내는 것으로서, Permanent dipole들의 이동시 발생하는 인가 전압과의 위상 차이에 기인하게 되는데, 손실은 대부분 열에너지의 형태로 발산되고, 압전체가 공진주파수에서 일으키는 공진의 Sharpness를 결정한다.The mechanical quality factor indicates the ratio of energy accumulated during exchange between electrical energy and mechanical energy, and is due to a phase difference from an applied voltage generated when the permanent dipoles move. Loss is mostly dissipated in the form of thermal energy, It determines the sharpness of resonance caused by the piezoelectric body at the resonance frequency.

기계적 품질계수 값이 낮으면 일반적으로 열화(Degradation)가 빨리 발생되고, 전기적 품질계수와는 다른 값을 갖는다.When the value of the mechanical quality factor is low, degradation generally occurs quickly and has a different value from the electrical quality factor.

도 20은 본 발명의 일 실시예에 따른 음성 인식 센서의 제작 형태를 보이는 이미지이다.20 is an image showing a production form of a speech recognition sensor according to an embodiment of the present invention.

도 20을 참조하면, 빈 공간이 형성된 PCB 상에 PET을 이용하여 압전 물질인 PZT를 적층한다. 상기 압전물질은 커브드 형태로 이루어진 복수 개의 주파수 채널을 이루는 것으로서 상기 빈 공간 상에 대응하도록 형성된다. 한편, Au 전극은 상기 압전 물질의 양단에 전기적으로 접속되는 과정을 통해 PCB 상면 상에 형성된다.Referring to FIG. 20, PZT, a piezoelectric material, is laminated on a PCB having an empty space using PET. The piezoelectric material forms a plurality of frequency channels formed in a curved shape and is formed to correspond to the empty space. Meanwhile, the Au electrode is formed on the upper surface of the PCB through a process of being electrically connected to both ends of the piezoelectric material.

사각형 모양의 투명한 플라스틱 기판인 PET 상에 PZT 및 PU 접착제가 위치해 있고, PZT 로부터 발생하는 전기적 에너지를 Au 전극을 통해서 수집하게 된다. 그리고, 이를 보호하는 보호층(Passivation layer)을 추가적으로 증착해서 소자를 보호하는 역할을 하게 된다.PZT and PU adhesives are placed on PET, which is a square-shaped transparent plastic substrate, and electrical energy generated from PZT is collected through Au electrodes. In addition, a protective layer (Passivation layer) that protects it is additionally deposited to protect the device.

한편, PET, UV 감수성 PU 접착제, PZT, 보호층은 투명한 재질로 이루어질 수 있다. Au 전극은 육안으로는 전극의 색상이 금색으로 보일 수 있다.On the other hand, PET, UV-sensitive PU adhesive, PZT, protective layer may be made of a transparent material. In the Au electrode, the color of the electrode may be visible to the naked eye.

본 발명은 인간의 청각기관인 달팽이관을 모사하여 음성인식을 구현하는 것에 착안한 것으로서, 주파수 분리를 위한 기존 마이크로폰, ADC, DSP 조합 방식이 아닌 유연 압전 음성 센서 기반의 간편한 회로로 전력 소비를 크게 줄일 수 있다. 또한, 이에 호환되는 효율적인 인식 알고리즘을 구현하게 된다면 인간의 자연스런 언어를 높은 선택성과, 민감도, 감지속도 및 안정성을 가지고 인식해낼 수 있다.The present invention focuses on realizing speech recognition by simulating the cochlear, which is a human auditory organ, and it is possible to significantly reduce power consumption with a simple circuit based on a flexible piezoelectric voice sensor rather than a conventional microphone, ADC, and DSP combination method for frequency separation. have. In addition, if an efficient recognition algorithm compatible with this is implemented, the natural language of a human can be recognized with high selectivity, sensitivity, detection speed, and stability.

본 기술을 실생활에 적용할 수 있는데, 예를 들어 운전 중에 음성으로 안전하게 차량 정보 시스템 사용을 상시 대기 상태에서 음성으로만 가능하게 하며, 이를 통해 TV, 청소기, 세탁기, 에어컨 등을 원거리에서 사람의 목소리로만 저전력 제어를 할 수 있게 된다. 특히, 손발이 불편한 장애인 및 환자들의 케어링(Caring)이나 음성을 등록함으로써 엘리베이터 등의 시설을 보다 편리하게 사용할 수 있다. This technology can be applied to real life. For example, it is possible to safely use the vehicle information system by voice while driving, only by voice while in standby mode. Only low-power control can be achieved. In particular, it is possible to use facilities such as elevators more conveniently by registering caring or voices for the handicapped and patients with disabilities and limbs.

본 기술은 IT-NT-BT-소재 전반을 아우르는 주제로서 자연으로부터 영감을 얻어 인간의 삶을 풍요롭게 하는 융합적 기술이다. 화자의 음성을 통해 적은 전력으로 상시 대기 상태에서 신원, 심리, 건강상태, 언어능력 등을 파악할 수 있어 개인 맞춤형 서비스 제공이 가능해지고, 보안, 금융, 의료 교육 등의 분야에 이르기까지 센서의 전 분야에 활용될 수 있게 한다. This technology is a theme that encompasses all IT-NT-BT-materials, and is a convergent technology that enriches human life by taking inspiration from nature. Through the voice of the speaker, it is possible to grasp the identity, psychology, health status, language ability, etc. in the standby mode with low power, enabling personalized service provision, and covering all areas of the sensor, including security, finance, and medical education. It can be utilized in.

특히, 빅데이터에 음성 패턴을 검출후 분석 및 저장하여 정서 상태를 분석하고 피드백 시스템을 통해 심리적인 안정을 이끌어내는 등 모바일 헬스케어로의 응용이 가능하고, 음성인증 및 화자식별을 통한 보안 시스템이 강화되어져 개인정보 및 사생활 보호에 도움이 될 것으로 기대된다. In particular, it is possible to apply to mobile healthcare, such as analyzing the emotional state by detecting and analyzing and storing voice patterns in big data and deriving psychological stability through a feedback system, and a security system through voice authentication and speaker identification. It is expected to be strengthened to help protect personal information and privacy.

본 발명은 상기의 특징들을 통해 음성인식 기반의 사물 인터넷(IoT,internet of Things) 및 모바일용 초소형 음성 센서 시스템을 구현할 수 있다.The present invention can implement an ultra-small voice sensor system for Internet of Things (IoT) and mobile based on voice recognition through the above features.

본 발명에 따른 음성 센서는 TV와 냉장고를 포함하는 스마트홈 가전, 음성 비서, 음성보안 어플리케이션 쪽에서 활용이 가능하다.The voice sensor according to the present invention can be used in smart home appliances including TVs and refrigerators, voice assistants, and voice security applications.

본 발명은 유연한 기판 상에 고효율 무기 압전 소재로 만들어진 음성인식 센서가 인간 음성의 스펙트럼을 디지털 샘플링 및 음향신호 처리 이전에 압전 소재를 이용하여 음성으로 인한 기계적 진동에너지를 주파수 별로 각각 다른 위치에 분리한 뒤 전기적인 신호로 변환하여 각각 주파수 별로 평행하게 음성신호를 처리한다.In the present invention, the mechanical recognition energy generated by the voice is separated into different positions for each frequency by using the piezoelectric material before the digital sampling and sound signal processing of the human voice spectrum by the speech recognition sensor made of a high-efficiency inorganic piezoelectric material on a flexible substrate. Afterwards, it is converted into an electrical signal and the voice signal is processed in parallel for each frequency.

본 발명에서는 복수의 주파수 분리 채널을 실로폰 모양을 닮은 인공 달팽이 관의 형상을 이루게 하고, 상기 복수의 주파수 분리 채널의 크기가 달라짐에 따라 고주파음과 저주파음이 공명하는 위치가 달라져서 물리적으로 인간의 음성을 분리하게 한다. 여기에서, 분리되는 각각의 음향은 주파수 별로 아날로그 회로를 통해 증폭되고 필터링을 거친후 디지털 신호로 전환되어 처리된다. 이 과정은 기존의 마이크로폰, ADC, DSP 조합을 이용한 방식보다 전력 소모가 크게 줄어들게 된다.In the present invention, a plurality of frequency separation channels form a shape of an artificial cochlea that resembles a xylophone shape, and as the sizes of the plurality of frequency separation channels are changed, locations where high and low frequency resonances resonate change to physically reproduce human voice. To separate. Here, each separated sound is amplified through an analog circuit for each frequency, filtered and converted into a digital signal for processing. In this process, power consumption is significantly reduced compared to the conventional microphone, ADC, and DSP combination.

본 발명은 플렉서블한 박막 상에 결합된 압전 음성인식 센서를 제공하는 것으로서, 의복 등에 부착한 상태에서도 사용할 수 있다. 즉, 의복 상에 부착된 상태에서 주위에서 쉽게 발생되는 음파, 초음파 영역의 물리적인 에너지를 수확하여 전기에너지로 변환하는 기술로의 응용이 가능하다. The present invention provides a piezoelectric voice recognition sensor coupled on a flexible thin film, and can be used even when attached to clothing. That is, it is possible to apply the technology to harvest the physical energy of the sound wave and ultrasonic regions easily generated in the surroundings while attached to the garment and convert it into electrical energy.

일반적으로 '어디에나 존재하는' 유비쿼터스 네트워크의 실현을 위해서는 '어디에나 존재하며 작동하는' 유비쿼터스 전원의 존재가 필수 불가결하다. 한편, 도처에 존재하는 유비쿼터스 네트워크 구성요소의 전원은 충전을 필요로 하지 않는 자급자족 형태이어야 한다. 즉, 발전능력 및 축전능력이 공히 구비되어야 한다.In general, in order to realize a ubiquitous network ``everywhere'', the existence of a ubiquitous power source ``everywhere and operating'' is indispensable. On the other hand, the power of ubiquitous network components that exist everywhere should be of a self-sufficient form that does not require charging. In other words, power generation and power storage must be provided.

상술한 바와 같이, 본 발명에 따른 압전 음성인식 센서는 사다리꼴 형태로 이루어진 복수의 주파수 분리 채널을 이용하여 감지되는 음성을 주파수에 따라 상기 복수의 채널을 통해 분리하는 것과 동시에 상기 분리된 음성 신호를 압전 소자를 통해 기계적 진동 신호에서 전기적 신호로 변환하게 하여 인식하게 한다.As described above, the piezoelectric voice recognition sensor according to the present invention separates the voice detected using the plurality of frequency separation channels formed in a trapezoidal form through the plurality of channels at the same time and piezoelectrically separates the separated voice signals. It converts the mechanical vibration signal to an electrical signal through the device for recognition.

이상에서 본 발명의 바람직한 실시 예에 대하여 설명하였으나, 본 발명은 상술한 특정의 실시 예에 한정되지 아니한다. 즉, 본 발명이 속하는 기술분야에서 통상의 지식을 가지는 자라면 첨부된 특허청구범위의 사상 및 범주를 일탈함이 없이 본 발명에 대한 다수의 변경 및 수정이 가능하며, 그러한 모든 적절한 변경 및 수정의 균등물들도 본 발명의 범위에 속하는 것으로 간주되어야 할 것이다.Although preferred embodiments of the present invention have been described above, the present invention is not limited to the specific embodiments described above. That is, a person having ordinary knowledge in the technical field to which the present invention pertains can make a number of changes and modifications to the present invention without departing from the spirit and scope of the appended claims, and all such appropriate changes and modifications Equivalents should also be considered within the scope of the present invention.

Claims

Flexible piezoelectric based speech recognition sensor,
The speech recognition sensor has a plurality of frequency channels that are spaced apart from each other with a different length,
The plurality of frequency channels sense a frequency region overlapping each other,
A voice recognition sensor characterized by having a response characteristic having a high sensitivity of -45dBV or higher at 94dB(SPL) of 1kHz through the entire shape of a curved surface in which the upper and lower ends of the plurality of frequency channels are continuously connected to each other.

According to claim 1,
The plurality of frequency channels are evenly sensed through the frequency range of 200 Hz to 4 kHz, which is the audible frequency range, and at the same time, the voice is sensed with higher intensity as it goes to a lower frequency band.
Speech recognition sensor.

delete

According to claim 1,
The length ratio between the shortest-length channel sensing the highest frequency region and the longest-length channel sensing the lowest frequency region among the plurality of frequency channels is in the range of 1: 1.5 to 6.5,
Speech recognition sensor.

The method of claim 4,
The slope of the curved surface gradually changes from the shortest channel to the longest channel,
Speech recognition sensor.

Flexible piezoelectric based speech recognition sensor,
The speech recognition sensor has a plurality of frequency channels having different lengths,
The plurality of frequency channels sense a frequency region overlapping each other,
The speech recognition sensor has a resonance type, and the quality factor of each channel has a value of 35 or less,
The quality factor widens the 3dB bandwidth of each channel to 50Hz or more,
Using more than 90% of the 200-4kHz frequency range, using a piezoelectric material with high sensitivity characteristics, more sensitive sensing and even sensing when compared with conventional MEMS-based microphones that sense sound pressure with a change in the capacitance of the microphone sensor Characterized by a speech recognition sensor.

delete

The method of claim 6,
The conventional commercially available MEMS-based microphone has the sensitivity of the microphone sensor itself, except for the voltage gain of the ROIC, of about 94SPL and -100dB in a white noise environment of 20-20kHz, whereas the flexible piezoelectric sensor has a 90 of 200~4kHz. Characterized in that it has a sensitivity of -100dB or more within the% range,
Speech recognition sensor.

According to claim 1,
Flexible thin film;
A layer of piezoelectric material stacked on the flexible thin film; And
It includes; an electrode stacked on the piezoelectric material layer,
The electrode is in a state including the plurality of frequency separation channels arranged in a line,
Speech recognition sensor.

The method of claim 9,
The voice recognition sensor,
Further comprising a protective layer (Passivation layer) laminated in a form that covers the electrode as a whole,
Speech recognition sensor.

Claim 1 to claim 2, claim 4 to claim 6 and comprising the speech recognition sensor according to any one of claims 8 to 10,
Small voice sensor system with voice recognition based Internet of Things (IoT).

Including the voice sensor system according to claim 11,
Smart home appliances.

Including the voice sensor system according to claim 11,
Wearable electronic device.