KR20010007842A

KR20010007842A - The system and method of a dialogue form voice and multi-sense recognition for a toy

Info

Publication number: KR20010007842A
Application number: KR1020000058952A
Authority: KR
Inventors: 남호원
Original assignee: 남호원
Priority date: 2000-10-06
Filing date: 2000-10-06
Publication date: 2001-02-05
Also published as: KR100423788B1

Abstract

PURPOSE: Interactive type audio and multi-sense cognition system in toy satisfies curiosity and interest of children by various vocal expression for identical question of user, improves culture of sentiments and speech study of children by various vocal and motion expression of children sense. CONSTITUTION: Interactive type audio and multi-sense cognition system in toy comprises first step to extract selected vocal sample to study data on audio cognition training process using IDMLP(Input Driven Multi Layer Perceptron) nerve circuit net algorithm through computer; second step to store study data and vocal scenario corresponding to study data in memory part(20) through PC interface part(12) of toys; third step to extract study data corresponding to vocal signal using IDMLP nerve circuit net algorithm after passing whole treatment process in control part(10); fourth step to select vocal expression and motion expression among scenario for vocal output and motion expression through audio and working composition part(11), memory part(20) in control part(10); fifth step to send control signal to output selected vocal expression and motion expression through output part(40); sixth step to output vocal and motion expression by control signal of control part(10).

Description

The system and method of a dialogue form voice and multi-sense recognition for a toy}

본 발명은 완구에 있어서 대화형 음성 및 다감각인식 시스템 및 이의 방법에 관한 것으로, 보다 상세하게는 인형 및 장난감과 같은 완구류내에 음성인식 및 다감각인식 시스템을 내장하여 사용자의 다양한 음성과 행위 및 주변상황에 적절히 반응하도록 소프트웨어와 하드웨어적으로 구현한 사용자와 커뮤니케이션이 가능한 완구에 있어서 음성 및 다감각인식 시스템 및 이의 방법에 관한 것이다.The present invention relates to an interactive voice and multi-sensory recognition system and a method thereof for toys, and more particularly, to a variety of voices, actions, and surroundings of a user by embedding a voice recognition and multi-sense recognition system in toys such as dolls and toys. The present invention relates to a voice and multisensory recognition system and a method thereof for communicating with a user implemented in software and hardware to properly respond to a situation.

일반적으로 완구는 예로부터 어린이와 밀접한 관계를 맺으면서 그들의 호기심을 충족하는 동시에 창작성이나 감성과 같은 교육적인 측면을 고려하여 다양한 형태로 꾸준히 개발 되고 있는데, 초창기에는 주로 어린이의 호기심을 충족하기 위하여 당시 유행하는 캐릭터를 중심으로 개발이 되어 왔으나 갈수록 교육적인 측면에 대한 요구가 반영되어 다양한 형태의 완구가 제작되고 있다.In general, toys are closely developed with children, satisfying their curiosity, and at the same time, they are constantly being developed in various forms in consideration of educational aspects such as creativity and sensitivity. It has been developed around the characters, but more and more types of toys are being produced reflecting the demand for educational aspects.

특히 어린이의 다양한 음성과 행위에 따라 상황에 맞는 적절한 음성표현과 동작행위를 하는 경우는 아이들의 호기심을 충족하는 동시에 교육적인 효과가 대단히 우수하여 대부분의 완구제작업체들은 이러한 방향으로 개발을 하고 있는데, 지금까지의 기술수준은 완구내에 소정의 저장매체를 내장한 후 아이들의 호기심을 충족할 수 있는 음향 및 음성표현을 저장하고, 터치센서에 의해 작동하도록 하여 소정의 음향 및 음성을 표현하도록 하고 있으나, 비용에 대한 부담, 기술의 한계성으로 인하여 유행에 민감하고 변덕이 심한 어린 아이들의 특성상 호기심충족에 대한 욕구를 계속 지속하기가 어렵다는 문제점이 있다.In particular, the appropriate voice expressions and movements according to the various voices and actions of the children satisfy the curiosity of the children and the educational effect is very excellent. Most of the toys are developed in this direction. Until now, the technology level has embedded a predetermined storage medium in the toy, and then stores sound and voice expressions to satisfy children's curiosity, and operates the touch sensor to express the predetermined sound and voice. Due to the burden of cost and the limitation of technology, there is a problem that it is difficult to continue the desire for satisfying curiosity due to the characteristics of the young children who are sensitive to the epidemic and severe.

최근 들어 전세계적으로 상기와 같은 문제점을 해결하기 위하여 아이들의 호기심을 충족하는 동시에 교육적인 측면을 고려한 완구 제작이 활기를 띠고 있는데, 특히 음성인식기술을 이용한 음성인식시스템을 완구내에 내장하여 적절한 상황에 맞는 음성을 출력하도록 하고 있지만 아직은 초보적인 단계에 머물고 있어 단순히 저장되어 있는 음성이 출력되는 수준이고, 좀 더 개선된 형태로 일정한 시나리오가 있는 형태의 음성이 출력되기도 하지만 출력되는 음성 역시 획일화된 형태로 음성표현이 되어 일정시간이 소요되면 어린이가 흥미를 잃어버리게 되고, 교육적인 효과도 약화된다는 문제점을 여전히 지니고 있다.In recent years, in order to solve the above problems all over the world, children's curiosity and educational production considering the educational aspects have been vigorous. In particular, a voice recognition system using voice recognition technology is embedded in the toy for appropriate situations. It outputs the right voice, but it is still in the elementary stage, so the stored voice is simply output, and the voice is output in a more advanced form with a certain scenario, but the output voice is also uniform. It is still a problem that children lose interest when a certain amount of time becomes a voice expression and the educational effect is weakened.

또한 인형 및 로봇 같은 완구내에 소정의 구동수단을 내장하여 어린이의 흥미를 자아내기 위해 일정한 동작을 표현하도록 하는 경우도 있으나 이는 비용에 대한 부담이나 기술의 한계성으로 인해 단순한 표현만 반복하기 때문에 아이들의 호기심에 대한 욕구를 충족하기에는 미흡하다는 문제점을 안고 있다.In addition, there is a case in which a predetermined driving means is embedded in toys such as dolls and robots to express a certain movement in order to induce children's interest. There is a problem that it is insufficient to satisfy the desire for.

따라서 어린이의 다양한 음성, 행위 및 주변상황에 맞는 음성 및 동작을 표현하도록 하여 어린이와 커뮤니케이션이 가능하도록 함으로써 흥미를 충족하면서도 교육적인 기능을 동시에 구현할 수 있는 완구의 필요성이 대두되고 있다.Therefore, there is a need for a toy that can implement educational functions while satisfying interests by enabling communication with children by expressing voices and actions corresponding to various voices, actions, and surroundings of children.

본 발명은 상기의 종래 문제점을 해결하기 위하여 안출된 것으로, IDMLP신경회로망 알고리즘을 이용하여 소정의 입력장치를 통해 입력되는 사용자의 음성을 인식하고, 인식된 음성에 해당하는 음성 시나리오의 다양한 음성표현 중에서 임의로 하나의 음성표현을 선택한 후 해당 출력장치를 통해 출력함으로써 사용자가 동일한 질문을 하여도 완구는 상황에 따라 다양하면서도 새로운 답변을 할 수 있게 되어 있어 어린이의 흥미를 충족시키는 동시에 다양한 언어학습에 의한 교육적인 기능이 부가되도록 하는데 목적이 있다.SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned conventional problem, and recognizes a user's voice input through a predetermined input device using an IDMLP neural network algorithm, and among various voice expressions of a voice scenario corresponding to the recognized voice. By randomly selecting one voice expression and outputting it through the corresponding output device, even if the user asks the same question, the toy can give a variety of new answers depending on the situation. The purpose is to add additional functionality.

또한 본 발명에 의해 구현된 완구내에 촉각감지센서, 초음파센서, 적외선센서를 내장하여 사용자가 만지는 촉각을 감지하고, 사용자와의 일정거리를 감지하고, 빛과 어둠을 감지하여 그에 상응하는 소정의 음성표현 및 동작표현을 함으로써 어린이들의 감각을 그대로 반영하여 살아있는 생물과 대화하는 것과 같은 효과를 느끼게 해 어린이의 정서함양에 도움을 주는데 목적이 있다.In addition, the built-in tactile sensor, ultrasonic sensor, infrared sensor in the toy implemented by the present invention detects the tactile touch by the user, detects a certain distance with the user, and detects light and darkness to a predetermined voice By expressing and expressing movements, it aims to help children's emotional development by making them feel the same effect as talking with living creatures.

상기의 목적을 달성하기 위하여 본 발명은 사용자에 제한없이 다양한 음성신호를 인식하기 위해 IDMLP신경회로망 알고리즘을 이용하여 음성인식 훈련과정을 거쳐 추출된 학습데이터와 상기 학습데이터에 대응하는 음성출력 및 동작출력용 시나리오를 완구내에 내장되어 있는 소정의 저장매체에 저장하여 사용자가 입력하는 음성신호와 소정의 센서가 공급하는 신호를 제어부가 감지하고, 상기 제어부의 제어신호에 의해 소정의 저장매체에 저장된 음성 시나리오의 다양한 음성표현 중에서 임의로 하나를 선택한 후 상기 선택된 음성표현 및 소정의 동작표현을 해당 출력장치를 통해 출력하도록 하여 사용자와 다양한 대화가 가능하도록 하는 대화형 음성 및 다감각인식 완구를 구현하고자 하는 것이다.In order to achieve the above object, the present invention uses the IDMLP neural network algorithm to recognize a variety of voice signals without limitation to the user and the training data extracted through the voice recognition training process and the voice output and operation output corresponding to the learning data The control unit detects a voice signal input by a user and a signal supplied by a predetermined sensor by storing the scenario in a predetermined storage medium embedded in the toy, and the control signal of the controller controls the scenario of the voice scenario stored in the predetermined storage medium. The present invention is to implement an interactive voice and a multi-sensory toy that enables a variety of conversations with a user by selecting one of various voice expressions and outputting the selected voice expression and a predetermined motion expression through a corresponding output device.

도1은 본 발명의 일실시례에 의해 구성된 구성 블록도1 is a block diagram showing the configuration of an embodiment of the present invention

도2는 본 발명에 따른 음성인식 훈련과정을 도시한 흐름도2 is a flowchart illustrating a voice recognition training process according to the present invention.

도3은 본 발명에 따른 음성인식과정을 도시한 흐름도3 is a flowchart illustrating a voice recognition process according to the present invention.

도4는 본 발명에 따른 메모리부상에 저장된 학습데이터의 테이블을 도시한 예시도4 is an exemplary diagram showing a table of learning data stored on a memory unit according to the present invention.

도5는 본 발명에 따른 메모리부상에 저장된 음성 시나리오의 테이블을 도시한 예시도5 is an exemplary diagram showing a table of voice scenarios stored on a memory unit according to the present invention;

*도면의 주요부분에 대한 부호설명** Description of Signs of Main Parts of Drawings *

1. PC 10. 제어부PC 10. Control part

11. 음성 및 작동합성부 12. PC 인터페이스부11. Voice and operation synthesis unit 12. PC interface unit

13. 전원부 20. 메모리부13. Power Supply 20. Memory

21. 제1메모리부 22. 제2메모리부21. First memory section 22. Second memory section

23. 제3메모리부 30. 입력부23. Third memory unit 30. Input unit

31. 마이크 32. 촉각감지센서31.Microphone 32.Tactile Sensor

33. 적외선센서 34. 초음파센서33. Infrared sensor 34. Ultrasonic sensor

35. 필터부 36. 제1증폭부35. Filter section 36. First amplifier section

40. 출력부 41. 제2증폭부40. Output section 41. Second amplifier section

42. 구동수단부 43. 스피커42. Drive means 43. Speaker

44. LED44.LED

이하, 본 발명의 구성을 첨부된 도면과 관련하여 상세히 설명하되, 후술되는 용어들은 본 발명에서의 기능을 고려하여 정의 내려진 용어로서 이는 당 분야에 종사하는 기술자의 의도 또는 관례 등에 따라 달라질 수 있으므로, 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.Hereinafter, the configuration of the present invention will be described in detail with reference to the accompanying drawings, the following terms are defined in consideration of the functions in the present invention, which may vary according to the intention or custom of those skilled in the art, The definition should be made based on the contents throughout the specification.

본 발명은 IDMLP(Input Driven Multi Layer Perceptron)신경회로망을 이용한 음성인식기법으로 사용자에 대한 제한이 없는 화자 독립형 음성인식이 가능하도록 하여 상황에 맞는 시나리오에 의해 적절한 음성 및 동작표현이 가능하도록 하고, 촉각감지센서, 거리감지센서 및 빛감지센서를 장착하여 사용자의 다양한 행위를 감지하여 적절하게 반응하도록 한다.The present invention enables speech-independent speech recognition without user restriction by voice recognition technique using IDMLP (Input Driven Multi Layer Perceptron) neural network to enable proper voice and motion expression according to the scenario suitable for the situation, and tactile It is equipped with a sensor, a distance sensor and a light sensor to detect various actions of the user and respond appropriately.

마이크로프로세서(10)를 이용하여 마이크(31)를 통해 입력되는 다양한 음성신호를 IDMLP알고리즘 기반의 음성인식기술로 인식한 후, 학습과정을 거쳐 제1메모리부(21)에 저장되어 있는 해당 학습 데이터를 인식하여 음성 및 작동합성부(11)에 제어신호를 전송하고, 상기 음성 및 작동합성부(11)에서 제2메모리부(22)에 저장되어 있는 음성출력용 시나리오 중에서 해당 시나리오를 선택한 후 다양한 음성표현데이터 중에서 무작위로 하나를 선택한 후 하나의 음성표현데이터를 추출하여 스피커(43)를 통해 출력하도록 소프트웨어와 하드웨어적으로 구현함으로써 사용자와 대화가 이루어지도록 하고, 상기의 과정을 반복하여 사용자가 같은 질문을 하여도 본 발명에 의해 구현된 인형 및 장난감과 같은 완구류는 항상 다른 음성을 출력함으로써 어린이의 흥미를 유발하고, 동시에 교육적인 기능이 부가되도록 구성한다.After recognizing various voice signals input through the microphone 31 using the microprocessor 10 by IDMLP algorithm-based voice recognition technology, the corresponding learning data stored in the first memory unit 21 through a learning process. Recognizes and transmits a control signal to the voice and operation synthesis unit 11, selects the scenario from the voice output scenarios stored in the second memory unit 22 in the voice and operation synthesis unit 11 and then various voices After randomly selecting one of the expression data, one voice expression data is extracted and implemented through software and hardware to output through the speaker 43 so that the user can have a conversation, and the user repeats the above process. Even if the toys such as dolls and toys implemented by the present invention always outputs a different voice to interest children To, and configured to simultaneously add the educational function.

또한 상기 마이크로프로세서(10)에 촉각감지센서(32), 빛감지센서(33), 거리감지센서(34)를 연결하여 사용자가 만지거나 다가오는 등의 일정한 행위 및 빛과 어둠을 감지한 신호를 ADC(Analog to Digital Converter)를 통하여 상기 마이크로프로세서(10)가 인식하여 음성 및 작동합성부(11)에 제어신호를 전송하고, 상기 음성 및 작동합성부(11)에서 제2메모리부(22)에 저장되어 있는 해당 동작표현 데이터를 추출하여 구동수단부(42)를 통해 일정한 동작을 표현하도록 소프트웨어와 하드웨어적으로 구현함으로써 사용자의 다양한 행위에 대해 상황에 맞는 적절한 반응을 할 수 있도록 하여 어린이의 흥미 유발과 교육적인 기능을 겸하도록 한다.In addition, by connecting the tactile sensor 32, the light sensor 33, and the distance sensor 34 to the microprocessor 10, the ADC detects a certain action such as a user touching or approaching light and darkness. The microprocessor 10 recognizes and transmits a control signal to the voice and operation synthesis unit 11 through an analog to digital converter, and transmits the control signal to the second memory unit 22 from the voice and operation synthesis unit 11. Induces children's interest by extracting the corresponding motion expression data and implementing them in software and hardware to express a certain motion through the driving means 42. And educational functions.

본 발명에 의한 인형 및 장난감과 같은 완구류가 사용자의 다양한 음성과 행위에 따라 상황에 맞는 적절한 반응을 하기 위해서는 완구가 시나리오 상황에서 들을 수 있는 예측가능한 음성에 대해 사전에 특징을 기억하여 인식할 수 있도록 훈련과정을 거쳐야 하는데, 상기의 훈련과정은 일반 PC(1)를 이용하여 일정한 상황과 동일한 질문에 대해 약 30명 정도의 어린이 음성을 녹음을 하도록 한 후 신호차이축적법 및 구간검출을 통한 정규화과정인 전처리과정을 거치고, IDMLP신경회로망 알고리즘에 의해 각 음성을 음소별로 분류하여 음소에 대한 일정한 가중치를 부여하고, 그에 대한 데이터를 추출하는 훈련과정을 반복하여 소정의 결과값이 추출되면 본 발명의 PC 인터페이스부(12)를 통하여 메모리부(20)에 저장하도록 함으로써 다양한 어린이의 음성신호가 마이크(31)를 통하여 입력되면 마이크로프로세서(10)가 해당 데이터를 검색하여 인식할 수 있도록 한다.In order for toys such as dolls and toys according to the present invention to respond appropriately to the situation according to various voices and actions of the user, the toys may remember and recognize the characteristics in advance about the predictable voices that can be heard in the scenario situation. The above training process requires the normal PC (1) to record about 30 children's voices for the same questions and conditions, and then normalizes them through signal difference accumulation method and interval detection. After the preprocessing process, IDMLP neural network algorithm classifies each voice by phoneme, assigns a certain weight to the phoneme, repeats the training process of extracting data, and extracts a predetermined result value. Voice of various children by storing in the memory unit 20 through the interface unit 12 The so when input from the microphone 31, the microprocessor 10 can recognize to retrieve the data.

상기의 훈련과정을 거쳐 학습과정이 완료된 완구의 인식과정을 보면, 사용자의 음성이 마이크(31)를 통하여 마이크로프로세서(10)에 전송되면 상기 마이크로프로세서(10)에서 음성신호를 입력받아 신호처리축적법 및 구간검출을 통한 정규화과정인 전처리과정을 거친 후, 입력된 음성을 IDMLP신경회로망 알고리즘에 의해 해당 학습데이터를 메모리부(20)를 통해 추출하고, 상기 제어부(10)는 인식된 학습데이터에 해당하는 음성 및 동작 시나리오를 선택하기 위해 음성 및 작동합성부(11)에 제어신호를 전송하여 해당 시나리오를 선택한 후 다양한 음성표현 및 동작표현 중에서 임의의 음성표현을 선택하여 스피커(43)를 통해 출력하고, 일정한 동작표현을 실시하도록 구동수단부(42)를 통해 출력한다.Looking at the recognition process of the toy is completed through the above training process, when the user's voice is transmitted to the microprocessor 10 through the microphone 31, the microprocessor 10 receives a voice signal and accumulates the signal processing After the preprocessing process, which is a normalization process through law and section detection, the input voice is extracted through the memory unit 20 by the IDMLP neural network algorithm, and the control unit 10 is applied to the recognized learning data. In order to select a corresponding voice and operation scenario, a control signal is transmitted to the voice and operation synthesis unit 11 to select a corresponding scenario, and then an arbitrary voice expression is selected from various voice expressions and operation expressions and output through the speaker 43. And output through the drive means 42 to perform a certain operation expression.

또한 사용자의 촉각이나 사용자의 거리, 빛의 양을 각종 센서에서 감지하여 해당 신호를 마이크로프로세서(10)로 전송하고, 상기 마이크로프로세서(10)에서는 ON/OFF여부를 인식하여 음성 및 작동합성부(11)를 통해 해당 음성표현 및 동작표현을 스피커(43)와 구동수단부(42)를 통해 출력한다.In addition, the user's tactile sense, the user's distance, and the amount of light are sensed by various sensors, and the corresponding signal is transmitted to the microprocessor 10, and the microprocessor 10 recognizes whether it is ON / OFF and the voice and operation synthesis unit ( 11) through the speaker 43 and the drive means 42 outputs the corresponding voice expression and operation expression.

이와 같이 사용자의 다양한 음성 및 행위에 대해 상황에 맞는 적절한 음성표현 및 동작표현을 할 수 있도록 소프트웨어 및 하드웨어적으로 완구를 구현하여 취학전 어린 아동을 대상으로 흥미를 충족하는 동시에 다양한 언어학습능력을 배양할 수 있는 교육적인 요소를 가미하도록 구성한다.As such, toys and software can be implemented in software and hardware so that users can express various voices and behaviors appropriately according to the situation. It is designed to add educational elements that can be done.

이하, 참조된 도면을 참조하여 본 발명의 일실시례에 의해 구성된 완구에 있어서 대화형 음성 및 다감각인식 시스템 및 이의 방법에 대해 상세히 설명한다.Hereinafter, with reference to the accompanying drawings in the toy constituted by an embodiment of the present invention will be described in detail for the interactive speech and multi-sensory recognition system and method thereof.

도1은 본 발명의 일실시례에 의해 구성된 구성 블록도이고, 도2는 본 발명에 따른 음성인식 훈련과정을 도시한 흐름도이고, 도3은 본 발명에 따른 음성인식과정을 도시한 흐름도이고, 도4는 본 발명에 따른 메모리부상에 저장된 학습데이터의 테이블을 도시한 예시도이고, 도5는 본 발명에 따른 메모리부상에 저장된 음성 시나리오의 테이블을 도시한 예시도이다.1 is a block diagram showing the configuration of an embodiment of the present invention, Figure 2 is a flow chart showing a voice recognition training process according to the present invention, Figure 3 is a flow chart showing a voice recognition process according to the present invention, 4 is an exemplary diagram showing a table of learning data stored on a memory unit according to the present invention, and FIG. 5 is an exemplary diagram showing a table of a voice scenario stored on the memory unit according to the present invention.

도1은 본 발명의 일실시례에 의해 구성된 구성 블록도이다.1 is a block diagram illustrating a configuration according to an embodiment of the present invention.

도시된 바와 같이 본 발명은 입력부(30), 제어부(10), 메모리부(20), 음성 및 작동합성부(11), 출력부(40), 전원부(13) 및 PC 인터페이스부(12)로 구성하는데, 상기 입력부(30)는 마이크(31), 촉각감지센서(32), 초음파센서(34), 적외선센서(33), 필터부(35), 제1증폭부(36)로 이루어지고, 상기 메모리부(20)는 제1메모리부(21), 제2메모리부(22), 제3메모리부(23)로 이루어지고, 상기 출력부(40)는 제2증폭부(41), 구동수단부(42), 스피커(43), LED(Light Emitting Diode)(44)로 구성한다.As shown, the present invention includes an input unit 30, a control unit 10, a memory unit 20, a voice and operation synthesis unit 11, an output unit 40, a power supply unit 13, and a PC interface unit 12. The input unit 30 includes a microphone 31, a tactile sensor 32, an ultrasonic sensor 34, an infrared sensor 33, a filter unit 35, and a first amplifier unit 36. The memory unit 20 includes a first memory unit 21, a second memory unit 22, and a third memory unit 23, and the output unit 40 includes a second amplifier 41 and a drive unit. It consists of a means part 42, a speaker 43, and a light emitting diode (LED) 44.

상기 제어부(10)는 원칩 마이크로프로세서인 Intel 80C196KC를 사용하여 시스템을 제어하도록 하는데, 입력부(30)를 통해 수신되는 아날로그신호를 디지털신호로 변환한 후, 학습과정을 거쳐 추출된 음성데이터와 비교할 수 있는 형태로 처리하여 사용자의 음성을 인식하도록 하고, 제1메모리부(21)에 소정의 압축형태로 저장되어 있는 음성데이터를 디코딩하여 입력된 음성데이터를 제3메모리부(23)에서 비교처리를 하여 해당 음성데이터를 검색하고, 각종 센서를 통해 입력되는 신호를 수신하여 ON/OFF여부를 인식하고, 또한 메모리부(20), 음성 및 작동합성부(11), 입력부(30), 출력부(40) 및 전원부(13)와 같은 주위의 다른 구성요소에 제어신호를 보내 상기 제어부(10)가 일괄적으로 통제하도록 한다.The controller 10 controls the system using the one-chip microprocessor Intel 80C196KC, which converts an analog signal received through the input unit 30 into a digital signal and compares the extracted voice data through a learning process. To recognize the user's voice, and to decode the voice data stored in the predetermined compression form in the first memory unit 21 to compare the input voice data in the third memory unit 23. Search for the corresponding voice data, receive signals input through various sensors, recognize ON / OFF status, and also perform memory unit 20, voice and operation synthesis unit 11, input unit 30, and output unit ( The control unit 10 collectively controls the control signal by sending control signals to the surrounding components such as 40 and the power supply unit 13.

상기 메모리부(20)는 제1메모리부(21), 제2메모리부(22), 제3메모리부(23)로 구성하는데, 상기 제1메모리부(21)는 32K 바이트 EPROM을 사용하여 상기 제어부(10)를 운용할 수 있도록 운용 소프트웨어를 저장하고, IDMLP신경회로망 알고리즘을 이용하여 학습과정을 거쳐 얻어진 학습데이터를 소정의 압축된 형태로 저장하여 상기 제어부(10)로 임의의 음성신호가 수신되면, 상기 제1메모리부(21)의 학습데이터와 연동하여 해당 음성데이터를 추출하도록 한다.The memory unit 20 is composed of a first memory unit 21, a second memory unit 22, and a third memory unit 23. The first memory unit 21 uses the 32K byte EPROM. Stores the operating software to operate the control unit 10, and stores the learning data obtained through the learning process using a IDMLP neural network algorithm in a predetermined compressed form to receive a random voice signal to the control unit 10 If so, the voice data may be extracted in association with the learning data of the first memory unit 21.

상기 제2메모리부(22)는 256K 바이트 EPROM을 사용하여 완구가 소정의 음성표현 및 동작표현을 할 수 있도록 각 상황에 맞는 음성 시나리오 데이터를 소정의 압축형태로 저장함으로써 마이크(31)를 통해 소정의 음성신호 및 센서를 통해 ON/OFF신호가 상기 제어부(10)에 수신되면, 음성신호인 경우는 제1메모리부(21)와 연동하여 해당 음성데이터를 추출하여 제2메모리부(22)에 저장되어 있는 해당 음성시나리오 데이터 중에서 상기 제어부(10)가 무작위로 하나를 선택하여 음성 및 작동합성부(11)를 통해 출력하고, 각종 센서를 통한 ON/OFF신호인 경우는 상기 제어부(10)가 제2메모리부(22)에 저장되어 있는 소정의 음성표현 및 동작표현을 선택하여 음성 및 작동합성부(11)를 통해 출력하도록 한다.The second memory unit 22 uses a microphone 31 to store voice scenario data suitable for each situation in a predetermined compression form so that a toy can express a predetermined voice and an operation using a 256K byte EPROM. When the ON / OFF signal is received through the voice signal and the sensor of the control unit 10, in the case of the voice signal in conjunction with the first memory unit 21 to extract the corresponding voice data to the second memory unit 22 The controller 10 randomly selects one of the stored voice scenario data and outputs it through the voice and operation synthesizing unit 11, and when the ON / OFF signal is provided through various sensors, the controller 10 A predetermined voice expression and an operation expression stored in the second memory unit 22 are selected to be output through the voice and operation synthesizing unit 11.

상기 제3메모리부(23)는 32K 바이트 RAM을 사용하여 내부적인 데이터 신호처리를 담당하는 소자로, 마이크(31)를 통해 입력되는 음성신호를 인식하기 위한 사용공간으로 이용한다.The third memory unit 23 is an element in charge of internal data signal processing using a 32K byte RAM, and serves as a use space for recognizing a voice signal input through the microphone 31.

상기 음성 및 작동합성부(11)는 제어부(10)로부터 제어신호를 수신하여 제2메모리부(22)에 저장되어 있는 음성표현 및 동작표현 시나리오를 선택한 후 해당 시나리오 중에서 상기 제어부(10)가 선택하는 임의의 음성표현 및 동작표현을 추출하여 스피커(43) 및 구동수단부(42)로 전송하여 소정의 음성 및 동작을 출력하도록 한다.The voice and operation synthesis unit 11 receives a control signal from the control unit 10 and selects a voice expression and an operation expression scenario stored in the second memory unit 22 and then selects the control unit 10 from the corresponding scenarios. Any voice expression and motion expression are extracted and transmitted to the speaker 43 and the driving means 42 to output a predetermined voice and operation.

상기 입력부(30)는 마이크(31)와 촉각감지센서(32), 적외선센서(33), 초음파센서(34), 필터부(35), 제1증폭부(36)로 이루어지는데, 상기 마이크(31)는 사용자의 음성을 수신한 후 음향에너지인 음압을 전기에너지로 변환하여 아날로그 음성신호를 상기 제어부(10)로 공급하고, 상기 촉각감지센서(32)는 사용자가 완구를 만지는 경우 촉감을 감지하여 ON/OFF신호를 상기 제어부(10)로 공급하고, 상기 초음파센서(34)는 사용자의 일정거리를 감지하여 ON/OFF신호를 상기 제어부(10)로 공급하고, 상기 적외선센서(33)는 주위가 어두위지거나 밝아지는 경우 빛과 어둠을 인식하여 ON/OFF신호를 상기 제어부(10)로 공급하고, 상기 필터부(35)는 저역통과필터(LPF; Low Pass Filter)를 사용하여 마이크(31) 및 각종 센서를 통해 입력되는 신호의 저역부분만을 통과시킴으로써 상기 제어부(10)가 인식하기 용이하도록 음성신호와 섞인 잡음신호를 걸러주고, 사용자의 접촉이나 주변 소음으로 인한 일정한 잡음을 제거하도록 하고, 상기 제1증폭부(36)는 연산 증폭기(OP AMP;Operating Amplifier)를 사용하여 상기 필터부(35)가 공급하는 약한 아날로그 신호를 상기 제어부(10)가 인식하여 운용할 수 있도록 소정의 크기로 신호를 증폭하여 준다.The input unit 30 includes a microphone 31, a tactile sensor 32, an infrared sensor 33, an ultrasonic sensor 34, a filter unit 35, and a first amplifier unit 36. 31) converts a sound pressure, which is acoustic energy, into electrical energy after receiving the user's voice, and supplies an analog voice signal to the controller 10, and the tactile sensor 32 detects the touch when the user touches the toy. By supplying an ON / OFF signal to the control unit 10, the ultrasonic sensor 34 detects a predetermined distance of the user and supplies an ON / OFF signal to the control unit 10, the infrared sensor 33 is When the surroundings are dark or bright, light and dark are recognized and the ON / OFF signal is supplied to the controller 10, and the filter unit 35 uses a low pass filter (LPF) to use a microphone ( 31) and the control unit 10 by passing only the low-pass portion of the signal input through the various sensors To filter the noise signal mixed with the voice signal so as to be easily recognized, and to remove the constant noise due to the user's contact or ambient noise, the first amplifier 36 using an operational amplifier (OP AMP) The control unit 10 amplifies the signal to a predetermined size so that the control unit 10 recognizes and operates the weak analog signal supplied by the filter unit 35.

상기 출력부(40)는 제2증폭부(41), 구동수단부(42), 스피커(43), LED(Light Emitting Diode)(44)로 이루어지는데, 상기 제2증폭부(41)는 입력부(30)의 제1증폭부(36)와 동일하게 연산 증폭기(OP AMP;Operating Amplifier)를 사용하여 상기 음성 및 작동합성부(11)를 통해 공급되는 약한 아날로그 음성신호 및 동작신호를 구동수단부(42), 스피커(43) 및 LED(44)가 작동할 수 있도록 소정의 크기로 신호를 증폭하고, 상기 스피커(43)는 제어부(10)에서 음성인식과정을 거쳐 공급되는 음성표현 데이터를 가진 전기적인 신호를 음의 진동으로 변환하여 사용자가 인식할 수 있는 음성으로 출력하고, 상기 LED(44)는 발광 다이오드를 사용하여 상기 제어부(10)의 제어신호에 의해 일정간격으로 빛을 발광하고, 상기 구동수단부(42)는 다양한 구동장치로 이루어져 상기 제어부(10)의 제어신호를 수신하면 해당 장치들이 제어신호에 따른 소정의 동작을 출력하도록 한다.The output unit 40 includes a second amplifier 41, a driving means 42, a speaker 43, and a light emitting diode (LED) 44. The second amplifier 41 is an input unit. The driving means unit drives the weak analog audio signal and the operation signal supplied through the voice and operation synthesizing unit 11 using an operational amplifier (OP AMP) similarly to the first amplifier unit 36 of 30. (42), the speaker 43 and the LED 44 to amplify the signal to a predetermined size to operate, the speaker 43 has a voice expression data supplied through the speech recognition process from the control unit 10 Converts an electrical signal into a sound vibration and outputs it as a voice that a user can recognize, and the LED 44 emits light at a predetermined interval by a control signal of the controller 10 using a light emitting diode, The driving means unit 42 is composed of various driving devices to receive the control signal of the control unit 10 And the devices so as to output a predetermined operation according to the control signal.

상기 PC 인터페이스부(12)는 학습과정을 수행하여 소정의 학습데이터를 도출한 일반 PC에 접근하여 해당 데이터를 공급받음에 있어서 상기 제어부(10)에서 전송되는 신호들과 버스충돌을 방지하여 용이하게 접근할 수 있도록 하는 장치로 입·출력용 포트를 가진다.The PC interface unit 12 easily accesses a general PC which derives predetermined learning data by performing a learning process and prevents a bus collision with signals transmitted from the control unit 10 in receiving the corresponding data. Accessible device that has input and output ports.

상기 전원부(13)는 전압변동이 있는 임의의 전압을 본 발명에 의해 구현된 완구가 작동할 수 있는 소정의 전압으로 유지하여 각 회로와 소자에 일정한 양의 전원을 공급할 수 있도록 하는데, 본 발명의 특성상 소정의 건전지를 사용하여 이동에 용이하도록 구성하는 것이 무난하다.The power supply unit 13 maintains an arbitrary voltage with a voltage change at a predetermined voltage at which the toy implemented by the present invention can operate, so that a certain amount of power can be supplied to each circuit and the device. In view of the characteristics, it is safe to use a predetermined battery for easy movement.

도2는 본 발명에 따른 음성인식 훈련과정을 도시한 흐름도이다.2 is a flowchart illustrating a voice recognition training process according to the present invention.

본 발명에 의해 구현된 완구가 다양한 사용자의 음성을 인식하기 위해서는 소정의 훈련과정을 거쳐 추출되는 음소마다 각 가중치를 부여하여 소정의 학습데이터를 생성해야 하는데, 이러한 일련의 학습과정은 일반 PC(1)를 사용하여 작업하도록 한다.In order to recognize the voices of various users, the toys implemented by the present invention should generate predetermined learning data by assigning each weight to each phoneme extracted through a predetermined training process. Use) to work.

화자독립형 음성인식이 가능하기 위하여 통상적으로 현지 어린이 30명 정도의 음성을 표본 추출하여(100) 소정의 입력장치를 통해 녹음한 후, 각 음성신호에 대해 신호차이 축적법 및 구간검출을 통한 정규화작업인 전처리 과정을 거치도록 한 후(110) IDMLP신경회로망 알고리즘을 이용하여 음성의 각 음소를 분류하고, 각 음소의 특징점을 추출한 후(120) 가중치를 부여하여(130) 소정의 저장매체에 저장하고(140), 이러한 일련의 음성인식 훈련과정을 반복적으로 실시하여(150) 다양한 사용자의 음성을 인식할 수 있도록 공통된 결과값을 추출한다.In order to enable speaker-independent speech recognition, voices of about 30 local children are sampled (100), recorded through a predetermined input device, and normalized by signal difference accumulation method and interval detection for each voice signal. After the preprocessing process (110) using the IDMLP neural network algorithm to classify each phoneme of the voice, extract the feature points of each phoneme (120) weighted (130) and store in a predetermined storage medium 140, a series of voice recognition training processes are repeatedly performed (150) to extract common result values to recognize voices of various users.

상기의 과정으로 일반 PC(1)를 통해 소정의 음성인식을 위한 결과값이 추출되면 본 발명에 의한 시스템의 PC 인터페이스부(12)를 통해 해당 데이터를 제1메모리부(21)에 저장하여 음성인식이 가능하도록 한다.When the result value for the predetermined voice recognition is extracted through the general PC 1 through the above process, the corresponding data is stored in the first memory unit 21 through the PC interface unit 12 of the system according to the present invention. Make it recognizable.

도3은 본 발명에 따른 음성인식과정을 도시한 흐름도이다.3 is a flowchart illustrating a voice recognition process according to the present invention.

도2에 도시된 바와 같은 음성인식을 위한 훈련과정을 거쳐 제1메모리부(21)에 학습 데이터를 저장한 후, 본 발명에 의해 구현된 완구류의 입력부(30)인 마이크(31)를 통해 사용자의 일정한 음성이 입력되면(200) 필터부(35)에서 소정의 잡음신호를 걸러주고(210), 제1증폭부(36)에서 상기 필터부(35)에서 잡음신호가 제거된 음성신호를 제어부(10)에서 처리하기 용이하도록 소정의 크기로 증폭한 후(220) 제어부(10)로 상기 음성신호를 전송한다.After the training data is stored in the first memory unit 21 through a training process for speech recognition as shown in FIG. 2, the user uses a microphone 31 which is an input unit 30 of a toy implemented by the present invention. When a certain voice is input (200), the filter unit 35 filters a predetermined noise signal (210), and the first amplifier 36 controls the voice signal from which the noise signal is removed from the filter unit 35. After the amplification to a predetermined size (220) to facilitate processing in (10) and transmits the voice signal to the control unit (10).

상기 제어부(10)에서는 제1증폭부(36)를 거쳐 수신된 아날로그 음성신호를 디지털 음성신호로 변환한 후 상기 음성신호를 인식하기 위해 전처리과정인 신호차이축적법 및 구간 검출을 통한 정규화작업을 하고(230), 상기 전처리과정을 거친 음성신호를 IDMLP신경회로망 알고리즘에 의해 제1메모리부(21)에 저장되어 있는 학습데이터와 비교 검색하는 처리과정을 거쳐(240) 해당 학습데이터를 추출하고(250), 상기 학습데이터에 해당하는 음성 및 동작시나리오를 음성 및 작동합성부(11)를 통해 제2메모리부(22)에 소정의 압축형태로 저되어 있는 해당 시나리오 데이터를 불러들이고, 상기 시나리오 데이터 중에서 제어부(10)가 임의로 하나의 음성표현 데이터를 선택하고(260), 그에 해당하는 동작표현 데이터를 선택한 후 출력부(40)의 해당 장치에 제어신호를 출력하여 각 장치들이 제어신호에 따른 음성표현 및 동작표현을 하도록 한다(270).The control unit 10 converts the analog voice signal received through the first amplifier 36 into a digital voice signal, and then normalizes the signal difference accumulation method and interval detection, which is a preprocessing process, to recognize the voice signal. In step 230, the speech signal, which has undergone the pre-processing process, is compared with the learning data stored in the first memory unit 21 by the IDMLP neural network algorithm, and then the corresponding data is extracted (240). 250), the voice and operation scenarios corresponding to the learning data are loaded into the second memory unit 22 through the voice and operation synthesizing unit 11 in a predetermined compressed form, and the scenario data are loaded. The controller 10 arbitrarily selects one voice expression data (260), selects the corresponding operation expression data, and outputs a control signal to the corresponding device of the output unit 40. Each W devices to a phonetic representation and operation expression according to the control signal 270.

예를들면, 사용자가 본 발명에 의해 구현된 완구류에 '네 이름이 뭐니'라는 음성표현을 하는 경우 제어부(10)가 상기 음성표현을 인식하여 도3에 도시한 바와 같은 과정을 통해 음성인식을 하게 되는데, 상기 제어부(10)는 도4에 도시된 바와 같이 제1메모리부(21)에 저장되어 있는 학습데이터를 비교 검색한 후 '네 이름이 뭐니'라는 학습데이터를 추출하고, 음성 및 작동합성부(11)를 통해 제2메모리부(22)에 저장되어 있는 도5에 도시된 바와 같은 음성시나리오 테이블 중 해당 음성 시나리오를 추출한 후 '내 이름은 사오정이야', '너부터 말해줘'‥‥‥ '알아서 뭐하게'라는 다양한 음성표현 중에서 제어부(10)가 무작위로 하나의 음성표현을 선택한 후, 그에 대한 동작표현을 지정하여 출력부(40)의 해당 장치에 제어신호를 공급하여 소정의 음성표현 및 동작표현을 하도록 한다.For example, when a user makes a voice expression of 'what is your name' on a toy implemented by the present invention, the controller 10 recognizes the voice expression and recognizes the voice through a process as shown in FIG. 3. As shown in FIG. 4, the control unit 10 compares and searches the learning data stored in the first memory unit 21, and extracts the learning data of 'what is your name', and performs voice and operation. After extracting the voice scenario from the voice scenario table shown in FIG. 5 stored in the second memory unit 22 through the synthesis unit 11, 'My name is Saojung', 'Tell me first'. ‥ The controller 10 randomly selects one voice expression from among various voice expressions of 'what do you know' and then assigns an operation expression to it and supplies a control signal to the corresponding device of the output unit 40 to provide a predetermined voice. Expression and Action Table Do the strings.

이상과 같이 IDMLP신경회로망 알고리즘을 이용하여 음성인식 훈련과정을 거친 본 발명에 의해 구현된 완구류는 사용자의 다양한 음성표현을 인식하여 동일한 질문에 대해서도 상이한 음성표현이 가능하도록 소프트웨어 및 하드웨어적으로 구현하여 어린이의 흥미 충족 및 다양한 음성표현으로 인한 언어학습을 할 수 있어 교육적인 기능이 가미된 유용한 완구이다.As described above, the toy implemented by the present invention, which has undergone a voice recognition training process using the IDMLP neural network algorithm, recognizes various voice expressions of the user and implements them in software and hardware to enable different voice expressions for the same question. It is a useful toy with educational function as it can learn language by satisfying the interest and various voice expressions.

이상에서 설명한 본 발명은, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 있어 본 발명의 기술적 사상을 벗어나지 않는 범위내에서 여러 가지 치환, 변형 및 변경이 가능하므로 전술한 실시례 및 첨부된 도면에 한정되는 것이 아니다.The present invention described above is capable of various substitutions, modifications, and changes without departing from the technical spirit of the present invention for those skilled in the art to which the present invention pertains. It is not limited to the drawings.

사용자의 음성을 IDMLP신경회로망 알고리즘을 이용하여 인식하고, 인식된 음성에 상응하는 음성 시나리오의 다양한 음성표현 중에서 하나의 음성표현을 임의로 선택하여 소정의 출력장치를 통해 출력하고, 각종 센서를 통해 사용자가 만지는 촉각을 감지하고, 일정거리 접근하는 것을 감지하고, 빛과 어둠을 감지하여 그에 상응하는 음성표현 및 동작표현을 함으로써 사용자의 동일한 질문에 대해 늘 다양하고 새로운 음성표현으로 인해 어린이의 호기심과 흥미를 충족시킬 수 있고, 어린의의 감각을 그대로 반영한 다양한 음성표현 및 동작표현으로 인해 어린이의 정서함양과 언어학습을 향상시킬 수 있는 이점이 있다.Recognizes user's voice using IDMLP neural network algorithm, randomly selects one voice expression from various voice expressions of voice scenarios corresponding to the recognized voice, and outputs it through a predetermined output device. It detects the tactile sensation of touch, detects a certain distance approach, detects light and darkness, and expresses the corresponding voice and motion, so that the user's curiosity and interest can always be changed due to various new voice expressions for the user's same question. It can satisfy and improve children's emotional development and language learning due to various voice expressions and movement expressions that reflect children's senses.

Claims

In the interactive voice and multi-sensory recognition system for toys and methods thereof,

A first step of extracting the selected speech sample into predetermined learning data through a speech recognition training process using an IDMLP neural network algorithm through a predetermined computer; A second step of storing the predetermined learning data extracted in the first step and a voice scenario corresponding to the learning data in a predetermined memory unit through a PC interface unit of a toy implemented according to the present invention; After the second step, when a predetermined voice signal is input through the input unit, a third step of extracting learning data corresponding to the input voice signal using the IDMLP neural network algorithm after performing a preprocessing process in the control unit; A fourth step of optionally selecting, by the controller, a predetermined voice expression and a motion expression from a voice output and motion expression scenario corresponding to the training data through a voice and operation synthesis unit and a predetermined memory unit; A fifth step of sending a control signal to a corresponding device to output the voice expression and the operation expression selected in the fourth step through an output unit; And a sixth step of outputting a predetermined voice expression and an operation expression by the corresponding device of the output unit in response to a control signal of the control unit.

The method of claim 1,

After the second step, when a predetermined signal is input through a predetermined sensor as an input unit, the controller recognizes the predetermined signal and selects a predetermined voice expression from among voice expression and operation expression scenarios stored in the predetermined memory unit through the voice and operation synthesis unit. A first step of arbitrarily selecting a motion expression; A second step of sending a control signal to a corresponding device to output the voice expression and the operation expression selected in the first step through an output unit; And a third step in which the corresponding device of the output unit outputs a predetermined voice expression and an operation expression according to a control signal of the controller.

A controller for controlling the apparatus to recognize a voice signal supplied through an input unit using an IDMLP neural network algorithm and to output a predetermined voice expression and an operation expression by recognizing a signal supplied through a predetermined sensor; A first memory unit for storing the operating software of the control unit and storing learning data obtained through a learning process using an IDMLP neural network algorithm, a second memory unit for storing voice expression and operation expression scenarios, and internal data signal processing A memory unit comprising a third memory unit in charge of the; A voice and operation synthesizing unit for supplying predetermined voice expression and operation expression data stored in a second memory unit to the output unit in association with the control unit; A microphone for receiving a user's voice and converting it into an electrical analog voice signal, a tactile sensor for sensing a touch to supply a predetermined signal to the controller, an ultrasonic sensor for sensing a predetermined distance and supplying a predetermined signal to the controller; Infrared sensor for detecting light and darkness and supplying a predetermined signal to the control unit, Filter unit for removing noise and ambient noise from the user's voice signal, and a predetermined analog signal so that the controller can recognize and operate An input section comprising a first amplifier section amplifying to a magnitude; A second amplifier for amplifying a weak analog signal supplied through the voice and the operation synthesizer to a predetermined size to operate the corresponding device of the output unit, the user can recognize the electrical signal having a predetermined voice expression data An output unit including a speaker for converting and outputting a voice, an LED for emitting light at a predetermined interval according to a control signal of the controller, and a driving unit for outputting an operation corresponding to predetermined operation expression data; A PC interface unit for easily accessing a computer from which learning data is extracted by performing a predetermined learning process; Interactive voice and multisensory recognition system for toys, comprising a power supply unit for supplying a constant power to each circuit and element while maintaining a predetermined voltage

The method according to any one of claims 1, 2 or 3,

The sensor installed in a predetermined portion of the toy is configured as a tactile sensor for sensing the user's touch to output a predetermined voice expression and motion expression, interactive voice and multi-sensory recognition system in the toy and Method of objection

The method according to any one of claims 1, 2 or 3,

The sensor installed at a predetermined portion of the toy is configured with an ultrasonic sensor to detect a user approaching a certain distance and output a predetermined voice expression and motion expression, interactive voice and multi-sensory recognition in the toy System and Method

The method according to any one of claims 1, 2 or 3,

The sensor installed at a predetermined portion of the toy is configured with an infrared sensor to detect light and darkness to output a predetermined voice expression and motion expression, interactive voice and multi-sensory recognition system and a method thereof in the toy