WO2019172735A3 - 커뮤니케이션 로봇 및 그의 구동 방법 - Google Patents
커뮤니케이션 로봇 및 그의 구동 방법 Download PDFInfo
- Publication number
- WO2019172735A3 WO2019172735A3 PCT/KR2019/007989 KR2019007989W WO2019172735A3 WO 2019172735 A3 WO2019172735 A3 WO 2019172735A3 KR 2019007989 W KR2019007989 W KR 2019007989W WO 2019172735 A3 WO2019172735 A3 WO 2019172735A3
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- communication robot
- robot
- driving method
- communication
- user
- Prior art date
Links
- 238000000034 method Methods 0.000 title abstract 4
- 238000013473 artificial intelligence Methods 0.000 abstract 1
- 238000010801 machine learning Methods 0.000 abstract 1
- 238000007781 pre-processing Methods 0.000 abstract 1
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J11/00—Manipulators not otherwise provided for
- B25J11/0005—Manipulators having means for high-level communication with users, e.g. speech generator, face recognition means
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/32—Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J11/00—Manipulators not otherwise provided for
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J11/00—Manipulators not otherwise provided for
- B25J11/008—Manipulators for service tasks
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J13/00—Controls for manipulators
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J13/00—Controls for manipulators
- B25J13/003—Controls for manipulators by means of an audio-responsive input
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J13/00—Controls for manipulators
- B25J13/08—Controls for manipulators by means of sensing devices, e.g. viewing or touching devices
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1679—Programme controls characterised by the tasks executed
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/005—Language recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Robotics (AREA)
- Mechanical Engineering (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Manipulator (AREA)
Abstract
탑재된 인공지능(artificial intelligence, AI) 알고리즘 및/또는 기계학습(machine learning) 알고리즘을 실행하여 음성 인식을 수행하여 5G 통신 환경에서 다른 전자 기기들 및 외부 서버와 통신할 수 있는 커뮤니케이션 로봇 및 그의 구동 방법이 개시된다. 본 발명의 일 실시 예에 따른 커뮤니케이션 로봇의 구동 방법은 커뮤니케이션 로봇으로부터 기설정된 거리 이내로 접근한 사용자가 발화하는 발화 음성을 수신하는 단계와, 복수개의 ASR(auto speech recognition) 모듈 중 발화 음성을 처리할 수 있는 어느 한 ASR 모듈을 최적화 ASR 모듈로 선택하는 단계를 포함할 수 있다. 본 발명에 의하면, 커뮤니케이션 로봇으로부터 서비스를 제공받기 위해 전처리 작업으로 사용자가 수동으로 제1 언어를 설정해야 하는 불편함을 줄임으로써 사용자의 커뮤니케이션 로봇 이용에 대한 만족도를 향상시킬 수 있다.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/KR2019/007989 WO2019172735A2 (ko) | 2019-07-02 | 2019-07-02 | 커뮤니케이션 로봇 및 그의 구동 방법 |
KR1020190085392A KR20190090745A (ko) | 2019-07-02 | 2019-07-15 | 커뮤니케이션 로봇 및 그의 구동 방법 |
US16/569,233 US11437042B2 (en) | 2019-07-02 | 2019-09-12 | Communication robot and method for operating the same |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/KR2019/007989 WO2019172735A2 (ko) | 2019-07-02 | 2019-07-02 | 커뮤니케이션 로봇 및 그의 구동 방법 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2019172735A2 WO2019172735A2 (ko) | 2019-09-12 |
WO2019172735A3 true WO2019172735A3 (ko) | 2020-05-14 |
Family
ID=67613972
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2019/007989 WO2019172735A2 (ko) | 2019-07-02 | 2019-07-02 | 커뮤니케이션 로봇 및 그의 구동 방법 |
Country Status (3)
Country | Link |
---|---|
US (1) | US11437042B2 (ko) |
KR (1) | KR20190090745A (ko) |
WO (1) | WO2019172735A2 (ko) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11322136B2 (en) * | 2019-01-09 | 2022-05-03 | Samsung Electronics Co., Ltd. | System and method for multi-spoken language detection |
US11947644B2 (en) | 2019-10-08 | 2024-04-02 | UiPath, Inc. | Facial recognition framework using deep learning for attended robots |
US11495210B2 (en) * | 2019-10-18 | 2022-11-08 | Microsoft Technology Licensing, Llc | Acoustic based speech analysis using deep learning models |
US11687778B2 (en) | 2020-01-06 | 2023-06-27 | The Research Foundation For The State University Of New York | Fakecatcher: detection of synthetic portrait videos using biological signals |
CN113873205B (zh) * | 2021-10-18 | 2023-12-22 | 中国联合网络通信集团有限公司 | 机器人监控系统和方法 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005122128A (ja) * | 2003-09-25 | 2005-05-12 | Fuji Photo Film Co Ltd | 音声認識システム及びプログラム |
KR20110100620A (ko) * | 2008-11-10 | 2011-09-14 | 구글 인코포레이티드 | 멀티센서 음성 검출 |
KR101893768B1 (ko) * | 2017-02-27 | 2018-09-04 | 주식회사 브이터치 | 음성 인식 트리거를 제공하기 위한 방법, 시스템 및 비일시성의 컴퓨터 판독 가능한 기록 매체 |
KR20180130315A (ko) * | 2017-05-29 | 2018-12-07 | 엘지전자 주식회사 | 홈 어플라이언스 및 그 동작 방법 |
KR20190001434A (ko) * | 2017-06-27 | 2019-01-04 | 삼성전자주식회사 | 발화 인식 모델을 선택하는 시스템 및 전자 장치 |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6061646A (en) * | 1997-12-18 | 2000-05-09 | International Business Machines Corp. | Kiosk for multiple spoken languages |
KR100847152B1 (ko) | 2006-12-28 | 2008-07-18 | 주식회사 유진로봇 | 로봇의 가이드 시스템 |
KR100904191B1 (ko) | 2008-05-29 | 2009-06-22 | (주)다사로봇 | 안내용 로봇 |
JP6705410B2 (ja) * | 2017-03-27 | 2020-06-03 | カシオ計算機株式会社 | 音声認識装置、音声認識方法、プログラム及びロボット |
US20180357998A1 (en) * | 2017-06-13 | 2018-12-13 | Intel IP Corporation | Wake-on-voice keyword detection with integrated language identification |
WO2019161193A2 (en) * | 2018-02-15 | 2019-08-22 | DMAI, Inc. | System and method for adaptive detection of spoken language via multiple speech models |
US11170761B2 (en) * | 2018-12-04 | 2021-11-09 | Sorenson Ip Holdings, Llc | Training of speech recognition systems |
-
2019
- 2019-07-02 WO PCT/KR2019/007989 patent/WO2019172735A2/ko active Application Filing
- 2019-07-15 KR KR1020190085392A patent/KR20190090745A/ko not_active Application Discontinuation
- 2019-09-12 US US16/569,233 patent/US11437042B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005122128A (ja) * | 2003-09-25 | 2005-05-12 | Fuji Photo Film Co Ltd | 音声認識システム及びプログラム |
KR20110100620A (ko) * | 2008-11-10 | 2011-09-14 | 구글 인코포레이티드 | 멀티센서 음성 검출 |
KR101893768B1 (ko) * | 2017-02-27 | 2018-09-04 | 주식회사 브이터치 | 음성 인식 트리거를 제공하기 위한 방법, 시스템 및 비일시성의 컴퓨터 판독 가능한 기록 매체 |
KR20180130315A (ko) * | 2017-05-29 | 2018-12-07 | 엘지전자 주식회사 | 홈 어플라이언스 및 그 동작 방법 |
KR20190001434A (ko) * | 2017-06-27 | 2019-01-04 | 삼성전자주식회사 | 발화 인식 모델을 선택하는 시스템 및 전자 장치 |
Also Published As
Publication number | Publication date |
---|---|
KR20190090745A (ko) | 2019-08-02 |
WO2019172735A2 (ko) | 2019-09-12 |
US20200005794A1 (en) | 2020-01-02 |
US11437042B2 (en) | 2022-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019172735A3 (ko) | 커뮤니케이션 로봇 및 그의 구동 방법 | |
EP3023979B1 (en) | Method and system for recognizing speech using wildcards in an expected response | |
US10380992B2 (en) | Natural language generation based on user speech style | |
JP6873333B2 (ja) | 音声認識システム及び音声認識システムを用いる方法 | |
US8612235B2 (en) | Method and system for considering information about an expected response when performing speech recognition | |
EP0943139B1 (en) | A method and device for activating a voice-controlled function in a multi-station network through using both speaker-dependent and speaker-independent speech recognition | |
US9202465B2 (en) | Speech recognition dependent on text message content | |
EP4235646A3 (en) | Adaptive audio enhancement for multichannel speech recognition | |
Renals et al. | Neural networks for distant speech recognition | |
WO2002069320A3 (en) | Spoken language interface | |
US20150056951A1 (en) | Vehicle telematics unit and method of operating the same | |
CN105609101B (zh) | 语音识别系统及语音识别方法 | |
CN107093427A (zh) | 不流畅语言的自动语音识别 | |
EP0720149A1 (en) | Speech recognition bias equalisation method and apparatus | |
CN110347863A (zh) | 话术推荐方法和装置及存储介质 | |
US20160019884A1 (en) | Methods and apparatus for training a transformation component | |
CN102543077A (zh) | 基于语言独立女性语音数据的男性声学模型适应 | |
CN109785827A (zh) | 在语音识别仲裁中使用的神经网络 | |
Kim et al. | End-to-End Speech Recognition with Auditory Attention for Multi-Microphone Distance Speech Recognition. | |
CN109949803B (zh) | 基于语义指令智能识别的建筑服务设施控制方法及系统 | |
KR20140067687A (ko) | 대화형 음성인식이 가능한 차량 시스템 | |
US20170018273A1 (en) | Real-time adaptation of in-vehicle speech recognition systems | |
CA2393999A1 (en) | Method for the voice-operated identification of the user of a telecommunication line in a telecommunications network in the course of a dialog with a voice-operated dialog system | |
US20030040915A1 (en) | Method for the voice-controlled initiation of actions by means of a limited circle of users, whereby said actions can be carried out in appliance | |
EP3232436A3 (en) | Application services interface to asr |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19763729 Country of ref document: EP Kind code of ref document: A2 |