KR102393147B1 - 향상된 음성 인식을 돕기 위한 시각적 컨텐츠의 변형 - Google Patents

향상된 음성 인식을 돕기 위한 시각적 컨텐츠의 변형 Download PDF

Info

Publication number
KR102393147B1
KR102393147B1 KR1020167037034A KR20167037034A KR102393147B1 KR 102393147 B1 KR102393147 B1 KR 102393147B1 KR 1020167037034 A KR1020167037034 A KR 1020167037034A KR 20167037034 A KR20167037034 A KR 20167037034A KR 102393147 B1 KR102393147 B1 KR 102393147B1
Authority
KR
South Korea
Prior art keywords
visual content
visual
user
display
layout
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
KR1020167037034A
Other languages
English (en)
Korean (ko)
Other versions
KR20170016399A (ko
Inventor
안드레아스 스톨케
제프리 츠바이크
말콤 슬라니
Original Assignee
마이크로소프트 테크놀로지 라이센싱, 엘엘씨
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 마이크로소프트 테크놀로지 라이센싱, 엘엘씨 filed Critical 마이크로소프트 테크놀로지 라이센싱, 엘엘씨
Publication of KR20170016399A publication Critical patent/KR20170016399A/ko
Application granted granted Critical
Publication of KR102393147B1 publication Critical patent/KR102393147B1/ko
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/24Speech recognition using non-acoustical features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/106Display of layout of documents; Previewing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/174Form filling; Merging
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/183Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a single remote source

Landscapes

  • Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Signal Processing (AREA)
  • User Interface Of Digital Computer (AREA)
  • Rehabilitation Tools (AREA)
  • Eye Examination Apparatus (AREA)
  • Controls And Circuits For Display Device (AREA)
  • Road Signs Or Road Markings (AREA)
  • Digital Computer Display Output (AREA)
KR1020167037034A 2014-06-06 2015-06-03 향상된 음성 인식을 돕기 위한 시각적 컨텐츠의 변형 Active KR102393147B1 (ko)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US14/297,742 US9583105B2 (en) 2014-06-06 2014-06-06 Modification of visual content to facilitate improved speech recognition
US14/297,742 2014-06-06
PCT/US2015/033865 WO2015187756A2 (en) 2014-06-06 2015-06-03 Modification of visual content to facilitate improved speech recognition

Publications (2)

Publication Number Publication Date
KR20170016399A KR20170016399A (ko) 2017-02-13
KR102393147B1 true KR102393147B1 (ko) 2022-04-29

Family

ID=54540159

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020167037034A Active KR102393147B1 (ko) 2014-06-06 2015-06-03 향상된 음성 인식을 돕기 위한 시각적 컨텐츠의 변형

Country Status (11)

Country Link
US (1) US9583105B2 (https=)
EP (1) EP3152754B1 (https=)
JP (1) JP6545716B2 (https=)
KR (1) KR102393147B1 (https=)
CN (1) CN106463119B (https=)
AU (1) AU2015271726B2 (https=)
BR (1) BR112016026904B1 (https=)
CA (1) CA2948523C (https=)
MX (1) MX361307B (https=)
RU (1) RU2684475C2 (https=)
WO (1) WO2015187756A2 (https=)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9613267B2 (en) * 2012-05-31 2017-04-04 Xerox Corporation Method and system of extracting label:value data from a document
KR102342117B1 (ko) * 2015-03-13 2021-12-21 엘지전자 주식회사 단말기, 및 이를 구비하는 홈 어플라이언스 시스템
WO2017183943A1 (ko) * 2016-04-21 2017-10-26 주식회사 비주얼캠프 표시 장치와 이를 이용한 입력 처리 방법 및 시스템
KR101904889B1 (ko) 2016-04-21 2018-10-05 주식회사 비주얼캠프 표시 장치와 이를 이용한 입력 처리 방법 및 시스템
SG11201908535XA (en) * 2017-03-17 2019-10-30 Uilicious Private Ltd Systems, methods and computer readable media for ambiguity resolution in instruction statement interpretation
US10142686B2 (en) * 2017-03-30 2018-11-27 Rovi Guides, Inc. System and methods for disambiguating an ambiguous entity in a search query based on the gaze of a user
AU2019223427A1 (en) * 2018-02-26 2019-11-14 Nintex Pty Ltd Method and system for chatbot-enabled web forms and workflows
CN109445757B (zh) * 2018-09-21 2022-07-29 深圳变设龙信息科技有限公司 新设计图生成方法、装置及终端设备
JP7414231B2 (ja) * 2019-07-11 2024-01-16 中部電力株式会社 マルチモーダル音声認識装置およびマルチモーダル音声認識方法
KR102909001B1 (ko) * 2020-04-29 2026-01-08 현대자동차주식회사 차량 음성 인식 방법 및 장치
WO2022067342A2 (en) 2020-09-25 2022-03-31 Apple Inc. Methods for interacting with virtual controls and/or an affordance for moving virtual objects in virtual environments
CN119621211A (zh) * 2022-01-03 2025-03-14 苹果公司 用于导航并输入或修订内容的设备、方法和图形用户界面
EP4463754A1 (en) 2022-01-12 2024-11-20 Apple Inc. Methods for displaying, selecting and moving objects and containers in an environment
US12541280B2 (en) 2022-02-28 2026-02-03 Apple Inc. System and method of three-dimensional placement and refinement in multi-user communication sessions
CN119404170A (zh) 2022-04-20 2025-02-07 苹果公司 三维环境中的被遮蔽对象
US20240233288A1 (en) 2022-09-24 2024-07-11 Apple Inc. Methods for controlling and interacting with a three-dimensional environment
WO2024064950A1 (en) 2022-09-24 2024-03-28 Apple Inc. Methods for time of day adjustments for environments and environment presentation during communication sessions
WO2024254096A1 (en) 2023-06-04 2024-12-12 Apple Inc. Methods for managing overlapping windows and applying visual effects
US12386418B2 (en) * 2023-09-08 2025-08-12 Huawei Technologies Co., Ltd. Gaze assisted input for an electronic device
WO2025210829A1 (ja) * 2024-04-04 2025-10-09 マクセル株式会社 情報端末、操作入力装置、及び情報端末の操作方法
US20250377719A1 (en) 2024-06-09 2025-12-11 Apple Inc. Methods of interacting with content in a virtual environment

Family Cites Families (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3530591B2 (ja) * 1994-09-14 2004-05-24 キヤノン株式会社 音声認識装置及びこれを用いた情報処理装置とそれらの方法
US6629074B1 (en) * 1997-08-14 2003-09-30 International Business Machines Corporation Resource utilization indication and commit mechanism in a data processing system and method therefor
US7720682B2 (en) 1998-12-04 2010-05-18 Tegic Communications, Inc. Method and apparatus utilizing voice input to resolve ambiguous manually entered text input
ATE282880T1 (de) 2000-01-27 2004-12-15 Siemens Ag System und verfahren zur blickfokussierten sprachverarbeitung
US6741791B1 (en) * 2000-01-31 2004-05-25 Intel Corporation Using speech to select a position in a program
US7036080B1 (en) 2001-11-30 2006-04-25 Sap Labs, Inc. Method and apparatus for implementing a speech interface for a GUI
US20050182558A1 (en) * 2002-04-12 2005-08-18 Mitsubishi Denki Kabushiki Kaisha Car navigation system and speech recognizing device therefor
US7158779B2 (en) * 2003-11-11 2007-01-02 Microsoft Corporation Sequential multimodal input
CN102272827B (zh) * 2005-06-01 2013-07-10 泰吉克通讯股份有限公司 利用语音输入解决模糊的手工输入文本输入的方法和装置
US7627819B2 (en) * 2005-11-01 2009-12-01 At&T Intellectual Property I, L.P. Visual screen indicator
JP4399607B2 (ja) * 2006-02-13 2010-01-20 国立大学法人埼玉大学 視線制御表示装置と表示方法
CN101395607B (zh) * 2006-03-03 2011-10-05 皇家飞利浦电子股份有限公司 用于自动生成多个图像的概要的方法和设备
US8793620B2 (en) 2011-04-21 2014-07-29 Sony Computer Entertainment Inc. Gaze-assisted computer interface
US9250703B2 (en) 2006-03-06 2016-02-02 Sony Computer Entertainment Inc. Interface with gaze detection and voice input
US20080141166A1 (en) * 2006-12-11 2008-06-12 Cisco Technology, Inc. Using images in alternative navigation
US7983915B2 (en) * 2007-04-30 2011-07-19 Sonic Foundry, Inc. Audio content search engine
JP5230120B2 (ja) * 2007-05-07 2013-07-10 任天堂株式会社 情報処理システム、情報処理プログラム
US20130125051A1 (en) * 2007-09-28 2013-05-16 Adobe Systems Incorporated Historical review using manipulable visual indicators
US8386260B2 (en) * 2007-12-31 2013-02-26 Motorola Mobility Llc Methods and apparatus for implementing distributed multi-modal applications
US8438485B2 (en) * 2009-03-17 2013-05-07 Unews, Llc System, method, and apparatus for generating, customizing, distributing, and presenting an interactive audio publication
US9197736B2 (en) 2009-12-31 2015-11-24 Digimarc Corporation Intuitive computing methods and systems
US9507418B2 (en) * 2010-01-21 2016-11-29 Tobii Ab Eye tracker based contextual action
JP2012022589A (ja) * 2010-07-16 2012-02-02 Hitachi Ltd 商品選択支援方法
US10120438B2 (en) * 2011-05-25 2018-11-06 Sony Interactive Entertainment Inc. Eye gaze to alter device behavior
US9423870B2 (en) 2012-05-08 2016-08-23 Google Inc. Input determination method
US9823742B2 (en) * 2012-05-18 2017-11-21 Microsoft Technology Licensing, Llc Interaction and management of devices using gaze detection
KR102156175B1 (ko) * 2012-10-09 2020-09-15 삼성전자주식회사 멀티 모달리티를 활용한 유저 인터페이스를 제공하는 인터페이싱 장치 및 그 장치를 이용한 방법

Also Published As

Publication number Publication date
RU2016147071A3 (https=) 2018-12-29
BR112016026904A2 (pt) 2017-08-15
WO2015187756A3 (en) 2016-01-28
WO2015187756A2 (en) 2015-12-10
CN106463119A (zh) 2017-02-22
EP3152754B1 (en) 2018-01-10
MX2016016131A (es) 2017-03-08
AU2015271726A1 (en) 2016-11-17
KR20170016399A (ko) 2017-02-13
JP6545716B2 (ja) 2019-07-17
CA2948523A1 (en) 2015-12-10
EP3152754A2 (en) 2017-04-12
JP2017525002A (ja) 2017-08-31
US20150356971A1 (en) 2015-12-10
AU2015271726B2 (en) 2020-04-09
MX361307B (es) 2018-12-03
CA2948523C (en) 2021-12-07
CN106463119B (zh) 2020-07-10
US9583105B2 (en) 2017-02-28
BR112016026904B1 (pt) 2023-03-14
RU2684475C2 (ru) 2019-04-09
RU2016147071A (ru) 2018-06-01
BR112016026904A8 (pt) 2021-07-13

Similar Documents

Publication Publication Date Title
KR102393147B1 (ko) 향상된 음성 인식을 돕기 위한 시각적 컨텐츠의 변형
US11854550B2 (en) Determining input for speech processing engine
EP3596585B1 (en) Invoking automated assistant function(s) based on detected gesture and gaze
US9720644B2 (en) Information processing apparatus, information processing method, and computer program
US10824310B2 (en) Augmented reality virtual personal assistant for external representation
KR102429583B1 (ko) 전자 장치, 그의 가이드 제공 방법 및 비일시적 컴퓨터 판독가능 기록매체
KR20170065563A (ko) 다중 모드 대화 상호 작용에서 음성 언어 이해를 위한 눈 시선
US20140304606A1 (en) Information processing apparatus, information processing method and computer program
WO2019087811A1 (ja) 情報処理装置、及び情報処理方法
JP6983118B2 (ja) 対話システムの制御方法、対話システム及びプログラム
US12361947B2 (en) Automated assistant interaction prediction using fusion of visual and audio input
KR20210042460A (ko) 복수의 언어가 포함된 음성을 인식하는 인공 지능 장치 및 그 방법
KR102792918B1 (ko) 전자 장치 및 그 제어 방법
JPWO2020116001A1 (ja) 情報処理装置および情報処理方法
KR20250096753A (ko) 인공지능 기기 및 그 동작 방법

Legal Events

Date Code Title Description
E13-X000 Pre-grant limitation requested

St.27 status event code: A-2-3-E10-E13-lim-X000

PA0105 International application

St.27 status event code: A-0-1-A10-A15-nap-PA0105

PG1501 Laying open of application

St.27 status event code: A-1-1-Q10-Q12-nap-PG1501

P22-X000 Classification modified

St.27 status event code: A-2-2-P10-P22-nap-X000

P22-X000 Classification modified

St.27 status event code: A-2-2-P10-P22-nap-X000

A201 Request for examination
P11-X000 Amendment of application requested

St.27 status event code: A-2-2-P10-P11-nap-X000

P13-X000 Application amended

St.27 status event code: A-2-2-P10-P13-nap-X000

PA0201 Request for examination

St.27 status event code: A-1-2-D10-D11-exm-PA0201

E902 Notification of reason for refusal
PE0902 Notice of grounds for rejection

St.27 status event code: A-1-2-D10-D21-exm-PE0902

P11-X000 Amendment of application requested

St.27 status event code: A-2-2-P10-P11-nap-X000

P13-X000 Application amended

St.27 status event code: A-2-2-P10-P13-nap-X000

E701 Decision to grant or registration of patent right
PE0701 Decision of registration

St.27 status event code: A-1-2-D10-D22-exm-PE0701

GRNT Written decision to grant
PR0701 Registration of establishment

St.27 status event code: A-2-4-F10-F11-exm-PR0701

PR1002 Payment of registration fee

St.27 status event code: A-2-2-U10-U12-oth-PR1002

Fee payment year number: 1

PG1601 Publication of registration

St.27 status event code: A-4-4-Q10-Q13-nap-PG1601

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 4

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 5

U11 Full renewal or maintenance fee paid

Free format text: ST27 STATUS EVENT CODE: A-4-4-U10-U11-OTH-PR1001 (AS PROVIDED BY THE NATIONAL OFFICE)

Year of fee payment: 5