JP6723591B1 - データベースに顔情報を入力する方法及び装置 - Google Patents
データベースに顔情報を入力する方法及び装置 Download PDFInfo
- Publication number
- JP6723591B1 JP6723591B1 JP2019184911A JP2019184911A JP6723591B1 JP 6723591 B1 JP6723591 B1 JP 6723591B1 JP 2019184911 A JP2019184911 A JP 2019184911A JP 2019184911 A JP2019184911 A JP 2019184911A JP 6723591 B1 JP6723591 B1 JP 6723591B1
- Authority
- JP
- Japan
- Prior art keywords
- information
- database
- face
- photographed
- voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 230000004807 localization Effects 0.000 claims description 9
- 230000033001 locomotion Effects 0.000 claims description 8
- 230000001815 facial effect Effects 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 abstract description 4
- 230000001771 impaired effect Effects 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 3
- 230000005236 sound signal Effects 0.000 description 3
- 230000007423 decrease Effects 0.000 description 2
- 238000006073 displacement reaction Methods 0.000 description 2
- 210000000887 face Anatomy 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 239000013589 supplement Substances 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000010267 cellular communication Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 239000004984 smart glass Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
- G06F16/41—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S5/00—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
- G01S5/18—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7834—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7837—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content
- G06F16/784—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content the detected or recognised objects being people
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/166—Detection; Localisation; Normalisation using acquisition arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/50—Maintenance of biometric data or enrolment thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/70—Multimodal biometrics, e.g. combining information from different biometric modalities
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Library & Information Science (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Collating Specific Patterns (AREA)
- Image Analysis (AREA)
- Studio Devices (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
図2の上面図に示すシーンでは、3人の被撮影者201、202及び203はビデオ取り込みユニット204の撮影範囲内に位置する。また、顔情報入力のための装置200は、オーディオ取り込みユニット205をさらに含む。なお、図2に示すオーディオ取り込みユニット205及びビデオ取り込みユニット204の相対位置に限定されない。
上記の音源定位はオーディオとビデオとの空間的方位の関連付けに関するが、唇の動作のキャプチャの実施形態はビデオとオーディオとの時間的な関連付けに関する。
Claims (14)
- データベースに顔情報を入力する方法であって、
1つ又は複数の被撮影者に対してビデオ撮影を行い、撮影中にビデオ画面から前記1つ又は複数の被撮影者の顔情報を抽出するステップと、
前記1つ又は複数の被撮影者のうち少なくとも1つの被撮影者の撮影中の音声を記録するステップと、
記録された音声に対して意味解析を行い、該音声から対応する情報を抽出するステップと、
抽出された情報と該情報を話した被撮影者の顔情報とを関連付けて前記データベースに入力するステップと、を含み、
前記抽出された情報と該情報を話した被撮影者の顔情報とを関連付けるステップは、
音源定位により前記情報を話した被撮影者の現実シーンにおける方位を決定するステップと、
方位について前記現実シーンとビデオシーンとのマッピングを行うステップと、
前記情報を話した被撮影者の現実シーンにおける方位により、該被撮影者のビデオシーンにおける位置を決定するステップと、を含む、方法。 - 前記顔情報は、前記1つ又は複数の被撮影者を認識するために使用できる顔特徴情報を含む、請求項1に記載の方法。
- 前記少なくとも1つの被撮影者の音声は、話者自身の身分情報を含み、
前記抽出された対応する情報は、前記話者自身の身分情報を含む、請求項1又は2に記載の方法。 - 前記身分情報は姓名を含む、請求項3に記載の方法。
- 前記少なくとも1つの被撮影者の音声は、話者自身の所在するシーンに関する情報を含み、
前記抽出された対応する情報は、前記話者自身の所在するシーンに関する情報を含む、請求項1又は2に記載の方法。 - 抽出された情報と該情報を話した被撮影者の顔情報とを関連付けるステップは、
撮影中にビデオ画面に基づいて前記1つ又は複数の被撮影者の唇の動きを解析するステップ、を含む、請求項1に記載の方法。 - 前記唇の動きの開始時間と、前記音声が記録される開始時間とを比較する、請求項6に記載の方法。
- データベースに前記少なくとも1つの被撮影者の顔情報が記憶されているか否かを検出し、データベースに前記少なくとも1つの被撮影者の顔情報が存在しない場合、前記記録された音声に対して解析を行う、請求項1に記載の方法。
- データベースに前記少なくとも1つの被撮影者の顔情報が記憶されているか否かを検出し、データベースに前記少なくとも1つの被撮影者の顔情報が記憶されている場合、前記抽出された情報を用いて、データベースに記憶された前記少なくとも1つの被撮影者の顔情報に関連付けられた情報を補充する、請求項1に記載の方法。
- 前記情報はテキスト情報としてデータベースに記憶される、請求項1に記載の方法。
- データベースに顔情報を入力するプロセッサチップ回路であって、
請求項1乃至10の何れかに記載の方法のステップを実行する回路部、を含む、プロセッサチップ回路。 - 1つ又は複数の被撮影者に対してビデオ撮影を行うビデオ・センサと、
前記1つ又は複数の被撮影者のうち少なくとも1つの被撮影者の撮影中の音声を記録するオーディオ・センサと、
対応する被撮影者の情報と顔情報とを関連付けてデータベースに入力する請求項11に記載のプロセッサチップ回路と、を含む、電子機器。 - 前記電子機器はウェアラブルデバイスとして実現され、
前記ウェアラブルデバイスは、認識された顔に対応する情報がデータベースに存在する場合、情報の内容を音声で再生するスピーカ、を含む、請求項12に記載の電子機器。 - 命令を含むプログラムが記憶されているコンピュータ読み取り可能な記憶媒体であって、
前記命令が電子機器のプロセッサにより実行される際に、前記電子機器に請求項1乃至10の何れかに記載の方法を実行させる、記憶媒体。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910686122.3A CN110196914B (zh) | 2019-07-29 | 2019-07-29 | 一种将人脸信息录入数据库的方法和装置 |
CN201910686122.3 | 2019-07-29 |
Publications (2)
Publication Number | Publication Date |
---|---|
JP6723591B1 true JP6723591B1 (ja) | 2020-07-15 |
JP2021022351A JP2021022351A (ja) | 2021-02-18 |
Family
ID=67756178
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2019184911A Active JP6723591B1 (ja) | 2019-07-29 | 2019-10-08 | データベースに顔情報を入力する方法及び装置 |
Country Status (6)
Country | Link |
---|---|
US (1) | US10922570B1 (ja) |
EP (1) | EP3772016B1 (ja) |
JP (1) | JP6723591B1 (ja) |
KR (1) | KR20220041891A (ja) |
CN (1) | CN110196914B (ja) |
WO (1) | WO2021017096A1 (ja) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110544270A (zh) * | 2019-08-30 | 2019-12-06 | 上海依图信息技术有限公司 | 结合语音识别且实时预测人脸追踪轨迹方法及装置 |
CN110767226B (zh) * | 2019-10-30 | 2022-08-16 | 山西见声科技有限公司 | 具有高准确度的声源定位方法、装置、语音识别方法、系统、存储设备及终端 |
CN113593572B (zh) * | 2021-08-03 | 2024-06-28 | 深圳地平线机器人科技有限公司 | 在空间区域内进行音区定位方法和装置、设备和介质 |
CN114420131B (zh) * | 2022-03-16 | 2022-05-31 | 云天智能信息(深圳)有限公司 | 低弱视力智能语音辅助识别系统 |
CN114863364B (zh) * | 2022-05-20 | 2023-03-07 | 碧桂园生活服务集团股份有限公司 | 一种基于智能视频监控的安防检测方法及系统 |
Family Cites Families (89)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6111517A (en) | 1996-12-30 | 2000-08-29 | Visionics Corporation | Continuous video monitoring using face recognition for access control |
US6243683B1 (en) * | 1998-12-29 | 2001-06-05 | Intel Corporation | Video control of speech recognition |
US6567775B1 (en) * | 2000-04-26 | 2003-05-20 | International Business Machines Corporation | Fusion of audio and video based speaker identification for multimedia information access |
US6975750B2 (en) * | 2000-12-01 | 2005-12-13 | Microsoft Corp. | System and method for face recognition using synthesized training images |
US20030154084A1 (en) | 2002-02-14 | 2003-08-14 | Koninklijke Philips Electronics N.V. | Method and system for person identification using video-speech matching |
US7472063B2 (en) * | 2002-12-19 | 2008-12-30 | Intel Corporation | Audio-visual feature fusion and support vector machine useful for continuous speech recognition |
US20040220705A1 (en) * | 2003-03-13 | 2004-11-04 | Otman Basir | Visual classification and posture estimation of multiple vehicle occupants |
EP1743323B1 (en) * | 2004-04-28 | 2013-07-10 | Koninklijke Philips Electronics N.V. | Adaptive beamformer, sidelobe canceller, handsfree speech communication device |
CN100410963C (zh) * | 2006-12-27 | 2008-08-13 | 中山大学 | 一种基于块内相关性的二维线性鉴别分析人脸识别方法 |
JP5134876B2 (ja) * | 2007-07-11 | 2013-01-30 | 株式会社日立製作所 | 音声通信装置及び音声通信方法並びにプログラム |
US20090055180A1 (en) * | 2007-08-23 | 2009-02-26 | Coon Bradley S | System and method for optimizing speech recognition in a vehicle |
US8219387B2 (en) * | 2007-12-10 | 2012-07-10 | Microsoft Corporation | Identifying far-end sound |
US8624962B2 (en) * | 2009-02-02 | 2014-01-07 | Ydreams—Informatica, S.A. Ydreams | Systems and methods for simulating three-dimensional virtual interactions from two-dimensional camera images |
JP2011186351A (ja) * | 2010-03-11 | 2011-09-22 | Sony Corp | 情報処理装置、および情報処理方法、並びにプログラム |
WO2011149558A2 (en) * | 2010-05-28 | 2011-12-01 | Abelow Daniel H | Reality alternate |
US9396385B2 (en) * | 2010-08-26 | 2016-07-19 | Blast Motion Inc. | Integrated sensor and video motion analysis method |
US8700392B1 (en) * | 2010-09-10 | 2014-04-15 | Amazon Technologies, Inc. | Speech-inclusive device interfaces |
US10289288B2 (en) * | 2011-04-22 | 2019-05-14 | Emerging Automotive, Llc | Vehicle systems for providing access to vehicle controls, functions, environment and applications to guests/passengers via mobile devices |
US10572123B2 (en) * | 2011-04-22 | 2020-02-25 | Emerging Automotive, Llc | Vehicle passenger controls via mobile devices |
US20130030811A1 (en) * | 2011-07-29 | 2013-01-31 | Panasonic Corporation | Natural query interface for connected car |
US8913103B1 (en) * | 2012-02-01 | 2014-12-16 | Google Inc. | Method and apparatus for focus-of-attention control |
KR101971697B1 (ko) * | 2012-02-24 | 2019-04-23 | 삼성전자주식회사 | 사용자 디바이스에서 복합 생체인식 정보를 이용한 사용자 인증 방법 및 장치 |
US9153084B2 (en) * | 2012-03-14 | 2015-10-06 | Flextronics Ap, Llc | Destination and travel information application |
US9922646B1 (en) * | 2012-09-21 | 2018-03-20 | Amazon Technologies, Inc. | Identifying a location of a voice-input device |
US9008641B2 (en) * | 2012-12-27 | 2015-04-14 | Intel Corporation | Detecting a user-to-wireless device association in a vehicle |
CN103973441B (zh) * | 2013-01-29 | 2016-03-09 | 腾讯科技(深圳)有限公司 | 基于音视频的用户认证方法和装置 |
WO2014139117A1 (en) * | 2013-03-14 | 2014-09-18 | Intel Corporation | Voice and/or facial recognition based service provision |
US9747898B2 (en) * | 2013-03-15 | 2017-08-29 | Honda Motor Co., Ltd. | Interpretation of ambiguous vehicle instructions |
US9317736B1 (en) * | 2013-05-08 | 2016-04-19 | Amazon Technologies, Inc. | Individual record verification based on features |
US9680934B2 (en) * | 2013-07-17 | 2017-06-13 | Ford Global Technologies, Llc | Vehicle communication channel management |
US9892745B2 (en) * | 2013-08-23 | 2018-02-13 | At&T Intellectual Property I, L.P. | Augmented multi-tier classifier for multi-modal voice activity detection |
JP6148163B2 (ja) * | 2013-11-29 | 2017-06-14 | 本田技研工業株式会社 | 会話支援装置、会話支援装置の制御方法、及び会話支援装置のプログラム |
US9390726B1 (en) * | 2013-12-30 | 2016-07-12 | Google Inc. | Supplementing speech commands with gestures |
US9582246B2 (en) * | 2014-03-04 | 2017-02-28 | Microsoft Technology Licensing, Llc | Voice-command suggestions based on computer context |
KR102216048B1 (ko) * | 2014-05-20 | 2021-02-15 | 삼성전자주식회사 | 음성 명령 인식 장치 및 방법 |
US9373200B2 (en) * | 2014-06-06 | 2016-06-21 | Vivint, Inc. | Monitoring vehicle usage |
JP6464449B2 (ja) * | 2014-08-29 | 2019-02-06 | 本田技研工業株式会社 | 音源分離装置、及び音源分離方法 |
US20160100092A1 (en) * | 2014-10-01 | 2016-04-07 | Fortemedia, Inc. | Object tracking device and tracking method thereof |
US9881610B2 (en) * | 2014-11-13 | 2018-01-30 | International Business Machines Corporation | Speech recognition system adaptation based on non-acoustic attributes and face selection based on mouth motion using pixel intensities |
US10318575B2 (en) * | 2014-11-14 | 2019-06-11 | Zorroa Corporation | Systems and methods of building and using an image catalog |
US9741342B2 (en) * | 2014-11-26 | 2017-08-22 | Panasonic Intellectual Property Corporation Of America | Method and apparatus for recognizing speech by lip reading |
US9734410B2 (en) * | 2015-01-23 | 2017-08-15 | Shindig, Inc. | Systems and methods for analyzing facial expressions within an online classroom to gauge participant attentiveness |
DE102015201369A1 (de) * | 2015-01-27 | 2016-07-28 | Robert Bosch Gmbh | Verfahren und Vorrichtung zum Betreiben eines zumindest teilautomatisch fahrenden oder fahrbaren Kraftfahrzeugs |
US9300801B1 (en) * | 2015-01-30 | 2016-03-29 | Mattersight Corporation | Personality analysis of mono-recording system and methods |
US20160267911A1 (en) * | 2015-03-13 | 2016-09-15 | Magna Mirrors Of America, Inc. | Vehicle voice acquisition system with microphone and optical sensor |
US10305895B2 (en) * | 2015-04-14 | 2019-05-28 | Blubox Security, Inc. | Multi-factor and multi-mode biometric physical access control device |
DE102015210430A1 (de) * | 2015-06-08 | 2016-12-08 | Robert Bosch Gmbh | Verfahren zum Erkennen eines Sprachkontexts für eine Sprachsteuerung, Verfahren zum Ermitteln eines Sprachsteuersignals für eine Sprachsteuerung und Vorrichtung zum Ausführen der Verfahren |
US9641585B2 (en) * | 2015-06-08 | 2017-05-02 | Cisco Technology, Inc. | Automated video editing based on activity in video conference |
US10178301B1 (en) * | 2015-06-25 | 2019-01-08 | Amazon Technologies, Inc. | User identification based on voice and face |
US20170068863A1 (en) * | 2015-09-04 | 2017-03-09 | Qualcomm Incorporated | Occupancy detection using computer vision |
US9764694B2 (en) * | 2015-10-27 | 2017-09-19 | Thunder Power Hong Kong Ltd. | Intelligent rear-view mirror system |
US9832583B2 (en) * | 2015-11-10 | 2017-11-28 | Avaya Inc. | Enhancement of audio captured by multiple microphones at unspecified positions |
CN105512348B (zh) * | 2016-01-28 | 2019-03-26 | 北京旷视科技有限公司 | 用于处理视频和相关音频的方法和装置及检索方法和装置 |
US11783524B2 (en) * | 2016-02-10 | 2023-10-10 | Nitin Vats | Producing realistic talking face with expression using images text and voice |
WO2017138934A1 (en) * | 2016-02-10 | 2017-08-17 | Nuance Communications, Inc. | Techniques for spatially selective wake-up word recognition and related systems and methods |
US10476888B2 (en) * | 2016-03-23 | 2019-11-12 | Georgia Tech Research Corporation | Systems and methods for using video for user and message authentication |
EP3239981B1 (en) * | 2016-04-26 | 2018-12-12 | Nokia Technologies Oy | Methods, apparatuses and computer programs relating to modification of a characteristic associated with a separated audio signal |
US9984314B2 (en) * | 2016-05-06 | 2018-05-29 | Microsoft Technology Licensing, Llc | Dynamic classifier selection based on class skew |
US10089071B2 (en) * | 2016-06-02 | 2018-10-02 | Microsoft Technology Licensing, Llc | Automatic audio attenuation on immersive display devices |
WO2018003196A1 (ja) * | 2016-06-27 | 2018-01-04 | ソニー株式会社 | 情報処理システム、記憶媒体、および情報処理方法 |
US10152969B2 (en) * | 2016-07-15 | 2018-12-11 | Sonos, Inc. | Voice detection by multiple devices |
US10026403B2 (en) * | 2016-08-12 | 2018-07-17 | Paypal, Inc. | Location based voice association system |
JP6631445B2 (ja) * | 2016-09-09 | 2020-01-15 | トヨタ自動車株式会社 | 車両用情報提示装置 |
US10198626B2 (en) * | 2016-10-19 | 2019-02-05 | Snap Inc. | Neural networks for facial modeling |
JP2018074366A (ja) * | 2016-10-28 | 2018-05-10 | 京セラ株式会社 | 電子機器、制御方法およびプログラム |
CN106782545B (zh) * | 2016-12-16 | 2019-07-16 | 广州视源电子科技股份有限公司 | 一种将音视频数据转化成文字记录的系统和方法 |
US10497382B2 (en) * | 2016-12-16 | 2019-12-03 | Google Llc | Associating faces with voices for speaker diarization within videos |
US10403279B2 (en) * | 2016-12-21 | 2019-09-03 | Avnera Corporation | Low-power, always-listening, voice command detection and capture |
US20180190282A1 (en) * | 2016-12-30 | 2018-07-05 | Qualcomm Incorporated | In-vehicle voice command control |
US20180187969A1 (en) * | 2017-01-03 | 2018-07-05 | Samsung Electronics Co., Ltd. | Refrigerator |
US10861450B2 (en) * | 2017-02-10 | 2020-12-08 | Samsung Electronics Co., Ltd. | Method and apparatus for managing voice-based interaction in internet of things network system |
US10467510B2 (en) * | 2017-02-14 | 2019-11-05 | Microsoft Technology Licensing, Llc | Intelligent assistant |
WO2018150758A1 (ja) * | 2017-02-15 | 2018-08-23 | ソニー株式会社 | 情報処理装置、情報処理方法及び記憶媒体 |
JP7337699B2 (ja) * | 2017-03-23 | 2023-09-04 | ジョイソン セイフティ システムズ アクイジション エルエルシー | 口の画像を入力コマンドと相互に関連付けるシステム及び方法 |
DK179867B1 (en) * | 2017-05-16 | 2019-08-06 | Apple Inc. | RECORDING AND SENDING EMOJI |
US20180357040A1 (en) * | 2017-06-09 | 2018-12-13 | Mitsubishi Electric Automotive America, Inc. | In-vehicle infotainment with multi-modal interface |
US10416671B2 (en) * | 2017-07-11 | 2019-09-17 | Waymo Llc | Methods and systems for vehicle occupancy confirmation |
US20190037363A1 (en) * | 2017-07-31 | 2019-01-31 | GM Global Technology Operations LLC | Vehicle based acoustic zoning system for smartphones |
CN107632704B (zh) * | 2017-09-01 | 2020-05-15 | 广州励丰文化科技股份有限公司 | 一种基于光学定位的混合现实音频控制方法及服务设备 |
JP2019049829A (ja) * | 2017-09-08 | 2019-03-28 | 株式会社豊田中央研究所 | 目的区間判別装置、モデル学習装置、及びプログラム |
JP7123540B2 (ja) * | 2017-09-25 | 2022-08-23 | キヤノン株式会社 | 音声情報による入力を受け付ける情報処理端末、方法、その情報処理端末を含むシステム |
US11465631B2 (en) * | 2017-12-08 | 2022-10-11 | Tesla, Inc. | Personalization system and method for a vehicle based on spatial locations of occupants' body portions |
US10374816B1 (en) * | 2017-12-13 | 2019-08-06 | Amazon Technologies, Inc. | Network conference management and arbitration via voice-capturing devices |
US10834365B2 (en) * | 2018-02-08 | 2020-11-10 | Nortek Security & Control Llc | Audio-visual monitoring using a virtual assistant |
US11335079B2 (en) * | 2018-03-05 | 2022-05-17 | Intel Corporation | Method and system of reflection suppression for image processing |
US10699572B2 (en) * | 2018-04-20 | 2020-06-30 | Carrier Corporation | Passenger counting for a transportation system |
US11196669B2 (en) * | 2018-05-17 | 2021-12-07 | At&T Intellectual Property I, L.P. | Network routing of media streams based upon semantic contents |
US20190355352A1 (en) * | 2018-05-18 | 2019-11-21 | Honda Motor Co., Ltd. | Voice and conversation recognition system |
DK201870683A1 (en) * | 2018-07-05 | 2020-05-25 | Aptiv Technologies Limited | IDENTIFYING AND AUTHENTICATING AUTONOMOUS VEHICLES AND PASSENGERS |
-
2019
- 2019-07-29 CN CN201910686122.3A patent/CN110196914B/zh active Active
- 2019-09-03 KR KR1020227006755A patent/KR20220041891A/ko active Search and Examination
- 2019-09-03 WO PCT/CN2019/104108 patent/WO2021017096A1/zh active Application Filing
- 2019-10-08 JP JP2019184911A patent/JP6723591B1/ja active Active
- 2019-11-08 US US16/678,838 patent/US10922570B1/en active Active
- 2019-11-26 EP EP19211509.5A patent/EP3772016B1/en active Active
Also Published As
Publication number | Publication date |
---|---|
JP2021022351A (ja) | 2021-02-18 |
US10922570B1 (en) | 2021-02-16 |
US20210034898A1 (en) | 2021-02-04 |
EP3772016A1 (en) | 2021-02-03 |
WO2021017096A1 (zh) | 2021-02-04 |
CN110196914B (zh) | 2019-12-27 |
KR20220041891A (ko) | 2022-04-01 |
EP3772016B1 (en) | 2022-05-18 |
CN110196914A (zh) | 2019-09-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6723591B1 (ja) | データベースに顔情報を入力する方法及び装置 | |
CN112037791B (zh) | 会议纪要转录方法、设备和存储介质 | |
EP2509070A1 (en) | Apparatus and method for determining relevance of input speech | |
US10806393B2 (en) | System and method for detection of cognitive and speech impairment based on temporal visual facial feature | |
CN107924392A (zh) | 基于姿势的注释 | |
EP3685288B1 (en) | Apparatus, method and computer program product for biometric recognition | |
JP2019217558A (ja) | 対話システム及び対話システムの制御方法 | |
KR20150135688A (ko) | 시청 데이터를 이용한 기억 보조 방법 | |
CN109345427B (zh) | 一种结合人脸识别和行人识别技术的教室视频点到方法 | |
Kumar et al. | Smart glasses for visually impaired people with facial recognition | |
KR20210066774A (ko) | 멀티모달 기반 사용자 구별 방법 및 장치 | |
US20230136553A1 (en) | Context-aided identification | |
WO2020125252A1 (zh) | 机器人会话切换方法、装置及计算设备 | |
US20220335752A1 (en) | Emotion recognition and notification system | |
JP2021131699A (ja) | 情報処理装置および行動モード設定方法 | |
JP2021086274A (ja) | 読唇装置及び読唇方法 | |
KR20200094570A (ko) | 수화용 장갑과 언어 변환용 안경으로 이루어진 수화 통역 시스템 | |
Shah et al. | Eyeris: a virtual eye to aid the visually impaired | |
JP2019152737A (ja) | 話者推定方法および話者推定装置 | |
JP2021179689A (ja) | 翻訳プログラム、翻訳装置、翻訳方法、及びウェアラブル端末 | |
JP2019175421A (ja) | マルチアングル顔認証システム及びその学習方法と認証方法 | |
CN117809354B (zh) | 基于头部可穿戴设备感知的情感识别方法、介质及设备 | |
Jain et al. | Survey on Various Techniques based on Voice Assistance for Blind | |
US20210350118A1 (en) | System & Method for Body Language Interpretation | |
JP7127864B2 (ja) | 情報処理方法、情報処理装置及びプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20191008 |
|
A871 | Explanation of circumstances concerning accelerated examination |
Free format text: JAPANESE INTERMEDIATE CODE: A871 Effective date: 20191008 |
|
A975 | Report on accelerated examination |
Free format text: JAPANESE INTERMEDIATE CODE: A971005 Effective date: 20191107 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20191224 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20200312 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20200609 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20200618 |
|
R150 | Certificate of patent or registration of utility model |
Ref document number: 6723591 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |