GR1008860B - Συστημα διαχωρισμου ομιλητων απο οπτικοακουστικα δεδομενα - Google Patents
Συστημα διαχωρισμου ομιλητων απο οπτικοακουστικα δεδομεναInfo
- Publication number
- GR1008860B GR1008860B GR20150100564A GR20150100564A GR1008860B GR 1008860 B GR1008860 B GR 1008860B GR 20150100564 A GR20150100564 A GR 20150100564A GR 20150100564 A GR20150100564 A GR 20150100564A GR 1008860 B GR1008860 B GR 1008860B
- Authority
- GR
- Greece
- Prior art keywords
- speakers
- isolation
- audiovisual data
- text
- minutes
- Prior art date
Links
- 238000002955 isolation Methods 0.000 title abstract 2
- 238000013518 transcription Methods 0.000 abstract 2
- 230000035897 transcription Effects 0.000 abstract 2
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/10—Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/254—Fusion techniques of classification results, e.g. of results related to same input data
- G06F18/256—Fusion techniques of classification results, e.g. of results related to same input data of results relating to different input data, e.g. multimodal recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Human Computer Interaction (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Business, Economics & Management (AREA)
- Computational Linguistics (AREA)
- Game Theory and Decision Science (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Μια μέθοδος διαχωρισμού ομιλητών, που χρησιμοποιεί κάμερες (5) και μικρόφωνα (4) για την αυτόματη δημιουργία πρακτικών, στα οποία, αφού γίνει ταυτοποίηση των ομιλητών, επισημαίνεται η στιγμή, που μίλησε κάθε ομιλητής και με χρήση συστήματος μεταγραφής ομιλίας σε κείμενο (21) το περιεχόμενο της ομιλίας. Η μέθοδος χρησιμοποιεί αρχιτεκτονική ανταλλαγής μηνυμάτων τύπου ΙοΤ, η οποία καθιστά εύκολη την επικοινωνία μεταξύ των συσκευών, που χρησιμοποιούνται και των αρθρωμάτων επεξεργασίας. Μπορεί να χρησιμοποιηθεί π.χ. για την αυτόματη δημιουργία πρακτικών σε συναντήσεις ή τη μεταγραφή τηλεοπτικών βίντεο σε κείμενο.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GR20150100564A GR1008860B (el) | 2015-12-29 | 2015-12-29 | Συστημα διαχωρισμου ομιλητων απο οπτικοακουστικα δεδομενα |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GR20150100564A GR1008860B (el) | 2015-12-29 | 2015-12-29 | Συστημα διαχωρισμου ομιλητων απο οπτικοακουστικα δεδομενα |
Publications (1)
Publication Number | Publication Date |
---|---|
GR1008860B true GR1008860B (el) | 2016-09-27 |
Family
ID=58186181
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GR20150100564A GR1008860B (el) | 2015-12-29 | 2015-12-29 | Συστημα διαχωρισμου ομιλητων απο οπτικοακουστικα δεδομενα |
Country Status (1)
Country | Link |
---|---|
GR (1) | GR1008860B (el) |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0795851A2 (en) * | 1996-03-15 | 1997-09-17 | Kabushiki Kaisha Toshiba | Method and system for microphone array input type speech recognition |
US6219640B1 (en) * | 1999-08-06 | 2001-04-17 | International Business Machines Corporation | Methods and apparatus for audio-visual speaker recognition and utterance verification |
US6567775B1 (en) * | 2000-04-26 | 2003-05-20 | International Business Machines Corporation | Fusion of audio and video based speaker identification for multimedia information access |
US6754631B1 (en) * | 1998-11-04 | 2004-06-22 | Gateway, Inc. | Recording meeting minutes based upon speech recognition |
US20040267521A1 (en) * | 2003-06-25 | 2004-12-30 | Ross Cutler | System and method for audio/video speaker detection |
WO2006089355A1 (en) * | 2005-02-22 | 2006-08-31 | Voice Perfect Systems Pty Ltd | A system for recording and analysing meetings |
JP2007233239A (ja) * | 2006-03-03 | 2007-09-13 | National Institute Of Advanced Industrial & Technology | 発話イベント分離方法、発話イベント分離システム、及び、発話イベント分離プログラム |
US20090110225A1 (en) * | 2007-10-31 | 2009-04-30 | Hyun Soo Kim | Method and apparatus for sound source localization using microphones |
US20090147995A1 (en) * | 2007-12-07 | 2009-06-11 | Tsutomu Sawada | Information processing apparatus and information processing method, and computer program |
WO2012023268A1 (ja) * | 2010-08-16 | 2012-02-23 | 日本電気株式会社 | 多マイクロホン話者分類装置、方法およびプログラム |
US20140016835A1 (en) * | 2012-07-13 | 2014-01-16 | National Chiao Tung University | Human identification system by fusion of face recognition and speaker recognition, method and service robot thereof |
EP2765791A1 (en) * | 2013-02-08 | 2014-08-13 | Thomson Licensing | Method and apparatus for determining directions of uncorrelated sound sources in a higher order ambisonics representation of a sound field |
-
2015
- 2015-12-29 GR GR20150100564A patent/GR1008860B/el active IP Right Grant
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0795851A2 (en) * | 1996-03-15 | 1997-09-17 | Kabushiki Kaisha Toshiba | Method and system for microphone array input type speech recognition |
US6754631B1 (en) * | 1998-11-04 | 2004-06-22 | Gateway, Inc. | Recording meeting minutes based upon speech recognition |
US6219640B1 (en) * | 1999-08-06 | 2001-04-17 | International Business Machines Corporation | Methods and apparatus for audio-visual speaker recognition and utterance verification |
US6567775B1 (en) * | 2000-04-26 | 2003-05-20 | International Business Machines Corporation | Fusion of audio and video based speaker identification for multimedia information access |
US20040267521A1 (en) * | 2003-06-25 | 2004-12-30 | Ross Cutler | System and method for audio/video speaker detection |
WO2006089355A1 (en) * | 2005-02-22 | 2006-08-31 | Voice Perfect Systems Pty Ltd | A system for recording and analysing meetings |
JP2007233239A (ja) * | 2006-03-03 | 2007-09-13 | National Institute Of Advanced Industrial & Technology | 発話イベント分離方法、発話イベント分離システム、及び、発話イベント分離プログラム |
US20090110225A1 (en) * | 2007-10-31 | 2009-04-30 | Hyun Soo Kim | Method and apparatus for sound source localization using microphones |
US20090147995A1 (en) * | 2007-12-07 | 2009-06-11 | Tsutomu Sawada | Information processing apparatus and information processing method, and computer program |
WO2012023268A1 (ja) * | 2010-08-16 | 2012-02-23 | 日本電気株式会社 | 多マイクロホン話者分類装置、方法およびプログラム |
US20140016835A1 (en) * | 2012-07-13 | 2014-01-16 | National Chiao Tung University | Human identification system by fusion of face recognition and speaker recognition, method and service robot thereof |
EP2765791A1 (en) * | 2013-02-08 | 2014-08-13 | Thomson Licensing | Method and apparatus for determining directions of uncorrelated sound sources in a higher order ambisonics representation of a sound field |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3752957A4 (en) | SYSTEM AND PROCEDURE FOR SPEECH UNDERSTANDING VIA INTEGRATED AUDIO AND VIDEO-BASED VOICE RECOGNITION | |
EP4283613A3 (en) | Noise mitigation for a voice interface device | |
PH12016502029A1 (en) | In-call translation | |
EP3920178A4 (en) | METHOD AND SYSTEM FOR AUDIO DETECTION AND DEVICE | |
EP3860133A4 (en) | AUDIO AND VIDEO QUALITY IMPROVEMENT METHOD AND SYSTEM WITH SCENE DETECTION AND DISPLAY DEVICE | |
EP3822831A4 (en) | VOICE RECOGNITION, BODY PORTABLE DEVICE AND ELECTRONIC DEVICE | |
EP3751561A3 (en) | Hotword recognition | |
EP3487148A4 (en) | VIDEO CONFERENCE IMPLEMENTATION PROCESS, DEVICE AND SYSTEM AND CLOUD DESKTOP END UNIT | |
EP3323083A4 (en) | APPARATUS AND METHODS FOR FACIAL RECOGNITION AND VIDEO ANALYSIS FOR IDENTIFYING INDIVIDUALS IN CONTEXTUAL VIDEO STREAMS | |
WO2018055455A8 (en) | Tonal/transient structural separation for audio effects | |
WO2011130083A3 (en) | Camera-assisted noise cancellation and speech recognition | |
EP3721380A4 (en) | METHOD AND DEVICE FOR FACIAL RECOGNITION | |
EP3403261A4 (en) | AUTOMATIC DETERMINATION OF TIMING WINDOWS FOR LANGUAGE SUBTITLES IN AN AUDIO DATA STREAM | |
EP3685312A4 (en) | METHOD AND SYSTEM FOR DETECTION OF IMAGE CONTENT | |
EP3663906A4 (en) | INFORMATION PROCESSING DEVICE, VOICE RECOGNITION SYSTEM AND INFORMATION PROCESSING METHOD | |
EP3511933A4 (en) | SYSTEM AND METHOD FOR PROVIDING VOICE RECOGNITION IMAGE FEEDBACK | |
EP3440826A4 (en) | SYSTEM AND METHOD FOR MONITORING VOICE AND VIDEO CALLING THIRD PARTIES | |
EP3446488A4 (en) | SYSTEM AND METHOD FOR REAL-TIME SYNCHRONIZATION FOR MEDIA CONTENT OF MULTIPLE DEVICES AND SPEAKER SYSTEMS | |
EP4084434A4 (en) | SERVER AND SERVER SIDE PROCESSING METHOD FOR ACTIVELY INITIATING A CONVERSATION, AND VOICE INTERACTION SYSTEM CAPABLE OF ACTIVELY INITIATING A CONVERSATION | |
EP3533033A4 (en) | SYSTEM AND METHOD FOR DEFINING, CAPTURING, ASSEMBLING AND DISPLAYING PERSONALIZED VIDEO CONTENT | |
EP3931826A4 (en) | SERVER SUPPORTING VOICE RECOGNITION OF A DEVICE AND METHOD OF OPERATING THE SERVER | |
EP3890332A4 (en) | VIDEO DISTRIBUTION METHOD AND ELECTRONIC DEVICE | |
EP3425635A4 (en) | AUDIO PROCESSING DEVICE, IMAGE PROCESSING DEVICE, MICROPHONE NETWORK SYSTEM, AND AUDIO PROCESSING METHOD | |
EP3782359A4 (en) | METHOD OF COMBINING CONTENT FROM MULTIPLE FRAMES AND ELECTRONIC DEVICE FOR THIS | |
EP3660839A4 (en) | SYSTEM, SERVER AND METHOD FOR SPEECH RECOGNITION OF HOME APPLIANCE |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PG | Patent granted |
Effective date: 20161020 |