GR1008860B - Συστημα διαχωρισμου ομιλητων απο οπτικοακουστικα δεδομενα - Google Patents

Συστημα διαχωρισμου ομιλητων απο οπτικοακουστικα δεδομενα

Info

Publication number
GR1008860B
GR1008860B GR20150100564A GR20150100564A GR1008860B GR 1008860 B GR1008860 B GR 1008860B GR 20150100564 A GR20150100564 A GR 20150100564A GR 20150100564 A GR20150100564 A GR 20150100564A GR 1008860 B GR1008860 B GR 1008860B
Authority
GR
Greece
Prior art keywords
speakers
isolation
audiovisual data
text
minutes
Prior art date
Application number
GR20150100564A
Other languages
English (en)
Inventor
Κωνσταντινος Δημητριου Σπυροπουλος
Σταυρος Ιωαννη Περαντωνης
Ευαγγελος Χρηστου Σπυρου
Δημητριος Ηλια Σγουροπουλος
Γεωργιος Αποστολου Σιαντικος
Θεοδωρος Δημητριου Γιαννακοπουλος
Original Assignee
Κωνσταντινος Δημητριου Σπυροπουλος
Εθνικο Κεντρο Ερευνας Φυσικων Επιστημων "Δημοκριτος"
Σταυρος Ιωαννη Περαντωνης
Ευαγγελος Χρηστου Σπυρου
Δημητριος Ηλια Σγουροπουλος
Γεωργιος Αποστολου Σιαντικος
Θεοδωρος Δημητριου Γιαννακοπουλος
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Κωνσταντινος Δημητριου Σπυροπουλος, Εθνικο Κεντρο Ερευνας Φυσικων Επιστημων "Δημοκριτος", Σταυρος Ιωαννη Περαντωνης, Ευαγγελος Χρηστου Σπυρου, Δημητριος Ηλια Σγουροπουλος, Γεωργιος Αποστολου Σιαντικος, Θεοδωρος Δημητριου Γιαννακοπουλος filed Critical Κωνσταντινος Δημητριου Σπυροπουλος
Priority to GR20150100564A priority Critical patent/GR1008860B/el
Publication of GR1008860B publication Critical patent/GR1008860B/el

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/10Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • G06F18/256Fusion techniques of classification results, e.g. of results related to same input data of results relating to different input data, e.g. multimodal recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Game Theory and Decision Science (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

Μια μέθοδος διαχωρισμού ομιλητών, που χρησιμοποιεί κάμερες (5) και μικρόφωνα (4) για την αυτόματη δημιουργία πρακτικών, στα οποία, αφού γίνει ταυτοποίηση των ομιλητών, επισημαίνεται η στιγμή, που μίλησε κάθε ομιλητής και με χρήση συστήματος μεταγραφής ομιλίας σε κείμενο (21) το περιεχόμενο της ομιλίας. Η μέθοδος χρησιμοποιεί αρχιτεκτονική ανταλλαγής μηνυμάτων τύπου ΙοΤ, η οποία καθιστά εύκολη την επικοινωνία μεταξύ των συσκευών, που χρησιμοποιούνται και των αρθρωμάτων επεξεργασίας. Μπορεί να χρησιμοποιηθεί π.χ. για την αυτόματη δημιουργία πρακτικών σε συναντήσεις ή τη μεταγραφή τηλεοπτικών βίντεο σε κείμενο.
GR20150100564A 2015-12-29 2015-12-29 Συστημα διαχωρισμου ομιλητων απο οπτικοακουστικα δεδομενα GR1008860B (el)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GR20150100564A GR1008860B (el) 2015-12-29 2015-12-29 Συστημα διαχωρισμου ομιλητων απο οπτικοακουστικα δεδομενα

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GR20150100564A GR1008860B (el) 2015-12-29 2015-12-29 Συστημα διαχωρισμου ομιλητων απο οπτικοακουστικα δεδομενα

Publications (1)

Publication Number Publication Date
GR1008860B true GR1008860B (el) 2016-09-27

Family

ID=58186181

Family Applications (1)

Application Number Title Priority Date Filing Date
GR20150100564A GR1008860B (el) 2015-12-29 2015-12-29 Συστημα διαχωρισμου ομιλητων απο οπτικοακουστικα δεδομενα

Country Status (1)

Country Link
GR (1) GR1008860B (el)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0795851A2 (en) * 1996-03-15 1997-09-17 Kabushiki Kaisha Toshiba Method and system for microphone array input type speech recognition
US6219640B1 (en) * 1999-08-06 2001-04-17 International Business Machines Corporation Methods and apparatus for audio-visual speaker recognition and utterance verification
US6567775B1 (en) * 2000-04-26 2003-05-20 International Business Machines Corporation Fusion of audio and video based speaker identification for multimedia information access
US6754631B1 (en) * 1998-11-04 2004-06-22 Gateway, Inc. Recording meeting minutes based upon speech recognition
US20040267521A1 (en) * 2003-06-25 2004-12-30 Ross Cutler System and method for audio/video speaker detection
WO2006089355A1 (en) * 2005-02-22 2006-08-31 Voice Perfect Systems Pty Ltd A system for recording and analysing meetings
JP2007233239A (ja) * 2006-03-03 2007-09-13 National Institute Of Advanced Industrial & Technology 発話イベント分離方法、発話イベント分離システム、及び、発話イベント分離プログラム
US20090110225A1 (en) * 2007-10-31 2009-04-30 Hyun Soo Kim Method and apparatus for sound source localization using microphones
US20090147995A1 (en) * 2007-12-07 2009-06-11 Tsutomu Sawada Information processing apparatus and information processing method, and computer program
WO2012023268A1 (ja) * 2010-08-16 2012-02-23 日本電気株式会社 多マイクロホン話者分類装置、方法およびプログラム
US20140016835A1 (en) * 2012-07-13 2014-01-16 National Chiao Tung University Human identification system by fusion of face recognition and speaker recognition, method and service robot thereof
EP2765791A1 (en) * 2013-02-08 2014-08-13 Thomson Licensing Method and apparatus for determining directions of uncorrelated sound sources in a higher order ambisonics representation of a sound field

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0795851A2 (en) * 1996-03-15 1997-09-17 Kabushiki Kaisha Toshiba Method and system for microphone array input type speech recognition
US6754631B1 (en) * 1998-11-04 2004-06-22 Gateway, Inc. Recording meeting minutes based upon speech recognition
US6219640B1 (en) * 1999-08-06 2001-04-17 International Business Machines Corporation Methods and apparatus for audio-visual speaker recognition and utterance verification
US6567775B1 (en) * 2000-04-26 2003-05-20 International Business Machines Corporation Fusion of audio and video based speaker identification for multimedia information access
US20040267521A1 (en) * 2003-06-25 2004-12-30 Ross Cutler System and method for audio/video speaker detection
WO2006089355A1 (en) * 2005-02-22 2006-08-31 Voice Perfect Systems Pty Ltd A system for recording and analysing meetings
JP2007233239A (ja) * 2006-03-03 2007-09-13 National Institute Of Advanced Industrial & Technology 発話イベント分離方法、発話イベント分離システム、及び、発話イベント分離プログラム
US20090110225A1 (en) * 2007-10-31 2009-04-30 Hyun Soo Kim Method and apparatus for sound source localization using microphones
US20090147995A1 (en) * 2007-12-07 2009-06-11 Tsutomu Sawada Information processing apparatus and information processing method, and computer program
WO2012023268A1 (ja) * 2010-08-16 2012-02-23 日本電気株式会社 多マイクロホン話者分類装置、方法およびプログラム
US20140016835A1 (en) * 2012-07-13 2014-01-16 National Chiao Tung University Human identification system by fusion of face recognition and speaker recognition, method and service robot thereof
EP2765791A1 (en) * 2013-02-08 2014-08-13 Thomson Licensing Method and apparatus for determining directions of uncorrelated sound sources in a higher order ambisonics representation of a sound field

Similar Documents

Publication Publication Date Title
EP3752957A4 (en) SYSTEM AND PROCEDURE FOR SPEECH UNDERSTANDING VIA INTEGRATED AUDIO AND VIDEO-BASED VOICE RECOGNITION
EP4283613A3 (en) Noise mitigation for a voice interface device
PH12016502029A1 (en) In-call translation
EP3920178A4 (en) METHOD AND SYSTEM FOR AUDIO DETECTION AND DEVICE
EP3860133A4 (en) AUDIO AND VIDEO QUALITY IMPROVEMENT METHOD AND SYSTEM WITH SCENE DETECTION AND DISPLAY DEVICE
EP3822831A4 (en) VOICE RECOGNITION, BODY PORTABLE DEVICE AND ELECTRONIC DEVICE
EP3751561A3 (en) Hotword recognition
EP3487148A4 (en) VIDEO CONFERENCE IMPLEMENTATION PROCESS, DEVICE AND SYSTEM AND CLOUD DESKTOP END UNIT
EP3323083A4 (en) APPARATUS AND METHODS FOR FACIAL RECOGNITION AND VIDEO ANALYSIS FOR IDENTIFYING INDIVIDUALS IN CONTEXTUAL VIDEO STREAMS
WO2018055455A8 (en) Tonal/transient structural separation for audio effects
WO2011130083A3 (en) Camera-assisted noise cancellation and speech recognition
EP3721380A4 (en) METHOD AND DEVICE FOR FACIAL RECOGNITION
EP3403261A4 (en) AUTOMATIC DETERMINATION OF TIMING WINDOWS FOR LANGUAGE SUBTITLES IN AN AUDIO DATA STREAM
EP3685312A4 (en) METHOD AND SYSTEM FOR DETECTION OF IMAGE CONTENT
EP3663906A4 (en) INFORMATION PROCESSING DEVICE, VOICE RECOGNITION SYSTEM AND INFORMATION PROCESSING METHOD
EP3511933A4 (en) SYSTEM AND METHOD FOR PROVIDING VOICE RECOGNITION IMAGE FEEDBACK
EP3440826A4 (en) SYSTEM AND METHOD FOR MONITORING VOICE AND VIDEO CALLING THIRD PARTIES
EP3446488A4 (en) SYSTEM AND METHOD FOR REAL-TIME SYNCHRONIZATION FOR MEDIA CONTENT OF MULTIPLE DEVICES AND SPEAKER SYSTEMS
EP4084434A4 (en) SERVER AND SERVER SIDE PROCESSING METHOD FOR ACTIVELY INITIATING A CONVERSATION, AND VOICE INTERACTION SYSTEM CAPABLE OF ACTIVELY INITIATING A CONVERSATION
EP3533033A4 (en) SYSTEM AND METHOD FOR DEFINING, CAPTURING, ASSEMBLING AND DISPLAYING PERSONALIZED VIDEO CONTENT
EP3931826A4 (en) SERVER SUPPORTING VOICE RECOGNITION OF A DEVICE AND METHOD OF OPERATING THE SERVER
EP3890332A4 (en) VIDEO DISTRIBUTION METHOD AND ELECTRONIC DEVICE
EP3425635A4 (en) AUDIO PROCESSING DEVICE, IMAGE PROCESSING DEVICE, MICROPHONE NETWORK SYSTEM, AND AUDIO PROCESSING METHOD
EP3782359A4 (en) METHOD OF COMBINING CONTENT FROM MULTIPLE FRAMES AND ELECTRONIC DEVICE FOR THIS
EP3660839A4 (en) SYSTEM, SERVER AND METHOD FOR SPEECH RECOGNITION OF HOME APPLIANCE

Legal Events

Date Code Title Description
PG Patent granted

Effective date: 20161020