WO2005036786A1 - Recepteur radio numerique a synthese de la parole - Google Patents

Recepteur radio numerique a synthese de la parole Download PDF

Info

Publication number
WO2005036786A1
WO2005036786A1 PCT/GB2004/004295 GB2004004295W WO2005036786A1 WO 2005036786 A1 WO2005036786 A1 WO 2005036786A1 GB 2004004295 W GB2004004295 W GB 2004004295W WO 2005036786 A1 WO2005036786 A1 WO 2005036786A1
Authority
WO
WIPO (PCT)
Prior art keywords
receiver
speech
radio
information
digital radio
Prior art date
Application number
PCT/GB2004/004295
Other languages
English (en)
Inventor
Gavin Robert Ferris
Original Assignee
Radioscape Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Radioscape Limited filed Critical Radioscape Limited
Publication of WO2005036786A1 publication Critical patent/WO2005036786A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H40/00Arrangements specially adapted for receiving broadcast information
    • H04H40/18Arrangements characterised by circuits or components specially adapted for receiving
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03JTUNING RESONANT CIRCUITS; SELECTING RESONANT CIRCUITS
    • H03J2200/00Indexing scheme relating to tuning resonant circuits and selecting resonant circuits
    • H03J2200/30Radio receiver with speech synthesis ability, used for conveying information that is shown on the display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H2201/00Aspects of broadcast communication
    • H04H2201/10Aspects of broadcast communication characterised by the type of broadcast system
    • H04H2201/20Aspects of broadcast communication characterised by the type of broadcast system digital audio broadcasting [DAB]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H60/00Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
    • H04H60/68Systems specially adapted for using specific information, e.g. geographical or meteorological information
    • H04H60/73Systems specially adapted for using specific information, e.g. geographical or meteorological information using meta-information
    • H04H60/74Systems specially adapted for using specific information, e.g. geographical or meteorological information using meta-information using programme related information, e.g. title, composer or interpreter

Definitions

  • the present invention provides a number of methods and techniques for making a digital radio (e.g. a Digital Audio Broadcasting (DAB) radio) usable without visual feedback.
  • a digital radio e.g. a Digital Audio Broadcasting (DAB) radio
  • Digital radio receivers can decode digital radio signals conforming to one or more different formats, such as DAB, DRM or IBOC.
  • digital radios have displays that can display textual information broadcast with a radio channel (e.g. the name- of the song being broadcast on that channel). Because digital radios generally have displays that can show text and in some cases graphics, much of the user interaction developments for digital radios have centered on new graphical user interfaces, or other ways of presenting information to the end-user that take advantage of the display and the ability for a digital radio channel to carry, an information payload.
  • a digital radio receiver comprising a speech synthesis module able to generate synthesised speech, the synthesised speech conveying information ordinarily shown on a display of the receiver.
  • the present invention moves away from the established bias in digital radio user interaction design by ignoring, or at least supplementing the display with synthesized speech.
  • the synthesised speech could confirm a control operation performed by an end-user (e.g. switching to pre-set station 1 could be accompanied by a synthesized voice saying "preset one"; the operation of tuning could be accompanied by the synthesized voice saying "tuning” etc.).
  • the receiver may include a display and the information conveyed by the synthesised speech is then also shown on the display.
  • the synthesized speech may not correspond to anything in fact displayed on a display.
  • the synthesised speech could be generated by synthesising textual information broadcast with a digital radio channel received by the receiver.
  • the textual information could be DLS, such as the station name the receiver is currently tuned into. It could be an alert relating to a newly received DLS.
  • the textual information could be stored in memory (e.g. to facilitate rapid replay at a future time; to enable the device to cycle through several discrete items of synthesised speech under the control of the end-user).
  • compressed speech versions of the the textual information could also be downloaded via transport protocols such as MOT.
  • the receiver could also be able to process a broadcast data file, such as a DLS announcement or radio station jingle.
  • the data file could also include information that improves the accuracy of the speech synthesis.
  • the receiver could implement a radio pause or rewind function is activated, in which live radio is temporarily stored in memory, whilst synthesized speech outputs from a speaker in the receiver.
  • a method of enabling a digital radio receiver to provide information; comprising the step of synthesising speech in a speech synthesis module in the receiver, the synthesised speech conveying information ordinarily shown on a display of the receiver.
  • the present invention therefore uses audio as the primary source of feedback in place of the LCD used in radios available today. These techniques are predominantiy aimed at making radios easier to use for the visually impaired but are also applicable to application areas where visual feedback may be distracting, such as automotive applications.
  • the present invention can be implemented in any radio system that provides additional textual information and hence includes DAB, DRM., IBOC and others.
  • the receiver confirms all operations performed by the user using a spoken equivalent of the operations result. For example changing the volume would cause a spoken message of the form "Volume now 10'. For a button that has multiple functions each press would announce the new mode, for example this might cycle between 'Volume mode', ⁇ ass mode', 'Treble mode', 'Balance mode' and 'Playback resumed'.
  • DAB provides textual information about services
  • this can also be communicated to the user using a spoken equivalent.
  • the interaction to select a new channel would be something like: • User presses 'select' button on the digital radio • Audio for current service is muted ' • Radio says 'Select station mode' — i.e. speech synthesised • User presses 'up' button on the digital radio • Radio looks for next station and outputs a spoken representation of its name — e.g. 'BBC Radio four' • User presses 'up' button again • Radio says 'BBC Five live' • User presses 'select' button • Radio says 'BBC Five live selected' • Radio starts playing Five live.
  • the radio makes use of a text to speech TTS module that converts text into speech that would normally be fed back to the user via the LCD.
  • the LCD can still be used to provide a second source of information.
  • TTS is a well understood problem and there a number of vendors selhng off the shelf TTS libraries suitable for embedded devices typically found in today's digital radios. High quality TTS can consume large amount of memory and processor time so different radio models will have different TTS modules that will be chosen as the best trade-off between quality and use of the resources available.
  • the radio is frustratingly slow to operate. There are number of techniques that can be employed to improve this: • If the radio is in the middle of speaking and the user presses a key then the current phrase should be interrupted and the phrase corresponding to the new key press should be started immediately. For example when changing the volume the user might press the 'Up' button a number of times. Instead of hearing Volume now 5', 'Volume now 6' and Volume now 7', the user might hear something like 'Vol..', * Vol', Volume now 7'. • The radio can have an advanced mode where confirmations are minimised and abbreviated.
  • Radio station names can be abbreviated using contracted form hints transmitted with the station name. In some cases this leads to names of stations that can't easily be pronounced; in these cases the TTS module would return an error saying that the station name was unpronounceable and the normal longer name would be used instead. • If the radio supports a pause/rewind facility, then the current station could be paused while the radio is providing spoken feedback. This will eliminate the chance of parts of programmes being missed due to spoken feedback interrupting the programme. The user would then be able to 'catch up' to the live programme at a time of their choosing.
  • DAB also sends associated textual information with a programme known as Dynamic Label Segments (DLS).
  • DLS Dynamic Label Segments
  • a user may want to be able to hear these along with the actual on air audio.
  • the DLS text can be spoken by the radio after first muting the current service. This would use the. same TTS functionality described above.
  • the difficulty for a user is knowing when the DLS has changed and also knowing if it contains something that the user has not heard before.
  • a radio station will have a series of DLS labels that it cycles through for a given programme.
  • the radio can cache a set of the most recent DLS texts. It can also remember which DLS the user has listened to. When they press the DLS button they would be able to cycle through the cached DLS with the most recent unlistened to DLS being presented first. If the user is interested in being notified of new DLS content then the radio can have a mode that beeps or says something like 'New DLS available' when new DLS is received.
  • the broadcaster could transmit spoken versions of the station name and DLS for the receiver to pick up, store and subsequently use.
  • the receiver uses the broadcaster supplied 'speech tags' rather than relying on synthesis in the receiver itself. These could make use of modern vocoding technology for high compression, or may simply provide additional intonation information to make the speech synthesis more realistic.
  • the format would have to be agreed between broadcasters and radio manufactures. The trade off is between high compression and therefore low cost for broadcasters and higher cost for receiver manufactures due to needing to support another audio speech decoder.
  • the Multimedia Object Transport (MOT) protocol is the most natural choice for this for distributing these speech tags.
  • the MOT carousel could be sent in Programme Associated Data (PAD) or sent as a separate data service.
  • PAD Program Associated Data
  • the whole layout of the receiver physical controls, the control surface design (e.g., encompassing Braille) and the expected semantics of operation of the receiver is standardised so as to provide the maximum assistance to a particular target group of users, for example those suffering from visual impairment. It is expected that any such standardisation would provide a set of HMI (human machine interface) 'idioms' for such routine tasks as navigating in and out of submenus, indicating selection, indicating list of options etc., all provided through a standardised -control set and requiring no visual feedback.
  • HMI human machine interface
  • the radio could contain a microphone and voice recognition module. This could be used to provide custom settings for different users of a shared radio. Different users may have different preset stations, preferred listening volume and mode of use preferences. The radio could also remember the last station each user was listening too.
  • An example appHcation is a digital radio in a family car. On power up the radio would ask for the user's name and then load their custom settings. If the radio isn't able to recognise the user or nothing is heard then it would default to the last user within a preset timeout.
  • Voice control The receiver can be further extended to enable voice based control allowing control without physical contact or visual feedback, making it potentially useful for the elderly or those with serious physical impairment. Alternatively, voice control could be combined with visual feedback if so desired. Voice control would use a voice recognition module to understand the user's request. There are varying levels of sophistication of implementation. The simplest implementation would have a predefined set of possible commands that closely match the names of the buttons on an equivalent 'physical' interface. For example commands might include: 'Select', Volume', 'Up', 'Down' etc Restricting the set to a known set of commands makes the recognition task easier and hence translates to a cheaper end unit price.
  • the end user experience can be made easier to use by also enabling the radio to understand station names. This would enable a user to select a station by saying a radio name rather than having to go through a sequence like 'Select', 'Up', Up', 'Up', 'Select'. This would of course require more powerful speech recognition than in the previous example.
  • DAB includes the facility to transmit an Electronic Programme Guide (EPG) which includes a list of the programmes that are on for various stations.
  • EPG Electronic Programme Guide
  • a voiced feedback receiver could enable the user to browse the EPG and listen to the details of each programme. These details include the name, time, duration and description. Each of these could be spoken in turn as the user navigates through the EPG. If the radio supports setting up timed recordings based on EPG then these could also be spoken to the user so that they can be reminded of what they have scheduled to be recorded.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Circuits Of Receivers In General (AREA)

Abstract

La présente invention concerne un récepteur radio numérique comprenant un module de synthèse de la parole capable de produire une parole synthétisée, laquelle parole synthétisée achemine les informations généralement présentées sur un écran du récepteur. Le mode de réalisation décrit dans cette invention consiste à utiliser le son en tant que source principale de retour d'informations à la place de l'écran à cristaux liquides utilisé dans les radios actuelles. Ces techniques visent principalement à fabriquer des radios plus simples à utiliser pour des malvoyants, mais elles peuvent également s'appliquer à des domaines d'application dans lesquels le retour d'informations visuel peut être à des fins de divertissement, tels que dans des applications à l'intérieur de véhicules. Le procédé décrit dans cette invention peut être mis en oeuvre dans tout système radio numérique fournissant des informations textuelles supplémentaires et, de ce fait, comprenant DAB, DRM, RDS, IBOC, etc.
PCT/GB2004/004295 2003-10-08 2004-10-08 Recepteur radio numerique a synthese de la parole WO2005036786A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0323551A GB0323551D0 (en) 2003-10-08 2003-10-08 DAB radio system with voiced control feedback
GB0323551.2 2003-10-08

Publications (1)

Publication Number Publication Date
WO2005036786A1 true WO2005036786A1 (fr) 2005-04-21

Family

ID=29433501

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2004/004295 WO2005036786A1 (fr) 2003-10-08 2004-10-08 Recepteur radio numerique a synthese de la parole

Country Status (2)

Country Link
GB (2) GB0323551D0 (fr)
WO (1) WO2005036786A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2411664C2 (ru) * 2006-06-16 2011-02-10 Нокиа Корпорейшн Идентификация вещательных каналов
EP2407961A3 (fr) * 2010-07-13 2012-02-01 Sony Europe Limited Système de diffusion utilisant une conversion de la parole vers le texte
US8694322B2 (en) 2005-08-05 2014-04-08 Microsoft Corporation Selective confirmation for execution of a voice activated user interface

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ATE449399T1 (de) 2005-05-31 2009-12-15 Telecom Italia Spa Bereitstellung von sprachsynthese auf benutzerendgeräten über ein kommunikationsnetz
EP1879310A1 (fr) * 2006-07-11 2008-01-16 Harman Becker Automotive Systems GmbH Méthode de décodage de flux de données de radiodiffusion numérique

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0259970A2 (fr) * 1986-09-08 1988-03-16 British Broadcasting Corporation Récepteurs radio
GB2295062A (en) * 1994-11-08 1996-05-15 British Broadcasting Corp Recording the call sign of a radio station for play back when tuned
EP0966102A1 (fr) * 1998-06-17 1999-12-22 Deutsche Thomson-Brandt Gmbh Procédé et dispositif pour signaler à l'auditeur un changement de programme ou de source de programmes à l'aide d'un marquage acoustique
EP1041755A2 (fr) * 1999-03-27 2000-10-04 Robert Bosch Gmbh Méthode pour la transmission et pour le traitement d'informations routières

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1130907C (zh) * 1998-06-02 2003-12-10 通用仪器公司 提供声音片段来增强机顶盒的用户界面的方法和装置
GB0307451D0 (en) * 2003-03-31 2003-05-07 Matsushita Electric Ind Co Ltd Digital receiver with aural interface

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0259970A2 (fr) * 1986-09-08 1988-03-16 British Broadcasting Corporation Récepteurs radio
GB2295062A (en) * 1994-11-08 1996-05-15 British Broadcasting Corp Recording the call sign of a radio station for play back when tuned
EP0966102A1 (fr) * 1998-06-17 1999-12-22 Deutsche Thomson-Brandt Gmbh Procédé et dispositif pour signaler à l'auditeur un changement de programme ou de source de programmes à l'aide d'un marquage acoustique
EP1041755A2 (fr) * 1999-03-27 2000-10-04 Robert Bosch Gmbh Méthode pour la transmission et pour le traitement d'informations routières

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KOZAMERNIK F: "DIGITAL AUDIO BROADCASTING", EBU REVIEW- TECHNICAL, EUROPEAN BROADCASTING UNION. BRUSSELS, BE, no. 279, 21 March 1999 (1999-03-21), pages 13 - 27, XP000848407, ISSN: 0251-0936 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8694322B2 (en) 2005-08-05 2014-04-08 Microsoft Corporation Selective confirmation for execution of a voice activated user interface
RU2411664C2 (ru) * 2006-06-16 2011-02-10 Нокиа Корпорейшн Идентификация вещательных каналов
EP2407961A3 (fr) * 2010-07-13 2012-02-01 Sony Europe Limited Système de diffusion utilisant une conversion de la parole vers le texte
CN102378050A (zh) * 2010-07-13 2012-03-14 索尼欧洲有限公司 使用文本转语音转换的广播系统
US9263027B2 (en) 2010-07-13 2016-02-16 Sony Europe Limited Broadcast system using text to speech conversion
CN102378050B (zh) * 2010-07-13 2017-03-01 索尼欧洲有限公司 使用文本转语音转换的广播系统

Also Published As

Publication number Publication date
GB2406983B (en) 2005-12-21
GB2406983A (en) 2005-04-13
GB0323551D0 (en) 2003-11-12
GB0422412D0 (en) 2004-11-10

Similar Documents

Publication Publication Date Title
KR100754676B1 (ko) 디지털 방송 수신 단말기의 전자 프로그램 가이드 데이터관리 장치 및 방법
EP2407961B1 (fr) Système de diffusion utilisant une conversion de la parole vers le texte
US8831948B2 (en) System and method for synthetically generated speech describing media content
CN1753502B (zh) 提供广告音乐的系统和方法
JP3086368B2 (ja) 放送通信装置
US20070260460A1 (en) Method and system for announcing audio and video content to a user of a mobile radio terminal
EP1860807A2 (fr) Appareil et procédé pour recevoir une diffusion multimédia numérique dans un dispositif électronique
US11258841B2 (en) Method for the transmission of audio contents in a hybrid receiver, system, receiver and program associated with the method
CN102244750B (zh) 具有声级控制功能的视频显示装置及其控制方法
EP1528700A3 (fr) Recepteur de radiodiffusion numérique/multimédia
US20090247096A1 (en) Method And System For Integrated FM Recording
EP1734750A2 (fr) Méthode et dispositif pour la réception de signaux de télédiffusion numérique
US6697608B2 (en) Digital audio/visual receiver with recordable memory
US20040194137A1 (en) Method, system, and apparatus for aural presentation of program guide
WO2005036786A1 (fr) Recepteur radio numerique a synthese de la parole
JP5551186B2 (ja) 放送受信装置及び放送受信装置における番組情報音声出力方法
KR20070111798A (ko) 휴대용 단말기의 방송 정보 공유 방법
US8472902B2 (en) Radio broadcast receiving apparatus and radio broadcast receiving method
WO2004102844A2 (fr) Ameliorations apportees a la radio numerique
EP1465361A2 (fr) Récepteur numérique avec une interface sonore
KR102339168B1 (ko) 디지털 라디오방송 제어방법 및 장치
JP2001127718A (ja) 広告音声挿入方法及び装置
KR101253640B1 (ko) 자막 제공 방법 및 그를 위한 장치
KR20070025770A (ko) 디지털 멀티미디어 방송 수신 장치 및 디지털 멀티미디어방송 수신 장치에서 디지털 멀티미디어 방송 채널 정보제공 방법
EP2953373A1 (fr) Récepteur radio et procédé de traitement des signaux de diffusion dans un dispositif de réception de signal de diffusion

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DPEN Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed from 20040101)
122 Ep: pct application non-entry in european phase