WO2005036786A1 - Recepteur radio numerique a synthese de la parole - Google Patents
Recepteur radio numerique a synthese de la parole Download PDFInfo
- Publication number
- WO2005036786A1 WO2005036786A1 PCT/GB2004/004295 GB2004004295W WO2005036786A1 WO 2005036786 A1 WO2005036786 A1 WO 2005036786A1 GB 2004004295 W GB2004004295 W GB 2004004295W WO 2005036786 A1 WO2005036786 A1 WO 2005036786A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- receiver
- speech
- radio
- information
- digital radio
- Prior art date
Links
- 230000015572 biosynthetic process Effects 0.000 title claims abstract description 14
- 238000003786 synthesis reaction Methods 0.000 title claims abstract description 14
- 238000000034 method Methods 0.000 claims abstract description 10
- 230000006870 function Effects 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 abstract description 6
- 230000001771 impaired effect Effects 0.000 abstract description 2
- 230000003993 interaction Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000012790 confirmation Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 206010047571 Visual impairment Diseases 0.000 description 1
- 230000006735 deficit Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
- 208000029257 vision disease Diseases 0.000 description 1
- 230000004393 visual impairment Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04H—BROADCAST COMMUNICATION
- H04H40/00—Arrangements specially adapted for receiving broadcast information
- H04H40/18—Arrangements characterised by circuits or components specially adapted for receiving
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03J—TUNING RESONANT CIRCUITS; SELECTING RESONANT CIRCUITS
- H03J2200/00—Indexing scheme relating to tuning resonant circuits and selecting resonant circuits
- H03J2200/30—Radio receiver with speech synthesis ability, used for conveying information that is shown on the display
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04H—BROADCAST COMMUNICATION
- H04H2201/00—Aspects of broadcast communication
- H04H2201/10—Aspects of broadcast communication characterised by the type of broadcast system
- H04H2201/20—Aspects of broadcast communication characterised by the type of broadcast system digital audio broadcasting [DAB]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04H—BROADCAST COMMUNICATION
- H04H60/00—Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
- H04H60/68—Systems specially adapted for using specific information, e.g. geographical or meteorological information
- H04H60/73—Systems specially adapted for using specific information, e.g. geographical or meteorological information using meta-information
- H04H60/74—Systems specially adapted for using specific information, e.g. geographical or meteorological information using meta-information using programme related information, e.g. title, composer or interpreter
Definitions
- the present invention provides a number of methods and techniques for making a digital radio (e.g. a Digital Audio Broadcasting (DAB) radio) usable without visual feedback.
- a digital radio e.g. a Digital Audio Broadcasting (DAB) radio
- Digital radio receivers can decode digital radio signals conforming to one or more different formats, such as DAB, DRM or IBOC.
- digital radios have displays that can display textual information broadcast with a radio channel (e.g. the name- of the song being broadcast on that channel). Because digital radios generally have displays that can show text and in some cases graphics, much of the user interaction developments for digital radios have centered on new graphical user interfaces, or other ways of presenting information to the end-user that take advantage of the display and the ability for a digital radio channel to carry, an information payload.
- a digital radio receiver comprising a speech synthesis module able to generate synthesised speech, the synthesised speech conveying information ordinarily shown on a display of the receiver.
- the present invention moves away from the established bias in digital radio user interaction design by ignoring, or at least supplementing the display with synthesized speech.
- the synthesised speech could confirm a control operation performed by an end-user (e.g. switching to pre-set station 1 could be accompanied by a synthesized voice saying "preset one"; the operation of tuning could be accompanied by the synthesized voice saying "tuning” etc.).
- the receiver may include a display and the information conveyed by the synthesised speech is then also shown on the display.
- the synthesized speech may not correspond to anything in fact displayed on a display.
- the synthesised speech could be generated by synthesising textual information broadcast with a digital radio channel received by the receiver.
- the textual information could be DLS, such as the station name the receiver is currently tuned into. It could be an alert relating to a newly received DLS.
- the textual information could be stored in memory (e.g. to facilitate rapid replay at a future time; to enable the device to cycle through several discrete items of synthesised speech under the control of the end-user).
- compressed speech versions of the the textual information could also be downloaded via transport protocols such as MOT.
- the receiver could also be able to process a broadcast data file, such as a DLS announcement or radio station jingle.
- the data file could also include information that improves the accuracy of the speech synthesis.
- the receiver could implement a radio pause or rewind function is activated, in which live radio is temporarily stored in memory, whilst synthesized speech outputs from a speaker in the receiver.
- a method of enabling a digital radio receiver to provide information; comprising the step of synthesising speech in a speech synthesis module in the receiver, the synthesised speech conveying information ordinarily shown on a display of the receiver.
- the present invention therefore uses audio as the primary source of feedback in place of the LCD used in radios available today. These techniques are predominantiy aimed at making radios easier to use for the visually impaired but are also applicable to application areas where visual feedback may be distracting, such as automotive applications.
- the present invention can be implemented in any radio system that provides additional textual information and hence includes DAB, DRM., IBOC and others.
- the receiver confirms all operations performed by the user using a spoken equivalent of the operations result. For example changing the volume would cause a spoken message of the form "Volume now 10'. For a button that has multiple functions each press would announce the new mode, for example this might cycle between 'Volume mode', ⁇ ass mode', 'Treble mode', 'Balance mode' and 'Playback resumed'.
- DAB provides textual information about services
- this can also be communicated to the user using a spoken equivalent.
- the interaction to select a new channel would be something like: • User presses 'select' button on the digital radio • Audio for current service is muted ' • Radio says 'Select station mode' — i.e. speech synthesised • User presses 'up' button on the digital radio • Radio looks for next station and outputs a spoken representation of its name — e.g. 'BBC Radio four' • User presses 'up' button again • Radio says 'BBC Five live' • User presses 'select' button • Radio says 'BBC Five live selected' • Radio starts playing Five live.
- the radio makes use of a text to speech TTS module that converts text into speech that would normally be fed back to the user via the LCD.
- the LCD can still be used to provide a second source of information.
- TTS is a well understood problem and there a number of vendors selhng off the shelf TTS libraries suitable for embedded devices typically found in today's digital radios. High quality TTS can consume large amount of memory and processor time so different radio models will have different TTS modules that will be chosen as the best trade-off between quality and use of the resources available.
- the radio is frustratingly slow to operate. There are number of techniques that can be employed to improve this: • If the radio is in the middle of speaking and the user presses a key then the current phrase should be interrupted and the phrase corresponding to the new key press should be started immediately. For example when changing the volume the user might press the 'Up' button a number of times. Instead of hearing Volume now 5', 'Volume now 6' and Volume now 7', the user might hear something like 'Vol..', * Vol', Volume now 7'. • The radio can have an advanced mode where confirmations are minimised and abbreviated.
- Radio station names can be abbreviated using contracted form hints transmitted with the station name. In some cases this leads to names of stations that can't easily be pronounced; in these cases the TTS module would return an error saying that the station name was unpronounceable and the normal longer name would be used instead. • If the radio supports a pause/rewind facility, then the current station could be paused while the radio is providing spoken feedback. This will eliminate the chance of parts of programmes being missed due to spoken feedback interrupting the programme. The user would then be able to 'catch up' to the live programme at a time of their choosing.
- DAB also sends associated textual information with a programme known as Dynamic Label Segments (DLS).
- DLS Dynamic Label Segments
- a user may want to be able to hear these along with the actual on air audio.
- the DLS text can be spoken by the radio after first muting the current service. This would use the. same TTS functionality described above.
- the difficulty for a user is knowing when the DLS has changed and also knowing if it contains something that the user has not heard before.
- a radio station will have a series of DLS labels that it cycles through for a given programme.
- the radio can cache a set of the most recent DLS texts. It can also remember which DLS the user has listened to. When they press the DLS button they would be able to cycle through the cached DLS with the most recent unlistened to DLS being presented first. If the user is interested in being notified of new DLS content then the radio can have a mode that beeps or says something like 'New DLS available' when new DLS is received.
- the broadcaster could transmit spoken versions of the station name and DLS for the receiver to pick up, store and subsequently use.
- the receiver uses the broadcaster supplied 'speech tags' rather than relying on synthesis in the receiver itself. These could make use of modern vocoding technology for high compression, or may simply provide additional intonation information to make the speech synthesis more realistic.
- the format would have to be agreed between broadcasters and radio manufactures. The trade off is between high compression and therefore low cost for broadcasters and higher cost for receiver manufactures due to needing to support another audio speech decoder.
- the Multimedia Object Transport (MOT) protocol is the most natural choice for this for distributing these speech tags.
- the MOT carousel could be sent in Programme Associated Data (PAD) or sent as a separate data service.
- PAD Program Associated Data
- the whole layout of the receiver physical controls, the control surface design (e.g., encompassing Braille) and the expected semantics of operation of the receiver is standardised so as to provide the maximum assistance to a particular target group of users, for example those suffering from visual impairment. It is expected that any such standardisation would provide a set of HMI (human machine interface) 'idioms' for such routine tasks as navigating in and out of submenus, indicating selection, indicating list of options etc., all provided through a standardised -control set and requiring no visual feedback.
- HMI human machine interface
- the radio could contain a microphone and voice recognition module. This could be used to provide custom settings for different users of a shared radio. Different users may have different preset stations, preferred listening volume and mode of use preferences. The radio could also remember the last station each user was listening too.
- An example appHcation is a digital radio in a family car. On power up the radio would ask for the user's name and then load their custom settings. If the radio isn't able to recognise the user or nothing is heard then it would default to the last user within a preset timeout.
- Voice control The receiver can be further extended to enable voice based control allowing control without physical contact or visual feedback, making it potentially useful for the elderly or those with serious physical impairment. Alternatively, voice control could be combined with visual feedback if so desired. Voice control would use a voice recognition module to understand the user's request. There are varying levels of sophistication of implementation. The simplest implementation would have a predefined set of possible commands that closely match the names of the buttons on an equivalent 'physical' interface. For example commands might include: 'Select', Volume', 'Up', 'Down' etc Restricting the set to a known set of commands makes the recognition task easier and hence translates to a cheaper end unit price.
- the end user experience can be made easier to use by also enabling the radio to understand station names. This would enable a user to select a station by saying a radio name rather than having to go through a sequence like 'Select', 'Up', Up', 'Up', 'Select'. This would of course require more powerful speech recognition than in the previous example.
- DAB includes the facility to transmit an Electronic Programme Guide (EPG) which includes a list of the programmes that are on for various stations.
- EPG Electronic Programme Guide
- a voiced feedback receiver could enable the user to browse the EPG and listen to the details of each programme. These details include the name, time, duration and description. Each of these could be spoken in turn as the user navigates through the EPG. If the radio supports setting up timed recordings based on EPG then these could also be spoken to the user so that they can be reminded of what they have scheduled to be recorded.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Circuits Of Receivers In General (AREA)
Abstract
La présente invention concerne un récepteur radio numérique comprenant un module de synthèse de la parole capable de produire une parole synthétisée, laquelle parole synthétisée achemine les informations généralement présentées sur un écran du récepteur. Le mode de réalisation décrit dans cette invention consiste à utiliser le son en tant que source principale de retour d'informations à la place de l'écran à cristaux liquides utilisé dans les radios actuelles. Ces techniques visent principalement à fabriquer des radios plus simples à utiliser pour des malvoyants, mais elles peuvent également s'appliquer à des domaines d'application dans lesquels le retour d'informations visuel peut être à des fins de divertissement, tels que dans des applications à l'intérieur de véhicules. Le procédé décrit dans cette invention peut être mis en oeuvre dans tout système radio numérique fournissant des informations textuelles supplémentaires et, de ce fait, comprenant DAB, DRM, RDS, IBOC, etc.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB0323551A GB0323551D0 (en) | 2003-10-08 | 2003-10-08 | DAB radio system with voiced control feedback |
GB0323551.2 | 2003-10-08 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2005036786A1 true WO2005036786A1 (fr) | 2005-04-21 |
Family
ID=29433501
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/GB2004/004295 WO2005036786A1 (fr) | 2003-10-08 | 2004-10-08 | Recepteur radio numerique a synthese de la parole |
Country Status (2)
Country | Link |
---|---|
GB (2) | GB0323551D0 (fr) |
WO (1) | WO2005036786A1 (fr) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
RU2411664C2 (ru) * | 2006-06-16 | 2011-02-10 | Нокиа Корпорейшн | Идентификация вещательных каналов |
EP2407961A3 (fr) * | 2010-07-13 | 2012-02-01 | Sony Europe Limited | Système de diffusion utilisant une conversion de la parole vers le texte |
US8694322B2 (en) | 2005-08-05 | 2014-04-08 | Microsoft Corporation | Selective confirmation for execution of a voice activated user interface |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
ATE449399T1 (de) | 2005-05-31 | 2009-12-15 | Telecom Italia Spa | Bereitstellung von sprachsynthese auf benutzerendgeräten über ein kommunikationsnetz |
EP1879310A1 (fr) * | 2006-07-11 | 2008-01-16 | Harman Becker Automotive Systems GmbH | Méthode de décodage de flux de données de radiodiffusion numérique |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0259970A2 (fr) * | 1986-09-08 | 1988-03-16 | British Broadcasting Corporation | Récepteurs radio |
GB2295062A (en) * | 1994-11-08 | 1996-05-15 | British Broadcasting Corp | Recording the call sign of a radio station for play back when tuned |
EP0966102A1 (fr) * | 1998-06-17 | 1999-12-22 | Deutsche Thomson-Brandt Gmbh | Procédé et dispositif pour signaler à l'auditeur un changement de programme ou de source de programmes à l'aide d'un marquage acoustique |
EP1041755A2 (fr) * | 1999-03-27 | 2000-10-04 | Robert Bosch Gmbh | Méthode pour la transmission et pour le traitement d'informations routières |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1130907C (zh) * | 1998-06-02 | 2003-12-10 | 通用仪器公司 | 提供声音片段来增强机顶盒的用户界面的方法和装置 |
GB0307451D0 (en) * | 2003-03-31 | 2003-05-07 | Matsushita Electric Ind Co Ltd | Digital receiver with aural interface |
-
2003
- 2003-10-08 GB GB0323551A patent/GB0323551D0/en not_active Ceased
-
2004
- 2004-10-08 WO PCT/GB2004/004295 patent/WO2005036786A1/fr active Application Filing
- 2004-10-08 GB GB0422412A patent/GB2406983B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0259970A2 (fr) * | 1986-09-08 | 1988-03-16 | British Broadcasting Corporation | Récepteurs radio |
GB2295062A (en) * | 1994-11-08 | 1996-05-15 | British Broadcasting Corp | Recording the call sign of a radio station for play back when tuned |
EP0966102A1 (fr) * | 1998-06-17 | 1999-12-22 | Deutsche Thomson-Brandt Gmbh | Procédé et dispositif pour signaler à l'auditeur un changement de programme ou de source de programmes à l'aide d'un marquage acoustique |
EP1041755A2 (fr) * | 1999-03-27 | 2000-10-04 | Robert Bosch Gmbh | Méthode pour la transmission et pour le traitement d'informations routières |
Non-Patent Citations (1)
Title |
---|
KOZAMERNIK F: "DIGITAL AUDIO BROADCASTING", EBU REVIEW- TECHNICAL, EUROPEAN BROADCASTING UNION. BRUSSELS, BE, no. 279, 21 March 1999 (1999-03-21), pages 13 - 27, XP000848407, ISSN: 0251-0936 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8694322B2 (en) | 2005-08-05 | 2014-04-08 | Microsoft Corporation | Selective confirmation for execution of a voice activated user interface |
RU2411664C2 (ru) * | 2006-06-16 | 2011-02-10 | Нокиа Корпорейшн | Идентификация вещательных каналов |
EP2407961A3 (fr) * | 2010-07-13 | 2012-02-01 | Sony Europe Limited | Système de diffusion utilisant une conversion de la parole vers le texte |
CN102378050A (zh) * | 2010-07-13 | 2012-03-14 | 索尼欧洲有限公司 | 使用文本转语音转换的广播系统 |
US9263027B2 (en) | 2010-07-13 | 2016-02-16 | Sony Europe Limited | Broadcast system using text to speech conversion |
CN102378050B (zh) * | 2010-07-13 | 2017-03-01 | 索尼欧洲有限公司 | 使用文本转语音转换的广播系统 |
Also Published As
Publication number | Publication date |
---|---|
GB2406983B (en) | 2005-12-21 |
GB2406983A (en) | 2005-04-13 |
GB0323551D0 (en) | 2003-11-12 |
GB0422412D0 (en) | 2004-11-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR100754676B1 (ko) | 디지털 방송 수신 단말기의 전자 프로그램 가이드 데이터관리 장치 및 방법 | |
EP2407961B1 (fr) | Système de diffusion utilisant une conversion de la parole vers le texte | |
US8831948B2 (en) | System and method for synthetically generated speech describing media content | |
CN1753502B (zh) | 提供广告音乐的系统和方法 | |
JP3086368B2 (ja) | 放送通信装置 | |
US20070260460A1 (en) | Method and system for announcing audio and video content to a user of a mobile radio terminal | |
EP1860807A2 (fr) | Appareil et procédé pour recevoir une diffusion multimédia numérique dans un dispositif électronique | |
US11258841B2 (en) | Method for the transmission of audio contents in a hybrid receiver, system, receiver and program associated with the method | |
CN102244750B (zh) | 具有声级控制功能的视频显示装置及其控制方法 | |
EP1528700A3 (fr) | Recepteur de radiodiffusion numérique/multimédia | |
US20090247096A1 (en) | Method And System For Integrated FM Recording | |
EP1734750A2 (fr) | Méthode et dispositif pour la réception de signaux de télédiffusion numérique | |
US6697608B2 (en) | Digital audio/visual receiver with recordable memory | |
US20040194137A1 (en) | Method, system, and apparatus for aural presentation of program guide | |
WO2005036786A1 (fr) | Recepteur radio numerique a synthese de la parole | |
JP5551186B2 (ja) | 放送受信装置及び放送受信装置における番組情報音声出力方法 | |
KR20070111798A (ko) | 휴대용 단말기의 방송 정보 공유 방법 | |
US8472902B2 (en) | Radio broadcast receiving apparatus and radio broadcast receiving method | |
WO2004102844A2 (fr) | Ameliorations apportees a la radio numerique | |
EP1465361A2 (fr) | Récepteur numérique avec une interface sonore | |
KR102339168B1 (ko) | 디지털 라디오방송 제어방법 및 장치 | |
JP2001127718A (ja) | 広告音声挿入方法及び装置 | |
KR101253640B1 (ko) | 자막 제공 방법 및 그를 위한 장치 | |
KR20070025770A (ko) | 디지털 멀티미디어 방송 수신 장치 및 디지털 멀티미디어방송 수신 장치에서 디지털 멀티미디어 방송 채널 정보제공 방법 | |
EP2953373A1 (fr) | Récepteur radio et procédé de traitement des signaux de diffusion dans un dispositif de réception de signal de diffusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DPEN | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed from 20040101) | ||
122 | Ep: pct application non-entry in european phase |