EP2608195A1 - Secure text-to-speech synthesis in portable electronic devices - Google Patents
Secure text-to-speech synthesis in portable electronic devices Download PDFInfo
- Publication number
- EP2608195A1 EP2608195A1 EP11195142.2A EP11195142A EP2608195A1 EP 2608195 A1 EP2608195 A1 EP 2608195A1 EP 11195142 A EP11195142 A EP 11195142A EP 2608195 A1 EP2608195 A1 EP 2608195A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- portable electronic
- text
- server
- electronic device
- message
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000015572 biosynthetic process Effects 0.000 title claims abstract description 20
- 238000003786 synthesis reaction Methods 0.000 title claims abstract description 20
- 238000000034 method Methods 0.000 claims abstract description 45
- 238000006243 chemical reaction Methods 0.000 claims abstract description 11
- 238000004891 communication Methods 0.000 description 47
- 238000007726 management method Methods 0.000 description 22
- 230000006870 function Effects 0.000 description 16
- 238000013478 data encryption standard Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 230000008520 organization Effects 0.000 description 6
- 230000001413 cellular effect Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 238000012546 transfer Methods 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 4
- 230000006835 compression Effects 0.000 description 4
- 238000007906 compression Methods 0.000 description 4
- 238000003860 storage Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000012384 transportation and delivery Methods 0.000 description 3
- 239000008186 active pharmaceutical agent Substances 0.000 description 2
- 230000006837 decompression Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000002085 persistent effect Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000001755 vocal effect Effects 0.000 description 2
- 235000006508 Nelumbo nucifera Nutrition 0.000 description 1
- 240000002853 Nelumbo nucifera Species 0.000 description 1
- 235000006510 Nelumbo pentapetala Nutrition 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 239000000446 fuel Substances 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- 238000013439 planning Methods 0.000 description 1
- 238000003825 pressing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000009979 protective mechanism Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000001308 synthesis method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
- G10L13/047—Architecture of speech synthesisers
Definitions
- the present disclosure relates generally to text-to-speech synthesis. More particularly, the present disclosure relates to a method and system for secure text-to-speech synthesis in portable electronic devices.
- Portable electronic devices include, for example, several types of mobile stations such as simple cellular telephones, smart phones, wireless personal digital assistants (PDAs), and laptop computers with wireless 802.11 or Bluetooth capabilities.
- PIM personal information manager
- Text-to-speech synthesis can be used in a number of applications to convert normal language text into speech, and can be implemented in software or hardware. For example, those who are visually impaired may use text-to-speech systems to read textual material. The use of text-to-speech synthesis can be useful in portable electronic devices, such as for the reading of email and text messages.
- FIG. 1 is a block diagram of a portable electronic device in accordance with the disclosure.
- Figure 2 is a block diagram of a host system of an example configuration in accordance with the disclosure.
- FIG. 3 is a flowchart of a method in accordance with the disclosure.
- Figure 4 is diagram of an example method of server-side authentication in accordance with the disclosure.
- Figure 5 is diagram of an example method of server-side authentication in accordance with the disclosure.
- Figure 6 is diagram of an example method of device-side authentication in accordance with the disclosure.
- the embodiments described herein generally relate to mobile wireless communication devices, hereafter referred to as a portable electronic devices.
- Examples of applicable communication devices include pagers, cellular phones, cellular smartphones, wireless organizers, personal digital assistants, computers, laptops, tablets, media players, e- book readers, handheld wireless communication devices, wirelessly enabled notebook computers and the like.
- pagers cellular phones
- cellular smartphones wireless organizers
- personal digital assistants computers
- laptops tablets
- media players e- book readers
- handheld wireless communication devices wirelessly enabled notebook computers and the like.
- numerous specific details are set forth in order to provide a thorough understanding of the embodiments described herein. However, it will be understood by those of ordinary skill in the art that the embodiments described herein may be practiced without these specific details. Also, the description is not to be considered as limiting the scope of the embodiments described herein.
- Embodiments of the invention may be represented as a software, or computer program, product stored in a machine-readable medium (also referred to as a computer-readable medium, a processor-readable medium, or a computer-usable medium having a computer-readable program code embodied therein or stored thereon).
- the machine-readable medium may be any suitable medium, including magnetic, optical, or electrical storage medium including a diskette, compact disk read only memory (CD-ROM), memory device (volatile or non-volatile), or similar storage mechanism.
- the machine-readable medium may contain various sets of instructions, code sequences, configuration information, or other data, which, when executed, cause a processor, or processors, to perform steps in a method according to an embodiment of the invention.
- Those of ordinary skill in the art will appreciate that other instructions and operations necessary to implement the described invention may also be stored on the machine-readable medium.
- Software running from the machine readable medium may interface with circuitry or other hardware to perform the described tasks.
- a portable electronic device may be a two-way communication device with advanced data communication capabilities, including the capability to communicate with other portable electronic devices or computer systems through a network of transceiver stations.
- the portable electronic device may also have the capability to allow voice communication.
- it may be referred to as a pager, cellular phone, cellular smartphone, wireless organizer, personal digital assistant, computer, laptop, tablet, media player, e- book reader, handheld wireless communication device, wirelessly-enabled notebook computer and the like.
- the portable electronic device 100 includes a number of components such as a main processor 102 that controls the overall operation of the portable electronic device 100. Communication functions, including data and voice communications, are performed through a communication subsystem 104. Data received by the portable electronic device 100 can be optionally decompressed and decrypted by decoder 103, operating according to any suitable decompression techniques (e.g. using decompression techniques for MPEG, JPEG, ZIP compression techniques) and decryption techniques (e.g. using decryption techniques for Data Encryption Standard (DES), Triple DES, or Advanced Encryption Standard (AES) encryption techniques).
- decompression techniques e.g. using decompression techniques for MPEG, JPEG, ZIP compression techniques
- decryption techniques e.g. using decryption techniques for Data Encryption Standard (DES), Triple DES, or Advanced Encryption Standard (AES) encryption techniques.
- the communication subsystem 104 receives messages from and sends messages to a wireless network 200.
- the communication subsystem 104 can be configured in accordance with the Global System for Mobile Communication (GSM) and General Packet Radio Services (GPRS) standards, Enhanced Data GSM Environment (EDGE), Universal Mobile Telecommunications Service (UMTS), or the like. New standards are still being defined, but it is believed that they will have similarities to the network behavior described herein, and it will also be understood by persons skilled in the art that the embodiments described herein are intended to use any other suitable standards that are developed in the future.
- GSM Global System for Mobile Communication
- GPRS General Packet Radio Services
- EDGE Enhanced Data GSM Environment
- UMTS Universal Mobile Telecommunications Service
- the wireless link connecting the communication subsystem 104 with the wireless network 200 represents one or more different Radio Frequency (RF) channels, operating according to defined protocols specified for GSM/GPRS communications. With newer network protocols, these channels are capable of supporting both circuit switched voice communications and packet switched data communications.
- RF Radio Frequency
- wireless network 200 associated with portable electronic device 100 is a GSM/GPRS wireless network in one example implementation
- other wireless networks may also be associated with the portable electronic device 100 in variant implementations.
- the different types of wireless networks that may be employed include, for example, data-centric wireless networks, voice-centric wireless networks, and dual-mode networks that can support both voice and data communications over the same physical base stations.
- Combined dual-mode networks include, but are not limited to, Code Division Multiple Access (CDMA) or CDMA2000 networks, GSM/GPRS networks, and third-generation (3G) and fourth-generation (4G) networks, such as EDGE, UMTS, High-Speed Downlink Packet Access (HSDPA), Worldwide Interoperability for Microwave Access (WiMAX), Long Term Evolution (LTE), and LTE Advanced.
- CDMA Code Division Multiple Access
- 3G Third-generation
- 4G fourth-generation
- EDGE EDGE
- UMTS High-Speed Downlink Packet Access
- HSDPA High-Speed Downlink Packet Access
- WiMAX Worldwide Interoperability for Microwave Access
- LTE Long Term Evolution
- LTE Advanced LTE Advanced
- Some other examples of data-centric networks include WiFi 802.11, MobitexTM and DataTACTM network communication systems.
- voice-centric data networks include Personal Communication Systems (PCS) networks like GSM and Time Division Multiple Access (TDMA) systems.
- PCS Personal
- the main processor 102 can also interact with additional subsystems such as a Random Access Memory (RAM) 106, a flash memory 108, a display 110, an auxiliary input/output (I/O) subsystem 112, a data port 114, a keyboard 116, a speaker 118, a microphone 120, short-range communications 122 and other device subsystems 124.
- the display 110 and the keyboard 116 may be used for both communication-related functions, such as entering a text message for transmission over the network 200, and device-resident functions such as a calculator or task list.
- the portable electronic device 100 can send and receive communication signals over the wireless network 200 after required network registration or activation procedures have been completed.
- Network access is associated with a subscriber or user of the portable electronic device 100.
- the portable electronic device 100 can, for example, use a SIM/RUIM card 126 (i.e. Subscriber identity Module or a Removable User identity Module) inserted into a SIM/RUIM interface 128 in order to communicate with a network.
- SIM is typically a component of a SIM card that can be inserted into a mobile device in order to associate that device with the user identified by the SIM.
- SIM cards can have various form factors such as a full size that is approximately the size of a credit card, a smaller mini size, and a still smaller micro size.
- SIM Universal Integrated Circuit Card
- USIM Universal Subscriber identity Module
- R-UIM Removable User identity Module
- a SIM could be stored on an embedded UICC (eUICC) or a similar software-based SIM module. Any such card or software module will be referred to herein as a SIM card, but it should be understood that such entities do not necessarily have the form factor of a card. Without the SIM card 126, the portable electronic device 100 is not fully operational for communication with the wireless network 200. By inserting the SIM card/RUIM 126 into the SIM/RUIM interface 128, a subscriber can access all subscribed services.
- the SIM card/RUIM 126 includes a processor and memory for storing information. Once the SIM card/RUIM 126 is inserted into the SIM/RUIM interface 128, it is coupled to the main processor 102. In order to identify the subscriber, the SIM card/RUIM 126 can include some user parameters such as an International Mobile Subscriber Identity (IMSI).
- IMSI International Mobile Subscriber Identity
- An advantage of using the SIM card/RUIM 126 is that a subscriber is not necessarily bound by any single physical portable electronic device.
- the SIM card/RUIM 126 may store additional subscriber information for a portable electronic device as well, including datebook (or calendar) information and recent call information. Alternatively, user identification information can also be programmed into the flash memory 108.
- the portable electronic device 100 is a battery-powered device and includes a battery interface 132 for receiving one or more rechargeable batteries 130.
- the battery 130 can be a smart battery with an embedded microprocessor.
- the battery interface 132 is coupled to a regulator (not shown), which assists the battery 130 in providing power V+ to the portable electronic device 100.
- a regulator not shown
- future technologies such as micro fuel cells may provide the power to the portable electronic device 100.
- the portable electronic device 100 also includes an operating system 134 and software components 136, which are described in more detail below.
- the operating system 134 and the software components 136 that are executed by the main processor 102 are typically stored in a persistent store such as the flash memory 108, which may alternatively be a read-only memory (ROM) or similar storage element (not shown).
- a persistent store such as the flash memory 108
- ROM read-only memory
- portions of the operating system 134 and the software components 136 such as specific device applications, or parts thereof, may be temporarily loaded into a volatile store such as the RAM 106.
- Other software components can also be included, as is well known to those skilled in the art.
- the subset of software applications 136 that control basic device operations, including data and voice communication applications, will normally be installed on the portable electronic device 100 during its manufacture.
- Other software applications include a message application 138 that can be any suitable software program that allows a user of the portable electronic device 100 to send and receive electronic messages.
- Messages that have been sent or received by the user are typically stored in the flash memory 108 of the portable electronic device 100 or some other suitable storage element in the portable electronic device 100. In at least some embodiments, some of the sent and received messages may be stored remotely from the device 100 such as in a data store of an associated host system with which the portable electronic device 100 communicates.
- the software applications can further include a device state module 140, a Personal Information Manager (PIM) 142, and other suitable modules (not shown).
- the device state module 140 provides persistence, i.e. the device state module 140 ensures that important device data is stored in persistent memory, such as the flash memory 108, so that the data is not lost when the portable electronic device 100 is turned off or loses power.
- the PIM 142 includes functionality for organizing and managing data items of interest to the user, such as, but not limited to, e-mail, contacts, calendar events, voice mails, appointments, and task items.
- a PIM application has the ability to send and receive data items via the wireless network 200.
- PIM data items may be seamlessly integrated, synchronized, and updated via the wireless network 200 with the portable electronic device subscriber's corresponding data items stored and/or associated with a host computer system. This functionality creates a mirrored host computer on the portable electronic device 100 with respect to such items. This can be particularly advantageous when the host computer system is the portable electronic device subscriber's office computer system.
- the portable electronic device 100 also includes a connect module 144, and an information technology (IT) policy module 146.
- the connect module 144 implements the communication protocols that are required for the portable electronic device 100 to communicate with the wireless infrastructure and any host system, such as an enterprise system, with which the portable electronic device 100 is authorized to interface. Examples of a wireless infrastructure and an enterprise system are given in Figures 3 and 4 , which are described in more detail below.
- the connect module 144 includes a set of APIs that can be integrated with the portable electronic device 100 to allow the portable electronic device 100 to use any number of services associated with the enterprise system.
- the connect module 144 allows the portable electronic device 100 to establish an end-to-end secure, authenticated communication pipe with the host system.
- a subset of applications for which access is provided by the connect module 144 can be used to pass IT policy commands from the host system to the portable electronic device 100. This can be done in a wireless or wired manner.
- These instructions can then be passed to the IT policy module 146 to modify the configuration of the device 100. Alternatively, in some cases, the IT policy update can also be done over a wired connection.
- software applications can also be installed on the portable electronic device 100.
- These software applications can be third party applications, which are added after the manufacture of the portable electronic device 100. Examples of third party applications include games, calculators, utilities, etc.
- the additional applications can be loaded onto the portable electronic device 100 through at least one of the wireless network 200, the auxiliary I/O subsystem 112, the data port 114, the short-range communications subsystem 122, or any other suitable device subsystem 124.
- This flexibility in application installation increases the functionality of the portable electronic device 100 and may provide enhanced on-device functions, communication-related functions, or both.
- secure communication applications may enable electronic commerce functions and other such financial transactions to be performed using the portable electronic device 100.
- the data port 114 enables a subscriber to set preferences through an external device or software application and extends the capabilities of the portable electronic device 100 by providing for information or software downloads to the portable electronic device 100 other than through a wireless communication network.
- the alternate download path may, for example, be used to load an encryption key onto the portable electronic device 100 through a direct and thus reliable and trusted connection to provide secure device communication.
- the data port 114 can be any suitable port that enables data communication between the portable electronic device 100 and another computing device.
- the data port 114 can be a serial or a parallel port. In some instances, the data port 114 can be a USB port that includes data lines for data transfer and a supply line that can provide a charging current to charge the battery 130 of the portable electronic device 100.
- the short-range communications subsystem 122 provides for communication between the portable electronic device 100 and different systems or devices, without the use of the wireless network 200.
- the subsystem 122 may include an infrared device and associated circuits and components for short-range communication.
- Examples of short-range communication standards include standards developed by the Infrared Data Association (IrDA), BluetoothTM, Near Field Communication (NFC) standards, and the 802.11 family of standards developed by IEEE.
- a received signal such as a text message, an e-mail message, or web page download will be processed by the communication subsystem 104 and input to the main processor 102.
- the main processor 102 in conjunction with the decoder 103, will then process the received signal for output to the display 110 or alternatively to the auxiliary I/O subsystem 112.
- a subscriber may also compose data items, such as e-mail messages, for example, using the keyboard 116 in conjunction with the display 110 and possibly the auxiliary I/O subsystem 112.
- the auxiliary subsystem 112 may include devices such as: a touch screen, mouse, track ball, trackpad, optical joystick, infrared fingerprint detector, or a roller wheel with dynamic button pressing capability.
- the keyboard 116 can include an alphanumeric keyboard and/or telephone-type keypad. However, other types of keyboards, including a virtual keyboard or an external keyboard, may also be used.
- a composed item may be transmitted over the wireless network 200 through the communication subsystem 104.
- the overall operation of the portable electronic device 100 may be substantially similar, except that the received signals are output to the speaker 118, and signals for transmission are generated by the microphone 120.
- Alternative voice or audio I/O subsystems such as a voice message recording subsystem, can also be implemented on the portable electronic device 100.
- voice or audio signal output is accomplished primarily through the speaker 118, the display 110 can also be used to provide additional information such as the identity of a calling party, duration of a voice call, or other voice call related information.
- FIG 2 is a block diagram illustrating components of a configuration of a host system 250 with which the portable electronic device 100 can, in conjunction with the connect module 144, communicate.
- the host system 250 will typically be a corporate enterprise or other local area network (LAN), but may also be a home office computer or some other private system, for example, in variant implementations.
- the host system 250 is depicted as a LAN of an organization to which a user of the portable electronic device 100 belongs.
- a plurality of portable electronic devices can communicate wirelessly with the host system 250 through one or more nodes 202 of the wireless network 200.
- the host system 250 includes a number of network components connected to each other by a network 260.
- a user's desktop computer 262a is situated on a LAN connection.
- the portable electronic device 100 can be directly coupled to the computer 262a by a serial or a Universal Serial Bus (USB) connection, for example, or can be wirelessly connected, such as through a Bluetooth connection.
- the connection of the portable electronic device 100 to the computer facilitates the loading of information (e.g. PIM data, private symmetric encryption keys to facilitate secure communications) from the user computer 262a to the portable electronic device 100, and may be particularly useful for bulk information updates often performed in initializing the portable electronic device 100 for use.
- the information downloaded to the portable electronic device 100 may include digital certificates used in the exchange of messages.
- the user computers 262a-262n will typically also be connected to other peripheral devices, such as printers, etc. which are not explicitly shown in Figure 4 .
- peripheral devices such as printers, etc.
- Only a subset of network components of the host system 250 are shown in Figure 4 for ease of exposition, and it will be understood that the host system 250 will include additional components that are not explicitly shown in Figure 4 for this example configuration. More generally, the host system 250 may represent a smaller part of a larger network (not shown) of the organization, and may include different components and/or be arranged in different topologies than that shown in the example embodiment of Figure 4 .
- the wireless communication support components 270 can include a message management server 272, a mobile data server (MDS) 274, a web server, such as Hypertext Transfer Protocol (HTTP) server 275, a contact server 276, and a device manager module 278.
- HTTP servers can also be located outside the enterprise system, as indicated by the HTTP server 275 attached to the network 224.
- the device manager module 278 includes an IT Policy editor 280 and an IT user property editor 282, as well as other software components for allowing an IT administrator to configure the portable electronic devices 100.
- the support components 270 also include a data store 284, and an IT policy server 286.
- the IT policy server 286 includes a processor 288, a network interface 290 and a memory unit 292.
- the processor 288 controls the operation of the IT policy server 286 and executes functions related to the standardized IT policy as described below.
- the network interface 290 allows the IT policy server 286 to communicate with the various components of the host system 250 and the portable electronic devices 100.
- the memory unit 292 can store functions used in implementing the IT policy as well as related data. Those skilled in the art know how to implement these various components. Other components may also be included as is well known to those skilled in the art. Further, in some implementations, the data store 284 can be part of any one of the servers.
- the portable electronic device 100 communicates with the host system 250 through node 202 of the wireless network 200 and a shared network infrastructure 224 such as a service provider network or the public Internet. Access to the host system 250 may be provided through one or more routers (not shown), and computing devices of the host system 250 may operate from behind a firewall or proxy server 266.
- the proxy server 266 provides a secure node and a wireless internet gateway for the host system 250. The proxy server 266 intelligently routes data to the correct destination server within the host system 250.
- the host system 250 can include a wireless VPN router (not shown) to facilitate data exchange between the host system 250 and the portable electronic device 100.
- the wireless VPN router allows a VPN connection to be established directly through a specific wireless network to the portable electronic device 100.
- the wireless VPN router can be used with the Internet Protocol (IP) Version 6 (IPV6) and IP-based wireless networks. This protocol can provide enough IP addresses so that each portable electronic device has a dedicated IP address, making it possible to push information to a portable electronic device at any time.
- IP Internet Protocol
- IPV6 Internet Protocol Version 6
- Messages intended for a user of the portable electronic device 100 are initially received by a message server 268 of the host system 250.
- Such messages may originate from any number of sources.
- a message may have been sent by a sender from the computer 262b within the host system 250, from a different portable electronic device (not shown) connected to the wireless network 200 or a different wireless network, or from a different computing device, or other device capable of sending messages, via the shared network infrastructure 224, possibly through an application service provider (ASP) or Internet service provider (ISP), for example.
- ASP application service provider
- ISP Internet service provider
- the message server 268 typically acts as the primary interface for the exchange of messages, particularly e-mail messages, within the organization and over the shared network infrastructure 224. Each user in the organization that has been set up to send and receive messages is typically associated with a user account managed by the message server 268.
- Some example implementations of the message server 268 include a Microsoft ExchangeTM server, a Lotus DominoTMserver, a Novell GroupwiseTMserver, or another suitable mail server installed in a corporate environment.
- the host system 250 may include multiple message servers 268.
- the message server 268 may also be adapted to provide additional functions beyond message management, including the management of data associated with calendars and task lists, for example.
- messages When messages are received by the message server 268, they are typically stored in a data store associated with the message server 268.
- the data store may be a separate hardware unit, such as data store 284, with which the message server 268 communicates. Messages can be subsequently retrieved and delivered to users by accessing the message server 268.
- an e-mail client application operating on a user's computer 262a may request the e-mail messages associated with that user's account stored on the data store associated with the message server 268. These messages are then retrieved from the data store and stored locally on the computer 262a.
- the data store associated with the message server 268 can store copies of each message that is locally stored on the portable electronic device 100.
- the data store associated with the message server 268 can store all of the messages for the user of the portable electronic device 100 and only a smaller number of messages can be stored on the portable electronic device 100 to conserve memory. For instance, the most recent messages (i.e. those received in the past two to three months for example) can be stored on the portable electronic device 100.
- the message application 138 (see Figure 1 ) operating on the portable electronic device 100 may also request messages associated with the user's account from the message server 268.
- the message application 138 may be configured (either by the user or by an administrator, possibly in accordance with an organization's IT policy) to make this request at the direction of the user, at some pre-defined time interval, or upon the occurrence of some pre-defined event.
- the portable electronic device 100 is assigned its own e-mail address, and messages addressed specifically to the portable electronic device 100 are automatically redirected to the portable electronic device 100 as they are received by the message server 268.
- the message management server 272 can be used to specifically provide support for the management of messages, such as e-mail messages, that are to be handled by portable electronic devices. Generally, while messages are still stored on the message server 268, the message management server 272 can be used to control when, if, and how messages are sent to the portable electronic device 100. The message management server 272 also facilitates the handling of messages composed on the portable electronic device 100, which are sent to the message server 268 for subsequent delivery.
- the message management server 272 may monitor the user's "mailbox" (e.g. the message store associated with the user's account on the message server 268) for new e-mail messages, and apply user-definable filters to new messages to determine if and how the messages are relayed to the user's portable electronic device 100.
- the message management server 272 may also, through an encoder 273, compress messages, using any suitable compression technology (e.g. using a compression technique such as ZIP, MPEG, JPEG and other known techniques) and encrypt messages (e.g.
- the message management server 272 may also receive messages composed on the portable electronic device 100 (e.g. encrypted using Triple DES), decrypt and decompress the composed messages, re-format the composed messages if desired so that they will appear to have originated from the user's computer 262a, and re-route the composed messages to the message server 268 for delivery.
- DES Data Encryption Standard
- Triple DES Triple DES
- AES Advanced Encryption Standard
- the message management server 272 may also receive messages composed on the portable electronic device 100 (e.g. encrypted using Triple DES), decrypt and decompress the composed messages, re-format the composed messages if desired so that they will appear to have originated from the user's computer 262a, and re-route the composed messages to the message server 268 for delivery.
- Triple DES Data Encryption Standard
- AES Advanced Encryption Standard
- Certain properties or restrictions associated with messages that are to be sent from and/or received by the portable electronic device 100 can be defined (e.g. by an administrator in accordance with IT policy) and enforced by the message management server 272. These may include whether the portable electronic device 100 may receive encrypted and/or signed messages, minimum encryption key sizes, whether outgoing messages must be encrypted and/or signed, and whether copies of all secure messages sent from the portable electronic device 100 are to be sent to a pre-defined copy address, for example.
- the message management server 272 may also be adapted to provide other control functions, such as only pushing certain message information or pre-defined portions (e.g. "blocks") of a message stored on the message server 268 to the portable electronic device 100. For example, in some cases, when a message is initially retrieved by the portable electronic device 100 from the message server 268, the message management server 272 may push only the first part of a message to the portable electronic device 100, with the part being of a pre-defined size (e.g. 2 KB). The user can then request that more of the message be delivered in similar-sized blocks by the message management server 272 to the portable electronic device 100, possibly up to a maximum pre-defined message size. Accordingly, the message management server 272 facilitates better control over the type of data and the amount of data that is communicated to the portable electronic device 100, and can help to minimize potential waste of bandwidth or other resources.
- pre-defined portions e.g. "blocks”
- the MDS 274 encompasses any other server that stores information that is relevant to the corporation.
- the mobile data server 274 may include, but is not limited to, databases, online data document repositories, customer relationship management (CRM) systems, or enterprise resource planning (ERP) applications.
- CRM customer relationship management
- ERP enterprise resource planning
- the MDS 274 can also connect to the Internet or other public network, through HTTP server 275 or other suitable web server such as an File Transfer Protocol (FTP) server, to retrieve HTTP webpages and other data. Requests for webpages are typically routed through M DS 274 and then to HTTP server 275, through suitable firewalls and other protective mechanisms. The web server then retrieves the webpage over the Internet, and returns it to MDS 274.
- FTP File Transfer Protocol
- MDS 274 is typically provided, or associated, with an encoder 277 that permits retrieved data, such as retrieved webpages, to be compressed, using any suitable compression technology, and encrypted, using an suitable encryption technique, as described above, and then pushed to the portable electronic device 100 via the shared network infrastructure 224 and the wireless network 200.
- the contact server 276 can provide information for a list of contacts for the user in a similar fashion as the address book on the portable electronic device 100. Accordingly, for a given contact, the contact server 276 can include the name, phone number, work address and e-mail address of the contact, among other information. The contact server 276 can also provide a global address list that contains the contact information for all of the contacts associated with the host system 250.
- the message management server 272, the MDS 274, the HTTP server 275, the contact server 276, the device manager module 278, the data store 284 and the IT policy server 286 do not need to be implemented on separate physical servers within the host system 250.
- some or all of the functions associated with the message management server 272 may be integrated with the message server 268, or some other server in the host system 250.
- the host system 250 may include multiple message management servers 272, particularly in variant implementations where a large number of portable electronic devices need to be supported.
- the present disclosure describes a method for secure text-to-speech conversion of text using speech or voice synthesis.
- Voice synthesis of text-based content can be used to synthesize any text-based content to an audio format for output to a recipient.
- voice synthesis can be used in various text-to-speech systems.
- the voice used in various text-to-speech systems is either a machine-generated voice, or a voice provided by a voice actor, not the sender, both of which can sound unnatural.
- the method generally consists of authenticating at least one of an originator or a recipient of text-based content to access a voiceprint associated with the text-based content (302), and converting at least a portion of the text-based content to an audio format for output to the recipient in accordance with the originator's voiceprint (304).
- text-based content encompasses any digital text content that can be synthesized for audio read out to the recipient. Examples of text-based content includes, but is not limited to, email messages, text messages, attachments to messages, calendar appointments, tasks, and text portions of any thereof.
- the "originator" of the text-based content is the creator of the digital content, who, in many of the described embodiments, is the sender of a message, such as the sender of an email message.
- the text-based content originates from another source, such as the creator of an attachment who is different from the sender of the message, or the creator of a calendar appointment or task, who may be, for example, the user of a portable electronic device (recipient) him or herself.
- HMM hidden Markov models
- HMM also called Statistical Parametric Synthesis
- the voiceprint can be stored as a voiceprint file. Speech waveforms can then be generated from the voiceprint.
- Advantages of HMM-based voice synthesis include the relatively short training time (e.g.
- a voiceprint can also, for example, be an adaptive voiceprint that adapts a "common" voice to a specific speaker.
- a server to provide secure text-to-speech conversion of text-based content can include a processor capable of authenticating at least one of an originator or a recipient of the text-based content, accessing a voiceprint associated with the originator of the text-based content; and converting, in accordance with the voiceprint, at least a portion of the text-based content to an audio format for output to the recipient or listener.
- a portable electronic device to provide secure text-to-speech conversion of text-based content can include a processor, such as main processor 102 (see Figure 1 ), configured to authenticate an originator of the text-based content; access a voiceprint associated with the originator; convert at least a portion of the text-based content to an audio format in accordance with the voiceprint; and cause the audio format to be output to an audio output device of the portable device, such as speaker 118 (see Figure 1 ) or another associated sound output device, such as a speaker paired to a portable electronic device through BluetoothTM, such as a BluetoothTM car set, or a headset.
- a processor such as main processor 102 (see Figure 1 )
- main processor 102 configured to authenticate an originator of the text-based content
- access a voiceprint associated with the originator convert at least a portion of the text-based content to an audio format in accordance with the voiceprint
- the audio format to be output to an audio output device of the portable device, such as speaker 118 (see Figure 1 )
- authentication of at least one of the originator or the recipient can include verifying a digital signature of at least one of the originator or the recipient, or authenticating the originator in accordance with a digital certificate accompanying the text-based content. Authenticating at least one of the originator or the recipient can further include verifying that the recipient is a trusted recipient.
- Server-side authentication permits a trusted third party, such as a wireless service provider or an enterprise to authenticate the originator and/or the recipient, and securely perform conversion of the text-based content to an audio format.
- a trusted third party such as a wireless service provider or an enterprise
- the text-based content is an email message.
- the following examples assume that the originator of the text-based content has created a voiceprint, using HMM or other appropriate method, prior to creating or transmitting text-based content.
- the originator of the message is authenticated by a trusted server 400 using a certificate-based authentication.
- the voiceprint file (HMM VP) is uploaded by the originator (A) to the server 400 of a trusted third party with a digital certificate (Cert A) (402).
- This step can occur at any time prior to, or contemporaneously with, the creation or transmission of the text-based content, and the communication may be secured using known or later developed methods, such as Secure Sockets Layer (SSL).
- a digital certificate also known as a public key certificate or identity certificate generally refers to an electronic document which uses a digital signature to bind a public key with an identity - information such as the name of a person or an organization, their address, etc.
- the certificate can be used to verify that a public key belongs to an individual.
- a valid digital signature gives a recipient reason to believe that the message was created by a known sender, and that it was not altered in transit. While the present embodiment will be described using digital certificates, the authentication method can be easily adapted, using shared symmetric keys, or to use messages simply signed with a digital signature.
- User A then sends a message to B, over any appropriate communication channel, and signs the message with the private key that corresponds to Cert A.
- An email client such as provided by message application 138 (see Figure 1 ), on a portable electronic device associated with B, receives the signed message (404).
- the email client then forwards the email message and signature to the trusted server 400 (406), for example, over a secure connection, such as an Secure Sockets Layer (SSL) or a Transport Layer Security (TLS) connection.
- SSL Secure Sockets Layer
- TLS Transport Layer Security
- the forwarding of the message and signature to the server 400 can be performed automatically by the email client on receipt of a signed message; can be user-initiated; can be based on specified conditions, such as time of day, presence status of user B; or can be based on user-specified preferences.
- only a portion of a message is signed with A's digital signature, and only the signed portion is forwarded and converted to speech.
- This embodiment is particularly appropriate for large text messages, such as long email threads, where it is only desirable to convert the newly added portion of the thread into speech.
- the server 400 On receipt of the message, the server 400 verifies the signature (408), thereby verifying the identity of sender A. If the signature verifies correctly, the server 400 then verifies the status of the certificate to ensure it has not expired or been revoked (410). The server 400 then retrieves the voiceprint file (HMM VP) associated with the certificate (412), converts the email message to an audio format (414) and transfers the audio message back to B's device to play (416).
- HMM VP voiceprint file
- the message can be automatically intercepted by a server prior to receipt by the email client.
- the intercepting server can be, for example, the enterprise server 270 (see Figure 2 ), which may also be the trusted server. Alternately, the intercepting server may forward the intercepted message to the trusted server.
- the conditions for such interception can also be based on specified conditions, such as time of day, presence status of user B; or can be based on user-specified preferences.
- the originator A is authenticated at the trusted server, as described above, and the trusted server then sends the audio message directly to the device, or back to the enterprise server which then pushes the result to the device. This embodiment avoids having the email client of the device have to send the message, which may be quite large if attachments are included, to the trusted server.
- the recipient of the message is authenticated by a trusted server 500 and compared to a list of trusted recipients provided by sender A.
- A can grant permissions to certain trusted parties to be able to play messages using his voiceprint.
- A's voiceprint file (HMM VP) is transmitted, over a secure connection, to a server 500, with a list of trusted recipients (502).
- HMM VP voiceprint file
- the email client at B signs the message with B's signature, and forwards the signed email message to the trusted server 500 (506), as described above.
- the server 500 verifies B's signature (508), thereby verifying the identity of the recipient B.
- server 500 If B's signature verifies correctly, the server 500 then verifies that B is included in the list of trusted recipients previously provided by A (510). If B is on the list of trusted recipients, server 500 retrieves A's voiceprint file (HMM VP) (512), converts the email message to an audio format (514) and transfers the audio message back to B's device (516).
- HMM VP voiceprint file
- a further embodiment (not shown), authenticates both A and B by combining the authentication methods shown in Figures 4 and 5 .
- the server verifies the identity of B as well as verifying that the content is from A.
- Sender A's voiceprint, digital certificate and a list of trusted recipients are transmitted to the server.
- A causes a digitally signed message to be sent to B, the signed message is, in turn, signed by the email client of B and forwarded to the server.
- the server then authenticates B based on B's signature and the list of trusted recipients, and authenticates A based on A's signature and the status of A's certificate. If both A and B are authenticated, A's voiceprint is retrieved, and the message converted as set out above.
- device-side authentication can be implemented using digital rights management (DRM) techniques.
- DRM digital rights management
- the voiceprint file is stored in an encrypted format, and only decrypted if the identity of A and/or B is verified.
- originator uploads the voiceprint file (HMM VP) to a server 600, with a list of trusted recipients, or other permissions (602).
- the server encrypts the file (604).
- the email client of the device associated with B requests the encrypted voiceprint file for A from a server 600, such as a Lightweight Directory Access Protocol (LDAP) server (606). If A has set permissions as to who can retrieve the voiceprint file, the server 600 can verify the identity of B (608).
- LDAP Lightweight Directory Access Protocol
- the server and B's device then establish a key to use to encrypt the content (610), using, for example, the Simple Password Exponential Key Exchange (SPEKE), or Public Key Infrastructure (PKI) techniques.
- SPEKE Simple Password Exponential Key Exchange
- PKI Public Key Infrastructure
- the encrypted file is transmitted to B (612).
- the file can either be directly stored encrypted or re-encrypted with a different key which is stored in a protected manner within the hardware of the device.
- the encrypted voice print file/decryption key is only accessible to the email client, which can be enforced using, for example, code signing on the email client application.
- the email client of B When the email client of B receives a digitally signed email message (614) from A (for which it has a stored voiceprint file), the email client verifies the email signature (616), temporarily decrypts the voiceprint file (618), and convert the email to audio format (620).
- the encrypted voiceprint file can optionally have an attached digital certificate to permit B to verify that the source of the signed message is A, as described above in relation to Figure 4 .
- Any suitable techniques such as those supported by S/MIME (Secure/Multipurpose Internet Mail Extensions), can be used in the exchange of certificates.
- the certificate permits the verification of trust status, revocation status and expiry on the message, and only if all conditions are valid is the voiceprint file temporarily decrypted. For example, using a certificate permits A to revoke the certificate at any time, thereby easily disallowing future text-to-speech conversions.
- Other methods are also contemplated to permit the originator to set more fine grained control over the use of the voiceprint file.
- the trusted server and recipient devices could store and enforce rules permitting conversion of only a limited number of words of a message, or limit conversion to a particular time date or time range.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Mobile Radio Communication Systems (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Description
- The present disclosure relates generally to text-to-speech synthesis. More particularly, the present disclosure relates to a method and system for secure text-to-speech synthesis in portable electronic devices.
- Electronic devices, including portable electronic devices, have gained widespread use and may provide a variety of functions including, for example, telephonic, electronic messaging and other personal information manager (PIM) application functions. Portable electronic devices include, for example, several types of mobile stations such as simple cellular telephones, smart phones, wireless personal digital assistants (PDAs), and laptop computers with wireless 802.11 or Bluetooth capabilities.
- Text-to-speech synthesis can be used in a number of applications to convert normal language text into speech, and can be implemented in software or hardware. For example, those who are visually impaired may use text-to-speech systems to read textual material. The use of text-to-speech synthesis can be useful in portable electronic devices, such as for the reading of email and text messages.
- Improvements in devices using text-to-speech synthesis are desirable.
- Embodiments of the present invention will now be described, by way of example only, with reference to the attached Figures, wherein:
-
Figure 1 is a block diagram of a portable electronic device in accordance with the disclosure. -
Figure 2 is a block diagram of a host system of an example configuration in accordance with the disclosure. -
Figure 3 is a flowchart of a method in accordance with the disclosure. -
Figure 4 is diagram of an example method of server-side authentication in accordance with the disclosure. -
Figure 5 is diagram of an example method of server-side authentication in accordance with the disclosure. -
Figure 6 is diagram of an example method of device-side authentication in accordance with the disclosure. - The embodiments described herein generally relate to mobile wireless communication devices, hereafter referred to as a portable electronic devices. Examples of applicable communication devices include pagers, cellular phones, cellular smartphones, wireless organizers, personal digital assistants, computers, laptops, tablets, media players, e- book readers, handheld wireless communication devices, wirelessly enabled notebook computers and the like. It will be appreciated that for simplicity and clarity of illustration, where considered appropriate, reference numerals may be repeated among the figures to indicate corresponding or analogous elements. In addition, numerous specific details are set forth in order to provide a thorough understanding of the embodiments described herein. However, it will be understood by those of ordinary skill in the art that the embodiments described herein may be practiced without these specific details. Also, the description is not to be considered as limiting the scope of the embodiments described herein.
- Embodiments of the invention may be represented as a software, or computer program, product stored in a machine-readable medium (also referred to as a computer-readable medium, a processor-readable medium, or a computer-usable medium having a computer-readable program code embodied therein or stored thereon). The machine-readable medium may be any suitable medium, including magnetic, optical, or electrical storage medium including a diskette, compact disk read only memory (CD-ROM), memory device (volatile or non-volatile), or similar storage mechanism. The machine-readable medium may contain various sets of instructions, code sequences, configuration information, or other data, which, when executed, cause a processor, or processors, to perform steps in a method according to an embodiment of the invention. Those of ordinary skill in the art will appreciate that other instructions and operations necessary to implement the described invention may also be stored on the machine-readable medium. Software running from the machine readable medium may interface with circuitry or other hardware to perform the described tasks.
- A portable electronic device may be a two-way communication device with advanced data communication capabilities, including the capability to communicate with other portable electronic devices or computer systems through a network of transceiver stations. The portable electronic device may also have the capability to allow voice communication. Depending on the functionality provided by the portable electronic device, it may be referred to as a pager, cellular phone, cellular smartphone, wireless organizer, personal digital assistant, computer, laptop, tablet, media player, e- book reader, handheld wireless communication device, wirelessly-enabled notebook computer and the like. To aid the reader in understanding the structure of the portable electronic device and how it communicates with other devices and host systems, reference will now be made to
Figures 1 through 4 . - Referring first to
Figure 1 , shown therein is a block diagram of a portableelectronic device 100. The portableelectronic device 100 includes a number of components such as amain processor 102 that controls the overall operation of the portableelectronic device 100. Communication functions, including data and voice communications, are performed through acommunication subsystem 104. Data received by the portableelectronic device 100 can be optionally decompressed and decrypted bydecoder 103, operating according to any suitable decompression techniques (e.g. using decompression techniques for MPEG, JPEG, ZIP compression techniques) and decryption techniques (e.g. using decryption techniques for Data Encryption Standard (DES), Triple DES, or Advanced Encryption Standard (AES) encryption techniques). Thecommunication subsystem 104 receives messages from and sends messages to awireless network 200. In this example embodiment of the portableelectronic device 100, thecommunication subsystem 104 can be configured in accordance with the Global System for Mobile Communication (GSM) and General Packet Radio Services (GPRS) standards, Enhanced Data GSM Environment (EDGE), Universal Mobile Telecommunications Service (UMTS), or the like. New standards are still being defined, but it is believed that they will have similarities to the network behavior described herein, and it will also be understood by persons skilled in the art that the embodiments described herein are intended to use any other suitable standards that are developed in the future. The wireless link connecting thecommunication subsystem 104 with thewireless network 200 represents one or more different Radio Frequency (RF) channels, operating according to defined protocols specified for GSM/GPRS communications. With newer network protocols, these channels are capable of supporting both circuit switched voice communications and packet switched data communications. - Although the
wireless network 200 associated with portableelectronic device 100 is a GSM/GPRS wireless network in one example implementation, other wireless networks may also be associated with the portableelectronic device 100 in variant implementations. The different types of wireless networks that may be employed include, for example, data-centric wireless networks, voice-centric wireless networks, and dual-mode networks that can support both voice and data communications over the same physical base stations. Combined dual-mode networks include, but are not limited to, Code Division Multiple Access (CDMA) or CDMA2000 networks, GSM/GPRS networks, and third-generation (3G) and fourth-generation (4G) networks, such as EDGE, UMTS, High-Speed Downlink Packet Access (HSDPA), Worldwide Interoperability for Microwave Access (WiMAX), Long Term Evolution (LTE), and LTE Advanced. Some other examples of data-centric networks include WiFi 802.11, Mobitex™ and DataTAC™ network communication systems. Examples of other voice-centric data networks include Personal Communication Systems (PCS) networks like GSM and Time Division Multiple Access (TDMA) systems. Themain processor 102 can also interact with additional subsystems such as a Random Access Memory (RAM) 106, aflash memory 108, adisplay 110, an auxiliary input/output (I/O)subsystem 112, adata port 114, akeyboard 116, aspeaker 118, amicrophone 120, short-range communications 122 andother device subsystems 124. - Some of the subsystems of the portable
electronic device 100 perform communication-related functions, whereas other subsystems may provide "resident" or on-device functions. By way of example, thedisplay 110 and thekeyboard 116 may be used for both communication-related functions, such as entering a text message for transmission over thenetwork 200, and device-resident functions such as a calculator or task list. - The portable
electronic device 100 can send and receive communication signals over thewireless network 200 after required network registration or activation procedures have been completed. Network access is associated with a subscriber or user of the portableelectronic device 100. To identify a subscriber, the portableelectronic device 100 can, for example, use a SIM/RUIM card 126 (i.e. Subscriber identity Module or a Removable User identity Module) inserted into a SIM/RUIM interface 128 in order to communicate with a network. A SIM is typically a component of a SIM card that can be inserted into a mobile device in order to associate that device with the user identified by the SIM. SIM cards can have various form factors such as a full size that is approximately the size of a credit card, a smaller mini size, and a still smaller micro size. Other cards, such as the Universal Integrated Circuit Card (UICC), the Universal Subscriber identity Module (USIM) card, or the Removable User identity Module (R-UIM) card, may function in a similar manner. Alternatively, a SIM could be stored on an embedded UICC (eUICC) or a similar software-based SIM module. Any such card or software module will be referred to herein as a SIM card, but it should be understood that such entities do not necessarily have the form factor of a card. Without theSIM card 126, the portableelectronic device 100 is not fully operational for communication with thewireless network 200. By inserting the SIM card/RUIM 126 into the SIM/RUIM interface 128, a subscriber can access all subscribed services. Services may include: web browsing and messaging such as e-mail, voice mail, Short Message Service (SMS), and Multimedia Messaging Services (MMS). More advanced services may include: point of sale, field service and sales force automation. The SIM card/RUIM 126 includes a processor and memory for storing information. Once the SIM card/RUIM 126 is inserted into the SIM/RUIM interface 128, it is coupled to themain processor 102. In order to identify the subscriber, the SIM card/RUIM 126 can include some user parameters such as an International Mobile Subscriber Identity (IMSI). An advantage of using the SIM card/RUIM 126 is that a subscriber is not necessarily bound by any single physical portable electronic device. The SIM card/RUIM 126 may store additional subscriber information for a portable electronic device as well, including datebook (or calendar) information and recent call information. Alternatively, user identification information can also be programmed into theflash memory 108. - The portable
electronic device 100 is a battery-powered device and includes abattery interface 132 for receiving one or morerechargeable batteries 130. In at least some embodiments, thebattery 130 can be a smart battery with an embedded microprocessor. Thebattery interface 132 is coupled to a regulator (not shown), which assists thebattery 130 in providing power V+ to the portableelectronic device 100. Although current technology makes use of a battery, future technologies such as micro fuel cells may provide the power to the portableelectronic device 100. - The portable
electronic device 100 also includes anoperating system 134 andsoftware components 136, which are described in more detail below. Theoperating system 134 and thesoftware components 136 that are executed by themain processor 102 are typically stored in a persistent store such as theflash memory 108, which may alternatively be a read-only memory (ROM) or similar storage element (not shown). Those skilled in the art will appreciate that portions of theoperating system 134 and thesoftware components 136, such as specific device applications, or parts thereof, may be temporarily loaded into a volatile store such as theRAM 106. Other software components can also be included, as is well known to those skilled in the art. - The subset of
software applications 136 that control basic device operations, including data and voice communication applications, will normally be installed on the portableelectronic device 100 during its manufacture. Other software applications include amessage application 138 that can be any suitable software program that allows a user of the portableelectronic device 100 to send and receive electronic messages. Various alternatives exist for themessage application 138 as is well known to those skilled in the art. Messages that have been sent or received by the user are typically stored in theflash memory 108 of the portableelectronic device 100 or some other suitable storage element in the portableelectronic device 100. In at least some embodiments, some of the sent and received messages may be stored remotely from thedevice 100 such as in a data store of an associated host system with which the portableelectronic device 100 communicates. - The software applications can further include a
device state module 140, a Personal Information Manager (PIM) 142, and other suitable modules (not shown). Thedevice state module 140 provides persistence, i.e. thedevice state module 140 ensures that important device data is stored in persistent memory, such as theflash memory 108, so that the data is not lost when the portableelectronic device 100 is turned off or loses power. - The
PIM 142 includes functionality for organizing and managing data items of interest to the user, such as, but not limited to, e-mail, contacts, calendar events, voice mails, appointments, and task items. A PIM application has the ability to send and receive data items via thewireless network 200. PIM data items may be seamlessly integrated, synchronized, and updated via thewireless network 200 with the portable electronic device subscriber's corresponding data items stored and/or associated with a host computer system. This functionality creates a mirrored host computer on the portableelectronic device 100 with respect to such items. This can be particularly advantageous when the host computer system is the portable electronic device subscriber's office computer system. - The portable
electronic device 100 also includes aconnect module 144, and an information technology (IT)policy module 146. Theconnect module 144 implements the communication protocols that are required for the portableelectronic device 100 to communicate with the wireless infrastructure and any host system, such as an enterprise system, with which the portableelectronic device 100 is authorized to interface. Examples of a wireless infrastructure and an enterprise system are given inFigures 3 and 4 , which are described in more detail below. - The
connect module 144 includes a set of APIs that can be integrated with the portableelectronic device 100 to allow the portableelectronic device 100 to use any number of services associated with the enterprise system. Theconnect module 144 allows the portableelectronic device 100 to establish an end-to-end secure, authenticated communication pipe with the host system. A subset of applications for which access is provided by theconnect module 144 can be used to pass IT policy commands from the host system to the portableelectronic device 100. This can be done in a wireless or wired manner. These instructions can then be passed to theIT policy module 146 to modify the configuration of thedevice 100. Alternatively, in some cases, the IT policy update can also be done over a wired connection. - Other types of software applications can also be installed on the portable
electronic device 100. These software applications can be third party applications, which are added after the manufacture of the portableelectronic device 100. Examples of third party applications include games, calculators, utilities, etc. - The additional applications can be loaded onto the portable
electronic device 100 through at least one of thewireless network 200, the auxiliary I/O subsystem 112, thedata port 114, the short-range communications subsystem 122, or any othersuitable device subsystem 124. This flexibility in application installation increases the functionality of the portableelectronic device 100 and may provide enhanced on-device functions, communication-related functions, or both. For example, secure communication applications may enable electronic commerce functions and other such financial transactions to be performed using the portableelectronic device 100. - The
data port 114 enables a subscriber to set preferences through an external device or software application and extends the capabilities of the portableelectronic device 100 by providing for information or software downloads to the portableelectronic device 100 other than through a wireless communication network. The alternate download path may, for example, be used to load an encryption key onto the portableelectronic device 100 through a direct and thus reliable and trusted connection to provide secure device communication. Thedata port 114 can be any suitable port that enables data communication between the portableelectronic device 100 and another computing device. Thedata port 114 can be a serial or a parallel port. In some instances, thedata port 114 can be a USB port that includes data lines for data transfer and a supply line that can provide a charging current to charge thebattery 130 of the portableelectronic device 100. - The short-
range communications subsystem 122 provides for communication between the portableelectronic device 100 and different systems or devices, without the use of thewireless network 200. For example, thesubsystem 122 may include an infrared device and associated circuits and components for short-range communication. Examples of short-range communication standards include standards developed by the Infrared Data Association (IrDA), Bluetooth™, Near Field Communication (NFC) standards, and the 802.11 family of standards developed by IEEE. - In use, a received signal such as a text message, an e-mail message, or web page download will be processed by the
communication subsystem 104 and input to themain processor 102. Themain processor 102, in conjunction with thedecoder 103, will then process the received signal for output to thedisplay 110 or alternatively to the auxiliary I/O subsystem 112. A subscriber may also compose data items, such as e-mail messages, for example, using thekeyboard 116 in conjunction with thedisplay 110 and possibly the auxiliary I/O subsystem 112. Theauxiliary subsystem 112 may include devices such as: a touch screen, mouse, track ball, trackpad, optical joystick, infrared fingerprint detector, or a roller wheel with dynamic button pressing capability. Thekeyboard 116 can include an alphanumeric keyboard and/or telephone-type keypad. However, other types of keyboards, including a virtual keyboard or an external keyboard, may also be used. A composed item may be transmitted over thewireless network 200 through thecommunication subsystem 104. - For voice communications, the overall operation of the portable
electronic device 100 may be substantially similar, except that the received signals are output to thespeaker 118, and signals for transmission are generated by themicrophone 120. Alternative voice or audio I/O subsystems, such as a voice message recording subsystem, can also be implemented on the portableelectronic device 100. Although voice or audio signal output is accomplished primarily through thespeaker 118, thedisplay 110 can also be used to provide additional information such as the identity of a calling party, duration of a voice call, or other voice call related information. -
Figure 2 is a block diagram illustrating components of a configuration of ahost system 250 with which the portableelectronic device 100 can, in conjunction with theconnect module 144, communicate. Thehost system 250 will typically be a corporate enterprise or other local area network (LAN), but may also be a home office computer or some other private system, for example, in variant implementations. In this example shown inFigure 4 , thehost system 250 is depicted as a LAN of an organization to which a user of the portableelectronic device 100 belongs. Typically, a plurality of portable electronic devices can communicate wirelessly with thehost system 250 through one ormore nodes 202 of thewireless network 200. - The
host system 250 includes a number of network components connected to each other by anetwork 260. For instance, a user'sdesktop computer 262a is situated on a LAN connection. The portableelectronic device 100 can be directly coupled to thecomputer 262a by a serial or a Universal Serial Bus (USB) connection, for example, or can be wirelessly connected, such as through a Bluetooth connection. The connection of the portableelectronic device 100 to the computer facilitates the loading of information (e.g. PIM data, private symmetric encryption keys to facilitate secure communications) from theuser computer 262a to the portableelectronic device 100, and may be particularly useful for bulk information updates often performed in initializing the portableelectronic device 100 for use. The information downloaded to the portableelectronic device 100 may include digital certificates used in the exchange of messages. - The
user computers 262a-262n will typically also be connected to other peripheral devices, such as printers, etc. which are not explicitly shown inFigure 4 . Furthermore, only a subset of network components of thehost system 250 are shown inFigure 4 for ease of exposition, and it will be understood that thehost system 250 will include additional components that are not explicitly shown inFigure 4 for this example configuration. More generally, thehost system 250 may represent a smaller part of a larger network (not shown) of the organization, and may include different components and/or be arranged in different topologies than that shown in the example embodiment ofFigure 4 . - To facilitate the operation of the portable
electronic device 100 and the wireless communication of messages and message-related data between the portableelectronic device 100 and components of thehost system 250, a number of wirelesscommunication support components 270 can be provided. In some implementations, the wirelesscommunication support components 270 can include amessage management server 272, a mobile data server (MDS) 274, a web server, such as Hypertext Transfer Protocol (HTTP)server 275, acontact server 276, and adevice manager module 278. HTTP servers can also be located outside the enterprise system, as indicated by theHTTP server 275 attached to thenetwork 224. Thedevice manager module 278 includes anIT Policy editor 280 and an ITuser property editor 282, as well as other software components for allowing an IT administrator to configure the portableelectronic devices 100. In an alternative embodiment, there may be one editor that provides the functionality of both theIT policy editor 280 and the ITuser property editor 282. Thesupport components 270 also include adata store 284, and anIT policy server 286. TheIT policy server 286 includes aprocessor 288, anetwork interface 290 and amemory unit 292. Theprocessor 288 controls the operation of theIT policy server 286 and executes functions related to the standardized IT policy as described below. Thenetwork interface 290 allows theIT policy server 286 to communicate with the various components of thehost system 250 and the portableelectronic devices 100. Thememory unit 292 can store functions used in implementing the IT policy as well as related data. Those skilled in the art know how to implement these various components. Other components may also be included as is well known to those skilled in the art. Further, in some implementations, thedata store 284 can be part of any one of the servers. - In this example embodiment, the portable
electronic device 100 communicates with thehost system 250 throughnode 202 of thewireless network 200 and a sharednetwork infrastructure 224 such as a service provider network or the public Internet. Access to thehost system 250 may be provided through one or more routers (not shown), and computing devices of thehost system 250 may operate from behind a firewall orproxy server 266. Theproxy server 266 provides a secure node and a wireless internet gateway for thehost system 250. Theproxy server 266 intelligently routes data to the correct destination server within thehost system 250. - In some implementations, the
host system 250 can include a wireless VPN router (not shown) to facilitate data exchange between thehost system 250 and the portableelectronic device 100. The wireless VPN router allows a VPN connection to be established directly through a specific wireless network to the portableelectronic device 100. The wireless VPN router can be used with the Internet Protocol (IP) Version 6 (IPV6) and IP-based wireless networks. This protocol can provide enough IP addresses so that each portable electronic device has a dedicated IP address, making it possible to push information to a portable electronic device at any time. An advantage of using a wireless VPN router is that it can be an off-the-shelf VPN component, and does not require a separate wireless gateway and separate wireless infrastructure. A VPN connection can be a Transmission Control Protocol (TCP)/IP or User Datagram Protocol (UDP)/IP connection for delivering the messages directly to the portableelectronic device 100 in this alternative implementation. - Messages intended for a user of the portable
electronic device 100 are initially received by amessage server 268 of thehost system 250. Such messages may originate from any number of sources. For instance, a message may have been sent by a sender from thecomputer 262b within thehost system 250, from a different portable electronic device (not shown) connected to thewireless network 200 or a different wireless network, or from a different computing device, or other device capable of sending messages, via the sharednetwork infrastructure 224, possibly through an application service provider (ASP) or Internet service provider (ISP), for example. - The
message server 268 typically acts as the primary interface for the exchange of messages, particularly e-mail messages, within the organization and over the sharednetwork infrastructure 224. Each user in the organization that has been set up to send and receive messages is typically associated with a user account managed by themessage server 268. Some example implementations of themessage server 268 include a Microsoft Exchange™ server, a Lotus Domino™server, a Novell Groupwise™server, or another suitable mail server installed in a corporate environment. In some implementations, thehost system 250 may includemultiple message servers 268. Themessage server 268 may also be adapted to provide additional functions beyond message management, including the management of data associated with calendars and task lists, for example. - When messages are received by the
message server 268, they are typically stored in a data store associated with themessage server 268. In at least some embodiments, the data store may be a separate hardware unit, such asdata store 284, with which themessage server 268 communicates. Messages can be subsequently retrieved and delivered to users by accessing themessage server 268. For instance, an e-mail client application operating on a user'scomputer 262a may request the e-mail messages associated with that user's account stored on the data store associated with themessage server 268. These messages are then retrieved from the data store and stored locally on thecomputer 262a. The data store associated with themessage server 268 can store copies of each message that is locally stored on the portableelectronic device 100. Alternatively, the data store associated with themessage server 268 can store all of the messages for the user of the portableelectronic device 100 and only a smaller number of messages can be stored on the portableelectronic device 100 to conserve memory. For instance, the most recent messages (i.e. those received in the past two to three months for example) can be stored on the portableelectronic device 100. - When operating the portable
electronic device 100, the user may wish to have e-mail messages retrieved for delivery to the portableelectronic device 100. The message application 138 (seeFigure 1 ) operating on the portableelectronic device 100 may also request messages associated with the user's account from themessage server 268. Themessage application 138 may be configured (either by the user or by an administrator, possibly in accordance with an organization's IT policy) to make this request at the direction of the user, at some pre-defined time interval, or upon the occurrence of some pre-defined event. In some implementations, the portableelectronic device 100 is assigned its own e-mail address, and messages addressed specifically to the portableelectronic device 100 are automatically redirected to the portableelectronic device 100 as they are received by themessage server 268. - The
message management server 272 can be used to specifically provide support for the management of messages, such as e-mail messages, that are to be handled by portable electronic devices. Generally, while messages are still stored on themessage server 268, themessage management server 272 can be used to control when, if, and how messages are sent to the portableelectronic device 100. Themessage management server 272 also facilitates the handling of messages composed on the portableelectronic device 100, which are sent to themessage server 268 for subsequent delivery. - For example, the
message management server 272 may monitor the user's "mailbox" (e.g. the message store associated with the user's account on the message server 268) for new e-mail messages, and apply user-definable filters to new messages to determine if and how the messages are relayed to the user's portableelectronic device 100. Themessage management server 272 may also, through anencoder 273, compress messages, using any suitable compression technology (e.g. using a compression technique such as ZIP, MPEG, JPEG and other known techniques) and encrypt messages (e.g. using an encryption technique such as Data Encryption Standard (DES), Triple DES, or Advanced Encryption Standard (AES)), and push them to the portableelectronic device 100 via the sharednetwork infrastructure 224 and thewireless network 200. Themessage management server 272 may also receive messages composed on the portable electronic device 100 (e.g. encrypted using Triple DES), decrypt and decompress the composed messages, re-format the composed messages if desired so that they will appear to have originated from the user'scomputer 262a, and re-route the composed messages to themessage server 268 for delivery. - Certain properties or restrictions associated with messages that are to be sent from and/or received by the portable
electronic device 100 can be defined (e.g. by an administrator in accordance with IT policy) and enforced by themessage management server 272. These may include whether the portableelectronic device 100 may receive encrypted and/or signed messages, minimum encryption key sizes, whether outgoing messages must be encrypted and/or signed, and whether copies of all secure messages sent from the portableelectronic device 100 are to be sent to a pre-defined copy address, for example. - The
message management server 272 may also be adapted to provide other control functions, such as only pushing certain message information or pre-defined portions (e.g. "blocks") of a message stored on themessage server 268 to the portableelectronic device 100. For example, in some cases, when a message is initially retrieved by the portableelectronic device 100 from themessage server 268, themessage management server 272 may push only the first part of a message to the portableelectronic device 100, with the part being of a pre-defined size (e.g. 2 KB). The user can then request that more of the message be delivered in similar-sized blocks by themessage management server 272 to the portableelectronic device 100, possibly up to a maximum pre-defined message size. Accordingly, themessage management server 272 facilitates better control over the type of data and the amount of data that is communicated to the portableelectronic device 100, and can help to minimize potential waste of bandwidth or other resources. - The
MDS 274 encompasses any other server that stores information that is relevant to the corporation. Themobile data server 274 may include, but is not limited to, databases, online data document repositories, customer relationship management (CRM) systems, or enterprise resource planning (ERP) applications. TheMDS 274 can also connect to the Internet or other public network, throughHTTP server 275 or other suitable web server such as an File Transfer Protocol (FTP) server, to retrieve HTTP webpages and other data. Requests for webpages are typically routed throughM DS 274 and then toHTTP server 275, through suitable firewalls and other protective mechanisms. The web server then retrieves the webpage over the Internet, and returns it toMDS 274. As described above in relation tomessage management server 272,MDS 274 is typically provided, or associated, with anencoder 277 that permits retrieved data, such as retrieved webpages, to be compressed, using any suitable compression technology, and encrypted, using an suitable encryption technique, as described above, and then pushed to the portableelectronic device 100 via the sharednetwork infrastructure 224 and thewireless network 200. - The
contact server 276 can provide information for a list of contacts for the user in a similar fashion as the address book on the portableelectronic device 100. Accordingly, for a given contact, thecontact server 276 can include the name, phone number, work address and e-mail address of the contact, among other information. Thecontact server 276 can also provide a global address list that contains the contact information for all of the contacts associated with thehost system 250. - It will be understood by persons skilled in the art that the
message management server 272, theMDS 274, theHTTP server 275, thecontact server 276, thedevice manager module 278, thedata store 284 and theIT policy server 286 do not need to be implemented on separate physical servers within thehost system 250. For example, some or all of the functions associated with themessage management server 272 may be integrated with themessage server 268, or some other server in thehost system 250. Alternatively, thehost system 250 may include multiplemessage management servers 272, particularly in variant implementations where a large number of portable electronic devices need to be supported. - The present disclosure describes a method for secure text-to-speech conversion of text using speech or voice synthesis. Voice synthesis of text-based content can be used to synthesize any text-based content to an audio format for output to a recipient. For example, in a car environment it is may be undesirable for a driver to read an email or text message, and the use of text-to-speech synthesis can be used. However, the voice used in various text-to-speech systems is either a machine-generated voice, or a voice provided by a voice actor, not the sender, both of which can sound unnatural. Therefore, it is desirable to provide text-to-speech synthesis that permits messages to be read in a natural manner, for example in the voice of the sender, or another voice that is familiar to the recipient. However, to prevent the originator's voice from being used or distributed inappropriately or in an unauthorized manner, appropriate security and access controls need to be provided. For example, when voice synthesis is used to read an email message, it is desirable to have appropriate controls to ensure that the message is read in the originator's voice, not the voice of another person. Such controls permit an originator's voiceprint file to be publicly accessible, but limit its use for voice synthesis to text-based content created by the originator, or sent to a trusted recipient. In this way a person can be assured that their voice cannot be used for content they did not write.
- Referring to
Figure 3 , the method generally consists of authenticating at least one of an originator or a recipient of text-based content to access a voiceprint associated with the text-based content (302), and converting at least a portion of the text-based content to an audio format for output to the recipient in accordance with the originator's voiceprint (304). As used herein, "text-based content" encompasses any digital text content that can be synthesized for audio read out to the recipient. Examples of text-based content includes, but is not limited to, email messages, text messages, attachments to messages, calendar appointments, tasks, and text portions of any thereof. The "originator" of the text-based content is the creator of the digital content, who, in many of the described embodiments, is the sender of a message, such as the sender of an email message. However, there are embodiments where the text-based content originates from another source, such as the creator of an attachment who is different from the sender of the message, or the creator of a calendar appointment or task, who may be, for example, the user of a portable electronic device (recipient) him or herself. - The present disclosure is not limited to a particular method of voice synthesis, and any known or later developed voice synthesis method can be used. As an example, voice synthesis based on hidden Markov models (HMM) is one method that can be used. HMM (also called Statistical Parametric Synthesis) models the frequency spectrum (vocal tract), fundamental frequency (vocal source), and duration (prosody) of a person's speech to generate hidden Markov models (sequence model and observation model), which are collectively termed herein a "voiceprint" of a speaker, sender or originator. The voiceprint can be stored as a voiceprint file. Speech waveforms can then be generated from the voiceprint. Advantages of HMM-based voice synthesis include the relatively short training time (e.g. < approximately 2 hours) needed to generate a voiceprint, and small size (e.g. typically less than approximately 5MB) of the resulting voiceprint file. The small size of the voiceprint files reduces the bandwidth needed to transmit such files, and the memory needed to store them. In contrast, other natural voice synthesis techniques can require many hours of training, and generate voiceprint files of 11MB to 1GB in size. A voiceprint can also, for example, be an adaptive voiceprint that adapts a "common" voice to a specific speaker.
- Referring to
Figures 1 and2 , the method can be accomplished at a server, such as the message management server 272 (seeFigure 2 ), or on a portable electronic device, or on a combination thereof. For example, a server to provide secure text-to-speech conversion of text-based content can include a processor capable of authenticating at least one of an originator or a recipient of the text-based content, accessing a voiceprint associated with the originator of the text-based content; and converting, in accordance with the voiceprint, at least a portion of the text-based content to an audio format for output to the recipient or listener. A portable electronic device to provide secure text-to-speech conversion of text-based content according to the present disclosure can include a processor, such as main processor 102 (seeFigure 1 ), configured to authenticate an originator of the text-based content; access a voiceprint associated with the originator; convert at least a portion of the text-based content to an audio format in accordance with the voiceprint; and cause the audio format to be output to an audio output device of the portable device, such as speaker 118 (seeFigure 1 ) or another associated sound output device, such as a speaker paired to a portable electronic device through Bluetooth™, such as a Bluetooth™ car set, or a headset. - A number of methods can be used to provide the security or authentication controls at the server-side or the device-side. According to embodiments, authentication of at least one of the originator or the recipient can include verifying a digital signature of at least one of the originator or the recipient, or authenticating the originator in accordance with a digital certificate accompanying the text-based content. Authenticating at least one of the originator or the recipient can further include verifying that the recipient is a trusted recipient.
- Server-side authentication permits a trusted third party, such as a wireless service provider or an enterprise to authenticate the originator and/or the recipient, and securely perform conversion of the text-based content to an audio format. For the purposes of illustration in the following example embodiments, the text-based content is an email message. The following examples assume that the originator of the text-based content has created a voiceprint, using HMM or other appropriate method, prior to creating or transmitting text-based content.
- In a first embodiment, as shown in
Figure 4 , the originator of the message is authenticated by a trustedserver 400 using a certificate-based authentication. The voiceprint file (HMM VP) is uploaded by the originator (A) to theserver 400 of a trusted third party with a digital certificate (Cert A) (402). This step can occur at any time prior to, or contemporaneously with, the creation or transmission of the text-based content, and the communication may be secured using known or later developed methods, such as Secure Sockets Layer (SSL). A digital certificate (also known as a public key certificate or identity certificate) generally refers to an electronic document which uses a digital signature to bind a public key with an identity - information such as the name of a person or an organization, their address, etc. The certificate can be used to verify that a public key belongs to an individual. A valid digital signature gives a recipient reason to believe that the message was created by a known sender, and that it was not altered in transit. While the present embodiment will be described using digital certificates, the authentication method can be easily adapted, using shared symmetric keys, or to use messages simply signed with a digital signature. - User A then sends a message to B, over any appropriate communication channel, and signs the message with the private key that corresponds to Cert A. An email client, such as provided by message application 138 (see
Figure 1 ), on a portable electronic device associated with B, receives the signed message (404). The email client then forwards the email message and signature to the trusted server 400 (406), for example, over a secure connection, such as an Secure Sockets Layer (SSL) or a Transport Layer Security (TLS) connection. The forwarding of the message and signature to theserver 400 can be performed automatically by the email client on receipt of a signed message; can be user-initiated; can be based on specified conditions, such as time of day, presence status of user B; or can be based on user-specified preferences. In accordance with an embodiment, only a portion of a message is signed with A's digital signature, and only the signed portion is forwarded and converted to speech. This embodiment is particularly appropriate for large text messages, such as long email threads, where it is only desirable to convert the newly added portion of the thread into speech. - On receipt of the message, the
server 400 verifies the signature (408), thereby verifying the identity of sender A. If the signature verifies correctly, theserver 400 then verifies the status of the certificate to ensure it has not expired or been revoked (410). Theserver 400 then retrieves the voiceprint file (HMM VP) associated with the certificate (412), converts the email message to an audio format (414) and transfers the audio message back to B's device to play (416). - Alternately, as shown by the
dotted path 407, the message can be automatically intercepted by a server prior to receipt by the email client. The intercepting server can be, for example, the enterprise server 270 (seeFigure 2 ), which may also be the trusted server. Alternately, the intercepting server may forward the intercepted message to the trusted server. The conditions for such interception can also be based on specified conditions, such as time of day, presence status of user B; or can be based on user-specified preferences. The originator A is authenticated at the trusted server, as described above, and the trusted server then sends the audio message directly to the device, or back to the enterprise server which then pushes the result to the device. This embodiment avoids having the email client of the device have to send the message, which may be quite large if attachments are included, to the trusted server. - In the embodiment shown in
Figure 5 , the recipient of the message is authenticated by a trustedserver 500 and compared to a list of trusted recipients provided by sender A. In this way, A can grant permissions to certain trusted parties to be able to play messages using his voiceprint. In this embodiment, A's voiceprint file (HMM VP) is transmitted, over a secure connection, to aserver 500, with a list of trusted recipients (502). When A causes a text message to be sent to B (504), the email client at B signs the message with B's signature, and forwards the signed email message to the trusted server 500 (506), as described above. On receipt of the message, theserver 500 verifies B's signature (508), thereby verifying the identity of the recipient B. If B's signature verifies correctly, theserver 500 then verifies that B is included in the list of trusted recipients previously provided by A (510). If B is on the list of trusted recipients,server 500 retrieves A's voiceprint file (HMM VP) (512), converts the email message to an audio format (514) and transfers the audio message back to B's device (516). - A further embodiment (not shown), authenticates both A and B by combining the authentication methods shown in
Figures 4 and5 . In other words, the server verifies the identity of B as well as verifying that the content is from A. Sender A's voiceprint, digital certificate and a list of trusted recipients are transmitted to the server. A causes a digitally signed message to be sent to B, the signed message is, in turn, signed by the email client of B and forwarded to the server. The server then authenticates B based on B's signature and the list of trusted recipients, and authenticates A based on A's signature and the status of A's certificate. If both A and B are authenticated, A's voiceprint is retrieved, and the message converted as set out above. - According to further embodiments, device-side authentication can be implemented using digital rights management (DRM) techniques. In an example device-side solution, the voiceprint file is stored in an encrypted format, and only decrypted if the identity of A and/or B is verified. Referring to
Figure 6 , originator uploads the voiceprint file (HMM VP) to aserver 600, with a list of trusted recipients, or other permissions (602). The server encrypts the file (604). The email client of the device associated with B then requests the encrypted voiceprint file for A from aserver 600, such as a Lightweight Directory Access Protocol (LDAP) server (606). If A has set permissions as to who can retrieve the voiceprint file, theserver 600 can verify the identity of B (608). The server and B's device then establish a key to use to encrypt the content (610), using, for example, the Simple Password Exponential Key Exchange (SPEKE), or Public Key Infrastructure (PKI) techniques. The encrypted file is transmitted to B (612). When the file is received by B, it can either be directly stored encrypted or re-encrypted with a different key which is stored in a protected manner within the hardware of the device. In accordance with various embodiments, the encrypted voice print file/decryption key is only accessible to the email client, which can be enforced using, for example, code signing on the email client application. When the email client of B receives a digitally signed email message (614) from A (for which it has a stored voiceprint file), the email client verifies the email signature (616), temporarily decrypts the voiceprint file (618), and convert the email to audio format (620). - The encrypted voiceprint file can optionally have an attached digital certificate to permit B to verify that the source of the signed message is A, as described above in relation to
Figure 4 . Any suitable techniques, such as those supported by S/MIME (Secure/Multipurpose Internet Mail Extensions), can be used in the exchange of certificates. The certificate permits the verification of trust status, revocation status and expiry on the message, and only if all conditions are valid is the voiceprint file temporarily decrypted. For example, using a certificate permits A to revoke the certificate at any time, thereby easily disallowing future text-to-speech conversions. Other methods are also contemplated to permit the originator to set more fine grained control over the use of the voiceprint file. For example, the trusted server and recipient devices could store and enforce rules permitting conversion of only a limited number of words of a message, or limit conversion to a particular time date or time range. - The above-described embodiments of the present invention are intended to be examples only. Alterations, modifications and variations may be effected to the particular embodiments by those of skill in the art without departing from the scope of the invention, which is defined solely by the claims appended hereto.
Claims (10)
- A method of secure text-to-speech conversion, the method comprising:authenticating (302) at least one of an originator or a recipient of text-based content to access a voiceprint associated with the text-based content; andconverting (304), in accordance with the voiceprint, at least a portion of the text-based content to an audio format for output to the recipient.
- The method of claim 1, wherein authenticating at least one of the originator or the recipient includes verifying a digital signature of at least one of the originator or the recipient (408, 508, 616).
- The method of claim 1, wherein authenticating at least one of the originator or the recipient includes authenticating the originator in accordance with a digital certificate accompanying the text-based content (410).
- The method of claim 1, wherein authenticating at least one of the originator or the recipient further includes verifying that the recipient is a trusted recipient (510).
- The method of claim 1, further comprising decrypting the voiceprint prior to use in converting at least a portion of the text-based content to an audio format (618).
- The method of claim 1, wherein accessing the voiceprint comprises retrieving a voiceprint file from a trusted third party (606).
- The method of claim 1, wherein the voiceprint permits synthesis of the originator's speech.
- The method of claim 7, wherein the voiceprint uses hidden Markov models of the originator's voice characteristics.
- A server for providing secure text-to-speech conversion according to the method of any one of claims 1 - 8.
- A portable electronic device capable of providing secure text-to-speech conversion according to the method of any one of claims 1 - 8.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP11195142.2A EP2608195B1 (en) | 2011-12-22 | 2011-12-22 | Secure text-to-speech synthesis for portable electronic devices |
CA2799734A CA2799734C (en) | 2011-12-22 | 2012-12-20 | Secure text-to-speech synthesis in portable electronic devices |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP11195142.2A EP2608195B1 (en) | 2011-12-22 | 2011-12-22 | Secure text-to-speech synthesis for portable electronic devices |
Publications (2)
Publication Number | Publication Date |
---|---|
EP2608195A1 true EP2608195A1 (en) | 2013-06-26 |
EP2608195B1 EP2608195B1 (en) | 2016-10-05 |
Family
ID=45440279
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP11195142.2A Active EP2608195B1 (en) | 2011-12-22 | 2011-12-22 | Secure text-to-speech synthesis for portable electronic devices |
Country Status (2)
Country | Link |
---|---|
EP (1) | EP2608195B1 (en) |
CA (1) | CA2799734C (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11323458B1 (en) * | 2016-08-22 | 2022-05-03 | Paubox, Inc. | Method for securely communicating email content between a sender and a recipient |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1168297A1 (en) * | 2000-06-30 | 2002-01-02 | Nokia Mobile Phones Ltd. | Speech synthesis |
US20040098266A1 (en) * | 2002-11-14 | 2004-05-20 | International Business Machines Corporation | Personal speech font |
EP1703492A1 (en) * | 2005-03-16 | 2006-09-20 | Research In Motion Limited | System and method for personalised text-to-voice synthesis |
EP2306450A1 (en) * | 2008-07-11 | 2011-04-06 | NTT DoCoMo, Inc. | Voice synthesis model generation device, voice synthesis model generation system, communication terminal device and method for generating voice synthesis model |
-
2011
- 2011-12-22 EP EP11195142.2A patent/EP2608195B1/en active Active
-
2012
- 2012-12-20 CA CA2799734A patent/CA2799734C/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1168297A1 (en) * | 2000-06-30 | 2002-01-02 | Nokia Mobile Phones Ltd. | Speech synthesis |
US20040098266A1 (en) * | 2002-11-14 | 2004-05-20 | International Business Machines Corporation | Personal speech font |
EP1703492A1 (en) * | 2005-03-16 | 2006-09-20 | Research In Motion Limited | System and method for personalised text-to-voice synthesis |
EP2306450A1 (en) * | 2008-07-11 | 2011-04-06 | NTT DoCoMo, Inc. | Voice synthesis model generation device, voice synthesis model generation system, communication terminal device and method for generating voice synthesis model |
Also Published As
Publication number | Publication date |
---|---|
EP2608195B1 (en) | 2016-10-05 |
CA2799734C (en) | 2017-12-05 |
CA2799734A1 (en) | 2013-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9166977B2 (en) | Secure text-to-speech synthesis in portable electronic devices | |
AU2005204221B2 (en) | Providing certificate matching in a system and method for searching and retrieving certificates | |
CA2639161C (en) | System and method for processing attachments to messages sent to a mobile device | |
US7949355B2 (en) | System and method for processing attachments to messages sent to a mobile device | |
US8254582B2 (en) | System and method for controlling message attachment handling functions on a mobile device | |
US10992613B2 (en) | Electronic mail system providing message character set formatting features and related methods | |
US20100031028A1 (en) | Systems and methods for selecting a certificate for use with secure messages | |
AU2005204223B2 (en) | System and method for searching and retrieving certificates | |
CA2639092A1 (en) | System and method for displaying a security encoding indicator associated with a message attachment | |
CA2538443C (en) | System and method for sending encrypted messages to a distribution list | |
US10594644B2 (en) | Methods for delivering electronic mails on request, electronic mail servers and computer programs implementing said methods | |
CA2799734C (en) | Secure text-to-speech synthesis in portable electronic devices | |
EP2574002B1 (en) | Systems and methods for selecting a certificate for use with secure messages | |
CA2639659C (en) | System and method for controlling message attachment handling functions on a mobile device | |
CA2587155C (en) | System and method for processing messages with encryptable message parts | |
CA2638443C (en) | Systems and methods for preserving auditable records of an electronic device | |
CA2638442A1 (en) | Systems and methods for selecting a certificate for use with secure messages | |
CA2693729A1 (en) | Communications system with polling server providing dynamic record id polling and related methods |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
17P | Request for examination filed |
Effective date: 20111222 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: BLACKBERRY LIMITED |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: BLACKBERRY LIMITED |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R079 Ref document number: 602011030908 Country of ref document: DE Free format text: PREVIOUS MAIN CLASS: G10L0013040000 Ipc: G10L0013000000 |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G10L 13/00 20060101AFI20160404BHEP Ipc: G10L 13/047 20130101ALN20160404BHEP Ipc: G10L 13/033 20130101ALN20160404BHEP |
|
INTG | Intention to grant announced |
Effective date: 20160422 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: REF Ref document number: 835227 Country of ref document: AT Kind code of ref document: T Effective date: 20161015 |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R096 Ref document number: 602011030908 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 6 |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: MP Effective date: 20161005 |
|
REG | Reference to a national code |
Ref country code: LT Ref legal event code: MG4D |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LV Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: MK05 Ref document number: 835227 Country of ref document: AT Kind code of ref document: T Effective date: 20161005 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: NO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20170105 Ref country code: SE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20170106 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: HR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20170205 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20170206 Ref country code: BE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: RS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R097 Ref document number: 602011030908 Country of ref document: DE |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: SM Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20170105 |
|
26N | No opposition filed |
Effective date: 20170706 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MC Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: MM4A |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20161222 Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20161231 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20161231 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20161222 Ref country code: SI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 7 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: HU Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO Effective date: 20111222 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: TR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 Ref country code: MK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20161222 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: AL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20161005 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20231227 Year of fee payment: 13 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20231227 Year of fee payment: 13 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20231229 Year of fee payment: 13 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R082 Ref document number: 602011030908 Country of ref document: DE Ref country code: DE Ref legal event code: R081 Ref document number: 602011030908 Country of ref document: DE Owner name: MALIKIE INNOVATIONS LTD., IE Free format text: FORMER OWNER: BLACKBERRY LIMITED, WATERLOO, ONTARIO, CA |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: 732E Free format text: REGISTERED BETWEEN 20240530 AND 20240605 |