WO2002076120A1 - System and method of converting a message and transmitting the converted message - Google Patents

System and method of converting a message and transmitting the converted message Download PDF

Info

Publication number
WO2002076120A1
WO2002076120A1 PCT/AU2002/000335 AU0200335W WO02076120A1 WO 2002076120 A1 WO2002076120 A1 WO 2002076120A1 AU 0200335 W AU0200335 W AU 0200335W WO 02076120 A1 WO02076120 A1 WO 02076120A1
Authority
WO
WIPO (PCT)
Prior art keywords
message
recipient
user
audio
data
Prior art date
Application number
PCT/AU2002/000335
Other languages
French (fr)
Inventor
Glenn Charles Brien
Ian Edward Dixon
Warwick Peter Freeland
Jeremy Peter Mocek
Original Assignee
Famoice Technology Pty Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Famoice Technology Pty Ltd filed Critical Famoice Technology Pty Ltd
Publication of WO2002076120A1 publication Critical patent/WO2002076120A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/06Message adaptation to terminal or network requirements
    • H04L51/066Format adaptation, e.g. format conversion or compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72433User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for voice messaging, e.g. dictaphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/12Messaging; Mailboxes; Announcements

Definitions

  • This invention relates to a method and system for constructing and transmitting a message over a communications network.
  • the invention also relates to a data template structure for display on a communications device that allows subscribers of mobile devices to construct a message using the template structure for delivery to another telephone subscriber or other communications device, which message may have certain characteristics such as being in the voice of a famous character or person, or including sound effects.
  • the invention also relates to a method of displaying data template structures on a display means of a communications device. Background of the invention At present there exists methods of transmitting text messages, for example using the Short Message Service (SMS), between mobile telephone users.
  • SMS Short Message Service
  • a user For a user to construct and send a message they generally input the recipient's name and/or telephone number by using the keypad of the mobile telephone and inputting the text message that they wish to send to the recipient. The message is then sent through a mobile communications network by depressing the send button and the recipient will be alerted on their mobile phone when they receive a text message which they can view by standard techniques.
  • the above system of transmitting SMS messages is limited in the sense that the message is simply a text message that has no specific characteristics about the message except from knowing who the message was sent from and the details of the plain text message.
  • the present invention makes sending a message a unique and fun experience in that the message may be sent to include the voice of a famous or recognisable character with a particular tone or style and in a particular type of message whether it be a greeting, hello, or a comment that is typically associated with the famous character.
  • the present invention provides an interactive way of selecting, by the user wishing to construct the message, a number of choices of famous or recognisable characters, the style in which the message is to be heard, and the type of greeting or type of message. More particularly the recipient will be able to hear the message rather than just simply viewing the contents of the message on the screen of their communications device.
  • the message can be sent to a variety of voice function communication devices. Summary of the invention
  • a data template structure for display on a display means of a user communications device to allow said user to construct a message to be sent to a recipient; said structure comprising: One or more fields that permits entry by said user of data representative of characteristics of said message and said recipient; wherein said data identifies an audio sequence and on receipt of said message by said recipient said audio sequence is heard on a recipient communications device by said recipient as part of said message.
  • the user and recipient communications devices may be wireless, linked to a cellular communications network or wired and linked to a fixed telecommunications network, such as the PSTN.
  • the audio sequence may be the voice of a famous or recognisable character, well known to most recipients, or a song or other special effects sounds.
  • a song may be sung in the voice of a famous character.
  • background and end sound effects may be included as part of said audio sequence.
  • the fields may allow direct insertion by said user of said data representing characteristics of said message or may be formed as an interactive menu having various levels. Thus, on selecting one field, specific choices for that field may be displayed on said display means, for example a range of choices in whose famous voice the message is to be heard.
  • the entry of data in said fields may be abbreviated.
  • a second field may allow entry of the user's name, either first names, nicknames or surnames.
  • a third field may allow entry of the recipient's name, either first name, nickname or surname.
  • a fourth field may allow entry of the recipient's mobile or other telephone number.
  • a fifth field may allow entry of the type of message to be sent, for example, happy, sad, bored, sorry.
  • a sixth field may allow entry in said structure of the type of message to be sent, for example, a greeting, happy birthday, congratulations etc.
  • a seventh field may allow entry of the time at which the message should be sent.
  • Each of said communication devices of said user and said recipient may be respectively linked to a fixed communications network either directly or through a cellular communications network, where said communication devices are wireless, cellular devices.
  • a server means linked to said coinmunications network or directly to the cellular communications network, may have data storage and processing means or be linked to a data storage and processing means. Said data storage and processing means may create and store said audio sequences such as recordings of voices of famous characters whether in a particular style or type.
  • the data storage means may include a telephone book matching recipient telephone numbers with recipient names.
  • a method of displaying a data template structure on a display means of a communications device of a user comprising the steps of: receiving at a server means a request by said user to construct a message to be sent to a recipient; downloading said data template structure from said server means to said communications device; said data template structure including a field permitting entry of data representative of characteristics of said message and said recipient; wherein said data identifies an audio sequence which is heard by said recipient.
  • a method of transmitting a data template structure between users each user having a communications device for reception and transmission of said structure, said method comprising the steps of: a first user receiving said structure downloaded from a server means; transmitting the downloaded structure as a message from a communication device of said first user to a communication device of a second user; storing said structure on said communication device of said second user; wherein said second user is able to retrieve the stored structure, create a message by entering data in at least one field of said structure, said data identifying an audio sequence to be heard by a further user, and transmitting the created message to said further user.
  • the message may be an SMS message.
  • a data menu structure for display on a display means of a communications device to allow a user to construct a message to be sent to a recipient; said structure comprising: at least one field that permits selection of options from menu options and permits entry of data against a selected menu option by said user, said data being representative of characteristics of said message and said recipient; wherein at least one of said selected menu options or data entries provides a string of text identifying an audio sequence, such that on receipt of said message by said recipient said audio sequence is heard by said recipient on a recipient communications device.
  • a method of displaying a data menu structure on a display means of a communications device of a user comprising the steps of: receiving at a server means a request by said user to construct a message to be sent to a recipient; establishing an application session between said server means and said communications device; sending a menu structure from said server means to said communications device; selecting, by said user, one or more menu options whilst navigating said menu structure and entering data required by said user or said server means; transmitting navigation, selection and data entry commands from said communications device to said server means, said commands permitting entry of data representative of characteristics of said message and said recipient; wherein at least one set of entered data or selected data represents a string of text identifying an audio sequence to be heard by said recipient.
  • the user may navigate up and down the menu structure.
  • the menu structure may be refreshed or modified depending upon such navigation.
  • a method of transmitting a data menu structure between a first user and a second user respectively using a first user communications device and a second user communications device comprising the steps of: said first user participating in an application session between a server means and said first user communications device; sending an application connection query, by said first user, to said second user communication device ; sending a menu structure representative of said application selected by said first user from said server means to said second user communications device; wherein said second user is able to use the application menu structure to create a message by entering data or selecting a menu option in at least one field of said structure, said data identifying an audio sequence to be heard by a further user, and transmitting the created message to said further user.
  • the application connection query may be sent to the server means before being sent to the second user communications device.
  • a network identifier of the second user may be provided by said first user to said server means.
  • the server means On sending the menu structure to the second user, the server means may send a network identifier for the application selected by said first user so that the second user can establish a further or later session with the server means where the application is stored.
  • the second user may have the option of responding to the server means at the time of receipt of the menu structure using the menu structure or at a later time.
  • the further user may be the first user.
  • a system for transmitting a converted message over a communications network from a sender to a recipient comprising: data entry means for entering a text-based message in a free-form style; means for converting said freeform text-based message into an audio message to be sent as said converted message, to said recipient over said communications network.
  • a ninth aspect of the invention there is provided a method of transmitting a converted message over a communications network from a sender to a recipient, said method comprising the steps of: entering a text-based message in a freeform style into a data entry means; converting the freeform text-based message into an audio message; and transmitting said audio message as said converted message to said recipient over said communications network.
  • Figure 1 is a schematic diagram of a system used to transmit a message between a sender and recipient in accordance with the invention
  • Figure 2 is a screen view of menu options available to the user in constructing a message
  • Figure 3 is a screen showing choices of voices of famous characters or special effects that the user can select;
  • Figure 4 is a view similar to Figure 2 but showing a selected famous character
  • Figure 5 is a screen view showing options available as to the type of message to be constructed
  • Figure 6 is a screen showing the options available to the user as to the style of the message to be sent
  • a user may use either the full name in the field or a three letter variant.
  • the user can choose to mix between the three letter variants and the full names and as an example the following combination of text entries would be valid: Enter your 1 st Name ⁇ Jack>, their 1 st Name ⁇ Jill>, their No. ⁇ 0409550206>, Famoice ⁇ Elvis>, Style ⁇ cool>, MSg Type ⁇ bor>, Time ⁇ 1500>
  • the message body decoder means 72 essentially converts the body of the SMS message from text to audio.
  • the message body decoder means 72 may be incorporated as part of the server 4 or be otherwise linked to the server 4.
  • the text generally comprises words which are delineated by space characters, however other delineating methods may be employed which includes statistical algorithms to determine word boundaries automatically from poorly delineated bodies.
  • Each word in the message body may be a control word, an attribute word, an SMS code or a regular word, each to be hereinafter described.
  • the decoder 72 scans the message body from left to right causing instructions to be generated that will be used to translate the SMS message into an audio message.
  • BX used to mix in a background sound effect
  • INVC used to switch voices for only one word of phrase
  • VX - used to enable a voice effect, for example shouting
  • the message body decoder 72 On receipt of the constructed message, the message body decoder 72 on detecting the code VC retrieves the Elvis Presley voice for that sequence, from database 20 or alternatively in data storage means 18 in server 4, or alternatively, via a speech synthesis module such that the message then includes the words "and also from him" in the voice of Elvis Presley.
  • Example 2
  • Attribute words may be any regular words, SMS codes or other sequences of ASCII or extended byte characters that are programmed to be recognised in conjunction with their adjoining control words. If one or more attribute words follow a control word, the longest sequence of control word followed by attribute words is chosen that is programmed as a valid attribute sequence for that specific control word and is then decoded into instruction to the message body decoder 72.
  • SMS language “D8” is equivalent to "date” and IMHO is equivalent to "in my famous opinion”.
  • an SMS code is used as an attribute word, it is used as such in decoding the instructions to the message body decoder, based on the preceding control word. If however, the SMS code is not part of a control word/attribute word sequence, that is, the SMS code is in the body of the text, it gets converted by the message body decoder 72 into equivalent text which in turn will be converted to voice. This feature allows users to type the bulk of their message in familiar abbreviated SMS form.
  • SMS codes can be used as attribute words or control words or they can appear in the body of the text in which case their text equivalent is generated.
  • SMS emoticons are also supported as SMS codes. These are sequences of extended ASCII characters that in text form resemble a simple picture. For example :-) is a smile. Emoticons get converted to equivalent text and in turn get converted to voice. For example, "I cw2cu 2nite. You make me feel :-)))" gets converted into “I can't wait to see you tonight. You make me feel very happy” which then gets converted into voice.
  • Regular words are all words that are not recognised as the above special classes of words and all words that are recognised as one of the above special classes of words but that in sequence, do not form a valid control pattern. All regular words are considered input text to the speech synthesis/reconstruction process and will be spoken in the currently selected voice. A default voice may be used where a character's voice has not been selected. Included in the stream of regular words are all the punctuation marks that are not part of the control sequence or SMS code. By way of example:
  • the message body decoder 72 comprises a parser or converter unit 76, a message builder 78, a text to speech synthesiser 80, a waveform concatenator 82, a waveform mixer 84 and a waveform effects filter 86.
  • the SMS message body which has been transmitted via the recipient decoder 70 is received by the parser unit 76 which breaks the message down into words and punctuation marks. It also searches through the input message for patterns that match the control sequences described above. From these patterns, the parser unit 76 builds up a construction formula for the final audio message. For example, one embodiment of this construction sequence is to represent each unique waveform segment in a list. For example:
  • the instructed sequence comprises a text to speech segment where the text is converted into the voice of Elvis who would utter "hello from Elvis". It also includes sound effects taken from the explosion.wav file to incorporate a background sound effect and a further text to speech portion in the voice of Homer Simpson who would utter "and hello from Homer”.
  • the message builder unit 78 after receiving the message construction sequence from the parser unit 76 takes the sequence and using the various modules 80 through to 86 attached to the message builder unit 78, constructs the message one segment at a time. It concatenates the message sequence together applying any post concatenation filters to the audio message and finally sending the complete audio message onto the audio message delivery system 74.
  • the text to speech synthesiser unit 80 receives each consecutive set of words that are to be spoken in a particular voice as an utterance. This utterance is then converted to speech and passed back as a waveform. It is to be understood that any method of speech synthesis could be used in this module and even different methods for different voices could be used. For example, Rhetorical's rvoice system, or the Festival Speech synthesis system, which use cluster unit concatenation technology could be used, or formant synthesisers could also be used.
  • BFN will index into the Mickey Mouse database and extract the "bye for now", pre-recorded audio fragment which will then be concatenated into the message.
  • the "bye for now” audio fragment may contain Mickey Mouse saying “cheers from all your friends at the mouse club”
  • the waveform concatenator unit 82 is used to join each of the waveform segments together. It is to be noted that sound effects, movie quotes and music clips are indexed as pre-recorded audio waveform files by the parser unit 76 and so, are passed directly to the waveform concatenator 82 for inclusion in the final audio waveform.
  • the waveform mixer unit 84 is used to mix a background track into the final waveform or waveform segment and the wave form effects filter 86 is used to add to the system effects such as "drunk” or "underwater” sounds such that the final waveform or waveform segment is processed by passing it through a filter that modifies the acoustic properties of the waveform to simulate a particular effect. Some effects, such as shout, whisper or other emotions may be implemented by selecting a different voice for the synthesis operation.
  • a user wishing to instruct and send a short audio message or SAM does so by inputting text into their telephone device, such as mobile telephone 10, which may be in the format SAM.xxxxx.recipient phone number (or other such template that provides for free form entry xxxxx) where the x's denote the body of the text which may include any one or more control words in combination with attribute words SMS codes and regular words.
  • the send button On depressing the send button, once the message has been constructed, the message is sent via mobile network 8 and fixed communications network 6 to a recipient decoder means 70 linked to the network 6.
  • Decoder means 70 analyses the SMS message and strips out the recipient ID, in this case the recipient telephone number and the rest of the message is passed to a message body decoder 72.
  • an outgoing audio message to a recipient may include further information that teaches a recipient on how to construct and send an SMS message, if that recipient does not already know, or further information about a feature of the audio messaging system. This would have the affect of enticing the recipient to participate more fully in the use of the audio messaging system features. For example, a custom program for interactive voice response by the recipient may be used that teaches the recipient about the product feature or entices the recipient to try the system for the first time. Another example would be to send an SMS message to the recipient, the content of which introduces certain product features or trains the recipient on becoming a sender of an audio message to a further recipient.
  • the interactive response via DTMF tones, can be used by the recipient to progress through a tutorial instruction or initiate the delivery of an SMS message outlining the tutorial steps or a requested product feature to his or her telephone.
  • the appendage of the further information can be instigated by the server means 4 by appending an audio message introducing the recipient to a particular product feature or indicating to the recipient that a tutorial may be heard on how to use the system and to send a message. Then all the recipient has to do is indicate on their telephone terminal that they wish to proceed to listen to a tutorial or the information on a product feature of the system and follow an interactive voice response session.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Human Computer Interaction (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Telephonic Communication Services (AREA)

Abstract

A system and method for transmitting an audio message to a recipient, the audio message resulting from a conversion process. Data entry means (10, 17, 19) is used by a sender to enter a text-based message, either in free-from style or using a template structure, which is sent to a processing means (4, 72) and converted into the audio message, which may be in a voice known to the recipient with or without sound effects. The message is then sent over communications network (6) to a recipient communications device (9, 12) to be heard by the recipient. The data template structure may be downloaded from the processing means (4) to the data entry means (10, 17, 19) or to recipient communications device (9, 12).

Description

SYSTEM AND METHOD OF CONVERTING A MESSAGE AND TRANSMITTING THE CONVERTED MESSAGE Field of the invention This invention relates to a method and system for constructing and transmitting a message over a communications network. The invention also relates to a data template structure for display on a communications device that allows subscribers of mobile devices to construct a message using the template structure for delivery to another telephone subscriber or other communications device, which message may have certain characteristics such as being in the voice of a famous character or person, or including sound effects. The invention also relates to a method of displaying data template structures on a display means of a communications device. Background of the invention At present there exists methods of transmitting text messages, for example using the Short Message Service (SMS), between mobile telephone users. For a user to construct and send a message they generally input the recipient's name and/or telephone number by using the keypad of the mobile telephone and inputting the text message that they wish to send to the recipient. The message is then sent through a mobile communications network by depressing the send button and the recipient will be alerted on their mobile phone when they receive a text message which they can view by standard techniques.
The above system of transmitting SMS messages is limited in the sense that the message is simply a text message that has no specific characteristics about the message except from knowing who the message was sent from and the details of the plain text message.
The present invention makes sending a message a unique and fun experience in that the message may be sent to include the voice of a famous or recognisable character with a particular tone or style and in a particular type of message whether it be a greeting, hello, or a comment that is typically associated with the famous character. The present invention provides an interactive way of selecting, by the user wishing to construct the message, a number of choices of famous or recognisable characters, the style in which the message is to be heard, and the type of greeting or type of message. More particularly the recipient will be able to hear the message rather than just simply viewing the contents of the message on the screen of their communications device. Furthermore the message can be sent to a variety of voice function communication devices. Summary of the invention
According to a first aspect of the invention there is provided a data template structure for display on a display means of a user communications device to allow said user to construct a message to be sent to a recipient; said structure comprising: One or more fields that permits entry by said user of data representative of characteristics of said message and said recipient; wherein said data identifies an audio sequence and on receipt of said message by said recipient said audio sequence is heard on a recipient communications device by said recipient as part of said message. The user and recipient communications devices may be wireless, linked to a cellular communications network or wired and linked to a fixed telecommunications network, such as the PSTN.
The audio sequence may be the voice of a famous or recognisable character, well known to most recipients, or a song or other special effects sounds. A song may be sung in the voice of a famous character. Where the audio sequence is in the voice of a famous character, background and end sound effects may be included as part of said audio sequence.
The fields may allow direct insertion by said user of said data representing characteristics of said message or may be formed as an interactive menu having various levels. Thus, on selecting one field, specific choices for that field may be displayed on said display means, for example a range of choices in whose famous voice the message is to be heard. The entry of data in said fields may be abbreviated.
A second field may allow entry of the user's name, either first names, nicknames or surnames. A third field may allow entry of the recipient's name, either first name, nickname or surname. A fourth field may allow entry of the recipient's mobile or other telephone number. A fifth field may allow entry of the type of message to be sent, for example, happy, sad, bored, sorry. A sixth field may allow entry in said structure of the type of message to be sent, for example, a greeting, happy birthday, congratulations etc. A seventh field may allow entry of the time at which the message should be sent.
Each of said communication devices of said user and said recipient may be respectively linked to a fixed communications network either directly or through a cellular communications network, where said communication devices are wireless, cellular devices. A server means, linked to said coinmunications network or directly to the cellular communications network, may have data storage and processing means or be linked to a data storage and processing means. Said data storage and processing means may create and store said audio sequences such as recordings of voices of famous characters whether in a particular style or type. The data storage means may include a telephone book matching recipient telephone numbers with recipient names.
The server means may also store the data template structures and download these on request to a user wishing to construct a message.
According to a second aspect of the invention there is provided a method of displaying a data template structure on a display means of a communications device of a user, comprising the steps of: receiving at a server means a request by said user to construct a message to be sent to a recipient; downloading said data template structure from said server means to said communications device; said data template structure including a field permitting entry of data representative of characteristics of said message and said recipient; wherein said data identifies an audio sequence which is heard by said recipient. According to a third aspect of the invention there is provided a method of transmitting a data template structure between users, each user having a communications device for reception and transmission of said structure, said method comprising the steps of: a first user receiving said structure downloaded from a server means; transmitting the downloaded structure as a message from a communication device of said first user to a communication device of a second user; storing said structure on said communication device of said second user; wherein said second user is able to retrieve the stored structure, create a message by entering data in at least one field of said structure, said data identifying an audio sequence to be heard by a further user, and transmitting the created message to said further user.
The message may be an SMS message.
According to a fourth aspect of the invention there is provided a data menu structure for display on a display means of a communications device to allow a user to construct a message to be sent to a recipient; said structure comprising: at least one field that permits selection of options from menu options and permits entry of data against a selected menu option by said user, said data being representative of characteristics of said message and said recipient; wherein at least one of said selected menu options or data entries provides a string of text identifying an audio sequence, such that on receipt of said message by said recipient said audio sequence is heard by said recipient on a recipient communications device. According to a fifth aspect of the invention there is provided a method of displaying a data menu structure on a display means of a communications device of a user, comprising the steps of: receiving at a server means a request by said user to construct a message to be sent to a recipient; establishing an application session between said server means and said communications device; sending a menu structure from said server means to said communications device; selecting, by said user, one or more menu options whilst navigating said menu structure and entering data required by said user or said server means; transmitting navigation, selection and data entry commands from said communications device to said server means, said commands permitting entry of data representative of characteristics of said message and said recipient; wherein at least one set of entered data or selected data represents a string of text identifying an audio sequence to be heard by said recipient.
The user may navigate up and down the menu structure. The menu structure may be refreshed or modified depending upon such navigation.
According to a sixth aspect of the invention there is provided a method of transmitting a data menu structure between a first user and a second user respectively using a first user communications device and a second user communications device, said method comprising the steps of: said first user participating in an application session between a server means and said first user communications device; sending an application connection query, by said first user, to said second user communication device ; sending a menu structure representative of said application selected by said first user from said server means to said second user communications device; wherein said second user is able to use the application menu structure to create a message by entering data or selecting a menu option in at least one field of said structure, said data identifying an audio sequence to be heard by a further user, and transmitting the created message to said further user. The application connection query may be sent to the server means before being sent to the second user communications device. A network identifier of the second user may be provided by said first user to said server means. On sending the menu structure to the second user, the server means may send a network identifier for the application selected by said first user so that the second user can establish a further or later session with the server means where the application is stored. Thus the second user may have the option of responding to the server means at the time of receipt of the menu structure using the menu structure or at a later time. The further user may be the first user.
According to a seventh aspect of the invention there is provided a system for constructing and transmitting a message over a communications network from a sender to a recipient, said system comprising: a sender communications device on which said message is constructed by said sender by entry of data representative of characteristics of said message and a recipient identifier into said sender communications device; processing means linked to said communications network for receiving said message and converting the received message into an audio message and transmitting said audio message over said communications network to said recipient using a recipient identifier; and a recipient communications device for accessing said audio message. According to an eighth aspect of the invention there is provided a system for transmitting a converted message over a communications network from a sender to a recipient, said system comprising: data entry means for entering a text-based message in a free-form style; means for converting said freeform text-based message into an audio message to be sent as said converted message, to said recipient over said communications network.
According to a ninth aspect of the invention there is provided a method of transmitting a converted message over a communications network from a sender to a recipient, said method comprising the steps of: entering a text-based message in a freeform style into a data entry means; converting the freeform text-based message into an audio message; and transmitting said audio message as said converted message to said recipient over said communications network.
Brief description of the drawings
The invention will hereinafter be described in a preferred embodiment, by way of example only, with reference to the drawings wherein: Figure 1 is a schematic diagram of a system used to transmit a message between a sender and recipient in accordance with the invention;
Figure 2 is a screen view of menu options available to the user in constructing a message;
Figure 3 is a screen showing choices of voices of famous characters or special effects that the user can select;
Figure 4 is a view similar to Figure 2 but showing a selected famous character;
Figure 5 is a screen view showing options available as to the type of message to be constructed; Figure 6 is a screen showing the options available to the user as to the style of the message to be sent;
Figure 7 is a screen allowing the input of the recipient's name or otherwise providing a search facility to locate either the name or telephone number of the recipient; Figure 8 shows a screen with the various selected options prior to sending the message;
Figure 9 is a view displaying the recipient's phone number on the screen of the user's mobile telephone; Figure 10 provides a further embodiment of options or characteristics available to the user in constructing their message; and
Figure 11 is a block diagram of a further embodiment of the present invention; and
Figure 12 is a block diagram of a message body decoder means in Figure 11.
Detailed description of preferred embodiments
With reference to Figure 1 there is shown a system that enables the downloading of data template structures to one or more mobile telephones over a number of networks which then allows users of the mobile telephones or wired telephones to construct a message, using a data template which is to be heard by the intended recipient. Specifically the system 2 incorporates a server means 4 (which may be owned and operated by a service provider)linked to a communications network 6, which may be the PSTN or the Internet, which in turn is linked to a Public Land Mobile Network (PLMN) 8 which has one or more mobile telephones 10, 12 linked to the mobile network 8. One or more wired telephones 7, 9 are connected to network 6. The server means 4 may be linked directly to PLMN 6 and has processing means 14, communication means 15, memory means 16 and data storage means 18. A further data base or storage means 20 is linked to the server 4 wherein either of the storage means 18 or 20 store recordings of voices or audio from famous characters which may be in a particular style and may be a particular type of message, any one or more songs, that may be recorded or constructed using text to speech conversion means in the voice of a famous character and together with any phone numbers and corresponding subscriber names that are all subscribers of the mobile network (or fixed network) and are registered as users with the system on the server means 4. A wireless computer processor or PC 17 is linked to the mobile network 8, as an alternative to a mobile telephone for inputting text and sending to a website stored at the server 4 for on-forwarding to a recipient. Alternatively, a 'fixed' computer 19 linked to network 6 may be used to construct a message and send to the website.
One example of what a user requiring to send a message will see on the screen of their mobile phone is shown in Figure 2. Generally a request is made by the user on depressing a key on their mobile telephone or fixed/wired telephone which is transmitted across the mobile network 8 and /or communications network 6 to be received at the server means 4. The server means 4 will then download a template structure to the requesting mobile phone or wired phone, for example mobile phone 10, to allow entry and navigation of the various fields shown in Figure 2. A computer program stored in the memory means 16 will provide instructions to the processing means 14 to extract the template structure from either data storage means 18 or 20 and be sent to the requesting phone 10 through the processing means 14 and communication means 15. Specifically there are six fields being field 30 to allow selection of the famous character's voice in which the message is to be heard, field 32 for the type of message to be sent, field 34 for the style in which the message is to be sent (this may be an optional feature), field 36 for entry of the sender's name, field 38 for entry of the recipient's name and field 40, which may be optional for entry of the time at which the message is to be sent. The user may select any one of fields 30 through to 40 using a navigation tool to gain further levels or choices under each of the six fields. It is to be noted that the layout of the screen is not limited to showing six fields but can include further fields for example the recipient's telephone number and any other text that the sender wishes to enter to be heard by the recipient in the famous character's voice.
At the option of the user, he or she can hear the final message before sending it to the recipient by sending the message to a service provider's telephone number (or equivalent or a designated number) and not providing the telephone number of the recipient. The user's phone will ring shortly thereafter to enable the user to hear the message. The user can then edit the message and send it to the recipient by inserting the recipient's telephone number at the end of the message, and sending the message to the number of the service provider. The recipient of the message may find the identity of the sender by inserting at the end of the received message an option to hear the number of the sender spoken to the recipient.
By selecting field 30 for the famous character's voice the user will be directed to a further screen shown in Figure 3 which provides choices of famous characters in whose voice the message is to be heard. Specifically there is shown on the screen 42 a selection from Elvis, rocker, sexy, FX, DJ. Many other choices may be available and selecting rocker or sexy provides a message in that particular tone or style, selecting FX includes backgrounds effects in the message such as recordings from a sporting event, a war scene or a street scene. By selecting DJ the message will either have in the background a song, or the song simply on its own or even the song sung in the voice of the famous character. Different screen fonts or characters can be used to identify optional and mandatory fields. Once the user has chosen which famous voice or character they wish to use they confirm the selection by clicking on the select button 44 or depressing a key to invoke the selection button. Either entry of a key on the key pad or navigation to the desired option selects the famous voice for the message and then returns the user to the top level as is shown in Figure 4. Figure 4 now shows that the voice or character Elvis has been selected and then the user will next select the type of message to be sent in Elvis' voice by highlighting or selecting the field 32. They will then be shown the screen shown in Figure 5 whereby any one of the selections hi, bored, call, wassup or sorry may be selected as the type of message. The user then clicks the select button 44 and will then be guided back to the uppermost menu screen shown in either Figure 4 or Figure 2. On selecting the field 34, which as mentioned before may be optional, the user will be guided to the screen shown in Figure 6 which provides a number of options such as cool, funny, bizarre or rude. Other options may include happy or sad. Once the user has selected the style of the message by clicking on button 44 they will then be directed back to the top menu where they may select field 36 to include their name, then field 38 to select the recipient's name and optionally field 40 to provide a time, to send the message, which can be any time within the next 24 hours down to the specific minute, for example 15. 32. Alternatively the sender's name may be defaulted from within the phone where applicable and the name of the recipient or their telephone number may be typed in and where only the phone number is known or the recipient's name is known a phone book may be used which is stored in the database 20 to locate the matching name with a telephone number or vice versa. For example in Figure 7 there is shown a screen that allows insertion or typing of the recipient's name in the space 46 or equivalently button 48 which allows the search facility to track down the name of the recipient based on a telephone number.
When all of the required fields are completed the screen will look something like that shown in Figure 8 where all of the fields have been completed except the optional field "send time". To send the message all the user has to do is click on the send function 50 in which case the user will see that a default telephone number for the recipient is shown on their screen in Figure 9 with the option to confirm that the number was correct by clicking on the button OK 52 in which case the server means will receive all of the information input onto the template and send the message to the recipient mobile phone 12 for example.
Once a blank template structure has been downloaded from the server 4 to a first user's telephone, he or she may forward the structure as an SMS message to a second user's telephone by dialling the second user's number and selecting a "send" option. On receipt of the message, the second user can store the template on their telephone and retrieve it to construct a message, as described, and transmit it to a further user's telephone.
An alternative data entry screen may be that shown in Figure 10 where for example users are familiar with the options available and can immediately input their choices under the various field between the delimiters identified by > and <. As an example on the screen 54 there is shown a first field 56 that allows entry of the user's or sender's name or nickname, a second field to allow entry of the recipient's name or nickname, a third field 60 to allow entry of the recipient's telephone number, a fourth field 62 to allow entry of the famous character in whose voice the message will be heard by the recipient, a fifth field for a message type 64, a sixth field for the style of a message 66 and finally a seventh field for the time that the message is to be sent 68. All of the required data is entered between the parenthesis or delimiters such as "Jack", "Jill". The first field 56, for entry of the user's name or sender's name or nickname is required and may be up to 10 characters in length. Similarly with field 58 for entry of the recipient's name this may or may not be required depending on whether the corresponding telephone number is stored in the database 20. It is more likely that the sender will know the name of the recipient but not necessarily their telephone number as well. Thus the entry of the name of the recipient may be up to 10 characters in length and in field 60 if the telephone number is also known then the number may also be up to 10 characters in length. Thus each of the first three fields 56, 58 and 60 can be up to 30 characters in length. The prompts and delimiters used to guide the sender in constructing the message require up to 99 characters to display on the screen which for SMS messages leaves another 61 characters free for the user to enter in values between the brackets or the delimiters. Thus for the other 4 fields 62, 64, 66 and 68, 31 characters are left to use. Four of these characters are used for the time that the message is to be sent in field 68 which leaves 27 characters for the Famoice or famous voice, style and message type. The famous or recognisable voice of a character in which the message is to be heard may be selected in Field 62 such as Elvis, DJ, FX. As mentioned previously this field is required. The next field message type 64 will also be required and can be for example a congratulatory message, a greeting, a birthday greeting and so forth. The style field 66 may be optional such as in what tone the message is to be transmitted, for example, happy, sad, angry.
Where sound effects are used via the template the user would set the field 62 to FX and then enter a message type in field 64. The style field would be left blank. Furthermore where a piece of music or song needs to be sent via the template then the user would set field 62 to DJ and then enter a message type, the style field will also be left blank. A number of variants may exist for each recording of the voice of the famous or recognisable character from which the sender can select. Thus for example the Elvis voice may say the message "hello" in six different ways. The user can specify explicitly which variant they would like to send by entering nil, hi2, hi3 and so on. Alternatively a default hi could be specified instead of the variations on hi.
In order to minimise the number of key strokes for end users that are perhaps more experienced with entry of data into the templates, it is possible to use a number of abbreviations for various fields such as the famous voice or famoice, or style and message type. The abbreviated form of three characters from the full field value include the following examples:
Elvis: Elv or Els
Cool: Coo or Col
Bored: Bor or Brd
Happy: Hap or Hpy Sad: Sad
Thus a user may use either the full name in the field or a three letter variant. The user can choose to mix between the three letter variants and the full names and as an example the following combination of text entries would be valid: Enter your 1st Name<Jack>, their 1st Name<Jill>, their No.<0409550206>, Famoice<Elvis>, Style<cool>, MSg Type<bor>, Time<1500>
As mentioned previously the server means 4 may be linked to the database 20 which can store a telephone book which contains matching entries for a mobile telephone number or PSTN number to the actual subscriber associated with that telephone number. The purpose of the phone book makes it easier for users to enter a phone number when filling one of the template structures in preparation for sending a message. When a message is received by the server means 4, which has been constructed by a sender, it determines the values contained within the field 58 (or equivalently field 38) and in field 60. If no value is supplied for the phone number in the field 60, the processing means 14 of the server means 4 will search the user's phone book for an entry that matches the value in the field 58 for the name of the recipient. If it finds an entry it uses the number stored in the phone book in data base 20 and proceeds to send the message. If a value has been supplied in the field 60 for the recipient's telephone number the server means 4 or more particularly the processing means 14 checks to see whether a number has been supplied or a name. If a number has been supplied it uses it and proceeds to send the message. If a name has been supplied then it looks up the name in the user's phone book as mentioned previously and proceeds to send the message. Examples of templates that are all valid, assuming that Jill and her phone number have been entered into Jack's on-line phone book at database 20 is as follows:
Enter your 1st Name<Jack>, their 1st Name<Jill>, their No.<0409550206>, Famoice<Elvis>, Style<cool>, MSg Tyρe<bor>, Time<1500>
Enter your 1st Name<Jack>, Their 1st Name<Jill>, Their No.o, Famoice<Elvis>, Style<cool>, Msg Type<bor>, Time<1500>
Enter your 1st Name<Jack>, Their 1st name<Jill>, Their No.<Jill>, Famoice<Elvis>, Style<cool>, Msg Type<bor>, Time<1500>. A further embodiment of a data template to be displayed on a user's mobile telephone incorporates the use of spaces to input characters for each field separated by full stops or dots. The template allows users to enter information with a minimum number of keystrokes in an easy to remember format. A typical example would be as follows: Jack.Jill.0409550206.elv.hi.hap This would be equivalent to: enter your 1st Name<Jack>, Their 1st Name<Jill>, Their No. <0409550206>, Famoice<Elvis>, Msg Type<hi>, Style<hap>, Timeo Optional fields are omitted if they are not used in the period delimited template.
The PLMN 8 may be any digital cellular network, such as GSM, CDMA that allows transmission of text messages. In the above embodiments a GSM or Global System for Mobiles network is used which allows the transmission of text based messages between the mobile terminal 10 or 12 and the server means 4. Firstly, Unstructured Supplementary Services Data or (USSD) uses a signalling channel in the GSM network as the bearer and this is commonly the fast associated control channel. An advantage of using a USSD service is that it is session oriented which means that when a user accesses the service, a session is established and the radio connection stays open until the user, application or time out function releases it. By using USSD the user wishing to construct a message sends short codes, for example #145# which routes the message to the correct application through the mobile network 8 obtains the particular service requested and then together with the server means 4 the message is forwarded to the recipient. Alternatively, a further protocol that can be used is the SIM Application Tool kit (SAT) which allows for the server means to download various menu structures or templates to the subscriber's mobile telephone for participating in an interactive manner. The SIM application tool kit has been agreed and incorporated within the GSM standard and allows the flexibility to update the SIM to alter the services and download new services over the radio link. For example, aside from assisting in downloading menu structures, the network operators can remotely provision the user's mobile telephone by sending codes from the server 4 to the mobile phones in the form of short messages or GPRS data. Information may also be sent from the mobile phone back to the server 4 using the SAT over a predefined bearer channel. In using the first channel, USSD, once the user has constructed their message and clicked the send button the data is sent through this channel and delivered to the server 4 directly from a USSD processor forming part of the mobile network.
In a further embodiment of the present invention, an application session may be established between a first user, for example having a first user communication device being mobile telephone 10, and server means 4. A data menu structure is transmitted from the server means 4 to be displayed on display means of the mobile telephone 10 to enable the first user to construct a message that is to be sent to a second user who for example may be using a communications device such as mobile telephone 12. The data menu structure includes a number of menu fields that enable the first user to navigate the menu structure and select options from a range of menu options and enter data against a selected menu option which data may be representative of characteristics of the message to be sent to the second user. The menu structure can be refreshed or modified at any time depending upon the first user's navigation. Once a user has constructed a message, that would typically be in the voice of a famous or recognisable character and may include special sound effects, the first user in choosing to send the message to a second user may select to send an application connection query to the mobile telephone 12 of the second user through the server means 4. Navigation, selection and data entry commands are sent to the server means 4 from the mobile telephone 10. A network identifier identifying the second user's mobile telephone is also sent by the first user as part of the message to the server means. The server means then sends to the second user's mobile telephone 12 an initial menu structure and, optionally, a network identifier identifying the application selected by the first user and also identifying the application being stored at the server means 4. The second user then has the option to respond to the server means 4 with the application menu structure that is transmitted to their mobile telephone 12 at the time they receive it or at a later time. The second user may choose to respond later as they are not able to respond at the time of receipt of the menu structure or in situations for example where a connection is lost. In the latter situation the user may require reconnection to establish a further USSD session in order to use the application menu structure to respond to the sender or alternatively create a new message by entering data or selecting a menu option in at least one field of the menu structure. At least one set of data entered by the second user identifies an audio sequence that is to be heard by a further user, which may be the first user, and whereby the message is then subsequently transmitted to that further user using a network identifier and transmitted by the server means 4.
The first mentioned network identifier may be a telephone number at which the user is to connect to the server means 4 in order to have access to the application that has already been transmitted by the sending user.
In addition to sending a completed template and constructing a message to the server, a user can also request for a new template or for help. A user may submit a request for a blank template to be sent to their mobile phone by sending the following message to the server 4; template
The template request service is only available to registered users and if an unregistered user requests a template they will receive an error message.
A user may request help on how to use the system by sending the following message to the server 4: help
The server will respond to a help request by sending an SMS containing the help line number as well as the website address of the service provider. Other information that is able to fit into the maximum 160 characters in a text message may also be used. Certain characters are reserved for the purposes of future extensions to the templates syntax such as #, ., < and >. These characters are not to be used as field values.
In a further embodiment, a user may construct a short audio message or SAM in a "freeform" style and send the message to either the user's own phone or to another persons phone. Using the freeform system, an SMS message is constructed by a sender on their telephone for example, mobile phone 10, and transmitted to a decoder which may form part of a server 4 or be otherwise connected to server 4. The message will be transmitted over mobile network 8 and communications network 6 to the decoder either directly or through server 4. The freeform SAM system imposes little or no formatting requirements upon the SMS message and uses key words and parsing to decode how the SMS message is to be converted into an audio format to be received by a recipient. The freeform system is principally designed to accommodate the conversion of arbitrary input text into speech of a person and more particularly in one or more characteristics or voices of a famous character.
Alternatively, a wireless computer processor or PC 17 linked to mobile network 8 may be used for inputting text sending to a website stored at server 4 for on-forwarding to a recipient. A fixed computer 19 linked to network 6 may also be used to construct the message and send to the website or to send to server 4.
With reference to Figure 11, the basic system architecture of the freeform system is shown. Once the message is constructed by the user, it is sent over networks 8 and 6 and received at the recipient decoder 70, which decodes the desired recipient of the message by scanning the end of the SMS message for a telephone number. If a valid telephone number is detected at the end of the message, it is stripped from the message and becomes the recipient identification (or ID) and the remainder of the message is passed on through the system. If no telephone number is detected then the sender telephone number is used as the recipient telephone number or ID. The rest of the message (not including the recipient ID) is sent to a message body decoder means 72 (to be hereinafter described) which is then converted into an audio message and delivered via the audio messaging delivery system 74 to the intended recipient over the networks 6 to a PSTN phone or over networks 6 and 8 to a mobile telephone 12.
The message body decoder means 72 essentially converts the body of the SMS message from text to audio. Again, the message body decoder means 72 may be incorporated as part of the server 4 or be otherwise linked to the server 4. The text generally comprises words which are delineated by space characters, however other delineating methods may be employed which includes statistical algorithms to determine word boundaries automatically from poorly delineated bodies. Each word in the message body may be a control word, an attribute word, an SMS code or a regular word, each to be hereinafter described. The decoder 72 scans the message body from left to right causing instructions to be generated that will be used to translate the SMS message into an audio message. Again as with a previous embodiment, at the option of the user, he or she can hear the final message before sending it to the recipient by sending the message to a telephone number of the service provider (or equivalent), or a designated number, and not providing the telephone number of the recipient. Thus, the recipient decoder 70, not detecting a recipient telephone number of ID, will use the user's telephone number. Therefore, the user's phone will ring shortly thereafter to enable a user to hear the message. The user can then edit the message and send it to the recipient by inserting the recipient's telephone number at the end of the message and sending the message to the number of the service provider. The recipient of the message may find the identity of the sender or user by inserting at the end of the received message an option to hear the number of the sender spoken to the recipient. Control words
Control words are used to implement the text to audio conversion for the message body decoder. Control words are provided for such functions as s witching the current voice to another voice or including sound effects, movie quotes or music clips or mixing in background sound effects.
Generally, most control words need to be followed by one or more attribute words. The attribute words are used to affect the mode of the control word. Examples of supported control words include the following:
VC - used to switch between voices;
/VC - used to resort to a previous voice;
FX - used to include a sound effect, movie quote or music clip;
BX - used to mix in a background sound effect; , INVC - used to switch voices for only one word of phrase;
VX - used to enable a voice effect, for example shouting; and
/VX - used to disable the current voice effect.
The following are examples of messages that can be constructed using the above mentioned control words. Example 1
"hello from me VC Elvis and also from him"
On receipt of the constructed message, the message body decoder 72 on detecting the code VC retrieves the Elvis Presley voice for that sequence, from database 20 or alternatively in data storage means 18 in server 4, or alternatively, via a speech synthesis module such that the message then includes the words "and also from him" in the voice of Elvis Presley. Example 2
"VC Scotty did you know that VC Mickey, Mickey Mouse /VC once said that VC Mickey you are my best friend". The sequence "did you know that" and "once said that" are spoken in
Scotty' s voice and "Mickey Mouse" and "you are my best friend" as spoken in the voice of Mickey Mouse. In each case the message body decoder would convert the detected text into speech and on detection of a control word together with an attribute word will fetch the appropriate audio sequence for the character for example Scotty or Mickey Mouse, from the data storage means 18, database 20 or via the speech synthesis module, prior to construction of the audio message.
Example 3
"I am feeling very sad FX sigh please come and see me". Essentially the control word FX causes a sighing sound effect to be included after the word
SAD and the rest of the phrase is in a standardised voice.
Example 4
"VX jazz music well lets get down and boogie". This phrase causes a jazz music track to be mixed with the spoken message "well lets get down and boogie".
Attribute words may be any regular words, SMS codes or other sequences of ASCII or extended byte characters that are programmed to be recognised in conjunction with their adjoining control words. If one or more attribute words follow a control word, the longest sequence of control word followed by attribute words is chosen that is programmed as a valid attribute sequence for that specific control word and is then decoded into instruction to the message body decoder 72.
For example, "That would make a baby FX ~:,-o ". The "~:,-o" is detected as a valid SMS code for a crying baby. Therefore, "~:,-o" forms an attribute word to FX which sequence is recognised as the control sequence to insert a crying baby sound effect after the spoken word "baby".
Attribute words may be SMS codes which are abbreviations within the
SMS language. "D8" is equivalent to "date" and IMHO is equivalent to "in my humble opinion". If an SMS code is used as an attribute word, it is used as such in decoding the instructions to the message body decoder, based on the preceding control word. If however, the SMS code is not part of a control word/attribute word sequence, that is, the SMS code is in the body of the text, it gets converted by the message body decoder 72 into equivalent text which in turn will be converted to voice. This feature allows users to type the bulk of their message in familiar abbreviated SMS form. Thus, SMS codes can be used as attribute words or control words or they can appear in the body of the text in which case their text equivalent is generated.
SMS emoticons are also supported as SMS codes. These are sequences of extended ASCII characters that in text form resemble a simple picture. For example :-) is a smile. Emoticons get converted to equivalent text and in turn get converted to voice. For example, "I cw2cu 2nite. You make me feel :-)))" gets converted into "I can't wait to see you tonight. You make me feel very happy" which then gets converted into voice.
Regular words are all words that are not recognised as the above special classes of words and all words that are recognised as one of the above special classes of words but that in sequence, do not form a valid control pattern. All regular words are considered input text to the speech synthesis/reconstruction process and will be spoken in the currently selected voice. A default voice may be used where a character's voice has not been selected. Included in the stream of regular words are all the punctuation marks that are not part of the control sequence or SMS code. By way of example:
"Hell there. My name is Xandoe. What's yours?". This sentence is completely regular and is passed, with its content unmodified to the speech synthesis process. "She'll be laughing FX laugh baby will be crying FX cry baby". The phrases of regular words in this sentence are "she'll be laughing" and "baby will be crying". Even though "baby" is an attribute word, it is only a valid attribute word for FX CRY not FX LAUGH, so it gets treated as a regular word in this context. A default voice is set such that the minimum message for conversion is simply input text. That is, the recipient can be the sender and the voice a default voice.
With reference to Figure 12 there is shown in more detail the components of the message body decoder 72 in Figure 11. It comprises a parser or converter unit 76, a message builder 78, a text to speech synthesiser 80, a waveform concatenator 82, a waveform mixer 84 and a waveform effects filter 86. With reference to Figure 11 the SMS message body which has been transmitted via the recipient decoder 70 is received by the parser unit 76 which breaks the message down into words and punctuation marks. It also searches through the input message for patterns that match the control sequences described above. From these patterns, the parser unit 76 builds up a construction formula for the final audio message. For example, one embodiment of this construction sequence is to represent each unique waveform segment in a list. For example:
StartList
Segment = TTS, Voice = Elvis, Txt = "Hello from Elvis" Segment = FX, File = explosion.wav
Segment = TTS, Voice = HomerSimpson, Txt = "and hello from Homer" EndList Thus, the instructed sequence comprises a text to speech segment where the text is converted into the voice of Elvis who would utter "hello from Elvis". It also includes sound effects taken from the explosion.wav file to incorporate a background sound effect and a further text to speech portion in the voice of Homer Simpson who would utter "and hello from Homer". The message builder unit 78 after receiving the message construction sequence from the parser unit 76 takes the sequence and using the various modules 80 through to 86 attached to the message builder unit 78, constructs the message one segment at a time. It concatenates the message sequence together applying any post concatenation filters to the audio message and finally sending the complete audio message onto the audio message delivery system 74.
The text to speech synthesiser unit 80 receives each consecutive set of words that are to be spoken in a particular voice as an utterance. This utterance is then converted to speech and passed back as a waveform. It is to be understood that any method of speech synthesis could be used in this module and even different methods for different voices could be used. For example, Rhetorical's rvoice system, or the Festival Speech synthesis system, which use cluster unit concatenation technology could be used, or formant synthesisers could also be used.
An Application Service Provider (ASP) model may be used in the system to allow rapid and seamless integration of speech synthesis speech engines from any provider. The ASP model would send a synthesis request over a network, such as communications network 6 which may be the Internet, for remote conversion. It would then receive at some time later the waveform of speech back for further processing. A further aspect of the invention is that some voices may be provided using pre-recorded phrases designed to match SMS codes, termed Fixed
Domain voices. In this way, even before a complete general purpose text to speech synthesiser is available for a particular voice, users can add specific content in that character's voice. This type of voice uses a fall back synthesis voice to articulate any words or phrases that are not supported by the Fixed
Domain voice. It should be noted that the recorded audio for specific SMS codes in fixed domain voices need not exactly match the textual translation of those SMS codes.
For example: "vc Elvis I heard Mickey Mouse always says vc mickey bfn" will cause
"I heard Mickey Mouse always says' to be spoken in Elvis' voice and
BFN will index into the Mickey Mouse database and extract the "bye for now", pre-recorded audio fragment which will then be concatenated into the message. The "bye for now" audio fragment may contain Mickey Mouse saying "cheers from all your friends at the mouse club"
The waveform concatenator unit 82 is used to join each of the waveform segments together. It is to be noted that sound effects, movie quotes and music clips are indexed as pre-recorded audio waveform files by the parser unit 76 and so, are passed directly to the waveform concatenator 82 for inclusion in the final audio waveform. The waveform mixer unit 84 is used to mix a background track into the final waveform or waveform segment and the wave form effects filter 86 is used to add to the system effects such as "drunk" or "underwater" sounds such that the final waveform or waveform segment is processed by passing it through a filter that modifies the acoustic properties of the waveform to simulate a particular effect. Some effects, such as shout, whisper or other emotions may be implemented by selecting a different voice for the synthesis operation.
In summary, a user wishing to instruct and send a short audio message or SAM does so by inputting text into their telephone device, such as mobile telephone 10, which may be in the format SAM.xxxxx.recipient phone number (or other such template that provides for free form entry xxxxx) where the x's denote the body of the text which may include any one or more control words in combination with attribute words SMS codes and regular words. On depressing the send button, once the message has been constructed, the message is sent via mobile network 8 and fixed communications network 6 to a recipient decoder means 70 linked to the network 6. Decoder means 70 analyses the SMS message and strips out the recipient ID, in this case the recipient telephone number and the rest of the message is passed to a message body decoder 72. Depending on the control words or attribute words contained in the message, the message body decoder builds the message extracting text to be converted to speech and forwarding this on to the synthesiser unit 80. Once all the text is converted into speech, which may call upon various audio recordings stored in the storage means, such as data storage 20, this is passed back to the message builder 78 in the message body decoder 72 and any background sound effects or filtering effects are used to construct the final audio message. All of the functionality associated with the recipient decoder and message body decoder can be encompassed in a server 8, such as server 4 in Figure 1 as previously mentioned. The constructed audio message is then delivered to the recipient, either at a fixed pierced ear phone or a mobile telephone 12 over communications network 6 and 8 and be received as a normal telephone call. In a further embodiment of the present invention, an outgoing audio message to a recipient may include further information that teaches a recipient on how to construct and send an SMS message, if that recipient does not already know, or further information about a feature of the audio messaging system. This would have the affect of enticing the recipient to participate more fully in the use of the audio messaging system features. For example, a custom program for interactive voice response by the recipient may be used that teaches the recipient about the product feature or entices the recipient to try the system for the first time. Another example would be to send an SMS message to the recipient, the content of which introduces certain product features or trains the recipient on becoming a sender of an audio message to a further recipient. The interactive response, via DTMF tones, can be used by the recipient to progress through a tutorial instruction or initiate the delivery of an SMS message outlining the tutorial steps or a requested product feature to his or her telephone. The appendage of the further information, by way of tutorial or product features, can be instigated by the server means 4 by appending an audio message introducing the recipient to a particular product feature or indicating to the recipient that a tutorial may be heard on how to use the system and to send a message. Then all the recipient has to do is indicate on their telephone terminal that they wish to proceed to listen to a tutorial or the information on a product feature of the system and follow an interactive voice response session.
It will also be appreciated that various modifications and alterations may be made to the preferred embodiments above, without departing from the scope and spirit of the present invention.

Claims

1. A data template structure for display on a display means of a user communications device to allow said user to construct a message to be sent to a recipient; said structure comprising:
'< One or more fields that permits entry by said user of data representative of characteristics of said message and said recipient; wherein said data identifies an audio sequence, and on receipt of said message by said recipient said audio sequence is heard on a recipient communications device by said recipient as part of said message.
2. A structure according to claim 1, wherein said audio sequence includes the voice of a person.
3. A structure according to claim 2, wherein said person is a famous or recognisable character.
4. A structure according to any one of claims 1 to 3, wherein said audio sequence includes sound effects.
5. A structure according to any one of claims 1 to 4, wherein said one or more fields permits entry of the user's name.
6. A structure according to any one of the previous claims wherein said one or more fields allows entry of said recipient identifier.
7. A structure according to claim 6 wherein said one or more fields allows entry of a further recipient identifier.
8. A structure according to claim 6 or claim 7 wherein said recipient identifier is the name of the recipient.
9. A structure according to claim 6 or claim 7 wherein said recipient identifier is a telephone number of said recipient.
10. A structure according to any one of the previous claims wherein said one or more fields allows entry of the type of message to be sent, for example, happy, sad, bored etc.
11. A structure according to any one of the previous claims wherein said one or more fields allows entry of the style of message to be sent, for example, a greeting, happy birthday etc.
12. A structure according to any one of the previous claims wherein said one or more fields allows entry of the time at which said message is to be sent to said recipient.
13. A structure according to any one of the previous claims wherein entry of data in said one or more fields may be formed as an interactive menu having various options, one of which is chosen by said user.
14. A structure according to any one of the previous claims wherein entry of said data in any one of the fields is abbreviated.
15. A structure according to any one of the previous claims wherein said user communications device and said recipient communications device are linked to said communications network.
16. A structure according to claim 15 wherein said user communication device is a wireless device linked to said communications network through a cellular communications network.
17. A structure according to any one of the previous claims wherein a server means linked to said communications network receives the constructed message, processes the message and translates the message into an audio format.
18. A structure according to claim 18 wherein in assembling said audio sequence, said server means retrieves audio recordings from a data storage means in said server means or a data storage means linked to said communications network.
19. A method of displaying a data template structure on a display means of a communications device of a user, comprising the steps of: receiving at a server means a request by said user to construct a message to be sent to a recipient; downloading said data template structure from said server means to said communications device; said data template structure including a field permitting entry of data representative of characteristics of said message and said recipient; wherein said data identifies an audio sequence which is to be heard by said recipient.
20. A method according to claim 19, wherein said data template structure includes further fields, said field and said further fields allowing entry of any one or more of the following: text to be converted into audio; a sound effect; a voice of a character known to the recipient; the type of message to be sent; the style in which the message is to sent; the user's name; a recipient identifier; or the time at which the message is to be sent to said recipient.
21. A method according to claim 19 or claim 20, wherein said communications device of said user is a wireless device, such as a mobile telephone.
22. A method according to claim 19 or claim 20 wherein said user communications device is a wireless computing processor means, such as a PC.
23. A method of transmitting a data template structure between users, each user having a communications device for reception and transmission of said structure, said method comprising the steps of: a first user receiving said structure downloaded from a server means; transmitting the downloaded structure as a message from a communication device of said first user to a communication device of a second user; storing said structure on said communication device of said second user; wherein said second user is able to retrieve the stored structure, create a message by entering data in at least one field of said structure, said data identifying an audio sequence to be heard by a further user, and transmitting the created message to said further user.
24. A data menu structure for display on a display means of a communications device to allow a user to construct a message to be sent to a recipient; said structure comprising: at least one field that permits selection of options from menu options and permits entry of data against a selected menu option by said user, said data being representative of characteristics of said message and said recipient; wherein at least one of said selected menu options or data entries provides a string of text identifying an audio sequence, such that on receipt of said message by said recipient said audio sequence is heard by said recipient on a recipient communications device.
25. A structure according to claim 24 wherein on completion of the message by said user, said user transmits the message to a server means over a communications network, which server means processes the received message and converts the received message into an audio message and transmits the audio message to said recipient.
26. A method of displaying a data menu structure on a display means of a communications device of a user, comprising the steps of: receiving at a server means a request by said user to construct a message to be sent to a recipient; establishing an application session between said server means and said communications device; sending a menu structure from said server means to said communications device; selecting, by said user, one or more menu options whilst navigating said menu structure and entering data required by said user or said server means; transmitting navigation, selection and data entry commands from said communications device to said server means, said commands permitting entry of data representative of characteristics of said message and said recipient; wherein at least one set of entered data or selected data represents a string of text identifying an audio sequence to be heard by said recipient.
27. A method of transmitting a data menu structure between a first user and a second user respectively using a first user communications device and a second user communications device , said method comprising the steps of: said first user participating in an application session between a server means and said first user communications device; sending an application connection query, by said first user, to said second user communications device ; sending a menu structure representative of said application selected by said first user from said server means to said second user communications device; wherein said second user is able to use the application menu structure to create a message by entering data or selecting a menu option in at least one field of said structure, said data identifying an audio sequence to be heard by a further user, and transmitting the created message to said further user.
28. A system for constructing and transmitting a message over a communications network from a sender to a recipient, said system comprising: a sender communications device on which said message is constructed by said sender by entry of data representative of characteristics of said message and a recipient identifier into said sender communications device; processing means linked to said communications network for receiving said message and converting the received message into an audio message and transmitting said audio message over said communications network to said recipient using a recipient identifier; and a recipient communications device for accessing said audio message.
29. A system according to claim 28, further comprising recipient decoder means for receiving the constructed message and separating said recipient identifier from said constructed message.
30. A system according to claim 28 or claim 29, wherein said processing means includes a message body decoder means for converting said message into said audio message.
31. A system according to claim 30, wherein said message body decoder means includes a parser for receiving said constructed message body and breaking said message body into words, punctuation marks etc., and forming a message construction sequence.
32. A system according to claim 31, wherein said message body decoder means further comprises a message builder for receiving a message construction sequence from said parser and constructing the message into an audio format.
33. A system according to claim 32, wherein said message builder is in communication with a number of modules one of which is a text to speech synthesiser which converts text into speech and is passed back as a waveform to said message builder.
34. A system according to claim 33, further including a waveform concatenator module for joining various waveform segments and for receiving pre-recorded audio waveforms indexed by said parser for inclusion for the final audio message for delivery to said recipient.
35. A system according to claim 34, further comprising a waveform mixer module for mixing background audio into said final audio message, or a waveform segment.
36. A system according to claim 35, further including a waveform effects filter module for adding particular sound effects to said final audio message such that the acoustic properties of the sound effects and therefore of the audio message is modified to simulate a particular effect.
37. A system according to any one of claims 32 to 36 wherein said audio message resulting from said message builder is combined with said recipient identifier before being made available for said recipient.
38. A system according to any one of claims 28 to 37, wherein said recipient identifier is the name of the recipient.
39. A system according to any one of claims 28 to 37, wherein said recipient identifier is the telephone number of the recipient.
40. A system according to any one of claims 28 to 39, wherein said sender communications device is a mobile telephone.
41. A system according to any one of claims 28 to 39, wherein said sender communications device is a mobile computing processor, such as a wireless PC.
42. A system according to any one of the previous claims wherein said audio message transmitted to said recipient includes information as to how to construct and send a message or information on features of the system or data template structure.
43. A system according to claim 42, wherein said information is tailored for the recipient depending on said recipient's level of experience with the system.
44. A system according to any one the previous claims wherein a message is sent to said recipient contains information on features of the system or how to create and send a text message converted into an audio message to a further recipient.
45. A system according to any one of claims 42 to 44, wherein said information is provided as a tutorial to said recipient and said recipient may respond interactively using keys of said recipient communications device.
46. A system for transmitting a converted message over a communications network from a sender to a recipient, said system comprising: data entry means for entering a text-based message in a free-form style; means for converting said freeform text-based message into an audio message to be sent as said converted message, to said recipient over said communications network.
47. A system according to claim 46 wherein said converting means comprises means for controlling how said text-based message is converted into said audio message.
48. A system according to claim 47 wherein said text-based message includes a control sequence to indicate to said conversion means that a designated portion of said audio message is to change to a particular voice and/or include a sound effect.
49. A system according to claim 48 wherein said text-based message includes an attribute sequence used in conjunction with said control sequence to attribute a characteristic of the particular voice and/or sound effect.
50. A system according to claim 48 or claim 49 wherein said voice is that of a famous character or a voice known to said recipient.
51. A system according to claim 49 wherein said attribute sequence comprises ASCII characters.
52. A system according to claim 51 wherein said attribute sequence or said control sequence is an SMS code.
53. A system according to any one of claims 46 to 52 wherein said text-based message is an SMS message.
54. A method of transmitting a converted message over a communications network from a sender to a recipient, said method comprising the steps of: entering a text-based message in a freeform style into a data entry means; ' converting the freeform text-based message into an audio message; and transmitting said audio message as said converted message to said recipient over said communications network.
55. A method according to claim 54 further comprising the step of controlling how said text-based message is converted into said audio message.
56. A method according to claim 55 wherein said controlling step involves using a control sequence to indicate that a designated portion of said audio message is to change to a particular voice and/or include a sound effect.
57. A method according to claim 56 wherein said controlling step further involves using an attribute sequence associated with said control sequence to attribute a characteristic of said particular voice and/or said sound effect.
PCT/AU2002/000335 2001-03-19 2002-03-19 System and method of converting a message and transmitting the converted message WO2002076120A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AUPR3804 2001-03-19
AUPR3804A AUPR380401A0 (en) 2001-03-19 2001-03-19 Data template structure

Publications (1)

Publication Number Publication Date
WO2002076120A1 true WO2002076120A1 (en) 2002-09-26

Family

ID=3827812

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2002/000335 WO2002076120A1 (en) 2001-03-19 2002-03-19 System and method of converting a message and transmitting the converted message

Country Status (2)

Country Link
AU (1) AUPR380401A0 (en)
WO (1) WO2002076120A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1411736A1 (en) * 2002-10-14 2004-04-21 Swisscom AG System and method for converting text messages prepared with a mobile equipment into voice messages
WO2008070094A2 (en) 2006-12-05 2008-06-12 Nuance Communication, Inc. Wireless server based text to speech email

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111768755A (en) * 2020-06-24 2020-10-13 华人运通(上海)云计算科技有限公司 Information processing method, information processing apparatus, vehicle, and computer storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0981252A2 (en) * 1998-08-19 2000-02-23 Lucent Technologies Inc. Using discrete message-oriented services to deliver short audio communications
EP1003344A2 (en) * 1998-11-20 2000-05-24 Nortel Networks Corporation Simultaneous text and audio transmission for sponsored calls
DE19856440A1 (en) * 1998-12-08 2000-06-15 Bosch Gmbh Robert Transmission frame and radio unit with transmission frame

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0981252A2 (en) * 1998-08-19 2000-02-23 Lucent Technologies Inc. Using discrete message-oriented services to deliver short audio communications
EP1003344A2 (en) * 1998-11-20 2000-05-24 Nortel Networks Corporation Simultaneous text and audio transmission for sponsored calls
DE19856440A1 (en) * 1998-12-08 2000-06-15 Bosch Gmbh Robert Transmission frame and radio unit with transmission frame

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1411736A1 (en) * 2002-10-14 2004-04-21 Swisscom AG System and method for converting text messages prepared with a mobile equipment into voice messages
WO2008070094A2 (en) 2006-12-05 2008-06-12 Nuance Communication, Inc. Wireless server based text to speech email
EP2095250A2 (en) * 2006-12-05 2009-09-02 Nuance Communication, Inc. Wireless server based text to speech email
EP2095250A4 (en) * 2006-12-05 2010-12-15 Nuance Communication Inc Wireless server based text to speech email

Also Published As

Publication number Publication date
AUPR380401A0 (en) 2001-04-12

Similar Documents

Publication Publication Date Title
US8705705B2 (en) Voice rendering of E-mail with tags for improved user experience
US7103548B2 (en) Audio-form presentation of text messages
US6895257B2 (en) Personalized agent for portable devices and cellular phone
CA2648617C (en) Hosted voice recognition system for wireless devices
US20090198497A1 (en) Method and apparatus for speech synthesis of text message
US20060069728A1 (en) System and process for transforming a style of a message
JP2003032369A (en) Instant messaging using a wireless interface
JP2003521750A (en) Speech system
US20100268525A1 (en) Real time translation system and method for mobile phone contents
KR100363656B1 (en) Internet service system using voice
EP1411736B1 (en) System and method for converting text messages prepared with a mobile equipment into voice messages
GB2376379A (en) Text messaging device adapted for indicating emotions
WO2002076120A1 (en) System and method of converting a message and transmitting the converted message
KR100325986B1 (en) Method and apparatus for sending and receiving multi-media cards using telephone
WO2011033533A1 (en) Device and method for creating an identifier
KR100412316B1 (en) Method for Text and Sound Transfer at the same time in Multimedia Service of Mobile Communication System
JP5007209B2 (en) User data management system, information providing system, and user data management method
JPH09258764A (en) Communication device, communication method and information processor
GB2443468A (en) Message delivery service and converting text to a user chosen style of speech
JP2003141116A (en) Translation system, translation method and translation program
KR20010113336A (en) Method and apparatus for providing various ringing signals for mobile phone using e-mail
KR20040015471A (en) System and Method for transmitting voice message converted from e-mail message
JP3540736B2 (en) Necessary information collection system
TWI425811B (en) System and method for playing text short messages
KR101043823B1 (en) Apparatus and Method for providing a message including sound corresponding to character

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP