GB2406470A - Display of facial poses associated with a message to a mobile - Google Patents

Display of facial poses associated with a message to a mobile Download PDF

Info

Publication number
GB2406470A
GB2406470A GB0322503A GB0322503A GB2406470A GB 2406470 A GB2406470 A GB 2406470A GB 0322503 A GB0322503 A GB 0322503A GB 0322503 A GB0322503 A GB 0322503A GB 2406470 A GB2406470 A GB 2406470A
Authority
GB
United Kingdom
Prior art keywords
message
mobile communication
communication device
operable
face
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB0322503A
Other versions
GB0322503D0 (en
Inventor
Simon Michael Rowe
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Technology Europe Ltd
Original Assignee
Canon Research Centre Europe Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Research Centre Europe Ltd filed Critical Canon Research Centre Europe Ltd
Priority to GB0322503A priority Critical patent/GB2406470A/en
Publication of GB0322503D0 publication Critical patent/GB0322503D0/en
Publication of GB2406470A publication Critical patent/GB2406470A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/12Messaging; Mailboxes; Announcements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/2053D [Three Dimensional] animation driven by audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/72427User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality for supporting games or graphical animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72433User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for voice messaging, e.g. dictaphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72436User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for text messaging, e.g. SMS or e-mail
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/18Information format or content conversion, e.g. adaptation by the network of the transmitted or received information for the purpose of wireless delivery to users or terminals

Abstract

The words in a message are processed to determine corresponding facial expressions and a sequence of rendered images which may be displayed on a mobile communication device. Mobile communication devices 2 send voice and text messages to each other via a server 4. Server 4 may processes each message and animates a 3D computer model of a head so the face of the model speaks the words in the message. The message maybe sent to the destination mobile communication device together with rendered images of the 3D computer model defining a sequence of images showing the face of the model speaking the message. The recipient mobile communication device outputs the voice or text of the message in synchronism with the display of the image sequence so that the face in the images appears to speak the words of the message. Little processing capacity is required by either the message-generating mobile communication device or the recipient mobile communication device because animation of the computer model is carried out by the server 4.

Description

MESSAGING SYSTEM FOR MOBILE COMMUNICATION APPARATUS
The present invention relates to the transmission of messages to a mobile (wireless) communication apparatus, such as a mobile telephone, personal digital assistant (PDA) or cordless telephone (for example a DECT telephone).
The transmission of text, voice and image messages to a mobile communication device (that is, a one-way or simplex communication as opposed to a real-time two-way interactive telephone conversation) is well known.
However, the content of the message that can be communicated is restricted by the processing resources of one or both of the receiving and sending mobile communication devices.
The present invention has been made with this problem in mind.
According to the present invention, there is provided a mobile communication messaging method and system in which a text or voice message is processed to animate a three dimensional computer model of a face to speak the words of the message and to generate images of the three dimensional computer model speaking the words. The message data and image data is sent to the destination mobile communication device where the images are displayed as the message is output so that an image sequence of a face speaking the words of the message is seen by the user.
By processing a 3D computer model to generate rendered images and sending the rendered images to the recipient mobile communication device, the recipient mobile communication device can display an image sequence in which a face appears to speak the associated voice or text message without significant processing resources being required in the recipient mobile communication device.
In addition, the features facilitate the transmission and output of a voice or text message accompanied by images of the face of the person who sent the message speaking the message.
The processing may be carried out by a server so that no significant processing resources are required in either the message-generating mobile communication device or the message-receiving mobile communication device.
The rendered images sent to the mobile communication device may be generated from different viewing directions. In this way, the appearance that the face is three-dimensional is enhanced when the images are displayed.
Instead of processing a 3D computer model and sending the rendered images to a recipient mobile communication device, the processing may be performed in the recipient mobile communication device itself.
Accordingly, the present invention also provides a mobile communication messaging method and system in which a text or voice message is received by a mobile communication device, which then animates a three-dimensional computer model of a face and renders images of the computer model to generate a sequence of images of the face speaking the words of the message. The rendered images are then displayed to the user as the message is output.
Such a system requires the message-receiving mobile communication device to have sufficient processing resources to process the 3D computer model and generate the rendered images, but not the message-generating mobile communication device.
The present invention also provides a computer program product, embodied for example as a storage device carrying instructions or a signal carrying instructions, comprising instructions for programming a programmable processing apparatus to become operable to perform a method as set out above or to become configured as an apparatus as set out above.
Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings in which: Figure 1 schematically shows the components of a first embodiment of the invention, together with the notional functional processing units into which the processing apparatus components may be thought of as being configured when programmed by computer program instructions; Figure 2, comprising Figures 2a, 2b and 2c, shows the processing operations performed by the apparatus in the system of Figure 1 in the first embodiment; Figure 3, comprising Figures 3a and 3b, shows the processing operations performed by the apparatus in the system of Figure 1 in a second embodiment; Figure 4 schematically shows the components of a mobile communication device in a third embodiment of the invention, together with the notional functional processing units into which the mobile communication device may be thought of as being configured when programmed by computer program instructions; and Figure 5, comprising Figures 5a and 5b, shows the processing operations performed by mobile communication devices in the third embodiment.
First Embodiment Referring to Figure 1, a first embodiment of the present invention comprises a plurality of mobile communication devices 2 operable to send voice and text messages to each other via a server 4 by transmitting signals 6 via a communication network, which, in this embodiment comprises a plurality of conventional base stations 8 and the Internet 10 but may comprise other forms of communication network.
A personal computer (PC) is also provided for sending a voice or text message to one or more of the mobile communication devices 2 via the server 4.
As will be explained in more detail below, server 4 is arranged to process each message and to animate a three dimensional (3D) computer model of a head (which may be the head of the user of the mobile communication device from which the message was sent) so that the face of the computer model speaks the words in the message. The message is then sent from the server to the destination mobile communication device(s) together with rendered images of the 3D computer model generated by server 4 defining a sequence of images showing the face of the model speaking the message. At the recipient mobile communication device(s) the voice or text of the message is output in synchronism with the display of the images in the sequence so that the face in the images appears to speak the words of the message to the user. Very little processing capacity is required by either the message-generating mobile communication device or the recipient mobile communication device for this form of messaging because all processing of the 3D computer model is carried out by the server 4.
Each mobile communication device 2 comprises a cellular mobile telephone, a personal digital assistant (PDA) operable to communicate as a mobile telephone, or a cordless telephone such as a DECT telephone (in which case the corresponding base station 8 may be located in the user's building). Each mobile communication device 2 has a display 12 for the display of text and a speaker (not shown) for the output of sound.
In this embodiment, each mobile communication device 2 is operable to transmit and receive data.
The transmission of signals 6 between each mobile communication device 2 and a base station 8 is wireless.
The transmission of signals 6 between base stations 8 and server 4 may be wireless and/or along electrical or optical cable.
In addition to conventional components, each mobile communication device 2 stores computer program instructions defining a face messaging application 14.
When executed by a processor in the mobile communication device 2, the computer program instructions of the face messaging application provide an interface 16 to allow the user to compose a text or voice message together with an image of the face of a person to speak the message when the message is received by the recipient mobile communication device or, instead of image data, data identifying a 3D computer model of a face already existing at the server 4, and to transmit the data to the server 4.
The computer program instructions of the face messaging application also define a viewer 18 when executed by a processor in the mobile communication device 2. The viewer 18 controls the mobile communication device 2 to display a received text message on display 12 or to output a received voice message via a speaker (not shown) and in synchronism with the output of the message to display face images received from server 4 on display 12.
The computer program instructions defining the face messaging application 14 may be provided within a mobile communication device 2 by the manufacturer of the device, or may be downloaded as a signal 6 for example by telephoning the server 4 or an alternative apparatus (not shown) storing the computer program instructions for download.
One of the mobile communication devices 2 shown in Figure 1 is operable to receive digital image data, for example from a camera 20. The digital camera 20 may be separate from the mobile communication device 2 or may be integral therewith.
Voice and/or text messages may also be sent to a mobile communication device 2 using PC 5. More particularly, PC 5 connects with server 4 via a web page (not shown) hosted by server 4 or a different server (not shown), which makes available the functionality of interface 16 of face messaging application 14 to the user of PC 5.
PC 5 is operable to receive digital image data, for example from a digital camera 20, for transmission to server 4.
Server 4 is provided by a mobile communication service provider, an Internet service provider or a third party.
In this embodiment, server 4 comprises a programmable processing apparatus containing, in a conventional manner, one or processors, memories, graphics card, etc. Server 4 is programmed to operate in accordance with programming instructions input, for example, as data stored on a data storage medium 24 (such as optical CD ROM, semiconductor ROM, magnetic recording medium, etc), and/or as a signal 26 (for example an electrical or optical signal input to the server 4, for example from a remote database, by transmission over a communication network such as the Internet or by transmission through the atmosphere), and/or entered by a user via a user input device such as a keyboard. The programming instructions for server 4 may be supplied either in compiled, computer executable format or in a format (such as source code) for conversion to a compiled format.
When programmed by the programming instructions, server 4 can be thought of as being configured as a number of functional units for performing processing operations and a number of data stores. Examples of such functional units and data stores are shown in Figure 1. The functional units and data stores illustrated in Figure 1 are, however, notional, and are shown for illustration purposes only to assist understanding; they do not necessarily represent units and data stores into which the processor, memory eta of server 4 actually become configured.
Referring to the functional units shown in Figure 1, interface module 30 is operable to control communication of the server 4 with Internet 10, base stations 8, mobile communication devices 2 and PC 5.
Text data store 32 is configured to store the text data of a text message received from a mobile communication device 2 or PC5.
Voice data store 34 is configured to store the voice data of a voice message received from a mobile communication device 2 or PC5.
Image data store 36 is configured to store image data showing a face received from a mobile communication device 2 or PC5.
Face model generation module 38 is operable to process image data from image data store 36 to generate a 3D computer model of the face shown in the image data.
Face model store 40 is configured to store 3D computer face models generated by face model generation module 38 and pre-generated 3D computer face models. In this embodiment, face model store 40 stores face models comprising 3D computer models of celebrity faces in addition to each 3D computer model of a face of a user generated by face model generation module 38.
Viseme identification module 42 is operable to process the text message data stored in text data store 32 or the voice message data stored in voice data store 34 to identify the visemes of the words in the message that is, the visually distinguishable speech postures of a face speaking the phonemes of the words in the message.
Expression identification module 44 is operable to process the text message data stored in text data store 32 or the voice message data stored in voice data store 34 to identify expression information therein. For example, symbols may be included in a text message to convey expressions, such as:-( to convey "sad" and:-) to convey 'happy.'. Expression identification module 44 is operable to process the text data to identify such symbols. Similarly, expression identification module 44 is operable to process voice message data stored in voice data store 34 to identify expressions conveyed by the way in which the words are spoken (such as happy, sad, laughter, etc).
Face model animation module 46 is operable to amend a 3D computer face model to provide the face with each speech posture and expression identified by viseme identification module 42 and expression identification module 44.
Face model rendering module 48 is operable to render images of the 3D computer face models as animated by face model animation module 46 from different viewing directions.
Text to speech conversion module 50 is operable to convert text message data stored in text data store 32 to voice message data if a received text message is to be converted to voice before being sent to a recipient mobile communication device 2.
Speech to text conversion module 52 is operable to convert voice message data stored in voice data store 34 to text message data if a received voice message is to be converted to text data before being sent to a recipient mobile communication device 2.
Billing module 54 is operable to charge the accounts of mobile communication device users for the provision of services by server 4.
Figure 2 shows the processing operations performed by two mobile communication devices 2 and server 4 during the transmission, receipt and output of a message.
In Figures 2a and 2b, processing operations on the left hand side of the dotted line are performed by a message originating mobile communication device 2, and processing operations on the right-hand side of the dotted line are performed by server 4. In Figure 2c, processing operation shown on the left-hand side of the dotted line are performed by server 4, and processing operations shown on the right-hand side of the dotted line are performed by each message-receiving mobile communication device 2.
At step S2-2, face messaging application 14 in the message-originating mobile communication device 2 generates a text or voice message in accordance with instructions from a user, together with the address of each recipient mobile communication device for the message and either image data of a face recorded by digital camera 20 or data identifying a 3D computer face model already existing in face model store 40 of server 4. The address of each recipient mobile communication device may comprise a telephone number, e-mail address, etc. The identification of an existing face model in face model store 40 within server 4 may comprise the telephone number of the message-originating mobile communication device 2. This telephone number uniquely identifies the message- originating mobile communication device 2 and may therefore be used by server 4 to identify a face model in face model store 40 previously generated using image data received from the message originating mobile communication device.
In this embodiment, the message data, recipient address(es)' and image data/face model identification data is transmitted from the messageoriginating mobile communication device 2 to server 4 as a multi-media messaging service (MMS) message or, if no image data is being sent and instead face model identification data is being sent, as a short messaging service (SMS) message.
It should be noted that the processing operations at step S2-2 may be performed by PC 5 instead of a mobile communication device 2. More particularly, the data may be compiled and sent to server 4 by accessing a website hosted by server 4 or a different processing apparatus (not shown) using PC 5.
At step S2-4, the transmitted data is received by server 4. If the message is a text message, then interface module 30 stores the text data in text data store 32.
Alternatively, if the message is a voice message, then interface module 30 stores the voice data in voice data store 34. Billing module 54 charges the account of the message-originating mobile communication device for the provision of services of the server 4 for subsequent processing and sending of the message.
At step S2-6, if the received data contains face image data, then face model generation module 38 processes the received image data to generate a 3D computer model of the face. This processing is performed using the method as described in co-pending UK patent application 0209877.0 and co-pending US application 10/424919, (the full contents of which are incorporated herein by cross reference), or for example as described in "A Morphable Model for the Synthesis of 3D Faces" by Blanz and Vetter, in SIGGRAPH'99 Conference Proceedings, pages 187-194 (1999). The resulting face model is stored in a face model store 40 for subsequent use by the user of the message-originating mobile communication device 2.
Alternatively, if the received data defines a face model already existing in face model store 40, then interface module 30 retrieves the defined 3D computer face model from face model store 40.
At step S2-8, viseme identification module 42 processes the message data stored at step 2-6 to identify the phonemes making up the words in the message and the corresponding speech postures. This processing is performed in a conventional way, for example as described in "Computer Facial Animation 't by Parke and Waters, published by A K Peters Limited, ISBN 1-56881-014-8, Section 8.2.
In addition, expression identification module 44 performs processing at step S2-8 to identify expression symbols in text message data or to infer expressions from the way in which words are spoken in voice message data. As mentioned previously, in the case of text data, expression identification module 44 processes the text to identify expression symbols such as:-) meaning "happy" and:-( meaning "sad". In the case of voice message data, expression identification module 44 performs processing as described in "You BEEP Machine -- Emotion in Automatic Speech Understanding Systems" by Huber et al in Proceedings of the Workshop on Text, Speech and Dialog (TSD'98), pages 223-228, Brno, 1998, Masaryk University.
At step S2-10, face model animation module 46 performs processing to pose the 3D computer face model generated or retrieved at step 2-6 to generate the next viseme and/or facial expression identified at step S2-8 (this being the first viseme and/or facial expression the first time step S2-10 is performed). This processing comprises amending the mouth of the 3D computer face model in the case of a viseme and also other parts of the face, such as the eyes, eyebrows and forehead in the case of facial expressions.
At step S2-12, face model rendering module 48 renders an image of the 3D computer face model generated at step S2-10 from the next predetermined viewing direction (this being the first predetermined viewing direction the first time step S2-12 is performed).
At step S2-14, processing is carried out to determine whether any visemes or facial expressions remain to be generated.
Steps S2-10 to S2-14 are repeated until a rendered image of each viseme and facial expression in the message stored at step S2-4 has been generated. As a result, a sequence of images is generated, each showing a viseme, a facial expression or the combination of a viseme and facial expression, such that when the sequence is displayed, the face appears to speak the words of the message with the expressions intended by the user of the message-originating mobile communication device. By rendering the images from different viewing directions, it has been found that the face in the images appears to be significantly more three-dimensional than if all of the images are rendered from the same viewing direction.
At step S2-16, a message is generated in the required format for output to the defined recipient mobile communication device(s). More particularly, if a text message was received at step S2-4 together with instructions to send a speech message to the recipient mobile communication device(s), then processing is performed by text to speech conversion module 50 to convert the text data stored at step S2-4 to voice data.
Similarly, if a voice message was received at step S2-4 together with instructions to send a text message, then speech to text conversion module 52 performs processing to convert the text message data to voice message data.
At step S2-18, interface module 30 transmits the message data stored at step S2-4 or the message data generated at step S2-16 if format conversion has been performed to the recipient mobile communication device(s) identified in the data received at step S2-4, together with the rendered images generated at step S2-12 and computer program instructions defining face messaging application 14.
The data transmitted at step S2-18 defines the sequence in which the rendered images are to be displayed and the times at which the rendered images are to be displayed in relation to the output of words in the accompanying message (if the message comprises voice data) or relative to the scrolling of text (if the message comprises text data).
The data is transmitted at step 2-18 as a WEB application protocol (WAP) PUSH or as a multi-media messaging service (MMS) message.
At step S2-20, interface module 30 determines whether the data has been successfully sent. Steps S2-18 and S2-20 are repeated until the data has been successfully sent to each recipient mobile communication device 2.
At step S2-22, interface module 30 sends a completion message to the message-originating mobile communication device 2, which receives and displays the message at step S2-24.
Referring to Figure 2c, at step S2-26, following data transmission from the server at step S2-18, the data is received by a message-receiving mobile communication device 2.
The computer program instructions of face messaging application 14 are executed by the recipient mobile communication device 2, and viewer 18 causes the rendered images to be displayed in sequence on display device 12 together with a scrolling text message on display device 12 (in the case of a text message) or the output of speech (in the case of a voice message). As noted previously, the data received at step S2-26 defines the rendered images to be displayed in synchronism with the scrolling text message or output speech, with the result that the displayed images appear to speak the words of the message.
Second Embodiment A second embodiment of the present invention will now be described.
In the first embodiment, the message-originating mobile communication device 2 transmits data to server 4, which processes the data and transmits the message to the message-receiving mobile communication device 2 together with rendered images of a 3D computer face model.
In the second embodiment, however, the message originating mobile communication device 2 transmits a message directly to each messagereceiving mobile communication device 2 (that is, the message is not sent via server 4).
Alternatively, the voice or text message is sent directly to messagereceiving mobile communication device 2 using PC 5 to connect to a conventional message-generating server (not shown) which causes the required message to be transmitted to the message-receiving mobile communication device 2.
The message-receiving mobile communication device 2 then contacts server 4 to obtain rendered images showing a face speaking the words of the message.
The functional processing units and data stores in the second embodiment are the same as those in the first embodiment, with the exception that the message originating mobile communication device does not need to store face messaging application 14.
Figure 3 shows the processing operation performed in the second embodiment.
Referring to Figure 3, processing operations on the left hand side of the dotted line are performed by the message-receiving mobile communication device 2, while processing operations on the right-hand side of the dotted line are performed by server 4.
At step S3-2, the message-receiving mobile communication device 2 receives a voice or text message from the message-originating mobile communication device 2 or PC in a conventional format such as an SMS message or MMS message.
At step S3-4, interface 16 of face messaging application 14 controls the message-receiving mobile communication device 2 to forward the received message to the server 4 together with data defining the display characteristics of display 12 of the message-receiving mobile communication device 2, and data defining a 3D computer face model stored in face model store 40 of server 4 or alternatively image data showing a face to be used to generate a new 3D computer face model.
In this embodiment, the data defining the display characteristics of display 12 comprises data defining the display size in pixels.
The data defining an existing 3D computer face model stored in face model store 40 may comprise the telephone number of the message-originating mobile communication device 2 if the user of the message-originating mobile communication device has previously sent a face image to server 4 for the generation of a 3D computer face model therefrom. Alternatively, the data may define a 3D computer face model of a celebrity, eta stored in face model store 40.
At step 3-6, the data transmitted from the message receiving mobile communication device 2 is received by server 4, text message data is stored in text data store 32 or the voice message data is stored in voice data store 34, and billing module 54 charges the account of the message-receiving mobile communication device 2.
At steps 3-8 and 3-10, processing corresponding to that performed in the first embodiment at steps 2-6 to S2-8 is performed. As this processing has been described above, it will not be described again here.
At step 3-12, face model rendering module 48 reads the data defining the number of pixels in the display 12 of mobile communication device 2.
At steps S3-14 to S3-20, processing is performed corresponding to that performed in the first embodiment at steps S2-10 to S2-16. The only difference in this processing is that face model rendering module 48 performs processing to render an image at step S3-16 in such a way that the face has a size to substantially fill a display having the number of pixels determined at step S3-12 (or to substantially fill a predetermined part of the display if a text message is also to be displayed).
In this way, the rendering is tailored to the display of the messagereceiving mobile communication device 2.
As the other processing operations performed at steps S3-14 to S3-20 have been described above with reference to steps S2-10 to S2-16 in the first embodiment, they will not be described again here.
At step S3-22, interface module 30 sends the message and rendered images back to the message-receiving mobile communication device 2. This transmitted data has the form of a WAP PUSH or an MMS message. As in the first embodiment, the transmitted data defines the order in which the rendered images are to be displayed and the time at which each rendered image is to be displayed relative to the message content so that the displayed images appear to speak the words of the message when the message is output or displayed to the user of the message-receiving mobile communication device 2.
At step S3-24, the data transmitted at step S3-22 is received by themessage-receiving mobile communication device, and interface 16 controls the device to output voice message data via a speaker or to scroll text message data on display 12 while at the same time displaying the rendered images. As a result, the rendered images displayed to the user of the message receiving mobile communication device 2 show a face speaking the words of the message as the words are output from the speech or displayed on display 12.
As a modification of the second embodiment, the message receiving mobile communication device 2 may retain the message data received from the message-originating mobile communication device 2 (or a copy thereof) as well as forwarding a copy of the message data (or the original message data) to the server 4 at step S3-4. In this way, it is unnecessary for the server 4 to return the message data at step S3-22 and instead only the rendered image data need be sent by the server at this step. The message data retained by the message-receiving mobile communication device 2 can then be output through a speaker (in the case of voice data) or displayed (in the case of text data) in synchronism with the received rendered images at step S3-24.
Third Embodiment A third embodiment of the present invention will now be described.
In the first and second embodiments, the processing to generate the rendered images for display on the message receiving mobile communication device 2 are generated by server 4. However, in the third embodiment, these processing operations are performed by the message originating mobile communication device 2 or the message receiving mobile communication device 2. In this case, it is necessary for one, but not both, of the mobile communication devices 2 to have sufficient processing capacity to perform the necessary processing operations to generate the rendered images.
Referring to Figure 4, each mobile communication device 2 that is to generate rendered images is programmed by computer program instructions to have functional units and data stores comprising interface 16, viewer 18, text data store 32, voice data store 34, image data store 36, face model generation module 38, face model store 40, viseme identification module 42, expression identification module 44, face model animation module 46, face model rendering module 48, text to speech conversion module 50 and speech to text conversion module 52. Each of these functional units and data stores has been described previously and accordingly will not be described again here.
The processing operations performed in the third embodiment will now be described with reference to Figure 5. By way of example, these processing operations will be described assuming that the message-originating mobile communication device performs processing to generate rendered images for display on the message-receiving mobile communication device. However, corresponding processing operations to generate the rendered images may be performed by the message-receiving mobile communication device to process a received message instead of the message-originating mobile communication device (in which case the rendered images can be tailored to the display size of display 12 on the message receiving mobile communication device as in the second embodiment).
Referring to Figure 5, processing operations on the left hand side of the dotted line are performed by the message-originating mobile communication device 2, while processing operations on the right-hand side of the dotted line are performed by the message-receiving mobile communication device 2.
At step S5-2, face model generation module 38 within the messageoriginating mobile communication device processes input image data to generate a 3D computer model of a face (in the same way that processing is performed at step S2-6 in the first embodiment). Alternatively, an existing 3D computer face model is retrieved from face model store 40 in accordance with input instructions from a user.
At steps S5-4 to S5-12, processing operations are performed corresponding to the processing operations performed at steps S2-8 to S2-16 in the first embodiment.
As these processing operations have been described above, they will not be described again here.
At step S5-14, interface 16 of the message-originating mobile communication device 2 transmits data containing the voice or text message and rendered images generated at step S5-8 to the message- receiving mobile communication device 2.
At step S5-16, interface 16 checks that the data has been sent successfully and steps S5-14 to S5-16 are repeated until it is determined that the data has been successfully transmitted.
At step S5-18, interface 16 causes a completion message to be displayed on display 12 of the message-originating mobile communication device 2.
At step S5-20, the data transmitted at step S5-14 is received by the message-receiving mobile communication device 2. Voice data is output through a speaker or text data is scrolled on display 12, while at the same time the rendered images are displayed in sequence, so that the face in the rendered images appears to speak the words of the message as they are output/displayed.
Modifications and Variations Many modifications and variations can be made to the embodiments described above within the scope of the accompanying claims.
For example, in the first and second embodiments described above, the functional components and data stores of server 4 are described as part of a single apparatus. However, the functional components and data stores may be provided in a number of separate apparatus which act together to constitute server 4.
In the embodiments described above, processing is performed to identify every speech posture for a message and to render an image of the 3D computer face model having every identified face posture. However, instead, some face postures may be omitted and rendered images may be generated for only the important face postures.
In the embodiments described above, expression identification module 44 is provided and processing is performed to identify facial expressions for the message and to render images of the 3D computer face model having the identified facial expressions. However, expression identification module 44, the processing to identify facial expressions and the processing to render images of the 3D computer model having the identified facial expressions may be omitted.
If expression identification module 44 and the associated processing operations are present, the processing may be amended to include word searching to identify words such as "yes", 'no", eta in a text message or voice message and to render images of the 3D computer face model performing corresponding head movements, such as a nod or a shake of the head, etc. In the first embodiment described above, each mobile communication device 2 could pre-register details of its screen size with the server 4, which would then be stored in the server 4. In this way, when a message is received by server 4, the rendering at step S2-12 can be performed to generate a facial image which will substantially fill the screen (or a predetermined part thereof) of the intended messagereceiving mobile communication device 2.
In the first embodiment described above, server 4 may maintain a database of the mobile communication devices 2 which have previously received face messaging application software from the server 4. In this way, server 4 may check the database and omit the face messaging application software from the data transmitted at step S2- 18 if the intended message-receiving mobile communication device has previously received the software.
In the embodiments described above, in the case of a text message, the data transmitted at steps S2-18, S3-22 and S5-14 causes the text message to be scrolled on the display 12 of the message-receiving mobile communication device in synchronism with the display of the rendered images. Instead, the data may cause the text message to be statically displayed while the rendered images are displayed on display 12.
In the embodiments described above, the functional processing units of the server 4 and the face messaging application 14 on each mobile communication device 2 are defined by the programming instructions. However, some, or all, of the functional processing units could, of course, be performed using hardware or a combination of hardware and programming instructions.
Other modifications are, of course, possible.

Claims (40)

1. A method of generating a communication signal carrying a message for receipt by a mobile communication device, the method comprising: processing data defining a message containing words to determine at least some of the respective speech postures of a mouth speaking the words; posing at least the mouth of a three-dimensional computer model of a face in accordance with each respective determined speech posture and rendering an image thereof to generate a sequence of rendered images showing the face in each determined speech posture; and causing a signal to be generated conveying image data defining the sequence of rendered images for receipt by a mobile communication device.
2. A method according to claim 1, wherein: the data defining the message is processed to determine facial expressions in addition to the speech postures; and the three-dimensional computer model of the face is posed in accordance with at least some of the determined facial expressions as well as the determined speech postures, and images are rendered to generate a sequence of images showing the face having the speech postures and the facial expressions.
3. A method according to claim 1 or claim 2, wherein at least some images in the sequence are rendered from different viewing directions.
4. A method according to any preceding claim, wherein each image is rendered in dependence upon the size of display of the mobile communication device to which the images are to be sent.
5. A method according to any preceding claim, wherein a text message is processed to determine the at least some of the respective speech postures.
6. A method according to any of claims 1 to 4, wherein a voice message is processed to determine the at least some of the respective speech postures.
7. A method according to any preceding claim, wherein the signal for receipt by the mobile communication device is generated to convey data defining the message in addition to the image data.
8. A method according to claim 7, wherein the signal for receipt by the mobile communication device is generated to convey a voice message.
9. A method according to claim 8, wherein the signal is generated to convey data to cause the receiving mobile communication device to output the voice message via a speaker and to display the rendered images substantially in synchronism therewith so that the displayed images show a face speaking the words as the words are output.
10. A method according to claim 7, wherein the signal for receipt by the mobile communication device is generated to convey a text message.
11. A method according to claim 10, wherein the signal is generated to convey data to cause the receiving mobile communication device to display the message as scrolling text and to display the rendered images substantially in synchronism therewith so that the displayed images show a face speaking the words as the words scroll on the display.
12. A method according to any preceding claim when performed by a mobile communication device.
13. A method according to any of claims 1 to 11 when performed by a server to process data defining a message received from a first mobile communication device to generate the signal conveying the image data for receipt by a second mobile communication device.
14. A method according to claim 13, further comprising charging an account associated with the first mobile communication device.
15. A method according to any of claims 1 to 11 when performed by a server to process data defining a message received from a mobile communication device to generate the signal conveying the image data for receipt by the same mobile communication device.
16. A method according to claim 15, further comprising charging an account associated with the mobile communication device.
17. A method of processing a message received from a mobile communication device, the method comprising: processing data defining a message containing words to determine at least some of the respective speech postures of a mouth speaking the words; posing at least the mouth of a three-dimensional computer model of a face in accordance with each respective determined speech posture and rendering an image thereof to generate a sequence of rendered images showing the face in each determined speech posture; and outputting the message while at the same time displaying the rendered images in sequence.
18. A method of sending a message between mobile communication devices, the method comprising: sending a message containing words and data defining a recipient mobile communication device for the message from a first mobile communication device to a server; processing the received message at the server to: - determine at least some of the respective speech postures of a mouth speaking the words in the message; - pose at least the mouth of a three-dimensional computer model of a face in accordance with each respective determined speech posture and rendering an image thereof to generate a sequence of rendered images showing the face in each determined speech posture; and - cause a signal to be sent to the defined recipient mobile communication device conveying data defining the message and image data defining the sequence of rendered images; and receiving the signal at the recipient mobile communication device and outputting the message to a user while at the same time displaying the rendered images in sequence.
19. A method of sending a message between mobile communication devices, the method comprising: sending a message containing words from a first mobile communication device to a second mobile communication device; receiving the message at the second mobile communication device and forwarding the message to a server; processing the received message at the server to: - determine at least some of the respective speech postures of a mouth speaking the words in the message; - pose at least the mouth of a three-dimensional computer model of a face in accordance with each respective determined speech posture and rendering an image thereof to generate a sequence of rendered images showing the face in each determined speech posture; and cause a signal to be sent to the second mobile communication device conveying image data defining the sequence of rendered images; and receiving the signal at the second mobile communication device and outputting the message to a user while at the same time displaying the rendered images in sequence.
20. Apparatus for generating a communication signal carrying a message for receipt by a mobile communication device, the apparatus comprising: a viseme identifier operable to process data defining a message containing words to determine at least some of the respective speech postures of a mouth speaking the words; a three-dimensional computer model animator operable to pose at least the mouth of a three-dimensional computer model of a face in accordance with each respective determined speech posture; a renderer operable to render images of the three dimensional computer model to generate a sequence of rendered images showing the face in each determined speech posture; and a signal generator, operable to cause a signal to be generated conveying image data defining the sequence of rendered images for receipt by a mobile communication device.
21. A method according to claim 20, further comprising an expression identifier operable to process the data defining the message to determine facial expressions, and wherein: the three-dimensional computer model animator is operable to pose the three-dimensional computer model of the face in accordance with at least some of the determined facial expressions as well as the determined speech postures; and the renderer is operable to render images to generate a sequence of images showing the face having the speech postures and the facial expressions.
22. Apparatus according to claim 20 or claim 21, wherein the renderer is operable to render images from different viewing directions.
23. Apparatus according to any of claims 20 to 22, wherein the renderer is operable to render each image in dependence upon the size of display of the mobile communication device to which the images are to be sent.
24. Apparatus according to any of claims 20 to 23, wherein the viseme identifier is operable to process text message data.
25. Apparatus according to any of claims 20 to 24, wherein the viseme identifier is operable to process voice message data.
26. Apparatus according to any of claims 20 to 25, wherein the signal generator is operable to cause a signal to be generated for receipt by the mobile communication device conveying data defining the message in addition to the image data.
27. Apparatus according to claim 26, wherein the signal generator is operable to cause a signal to be generated for receipt by the mobile communication device conveying voice message data.
28. Apparatus according to claim 27, wherein the signal generator is operable to cause a signal to be generated conveying data to cause the receiving mobile communication device to output the voice message via a speaker and to display the rendered images substantially in synchronism therewith so that the displayed images show a face speaking the words as the words are output.
29. Apparatus according to claim 26, wherein the signal generator is operable to cause a signal to be generated for receipt by the mobile communication device conveying text message data.
30. Apparatus according to claim 29, wherein the signal generator is operable to cause a signal to be generated conveying data to cause the receiving mobile communication device to display the message as scrolling text and to display the rendered images substantially in synchronism therewith so that the displayed images show a face speaking the words as the words scroll on the display.
31. Apparatus according to any of claims 20 to 30, wherein the apparatus is a mobile communication device.
32. Apparatus according to any of claims 1 to 30, wherein the apparatus is a server operable to process data defining a message received from a first mobile communication device to cause a signal to be generated conveying the image data for receipt by a second mobile communication device.
33. Apparatus according to claim 32, further comprising a billing module operable to charge an account associated with the first mobile communication device.
34. Apparatus according to any of claims 1 to 30, wherein the apparatus is a server operable to process data defining a message received from a mobile communication device to cause a signal to be generated conveying the image data for receipt by the same mobile communication device.
35. Apparatus according to claim 34, further comprising a billing module operable to charge an account associated with the mobile communication device.
36. Apparatus for processing a message received from a mobile communication device, the apparatus comprising: a viseme identifier operable to process data defining a message containing words to determine at least some of the respective speech postures of a mouth speaking the words; a three-dimensional computer model animator operable to pose at least the mouth of a three-dimensional computer model of a face in accordance with each respective determined speech posture; a renderer operable to render an image thereof to generate a sequence of rendered images showing the face in each determined speech posture; and an output controller operable to output the message 5while at the same time displaying the rendered images in sequence.
37. A system for sending a message between mobile communication devices, the system comprising: 10a first mobile communication device operable to send a message containing words and data defining a recipient mobile communication device for the message to a server; a server operable to process the received message to: - determine at least some of the respective speech postures of a mouth speaking the words in the message; pose at least the mouth of a three- dimensional computer model of a face in accordance with each respective determined speech posture and rendering an image thereof to generate a sequence of rendered images showing the face in each determined speech posture; and cause a signal to be sent to the defined recipient mobile communication device conveying data defining the message and image data defining the sequence of rendered images; and a second mobile communication device operable to receive the signal and to output the message to a user while at the same time displaying the rendered images in sequence.
38. A system for sending a message between mobile communication devices, the system comprising: a first mobile communication device operable to send a message containing words to a second mobile communication device; a second mobile communication device operable to receive the message and to send the message to a server; and a server operable to process the received message to: - determine at least some of the respective speech postures of a mouth speaking the words in the message; - pose at least the mouth of a three-dimensional computer model of a face in accordance with each respective determined speech posture and rendering an image thereof to generate a sequence of rendered images showing the face in each determined speech posture; and - cause a signal to be sent to the second mobile communication device conveying image data defining the sequence of rendered images; wherein the second mobile communication device is further operable to receive the signal and to output the message to a user while at the same time displaying the rendered images in sequence.
39. A storage medium storing computer program instructions for programming a programmable processing apparatus to become operable to perform a method as set out in at least one of claims 1 to 17.
40. A signal carrying computer program instructions for programming a programmable processing apparatus to become operable to perform a method as set out in at least one of claims 1 to 17.
GB0322503A 2003-09-25 2003-09-25 Display of facial poses associated with a message to a mobile Withdrawn GB2406470A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB0322503A GB2406470A (en) 2003-09-25 2003-09-25 Display of facial poses associated with a message to a mobile

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB0322503A GB2406470A (en) 2003-09-25 2003-09-25 Display of facial poses associated with a message to a mobile

Publications (2)

Publication Number Publication Date
GB0322503D0 GB0322503D0 (en) 2003-10-29
GB2406470A true GB2406470A (en) 2005-03-30

Family

ID=29286845

Family Applications (1)

Application Number Title Priority Date Filing Date
GB0322503A Withdrawn GB2406470A (en) 2003-09-25 2003-09-25 Display of facial poses associated with a message to a mobile

Country Status (1)

Country Link
GB (1) GB2406470A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006048681A1 (en) * 2004-11-08 2006-05-11 Simon Watson Visual messaging system
WO2008096099A1 (en) * 2007-02-05 2008-08-14 Amegoworld Ltd A communication network and devices for text to speech and text to facial animation conversion
GB2574098A (en) * 2018-03-26 2019-11-27 Orbital Media And Advertising Ltd Interactive systems and methods

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6181351B1 (en) * 1998-04-13 2001-01-30 Microsoft Corporation Synchronizing the moveable mouths of animated characters with recorded speech
US20030035412A1 (en) * 2001-07-31 2003-02-20 Xuejun Wang Animated audio messaging
JP2003178319A (en) * 2001-12-13 2003-06-27 Plaza Create Co Ltd Data transceiver, terminal and image forming method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6181351B1 (en) * 1998-04-13 2001-01-30 Microsoft Corporation Synchronizing the moveable mouths of animated characters with recorded speech
US20030035412A1 (en) * 2001-07-31 2003-02-20 Xuejun Wang Animated audio messaging
JP2003178319A (en) * 2001-12-13 2003-06-27 Plaza Create Co Ltd Data transceiver, terminal and image forming method

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006048681A1 (en) * 2004-11-08 2006-05-11 Simon Watson Visual messaging system
WO2008096099A1 (en) * 2007-02-05 2008-08-14 Amegoworld Ltd A communication network and devices for text to speech and text to facial animation conversion
GB2459073A (en) * 2007-02-05 2009-10-14 Amegoworld Ltd A communication network and devices for text to speech and text to facial animation conversion
GB2459073B (en) * 2007-02-05 2011-10-12 Amegoworld Ltd A communication network and devices
RU2488232C2 (en) * 2007-02-05 2013-07-20 Амеговорлд Лтд Communication network and devices for text to speech and text to facial animation conversion
GB2574098A (en) * 2018-03-26 2019-11-27 Orbital Media And Advertising Ltd Interactive systems and methods
GB2581943A (en) * 2018-03-26 2020-09-02 Orbital Media And Advertising Ltd Interactive systems and methods
GB2574098B (en) * 2018-03-26 2020-09-30 Orbital Media And Advertising Ltd Interactive systems and methods
GB2581943B (en) * 2018-03-26 2021-03-31 Virtturi Ltd Interactive systems and methods

Also Published As

Publication number Publication date
GB0322503D0 (en) 2003-10-29

Similar Documents

Publication Publication Date Title
EP2127341B1 (en) A communication network and devices for text to speech and text to facial animation conversion
US10593166B2 (en) Haptically enabled messaging
EP1480425B1 (en) Portable terminal and program for generating an avatar based on voice analysis
RU2442294C2 (en) Method and device for receiving and displaying animated sms-messages
CN1672178B (en) Method and device for instant motion picture communication
US8373799B2 (en) Visual effects for video calls
US20020194006A1 (en) Text to visual speech system and method incorporating facial emotions
JP2005115896A (en) Communication apparatus and method
US20060019636A1 (en) Method and system for transmitting messages on telecommunications network and related sender terminal
GB2406470A (en) Display of facial poses associated with a message to a mobile
KR101403226B1 (en) system and method for transferring message
WO2003091902A1 (en) Method and apparatus for conveying messages and simple patterns in communications network
JP2002342234A (en) Display method
WO2002054802A1 (en) Method for editing and sending data
JP2005057431A (en) Video phone terminal apparatus
TWM608752U (en) Scenario type interactive message delivery system with text and animated image
GB2480173A (en) A data structure for representing an animated model of a head/face wherein hair overlies a flat peripheral region of a partial 3D map
JP2005216087A (en) Electronic mail reception device and electronic mail transmission device
Emura et al. Personal Media Producer: A System for Creating 3D CG Animation from Mobile Phone E-mail.
Ostermann PlayMail–Put Words into Other People's Mouth
Gonçalves et al. Expressive audiovisual SMS reading

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)