WO2007136208A1 - Method and system of video phone calling using talker sensitive avatar - Google Patents

Method and system of video phone calling using talker sensitive avatar Download PDF

Info

Publication number
WO2007136208A1
WO2007136208A1 PCT/KR2007/002449 KR2007002449W WO2007136208A1 WO 2007136208 A1 WO2007136208 A1 WO 2007136208A1 KR 2007002449 W KR2007002449 W KR 2007002449W WO 2007136208 A1 WO2007136208 A1 WO 2007136208A1
Authority
WO
WIPO (PCT)
Prior art keywords
avatar
speaker
shape
counterpart
user
Prior art date
Application number
PCT/KR2007/002449
Other languages
French (fr)
Inventor
Seongho Lim
Original Assignee
Seongho Lim
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR20060052287A external-priority patent/KR100768666B1/en
Application filed by Seongho Lim filed Critical Seongho Lim
Publication of WO2007136208A1 publication Critical patent/WO2007136208A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/131Protocols for games, networked simulations or virtual reality

Definitions

  • the present invention relates to a video communication method using an avatar naturally moving in accordance with a speaker and a system therefor. More specifically, the present invention relates to a video communication method using an avatar naturally moving in accordance with a speaker and a system therefor, in which a user transmits an avatar instead of a real image of the user while the user communicates using a telephone capable of video communication, and user's and counterpart's voices are monitored so that the avatar may naturally move in accordance with a speaker.
  • a conventional video communication method using an avatar is a technique that retrieves and transmits data on a speaking- shape avatar if a user speaks while a non- speaking- shape avatar is transmitted.
  • the present invention has been made in order to solve the above problems, and it is an object of the present invention to provide a video communication method and a system therefor, in which while a user communicates using a videophone, a listening- shape or a speaking- shape user avatar is displayed on a counterpart's videophone in accordance with a speaker.
  • a video communication method using an avatar naturally moving in accordance with a speaker comprising the steps of: a speaker determining step for monitoring user's and counterpart's voices and determining a current speaker; and an avatar motion step for retrieving listening- shape avatar data or speaking- shape avatar data and transmitting the retrieved avatar data to a counterpart's videophone, if a speaker is determined, whereby the avatar is transmitted while video communicating.
  • an energy sensing technique for simply detecting changes in amplitude or frequency a voice activity detection (VAD) technique that is further specialized, or the like can be used in the speaker determining step.
  • VAD voice activity detection
  • the avatar motion step it is controlled to select and transmit either a speaking- shape avatar or a listening-shape avatar if a speaker is determined.
  • the user is determined as a current speaker, and the speaking- shape avatar is transmitted.
  • speaker's voices are analyzed using a variety of voice analysis techniques, and thus movements of lips, speaking patterns, or listening patterns can be divided in further detail.
  • Examples of such a variety of voice analysis techniques may include voice magnitude and pattern analysis, phoneme recognition, speech recognition, and the like.
  • FIG. 1 is a conceptual view showing the connectivity of a video communication system using an avatar moving in accordance with a speaker according to an embodiment of the invention
  • FIG. 2 is a view showing the configuration of a videophone used for a video communication system using an avatar moving in accordance with a speaker;
  • FIG. 3 is a flowchart illustrating a method of operating an avatar of a video communication system using an avatar moving in accordance with a speaker
  • FIG. 4 is a view showing the configuration of an avatar server connected to a video communication system using an avatar moving in accordance with a speaker according to another embodiment of the invention.
  • FIG. 1 is a conceptual view showing the connectivity of a video communication system using an avatar naturally moving in accordance with a speaker (hereinafter, referred to as a video communication system) according to an embodiment of the invention, and the embodiment is described referring to FIG. 1.
  • the video communication system comprises a plurality of videophones 100 and
  • the videophones 100 and 200 are divided into a transmitter videophone 100 and a receiver videophone 200, i.e., any kind of telephone that can transfer a user's image to a counterpart user through a camera. That is, the videophone can be any kind of terminal, including a general wired telephone, mobile communication terminal, personal digital assistance (PDA), and the like.
  • PDA personal digital assistance
  • the avatar server 300 stores a plurality of avatar data and transfers an avatar if a user requests to transmit a specific avatar.
  • the telephone network 400 comprises all kinds of general wired or wireless telephone communication networks, i.e., a communication network capable of voice, video, or data communication.
  • FIG. 2 is a view showing the configuration of a videophone used for a video communication system, and the embodiment is described referring to FIG. 2.
  • the videophone 100 or 200 means all terminals of a transmitter side and a receiver side. Here, mainly a transmitter side terminal is described for convenience.
  • the videophone 100 comprises a camera 102, a microphone 104, a keypad 106, a video and audio mixing unit 108, an avatar output unit 110, a speaker determining unit 112, an avatar database (DB) 114, a control unit 116, a speaker 151, a video and audio processing 153, and a display apparatus 154.
  • DB avatar database
  • the camera 102 is an apparatus for photographing an image of a user and converting the image into an electrical signal, i.e., a camera that is generally used for a videophone or a digital camera.
  • the microphone 104 is an apparatus for converting a user's voice into an electrical signal, i.e., a microphone that is generally used for a telephone, a cellular phone, or the like.
  • the keypad 106 is an apparatus for inputting numerals or special characters, i.e., an apparatus for generating an electrical signal corresponding to a numeral or a character if a user presses a specific numeral or character button, which is a general keypad that is used to input numerals in a telephone, a cellular phone, or the like.
  • the video and audio mixing unit 108 appropriately mixes a user's image and voice inputted from the camera 102 and the microphone 104 with an avatar image and a sound or voice outputted from the avatar output unit 110 in response to a request of the control unit 116 or in accordence to a characteristic of avatar data, or selects an image and a voice, and converts the mixed or selected image and voice into a data form that can be transmitted through the telephone network 400.
  • the avatar output unit 110 retrieves an avatar requested by the control unit 116 from the avatar DB and outputs the retrieved avatar to the video and audio mixing unit 108.
  • the speaker determining unit 112 monitors a user's voice inputted through the microphone 103 or a counterpart's voice received through the telephone network and determines a current speaker, thereby allowing the control unit 116 to transmit an avatar corresponding to the speaker.
  • speaker's voices are analyzed using a variety of voice analysis techniques so that the control unit 116 can transmit an avatar of a further detailed form.
  • the avatar DB 114 stores data on an avatar that a user desired to transmit.
  • the avatar data stored in the avatar DB includes images of an animal or a thing, icons, pictorial letters, and the like, including all kinds of stand-still avatars and moving avatars.
  • the avatar data can be a general pictorial data that can be animated, such as a graphic interchange format (GIF) file, or a vector or coordinate data that can contain motions, shapes, and the like.
  • GIF graphic interchange format
  • the avatar data can include a sound or a voice.
  • a sound expressing an agreement such as indeed-, oh ⁇
  • a listening- shape avatar is displayed, a user can feel more comfortable than communicating only viewing an avatar.
  • the avatar data can be downloaded in a variety of ways, such as a website, a personal computer (PC), or the like.
  • the control unit 116 requests the avatar output unit 110 to output data for displaying an avatar of a specific form based on a signal inputted from the speaker determining unit 112 and a signal inputted through the keypad 106.
  • a user can switch from a video communication to an avatar communication, or vice versa. That is, a user can transmit a user's image by handling the keypad 106 while displaying an avatar in a communication, or can transmit an avatar while displaying a user's image.
  • a user can communicate by simultaneously displaying a user's image and an avatar on the counterpart's videophone.
  • Such a switch handling can be implemented by control unit's 116 recognizing a designated command inputted through the keypad 106.
  • the speaker 151 is an apparatus for converting an electrical signal into a sound so that people can hear the sound, which means a general speaker of a wired or a wireless telephone.
  • the video and audio processing unit 153 converts data received from the telephone network into video and audio data.
  • the display unit 154 is an apparatus for converting an electrical signal into an image so that people can see the image, which means a liquid crystal display screen or the like of a general wired or a wireless telephone.
  • FIG. 3 is a flowchart illustrating a method of operating an avatar of a video communication system, and the embodiment is described referring to FIG. 3.
  • the transmitter can select whether to communicate with a video, to communicate with an avatar, or to communicate without a video function.
  • the control unit 116 provides a list of avatars that the transmitter can select, among the avatar data stored in the avatar DB 114.
  • control unit 116 If the communication is started, the control unit 116 requests the avatar output unit
  • the video and audio mixing unit 108 transmits the non- speaking- shape avatar data outputted from the avatar output unit 110 to the receiver's videophone 200 so that the non- speaking- shape avatar is displayed on the receiver's videophone 200 S 106.
  • the non- speaking- shape avatar is an avatar that makes a tiny gesture, such as lifting a finger or slightly nodding although the avatar is in a stand-still shape or does not move lips, while remaining in a non-speaking state.
  • the non-speaking-shape avatar is an avatar that makes a small motion enough to indicate that the communication is still connected although the avatar does not speak to the user. Through the non- speaking- shape avatar, the counterpart can also recognize that the communication is still continued.
  • the speaker determining unit 112 monitors whether transmitter's voices or counterpart's voices are received through the microphone 104 while the communication is continued and determines a current speaker S 107.
  • the control unit 116 retrieves avatar data for displaying a speaking- shape avatar among a plurality of avatar data stored in the avatar DB 114, and the video and audio mixing unit 108 transmits the retrieved speaking- shape avatar data to the receiver's videophone 200 so that the speaking- shape avatar is displayed on the receiver's videophone 200 S108.
  • the speaking-shape avatar is an avatar moving lips or shows a motion expressing that the avatar is speaking.
  • control unit 116 retrieves avatar data for displaying a listening-shape avatar among a plurality of avatar data stored in the avatar DB 114, and the video and audio mixing unit 108 transmits the retrieved listening-shape avatar data to the receiver's videophone 200 so that the listening- shape avatar is displayed on the receiver's videophone 200 S 109.
  • the listening- shape avatar is an avatar that can express its listening to what the counterpart says through a motion, such as nodding occasionally, giving an ear, or the like.
  • the speaking-shape or listening-shape avatar can be created in a variety of forms. If a speaker is determined by the speaker determining unit 112, a shape of an avatar can be controlled by analyzing or classifying magnitude or changes of speaker's voices into certain steps.
  • FIG. 4 is a view showing the configuration of an avatar server connected to a video communication system using an avatar moving in accordance with a speaker according to another embodiment of the invention, and the embodiment is described referring to FIG. 4.
  • the videophone 100 described above comprises the avatar DB for storing avatar data, the speaker determining unit, and the like in a single body, and therefore video communication using such an avatar can be implemented without an additional server or apparatus.
  • the service of the present invention can be implemented by installing such constitutional components in a separate server, not in the videophone 100, connected to the telephone network.
  • 300 comprises a web server module 310, an avatar control module 320, a speaker determining module 330, and an avatar DB 360.
  • the web server module 310 is connected to the telephone network 400 and provides a browser that enables transmitting and receiving data with the videophone 100 or 200.
  • the avatar control module 320 functions the same as the control unit 116 of the embodiment described above.
  • the avatar control module retrieves predetermined avatar data in response to a signal inputted from the speaker determining module 330 and allows the retrieved avatar data to be transmitted to a counterpart's videophone.
  • the speaker determining module 330 functions the same as the speaker determining unit 112 of the embodiment described above.
  • the speaker determining module determines a current speaker and allows an avatar of a predetermined form to be transmitted, which can be implemented using an energy sensing technique, VAD technique, or the like described above.
  • the avatar DB 360 functions the same as the avatar DB 114 of the embodiment described above, which stores data on an avatar desired to be transmitted by a user.
  • the avatar server 300 transmits and displays a non- speaking- shape avatar on the counterpart's videophone when a video communication is commenced. If a speaker is determined, the avatar server detects the determination of a speaker, appropriately mixes a naturally listening or speaking avatar with a user's image or voice, or selects an avatar, and transmits and displays the mixed or selected avatar on the counterpart's videophone.
  • a video communication system using an avatar can be constructed without adding separate constitutional components required for video communication to the videophone 100 and 200.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The present invention relates to a video communication method using an avatar naturally moving in accordance with a speaker and a system therefor. There is provided a method of transmitting an avatar while performing video communication, the method comprising: a speaker determining step for monitoring user's and counterpart's voices and determining a speaker; and an avatar motion step for retrieving listening-shape avatar data or speaking-shape avatar data and transmitting the retrieved avatar data to a counterpart's videophone, if a speaker is determined. According to the present invention, video communication using an avatar naturally acting according to a speaker is allowed.

Description

Description
METHOD AND SYSTEM OF VIDEO PHONE CALLING USING
TALKER SENSITIVE AVATAR
Technical Field
[1] The present invention relates to a video communication method using an avatar naturally moving in accordance with a speaker and a system therefor. More specifically, the present invention relates to a video communication method using an avatar naturally moving in accordance with a speaker and a system therefor, in which a user transmits an avatar instead of a real image of the user while the user communicates using a telephone capable of video communication, and user's and counterpart's voices are monitored so that the avatar may naturally move in accordance with a speaker. Background Art
[2] A conventional video communication method using an avatar is a technique that retrieves and transmits data on a speaking- shape avatar if a user speaks while a non- speaking- shape avatar is transmitted.
[3] However, since only a simply non-speaking-shape user's avatar is transmitted while a counterpart speaks, and the avatar is controlled inappropriately, communication using the avatar is unnatural and inconvenient. Disclosure of Invention Technical Problem
[4] Accordingly, the present invention has been made in order to solve the above problems, and it is an object of the present invention to provide a video communication method and a system therefor, in which while a user communicates using a videophone, a listening- shape or a speaking- shape user avatar is displayed on a counterpart's videophone in accordance with a speaker. Technical Solution
[5] In order to accomplish the above object of the invention, according to one aspect of the invention, there is provided a video communication method using an avatar naturally moving in accordance with a speaker, the video communication method comprising the steps of: a speaker determining step for monitoring user's and counterpart's voices and determining a current speaker; and an avatar motion step for retrieving listening- shape avatar data or speaking- shape avatar data and transmitting the retrieved avatar data to a counterpart's videophone, if a speaker is determined, whereby the avatar is transmitted while video communicating.
[6] Here, an energy sensing technique for simply detecting changes in amplitude or frequency, a voice activity detection (VAD) technique that is further specialized, or the like can be used in the speaker determining step. [7] Here, in the avatar motion step, it is controlled to select and transmit either a speaking- shape avatar or a listening-shape avatar if a speaker is determined. [8] At this point, if a user's voice and a counterpart's voice are simultaneously detected, the user is determined as a current speaker, and the speaking- shape avatar is transmitted. [9] In addition, if a speaker is determined, speaker's voices are analyzed using a variety of voice analysis techniques, and thus movements of lips, speaking patterns, or listening patterns can be divided in further detail. [10] Examples of such a variety of voice analysis techniques may include voice magnitude and pattern analysis, phoneme recognition, speech recognition, and the like.
Advantageous Effects
[11] According to the present invention, when a user speaks while performing avatar communication, a counterpart will see a speaking- shape avatar, and when the counterpart speaks, the counterpart will see a listening-shape avatar.
[12] That is, through determination of a speaker, a user's avatar is controlled in real-time to act clearly discriminating a listening or a speaking form or motion, and thus the user can enjoy video communication seeing an avatar that is further naturally responding to the speaker. Brief Description of the Drawings
[13] Further objects and advantages of the invention can be more fully understood from the following detailed description taken in conjunction with the accompanying drawings in which:
[14] FIG. 1 is a conceptual view showing the connectivity of a video communication system using an avatar moving in accordance with a speaker according to an embodiment of the invention;
[15] FIG. 2 is a view showing the configuration of a videophone used for a video communication system using an avatar moving in accordance with a speaker;
[16] FIG. 3 is a flowchart illustrating a method of operating an avatar of a video communication system using an avatar moving in accordance with a speaker; and
[17] FIG. 4 is a view showing the configuration of an avatar server connected to a video communication system using an avatar moving in accordance with a speaker according to another embodiment of the invention.
[18]
Mode for the Invention
[19] Hereinafter, a video communication method using an avatar naturally moving in accordance with a speaker and a system therefor according to a preferred embodiment of the present invention will be described in detail with reference to the accompanying drawings.
[20] Embodiment 1
[21] FIG. 1 is a conceptual view showing the connectivity of a video communication system using an avatar naturally moving in accordance with a speaker (hereinafter, referred to as a video communication system) according to an embodiment of the invention, and the embodiment is described referring to FIG. 1.
[22] The video communication system comprises a plurality of videophones 100 and
200, an avatar server 300, and a telephone network 400.
[23] The videophones 100 and 200 are divided into a transmitter videophone 100 and a receiver videophone 200, i.e., any kind of telephone that can transfer a user's image to a counterpart user through a camera. That is, the videophone can be any kind of terminal, including a general wired telephone, mobile communication terminal, personal digital assistance (PDA), and the like.
[24] The avatar server 300 stores a plurality of avatar data and transfers an avatar if a user requests to transmit a specific avatar.
[25] The telephone network 400 comprises all kinds of general wired or wireless telephone communication networks, i.e., a communication network capable of voice, video, or data communication.
[26] FIG. 2 is a view showing the configuration of a videophone used for a video communication system, and the embodiment is described referring to FIG. 2.
[27] The videophone 100 or 200 means all terminals of a transmitter side and a receiver side. Here, mainly a transmitter side terminal is described for convenience.
[28] The videophone 100 comprises a camera 102, a microphone 104, a keypad 106, a video and audio mixing unit 108, an avatar output unit 110, a speaker determining unit 112, an avatar database (DB) 114, a control unit 116, a speaker 151, a video and audio processing 153, and a display apparatus 154.
[29] The camera 102 is an apparatus for photographing an image of a user and converting the image into an electrical signal, i.e., a camera that is generally used for a videophone or a digital camera.
[30] The microphone 104 is an apparatus for converting a user's voice into an electrical signal, i.e., a microphone that is generally used for a telephone, a cellular phone, or the like.
[31] The keypad 106 is an apparatus for inputting numerals or special characters, i.e., an apparatus for generating an electrical signal corresponding to a numeral or a character if a user presses a specific numeral or character button, which is a general keypad that is used to input numerals in a telephone, a cellular phone, or the like. [32] The video and audio mixing unit 108 appropriately mixes a user's image and voice inputted from the camera 102 and the microphone 104 with an avatar image and a sound or voice outputted from the avatar output unit 110 in response to a request of the control unit 116 or in accordence to a characteristic of avatar data, or selects an image and a voice, and converts the mixed or selected image and voice into a data form that can be transmitted through the telephone network 400.
[33] The avatar output unit 110 retrieves an avatar requested by the control unit 116 from the avatar DB and outputs the retrieved avatar to the video and audio mixing unit 108.
[34] The speaker determining unit 112 monitors a user's voice inputted through the microphone 103 or a counterpart's voice received through the telephone network and determines a current speaker, thereby allowing the control unit 116 to transmit an avatar corresponding to the speaker.
[35] On the other hand, if a speaker is determined, speaker's voices are analyzed using a variety of voice analysis techniques so that the control unit 116 can transmit an avatar of a further detailed form.
[36] The avatar DB 114 stores data on an avatar that a user desired to transmit. The avatar data stored in the avatar DB includes images of an animal or a thing, icons, pictorial letters, and the like, including all kinds of stand-still avatars and moving avatars.
[37] Here, the avatar data can be a general pictorial data that can be animated, such as a graphic interchange format (GIF) file, or a vector or coordinate data that can contain motions, shapes, and the like.
[38] On the other hand, the avatar data can include a sound or a voice. For example, if a sound expressing an agreement, such as indeed-, oh~ , is outputted while a listening- shape avatar is displayed, a user can feel more comfortable than communicating only viewing an avatar.
[39] Here, the avatar data can be downloaded in a variety of ways, such as a website, a personal computer (PC), or the like.
[40] The control unit 116 requests the avatar output unit 110 to output data for displaying an avatar of a specific form based on a signal inputted from the speaker determining unit 112 and a signal inputted through the keypad 106.
[41] On the other hand, a user can switch from a video communication to an avatar communication, or vice versa. That is, a user can transmit a user's image by handling the keypad 106 while displaying an avatar in a communication, or can transmit an avatar while displaying a user's image.
[42] In addition, a user can communicate by simultaneously displaying a user's image and an avatar on the counterpart's videophone. [43] Such a switch handling can be implemented by control unit's 116 recognizing a designated command inputted through the keypad 106.
[44] The speaker 151 is an apparatus for converting an electrical signal into a sound so that people can hear the sound, which means a general speaker of a wired or a wireless telephone.
[45] The video and audio processing unit 153 converts data received from the telephone network into video and audio data.
[46] The display unit 154 is an apparatus for converting an electrical signal into an image so that people can see the image, which means a liquid crystal display screen or the like of a general wired or a wireless telephone.
[47] FIG. 3 is a flowchart illustrating a method of operating an avatar of a video communication system, and the embodiment is described referring to FIG. 3.
[48] When a transmitter places a phone call using the videophone 100, the transmitter can select whether to communicate with a video, to communicate with an avatar, or to communicate without a video function.
[49] If the transmitter selects an avatar communication through the keypad 106 SlOO, the control unit 116 provides a list of avatars that the transmitter can select, among the avatar data stored in the avatar DB 114.
[50] If the transmitter selects an avatar to transmit to a receiver's videophone 200 S 102 and inputs a phone number of a specific receiver, a video communication using an avatar is started S 104.
[51] If the communication is started, the control unit 116 requests the avatar output unit
110 to retrieve avatar data for displaying a non- speaking- shape avatar among a plurality of avatar data stored in the avatar DB 114, and the video and audio mixing unit 108 transmits the non- speaking- shape avatar data outputted from the avatar output unit 110 to the receiver's videophone 200 so that the non- speaking- shape avatar is displayed on the receiver's videophone 200 S 106.
[52] The non- speaking- shape avatar is an avatar that makes a tiny gesture, such as lifting a finger or slightly nodding although the avatar is in a stand-still shape or does not move lips, while remaining in a non-speaking state.
[53] That is, the non-speaking-shape avatar is an avatar that makes a small motion enough to indicate that the communication is still connected although the avatar does not speak to the user. Through the non- speaking- shape avatar, the counterpart can also recognize that the communication is still continued.
[54] The speaker determining unit 112 monitors whether transmitter's voices or counterpart's voices are received through the microphone 104 while the communication is continued and determines a current speaker S 107.
[55] If the user is determined as a current speaker by the speaker determining unit 112, the control unit 116 retrieves avatar data for displaying a speaking- shape avatar among a plurality of avatar data stored in the avatar DB 114, and the video and audio mixing unit 108 transmits the retrieved speaking- shape avatar data to the receiver's videophone 200 so that the speaking- shape avatar is displayed on the receiver's videophone 200 S108.
[56] The speaking-shape avatar is an avatar moving lips or shows a motion expressing that the avatar is speaking.
[57] If the counterpart is determined as a current speaker by the speaker determining unit
112, the control unit 116 retrieves avatar data for displaying a listening-shape avatar among a plurality of avatar data stored in the avatar DB 114, and the video and audio mixing unit 108 transmits the retrieved listening-shape avatar data to the receiver's videophone 200 so that the listening- shape avatar is displayed on the receiver's videophone 200 S 109.
[58] The listening- shape avatar is an avatar that can express its listening to what the counterpart says through a motion, such as nodding occasionally, giving an ear, or the like.
[59] The speaking-shape or listening-shape avatar can be created in a variety of forms. If a speaker is determined by the speaker determining unit 112, a shape of an avatar can be controlled by analyzing or classifying magnitude or changes of speaker's voices into certain steps.
[60] Embodiment 2
[61] FIG. 4 is a view showing the configuration of an avatar server connected to a video communication system using an avatar moving in accordance with a speaker according to another embodiment of the invention, and the embodiment is described referring to FIG. 4.
[62] The videophone 100 described above comprises the avatar DB for storing avatar data, the speaker determining unit, and the like in a single body, and therefore video communication using such an avatar can be implemented without an additional server or apparatus.
[63] However, the service of the present invention can be implemented by installing such constitutional components in a separate server, not in the videophone 100, connected to the telephone network.
[64] For this purpose, an additional avatar server 300 is required, and the avatar server
300 comprises a web server module 310, an avatar control module 320, a speaker determining module 330, and an avatar DB 360.
[65] The web server module 310 is connected to the telephone network 400 and provides a browser that enables transmitting and receiving data with the videophone 100 or 200.
[66] The avatar control module 320 functions the same as the control unit 116 of the embodiment described above. The avatar control module retrieves predetermined avatar data in response to a signal inputted from the speaker determining module 330 and allows the retrieved avatar data to be transmitted to a counterpart's videophone.
[67] The speaker determining module 330 functions the same as the speaker determining unit 112 of the embodiment described above. The speaker determining module determines a current speaker and allows an avatar of a predetermined form to be transmitted, which can be implemented using an energy sensing technique, VAD technique, or the like described above.
[68] The avatar DB 360 functions the same as the avatar DB 114 of the embodiment described above, which stores data on an avatar desired to be transmitted by a user.
[69] The avatar server 300 transmits and displays a non- speaking- shape avatar on the counterpart's videophone when a video communication is commenced. If a speaker is determined, the avatar server detects the determination of a speaker, appropriately mixes a naturally listening or speaking avatar with a user's image or voice, or selects an avatar, and transmits and displays the mixed or selected avatar on the counterpart's videophone.
[70] Accordingly, a video communication system using an avatar can be constructed without adding separate constitutional components required for video communication to the videophone 100 and 200.
[71] A video communication method using an avatar naturally moving in accordance with a speaker and a system therefor according to the embodiments of the present invention have been described above. However, the scope of rights of the present invention is not limited to the embodiments.
[72] For example, even when movements or expressions of a user's avatar to be transmitted are controlled by analyzing user's real movements or expressions received from a camera, if a counterpart's voice is detected, an ear of the user's avatar is slightly moved or enlarged, or the color of the avatar is changed in order to add an effect or a change showing that the user is listening to what the counterpart says. If a user's voice is detected while the avatar is transmitted, the previous effect or change is discarded or another effect or change is added. Therefore, it is apparent that a communication using an avatar naturally moving in accordance with a speaker can be implemented. Industrial Applicability
[73] Although messenger programs, such as the Microsoft Network (MSN) messenger, that provide a video communication function on a PC are widely distributed already, video communication users are not so many yet in reality.
[74] One of the most important reasons thereof is reluctance of users to show their camera images to a counterpart or worries about privacy infringement. [75] That is, in order to activate video communications, a technique that can ease the reluctance and worries about privacy should be provided to the users, and demands on an avatar video communication technique is absolutely expected as a technique that is most appropriate thereto.
[76] At this time, it is expected that necessarily required is a technique for identifying and transmitting an avatar naturally listening or speaking according to a speaker when the avatar is transmitted.
[77]

Claims

Claims
[1] A video communication method using an avatar naturally moving in accordance with a speaker, the video communication method comprising the steps of: a speaker determining step for monitoring user's and counterpart's voices and determining a current speaker; and an avatar motion step for retrieving listening-shape avatar data or speaking- shape avatar data and transmitting the retrieved avatar data to a counterpart's videophone, if a speaker is determined, whereby the avatar is transmitted while video communicating.
[2] The method according to claim 1, wherein if a user's voice and a counterpart's voice are simultaneously detected, the speaking-shape avatar data is retrieved and transmitted to the counterpart's videophone.
[3] A video communication system using an avatar naturally moving in accordance with a speaker, the video communication system comprising: a speaker determining unit for monitoring user's and counterpart's voices and determining a current speaker; and a control unit for controlling to retrieve listening-shape avatar data or speaking- shape avatar data and transmitting the retrieved avatar data to a counterpart's videophone, if a speaker is determined by the speaker determining unit, whereby the avatar is transmitted while video communicating.
[4] The system according to claim 3, wherein if a user's voice and a counterpart's voice are simultaneously detected, the control unit controls to retrieve the speaking- shape avatar data.
PCT/KR2007/002449 2006-05-24 2007-05-21 Method and system of video phone calling using talker sensitive avatar WO2007136208A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR10-2006-0046811 2006-05-24
KR20060046811 2006-05-24
KR20060052287A KR100768666B1 (en) 2006-05-24 2006-06-12 Method and system of video phone calling using talker sensitive avata
KR10-2006-0052287 2006-06-12

Publications (1)

Publication Number Publication Date
WO2007136208A1 true WO2007136208A1 (en) 2007-11-29

Family

ID=38723503

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2007/002449 WO2007136208A1 (en) 2006-05-24 2007-05-21 Method and system of video phone calling using talker sensitive avatar

Country Status (1)

Country Link
WO (1) WO2007136208A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1326445A2 (en) * 2001-12-20 2003-07-09 Matsushita Electric Industrial Co., Ltd. Virtual television phone apparatus
KR20040016778A (en) * 2002-08-19 2004-02-25 임성호 Method and System of Video Phone Calling using Avata
US20040097221A1 (en) * 2002-11-20 2004-05-20 Lg Electronics Inc. System and method for remotely controlling character avatar image using mobile phone
US20040235531A1 (en) * 2003-05-20 2004-11-25 Ntt Docomo, Inc. Portable terminal, and image communication program
KR20040103047A (en) * 2003-05-30 2004-12-08 에스케이 텔레콤주식회사 Method and System for Providing Avatar Image Service during Talking over the Phone
US20060046699A1 (en) * 2001-07-26 2006-03-02 Olivier Guyot Method for changing graphical data like avatars by mobile telecommunication terminals

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060046699A1 (en) * 2001-07-26 2006-03-02 Olivier Guyot Method for changing graphical data like avatars by mobile telecommunication terminals
EP1326445A2 (en) * 2001-12-20 2003-07-09 Matsushita Electric Industrial Co., Ltd. Virtual television phone apparatus
KR20040016778A (en) * 2002-08-19 2004-02-25 임성호 Method and System of Video Phone Calling using Avata
US20040097221A1 (en) * 2002-11-20 2004-05-20 Lg Electronics Inc. System and method for remotely controlling character avatar image using mobile phone
US20040235531A1 (en) * 2003-05-20 2004-11-25 Ntt Docomo, Inc. Portable terminal, and image communication program
KR20040103047A (en) * 2003-05-30 2004-12-08 에스케이 텔레콤주식회사 Method and System for Providing Avatar Image Service during Talking over the Phone

Similar Documents

Publication Publication Date Title
CN108540655B (en) Caller identification processing method and mobile terminal
EP2175647B1 (en) Apparatus and method for providing emotion expression service in mobile communication terminal
EP1973314A1 (en) Method and apparatus for motion-based communication
US9397850B2 (en) Conference system and associated signalling method
CN108874352B (en) Information display method and mobile terminal
KR20050067022A (en) Method for processing message using avatar in wireless phone
CN111666009B (en) Interface display method and electronic equipment
EP1838099B1 (en) Image-based communication methods and apparatus
CN109412932B (en) Screen capturing method and terminal
CN109993821B (en) Expression playing method and mobile terminal
CN110233933B (en) Call method, terminal equipment and computer readable storage medium
CN111432071B (en) Call control method and electronic equipment
CN107786427B (en) Information interaction method, terminal and computer readable storage medium
CN108881782B (en) Video call method and terminal equipment
CN111601174A (en) Subtitle adding method and device
CN110457716B (en) Voice output method and mobile terminal
CN111148081B (en) Information interaction method and electronic equipment
CN109982273B (en) Information reply method and mobile terminal
CN110505660B (en) Network rate adjusting method and terminal equipment
CN110516495B (en) Code scanning method and mobile terminal
CN111694975B (en) Image display method, device, electronic equipment and readable storage medium
CN111597435B (en) Voice search method and device and electronic equipment
CN111491058A (en) Method for controlling operation mode, electronic device, and storage medium
JP2004015250A (en) Mobile terminal
CN111596986B (en) Information prompting method and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07746597

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07746597

Country of ref document: EP

Kind code of ref document: A1