CN104835190A - 3D instant messaging system and messaging method - Google Patents

3D instant messaging system and messaging method Download PDF

Info

Publication number
CN104835190A
CN104835190A CN201510215785.9A CN201510215785A CN104835190A CN 104835190 A CN104835190 A CN 104835190A CN 201510215785 A CN201510215785 A CN 201510215785A CN 104835190 A CN104835190 A CN 104835190A
Authority
CN
China
Prior art keywords
face
human face
module
client
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510215785.9A
Other languages
Chinese (zh)
Inventor
陆远刚
盛蕴
张桂戌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN201510215785.9A priority Critical patent/CN104835190A/en
Publication of CN104835190A publication Critical patent/CN104835190A/en
Pending legal-status Critical Current

Links

Landscapes

  • Processing Or Creating Images (AREA)

Abstract

The invention provides a 3D instant messaging system comprising clients and a server. A face synthesis unit and a voice synthesis unit are disposed inside each client. The face synthesis unit comprises a face feature extraction device, a model nesting device and a texture mapping device. The face feature extraction device is used for extracting face features from 2D face photos. The model nesting device is used for projecting 3D face mesh models onto the 2D face photos according to the extracted face features, so that texture coordinates of the 3D face mesh models can be obtained. The texture mapping device is used for mapping the 2D face photos back to 3D face meshes so as to form 3D faces. The voice synthesis unit is used for generating voice flows and 3D face animation according to the 3D faces and input text, and outputting the voice flows and the 3D face animation to the server. The server is used for achieving information interaction between the clients. According to the invention, 3D technology is introduced into the messaging system, so that users can design customized 3D face animation and chat on the internet with the customized 3D face animation, and the practical operation can be more interesting and vivid.

Description

A kind of 3D instant communicating system and the means of communication thereof
Technical field
The present invention relates to a kind of communication system, particularly relate to a kind of 3D instant communicating system and the means of communication thereof.
Background technology
Instant messaging (IM, Instant Messenger) system is a kind of instant AC system based on internet.This system can make user be talked online by internet and other people, do not worry mail not in time with phone cost issues.1996, after first item immediate communication tool ICQ is born, and become rapidly the instant communicating system that customer volume is maximum in the world at that time.Afterwards, all kinds of IM is pushed out like the mushrooms after rain.Today, instant communicating system has become the requisite internet communication instrument of many people.
In modes such as text, audio frequency, videos, instant communicating system mainly supports that user carries out remote dialogue and exchanges.Text chat is more dull, and Video chat only appears at when you have installed camera (webcom), and hope and the other side could realize when looking.Instant communicating system, reach personalised effects as Microsoft MSN Messenger mainly loads 2D photo by user, this personalisation process belongs to static treatment.Tencent QQ then have employed the way of user interactions, and user reaches personalised effects by the ornament buying a human head picture, but this technology still belongs to 2D technology.
Therefore, needing the text chat that a kind of compromise scheme both can avoid monotony in the market, can removing from again with stranger looking awkward instant communicating system.
Summary of the invention
The object of the invention is, for providing a kind of personalized 3D instant communicating system, only to synthesize realistic 3D face with a 2D human face photo by present system user, and personalization can be carried out to generated 3D face.When the Internet chat, user by text decomposing module, the text of key entry is converted into corresponding phoneme and apparent place, drive system generates corresponding voice and animation.
The present invention proposes a kind of 3D instant communicating system, comprising: client and server end; Described client is for realizing logging in and the constrained input of information of user; Human face segmentation portion and phonetic synthesis portion is provided with in described client; Wherein, described human face segmentation portion comprises: face characteristic extraction element, model nested arrangement and texture mapping unit; Described face extraction device is used for extracting face feature from 2D human face photo; Described model nested arrangement, for 3D face wire frame model being projected to described 2D human face photo according to the described face feature extracted, obtains the texture coordinate of described 3D face wire frame model; Described 2D human face photo is mapped back 3D face grid according to described texture coordinate by described texture mapping unit, forms 3D face; Described phonetic synthesis portion is used for according to the text generation voice flow of described 3D face and input and 3D human face animation and exports described server end to; Described server end is for realizing the information interaction between each client.
In the 3D instant communicating system that the present invention proposes, described face characteristic extraction element extracts described face feature by interactive.
In the 3D instant communicating system that the present invention proposes, described model nested arrangement comprises: attitude evaluation module, overall calibration module, local alignment module and boundary alignment module; Wherein, described attitude evaluation module estimates 3D information according to described face feature from described 2D human face photo; Described 3D face wire frame model projects on 2D human face photo according to described 3D information by described overall calibration module; Described local alignment module is used for matching to the face in described 3D faceform and described 2D human face photo; The border of described 3D faceform is drawn to the border of described 2D human face photo by described boundary alignment module by use spring model algorithm.
In the 3D instant communicating system that the present invention proposes, described phonetic synthesis portion comprises: text decomposing module, Visual text-to-speech module, animation compound module; Wherein, the text of input is decomposed into phoneme by described text decomposing module; Described Visual text-to-speech module by described phoneme conversion be voice flow and described voice flow synchronous apparent place sequence; Described animation compound module generates 3D human face animation according to described apparent place sequence, and with described voice flow synchronism output.
In the 3D instant communicating system that the present invention proposes, comprise further: 3D face personality module in described client, described 3D face personality module is used for decorating described 3D face.
The invention allows for a kind of 3D instant communication method, it comprises the following steps:
Step one: log in client, inputs described client by user profile and 2D human face photo;
Step 2: extract face feature by face extraction device from described 2D human face photo;
Step 3: 3D face wire frame model projects on described 2D human face photo according to the described face feature extracted by model nested arrangement, obtains the texture coordinate of described 3D face wire frame model;
Step 4: described 2D human face photo is mapped back described 3D face grid according to described texture coordinate by texture mapping unit, forms 3D face;
Step 5: select to need the good friend of communication and input text in described client, phonetic synthesis portion is by the text generation voice flow of described 3D face and input and 3D human face animation and export server end to;
Step 6: described voice flow and described 3D human face animation are sent to corresponding client according to the described good friend selected by described server end, realize 3D instant messaging.
The 3D instant communicating system that the present invention proposes, user can synthesize realistic 3D face with a 2D human face photo, and can carry out personalisation process to the 3D face of synthesis.When using the Internet chat, keying in text by phonetic synthesis portion, driving 3D human face segmentation animation and voice.3D technology is introduced network instant communication system by the present invention, user is designed and uses personalized 3D human face animation to carry out Internet chat, adding interest and the vividness of practical operation, promotes the innovation of instant communicating system.
The 3D instant communicating system that the present invention proposes, user only needs the 2D human face photo of input one oneself generation 3D human face animation just can chat with the other side.3D face refers in the real border of three-dimensional can the 3 D stereo head portrait of representative of consumer individual.In 3D instant communicating system, user is by the human face segmentation technology 2D human face photo synthesis 3D face based on single view, and different expressions and mood (emotions) can be generated, to a kind of sensation on the spot in person of people, expression is then by the phoneme of one group of predefined (phonemes) and apparent place phonetic synthesis portion (text-visual speech engine) (text-to-visual speech engine) that (visemes) drives synthesizes.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of 3D human face segmentation in the present invention.
Fig. 2 is the position view of human face characteristic point in the present invention.
Fig. 3 is end rotation θ in the present invention zestimation figure.
Fig. 4 a is end rotation θ in the invention process ythe view of estimation position, middle section.
Fig. 4 b is end rotation θ in the invention process yfront view in estimation.
Fig. 5 a is the schematic diagram of 2D face in the embodiment of the present invention.
Fig. 5 b is the face border schematic diagram before using spring model in the embodiment of the present invention.
Fig. 5 c is the face border schematic diagram after using spring model in the embodiment of the present invention.
Fig. 6 is the workflow diagram in phonetic synthesis portion in the present invention.
Fig. 7 be in table 1 of the present invention each phoneme corresponding apparent place figure.
Fig. 8 is mouth shape animation curve fitted figure of the present invention.
Fig. 9 is the schematic flow sheet of 3D instant communication method of the present invention.
Figure 10 is the schematic diagram that in the embodiment of the present invention, client logins interface.
Figure 11 is client chat window surface chart in the embodiment of the present invention.
Figure 12 a is personalized 3D face schematic diagram in the embodiment of the present invention.
Figure 12 b is personalized 3D face schematic diagram in the embodiment of the present invention.
Figure 12 c is personalized 3D face schematic diagram in the embodiment of the present invention.
Figure 12 d is personalized 3D face schematic diagram in the embodiment of the present invention.
Figure 13 is the surface chart of server end in the embodiment of the present invention.
Figure 14 a is the schematic diagram that in the embodiment of the present invention, client logins interface.
Figure 14 b is the schematic diagram that in the embodiment of the present invention, client logins interface.
Figure 14 c is the schematic diagram that in the embodiment of the present invention, client logins interface.
Figure 14 d is the surface chart of one's own side's client in the embodiment of the present invention.
Figure 14 e is the surface chart of the other side's client in the embodiment of the present invention.
Figure 15 is the structural representation of 3D instant communicating system of the present invention.
Embodiment
In conjunction with following specific embodiments and the drawings, the invention will be described in further detail.Implement process of the present invention, condition, experimental technique etc., except the following content mentioned specially, be universal knowledege and the common practise of this area, the present invention is not particularly limited content.
As shown in figure 15, the present invention proposes a kind of 3D instant communicating system, comprising: client and server end.Client, for realizing logging in and the constrained input of information of user, is provided with human face segmentation portion 1 and phonetic synthesis portion 2 in client.
In the 3D instant communicating system that the present invention proposes, human face segmentation portion 1 comprises: face characteristic extraction element 11, model nested arrangement 12 and texture mapping unit 13.Face extraction device 11 for extracting face feature from 2D human face photo.Model nested arrangement 12, for 3D face wire frame model being projected to 2D human face photo according to the face feature extracted, obtains the texture coordinate of 3D face wire frame model; 2D human face photo is mapped back 3D face grid according to texture coordinate by texture mapping unit 13, forms 3D face.
In the 3D instant communicating system that the present invention proposes, phonetic synthesis portion 2 is for exporting server end to according to the text generation voice flow of 3D face and input and 3D human face animation.
In the 3D instant communicating system that the present invention proposes, server end is for realizing the information interaction between each client.
In the 3D instant communicating system that the present invention proposes, face characteristic extraction element 11 extracts face feature by interactive.
In the 3D instant communicating system that the present invention proposes, model nested arrangement 12 comprises: attitude evaluation module 121, overall calibration module 122, local alignment module 123 and boundary alignment module 124.Wherein, attitude evaluation module 121 estimates 3D information according to face feature from 2D human face photo.3D face wire frame model projects on 2D human face photo according to 3D information by overall situation calibration module 122.Local alignment module 123 is for matching to the face in 3D faceform and 2D human face photo.The border of 3D faceform is drawn to the border of 2D human face photo by boundary alignment module 124 by use spring model algorithm.
In the present invention, spring model is the mathematical model that in physically based deformation, principle of conservation of energy proposes, and it adopts the rule of interior external force conservation to make the 2D project motion of 3D face grid to the position on face border in photo.In spring model, every bar limit of 3D face grid is counted as spring, and 3D face wire frame model is exactly the 3D model that the point that intersected by spring and spring is formed.
The boundary position that external force can make face in mesh motion to photo is applied to spring model.Because the gradient of border and non-boundary has obvious difference in photo, define external force by gradient.External force following formula can be represented as:
F i ext → = τΔ ( G σ * I ( n → i ) ) ;
Wherein, that image I is at summit N ithe intensity at place, τ is a weight constant being used for controlling internal agency and external agency balance, G σthe 2-D gaussian filtering produced under standard deviation sigma, it is Hamiltonian operator
Because each summit in 3D grid connects many limits, also move in another summit coupled when a vertex movements thereupon, and now applying reacting force stops network to change by these summits, and this power is internal force.When interior external force is equal, this point moves to equilibrium position, and namely the boundary position of 2D face, completes boundary alignment.As shown in Figure 5 c, the face's outline grey broken line in figure is the face border after using spring model.
In the 3D instant communicating system that the present invention proposes, phonetic synthesis portion 2 comprises: text decomposing module 21, Visual text-to-speech module 22, animation compound module 23.Wherein, the text of input is decomposed into phoneme by text decomposing module 21.Visual text-to-speech module 22 by phoneme conversion be voice flow and voice flow synchronous apparent place sequence.Animation compound module 23 according to apparent place sequence generates 3D human face animation, and with voice flow synchronism output.
In the 3D instant communicating system that the present invention proposes, comprise further in client: 3D face personality module 3,3D face personality module 3 is for decorating 3D face.
Based on above 3D communication system, the present invention proposes a kind of 3D instant communication method, as shown in Figure 9, it comprises the following steps:
Step one: log in client, by user profile and 2D human face photo input client;
Step 2: extract face feature by face extraction device 11 from 2D human face photo;
Step 3: 3D face wire frame model projects on 2D human face photo according to the face feature extracted by model nested arrangement 12, obtains the texture coordinate of 3D face wire frame model;
Step 4: 2D human face photo is mapped back 3D face grid according to texture coordinate by texture mapping unit 13, forms 3D face;
Step 5: select to need the good friend of communication and input text in client, phonetic synthesis portion 2 is by the text generation voice flow of 3D face and input and 3D human face animation and export server end to;
Step 6: voice flow and 3D human face animation are sent to corresponding client according to the good friend selected by server end, realize 3D instant messaging.
As shown in Figure 1, human face segmentation of the present invention needs to experience following process: face characteristic extracts, model is nested (modelfitting) and texture.
In the present invention, it is from 2D human face photo, extract face feature, as eyebrow, eyes, nose, face, lower jaw etc. that face characteristic extracts.Face characteristic is extracted in the human face segmentation based on single view and mainly plays two aspect effects: one is for the estimation of 3D human face posture provides geological information; Two is provide positional information for model is nested.
Model cover insertion device 12 is made up of with boundary alignment module 124 attitude evaluation module 121, overall calibration module 122, local alignment module 123.The estimation of 3D human face posture estimates 3D information from 2D human face photo, due to the disappearance of depth information (depth), thus needs to be estimated by the inherent geological information in face characteristic.The accuracy of estimation depends on the degree of accuracy extracted face characteristic.According to 3D human face posture estimation result, by overall situation calibration, the 3D face wire frame model of a predefined is projected on 2D human face photo by rotation, scaling and translation.Because different faces there are differences, the projection of general 3D faceform can not match with the feature of each face and profile thereof, therefore needs local alignment and boundary alignment.Local alignment is the calibration to local locations such as the eyes in 3D faceform, mouth shapes after overall situation calibration, thus makes matching in eyes in 3D faceform, mouth shape and 2D human face photo.Boundary alignment is after local alignment, by use spring model algorithm, the border of 3D faceform is drawn to the border of 2D human face photo.By above step, the texture coordinate of 3D face wire frame model can be calculated, texture element be mapped back 3D face wire frame model, realistic 3D face can be obtained, then be mapped to screen space, complete texture.
In the present invention, phonetic synthesis portion 2 supports different language Text Input, and Text Input can be converted into voice and the output of 3D human face animation, improves the sense of reality of synthesis human face animation.
In the present invention, because extracted human face characteristic point is positioned at the different parts of face, and the complex background of the attitude of face and 2D human face photo all can affect the accuracy of extraction, so the present embodiment adopts the 2D human face photo of the mode of user interactions to input to carry out manual human face characteristic point extraction.Ten human face characteristic points that user extracts as shown in black round dot in Fig. 2, and carry out the estimation of 3D human face posture according to the geometric relationship between each unique point.
In the present invention, the estimation of 3D human face posture estimates 3D information from 2D image, and it is the key based on single view human face segmentation technology.Due to the disappearance of depth information (depth), thus need to be estimated by the inherent geological information in face characteristic.Attitude estimation can be considered to θ in formula x, θ y, θ zestimation, following formula:
X ′ Y ′ Z ′ = 1 - θ Z θ Y θ Z 1 - θ X - θ Y θ X 1 × S X 0 0 0 S Y 0 0 0 S Z X Y Z + T X T Y T Z ;
Formula is the 3D affined transformation under rectangular projection model.Any one some P (X, Y, Z) on 3D faceform can be mapped to impact point P'=(X', Y', Z') by this affined transformation under very small Euclid angle.Wherein θ x, θ yand θ zrepresent the angle rotated around the X-axis in 3-D space, Y-axis and Z axis respectively, S x, S yand S zrepresent zoom factor respectively, (T x, T y, T z) tit is corresponding translation vector.
θ in formula zestimation can by calculating the line segment P in 2D face picture lp robtain, as Fig. 3 with the angle of 2-D horizontal axis.The present invention adopts a kind of method based on round section to be used for estimating end rotation θ yvalue, prerequisite be hypothesis head be annular through eye level cross section, as shown in figures 4 a and 4b.Now, end rotation θ yvalue just can be estimated, following formula:
θ Y = ∠ P C OP 3 = arcsin 2 | P CX - P 3 X | | P 2 X - P 1 X | ;
Wherein, P 1and P 2two points in horizontal eye position direction in face picture, P 3p 1, P 2mid point.Subscript X represents horizontal axis.θ xfor head faces y direction, now the relative direction of X-axis line is stood in this direction in being.
Mainly use left eye central point, right eye central point and right and left eyes central point, as shown in Figure 2, the angle information of estimation face, the method can rotate 3D faceform by the angle of estimation, and by its rectangular projection on 2D face view.
As shown in Figure 2, left eye center, right eye center, right and left eyes center and four, face center point are calibrated for the overall situation of model; Left eye center and two, right eye center point are calibrated for the width of faceform; Right and left eyes center and face center are used for the altitude calibration of faceform.The left corners of the mouth, the right corners of the mouth, upper lip are used under mid point and lower lip the mouth shape calibration in local alignment along mid point four points.
Fig. 5 a-Fig. 5 c illustrates the 3D face boundary effect figure using spring model.Fig. 5 a is the 2D human face photo for the synthesis of 3D faceform, and in Fig. 5 b, facial contour inner grey broken line is the 3D face border before using spring model, and the facial contour grey broken line in Fig. 5 c is the face border after using spring model.Clearly can find out from comparing result, use the 3D face border after spring model consistent with original face border.
Have employed CANDIDE-3 model and OpenGL shape library in the present invention, completed on the CANDIDE-3 model after by 2D face View Mapping to calibration by the texture function calling OpenGL shape library, generate 3D face.Finally the 3D face of generation is mapped to screen space, completes texture.
In the present invention, phonetic synthesis portion 2 can be input as object with English text, input text is decomposed into the aligned phoneme sequence with sequential by text decomposing module 21, the phoneme generated not only for phonetic synthesis but also for apparent place generation, by generate apparent place sequence completes the synthesis of 3D human face animation.Complete text, speech-sound synthesizing function by Visual text-to-speech module 22, carry
Table 1: phoneme is apparent place mapping table
Apparent place Set of phonemes Example
1 Acquiescence -
2 ay,ah bite,but
3 ey,eh,ae bait,bet,bat
4 er Bird
5 ix,iy,ih,ax,axr,y debit,beet,bit,about,butter,yacht
6 uw,uh,w boot,book,way
7 ao,aa,oy,ow bought,bott,boy,boat
8 aw Bout
9 g,hh,k,ng gay,hay,key,sing
a r Ray
b r,d,n,en,el,t lay,day,noon,button,bottle,tea
c s,z sea,zone
d ch,sh,jh,zh choke,she,joke,azure
e th,dh thin,van
f f,v fin,van
g m,em,b,p mom,bottom,bee,pea
Get the phoneme corresponding with text, by self-defining phoneme apparent place table, phoneme conversion can be become corresponding apparent place.As shown in first branch of Fig. 6, when user inputs English text, phonetic synthesis portion 2 is first decomposed into phoneme, then generates corresponding voice flow, finally completes phonetic synthesis.
As shown in table 1,16 of the present invention's employing have the basic apparent place class of obviously difference, and it correspond to 16 substantially apparent place as Fig. 7.Based on 16 basic apparent place, by accurate expert along training define herein phoneme be mapped as apparent place translation table.
Table 2: phoneme is apparent place the rule of mapping
As shown in table 2, in translation table, have 3 kinds of projected forms: single phoneme corresponding single apparent place, multiple phoneme corresponding multiple apparent place, multiple phoneme corresponding single apparent place.As shown in half branch under Fig. 6, present system phoneme can be mapped as corresponding apparent place, obtain synchronous with phoneme apparent place sequence, complete human face segmentation.
Due to apparent place describe be sounding, what the present invention mainly realized is based on mouth shape animation.Based on 16 differences apparent place, define herein different mouth shape animation rule.Fig. 8 illustrates mouth shape animation principle.When the upper lip of face moves upward, as the direction of arrow indication upwards, now use the change of sin curve C 1 matching upper lip.When upper lip moves, the height first calculating face point midway is the height of sin curve, and secondly the movement position of other position points of upper lip calculates gained by sin curve.The motion of lower lip is the direction of downward arrow indication in figure, adopts and upper lip moves same principle, and be C1 when face closes up, C2 converges to the situation of C3 line segment, and now sin height of curve is 0.
The present invention conveniently user chat experience, client is made up of two interfaces, is respectively and logins interface and dialog box.Figure 10 be client login interface, this interface for supporting that user inputs user name, the 2D human face photo of user and interactive operation of extracting ten unique points.After completing aforesaid operations, the information of user is sent to server end by client.User obtains online user's list information that server end is passed back, selects the user of chat.Now user just can chat with the other side, and enters the dialog box of client, as shown in figure 11.As can be seen from the figure this dialog box by Text Entry, chat message display box, one's own side and the other side 3D face frame, send the controls such as button, chat record button and individual operation button and form.After user input text, text can be sent to the client of the other side by client.Meanwhile resolved to phoneme by the phonetic synthesis portion 2 of this client, then phonetic synthesis portion 2 can generate voice flow with corresponding apparent place sequence.Now adopt a Thread control herein for voice flow, animation producing adopts another Thread control, with multi-thread mechanism make phoneme with apparent place accomplish synchronous.Can the animation sequence corresponding with it be exported while output voice, create with text, the instant chat of voice and animation three real-time synchronization.
The present invention has more interest and vividness to allow chat, is also provided with 3D face personality module 3.As shown in figure 11, the invention provides 4 kinds of personalized jewelrys and select for user, 4 kinds of personalized jewelry icons as top-right in interface, it represents yellow grid cap, red grid cap, black glasses, yellow glasses respectively.When user does not also put on personalized jewelry, now click the personalized jewelry icon of client, one's own side can put on corresponding personalized jewelry to the 3D face of the other side's client, completes wearing.When user clicks the personalized jewelry put on again, the personalized jewelry that the 3D face of one's own side and the other side's client is worn can picked-off.Shown in Figure 12 a is front elevation after putting on a hat, and shown in Figure 12 b is 3D face after rotation 30 degree, and shown in Figure 12 c is front elevation after putting on one's glasses, and shown in Figure 12 d is 3D face after rotation 30 degree.
Server end interface as shown in fig. 13 that, is made up of online user's list, user's chat text information, unlatching server end button, closing server end button four controls.The current online user list of online user's list display.User's chat text message box is for recording the chat message of all users.In the present invention, the text communications between user and user is completed by server end transfer, and it is first send to server end that user sends text to another user, then gives request user that user that will send by this information and sending of server-side processes.In addition server end also stored for other information of user, as user name, user 2D human face photo, human face characteristic point etc.Because server end will process various message, all having added message indications at message header when client sends message, is connection server end request message for indicating message, or sends message to other users, or the message of other types.According to different message header indications, server end can provide different processing modes and complete corresponding operation.When click open server end time, server end can open a port for waiting for the connection of user, and bring into operation its various service functions; When clicking closing server end button, server end meeting close port, disconnects, exits server end interface.
3D communication system of the present invention uses step as follows: first user opens client, can show and login interface box, as shown in figures 14a.Input user name and 2D human face photo, as shown in 14b figure logining in interface.Connection server end, now the information of user can be sent to server end by client.Server end receives user profile, is stored in server end, and passes online user's list back, and user selects the object of chat according to online user's list, as shown in figure 14 c.Just can enter the dialog box of client in doing so and chat with online user, as shown in Figure 14 d and Figure 14 e.As shown in Figure 14 d, user first inputs chat text in input frame, click and send button, chat text will be sent to the client of the other side by client, and in chat message display box, upgrade chat text record (chat text, transmit leg user name, transmitting time).Meanwhile client can generate the voice corresponding with text and human face animation by phonetic synthesis portion 2.When one's own side receives the chat text of the other side's transmission, client can upgrade the chat text of the other side in chat message display box, and is generated the human face animation of voice and the other side by phonetic synthesis portion 2.Figure 14 d is the state of user's one's own side's client when chatting, and Figure 14 e is the state of the other side's client.The invention provides yellow grid cap, red grid cap, black glasses, the personalized jewelry of yellow glasses 4 kinds for user's selection, user can click arbitrarily one or more jewelry icon and complete 3D face, and is shown in one's own side and the other side's client simultaneously.As jewelry need be extractd, only the personalized jewelry icon put on need be clicked.What Figure 14 d and Figure 14 e showed is the state that one's own side puts on that yellow grid cap, black glasses and the other side put on red grid cap, the chat of yellow glasses.
Protection content of the present invention is not limited to above embodiment.Under the spirit and scope not deviating from inventive concept, the change that those skilled in the art can expect and advantage are all included in the present invention, and are protection domain with appending claims.

Claims (6)

1. a 3D instant communicating system, is characterized in that, comprising: client and server end;
Described client is for realizing logging in and the constrained input of information of user; Human face segmentation portion (1) and phonetic synthesis portion (2) are provided with in described client; Wherein,
Described human face segmentation portion (1) comprising: face characteristic extraction element (11), model nested arrangement (12) and texture mapping unit (13); Described face extraction device (11) for extracting face feature from 2D human face photo; Described model nested arrangement (12), for calibrating 3D face wire frame model according to the described face feature extracted, obtains the texture coordinate of described 3D face wire frame model; Described 2D human face photo is mapped back 3D face grid according to described texture coordinate by described texture mapping unit (13), forms 3D face;
Described phonetic synthesis portion (2) is for exporting described server end to according to the text generation voice flow of described 3D face and input and 3D human face animation;
Described server end is for realizing the information interaction between each client.
2. 3D instant communicating system as claimed in claim 1, is characterized in that, described face characteristic extraction element (11) extracts described face feature by interactive.
3. 3D instant communicating system as claimed in claim 1, it is characterized in that, described model nested arrangement (12) comprising: attitude evaluation module (121), overall calibration module (122), local alignment module (123) and boundary alignment module (124); Wherein,
Described attitude evaluation module (121) estimates 3D information according to described face feature from described 2D human face photo;
Described 3D face wire frame model projects on 2D human face photo according to described 3D information by described overall calibration module (122);
Described local alignment module (123) is for matching to the face in described 3D faceform and described 2D human face photo;
The border of described 3D faceform is drawn to the border of described 2D human face photo by described boundary alignment module (124) by use spring model algorithm.
4. 3D instant communicating system as claimed in claim 1, it is characterized in that, described phonetic synthesis portion (2) comprising: text decomposing module (21), Visual text-to-speech module (22), animation compound module (23); Wherein,
The text of input is decomposed into phoneme by described text decomposing module (21);
Described Visual text-to-speech module (22) by described phoneme conversion be voice flow and described voice flow synchronous apparent place sequence;
Described animation compound module (23) generates 3D human face animation according to described apparent place sequence, and with described voice flow synchronism output.
5. 3D instant communicating system as claimed in claim 1, it is characterized in that, comprise further in described client: 3D face personality module (3), described 3D face personality module (3) is for decorating described 3D face.
6. a 3D instant communication method, is characterized in that, for 3D instant communicating system as claimed in claim 1, it comprises the following steps:
Step one: log in client, inputs described client by user profile and 2D human face photo;
Step 2: extract face feature by face extraction device (11) from described 2D human face photo;
Step 3: 3D face wire frame model projects on described 2D human face photo according to the described face feature extracted by model nested arrangement (12), obtains the texture coordinate of described 3D face wire frame model;
Step 4: described 2D human face photo is mapped back described 3D face grid according to described texture coordinate by texture mapping unit (13), forms 3D face;
Step 5: select to need the good friend of communication and input text in described client, phonetic synthesis portion (2) are by the text generation voice flow of described 3D face and input and 3D human face animation and export server end to;
Step 6: described voice flow and described 3D human face animation are sent to corresponding client according to the described good friend selected by described server end, realize 3D instant messaging.
CN201510215785.9A 2015-04-29 2015-04-29 3D instant messaging system and messaging method Pending CN104835190A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510215785.9A CN104835190A (en) 2015-04-29 2015-04-29 3D instant messaging system and messaging method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510215785.9A CN104835190A (en) 2015-04-29 2015-04-29 3D instant messaging system and messaging method

Publications (1)

Publication Number Publication Date
CN104835190A true CN104835190A (en) 2015-08-12

Family

ID=53813055

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510215785.9A Pending CN104835190A (en) 2015-04-29 2015-04-29 3D instant messaging system and messaging method

Country Status (1)

Country Link
CN (1) CN104835190A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106411695A (en) * 2016-08-29 2017-02-15 广州华多网络科技有限公司 User characteristic information area pendant dynamic updating method and device and smart terminal
CN107205083A (en) * 2017-05-11 2017-09-26 腾讯科技(深圳)有限公司 Information displaying method and device
WO2018049979A1 (en) * 2016-09-14 2018-03-22 厦门幻世网络科技有限公司 Animation synthesis method and device
CN108022172A (en) * 2017-11-30 2018-05-11 广州星天空信息科技有限公司 Virtual social method and system based on threedimensional model
CN108880975A (en) * 2017-05-16 2018-11-23 腾讯科技(深圳)有限公司 Information display method, apparatus and system
CN109274575A (en) * 2018-08-08 2019-01-25 阿里巴巴集团控股有限公司 Message method and device and electronic equipment
CN111294665A (en) * 2020-02-12 2020-06-16 百度在线网络技术(北京)有限公司 Video generation method and device, electronic equipment and readable storage medium
CN112541957A (en) * 2020-12-09 2021-03-23 北京百度网讯科技有限公司 Animation generation method, animation generation device, electronic equipment and computer readable medium
TWI824883B (en) * 2022-12-14 2023-12-01 輔仁大學學校財團法人輔仁大學 A virtual reality interactive system that uses virtual reality to simulate expressions and emotions for training

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1427626A (en) * 2001-12-20 2003-07-02 松下电器产业株式会社 Virtual television telephone device
CN101120348A (en) * 2005-02-15 2008-02-06 Sk电信有限公司 Method and system for providing news information by using three dimensional character for use in wireless communication network
CN102426712A (en) * 2011-11-03 2012-04-25 中国科学院自动化研究所 Three-dimensional head modeling method based on two images
KR20120137826A (en) * 2011-06-13 2012-12-24 한국과학기술원 Retargeting method for characteristic facial and recording medium for the same
CN102999942A (en) * 2012-12-13 2013-03-27 清华大学 Three-dimensional face reconstruction method
CN103235943A (en) * 2013-05-13 2013-08-07 苏州福丰科技有限公司 Principal component analysis-based (PCA-based) three-dimensional (3D) face recognition system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1427626A (en) * 2001-12-20 2003-07-02 松下电器产业株式会社 Virtual television telephone device
CN101120348A (en) * 2005-02-15 2008-02-06 Sk电信有限公司 Method and system for providing news information by using three dimensional character for use in wireless communication network
KR20120137826A (en) * 2011-06-13 2012-12-24 한국과학기술원 Retargeting method for characteristic facial and recording medium for the same
CN102426712A (en) * 2011-11-03 2012-04-25 中国科学院自动化研究所 Three-dimensional head modeling method based on two images
CN102999942A (en) * 2012-12-13 2013-03-27 清华大学 Three-dimensional face reconstruction method
CN103235943A (en) * 2013-05-13 2013-08-07 苏州福丰科技有限公司 Principal component analysis-based (PCA-based) three-dimensional (3D) face recognition system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
G C FENG .ETAL: "Virtual View face image synthesis using 3D Spring-based Face Model from a single Image", 《PROCEEDINGS OF THE FOURTH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION》 *
MARIO RINCON-NIGRO .ETAL: "A Text-Driven Conversational Avatar Interface for Instant Messaging on Mobile Devices", 《IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106411695A (en) * 2016-08-29 2017-02-15 广州华多网络科技有限公司 User characteristic information area pendant dynamic updating method and device and smart terminal
WO2018049979A1 (en) * 2016-09-14 2018-03-22 厦门幻世网络科技有限公司 Animation synthesis method and device
CN107205083A (en) * 2017-05-11 2017-09-26 腾讯科技(深圳)有限公司 Information displaying method and device
CN108880975A (en) * 2017-05-16 2018-11-23 腾讯科技(深圳)有限公司 Information display method, apparatus and system
CN108880975B (en) * 2017-05-16 2020-11-10 腾讯科技(深圳)有限公司 Information display method, device and system
CN108022172A (en) * 2017-11-30 2018-05-11 广州星天空信息科技有限公司 Virtual social method and system based on threedimensional model
CN111865771B (en) * 2018-08-08 2023-01-20 创新先进技术有限公司 Message sending method and device and electronic equipment
CN109274575A (en) * 2018-08-08 2019-01-25 阿里巴巴集团控股有限公司 Message method and device and electronic equipment
CN111865771A (en) * 2018-08-08 2020-10-30 创新先进技术有限公司 Message sending method and device and electronic equipment
CN111294665A (en) * 2020-02-12 2020-06-16 百度在线网络技术(北京)有限公司 Video generation method and device, electronic equipment and readable storage medium
CN112541957A (en) * 2020-12-09 2021-03-23 北京百度网讯科技有限公司 Animation generation method, animation generation device, electronic equipment and computer readable medium
US11948236B2 (en) 2020-12-09 2024-04-02 Beijing Baidu Netcom Science Technology Co., Ltd. Method and apparatus for generating animation, electronic device, and computer readable medium
CN112541957B (en) * 2020-12-09 2024-05-21 北京百度网讯科技有限公司 Animation generation method, device, electronic equipment and computer readable medium
TWI824883B (en) * 2022-12-14 2023-12-01 輔仁大學學校財團法人輔仁大學 A virtual reality interactive system that uses virtual reality to simulate expressions and emotions for training

Similar Documents

Publication Publication Date Title
CN104835190A (en) 3D instant messaging system and messaging method
US11670033B1 (en) Generating a background that allows a first avatar to take part in an activity with a second avatar
EP3370208B1 (en) Virtual reality-based apparatus and method to generate a three dimensional (3d) human face model using image and depth data
US11736756B2 (en) Producing realistic body movement using body images
JP4449723B2 (en) Image processing apparatus, image processing method, and program
CN110503703A (en) Method and apparatus for generating image
KR101743763B1 (en) Method for providng smart learning education based on sensitivity avatar emoticon, and smart learning education device for the same
CN110286756A (en) Method for processing video frequency, device, system, terminal device and storage medium
CN110163054A (en) A kind of face three-dimensional image generating method and device
US20140085293A1 (en) Method of creating avatar from user submitted image
KR20220005424A (en) Method and apparatus for creating a virtual character, electronic equipment, computer readable storage medium and computer program
KR101743764B1 (en) Method for providing ultra light-weight data animation type based on sensitivity avatar emoticon
US20020194006A1 (en) Text to visual speech system and method incorporating facial emotions
US11798238B2 (en) Blending body mesh into external mesh
KR102491140B1 (en) Method and apparatus for generating virtual avatar
CN108090940A (en) Text based video generates
US20160004905A1 (en) Method and system for facial expression transfer
CN107333086A (en) A kind of method and device that video communication is carried out in virtual scene
CN115049016B (en) Model driving method and device based on emotion recognition
KR102334705B1 (en) Apparatus and method for drawing webtoon
CN107146275B (en) Method and device for setting virtual image
KR20220006022A (en) Slider block processing method and apparatus for virtual characters, electronic equipment, computer readable storage medium and computer program
KR20160010810A (en) Realistic character creation method and creating system capable of providing real voice
EP4152269B1 (en) Method and apparatus of training model, device, and medium
KR102388773B1 (en) Method for three dimensions modeling service and Apparatus therefor

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150812