CN108174123A - Data processing method, apparatus and system - Google Patents
Data processing method, apparatus and system Download PDFInfo
- Publication number
- CN108174123A CN108174123A CN201711443989.3A CN201711443989A CN108174123A CN 108174123 A CN108174123 A CN 108174123A CN 201711443989 A CN201711443989 A CN 201711443989A CN 108174123 A CN108174123 A CN 108174123A
- Authority
- CN
- China
- Prior art keywords
- lip
- user
- data
- image
- lteral
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment
- H04N5/262—Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
- H04N5/265—Mixing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/165—Detection; Localisation; Normalisation using facial parts and geometric relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
Abstract
This application provides a kind of data processing method, apparatus system, including:Obtain user voice data and user's lteral data;Wherein, the user voice data is corresponding with user's lteral data;Determine lip image set corresponding with user's lteral data;It adjusts the lip image set and obtains lip image set corresponding with facial image, and synthesize the corresponding lip video data of facial image;User voice data and lip video data are synthesized, obtains user video data.The application can be based on user voice data, and with reference to facial image, display is for voice data on facial image, to show with the effect of facial image displaying user voice data.The exchange way of instant message applications can be enriched in this way.
Description
Technical field
This application involves field of communication technology more particularly to a kind of data processing method, apparatus and systems.
Background technology
In increasingly flourishing internet, some social networking applications may be used voice mode and send message.Speech message its
It is more single to show form, interaction effect is poor.
Invention content
In consideration of it, the application provides a kind of data processing method, apparatus and system, the friendship of instant message applications can be enriched
Stream mode.
To achieve these goals, this application provides following technical characteristics:
A kind of data processing method, including:
Obtain user voice data and user's lteral data;Wherein, the user voice data and user's word
Data correspond to;
Determine lip image set corresponding with user's lteral data;
It adjusts the lip image set and obtains lip image set corresponding with facial image, and it is corresponding to synthesize facial image
Lip video data;
User voice data and lip video data are synthesized, obtains user video data.
Optionally, the acquisition user voice data and user's lteral data, including:
User's lteral data is obtained in response to lteral data input by user, voice data is converted to based on lteral data and is obtained
Obtain user voice data;Alternatively,
User voice data is obtained in response to voice data input by user, lteral data is converted to based on voice data and is obtained
Obtain user's lteral data.
Optionally, it is described to determine lip image set corresponding with user's lteral data, including:
Semantic analysis is carried out to user's lteral data and is segmented, multiple participles is obtained and corresponding multiple participles belongs to
Property information;
Multiple lip images corresponding with multiple participles are determined respectively;
Corresponding lip image is adjusted based on participle attribute information;
Lip image composition lip image set after multiple adjustment.
Optionally, it is described to determine multiple lip images corresponding with multiple participles respectively, including:
In the multiple lip images divided by simple or compound vowel of a Chinese syllable, lip image corresponding with participle simple or compound vowel of a Chinese syllable is determined;
In by initial consonant and multiple lip images of simple or compound vowel of a Chinese syllable division, lip figure corresponding with the initial consonant and simple or compound vowel of a Chinese syllable that segment is determined
Picture;
Initial consonant and simple or compound vowel of a Chinese syllable are input to lip iconic model, obtain the lip image of lip iconic model output.
Optionally, the adjustment lip image set obtains lip image set corresponding with facial image, including:
The lip feature in facial image is adjusted, so that lip feature and the lip characteristic matching in lip image;
Facial image after several are adjusted is determined as lip image set corresponding with facial image.
Optionally, the synthesis user voice data and lip video data, obtain user video data, including:
Determine the coding parameter of user voice data, the voice document after being encoded;
Determine the coding parameter of lip video data, the video file after being encoded;
Audio-visual synchronization is carried out to the video file after the voice document and coding after coding, obtains user video data.
A kind of data processing equipment, including:
Data cell is obtained, for obtaining user voice data and user's lteral data;Wherein, the user speech number
According to corresponding with user's lteral data;
Image set unit is determined, for determining lip image set corresponding with user's lteral data;
Adjustment unit obtains lip image set corresponding with facial image, and synthesize for adjusting the lip image set
The corresponding lip video data of facial image;
Synthesis unit for synthesizing user voice data and lip video data, obtains user video data.
Optionally, the determining image set unit, including:
Participle unit for user's lteral data semantic analysis and segment, obtains multiple participles and right
The multiple participle attribute informations answered;
Lip elementary area is determined, for determining multiple lip images corresponding with multiple participles respectively;
Lip elementary area is adjusted, corresponding lip image is adjusted for being based on participle attribute information;
Component units form lip image set for the lip image after multiple adjustment.
Optionally, the adjustment unit includes:
Adjustment unit, for adjusting the lip feature in facial image, so that lip feature and the lip in lip image
Characteristic matching;
Determination unit for the facial image after several are adjusted, is determined as lip image set corresponding with facial image.
A kind of data processing system, including:
Terminal is sent, for the face image that determines concurrently make a gift to someone using facial image to server;Send user speech number
According to or user's lteral data to server;
Server for receiving and storing facial image, obtains user voice data and user's lteral data;Wherein,
The user voice data is corresponding with user's lteral data;Determine lip image set corresponding with user's lteral data;It adjusts
The whole lip image set obtains lip image set corresponding with facial image, and synthesize the corresponding lip video counts of facial image
According to;User voice data and lip video data are synthesized, obtains user video data;User video data are sent to receiving end
End;
Terminal is received, for receiving and showing user video data.
By more than technological means, following advantageous effect can be realized:
The application can be based on user voice data, and with reference to facial image, display is for voice number on facial image
According to show with the effect of facial image displaying user voice data.The exchange side of instant message applications can be enriched in this way
Formula.
Description of the drawings
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or it will show below
There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of application, for those of ordinary skill in the art, without creative efforts, can be with
Other attached drawings are obtained according to these attached drawings.
Fig. 1 a are a kind of structure diagram of data processing system disclosed in the embodiment of the present application;
Fig. 1 b are a kind of flow chart of data processing method disclosed in the embodiment of the present application;
Fig. 2 is a kind of flow chart of data processing method disclosed in the embodiment of the present application;
Fig. 3 is the schematic diagram that the embodiment of the present application discloses some lips divided based on simple or compound vowel of a Chinese syllable;
Fig. 4 a-4c are the schematic diagram that the embodiment of the present application discloses some lips;
Fig. 5 is the schematic diagram that the embodiment of the present application discloses some lip characteristic points;
Fig. 6 is a kind of flow chart of data processing equipment disclosed in the embodiment of the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, the technical solution in the embodiment of the present application is carried out clear, complete
Site preparation describes, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on
Embodiment in the application, those of ordinary skill in the art are obtained every other without making creative work
Embodiment shall fall in the protection scope of this application.
At present, in the exchange way in instant message applications, in order to enable exchange way is more diversified, needle is provided
Voice data is carried out the scheme that shows of video and, carry out the scheme that video shows for lteral data.
According to one embodiment that the application provides, a kind of data processing method is provided.Referring to Fig. 1 a, including:It sends
Terminal 100, server 200 and reception terminal 300.
The specific implementation of data processing method is described below, referring to Fig. 1 b, includes the following steps:
Step S101:It sends terminal 100 and determines the facial image used, and sender's face image is to server 200.
Step S102:It sends terminal 100 and sends user voice data or user's lteral data to server 200.
Step S103:Server 200 receives user voice data or user's lteral data, and obtain user voice data with
And user's lteral data;Wherein, the user voice data is corresponding with user's lteral data.
When send terminal 100 send be user voice data in the case of, server 200 is in response to text input by user
Digital data obtains user's lteral data, then, is converted to voice data based on lteral data and obtains user voice data.
The process that voice data is converted to based on lteral data has been mature technology, and details are not described herein.
When send terminal 200 send be user's lteral data in the case of, server 200 is in response to language input by user
Sound data obtain user voice data, then, are converted to lteral data based on voice data and obtain user's lteral data.
The process that lteral data is converted to based on voice data has been mature technology, and details are not described herein.
Step S104:Server 200 determines lip image set corresponding with user's lteral data.
Referring to Fig. 2, this step specifically includes:
Step S201:User's lteral data semantic analysis and segment, obtains multiple participles and corresponding
Multiple participle attribute informations.
According to the category of language of user's lteral data, lteral data is segmented to obtain multiple participles.For example, with user's word
Data are speech category that there are two types of the determining user's lteral data tools of first choice for " Hello, hello ":English and Chinese.
English components are segmented according to English participle mode, such as each word is a participle.To Chinese part
It is segmented according to Chinese mode, such as a Chinese character is a participle.After so, being segmented to user's lteral data
It arrives:It is Hello, big, family, good.
Step S202:Multiple lip images corresponding with multiple participles are determined respectively.
This step can be by three kinds of realization methods:
The first realization method:Classification mode is divided based on simple or compound vowel of a Chinese syllable.
It is found after a large amount of lip data are analyzed, lip depends primarily on the simple or compound vowel of a Chinese syllable (for example, a, ang, ao etc.) of participle.Cause
This, can be based on the multiple lip classifications of simple or compound vowel of a Chinese syllable and, lip image corresponding with lip classification.It is to be drawn based on simple or compound vowel of a Chinese syllable referring to Fig. 3
The signal of some lips divided.
Therefore after being segmented, can the simple or compound vowel of a Chinese syllable based on participle, search obtain lip image corresponding with simple or compound vowel of a Chinese syllable.Example
Such as, by taking " big " as an example, simple or compound vowel of a Chinese syllable is " a ", then searches lip image corresponding with simple or compound vowel of a Chinese syllable " a ".
Second realization method:Class is divided otherwise based on initial consonant and simple or compound vowel of a Chinese syllable.
Lip depends primarily on the simple or compound vowel of a Chinese syllable of participle, but the initial consonant segmented can also generate lip some difference, so, it can
Lip image is determined with the initial consonant based on participle and simple or compound vowel of a Chinese syllable jointly.
Therefore after being segmented, initial consonant and simple or compound vowel of a Chinese syllable that can be based on participle be searched and are obtained with initial consonant and common with simple or compound vowel of a Chinese syllable
Corresponding lip image.For example, by taking " big " as an example, initial consonant is " d ", simple or compound vowel of a Chinese syllable is " a ", then searches common with initial consonant and simple or compound vowel of a Chinese syllable " da "
With corresponding lip image.
Third realization method:Lip image is determined based on lip iconic model.
Initial consonant and simple or compound vowel of a Chinese syllable are in advance based on to train lip iconic model, training at present can be based on about lip iconic model
Model trains the initial consonant of a large amount of words, simple or compound vowel of a Chinese syllable and its lip data, and the lip iconic model after the completion of being trained.
Therefore, the initial consonant and simple or compound vowel of a Chinese syllable of participle can be obtained after being analyzed, and is input to lip iconic model, is passed through
After lip iconic model calculates, lip image corresponding with participle is obtained.
Referring to Fig. 4 a-4c, respectively " big " " family ", " good " lip image.
Step S203:Corresponding lip image is adjusted based on participle attribute information.
The attribute information of participle can include the attribute informations such as emotion information and the information volume of participle.Using emotion information as
Example, the corresponding lip image of different emotions information are also different.For example, when emotion information is happy, the lip of " hello " is said
Shape, with emotion information for it is boiling with rage when, the lip for saying " hello " is different.
Also, lip can also can be big with the raising opening and closing degree of volume, lip is with the reduction opening and closing degree of volume
It can reduce.And hence it is also possible to the opening and closing degree of lip is adjusted based on the size of word.
A large amount of lip sample can be obtained in advance, and obtains the attribute information of sample, with the attribute information of lip sample
For input, using lip image as output, training pattern is trained.It is obtained after training with the attribute information of lip sample
For input, using lip image as the model of output;The model can be based on attribute information and export lip corresponding with attribute information
Image.
Step S204:Lip image composition lip image set after multiple adjustment.
It is performed both by the above process for each participle, obtains the corresponding lip image of multiple participles.By user's word number
According to being obtained after participle by the multiple participles successively arranged in user's lteral data.By the sequence of the priority of participle, determine and segment
Multiple orderly lip images are determined as lip image set by the sequence of corresponding lip image.
Fig. 1 b are then returned to, enter step S105:It adjusts the lip image set and obtains lip corresponding with facial image
Image set, and synthesize the corresponding lip video data of facial image.
It sends terminal and uploads facial image in advance to server, therefore, server is obtained with sending 100 corresponding people of terminal
Face image.There is lip image on facial image.
Below by taking " hello " as an example, this step is illustrated.
First process:Obtain the lip image of " big ".
Step A:Facial image is identified determining lip eigenmatrix 1.
Referring to Fig. 5, lip has many features point:Lip outer feature point m1-m10;Characteristic point n1-n8 on the inside of lip.
Characteristic point can generate eigenmatrix according to definite composition mode.Specifically generator matrix mode can be determined according to actual algorithm,
Details are not described herein.
Step B:" big " in lip image set corresponding lip image is identified, determines lip eigenmatrix 2.
Step C:Determine the transformation matrix 1 between lip eigenmatrix 1 and lip eigenmatrix 2.
D steps:By lip eigenmatrix 1 and 1 product 1 of transformation matrix, it is determined as the facial image 1 with lip image 1.
Second process:Based on the lip image that " family " is obtained on the basis of " big " lip.
Step A:By lip eigenmatrix 1 and 1 product 1 of transformation matrix, it is determined as lip eigenmatrix 3.
Step B:" family " in lip image set corresponding lip image is identified, determines lip eigenmatrix 4.
Step C:Determine the transformation matrix 2 between lip eigenmatrix 3 and lip eigenmatrix 4.
D steps:It is determined as lip eigenmatrix 3 and 2 product 2 of transformation matrix for the facial image with lip image 2
2。
Third process:Based on the lip image that " good " is obtained on the basis of " family " lip.
Step A:Lip eigenmatrix 3 and 2 product 2 of transformation matrix, are determined as lip eigenmatrix 5.
Step B:" good " corresponding lip image in lip image set is identified, determines lip eigenmatrix 6.
Step C:Determine the transformation matrix 3 between lip eigenmatrix 5 and lip eigenmatrix 6.
D steps:By lip eigenmatrix 5 and 3 product 3 of transformation matrix, it is determined as the facial image 3 with lip image 3.
By the facial image 1 with lip image 1, the facial image 2 with lip image/2 and with lip image 3
Facial image 3, be determined as lip image set corresponding with facial image.
By several face image synthesis videos, the corresponding lip video data of facial image is obtained.
Step S106:Server 200 synthesizes user voice data and lip video data, obtains user video data.
Server 200 determines the coding parameter of user voice data, the voice document after being encoded;Server 200 is true
Determine the coding parameter of lip video data, the video file after being encoded;Server 200 is to the voice document and volume after coding
Video file after code carries out audio-visual synchronization, obtains user video data.
For example, the coding parameter of lip video data can select H264 to be encoded, the frame per second of video is set as 30 frames;Sound
It is 1 that frequency, which selects AAC coding channels number, sample rate 44100, final to synthesize MP4 forms.
Step S107:Server 200 sends user video data to reception terminal 300.
Through the above, the application can be obtained to have the advantages that:
The application can be based on user voice data, and with reference to facial image, display is for voice number on facial image
According to show with the effect of facial image displaying user voice data.The exchange side of instant message applications can be enriched in this way
Formula.
Referring to Fig. 6, this application provides a kind of data processing equipment, including:
Data cell 31 is obtained, for obtaining user voice data and user's lteral data;Wherein, the user speech
Data are corresponding with user's lteral data;
Image set unit 32 is determined, for determining lip image set corresponding with user's lteral data;
Adjustment unit 33 obtains lip image set corresponding with facial image, and close for adjusting the lip image set
Into the corresponding lip video data of facial image;
Synthesis unit 34 for synthesizing user voice data and lip video data, obtains user video data.
Wherein described determining image set unit 32, including:
Participle unit 321, for user's lteral data semantic analysis and segment, obtain multiple participles and
Corresponding multiple participle attribute informations;
Lip elementary area 322 is determined, for determining multiple lip images corresponding with multiple participles respectively;
Lip elementary area 323 is adjusted, corresponding lip image is adjusted for being based on participle attribute information;
Component units 324 form lip image set for the lip image after multiple adjustment.
Wherein, the adjustment unit 33 includes:
Lip unit 331 is adjusted, for adjusting the lip feature in facial image, so that in lip feature and lip image
Lip characteristic matching;
Determination unit 332 for the facial image after several are adjusted, is determined as lip image corresponding with facial image
Collection.
The particular content of said program may refer to the embodiment shown in Fig. 1 b, and details are not described herein.
Through the above, the application can be obtained to have the advantages that:
The application can be based on user voice data, and with reference to facial image, display is for voice number on facial image
According to show with the effect of facial image displaying user voice data.The exchange side of instant message applications can be enriched in this way
Formula.
If the function described in the present embodiment method is realized in the form of SFU software functional unit and is independent product pin
It sells or in use, can be stored in a computing device read/write memory medium.Based on such understanding, the embodiment of the present application
The part contribute to the prior art or the part of the technical solution can be embodied in the form of software product, this is soft
Part product is stored in a storage medium, used including some instructions so that computing device (can be personal computer,
Server, mobile computing device or network equipment etc.) perform all or part of step of each embodiment the method for the application
Suddenly.And aforementioned storage medium includes:USB flash disk, read-only memory (ROM, Read-Only Memory), is deposited mobile hard disk at random
The various media that can store program code such as access to memory (RAM, Random Access Memory), magnetic disc or CD.
Each embodiment is described by the way of progressive in this specification, the highlights of each of the examples are with it is other
The difference of embodiment, just to refer each other for same or similar part between each embodiment.
The foregoing description of the disclosed embodiments enables professional and technical personnel in the field to realize or using the application.
A variety of modifications of these embodiments will be apparent for those skilled in the art, it is as defined herein
General Principle can in other embodiments be realized in the case where not departing from spirit herein or range.Therefore, the application
The embodiments shown herein is not intended to be limited to, and is to fit to and the principles and novel features disclosed herein phase one
The most wide range caused.
Claims (10)
1. a kind of data processing method, which is characterized in that including:
Obtain user voice data and user's lteral data;Wherein, the user voice data and user's lteral data
It is corresponding;
Determine lip image set corresponding with user's lteral data;
It adjusts the lip image set and obtains lip image set corresponding with facial image, and synthesize the corresponding lip of facial image
Video data;
User voice data and lip video data are synthesized, obtains user video data.
2. the method as described in claim 1, which is characterized in that the acquisition user voice data and user's lteral data,
Including:
User's lteral data is obtained in response to lteral data input by user, voice data is converted to based on lteral data and is used
Family voice data;Alternatively,
User voice data is obtained in response to voice data input by user, lteral data is converted to based on voice data and is used
Family lteral data.
3. the method as described in claim 1, which is characterized in that described to determine lip image corresponding with user's lteral data
Collection, including:
Semantic analysis is carried out to user's lteral data and is segmented, obtains multiple participles and corresponding multiple participle attribute letters
Breath;
Multiple lip images corresponding with multiple participles are determined respectively;
Corresponding lip image is adjusted based on participle attribute information;
Lip image composition lip image set after multiple adjustment.
4. the method as described in claim 1, which is characterized in that described to determine multiple lip figures corresponding with multiple participles respectively
Picture, including:
In the multiple lip images divided by simple or compound vowel of a Chinese syllable, lip image corresponding with participle simple or compound vowel of a Chinese syllable is determined;
In by initial consonant and multiple lip images of simple or compound vowel of a Chinese syllable division, lip image corresponding with the initial consonant and simple or compound vowel of a Chinese syllable that segment is determined;
Initial consonant and simple or compound vowel of a Chinese syllable are input to lip iconic model, obtain the lip image of lip iconic model output.
5. the method as described in claim 1, which is characterized in that the adjustment lip image set obtains and facial image pair
The lip image set answered, including:
The lip feature in facial image is adjusted, so that lip feature and the lip characteristic matching in lip image;
Facial image after several are adjusted is determined as lip image set corresponding with facial image.
6. the method as described in claim 1, which is characterized in that the synthesis user voice data and lip video data obtain
User's video data is obtained, including:
Determine the coding parameter of user voice data, the voice document after being encoded;
Determine the coding parameter of lip video data, the video file after being encoded;
Audio-visual synchronization is carried out to the video file after the voice document and coding after coding, obtains user video data.
7. a kind of data processing equipment, which is characterized in that including:
Data cell is obtained, for obtaining user voice data and user's lteral data;Wherein, the user voice data with
User's lteral data corresponds to;
Image set unit is determined, for determining lip image set corresponding with user's lteral data;
Adjustment unit obtains lip image set corresponding with facial image, and synthesize face for adjusting the lip image set
The corresponding lip video data of image;
Synthesis unit for synthesizing user voice data and lip video data, obtains user video data.
8. device as claimed in claim 7, which is characterized in that the determining image set unit, including:
Participle unit for user's lteral data semantic analysis and segment, obtains multiple participles and corresponding
Multiple participle attribute informations;
Lip elementary area is determined, for determining multiple lip images corresponding with multiple participles respectively;
Lip elementary area is adjusted, corresponding lip image is adjusted for being based on participle attribute information;
Component units form lip image set for the lip image after multiple adjustment.
9. device as claimed in claim 7, which is characterized in that the adjustment unit includes:
Adjustment unit, for adjusting the lip feature in facial image, so that lip feature and the lip feature in lip image
Matching;
Determination unit for the facial image after several are adjusted, is determined as lip image set corresponding with facial image.
10. a kind of data processing system, which is characterized in that including:
Terminal is sent, for the face image that determines concurrently make a gift to someone using facial image to server;Send user voice data or
User's lteral data is to server;
Server for receiving and storing facial image, obtains user voice data and user's lteral data;Wherein, it is described
User voice data is corresponding with user's lteral data;Determine lip image set corresponding with user's lteral data;Adjustment institute
It states lip image set and obtains lip image set corresponding with facial image, and synthesize the corresponding lip video data of facial image;
User voice data and lip video data are synthesized, obtains user video data;User video data are sent to receiving terminal;
Terminal is received, for receiving and showing user video data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711443989.3A CN108174123A (en) | 2017-12-27 | 2017-12-27 | Data processing method, apparatus and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711443989.3A CN108174123A (en) | 2017-12-27 | 2017-12-27 | Data processing method, apparatus and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108174123A true CN108174123A (en) | 2018-06-15 |
Family
ID=62518236
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711443989.3A Pending CN108174123A (en) | 2017-12-27 | 2017-12-27 | Data processing method, apparatus and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108174123A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112218080A (en) * | 2019-07-12 | 2021-01-12 | 北京新唐思创教育科技有限公司 | Image processing method, device, equipment and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005128177A (en) * | 2003-10-22 | 2005-05-19 | Ace:Kk | Pronunciation learning support method, learner's terminal, processing program, and recording medium with the program stored thereto |
CN101482975A (en) * | 2008-01-07 | 2009-07-15 | 丰达软件(苏州)有限公司 | Method and apparatus for converting words into animation |
CN101751692A (en) * | 2009-12-24 | 2010-06-23 | 四川大学 | Method for voice-driven lip animation |
CN104756188A (en) * | 2012-09-18 | 2015-07-01 | 金详哲 | Device and method for changing shape of lips on basis of automatic word translation |
CN106875947A (en) * | 2016-12-28 | 2017-06-20 | 北京光年无限科技有限公司 | For the speech output method and device of intelligent robot |
CN107204027A (en) * | 2016-03-16 | 2017-09-26 | 卡西欧计算机株式会社 | Image processing apparatus, display device, animation producing method and cartoon display method |
CN107330961A (en) * | 2017-07-10 | 2017-11-07 | 湖北燿影科技有限公司 | A kind of audio-visual conversion method of word and system |
-
2017
- 2017-12-27 CN CN201711443989.3A patent/CN108174123A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005128177A (en) * | 2003-10-22 | 2005-05-19 | Ace:Kk | Pronunciation learning support method, learner's terminal, processing program, and recording medium with the program stored thereto |
CN101482975A (en) * | 2008-01-07 | 2009-07-15 | 丰达软件(苏州)有限公司 | Method and apparatus for converting words into animation |
CN101751692A (en) * | 2009-12-24 | 2010-06-23 | 四川大学 | Method for voice-driven lip animation |
CN104756188A (en) * | 2012-09-18 | 2015-07-01 | 金详哲 | Device and method for changing shape of lips on basis of automatic word translation |
CN107204027A (en) * | 2016-03-16 | 2017-09-26 | 卡西欧计算机株式会社 | Image processing apparatus, display device, animation producing method and cartoon display method |
CN106875947A (en) * | 2016-12-28 | 2017-06-20 | 北京光年无限科技有限公司 | For the speech output method and device of intelligent robot |
CN107330961A (en) * | 2017-07-10 | 2017-11-07 | 湖北燿影科技有限公司 | A kind of audio-visual conversion method of word and system |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112218080A (en) * | 2019-07-12 | 2021-01-12 | 北京新唐思创教育科技有限公司 | Image processing method, device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Czyzewski et al. | An audio-visual corpus for multimodal automatic speech recognition | |
Nguyen et al. | Generative spoken dialogue language modeling | |
Cao et al. | Expressive speech-driven facial animation | |
US9060095B2 (en) | Modifying an appearance of a participant during a video conference | |
CN104732593B (en) | A kind of 3D animation editing methods based on mobile terminal | |
WO2018108013A1 (en) | Medium displaying method and terminal | |
CN109218629B (en) | Video generation method, storage medium and device | |
CN111161739B (en) | Speech recognition method and related product | |
US20110222782A1 (en) | Information processing apparatus, information processing method, and program | |
CN103024530A (en) | Intelligent television voice response system and method | |
CN104598644A (en) | User fond label mining method and device | |
US20170270701A1 (en) | Image processing device, animation display method and computer readable medium | |
CN107274903A (en) | Text handling method and device, the device for text-processing | |
CN110784662A (en) | Method, system, device and storage medium for replacing video background | |
CN111524045A (en) | Dictation method and device | |
CN106708789A (en) | Text processing method and device | |
CN108174123A (en) | Data processing method, apparatus and system | |
US20230326369A1 (en) | Method and apparatus for generating sign language video, computer device, and storage medium | |
Eyben et al. | Audiovisual vocal outburst classification in noisy acoustic conditions | |
US20220375223A1 (en) | Information generation method and apparatus | |
Liz-Lopez et al. | Generation and detection of manipulated multimodal audiovisual content: Advances, trends and open challenges | |
Deng et al. | Unsupervised audiovisual synthesis via exemplar autoencoders | |
CN111160051B (en) | Data processing method, device, electronic equipment and storage medium | |
Stappen et al. | MuSe 2020--The First International Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop | |
CN114359450A (en) | Method and device for simulating virtual character speaking |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180615 |