CN106447750A - Depth photo image reconstruction expression synchronization video generation method - Google Patents
Depth photo image reconstruction expression synchronization video generation method Download PDFInfo
- Publication number
- CN106447750A CN106447750A CN201610867180.2A CN201610867180A CN106447750A CN 106447750 A CN106447750 A CN 106447750A CN 201610867180 A CN201610867180 A CN 201610867180A CN 106447750 A CN106447750 A CN 106447750A
- Authority
- CN
- China
- Prior art keywords
- generation method
- video
- expression
- image reconstruction
- video generation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Processing Or Creating Images (AREA)
Abstract
The invention discloses a depth photo image reconstruction expression synchronization video generation method and relates to the medical technical field. The method comprises a cloud database, a data processing module and a terminal device. The method combines existing mobile medical platform registration, information consultation, live broadcast, instant message, cloud computing and big data and the like, can integrate resources of hospitals and doctors in one region or a plurality of regions, and enables the doctors to improve medical levels more quickly, to cooperate mutually better and to provide better service for patients; and meanwhile, the method improves operation efficiency of a medical mechanism, lightens the medical burden of the patients and improves medical efficiency.
Description
Technical field
The present invention relates to computer image processing technology field is and in particular to a kind of depth write true image reconstruction expression synchronization
Video generation method.
Background technology
Video technique is dynamic image transmission, is referred to as video traffic or videoconferencing service in field of telecommunications, in computer circle
It is usually referred to as multimedia communication, Streaming Media(Lower image-bearing flowing water)Communication etc..
With the development of development of Mobile Internet technology, increasing people carries out interaction by the Internet.At the beginning of the Internet
In stage beginning, during exchange, great majority use word interaction to people, are linked up by inputting word, with
The development of communication technology and the progress of image synthesis technology, people of today have been no longer satisfied with common written communication, in ditch
More can adulterate in logical process some short-sighted frequencies, Dynamic Graph or full animation expression etc..In current communication process, Ren Menyi
As can only pass through shooting video, or find proper expression from the expression storehouse of chat software and exchanged, no
Can be according to the word real time propelling movement actually entering and the many matchmakers such as short-sighted frequency, Dynamic Graph or the facial expression image being identical that truly express one's feelings
Body information, thus the synthetic video of original individual character can not be formed, interactive not strong.
Content of the invention
It is an object of the invention to provide a kind of true image reconstruction expression synchronization video generation method of depth write, existing to solve
There is the drawbacks described above leading in technology.
A kind of true image reconstruction expression synchronization video generation method of depth write, comprises the steps:
(1) adopt cooperative target mode, record actual persons as video, image sequence is gathered by the typical shape of the mouth as one speaks;
(2) extract face position as the matching characteristic between image so that interframe face position amount of movement is maintained at one relatively
Little scope;
(3) the typical shape of the mouth as one speaks of portrait is carried out pretreatment, and extract face position;
(4) nozzle type is reconstructed, more nozzle type expressions is obtained by true nozzle type sequence, forms expression dictionary, be stored in number
According in storehouse;
(5) pronunciation of identified input word, searches expression dictionary;
(6) the corresponding dynamic picture with different nozzle type is combined according to the input character order of identification;
(7) interpolation smoothing video sequence forms the dynamic video synchronous with word.
Preferably, described step(1)The harvester of middle image sequence is video camera or shooting unit or mobile phone camera.
Preferably, described step(7)The dynamic video of middle generation is stored into being stored in storage device or by the Internet
In cloud data base.
Preferably, described step(7)In dynamic video can be replaced by Dynamic Graph.
Preferably, described step(4)Expression dictionary in expression according to certain order be ranked up store.
Preferably, described order can be Chinese phonetic alphabet table order or the English alphabet order.
It is an advantage of the current invention that:The present invention forms short-sighted frequency, Dynamic Graph or expression bag using image synthesis technology, passes through
The multiple features multimode sample collection to the target that is taken for the camera head, then passes through image interpolation, reconstruct forms expression dictionary,
By being synthesized together these images to the identification inquiry expression dictionary inputting word, form the synthetic video of original individual character,
This image can carry out network sharing or storage by application person, enriches mode, the video being generated or the Dynamic Graph of people's exchange
Very identical with real scene, proper actual, improve the interest of people's exchange, compare the expression bag in conventional chat software,
More rich and varied.
Brief description
Fig. 1 is a kind of FB(flow block) of depth write of the present invention true image reconstruction expression synchronization video generation method.
Fig. 2 is a kind of theory diagram of depth write of the present invention true image reconstruction expression synchronization video-generating device.
Specific embodiment
Technological means, creation characteristic, reached purpose and effect for making the present invention realize are easy to understand, with reference to
Specific embodiment, is expanded on further the present invention.
As shown in figure 1, a kind of true image reconstruction expression synchronization video generation method of depth write, comprise the steps:
(1) video camera or shooting unit are disposed, or calling mobile phone photographic head, using cooperative target mode, record actual persons picture and regard
Frequently, image sequence is gathered by the typical shape of the mouth as one speaks;
(2) indicate that the people that is taken carries out expression according to phonetic prompting and records, phonetic prompting can be a fairly large number of depth model
Or the vowel model of negligible amounts, such as a o e i u etc. it is desirable to the people that is taken makes expression according to prompting, and carry out image or
Image sequence stores, and extracts face position as the matching characteristic between image so that interframe face position amount of movement is maintained at
One less scope;
(3) nozzle type is reconstructed, more nozzle type expressions is obtained by true nozzle type sequence, forms expression dictionary, by people
Machine interactively enters voice or word, sets word sentence to be designed, is stored in data base;
(4) pronunciation of identified input word, searches expression dictionary, and the expression in expression dictionary is ranked up according to certain order
Storage, described order can be Chinese phonetic alphabet table order or the English alphabet order;
(5) the corresponding dynamic picture with different nozzle type is combined according to the input character order of identification;
(6) interpolation smoothing video sequence forms dynamic video or Dynamic Graph, and is superimposed Word message and is stored;
(7) above-mentioned audio/video file or Dynamic Graph file are stored in storage device or carry out network sharing.
In the present invention, described step(1)The harvester of middle image sequence is video camera or shooting unit or cell-phone camera
Head.
In the present invention, described step(7)The dynamic video of middle generation can be deposited in storage device or be deposited by the Internet
Enter in cloud data base.
As shown in Fig. 2 in addition, utilizing a kind of dynamic expression image reconstruction and the Video Composition system of the inventive method design
System, cooperates photography for portrait, allows people carry out the adjustment of the shape of the mouth as one speaks or attitude as indicated, the such as pronunciation of simulation simple or compound vowel of a Chinese syllable, enters
Row image or the storage of image sequence, software carries out removal and the smoothed image of noise by filtering technique, by human-machine interface
Mouth input word, then software can be automatically by the image of collection or image sequence sequential combination formation image video or dynamic
Figure, obtains the Dynamic Graph of captions and nozzle type expression synchronization.This device can be realized in the form of software on mobile phone, calling mobile phone
Photographic head carry out image collection it is also possible to build single camera or multi-cam array or Flying Camera head from spatially,
Obtain more photographic samples by multiple imaging angles., this image reconstruction system can realize it is also possible to build on mobile phone
Depth description platform is realizing
Based on above-mentioned, the present invention forms short-sighted frequency, Dynamic Graph or expression bag using image synthesis technology, by camera head to quilt
The multiple features multimode sample collection of photographic subjects, then passes through image interpolation, reconstruct forms expression dictionary, by input literary composition
Word identification inquiry expression dictionary these images are synthesized together, formed original individual character synthetic video, this image can by should
User carries out network sharing or storage, enriches the mode of people's exchange, and the video being generated or Dynamic Graph are non-with real scene
Often it coincide, proper actual, improve the interest of people's exchange, compare the expression bag in conventional chat software, more rich and varied.
As known by the technical knowledge, the present invention can be by the embodiment party of other essence without departing from its spirit or essential feature
Case is realizing.Therefore, embodiment disclosed above, for each side, is all merely illustrative, and is not only.Institute
Have within the scope of the present invention or be all included in the invention in the change being equal in the scope of the present invention.
Claims (6)
1. a kind of true image reconstruction expression synchronization video generation method of depth write is it is characterised in that comprise the steps:
(1)Using cooperative target mode, record actual persons as video, image sequence is gathered by the typical shape of the mouth as one speaks;
(2)Extract face position as the matching characteristic between image so that interframe face position amount of movement is maintained at one relatively
Little scope;
(3)The typical shape of the mouth as one speaks of portrait is carried out pretreatment, and extracts face position;
(4)Nozzle type is reconstructed, more nozzle type expressions is obtained by true nozzle type sequence, forms expression dictionary, be stored in number
According in storehouse;
(5)The pronunciation of identified input word, searches expression dictionary;
(6)The corresponding dynamic picture with different nozzle type is combined according to the input character order of identification;
(7)Interpolation smoothing video sequence forms the dynamic video synchronous with word.
2. a kind of true image reconstruction expression synchronization video generation method of depth write according to claim 1 it is characterised in that:
Described step(1)The harvester of middle image sequence is video camera or shooting unit or mobile phone camera.
3. the true image reconstruction expression synchronization video generation method of depth write according to claim 1 and 2 it is characterised in that
Described step(7)The dynamic video of middle generation is stored in storage device or is stored in cloud data base by the Internet.
4. a kind of true image reconstruction expression synchronization video generation method of depth write according to claim 3 it is characterised in that:
Described step(7)In dynamic video can be replaced by Dynamic Graph.
5. a kind of true image reconstruction expression synchronization video generation method of depth write according to claim 4 it is characterised in that:
Described step(4)Expression dictionary in expression according to certain order be ranked up store.
6. a kind of true image reconstruction expression synchronization video generation method of depth write according to claim 5 it is characterised in that:
Described order can be Chinese phonetic alphabet table order or the English alphabet order.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610867180.2A CN106447750A (en) | 2016-09-30 | 2016-09-30 | Depth photo image reconstruction expression synchronization video generation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610867180.2A CN106447750A (en) | 2016-09-30 | 2016-09-30 | Depth photo image reconstruction expression synchronization video generation method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106447750A true CN106447750A (en) | 2017-02-22 |
Family
ID=58171964
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610867180.2A Pending CN106447750A (en) | 2016-09-30 | 2016-09-30 | Depth photo image reconstruction expression synchronization video generation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106447750A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1707550A (en) * | 2005-04-14 | 2005-12-14 | 张远辉 | Establishment of pronunciation and articalation mouth shape cartoon databank and access method thereof |
CN1731833A (en) * | 2005-08-23 | 2006-02-08 | 孙丹 | Method for composing audio/video file by voice driving head image |
CN101482975A (en) * | 2008-01-07 | 2009-07-15 | 丰达软件(苏州)有限公司 | Method and apparatus for converting words into animation |
CN101751692A (en) * | 2009-12-24 | 2010-06-23 | 四川大学 | Method for voice-driven lip animation |
CN102542586A (en) * | 2011-12-26 | 2012-07-04 | 暨南大学 | Personalized cartoon portrait generating system based on mobile terminal and method |
-
2016
- 2016-09-30 CN CN201610867180.2A patent/CN106447750A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1707550A (en) * | 2005-04-14 | 2005-12-14 | 张远辉 | Establishment of pronunciation and articalation mouth shape cartoon databank and access method thereof |
CN1731833A (en) * | 2005-08-23 | 2006-02-08 | 孙丹 | Method for composing audio/video file by voice driving head image |
CN101482975A (en) * | 2008-01-07 | 2009-07-15 | 丰达软件(苏州)有限公司 | Method and apparatus for converting words into animation |
CN101751692A (en) * | 2009-12-24 | 2010-06-23 | 四川大学 | Method for voice-driven lip animation |
CN102542586A (en) * | 2011-12-26 | 2012-07-04 | 暨南大学 | Personalized cartoon portrait generating system based on mobile terminal and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Fried et al. | Text-based editing of talking-head video | |
Guo et al. | Ad-nerf: Audio driven neural radiance fields for talking head synthesis | |
Wang et al. | One-shot talking face generation from single-speaker audio-visual correlation learning | |
WO2022001593A1 (en) | Video generation method and apparatus, storage medium and computer device | |
CN111741326A (en) | Video synthesis method, device, equipment and storage medium | |
JP2014519082A (en) | Video generation based on text | |
WO2023011221A1 (en) | Blend shape value output method, storage medium and electronic apparatus | |
WO2022134698A1 (en) | Video processing method and device | |
US11581020B1 (en) | Facial synchronization utilizing deferred neural rendering | |
JP2024513640A (en) | Virtual object action processing method, device, and computer program | |
CN113395569B (en) | Video generation method and device | |
Tan et al. | Emmn: Emotional motion memory network for audio-driven emotional talking face generation | |
Voigtlaender et al. | Connecting vision and language with video localized narratives | |
Cheng et al. | Audio-driven talking video frame restoration | |
US20230326369A1 (en) | Method and apparatus for generating sign language video, computer device, and storage medium | |
CN117252966A (en) | Dynamic cartoon generation method and device, storage medium and electronic equipment | |
CN104780341B (en) | A kind of information processing method and information processing unit | |
CN106447750A (en) | Depth photo image reconstruction expression synchronization video generation method | |
CN115376033A (en) | Information generation method and device | |
JP2005065191A (en) | Moving image meta-data automatic creating apparatus and moving image meta-data automatic creation program | |
CN111160051A (en) | Data processing method and device, electronic equipment and storage medium | |
Cakir et al. | Audio to video: Generating a talking fake agent | |
Hegde et al. | Extreme-scale talking-face video upsampling with audio-visual priors | |
EP4345814A1 (en) | Video-generation system | |
Zhou et al. | Text-based Talking Facial Synthesis for Virtual Host System |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination |