CN103745462B - A kind of human body mouth shape video reconfiguration system and reconstructing method - Google Patents
A kind of human body mouth shape video reconfiguration system and reconstructing method Download PDFInfo
- Publication number
- CN103745462B CN103745462B CN201310745441.XA CN201310745441A CN103745462B CN 103745462 B CN103745462 B CN 103745462B CN 201310745441 A CN201310745441 A CN 201310745441A CN 103745462 B CN103745462 B CN 103745462B
- Authority
- CN
- China
- Prior art keywords
- mouth
- shape
- video
- human body
- speaks
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 65
- 125000004122 cyclic group Chemical group 0.000 claims abstract description 56
- 230000037237 body shape Effects 0.000 claims abstract description 26
- 238000006243 chemical reaction Methods 0.000 claims abstract description 18
- 230000002123 temporal effect Effects 0.000 claims abstract description 15
- 239000000203 mixture Substances 0.000 claims abstract description 12
- 238000012545 processing Methods 0.000 claims description 34
- 230000008859 change Effects 0.000 claims description 15
- 230000008569 process Effects 0.000 claims description 12
- 238000004458 analytical method Methods 0.000 claims description 11
- 230000000694 effects Effects 0.000 claims description 11
- 238000004088 simulation Methods 0.000 claims description 8
- 230000008034 disappearance Effects 0.000 claims description 3
- 238000006073 displacement reaction Methods 0.000 claims description 3
- 230000008878 coupling Effects 0.000 claims description 2
- 238000010168 coupling process Methods 0.000 claims description 2
- 238000005859 coupling reaction Methods 0.000 claims description 2
- 238000012937 correction Methods 0.000 abstract description 4
- 230000011218 segmentation Effects 0.000 description 11
- 230000007704 transition Effects 0.000 description 10
- 230000006870 function Effects 0.000 description 7
- 230000002093 peripheral effect Effects 0.000 description 6
- 230000001932 seasonal effect Effects 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 5
- 230000009471 action Effects 0.000 description 4
- 230000035772 mutation Effects 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 241001125929 Trisopterus luscus Species 0.000 description 3
- 210000004556 brain Anatomy 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000002035 prolonged effect Effects 0.000 description 3
- 241000406668 Loxodonta cyclotis Species 0.000 description 2
- 230000004397 blinking Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010205 computational analysis Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 210000000256 facial nerve Anatomy 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000003760 hair shine Effects 0.000 description 1
- 238000000465 moulding Methods 0.000 description 1
- 229940037201 oris Drugs 0.000 description 1
- 238000004064 recycling Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001550 time effect Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Landscapes
- Processing Or Creating Images (AREA)
Abstract
The present invention provides a kind of human body mouth shape video reconfiguration system based on cyclic spring spatial dynamics temporal evolution and corresponding method.The inventive method includes information reading, pretreatment, shape of the mouth as one speaks reconstruct and these four steps of video frequency output, the relevant method of inversion and two kinds of implementations of logic revised law.Reconstructing method and system that the present invention provides both can realize reading in the inverting of Shape of mouth on this single-frame images, generate the human body shape of the mouth as one speaks video after reconstruct, on the video of multiple image composition, can also realize reading in the correction of Shape of mouth, generate the human body shape of the mouth as one speaks video after reconstruct.Compare traditional shape of the mouth as one speaks reconstructing method and system, the inventive method and system precise and high efficiency, it is not necessary to data base, while saving space, also enhance the flexibility ratio of shape of the mouth as one speaks conversion.It is highly preferred that all unit of the system of the present invention can be integrated on an intelligent terminal, described intelligent terminal can be various smart mobile phone, panel computer (such as iPad etc.), palm PC, intelligence handheld device etc..
Description
Technical field
The present invention relates to field of video image processing, be specifically related to a kind of based on cyclic spring space power class hour
Between develop human body mouth shape video reconfiguration system and reconstructing method.
Background technology
Along with the development of computer technology be gradually improved, the moulding of face and animation are as computer graphics
In a unique branch the most increasingly paid close attention to by people, wherein for the human body shape of the mouth as one speaks in video, image
Change be widely used especially.Many occasions need the shape of the mouth as one speaks by the people in existing video or image to enter
Line reconstruction, is i.e. generated a series of shape of the mouth as one speaks actions by a static image, or enters the shape of the mouth as one speaks in existing video
Row is revised.In order to reach such purpose, existing technical method is typically all by regarding in a large number existing
Frequently image information is analyzed and processed, and sets up mouth shape data storehouse, then carries out from described mouth for particular problem
Type data base calls relevant information.Although such technological means can be relatively accurately to video, figure
The human body shape of the mouth as one speaks in Xiang converts, but its limitation is also apparent from.On the one hand, its realization depends on
Relying in the huge mouth shape data storehouse built in advance, need huge data sample, portability is relatively
Difference;On the other hand, the realization of algorithm relates to substantial amounts of computational analysis, and complexity is the highest, also limit its
Range of application.
Summary of the invention
For the deficiencies in the prior art, the technical problem to be solved be to provide a kind of precision high, can
The human body mouth shape video reconfiguration method and system that transplantability is good, to realize destination object according to the required shape of the mouth as one speaks
Single-frame images is to the evolution of video, or realizes amendment and the inverting of the video of destination object multiple image composition.
Traditional shape of the mouth as one speaks converter technique depends on huge mouth shape data storehouse, contains sound bank in this mouth shape data storehouse
And corresponding mouth shape image, in order to called in conversion, on the one hand occupied substantial amounts of sky
Between;On the other hand the shape of the mouth as one speaks made new advances, nothing in practice can not independently be built due to this mouth shape data storehouse itself
Method processes the transformation problem not comprising the shape of the mouth as one speaks in data base.Present system is different from traditional shape of the mouth as one speaks transformation series
System, it is not necessary to such mouth shape data storehouse, can be completed rapidly and accurately the video reconstruction of the human body shape of the mouth as one speaks.
The technical solution used in the present invention is as follows:
A kind of human body mouth shape video reconfiguration method, specifically includes following four step:
(1) information is read in: read in human body information and Shape of mouth from input port, and described human body information is selected from mesh
The mark single-frame images of object or the video of multiple image composition, described Shape of mouth selected from word,
Sound, image, video any one or the most multiple;
(2) pretreatment: the Shape of mouth reading in input port is identified conversion and will identify the shape of the mouth as one speaks after changing
Information shows in real time at display module, and the human body information reading in input port is analyzed and locks
The position of oral area;
(3) shape of the mouth as one speaks reconstruct: temporal evolution method based on cyclic spring spatial dynamics, according to pretreated mouth
Type information and human body information carry out human body mouth shape video reconfiguration;
(4) video frequency output: the human body shape of the mouth as one speaks video after delivery outlet output has reconstructed.
The flow chart of technical solution of the present invention is as shown in Figure 1.
In described step (3), the method for described shape of the mouth as one speaks reconstruct is to drill based on the cyclic spring spatial dynamics time
Change.Described cyclic spring space is a kind of to define order a little and the plane space of distance, its have with
Lower 4 character:
1, any two points P in cyclic spring space1And P2, distance variable therebetween.
2, any two points P in cyclic spring space1And P2, its order is the most constant, it may be assumed that choose annular
P is differed from elastic space1、P2Any point P3, the order of these 3 (or counterclockwise) clockwise is arbitrarily
All without changing in conversion.
3, any point P in cyclic spring space can by with trunnion axis angle be α, size be the power of f
The effect of F, and therefore produce the change on position, showing as edge, relatively primitive position is α with trunnion axis angle
Direction produce certain displacement.
4, when any point P in cyclic spring space is acted on by power F, this power F is in impact
Also other point in cyclic spring space is influenced whether so that it is be equal to by one and trunnion axis while P
The effect of power that angle is α ', size is f ', referred to as correlation.This locus relative to P is certainly
Having determined the size of α ', the distance of this point and P determines the size of f ', when the shadow that the distance of this point and P is more than
During scope R of sound, it is believed that the correlation impact of its F that do not stresses.
Cyclic spring space schematic diagram is as shown in Figure 2.
The conversion of the shape of the mouth as one speaks is that the orbicularis oris of lip is affected generation, therefore for the shape of the mouth as one speaks by buccal branch of facial nerve domination
Described cyclic spring spatial model can be set up study.When the t shape of the mouth as one speaks changes, it is believed that
Be now this cyclic spring spatially certain n some P1, P2..., PnReceive power F respectively1, F2...,
FnEffect, the common effect of this n power make this cyclic spring space occur the displacement of local, rotation or
Stretching, i.e. produces the conversion of the shape of the mouth as one speaks.In described step (3), system processing module can pick out video, figure
The position of the shape of the mouth as one speaks and changing based on seasonal effect in time series in Xiang, sets up corresponding cyclic spring spatial model, extracts
Go out the effect of the power that each t produces on this model regional.Meanwhile, the human body described in recycling
Information sets up new cyclic spring spatial model, by the power that extracted according to corresponding time effect at new ring
Correspondence position on shape elastic space model, can complete human body mouth shape video reconfiguration.Described correspondence position
Can be determined by the characteristic point on the 4 of the shape of the mouth as one speaks contour lines and contour line, in order to ensure the precision of conversion,
In practical operation, the characteristic point on every contour line should be greater than equal to 3, as shown in Figure 3.Described determination
The process of correspondence position is association based on cyclic spring space.
As preferably, in described step (3) is association based on cyclic spring spatial dynamics temporal evolution method
The method of inversion, the Shape of mouth i.e. demonstrated as synchronization object simulant display model by on-the-spot true man, then pass through
Real-time Collection module gathers analog video, and the human body information read carries out based on cyclic spring space
Coupling, thus complete the reconstruct of human body shape of the mouth as one speaks video.As shown in Figure 4, in this method, synchronization object scene mould
Intending Shape of mouth to be reconstructed, this process is collected as analog video, sets up ring based on this analog video
Then it be analyzed processing by shape elastic space model, and Shape of mouth to be reconstructed can be made by accurate, high
Effect ground reappears on the human body information of destination object, thus realizes the reconstruct at destination object oral area of this shape of the mouth as one speaks.
This method schematic flow sheet is as shown in Figure 6.Specifically, the shape of the mouth as one speaks letter that synchronization object demonstrates according to display module
The breath simulation shape of the mouth as one speaks, such as, reads the passage of display or imitates some shape of the mouth as one speaks pictures of display, now,
Processing module controls Real-time Collection module and gathers the analog video of synchronization object, as the foundation of shape of the mouth as one speaks reconstruct.
After collection completes, the analog video collected is segmented into n frame according to certain frame number N average mark by processing module
(when the described sample shape of the mouth as one speaks video during a length of T second, have n=TN), the most corresponding time t1, t2...,
tn, position the shape of the mouth as one speaks of each frame, and in the profile of the shape of the mouth as one speaks and characteristic point and the human body information read
It is corresponding that the profile of the shape of the mouth as one speaks carries out linkage with characteristic point.Described frame number N can determine according to practical situation, expire
Foot sampling thheorem can reflect the Shape of mouth of required reconstruct with the image after ensureing segmentation;The frequency of segmentation
The highest, the complexity of shape of the mouth as one speaks reconstruct is the highest, and the precision of reconstruct is the highest;The frequency of segmentation is the lowest, shape of the mouth as one speaks weight
The complexity of structure is the lowest, and the precision of reconstruct is the lowest.When the human body information read in step (1) is single-frame images
Time, described linkage correspondence refers to the shape of the mouth as one speaks characteristic point in each for analog video frame all to correspond to single frames human body
On frame;When the video that the human body information read in step (1) is multiple image composition, described connection
Dynamic correspondence refers to the shape of the mouth as one speaks characteristic point in each for analog video frame all to correspond to the corresponding frame of human body information video
On.Described corresponding frame can be determined by method below: the frame figure gone out by human body information Video segmentation and mould
The frame figure that plan Video segmentation goes out all is numbered, if the frame number of human body information video and analog video is equal,
Described corresponding frame is the frame that numbering is identical;If the frame number of human body information video and analog video not phase
Deng, described corresponding frame is then the frame that proportion position is identical in sum.When the frame number of analog video is big
When human body information video frame number, unnecessary frame is cast out in proportion;When the frame number of analog video is less than human body
During informational video frame number, not enough frame being carried out interpolation processing in proportion, the middle entry shape of the mouth as one speaks of interpolation passes through base
Kinetics temporal evolution structure in cyclic spring space.After the linkage correspondence completing the shape of the mouth as one speaks,
Show that t=(i/N) second in this moment is corresponding according to the mutation analysis of the i-th frame in analog video to the shape of the mouth as one speaks of (i+1) frame
Cyclic spring spatial model in the acting on of the power that is subject to of each characteristic point, the power obtained is acted on human body letter
In the cyclic spring spatial model that breath is corresponding, the reconstruct of t=this moment Shape of mouth of (i/N) second can be completed.
After new video each frame figure has reconstructed, i.e. obtain the human body shape of the mouth as one speaks video after having reconstructed.
As preferably, in described step (3) is logic based on cyclic spring spatial dynamics temporal evolution method
Revised law, does not i.e. rely on on-the-spot true man and deduces, directly according to required Shape of mouth, call shape of the mouth as one speaks primitive
Module builds shape of the mouth as one speaks state template artificially, more raw by kinetics temporal evolution based on cyclic spring space
The transitive state becoming disappearance completes video reconstruction.As it is shown in figure 5, this method is without synchronization object scene mould
Intend, but on the basis of human body information and Shape of mouth, generate shape of the mouth as one speaks shape by calling shape of the mouth as one speaks primitive artificially
Morphotype plate, resettles the evolution of cyclic spring spatial model and generates the shape of the mouth as one speaks video of destination object, it is achieved target pair
Video reconstruction as oral area.This method schematic flow sheet is as shown in Figure 7.Described shape of the mouth as one speaks primitive is human body mouth
The model of the most basic situation of type, such as a shape of the mouth as one speaks (opening one's mouth) in phonetic, the o shape of the mouth as one speaks (pouting one's lips), the i shape of the mouth as one speaks
(grinning) etc., it is possible to generate all transition by kinetics temporal evolution based on cyclic spring space
Shape of the mouth as one speaks state.The shape of the mouth as one speaks state of described transition refers to the shape of the mouth as one speaks basis element change mistake to another shape of the mouth as one speaks primitive
The shape of the mouth as one speaks state produced in journey, such as, from the shape of the mouth as one speaks primitive remained silent to sending the shape of the mouth as one speaks primitive of phonetic " a ",
The shape of the mouth as one speaks state of its transition is exactly the shape of the mouth as one speaks during oral area slowly magnifies.Specifically, show when display module
Go out the Shape of mouth of required reconstruct, n the basic shape of the mouth as one speaks meeting demand can be chosen artificially in shape of the mouth as one speaks storehouse
Being associated the shape of the mouth as one speaks in ad-hoc location frame revising, simulation is constructed based on seasonal effect in time series shape of the mouth as one speaks state mould
Plate.When the human body information read in the step (1) is single-frame images, in the described shape of the mouth as one speaks state template shape of the mouth as one speaks it
Outer information is all extended by single-frame images;When the human body information read in step (1) is multiple image composition
During video, in described shape of the mouth as one speaks state template, the information outside the shape of the mouth as one speaks is consistent with video.Outside the described shape of the mouth as one speaks
Information be all information outside oral area in image or video, including other parts outside human body oral area
Environment residing for (such as nose, eye, cheek, trunk, extremity etc.) and people.Such as, the blinking of eye
The rocking of dynamic, health, after one's death other people through etc. be all considered as the change that all information outside oral area occurs
Change.After shape of the mouth as one speaks state template has built, then the oral area peripheral position in human body information is carried out based on annular
The association of elastic space so that corresponding impact is caused in the region of oral area peripheral extent by the change of oral area, i.e.
Construct the cyclic spring spatial model of correspondence.Now, i-th shape of the mouth as one speaks primitive is analyzed to (i+1) individual shape of the mouth as one speaks
The change of primitive, the power that in the cyclic spring spatial model that i-th stage that can draw is corresponding, each point receives
Effect, then by the prolonged action of power to longer time series, i.e. can get in the two stage all of
Shape of the mouth as one speaks transitive state.When (n-1) individual transitive state has all reconstructed, i.e. realize the weight of human body shape of the mouth as one speaks video
Structure.
For association inversion method, the present invention also provides for a kind of human body mouth shape video reconfiguration system, including input
Mouth, delivery outlet, processing module, display module and Real-time Collection module, wherein:
Described input port is used for reading in human body information and Shape of mouth, and described human body information is selected from target pair
The single-frame images of elephant or the video of multiple image composition, described Shape of mouth is selected from word, sound, figure
As, video any one or the most multiple;
Described delivery outlet is for exporting the human body shape of the mouth as one speaks video after having reconstructed;
Described display module shows in real time for the Shape of mouth reading in input port;
Described processing module carries out conversion process for the Shape of mouth reading in input port, then believes at human body
The reconstruct of human body shape of the mouth as one speaks video is realized on the basis of breath;
Described Real-time Collection module is used for during using association inversion method to be reconstructed synchronization object
Video carry out Real-time Collection.
The connected mode of modules is as shown in Figure 8.Between wherein, described input port and processing module,
Between processing module and delivery outlet, between processing module and Real-time Collection module, processing module and display module
Between can be attached partially or completely through wired or wireless mode, to ensure effective transmission of data.Can
With according to actual needs, all use wired mode to connect, all use wireless mode to connect, or part is adopted
Connect with wired mode, partly use wireless mode to connect.
Described processing module is to have Computer Vision and the terminal of information analysis ability, is selected from numeral
Chip, intelligent terminal.Described intelligent terminal refers to capture external information, can carry out calculating, analyzing
And process, and the equipment of information transmission can be carried out between different terminals, include but not limited to desktop
Brain, notebook computer, mobile intelligent terminal.Described mobile intelligent terminal is portable intelligent terminal,
Include but not limited to various smart mobile phone, panel computer (such as iPad etc.), palm PC, intelligence handheld game
Machine.Described digit chip refers to, through design, use integrated electronic technique, it is possible to carry out calculating, analyze and
The chip processed, and other equipment can be controlled by extension, include but not limited to single-chip microcomputer, ARM,
DSP, FPGA etc..
Described Real-time Collection module is selected from video camera, photographing unit, photographic head, digitized image equipment, tool
Have camera function intelligent terminal any one or the most multiple.
Described display module selected from display, display screen, projector, intelligent terminal any one or appoint
Anticipate multiple.
Specifically, the Shape of mouth simulation shape of the mouth as one speaks that synchronization object demonstrates according to display module, such as, read
The passage of display or some shape of the mouth as one speaks pictures of imitation display, now, processing module controls Real-time Collection mould
Block gathers the analog video of synchronization object, as the foundation of shape of the mouth as one speaks reconstruct.After collection completes, processing module will
The analog video collected is segmented into n frame (when described sample shape of the mouth as one speaks video according to certain frame number N average mark
During a length of T second, have n=TN), the most corresponding time t1, t2..., tn, position the shape of the mouth as one speaks of each frame, and
Profile and the characteristic point of the shape of the mouth as one speaks in the profile of the shape of the mouth as one speaks and characteristic point and the human body information read are linked
Corresponding.The frequency of described segmentation can determine according to practical situation, after sampling thheorem to be met is to ensure segmentation
Image can reflect the Shape of mouth of required reconstruct;The frequency of segmentation is the highest, the complexity of shape of the mouth as one speaks reconstruct
The highest, the precision of reconstruct is the highest;The frequency of segmentation is the lowest, and the complexity of shape of the mouth as one speaks reconstruct is the lowest, reconstruct
Precision is the lowest.When the human body information read in described input port is single-frame images, described linkage is right
The shape of the mouth as one speaks characteristic point in each for analog video frame should be referred to all to correspond on single frames human body information image;Work as institute
When the human body information read in the input port stated is the video of multiple image composition, described linkage correspondence refers to
Shape of the mouth as one speaks characteristic point in each for analog video frame is all corresponded on the corresponding frame of human body information video.Described
Corresponding frame can be determined by method below: the frame figure gone out by human body information Video segmentation and analog video segmentation
The frame figure gone out all is numbered, if the frame number of human body information video and analog video is equal, and described correspondence
Frame is the frame that numbering is identical;If the frame number of human body information video and analog video is unequal, described is right
Answering frame is then the frame that proportion position is identical in sum.When the frame number of analog video regards more than human body information
Frequently, during frame number, unnecessary frame is cast out in proportion;When the frame number of analog video is less than human body information video frame number
Time, not enough frame is carried out interpolation processing in proportion, the middle entry shape of the mouth as one speaks of interpolation is by empty based on cyclic spring
Between kinetics temporal evolution structure.After the linkage correspondence completing the shape of the mouth as one speaks, can be according to analog video
In the i-th frame show that cyclic spring corresponding to t=(i/N) second in this moment is empty to the mutation analysis of the shape of the mouth as one speaks of (i+1) frame
Between the acting on of each characteristic point is subject in model power, the power obtained is acted on the annular that human body information is corresponding
In elastic space model, the reconstruct of t=this moment Shape of mouth of (i/N) second can be completed.The each frame of new video
After figure reconstruct completes, i.e. obtain the human body shape of the mouth as one speaks video after having reconstructed.
For logic revised law, the present invention also provides for a kind of human body mouth shape video reconfiguration system, including input
Mouth, delivery outlet, processing module, display module and shape of the mouth as one speaks primitive models, wherein:
Described input port is used for reading in human body information and Shape of mouth, and described human body information is selected from target pair
The single-frame images of elephant or the video of multiple image composition, described Shape of mouth is selected from word, sound, figure
As, video any one or the most multiple;
Described delivery outlet is for exporting the human body shape of the mouth as one speaks video after having reconstructed;
Described display module shows in real time for the Shape of mouth reading in input port;
Described processing module carries out conversion process for the Shape of mouth reading in input port, then believes at human body
The reconstruct of human body shape of the mouth as one speaks video is realized on the basis of breath;
Described shape of the mouth as one speaks primitive models is for storing basic shape of the mouth as one speaks primitive, in order to use logic revised law to enter
Call during line reconstruction, build shape of the mouth as one speaks state template artificially.
The described model that shape of the mouth as one speaks primitive is the most basic situation of the human body shape of the mouth as one speaks, such as a shape of the mouth as one speaks in phonetic (is opened
Mouth), the o shape of the mouth as one speaks (pouting one's lips), the i shape of the mouth as one speaks (grinning) etc., it is possible to by based on cyclic spring space dynamic
Mechanics temporal evolution generates the shape of the mouth as one speaks state of all transition.The shape of the mouth as one speaks state of described transition refers to a shape of the mouth as one speaks base
The shape of the mouth as one speaks state that unit produces during transforming to another shape of the mouth as one speaks primitive, such as, from the shape of the mouth as one speaks primitive remained silent
To sending the shape of the mouth as one speaks primitive of phonetic " a ", the shape of the mouth as one speaks state of its transition is exactly the mouth during oral area slowly magnifies
Type.
The connected mode of modules is as shown in Figure 9.Between wherein, described input port and processing module,
Between processing module and delivery outlet, between processing module and shape of the mouth as one speaks primitive models, processing module and display module
Between can be attached partially or completely through wired or wireless mode, to ensure effective transmission of data.Can
With according to actual needs, all use wired mode to connect, all use wireless mode to connect, or part is adopted
Connect with wired mode, partly use wireless mode to connect.
Described processing module is to have Computer Vision and the terminal of information analysis ability, including selected from number
Word chip, intelligent terminal.Described intelligent terminal refers to capture external information, can carry out calculating, dividing
Analysis and process, and the equipment of information transmission can be carried out between different terminals, include but not limited to desktop
Brain, notebook computer, mobile intelligent terminal.Described mobile intelligent terminal is portable intelligent terminal,
Include but not limited to various smart mobile phone, panel computer (such as iPad etc.), palm PC, intelligence handheld game
Machine.Described digit chip refers to, through design, use integrated electronic technique, it is possible to carry out calculating, analyze and
The chip processed, and other equipment can be controlled by extension, include but not limited to single-chip microcomputer, ARM,
DSP, FPGA etc..
Described display module selected from display, display screen, projector, intelligent terminal any one or appoint
Anticipate multiple.
Described shape of the mouth as one speaks primitive models is for storing basic shape of the mouth as one speaks model, in order to use logic revised law to enter
Call during line reconstruction, build shape of the mouth as one speaks state template artificially.Traditional shape of the mouth as one speaks converter technique depends on
Huge mouth shape data storehouse, this mouth shape data storehouse contains sound bank and corresponding mouth shape image with
It is easy to be called in conversion, on the one hand occupies substantial amounts of space;On the other hand due to this mouth shape data
Storehouse itself can not independently build the shape of the mouth as one speaks made new advances, and cannot process in data base and not comprise the shape of the mouth as one speaks in practice
Transformation problem.Present system is different from traditional shape of the mouth as one speaks transformation system, it is not necessary to such mouth shape data
Storehouse, can be completed rapidly and accurately the video reconstruction of the human body shape of the mouth as one speaks.
As preferably, the mouth shape video reconfiguration system of the present invention can be have camera function desktop computer,
Notebook computer or mobile intelligent terminal.Described mobile intelligent terminal is portable intelligent terminal, including
But it is not limited to various smart mobile phone, panel computer (such as iPad etc.), palm PC, intelligence handheld game
Machine.Specifically, the mouth shape video reconfiguration system of the present invention can be only a desktop with camera function
Brain, or a notebook computer with camera function, or an intelligent movable with camera function
Terminal.Now, the communication of equipment and data transmission module are as the input port of system and delivery outlet, in processing
Core is as the processing module of system, and photographic head is as the Real-time Collection module of system, and display screen is as system
Display module, memory element is as the shape of the mouth as one speaks primitive models of system.The mouth shape video reconfiguration system of the present invention is also
Can be the combination with the desktop computer of camera function, notebook computer or mobile intelligent terminal, such as,
There is the photographic head of the mobile intelligent terminal of camera function and display screen respectively as Real-time Collection module and display
Module, the communication module of notebook computer, process kernel and memory element are respectively as the input and output of system
Mouth, processing module and shape of the mouth as one speaks primitive models, etc..
As preferably, in described step (3) is logic based on cyclic spring spatial dynamics temporal evolution method
Revised law, does not i.e. rely on on-the-spot true man and deduces, directly according to required Shape of mouth, call shape of the mouth as one speaks primitive
Module builds shape of the mouth as one speaks state template artificially, more raw by kinetics temporal evolution based on cyclic spring space
The transitive state becoming disappearance completes video reconstruction, and its schematic flow sheet is as shown in Figure 7.Described shape of the mouth as one speaks primitive is
The model of the most basic situation of the human body shape of the mouth as one speaks, such as a shape of the mouth as one speaks (opening one's mouth) in phonetic, the o shape of the mouth as one speaks (pouting one's lips), i
Shape of the mouth as one speaks (grinning) etc., it is possible to generate all mistakes by kinetics temporal evolution based on cyclic spring space
The shape of the mouth as one speaks state crossed.The shape of the mouth as one speaks state of described transition refers to that a shape of the mouth as one speaks basis element change is to another shape of the mouth as one speaks primitive
During produce shape of the mouth as one speaks state, such as, from the shape of the mouth as one speaks primitive remained silent to the shape of the mouth as one speaks base sending phonetic " a "
Unit, the shape of the mouth as one speaks state of its transition is exactly the shape of the mouth as one speaks during oral area slowly magnifies.Specifically, display module is worked as
Demonstrate the Shape of mouth of required reconstruct, n the basic mouth meeting demand can be chosen artificially in shape of the mouth as one speaks storehouse
The shape of the mouth as one speaks in ad-hoc location frame is associated revising by type, and simulation is constructed based on seasonal effect in time series shape of the mouth as one speaks state
Template.When the human body information read in described input port is single-frame images, described shape of the mouth as one speaks state template
Information outside the middle shape of the mouth as one speaks is all extended by single-frame images;When the human body information read in described input port is
During the video that multiple image forms, in described shape of the mouth as one speaks state template, the information outside the shape of the mouth as one speaks is consistent with video.
The described information outside the shape of the mouth as one speaks is all information in image or video outside oral area, including human body oral area it
Environment residing for outer other parts (such as nose, eye, cheek, trunk, extremity etc.) and people.Such as,
The blinking of eye, the rocking of health, after one's death other people through etc. be all considered as all information outside oral area
The change occurred.After shape of the mouth as one speaks state template has built, then the oral area peripheral position in human body information is carried out
Association based on cyclic spring space so that the region of oral area peripheral extent is caused accordingly by the change of oral area
Impact, i.e. constructs the cyclic spring spatial model of correspondence.Now, i-th shape of the mouth as one speaks primitive is analyzed to (i+1)
The change of individual shape of the mouth as one speaks primitive, in the cyclic spring spatial model that i-th stage that can draw is corresponding, each point receives
The effect of power, then by the prolonged action of power to longer time series, i.e. can get institute in the two stage
Some shape of the mouth as one speaks transitive states.When (n-1) individual transitive state has all reconstructed, i.e. realize human body shape of the mouth as one speaks video
Reconstruct.
The invention has the beneficial effects as follows:
(1) present invention both can realize reading in the inverting of Shape of mouth on this single-frame images, after generating reconstruct
Human body shape of the mouth as one speaks video, it is also possible to realize reading in the correction of Shape of mouth on the video of multiple image composition, raw
Become the human body shape of the mouth as one speaks video after reconstruct, have the strongest suitability.
(2) present invention has association inversion and two kinds of specific embodiments of logic correction, and the former is true by scene
The synchronization of people is deduced and can quickly and efficiently be completed the reconstruct of human body shape of the mouth as one speaks video;The latter needs artificially to call
Shape of the mouth as one speaks primitive but be not dependent on on-the-spot deduction, it is possible to achieve off-line is revised, and two kinds of methods can meet does not sympathizes with
The demand of mouth shape video reconfiguration under condition.
(3) present invention configures simply in terms of system hardware, with low cost;Software aspects the most only needs common regarding
Frequently, image processing software and small-sized shape of the mouth as one speaks primitive, be not related to extra software and dispose, the most relatively pass
The shape of the mouth as one speaks reconfiguration system of system, present system, without data base, also enhances mouth while saving space
The flexibility ratio of type conversion.
(4) it is highly preferred that all unit of the system of the present invention can be integrated on an intelligent terminal, described
Intelligent terminal can be smart mobile phone, panel computer, palm PC, intelligence handheld device, therefore have
There is the highest portability.
Accompanying drawing explanation
Fig. 1 is the inventive method flow chart.
Fig. 2 is cyclic spring space schematic diagram.
Contour line and the schematic diagram of characteristic point when Fig. 3 is shape of the mouth as one speaks position correspondence in the inventive method, in figure, L1 is extremely
L4 and L1 ' to L4 ' be respectively the contour line of two shape of the mouth as one speaks, P1 to P6 and P1 ' be two respectively to P6 '
Key point on individual shape of the mouth as one speaks contour line, needs to ensure there are at least 3 corresponding point on every contour line to ensure to become
The accuracy changed.
Fig. 4 is the information conversion sketch of association inversion method in the present invention.
Fig. 5 is the information conversion sketch of logic revised law in the present invention.
Fig. 6 is the schematic flow sheet of association inversion method in the present invention.
Fig. 7 is the schematic flow sheet of logic revised law in the present invention.
Fig. 8 is the system construction drawing that association inversion method of the present invention is corresponding.
Fig. 9 is the system construction drawing that logic revised law of the present invention is corresponding.
Detailed description of the invention
In order to illustrate in greater detail the human body mouth shape video reconfiguration method of the present invention, below according to accompanying drawing specifically
The bright present invention.
Embodiment 1
As shown in Figure 6, using B as synchronization object, use an association inversion method photograph from destination object A
As a example by sheet reconstructs the video that A reads aloud a lecture original text, illustrate the shape of the mouth as one speaks reconstructing method of the present invention.
Here using a desktop computer being equipped with photographic head as reconfiguration system, wherein: USB interface is as system
Input, delivery outlet, processor is as the processing module of system, and photographic head is as the Real-time Collection mould of system
Block, display is as the display module of system.
(1) information is read in: system reads in the photo of A as pending human body information, reading from USB interface
Enter speech draft document as pending Shape of mouth.
(2) pretreatment: it is text formatting that processor identifies Shape of mouth, it is contemplated that the utilization of association inversion method
Convenient, directly the Shape of mouth of text formatting is transferred to display and shows;Meanwhile, processor pair
The photo of A carries out graphical analysis, identifies and locks out the position of A oral area in photo, choose outlet
The characteristic point of type, such as two labial angles, the center of four lip lines.
(3) shape of the mouth as one speaks reconstruct: the Word message simulation shape of the mouth as one speaks that synchronization object B demonstrates according to display, reads
The content of speech draft.Meanwhile, camera collection B reads the video (duration 1000 of this speech draft
Second), i.e. analog video, it is used as the foundation of shape of the mouth as one speaks reconstruct.After collection completes, processor will be adopted
Collect to the analog video of B be divided into 30000 frames by the frame number of 30 frames/second, the most corresponding time
t1, t2..., t30000, and position the shape of the mouth as one speaks in each frame, choose same shape of the mouth as one speaks characteristic point, i.e.
Two labial angles, the centers of four lip lines.Because the photo of human body information, i.e. A is single frames figure
Picture, 30000 frames that the analog video of B is partitioned into respectively with characteristic of correspondence in the photo of A
Point carries out correspondence, and link peripheral position, it is established that based on seasonal effect in time series cyclic spring spatial mode
Type.Afterwards, can be according to the 1st frame in the analog video of B to the mutation analysis of the shape of the mouth as one speaks of the 2nd frame
Show that each characteristic point in cyclic spring spatial model corresponding to t=(1/30) second in this moment is subject to
The effect of power, acts in the cyclic spring spatial model that A photo is corresponding, i.e. by the power obtained
The reconstruct of t=(1/30) second in this moment A Shape of mouth can be completed.When 30000 frame reconstruct are the completeest
Become, i.e. obtain the A after having reconstructed and read aloud the video of this speech draft.
(4) video frequency output: the A after USB interface output has reconstructed reads aloud the video of this speech draft.
In the present embodiment, it is also possible to use here using a smart mobile phone as reconfiguration system, wherein: WIFI connects
Mouthful as the input of system, delivery outlet, cell phone processor is made as the processing module of system, mobile phone camera
For the Real-time Collection module of system, mobile phone display screen is as the display module of system.
(1) information is read in: system reads in the photo of A as pending human body information, reading from WIFI interface
Enter speech draft document as pending Shape of mouth.
(2) pretreatment: it is text formatting that cell phone processor identifies Shape of mouth, it is contemplated that association inversion method
It is convenient to use, and directly the Shape of mouth of text formatting is transferred to display and shows;Meanwhile, process
Device carries out graphical analysis to the photo of A, identifies and lock out the position of A oral area in photo, choosing
Take out the characteristic point of the shape of the mouth as one speaks, such as two labial angles, the center of four lip lines.
(3) shape of the mouth as one speaks reconstruct: the Word message simulation shape of the mouth as one speaks that synchronization object B demonstrates according to display, reads
The content of speech draft.Meanwhile, mobile phone camera by the analog video of B that collects by 30 frames/second
Frame number be divided into 30000 frames, the most corresponding time t1, t2..., t30000, and in each frame
The location shape of the mouth as one speaks, chooses same shape of the mouth as one speaks characteristic point, i.e. two labial angles, the centers of four lip lines.Cause
Photo for human body information, i.e. A is single-frame images, 30000 be partitioned into by the analog video of B
Frame carries out correspondence respectively with characteristic of correspondence point in the photo of A, and link peripheral position, it is established that
Based on seasonal effect in time series cyclic spring spatial model.Afterwards, can be according in the analog video of B
1st frame draws annular corresponding to t=(1/30) second in this moment to the mutation analysis of the shape of the mouth as one speaks of the 2nd frame
Acting on of the power that in elastic space model, each characteristic point is subject to, acts on the power obtained A and shines
In the cyclic spring spatial model that sheet is corresponding, t=(1/30) second in this moment A shape of the mouth as one speaks letter can be completed
The reconstruct of breath.When 30000 frame reconstruct are fully completed, i.e. obtain the A after having reconstructed and read aloud this
The video of speech draft.
(4) video frequency output: the A after WIFI interface output has reconstructed reads aloud the video of this speech draft.
Embodiment 2
As it is shown in fig. 7, below to use logic revised law that the shape of the mouth as one speaks of certain fragment in announcer's C video is repaiied
As a example by just, illustrate the present invention shape of the mouth as one speaks reconstructing method, in the present embodiment, C is destination object.This
In using smart mobile phone as reconfiguration system, wherein: WIFI interface as the input of system, delivery outlet,
Cell phone processor is as the processing module of system, and mobile phone display screen is as the display module of system, depositing of mobile phone
Storage unit is as the shape of the mouth as one speaks primitive models of system.
(1) information is read in: system reads in the video of announcer C from WIFI interface, and editing goes out portion to be revised
It is allocated as pending human body information, reads in voice correction content as pending shape of the mouth as one speaks letter simultaneously
Breath.
(2) pretreatment: it is phonetic matrix that processor identifies Shape of mouth, it is contemplated that the utilization of logic revised law
Convenient, the Shape of mouth of phonetic matrix is converted into the shape of the mouth as one speaks delivery of video of correspondence to showing screen display
Show.
(3) shape of the mouth as one speaks reconstruct: when display screen demonstrates the Shape of mouth of required reconstruct, can be artificially at shape of the mouth as one speaks base
Element module calls and meets the basic shape of the mouth as one speaks of demand and be associated repairing to the shape of the mouth as one speaks in ad-hoc location frame
Just, simulation is constructed based on seasonal effect in time series shape of the mouth as one speaks state template, letter outside the shape of the mouth as one speaks in template
Breath, the most here the rocking of people's limbs, the change etc. of surrounding, need consistent with video.Example
As, wait reconstruct be the sound being sent " a " by closed configuration after recover one section of voice remaining silent again, only
Need the shape of the mouth as one speaks of initially remaining silent, send out " a " time open the shape of the mouth as one speaks of maximum, pronounce to terminate after the mouth remained silent
The type these three shape of the mouth as one speaks rewrites the frame into the corresponding time, can be used as the state template of this section of shape of the mouth as one speaks,
Set up corresponding cyclic spring spatial model.Two changes between this model three phases are carried out point
Analysis, i.e. can get the work of the power that each characteristic point receives in the two stage cyclic spring spatial model
With, then by the prolonged action of power to longer time series, i.e. can get institute in the two stage
The shape of the mouth as one speaks state of some transition, be i.e. slow of speech the some frames magnified slowly and the some frames closed up slowly that are slow of speech.
For example, it is desired to build 30 frames between the two shape of the mouth as one speaks primitive to complete video reconstruction, just will be divided
The effect of the power separated out is divided into 30 parts, acts in this cyclic spring spatial model successively, produces
The shape of the mouth as one speaks state of raw 30 transition.
(4) video frequency output: cover, with the video generated, the part that in original video, editing goes out, in WIFI interface
Export the video of the announcer C after having reconstructed.
Should be understood that to one skilled in the art and require can carry out various with other factors according to design
Amendment, combination, certainly combination and change, as long as they all fall within claims and equivalents is limited
In fixed scope.
Claims (14)
1. a human body mouth shape video reconfiguration method, it is characterised in that include following four step:
(1) information is read in: read in human body information and Shape of mouth from input port, and described human body information is selected from target
The single-frame images of object or multiple image composition video, described Shape of mouth selected from word, sound,
Image, video any one or the most multiple;
(2) pretreatment: the shape of the mouth as one speaks after the Shape of mouth reading in input port is identified conversion and will identify conversion is believed
Breath shows in real time at display module, and the human body information reading in input port is analyzed and locks oral area
Position;
(3) shape of the mouth as one speaks reconstruct: temporal evolution method based on cyclic spring spatial dynamics, according to the pretreated shape of the mouth as one speaks
Information and human body information carry out human body mouth shape video reconfiguration;
(4) video frequency output: the human body shape of the mouth as one speaks video after delivery outlet output has reconstructed;
Wherein, described cyclic spring space is a kind of to define order a little and the plane space of distance, its
There are following 4 character:
1) any two points P in cyclic spring space1And P2, distance variable therebetween;
2) any two points P in cyclic spring space1And P2, its order is the most constant, it may be assumed that choose annular elastomeric
Property differs from P in space1、P2Any point P3, these 3 orders clockwise or counterclockwise are in arbitrarily conversion
All without changing;
3) any point P in cyclic spring space can by with trunnion axis angle be α, size be power F of f
Effect, and therefore produce the change on position, show as relatively primitive position along being α with trunnion axis angle
Direction produce certain displacement;
4) when any point P in cyclic spring space is acted on by power F, this power F is in impact
Also other point in cyclic spring space is influenced whether so that it is be equal to by one and horizontal axle clamp while P
The effect of power that angle is α ', size is f ', referred to as correlation;Other the point described space bit relative to P
Putting the size determining α ', other point described and the distance of P determine the size of f ', when other point described
During coverage R being more than with the distance of P, it is believed that the correlation impact of its F that do not stresses.
Human body mouth shape video reconfiguration method the most according to claim 1, it is characterised in that: described step (3)
In be association inversion method based on cyclic spring spatial dynamics temporal evolution method, i.e. by on-the-spot true man's conduct
The Shape of mouth that synchronization object simulant display model demonstrates, then regarded by Real-time Collection module collection simulation
Frequently, and the human body information read carries out coupling based on cyclic spring space, thus completes human body mouth
The reconstruct of type video.
Human body mouth shape video reconfiguration method the most according to claim 1, it is characterised in that: described step (3)
In be logic revised law based on cyclic spring spatial dynamics temporal evolution method, i.e. do not rely on on-the-spot true
People deduces, and directly according to required Shape of mouth, calls shape of the mouth as one speaks primitive models and builds shape of the mouth as one speaks state artificially
Template, the transitive state generating disappearance completes video reconstruction.
The human body mouth shape video reconfiguration system of reconstructing method the most according to claim 2, it is characterised in that: described
Video reconstruction system include input port, delivery outlet, processing module, display module and Real-time Collection module,
Wherein:
Described input port is used for reading in human body information and Shape of mouth, and described human body information is selected from destination object
Single-frame images or multiple image composition video, described Shape of mouth selected from word, sound, image,
Video any one or the most multiple;
Described delivery outlet is for exporting the human body shape of the mouth as one speaks video after having reconstructed;
Described display module shows in real time for the Shape of mouth reading in input port;
Described processing module carries out conversion process for the Shape of mouth reading in input port, then at human body information
On the basis of realize the reconstruct of human body shape of the mouth as one speaks video;
Described Real-time Collection module is used for during using association inversion method to be reconstructed synchronization object
Video carries out Real-time Collection.
Human body mouth shape video reconfiguration system the most according to claim 4, it is characterised in that: described Real-time Collection
Module selected from digitized image equipment, have camera function intelligent terminal any one or the most multiple.
Human body mouth shape video reconfiguration system the most according to claim 4, it is characterised in that: described Real-time Collection
Module selected from video camera, photographing unit any one or the most multiple.
Human body mouth shape video reconfiguration system the most according to claim 4, it is characterised in that: described Real-time Collection
Module is photographic head.
The human body mouth shape video reconfiguration system of reconstructing method the most according to claim 3, it is characterised in that: described
Video reconstruction system include input port, delivery outlet, processing module, display module and shape of the mouth as one speaks primitive models,
Wherein:
Described input port is used for reading in human body information and Shape of mouth, and described human body information is selected from destination object
Single-frame images or multiple image composition video, described Shape of mouth selected from word, sound, image,
Video any one or the most multiple;
Described delivery outlet is for exporting the human body shape of the mouth as one speaks video after having reconstructed;
Described display module shows in real time for the Shape of mouth reading in input port;
Described processing module carries out conversion process for the Shape of mouth reading in input port, then at human body information
On the basis of realize the reconstruct of human body shape of the mouth as one speaks video;
Described shape of the mouth as one speaks primitive models is for storing basic shape of the mouth as one speaks primitive, in order to use logic revised law to carry out
Call during reconstruct, build shape of the mouth as one speaks state template artificially.
9. according to the human body mouth shape video reconfiguration system described in any one of claim 4-8, it is characterised in that: described
Processing module is to have Computer Vision and the terminal of information analysis ability.
10. according to the human body mouth shape video reconfiguration system described in any one of claim 4-8, it is characterised in that: institute
The display module stated selected from display, display screen, projector any one or the most multiple.
11. according to the human body mouth shape video reconfiguration system described in any one of claim 4-8, it is characterised in that: institute
The display module stated is intelligent terminal.
12. according to the human body mouth shape video reconfiguration system described in any one of claim 4-8, it is characterised in that: institute
The mouth shape video reconfiguration system stated is to have the desktop computer of camera function, notebook computer.
13. according to the human body mouth shape video reconfiguration system described in any one of claim 4-8, it is characterised in that: institute
The mouth shape video reconfiguration system stated is mobile intelligent terminal.
14. according to the human body mouth shape video reconfiguration system described in any one of claim 4-8, it is characterised in that: institute
The mouth shape video reconfiguration system stated is smart mobile phone, panel computer, palm PC, intelligence handheld device.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310745441.XA CN103745462B (en) | 2013-12-27 | 2013-12-27 | A kind of human body mouth shape video reconfiguration system and reconstructing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310745441.XA CN103745462B (en) | 2013-12-27 | 2013-12-27 | A kind of human body mouth shape video reconfiguration system and reconstructing method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103745462A CN103745462A (en) | 2014-04-23 |
CN103745462B true CN103745462B (en) | 2016-11-02 |
Family
ID=50502477
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310745441.XA Active CN103745462B (en) | 2013-12-27 | 2013-12-27 | A kind of human body mouth shape video reconfiguration system and reconstructing method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103745462B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104298961B (en) * | 2014-06-30 | 2018-02-16 | 中国传媒大学 | Video method of combination based on Mouth-Shape Recognition |
CN108831463B (en) * | 2018-06-28 | 2021-11-12 | 广州方硅信息技术有限公司 | Lip language synthesis method and device, electronic equipment and storage medium |
CN109168067B (en) * | 2018-11-02 | 2022-04-22 | 深圳Tcl新技术有限公司 | Video time sequence correction method, correction terminal and computer readable storage medium |
CN114554267B (en) * | 2022-02-22 | 2024-04-02 | 上海艾融软件股份有限公司 | Audio and video synchronization method and device based on digital twin technology |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101101752A (en) * | 2007-07-19 | 2008-01-09 | 华中科技大学 | Monosyllabic language lip-reading recognition system based on vision character |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100332229A1 (en) * | 2009-06-30 | 2010-12-30 | Sony Corporation | Apparatus control based on visual lip share recognition |
-
2013
- 2013-12-27 CN CN201310745441.XA patent/CN103745462B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101101752A (en) * | 2007-07-19 | 2008-01-09 | 华中科技大学 | Monosyllabic language lip-reading recognition system based on vision character |
Non-Patent Citations (2)
Title |
---|
《Extraction of Visual Features for Lipreading》;lain Matthews et al.;《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》;20020228;第24卷(第2期);198-213 * |
《视觉驱动的语音合成系统中唇形轮廓的正交变换描述》;李刚等;《光学精密工程》;20070731;第15卷(第7期);1117-1123 * |
Also Published As
Publication number | Publication date |
---|---|
CN103745462A (en) | 2014-04-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103745423B (en) | A kind of shape of the mouth as one speaks teaching system and teaching method | |
CN103745462B (en) | A kind of human body mouth shape video reconfiguration system and reconstructing method | |
CN108615009A (en) | A kind of sign language interpreter AC system based on dynamic hand gesture recognition | |
CN113901894A (en) | Video generation method, device, server and storage medium | |
CN103544724A (en) | System and method for realizing fictional cartoon character on mobile intelligent terminal by augmented reality and card recognition technology | |
WO2020134436A1 (en) | Method for generating animated expression and electronic device | |
CN101271591A (en) | Interactive multi-vision point three-dimensional model reconstruction method | |
CN109801349A (en) | A kind of real-time expression generation method of the three-dimensional animation role of sound driver and system | |
CN110751708A (en) | Method and system for driving face animation in real time through voice | |
CN111724458B (en) | Voice-driven three-dimensional face animation generation method and network structure | |
CN110415701A (en) | The recognition methods of lip reading and its device | |
CN103778661B (en) | A kind of method, system and computer for generating speaker's three-dimensional motion model | |
CN104376309A (en) | Method for structuring gesture movement basic element models on basis of gesture recognition | |
CN110008961A (en) | Text real-time identification method, device, computer equipment and storage medium | |
CN107704817A (en) | A kind of detection algorithm of animal face key point | |
CN100487732C (en) | Method for generating cartoon portrait based on photo of human face | |
CN108810561A (en) | A kind of three-dimensional idol live broadcasting method and device based on artificial intelligence | |
CN112419334A (en) | Micro surface material reconstruction method and system based on deep learning | |
CN114697759B (en) | Virtual image video generation method and system, electronic device and storage medium | |
CN111105487B (en) | Face synthesis method and device in virtual teacher system | |
CN116416386A (en) | Digital twin L5-level simulation-based high-definition rendering and restoring system | |
CN116563923A (en) | RGBD-based facial acupoint positioning method, digital twin system and device | |
CN110322545A (en) | Campus three-dimensional digital modeling method, system, device and storage medium | |
CN114494542A (en) | Character driving animation method and system based on convolutional neural network | |
CN111582067B (en) | Facial expression recognition method, system, storage medium, computer program and terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |