CN111640187B - Video stitching method and system based on interpolation transition - Google Patents

Video stitching method and system based on interpolation transition Download PDF

Info

Publication number
CN111640187B
CN111640187B CN202010310346.7A CN202010310346A CN111640187B CN 111640187 B CN111640187 B CN 111640187B CN 202010310346 A CN202010310346 A CN 202010310346A CN 111640187 B CN111640187 B CN 111640187B
Authority
CN
China
Prior art keywords
video
image
images
interpolation
transition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010310346.7A
Other languages
Chinese (zh)
Other versions
CN111640187A (en
Inventor
邢云冰
陈益强
戴连君
张钧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN202010310346.7A priority Critical patent/CN111640187B/en
Publication of CN111640187A publication Critical patent/CN111640187A/en
Application granted granted Critical
Publication of CN111640187B publication Critical patent/CN111640187B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Graphics (AREA)
  • Marketing (AREA)
  • Business, Economics & Management (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Studio Circuits (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention provides a video stitching method and a system based on interpolation transition, comprising the following steps: unifying the object sizes before and after the video, searching the position of the optimal splicing point, unifying the illumination brightness before and after the video and the object position, calculating the number of interpolation transition images and generating an interpolation transition image sequence. The technical scheme provided by the invention has the technical effects of smooth and fluent video transition, high speed and strong instantaneity.

Description

Video stitching method and system based on interpolation transition
Technical Field
The invention relates to the technical field of computer vision and video processing, in particular to a method and a system for splicing videos by utilizing interpolation transition frames.
Background
Video stitching is an important item of video processing. In video communication, when the network bandwidth fluctuates, the variable code rate video communication system can dynamically adjust the code rate in a scalable coding mode (for example, adjusting resolution, frame rate or image quality), but for a fixed code rate video communication system, a receiving end can generate obvious blocking phenomenon, and the video communication system is particularly characterized by video time-out. For the first case, the received image sequence can be uniformly interpolated to improve the frame rate and smoothness of the video, and for the second case, the pause image sequence segments need to be supplemented to maintain the duration and continuity of the video. The uniform interpolation of the first case can be regarded as a special case of the second case.
Unlike image stitching at a spatial scale, video stitching is a process at a temporal scale that improves the smoothness of video playback by generating interpolated transition images between adjacent videos. Video stitching is also widely used in video synthesis and video editing, for example, in sign language video synthesis, each sign language word corresponds to a video segment, if video segments corresponding to different sign language words are directly connected, the connected sign language videos may have problems of hand movement and angle transfer deviation, and adjacent videos may generate obvious visual abrupt switching phenomenon during playing, so that interpolation transition between adjacent videos is required.
Currently, there are two main methods for splicing video clips: traditional optical flow-based methods and deep learning-based methods.
In the optical flow-based method, the optical flow of each pixel point in the image is calculated, the optical flow can represent the motion change information of the object in space, and the expression form of the optical flow of the interpolation image can be obtained through the linear relation between the optical flow of the interpolation image and the optical flow of the adjacent image so as to generate the interpolation transition image.
In the method based on deep learning, a mode of training and predicting is generally adopted, a generating model of an interpolation image is obtained through known continuous image data in a training stage, and an interpolation image with a transitional effect is finally generated by calculating an input model of an adjacent image in a predicting stage. In order to achieve a more natural and smooth playing effect, the spliced video can be subjected to re-smooth transition treatment.
In the optical flow based method, the quality of the interpolated image is very sensitive to the accuracy of the optical flow, and if the optical flow of a certain pixel point is wrong, all areas between the correct position and the wrong position in the interpolated image can be significantly affected. On the one hand, due to the complexity of searching, optical flow algorithms generally detect motion only in a small range, but the motion amplitude between non-continuous video segments (e.g., two different sign language videos) is generally large, so that it is difficult to accurately detect optical flow between adjacent images. On the other hand, the optical flow algorithm can accurately track pixel points (such as corner points) with obvious features, but is relatively easy to lose for objects with less textures (such as hands).
In the deep learning-based method, a large amount of data is generally required for training a model, and obtaining training data is a time-consuming and labor-consuming task, for example, in sign language video synthesis, continuous videos expressing a plurality of sign language words do not exist. Furthermore, the time complexity of deep learning is generally high, i.e. not suitable for real-time stitching situations.
The video image may be divided into a foreground and a background, the foreground being composed of one or more target objects, assuming P is used (i) Representing the ith target object, then an image containing L target objects
Figure BDA0002457527720000021
Wherein P is (0) Representing the background. Each target object comprises a plurality of key nodes, and part of the key nodes belong to anchor nodes. The key nodes refer to nodes capable of acquiring corresponding relations between adjacent images, and the anchor nodes are nodes with unchanged relative positions in the adjacent images. The anchor node only moves in a translation way, and the movement of the non-anchor key node consists of two parts of translation and rotation, and the related key node is required to be used as a fulcrum during rotation, namely the non-anchor key node rotates around the related key node.
Disclosure of Invention
In view of the above problems, the present invention provides a video stitching method based on interpolation transition, which includes:
step 1, acquiring adjacent video clips to be spliced, which comprise a first video and a tail video, and adjusting the sizes of target objects of all images in the tail video according to the average distance value of anchor nodes of each target object in the first video;
step 2, taking the average position value of the anchor node of each target object of each frame of image in the adjacent video segment as the origin of coordinates, obtaining the position of the non-anchor key node of the same target object in each frame of image and the direction between the key nodes, obtaining the splicing cost of the front and rear images of the first video and the tail video according to the position of the image in the video segment, and taking the image pair with the minimum splicing cost as the optimal splicing point image of the first video and the tail video respectively;
step 3, adjusting the brightness of the target objects of all images in the tail video by taking the average brightness value of each target object in the head video as a reference, calculating the average position value of the anchor node of each target object of the best splicing point image in the head video, and adjusting the positions of the target objects of all images in the tail video by taking the position value as a reference;
step 4, calculating the motion trail and the speed of all non-anchor key nodes of all target objects between two optimal splicing point images, and obtaining the number of interpolation transition images through the motion trail and the speed;
step 5, according to the position values of all key nodes of all target objects of two optimal splicing point images, interpolating the number of transitional images and the motion track of the key nodes to obtain the positions of the key nodes in each frame of interpolation transitional images, triangulating the boundary nodes of the key nodes and the images in each frame of interpolation transitional images to form triangular grids, respectively taking the vertexes of each triangle as target vertexes, respectively taking the corresponding nodes of two optimal splicing point images as source vertexes, calculating affine transformation of each pair of triangles to obtain transformation matrixes, respectively mapping triangular images formed by the source vertexes to triangular areas formed by the target vertexes, calculating transformation matrixes of all triangles in the triangular grids to form two mapped image areas, respectively corresponding to the two optimal splicing point images;
and 6, classifying the areas of the interpolation transition images according to the environment types of the areas in the two optimal splicing point images, respectively mixing the two optimal splicing point images according to the types of the areas in the interpolation transition images and the preset proportion to obtain an interpolation transition image, and splicing an interpolation transition image sequence containing the interpolation transition image between the head video and the tail video to obtain a final spliced video.
In the video stitching method based on interpolation transition, the classifying the region of the interpolation transition image in the step 6 specifically includes:
the regions of the interpolated transition image are divided into three classes: the first type of region is that the corresponding region in the two best splice point images is foreground or background, the second type of region is that the corresponding region of the previous best splice point image is foreground and the corresponding region of the next best splice point image is background, and the third type of region is that the corresponding region of the previous best splice point image is background and the corresponding region of the next best splice point image is foreground.
The video stitching method based on interpolation transition, wherein the step 6 comprises the following steps:
according to the position k of the interpolation transition image in the interpolation transition image sequence, the previous best splicing point image P is arranged in the first type area T And the last best splice point image Q S Is scaled to the pixels of (2)
Figure BDA0002457527720000031
And->
Figure BDA0002457527720000032
Mixing, in the second mapping region, P T Is proportional to +.>
Figure BDA0002457527720000033
Reduction, Q S Is proportional to +.>
Figure BDA0002457527720000034
Increasing, in the third mapping region, Q S Is proportional to +.>
Figure BDA0002457527720000035
Increase, P T Is proportional to +.>
Figure BDA0002457527720000036
Reduction;
wherein, k is more than or equal to 1 and less than or equal to L, and L is the number of frames of the interpolation transition image sequence.
The video stitching method based on interpolation transition, wherein the stitching cost of the front and rear images of the first video and the tail video obtained in the step 2 specifically includes:
Figure BDA0002457527720000041
where alpha and beta are the balance parameters,
Figure BDA0002457527720000042
for image Q in tail video j Is provided for the location of the non-anchor critical node,
Figure BDA0002457527720000043
for image Q in tail video j Direction between related critical nodes of (a), a ∈>
Figure BDA0002457527720000044
Image P in the first video i Positions of non-anchor critical nodes, +.>
Figure BDA0002457527720000045
Image P in the first video i I and j are the positions of the image in the first video and the last video, respectively.
In the video stitching method based on interpolation transition, in the step 4, the motion track is a straight line or a curve.
The invention also provides a video splicing system based on interpolation transition, which comprises the following steps:
the method comprises the steps that a module 1, adjacent video clips to be spliced, comprising a first video and a tail video, are obtained, and the sizes of target objects of all images in the tail video are adjusted according to the average distance value of anchor nodes of each target object in the first video;
the module 2 takes the average position value of the anchor node of each target object of each frame of image in the adjacent video segment as the origin of coordinates to obtain the position of the non-anchor key node of the same target object in each frame of image and the direction between the key nodes, and obtains the splicing cost of the front and rear images of the head video and the tail video according to the position of the image in the video segment, and takes the image pair with the minimum splicing cost as the optimal splicing point image of the head video and the tail video respectively;
the module 3 adjusts the brightness of the target objects of all the images in the tail video by taking the average brightness value of each target object in the head video as a reference, calculates the average position value of the anchor node of each target object of the best splicing point image in the head video, and adjusts the positions of the target objects of all the images in the tail video by taking the position value as a reference;
the module 4 calculates the motion trail and the speed of all non-anchor key nodes of all target objects between two optimal splicing point images, and obtains the number of interpolation transition images through the motion trail and the speed;
the module 5 interpolates the number of transition images and the motion track of the key nodes according to the position values of all key nodes of all target objects of two optimal splicing point images to obtain the positions of the key nodes in each frame of interpolation transition image, triangulates the boundary nodes of the key nodes and the images in each frame of interpolation transition image as a scattered point set to form triangular grids, calculates affine transformation of each pair of triangles by taking the vertexes of each triangle as the destination vertexes and the corresponding nodes of the two optimal splicing point images as the source vertexes to obtain transformation matrixes, maps the triangle images formed by the source vertexes to triangle areas formed by the destination vertexes, calculates transformation matrixes of all triangles in the triangular grids to form two mapped image areas, and corresponds to the two optimal splicing point images respectively;
and a module 6, classifying the areas of the interpolation transition images according to the environment types of the areas in the two optimal splicing point images, respectively mixing the two optimal splicing point images according to the types of the areas in the interpolation transition images and the preset proportion to obtain the interpolation transition images, and splicing the interpolation transition image sequence containing the interpolation transition images between the head video and the tail video to obtain the final spliced video.
The video stitching system based on interpolation transition, wherein the classifying the region of the interpolation transition image in the module 6 specifically includes:
the regions of the interpolated transition image are divided into three classes: the first type of region is that the corresponding region in the two best splice point images is foreground or background, the second type of region is that the corresponding region of the previous best splice point image is foreground and the corresponding region of the next best splice point image is background, and the third type of region is that the corresponding region of the previous best splice point image is background and the corresponding region of the next best splice point image is foreground.
The video stitching system based on interpolation transition, wherein the module 6 comprises:
according to the position k of the interpolation transition image in the interpolation transition image sequence, the previous best splicing point image P is arranged in the first type area T And the last best splice point image Q S Is scaled to the pixels of (2)
Figure BDA0002457527720000051
And->
Figure BDA0002457527720000052
Mixing, in the second mapping region, P T Is proportional to +.>
Figure BDA0002457527720000053
Reduction, Q S Is proportional to +.>
Figure BDA0002457527720000054
Increasing, in the third mapping region, Q S Is proportional to +.>
Figure BDA0002457527720000055
Increase, P T Is proportional to +.>
Figure BDA0002457527720000056
Reduction;
wherein, k is more than or equal to 1 and less than or equal to L, and L is the number of frames of the interpolation transition image sequence.
The video stitching system based on interpolation transition, wherein the stitching cost of the front and rear images of the first video and the tail video obtained in the module 2 specifically includes:
Figure BDA0002457527720000057
where alpha and beta are the balance parameters,
Figure BDA0002457527720000058
for image Q in tail video j Is provided for the location of the non-anchor critical node,
Figure BDA0002457527720000059
for image Q in tail video j Direction between related critical nodes of (a), a ∈>
Figure BDA00024575277200000510
Image P in the first video i Positions of non-anchor critical nodes, +.>
Figure BDA00024575277200000511
Image P in the first video i I and j are the positions of the image in the first video and the last video, respectively.
The video stitching system based on interpolation transition, wherein the motion track in the module 4 is a straight line or a curve.
Compared with the prior art, the technical scheme provided by the invention has the following advantages:
(1) The video transition is smooth and has a gradual transition effect. The invention divides the video image into a background, a rigid foreground and a flexible foreground, different areas have different transition strategies, the pixel values of the overlapped areas are mixed in proportion, the pixels of the non-overlapped areas are increased or decreased in proportion, and the invention accords with the reaction process of human eyes for observing moving objects.
(2) The speed is high, and the instantaneity is high. No search analysis or network recursion is required for the entire frame of images. The method is characterized in that all calculation processing is based on a single pixel, the main performance bottleneck calculates the shortest distance in the image mixing stage, but the distance comparison is only limited in a triangle grid, the calculation complexity is inversely proportional to the number of key nodes, and the result can be obtained in advance by adopting a proper distance measurement and searching mode.
Drawings
FIG. 1 is a diagram of the location and orientation of key nodes of a human body (head-to-tail is a stitching point image);
FIG. 2 is a triangulation (intermediate image of FIG. 1);
FIG. 3 is a triangle area and corresponding stitching point image;
fig. 4 is an interpolated transition image.
Detailed Description
The invention provides a video stitching method based on interpolation transition, which comprises the following steps:
step one, size consistency.
In the adjacent video segments, the average distance value of the anchor node of each target object in the previous video segment (the first video segment) is calculated, and the size of the same target object of all images in the next video segment (the last video segment) is adjusted by taking the distance value as a reference. The sizes of all target objects of all images in the subsequent video segment are adjusted.
And step two, searching the position of the optimal splicing point.
And calculating the average position value of the anchor node of each target object of each frame of image in the adjacent video segment, and recalculating the position of the non-anchor key node of the same target object in the image and the direction between the related key nodes by taking the position value as the origin of coordinates.
In the adjacent video segments, the splicing cost of the front and rear images is calculated, wherein the splicing cost comprises the positions of the images in the video segments, the distances of the corresponding non-anchor key nodes of all target objects and the direction difference between the related key nodes of all target objects. And taking the image pair with the minimum splicing cost as the optimal splicing point image of the adjacent video clips respectively. All images following the best splice point in the previous video and all images preceding the best splice point in the next video are deleted.
And thirdly, illumination consistency and position consistency.
In the adjacent video segments, calculating the average brightness value of each target object in the previous video segment, and adjusting the brightness of the same target object of all images in the next video segment by taking the brightness value as a reference. The brightness of all target objects of all images in the subsequent video is adjusted.
In the adjacent video segments, calculating the average position value of the anchor node of each target object of the splicing point image in the previous video segment, and adjusting the positions of the same target objects of all images in the subsequent video segment by taking the position value as a reference. The positions of all target objects of all images in the subsequent video are adjusted.
And step four, calculating the number of interpolation transition images.
And calculating the longest motion trail of all non-anchor key nodes of all target objects of the splice point images in the adjacent video segments, wherein the longest motion trail can be a straight line or a curve. And respectively calculating the speed of the corresponding non-anchor key nodes of the splicing point image, namely the distance between the splicing point image and the corresponding nodes of the adjacent images. And obtaining the number of interpolation transition images through the motion trail and the speed.
And fifthly, generating an interpolation transition image sequence.
And interpolating the number of the transition images and the motion track of the key nodes according to the position values of all the key nodes of all the target objects of the splicing point images in the adjacent video segments to obtain the positions of the corresponding key nodes in each frame of the interpolation transition images.
In each frame of interpolation transition image, triangulation is carried out by taking key nodes and boundary nodes of the image as a scattered point set, so as to form a triangular grid. And taking the vertex of each triangle as a target vertex, respectively taking the corresponding nodes of the two stitching point images as source vertices, and calculating affine transformation of each pair of triangles. And mapping the triangle image formed by the source vertexes to the triangle area formed by the destination vertexes according to the transformation matrix. And calculating transformation matrixes of all triangles in the triangle mesh to form two mapped image areas, wherein the two mapped image areas correspond to the two splicing point images respectively.
The regions of the interpolated transition image are divided into three classes: the corresponding areas in the two splicing point images are foreground or background, the corresponding area of the previous splicing point image is foreground and the corresponding area of the next splicing point image is background, and the corresponding area of the previous splicing point image is background and the corresponding area of the next splicing point image is foreground. According to the position of the interpolation image in the whole image sequence, in a first type region, the pixels of two splicing point images are mixed proportionally, in a second type region, the foreground pixels of the previous splicing point image are reduced proportionally, the background pixels of the next splicing point image are correspondingly increased, and in a third type region, the foreground pixels of the next splicing point image are increased proportionally, and the background pixels of the previous splicing point image are correspondingly reduced. The order of increasing and decreasing is measured as the shortest distance between the foreground pixel and the foreground of the other splice point image.
In order to make the above features and effects of the present invention more clearly understood, the following specific examples are given with reference to the accompanying drawings.
For ease of understanding, a possible application scenario of the method according to the invention is first given before a detailed description of the method according to the invention is given. In sign language video synthesis, video segments corresponding to different sign language words need to be spliced, and a previous sign language video is formed by N frames of images and is marked as { P } 1 ,P 2 …P N The later sign language video consists of M frames of images, which is marked as { Q } M ,Q M-1 …Q 1 And (3)
Figure BDA0002457527720000081
Figure BDA0002457527720000082
Wherein->
Figure BDA0002457527720000083
And
Figure BDA0002457527720000084
respectively represent the images P i And Q j Is of a single color, ++>
Figure BDA0002457527720000085
And->
Figure BDA0002457527720000086
Representing the human body and being the only target object in the foreground. In sign language movements, the main moving objects are hands and arms, and the human torso can be considered to be approximately rigid. Thus, skeletal nodes of the human body are key nodes such as left and right hands, left and right elbows, left and right five fingers, etc., wherein the neck, left and right shoulders and left and right hips are anchor nodes whose coordinate values are expressed as (X) joint ,Y joint ) For example, the coordinate value of the left hand is (X LHand ,Y LHand )。
For the above application scenario, an embodiment of the interpolation transition based video stitching of the present invention is given below. The basic steps are as follows:
step one, size consistency.
Respectively calculating the previous video { P } 1 ,P 2 …P N Sum-next video { Q } M ,Q M-1 …Q 1 Average distance value (W of human anchor node in } P ,H P ) And (W) Q ,H Q )。
Figure BDA0002457527720000087
Figure BDA0002457527720000088
Figure BDA0002457527720000089
Figure BDA00024575277200000810
Respectively in W P /W Q And H P /H Q To scale { Q } M ,Q M-1 …Q 1 Width and height.
Step two, searching the position of the optimal splicing point
Calculating the previous video { P } 1 ,P 2 …P N Every frame of picture P i Average position value of human body anchor node
Figure BDA00024575277200000811
Figure BDA00024575277200000812
/>
Figure BDA0002457527720000091
To be used for
Figure BDA0002457527720000092
Recalculating the image P for the origin of coordinates i Is not the location of the anchor critical node of the human body
Figure BDA0002457527720000093
And the direction between the relevant key nodes +.>
Figure BDA0002457527720000094
Figure BDA0002457527720000095
Figure BDA0002457527720000096
Figure BDA0002457527720000097
Wherein the method comprises the steps of
Figure BDA0002457527720000098
Is the location of the relevant key node, for example for the left hand, which is the left elbow.
Similarly, calculate the next video { Q } M ,Q M-1 …Q 1 Each frame of image Q j Is not the location of the anchor critical node of the human body
Figure BDA0002457527720000099
And the direction between the relevant key nodes +.>
Figure BDA00024575277200000910
Calculating the front and rear images P i And Q j Is a concatenation cost (P) i ,Q j )。
Figure BDA00024575277200000911
Where α and β are balance parameters, α determines the relative importance of the image in the video segment position relative to the distance corresponding to the human non-anchor key node, and β balances the relative importance of the directional difference between the relevant key nodes.
Taking the image pair P with minimum splicing cost T And Q S Respectively as { P ] 1 ,P 2 …P N Sum { Q } M ,Q M-1 …Q 1 The best splice point image. Delete { P ] 1 ,P 2 …P N All images after number T in { Q }, Q M ,Q M-1 …Q 1 All images preceding number S in }.
Step three, illumination consistency and position consistency
Respectively calculating the previous video { P } 1 ,P 2 …P T Sum of the lastSegment video { Q S ,Q S-1 …Q 1 Average brightness value L of human body in } P And L Q Will { Q S ,Q S-1 …Q 1 Brightness of human body of each frame image in } is heightened by L P -L Q
Separately computing splice point images P T And Q S Average position value of human body anchor node
Figure BDA00024575277200000912
And->
Figure BDA00024575277200000913
Figure BDA00024575277200000914
Figure BDA00024575277200000915
Figure BDA00024575277200000916
Figure BDA00024575277200000917
Respectively by
Figure BDA0002457527720000101
And->
Figure BDA0002457527720000102
For moving scale, adjust { Q S ,Q S-1 …Q 1 Location of }. />
Step four, calculating the number of interpolation transition images
Computing a splice point image P T And Q S The longest motion track of the human body non-anchor key node is in the embodiment of the inventionWherein the motion trail is parabolic.
The general trajectory equation for a parabola is:
Figure BDA0002457527720000103
where a, b, c, θ are parameters of a parabola and t is an intermediate variable of the equation. To be used for
Figure BDA0002457527720000104
Figure BDA0002457527720000105
Solving a, b, c and theta to obtain a motion track for the known points.
Calculation of P T And Q S Track length of human body non-anchor key node
Figure BDA0002457527720000106
Figure BDA0002457527720000107
The longest trace is marked as D ST Respectively calculate D ST Speed of corresponding node
Figure BDA0002457527720000108
And->
Figure BDA0002457527720000109
Figure BDA00024575277200001010
Figure BDA00024575277200001011
In the embodiment of the invention, if the non-anchor key nodes of the human body move at a constant speed along the parabolic track, the number of the interpolation transition images is as follows
Figure BDA00024575277200001012
Step five, generating an interpolation transition image sequence
The generated interpolation transition image sequence consists of L frame images and is marked as { R } 1 ,R 2 …R L -computing the splice point image P in a step four manner T And Q S Uniformly dividing each motion track into L+1 parts, wherein the dividing points are interpolation transition images R of each frame k The position of the key node corresponding to the human body, wherein k is more than or equal to 1 and less than or equal to L. As shown in fig. 1.
At R k In the method, 8 boundary nodes (upper left, upper right, lower left, lower right, middle left, middle right, middle upper and lower middle) of the human body key nodes and the images are used as a scattered point set to perform triangulation to form a triangular grid, and the triangular grid is shown in figure 2. The vertex of each triangle is used as the target vertex, and P is respectively used as the vertex T And Q S Is used as a source vertex, and affine transformation of each pair of triangles is calculated. According to the transformation matrix, the triangle image composed of the source vertexes is mapped to the triangle region composed of the destination vertexes, respectively, as shown in fig. 3. Calculating transformation matrixes of all triangles in the triangle mesh to form two mapped image areas, wherein the two mapped image areas correspond to P respectively T And Q S
R is R k Is divided into three categories: first is P T And Q S The corresponding areas in the two are foreground or background, and P T Corresponding region of (1) is foreground and Q S The corresponding region of (2) is background, thirdly P T Corresponding region of (1) is background and Q S Is foreground. According to R k Position k in the interpolated transition image sequence, P in the first type region T And Q S Is scaled to the pixels of (2)
Figure BDA0002457527720000111
And->
Figure BDA0002457527720000112
Mixing, in the second mapping region, P T Is proportional to +.>
Figure BDA0002457527720000113
Reduction, Q S Correspondingly increasing background pixels of (2), in the third class of mapping region, Q S Is proportional to +.>
Figure BDA0002457527720000114
Increase, P T Correspondingly reduced background pixels. The order of increasing and decreasing is measured by the shortest distance of the foreground pixel and the foreground of another splice point image, as shown in fig. 4. Preferably, to calculate the shortest distance with minimal temporal and spatial complexity, manhattan distance is used as a metric and the search is spread around from the current position of the foreground pixel. Where the order refers to, taking the second type of region as an example, assume that the region has 100 pixels in total, 30 of which are foreground, each corresponding to one of the shortest distances. If k=1, l=9, it is indicated that the 1 st frame interpolation image only retains P T 90% of the foreground pixels, i.e. proportionally reduced +.>
Figure BDA0002457527720000115
The 10% decrease is those pixels with the greatest shortest distance, and the 10% decrease is those areas, then Q S I.e. corresponding increase.
It should be understood that, according to an embodiment of the present invention, any method in the prior art may be used to obtain key nodes (including anchor nodes) of a human body, for example, the key nodes may be obtained directly by using a Kinect camera, or may be obtained by calculating by using a human body gesture recognition (e.g. deep learning) method.
The following is a system example corresponding to the above method example, and this embodiment mode may be implemented in cooperation with the above embodiment mode. The related technical details mentioned in the above embodiments are still valid in this embodiment, and in order to reduce repetition, they are not repeated here. Accordingly, the related technical details mentioned in the present embodiment can also be applied to the above-described embodiments.
The invention also provides a video splicing system based on interpolation transition, which comprises the following steps:
the method comprises the steps that a module 1, adjacent video clips to be spliced, comprising a first video and a tail video, are obtained, and the sizes of target objects of all images in the tail video are adjusted according to the average distance value of anchor nodes of each target object in the first video;
the module 2 takes the average position value of the anchor node of each target object of each frame of image in the adjacent video segment as the origin of coordinates to obtain the position of the non-anchor key node of the same target object in each frame of image and the direction between the key nodes, and obtains the splicing cost of the front and rear images of the head video and the tail video according to the position of the image in the video segment, and takes the image pair with the minimum splicing cost as the optimal splicing point image of the head video and the tail video respectively;
the module 3 adjusts the brightness of the target objects of all the images in the tail video by taking the average brightness value of each target object in the head video as a reference, calculates the average position value of the anchor node of each target object of the best splicing point image in the head video, and adjusts the positions of the target objects of all the images in the tail video by taking the position value as a reference;
the module 4 calculates the motion trail and the speed of all non-anchor key nodes of all target objects between two optimal splicing point images, and obtains the number of interpolation transition images through the motion trail and the speed;
the module 5 interpolates the number of transition images and the motion track of the key nodes according to the position values of all key nodes of all target objects of two optimal splicing point images to obtain the positions of the key nodes in each frame of interpolation transition image, triangulates the boundary nodes of the key nodes and the images in each frame of interpolation transition image as a scattered point set to form triangular grids, calculates affine transformation of each pair of triangles by taking the vertexes of each triangle as the destination vertexes and the corresponding nodes of the two optimal splicing point images as the source vertexes to obtain transformation matrixes, maps the triangle images formed by the source vertexes to triangle areas formed by the destination vertexes, calculates transformation matrixes of all triangles in the triangular grids to form two mapped image areas, and corresponds to the two optimal splicing point images respectively;
and a module 6, classifying the areas of the interpolation transition images according to the environment types of the areas in the two optimal splicing point images, respectively mixing the two optimal splicing point images according to the types of the areas in the interpolation transition images and the preset proportion to obtain the interpolation transition images, and splicing the interpolation transition image sequence containing the interpolation transition images between the head video and the tail video to obtain the final spliced video.
The video stitching system based on interpolation transition, wherein the classifying the region of the interpolation transition image in the module 6 specifically includes:
the regions of the interpolated transition image are divided into three classes: the first type of region is that the corresponding region in the two best splice point images is foreground or background, the second type of region is that the corresponding region of the previous best splice point image is foreground and the corresponding region of the next best splice point image is background, and the third type of region is that the corresponding region of the previous best splice point image is background and the corresponding region of the next best splice point image is foreground.
The video stitching system based on interpolation transition, wherein the module 6 comprises:
according to the position k of the interpolation transition image in the interpolation transition image sequence, the previous best splicing point image P is arranged in the first type area T And the last best splice point image Q S Is scaled to the pixels of (2)
Figure BDA0002457527720000121
And->
Figure BDA0002457527720000122
Mixing, in the second mapping region, P T Is proportional to +.>
Figure BDA0002457527720000123
Reduction, Q S Is proportional to +.>
Figure BDA0002457527720000131
Increasing, in the third mapping region, Q S Is proportional to +.>
Figure BDA0002457527720000132
Increase, P T Is proportional to +.>
Figure BDA0002457527720000133
Reduction;
wherein, k is more than or equal to 1 and less than or equal to L, and L is the number of frames of the interpolation transition image sequence.
The video stitching system based on interpolation transition, wherein the stitching cost of the front and rear images of the first video and the tail video obtained in the module 2 specifically includes:
Figure BDA0002457527720000134
where alpha and beta are the balance parameters,
Figure BDA0002457527720000135
for image Q in tail video j Is provided for the location of the non-anchor critical node,
Figure BDA0002457527720000136
for image Q in tail video j Direction between related critical nodes of (a), a ∈>
Figure BDA0002457527720000137
Image P in the first video i Positions of non-anchor critical nodes, +.>
Figure BDA0002457527720000138
Image P in the first video i I and j are the positions of the image in the first video and the last video, respectively. />

Claims (10)

1. A method of video stitching based on interpolation transitions, comprising:
step 1, acquiring adjacent video clips to be spliced, which comprise a first video and a tail video, and adjusting the sizes of target objects of all images in the tail video according to the average distance value of anchor nodes of each target object in the first video;
step 2, taking the average position value of the anchor node of each target object of each frame of image in the adjacent video segment as the origin of coordinates, obtaining the position of the non-anchor key node of the same target object in each frame of image and the direction between the key nodes, obtaining the splicing cost of the front and rear images of the first video and the tail video according to the position of the image in the video segment, and taking the image pair with the minimum splicing cost as the optimal splicing point image of the first video and the tail video respectively;
step 3, adjusting the brightness of the target objects of all images in the tail video by taking the average brightness value of each target object in the head video as a reference, calculating the average position value of the anchor node of each target object of the best splicing point image in the head video, and adjusting the positions of the target objects of all images in the tail video by taking the position value as a reference;
step 4, calculating the motion trail and the speed of all non-anchor key nodes of all target objects between two optimal splicing point images, and obtaining the number of interpolation transition images through the motion trail and the speed;
step 5, according to the position values of all key nodes of all target objects of two optimal splicing point images, interpolating the number of transitional images and the motion track of the key nodes to obtain the positions of the key nodes in each frame of interpolation transitional images, triangulating the boundary nodes of the key nodes and the images in each frame of interpolation transitional images to form triangular grids, respectively taking the vertexes of each triangle as target vertexes, respectively taking the corresponding nodes of two optimal splicing point images as source vertexes, calculating affine transformation of each pair of triangles to obtain transformation matrixes, respectively mapping triangular images formed by the source vertexes to triangular areas formed by the target vertexes, calculating transformation matrixes of all triangles in the triangular grids to form two mapped image areas, respectively corresponding to the two optimal splicing point images;
and 6, classifying the areas of the interpolation transition images according to the environment types of the areas in the two optimal splicing point images, respectively mixing the two optimal splicing point images according to the types of the areas in the interpolation transition images and the preset proportion to obtain an interpolation transition image, and splicing an interpolation transition image sequence containing the interpolation transition image between the head video and the tail video to obtain a final spliced video.
2. The method for video stitching based on interpolation transition according to claim 1, wherein classifying the region of the interpolation transition image in the step 6 specifically includes:
the regions of the interpolated transition image are divided into three classes: the first type of region is that the corresponding region in the two best splice point images is foreground or background, the second type of region is that the corresponding region of the previous best splice point image is foreground and the corresponding region of the next best splice point image is background, and the third type of region is that the corresponding region of the previous best splice point image is background and the corresponding region of the next best splice point image is foreground.
3. The interpolation transition based video stitching method as recited in claim 2, wherein the step 6 includes:
according to the position k of the interpolation transition image in the interpolation transition image sequence, the previous best splicing point image P is arranged in the first type area T And the last best splice point image Q S Is scaled to the pixels of (2)
Figure FDA0002457527710000021
And->
Figure FDA0002457527710000022
Mixing, in the second mapping region, P T Foreground pixels of (a)Proportional->
Figure FDA0002457527710000023
Reduction, Q S Is proportional to +.>
Figure FDA0002457527710000024
Increasing, in the third mapping region, Q S Is proportional to +.>
Figure FDA0002457527710000025
Increase, P T Is proportional to +.>
Figure FDA0002457527710000026
Reduction;
wherein, k is more than or equal to 1 and less than or equal to L, and L is the number of frames of the interpolation transition image sequence.
4. The method for video stitching based on interpolation transition according to claim 1, 2 or 3, wherein the stitching cost of the front and rear images of the first video and the tail video obtained in the step 2 specifically includes:
Figure FDA0002457527710000027
where alpha and beta are the balance parameters,
Figure FDA0002457527710000028
for image Q in tail video j Positions of non-anchor critical nodes, +.>
Figure FDA0002457527710000029
For image Q in tail video j Direction between related critical nodes of (a), a ∈>
Figure FDA00024575277100000210
Image P in the first video i Positions of non-anchor critical nodes, +.>
Figure FDA00024575277100000211
Image P in the first video i I and j are the positions of the image in the first video and the last video, respectively.
5. The interpolation transition-based video stitching method according to claim 1, wherein the motion trajectory in step 4 is a straight line or a curved line.
6. A video stitching system based on interpolation transitions, comprising:
the method comprises the steps that a module 1, adjacent video clips to be spliced, comprising a first video and a tail video, are obtained, and the sizes of target objects of all images in the tail video are adjusted according to the average distance value of anchor nodes of each target object in the first video;
the module 2 takes the average position value of the anchor node of each target object of each frame of image in the adjacent video segment as the origin of coordinates to obtain the position of the non-anchor key node of the same target object in each frame of image and the direction between the key nodes, and obtains the splicing cost of the front and rear images of the head video and the tail video according to the position of the image in the video segment, and takes the image pair with the minimum splicing cost as the optimal splicing point image of the head video and the tail video respectively;
the module 3 adjusts the brightness of the target objects of all the images in the tail video by taking the average brightness value of each target object in the head video as a reference, calculates the average position value of the anchor node of each target object of the best splicing point image in the head video, and adjusts the positions of the target objects of all the images in the tail video by taking the position value as a reference;
the module 4 calculates the motion trail and the speed of all non-anchor key nodes of all target objects between two optimal splicing point images, and obtains the number of interpolation transition images through the motion trail and the speed;
the module 5 interpolates the number of transition images and the motion track of the key nodes according to the position values of all key nodes of all target objects of two optimal splicing point images to obtain the positions of the key nodes in each frame of interpolation transition image, triangulates the boundary nodes of the key nodes and the images in each frame of interpolation transition image as a scattered point set to form triangular grids, calculates affine transformation of each pair of triangles by taking the vertexes of each triangle as the destination vertexes and the corresponding nodes of the two optimal splicing point images as the source vertexes to obtain transformation matrixes, maps the triangle images formed by the source vertexes to triangle areas formed by the destination vertexes, calculates transformation matrixes of all triangles in the triangular grids to form two mapped image areas, and corresponds to the two optimal splicing point images respectively;
and a module 6, classifying the areas of the interpolation transition images according to the environment types of the areas in the two optimal splicing point images, respectively mixing the two optimal splicing point images according to the types of the areas in the interpolation transition images and the preset proportion to obtain the interpolation transition images, and splicing the interpolation transition image sequence containing the interpolation transition images between the head video and the tail video to obtain the final spliced video.
7. The interpolation transition-based video stitching system of claim 6 wherein the module 6 classifies regions of the interpolation transition image in particular comprising:
the regions of the interpolated transition image are divided into three classes: the first type of region is that the corresponding region in the two best splice point images is foreground or background, the second type of region is that the corresponding region of the previous best splice point image is foreground and the corresponding region of the next best splice point image is background, and the third type of region is that the corresponding region of the previous best splice point image is background and the corresponding region of the next best splice point image is foreground.
8. The interpolation transition based video stitching system of claim 7, wherein the module 6 includes:
interpolation is carried out according to the interpolation transition imagePosition k in the transition image sequence, in the first type of region, the previous best splice point image P T And the last best splice point image Q S Is scaled to the pixels of (2)
Figure FDA0002457527710000031
And->
Figure FDA0002457527710000032
Mixing, in the second mapping region, P T Is proportional to +.>
Figure FDA0002457527710000033
Reduction, Q S Is proportional to +.>
Figure FDA0002457527710000041
Increasing, in the third mapping region, Q S Is proportional to +.>
Figure FDA0002457527710000042
Increase, P T Is proportional to +.>
Figure FDA0002457527710000043
Reduction;
wherein, k is more than or equal to 1 and less than or equal to L, and L is the number of frames of the interpolation transition image sequence.
9. The interpolation transition-based video stitching system according to claim 6, 7 or 8, wherein the cost of obtaining the front and rear images of the front video and the rear video in the module 2 specifically includes:
Figure FDA0002457527710000044
where alpha and beta are the balance parameters,
Figure FDA0002457527710000045
for image Q in tail video j Positions of non-anchor critical nodes, +.>
Figure FDA0002457527710000046
For image Q in tail video j Direction between related critical nodes of (a), a ∈>
Figure FDA0002457527710000047
Image P in the first video i Positions of non-anchor critical nodes, +.>
Figure FDA0002457527710000048
Image P in the first video i I and j are the positions of the image in the first video and the last video, respectively.
10. The interpolation transition-based video stitching system according to claim 6, wherein the motion profile in block 4 is a straight line or a curved line.
CN202010310346.7A 2020-04-20 2020-04-20 Video stitching method and system based on interpolation transition Active CN111640187B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010310346.7A CN111640187B (en) 2020-04-20 2020-04-20 Video stitching method and system based on interpolation transition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010310346.7A CN111640187B (en) 2020-04-20 2020-04-20 Video stitching method and system based on interpolation transition

Publications (2)

Publication Number Publication Date
CN111640187A CN111640187A (en) 2020-09-08
CN111640187B true CN111640187B (en) 2023-05-02

Family

ID=72332727

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010310346.7A Active CN111640187B (en) 2020-04-20 2020-04-20 Video stitching method and system based on interpolation transition

Country Status (1)

Country Link
CN (1) CN111640187B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112200739A (en) * 2020-09-30 2021-01-08 北京大米科技有限公司 Video processing method and device, readable storage medium and electronic equipment
WO2022193090A1 (en) * 2021-03-15 2022-09-22 深圳市大疆创新科技有限公司 Video processing method, electronic device and computer-readable storage medium
CN113518235B (en) * 2021-04-30 2023-11-28 广州繁星互娱信息科技有限公司 Live video data generation method, device and storage medium
CN113766275B (en) * 2021-09-29 2023-05-30 北京达佳互联信息技术有限公司 Video editing method, device, terminal and storage medium
CN114125324B (en) * 2021-11-08 2024-02-06 北京百度网讯科技有限公司 Video stitching method and device, electronic equipment and storage medium
CN114286174B (en) * 2021-12-16 2023-06-20 天翼爱音乐文化科技有限公司 Video editing method, system, equipment and medium based on target matching

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1694512A (en) * 2005-06-24 2005-11-09 清华大学 Synthesis method of virtual viewpoint in interactive multi-viewpoint video system
CN102201115A (en) * 2011-04-07 2011-09-28 湖南天幕智能科技有限公司 Real-time panoramic image stitching method of aerial videos shot by unmanned plane
CN102307309A (en) * 2011-07-29 2012-01-04 杭州电子科技大学 Somatosensory interactive broadcasting guide system and method based on free viewpoints
CN102903085A (en) * 2012-09-25 2013-01-30 福州大学 Rapid image mosaic method based on corner matching
CN102999901A (en) * 2012-10-17 2013-03-27 中国科学院计算技术研究所 Method and system for processing split online video on the basis of depth sensor
CN103489165A (en) * 2013-10-01 2014-01-01 中国人民解放军国防科学技术大学 Decimal lookup table generation method for video stitching
CN103501415A (en) * 2013-10-01 2014-01-08 中国人民解放军国防科学技术大学 Overlap structural deformation-based video real-time stitching method
CN103533266A (en) * 2013-10-01 2014-01-22 中国人民解放军国防科学技术大学 360-degree stitched-type panoramic camera with wide view field in vertical direction
CN103646416A (en) * 2013-12-18 2014-03-19 中国科学院计算技术研究所 Three-dimensional cartoon face texture generation method and device
CN104091318A (en) * 2014-06-16 2014-10-08 北京工业大学 Chinese sign language video transition frame synthesizing method
CN104103050A (en) * 2014-08-07 2014-10-15 重庆大学 Real video recovery method based on local strategies
CN107426524A (en) * 2017-06-06 2017-12-01 微鲸科技有限公司 A kind of method and apparatus of the Multi-Party Conference based on virtual panoramic
WO2018005701A1 (en) * 2016-06-29 2018-01-04 Cellular South, Inc. Dba C Spire Wireless Video to data
TWI639136B (en) * 2017-11-29 2018-10-21 國立高雄科技大學 Real-time video stitching method
CN108734728A (en) * 2018-04-25 2018-11-02 西北工业大学 A kind of extraterrestrial target three-dimensional reconstruction method based on high-resolution sequence image
CN109165550A (en) * 2018-07-13 2019-01-08 首都师范大学 A kind of multi-modal operation track fast partition method based on unsupervised deep learning

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090300692A1 (en) * 2008-06-02 2009-12-03 Mavlankar Aditya A Systems and methods for video streaming and display
US10204449B2 (en) * 2015-09-01 2019-02-12 Siemens Healthcare Gmbh Video-based interactive viewing along a path in medical imaging
US20170195568A1 (en) * 2016-01-06 2017-07-06 360fly, Inc. Modular Panoramic Camera Systems
US10553012B2 (en) * 2018-04-16 2020-02-04 Facebook Technologies, Llc Systems and methods for rendering foveated effects

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1694512A (en) * 2005-06-24 2005-11-09 清华大学 Synthesis method of virtual viewpoint in interactive multi-viewpoint video system
CN102201115A (en) * 2011-04-07 2011-09-28 湖南天幕智能科技有限公司 Real-time panoramic image stitching method of aerial videos shot by unmanned plane
CN102307309A (en) * 2011-07-29 2012-01-04 杭州电子科技大学 Somatosensory interactive broadcasting guide system and method based on free viewpoints
CN102903085A (en) * 2012-09-25 2013-01-30 福州大学 Rapid image mosaic method based on corner matching
CN102999901A (en) * 2012-10-17 2013-03-27 中国科学院计算技术研究所 Method and system for processing split online video on the basis of depth sensor
CN103489165A (en) * 2013-10-01 2014-01-01 中国人民解放军国防科学技术大学 Decimal lookup table generation method for video stitching
CN103501415A (en) * 2013-10-01 2014-01-08 中国人民解放军国防科学技术大学 Overlap structural deformation-based video real-time stitching method
CN103533266A (en) * 2013-10-01 2014-01-22 中国人民解放军国防科学技术大学 360-degree stitched-type panoramic camera with wide view field in vertical direction
CN103646416A (en) * 2013-12-18 2014-03-19 中国科学院计算技术研究所 Three-dimensional cartoon face texture generation method and device
CN104091318A (en) * 2014-06-16 2014-10-08 北京工业大学 Chinese sign language video transition frame synthesizing method
CN104103050A (en) * 2014-08-07 2014-10-15 重庆大学 Real video recovery method based on local strategies
WO2018005701A1 (en) * 2016-06-29 2018-01-04 Cellular South, Inc. Dba C Spire Wireless Video to data
CN107426524A (en) * 2017-06-06 2017-12-01 微鲸科技有限公司 A kind of method and apparatus of the Multi-Party Conference based on virtual panoramic
TWI639136B (en) * 2017-11-29 2018-10-21 國立高雄科技大學 Real-time video stitching method
CN108734728A (en) * 2018-04-25 2018-11-02 西北工业大学 A kind of extraterrestrial target three-dimensional reconstruction method based on high-resolution sequence image
CN109165550A (en) * 2018-07-13 2019-01-08 首都师范大学 A kind of multi-modal operation track fast partition method based on unsupervised deep learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
安平 ; 刘占伟 ; 刘苏醒 ; 张兆扬 ; .视频会议系统中基于图像拼合的中间视合成.上海大学学报(自然科学版).2007,(04),全文. *
方赵林 ; 彭洁 ; 葛春霞 ; 秦绪佳 ; .基于改进加权算法的实时图像数据融合研究.浙江工业大学学报.2017,(第03期),全文. *
王凯 ; 陈朝勇 ; 吴敏 ; 姚辉 ; 张翔 ; .一种改进的非线性加权图像拼接融合方法.小型微型计算机系统.2017,(第05期),全文. *

Also Published As

Publication number Publication date
CN111640187A (en) 2020-09-08

Similar Documents

Publication Publication Date Title
CN111640187B (en) Video stitching method and system based on interpolation transition
WO2022033048A1 (en) Video frame interpolation method, model training method, and corresponding device
US11348314B2 (en) Fast and deep facial deformations
CN111294665B (en) Video generation method and device, electronic equipment and readable storage medium
US10489956B2 (en) Robust attribute transfer for character animation
Ding et al. Spatio-temporal recurrent networks for event-based optical flow estimation
US11915439B2 (en) Method and apparatus of training depth estimation network, and method and apparatus of estimating depth of image
US20180005039A1 (en) Method and apparatus for generating an initial superpixel label map for an image
Liu et al. Using unsupervised deep learning technique for monocular visual odometry
CN111462205A (en) Image data deformation and live broadcast method and device, electronic equipment and storage medium
WO2024051756A1 (en) Special effect image drawing method and apparatus, device, and medium
Jiang et al. A neural refinement network for single image view synthesis
Reso et al. Occlusion-aware method for temporally consistent superpixels
Anasosalu et al. Compact and accurate 3-D face modeling using an RGB-D camera: let's open the door to 3-D video conference
CN111915587A (en) Video processing method, video processing device, storage medium and electronic equipment
US11158122B2 (en) Surface geometry object model training and inference
CN112308875A (en) Unsupervised image segmentation based on background likelihood estimation
Ihm et al. Low-cost depth camera pose tracking for mobile platforms
Chang et al. Mono-star: Mono-camera scene-level tracking and reconstruction
CN112990134A (en) Image simulation method and device, electronic equipment and storage medium
Deng et al. A Dynamic Graph CNN with Cross-Representation Distillation for Event-Based Recognition
US11533451B2 (en) System and method for frame rate up-conversion of video data
CN111651033A (en) Driving display method and device for human face, electronic equipment and storage medium
Andrei et al. Unsupervised learning of visual odometry using direct motion modeling
Iu et al. Re-examining the optical flow constraint. A new optical flow algorithm with outlier rejection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant