CN114241378A

CN114241378A - Method and device for generating transition video, electronic equipment and storage medium

Info

Publication number: CN114241378A
Application number: CN202111547771.9A
Authority: CN
Inventors: 成丹妮; 罗超; 邹宇; 李巍
Original assignee: Ctrip Travel Information Technology Shanghai Co Ltd
Current assignee: Ctrip Travel Information Technology Shanghai Co Ltd
Priority date: 2021-12-16
Filing date: 2021-12-16
Publication date: 2022-03-25

Abstract

The invention provides a method, a device, electronic equipment and a storage medium for generating transition videos, wherein the method comprises the following steps: receiving an image pair, the image pair comprising a first image and a second image; extracting n key feature points from the first image and the second image respectively to form a first point set and a second point set; forming a first set of triangulated regions for the first image from the first set of points; forming a second set of triangulation regions for the second image from the second set of points; determining a matching relationship between the first triangulation region set and the second triangulation region set; respectively calculating a first affine matrix and a second affine matrix; generating an intermediate triangulation region set according to the first triangulation region set and the second triangulation region set based on the first affine matrix and the second affine matrix; generating an intermediate frame; transition video is generated based on the first image, the intermediate frame, and the second image. The invention realizes the generation of transition videos.

Description

Method and device for generating transition video, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of computer application, in particular to a method and a device for generating transition videos, electronic equipment and a storage medium.

Background

With the development of internet information and the continuous upgrade of network technology, the traditional pictures and characters can not meet the requirement of users for quickly acquiring information, and the mobile video technology is developed rapidly. For OTA (on-line travel company), the information of the products to be sold is accurately and beautifully displayed based on the video, and the user experience can be effectively improved. Therefore, it is important to produce high-quality video as an important information medium, and the forms of video are divided into various forms, which can be mainly divided into two categories, i.e., picture-generated video and actually-photographed video. The video to be shot needs professional equipment and lens processing skills, so that the cost of the video to be shot is high, and the history accumulation amount is far less than the quantity of high-quality pictures. Therefore, the video is automatically generated through a machine based on the existing picture, the transition mode of the image composite video is mined, the video generation cost can be greatly saved, the richness of the image composite video is improved, and the method has higher application value.

Therefore, how to generate transition videos based on images is a technical problem to be solved urgently in the field.

Disclosure of Invention

In order to overcome the defects in the prior art, the invention provides a method, a device, an electronic device and a storage medium for generating transition videos, so that the transition videos can be generated based on images.

According to an aspect of the present invention, there is provided a method of generating a transition video, comprising:

receiving an image pair comprising a first image and a second image;

extracting n key feature points from the first image and the second image respectively to form a first point set and a second point set, wherein n is an integer greater than 0;

forming a first set of triangulation regions of the first image from the first set of points;

forming a second set of triangulation regions for the second image from the second set of points;

determining a matching relationship of triangulation regions in the first triangulation region set and the second triangulation region set according to the first point set and the second point set;

respectively calculating a first affine matrix and a second affine matrix of the first triangulation area and the second triangulation area which are matched and converted into the middle triangulation area;

generating an intermediate triangulation region set according to the first triangulation region set and the second triangulation region set based on the first affine matrix and the second affine matrix;

generating an intermediate frame according to the generated intermediate triangulation area set;

generating a transition video based on the first image, the intermediate frame, and the second image.

In some embodiments of the present application, before extracting n key feature points from the first image and the second image, respectively, the method includes:

obtaining a similarity score for the image pair;

and calculating the number n of key feature points according to the similarity scores of the image pairs.

In some embodiments of the present application, said calculating the number n of key feature points from the similarity scores of the image pairs comprises:

when the similarity score s of the image pair is less than 0.5, making n 9;

when the similarity score s of the image pair is greater than or equal to 0.5, n is 10 × s + 4.

In some embodiments of the present application, the first point set and the second point set respectively include four corner points of the image, center points of four edges, and an image gravity center point, and the image gravity center point is obtained by calculating according to a zeroth order matrix and a first order matrix of the image.

In some embodiments of the present application, when the similarity score s of the image pair is greater than or equal to 0.5, the extracting n key feature points from the first image and the second image respectively includes:

extracting four corner points, center points of four edges and center points of gravity of the image;

and extracting n-9 feature points based on a feature point matching algorithm.

In some embodiments of the present application, the calculating the first and second affine matrices for the transformation of the matched first and second triangulation areas to the intermediate triangulation area respectively comprises:

selecting a pair of matched first triangulation areas and second triangulation areas;

calculating the vertex coordinates of the middle triangulation area according to the selected vertex coordinates of the first triangulation area and the selected vertex coordinates of the second triangulation area;

and respectively calculating the first affine matrix and the second affine matrix according to the vertex coordinates of the selected first triangulation area, the vertex coordinates of the second triangulation area and the vertex coordinates of the middle triangulation area.

In some embodiments of the present application, the vertex coordinates, the first affine matrix, and the second affine matrix of the intermediate triangulated region are calculated based on a fusion coefficient, the method further comprising:

generating a plurality of intermediate frames based on different fusion coefficients to form an intermediate frame sequence;

a transition video is generated based on the first image, the sequence of intermediate frames, and the second image.

According to another aspect of the present application, there is also provided an apparatus for generating transition video, including:

a receiving module for receiving an image pair, the image pair comprising a first image and a second image;

an extraction module, configured to extract n key feature points from the first image and the second image, respectively, to form a first point set and a second point set, where n is an integer greater than 0;

a first triangulation module for forming a first set of triangulated regions for the first image from the first set of points;

a second triangulation module for forming a second set of triangulation regions for the second image from the second set of points;

a matching module, configured to determine a matching relationship between triangulation regions in the first triangulation region set and the second triangulation region set according to the first point set and the second point set;

the affine matrix calculation module is used for calculating a first affine matrix and a second affine matrix of the first triangulation area and the second triangulation area which are matched and converted into the middle triangulation area respectively;

the intermediate triangulation generation module is used for generating an intermediate triangulation area set according to the first triangulation area set and the second triangulation area set based on the first affine matrix and the second affine matrix;

the intermediate frame generating module is used for generating an intermediate frame according to the generated intermediate triangulation area set;

and the transition module is used for generating a transition video based on the first image, the intermediate frame and the second image.

According to still another aspect of the present invention, there is also provided an electronic apparatus, including: a processor; a storage medium having stored thereon a computer program which, when executed by the processor, performs the steps of the method of generating transition video as described above.

According to a further aspect of the present invention, there is also provided a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of generating transition video as described above.

Compared with the prior art, the invention has the advantages that:

and (3) by extracting feature points which can be matched with image keys, forming a point set by the feature points and the image contour points, and acquiring a triangulation area of the image. And then, corresponding triangulation areas of the two images are obtained based on the corresponding matching points, and intermediate transition images are constructed and combined through affine change, so that a transition video is generated. Therefore, the image key points can be efficiently extracted based on the existing image data construction model, the matching area is obtained, the transition video is generated through shape gradual change, the labor cost of video production and shooting can be greatly saved, the richness of the image generation video is ensured, and the user experience is improved.

Drawings

The above and other features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.

FIG. 1 shows a flow diagram of a method of generating a transition video according to an embodiment of the invention.

Fig. 2 shows a schematic diagram of key feature points of an image according to an embodiment of the invention.

Fig. 3 shows a flow diagram for generating a transition video based on a sequence of intermediate frames according to another embodiment of the invention.

Fig. 4 is a block diagram illustrating an apparatus for generating a transition video according to an embodiment of the present invention.

Fig. 5 schematically illustrates a computer-readable storage medium in an exemplary embodiment of the disclosure.

Fig. 6 schematically illustrates an electronic device in an exemplary embodiment of the disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

Referring initially to fig. 1, fig. 1 illustrates a schematic diagram of a method of generating a transition video, in accordance with an embodiment of the present invention. The method for generating transition videos comprises the following steps:

step S110: an image pair is received, the image pair including a first image and a second image.

Step S120: and respectively extracting n key feature points from the first image and the second image to form a first point set and a second point set, wherein n is an integer greater than 0.

Specifically, prior to step S120, a similarity score for the image pair may be obtained; and calculating the number n of key feature points according to the similarity scores of the image pairs. The similarity score of the image pair may be obtained based on various image similarity calculations, which is not limited in this application. The similarity of the image pair can be normalized to a value of 0 to 1. The number of key feature points to be extracted can be calculated and determined from the similarity scores of the obtained image pairs.

Specifically, the calculating the number n of key feature points according to the similarity scores of the image pairs comprises: when the similarity score s of the image pair is less than 0.5, making n 9; when the similarity score s of the image pair is greater than or equal to 0.5, n is 10 × s + 4.

In this embodiment, for two images with a similarity smaller than 0.5, which indicates that the images are not similar, the conventional feature point matching algorithm generally cannot effectively extract valid features. Thus, this embodiment may construct a set of fixed-attribute matching points based on this: four corner points of the image, center points of four edges and an image center of gravity point. As shown in fig. 2, fig. 2 shows a schematic diagram of 9 key feature points of an image 200 according to an embodiment of the invention.

Specifically, the image gravity center point can be obtained by calculating according to a zeroth order matrix and a first order matrix of the image. Image center of gravity point P_mCoordinate (x) of_c，y_c) It can be calculated as follows:

wherein M is₀₀Is a zeroth order matrix of the image, M₁₀And M₀₁Is a first order matrix of the image.

Zero order matrix M₀₀Can be calculated according to the following formula:

wherein, V (I, J) is the gray value of the image (I, J), I is the number of columns of the image pixel points, and J is the number of rows of the image pixel points.

First order matrix M₁₀And M₀₁To be calculated according to the following formula:

further, when the similarity score s of the image pair is greater than or equal to 0.5, the extracting n key feature points from the first image and the second image respectively comprises: extracting four corner points, center points of four edges and center points of gravity of the image; and extracting n-9 feature points based on a feature point matching algorithm. The feature point matching algorithm may include, but is not limited to, SIFT (Scale-invariant feature transform), SURF (Speeded Up Robust Features), ORB (organized FAST and Rotated BRIEF), and the like. According to the algorithms, the similarity of the feature points can be calculated, and n-9 feature point pairs with the highest similarity are extracted to serve as key feature point packages.

Step S130: forming a first set of triangulated regions for the first image from the first set of points.

Step S140: forming a second set of triangulated regions for the second image from the second set of points.

Specifically, the foregoing steps S130 and S140 may adopt Delaunay triangulation to obtain a set of triangulation regions formed by a plurality of triangulation regions of the first image and the second image.

The triangulation is defined as: let V be a finite set of points in the two-dimensional real number domain, edge E be a closed line segment composed of points in the set of points as end points, and E be a set of E. Then a triangulation T ═ (V, E) of the set of points V is a plan G which satisfies the condition: edges in the plan view do not contain any points in the set of points, except for the endpoints; there are no intersecting edges; all the faces in the plan view are triangular faces, and the collection of all the triangular faces is the convex hull of the scatter set V. For the Delaunay triangulation, the Delaunay edge can be defined as: suppose an edge E (two endpoints are a, b) in E, and E is called a Delaunay edge if the following conditions are satisfied: there is a circle passing through two points a and b, and there is no other point in the circle (note that in the circle, at most three points on the circle are in a common circle) in the point set V, which is also called a null circle characteristic. The Delaunay triangulation can be defined as: if a triangulation T of the set of points V contains only Delaunay edges, the triangulation is referred to as a Delaunay triangulation. Thus, assuming that T is any triangulation of V, then T is a Delaunay triangulation of V, only if the inside of the circumscribed circle of each triangle in T currently contains no points in V. The Delaunay triangulation can be realized by using a flanging algorithm, a point-by-point insertion algorithm, a segmentation and merging algorithm, a Bowyer-Watson algorithm and the like, and the application is not limited to the above.

Step S150: and determining a matching relationship of triangulation areas in the first triangulation area set and the second triangulation area set according to the first point set and the second point set.

Specifically, since the triangulation regions are generated from the point sets, and there is a matching relationship between key feature points in the first point set and the second point set, the triangulation regions in the first triangulation region set and the second triangulation region set also have a matching relationship.

Therefore, many pairs of matched triangulation areas can be obtained according to the matching relation.

Step S160: and respectively calculating a first affine matrix and a second affine matrix of the first triangulation area and the second triangulation area which are matched and converted into the middle triangulation area.

Specifically, step S160 can be implemented by the following steps: selecting a pair of matched first triangulation areas and second triangulation areas; calculating the vertex coordinates of the middle triangulation area according to the selected vertex coordinates of the first triangulation area and the selected vertex coordinates of the second triangulation area; and respectively calculating the first affine matrix and the second affine matrix according to the vertex coordinates of the selected first triangulation area, the vertex coordinates of the second triangulation area and the vertex coordinates of the middle triangulation area.

In some specific implementations, a blending function of the two images may be set to control a blending coefficient of the two images for gradual changes. And obtaining the triangular area coordinates corresponding to the triangulation area of the intermediate frame through the mixed coefficient interpolation. As shown in the following equation:

V_M＝(1-α)V₁+αV₂

wherein, V₁For the vertex coordinates, V, of the selected first triangulation area₂For the vertex coordinates, V, of the selected second triangulation area_MThe vertex coordinates of the intermediate triangulation region.

First affine matrix M₁And a second affine matrix M₂As shown in the following equation:

V_M＝M₁*V₁

V_M＝M₂*V₂

the above formula can be further split into:

wherein x is_mAnd y_mThe vertex coordinates of the middle triangulation region, x11 and y12 are the vertex coordinates of the first triangulation region,

is M₁。M₂Can also be obtained by splitting in the same way.

Thus, by knowing the vertex coordinates of a pair of first triangulation regions, the vertex coordinates of the second triangulation region and the vertex coordinates of the intermediate triangulation region, M can be solved₁，M₂。

Step S170: and generating an intermediate triangulation area set from the first triangulation area set and the second triangulation area set based on the first affine matrix and the second affine matrix.

Step S170 can be shown as follows:

Z＝(1-α)M₁A+αM₂B

wherein Z is the middle triangulation zone, A and B are a pair of matched first and second triangulation zones, M₁，M₂The first affine matrix and the second affine matrix, and α is a relaxation coefficient.

Step S180: and generating an intermediate frame according to the generated intermediate triangulation area set.

Step S190: generating a transition video based on the first image, the intermediate frame, and the second image.

In the method for generating the transition video, the feature points which can be matched with the image key are extracted, the feature points and the image contour points form a point set, and the triangulation area of the image is obtained. And then, corresponding triangulation areas of the two images are obtained based on the corresponding matching points, and intermediate transition images are constructed and combined through affine change, so that a transition video is generated. Therefore, the image key points can be efficiently extracted based on the existing image data construction model, the matching area is obtained, the transition video is generated through shape gradual change, the labor cost of video production and shooting can be greatly saved, the richness of the image generation video is ensured, and the user experience is improved.

Turning next to fig. 3, fig. 3 shows a flow diagram for generating transition video based on inter-frame sequences, according to another embodiment of the invention. Fig. 3 shows the following steps in total:

step S181: generating a plurality of intermediate frames based on different fusion coefficients to form an intermediate frame sequence;

step S191: a transition video is generated based on the first image, the sequence of intermediate frames, and the second image.

Therefore, the transition video with smoother transition can be formed by enabling a plurality of intermediate frames to be in smooth transition based on the gradual change adjustment of the fusion coefficient.

The transition video generation method based on the key feature points realizes gradual change of the matching area in the process of generating the transition video based on the two images, improves the utilization rate of the images, greatly reduces the acquisition and manufacturing cost of the video and improves the richness of the image synthetic video.

The foregoing is merely an exemplary description of various implementations of the invention and is not intended to be limiting thereof.

The invention also provides a device for generating transition video, and fig. 4 is a schematic diagram of the device for generating transition video according to the embodiment of the invention. The apparatus 400 for generating transition video comprises a receiving module 410, an extracting module 420, a first partitioning module 430, a second partitioning module 440, a matching module 450, an affine matrix calculating module 460, an intermediate partitioning generating module 470, an intermediate frame generating module 480, and a transition module 490.

The receiving module 410 is configured to receive an image pair, the image pair comprising a first image and a second image;

the extracting module 420 is configured to extract n key feature points from the first image and the second image, respectively, to form a first point set and a second point set, where n is an integer greater than 0;

a first triangulation module 430 for forming a first set of triangulated regions for the first image from the first set of points;

the transition module 490 is configured to form a second set of triangulation regions for the second image from the second set of points;

the matching module 450 is configured to determine a matching relationship between triangulation regions in the first triangulation region set and the second triangulation region set according to the first point set and the second point set;

the affine matrix calculating module 460 is configured to calculate a first affine matrix and a second affine matrix of the first triangulation region and the second triangulation region that are matched and converted into the intermediate triangulation region, respectively;

the intermediate triangulation generation module 470 is configured to generate an intermediate triangulation region set according to the first triangulation region set and the second triangulation region set based on the first affine matrix and the second affine matrix;

the intermediate frame generating module 480 is configured to generate an intermediate frame according to the generated intermediate triangulation region set;

transition module 490 is configured to generate a transition video based on the first image, the intermediate frame, and the second image.

In the device for generating the transition video, which is provided by the invention, the feature points which can be matched with the image key are extracted, the feature points and the image contour points form a point set, and the triangulation area of the image is obtained. And then, corresponding triangulation areas of the two images are obtained based on the corresponding matching points, and intermediate transition images are constructed and combined through affine change, so that a transition video is generated. Therefore, the image key points can be efficiently extracted based on the existing image data construction model, the matching area is obtained, the transition video is generated through shape gradual change, the labor cost of video production and shooting can be greatly saved, the richness of the image generation video is ensured, and the user experience is improved.

Fig. 4 is a schematic diagram illustrating an apparatus for generating transition video provided by the present invention, and the splitting, combining and adding of modules are within the scope of the present invention without departing from the concept of the present invention. The transition video generating apparatus provided by the present invention can be implemented by software, hardware, firmware, plug-in and any combination thereof, and the present invention is not limited thereto.

In an exemplary embodiment of the present disclosure, a computer-readable storage medium is also provided, on which a computer program is stored, which when executed by, for example, a processor, may implement the steps of the method of generating a transition video as described in any of the above embodiments. In some possible embodiments, aspects of the present invention may also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps according to various exemplary embodiments of the present invention described in the above-mentioned method part of generating transition videos of the present description, when said program product is run on the terminal device.

Referring to fig. 5, a program product 400 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the tenant computing device, partly on the tenant device, as a stand-alone software package, partly on the tenant computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing devices may be connected to the tenant computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

In an exemplary embodiment of the present disclosure, there is also provided an electronic device, which may include a processor, and a memory for storing executable instructions of the processor. Wherein the processor is configured to perform the steps of the method of generating transition videos in any of the above embodiments via execution of the executable instructions.

As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.

An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 6. The electronic device 600 shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 6, the electronic device 600 is embodied in the form of a general purpose computing device. The components of the electronic device 600 may include, but are not limited to: at least one processing unit 610, at least one storage unit 620, a bus 630 that connects the various system components (including the storage unit 620 and the processing unit 610), a display unit 640, and the like.

Wherein the storage unit stores program code executable by the processing unit 610 to cause the processing unit 610 to perform steps according to various exemplary embodiments of the present invention described in the method section of generating transition video above in this specification. For example, the processing unit 610 may perform the steps as shown in fig. 1.

The storage unit 620 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)6201 and/or a cache memory unit 6202, and may further include a read-only memory unit (ROM) 6203.

The memory unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Bus 630 may be one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a tenant to interact with the electronic device 600, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 600 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 650. Also, the electronic device 600 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 660. The network adapter 660 may communicate with other modules of the electronic device 600 via the bus 630. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, or a network device, etc.) to execute the above-mentioned method for generating transition video according to the embodiments of the present disclosure.

Compared with the prior art, the invention has the advantages that:

the method comprises the steps of extracting feature points which can be matched with image keys, forming a point set by the feature points and image contour points, and obtaining a triangulation area of an image. And then, corresponding triangulation areas of the two images are obtained based on the corresponding matching points, and intermediate transition images are constructed and combined through affine change, so that a transition video is generated. Therefore, the image key points can be efficiently extracted based on the existing image data construction model, the matching area is obtained, the transition video is generated through shape gradual change, the labor cost of video production and shooting can be greatly saved, the richness of the image generation video is ensured, and the user experience is improved.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims

1. A method of generating transition video, comprising:

receiving an image pair comprising a first image and a second image;

2. The method of generating transition video according to claim 1, wherein before extracting n key feature points from the first image and the second image, respectively, comprising:

obtaining a similarity score for the image pair;

3. The method of generating a transition video of claim 2, wherein said computing a number n of key feature points from the similarity scores of the image pairs comprises:

when the similarity score s of the image pair is less than 0.5, making n 9;

4. The method of claim 3, wherein the first set of points and the second set of points comprise four corners of the image, center points of four edges, and center points of image gravity, respectively, the center points of image gravity being calculated from a zeroth order matrix and a first order matrix of the image.

5. The method of generating a transition video of claim 4, wherein when the similarity score s of the image pair is greater than or equal to 0.5, the extracting n key feature points from the first image and the second image respectively comprises:

and extracting n-9 feature points based on a feature point matching algorithm.

6. The method of generating a transition video of claim 1, wherein the separately computing the first and second affine matrices for the matched first and second triangulation regions to transform to the intermediate triangulation region comprises:

7. The method of generating a transition video of claim 6, wherein the vertex coordinates of the intermediate triangulation region, the first affine matrix, the second affine matrix are computed based on fusion coefficients, the method further comprising:

8. An apparatus for generating transition video, comprising:

an intermediate frame generation module, configured to generate an intermediate frame according to the generated intermediate triangulation region set;

9. An electronic device, characterized in that the electronic device comprises:

a processor;

storage medium having stored thereon a computer program which, when being executed by the processor, carries out the method of generating a transition video according to any one of claims 1 to 7.

10. A storage medium having stored thereon a computer program for performing, when executed by a processor, a method of generating a transition video according to any one of claims 1 to 7.