WO2021217385A1 - Procédé et appareil de traitement vidéo - Google Patents

Procédé et appareil de traitement vidéo Download PDF

Info

Publication number
WO2021217385A1
WO2021217385A1 PCT/CN2020/087350 CN2020087350W WO2021217385A1 WO 2021217385 A1 WO2021217385 A1 WO 2021217385A1 CN 2020087350 W CN2020087350 W CN 2020087350W WO 2021217385 A1 WO2021217385 A1 WO 2021217385A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
video
target
sub
response
Prior art date
Application number
PCT/CN2020/087350
Other languages
English (en)
Chinese (zh)
Inventor
陈希
周游
刘洁
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to CN202080041521.5A priority Critical patent/CN113906731B/zh
Priority to PCT/CN2020/087350 priority patent/WO2021217385A1/fr
Publication of WO2021217385A1 publication Critical patent/WO2021217385A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing

Definitions

  • the present invention relates to the technical field of video processing, in particular to a video processing method and device.
  • users can add various objects to the video in a video processing program, including but not limited to text, objects, images, and so on.
  • these require professional users to have professional image processing technology to complete, which requires a simple image processing method to enable users to add objects to the video.
  • the embodiments of the present invention provide a video processing method and device to solve the above-mentioned problem of adding objects to a video.
  • an embodiment of the present invention discloses a video processing method, which is applied to a remote control device, and includes:
  • the synthesized video is: according to the position information of the display object in space and the position and posture information of the shooting device of the movable platform when each frame of the image in the target video is captured, the display object A video projected on each frame of the target video, where the target video is a video collected by a shooting device of the movable platform when the movable platform moves in the space;
  • the embodiment of the present invention also discloses a video processing device.
  • the device includes a processor, a memory, and a computer program stored on the memory and running on the processor.
  • the processor When the computer program is executed, it is used to:
  • the synthesized video is: according to the position information of the display object in the space and the pose information of the shooting device of the movable platform when shooting each frame of the target video, the display object A video projected on each frame of the target video, where the target video is a video collected by a shooting device of the movable platform when the movable platform moves in the space;
  • the display object edited by the user is obtained in response to the user's object content editing operation; the position information of the display object in the space is obtained in response to the user's object position editing operation; the synthesized video is obtained, so
  • the synthesized video is: according to the position information of the display object in space and the pose information of each frame of the image in the target video by the shooting device of the movable platform, the display object is projected to the target video
  • the present invention only needs the user to input the display object and the position, so as to synthesize the display object input by the user into the video according to the position, thereby reducing the complexity of the user's operation.
  • Figure 1 shows a flow chart of the steps of a video processing method in Embodiment 1 of the present invention
  • FIG. 2 shows a flowchart of the steps of another video processing method in Embodiment 1 of the present invention
  • FIG. 3 shows a flowchart of the steps of another video processing method in Embodiment 1 of the present invention
  • FIG. 4 shows a flowchart of the steps of another video processing method in Embodiment 1 of the present invention.
  • FIG. 5 shows a flowchart of the steps of another video processing method in Embodiment 1 of the present invention.
  • Fig. 6 shows a structural block diagram of a video processing device in real-time example 2 of the present application.
  • the present invention is used to receive the display object edited by the user and the position information of the display object in the space to synthesize the video.
  • the display object in the synthesized video is projected into the video according to the position information in the space, that is, whether the display object is directly placed In a two-dimensional image, the display object is a three-dimensional display in the synthesized video.
  • the present invention can be applied to a video processing application program.
  • the user can import the target video into the video processing application program and input the display object.
  • the system can automatically obtain the position information of the display object in the space, which reduces the user's operation complexity.
  • Embodiment 1 of the present invention shows a flowchart of steps of a video processing method according to Embodiment 1 of the present invention, which may specifically include the following steps:
  • Step 101 In response to the user's object content editing operation, obtain a display object edited by the user.
  • the object content editing operation can be any operation for the user to input the display object, for example, provide an input box, the user can enter text in the input box; for example, provide an object library, and display the object in the area to be selected For the objects in the library, the user can select one of the objects from the to-be-selected area as the display object. Of course, the user can also cancel the previously selected display object and reselect a new display object.
  • the display object may include any displayable object, for example, text, picture, etc., and the text content, picture content, text color, font, picture shape, etc. are not limited.
  • Step 102 In response to the user's object position editing operation, obtain position information of the display object in the space.
  • the position information in the space can be determined for the display object, and the position information in the space is represented by three-dimensional coordinates or latitude and longitude in the space.
  • the present invention does not impose restrictions on it.
  • the position information in the space can be the space position directly edited by the user, or the position information of the pixel position determined in the image of the target video projected into the space, or the space position of an object automatically recognized in the target video information.
  • the object position editing operation is used to input a position, which can be a three-dimensional coordinate in space, so that the position information of the object in the space is the three-dimensional coordinate in the space input by the user; the position input by the user can also be The position of the pixel determined in any image frame in the target video, the position information of the pixel position needs to be projected into space, and the position information of the projected space is the position information of the display object in the space.
  • the present invention can not only edit the display object by the user, but also input the position by the user, so that the display object is displayed at the position designated by the user, so that the effect of the synthesized video is more in line with the user's needs.
  • Step 103 Obtain a synthesized video.
  • the synthesized video is based on the position information of the display object in space and the position and posture information of the shooting device of the movable platform when shooting each frame of the target video.
  • the video obtained by projecting the display object onto each frame of image in the target video, wherein the target video is a video collected by a shooting device of the movable platform when the movable platform moves in the space.
  • the projection position and projection attitude of the display object in each frame of the target video are determined according to the position information and pose information of the display object in the space; then, according to the projection of the display object in each frame of the image
  • the position and projection posture project the display object into each frame of image to obtain the target composite video.
  • the target video can be a video shot by a camera on a movable platform
  • the movable platform can be any device that can be moved in space, for example, aircraft, sliding equipment, cars, trains, etc., with shooting equipment installed on the movable platform, In the process of shooting video, the camera moves along with the movable platform, so that the posture information of the camera is different.
  • the shooting device may be any device with shooting function, for example, a camera, a mobile phone with shooting function, a tablet computer, etc.
  • the posture information of the camera includes the position and posture of the camera. Since the position of the camera is changed by translation and the posture is changed by rotation, the posture information of the camera can be represented by a translation matrix and a rotation matrix.
  • the pose information is used to indicate the displacement and rotation relationship between the coordinate system of the camera and the world coordinate system in the three-dimensional space.
  • Step 104 Display the synthesized video.
  • the synthesized video can be shown to the user so that the user can view the synthesized effect. If the user is satisfied with the synthesized effect, the synthesized video can be stored and shared; if the user is not satisfied with the synthesized effect, the synthesized video can be deleted The synthesized video.
  • the method further includes step 105:
  • Step 105 In response to the user's video selection operation, determine the target video from the videos collected by the shooting device of the movable platform according to the video selection operation.
  • the video selection operation is used to select the target video from the videos collected by the shooting device of the movable platform.
  • the video collected by the shooting device of the movable platform can be classified or displayed according to the movable platform, so that the user can display Select the target video from the videos in.
  • the user can directly select one of the videos displayed as the target video, or the user can input the identification of the movable platform, and then select one of the videos from the filtered videos as the target video.
  • the video selection operation can be a direct selection operation on the target video, or a combination of a selection operation on the movable platform and a selection operation on the target video.
  • the selection operation can be click, long press, drag and other operations.
  • the video selection operation includes a first movable platform selection sub-operation
  • the step 105 includes a sub-step 1051:
  • Sub-step 1051 in response to the user's first movable platform selection sub-operation, determine a target movable platform among a plurality of movable platforms, and determine the video captured by the shooting device of the target movable platform as the Target video.
  • the first movable platform selection sub-operation is used to select one of the movable platform videos from the displayed plurality of movable platforms as the target movable platform.
  • the first movable platform selection sub-operation may be a user's click operation, drag operation, long press operation, etc. on one of the movable platforms.
  • the user can select one of the movable platforms, or directly enter the identification of the movable platform in the input box; then, show the video taken by the movable platform selected by the user If there is only one video shot by the mobile platform selected by the user, the video can be directly used as the target video; if there are multiple videos shot by the mobile platform selected by the user, a selection control for each movable platform needs to be provided , So that the user selects one of the videos as the target video.
  • the invention can provide the user with the selection function of the movable platform, and when there are a large number of movable platforms, it can assist the user to quickly filter out the target video.
  • the video selection operation includes a first video selection sub-operation
  • the step 105 includes a sub-step 1052:
  • sub-step 1052 in response to the first video selection sub-operation of the user in the video set, the video selected by the user is determined as the target video, the video set including at least one video captured by a shooting device of a movable platform.
  • the first video selection sub-operation is used to select one of the videos from the displayed video set as the target video.
  • the first video selection operation may be a click operation, a drag operation, or a long press operation of one of the videos by the user.
  • the present invention can also display all the videos shot by all the movable platforms in the area to be selected, so that the user can select one of the videos as the target video.
  • the user does not remember the logo of the movable platform, it can assist the user to directly select the video, and improve the success rate of selecting the target video.
  • the video selection operation includes a second movable platform selection sub-operation and a second video selection sub-operation
  • the step 105 includes sub-steps 1053 to 1054:
  • sub-step 1053 in response to the second movable platform selection sub-operation of the user, the video collected by the shooting device of the movable platform selected by the user is displayed as a candidate video.
  • the second movable platform selection sub-operation is used to select one of the movable platform videos from the displayed plurality of movable platforms as the target movable platform.
  • the second movable platform selection sub-operation may be a user's click operation, drag operation, long press operation, etc. on one of the movable platforms.
  • the user can select one movable platform or multiple movable platforms. If the user selects a movable platform, the video collected by the movable platform will be displayed; if the user selects multiple movable platforms, the videos collected by multiple movable platforms will be displayed separately according to the movable platform, or The videos collected by multiple mobile platforms are mixed and displayed together.
  • Sub-step 1054 in response to the user's second video selection sub-operation of the candidate video, determine the video selected by the user as the target video.
  • the second video selection sub-operation is used to select the target video from the videos collected by the movable platform selected by the user.
  • the second video selection operation may be a click operation, a drag operation, or a long press operation of one of the videos by the user.
  • the invention can enable the user to select the movable platform first and then select the video, and filter out a part of candidate videos through the movable platform, so that the number of selectable videos when the user selects the video is reduced, and the user can avoid directly selecting the video from a large number of videos, which is helpful To increase the speed at which users select videos.
  • the video selection operation includes a segment selection sub-operation
  • the step 105 includes a sub-step 1055:
  • Sub-step 1055 in response to the user's segment selection sub-operation, determine a video segment in the to-be-edited video as the target video, wherein the to-be-edited video includes one of the following: The video selected from the video captured by the camera of the platform, the video captured by the camera of the movable platform selected by the user, and the video selected by the user from the video captured by the camera of the selected movable platform.
  • the segment selection sub-operation is used to select multiple frames of images from one or more to-be-edited videos to form the target video.
  • the segment selection operation can be realized by setting the start image and the end image in the video to be edited, so that the target video is a video segment composed of images between the start image and the end image in the video to be edited.
  • the segment selection operation can also be achieved by selecting each frame of image, so that the target video is a video segment composed of a number of continuous or discontinuous images selected from the video to be edited.
  • the segment selection operation can also be achieved by setting the start playback time and end playback time of the video to be edited, so that the target video is a video segment between the start playback time and the end playback time.
  • the above-mentioned video to be edited can be similar to the process of selecting the target video in sub-step 1051, which is the video collected by the shooting device of the movable platform selected by the user, or can be similar to the process of selecting the target video in sub-step 1052, in which the user selects the target video in one or more movable platforms.
  • the video selected from the video captured by the camera of the platform can also be similar to the process of selecting the target video in substeps 1053 to 1054, which is the video selected by the user from the video captured by the camera of the selected movable platform.
  • the target video can also be obtained by editing, so that the target video can better meet the needs of users.
  • the step 101 includes sub-step 1011:
  • Sub-step 1011 in response to the user's object content editing operation, obtain a three-dimensional model corresponding to the display object edited by the user; the synthesized video is to project the three-dimensional model corresponding to the display object to each of the target videos The resulting video on the frame image.
  • the user can directly select one of the displayed three-dimensional models; in another example, the user can input the identifier of the three-dimensional model to obtain the corresponding three-dimensional model.
  • the object content editing operation includes a first input sub-operation; the sub-step 1011 includes a sub-step 10111:
  • Sub-step 10111 in response to the user's first input sub-operation, obtain the object identifier input by the user, and determine the three-dimensional model corresponding to the object identifier as the display object.
  • the first input operation is used to input the object identifier, and the user can input the object identifier in the input box, thereby obtaining the three-dimensional model corresponding to the object identifier from the three-dimensional model library.
  • the object identifier is a unique identifier of the display object, and may include at least one of the following: numbers, letters, and special symbols.
  • the three-dimensional model can be a three-dimensional model of any object, creature, text, etc.
  • the object content editing operation includes a second input sub-operation and a first model selection sub-operation;
  • the sub-step 1011 includes sub-steps 10112 to 10113:
  • Sub-step 10112 in response to the second input sub-operation of the user, obtain the object identifier input by the user, and display the three-dimensional model corresponding to the object identifier as a candidate model.
  • the same object identifier can correspond to multiple three-dimensional models of different types and styles, so that the user can select one of the multiple three-dimensional models corresponding to the same object identifier as the display object.
  • it can be divided into three-dimensional model of KaiTi, three-dimensional model of Song Ti, and three-dimensional model of boldface according to the font; it can also be divided into three-dimensional model of red, three-dimensional model of black, three-dimensional model of green, etc. according to color;
  • the style it can also be divided into: three-dimensional model of normal body, three-dimensional model of italics, three-dimensional model of bold, etc.
  • Sub-step 10113 in response to the user's first model selection sub-operation of the candidate model, determine the candidate model selected by the user as the display object.
  • the first model selection sub-operation is used to select one of the three-dimensional models from the multiple three-dimensional models identified by the designated object as the display object.
  • the first model selection sub-operation may be operations such as clicking, long pressing, and dragging on the three-dimensional model.
  • the present invention can provide different three-dimensional models for the same object identification for users to select, which helps to improve the richness and diversity of three-dimensional models, and thus can better meet the needs of users.
  • the object content editing operation includes: a second model selection sub-operation, and the sub-step 1011 includes sub-steps 10114 to 10115:
  • sub-step 10114 multiple candidate three-dimensional models are displayed.
  • the candidate three-dimensional model can be any three-dimensional model for users to select.
  • the candidate three-dimensional model can be displayed in the to-be-selected area in a certain order for the user to select.
  • Sub-step 10115 in response to the user's second model selection sub-operation, determine any of the three-dimensional models selected by the user from a plurality of candidate three-dimensional models as the display object.
  • the second model selection sub-operation is used to select one of the candidate three-dimensional models as the display object.
  • the second model selection sub-operation may be operations such as clicking, long pressing, and dragging on the three-dimensional model.
  • the present invention can display all the provided candidate three-dimensional models to the user for selection by the user, which helps to improve the flexibility of the user's selection.
  • the display object includes at least one of numbers, letters, special symbols, and object identifiers.
  • the object identifier may be the name, number, etc. of the object.
  • the display object may be one of numbers, letters, special symbols, and object identifiers, or a combination of two or more of them, which is not limited in the embodiment of the present invention.
  • a sub-step 1012 is further included:
  • Sub-step 1012 in response to the user's attribute editing sub-operation, obtain the object attribute information input by the user, and set the display object according to the object attribute information, and the object attribute information includes at least one of the following: display object The size of the display object, the transparency of the display object, the degree of blurring of the display object, and the color of the display object.
  • the attribute editing sub-operation is used to adjust the attributes of the display object.
  • the size of the display object can be directly adjusted by dragging the boundary of the display object, the transparency and the degree of blurring of the display object can be adjusted through a sliding bar or direct input, and the color of the display object can be selected from the color wheel.
  • the present invention can flexibly adjust the attributes of the display object, which helps to enrich the effect of the synthesized video.
  • step 102 includes sub-step 1021:
  • Sub-step 1021 in response to the user's object position editing operation, obtain the target pixel position edited by the user in the target image frame, and determine the position of the display object in the space according to the projection position of the target pixel position in the space
  • the target image frame is an image frame in the target video.
  • the projection position of the target pixel position in the space is the position of the target pixel position in the space, which can be determined according to the pose information of the camera and the target pixel position.
  • the target pixel position may be a position selected by the user, or a position determined according to a preset rule. Therefore, the object position editing operation may be a click operation, a long press operation, or a drag operation on one of the pixel positions.
  • the target pixel position is the position represented by the pixel in the target image frame, which can be represented by two-dimensional coordinates, and the image can be represented by a two-dimensional matrix. One dimension is used to represent the number of rows of the image, and the other dimension is used to identify the number of columns of the image. For example, the target pixel position (10, 20) represents the pixel position in the 10th row and 20th column in the image.
  • the target pixel position is selected based on the target image frame, and the target image frame can be any frame image in the target video, or an image frame determined based on a certain principle.
  • the target image frame is any image, the effect of the synthesized video is poor; when the target image frame is an image frame determined according to a certain principle, the effect of the synthesized video is better.
  • the object position editing operation includes a first object position editing sub-operation or a second object position editing sub-operation
  • the sub-step 1021 includes sub-steps 10211 or 10212:
  • Sub-step 10211 in response to the user's first object position editing sub-operation, determine the target pixel position according to the position of the pixel selected in the target image frame by the user in the target image frame.
  • the first object position editing sub-operation is used to select pixels in the target image frame, and the first object position editing sub-operation can be operations such as clicking, long-pressing, or dragging one of the pixels, or it can be a pixel Point position input operation.
  • Sub-step 10212 in response to the user's second object position editing sub-operation, determine the target pixel position according to the position in the target image frame of the pixel area selected by the user in the target image frame.
  • the second object position editing sub-operation is used to select the pixel area in the target image frame, and the second object position editing sub-operation can be an operation in which the user delimits a range in the target image frame, or inputting the boundary of the pixel area operate.
  • the user can either directly enclose a rectangular area, or input the start row and end row, start column and end column of the rectangular area.
  • center position of the pixel point area may be used as the target pixel position, or any position in the pixel point area may be determined as the target pixel position according to actual applications.
  • the present invention can select the target pixel position in a variety of ways, and realizes the diversified selection of the target pixel position.
  • the method further includes sub-step 10213:
  • sub-step 10213 if the pixel or the object in the space indicated by the pixel area selected by the user is a stationary object, or the number of feature points in the pixel area selected by the user is less than or equal to the preset feature point number threshold, display
  • the third prompt information, the third prompt information is used to prompt the user that the pixel or the pixel area is not selectable, or prompt the user to select another pixel or pixel area.
  • the third prompt information can be any form of information, including: text, sound, pattern, color, etc., for example, it can display the text "This pixel is not selectable, please select the remaining pixels" or "The pixel area is not selectable , Please select the rest of the pixel area", or pop out a pattern that is not selectable, etc.
  • the present invention adopts Recognize whether the pixel point or the pixel point area corresponds to a stationary object, thereby prompting the user to select the location of a non-stationary object, thereby helping to improve the composite video effect.
  • the feature points can be obtained through feature extraction methods such as HarrisCorner, HOG (Histogram of Oriented Gradient), etc. If the number of feature points in the pixel area selected by the user is less than or equal to the preset feature point number threshold, then There are not enough representative feature points, indicating that the texture is too weak and not trackable, thus prompting the user that the area is not selectable.
  • the sub-step 10212 includes the sub-step 102121:
  • the center position of the pixel area selected by the user in the target image frame is determined as the target pixel position.
  • the center position of the pixel point may be the average value of the position of each pixel point in the pixel point area, or the center position of the geometric shape is determined according to the geometric shape of the pixel point area.
  • the sub-step 10212 includes sub-steps 102122 to 102123:
  • Sub-step 102122 displaying the pixel point area selected by the user in the target image frame, and marking and displaying feature points in the pixel point area.
  • the boundary pixels of the display pixel area can be marked in the target image frame, for example, the boundary pixels of the pixel area are displayed in a special color or boldly displayed.
  • feature points can be displayed in special colors, or displayed in bold, etc.
  • Sub-step 102123 in response to the user's feature point selection sub-operation in the pixel point area, determine the target pixel location according to the location of the feature point selected by the user.
  • the feature point selection sub-operation is used to select one of the feature points from the pixel area, and the feature point selection sub-operation can be a click operation, a long press operation, a drag operation, etc. on one of the feature points.
  • the present invention can mark and display the pixel point area and the feature points in the pixel point area to assist the user in selecting the feature point, thereby determining the target pixel position.
  • the sub-step 102121 includes the sub-step 1021211:
  • the position of the center of gravity of the pixel area is determined according to the position of the feature point in the pixel area selected by the user in the target image frame, and the position of the center of gravity is determined as the target pixel position.
  • the position of the feature point in the pixel point area can be averaged to obtain the target pixel position. For example, if there are three feature points in the pixel area: P1 (x1, y1), P2 (x2, y2), P3 (x3, y3), where x1 and y1 are the row numbers of the first feature point P1 And column number, x2 and y2 are the row number and column number of the second feature point P2, x3 and y3 are the row number and column number of the third feature point P3, so that the average row number and column number can be calculated respectively The average value of to get the row number and column number of the center position: (x1+x2+x3)/3, (y1+y2+y3)/3.
  • the method further includes sub-step 1022:
  • sub-step 1022 the display object edited by the user is displayed at the target pixel position of the target image frame.
  • the present invention can display the display object edited by the user at the target pixel position during the editing process to show the effect to the user, so that the user has an expectation of the effect, so that if the user is not satisfied with the effect, the target pixel position can be reselected, so that The user can adjust the target pixel position according to expectations.
  • the method further includes sub-steps 1023 to 1024:
  • Sub-step 1023 displaying the target sub-video in the target video, where the target sub-video includes: the video collected by the shooting device when the motion state of the shooting device satisfies a preset motion condition.
  • the preset motion condition refers to the displacement of the shooting device during shooting, rather than being static or just shaking the head in place.
  • the target sub-video is composed of multiple continuous image frames, and these multiple continuous images need to meet two conditions.
  • the first condition is that the sum of the average translation amount of the feature points between adjacent image frames is greater than or equal to the preset distance threshold to ensure a sufficient translation amount.
  • the second condition is that the parallax of multiple consecutive image frames is greater than or equal to the preset parallax threshold, and the amount of translation caused by the camera shaking the head in place can be filtered.
  • the number of multiple consecutive image frames needs to be greater than or equal to a preset image number threshold.
  • the image frame selected by the user in the target sub-video is determined as the target image frame.
  • the first image frame selection sub-operation is used to select the target image frame from the target sub-video.
  • the target sub-video tag can be displayed, so that the user can select one of the images as the target image frame.
  • the user can click, long press, drag one of the image frames, or click the selection control of each image frame.
  • the present invention can determine the target image frame from the target sub-video with relatively large disparity. Since the larger disparity means that the motion of the shooting device is more obvious when shooting the target sub-video, compositing the display object into the target sub-video will make the display object in the target sub-video.
  • the synthesized video shows obvious movement and changes, which helps to increase the richness of the synthesized video.
  • the method further includes sub-step 1025:
  • first prompt information is displayed, where the first prompt information is used to prompt that the remaining sub-videos are not selectable, and the remaining sub-videos include Videos outside the target sub-video.
  • the second image frame selection sub-operation is used to select the target image frame from the remaining sub-videos.
  • the second image frame selection sub-operation is illegal in the present invention.
  • the first prompt information may be any form of information, including: text, sound, pattern, color, etc., for example, the text "This image frame is not selectable" may be displayed, or a pattern that indicates that the image frame is not selectable may be displayed.
  • the present invention can prompt the user that the image frame in the non-target sub-video is not selectable when the user selects the image frame, and realizes the humanized interaction.
  • the first prompt information includes first prompt sub-information
  • the sub-step 1025 includes sub-step 10251:
  • sub-step 10251 in response to the user's second image frame selection sub-operation in the remaining sub-videos, the first prompt sub-information is displayed, and the first prompt sub-information is used to prompt the user to select an image in the target sub-video frame.
  • the first prompt information can be any form of prompt. After the target sub-video is marked and displayed, the user will be prompted to select a target image frame from the target sub-video.
  • the first prompt information includes: text, sound, pattern, color, etc., for example, each frame of image in the target sub-video is displayed with a mask mark, and the text "Please select a target sub-video with a mask, select the target image frame ".
  • the present invention can prompt the user to select the target sub-video when the user selects an image frame that is not selectable, and further realizes more friendly interaction.
  • the determining the image frame selected by the user in the target sub-video as the target image frame includes sub-steps 10241 to 10242:
  • Sub-step 10241 displaying the key frames in the target sub-video.
  • the key frame is an image frame containing more information in the target sub-video, and the target sub-video may include one or more key frames.
  • key frame markers can be displayed, for example, the key frame is displayed in a larger size, or the key frame is covered with a mask of a certain color.
  • sub-step 10242 in response to the user's third image frame selection sub-operation in the key frames, the key frame selected by the user is determined as the target image frame, and the target image frame is displayed.
  • the third image frame selection sub-operation is used to select one of the multiple key frames as the target image frame.
  • the third image frame selection sub-operation may be operations such as clicking, long pressing, and dragging the key frame.
  • the target image frame When displaying the target image frame, the target image frame can be extracted from the target sub-video and displayed in a separate area; it can also be displayed in the target sub-video continuously, but the target image frame needs to be marked for display. For example, display the target image frame in a larger size, or cover it with a certain color mask.
  • the method further includes sub-step 10243:
  • Sub-step 10243 in response to the user's fourth image frame selection sub-operation in the remaining image frames, second prompt information is displayed, where the second prompt information is used to prompt that the remaining image frames are not selectable, and the remaining image frames include The image frames other than the key frame in the target sub-video.
  • the fourth image frame selection sub-operation is used to select image frames from image frames other than the key frame in the target sub-video.
  • the fourth image frame selection sub-operation is illegal in the present invention.
  • the selection of the target image frame fails, and the user is prompted through the second prompt message that the image frame is not selectable.
  • the second prompt information can be any form of information, including: text, sound, pattern, color, etc., for example, the text "This image frame is not selectable" may be displayed, or a pattern that indicates that the image frame is not selectable may be displayed.
  • the present invention can provide key frames with more information for the user to select, so that the selected target image frame includes more information, so as to provide the user with more optional pixel positions.
  • the 10241 includes sub-step 102411:
  • Sub-step 102411 mark and display the key frame in the target sub-video.
  • the mark display is used to clearly distinguish the key frame from the image frame in the target sub-video. For example, display the key frame in a larger size, or cover the key frame with a mask of a certain color.
  • the embodiment of the present invention may mark key frames, so that the user can more easily see the key frames, and it is convenient for the user to select the target image frame.
  • the key frame satisfies at least one of the following conditions:
  • the amount of translation between the current key frame and the previous key frame is greater than a translation threshold
  • the amount of rotation between the current key frame and the previous key frame is greater than a rotation threshold
  • the total number of the feature points that are successfully tracked and matched in the key frame is less than a matching threshold
  • the number of the feature points on the key frame is less than the number threshold.
  • the value of the translation information is the distance between the position where the camera captures the current key frame and the position where the previous key frame is shot. It can be understood that the larger the value of the translation information, it means that the camera is in translation during shooting, and the value of the translation information can be calculated from the pose information of each key frame taken by the camera.
  • the value of the rotation information is the angle at which the camera rotates when the camera captures the current key frame and the previous key frame. It can be understood that the larger the value of the rotation information, the representing that the camera is rotating during shooting, and the value of the rotation information can be calculated from the pose information of each key frame taken by the camera.
  • the feature points that are successfully tracked and matched in key frames are the number of feature points that appear in different key frames at the same time. It can be understood that the fewer feature points that are successfully tracked and matched, the fewer feature points in the image frame, which means that the image frame cannot continue to be tracked, and the feature point calculation needs to be performed again based on the image frame.
  • the displaying the target image frame includes sub-step 102421.
  • Sub-step 102421 marking the feature points in the target image frame in the displayed target image frame.
  • feature points can be marked in the target image frame.
  • the method of marking the feature points includes but is not limited to: marking with special colors, encircling the feature points, and accentuating the color of the feature points.
  • the present invention can prompt the user to select the characteristic point as the target pixel position by marking the characteristic point, which is convenient for the user to select and improves the efficiency of selecting the target pixel position.
  • the method further includes step 106:
  • Step 106 In response to the user's object content editing operation, display the display object edited by the user in the editing interface.
  • the editing interface is an interface for the user to edit the object, and the editing interface can be an input box.
  • the display object edited by the user can be displayed, so that the user can see in real time whether the input display object is correct.
  • the target video is captured and acquired by the photographing device when a movable platform tracks the target object in the space
  • the acquiring position information of the display object in the space includes sub-step 1026 To 1027:
  • sub-step 1026 the position information of the tracking object of the camera of the movable platform is obtained.
  • the movable platform can track an object for shooting in a surrounding or moving manner.
  • the object usually represents the subject of the target video, so that the user can edit the subject, and the display position of the subject can usually be at the location of the subject. In the surrounding area.
  • Sub-step 1027 Determine the position information of the display object in the space according to the position information of the tracking object.
  • a position that does not cover the tracking object can be determined according to the position information of the tracking object as the target pixel position, or the position information of the tracking object can be directly determined as the target pixel position, for example, according to the position information of the tracking object.
  • the front position or the rear position or the left position or the right position of the tracking object is used as the target pixel position; then, the projection position of the target pixel position in the space is determined as the position information of the display object in the space.
  • the present invention can determine the position information of the display object in the space according to the position information of the tracking object, and can avoid the user from editing the position information in the scene of tracking and shooting, and further reduces the complexity of the user's operation.
  • the method further includes step 107:
  • Step 107 In response to the user's position adjustment operation of the display object, adjust the position information of the display object in the space.
  • the user can also directly adjust the position information of the display object in the space, and the user confirms the adjusted position information so that the adjusted position information becomes effective and the original position information becomes invalid.
  • the user can input the three-dimensional coordinates corresponding to the position information in the adjusted space in the input box, or input the adjustment amount of the three-dimensional coordinates, or directly drag the display object.
  • the present invention can enable the user to flexibly adjust the position information of the display object, so that the synthesized video is more in line with the user's needs.
  • the method further includes step 108:
  • Step 108 In response to the user's orientation adjustment operation of the display object, adjust the orientation of the display object;
  • the display object is projected to each frame of the target video according to the position information of the display object in the space and the pose information of each frame of the target video when the shooting device of the movable platform shoots each frame of the image
  • the video obtained from the image includes: according to the position information of the display object in space, the pose information of each frame of the image in the target video taken by the shooting device of the movable platform, and the adjusted orientation of the display object , The video obtained by projecting the display object onto each frame of the target video.
  • the orientation of the display object is used to characterize the relative pose relationship between the spatial position of the display object and the projection position of the target pixel position in space. For different orientations, the projection of the display object in different image frames is different. .
  • the orientation of the display object can be set as the default orientation.
  • the default orientation of the display object can be any orientation of the display object.
  • the orientation adjustment operation is used to adjust the orientation of the display object.
  • the user can rotate the display object, or input a rotation parameter in the input box.
  • the rotation parameter may include a rotation angle and a direction.
  • the user can only adjust the position or orientation of the display object, or adjust the orientation and position at the same time.
  • the orientation of the display object can be flexibly adjusted by the user, so that the synthesized video meets the needs of the user.
  • the position information of the display object in space is the coordinate position of the spatial position of the display object in a coordinate system
  • the orientation of the display object is the orientation of the display object in the coordinate system
  • the coordinate system is a coordinate system established by the projection position of the target pixel position in space as the origin.
  • the coordinate system is a three-dimensional coordinate system
  • the coordinate position is based on the three-dimensional coordinate in the three-dimensional coordinate system
  • the orientation of the spatial position of the display object can be represented by a vector based on the three-dimensional coordinate system.
  • the coordinate position of the display object's spatial position in the coordinate system can be (10, 12, 30).
  • the distance between the spatial position of the display object and the projection position of the target pixel position in the space is on the x axis.
  • the distance is 10, the distance on the y-axis is 12, and the distance on the z-axis is 30.
  • the orientation of the spatial position of the display object in the coordinate system may be (-10, -12, -30), which represents the projection position of the display object facing the target pixel position in space.
  • the 107 includes sub-steps 1071 to 1073:
  • Sub-step 1071 in response to the user's click operation on the operation interface, obtain the initial touch point position corresponding to the click operation.
  • the operation interface is an interface for adjusting the pose of the display object.
  • a sub-window can be launched separately as the operation interface.
  • the display object in the sub-window is displayed in the initial pose.
  • the user can click Save in the sub-window Control to save the adjusted pose.
  • the sub-window can be closed, and the display object will be displayed in the target image frame in the adjusted pose, so that the user can view the effect.
  • the pose can be adjusted directly in the target image frame, and the interface where the target image frame is located is used as the operation interface.
  • the user can click any position on the display object as the initial touch point position.
  • Sub-step 1072 in response to the user's drag operation on the operation interface, obtain the change in the position of the touch point corresponding to the drag operation.
  • the amount of change in the contact position is the amount of change on the operation interface.
  • the change in the position of the contact point can be expressed by the number of pixels, and the change in the position of the contact point is a vector, which not only represents the size but also the direction.
  • the change in the position of the contact point can be expressed in two-dimensional coordinates.
  • Sub-step 1073 Adjust the position information of the display object in the space according to the change in the position of the touch point.
  • the spatial position change is the position change in space
  • the spatial position change can be expressed by a spatial distance
  • the spatial position change is a vector, which not only represents the size but also the direction.
  • the amount of spatial position change can be expressed in three-dimensional coordinates.
  • the conversion relationship between the two-dimensional coordinates and the three-dimensional coordinates can be preset, so as to determine the corresponding spatial position change according to the change of the contact position.
  • the two-dimensional coordinate can be converted into the three-dimensional coordinate through a 2 ⁇ 3 matrix. , And then determine the position information of the adjusted display object in the space according to the amount of change in the space position and the position information of the display object in the space before the adjustment.
  • the present invention can realize the adjustment of the spatial position by dragging operation.
  • step 1091 and/or step 1092 and/or step 1093 and/or step 1094 are further included:
  • Step 1091 In response to the user's play operation of the synthesized video, play the synthesized video.
  • the user can click, long press, or drag the synthesized video to play the synthesized video, or import the synthesized video into the playback software to play.
  • Step 1092 In response to the user's confirmation operation on the synthesized video, store the synthesized video.
  • the user can store the synthesized video in a designated storage device, where the storage device includes, but is not limited to: a magnetic disk, an optical disk, and a cache.
  • the storage device includes, but is not limited to: a magnetic disk, an optical disk, and a cache.
  • Step 1093 In response to the user's deleting operation on the synthesized video, delete the synthesized video.
  • the user can operate the delete control after selecting the synthesized video to delete the synthesized video.
  • Step 1094 In response to the user's sharing operation of the synthesized video, the synthesized video is sent to a target user specified by the sharing operation, and the target user includes: users registered in different applications.
  • the user can realize the sharing of the synthesized video through an application program with a file sharing function, for example, the synthesized video can be shared through social software.
  • the movable platform is an unmanned aerial vehicle.
  • the display object edited by the user is obtained in response to the user's object content editing operation; the position information of the display object in the space is obtained in response to the user's object position editing operation; the synthesized video is obtained, so
  • the synthesized video is: according to the position information of the display object in space and the pose information of each frame of the image in the target video by the shooting device of the movable platform, the display object is projected to the target video
  • the present invention only needs the user to input the display object and the position, so as to synthesize the display object input by the user into the video according to the position, thereby reducing the complexity of the user's operation.
  • FIG. 6 there is shown a structural block diagram of a video processing device according to the second embodiment of the present application, which specifically includes a processor 210, a memory 220, and a computer stored on the memory 220 and capable of running on the processor 210 A program, when the processor executes the computer program:
  • the synthesized video is: according to the position information of the display object in the space and the pose information of the shooting device of the movable platform when shooting each frame of the target video, the display object A video projected on each frame of the target video, where the target video is a video collected by a shooting device of the movable platform when the movable platform moves in the space;
  • the processor is further configured to:
  • the target video is determined from the videos collected by the shooting device of the movable platform according to the video selection operation.
  • the video selection operation includes a first movable platform selection sub-operation
  • the processor is further configured to:
  • a target movable platform is determined among a plurality of movable platforms, and a video captured by a shooting device of the target movable platform is determined as the target video.
  • the video selection operation includes a first video selection sub-operation
  • the processor is further configured to:
  • the video selected by the user is determined as the target video, and the video collection includes videos collected by at least one shooting device of a movable platform.
  • the video selection operation includes a second movable platform selection sub-operation and a second video selection sub-operation
  • the processor is further configured to:
  • the video collected by the camera of the movable platform selected by the user is displayed as a candidate video
  • the video selected by the user is determined as the target video.
  • the video selection operation includes a segment selection sub-operation
  • the processor is further configured to:
  • a video segment is determined as the target video in the to-be-edited video, where the to-be-edited video includes one of the following: the user is on one or more mobile platforms with a shooting device The video selected from the collected videos, the video collected by the shooting device of the movable platform selected by the user, and the video selected by the user from the video collected by the shooting device of the selected movable platform.
  • the processor is further configured to:
  • the synthesized video is a video obtained by projecting the three-dimensional model corresponding to the display object onto each frame of the target video.
  • the object content editing operation includes a first input sub-operation; the processor is further configured to:
  • the object identifier input by the user is acquired, and the three-dimensional model corresponding to the object identifier is determined as the display object.
  • the object content editing operation includes a second input sub-operation and a first model selection sub-operation; the processor is further configured to:
  • the candidate model selected by the user is determined as the display object.
  • the object content editing operation includes: a second model selection sub-operation, and the processor is further configured to:
  • any of the three-dimensional models selected by the user from among a plurality of candidate three-dimensional models is determined as the display object.
  • the display object includes at least one of numbers, letters, special symbols, and object identifiers.
  • the processor is further configured to:
  • the object attribute information input by the user is obtained, and the display object is set according to the object attribute information.
  • the object attribute information includes at least one of the following: size of the display object, display The transparency of the object, the degree of blurring of the display object, and the color of the display object.
  • the processor is further configured to:
  • the target image frame is an image frame in the target video.
  • the object position editing operation includes a first object position editing sub-operation or a second object position editing sub-operation
  • the processor is further configured to:
  • the target pixel position is determined according to the position in the target image frame of the pixel area selected by the user in the target image frame.
  • the processor is further configured to:
  • the third prompt message is displayed
  • the third prompt information is used to prompt the user that the pixel point or the pixel point area is not selectable, or to prompt the user to select another pixel point or pixel point area.
  • the processor is further configured to:
  • the center position of the pixel area selected by the user in the target image frame is determined as the target pixel position.
  • the processor is further configured to:
  • the target pixel location is determined according to the location of the feature point selected by the user.
  • the processor is further configured to:
  • the position of the center of gravity of the pixel area is determined according to the position of the feature point in the pixel area selected by the user in the target image frame, and the position of the center of gravity is determined as the target pixel position.
  • the processor is further configured to:
  • the display object edited by the user is displayed at the target pixel position of the target image frame.
  • the processor is further configured to:
  • Target sub-video includes: a video collected by the shooting device when the motion state of the shooting device satisfies a preset motion condition;
  • the image frame selected by the user in the target sub-video is determined as the target image frame.
  • the processor is further configured to:
  • the first prompt information is displayed. Outside the video.
  • the first prompt information includes first prompt sub-information
  • the processor is further configured to:
  • the first prompt sub-information is displayed, and the first prompt sub-information is used to prompt the user to select an image frame in the target sub-video.
  • the processor is further configured to:
  • the key frame selected by the user is determined as the target image frame, and the target image frame is displayed.
  • the processor is further configured to:
  • second prompt information is displayed.
  • the second prompt information is used to prompt that the remaining image frames are not selectable, and the remaining image frames include the target sub-operation.
  • the processor is further configured to:
  • the key frame satisfies at least one of the following conditions:
  • the amount of translation between the current key frame and the previous key frame is greater than a translation threshold
  • the amount of rotation between the current key frame and the previous key frame is greater than a rotation threshold
  • the total number of the feature points that are successfully tracked and matched in the key frame is less than a matching threshold
  • the number of the feature points on the key frame is less than the number threshold.
  • the processor is further configured to:
  • the processor is further configured to:
  • the display object edited by the user is displayed in the editing interface.
  • the target video is captured and acquired by the photographing device when the movable platform tracks the target object in the space
  • the processor is further configured to:
  • the location information of the display object in the space is determined according to the location information of the tracking object.
  • the processor is further configured to:
  • the position information of the display object in the space is adjusted.
  • the processor is further configured to:
  • the display object According to the position information of the display object in space, the pose information of each frame of the image in the target video taken by the shooting device of the movable platform, and the adjusted orientation of the display object, the display object is projected to The video obtained from each frame of the target video.
  • the processor is further configured to:
  • the processor is further configured to:
  • the synthesized video is sent to a target user specified by the sharing operation, and the target user includes: users registered in different applications.
  • the movable platform is an unmanned aerial vehicle.
  • the display object edited by the user is obtained in response to the user's object content editing operation; the position information of the display object in the space is obtained in response to the user's object position editing operation; the synthesized video is obtained, so
  • the synthesized video is: according to the position information of the display object in space and the pose information of each frame of the image in the target video by the shooting device of the movable platform, the display object is projected to the target video
  • the present invention only needs the user to input the display object and the position, so as to synthesize the display object input by the user into the video according to the position, thereby reducing the complexity of the user's operation.
  • the device embodiments described above are merely illustrative.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in One place, or it can be distributed to multiple network units.
  • Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments. Those of ordinary skill in the art can understand and implement it without creative work.
  • the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
  • any reference signs placed between parentheses should not be constructed as a limitation to the claims.
  • the word “comprising” does not exclude the presence of elements or steps not listed in the claims.
  • the word “a” or “an” preceding an element does not exclude the presence of multiple such elements.
  • the application can be realized by means of hardware including several different elements and by means of a suitably programmed computer. In the unit claims listing several devices, several of these devices may be embodied in the same hardware item.
  • the use of the words first, second, and third, etc. do not indicate any order. These words can be interpreted as names.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Processing Or Creating Images (AREA)

Abstract

La présente invention concerne un procédé et un appareil de traitement vidéo. Le procédé consiste : à obtenir, en réponse à une opération d'édition de contenu d'objet d'un utilisateur, un objet d'affichage édité par l'utilisateur (101) ; en réponse à une opération d'édition de position d'objet de l'utilisateur, à obtenir des informations de position de l'objet d'affichage dans un espace (102) ; à obtenir une vidéo composite, la vidéo composite étant une vidéo obtenue en projetant l'objet d'affichage sur chaque trame d'image dans une vidéo cible selon les informations de position de l'objet d'affichage dans l'espace et des informations de pose d'un dispositif de photographie d'une plateforme mobile pendant la photographie de chaque trame d'image dans la vidéo cible, et la vidéo cible étant une vidéo capturée par le dispositif de photographie de la plateforme mobile lorsque la plateforme mobile se déplace dans l'espace (103) ; et à afficher la vidéo composite (104). Selon la présente invention, un utilisateur a seulement besoin d'entrer un objet d'affichage et une position, de telle sorte que l'objet d'affichage entré par l'utilisateur est composé dans une vidéo selon la position, ce qui permet de réduire la complexité de l'opération par l'utilisateur.
PCT/CN2020/087350 2020-04-28 2020-04-28 Procédé et appareil de traitement vidéo WO2021217385A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080041521.5A CN113906731B (zh) 2020-04-28 2020-04-28 一种视频处理方法和装置
PCT/CN2020/087350 WO2021217385A1 (fr) 2020-04-28 2020-04-28 Procédé et appareil de traitement vidéo

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/087350 WO2021217385A1 (fr) 2020-04-28 2020-04-28 Procédé et appareil de traitement vidéo

Publications (1)

Publication Number Publication Date
WO2021217385A1 true WO2021217385A1 (fr) 2021-11-04

Family

ID=78373253

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/087350 WO2021217385A1 (fr) 2020-04-28 2020-04-28 Procédé et appareil de traitement vidéo

Country Status (2)

Country Link
CN (1) CN113906731B (fr)
WO (1) WO2021217385A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116567166A (zh) * 2023-07-07 2023-08-08 广东省电信规划设计院有限公司 一种视频融合方法、装置、电子设备及存储介质
WO2024060856A1 (fr) * 2022-09-20 2024-03-28 腾讯科技(深圳)有限公司 Procédé et appareil de traitement de données, dispositif électronique, support de stockage et produit-programme

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101630416A (zh) * 2008-07-17 2010-01-20 鸿富锦精密工业(深圳)有限公司 图片编辑系统及其编辑方法
JP2013008257A (ja) * 2011-06-27 2013-01-10 Celsys:Kk 画像合成プログラム
CN104036476A (zh) * 2013-03-08 2014-09-10 三星电子株式会社 用于提供增强现实的方法以及便携式终端
WO2015056826A1 (fr) * 2013-10-18 2015-04-23 주식회사 이미지넥스트 Appareil et procédé de traitement des images d'un appareil de prise de vues
CN106097435A (zh) * 2016-06-07 2016-11-09 北京圣威特科技有限公司 一种增强现实拍摄系统及方法
CN108346171A (zh) * 2017-01-25 2018-07-31 阿里巴巴集团控股有限公司 一种图像处理方法、装置、设备和计算机存储介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109857905B (zh) * 2018-11-29 2022-03-15 维沃移动通信有限公司 一种视频编辑方法及终端设备
CN110505498B (zh) * 2019-09-03 2021-04-02 腾讯科技(深圳)有限公司 视频的处理、播放方法、装置及计算机可读介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101630416A (zh) * 2008-07-17 2010-01-20 鸿富锦精密工业(深圳)有限公司 图片编辑系统及其编辑方法
JP2013008257A (ja) * 2011-06-27 2013-01-10 Celsys:Kk 画像合成プログラム
CN104036476A (zh) * 2013-03-08 2014-09-10 三星电子株式会社 用于提供增强现实的方法以及便携式终端
WO2015056826A1 (fr) * 2013-10-18 2015-04-23 주식회사 이미지넥스트 Appareil et procédé de traitement des images d'un appareil de prise de vues
CN106097435A (zh) * 2016-06-07 2016-11-09 北京圣威特科技有限公司 一种增强现实拍摄系统及方法
CN108346171A (zh) * 2017-01-25 2018-07-31 阿里巴巴集团控股有限公司 一种图像处理方法、装置、设备和计算机存储介质

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024060856A1 (fr) * 2022-09-20 2024-03-28 腾讯科技(深圳)有限公司 Procédé et appareil de traitement de données, dispositif électronique, support de stockage et produit-programme
CN116567166A (zh) * 2023-07-07 2023-08-08 广东省电信规划设计院有限公司 一种视频融合方法、装置、电子设备及存储介质
CN116567166B (zh) * 2023-07-07 2023-10-17 广东省电信规划设计院有限公司 一种视频融合方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN113906731A (zh) 2022-01-07
CN113906731B (zh) 2023-10-13

Similar Documents

Publication Publication Date Title
Lai et al. Semantic-driven generation of hyperlapse from 360 degree video
US10958891B2 (en) Visual annotation using tagging sessions
US11055534B2 (en) Method and apparatus for 3-D auto tagging
US11482192B2 (en) Automated object selection and placement for augmented reality
US10861159B2 (en) Method, system and computer program product for automatically altering a video stream
JP6599435B2 (ja) イベント空間で生じるイベントの3次元再構成における3次元再構成システムによる周囲の処理を制限するためのシステムおよび方法
US9381429B2 (en) Compositing multiple scene shots into a video game clip
CN106664376B (zh) 增强现实设备和方法
JP3773670B2 (ja) 情報呈示方法および情報呈示装置および記録媒体
JP2021511729A (ja) 画像、又はビデオデータにおいて検出された領域の拡張
US20180160194A1 (en) Methods, systems, and media for enhancing two-dimensional video content items with spherical video content
US20180276882A1 (en) Systems and methods for augmented reality art creation
TW200922324A (en) Image processing device, dynamic image reproduction device, and processing method and program in them
US20230410332A1 (en) Structuring visual data
JP7459870B2 (ja) 画像処理装置、画像処理方法、及び、プログラム
JP2009077363A (ja) 画像処理装置、動画再生装置、これらにおける処理方法およびプログラム
US20180005430A1 (en) System, method and apparatus for rapid film pre-visualization
WO2021217385A1 (fr) Procédé et appareil de traitement vidéo
KR20180130504A (ko) 정보 처리 장치, 정보 처리 방법, 프로그램
US20160381290A1 (en) Apparatus, method and computer program
CN110996150A (zh) 视频融合方法、电子设备及存储介质
US20160379682A1 (en) Apparatus, method and computer program
US20190155465A1 (en) Augmented media
Langlotz et al. AR record&replay: situated compositing of video content in mobile augmented reality
EP3503101A1 (fr) Interface utilisateur basée sur des objets

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20932929

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20932929

Country of ref document: EP

Kind code of ref document: A1