WO2019042341A1 - 视频剪辑方法和装置 - Google Patents

视频剪辑方法和装置 Download PDF

Info

Publication number
WO2019042341A1
WO2019042341A1 PCT/CN2018/103148 CN2018103148W WO2019042341A1 WO 2019042341 A1 WO2019042341 A1 WO 2019042341A1 CN 2018103148 W CN2018103148 W CN 2018103148W WO 2019042341 A1 WO2019042341 A1 WO 2019042341A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
key frame
clip
segments
segment
Prior art date
Application number
PCT/CN2018/103148
Other languages
English (en)
French (fr)
Inventor
狄杰
Original Assignee
优酷网络技术(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 优酷网络技术(北京)有限公司 filed Critical 优酷网络技术(北京)有限公司
Publication of WO2019042341A1 publication Critical patent/WO2019042341A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content

Definitions

  • the present disclosure relates to the field of video editing, and more particularly to video editing methods and apparatus.
  • the present disclosure proposes a method capable of automatically acquiring clip material according to user requirements.
  • the present disclosure also proposes corresponding devices.
  • a video editing method comprising: receiving a clip index; determining a label of the plurality of tags of the video that matches the clip index, the plurality of tags corresponding to the video a plurality of segments; merging the segments corresponding to the tags matching the clip index to obtain a clip into a slice.
  • the clip index includes at least one of a text and a picture.
  • the tag includes at least one of a text and a picture.
  • the label includes a text; the method further includes: dividing the video into the plurality of segments; determining a key frame in the segment; and performing an image on the key frame Identifying to obtain the text in the tag corresponding to the segment including the key frame.
  • the label includes a picture; the method further includes: dividing the video into the plurality of segments; determining a key frame in the segment, and using the key frame as The picture in the tag corresponding to the segment including the key frame.
  • a video editing apparatus comprising: a clip index receiving module for receiving a clip index; a matching label determining module for determining a plurality of tags of the video and the clip Index matching tags, the plurality of tags corresponding to the plurality of segments of the video; and a segment merging module for merging the segments corresponding to the tags matching the clip index to obtain a clip into a slice.
  • the clip index includes at least one of a text and a picture.
  • the tag includes at least one of a text and a picture.
  • the label includes a text
  • the apparatus further includes: a first video segmentation module, configured to slice the video into the plurality of segments; a first key frame determining module, For determining a key frame in the segment; an image recognition module, configured to perform image recognition on the key frame to obtain the text in a tag corresponding to a segment including the key frame.
  • the label includes a picture; the apparatus further includes: a second video segmentation module, configured to divide the video into the plurality of segments; and a second key frame determining module, A method for determining a key frame in the segment and using the key frame as the picture in a tag corresponding to a segment including the key frame.
  • an apparatus for video editing comprising: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to perform the method described above.
  • a non-transitory computer readable storage medium having stored thereon computer program instructions, wherein the computer program instructions are implemented by a processor to implement the above method.
  • FIG. 1 illustrates a flow chart of a video editing method in accordance with an exemplary embodiment of the present disclosure.
  • FIG. 3 illustrates a structural block diagram of a video editing apparatus according to an exemplary embodiment of the present disclosure.
  • FIG. 4 illustrates a structural block diagram of an apparatus for video editing, according to an exemplary embodiment of the present disclosure.
  • FIG. 1 illustrates a flow chart of a video editing method in accordance with an exemplary embodiment of the present disclosure. This method can be applied to servers or terminal devices. As shown in Figure 1, the method includes the following steps.
  • Step 102 Receive a clip index.
  • the clip index can be received from the client.
  • the clip index includes text.
  • the clip index includes a picture.
  • the clip index includes both text and a picture.
  • Step 104 Determine tags of the plurality of tags of the video that match the clip index, the plurality of tags corresponding to the plurality of segments of the video.
  • the tag includes text, such as a person's name, such as the name of a building (such as “monument”, etc.), such as a description of the behavior (such as “shooting”, etc.), such as a background description (such as "the sea” "etc.”, such as a scene description (such as "indoor”, etc.) and so on.
  • a person's name such as the name of a building (such as “monument”, etc.)
  • a description of the behavior such as “shooting”, etc.
  • a background description such as "the sea” "etc.”
  • scene description such as "indoor”, etc.
  • the tag includes a picture, for example, one or more image frames in the corresponding segment, such as a picture mainly based on a certain person appearing, such as a picture of a specific scene, and the like.
  • the tag includes both text and a picture.
  • the aforementioned clip index includes the name of an actor
  • a tag includes the name of the actor, or includes the name of the character played by the actor, or includes a picture of the actor, etc.
  • the The tag matches the clip index.
  • the tag may be considered to match the clip index.
  • the tags corresponding to one video segment can be matched to different clip indexes. For example, if the label corresponding to a certain piece includes a certain person name and a scene description, when the clip index includes the person name or the picture of the person appears, or when the clip index includes the scene description or the picture in which the scene appears, It can be determined that the tag matches the clip index. For example, if a label corresponding to a certain piece includes a picture in which a certain character is located in a certain scene, when the clip index includes the name of the person or a picture in which the person appears, or when the clip index includes the scene description or a picture in which the scene appears, It can be determined that the tag matches the clip index.
  • the above is for illustrative purposes only and is not intended to limit the disclosure in any way. Those skilled in the art can determine whether the tag and the clip index match according to their own needs.
  • Step 106 Combine the segments corresponding to the tags matching the clip index to obtain a clip into a slice.
  • all the segments corresponding to the tags matching the clip index may be automatically merged.
  • the segments may be merged into one clip into a slice according to the time stamp of each segment.
  • the user can freely edit the clip into pieces, for example, deleting one or more clips from it, inserting other video clips, or adjusting the order of the clips, etc. .
  • the corresponding clips are automatically obtained according to the clip index, which greatly saves the labor cost of the video clip and provides great convenience for the video clip.
  • the above embodiment is applied to a server.
  • the user edits an episode of a TV series and hopes to get a highlight of the appearance of a character in the episode. Then the user can trigger a clip request for the episode with the character name as a clip index on the client.
  • the server may determine 10 tags in the tag of the episode that match the name of the character, and determine corresponding 10 segments according to the correspondence between the tag and the segment, for example, including 1 minute 05 seconds ⁇ 1 minute 12 seconds segment, 3 minutes 10 seconds to 4 minutes 20 seconds segment, 9 minutes 10 seconds to 11 minutes 20 seconds segment..., then merge the 10 segments to get the clip into pieces, and send the clip into Slice to the client.
  • the corresponding splicing information is sent to the client, so that subsequent users can edit the clip into pieces, such as deleting one or more segments, inserting other video segments or adjusting the order of the segments.
  • the user may be presented with the initially matched label information to facilitate screening by the user to determine the clip material that best fits the desired.
  • the client may display to the user a plurality of labels that initially match the name of the person among the plurality of labels displaying the video, and the matching labels in the labels.
  • Some or all of the information including the matching of the name of the person includes other information of the corresponding segment, such as scene description information, behavior description information, and the like.
  • Each tab shown can be configured with a corresponding selected control and/or delete control. The user can select some or all of these tags as the final determined tag that matches the clip index by the selection control. Further, the segments corresponding to the tags selected by the user may be merged to obtain a clip into a slice.
  • FIG. 2(a), 2(b), and 2(c) show schematic diagrams according to an exemplary application example of the present disclosure.
  • Fig. 2(a) shows a schematic diagram of a page for receiving a clip index input by a user in a client of the terminal device.
  • the user can input a cut index for the video A in a box below the display area of the video A, and in the present illustrative example, the user inputs "person M" as a clip index.
  • the user can then click on the scissors icon to the right of the clip index input box to trigger the clip operation.
  • the terminal device can send the clip index to the server.
  • the server receives the clip index and returns complete information of the plurality of tags of the video A that are initially matched with the clip index to the corresponding client in the terminal device.
  • these labels can be displayed to the user through the display, and different labels can be displayed alternately.
  • time stamp information of the segment corresponding to the tag may also be displayed, for example, the start time and the end time of the segment in the video A.
  • the information in each tag that matches the clip index (such as the "People M" field in this example) can be highlighted, for example, in a special color/font.
  • X1, X2, Y1, and Y2 in Fig. 2(b) are used to refer to other information in the corresponding tag.
  • the user may also obtain further information for the corresponding segment by operation of the tag in Figure 2(b).
  • a tag can be clicked to request the server to send the segment corresponding to the tag.
  • the server may, in response to the request, send the segment to a corresponding client on the terminal device, the client may play the segment to facilitate preview by the user.
  • the user can identify the clip as a slice.
  • the user clicks on the clip into a display area of the tablet to play the clip into a piece. It is also possible to set a space for downloading, uploading, and the like for the clip, which will not be described again.
  • client page diagrams shown in FIG. 2( a ) to FIG. 2( c ) are only used as examples, and are not used to limit the display content, display manner, and arrangement manner of the page in any aspect.
  • a person skilled in the art can set appropriate display content, display mode and arrangement manner as needed.
  • the server/terminal device may obtain tags corresponding to the respective segments in advance. When a clip request is received, it is clipped into pieces according to the manner as shown in FIG.
  • the server/terminal device may analyze the video in real time to obtain a label corresponding to the plurality of video clips, and then determine the clip index that is matched with the clip request. Label and merge the corresponding clips to get the clip into pieces.
  • the method described in FIG. 1 further includes: dividing the video into the plurality of segments; determining a key frame in the segment; Image recognition is performed on the key frame to obtain the text in a tag corresponding to a segment including the key frame.
  • different shots in the video may be identified by detecting physical parameters (eg, feature values) of the video frame, thereby splitting the video into the plurality of clips according to a time stamp of the shot switch;
  • the video may be first divided into multiple small segments according to the lens, and then the small segments are clustered, and multiple small segments belonging to the same cluster are aggregated together as a scene, so that the time stamp according to the scene switching will be
  • the video is divided into the plurality of segments; in one example, the video may be sliced into a plurality of segments according to the principle of time division, and so on.
  • a person skilled in the art may divide the video into a plurality of segments by any suitable means, which is not limited in the disclosure.
  • static frame extraction means may be employed to determine key frames in the segment, such as first frame method, tail frame method, head and tail frame method, pixel frame averaging method or histogram averaging method, etc.
  • dynamic key frame extraction means may be employed to determine key frames in the segment, such as a key frame extraction algorithm based on cluster analysis, a key frame extraction method based on motion analysis, and a key frame extraction method based on semantic content (eg, For video using the MPEG-7 encoding standard) and so on.
  • semantic content eg, For video using the MPEG-7 encoding standard
  • subjects in key frames may be identified, and/or behaviors and/or expressions of subjects in key frames may be identified, and/or key frames may be identified.
  • Text information (such as subtitles) and/or symbol information, and/or a background that can identify key frames, and/or a scene that can identify key frames, and the like, and provide textual information related to the identified object as A part of a label or label.
  • a person skilled in the art may employ any suitable image recognition means, which is not limited in the present disclosure.
  • the method of FIG. 1 further includes: dividing the video into the plurality of segments; determining a key frame in the segment, And the key frame is taken as the picture in a tag corresponding to a segment including the key frame.
  • the text/picture label corresponding to each segment can be automatically obtained, which provides great convenience for obtaining the label corresponding to the segment of the video.
  • FIG. 3 illustrates a structural block diagram of a video editing apparatus 300 according to an exemplary embodiment of the present disclosure.
  • the device can be applied to a server or a terminal device.
  • the apparatus 300 includes a clip index receiving module 302, a matching tag determining module 304, and a segment combining module 306.
  • Clip index receiving module 302 is for receiving a clip index.
  • the matching tag determination module 304 is configured to determine a tag of the plurality of tags of the video that matches the clip index, the plurality of tags corresponding to the plurality of segments of the video.
  • the segment merge module 306 is configured to merge the segments corresponding to the tags that match the clip index to obtain a clip into a slice.
  • the clip index includes at least one of a text and a picture.
  • the tag includes at least one of a text and a picture.
  • the tag includes a text; the apparatus 300 further includes: a first video segmentation module (not shown), configured to slice the video into the plurality of segments; a key frame determining module (not shown) for determining a key frame in the segment; an image recognition module (not shown) for performing image recognition on the key frame to obtain and include the key frame The fragment corresponds to the text in the label.
  • a first video segmentation module (not shown), configured to slice the video into the plurality of segments
  • a key frame determining module for determining a key frame in the segment
  • an image recognition module (not shown) for performing image recognition on the key frame to obtain and include the key frame
  • the fragment corresponds to the text in the label.
  • the tag includes a picture; the apparatus 300 further includes: a second video segmentation module (not shown), configured to slice the video into the plurality of segments; A key frame determination module (not shown) for determining a key frame in the segment and using the key frame as the picture in a tag corresponding to a segment including the key frame.
  • FIG. 4 is a block diagram of an apparatus 400 for video editing, according to an exemplary embodiment.
  • device 400 can be provided as a server or a terminal device.
  • apparatus 400 includes a processing component 422 that further includes one or more processors, and memory resources represented by memory 432 for storing instructions executable by processing component 422, such as an application.
  • An application stored in memory 432 may include one or more modules each corresponding to a set of instructions.
  • processing component 422 is configured to execute instructions to perform the methods described above.
  • Device 400 may also include a power supply component 426 configured to perform power management of device 400, a wired or wireless network interface 450 configured to connect device 400 to the network, and an input/output (I/O) interface 458.
  • Device 400 can operate based on an operating system stored in memory 432, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or the like.
  • a non-transitory computer readable storage medium such as a memory 432 comprising computer program instructions executable by processing component 422 of apparatus 400 to perform the above method.
  • the present disclosure can be a system, method, and/or computer program product.
  • the computer program product can comprise a computer readable storage medium having computer readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.
  • the computer readable storage medium can be a tangible device that can hold and store the instructions used by the instruction execution device.
  • the computer readable storage medium can be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Non-exhaustive list of computer readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM) Or flash memory), static random access memory (SRAM), portable compact disk read only memory (CD-ROM), digital versatile disk (DVD), memory stick, floppy disk, mechanical encoding device, for example, with instructions stored thereon A raised structure in the hole card or groove, and any suitable combination of the above.
  • a computer readable storage medium as used herein is not to be interpreted as a transient signal itself, such as a radio wave or other freely propagating electromagnetic wave, an electromagnetic wave propagating through a waveguide or other transmission medium (eg, a light pulse through a fiber optic cable), or through a wire The electrical signal transmitted.
  • the computer readable program instructions described herein can be downloaded from a computer readable storage medium to various computing/processing devices or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in each computing/processing device .
  • Computer program instructions for performing the operations of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine related instructions, microcode, firmware instructions, state setting data, or in one or more programming languages.
  • Source code or object code written in any combination including object oriented programming languages such as Smalltalk, C++, etc., as well as conventional procedural programming languages such as the "C" language or similar programming languages.
  • the computer readable program instructions can execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer, partly on the remote computer, or entirely on the remote computer or server. carried out.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or wide area network (WAN), or can be connected to an external computer (eg, using an Internet service provider to access the Internet) connection).
  • the customized electronic circuit such as a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), can be customized by utilizing state information of computer readable program instructions.
  • Computer readable program instructions are executed to implement various aspects of the present disclosure.
  • the computer readable program instructions can be provided to a general purpose computer, a special purpose computer, or a processor of other programmable data processing apparatus to produce a machine such that when executed by a processor of a computer or other programmable data processing apparatus Means for implementing the functions/acts specified in one or more of the blocks of the flowcharts and/or block diagrams.
  • the computer readable program instructions can also be stored in a computer readable storage medium that causes the computer, programmable data processing device, and/or other device to operate in a particular manner, such that the computer readable medium storing the instructions includes An article of manufacture that includes instructions for implementing various aspects of the functions/acts recited in one or more of the flowcharts.
  • the computer readable program instructions can also be loaded onto a computer, other programmable data processing device, or other device to perform a series of operational steps on a computer, other programmable data processing device or other device to produce a computer-implemented process.
  • instructions executed on a computer, other programmable data processing apparatus, or other device implement the functions/acts recited in one or more of the flowcharts and/or block diagrams.
  • each block in the flowchart or block diagram can represent a module, a program segment, or a portion of an instruction that includes one or more components for implementing the specified logical functions.
  • Executable instructions can also occur in a different order than those illustrated in the drawings. For example, two consecutive blocks may be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented in a dedicated hardware-based system that performs the specified function or action. Or it can be implemented by a combination of dedicated hardware and computer instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

本公开涉及视频剪辑方法和装置。该方法包括:接收剪辑索引;确定视频的多个标签中与所述剪辑索引匹配的标签,所述多个标签对应于所述视频的多个片段;合并与所述剪辑索引匹配的标签对应的所述片段,得到剪辑成片。根据本公开能够大大节省了视频剪辑的工作量,为视频剪辑提供了极大便利。

Description

视频剪辑方法和装置
交叉引用
本申请主张2017年9月4日提交的中国专利申请号为201710787710.7的优先权,其全部内容通过引用包含于此。
技术领域
本公开涉及视频剪辑领域,尤其涉及视频剪辑方法和装置。
背景技术
现有技术中,通常通过人工标记、搜索和截取视频片段,再对其进行拼接合成。人力成本巨大。
发明内容
有鉴于此,本公开提出了一种能够根据用户要求自动获取剪辑素材的方法。本公开还提出了相应的装置。
根据本公开的一方面,提供了一种视频剪辑方法,所述方法包括:接收剪辑索引;确定视频的多个标签中与所述剪辑索引匹配的标签,所述多个标签对应于所述视频的多个片段;合并与所述剪辑索引匹配的标签对应的所述片段,得到剪辑成片。
在一种可能的实现方式中,所述剪辑索引包括文字和图片中的至少一者。
在一种可能的实现方式中,所述标签包括文字和图片中的至少一者。
在一种可能的实现方式中,所述标签包括文字;所述方法还包括:将所述视频切分为所述多个片段;确定所述片段中的关键帧;对所述关键帧进行图像识别,以得到与包括所述关键帧的片段对应的标签中的所述文字。
在一种可能的实现方式中,所述标签包括图片;所述方法还包括:将所 述视频切分为所述多个片段;确定所述片段中的关键帧,并将所述关键帧作为与包括所述关键帧的片段对应的标签中的所述图片。
根据本公开的另一方面,提供了一种视频剪辑装置,所述装置包括:剪辑索引接收模块,用于接收剪辑索引;匹配标签确定模块,用于确定视频的多个标签中与所述剪辑索引匹配的标签,所述多个标签对应于所述视频的多个片段;片段合并模块,用于合并与所述剪辑索引匹配的标签对应的所述片段,得到剪辑成片。
在一种可能的实现方式中,所述剪辑索引包括文字和图片中的至少一者。
在一种可能的实现方式中,所述标签包括文字和图片中的至少一者。
在一种可能的实现方式中,所述标签包括文字;所述装置还包括:第一视频切分模块,用于将所述视频切分为所述多个片段;第一关键帧确定模块,用于确定所述片段中的关键帧;图像识别模块,用于对所述关键帧进行图像识别,以得到与包括所述关键帧的片段对应的标签中的所述文字。
在一种可能的实现方式中,所述标签包括图片;所述装置还包括:第二视频切分模块,用于将所述视频切分为所述多个片段;第二关键帧确定模块,用于确定所述片段中的关键帧,并将所述关键帧作为与包括所述关键帧的片段对应的标签中的所述图片。
根据本公开的另一方面,提供了一种用于视频剪辑的装置,包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为执行上述方法。
根据本公开的另一方面,提供了一种非易失性计算机可读存储介质,其上存储有计算机程序指令,其中,所述计算机程序指令被处理器执行时实现上述方法。
根据本公开的各方面能够自动根据剪辑索引得到相应的剪辑成片,大大 节省了视频剪辑的工作量,为视频剪辑提供了极大便利。
根据下面参考附图对示例性实施例的详细说明,本公开的其它特征及方面将变得清楚。
附图说明
包含在说明书中并且构成说明书的一部分的附图与说明书一起示出了本公开的示例性实施例、特征和方面,并且用于解释本公开的原理。
图1示出根据本公开的一个示例性实施例的视频剪辑方法的流程图。
图2(a)、图2(b)和图2(c)示出根据本公开的一个示例性应用示例的示意图。
图3示出根据本公开的一个示例性实施例的视频剪辑装置的结构框图。
图4示出根据本公开的一个示例性实施例的用于视频剪辑的装置的结构框图。
具体实施方式
以下将参考附图详细说明本公开的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。
在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。
另外,为了更好的说明本公开,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本公开同样可以实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本公开的主旨。
图1示出根据本公开的一个示例性实施例的视频剪辑方法的流程图。该 方法可应用于服务器或终端设备。如图1所示,该方法包括下列步骤。
步骤102,接收剪辑索引。
例如,可从客户端接收该剪辑索引。
在一种可能的实现方式中,所述剪辑索引包括文字。
在一种可能的实现方式中,所述剪辑索引包括图片。
在一种可能的实现方式中,所述剪辑索引包括文字和图片这两者。
步骤104,确定视频的多个标签中与所述剪辑索引匹配的标签,所述多个标签对应于所述视频的多个片段。
在一种可能的实现方式中,所述标签包括文字,例如人物名称,例如建筑物的名称(诸如“纪念碑”等),例如行为描述(诸如“投篮”等),例如背景描述(诸如“大海”等),例如场景描述(诸如“室内”等)等等。
在一种可能的实现方式中,所述标签包括图片,例如,对应片段中的一个或多个图像帧,例如以出现的某个人物为主体的图片,例如特定场景的图片等等。
在一种可能的实现方式中,所述标签包括文字和图片二者。
例如,前述剪辑索引包括某个演员的名字,则如果某个标签中包括该演员的名字,或者包括该演员所扮演的角色的名字,或者包括出现了该演员的图片等等,则可认为该标签与所述剪辑索引匹配。
例如,前述剪辑索引包括以某个人物为主体的图片,则如果某个标签中包括以该人物为主体的图片,或者包括该人物的名称,则可认为该标签与所述剪辑索引匹配。
在一种可能的实现方式中,一个视频片段对应的标签可匹配于不同的剪辑索引。例如,某个片段对应的标签包括某人物名称和某场景描述,则当剪辑索引包括该人物名称或出现该人物的图片时,或当剪辑索引包括该场景描述或出现该场景的图片时,均可确定该标签与该剪辑索引匹配。例如,某个 片段对应的标签包括某人物位于某场景的图片,则当剪辑索引包括该人物名称或出现该人物的图片时,或当剪辑索引包括该场景描述或出现该场景的图片时,均可确定该标签与该剪辑索引匹配。以上仅用于示例性说明,不用于对本公开进行任何限定。本领域技术人员可根据自己的需要来确定标签和剪辑索引是否匹配。
步骤106,合并与所述剪辑索引匹配的标签对应的所述片段,得到剪辑成片。
在一种可能的实现方式中,可自动合并与所述剪辑索引匹配的标签对应的所有片段,例如,可以按照各个片段的时间戳的先后顺序将这些片段合并成一个剪辑成片。在本实现方式的一个示例中,将该剪辑成片发送给用户后,用户可自由编辑该剪辑成片,例如,从中删除一个或多个片段、插入其他的视频片段或调整片段的顺序等等。
在上述实施例中,可自动根据剪辑索引得到相应的剪辑成片,极大地节省了视频剪辑的人力成本,为视频剪辑提供了很大便利。
在应用本公开的一个示例中,上述实施例应用于服务器。用户对某个电视连续剧的一集进行剪辑,希望得到该剧集中某个人物的出现片段集锦。则用户可在客户端触发针对该剧集、以该人物名称为剪辑索引的剪辑请求。服务器收到该剪辑请求后,可确定该剧集的标签中与该人物名称匹配的10个标签,并根据标签与片段的对应关系确定对应的10个片段,例如,包括位于1分05秒~1分12秒的片段、3分10秒~4分20秒的片段、9分10秒~11分20秒的片段……,然后合并这10个片段以得到剪辑成片,并发送该剪辑成片给客户端。同时发送给客户端的还有相应的拼接信息,以便后续用户对该剪辑成片进行编辑,例如删除其中的一个或多个片段、插入其他的视频片段或调整片段的顺序等。
在一种可能的实现方式中,在接收到剪辑索引后,可向用户展示初步匹 配的标签信息,以便于用户进行筛选以确定最符合期望的剪辑素材。例如,当用户输入的针对某视频的剪辑索引包括某人物名称时,可在客户端向用户展示显示该视频的多个标签中与所述人物名称初步匹配的若干标签,这些匹配的标签中的部分或全部除包括与该人物名称匹配的信息外,还包括相应片段的其他信息,例如场景描述信息、行为描述信息等。所展示的每个标签均可配置有对应的选中控件和/或删除控件。用户可通过该选择控件选择这些标签中的部分或全部作为最终确定的与所述剪辑索引匹配的标签。进一步地,可合并用户所选择的标签对应的片段以得到剪辑成片。
图2(a)、图2(b)和图2(c)示出根据本公开的一个示例性应用示例的示意图。图2(a)示出终端设备的客户端中用于接收用户输入的剪辑索引的页面示意图。用户可在视频A的展示区域下方的方框中输入针对视频A的剪辑索引,在本示例性示例中,用户输入“人物M”作为剪辑索引。然后,用户可点击剪辑索引输入框右侧的剪刀图标以触发剪辑操作。
终端设备可将该剪辑索引发送至服务器。服务器接收该剪辑索引,并向终端设备中的相应客户端返回视频A的多个标签中与该剪辑索引初步匹配的若干标签的完整信息。如图2(b)所示,这些标签可通过显示屏展示给用户,不同标签可换行显示。在展示标签时,还可展示该标签对应的片段的时间戳信息,例如,该片段在视频A中的起始时刻和结束时刻。每个标签中与该剪辑索引匹配的信息(如本示例中的“人物M”字段)可着重显示,例如,用特殊颜色/字体显示。图2(b)中的X1、X2、Y1和Y2用于指代相应标签中的其他信息。
在一个示例中,用户还可通过针对图2(b)中标签的操作获得对应片段的进一步信息。例如,可点击某标签,以请求服务器发送该标签对应的片段。服务器可响应于该请求,将该片段发送至终端设备上的相应客户端,该客户端可播放该片段以便于用户预览。
图2(b)中每个标签的左侧有一个选中控件,每个标签的右侧有一个删除控件,用户可根据每个片段对应的标签的完整信息过滤掉明显不符合期望的素材,并点击页面右下角的“下一步”控件,以请求服务器将选中的标签对应的片段合成剪辑成品并返回给该客户端。
如图2(c)所示,用户可标识该剪辑成片。在一个示例中,用户点击该剪辑成片的展示区域来播放该剪辑成片。还可设置针对该剪辑成片的下载、上传等空间,在此不再一一赘述。
需要说明的是,图2(a)~图2(c)所示的客户端页面示意图仅用于示例,不用于在任何方面限定页面的显示内容、显示方式和排布方式。本领域技术人员可根据需要设置合适的显示内容、显示方式和排布方式。
在一种可能的实现方式中,服务器/终端设备可预先得到各个片段对应的标签。在接收到剪辑请求时,根据如图1所示的方式得到剪辑成片。
在一种可能的实现方式中,服务器/终端设备可在接收到剪辑请求后,实时对视频进行分析,以得到与多个视频片段对应的标签,然后确定与剪辑请求所携带的剪辑索引匹配的标签,并合并对应的片段以得到剪辑成片。
以下给出了如何得到视频的片段对应的标签的几个示例。
在一种可能的实现方式中,在所述标签包括文字的情况下,图1所述的方法还包括:将所述视频切分为所述多个片段;确定所述片段中的关键帧;对所述关键帧进行图像识别,以得到与包括所述关键帧的片段对应的标签中的所述文字。
例如,在一个示例中,可通过检测视频帧的物理参数(例如特征值)来识别视频中的不同镜头,从而按照镜头切换的时间戳将所述视频切分为所述多个片段;在一个示例中,可先将视频按照镜头切分为多个小片段,再对小片段进行聚类,将属于同一聚类的多个小片段聚合在一起作为一个场景,从而按照场景切换的时间戳将所述视频切分为所述多个片段;在一个示例中, 可按照时间等分的原则将所述视频切分为多个片段,等等。本领域技术人员可采用任意适用的手段将所述视频切分为多个片段,本公开对此不作限定。
例如,在一个示例中,可采用静态帧提取手段来确定所述片段中的关键帧,诸如首帧法、尾帧法、首尾帧法、像素帧平均法或直方图平均法等等;在一个示例中,可采用动态关键帧提取手段来确定所述片段中的关键帧,例如基于聚类分析的关键帧提取算法、基于运动分析的关键帧提取方法、基于语义内容的关键帧提取方法(例如针对采用MPEG-7编码标准的视频)等等。本领域技术人员可采用任意适用的手段确定片段中的关键帧,本公开对此不作限定。
例如,在进行图像识别时,可识别关键帧中的主体(诸如人物、动物、植物、建筑物等),和/或可识别关键帧中主体的行为和/或表情,和/或识别关键帧中的文字信息(诸如字幕)和/或符号信息,和/或可识别关键帧的背景,和/或可识别关键帧的场景等等,并提供与识别出的对象相关的文字信息,以作为标签或标签的一部分。本领域技术人员可采用任意适用的图像识别手段,本公开对此不作限定。
在一种可能的实现方式中,在所述标签包括图片的情况下,图1所述的方法还包括:将所述视频切分为所述多个片段;确定所述片段中的关键帧,并将所述关键帧作为与包括所述关键帧的片段对应的标签中的所述图片。
如上所示,本领域技术人员可采用任意适用的技术手段来切分视频以及确定关键帧,本公开对此不作限定。
通过上述实现方式,可自动得到与各个片段对应的文字/图片标签,为获取视频的片段对应的标签提供了极大的便利。
图3示出根据本公开的一个示例性实施例的视频剪辑装置300的结构框图。该装置可应用于服务器或终端设备。如图3所示,该装置300包括剪辑索引接收模块302、匹配标签确定模块304、片段合并模块306。剪辑索引接收 模块302用于接收剪辑索引。匹配标签确定模块304用于确定视频的多个标签中与所述剪辑索引匹配的标签,所述多个标签对应于所述视频的多个片段。片段合并模块306用于合并与所述剪辑索引匹配的标签对应的所述片段,得到剪辑成片。
在一种可能的实现方式中,所述剪辑索引包括文字和图片中的至少一者。
在一种可能的实施方式中,所述标签包括文字和图片中的至少一者。
在一种可能的实施方式中,所述标签包括文字;所述装置300还包括:第一视频切分模块(未示出),用于将所述视频切分为所述多个片段;第一关键帧确定模块(未示出),用于确定所述片段中的关键帧;图像识别模块(未示出),用于对所述关键帧进行图像识别,以得到与包括所述关键帧的片段对应的标签中的所述文字。
在一种可能的实施方式中,所述标签包括图片;所述装置300还包括:第二视频切分模块(未示出),用于将所述视频切分为所述多个片段;第二关键帧确定模块(未示出),用于确定所述片段中的关键帧,并将所述关键帧作为与包括所述关键帧的片段对应的标签中的所述图片。
图4是根据一示例性实施例示出的一种用于视频剪辑的装置400的框图。例如,装置400可以被提供为一服务器或一终端设备。参照图4,装置400包括处理组件422,其进一步包括一个或多个处理器,以及由存储器432所代表的存储器资源,用于存储可由处理组件422的执行的指令,例如应用程序。存储器432中存储的应用程序可以包括一个或一个以上的每一个对应于一组指令的模块。此外,处理组件422被配置为执行指令,以执行上述方法。
装置400还可以包括一个电源组件426被配置为执行装置400的电源管理,一个有线或无线网络接口450被配置为将装置400连接到网络,和一个输入输出(I/O)接口458。装置400可以操作基于存储在存储器432的操作系统, 例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM或类似。
在示例性实施例中,还提供了一种非易失性计算机可读存储介质,例如包括计算机程序指令的存储器432,上述计算机程序指令可由装置400的处理组件422执行以完成上述方法。
本公开可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本公开的各个方面的计算机可读程序指令。
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是――但不限于――电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序 指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。
用于执行本公开操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本公开的各个方面。
这里参照根据本公开实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本公开的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据 处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。
附图中的流程图和框图显示了根据本公开的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
以上已经描述了本公开的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的技术改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。

Claims (12)

  1. 一种视频剪辑方法,其特征在于,所述方法包括:
    接收剪辑索引;
    确定视频的多个标签中与所述剪辑索引匹配的标签,所述多个标签对应于所述视频的多个片段;
    合并与所述剪辑索引匹配的标签对应的所述片段,得到剪辑成片。
  2. 根据权利要求1所述的方法,其特征在于,所述剪辑索引包括文字和图片中的至少一者。
  3. 根据权利要求1所述的方法,其特征在于,所述标签包括文字和图片中的至少一者。
  4. 根据权利要求3所述的方法,其特征在于,所述标签包括文字;
    所述方法还包括:
    将所述视频切分为所述多个片段;
    确定所述片段中的关键帧;
    对所述关键帧进行图像识别,以得到与包括所述关键帧的片段对应的标签中的所述文字。
  5. 根据权利要求3所述的方法,其特征在于,所述标签包括图片;
    所述方法还包括:
    将所述视频切分为所述多个片段;
    确定所述片段中的关键帧,并将所述关键帧作为与包括所述关键帧的片段对应的标签中的所述图片。
  6. 一种视频剪辑装置,其特征在于,所述装置包括:
    剪辑索引接收模块,用于接收剪辑索引;
    匹配标签确定模块,用于确定视频的多个标签中与所述剪辑索引匹配的标签,所述多个标签对应于所述视频的多个片段;
    片段合并模块,用于合并与所述剪辑索引匹配的标签对应的所述片段,得到剪辑成片。
  7. 根据权利要求6所述的装置,其特征在于,所述剪辑索引包括文字和图片中的至少一者。
  8. 根据权利要求6所述的装置,其特征在于,所述标签包括文字和图片中的至少一者。
  9. 根据权利要求8所述的装置,其特征在于,所述标签包括文字;
    所述装置还包括:
    第一视频切分模块,用于将所述视频切分为所述多个片段;
    第一关键帧确定模块,用于确定所述片段中的关键帧;
    图像识别模块,用于对所述关键帧进行图像识别,以得到与包括所述关键帧的片段对应的标签中的所述文字。
  10. 根据权利要求8所述的装置,其特征在于,所述标签包括图片;
    所述装置还包括:
    第二视频切分模块,用于将所述视频切分为所述多个片段;
    第二关键帧确定模块,用于确定所述片段中的关键帧,并将所述关键帧作为与包括所述关键帧的片段对应的标签中的所述图片。
  11. 一种用于视频剪辑的装置,其特征在于,包括:
    处理器;
    用于存储处理器可执行指令的存储器;
    其中,所述处理器被配置为执行如权利要求1-5中任意一项所述的方法。
  12. 一种非易失性计算机可读存储介质,其上存储有计算机程序指令,其特征在于,所述计算机程序指令被处理器执行时实现权利要求1至5中任意一项所述的方法。
PCT/CN2018/103148 2017-09-04 2018-08-30 视频剪辑方法和装置 WO2019042341A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710787710 2017-09-04
CN201710787710.7 2017-09-04

Publications (1)

Publication Number Publication Date
WO2019042341A1 true WO2019042341A1 (zh) 2019-03-07

Family

ID=65526161

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/103148 WO2019042341A1 (zh) 2017-09-04 2018-08-30 视频剪辑方法和装置

Country Status (2)

Country Link
CN (1) CN110019880A (zh)
WO (1) WO2019042341A1 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110401878A (zh) * 2019-07-08 2019-11-01 天脉聚源(杭州)传媒科技有限公司 一种视频剪辑方法、系统及存储介质
CN110933460A (zh) * 2019-12-05 2020-03-27 腾讯科技(深圳)有限公司 视频的拼接方法及装置、计算机存储介质
CN111538896A (zh) * 2020-03-12 2020-08-14 成都云帆数联科技有限公司 基于深度学习的新闻视频细粒度标签智能提取方法
CN111639228A (zh) * 2020-05-29 2020-09-08 北京百度网讯科技有限公司 视频检索方法、装置、设备及存储介质
CN111695505A (zh) * 2020-06-11 2020-09-22 北京市商汤科技开发有限公司 视频处理方法及装置、电子设备和存储介质
CN113709560A (zh) * 2021-03-31 2021-11-26 腾讯科技(深圳)有限公司 视频剪辑方法、装置、设备及存储介质
CN113905274A (zh) * 2021-09-30 2022-01-07 安徽尚趣玩网络科技有限公司 一种基于ec标识的视频素材拼接方法及装置
CN115396627A (zh) * 2022-08-24 2022-11-25 易讯科技股份有限公司 一种录屏视频会议的定位管理方法及系统

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110381371B (zh) * 2019-07-30 2021-08-31 维沃移动通信有限公司 一种视频剪辑方法及电子设备
CN110534113B (zh) * 2019-08-26 2021-08-24 深圳追一科技有限公司 音频数据脱敏方法、装置、设备和存储介质
CN110611846A (zh) * 2019-09-18 2019-12-24 安徽石轩文化科技有限公司 一种短视频自动剪辑方法
CN111182327B (zh) * 2020-02-12 2022-04-22 北京达佳互联信息技术有限公司 一种视频剪辑方法、装置、视频分发服务器及终端
CN111246289A (zh) * 2020-03-09 2020-06-05 Oppo广东移动通信有限公司 视频生成方法及装置、电子设备、存储介质
CN112423113A (zh) * 2020-11-20 2021-02-26 广州欢网科技有限责任公司 电视节目打点方法、装置及电子终端
CN112423115A (zh) * 2020-11-20 2021-02-26 广州欢网科技有限责任公司 一种花絮视频剪辑方法及系统
CN114302253B (zh) * 2021-11-25 2024-03-12 北京达佳互联信息技术有限公司 媒体数据处理方法、装置、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101646050A (zh) * 2009-09-09 2010-02-10 中国电信股份有限公司 视频文件的文本注释方法和系统、播放方法和系统
CN105144740A (zh) * 2013-05-20 2015-12-09 英特尔公司 弹性云视频编辑和多媒体搜索
CN105657537A (zh) * 2015-12-23 2016-06-08 小米科技有限责任公司 视频剪辑方法及装置
US20170017658A1 (en) * 2015-07-14 2017-01-19 Verizon Patent And Licensing Inc. Automated media clipping and combination system
CN107704525A (zh) * 2017-09-04 2018-02-16 优酷网络技术(北京)有限公司 视频搜索方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7447337B2 (en) * 2004-10-25 2008-11-04 Hewlett-Packard Development Company, L.P. Video content understanding through real time video motion analysis
CN101620629A (zh) * 2009-06-09 2010-01-06 中兴通讯股份有限公司 一种提取视频索引的方法、装置及视频下载系统
US9620168B1 (en) * 2015-12-21 2017-04-11 Amazon Technologies, Inc. Cataloging video and creating video summaries
US20170220869A1 (en) * 2016-02-02 2017-08-03 Verizon Patent And Licensing Inc. Automatic supercut creation and arrangement

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101646050A (zh) * 2009-09-09 2010-02-10 中国电信股份有限公司 视频文件的文本注释方法和系统、播放方法和系统
CN105144740A (zh) * 2013-05-20 2015-12-09 英特尔公司 弹性云视频编辑和多媒体搜索
US20170017658A1 (en) * 2015-07-14 2017-01-19 Verizon Patent And Licensing Inc. Automated media clipping and combination system
CN105657537A (zh) * 2015-12-23 2016-06-08 小米科技有限责任公司 视频剪辑方法及装置
CN107704525A (zh) * 2017-09-04 2018-02-16 优酷网络技术(北京)有限公司 视频搜索方法和装置

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110401878A (zh) * 2019-07-08 2019-11-01 天脉聚源(杭州)传媒科技有限公司 一种视频剪辑方法、系统及存储介质
CN110933460A (zh) * 2019-12-05 2020-03-27 腾讯科技(深圳)有限公司 视频的拼接方法及装置、计算机存储介质
CN110933460B (zh) * 2019-12-05 2021-09-07 腾讯科技(深圳)有限公司 视频的拼接方法及装置、计算机存储介质
CN111538896A (zh) * 2020-03-12 2020-08-14 成都云帆数联科技有限公司 基于深度学习的新闻视频细粒度标签智能提取方法
CN111639228A (zh) * 2020-05-29 2020-09-08 北京百度网讯科技有限公司 视频检索方法、装置、设备及存储介质
CN111695505A (zh) * 2020-06-11 2020-09-22 北京市商汤科技开发有限公司 视频处理方法及装置、电子设备和存储介质
CN111695505B (zh) * 2020-06-11 2024-05-24 北京市商汤科技开发有限公司 视频处理方法及装置、电子设备和存储介质
CN113709560A (zh) * 2021-03-31 2021-11-26 腾讯科技(深圳)有限公司 视频剪辑方法、装置、设备及存储介质
CN113709560B (zh) * 2021-03-31 2024-01-02 腾讯科技(深圳)有限公司 视频剪辑方法、装置、设备及存储介质
CN113905274A (zh) * 2021-09-30 2022-01-07 安徽尚趣玩网络科技有限公司 一种基于ec标识的视频素材拼接方法及装置
CN113905274B (zh) * 2021-09-30 2024-05-17 安徽尚趣玩网络科技有限公司 一种基于ec标识的视频素材拼接方法及装置
CN115396627A (zh) * 2022-08-24 2022-11-25 易讯科技股份有限公司 一种录屏视频会议的定位管理方法及系统

Also Published As

Publication number Publication date
CN110019880A (zh) 2019-07-16

Similar Documents

Publication Publication Date Title
WO2019042341A1 (zh) 视频剪辑方法和装置
CN108833973B (zh) 视频特征的提取方法、装置和计算机设备
US10587920B2 (en) Cognitive digital video filtering based on user preferences
CN113613065B (zh) 视频编辑方法、装置、电子设备以及存储介质
JP7123122B2 (ja) 認知的洞察を使用したビデオ・シーンの移動
CN113709561B (zh) 视频剪辑方法、装置、设备及存储介质
CN106303723B (zh) 视频处理方法和装置
CN112929744B (zh) 用于分割视频剪辑的方法、装置、设备、介质和程序产品
US10789990B2 (en) Video data learning and prediction
US20150269145A1 (en) Automatic discovery and presentation of topic summaries related to a selection of text
CN108182211B (zh) 视频舆情获取方法、装置、计算机设备及存储介质
US20220277775A1 (en) Video generating method, apparatus, electronic device, and computer-readable medium
JP7223056B2 (ja) 画像審査方法、装置、電子機器及び記憶媒体
CN111263186A (zh) 视频生成、播放、搜索以及处理方法、装置和存储介质
CN109255035B (zh) 用于构建知识图谱的方法和装置
CN111726682B (zh) 视频片段生成方法、装置、设备和计算机存储介质
WO2024099171A1 (zh) 视频生成方法和装置
CN115379136A (zh) 特效道具处理方法、装置、电子设备及存储介质
CN109241344B (zh) 用于处理信息的方法和装置
US20180063593A1 (en) Visually representing speech and motion
CN113923479A (zh) 音视频剪辑方法和装置
CN114064968A (zh) 一种新闻字幕摘要生成方法和系统
CN114245171A (zh) 视频编辑方法、装置、电子设备、介质
US20220321970A1 (en) Dynamic Real-Time Audio-Visual Search Result Assembly
US10558697B2 (en) Segmenting a set of media data using a set of social networking data

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18851578

Country of ref document: EP

Kind code of ref document: A1