WO2020253806A1 - 展示视频的生成方法、装置、设备及存储介质 - Google Patents

展示视频的生成方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2020253806A1
WO2020253806A1 PCT/CN2020/096969 CN2020096969W WO2020253806A1 WO 2020253806 A1 WO2020253806 A1 WO 2020253806A1 CN 2020096969 W CN2020096969 W CN 2020096969W WO 2020253806 A1 WO2020253806 A1 WO 2020253806A1
Authority
WO
WIPO (PCT)
Prior art keywords
displayed
content
beat
pictures
music
Prior art date
Application number
PCT/CN2020/096969
Other languages
English (en)
French (fr)
Inventor
黄晨婕
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2020253806A1 publication Critical patent/WO2020253806A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0276Advertisement creation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing

Definitions

  • the present disclosure relates to the field of Internet technology, for example, to a method, device, device, and storage medium for generating a display video.
  • Advertisement is a means of propaganda to convey information to the public openly and widely through a certain form of media for a specific demand.
  • the embodiments of the present disclosure provide a method, device, equipment, and storage medium for generating a display video, so as to reduce the cost of generating the display video and improve the quality of the display video.
  • the embodiment of the present disclosure provides a method for generating a display video, including:
  • Obtain data where the data includes one of the following: at least two pictures of the content to be displayed; at least two pictures of the content to be displayed and characteristic information of the content to be displayed;
  • a display video is generated according to at least two pictures of the content to be displayed and a music clip matching the content to be displayed, wherein the time point at which each picture is presented in the display video is the same as the beat point in the beat information correspond.
  • the embodiment of the present disclosure also provides a device for generating a display video, including:
  • the characteristic information acquiring module is configured to acquire data, wherein the data includes one of the following: at least two pictures of the content to be displayed; at least two pictures of the content to be displayed and characteristic information of the content to be displayed;
  • a music segment determining module configured to determine a music segment matching the content to be displayed according to the acquired data
  • the beat information acquisition module is configured to perform feature extraction on the music fragment to obtain beat information of the music fragment, wherein the beat information includes at least two beat points;
  • the display video generation module is configured to generate a display video based on at least two pictures of the content to be displayed and a music clip matching the content to be displayed, wherein the time point when each picture is presented in the display video and the time Corresponds to the beat points in the beat information.
  • An embodiment of the present disclosure also provides an electronic device, which includes:
  • One or more processing devices are One or more processing devices;
  • Storage device set to store one or more programs
  • the one or more processing devices When the one or more programs are executed by the one or more processing devices, the one or more processing devices implement the display video generation method according to the embodiment of the present disclosure.
  • the embodiment of the present disclosure also provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processing device, the method for generating a display video as described in the embodiment of the present disclosure is realized.
  • FIG. 1 is a flowchart of a method for generating a display video provided by Embodiment 1 of the present disclosure
  • FIG. 2 is a schematic structural diagram of an apparatus for generating a display video provided by Embodiment 2 of the present disclosure
  • FIG. 3 is a schematic structural diagram of an electronic device provided in the third embodiment of the present disclosure.
  • FIG. 1 is a flowchart of a method for generating a display video according to Embodiment 1 of the present disclosure. This embodiment is applicable to a situation where a display video is generated based on a picture of a content to be displayed.
  • the method can be executed by a display video generation device
  • the device can be composed of hardware and/or software, and is generally integrated in electronic equipment. As shown in Figure 1, the method includes the following steps:
  • Step 110 Obtain at least two pictures of the content to be displayed and/or characteristic information of the content to be displayed.
  • the content to be displayed can be commodities, concerts, competitions, film and television dramas, and tourist attractions that need to be promoted.
  • the characteristic information of the content to be displayed may include category information of the content to be displayed, information about the owner of the content to be displayed, and delivery data of the content to be displayed.
  • the owner information of the content to be displayed can be the producer of the content to be displayed, such as the manufacturer of the product, the organizer of the concert, the producer of the film and television series, etc.; the release data of the content to be displayed can be the consumption after the initial release of the content to be displayed Volume, volume, click volume, etc.
  • the user uploads at least two pictures of the content to be displayed and characteristic information of the content to be displayed.
  • Step 120 Determine a music segment matching the content to be displayed according to the acquired at least two pictures of the content to be displayed and/or characteristic information of the content to be displayed.
  • the music clip is used as background music for the display video.
  • the feature of the content to be displayed is obtained, and the matching music segment is obtained according to the feature of the content to be displayed.
  • a music segment matching the content to be displayed is determined based on at least two pictures and characteristic information at the same time.
  • determining a music segment that matches the content to be displayed can be implemented in the following manner: feature extraction of at least two pictures , Obtain the first feature vector; generate the second feature vector according to the feature information; input the first feature vector and/or the second feature vector into the set neural network model to obtain a music segment matching the content to be displayed.
  • the set neural network can be a deep neural network (Deep Neural Network, DNN) or a Convolutional Neural Network (Convolutional Neural Networks, CNN).
  • DNN Deep Neural Network
  • CNN Convolutional Neural Networks
  • the neural network is assumed to have the ability to output music fragments matching the content to be displayed according to the input first feature vector and/or second feature vector.
  • the manner of performing feature extraction on at least two pictures may be to input at least two pictures into a feature extraction neural network to perform feature extraction, so as to obtain the first feature vector corresponding to the at least two pictures.
  • the method of generating the second feature vector according to the feature information may be to obtain vector elements corresponding to the feature information, and then form the second feature vector.
  • the neural network After obtaining the first feature vector and the second feature vector, set the neural network with the first feature vector and the second feature vector, or one of the first feature vector and the second feature vector, so as to obtain a match with the content to be displayed Music clips.
  • Step 130 Perform feature extraction on the music fragment to obtain beat information of the music fragment, where the beat information includes at least two beat points.
  • the feature extraction is performed on the music segment to obtain the beat information of the music segment.
  • the way to obtain the beat information of the music segment may be: using Mel-Frequency Cepstrum (MFCC) algorithm to extract the features of the music segment to obtain a satisfactory setting
  • MFCC Mel-Frequency Cepstrum
  • Conditional accent points Acquire a group of accent points whose time interval between adjacent accent points is within a set range, and determine the group of accent points as beat information of the music segment.
  • the beat information includes at least two beat points, and the at least two beat points have a one-to-one correspondence with the accent points in the group.
  • the accent points satisfying the set condition may be music points whose sound frequency exceeds a preset threshold.
  • a group of accent points whose time intervals between adjacent accent points are within a set range can be understood as the same or similar time intervals between adjacent accent points.
  • the MFCC algorithm is used to extract the accent points in the music fragment, and then a group of accent points with the same or similar time interval between adjacent accent points is obtained, and the group of accent points is regarded as the music fragment Beat information.
  • Step 140 Generate a display video based on at least two pictures of the content to be displayed and a music segment matching the content to be displayed.
  • the time point of each picture presented in the display video corresponds to the beat point in the beat information.
  • the beat information of the music fragment is obtained, at least two pictures are set on the beat points in the beat information according to the set sequence, and the at least two pictures and the music fragment set on the beat points are merged to obtain a display video.
  • Setting at least two pictures on the beat points in the beat information according to the set sequence can be understood as a picture corresponding to each beat point in the beat information.
  • the setting sequence can be the upload sequence of the pictures or the shooting time sequence marked in the pictures, which is not limited here.
  • the method further includes the following step: adding a set playing special effect to the at least two pictures.
  • Setting playback special effects can include special effects such as entering the picture from left to right, rotating into the picture, and entering the picture from top to bottom.
  • a set playing special effect is added to at least two pictures, so that when the display video is played, the pictures in the display video are played according to the set playing special effect, which increases the interest of the display video.
  • the method before setting at least two pictures on the beat points in the beat information in a set order, the method further includes the following step: if the number of beat points in the beat information is greater than the number of pictures, then the music clip is cut Process to make the number of beat points equal to the number of pictures; if the number of beat points in the beat information is less than the number of pictures, copy the music sub-segment from the music segment, and stitch the music sub-segment and the music segment to form new music Fragment, so that the number of beat points contained in the new music fragment is equal to the number of pictures.
  • the way of cutting the music segment can be to start cutting from the beginning or the end of the music segment, and the size of the cut segment can be determined according to the number of beat points and the number of pictures.
  • the length of the music sub-segment can be determined according to the number of beats and the number of pictures.
  • the way of copying the music sub-segment from the music segment can be to copy the music sub-segment of a certain length from the beginning and the end of the music segment. The advantage of this is that the number of pictures matches the length of the music clip.
  • the technical solution of this embodiment firstly, at least two pictures of the content to be displayed and/or characteristic information of the content to be displayed are acquired, and then according to the acquired at least two pictures of the content to be displayed and/or characteristic information of the content to be displayed.
  • the music fragments matching the content to be displayed are then feature extracted to obtain the beat information of the music fragments, and finally a display video is generated according to at least two pictures of the content to be displayed and the music fragments matching the content to be displayed.
  • the display video generation method provided by the embodiment of the present disclosure obtains the beat information of the music segment matching the content to be displayed, and generates the display video based on at least two pictures and the music segment, which can reduce the cost of display video generation and improve the display video performance quality.
  • performing feature extraction on at least two pictures, and before obtaining the first feature vector further includes the following steps: obtaining a display video sample set; extracting the first feature vector corresponding to each video frame of the display video in the display video sample set and / Or the second feature vector corresponding to the feature information; for each display video, input the first feature vector and/or the second feature vector into the set neural network to obtain the initial music segment; according to the initial music segment and the music segment in the display video
  • the loss function adjusts the parameters in the set neural network to train the set neural network.
  • the display video in the display video sample set may be a published video.
  • the process of extracting the first feature vector corresponding to the video frame of each display video may be to input all or part of the video frames included in the display video into the feature extraction neural network to obtain the first feature vector of the current display video.
  • the method for extracting the second feature vector corresponding to the feature information of each displayed video may be to generate the second feature vector according to feature information such as category information of the currently displayed video, owner information, and placement data.
  • the initial music segment After inputting the first feature vector and/or second feature vector of the current display video into the set neural network, the initial music segment is obtained, and then the loss function of the initial music segment and the music segment in the current display video is calculated, and the loss function is set in the design Set the reverse transmission in the neural network and adjust the parameters in the deep neural network to train the deep neural network.
  • training the set neural network by displaying the video sample set can improve the recognition accuracy of the set neural network.
  • FIG. 2 is a schematic structural diagram of an apparatus for generating a display video provided by the second embodiment of the disclosure.
  • the device includes: a feature information acquisition module 210, a music segment determination module 220, a beat information acquisition module 230, and a display video generation module 240.
  • the feature information obtaining module 210 is configured to obtain at least two pictures of the content to be displayed and/or feature information of the content to be displayed.
  • the music segment determining module 220 is configured to determine a music segment matching the content to be displayed according to the acquired at least two pictures of the content to be displayed and/or characteristic information of the content to be displayed.
  • the beat information acquisition module 230 is configured to perform feature extraction on the music fragment to obtain beat information of the music fragment, and the beat information includes at least two beat points.
  • the display video generation module 240 is configured to generate a display video based on at least two pictures of the content to be displayed and a music clip matching the content to be displayed, wherein the time point at which each picture is presented in the display video and the beat point in the beat information correspond.
  • the characteristic information of the content to be displayed includes category information of the content to be displayed, information about the owner of the content to be displayed, and delivery data of the content to be displayed.
  • the music segment determining module 220 is set to:
  • the beat information acquisition module 230 is set to:
  • the beat information of the music segment includes at least two beat points, and the at least two beat points correspond one-to-one with the accent points in the group.
  • the display video generation module 240 is set to:
  • At least two pictures are set on the beat points in the beat information according to the set sequence; at least two pictures and music clips set on the beat points are merged to obtain a display video.
  • Optional also includes:
  • a music segment adjustment module set to:
  • the music fragment will be cut to make the number of beat points equal to the number of pictures; if the number of beat points in the beat information is less than the number of pictures, then the music fragment Copy the music sub-segment in, splicing the music sub-segment with the music segment to form a new music segment, so that the number of beat points contained in the new music segment is equal to the number of pictures.
  • it also includes: setting the neural network training module to:
  • the display video sample set extract the first feature vector corresponding to each video frame of the display video in the display video sample set and/or the second feature vector corresponding to the feature information; for each display video, combine the first feature vector and/or
  • the second feature vector input sets the neural network to obtain the initial music segment;
  • the set neural network includes a deep neural network or a convolutional neural network; adjusts the parameters in the neural network according to the loss function of the initial music segment and the music segment in the display video , To train the set neural network.
  • the foregoing device can execute the methods provided in all the foregoing embodiments of the present disclosure, and has functional modules and effects corresponding to the foregoing methods. For technical details not described in this embodiment, refer to the methods provided in all the foregoing embodiments of the present disclosure.
  • FIG. 3 shows a schematic structural diagram of an electronic device 300 suitable for implementing embodiments of the present disclosure.
  • the electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA), tablet computers (PAD), and portable multimedia players (Portable Media Player). , PMP), in-vehicle terminals (for example, in-vehicle navigation terminals), mobile terminals such as digital televisions (Television, TV), desktop computers, etc., or multiple forms of servers, such as independent servers or server clusters.
  • PMP Personal Digital Assistant
  • PDA Personal Digital Assistant
  • PAD tablet computers
  • portable multimedia players Portable Media Player
  • PMP Personal Digital Assistant
  • in-vehicle terminals for example, in-vehicle navigation terminals
  • mobile terminals such as digital televisions (Television, TV), desktop computers, etc.
  • multiple forms of servers such as independent servers or server clusters.
  • the electronic device 300 may include a processing device (such as a central processing unit, a graphics processor, etc.) 301, which can be based on a program stored in a read-only memory (Read-Only Memory, ROM) 302 or from a storage device.
  • the device 305 loads a program in a random access memory (RAM) 303 to execute various appropriate actions and processes.
  • RAM random access memory
  • various programs and data required for the operation of the electronic device 300 are also stored.
  • the processing device 301, ROM 302, and RAM 303 are connected to each other through a bus 304.
  • An input/output (Input/Output, I/O) interface 305 is also connected to the bus 304.
  • the following devices can be connected to the I/O interface 305: including input devices 306 such as touch screens, touch pads, keyboards, mice, cameras, microphones, accelerometers, gyroscopes, etc.; including, for example, liquid crystal displays (LCD) Output devices 307 such as speakers, vibrators, etc.; storage devices 308 such as magnetic tapes, hard disks, etc.; and communication devices 309.
  • the communication device 309 may allow the electronic device 300 to perform wireless or wired communication with other devices to exchange data.
  • FIG. 3 shows an electronic device 300 having multiple devices, it is not required to implement or have all the devices shown. It may alternatively be implemented or provided with more or fewer devices.
  • an embodiment of the present disclosure includes a computer program product including a computer program carried on a computer-readable medium, and the computer program contains program code for executing a word recommendation method.
  • the computer program may be downloaded and installed from the network through the communication device 309, or installed from the storage device 305, or installed from the ROM 302.
  • the processing device 301 When the computer program is executed by the processing device 301, the above-mentioned functions defined in the method of the embodiment of the present disclosure are executed.
  • the aforementioned computer-readable medium in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above.
  • Examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access storage devices (RAM), read-only storage devices (ROM), erasable programmable Read-only storage device (Erasable Programmable Read-Only Memory, EPROM or flash memory), optical fiber, portable compact disk read-only memory device (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or Any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein.
  • This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
  • the computer-readable signal medium may send, propagate or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wire, optical cable, radio frequency (RF), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or it may exist alone without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the processing device, the electronic device: obtains at least two pictures of the content to be displayed and/or characteristic information of the content to be displayed According to the acquired at least two pictures of the content to be displayed and/or the feature information of the content to be displayed, determine a music segment matching the content to be displayed; perform feature extraction on the music segment to obtain the music The beat information of the segment, where the beat information includes at least two beat points; a display video is generated based on at least two pictures of the content to be displayed and a music segment matching the content to be displayed, wherein each picture is displayed in the video The time point presented in corresponds to the beat point in the beat information.
  • the computer program code used to perform the operations of the present disclosure may be written in one or more programming languages or a combination thereof.
  • the above-mentioned programming languages include object-oriented programming languages—such as Java, Smalltalk, C++, and also conventional Procedural programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network-including Local Area Network (LAN) or Wide Area Network (WAN)-or it can be connected to an external computer (for example, use an Internet service provider to connect via the Internet).
  • LAN Local Area Network
  • WAN Wide Area Network
  • each block in the flowchart or block diagram can represent a module, program segment, or part of code, and the module, program segment, or part of code contains one or more for realizing the specified logical function Executable instructions.
  • the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure may be implemented in a software manner, or may be implemented in a hardware manner. Among them, the name of the module does not constitute a limitation on the module itself in one case.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Accounting & Taxation (AREA)
  • Acoustics & Sound (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • User Interface Of Digital Computer (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本文公开了一种展示视频的生成方法、装置、设备及存储介质。所述展示视频的生成方法包括:获取数据,其中,所述数据包括以下之一:待展示内容的至少两张图片;待展示内容的至少两张图片和待展示内容的特征信息;根据获取的数据确定与所述待展示内容匹配的音乐片段;对所述音片段进行特征提取,获得所述音乐片段的节拍信息,其中,所述节拍信息包括至少两个节拍点;根据所述待展示内容的至少两张图片和与所述待展示内容匹配的音乐片段生成展示视频,其中,每张图片在所述展示视频中呈现的时间点与所述节拍信息中的节拍点对应。

Description

展示视频的生成方法、装置、设备及存储介质
本申请要求在2019年06月19日提交中国专利局、申请号为201910532395.2的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本公开涉及互联网技术领域,例如涉及一种展示视频的生成方法、装置、设备及存储介质。
背景技术
广告是为了一种特定的需求,通过一定形式的媒体,公开而广泛地向公众传递信息的宣传手段。广告的种类有很多,其中视频广告因具有良好的传播效果而受到广泛的关注。
相关技术中,在制作广告视频时,通常是由广告主对待宣传的商品进行视频拍摄获得广告视频,该方式不仅要花费大量的人力物力,而且拍摄出的广告视频可能效果并不理想,从而间接影响对商品的宣传。
发明内容
本公开实施例提供一种展示视频的生成方法、装置、设备及存储介质,以降低展示视频的生成成本,并提高展示视频的质量。
本公开实施例提供了一种展示视频的生成方法,包括:
获取数据,其中,所述数据包括以下之一:待展示内容的至少两张图片;待展示内容的至少两张图片和待展示内容的特征信息;
根据获取的数据确定与所述待展示内容匹配的音乐片段;
对所述音乐片段进行特征提取,获得所述音乐片段的节拍信息,其中,所述节拍信息包括至少两个节拍点;
根据所述待展示内容的至少两张图片和与所述待展示内容匹配的音乐片段生成展示视频,其中,每张图片在所述展示视频中呈现的时间点与所述节拍信息中的节拍点对应。
本公开实施例还提供了一种展示视频的生成装置,包括:
特征信息获取模块,设置为获取数据,其中,所述数据包括以下之一:待展示内容的至少两张图片;待展示内容的至少两张图片和待展示内容的特征信 息;
音乐片段确定模块,设置为根据获取的数据确定与所述待展示内容匹配的音乐片段;
节拍信息获取模块,设置为对所述音乐片段进行特征提取,获得所述音乐片段的节拍信息,其中,所述节拍信息包括至少两个节拍点;
展示视频生成模块,设置为根据所述待展示内容的至少两张图片和与所述待展示内容匹配的音乐片段生成展示视频,其中,每张图片在所述展示视频中呈现的时间点与所述节拍信息中的节拍点对应。
本公开实施例还提供了一种电子设备,所述电子设备包括:
一个或多个处理装置;
存储装置,设置为存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理装置执行,使得所述一个或多个处理装置实现如本公开实施例所述的展示视频的生成方法。
本公开实施例还提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理装置执行时实现如本公开实施例所述的展示视频的生成方法。
附图说明
图1是本公开实施例一提供的一种展示视频的生成方法的流程图;
图2是本公开实施例二提供的一种展示视频的生成装置的结构示意图;
图3是本公开实施例三提供的一种电子设备的结构示意图。
具体实施方式
下面结合附图和实施例对本公开进行说明。此处所描述的具体实施例仅仅用于解释本公开,而非对本公开的限定。为了便于描述,附图中仅示出了与本公开相关的部分而非全部结构。
实施例一
图1为本公开实施例一提供的一种展示视频的生成方法的流程图,本实施例可适用于根据待展示内容的图片生成展示视频的情况,该方法可以由展示视频的生成装置来执行,该装置可以由硬件和/或软件构成,并一般集成在电子设备中。如图1所示,该方法包括如下步骤:
步骤110,获取待展示内容的至少两张图片和/或待展示内容的特征信息。
待展示内容可以是商品、演唱会、比赛赛事、影视剧及旅游景点等需要宣传的内容。待展示内容的特征信息可以包括待展示内容类别信息、待展示内容业主信息及待展示内容投放数据。待展示内容业主信息可以是待展示内容的出品方,如商品的生产厂家、演唱会的主办方、影视剧的制片方等;待展示内容的投放数据可以是待展示内容前期投放后的消耗量、投放量及点击量等。
用户为了宣传待展示内容,上传待展示内容的至少两张图片以及待展示内容的特征信息。
步骤120,根据获取的待展示内容的至少两张图片和/或待展示内容的特征信息确定与待展示内容匹配的音乐片段。
音乐片段是用作展示视频的背景音乐。通过对至少两张图片进行特征提取,获得待展示内容的特征,根据待展示内容的特征获取与其匹配的音乐片段。或者,直接根据上传的特征信息确定与待展示内容的音乐片段。或者,同时根据至少两张图片和特征信息确定与待展示内容匹配的音乐片段。
本实施例中,根据获取的待展示内容的至少两张图片和/或待展示内容的特征信息确定与待展示内容匹配的音乐片段,可以通过下述方式实施:对至少两张图片进行特征提取,获得第一特征向量;根据特征信息生成第二特征向量;将第一特征向量和/或第二特征向量输入设定神经网络模型,获得与待展示内容匹配的音乐片段。
设定神经网络可以是深度神经网络(Deep Neural Network,DNN)或者卷积神经网络(Convolutional Neural Networks,CNN)。本实施例中,设定神经网具有根据输入的第一特征向量和/或第二特征向量输出与待展示内容匹配的音乐片段的能力。
本实施例中,对至少两张图片进行特征提取的方式可以是将至少两张图片输入特征提取神经网络进行特征提取,从而获得至少两张图片对应的第一特征向量。根据特征信息生成第二特征向量的方式可以是,获取与特征信息对应的向量元素,然后组成第二特征向量。
在获得第一特征向量和第二特征向量后,将第一特征向量和第二特征向量,或者第一特征向量和第二特征向量中的一个输入设定神经网络,从而获得与待展示内容匹配的音乐片段。
步骤130,对音乐片段进行特征提取,获得音乐片段的节拍信息,节拍信息包括至少两个节拍点。
本实施例中,对音乐片段进行特征提取,获得音乐片段的节拍信息的方式可以是:采用梅尔频率倒谱系数(Mel-Frequency Cepstrum,MFCC)算法对音 乐片段进行特征提取,获得满足设定条件的重音点;获取相邻重音点的时间间隔在设定范围内的一组重音点,将该组重音点确定为音乐片段的节拍信息。
节拍信息包括至少两个节拍点,至少两个节拍点与组内的重音点一一对应。满足设定条件的重音点可以是声音频率超过预设阈值的音乐点。相邻重音点的时间间隔在设定范围内的一组重音点可以理解为相邻重音点的时间间隔相同或相近。
在获得与待展示内容匹配的音乐片段后,采用MFCC算法提取音乐片段中的重音点,然后获取相邻重音点的时间间隔相同或相近的一组重音点,将该组重音点作为音乐片段的节拍信息。
步骤140,根据待展示内容的至少两张图片和与待展示内容匹配的音乐片段生成展示视频。
每张图片在展示视频中呈现的时间点与节拍信息中的节拍点对应。在获得音乐片段的节拍信息后,将至少两张图片按照设定顺序设置于节拍信息中的节拍点上,将设置于节拍点上的至少两张图片和音乐片段进行合并,获得展示视频。将至少两张图片按照设定顺序设置于节拍信息中的节拍点上可以理解为节拍信息中的每个节拍点上对应一张图片。设定顺序可以是图片的上传顺序或者在图片中标记的拍摄时间顺序,在此不做限定。
可选的,在将所述至少两张图片按照设定顺序设置于所述节拍信息中的节拍点上之后,还包括如下步骤:对至少两张图片添加设定播放特效。
设定播放特效可以包括图片从左到右进入画面、旋转进入画面、从上到下进入画面等特效。本实施例中,对至少两张图片添加设定播放特效,使得在播放展示视频时,展示视频中的图片按照设定播放特效播放,增加展示视频的趣味性。
可选的,在将至少两张图片按照设定顺序设置于节拍信息中的节拍点上之前,还包括如下步骤:若节拍信息中节拍点的数量大于图片的数量,则对音乐片段进行剪切处理,使得节拍点的数量与图片的数量相等;若节拍信息中节拍点的数量小于图片的数量,则从音乐片段中复制音乐子片段,将音乐子片段与音乐片段进行拼接,形成新的音乐片段,使得新的音乐片段包含的节拍点的数量与图片的数量相等。
对音乐片段进行剪切处理的方式可以是从音乐片段的开头或者结尾处开始剪切,可以根据节拍点的数量和图片的数量确定剪切的片段的大小。音乐子片段的长度可以根据节拍数量和图片数量确定,从音乐片段中复制音乐子片段的方式可以是从音乐片段开头和结尾处开始复制一定长度的音乐子片段。这样做 的好处是,使得图片数量和音乐片段的长度匹配。
本实施例的技术方案,首先获取待展示内容的至少两张图片和/或待展示内容的特征信息,然后根据获取的待展示内容的至少两张图片和/或待展示内容的特征信息确定与待展示内容匹配的音乐片段,再然后对音乐片段进行特征提取,获得音乐片段的节拍信息,最后根据待展示内容的至少两张图片和与待展示内容匹配的音乐片段生成展示视频。本公开实施例提供的展示视频的生成方法,获取与待展示内容匹配的音乐片段的节拍信息,根据至少两张图片和音乐片段生成展示视频,可以降低展示视频的生成成本,并提高展示视频的质量。
可选的,对至少两张图片进行特征提取,获得第一特征向量之前,还包括如下步骤:获取展示视频样本集;提取展示视频样本集中每个展示视频的视频帧对应的第一特征向量和/或特征信息对应的第二特征向量;对于每个展示视频,将第一特征向量和/或第二特征向量输入设定神经网络,获得初始音乐片段;根据初始音乐片段和展示视频中音乐片段的损失函数调整设定神经网络中的参数,以对设定神经网络进行训练。
展示视频样本集中的展示视频可以是已经发布的视频。提取每个展示视频的视频帧对应的第一特征向量的过程可以是,将展示视频包含的全部或者部分视频帧输入特征提取神经网络中,获得当前展示视频的第一特征向量。提取每个展示视频的特征信息对应的第二特征向量的方式可以是,根据当前展示视频的类别信息、业主信息及投放数据等特征信息生成第二特征向量。
在将当前展示视频的第一特征向量和/或第二特征向量输入设定神经网络后,获得初始音乐片段,然后计算初始音乐片段和当前展示视频中音乐片段的损失函数,将损失函数在设定神经网络中反向传输,调整深度神经网络中的参数,以对深度神经网络进行训练。
本实施例中,通过展示视频样本集对设定神经网络进行训练,可以提高设定神经网络的识别精度。
实施例二
图2为本公开实施例二提供的一种展示视频的生成装置的结构示意图。如图2所示,该装置包括:特征信息获取模块210,音乐片段确定模块220,节拍信息获取模块230和展示视频生成模块240。
特征信息获取模块210,设置为获取待展示内容的至少两张图片和/或待展示内容的特征信息。
音乐片段确定模块220,设置为根据获取的待展示内容的至少两张图片和/或待展示内容的特征信息确定与待展示内容匹配的音乐片段。
节拍信息获取模块230,设置为对音乐片段进行特征提取,获得音乐片段的节拍信息,节拍信息包括至少两个节拍点。
展示视频生成模块240,设置为根据待展示内容的至少两张图片和与待展示内容匹配的音乐片段生成展示视频,其中,每张图片在展示视频中呈现的时间点与节拍信息中的节拍点对应。
可选的,待展示内容的特征信息包括待展示内容类别信息、待展示内容业主信息及待展示内容投放数据。
可选的,音乐片段确定模块220,是设置为:
对至少两张图片进行特征提取,获得第一特征向量;根据特征信息生成第二特征向量;将第一特征向量和/或第二特征向量输入设定神经网络模型,获得与待展示内容匹配的音乐片段。
可选的,节拍信息获取模块230,是设置为:
采用梅尔频率倒谱系数算法对音乐片段进行特征提取,获得满足设定条件的重音点;获取相邻重音点的时间间隔在设定范围内的一组重音点,将该组重音点确定为音乐片段的节拍信息,节拍信息包括至少两个节拍点,至少两个节拍点与组内的重音点一一对应。
可选的,展示视频生成模块240,是设置为:
将至少两张图片按照设定顺序设置于节拍信息中的节拍点上;将设置于节拍点上的至少两张图片和音乐片段进行合并,获得展示视频。
可选的,还包括:
设定播放特效添加模块,设置为对至少两张图片添加设定播放特效。
可选的,还包括:音乐片段调整模块,设置为:
若节拍信息中节拍点的数量大于图片的数量,则对音乐片段进行剪切处理,使得节拍点的数量与图片的数量相等;若节拍信息中节拍点的数量小于图片的数量,则从音乐片段中复制音乐子片段,将音乐子片段与音乐片段进行拼接,形成新的音乐片段,使得新的音乐片段包含的节拍点的数量与图片的数量相等。
可选的,还包括:设定神经网络训练模块,设置为:
获取展示视频样本集;提取展示视频样本集中每个展示视频的视频帧对应的第一特征向量和/或特征信息对应的第二特征向量;对于每个展示视频,将第一特征向量和/或第二特征向量输入设定神经网络,获得初始音乐片段;设定神经网络包括深度神经网络或卷积神经网络;根据初始音乐片段和展示视频中音乐片段的损失函数调整设定神经网络中的参数,以对设定神经网络进行训练。
上述装置可执行本公开前述所有实施例所提供的方法,具备执行上述方法相应的功能模块和效果。未在本实施例中描述的技术细节,可参见本公开前述所有实施例所提供的方法。
实施例三
下面参考图3,图3示出了适于用来实现本公开实施例的电子设备300的结构示意图。本公开实施例中的电子设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,PDA)、平板电脑(PAD)、便携式多媒体播放器(Portable Media Player,PMP)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字电视机(Television,TV)、台式计算机等等的固定终端,或者多种形式的服务器,如独立服务器或者服务器集群。图3示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图3所示,电子设备300可以包括处理装置(例如中央处理器、图形处理器等)301,其可以根据存储在只读存储装置(Read-Only Memory,ROM)302中的程序或者从存储装置305加载到随机访问存储装置(Random Access Memory,RAM)303中的程序而执行多种适当的动作和处理。在RAM 303中,还存储有电子设备300操作所需的多种程序和数据。处理装置301、ROM 302以及RAM 303通过总线304彼此相连。输入/输出(Input/Output,I/O)接口305也连接至总线304。
通常,以下装置可以连接至I/O接口305:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置306;包括例如液晶显示器(Liquid Crystal Display,LCD)、扬声器、振动器等的输出装置307;包括例如磁带、硬盘等的存储装置308;以及通信装置309。通信装置309可以允许电子设备300与其他设备进行无线或有线通信以交换数据。虽然图3示出了具有多种装置的电子设备300,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行词语的推荐方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置309从网络上被下载和安装,或者从存储装置305被安装,或者从ROM 302被安装。在该计算机程序被处理装置301执行时,执行本公开实施例的方法中限定的上述功能。
本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但 不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储装置(RAM)、只读存储装置(ROM)、可擦式可编程只读存储装置(Erasable Programmable Read-Only Memory,EPROM或闪存)、光纤、便携式紧凑磁盘只读存储装置(Compact Disc Read-Only Memory,CD-ROM)、光存储装置件、磁存储装置件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该处理装置执行时,使得该电子设备:获取待展示内容的至少两张图片和/或待展示内容的特征信息;根据获取的所述待展示内容的至少两张图片和/或所述待展示内容的特征信息确定与所述待展示内容匹配的音乐片段;对所述音乐片段进行特征提取,获得所述音乐片段的节拍信息,所述节拍信息包括至少两个节拍点;根据所述待展示内容的至少两张图片和与所述待展示内容匹配的音乐片段生成展示视频,其中,每张图片在展示视频中呈现的时间点与所述节拍信息中的节拍点对应。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(Local Area Network,LAN)或广域网(Wide Area Network,WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开多种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,模块的名称在一种情况下并不构成对该模块本身的限定。

Claims (11)

  1. 一种展示视频的生成方法,包括:
    获取数据,其中,所述数据包括以下之一:待展示内容的至少两张图片;待展示内容的至少两张图片和待展示内容的特征信息;
    根据获取的数据确定与所述待展示内容匹配的音乐片段;
    对所述音乐片段进行特征提取,获得所述音乐片段的节拍信息,其中,所述节拍信息包括至少两个节拍点;
    根据所述待展示内容的至少两张图片和与所述待展示内容匹配的音乐片段生成展示视频,其中,每张图片在所述展示视频中呈现的时间点与所述节拍信息中的节拍点对应。
  2. 根据权利要求1所述的方法,其中,所述待展示内容的特征信息包括待展示内容类别信息、待展示内容业主信息及待展示内容投放数据。
  3. 根据权利要求1或2所述的方法,其中,所述根据获取的数据确定与所述待展示内容匹配的音乐片段,包括以下之一:
    对所述至少两张图片进行特征提取,获得第一特征向量;将所述第一特征向量输入设定神经网络模型,获得与所述待展示内容匹配的音乐片段;
    根据所述特征信息生成第二特征向量;将所述第二特征向量输入设定神经网络模型,获得与所述待展示内容匹配的音乐片段;
    对所述至少两张图片进行特征提取,获得第一特征向量;根据所述特征信息生成第二特征向量;将所述第一特征向量和所述第二特征向量输入设定神经网络模型,获得与所述待展示内容匹配的音乐片段。
  4. 根据权利要求1所述的方法,其中,所述对所述音乐片段进行特征提取,获得所述音乐片段的节拍信息,包括:
    采用梅尔频率倒谱系数算法对所述音乐片段进行特征提取,获得满足设定条件的重音点;
    获取相邻重音点的时间间隔在设定范围内的一组重音点,将所述组重音点确定为音乐片段的节拍信息,其中,所述至少两个节拍点与组内的重音点一一对应。
  5. 根据权利要求1所述的方法,其中,所述根据所述待展示内容的至少两张图片和与所述待展示内容匹配的音乐片段生成展示视频,包括:
    将所述至少两张图片按照设定顺序设置于所述节拍信息中的节拍点上;
    将设置于节拍点上的所述至少两张图片和所述音乐片段进行合并,获得所 述展示视频。
  6. 根据权利要求5所述的方法,在所述将所述至少两张图片按照设定顺序设置于所述节拍信息中的节拍点上之后,还包括:
    对所述至少两张图片添加设定播放特效。
  7. 根据权利要求5所述的方法,在所述将所述至少两张图片按照设定顺序设置于所述节拍信息中的节拍点上之前,还包括:
    在所述节拍信息中节拍点的数量大于所述至少两张图片的数量的情况下,对所述音乐片段进行剪切处理,使得所述节拍点的数量与所述至少两张图片的数量相等;
    在所述节拍信息中节拍点的数量小于所述至少两张图片的数量,从所述音乐片段中复制音乐子片段,将所述音乐子片段与所述音乐片段进行拼接,形成新的音乐片段,使得新的音乐片段包含的节拍点的数量与所述至少两张图片的数量相等。
  8. 根据权利要求3所述的方法,在所述对所述至少两张图片进行特征提取,获得第一特征向量之前,还包括:
    获取展示视频样本集,其中,所述展示视频样本集中包括多个展示视频;
    提取以下至少之一:每个展示视频的视频帧对应的第一特征向量;每个展示视频的特征信息对应的第二特征向量;
    对于每个展示视频,将提取的特征向量输入设定神经网络,获得初始音乐片段;其中,所述设定神经网络包括深度神经网络或卷积神经网络;
    根据通过所述初始音乐片段和所述展示视频中的音乐片段得到的损失函数调整所述设定神经网络中的参数,以对所述设定神经网络进行训练。
  9. 一种展示视频的生成装置,包括:
    特征信息获取模块,设置为获取数据,其中,所述数据包括以下之一:待展示内容的至少两张图片;待展示内容的至少两张图片和待展示内容的特征信息;
    音乐片段确定模块,设置为根据获取的数据确定与所述待展示内容匹配的音乐片段;
    节拍信息获取模块,设置为对所述音乐片段进行特征提取,获得所述音乐片段的节拍信息,其中,所述节拍信息包括至少两个节拍点;
    展示视频生成模块,设置为根据所述待展示内容的至少两张图片和与所述 待展示内容匹配的音乐片段生成展示视频,其中,每张图片在所述展示视频中呈现的时间点与所述节拍信息中的节拍点对应。
  10. 一种电子设备,包括:
    至少一个处理装置;
    存储装置,设置为存储至少一个程序;
    当所述至少一个程序被所述至少一个处理装置执行,使得所述至少一个处理装置实现如权利要求1-8中任一项所述的展示视频的生成方法。
  11. 一种计算机可读介质,存储有计算机程序,其中,所述程序被处理装置执行时实现如权利要求1-8中任一项所述的展示视频的生成方法。
PCT/CN2020/096969 2019-06-19 2020-06-19 展示视频的生成方法、装置、设备及存储介质 WO2020253806A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910532395.2A CN110278388B (zh) 2019-06-19 2019-06-19 展示视频的生成方法、装置、设备及存储介质
CN201910532395.2 2019-06-19

Publications (1)

Publication Number Publication Date
WO2020253806A1 true WO2020253806A1 (zh) 2020-12-24

Family

ID=67961271

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/096969 WO2020253806A1 (zh) 2019-06-19 2020-06-19 展示视频的生成方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN110278388B (zh)
WO (1) WO2020253806A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114329001A (zh) * 2021-12-23 2022-04-12 游艺星际(北京)科技有限公司 动态图片的显示方法、装置、电子设备及存储介质

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110278388B (zh) * 2019-06-19 2022-02-22 北京字节跳动网络技术有限公司 展示视频的生成方法、装置、设备及存储介质
CN112822563A (zh) * 2019-11-15 2021-05-18 北京字节跳动网络技术有限公司 生成视频的方法、装置、电子设备和计算机可读介质
CN112822541B (zh) * 2019-11-18 2022-05-20 北京字节跳动网络技术有限公司 视频生成方法、装置、电子设备和计算机可读介质
CN111010611A (zh) * 2019-12-03 2020-04-14 北京达佳互联信息技术有限公司 电子相册的获取方法、装置、计算机设备和存储介质
CN113223487B (zh) * 2020-02-05 2023-10-17 字节跳动有限公司 一种信息识别方法及装置、电子设备和存储介质
CN111432141B (zh) * 2020-03-31 2022-06-17 北京字节跳动网络技术有限公司 一种混剪视频确定方法、装置、设备及存储介质
CN111756953A (zh) * 2020-07-14 2020-10-09 北京字节跳动网络技术有限公司 视频处理方法、装置、设备和计算机可读介质
CN111813970A (zh) * 2020-07-14 2020-10-23 广州酷狗计算机科技有限公司 多媒体内容展示方法、装置、终端及存储介质
CN112259062B (zh) * 2020-10-20 2022-11-04 北京字节跳动网络技术有限公司 特效展示方法、装置、电子设备及计算机可读介质
CN112489681A (zh) * 2020-11-23 2021-03-12 瑞声新能源发展(常州)有限公司科教城分公司 节拍识别方法、装置及存储介质
CN113473177B (zh) * 2021-05-27 2023-10-31 北京达佳互联信息技术有限公司 音乐推荐方法、装置、电子设备及计算机可读存储介质
CN113438547B (zh) * 2021-05-28 2022-03-25 北京达佳互联信息技术有限公司 一种音乐生成方法、装置、电子设备及存储介质
CN115695899A (zh) * 2021-07-23 2023-02-03 花瓣云科技有限公司 视频的生成方法、电子设备及其介质
CN113655930B (zh) * 2021-08-30 2023-01-10 北京字跳网络技术有限公司 信息发布方法、信息的展示方法、装置、电子设备及介质
CN116152393A (zh) * 2021-11-18 2023-05-23 脸萌有限公司 视频生成方法、装置、设备及存储介质
CN116800908A (zh) * 2022-03-18 2023-09-22 北京字跳网络技术有限公司 一种视频生成方法、装置、电子设备和存储介质
CN115243101B (zh) * 2022-06-20 2024-04-12 上海众源网络有限公司 视频动静率识别方法、装置、电子设备及存储介质
CN115243107B (zh) * 2022-07-08 2023-11-21 华人运通(上海)云计算科技有限公司 短视频播放的方法、装置、系统、电子设备和介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7904815B2 (en) * 2003-06-30 2011-03-08 Microsoft Corporation Content-based dynamic photo-to-video methods and apparatuses
CN104202540A (zh) * 2014-09-28 2014-12-10 北京金山安全软件有限公司 一种利用图片生成视频的方法及系统
CN105072354A (zh) * 2015-07-17 2015-11-18 Tcl集团股份有限公司 一种利用多张照片合成视频流的方法及系统
CN107743268A (zh) * 2017-09-26 2018-02-27 维沃移动通信有限公司 一种视频的编辑方法及移动终端
CN109618222A (zh) * 2018-12-27 2019-04-12 北京字节跳动网络技术有限公司 一种拼接视频生成方法、装置、终端设备及存储介质
CN110278388A (zh) * 2019-06-19 2019-09-24 北京字节跳动网络技术有限公司 展示视频的生成方法、装置、设备及存储介质

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7569761B1 (en) * 2007-09-21 2009-08-04 Adobe Systems Inc. Video editing matched to musical beats
CN101727943B (zh) * 2009-12-03 2012-10-17 无锡中星微电子有限公司 一种图像配乐的方法、图像配乐装置及图像播放装置
CN102256030A (zh) * 2010-05-20 2011-11-23 Tcl集团股份有限公司 可匹配背景音乐的相册演示系统及其背景音乐匹配方法
CN102403011A (zh) * 2010-09-14 2012-04-04 北京中星微电子有限公司 一种音乐输出方法及装置
US20140317480A1 (en) * 2013-04-23 2014-10-23 Microsoft Corporation Automatic music video creation from a set of photos
CN105550251A (zh) * 2015-12-08 2016-05-04 小米科技有限责任公司 图片播放方法和装置
CN108920648B (zh) * 2018-07-03 2021-06-22 四川大学 一种基于音乐-图像语义关系的跨模态匹配方法
CN109256146B (zh) * 2018-10-30 2021-07-06 腾讯音乐娱乐科技(深圳)有限公司 音频检测方法、装置及存储介质
CN109697236A (zh) * 2018-11-06 2019-04-30 建湖云飞数据科技有限公司 一种多媒体数据匹配信息处理方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7904815B2 (en) * 2003-06-30 2011-03-08 Microsoft Corporation Content-based dynamic photo-to-video methods and apparatuses
CN104202540A (zh) * 2014-09-28 2014-12-10 北京金山安全软件有限公司 一种利用图片生成视频的方法及系统
CN105072354A (zh) * 2015-07-17 2015-11-18 Tcl集团股份有限公司 一种利用多张照片合成视频流的方法及系统
CN107743268A (zh) * 2017-09-26 2018-02-27 维沃移动通信有限公司 一种视频的编辑方法及移动终端
CN109618222A (zh) * 2018-12-27 2019-04-12 北京字节跳动网络技术有限公司 一种拼接视频生成方法、装置、终端设备及存储介质
CN110278388A (zh) * 2019-06-19 2019-09-24 北京字节跳动网络技术有限公司 展示视频的生成方法、装置、设备及存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114329001A (zh) * 2021-12-23 2022-04-12 游艺星际(北京)科技有限公司 动态图片的显示方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN110278388A (zh) 2019-09-24
CN110278388B (zh) 2022-02-22

Similar Documents

Publication Publication Date Title
WO2020253806A1 (zh) 展示视频的生成方法、装置、设备及存储介质
CN110677711B (zh) 视频配乐方法、装置、电子设备及计算机可读介质
US10182095B2 (en) Method and system for video call using two-way communication of visual or auditory effect
CN109543064B (zh) 歌词显示处理方法、装置、电子设备及计算机存储介质
ES2719586T3 (es) Creación de puntos de referencia en un flujo multimedia con reconocimiento de contenido automatizado
WO2021093737A1 (zh) 生成视频的方法、装置、电子设备和计算机可读介质
WO2021008223A1 (zh) 信息的确定方法、装置及电子设备
WO2021196903A1 (zh) 视频处理方法、装置、可读介质及电子设备
WO2020113733A1 (zh) 动画生成方法、装置、电子设备及计算机可读存储介质
WO2020082870A1 (zh) 即时视频显示方法、装置、终端设备及存储介质
WO2020259130A1 (zh) 精选片段处理方法、装置、电子设备及可读介质
WO2022152064A1 (zh) 视频生成方法、装置、电子设备和存储介质
JP6971292B2 (ja) 段落と映像を整列させるための方法、装置、サーバー、コンピュータ可読記憶媒体およびコンピュータプログラム
CN110324718B (zh) 音视频生成方法、装置、电子设备及可读介质
CN109640129B (zh) 视频推荐方法、装置,客户端设备、服务器及存储介质
WO2020207080A1 (zh) 视频拍摄方法、装置、电子设备及存储介质
WO2021012764A1 (zh) 音视频播放方法、装置、电子设备及可读介质
WO2021057740A1 (zh) 视频生成方法、装置、电子设备和计算机可读介质
CN107450874B (zh) 一种多媒体数据双屏播放方法及系统
CN113257218B (zh) 语音合成方法、装置、电子设备和存储介质
WO2023103889A1 (zh) 视频处理方法、装置、电子设备及存储介质
US20230131975A1 (en) Music playing method and apparatus based on user interaction, and device and storage medium
WO2020224294A1 (zh) 用于处理信息的系统、方法和装置
WO2024078293A1 (zh) 图像处理方法、装置、电子设备及存储介质
WO2023174073A1 (zh) 视频生成方法、装置、设备、存储介质和程序产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20827465

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20827465

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 28.03.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20827465

Country of ref document: EP

Kind code of ref document: A1