WO2022089427A1 - 视频生成方法、装置、电子设备以及计算机可读介质 - Google Patents

视频生成方法、装置、电子设备以及计算机可读介质 Download PDF

Info

Publication number
WO2022089427A1
WO2022089427A1 PCT/CN2021/126427 CN2021126427W WO2022089427A1 WO 2022089427 A1 WO2022089427 A1 WO 2022089427A1 CN 2021126427 W CN2021126427 W CN 2021126427W WO 2022089427 A1 WO2022089427 A1 WO 2022089427A1
Authority
WO
WIPO (PCT)
Prior art keywords
product
video
commodity
information
user
Prior art date
Application number
PCT/CN2021/126427
Other languages
English (en)
French (fr)
Inventor
刘晓娟
刘乐
Original Assignee
北京沃东天骏信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京沃东天骏信息技术有限公司, 北京京东世纪贸易有限公司 filed Critical 北京沃东天骏信息技术有限公司
Priority to US18/033,671 priority Critical patent/US20230396857A1/en
Publication of WO2022089427A1 publication Critical patent/WO2022089427A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/812Monomedia components thereof involving advertisement data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0641Shopping interfaces
    • G06Q30/0643Graphical representation of items or shoppers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04845Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range for image manipulation, e.g. dragging, rotation, expansion or change of colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0276Advertisement creation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/254Management at additional data server, e.g. shopping server, rights management server
    • H04N21/2542Management at additional data server, e.g. shopping server, rights management server for selling goods, e.g. TV shopping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Definitions

  • the present disclosure relates to the field of computer technology, in particular to the field of computer vision technology, and in particular, to a video generation method, apparatus, electronic device, and computer-readable medium.
  • Video generation refers to editing video clips that conform to the semantics of the video into a video, and the e-commerce scenario requires the generated video to display product characteristics in multiple aspects, dimensions, and angles.
  • Embodiments of the present disclosure propose a video generation method, apparatus, electronic device, and computer-readable medium.
  • an embodiment of the present disclosure provides a method for generating a video.
  • the method includes: determining a template from a variety of category templates as a production template based on a user's input instruction; in response to determining the category to which the commodity information input by the user belongs Matching with the category of the template, obtain the product image or product video material related to the product information; based on the production template, process the product image or product video material to generate a product video.
  • the above-mentioned processing of commodity pictures or commodity video materials based on the production template to generate a commodity video includes: transferring the music in the production template or the music in the commodity video material to the frequency domain, and calculating the local part of the audio energy Extreme value and dislocation convolution to determine accent points and beats; generate an initial video from a product image and extract multiple preset duration video segments from the original video, or extract multiple preset duration video segments from video materials; based on Accent points and beats, combine multiple video segments in the form of transition animations to generate product videos.
  • obtaining commodity pictures or commodity video materials related to the commodity information includes: after the user logs in, determining whether the user has a merchant registration information; in response to the judgment result that the user has the merchant registration information, based on the merchant registration information, obtain the basic information of the user's products for sale; based on the user's operation on the basic information, determine the product information input by the user, and obtain the products related to the product information.
  • Picture or product video material in response to the judgment result that the user does not have the merchant registration information, prompt the user to input the product information, and in response to determining that the category of the product information entered by the user matches the category of the template, obtain the product image related to the product information or Product video footage.
  • the above method further includes: obtaining a product detail page related to the product information based on the product information; extracting key information in the product detail page; performing special effect processing on the key information, and writing the key information after the special effect processing Enter the product video; filter and light and shadow effect processing on the product video with key information.
  • the above-mentioned extracting the key information in the product detail page includes: extracting the key information in the product detail page by using a language model, and the language model is obtained by training the category based on the template.
  • the above-mentioned processing of the product image includes: preprocessing the product image; identifying the text area of the preprocessed image, and removing the text content of the text area.
  • the above method further includes: binding the product video with the code of the product; and uploading the product video bound with the code of the product to the publicity display position of the main image of the product.
  • the above method further includes: in response to determining that the category to which the commodity information input by the user belongs does not match the category for which the template is made, sending prompt information prompting to replace the made template.
  • an embodiment of the present disclosure provides a video generation apparatus, the apparatus includes: a determination unit configured to determine a template from a plurality of category templates as a production template based on an input instruction of a user; an acquisition unit, configured by It is configured to, in response to determining that the category of the commodity information input by the user matches the category of the production template, obtain commodity pictures or commodity video materials related to the commodity information; the generating unit is configured to, based on the production template, perform the production on the commodity pictures or commodity video materials. Process to generate product video.
  • the above-mentioned generation unit includes: a calculation module configured to convert the music in the production template or the music in the commercial video material to the frequency domain, calculate the local extrema and dislocation convolution of the audio energy, and determine the accent point and beats; the extraction module is configured to generate an initial video from the product image and extract a plurality of video segments of preset duration from the initial video, or extract a plurality of video segments of preset duration from the video material; the generation module is configured to Based on accent points and beats, multiple video segments are merged in the form of transition animations to generate product videos.
  • the above obtaining unit includes: a judgment module, configured to judge whether the user has the merchant registration information after the user logs in; the obtaining module is configured to respond to the judgment result that the user has the merchant registration information, based on the merchant registration information information, to obtain the basic information of the commodity being sold by the user; the determination module is configured to determine the commodity information input by the user based on the user's operation on the basic information; the response module is configured to determine the category and production of the commodity information input by the user in response to The category of the template is matched, and the commodity pictures or commodity video materials related to the commodity information are obtained; the prompt module is configured to prompt the user to input the commodity information in response to the judgment result that the user does not have the merchant registration information, and trigger the response module to work.
  • a judgment module configured to judge whether the user has the merchant registration information after the user logs in
  • the obtaining module is configured to respond to the judgment result that the user has the merchant registration information, based on the merchant registration information information, to obtain the basic information of the commodity being sold by the
  • the above-mentioned apparatus further includes: a detailing unit, configured to obtain a product detail page related to the product information based on the product information; an extraction unit, configured to extract key information in the product detail page; a special effect unit, The processing unit is configured to perform special effects processing on the key information, and write the key information after the special effects processing into the product video; the processing unit is configured to perform filter and light and shadow effect processing on the product video written with the key information.
  • a detailing unit configured to obtain a product detail page related to the product information based on the product information
  • an extraction unit configured to extract key information in the product detail page
  • a special effect unit The processing unit is configured to perform special effects processing on the key information, and write the key information after the special effects processing into the product video
  • the processing unit is configured to perform filter and light and shadow effect processing on the product video written with the key information.
  • the above-mentioned extraction unit is further configured to extract the key information in the product detail page by using a language model, and the language model is obtained based on the category training of the template.
  • the above-mentioned generating unit includes: a preprocessing module, configured to preprocess the image of the product; a recognition module, configured to recognize the text area of the preprocessed image; and a removal module, configured to remove the text area text content.
  • the above apparatus further includes: a binding unit configured to bind the product video with the product code; an uploading unit configured to upload the product video bound with the product code to the product main image 's promotional display.
  • the above apparatus further includes: a sending unit configured to send prompt information prompting to replace the production template in response to determining that the category of the commodity information input by the user does not match the category of the template for making.
  • embodiments of the present disclosure provide an electronic device, the electronic device includes: one or more processors; a storage device on which one or more programs are stored; when the one or more programs are stored by one or more A plurality of processors execute such that one or more processors implement a method as described in any implementation of the first aspect.
  • an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, implements the method described in any of the implementation manners of the first aspect.
  • a template is determined from a variety of category templates as a production template; secondly, in response to determining that the category to which the commodity information input by the user belongs is matched with the category of the production template , to obtain product pictures or product video materials related to product information; finally, based on the production template, the product pictures or product video materials are processed to generate a product video; thus, through the interaction with the user to determine the production template, and based on the production template and acquisition
  • the product pictures or product video materials of the product can be used to generate product videos, which simplifies the process of video production and improves the efficiency of video production.
  • FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present disclosure may be applied;
  • FIG. 2 is a flowchart of one embodiment of a video generation method according to the present disclosure
  • FIG. 3 is a flowchart of a method for obtaining a product picture or a product video material related to the product information according to the present disclosure
  • FIG. 4 is a flowchart of another embodiment of a video generation method according to the present disclosure.
  • FIG. 5 is a schematic structural diagram of an embodiment of a video generation apparatus according to the present disclosure.
  • FIG. 6 is a schematic structural diagram of an electronic device suitable for implementing embodiments of the present disclosure.
  • FIG. 1 illustrates an exemplary system architecture 100 to which the video generation method of the present disclosure may be applied.
  • the system architecture 100 may include terminal devices 101 , 102 , and 103 , a network 104 and a server 105 .
  • the network 104 is a medium used to provide a communication link between the terminal devices 101 , 102 , 103 and the server 105 .
  • the network 104 may include various connection types, and may typically include wireless communication links and the like.
  • the terminal devices 101, 102, and 103 interact with the server 105 through the network 104 to receive or send messages and the like.
  • Various communication client applications may be installed on the terminal devices 101 , 102 and 103 , such as instant messaging tools, email clients, and the like.
  • the terminal devices 101, 102, and 103 may be hardware or software; when the terminal devices 101, 102, and 103 are hardware, they may be user devices with communication and control functions, and the above-mentioned user settings can communicate with the server 105.
  • the terminal devices 101, 102, and 103 are software, they can be installed in the above-mentioned user equipment; the terminal devices 101, 102, and 103 can be implemented into multiple software or software modules (for example, software or software modules for providing distributed services) , can also be implemented as a single software or software module. There is no specific limitation here.
  • the server 105 may be a server that provides various services, for example, a background server that provides support for video generation on the image processing systems on the terminal devices 101 , 102 , and 103 .
  • the backend server can analyze and process the relevant information of each online sales commodity in the network, and feed back the processing result (such as the video generation result) to the terminal device.
  • the server may be hardware or software.
  • the server can be implemented as a distributed server cluster composed of multiple servers, or as a single server.
  • the server is software, it can be implemented as a plurality of software or software modules (for example, software or software modules for providing distributed services), or can be implemented as a single software or software module. There is no specific limitation here.
  • the video generation method provided by the embodiments of the present disclosure is generally executed by the server 105 .
  • terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.
  • FIG. 2 shows a process 200 of an embodiment of a video generation method according to the present disclosure, and the video generation method includes the following steps:
  • Step 201 based on the user's input instruction, determine a template from a variety of category templates as a production template.
  • the execution body on which the video generation method runs can provide a video generation interface for users who need video production, and display various categories of templates on the video generation interface.
  • the user inputs an instruction on the video generation interface to execute
  • the main body determines the production template according to the input instruction of the user, wherein, multiple category templates are used to distinguish different categories of commodities, and multiple category templates are used to display different characteristics of each category of commodities. Different categories include: sports, leisure, etc.
  • sportswear uses a sports template to make the main image video.
  • the fast-paced audio on the sports template and lively special effects can better highlight the characteristics of the product.
  • each category template is a data structure, and the category template defines music to be used in the video, types of various animations, entry methods and times, transitions, text, and special effects.
  • Each category template is the basis of video production. If you determine the production template, you can reuse the preset special events, special effects, music, selling points and other information on the production template when making the product video.
  • the user before making a product video, can select a category template according to the category.
  • the execution body can also record the usage of each category template.
  • Each category template is used once, and its corresponding usage is accumulated; further , the executive body can also recommend different category templates according to the usage (for example, the top three usage), and the specific recommendation method can add a recommendation label to the top three usage templates, so that the user can choose the category template according to their preference or recommendation. Make a template.
  • Step 202 in response to determining that the category to which the commodity information input by the user belongs matches the category of the prepared template, acquire commodity pictures or commodity video materials related to the commodity information.
  • the execution subject on which the video generation method runs can also obtain commodity information input by the user through the video generation interface, and the commodity information determines the production materials for video generation, which can be commodity pictures or commodity video materials.
  • the commodity information input by the user includes: commodity code, commodity name, commodity cover image, commodity picture, commodity promotion video, commodity explanation video, commodity display video, and the like.
  • the execution subject may obtain product pictures or product video materials related to the product information through the product information (eg, product code, product name, product cover image) input by the user.
  • the execution body can also obtain the product details page on the webpage based on the product code input by the user, and based on the production template, the product details page can be intelligently selected, image impurities erased, intelligent cropping, and selling point extraction. and other steps to generate a video.
  • the commodity pictures may be pictures of various dimensions and angles of the commodity;
  • the commodity video materials may be commodity promotion videos, commodity explanation videos, commodity display videos and other materials.
  • the commodity information input by the user may be the code of the commodity, or the picture of the commodity or the video material of the commodity, and the commodity information input by the user may also include: the code of the commodity, and the picture of the commodity corresponding to the code of the commodity Or product video footage. That is, the product information input by the user can be any one of the product code, the product picture, and the product video material. In this embodiment, if the product information is not obtained, the product video cannot be generated.
  • the commodity information input by the user may belong to one category or two categories.
  • the category for creating the template needs to be the same as the category of the commodity information input by the user.
  • the production template can be a general template.
  • the general template is a template available for all categories of products, without the characteristics of personalized categories, such as non-universal sports templates, some will be added. Elements or textual descriptions of sports.
  • Step 203 based on the production template, process the product image or product video material to generate a product video.
  • the production template is a template referenced by the commodity video to be generated, the production template provides a video layout reference for the commodity video, and the production template defines the types of music and animation involved in the commodity video, as well as the entry method and time of the animated characters, For content such as transitions, text and special effects, according to the content defined in the production template, the product images or product video materials are processed to generate product videos.
  • the processing of commodity pictures or commodity video materials may be some simple picture processing, such as picture translation and zooming, or some complex transformations, such as technological flickering, 3D rotation, etc., or
  • the animation designed by the designer is usually an animation format formed by cutting, splicing, and complex spatial motion of multiple pictures.
  • the selling point of the product may be added directly in the process of generating the video, or the selling point of the product may be added after the product video is generated.
  • the selling point of a product is the language and presentation refined by an enterprise to show the features and advantages of its products.
  • the product video can be processed with filters and special effects, so that the video can present different styles and enrich the viewability of the video.
  • the product pictures or product video materials are processed to generate the product video, including:
  • a video segment with a preset duration, or multiple video segments with a preset duration are extracted from the video material; based on accent points and beats, multiple video segments are merged in the form of transition animations to generate a product video.
  • various animation generation functions can be used to generate an initial video from a picture or a commodity.
  • a video summary extraction model can be used to extract a plurality of video segments of preset duration from the initial video or video material.
  • the transition animation is also called transition transition.
  • the transition can be realized through OPENGL (Open Graphics Library).
  • OPENGL Open Graphics Library
  • the transition realized by OPENGL is used. Nearly 100 kinds of video transition effects can be obtained.
  • the transition includes the transition effect from fading black to fading bright.
  • the accent point of the music is determined by calculating the local extrema of the audio energy
  • the beat of the music is determined by the dislocation convolution
  • the multiple extracted video segments are merged in the form of transition animation based on the accent point and the beat. make sure that the transition points of the video and the rhythm of the video soundtrack are consistent.
  • the processing of the product image includes: preprocessing the product image; identifying the text area of the preprocessed image, and removing the text content of the text area.
  • the preprocessing includes: splicing of cut pictures, multi-subject picture cutting, picture screening and filtering, picture impurity erasing, intelligent cropping and splicing of pictures, and unified picture size design.
  • deep learning OCR Optical Character Recognition, Optical Character Recognition
  • OCR Optical Character Recognition
  • Optical Character Recognition can be used to recognize the text area and text content of the preprocessed picture.
  • a conventional text-wiping model is used to erase the text content to ensure that the generated video is clear and tidy.
  • the product images are first preprocessed, and when there are multiple product images, the product images may have a uniform size. Identifying the text area of the preprocessed image and removing the text content in the text area can ensure the clarity and tidyness of the generated product video.
  • a language model may also be used to extract key information in the text content, and the extracted key information may be written into a product video.
  • the key information is the selling point of the product in the form of text, and writing the selling point of the product into the product video can facilitate the user to quickly discover the selling point information of the product in the product video.
  • prompt information for providing a replacement template is sent.
  • the user when the category of the commodity information input by the user does not match the category of the template for making the template, the user is prompted to re-enter the instruction through a prompt message in a timely manner, so as to ensure that the style of the generated video matches the style of the user's needs to the greatest extent, so that the The product video generated subsequently can achieve the best production effect.
  • a matching template in response to determining that the category of the commodity information input by the user does not match the category of the template for which the template is made, a matching template may be recommended to the user, thereby achieving a better production effect.
  • the product video after the product video is generated, can also be bound with the product code; and the product video bound with the product code is uploaded to the publicity display position of the product main image. .
  • the production efficiency of the product video can be improved, and the production time of a single product video can be about 40s.
  • product videos can be produced in batches, which greatly improves efficiency.
  • a template is determined from a variety of category templates as the production template; secondly, in response to determining that the category of the commodity information input by the user matches the category of the template for making the template, obtain Commodity pictures or commodity video materials related to commodity information; finally, based on the production template, the commodity pictures or commodity video materials are processed to generate a commodity video; thus, the template is produced by determining the interaction with the user, and based on the production template and the acquired commodity Image or product video material generates product video, which simplifies the process of video production and improves the efficiency of video production.
  • the commodity information may be the user's online commodity or the user's commodity on sale, and the execution of the video generation method running on it may determine whether to automatically recommend commodity information to the user according to the user's commodity registration information.
  • the method for obtaining product pictures or product video materials related to product information includes the following steps:
  • Step 301 after the user logs in, determine whether the user has the merchant registration information; when the judgment result is that the user has the merchant registration information, then step 302 is executed; when the judgment result is that the user does not have the merchant registration information, then step 306 is executed .
  • the user needs to register in the video generation system provided by the executive body, and after successful registration, log in to the video generation system, select a production template through the video generation interface, and enter product information.
  • the merchant registration information means that the user has registered a merchant account in the video generation system. Through the merchant account, it can be determined whether the user has a product for sale, and after determining that there is a product on sale, the product on sale can be obtained. basic information.
  • Step 302 based on the registration information of the merchant, acquire basic information of the commodity being sold by the user, and then execute Step 303 .
  • the basic information of the commodity on sale refers to information such as the code (SKU, Stock Keeping Unit), commodity name, and commodity cover image of the commodity being sold.
  • step 303 the commodity information input by the user is determined based on the operation of the user on the basic information, and then step 304 is executed.
  • the basic information of the commodity on sale can be directly displayed on the video generation interface, and the user can directly click on the displayed content on the video generation interface or input the code of the commodity for sale, the product cover image, the commodity name, etc. Operation to determine the product information entered by the user.
  • the basic information of the commodity on sale can also be displayed on the operation interface of the user's login. When the user directly clicks on the basic information of the commodity on sale or enters the code of the commodity on sale, the product cover image, the commodity name, etc., the The operation interface is no longer displayed.
  • the execution body can obtain the merchant's products for sale from the backend server, allowing the merchant user to directly select the product information to be input from the basic information of the products for sale when making a video, and No need to enter product information.
  • the commodity information input by the user is obtained from the basic information of the commodity being sold, which improves the convenience of the user's selection. When the user forgets or cannot determine the product information.
  • Step 304 in response to determining that the category of the commodity information input by the user matches the category of the prepared template, obtain commodity pictures or commodity video materials related to the commodity information, and then perform step 305 .
  • the user when the user does not have a merchant account, the user is required to directly input the commodity information of the commodity being sold in the mall, the commodity information includes commodity code, commodity link, etc., and further, a video is generated based on the commodity information.
  • a video can be generated based on the product SKUID and SKU link.
  • a video may also be generated according to a picture or video material added by the user.
  • Step 305 exit.
  • step 306 the user is prompted to input the commodity information, and then step 304 is executed.
  • preset prompt information may be displayed on the user's login operation interface to prompt the user to input commodity information.
  • the operation interface may also be an interface for the user to input commodity information.
  • the system will again determine whether the category to which the product information belongs matches the category of the template for which the template is made. If it does not match, a matching template will be recommended to achieve a better production effect of the product video. .
  • the basic information of the user's products being sold is determined based on the merchant registration information, and based on the user's operation on the basic information , to determine the product information entered by the user. Therefore, when interacting with the user, the product information is automatically recommended to the user based on the basic information of the product being sold by the user, which improves the video production efficiency.
  • FIG. 4 shows a process 400 of another embodiment of the video generation method of the present disclosure.
  • the video generation method includes the following steps:
  • Step 401 based on the user's input instruction, determine a template from a variety of category templates as a production template.
  • Step 402 in response to determining that the category to which the commodity information input by the user belongs matches the category of the prepared template, obtain commodity pictures or commodity video materials related to the commodity information.
  • Step 403 based on the production template, process the product image or product video material to generate a product video.
  • Step 404 based on the product information, obtain a product detail page related to the product information.
  • the product detail page is the product detail page made by the product seller.
  • the product detail page describes the product's origin, manufacturer, specification, scope of application and other detailed description information.
  • the detailed description information may include the video of the product, Pictures, text descriptions. For example, click on the picture of a product promotion display position on a webpage to view the product details page.
  • Step 405 extracting key information in the product detail page.
  • the key information may be information representing the characteristics of the commodity, such as text, pictures, and videos, and the selling point of the commodity may be reflected by the extracted key information.
  • the key information may be text
  • extracting the key information in the product detail page includes: using a language model to extract the key information in the product detail page, and the language model is obtained based on the category training of the template.
  • the language model can be used to segment, weight and understand the text recognized by OCR, and extract several texts from large texts or multiple sentences. Concise phrases serve as promotional copy for a product's selling point.
  • the weight setting refers to: select a part of the product copy from the language model sample as a calibration sample, after the word segmentation, it is calibrated by the calibration team according to the needs of the business, and then the language model is used to train the weight of each word on this data.
  • the key information of the text in the product detail page can be effectively extracted through the language model, which improves the efficiency of extracting the selling point of the product.
  • Step 406 perform special effect processing on the key information.
  • the special effect processing can be set according to business requirements specified by the user, so as to achieve the purpose of displaying various effects.
  • the special effect processing adds some light and shadow or particles to key information.
  • Step 407 Write the key information after the special effect processing into the product video.
  • the key information after the special effect processing realizes the display effect of various styles of the selling point.
  • the key information is text
  • a variety of text display effects are achieved.
  • Step 408 Perform filter and light and shadow effect processing on the commodity video with the key information written therein.
  • the filter is mainly used to realize various special effects of the image.
  • Light and shadow effect processing so that objects in the image have the shadow effect formed by sunlight.
  • Performing several common filters and light and shadow effects on the product video can make the product video show different styles and enrich the viewability of the product video.
  • Step 409 Bind the product video in which the key information is written with the product code.
  • the product video and the product code are bound, so that the product corresponding to the product code can be easily found. For example, if coded product videos for 5 products are generated together in batches, the produced product videos are automatically associated with these 5 products.
  • the production efficiency of the product video can be improved, and the production time of a single product video can be about 40s.
  • product videos can be produced in batches, which greatly improves efficiency.
  • Step 410 Upload the product video bound with the product code to the publicity display position of the product main image.
  • the generated product video will be uploaded to the publicity display position of the product main image.
  • the product video can be uploaded to the publicity display position of the product main image through the video review system, and the video review system is used to check whether the product conforms to the preset video specification, which is constrained by a special specification file.
  • the video review system judges that the video is generated, and reviews and judges the product video. After the review, it will be displayed on the product details page (that is, the publicity display position of the product's main image).
  • the video generation method provided in this embodiment based on the product information, obtains a product detail page related to the product information, extracts key information in the product detail page, performs special effect processing on the key information, and writes the key information after the special effect processing into the product.
  • Video, filter and light and shadow effect processing on the product video written with key information obtain the selling point of the product through the extracted key information, and improve the display effect of the selling point by performing special effects processing on the key information; by writing the key information
  • the product video is processed with filters and light and shadow effects, which improves the overall coordination of the selling point in the product video, and also improves the display effect of the product video.
  • the present disclosure provides an embodiment of a video generating apparatus, which corresponds to the method embodiment shown in FIG. 2 , and the apparatus can be specifically applied to in various electronic devices.
  • an embodiment of the present disclosure provides a video generation apparatus 500 , and the apparatus 500 includes: a determination unit 501 , an acquisition unit 502 , and a generation unit 503 .
  • the determining unit 501 may be configured to determine a template from a plurality of category templates as a production template based on an input instruction of the user.
  • the obtaining unit 502 may be configured to obtain a product picture or a product video material related to the product information in response to determining that the category to which the product information input by the user belongs is matched with the category of the production template.
  • the generating unit 503 may be configured to process the product image or product video material based on the production template to generate a product video.
  • the specific processing of the determining unit 501, the obtaining unit 502, and the generating unit 503 and the technical effects brought by them may refer to steps 201, 202, and 202 in the corresponding embodiment of FIG. 2, respectively.
  • Step 203 the specific processing of the determining unit 501, the obtaining unit 502, and the generating unit 503 and the technical effects brought by them may refer to steps 201, 202, and 202 in the corresponding embodiment of FIG. 2, respectively.
  • the above generating unit 503 includes: a calculation module (not shown in the figure), an extraction unit (not shown in the figure), and a generation module (not shown in the figure).
  • the calculation module can be configured to transfer the music in the production template or the music in the commercial video material to the frequency domain, calculate the local extrema and dislocation convolution of the audio energy, and determine the accent point and the beat.
  • the extraction module may be configured to generate an initial video from the product image and extract a plurality of video segments with a preset duration from the initial video, or extract a plurality of video segments with a preset duration from a video material.
  • the generating module may be configured to combine multiple video segments in a transition animation manner based on accent points and beats to generate a product video.
  • the above obtaining unit 502 includes: a judgment module (not shown in the figure), an obtaining module (not shown in the figure), a determination module (not shown in the figure), and a response module (not shown in the figure) ), prompt module (not shown in the figure).
  • the determination module may be configured to determine whether the user has merchant registration information after the user logs in.
  • the acquiring module may be configured to acquire basic information of the user's products on sale based on the merchant registration information in response to the determination result that the user has the merchant registration information.
  • the determining module may be configured to determine the commodity information input by the user based on the user's operation on the basic information.
  • the response module may be configured to obtain a commodity picture or commodity video material related to the commodity information in response to determining that the category to which the commodity information input by the user belongs matches the category of the production template.
  • the prompting module may be configured to prompt the user to input commodity information in response to the judgment result that the user does not have the merchant registration information, and trigger the response module to work.
  • the above-mentioned apparatus 500 further includes: a detailing unit (not shown in the figure), an extraction unit (not shown in the figure), a special effect unit (not shown in the figure), and a processing unit (not shown in the figure) out).
  • the detailing unit may be configured to acquire a product detail page related to the product information based on the product information.
  • the extraction unit may be configured to extract key information in the product detail page.
  • the special effect unit can be configured to perform special effect processing on the key information, and write the key information after the special effect processing into the commodity video.
  • the processing unit may be configured to perform filter and light and shadow effect processing on the commodity video written with the key information.
  • the above-mentioned extraction unit is further configured to extract key information in the product detail page by using a language model, and the language model is obtained based on the category training of the template.
  • the above-mentioned generating unit 503 includes: a preprocessing module (not shown in the figure), an identification module (not shown in the figure), and a removal module (not shown in the figure).
  • the preprocessing module can be configured to preprocess the image of the product.
  • the recognition module may be configured to recognize the text region of the preprocessed image.
  • the removal module can be configured to remove the text content of the text area.
  • the above-mentioned apparatus 500 further includes: a binding unit (not shown in the figure), and an uploading unit (not shown in the figure).
  • the binding unit may be configured to bind the video of the product with the code of the product.
  • the uploading unit can be configured to upload the video of the product bound with the code of the product to the publicity display position of the main image of the product.
  • the above-mentioned apparatus 500 further includes: a sending unit (not shown in the figure).
  • the above-mentioned sending unit may be configured to, in response to determining that the category to which the commodity information input by the user belongs does not match the category for which the template is made, to send prompt information for prompting to replace the template.
  • the determining unit 501 determines a template from a variety of category templates as a production template based on the user's input instruction; secondly, the acquiring unit 502 determines the category and production of the commodity information input by the user in response to The categories of the templates are matched, and the product pictures or product video materials related to the product information are obtained; finally, the generating unit 503 processes the product pictures or product video materials based on the production template, and generates a product video; thus, the template is generated by determining the interaction with the user. , and generate product videos based on the production templates and obtained product pictures or product video materials, which lowers the threshold for video production, provides users with a simple and convenient operation method, facilitates users' use, and improves user experience.
  • FIG. 6 a schematic structural diagram of an electronic device 600 suitable for implementing embodiments of the present disclosure is shown.
  • an electronic device 600 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 601 that may be loaded into random access according to a program stored in a read only memory (ROM) 602 or from a storage device 608 Various appropriate actions and processes are executed by the programs in the memory (RAM) 603 . In the RAM 603, various programs and data required for the operation of the electronic device 600 are also stored.
  • the processing device 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input/output (I/O) interface 605 is also connected to bus 604 .
  • the following devices can be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touchpad, keyboard, mouse, etc.; output devices including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, etc. 607; storage devices 608 including, for example, magnetic tapes, hard disks, etc.; and communication devices 609.
  • Communication means 609 may allow electronic device 600 to communicate wirelessly or by wire with other devices to exchange data. While FIG. 6 shows electronic device 600 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in FIG. 6 can represent one device, and can also represent multiple devices as required.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 609, or from the storage device 608, or from the ROM 602.
  • the processing apparatus 601 the above-described functions defined in the methods of the embodiments of the present disclosure are executed.
  • the computer-readable medium of the embodiments of the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal in baseband or propagated as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: electric wire, optical cable, RF (Radio Frequency, radio frequency), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned server; or may exist alone without being assembled into the server.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the server, the server: based on the input instruction of the user, determine a template from a variety of category templates as a production template; respond It is used to determine that the category of the commodity information input by the user matches the category of the production template, and obtain commodity pictures or commodity video materials related to the commodity information; based on the production template, the commodity pictures or commodity video materials are processed to generate a commodity video.
  • Computer program code for carrying out operations of embodiments of the present disclosure may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, and also A conventional procedural programming language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in software or hardware.
  • the described unit may also be provided in the processor, for example, it may be described as: a processor including a determination unit, an acquisition unit, and a generation unit. Wherein, the names of these units do not constitute a limitation of the unit itself in some cases, for example, the determination unit can also be described as "configured to determine a template from a variety of category templates based on the user's input instruction as the Making Templates" unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Processing Or Creating Images (AREA)

Abstract

一种视频生成方法和装置。该方法包括:基于用户的输入指令,从多种品类模板中确定一个模板作为制作模板(201);响应于确定用户输入的商品信息所属品类与制作模板的品类匹配,获取与商品信息相关的商品图片或商品视频素材(202);基于制作模板,对商品图片或商品视频素材进行处理,生成商品视频(203)。该方法简化了视频制作流程,提高了视频制作效率。

Description

视频生成方法、装置、电子设备以及计算机可读介质
本专利申请要求于2020年10月30提交的、申请号为202011192359.5、发明名称为“视频生成方法、装置、电子设备以及计算机可读介质”的中国专利申请的优先权,该申请的全文以引用的方式并入本申请中。
技术领域
本公开涉及计算机技术领域,具体涉及计算机视觉技术领域,尤其涉及视频生成方法、装置、电子设备以及计算机可读介质。
背景技术
视频生成是指将符合视频语义的视频片段剪辑为一个视频,而电商场景要求生成的视频能够多方面、多维度、多角度展示商品特性。
发明内容
本公开的实施例提出了视频生成方法、装置、电子设备以及计算机可读介质。
第一方面,本公开的实施例提供了一种视频生成方法,该方法包括:基于用户的输入指令,从多种品类模板中确定一个模板作为制作模板;响应于确定用户输入的商品信息所属品类与制作模板的品类匹配,获取与商品信息相关的商品图片或商品视频素材;基于制作模板,对商品图片或商品视频素材进行处理,生成商品视频。
在一些实施例中,上述基于制作模板,对商品图片或商品视频素材进行处理,生成商品视频,包括:将制作模板中的音乐或商品视频素材中的音乐转到频域,计算音频能量的局部极值和错位卷积,确定重音点和节拍;将商品图片生成初始视频并从初始视频中提取多个预设时长的视频段落,或者从视频素材中提取多个预设时长的视频段落; 基于重音点、节拍,将多个视频段落以过渡动画的方式合并,生成商品视频。
在一些实施例中,上述响应于确定用户输入的商品信息所属品类与制作模板的品类匹配,获取与商品信息相关的商品图片或商品视频素材,包括:在用户登录之后,判断用户是否具有商家注册信息;响应于判断结果为用户具有商家注册信息,基于商家注册信息,获取用户在售商品的基本信息;基于用户对基本信息的操作,确定用户输入的商品信息,并获取与商品信息相关的商品图片或商品视频素材;响应于判断结果为用户不具有商家注册信息,提示用户输入商品信息,响应于确定用户输入的商品信息所属品类与制作模板的品类匹配,获取与商品信息相关的商品图片或商品视频素材。
在一些实施例中,上述方法还包括:基于商品信息,获取与商品信息相关的商品详情页;提取商品详情页中的关键信息;对关键信息进行特效处理,并将特效处理后的关键信息写入商品视频;对写入关键信息的商品视频进行滤镜和光影效果处理。
在一些实施例中,上述提取商品详情页中的关键信息,包括:采用语言模型提取商品详情页中的关键信息,语言模型基于制作模板的品类训练得到。
在一些实施例中,上述对商品图片进行处理包括:对商品图片进行预处理;识别预处理后的图片的文字区域,去除文字区域的文字内容。
在一些实施例中,上述方法还包括:将商品视频与商品的编码绑定;并将与商品的编码绑定后的商品视频上传至商品主图的宣传显示位。
在一些实施例中,上述方法还包括:响应于确定用户输入的商品信息所属品类与制作模板的品类不相匹配,发送提示更换制作模板的提示信息。
第二方面,本公开的实施例提供了一种视频生成装置,该装置包括:确定单元,被配置成基于用户的输入指令,从多种品类模板中确定一个模板作为制作模板;获取单元,被配置成响应于确定用户输入 的商品信息所属品类与制作模板的品类匹配,获取与商品信息相关的商品图片或商品视频素材;生成单元,被配置成基于制作模板,对商品图片或商品视频素材进行处理,生成商品视频。
在一些实施例中,上述生成单元包括:计算模块,被配置成将制作模板中的音乐或商品视频素材中的音乐转到频域,计算音频能量的局部极值和错位卷积,确定重音点和节拍;提取模块,被配置成将商品图片生成初始视频并从初始视频中提取多个预设时长的视频段落,或者从视频素材中提取多个预设时长的视频段落;生成模块,被配置成基于重音点、节拍,将多个视频段落以过渡动画的方式合并,生成商品视频。
在一些实施例中,上述获取单元包括:判断模块,被配置成在用户登录之后,判断用户是否具有商家注册信息;获取模块,被配置成响应于判断结果为用户具有商家注册信息,基于商家注册信息,获取用户在售商品的基本信息;确定模块,被配置成基于用户对基本信息的操作,确定用户输入的商品信息;响应模块,被配置成响应于确定用户输入的商品信息所属品类与制作模板的品类匹配,获取与商品信息相关的商品图片或商品视频素材;提示模块,被配置成响应于判断结果为用户不具有商家注册信息,提示用户输入商品信息,触发响应模块工作。
在一些实施例中,上述装置还包括:详分单元,被配置成基于商品信息,获取与商品信息相关的商品详情页;提取单元,被配置成提取商品详情页中的关键信息;特效单元,被配置成对关键信息进行特效处理,并将特效处理后的关键信息写入商品视频;处理单元,被配置成对写入关键信息的商品视频进行滤镜和光影效果处理。
在一些实施例中,上述提取单元进一步被配置成采用语言模型提取商品详情页中的关键信息,语言模型基于制作模板的品类训练得到。
在一些实施例中,上述生成单元包括:预处理模块,被配置成对商品图片进行预处理;识别模块,被配置成识别预处理后的图片的文字区域;去除模块,被配置成去除文字区域的文字内容。
在一些实施例中,上述装置还包括:绑定单元,被配置成将商品 视频与商品的编码绑定;上传单元,被配置成将与商品的编码绑定后的商品视频上传至商品主图的宣传显示位。
在一些实施例中,上述装置还包括:发送单元,被配置成响应于确定用户输入的商品信息所属品类与制作模板的品类不相匹配,发送提示更换制作模板的提示信息。
第三方面,本公开的实施例提供了一种电子设备,该电子设备包括:一个或多个处理器;存储装置,其上存储有一个或多个程序;当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现如第一方面中任一实现方式描述的方法。
第四方面,本公开的实施例提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时实现如第一方面中任一实现方式描述的方法。
本公开的实施例提供的视频生成方法和装置,首先基于用户的输入指令,从多种品类模板中确定一个模板作为制作模板;其次响应于确定用户输入的商品信息所属品类与制作模板的品类匹配,获取与商品信息相关的商品图片或商品视频素材;最后基于制作模板,对商品图片或商品视频素材进行处理,生成商品视频;由此通过与用户交互的确定制作模板,并基于制作模板以及获取的商品图片或商品视频素材生成商品视频,减化了视频制作的流程,提高了视频制作效率。
附图说明
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本公开的其它特征、目的和优点将会变得更明显:
图1是本公开的一个实施例可以应用于其中的示例性系统架构图;
图2是根据本公开的视频生成方法的一个实施例的流程图;
图3是根据本公开的获取商品信息相关的商品图片或商品视频素材的方法的流程图;
图4是根据本公开的视频生成方法的另一个实施例的流程图;
图5是根据本公开的视频生成装置的实施例的结构示意图;
图6是适于用来实现本公开的实施例的电子设备的结构示意图。
具体实施方式
下面结合附图和实施例对本公开作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。
需要说明的是,在不冲突的情况下,本公开中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本公开。
图1示出了可以应用本公开的视频生成方法的示例性系统架构100。
如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,通常可以包括无线通信链路等等。
终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用,例如即时通信工具、邮箱客户端等。
终端设备101、102、103可以是硬件,也可以是软件;当终端设备101、102、103为硬件时,可以是具有通信和控制功能的用户设备,上述用户设置可与服务器105进行通信。当终端设备101、102、103为软件时,可以安装在上述用户设备中;终端设备101、102、103可以实现成多个软件或软件模块(例如用来提供分布式服务的软件或软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。
服务器105可以是提供各种服务的服务器,例如为终端设备101、102、103上图像处理系统提供支持的视频生成的后台服务器。后台服务器可以对网络中各在线销售商品的相关信息进行分析处理,并将处理结果(如视频生成结果)反馈给终端设备。
需要说明的是,服务器可以是硬件,也可以是软件。当服务器为 硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器。当服务器为软件时,可以实现成多个软件或软件模块(例如用来提供分布式服务的软件或软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。
需要说明的是,本公开的实施例所提供的视频生成方法一般由服务器105执行。
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。
如图2,示出了根据本公开的视频生成方法的一个实施例的流程200,该视频生成方法包括以下步骤:
步骤201,基于用户的输入指令,从多种品类模板中确定一个模板作为制作模板。
本实施例中,视频生成方法运行于其上的执行主体可以为有视频制作需要的用户提供视频生成界面,并在视频生成界面上显示多种品类模板,用户在视频生成界面上输入指令,执行主体根据用户的输入指令,确定制作模板,其中,多种品类模板用于区别不同品类的商品,多种品类模板用于展示各品类商品的不同特性。不同品类包括:运动类、休闲类等,如运动服饰使用运动模板制作主图视频,运动模板上快节奏的音频加上活泼的特效,更能凸显商品的特点。
本实施例中,各品类模板是一种数据结构,在品类模板中定义了视频要用的音乐、多种动画的种类以及进入方式和时间、转场、文字以及特效等。各品类模板是视频制作的基础,确定制作模板则可以在制作商品视频时,复用制作模板上的预设的专场、特效、音乐、卖点等信息。
本实施例中,用户在制作商品视频之前,可根据品类选择品类模板,可选地,执行主体还可以记录各品类模板的使用量,每个品类模板使用一次,其对应的使用量累加;进一步,执行主体还可以根据使用量(例如,使用量前三位)推荐不同品类模板,具体推荐方式可以为使用量前三位品类模板添加推荐标签,以使用户按喜好或者推荐选 用品类模板作为制作模板。
步骤202,响应于确定用户输入的商品信息所属品类与制作模板的品类匹配,获取与商品信息相关的商品图片或商品视频素材。
本实施例中,在用户向视频生成界面输入指令,确定制作模板之后。视频生成方法运行与其上的执行主体还可以通过视频生成界面获取用户输入的商品信息,由商品信息确定视频生成的制作材料,该制作材料可以是商品图片或商品视频素材。
本实施例中,用户输入的商品信息包括:商品的编码、商品的名称、商品封面图、商品的图片、商品宣传视频、商品讲解视频、商品展示视频等。进一步地,执行主体可以通过用户输入的商品信息(例如,商品的编码、商品的名称、商品封面图),获取与商品信息相关的商品图片或商品视频素材。
在实际场景中,执行主体还可以基于用户输入的商品的编码,获取网页上商品的详情页,基于制作模板,经过对商品的详情页进行智能选图、图片杂质擦除、智能裁剪、卖点提取等步骤的处理,生成视频。
本实施例中,商品图片可以是商品的各种维度、各个角度的图片;商品视频素材可以是商品宣传视频、商品讲解视频、商品展示视频等素材。
本实施例的一些可选实现方式中,用户输入的商品信息可以是商品的编码,或者商品图片或者商品视频素材,用户输入的商品信息还可以包括:商品的编码、商品的编码对应的商品图片或者商品视频素材。即用户输入的商品信息可以是商品的编码、商品图片、商品视频素材中的任意一种,在本实施例中,如果不获取商品信息,则无法生成商品视频。
本实施例中,用户输入的商品信息所属品类可以是一种也可以是两种,当用户输入的商品信息所属品类是一种时,制作模板的品类需要与用户输入的商品信息所属品类相同。当用户输入的商品信息所属品类是两种以上时,制作模板可以是通用模板,通用模板是所有品类商品可用的模板,无个性化品类的特征,比如非通用的运动类的模板, 会加一些运动类的元素或文案描述。
步骤203,基于制作模板,对商品图片或商品视频素材进行处理,生成商品视频。
本实施例中,制作模板是即将生成的商品视频参照的模板,制作模板为商品视频提供视频版式参考,制作模板中定义了商品视频所涉及的音乐、动画的种类以及动画人物进入方式和时间、转场、文字以及特效等内容,按照制作模板定义的内容,对商品图片或商品视频素材进行处理,生成商品视频。
本实施例中,对商品图片或商品视频素材进行处理,可以是一些简单的图片处理,比如图片平移、缩放,还可以是一些复杂的变换,比如科技感的闪烁,3D旋转等,还可以是设计师设计出来的动画,这种通常是多张图片按照切割、拼接、复杂的空间运动等方式形成的动画格式。
可选地,基于制作模板生成商品视频过程中,可以直接在视频生成过程中增加商品卖点,也可以在生成商品视频之后增加商品卖点。商品卖点是企业为展示自己产品的特点、优点,而提炼的语言和演示。进一步地,为了提升商品视频的吸引力,可以对商品视频进行滤镜和特效处理,能使视频呈现不同的风格,丰富视频可观赏性。
在本实施例的一些可选实现方式中,基于制作模板,对商品图片或商品视频素材进行处理,生成商品视频,包括:
将制作模板中的音乐或商品视频素材中的音乐转到频域,计算音频能量的局部极值和错位卷积,确定重音点和节拍;将商品图片生成初始视频并从初始视频中提取多个预设时长的视频段落,或者从视频素材中提取多个预设时长的视频段落;基于重音点、节拍,将多个视频段落以过渡动画的方式合并,生成商品视频。
本可选实现方式中,可以采用多种动画生成函数将一张或商品图片生成初始视频。可以采用视频摘要提取模型从初始视频或视频素材中提取多个预设时长的视频段落。
本可选实现方式中,过渡动画又称为转场过渡,具体地,可以通过OPENGL(Open Graphics Library,开放图形库)实现转场过渡,本 可选实现方式中,采用OPENGL实现的转场过渡可以得到近百种视频转场效果,比如,转场包括渐黑到渐亮的转场效果。
本可选实现方式中,通过计算音频能量的局部极值确定音乐的重音点,通过错位卷积确定音乐的节拍,基于重音点以及节拍将提取的多个视频段落以过渡动画的方式合并,可以保证视频转场点和视频配乐的节奏一致。
在本实施例的一些可选实现方式中,对商品图片进行处理包括:对商品图片进行预处理;识别预处理后的图片的文字区域,去除文字区域的文字内容。
本可选实现方式中,预处理包括:被切割图片的拼接,多主体图片切割,图片筛选过滤,图片杂质擦除、图片的智能裁剪和拼接以及图片尺寸统一设计。本可选实现方式中,可以采用深度学习OCR(Optical Character Recognition,光学字符识别)识别预处理后的图片的文字区域和文字内容。在文字区域采用常规的文字擦图模型擦去文字内容,以保证生成的视频清晰整洁。
本可选实现方式中,首先对商品图片进行预处理,可以在商品图片在具有多张时,具有统一的规格尺寸。识别预处理后的图片的文字区域,去除文字区域的文字内容,可以保证生成的商品视频的清晰整洁。
在本实施例的另一些可选实现方式中,在去除文字区域的文字内容之前,还可以采用语言模型提取文字内容中的关键信息,并将提取出的关键信息写商品视频。本可选实现方式中,关键信息为文字形式的商品的卖点,将商品的卖点写入商品视频,可以便于用户在商品视频中快速发现商品的卖点信息。
在本实施例的一些可选实现方式中,响应于确定用户输入的商品信息所属品类与制作模板的品类不相匹配,发送提供更换制作模板的提示信息。
本可选实现方式中,在用户输入的商品信息所属品类与制作模板的品类不相匹配时,及时通过提示信息提示用户重新输入指令,保证了生成视频的风格与用户需求风格最大化匹配,使后续生成的商品视 频可以达到最优的制作效果。
在本实施例的另一些可选实现方式中,响应于确定用户输入的商品信息所属品类与制作模板的品类不相匹配,还可以向用户推荐可匹配的模板,从而达到更优的制作效果。
在本实施例的一些可选实现方式中,生成商品视频之后,还可以将商品视频与商品的编码绑定;并将与商品的编码绑定后的商品视频上传至商品主图的宣传显示位。
本可选实现方式中,通过将商品视频与商品的编码绑定,提高商品视频的制作效率,可以使单个商品视频制作时间在40s左右。同时商品视频可批量制作,大幅提升效率。
本公开的实施例提供的视频生成方法,首先基于用户的输入指令,从多种品类模板中确定一个模板作为制作模板;其次响应于确定用户输入的商品信息所属品类与制作模板的品类匹配,获取与商品信息相关的商品图片或商品视频素材;最后基于制作模板,对商品图片或商品视频素材进行处理,生成商品视频;由此通过与用户交互的确定制作模板,并基于制作模板以及获取的商品图片或商品视频素材生成商品视频,简化了视频制作的流程,提高了视频制作效率。
本实施例中,商品信息可以是用户的线上商品或者用户的在售商品,视频生成方法运行于其上的执行可以根据用户的商品注册信息,确定是否向用户自动推荐商品信息,在本实施了一些可选实现方式中,获取商品信息相关的商品图片或商品视频素材的方法,包括以下步骤:
步骤301,在用户登录之后,判断用户是否具有商家注册信息;在判断结果为用户具有商家注册信息时,之后,执行步骤302;在判断结果为用户不具有商家注册信息时,之后,执行步骤306。
本可选实现方式,用户需要在执行主体提供的视频生成系统注册,并注册成功之后,登录视频生成系统,通过视频生成界面选择制作模板,输入商品信息。
本可选实现方式中,商家注册信息是指用户在视频生成系统中注册有商家账号,通过该商家账号,可以确定用户是否有在售商品,并 且在确定有在售商品之后可以得到在售商品的基本信息。
步骤302,基于商家注册信息,获取用户在售商品的基本信息,之后,执行步骤303。
本可选实现方式中,在售商品的基本信息是指在售商品的编码(SKU,Stock Keeping Unit)、商品名称、商品封面图等信息。
步骤303,基于用户对基本信息的操作,确定用户输入的商品信息,之后,执行步骤304。
本可选实现方式中,在售商品的基本信息可以直接显示在视频生成界面,用户参照视频生成界面上的显示内容,进行直接点选或者输入在售商品的编码、商品封面图、商品名称等操作,确定用户输入的商品信息。当然,在售商品的基本信息还可以显示在用户登录的操作界面上,在用户对在售商品的基本信息进行直接点选或者输入在售商品的编码、商品封面图、商品名称等操作,该操作界面不再显示。
本可选实现方式中,在用户具有商家账号时,执行主体可以从后台服务器获取商家在售商品,让商家用户在制作视频时直接从在售商品的基本信息中选择需要输入的商品信息,而不用输入商品信息。
进一步地,用户输入的商品信息是由在售商品的基本信息中得到,提高了用户的选择的便利性。在用户忘记或者无法确定商品信息时。
步骤304,响应于确定用户输入的商品信息所属品类与制作模板的品类匹配,获取与商品信息相关的商品图片或商品视频素材,之后,执行步骤305。
本可选实现方式中,在用户不具有商家账号时,则需用户直接输入商城在售商品的商品信息,该商品信息包括商品的编码、商品的链接等,进一步地,基于商品信息生成视频。可选的,可以根据商品SKUID、SKU链接生成视频。可选的,还可以根据用户添加的图片或视频素材生成视频。
步骤305,退出。
步骤306,提示用户输入商品信息,之后,执行步骤304。
本可选实现方式中,可以向用户登录的操作界面中展示预设的提示信息,以提示用户输入商品信息。其中,该操作界面还可以是用户 输入商品信息的界面。
本可选实现方式中,在具有商家账号的用户添加商品信息后,系统会再次判断商品信息所属品类与制作模板的品类是否匹配,不匹配则会推荐匹配模板,达到商品视频更优的制作效果。
本可选实现方式提供的获取商品信息相关的商品图片或商品视频素材的方法,在用户具有商家注册信息时,基于商家注册信息,确定用户在售商品的基本信息,基于用户对基本信息的操作,确定用户输入的商品信息。从而在与用户交互时,基于用户在售商品的基本信息,向用户自动推荐商品信息,提高了视频制作效率。
为了生成的视频具有更好的展示效果,进一步参考图4,其示出了本公开的视频生成方法的另一个实施例的流程400。该视频生成方法,包括以下步骤:
步骤401,基于用户的输入指令,从多种品类模板中确定一个模板作为制作模板。
步骤402,响应于确定用户输入的商品信息所属品类与制作模板的品类匹配,获取与商品信息相关的商品图片或商品视频素材。
步骤403,基于制作模板,对商品图片或商品视频素材进行处理,生成商品视频。
应当理解,上述步骤401-步骤403中的操作和特征,分别与步骤201-步骤203中的操作和特征相对应,因此,上述在步骤401-步骤403中对于操作和特征的描述,同样适用于步骤201-步骤203,在此不再赘述。
步骤404,基于商品信息,获取与商品信息相关的商品详情页。
本实施例中,商品详情页即商品卖家制作的商品详情页,该商品详情页上描述有商品的产地、厂家、规格、适用范围等等详细描述信息,该详细描述信息可以包含商品的视频、图片、文字描述。例如,在某网页点击某个商品宣传显示位的图片可以查看到的商品详情页。
步骤405,提取商品详情页中的关键信息。
本实施例中,关键信息可以是表征商品的特征的信息,比如,文 字、图片、视频,通过提取的关键信息可以体现商品的卖点。
本实施例的一些可选实现方式中,关键信息可以为文字,提取商品详情页中的关键信息,包括:采用语言模型提取商品详情页中的关键信息,语言模型基于制作模板的品类训练得到。
针对不同商品品类可以训练不同的BERT(Bidirectional Encoder Representations from Transformers)语言模型,采用语言模型可以对OCR识别的文字进行分词、权重设置以及语义理解,从大段文字或者多句文字中提炼出数条简练的短语作为商品卖点宣传文案。其中,权重设置是指:在语言模型样本选出一部分商品文案作为标定样例,分词后给标定团队按照业务的需求标定,再用语言模型在这个数据上训练出每个词的权重。
本可选实现方式中,通过语言模型可以有效提取商品详情页中的文字的关键信息,提高了商品卖点提取的效率。
步骤406,对关键信息进行特效处理。
本实施例中,特效处理可以根据用户指定的业务需求而进行设置,以达到多种展示效果的目的,例如,特效处理为关键信息增加一些光影或者粒子。
步骤407,将特效处理后的关键信息写入商品视频。
本实施例中,特效处理后的关键信息实现了卖点的多种样式展示效果。例如,关键信息为文字时,实现了多种文字展示效果。
步骤408,对写入关键信息的商品视频进行滤镜和光影效果处理。
本实施例中,滤镜,主要是用来实现图像的各种特殊效果。光影效果处理,使图像中物体具有在阳光照射下形成的阴影效果。对商品视频进行数个常用滤镜和光影效果,能使商品视频呈现不同的风格,丰富商品视频可观赏性。
步骤409,将写入关键信息的商品视频与商品的编码绑定。
本实施例中,将商品视频与商品的编码进行绑定,便于查找到与商品编码相对于的商品。例如,批量一同生成5个商品的编码的商品视频,则制作的商品视频自动关联至这5个商品。通过将商品视频与商品的编码绑定,提高商品视频的制作效率,可以使单个商品视频制 作时间在40s左右。同时商品视频可批量制作,大幅提升效率。
步骤410,将与商品的编码绑定后的商品视频上传至商品主图的宣传显示位。
本实施例中,已生成的商品视频将会被上传至商品主图的宣传显示位。用户在浏览商品时可通过浏览商品视频全方位了解商品外观、特性和卖点。进一步地,可以通过视频审核系统将商品视频上传至商品主图的宣传显示位,视频审核系统用于审核商品是否符合预设的视频规范,该预设的视频规范有专门的规范文件进行约束。
视频生成完成后,视频审核系统判断视频生成完成,对商品视频进行审核判断,审核通过则在商品详情页的位置(即商品主图的宣传显示位)展示。
本实施例中,通过将商品视频上传至商品主图的宣传显示位,能够加强商品特性的宣传,吸引买家,加强向订单引流,直至转化。
本实施例提供的视频生成方法,基于商品信息,获取与商品信息相关的商品详情页,提取商品详情页中的关键信息,对关键信息进行特效处理,并将特效处理后的关键信息写入商品视频,并对写入关键信息的商品视频进行滤镜和光影效果处理,通过提取的关键信息得到商品的卖点,通过对关键信息进行特效处理,提高了卖点的显示效果;通过对写入关键信息的商品视频进行滤镜和光影效果处理,提高了卖点在商品视频中整体协调度,并且还提高了商品视频的显示效果。
进一步参考图5,作为对上述各图所示方法的实现,本公开提供了视频生成装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。
如图5所示,本公开的实施例提供了一种视频生成装置500,该装置500包括:确定单元501、获取单元502、生成单元503。其中,确定单元501,可以被配置成基于用户的输入指令,从多种品类模板中确定一个模板作为制作模板。获取单元502,可以被配置成响应于确定用户输入的商品信息所属品类与制作模板的品类匹配,获取与商品信息相关的商品图片或商品视频素材。生成单元503,可以被配置 成基于制作模板,对商品图片或商品视频素材进行处理,生成商品视频。
在本实施例中,视频生成装置500中,确定单元501、获取单元502、生成单元503的具体处理及其所带来的技术效果可分别参考图2对应实施例中的步骤201、步骤202、步骤203。
在一些实施例中,上述生成单元503包括:计算模块(图中未示出)、提取单元(图中未示出)、生成模块(图中未示出)。其中,计算模块,可以被配置成将制作模板中的音乐或商品视频素材中的音乐转到频域,计算音频能量的局部极值和错位卷积,确定重音点和节拍。提取模块,可以被配置成将商品图片生成初始视频并从初始视频中提取多个预设时长的视频段落,或者从视频素材中提取多个预设时长的视频段落。生成模块,可以被配置成基于重音点、节拍,将多个视频段落以过渡动画的方式合并,生成商品视频。
在一些实施例中,上述获取单元502包括:判断模块(图中未示出)、获取模块(图中未示出)、确定模块(图中未示出)、响应模块(图中未示出)、提示模块(图中未示出)。其中,判断模块,可以被配置成在用户登录之后,判断用户是否具有商家注册信息。获取模块,可以被配置成响应于判断结果为用户具有商家注册信息,基于商家注册信息,获取用户在售商品的基本信息。确定模块,可以被配置成基于用户对基本信息的操作,确定用户输入的商品信息。响应模块,可以被配置成响应于确定用户输入的商品信息所属品类与制作模板的品类匹配,获取与商品信息相关的商品图片或商品视频素材。提示模块,可以被配置成响应于判断结果为用户不具有商家注册信息,提示用户输入商品信息,触发响应模块工作。
在一些实施例中,上述装置500还包括:详分单元(图中未示出)、提取单元(图中未示出)、特效单元(图中未示出)、处理单元(图中未示出)。其中,详分单元,可以被配置成基于商品信息,获取与商品信息相关的商品详情页。提取单元,可以被配置成提取商品详情页中的关键信息。特效单元,可以被配置成对关键信息进行特效处理,并将特效处理后的关键信息写入商品视频。处理单元,可以被配置成对 写入关键信息的商品视频进行滤镜和光影效果处理。
在一些实施例中,上述提取单元,进一步被配置成采用语言模型提取商品详情页中的关键信息,语言模型基于制作模板的品类训练得到。
在一些实施例中,上述生成单元503包括:预处理模块(图中未示出)、识别模块(图中未示出)、去除模块(图中未示出)。其中,预处理模块,可以被配置成对商品图片进行预处理。识别模块,可以被配置成识别预处理后的图片的文字区域。去除模块,可以被配置成去除文字区域的文字内容。
在一些实施例中,上述装置500还包括:绑定单元(图中未示出)、上传单元(图中未示出)。其中,绑定单元,可以被配置成将商品视频与商品的编码绑定。上传单元,可以被配置成将与商品的编码绑定后的商品视频上传至商品主图的宣传显示位。
在一些实施例中,上述装置500还包括:发送单元(图中未示出)。上述发送单元,可以被配置成响应于确定用户输入的商品信息所属品类与制作模板的品类不相匹配,发送提示更换制作模板的提示信息。
本公开的实施例提供的视频生成装置,首先确定单元501基于用户的输入指令,从多种品类模板中确定一个模板作为制作模板;其次获取单元502响应于确定用户输入的商品信息所属品类与制作模板的品类匹配,获取与商品信息相关的商品图片或商品视频素材;最后生成单元503基于制作模板,对商品图片或商品视频素材进行处理,生成商品视频;由此通过与用户交互的确定制作模板,并基于制作模板以及获取的商品图片或商品视频素材生成商品视频,降低了视频制作的门槛,为用户提供了简单便捷的操作方式,方便了用户的使用,提高了用户的体验。
下面参考图6,其示出了适于用来实现本公开的实施例的电子设备600的结构示意图。
如图6所示,电子设备600可以包括处理装置(例如中央处理器、图形处理器等)601,其可以根据存储在只读存储器(ROM)602中的 程序或者从存储装置608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有电子设备600操作所需的各种程序和数据。处理装置601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。
通常,以下装置可以连接至I/O接口605:包括例如触摸屏、触摸板、键盘、鼠标、等的输入装置606;包括例如液晶显示器(LCD,Liquid Crystal Display)、扬声器、振动器等的输出装置607;包括例如磁带、硬盘等的存储装置608;以及通信装置609。通信装置609可以允许电子设备600与其他设备进行无线或有线通信以交换数据。虽然图6示出了具有各种装置的电子设备600,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。图6中示出的每个方框可以代表一个装置,也可以根据需要代表多个装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置609从网络上被下载和安装,或者从存储装置608被安装,或者从ROM 602被安装。在该计算机程序被处理装置601执行时,执行本公开的实施例的方法中限定的上述功能。
需要说明的是,本公开的实施例的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开的实施例中,计算机可 读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开的实施例中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(Radio Frequency,射频)等等,或者上述的任意合适的组合。
上述计算机可读介质可以是上述服务器中所包含的;也可以是单独存在,而未装配入该服务器中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该服务器执行时,使得该服务器:基于用户的输入指令,从多种品类模板中确定一个模板作为制作模板;响应于确定用户输入的商品信息所属品类与制作模板的品类匹配,获取与商品信息相关的商品图片或商品视频素材;基于制作模板,对商品图片或商品视频素材进行处理,生成商品视频。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的实施例的操作的计算机程序代码,程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开的各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点 上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开的实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器,包括确定单元、获取单元、和生成单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,确定单元还可以被描述为“被配置成基于用户的输入指令,从多种品类模板中确定一个模板作为制作模板”的单元。
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开的实施例中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开的实施例中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。

Claims (18)

  1. 一种视频生成方法,所述方法包括:
    基于用户的输入指令,从多种品类模板中确定一个模板作为制作模板;
    响应于确定用户输入的商品信息所属品类与所述制作模板的品类匹配,获取与所述商品信息相关的商品图片或商品视频素材;
    基于所述制作模板,对所述商品图片或商品视频素材进行处理,生成商品视频。
  2. 根据权利要求1所述的方法,其中,所述基于所述制作模板,对所述商品图片或商品视频素材进行处理,生成商品视频,包括:
    将所述制作模板中的音乐或所述商品视频素材中的音乐转到频域,计算音频能量的局部极值和错位卷积,确定重音点和节拍;
    将所述商品图片生成初始视频并从所述初始视频中提取多个预设时长的视频段落,或者从所述视频素材中提取多个预设时长的视频段落;
    基于所述重音点、所述节拍,将多个所述视频段落以过渡动画的方式合并,生成商品视频。
  3. 根据权利要求1-2任一项所述的方法,其中,所述响应于确定用户输入的商品信息所属品类与所述制作模板的品类匹配,获取与所述商品信息相关的商品图片或商品视频素材,包括:
    在用户登录之后,判断所述用户是否具有商家注册信息;
    响应于判断结果为所述用户具有商家注册信息,基于所述商家注册信息,获取所述用户在售商品的基本信息;
    基于所述用户对所述基本信息的操作,确定用户输入的商品信息;响应于确定用户输入的商品信息所属品类与所述制作模板的品类匹配,获取与所述商品信息相关的商品图片或商品视频素材;
    响应于判断结果为所述用户不具有商家注册信息,提示用户输入 商品信息,响应于确定用户输入的商品信息所属品类与所述制作模板的品类匹配,获取与所述商品信息相关的商品图片或商品视频素材。
  4. 根据权利要求1-3任一项所述的方法,所述方法还包括:
    基于所述商品信息,获取与所述商品信息相关的商品详情页;
    提取所述商品详情页中的关键信息;
    对所述关键信息进行特效处理,并将特效处理后的关键信息写入所述商品视频;
    对写入关键信息的商品视频进行滤镜和光影效果处理。
  5. 根据权利要求4所述的方法,其中,所述提取所述商品详情页中的关键信息,包括:
    采用语言模型提取所述商品详情页中的关键信息,所述语言模型基于所述制作模板的品类训练得到。
  6. 根据权利要求1-5任一项所述的方法,其中,所述对所述商品图片进行处理包括:
    对所述商品图片进行预处理;
    识别预处理后的图片的文字区域;
    去除所述文字区域的文字内容。
  7. 根据权利要求1-6任一项所述的方法,所述方法还包括:
    将所述商品视频与商品的编码绑定;
    并将与商品的编码绑定后的商品视频上传至所述商品主图的宣传显示位。
  8. 根据权利要求1-7之一所述的方法,所述方法还包括:
    响应于确定用户输入的商品信息所属品类与所述制作模板的品类不相匹配,发送提示更换所述制作模板的提示信息。
  9. 一种视频生成装置,所述装置包括:
    确定单元,被配置成基于用户的输入指令,从多种品类模板中确定一个模板作为制作模板;
    获取单元,被配置成响应于确定用户输入的商品信息所属品类与所述制作模板的品类匹配,获取与所述商品信息相关的商品图片或商品视频素材;
    生成单元,被配置成基于所述制作模板,对所述商品图片或商品视频素材进行处理,生成商品视频。
  10. 根据权利要求9所述的装置,其中,所述生成单元包括:
    计算模块,被配置成将所述制作模板中的音乐或所述商品视频素材中的音乐转到频域,计算音频能量的局部极值和错位卷积,确定重音点和节拍;
    提取模块,被配置成将所述商品图片生成初始视频并从所述初始视频中提取多个预设时长的视频段落,或者从所述视频素材中提取多个预设时长的视频段落;
    生成模块,被配置成基于所述重音点、所述节拍,将多个所述视频段落以过渡动画的方式合并,生成商品视频。
  11. 根据权利要求9-10任一项所述的装置,其中,所述获取单元进一步包括:
    判断模块,被配置为:在用户登录之后,判断所述用户是否具有商家注册信息;
    获取模块,被配置为:响应于判断结果为所述用户具有商家注册信息,基于所述商家注册信息,获取所述用户在售商品的基本信息;
    确定模块,被配置为:基于所述用户对所述基本信息的操作,确定用户输入的商品信息;
    响应模块,被配置为:响应于确定用户输入的商品信息所属品类与所述制作模板的品类匹配,获取与所述商品信息相关的商品图片或商品视频素材;以及
    提示模块,被配置为:响应于判断结果为所述用户不具有商家注册信息,提示用户输入商品信息,以及触发响应模块工作。
  12. 根据权利要求9-11任一项所述的装置,其中,所述装置还包括:
    详分单元,被配置为基于所述商品信息,获取与所述商品信息相关的商品详情页;
    提取单元,被配置为提取所述商品详情页中的关键信息;
    特效单元,被配置为对所述关键信息进行特效处理,并将特效处理后的关键信息写入所述商品视频;以及
    处理单元,被配置为对写入关键信息的商品视频进行滤镜和光影效果处理。
  13. 根据权利要求12所述的装置,其中,所述提取单元进一步被配置为采用语言模型提取所述商品详情页中的关键信息,所述语言模型基于所述制作模板的品类训练得到。
  14. 根据权利要求9-13任一项所述的装置,其中所述生成单元包括:
    预处理模块,被配置为对所述商品图片进行预处理;
    识别模块,被配置为识别预处理后的图片的文字区域;以及
    去除模块,被配置为去除所述文字区域的文字内容。
  15. 根据权利要求9-14任一项所述的装置,其中,所述装置还包括:
    绑定单元,被配置为将将所述商品视频与商品的编码绑定;以及
    上传单元,被配置为将与商品的编码绑定后的商品视频上传至所述商品主图的宣传显示位。
  16. 根据权利要求9-15任一项所述的装置,其中,所述装置还包括:
    发送单元,被配置为响应于确定用户输入的商品信息所属品类与所述制作模板的品类不相匹配,发送提示更换所述制作模板的提示信息。
  17. 一种电子设备,包括:
    一个或多个处理器;
    存储装置,其上存储有一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-8中任一所述的方法。
  18. 一种计算机可读介质,其上存储有计算机程序,其中,该程序被处理器执行时实现如权利要求1-8中任一所述的方法。
PCT/CN2021/126427 2020-10-30 2021-10-26 视频生成方法、装置、电子设备以及计算机可读介质 WO2022089427A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/033,671 US20230396857A1 (en) 2020-10-30 2021-10-26 Video generation method and apparatus, and electronic device and computer-readable medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011192359.5A CN113781140A (zh) 2020-10-30 2020-10-30 视频生成方法、装置、电子设备以及计算机可读介质
CN202011192359.5 2020-10-30

Publications (1)

Publication Number Publication Date
WO2022089427A1 true WO2022089427A1 (zh) 2022-05-05

Family

ID=78835160

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/126427 WO2022089427A1 (zh) 2020-10-30 2021-10-26 视频生成方法、装置、电子设备以及计算机可读介质

Country Status (3)

Country Link
US (1) US20230396857A1 (zh)
CN (1) CN113781140A (zh)
WO (1) WO2022089427A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117609550B (zh) * 2024-01-17 2024-05-28 腾讯科技(深圳)有限公司 视频标题生成方法和视频标题生成模型的训练方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110418196A (zh) * 2019-08-29 2019-11-05 金瓜子科技发展(北京)有限公司 视频生成方法、装置及服务器
CN111242712A (zh) * 2018-11-29 2020-06-05 阿里巴巴集团控股有限公司 一种商品展示方法及其装置
CN111539779A (zh) * 2019-01-21 2020-08-14 阿里巴巴集团控股有限公司 商品页面的生成方法、电子设备及计算机存储介质
CN111784431A (zh) * 2019-11-18 2020-10-16 北京沃东天骏信息技术有限公司 视频生成方法、装置、终端以及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120185772A1 (en) * 2011-01-19 2012-07-19 Christopher Alexis Kotelly System and method for video generation
CN110517111B (zh) * 2019-08-15 2021-09-21 青岛科技大学 一种产品个性化定制方法
CN111027901A (zh) * 2019-11-29 2020-04-17 珠海随变科技有限公司 商品录入信息的处理方法、装置、计算机设备和存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111242712A (zh) * 2018-11-29 2020-06-05 阿里巴巴集团控股有限公司 一种商品展示方法及其装置
CN111539779A (zh) * 2019-01-21 2020-08-14 阿里巴巴集团控股有限公司 商品页面的生成方法、电子设备及计算机存储介质
CN110418196A (zh) * 2019-08-29 2019-11-05 金瓜子科技发展(北京)有限公司 视频生成方法、装置及服务器
CN111784431A (zh) * 2019-11-18 2020-10-16 北京沃东天骏信息技术有限公司 视频生成方法、装置、终端以及存储介质

Also Published As

Publication number Publication date
US20230396857A1 (en) 2023-12-07
CN113781140A (zh) 2021-12-10

Similar Documents

Publication Publication Date Title
CN109688463B (zh) 一种剪辑视频生成方法、装置、终端设备及存储介质
US20240107127A1 (en) Video display method and apparatus, video processing method, apparatus, and system, device, and medium
US20210042662A1 (en) Interactive Information Capture and Retrieval with User-Defined and/or Machine Intelligence Augmented Prompts and Prompt Processing
US10438264B1 (en) Artificial intelligence feature extraction service for products
CN110914872A (zh) 用认知洞察力导航视频场景
US10121171B1 (en) Component-level rating system and method
CN107211106B (zh) 动画生成服务器、动画生成方法、动画生成系统及计算机可读的记录介质
CN111460179A (zh) 多媒体信息展示方法及装置、计算机可读介质及终端设备
CN113760158A (zh) 目标对象展示方法、对象关联方法、装置、介质及设备
CN109961493A (zh) 展示页面上的横幅广告图片生成方法及装置
WO2024051609A1 (zh) 广告创意数据选取方法及装置、模型训练方法及装置、设备、存储介质
CN110798567A (zh) 短信分类显示方法及装置、存储介质、电子设备
CN112287168A (zh) 用于生成视频的方法和装置
WO2022089427A1 (zh) 视频生成方法、装置、电子设备以及计算机可读介质
US10453491B2 (en) Video processing architectures which provide looping video
US20200005387A1 (en) Method and system for automatically generating product visualization from e-commerce content managing systems
CN111680482A (zh) 一种标题图文生成方法及计算设备
CN112784103A (zh) 信息推送方法和装置
CN111063037A (zh) 一种三维场景编辑方法及装置
CN116739692A (zh) 一种电商大数据反馈的推送优化方法及系统
CN114817812A (zh) 面向电商的无代码网页构建方法和装置
US11727681B2 (en) Media annotation with product source linking
JP5801104B2 (ja) Html文書に基づく短編動画作品の自動制作
US20230401634A1 (en) Product card ecommerce purchase within short-form videos
CN113312516B (zh) 一种视频处理方法及相关装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21885154

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21885154

Country of ref document: EP

Kind code of ref document: A1