WO2022228557A1 - 剪辑模板搜索方法及装置 - Google Patents

剪辑模板搜索方法及装置 Download PDF

Info

Publication number
WO2022228557A1
WO2022228557A1 PCT/CN2022/090348 CN2022090348W WO2022228557A1 WO 2022228557 A1 WO2022228557 A1 WO 2022228557A1 CN 2022090348 W CN2022090348 W CN 2022090348W WO 2022228557 A1 WO2022228557 A1 WO 2022228557A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimension
template
target
multimedia resource
editing
Prior art date
Application number
PCT/CN2022/090348
Other languages
English (en)
French (fr)
Inventor
李�根
周颖枝
崔冉
张映
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Priority to JP2023566920A priority Critical patent/JP2024516836A/ja
Priority to EP22795026.8A priority patent/EP4322025A4/en
Publication of WO2022228557A1 publication Critical patent/WO2022228557A1/zh
Priority to US18/484,933 priority patent/US20240037134A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/435Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/438Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/483Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • G11B27/034Electronic editing of digitised analogue information signals, e.g. audio or video signals on discs

Definitions

  • the present disclosure relates to the field of Internet technologies, and in particular, to a method and device for searching for clip templates.
  • APP application
  • the application usually provides a wealth of clip templates, and users can use the clip templates and select their favorite photos or videos to obtain a composite video.
  • how to quickly and accurately find the clip template desired by the user is an urgent problem to be solved at present.
  • the present disclosure provides a clip template search method and apparatus.
  • an embodiment of the present disclosure provides a method for searching for a clip template, including:
  • the search result is used to indicate whether a target editing template matching the first multimedia resource is found, and the target editing template is used to indicate that the multimedia material to be edited is edited into a second multimedia material according to the target editing method.
  • the target editing method is the editing method adopted by the first multimedia resource.
  • the searching according to the first multimedia resource to obtain search results includes:
  • the identification result of the candidate clip template in the target dimension is obtained according to the first multimedia resource and the feature of the candidate clip template in the target dimension.
  • the method before acquiring the search result according to the identification result of the candidate clip template in the target dimension, the method further includes:
  • the at least one editing template includes the candidate editing template.
  • the target dimension includes one or more of: a music style dimension, an audio fingerprint dimension, a video size dimension, a video segment feature dimension, and a visual effect dimension.
  • the target dimension includes a plurality of the music genre dimension, the audio fingerprint dimension, the video size dimension, the video segment feature dimension, and the visual effects dimension;
  • the obtaining the search result according to the identification result of the candidate clip template in the target dimension includes:
  • the search result is acquired according to the weighted calculation result corresponding to the candidate clip template.
  • the determining of the candidate editing template according to the characteristics of the first multimedia resource and the at least one editing template on the target dimension respectively includes:
  • the characteristics of the first multimedia resource and the first screening result respectively in the current dimension obtain the first recognition result of each editing template in the first screening result in the current dimension respectively;
  • the second screening result includes: one or more editing templates, the first screening result
  • the initial state includes: the at least one clip template
  • the obtaining the first multimedia resource specified by the user includes: obtaining a target link input by the user, and analyzing the target link to obtain the first multimedia resource.
  • the method further includes sending the search results to the user.
  • an embodiment of the present disclosure provides a clip template search device, including:
  • an acquisition module for acquiring the first multimedia resource specified by the user
  • a search module configured to perform a search according to the first multimedia resource, and obtain a search result; wherein, the search result is used to indicate whether a target clip template matching the first multimedia resource is found, and the The target editing template is used to indicate that the multimedia material to be edited is edited into a second multimedia resource according to a target editing method, and the target editing method is the editing method adopted by the first multimedia resource.
  • embodiments of the present disclosure provide an electronic device, including: a memory, a processor, and a computer program;
  • the memory is configured to store the computer program
  • the processor is configured to execute the computer program to implement the method of any one of the first aspects.
  • an embodiment of the present disclosure provides a readable storage medium, including: a program
  • an embodiment of the present disclosure further provides a program product, including: a computer program, where the computer program is stored in a readable storage medium, and at least one processor of an electronic device can read from the readable storage medium The computer program, executed by the at least one processor, causes the electronic device to implement the method according to any one of the first aspects.
  • Embodiments of the present disclosure provide a method and apparatus for searching for a clip template, wherein the method includes: in a video editing scenario, the server device obtains a first multimedia resource specified by a user, and searches according to the first multimedia resource specified by the user.
  • the target editing template is used to instruct the multimedia material to be edited to be edited into the second multimedia resource according to the target editing method.
  • searching according to the first multimedia resource improves the accuracy of search results, can better meet the needs of users for video creation, and can also improve the utilization rate of the target editing template.
  • FIG. 1 is a schematic diagram of an application scenario of a method for searching for a clip template provided by an embodiment of the present disclosure
  • FIG. 2 is a flowchart of a clip template search method provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic structural diagram of a search model provided by an embodiment of the present disclosure.
  • FIG. 4 is a flowchart of a method for searching for a clip template provided by another embodiment of the present disclosure
  • FIG. 5 is a schematic structural diagram of a search model provided by another embodiment of the present disclosure.
  • FIG. 6 is a flowchart of a method for searching for a clip template provided by another embodiment of the present disclosure.
  • FIG. 7 is a flowchart of a clip template search method provided by another embodiment of the present disclosure.
  • 8A to 8K are schematic diagrams of human-computer interaction interfaces provided by the present disclosure.
  • FIG. 9 is a schematic structural diagram of a clip template search apparatus provided by an embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
  • the existing APP usually supports searching for editing templates by keywords.
  • the user may not be able to obtain the correct keyword, and the clip template found by the keyword input by the user is not the clip template that the user wants, so that the user may not want to continue video creation.
  • the present disclosure provides a clip template search method, the core idea of which is: by acquiring the first multimedia resource specified by the user, and analyzing the first multimedia resource and the candidate clip template, so as to quickly
  • the target editing template that the user wants is located from the candidate editing templates the accuracy of the search results is improved, the user's needs for video creation can be better met, and the utilization rate of the target editing template can also be improved.
  • FIG. 1 is a schematic diagram of an application scenario of a method for searching for a clip template provided by an embodiment of the present disclosure.
  • the clip template search method provided in this embodiment can be applied to the scene shown in FIG. 1 .
  • the scenario includes: a server device 101 and a terminal device 102.
  • the server device 101 and the terminal device 102 may be connected through a wired or wireless network.
  • the server device 101 may be implemented in any software and/or hardware manner.
  • the server device 101 may be a server, and the server may be an independent server, a server cluster composed of multiple independent servers, or a cloud server.
  • the server device 101 may also be a software program integrated in the electronic device.
  • the software program is executed by at least one processor of the electronic device, the technical solution executed by the server device in the clip template search method provided by the embodiment of the present disclosure can be executed.
  • the server device can interact with one or more terminal devices at the same time, and send the same or different data to the terminal devices.
  • the terminal device 102 may be implemented in any software and/or hardware manner.
  • the terminal device 102 may be, but not limited to, a notebook computer, a desktop computer, a smart phone, a portable terminal device, a wearable device, a personal digital assistant (personal digital assistant, PDA) and other devices. No restrictions apply.
  • the terminal device 102 may also be a software program integrated in the electronic device. When the software program is executed by the processor of the electronic device, the technical solution executed by the terminal device in the clip template search method provided by the embodiment of the present disclosure may be executed.
  • FIG. 1 exemplarily shows a scenario in which one server device interacts with one terminal device.
  • the server device can interact with more terminal devices in parallel.
  • FIG. 2 is a flowchart of a method for searching a clip template according to an embodiment of the present disclosure.
  • the execution body of this embodiment may be a server device. Referring to Figure 2, this embodiment includes:
  • the server device acquires the first multimedia resource designated by the user through the terminal device.
  • the first multimedia resource may be an audio file or a video file.
  • the first multimedia resource may also be a multimedia resource (eg, a short video) obtained by performing video creation according to the target editing template.
  • a multimedia resource eg, a short video
  • the first multimedia resource may be actively reported by the user through the terminal device.
  • it may also be specified by the user in other ways, for example, the user reports the target link through the terminal device, and the server device obtains the target link by parsing.
  • multiple application programs are installed on the terminal device, wherein the multiple application programs include: a video editing application program and other application programs.
  • the target link may be copied by the user from any one of the above other applications and pasted into the video editing application.
  • the above other applications can be, but are not limited to, music applications, video applications, social applications, and so on.
  • the target link may also be manually input by the user to the video editing application installed on the terminal device.
  • the present disclosure does not limit the source of the target link and the manner in which the terminal device obtains the target link.
  • the video editing application After the video editing application obtains the target link, it parses the target link, obtains the webpage corresponding to the target link, and obtains the uniform resource locator (URL) of the pointed first multimedia resource from the webpage. ;
  • the terminal device accesses and downloads the first multimedia resource according to the URL of the first multimedia resource; Afterwards, the terminal device through the established communication link with the server device, the first multimedia resource obtained The resource is uploaded to the server device.
  • URL uniform resource locator
  • the video editing application program when it obtains the target link, it can send the target link to the server device through the established communication link between the terminal device and the server device; after the server device receives the target link, The target link is parsed, the webpage corresponding to the target link is obtained, and the URL of the pointed first multimedia resource is obtained from the webpage; the terminal device accesses and downloads the first multimedia resource according to the URL of the first multimedia resource.
  • the purpose of this step is to search for a target editing template that is the same as the editing method used by the first multimedia resource.
  • the search result is used to indicate whether a target clip template matching the first multimedia resource is searched.
  • a template library deployed on the server device is searched to obtain a search result.
  • the template library deployed on the server device includes: at least one clip template.
  • Each editing template is used to provide a preset editing mode, and the multimedia material to be edited selected or imported by the user can be edited into a new multimedia resource according to the preset editing mode provided by the editing template.
  • the server device may obtain the search result by performing an analysis on the target dimension on the first multimedia resource and each clip template in the template library.
  • the target dimension may include one or more dimensions of music style dimension, audio fingerprint dimension, video size dimension, video segment feature dimension, and visual effect dimension.
  • a trained search model may be pre-deployed on the server device, and after the server device acquires the first multimedia resource specified by the user, the search model may be used to perform the above search.
  • the search model may be used to perform the above search.
  • other methods or algorithms can also be used to perform the above search.
  • the server device obtains the first multimedia resource specified by the user, and searches the template library according to the first multimedia resource specified by the user, and obtains the same value as the first multimedia resource.
  • Matching target clip template In this solution, searching through the first multimedia resource improves the accuracy of search results, can better meet the needs of users for video creation, and can also improve the utilization rate of the target editing template.
  • the search model searches according to the first multimedia resource and each clip template included in the template library will be described in detail respectively in two cases that the first multimedia resource is an audio file and a video file.
  • server device performing the search and the server device storing the template library may be the same device or different devices, which are not limited in this embodiment of the present disclosure.
  • Case 1 The first multimedia resource is an audio file
  • FIG. 3 is a schematic structural diagram of a search model provided by an embodiment of the present disclosure
  • FIG. 4 is a flowchart of a method for searching a clip template provided by another embodiment of the present disclosure.
  • the search model 300 includes: a music style identification sub-model 301 and an audio fingerprint identification sub-model 302 .
  • the music style recognition sub-model 301 is used to output the first recognition result corresponding to each clip template in the music style dimension according to the first multimedia resource and the audio files contained in each template video, and the first recognition result is used to indicate the first recognition result. Similarity in musical style between a multimedia resource and an audio file contained in a template video.
  • the music styles can be pre-divided into multiple styles, for example, the music styles include: sad, quiet, popular, cheerful, relaxed, sweet, happy and so on.
  • the specific classification of music styles is not limited in the embodiments of the present disclosure.
  • the music style recognition sub-model 301 obtains the first feature vector corresponding to the first multimedia resource and the second feature vector of the audio file included in the clip template through a specific algorithm. Then, calculate the distance between the first feature vector and the second feature vector (for example: Euclidean distance, of course, it can also be the distance obtained by other algorithms), and obtain the first recognition corresponding to the clip template according to the distance obtained by calculation result.
  • Euclidean distance for example: Euclidean distance, of course, it can also be the distance obtained by other algorithms
  • the audio fingerprint identification sub-model 302 is used to output the second identification result corresponding to each clip template in the audio fingerprint dimension according to the first multimedia resource and the audio files contained in each template video, and the second identification result is used to indicate the multimedia The similarity of the audio fingerprint between the file and the audio file contained in the template video.
  • the audio fingerprinting sub-model 302 specifically adopts an audio fingerprinting technology (audio fingerprinting technology) to analyze the multimedia files and the audio files contained in each clip template.
  • audio fingerprinting technology refers to using a specific algorithm to extract the data features of the audio files that need to be identified, such as sound spectrum features, spectral features, etc. Compared.
  • the clip template search method provided by this embodiment includes:
  • the server device obtains the target audio file.
  • the target audio file in this embodiment is the first multimedia resource.
  • the specific implementation manner of acquiring the target audio file by the server device may refer to the description in the embodiment shown in FIG. 2 , which will not be repeated here.
  • the target dimensions in this step exemplarily include: a music style dimension and an audio fingerprint dimension.
  • the duration of the first multimedia resource (in this embodiment, the first multimedia resource is the target audio file) is too long, and the server device may
  • the audio files of the clip templates contained in the resource and template library are respectively sliced, and the first audio sub-file corresponding to the first multimedia resource and the first audio sub-file corresponding to the audio file contained in each of the clip templates are obtained. Two audio subfiles.
  • the server device may perform slicing processing at fixed intervals, so that the audio durations of each first audio sub-file and each second audio sub-file are consistent, which is convenient for the music style identification sub-model and the audio fingerprint identification sub-model segment by segment analysis.
  • the durations of the audio files included in each editing template may not be exactly the same, the duration of the audio files included in some editing templates is longer, and the duration of the audio files included in some editing templates is shorter. If the audio file included in the clip template has a long duration, slice processing can be performed at the above-mentioned fixed interval. If the duration of the audio files included in the clip template satisfies the above-mentioned fixed interval, the slicing process may not be performed. Whether to perform slicing processing can be flexibly set according to actual needs.
  • the first audio sub-file corresponding to the first multimedia resource and the second audio sub-file corresponding to the audio files contained in each of the clip templates can be input into the music style recognition sub-model and audio fingerprint respectively.
  • the sub-model is identified, so as to obtain the first identification result and the second identification result corresponding to each clip template output by the music style identification sub-model.
  • the search result is acquired according to the first recognition result and the second recognition result corresponding to each candidate clip template respectively.
  • the music style recognition sub-model and the audio fingerprint recognition sub-model may perform recognition tasks in parallel, or may perform recognition tasks in sequence.
  • the candidate clipping template is a part of the at least one clipping template
  • the method further includes: S402', determining the candidate editing template according to the target audio file and the characteristics of at least one editing template in the target dimension respectively.
  • the number of candidate clip templates may be one or multiple.
  • the “at least one editing template” in this step may include part of the editing templates in the template library, and may also include all the editing templates in the template library.
  • the at least one editing template may be determined according to factors such as the release time, usage, and collection of the editing template; alternatively, it may be determined randomly; or determined by any other means.
  • a second filtering result corresponding to the musical style dimension is obtained.
  • the second filtering result corresponding to the music style dimension includes one or more editing templates.
  • the first audio sub-file corresponding to the first multimedia resource and the second audio sub-file corresponding to each clip template included in the second screening result corresponding to the music style dimension are input into the audio fingerprint identification sub-model to obtain the music style.
  • the second identification result corresponding to each clip template included in the second screening result corresponding to the dimension is input into the audio fingerprint identification sub-model to obtain the music style.
  • the first filter result can be obtained according to the filter condition and the second identification result corresponding to each clip template included in the second filter result corresponding to the music style dimension.
  • the editing template included in the first screening result is the aforementioned candidate editing template.
  • the second screening result corresponding to the audio fingerprint dimension is obtained.
  • the second screening result corresponding to the audio fingerprint dimension includes one or more editing templates.
  • the first audio sub-file corresponding to the first multimedia resource and the second audio sub-file corresponding to each clip template included in the second screening result corresponding to the audio fingerprint dimension are input into the music style recognition sub-model to obtain the audio fingerprint.
  • the first identification result corresponding to each clip template included in the second screening result corresponding to the dimension is input into the music style recognition sub-model to obtain the audio fingerprint.
  • the first filter result can be obtained according to the filter condition and the second recognition result corresponding to each clip template included in the second filter result corresponding to the audio fingerprint dimension.
  • the clip template included in the screening result is the aforementioned candidate clip template.
  • the filter condition corresponding to the music style dimension or the audio fingerprint dimension may be empty.
  • Obtaining the search result according to the first recognition result and the second recognition result corresponding to each candidate editing template can be achieved in the following manner:
  • the weighting calculation result corresponding to each candidate editing template is obtained according to the first identification result and the second identification result corresponding to each candidate editing template, the respective weight coefficients corresponding to the music style dimension and the audio fingerprint dimension.
  • the respective weight coefficients corresponding to the music style dimension and the audio fingerprint dimension can be flexibly configured according to requirements.
  • the pre-trained search model is first used to analyze the audio file specified by the user (ie, the first multimedia resource) and the audio files contained in each clip template in the template library in the dimension of music style and audio fingerprint. , and comprehensively evaluates whether the clip template is the target clip template the user wants to find according to the analysis results in the two dimensions, which ensures the accuracy of the search results.
  • the above-mentioned search task is performed through the search model, which can improve the search efficiency.
  • the first multimedia resource is a video file
  • FIG. 5 is a schematic structural diagram of a search model provided by an embodiment of the present disclosure
  • FIG. 6 is a flowchart of a method for searching a clip template provided by another embodiment of the present disclosure.
  • the search model 500 includes five sub-models, namely: a music style recognition sub-model 501, a video size recognition sub-model 502, a video segment feature recognition sub-model 503, a visual effect recognition sub-model 504, and an audio fingerprint recognition. Submodel 505.
  • the music style identification sub-model 501 included in the search model 500 provided by the embodiment shown in FIG. 5 is similar to the music style sub-model 301 included in the search model 300 provided by the embodiment shown in FIG. 3;
  • the audio fingerprint identification sub-model 505 included in the search model 500 is similar to the audio identification sub-model 302 included in the search model 300 provided by the embodiment shown in FIG. 3; for details, please refer to the detailed description in the embodiment shown in FIG. 3, which is not repeated here. Repeat.
  • the video size recognition sub-model 502 is used to output the third recognition result corresponding to each editing template in the video size dimension according to the size feature of the first multimedia resource and the size feature of the editing template. It is used to indicate the similarity of the video size between the first multimedia resource and the clip template.
  • the above-mentioned size features include: duration and/or aspect ratio of video frames.
  • the size feature of the first multimedia resource includes: the duration of the multimedia file and/or the aspect ratio of the video frames included in the multimedia file.
  • the size characteristics of the clip template include: the duration of the clip template and/or the aspect ratio of the video frame of the clip template.
  • the aspect ratio of the video frame may be obtained by dividing the length of the video frame by the width of the video frame, or may be obtained by dividing the width of the video frame by the length of the video frame.
  • the video clip feature identification sub-model 503 is used to output the fourth recognition result corresponding to the clip template on the feature dimension of the video clip according to the video clip feature of the first multimedia resource and the video clip feature of the clip template, and the fourth recognition result is used for Indicates the similarity between the video segment of the first multimedia resource and the video segment of the clip template.
  • the video segment feature identification sub-model 503 segments the first multimedia resource according to the transition moment of the first multimedia resource, and obtains a plurality of first video sub-segments corresponding to the first multimedia resource; According to the transition moment of each editing template, the editing template is segmented to obtain a plurality of second video sub-segments corresponding to each editing template.
  • the video segment feature identification sub-model 503 obtains the first video sub-segment corresponding to each video template file according to the sequence of the first video sub-segment, the sequence of the second video sub-segment, and the duration, transition mode and other characteristics of the video sub-segment. 4. Identification results.
  • the visual effect recognition sub-model 504 is used for outputting the fifth recognition result corresponding to the editing template in the visual effect dimension according to the visual effect of the first multimedia resource and the visual effect of the editing template, and the fifth recognition result is used to indicate the first multiple recognition result. How similar the visuals of the media asset are to the visuals of the clip template.
  • the visual effect recognition sub-model 504 specifically identifies one or more of the sticker material style, sticker material size, text material style, text material size, filter effect, etc. used by the first multimedia resource and the clip template respectively. Multiple items are obtained to obtain the fifth recognition result in the dimension of visual effects.
  • the clip template search method provided by this embodiment includes:
  • the server device obtains the target video file.
  • the above-mentioned target video file is the first multimedia resource.
  • the specific implementation manner of acquiring the target video file by the server device may refer to the description in the embodiment shown in FIG. 2 , which will not be repeated here.
  • the target dimensions include: a music style dimension, an audio fingerprint dimension, a video size dimension, a video clip feature dimension, and a visual effect dimension.
  • the server device may perform slice processing on the audio files of the clip templates contained in the first multimedia resource and the template library, respectively, to obtain the first multimedia resource.
  • the first multimedia resource and the at least one editing template can be input into the five sub-models included in the search model 500, respectively, and the first recognition results corresponding to the editing templates output by the five sub-models can be obtained to The fifth identification result.
  • the search result is acquired according to the first identification result to the fifth identification result corresponding to each candidate editing template respectively.
  • the candidate clipping template is a part of the at least one clipping template
  • the method further includes: S602', determining the candidate editing template according to the target video file and the characteristics of at least one editing template in the target dimension respectively.
  • the number of candidate clip templates may be one or multiple.
  • the “at least one editing template” in this step may include part of the editing templates in the template library, and may also include all the editing templates in the template library.
  • the at least one editing template may be determined according to factors such as the release time, usage, and collection of the editing template; alternatively, it may be determined randomly; or determined by any other means.
  • S602' may include the following steps:
  • Step a Determine the current dimension according to the priority order of the above five sub-models.
  • the priority order of the five sub-models included in the search model 500 is the priority order of each dimension corresponding to the target dimension, and the current dimension is determined according to the order of priorities of the five sub-models included in the search model 500 from high to bottom.
  • Step b Acquire a first identification result of each clip template in the first screening result in the current dimension according to the first multimedia resource and the characteristics of the first screening result in the current dimension respectively.
  • the first screening result is the screening result corresponding to the previous dimension.
  • Step c obtaining a second screening result corresponding to the current dimension according to the first identification result and the screening condition corresponding to the current dimension;
  • the second screening result includes: one or more editing templates,
  • the An initial state of a screening result includes: the at least one clip template.
  • Step d determining that the second screening result is the first screening result.
  • step a determines that the editing template included in the second screening result corresponding to the last current dimension is the candidate editing template.
  • the filter conditions corresponding to some dimensions can be configured to be empty.
  • the priority order of the five sub-models included in the search model is: music style recognition sub-model > video size recognition sub-model > video segment feature recognition sub-model > audio fingerprint recognition sub-model > visual effect recognition sub-model.
  • the filter condition corresponding to the audio fingerprint dimension is empty, then directly input the audio file contained in the first multimedia resource and the audio file of the clip template contained in the second filtering result corresponding to the dimension of the video clip into the audio fingerprint identifier. in the model. That is to say, the clip template included in the second screening result corresponding to the feature dimension of the video clip is the same as the clip template included in the second screening result corresponding to the audio fingerprint dimension.
  • the priority order of each sub-model included in the search model can be flexibly configured, and is not limited to the above example.
  • the music style identification sub-model and the audio fingerprint identification sub-model included in the search model analyze the audio files included in the multimedia files and the audio files included in the clip templates, and the audio files included in the multimedia files and the clip templates include audio files. If there is a duration process, the method in the embodiment shown in FIG. 4 can be used to perform slice processing on the audio file with an excessively long duration. For details, refer to the description in the embodiment shown in FIG. 4 , which will not be repeated here.
  • Obtaining the search result according to the first identification result to the fifth identification result corresponding to each of the editing templates can be achieved in the following manner:
  • the music style dimension, the video size dimension, the video segment feature dimension, the audio fingerprint dimension and the visual effect dimension corresponding to each editing template respectively obtain the corresponding corresponding to each editing template.
  • the search result according to the weighted calculation result corresponding to the candidate editing template and the second preset threshold; wherein, if the highest score in the weighted calculation result corresponding to the candidate editing template is greater than the second preset threshold, it is determined that the template is in the template.
  • the target editing template matching the target multimedia is searched in the library, and the target editing template is the candidate editing template corresponding to the highest score in the weighted calculation result; if the highest score in the corresponding weighted calculation result of the candidate editing template is less than or equal to the If the preset threshold is two, it is determined that no target clip template matching the first multimedia resource is found in the template library.
  • the values of the first preset threshold and the second preset threshold may be the same or different.
  • the pre-trained search model is first used to perform the music style dimension, audio fingerprint dimension, video size dimension, and video segment feature dimension on the target video file (ie, the first multimedia resource) and the candidate clip template specified by the user. And the analysis in the dimension of visual effects, and comprehensively evaluate whether the candidate editing template is the target editing template that the user wants to find according to the analysis results in the above five dimensions, so as to ensure the accuracy of the search results.
  • the search model may be adjusted based on different scenarios and purposes.
  • the search model may only include a sub-model for music style identification.
  • the embodiments shown in FIG. 3 and FIG. 5 do not limit the specific implementation of the search model.
  • FIG. 7 is a flowchart of a method for searching for a clip template provided by another embodiment of the present disclosure. Referring to FIG. 7 , the method of this embodiment includes:
  • a server device acquires a first multimedia resource.
  • the server device performs a search according to the first multimedia resource, and obtains a search result.
  • the server device returns the search result to the terminal device.
  • the terminal device receives the search result returned by the server device.
  • the server device acquires the target clip template from the template library according to the search result.
  • the server device sends the target editing template to the terminal device.
  • the server device can extract all data corresponding to the target clip template from the template library according to the identifier of the target clip template, and package it, and then sent to the end device.
  • the terminal device displays the target editing template.
  • the terminal device After receiving the data of the target clip template, the terminal device performs decoding and display, so that the user holding the terminal device can view the detailed information of the target clip template.
  • the target editing template is more in line with the user's requirements for video creation, the utilization rate of the target editing template can be improved, and at the same time, the target editing template can be improved. Improve the enthusiasm of users for video creation.
  • the terminal device may display prompt information that the target clip template is not found to the user according to the search result.
  • the clip template search method provided by the embodiments of the present disclosure is introduced in detail with reference to the accompanying drawings and application scenarios.
  • the terminal device as a mobile phone
  • a video editing APP referred to as Application 1
  • a video APP referred to as Application 2
  • mobile phones and other terminal devices provide the function of intelligently reading the clipboard, which can automatically fill in the copied content automatically.
  • the method for searching for clip templates provided by the present disclosure will be respectively introduced in two scenarios of enabling the smart clipboard reading function and turning off the smart clipboard reading function.
  • the user can copy the target link through other video applications or music applications, and when the smart clipboard reading function of the mobile phone is enabled, when the user opens application 1, application 1 can obtain the target link from the clipboard.
  • application 2 can display the user interface 11 exemplarily shown in FIG. 8A on the mobile phone, the user interface 11 is used to display the video playing page of application 2, and application 2 can perform some function sets in the video playing page, such as playing Multimedia files (such as short videos), share multimedia files.
  • the user interface 11 includes: a control 1101, where the control 1101 is used to copy the link of the multimedia file currently being played.
  • the application 1 After the user exits the application 2 and opens the application 1 within a preset time period, the application 1 sends the target link to the server through the background.
  • the preset duration is, for example, 5 seconds, 10 seconds, and so on.
  • the application 1 may display the user interface 12 exemplarily shown in FIG. 8B on the mobile phone, wherein the user interface 12 is used to display the waiting page of the application 1 .
  • the current search progress information may be displayed in the user interface 12 , for example, “75% in recognition” is exemplarily displayed in the waiting page displayed by the user interface 12 .
  • the user interface 12 also includes a control 1201 for abandoning the current search task.
  • application 1 After application 1 receives that the user performs an operation such as clicking the control 1201 in the user interface 12 shown in FIG. 8B , application 1 can display the default home page of application 1 on the mobile phone.
  • the application 1 When the application 1 receives the search result returned by the server and the data of the matched target clip template, the application 1 can display the user interface 13 exemplarily shown in FIG. 8C on the mobile phone, wherein the user interface 13 includes: a display window 1301.
  • the display window is used to display the cover of the target editing template, and the cover may be any video frame in the video included in the target editing template, or a specific video frame.
  • the user interface 13 further includes: a control 1302, wherein the control 1302 is used to enter the details page of the target clip template.
  • the application 1 receives that the user performs an operation such as clicking on the control 1302 in the user interface 13 shown in FIG. 8C
  • the application 1 displays the user interface 14 shown in FIG. 8D on the mobile phone, wherein the user interface 14 is used to display the
  • the video playing page application 1 can execute some function sets in the video playing page, such as playing multimedia files, providing a visual entry to the video authoring page, and so on.
  • the user interface 13 further includes a control 1303 , wherein the control 1303 is used to close the presentation window 1301 .
  • the application 1 After the application 1 receives the user's click operation on the control 1302 in the user interface 13 shown in FIG. 8C , the application 1 can send a clip template acquisition request to the server through the mobile phone, and the clip template acquisition request is used to request to acquire all the target clip templates. data.
  • the user interface 14 includes: a control 1401, the control 1401 is used to enter a video creation page using the target clip template as a creation template.
  • the application 1 can send a candidate clip template acquisition request to the server through the mobile phone, and the candidate clip template acquisition request is used to request the weighted score only times All data of candidate clip templates in the target clip template.
  • the server may send all the data of the candidate clip template whose weighted score ranks second to the target clip template to the mobile phone according to the search result.
  • the search result returned by the server indicates that the template library does not contain a target clip template that matches the target link
  • application 1 can display the template shown in FIG. 8E on the mobile phone
  • the user interface 15 shown in the figure wherein the user interface 15 includes a window 1501 , wherein the window 1501 includes a text information display area 1502 and a control 1503 .
  • the text information display area 1502 can display the relevant content of the search result. For example, the text information display area 1502 displays the text "The same template has not been found, try another link ⁇ ".
  • Control 1503 is used to close window 1501.
  • application 1 may display the default home page of application 1 on the mobile phone.
  • the application 1 can obtain the target link through manual input by the user.
  • the application 1 displays the user interface 16 exemplarily shown in FIG. 8F on the mobile phone, wherein the user interface 16 includes: an input window 1601, wherein the input window 1601 includes a control 1602, and the control 1602 is used to enter the link search page.
  • the application 1 After application 1 receives that the user performs an operation such as clicking on the control 1602 in the user interface 16 shown in FIG. 8F , the application 1 displays the user interface 17 exemplarily shown in FIG. 8G on the mobile phone.
  • the link search page displayed on the user interface 17 includes an input window 1701, and prompt information can be displayed in the input window 1701 to remind the user to paste video links or music links from other applications to find the same template.
  • application 1 can display the user interface 18 shown in FIG. 8H on the mobile phone after receiving the user to perform an operation such as long-pressing the input window 1701 in the user interface 17 shown in FIG. 8G , and the user interface 18 includes controls. 1801 , the control 1801 is used to paste the content in the clipboard into the input window 1701 .
  • the application 1 After the application 1 receives the user's operation such as clicking the control 1801 in the user interface 18 shown in FIG. 8H, the target link is displayed in the input window 1701, and the application 1 correspondingly displays the user interface 19 shown in FIG. 8I on the mobile phone.
  • the user interface 17 may further include: an input method software disk area 1702 , and the user can manually input a target link into the input window 1701 by operating the input method software keyboard area 1702 .
  • the user interface 17 may also include a control 1703 that allows the user to close the link search page.
  • User interface 17 also includes controls 1704 for generating search tasks based on target links.
  • the mobile phone sends the target link to the server according to the search task.
  • the user interface 19 may further include: a control 1705 , where the control 1705 is used to delete all content in the input window 1701 .
  • the control 1705 is used to delete all content in the input window 1701 .
  • the application 1 displays the user interface 17 shown in FIG. 8G on the mobile phone.
  • the user manually operates the input method soft keyboard area 1702 included in the user interface 17 to input part or all of the content of the target link into the input window 1701, he can also operate the control 1705 to delete all the content in the input window 1701 .
  • control 1704 When application 1 detects that the correct target link has not been entered in the input window 1701 shown in the user interface 17 and user interface 18, the control 1704 is in the first state; when application 1 detects that the correct target link has been entered in the input window 1701 , the control 1704 is in the second state.
  • the first state is an inactive state
  • the second state is an active state.
  • the operation control 1704 In an inactive state, the operation control 1704 cannot generate a search task; in an activated state, the operation control 1704 can generate a search task according to the target link in the input window 1701 .
  • user interface 18 and user interface 19 also include controls 1704 in which user interface 18 is in a first state and user interface 19 in which control 1704 is in a second state.
  • the application 1 After the application 1 receives that the user performs an operation such as clicking the control 1704 in the user interface 19 as shown in FIG. 8I, the application 1 generates a search task, and sends the target link to the server through the mobile phone, so that the server searches according to the target link. And after the application 1 receives the user's operation such as clicking the control 1704 in the user interface 19, the application 1 can display the user interface 12 shown in FIG. 8B on the mobile phone.
  • the application 1 can display the user interface 13 to the user interface 15 shown in FIG. 8C to FIG. 8E on the mobile phone according to the search result.
  • a user interface 20 as shown in Figure 8J is illustratively displayed.
  • the user interface 20 includes a window 2001, wherein the window 2001 is used to display guiding information, for example, the guiding information is "support for finding templates through links".
  • the communication quality between the mobile phone and the server is poor.
  • the server can match the target clip template according to the target link, the mobile phone cannot obtain the target clip from the server due to the poor communication quality between the mobile phone and the server.
  • template data The application 1 can display the user interface 21 shown in FIG. 8K on the mobile phone, wherein the user interface 21 includes a window 2101, and the window 2101 is used to display a loading failure page.
  • the loading failure page may include an area 2102, a window 2103, and a control 2104, where the area 2102 is used to display the prompt information of the loading failure; the window 2103 contains a control 2105, and the control 2105 is used to generate a new data loading task; the control 2104 is used to Cancel the data load task.
  • FIGS. 8A to 8K From the schematic diagrams of the human-computer interaction interface shown in the above-mentioned FIGS. 8A to 8K, combined with the actual application scenarios, it can be seen that the editing template search method provided by the embodiment of the present disclosure is convenient for the user to operate the controls on the user interface displayed by the terminal device. Quickly obtain the desired target editing template, which can better meet the needs of users for video creation.
  • FIG. 9 is a schematic structural diagram of a clip template search apparatus according to an embodiment of the present disclosure.
  • the clip template search apparatus 900 provided in this embodiment includes: an acquisition module 901 and a search module 902 .
  • the obtaining module 901 is configured to obtain the first multimedia resource designated by the user.
  • a search module 902 configured to perform a search according to the first multimedia resource, and obtain a search result
  • the search result is used to indicate whether a target editing template matching the first multimedia resource is found, and the target editing template is used to indicate that the multimedia material to be edited is edited into a second multimedia material according to the target editing method.
  • the target editing method is the editing method adopted by the first multimedia resource.
  • the search module 902 is specifically configured to obtain the search result according to the identification result of the candidate clip template in the target dimension; wherein, the identification result of the candidate clip template in the target dimension is based on the first The characteristics of the multimedia resource and the candidate clip template in the target dimension are obtained.
  • the search module 902 is further configured to determine the candidate editing template according to the characteristics of the first multimedia resource and the at least one editing template respectively in the target dimension; wherein the at least one editing template is Templates include the candidate clip templates.
  • the target dimension includes one or more of: a music style dimension, an audio fingerprint dimension, a video size dimension, a video segment feature dimension, and a visual effect dimension.
  • the target dimension includes a plurality of the music genre dimension, the audio fingerprint dimension, the video size dimension, the video segment feature dimension, and the visual effects dimension;
  • the search module 902 is specifically configured to obtain, according to the identification results of each dimension of the candidate editing template in the target dimension and the corresponding weight coefficients of each dimension in the target dimension, the corresponding data of the candidate editing template. Weighted calculation result; obtain the search result according to the weighted calculation result corresponding to the candidate clip template.
  • the search module 902 is specifically configured to determine the current dimension according to the priority order of each dimension in the target dimension; obtain the current dimension according to the characteristics of the first multimedia resource and the first screening result respectively in the current dimension.
  • the first screening result the first recognition result of each editing template in the current dimension is obtained; according to the first recognition result and the screening condition corresponding to the current dimension, obtain the second screening result corresponding to the current dimension ;
  • the second screening result includes: one or more editing templates, and the initial state of the first screening result includes: the at least one editing template; determine that the second screening result is the first screening result;
  • the current dimension is determined until the second screening result corresponding to the last described current dimension is obtained, and the editing template included in the second screening result corresponding to the last described current dimension is determined to be: the candidate clip template.
  • the obtaining module 901 is specifically configured to obtain a target link input by a user, and parse the target link to obtain the first multimedia resource.
  • the clip template searching apparatus 900 further includes: a sending module 903 .
  • the sending module 903 is configured to send the search result to the user.
  • the sending module 903 is further configured to send the target clip template to the user according to the search result.
  • the clip template search apparatus provided in this embodiment can be used to execute the technical solution executed by the server device in any of the foregoing embodiments, and its implementation principle and technical effect are similar, and reference may be made to the foregoing detailed description, which will not be repeated here.
  • FIG. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
  • the electronic device 1000 provided in this embodiment includes: a memory 1001 and a processor 1002 .
  • the memory 1001 may be an independent physical unit, and may be connected to the processor 1002 through a bus 1003 .
  • the memory 1001 and the processor 1002 can also be integrated together, implemented by hardware, and the like.
  • the memory 1001 is used to store program instructions, and the processor 1002 invokes the program instructions to execute the operations performed by the server device or the terminal device in any of the above method embodiments.
  • the foregoing electronic device 1000 may also include only the processor 1002 .
  • the memory 1001 for storing programs is located outside the electronic device 1000, and the processor 1002 is connected to the memory through circuits/wires for reading and executing the programs stored in the memory.
  • the processor 1002 may be a central processing unit (CPU), a network processor (NP), or a combination of CPU and NP.
  • CPU central processing unit
  • NP network processor
  • the processor 1002 may further include hardware chips.
  • the above-mentioned hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD) or a combination thereof.
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • the above-mentioned PLD can be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL) or any combination thereof.
  • CPLD complex programmable logic device
  • FPGA field-programmable gate array
  • GAL general array logic
  • the memory 1001 may include volatile memory (volatile memory), such as random-access memory (RAM); the memory may also include non-volatile memory (non-volatile memory), such as flash memory (flash memory) ), a hard disk drive (HDD) or a solid-state drive (SSD); the memory may also include a combination of the above-mentioned types of memory.
  • volatile memory such as random-access memory (RAM)
  • non-volatile memory such as flash memory (flash memory)
  • HDD hard disk drive
  • SSD solid-state drive
  • the memory may also include a combination of the above-mentioned types of memory.
  • the present disclosure further provides a computer-readable storage medium, where the computer-readable storage medium includes computer program instructions, when executed by at least one processor of an electronic device, the computer program instructions are used to perform the service in any of the above method embodiments.
  • the present disclosure also provides a program product comprising a computer program, the computer program being stored in a readable storage medium, from which at least one processor of the electronic device can read the data
  • the computer program is executed by the at least one processor, so that the electronic device executes the technical solution executed by the server device or the terminal device in any of the above method embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

本公开涉及一种剪辑模板搜索方法、装置及可读存储介质,其中,方法包括:在视频编辑场景中,服务端设备通过获取用户指定的第一多媒体资源,并根据用户指定的第一多媒体资源进行搜索,获得与第一多媒体资源匹配的目标剪辑模板,其中,目标剪辑模板用于指示待剪辑多媒体素材按照目标剪辑方法被剪辑成第二多媒体资源。本方案中,根据第一多媒体资源进行搜索,提高了搜索结果的准确性,能够较好地满足用户进行视频创作的需求,同时也能够提高目标剪辑模板的使用率。

Description

剪辑模板搜索方法及装置
本公开要求于2021年04月30日提交中国专利局、申请号为202110485269.3、发明名称为“剪辑模板搜索方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
技术领域
本公开涉及互联网技术领域,尤其涉及一种剪辑模板搜索方法及装置。
背景技术
随着互联网技术的不断发展,用户往往会选择应用程序(application,APP)来进行视频创作。应用程序通常会提供丰富的剪辑模板,用户可以利用剪辑模板,并选择自己喜欢的照片或者视频,从而获得合成的视频。然而,如何快速、准确地找到用户想要的剪辑模板是当前亟待解决的问题。
发明内容
为了解决上述技术问题或者至少部分地解决上述技术问题,本公开提供了一种剪辑模板搜索方法及装置。
第一方面,本公开实施例提供了一种剪辑模板搜索方法,包括:
获取用户指定的第一多媒体资源;
根据所述第一多媒体资源进行搜索,获取搜索结果;
其中,所述搜索结果用于指示是否搜索到与所述第一多媒体资源匹配的目标剪辑模板,所述目标剪辑模板用于指示待剪辑多媒体素材按照目标剪辑方式被剪辑成第二多媒体资源,所述目标剪辑方式为所述第一多媒体资源所采用的剪辑方式。
在一些可能的设计中,所述根据所述第一多媒体资源进行搜索,获取搜索结果,包括:
根据候选剪辑模板在目标维度的识别结果,获取所述搜索结果;
所述候选剪辑模板在目标维度的识别结果是根据所述第一多媒体资源和所述候选剪辑模板在所述目标维度的特征获得的。
在一些可能的设计中,所述根据所述候选剪辑模板在目标维度的识别结果,获取所述搜索结果之前,还包括:
根据所述第一多媒体资源以及至少一个剪辑模板分别在目标维度上的特征,确定所述候选剪辑模板;
其中,所述至少一个剪辑模板包括所述候选剪辑模板。
在一些可能的设计中,所述目标维度包括:音乐风格维度、音频指纹维度、视频尺寸维度、视频片段特征维度以及视觉效果维度中的一个或多个。
在一些可能的设计中,所述目标维度包括所述音乐风格维度、所述音频指纹维度、所述视频尺寸维度、所述视频片段特征维度以及所述视觉效果维度中的多个维度;
所述根据所述候选剪辑模板在所述目标维度的识别结果,获取所述搜索结果,包括:
根据所述候选剪辑模板分别在所述目标维度中各维度的识别结果、以及所述目标维度中各维度对应的权重系数,获取所述候选剪辑模板对应的加权计算结果;
根据所述候选剪辑模板对应的加权计算结果,获取所述搜索结果。
在一些可能的设计中,所述根据所述第一多媒体资源以及至少一个剪辑模板分别在目标维度上的特征,确定所述候选剪辑模板,包括:
按照目标维度中的各维度的优先级顺序,确定当前维度;
根据所述第一多媒体资源以及第一筛选结果分别在所述当前维度的特征,获取所述第一筛选结果中每个剪辑模板分别在所述当前维度的第一识别结果;
根据所述第一识别结果以及所述当前维度对应的筛选条件,获取所述当前维度对应的第二筛选结果;所述第二筛选结果包括:一个或者多个剪辑模板,所述第一筛选结果的初始状态包括:所述至少一个剪辑模板;
确定所述第二筛选结果为第一筛选结果;
返回执行所述按照目标维度中的各维度的优先级顺序,确定当前维度,直到获取最后一个所述当前维度对应的第二筛选结果,确定最后一个所述当前维度对应的第二筛选结果包括的剪辑模板为所述候选剪辑模板。
在一些可能的设计中,所述获取用户指定的第一多媒体资源,包括:获取用户输入的目标链接,并对所述目标链接进行解析,获取所述第一多媒体资源。
在一些可能的设计中,所述方法还包括:向所述用户发送所述搜索结果。
第二方面,本公开实施例提供了一种剪辑模板搜索装置,包括:
获取模块,用于获取用户指定的第一多媒体资源;
搜索模块,用于根据所述第一多媒体资源进行搜索,获取搜索结果;其中,所述搜索结果用于指示是否搜索到与所述第一多媒体资源匹配的目标剪辑模板,所述目标剪辑模板用于指示待剪辑多媒体素材按照目标剪辑方式被剪辑成第二多媒体资源,所述目标剪辑方式为所述第一多媒体资源所采用的剪辑方式。
第三方面,本公开实施例提供了一种电子设备,包括:存储器、处理器以及计算机程序;
所述存储器被配置为存储所述计算机程序;
所述处理器被配置为执行所述计算机程序,以实现如第一方面任一项所述的方法。
第四方面,本公开实施例提供了一种可读存储介质,包括:程序;
所述程序被电子设备的至少一个处理器执行时,以实现如第一方面任一项所述的方法。
第五方面,本公开实施例还提供一种程序产品,包括:计算机程序,所述计算机程序存储在可读存储介质中,电子设备的至少一个处理器可以从所述可读存储介质中读取所述计算机程序,所述至少一个处理器执行所述计算机程序使得所述电子设备实现如第一方面任一项所述的方法。
本公开实施例提供一种剪辑模板搜索方法及装置,其中,该方法包括:在视频编辑场景中,服务端设备通过获取用户指定的第一多媒体资源,并根据用户指定的第一多媒体资源进行搜索,获得与第一多媒体资源匹配的目标剪辑模板,其中,目标剪辑模板用于指示 待剪辑多媒体素材按照目标剪辑方法被剪辑成第二多媒体资源。本方案中,根据第一多媒体资源进行搜索,提高了搜索结果的准确性,能够较好地满足用户进行视频创作的需求,同时也能够提高目标剪辑模板的使用率。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本公开一实施例提供的剪辑模板搜索方法的应用场景示意图;
图2为本公开一实施例提供的剪辑模板搜索方法的流程图;
图3为本公开一实施例提供的搜索模型的结构示意图;
图4为本公开另一实施例提供的剪辑模板搜索方法的流程图;
图5为本公开另一实施例提供的搜索模型的结构示意图;
图6为本公开另一实施例提供的剪辑模板搜索方法的流程图;
图7为本公开另一实施例提供的剪辑模板搜索方法的流程图;
图8A至图8K为本公开提供的人机交互界面示意图;
图9为本公开一实施例提供的剪辑模板搜索装置的结构示意图;
图10为本公开一实施例提供的电子设备的结构示意图。
具体实施方式
为了能够更清楚地理解本公开的上述目的、特征和优点,下面将对本公开的方案进行进一步描述。需要说明的是,在不冲突的情况下,本公开的实施例及实施例中的特征可以相互组合。
在下面的描述中阐述了很多具体细节以便于充分理解本公开,但本公开还可以采用其他不同于在此描述的方式来实施;显然,说明书中的实施例只是本公开的一部分实施例,而不是全部的实施例。
用户使用APP进行视频创作时,现有的APP通常支持通过关键字来查找剪辑模板。然而,一些情况下,用户可能无法获得正确的关键字,则通过用户输入的关键字找到的剪辑模板并不是用户想要的剪辑模板,导致用户可能不想继续进行视频创作。
基于此,本公开提供一种剪辑模板搜索方法,该方法的核心思想是:通过获取用户指定的第一多媒体资源,并对该第一多媒体资源以及候选剪辑模板进行分析,从而快速地从候选剪辑模板中定位用户想要的目标剪辑模板,提高了搜索结果的准确性,能够较好地满足用户进行视频创作的需求,同时也能够提高目标剪辑模板的使用率。
图1为本公开一实施例提供的剪辑模板搜索方法的应用场景示意图。本实施例提供的剪辑模板搜索方法可以应用于图1所示的场景中。参照图1所示,该场景包括:服务端设 备101和终端设备102。其中,服务端设备101与终端设备102可以通过有线或者无线网络连接。
其中,服务端设备101可以通过任意的软件和/或硬件的方式实现。例如,服务端设备101可以是服务器,该服务器可以是一个独立的服务器,也可以是由多个独立服务器构成的服务器集群,或者,还可以是云端服务器。服务端设备101也可以是集成在电子设备中的软件程序,当软件程序被电子设备的至少一个处理器执行时,可以执行本公开实施例提供的剪辑模板搜索方法中服务端设备执行的技术方案。在实际应用中,服务端设备可以同时与一个或多个终端设备进行交互,向终端设备发送相同或者不同的数据。
终端设备102可以通过任意的软件和/或硬件的方式实现。例如,终端设备102可以但不限于是笔记本电脑、台式电脑、智能手机、便携式终端设备、可穿戴设备、个人数字助理(personal digital assistant,PDA)等设备,本公开实施例对于终端设备的具体类型不作限制。终端设备102也可以是集成在电子设备中的软件程序,当软件程序被电子设备的处理器执行时,可以执行本公开实施例提供的剪辑模板搜索方法中终端设备执行的技术方案。
其中,图1中示例性地示出了1个服务端设备与1个终端设备进行交互的场景。在实际应用场景中,服务端设备可以并行与更多个终端设备进行交互。
下面通过几个具体实施例对本公开提供的剪辑模板搜索方法进行详细介绍。
图2为本公开一实施例提供的剪辑模板搜索方法的流程图。本实施例的执行主体可以为服务端设备。参照图2所示,本实施例包括:
S201、获取用户指定的第一多媒体资源。
具体地,服务端设备获取用户通过终端设备指定的第一多媒体资源。其中,第一多媒体资源可以为音频文件,也可以为视频文件。
可选地,第一多媒体资源还可以是根据目标剪辑模板进行视频创作获得的多媒体资源(如短视频)。
可选地,第一多媒体资源可以是用户通过终端设备主动上报的。或者,也可以是用户通过其他方式指定的,例如,用户通过终端设备上报目标链接,服务端设备根据目标链接解析获得的。
假设,终端设备上安装有多个应用程序,其中,所述多个应用程序包括:视频编辑应用程序以及其他应用程序。
可选地,目标链接可以是用户从上述其他应用程序中的任一应用程序进行复制,并粘贴至视频编辑应用程序中的。上述其他应用程序可以但不限于为音乐类应用程序、视频类应用程序、社交类应用程序等等。
可选地,目标链接也可以是用户手动输入至终端设备上安装的视频编辑应用程序的。
本公开对于目标链接的来源以及终端设备获取目标链接的方式不作限制。
当视频编辑应用程序获得目标链接后,通过对目标链接进行解析,获取目标链接对应的网页,并从网页中获取所指向的第一多媒体资源的统一资源定位符(uniform resource locator,URL);终端设备根据第一多媒体资源的URL访问并下载第一多媒体资源;之后,终端设备通过其与服务端设备之间已经建立好的通信链路,将获得的第一多媒体资源上传 至服务端设备。
或者,当视频编辑应用程序获得目标链接后,可以通过终端设备与服务端设备之间已经建立好的通信链路,将目标链接发送至服务端设备;服务端设备接收到目标链接后,通过对目标链接进行解析,获取目标链接对应的网页,并从网页中获取所指向的第一多媒体资源的URL;终端设备根据第一多媒体资源的URL访问并下载第一多媒体资源。
S202、根据第一多媒体资源进行搜索,获取搜索结果。
本步骤的目的在于:搜索到与第一多媒体资源使用的剪辑方式相同的目标剪辑模板。
其中,所述搜索结果用于指示是否搜索到与第一多媒体资源匹配的目标剪辑模板。
一种可能的实现方式,根据第一多媒体资源,在服务端设备上部署的模板库中进行搜索,获取搜索结果。
其中,服务端设备上部署的模板库包括:至少一个剪辑模板。其中,每个剪辑模板用于提供一种预设剪辑方式,用户选择或者导入的待剪辑多媒体素材按照剪辑模板提供的预设剪辑方式能够被剪辑成一个新的多媒体资源。
具体地,服务端设备可以通过对第一多媒体资源和模板库中的各剪辑模板进行目标维度上的分析,获得搜索结果。其中,目标维度可以包括音乐风格维度、音频指纹维度、视频尺寸维度、视频片段特征维度以及视觉效果维度中的一个或多个维度。
可选地,服务端设备上可以预先部署已经训练好的搜索模型,当服务端设备获取用户指定的第一多媒体资源后,可利用该搜索模型来执行上述搜索。当然,也可以采用其他方式或者算法来执行上述搜索。
利用搜索模型如何执行搜索,可参照后文中的详细介绍。
本实施例提供的剪辑模板搜索方法,服务端设备获取用户指定的第一多媒体资源,并根据用户指定的第一多媒体资源在模板库中进行搜索,获得与第一多媒体资源匹配的目标剪辑模板。本方案中,通过第一多媒体资源进行搜索,提高了搜索结果的准确性,能够较好地满足用户进行视频创作的需求,同时也能够提高目标剪辑模板的使用率。
接下来,将以第一多媒体资源为音频文件和视频文件两种情况,对搜索模型是如何根据第一多媒体资源以及模板库中包括的各剪辑模板进行搜索分别进行详细介绍。
需要说明的是,在实际应用中,执行搜索的服务端设备与存储模板库的服务端设备可以为相同的设备,也可以为不同的设备,本公开实施例对此不作限制。
情况1:第一多媒体资源为音频文件
其中,图3为本公开一实施例提供的搜索模型的结构示意图;图4为本公开另一实施例提供的剪辑模板搜索方法的流程图。
参照图3所示,搜索模型300包括:音乐风格识别子模型301和音频指纹识别子模型302。
音乐风格识别子模型301用于根据第一多媒体资源和各模板视频分别包含的音频文件,输出音乐风格维度上,各剪辑模板对应的第一识别结果,该第一识别结果用于指示第一多媒体资源和模板视频包含的音频文件之间的音乐风格的相似度。
本方案中,音乐风格可以预先划分为多种,例如,音乐风格包括:伤感、安静、流行、 欢快、轻松、甜蜜、开心等等多个风格。音乐风格的具体分类,本公开实施例不作限制。
可选地,针对模板库包括的每个剪辑模板,音乐风格识别子模型301通过特定的算法,获取第一多媒体资源对应的第一特征向量和剪辑模板包含的音频文件的第二特征向量;接着,计算第一特征向量和第二特征向量之间的距离(例如:欧式距离,当然也可以是其他算法获得的距离),并根据计算获得的距离获得所述剪辑模板对应的第一识别结果。
当第一识别结果采用音乐风格分数表示时,其中,音乐风格相似度越高,音乐风格分数越高;音乐风格相似度越低,音乐风格分数越低。
音频指纹识别子模型302用于根据第一多媒体资源和各模板视频分别包含的音频文件,输出音频指纹维度上,各剪辑模板对应的第二识别结果,该第二识别结果用于指示多媒体文件和所述模板视频包含的音频文件之间的音频指纹的相似度。
具体地,音频指纹识别子模型302具体采用音频指纹技术(audio fingerprinting technology)对多媒体文件和各剪辑模板包含的音频文件进行分析。其中,音频指纹技术是指采用特定的算法提取需要被识别的音频文件的数据特征,例如,声谱特征、频谱特征等,并根据需要被识别的音频文件的数据特征与建立的音频指纹数据库进行对比。
当第二识别结果采用音频指纹分数表示时,其中,音频指纹相似度越高,音频指纹分数越高;音频指纹相似度越低,音频指纹分数越低。
在采用图3所示的搜索模型300的基础上,结合图4所示,本实施例提供的剪辑模板搜索方法包括:
S401、服务端设备获取目标音频文件。
应理解,本实施例中的目标音频文件即第一多媒体资源。且服务端设备获取目标音频文件的具体实现方式可参照图2所示实施例中所述,此处不再赘述。
S402、根据候选剪辑模板在目标维度的识别结果,获取所述搜索结果;其中,所述候选剪辑模板在目标维度的识别结果是根据所述目标音频文件和所述候选剪辑模板在所述目标维度的特征获得的。
结合图3所示的搜索模型300可知,本步骤中的目标维度示例性地包括:音乐风格维度和音频指纹维度。
需要说明的是,在一些情况下,第一多媒体资源(本实施例中第一多媒体资源即为目标音频文件)的时长过长,服务端设备可以对所述第一多媒体资源和模板库中包含的各所述剪辑模板的音频文件分别进行切片处理,获取所述第一多媒体资源对应的第一音频子文件和各所述剪辑模板包含的音频文件分别对应的第二音频子文件。
本步骤中,服务端设备可以按照固定间隔进行切片处理,以使各第一音频子文件和各第二音频子文件的音频时长保持一致,方便音乐风格识别子模型和音频指纹识别子模型逐段进行分析。
且在实际应用中,由于各剪辑模板中包含的音频文件的时长可能不完全相同,有些剪辑模板包含的音频文件的时长较长,有些剪辑模板包含的音频文件的时长较短。若剪辑模板包含的音频文件的时长较长,则可以按照上述固定间隔进行切片处理。若剪辑模板包含的音频文件的时长满足上述固定间隔,则可以不用进行切片处理。可以根据实际需求灵活 设置是否进行切片处理。
下面以候选剪辑模板为至少一个剪辑模板中的全部和部分两种情况进行说明。
A、假设候选剪辑模板为所述至少一个剪辑模板中的全部
这样的情况下,可将第一多媒体资源对应的第一音频子文件和各所述剪辑模板包含的音频文件分别对应的第二音频子文件,分别输入至音乐风格识别子模型和音频指纹识别子模型,从而获取音乐风格识别子模型输出的各剪辑模板对应的第一识别结果和第二识别结果。
接着,根据各候选剪辑模板分别对应的第一识别结果和第二识别结果,获取所述搜索结果。
其中,音乐风格识别子模型和音频指纹识别子模型可以并行执行识别任务,也可以按照先后顺序依次执行识别任务。
B、假设候选剪辑模板为所述至少一个剪辑模板中的部分
这样的情况下,S402之前,还包括:S402'、根据所述目标音频文件以及至少一个剪辑模板分别在目标维度上的特征,确定所述候选剪辑模板。
其中,候选剪辑模板的数量可以为一个,也可以为多个。
本步骤中的“至少一个剪辑模板”可以包括模板库中的部分剪辑模板,也可以包括模板库中的全部剪辑模板。当“至少一个剪辑模板”包括模板库中的部分剪辑模板时,至少一个剪辑模板可以是根据剪辑模板的发布时间、使用量、收藏量等因素确定;或者,也可以是随机确定的;或者是通过其他任一种方式确定的。
假设,音乐风格维度的优先级高于音频指纹维度的优先级时:
首先将第一多媒体资源对应的第一音频子文件和所述剪辑模板包含的音频文件分别对应的第二音频子文件,输入至音乐风格识别子模型,获取模板库中各剪辑模板对应的第一识别结果。
接着,按照预先配置的音乐风格维度对应的筛选条件以及各剪辑模板对应的第一识别结果,获得音乐风格维度对应的第二筛选结果。其中,音乐风格维度对应的第二筛选结果包括一个或多个剪辑模板。
再将第一多媒体资源对应的第一音频子文件和音乐风格维度对应的第二筛选结果包含的各剪辑模板分别对应的第二音频子文件,输入至音频指纹识别子模型,获取音乐风格维度对应的第二筛选结果包含的各剪辑模板对应的第二识别结果。
若音频指纹维度预先配置有对应的筛选条件,则可以根据该筛选条件以及音乐风格维度对应的第二筛选结果包含的各剪辑模板对应的第二识别结果,获取第一筛选结果。其中,第一筛选结果包含的剪辑模板即为前述的候选剪辑模板。
假设音频指纹维度的优先级高于音乐风格维度的优先级时:
首先将第一多媒体资源对应的第一音频子文件和所述剪辑模板包含的音频文件分别对应的第二音频子文件,输入至音频指纹识别子模型,获取模板库中各剪辑模板对应的第二识别结果。
接着,按照预先配置的音频指纹维度对应的筛选条件以及各剪辑模板对应的第二识别 结果,获得音频指纹维度对应的第二筛选结果。其中,音频指纹维度对应的第二筛选结果包括一个或多个剪辑模板。
再将第一多媒体资源对应的第一音频子文件和音频指纹维度对应的第二筛选结果包含的各剪辑模板分别对应的第二音频子文件,输入至音乐风格识别子模型,获取音频指纹维度对应的第二筛选结果包含的各剪辑模板对应的第一识别结果。
若音乐风格识别子模型预先配置有对应的筛选条件,则可以根据该筛选条件以及音频指纹维度对应的第二筛选结果包含的各剪辑模板对应的第二识别结果,获取第一筛选结果,第一筛选结果包含的剪辑模板即为前述的候选剪辑模板。
在一些情况下,音乐风格维度或者音频指纹维度对应的筛选条件可以为空。
根据各候选剪辑模板分别对应的第一识别结果和第二识别结果,获取所述搜索结果,可以通过下述方式实现:
根据各候选剪辑模板分别对应的第一识别结果和第二识别结果、音乐风格维度和音频指纹维度分别对应的权重系数,获取每个候选剪辑模板对应的加权计算结果。
根据每个候选剪辑模板对应的加权计算结果以及第一预设阈值,获取所述搜索结果;其中,若候选剪辑模板对应的加权计算结果中的最高分数大于所述第一预设阈值,则确定在模板库中搜索到与第一多媒体资源匹配的目标剪辑模板,该目标剪辑模板即为加权计算结果中最高分数的剪辑模板;若候选剪辑模板对应的加权计算结果中的最高分数小于或等于所述第一预设阈值,则确定在模板库中未搜索到与第一多媒体资源匹配的目标剪辑模板。
在实际应用中,音乐风格维度和音频指纹维度分别对应的权重系数可以根据需求灵活配置。其中,权重系数越大,则该维度所占的比重越大,对搜索结果的影响也越大;权重系数越小,则该维度所占的比重越小,对搜索结果的影响也越小。
本实施例中,首先利用预先训练好的搜索模型对用户指定的音频文件(即第一多媒体资源)与模板库中各剪辑模板包含的音频文件进行音乐风格维度和音频指纹维度上的分析,并根据两个维度上的分析结果综合评价剪辑模板是否是用户想要找到的目标剪辑模板,保证了搜索结果的准确性。且通过搜索模型来执行上述搜索任务,能够提高搜索效率。
另外,若为搜索模型包括的各子模型配置了优先级顺序,则可以进行逐级识别以及筛选,能够减小搜索模型的计算量,进一步提高搜索效率。
情况2:第一多媒体资源为视频文件
其中,图5为本公开一实施例提供的搜索模型的结构示意图;图6为本公开另一实施例提供的剪辑模板搜索方法的流程图。
参照图5所示,搜索模型500包括5个子模型,分别为:音乐风格识别子模型501、、视频尺寸识别子模型502、视频片段特征识别子模型503、视觉效果识别子模型504以及音频指纹识别子模型505。
其中,图5所示实施例提供的搜索模型500包括的音乐风格识别子模型501与图3所示实施例提供的搜索模型300包括的音乐风格子模型301类似;图5所示实施例提供的搜索模型500包括的音频指纹识别子模型505与图3所示实施例提供的搜索模型300包括的音频识别子模型302类似;详细可参照图3所示实施例中的详细描述,此处不再赘述。
视频尺寸识别子模型502用于根据第一多媒体资源的尺寸特征和所述剪辑模板的尺寸特征,输出视频尺寸维度上,各剪辑模板分别对应的第三识别结果,所述第三识别结果用于指示所述第一多媒体资源与所述剪辑模板之间的视频尺寸的相似度。
其中,上述尺寸特征包括:时长和/或视频帧的长宽比。具体地,第一多媒体资源的尺寸特征包括:多媒体文件的时长和/或多媒体文件包含的视频帧的长宽比。类似地,剪辑模板的尺寸特征包括:剪辑模板的时长和/或剪辑模板的视频帧的长宽比。
应理解,本方案中,视频帧的长宽比可以是视频帧的长除以视频帧的宽获得的,也可以是视频帧的宽除以视频帧的长获得的。
当第三识别结果采用视频尺寸分数表示时,其中,视频时长越接近、视频帧的长宽比越接近,视频尺寸的相似度越高,则视频尺寸分数越高;视频时长差异越大、视频帧的长宽比差异越大,则视频尺寸相似度越低,视频尺寸分数越低。
视频片段特征识别子模型503用于根据第一多媒体资源的视频片段特征和剪辑模板的视频片段特征,输出视频片段特征维度上,剪辑模板对应的第四识别结果,第四识别结果用于指示第一多媒体资源的视频片段与剪辑模板的视频片段之间的相似度。
其中,视频片段特征识别子模型503,根据第一多媒体资源的转场时刻,对第一多媒体资源进行分段,获得第一多媒体资源对应的多个第一视频子片段;根据各剪辑模板的转场时刻,对剪辑模板进行分段,获得每个剪辑模板对应的多个第二视频子片段。接着,视频片段特征识别子模型503按照第一视频子片段的先后顺序、第二视频子片段的先后顺序,根据视频子片段的时长、转场方式等特征,获得每个视频模板文件对应的第四识别结果。
当第四识别结果采用视频片段特征分数表示时,其中,视频子片段的时长越接近、视频帧的转场方式相同,视频片段的相似度越高,则视频片段特征分数越高;视频子片段的时长差异越大、视频帧的转场方式不同,则视频片段的相似度越低,视频片段特征分数越低。
视觉效果识别子模型504用于根据第一多媒体资源的视觉效果和剪辑模板的视觉效果,输出视觉效果维度上,剪辑模板对应的第五识别结果,第五识别结果用于指示第一多媒体资源与的视觉效果与剪辑模板的视觉效果的相似度。
其中,视觉效果识别子模型504,具体通过识别第一多媒体资源和剪辑模板各自使用的贴纸素材样式、贴纸素材尺寸、文字素材样式、文字素材尺寸、滤镜效果等等中的一项或多项,获得视觉效果维度上的第五识别结果。
在采用图5所示的搜索模型500的基础上,结合图6所示,本实施例提供的剪辑模板搜索方法包括:
S601、服务端设备获取目标视频文件。
应理解,本实施例中,上述目标视频文件即第一多媒体资源。且服务端设备获取目标视频文件的具体实现方式可参照图2所示实施例中所述,此处不再赘述。
S602、根据候选剪辑模板在目标维度的识别结果,获取所述搜索结果;其中,所述候选剪辑模板在目标维度的识别结果是根据所述目标视频文件和所述候选剪辑模板在所述目标维度的特征获得的。
结合图5所示的搜索模型可知,本实施例中,目标维度包括:音乐风格维度、音频指纹维度、视频尺寸维度、视频片段特征维度以及视觉效果维度。
需要说明的是,在一些情况下,第一多媒体资源的时长过长,在对第一多媒体资源和各剪辑模板均可以进行切片处理。第一多媒体资源的时长过长,服务端设备可以对所述第一多媒体资源和模板库中包含的各所述剪辑模板的音频文件分别进行切片处理,获取所述第一多媒体资源对应的第一音频子文件和各所述剪辑模板包含的音频文件分别对应的第二音频子文件。
切片处理的实现方式可参照图4所示实施例中的详细描述。
A、假设候选剪辑模板为所述至少一个剪辑模板中的全部
这样的情况下,可将第一多媒体资源和所述至少一个剪辑模板,分别输入至搜索模型500包括的5个子模型,获取5个子模型分别输出的各剪辑模板对应的第一识别结果至第五识别结果。
接着,根据各候选剪辑模板分别对应的第一识别结果至第五识别结果,获取所述搜索结果。
B、假设候选剪辑模板为所述至少一个剪辑模板中的部分
这样的情况下,S602之前,还包括:S602'、根据所述目标视频文件以及至少一个剪辑模板分别在目标维度上的特征,确定所述候选剪辑模板。
其中,候选剪辑模板的数量可以为一个,也可以为多个。
本步骤中的“至少一个剪辑模板”可以包括模板库中的部分剪辑模板,也可以包括模板库中的全部剪辑模板。当“至少一个剪辑模板”包括模板库中的部分剪辑模板时,至少一个剪辑模板可以是根据剪辑模板的发布时间、使用量、收藏量等因素确定;或者,也可以是随机确定的;或者是通过其他任一种方式确定的。
一种可能的实现方式,S602'可以包括以下步骤:
步骤a、根据上述5个子模型的优先级顺序,确定当前维度。
具体地,搜索模型500包含的5个子模型的优先级顺序即对应目标维度中各维度的优先级顺序,按照搜索模型500包含的5个子模型的优先级由高到底的顺序,确定当前维度。
步骤b、根据所述第一多媒体资源以及第一筛选结果分别在所述当前维度的特征,获取所述第一筛选结果中每个剪辑模板分别在所述当前维度的第一识别结果。
本步骤中,第一筛选结果即为前一维度对应的筛选结果。
步骤c、根据所述第一识别结果以及所述当前维度对应的筛选条件,获取所述当前维度对应的第二筛选结果;所述第二筛选结果包括:一个或者多个剪辑模板,所述第一筛选结果的初始状态包括:所述至少一个剪辑模板。
步骤d、确定所述第二筛选结果为第一筛选结果。
返回执行步骤a,直到获取最后一个所述当前维度对应的第二筛选结果,确定最后一个所述当前维度对应的第二筛选结果包括的剪辑模板为所述候选剪辑模板。
需要说明的是,在上述过程中,一些维度对应的筛选条件可以配置为空。
示例性地,假设搜索模型包括的5个子模型的优先级顺序为:音乐风格识别子模型> 视频尺寸识别子模型>视频片段特征识别子模型>音频指纹识别子模型>视觉效果识别子模型。例如,音频指纹维度对应的筛选条件为空,则直接将第一多媒体资源包含的音频文件、视频片段特维度对应的第二筛选结果包括的剪辑模板的音频文件,输入至音频指纹识别子模型中。也就是说,视频片段特征维度对应的第二筛选结果包含的剪辑模板与音频指纹维度对应的第二筛选结果包含的剪辑模板相同。
需要说明的是,搜索模型包括的各子模型的优先级顺序可灵活配置,并不限于上述示例,例如搜索模型包括的各子模型的优先级顺序还可以为:音乐风格识别子模型>视频尺寸识别子模型>音频指纹识别子模型>视频片段特征识别子模型>视觉效果识别子模型;或者,优先级顺序还可以为:音乐风格识别子模型>视频尺寸识别子模型>音频指纹识别子模型=视频片段特征识别子模型>视觉效果识别子模型等等。
本实施例中,搜索模型包括的音乐风格识别子模型和音频指纹识别子模型对多媒体文件包含的音频文件和剪辑模板包含的音频文件进行分析,多媒体文件包含的音频文件和剪辑模板包含的音频文件均存在时长过程的情况,则可以采用图4所示实施例中的方式,对时长过长的音频文件进行切片处理,详细参照图4所示实施例中的描述,此处不再赘述。
根据各所述剪辑模板分别对应的第一识别结果至第五识别结果,获取所述搜索结果,可以通过下述方式实现:
根据各剪辑模板分别对应的第一识别结果至第五识别结果、音乐风格维度、视频尺寸维度、视频片段特征维度、音频指纹维度以及视觉效果维度分别对应的权重系数,获取每个候选剪辑模板对应的加权计算结果。
根据候选剪辑模板对应的加权计算结果以及第二预设阈值,获取所述搜索结果;其中,若候选剪辑模板对应的加权计算结果中的最高分数大于所述第二预设阈值,则确定在模板库中搜索到与目标多媒体匹配的目标剪辑模板,该目标剪辑模板即为加权计算结果中最高分数对应的候选剪辑模板;若候选剪辑模板对应的加权计算结果中的最高分数小于或等于所述第二预设阈值,则确定在模板库中未搜索到与第一多媒体资源匹配的目标剪辑模板。
可选的,上述第一预设阈值和第二预设阈值的取值可以相同,也可以不同。
本实施例中,首先利用预先训练好的搜索模型对用户指定的目标视频文件(即第一多媒体资源)与候选剪辑模板进行音乐风格维度、音频指纹维度、视频尺寸维度、视频片段特征维度以及视觉效果维度上的分析,并根据上述五个维度上的分析结果综合评价候选剪辑模板是否是用户想要找到的目标剪辑模板,保证了搜索结果的准确性。
另外,若为搜索模型包括的各子模型配置了优先级顺序,则可以进行逐级识别以及筛选,能够减小搜索模型的计算量,进一步提高搜索效率。
在实际应用中,基于不同的场景以及目的,可以对搜索模型进行调整,例如,在一些情况下,搜索模型可以仅包括音乐风格识别子模型。图3以及图5所示实施例并不是对搜索模型的具体实现方式的限制。
图7为本公开另一实施例提供的剪辑模板搜索方法的流程图。参照图7所示,本实施例的方法包括:
S701、服务端设备获取第一多媒体资源。
S702、服务端设备根据所述第一多媒体资源进行搜索,获取搜索结果。
S703、服务端设备向终端设备返回所述搜索结果。相应地,终端设备接收服务端设备返回的搜索结果。
S704、服务端设备根据所述搜索结果,从所述模板库中获取所述目标剪辑模板。
S705、服务端设备将所述目标剪辑模板发送给所述终端设备。
具体地,搜索结果指示搜索到与第一多媒体资源匹配的目标剪辑模板时,服务端设备可根据目标剪辑模板的标识从模板库中提取目标剪辑模板对应的所有数据,并进行打包,之后发送给终端设备。
S706、终端设备展示所述目标剪辑模板。
终端设备接收目标剪辑模板的数据之后,进行解码显示,以使持有终端设备的用户能够查看目标剪辑模板的详细信息。
本实施例中,通过向用户展示根据用户指定的第一多媒体资源匹配到的目标剪辑模板,该目标剪辑模板更加符合用户进行视频创作的需求,能够提高目标剪辑模板的使用率,同时还提高用户进行视频创作的积极性。
当然,在另一些情况下,若搜索结果指示未搜索到与第一多媒体资源匹配的目标剪辑模板时,终端设备可根据搜索结果向用户展示未搜索到目标剪辑模板的提示信息。
基于前述描述,结合附图以及应用场景,对本公开实施例提供的剪辑模板搜索方法进行详细介绍。为了便于说明,下面以终端设备为手机,手机中安装有视频编辑APP(简称为应用1)、手机中还安装有视频类APP(简称为:应用2)为例进行示例。
目前,手机以及其他终端设备提供了智能读剪贴板功能,可以自动进行复制内容的自动填充。下面以手机开启智能读剪贴板功能和关闭智能读剪贴板功能两种场景,分别对本公开提供的剪辑模板搜索方法进行介绍。
场景一
用户可通过其他视频类应用程序或者音乐类应用程序中复制目标链接,且手机的智能读剪贴板功能开启的情况下,当用户打开应用1时,应用1可从剪贴板中获取目标链接。
示例性地,应用2可在手机上显示图8A示例性所示的用户界面11,用户界面11用于显示应用2的视频播放页面,应用2可在视频播放页面中执行一些功能集合,如播放多媒体文件(如短视频)、分享多媒体文件。其中,用户界面11包括:控件1101,控件1101用于复制当前正在播放的多媒体文件的链接。
应理解,当前正在播放的多媒体文件的链接即前文中描述的目标链接。
当用户退出应用2之后,且在预设时长内打开应用1,应用1通过后台将目标链接发送给服务器。其中,预设时长例如为5秒,10秒等等。应用1在等待服务器返回搜索结果之前,应用1可在手机上显示图8B示例性地所示的用户界面12,其中,用户界面12用于显示应用1的等待页面。
在用户界面12可以显示当前的搜索进度信息,如用户界面12显示的等待页面中示例性地显示“识别中75%”。用户界面12中还包括控件1201,控件1201用于放弃当前搜索 任务。
在应用1接收到用户在图8B所示的用户界面12中执行如点击控件1201的操作后,应用1可以在手机上显示应用1的默认主页面。
在应用1接收到服务器返回的搜索结果以及匹配到的目标剪辑模板的数据时,应用1可以在手机上显示图8C示例性地所示的用户界面13,其中,用户界面13中包括:展示窗口1301,展示窗口用于展示目标剪辑模板的封面,该封面可以是目标剪辑模板包含的视频中的任一视频帧,或者,特定视频帧。
用户界面13还包括:控件1302,其中,控件1302用于进入目标剪辑模板的详情页面。应用1接收到用户在图8C所示的用户界面13中执行如点击控件1302的操作后,应用1在手机上显示图8D所示的用户界面14,其中,用户界面14用于显示应用1的视频播放页面应用1可在视频播放页面中执行一些功能集合,如播放多媒体文件、提供进入视频创作页面的可视化入口等等。
用户界面13还包括:控件1303,其中,控件1303用于关闭展示窗口1301。
应用1接收到用户在图8C所示的用户界面13中执行针对控件1302的点击操作后,应用1可以通过手机向服务器发送剪辑模板获取请求,剪辑模板获取请求用于请求获取目标剪辑模板的所有数据。
用户界面14包括:控件1401,控件1401用于进入以该目标剪辑模板为创作模板的视频创作页面。
另外,当应用1接收到用户在图8C所示的用户界面13中执行的滑动操作后,应用1可以通过手机向服务器发送候选剪辑模板获取请求,候选剪辑模板获取请求用于请求加权分数仅次于目标剪辑模板的候选剪辑模板的所有数据。
服务器接收到候选剪辑模板获取请求后,服务器可根据搜索结果,将加权分数排在次于目标剪辑模板的候选剪辑模板的所有数据发送给手机。
若服务器返回的搜索结果指示模板库中未包含与目标链接相匹配的目标剪辑模板,则应用1在手机上显示如图8B所示的等待页面之后,应用1可在手机上显示如图8E所示的用户界面15,其中,用户界面15中包含窗口1501,其中,窗口1501包含文字信息显示区域1502以及控件1503。
其中,文字信息显示区域1502可以显示搜索结果的相关内容,例如,文字信息显示区域1502中显示文字“暂无找到同款模板,换个链接试试吧~”。
控件1503用于关闭窗口1501。
示例性地,应用1接收到用户在图8E所示的用户界面15中执行如点击控件1503的操作后,应用1可以在手机上显示应用1的默认主页面。
场景二
手机的智能读剪贴板功能关闭的情况下,可以通过用户手动输入的方式,使应用1获得目标链接。
应用1在手机上显示如图8F示例性地所示的用户界面16,其中,用户界面16包括:输入窗口1601,其中,输入窗口1601中包括控件1602,控件1602用于进入链接搜索页面。
应用1接收用户在图8F所示的用户界面16中执行如点击控件1602的操作后,应用1在手机上显示如图8G示例性地所示的用户界面17。其中,用户界面17显示的链接搜索页面中包括输入窗口1701,输入窗口1701中可显示提示信息以提醒用户可以粘贴来自其他应用的视频链接或者音乐链接找同款模板。
若用户已在其他视频类应用程序或者音乐类应用程序中复制目标链接。则应用1可以在接收用户在图8G所示的用户界面17中执行如长按输入窗口1701的操作后,应用1在手机上显示如图8H所示的用户界面18,用户界面18中包含控件1801,控件1801用于将剪切板中的内容粘贴至输入窗口1701中。
应用1接收用户在图8H所示的用户界面18中执行如点击控件1801的操作后,在输入窗口1701中显示目标链接,应用1相应的在手机上显示如图8I所示的用户界面19。
在另一些情况下,用户界面17中还可以包括:输入法软件盘区域1702,用户可通过操作输入法软键盘区域1702向输入窗口1701中手动输入目标链接。
用户界面17中还可以包括控件1703,控件1703用户关闭链接搜索页面。
用户界面17中还包括控件1704,控件1704用于根据目标链接生成搜索任务。手机根据搜索任务将目标链接发送至服务器。
可选地,用户界面19中还可以包括:控件1705,其中,控件1705用于删除输入窗口1701中的所有内容。例如,应用1在接收到用户在图8I所示的用户界面19中执行如点击控件1705的操作后,应用1在手机上显示如图8G所示的用户界面17。又如,用户通过手动操作用户界面17包含的输入法软键盘区域1702向输入窗口1701中输入目标链接的部分内容或者全部内容时,也可以通过操作控件1705,以删除输入窗口1701中的所有内容。
当应用1检测到用户界面17以及用户界面18中所示的输入窗口1701中未输入正确的目标链接时,控件1704处于第一状态;当应用1检测到输入窗口1701中输入了正确的目标链接时,控件1704处于第二状态。
其中,第一状态为未激活状态,第二状态为激活状态。未激活状态下,操作控件1704无法生成搜索任务;激活状态下,操作控件1704可根据输入窗口1701中的目标链接生成搜索任务。
例如,用户界面18和用户界面19中也包括控件1704,在用户界面18中控件1704处于第一状态,在用户界面19中控件1704处于第二状态。
应用1接收到用户在如图8I所示的用户界面19中执行如点击控件1704的操作后,应用1生成搜索任务,将目标链接通过手机发送给服务器,以使服务器进行根据目标链接进行搜索。且应用1接收用户在用户界面19中执行如点击控件1704的操作后,应用1可以在手机上显示如图8B所示的用户界面12。
在场景二中,若服务器返回搜索结果,应用1可根据搜索结果在手机显示如图8C至图8E中所示的用户界面13至用户界面15。
在一些情况下,用户可能不清楚当前应用1支持通过链接搜索同款模板的功能,因此,针对未使用过通过链接搜索目标剪辑模板的用户,当用户打开应用1时,应用1可在手机上示例性地显示如图8J所示的用户界面20。用户界面20包括:窗口2001,其中,窗口2001 用于显示引导信息,引导信息例如为“支持通过链接找模板”。
在一些情况下,例如手机与服务器之间通信质量较差.,虽然,服务器能够根据目标链接匹配到目标剪辑模板,但是由于手机与服务器之间的通信质量较差,手机无法从服务器获得目标剪辑模板的数据。应用1可在手机上显示如图8K所示的用户界面21,其中,用户界面21中包含窗口2101,窗口2101用于显示加载失败页面。加载失败页面中可以包括区域2102、窗口2103以及控件2104,其中,区域2102用于显示加载失败的提示信息;窗口2103中包含控件2105,控件2105用于生成新的数据加载任务;控件2104用于取消数据加载任务。
通过上述图8A至图8K所示的人机交互界面示意图,并结合实际应用场景可知,本公开实施例提供的剪辑模板搜索方法,方便用户通过操作终端设备显示的用户界面上的控件,即可快速获得想要的目标剪辑模板,能够较好地满足用户进行视频创作的需求。
图9为本公开一实施例提供的剪辑模板搜索装置的结构示意图。参照图9所示,本实施例提供的剪辑模板搜索装置900包括:获取模块901和搜索模块902。
其中,获取模块901,用于获取用户指定的第一多媒体资源。
搜索模块902,用于根据所述第一多媒体资源进行搜索,获取搜索结果;
其中,所述搜索结果用于指示是否搜索到与所述第一多媒体资源匹配的目标剪辑模板,所述目标剪辑模板用于指示待剪辑多媒体素材按照目标剪辑方式被剪辑成第二多媒体资源,所述目标剪辑方式为所述第一多媒体资源所采用的剪辑方式。
在一些可能的设计中,搜索模块902,具体用于根据候选剪辑模板在目标维度的识别结果,获取所述搜索结果;其中,所述候选剪辑模板在目标维度的识别结果是根据所述第一多媒体资源和所述候选剪辑模板在所述目标维度的特征获得的。
在一些可能的设计中,搜索模块902,还用于根据所述第一多媒体资源以及至少一个剪辑模板分别在目标维度上的特征,确定所述候选剪辑模板;其中,所述至少一个剪辑模板包括所述候选剪辑模板。
在一些可能的设计中,所述目标维度包括:音乐风格维度、音频指纹维度、视频尺寸维度、视频片段特征维度以及视觉效果维度中的一个或多个。
在一些可能的设计中,所述目标维度包括所述音乐风格维度、所述音频指纹维度、所述视频尺寸维度、所述视频片段特征维度以及所述视觉效果维度中的多个维度;
则所述搜索模块902,具体用于根据所述候选剪辑模板分别在所述目标维度中各维度的识别结果、以及所述目标维度中各维度对应的权重系数,获取所述候选剪辑模板对应的加权计算结果;根据所述候选剪辑模板对应的加权计算结果,获取所述搜索结果。
所述搜索模块902,具体用于按照目标维度中的各维度的优先级顺序,确定当前维度;根据所述第一多媒体资源以及第一筛选结果分别在所述当前维度的特征,获取所述第一筛选结果中每个剪辑模板分别在所述当前维度的第一识别结果;根据所述第一识别结果以及所述当前维度对应的筛选条件,获取所述当前维度对应的第二筛选结果;所述第二筛选结果包括:一个或者多个剪辑模板,所述第一筛选结果的初始状态包括:所述至少一个剪辑模板;确定所述第二筛选结果为第一筛选结果;返回执行所述按照目标维度中的各维度的 优先级顺序,确定当前维度,直到获取最后一个所述当前维度对应的第二筛选结果,确定最后一个所述当前维度对应的第二筛选结果包括的剪辑模板为所述候选剪辑模板。
在一些可能的设计中,获取模块901,具体用于获取用户输入的目标链接,并对所述目标链接进行解析,获取所述第一多媒体资源。
在一些可能的设计中,剪辑模板搜索装置900,还包括:发送模块903。发送模块903,用于向用户发送所述搜索结果。
在一些可能的设计中,发送模块903,还用于根据搜索结果,向用户发送目标剪辑模板。
本实施例提供的剪辑模板搜索装置可以用于执行前述任一实施例中服务端设备执行的技术方案,其实现原理以及技术效果类似,可参照前文中的详细描述,此处不再赘述。
图10为本公开一实施例提供的电子设备的结构示意图。参照图10所示,本实施例提供的电子设备1000包括:存储器1001和处理器1002。
其中,存储器1001可以是独立的物理单元,与处理器1002可以通过总线1003连接。存储器1001、处理器1002也可以集成在一起,通过硬件实现等。
存储器1001用于存储程序指令,处理器1002调用该程序指令,执行以上任一方法实施例中服务端设备或者终端设备执行的操作。
可选地,当上述实施例的方法中的部分或全部通过软件实现时,上述电子设备1000也可以只包括处理器1002。用于存储程序的存储器1001位于电子设备1000之外,处理器1002通过电路/电线与存储器连接,用于读取并执行存储器中存储的程序。
处理器1002可以是中央处理器(central processing unit,CPU),网络处理器(network processor,NP)或者CPU和NP的组合。
处理器1002还可以进一步包括硬件芯片。上述硬件芯片可以是专用集成电路(application-specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(complex programmable logic device,CPLD),现场可编程逻辑门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。
存储器1001可以包括易失性存储器(volatile memory),例如随机存取存储器(random-access memory,RAM);存储器也可以包括非易失性存储器(non-volatile memory),例如快闪存储器(flash memory),硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD);存储器还可以包括上述种类的存储器的组合。
本公开还提供一种计算机可读存储介质,计算机可读存储介质中包括计算机程序指令,所述计算机程序指令在被电子设备的至少一个处理器执行时,以执行以上任一方法实施例中服务端设备或者终端设备执行的技术方案。
本公开还提供一种程序产品,所述程序产品包括计算机程序,所述计算机程序存储在可读存储介质中,所述电子设备的至少一个处理器可以从所述可读存储介质中读取所述计算机程序,所述至少一个处理器执行所述计算机程序使得所述电子设备执行如上任一方法实施例中服务端设备或者终端设备执行的技术方案。
需要说明的是,在本文中,诸如“第一”和“第二”等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
以上所述仅是本公开的具体实施方式,使本领域技术人员能够理解或实现本公开。对这些实施例的多种修改对本领域的技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本公开的精神或范围的情况下,在其它实施例中实现。因此,本公开将不会被限制于本文所述的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。

Claims (11)

  1. 一种剪辑模板搜索方法,其特征在于,包括:
    获取用户指定的第一多媒体资源;
    根据所述第一多媒体资源进行搜索,获取搜索结果;
    其中,所述搜索结果用于指示是否搜索到与所述第一多媒体资源匹配的目标剪辑模板,所述目标剪辑模板用于指示待剪辑多媒体素材按照目标剪辑方式被剪辑成第二多媒体资源,所述目标剪辑方式为所述第一多媒体资源所采用的剪辑方式。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述第一多媒体资源进行搜索,获取搜索结果,包括:
    根据候选剪辑模板在目标维度的识别结果,获取所述搜索结果;
    其中,所述候选剪辑模板在目标维度的识别结果是根据所述第一多媒体资源和所述候选剪辑模板在所述目标维度的特征获得的。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述候选剪辑模板在目标维度的识别结果,获取所述搜索结果之前,还包括:
    根据所述第一多媒体资源以及至少一个剪辑模板分别在目标维度上的特征,确定所述候选剪辑模板;
    其中,所述至少一个剪辑模板包括所述候选剪辑模板。
  4. 根据权利要求2或3所述的方法,其特征在于,所述目标维度包括:音乐风格维度、音频指纹维度、视频尺寸维度、视频片段特征维度以及视觉效果维度中的一个或多个。
  5. 根据权利要求4所述的方法,其特征在于,所述目标维度包括所述音乐风格维度、所述音频指纹维度、所述视频尺寸维度、所述视频片段特征维度以及所述视觉效果维度中的多个维度;
    所述根据所述候选剪辑模板在所述目标维度的识别结果,获取所述搜索结果,包括:
    根据所述候选剪辑模板分别在所述目标维度中各维度的识别结果、以及所述目标维度中各维度对应的权重系数,获取所述候选剪辑模板对应的加权计算结果;
    根据所述候选剪辑模板对应的加权计算结果,获取所述搜索结果。
  6. 根据权利要求4所述的方法,其特征在于,所述根据所述第一多媒体资源以及至少一个剪辑模板分别在目标维度上的特征,确定所述候选剪辑模板,包括:
    按照目标维度中的各维度的优先级顺序,确定当前维度;
    根据所述第一多媒体资源以及第一筛选结果分别在所述当前维度的特征,获取所述第一筛选结果中每个剪辑模板分别在所述当前维度的第一识别结果;
    根据所述第一识别结果以及所述当前维度对应的筛选条件,获取所述当前维度对应的第二筛选结果;所述第二筛选结果包括:一个或者多个剪辑模板,所述第一筛选结果的初始状态包括:所述至少一个剪辑模板;
    确定所述第二筛选结果为第一筛选结果;
    返回执行所述按照目标维度中的各维度的优先级顺序,确定当前维度,直到获取最后一个所述当前维度对应的第二筛选结果,确定最后一个所述当前维度对应的第二筛选结果 包括的剪辑模板为所述候选剪辑模板。
  7. 根据权利要求1至3任一项所述的方法,其特征在于,所述获取用户指定的第一多媒体资源,包括:
    获取用户输入的目标链接,并对所述目标链接进行解析,获取所述第一多媒体资源。
  8. 根据权利要求1至3任一项所述的方法,其特征在于,所述方法还包括:
    向所述用户发送所述搜索结果。
  9. 根据权利要求8所述的方法,其特征在于,所述方法还包括:
    根据所述搜索结果,向所述用户发送所述目标剪辑模板。
  10. 一种电子设备,其特征在于,包括:存储器、处理器以及计算机程序;
    所述存储器被配置为存储所述计算机程序;
    所述处理器被配置为执行所述计算机程序,以实现如权利要求1至9任一项所述的方法。
  11. 一种程序产品,其特征在于,包括:计算机程序,所述计算机程序存储在可读存储介质中,电子设备的至少一个处理器可以从所述可读存储介质中读取所述计算机程序,所述至少一个处理器执行所述计算机程序使得所述电子设备实现如权利要求1至9任一项所述的方法。
PCT/CN2022/090348 2021-04-30 2022-04-29 剪辑模板搜索方法及装置 WO2022228557A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2023566920A JP2024516836A (ja) 2021-04-30 2022-04-29 クリップテンプレート検索方法及び装置
EP22795026.8A EP4322025A4 (en) 2021-04-30 2022-04-29 METHOD AND APPARATUS FOR SEARCHING CUTTING TEMPLATE
US18/484,933 US20240037134A1 (en) 2021-04-30 2023-10-11 Method and apparatus for searching for clipping template

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110485269.3A CN115269889B (zh) 2021-04-30 2021-04-30 剪辑模板搜索方法及装置
CN202110485269.3 2021-04-30

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/484,933 Continuation US20240037134A1 (en) 2021-04-30 2023-10-11 Method and apparatus for searching for clipping template

Publications (1)

Publication Number Publication Date
WO2022228557A1 true WO2022228557A1 (zh) 2022-11-03

Family

ID=83744887

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/090348 WO2022228557A1 (zh) 2021-04-30 2022-04-29 剪辑模板搜索方法及装置

Country Status (5)

Country Link
US (1) US20240037134A1 (zh)
EP (1) EP4322025A4 (zh)
JP (1) JP2024516836A (zh)
CN (1) CN115269889B (zh)
WO (1) WO2022228557A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118175348A (zh) * 2022-12-09 2024-06-11 北京字跳网络技术有限公司 视频模板的推送方法、装置、介质及设备
CN116506694B (zh) * 2023-06-26 2023-10-27 北京达佳互联信息技术有限公司 视频剪辑方法、装置、电子设备及存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090254565A1 (en) * 2008-04-03 2009-10-08 Thumbplay, Inc. Subsequent tailoring of a sign-up page based on a search engine query
CN110177219A (zh) * 2019-07-01 2019-08-27 百度在线网络技术(北京)有限公司 视频的模板推荐方法和装置
CN111541936A (zh) * 2020-04-02 2020-08-14 腾讯科技(深圳)有限公司 视频及图像处理方法、装置、电子设备、存储介质
CN111914523A (zh) * 2020-08-19 2020-11-10 腾讯科技(深圳)有限公司 基于人工智能的多媒体处理方法、装置及电子设备
CN112203140A (zh) * 2020-09-10 2021-01-08 北京达佳互联信息技术有限公司 一种视频剪辑方法、装置、电子设备及存储介质
CN112449231A (zh) * 2019-08-30 2021-03-05 腾讯科技(深圳)有限公司 多媒体文件素材的处理方法、装置、电子设备及存储介质
CN113840099A (zh) * 2020-06-23 2021-12-24 北京字节跳动网络技术有限公司 视频处理方法、装置、设备及计算机可读存储介质

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AUPP624698A0 (en) * 1998-09-29 1998-10-22 Canon Kabushiki Kaisha Method and apparatus for multimedia editing
US7921156B1 (en) * 2010-08-05 2011-04-05 Solariat, Inc. Methods and apparatus for inserting content into conversations in on-line and digital environments
US10691885B2 (en) * 2016-03-30 2020-06-23 Evernote Corporation Extracting structured data from handwritten and audio notes
US10444946B2 (en) * 2016-12-13 2019-10-15 Evernote Corporation Shared user driven clipping of multiple web pages
US20190042574A1 (en) * 2017-08-01 2019-02-07 Samsung Electronics Co., Ltd. Electronic device and method for controlling the electronic device
CN109299445A (zh) * 2018-08-01 2019-02-01 政采云有限公司 获取文件模板的方法、装置、计算设备及存储介质
CN109255053B (zh) * 2018-09-14 2021-08-20 北京奇艺世纪科技有限公司 资源搜索方法、装置、终端、服务器、计算机可读存储介质
CN110139159B (zh) * 2019-06-21 2021-04-06 上海摩象网络科技有限公司 视频素材的处理方法、装置及存储介质
CN110536177B (zh) * 2019-09-23 2020-10-09 北京达佳互联信息技术有限公司 视频生成方法、装置、电子设备及存储介质
CN111105819B (zh) * 2019-12-13 2021-08-13 北京达佳互联信息技术有限公司 剪辑模板的推荐方法、装置、电子设备及存储介质
CN111246300B (zh) * 2020-01-02 2022-04-22 北京达佳互联信息技术有限公司 剪辑模板的生成方法、装置、设备及存储介质
CN111243632B (zh) * 2020-01-02 2022-06-24 北京达佳互联信息技术有限公司 多媒体资源的生成方法、装置、设备及存储介质
CN111460183B (zh) * 2020-03-30 2024-02-13 北京金堤科技有限公司 多媒体文件生成方法和装置、存储介质、电子设备
CN111522863B (zh) * 2020-04-15 2023-07-25 北京百度网讯科技有限公司 一种主题概念挖掘方法、装置、设备以及存储介质
CN111835986B (zh) * 2020-07-09 2021-08-24 腾讯科技(深圳)有限公司 视频编辑处理方法、装置及电子设备
CN111930994A (zh) * 2020-07-14 2020-11-13 腾讯科技(深圳)有限公司 视频编辑的处理方法、装置、电子设备及存储介质
CN111741331B (zh) * 2020-08-07 2020-12-22 北京美摄网络科技有限公司 一种视频片段处理方法、装置、存储介质及设备
CN112015926B (zh) * 2020-08-27 2022-03-04 北京字节跳动网络技术有限公司 搜索结果的展示方法、装置、可读介质和电子设备
CN112711937B (zh) * 2021-01-18 2022-06-24 腾讯科技(深圳)有限公司 一种模板推荐方法、装置、设备及存储介质
US12099544B2 (en) * 2022-07-21 2024-09-24 Google Llc Systems and methods for generating stories for live events using a scalable pipeline

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090254565A1 (en) * 2008-04-03 2009-10-08 Thumbplay, Inc. Subsequent tailoring of a sign-up page based on a search engine query
CN110177219A (zh) * 2019-07-01 2019-08-27 百度在线网络技术(北京)有限公司 视频的模板推荐方法和装置
CN112449231A (zh) * 2019-08-30 2021-03-05 腾讯科技(深圳)有限公司 多媒体文件素材的处理方法、装置、电子设备及存储介质
CN111541936A (zh) * 2020-04-02 2020-08-14 腾讯科技(深圳)有限公司 视频及图像处理方法、装置、电子设备、存储介质
CN113840099A (zh) * 2020-06-23 2021-12-24 北京字节跳动网络技术有限公司 视频处理方法、装置、设备及计算机可读存储介质
CN111914523A (zh) * 2020-08-19 2020-11-10 腾讯科技(深圳)有限公司 基于人工智能的多媒体处理方法、装置及电子设备
CN112203140A (zh) * 2020-09-10 2021-01-08 北京达佳互联信息技术有限公司 一种视频剪辑方法、装置、电子设备及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4322025A4

Also Published As

Publication number Publication date
JP2024516836A (ja) 2024-04-17
CN115269889A (zh) 2022-11-01
US20240037134A1 (en) 2024-02-01
EP4322025A1 (en) 2024-02-14
CN115269889B (zh) 2024-07-02
EP4322025A4 (en) 2024-08-14

Similar Documents

Publication Publication Date Title
CN107256267B (zh) 查询方法和装置
US10860811B2 (en) Method and device for generating review article of hot news, and terminal device
US10275499B2 (en) Tag selection, clustering, and recommendation for content hosting services
WO2022228557A1 (zh) 剪辑模板搜索方法及装置
US9292519B2 (en) Signature-based system and method for generation of personalized multimedia channels
CN105009118B (zh) 定制的内容消费界面
US8666749B1 (en) System and method for audio snippet generation from a subset of music tracks
US10402407B2 (en) Contextual smart tags for content retrieval
TW201214173A (en) Methods and apparatus for displaying content
US20140164371A1 (en) Extraction of media portions in association with correlated input
US9449027B2 (en) Apparatus and method for representing and manipulating metadata
EP2210196A2 (en) Generating metadata for association with a collection of content items
CN104281656B (zh) 在应用程序中加入标签信息的方法和装置
US11048736B2 (en) Filtering search results using smart tags
JP2017535860A (ja) マルチメディア内容の提供方法および装置
CN111723289B (zh) 信息推荐方法及装置
CN111680254A (zh) 一种内容推荐方法及装置
US20140161423A1 (en) Message composition of media portions in association with image content
US20170357712A1 (en) Method and system for searching and identifying content items in response to a search query using a matched keyword whitelist
WO2023016349A1 (zh) 一种文本输入方法、装置、电子设备和存储介质
US20130346385A1 (en) System and method for a purposeful sharing environment
CN110290199A (zh) 内容推送方法、装置及设备
WO2023128877A2 (zh) 视频生成方法、装置、电子设备及可读存储介质
US20140163956A1 (en) Message composition of media portions in association with correlated text
US10241988B2 (en) Prioritizing smart tag creation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22795026

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023566920

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 2022795026

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022795026

Country of ref document: EP

Effective date: 20231106

NENP Non-entry into the national phase

Ref country code: DE