CN112188307B

CN112188307B - Video resource synthesis method and device, storage medium and electronic device

Info

Publication number: CN112188307B
Application number: CN201910595978.XA
Authority: CN
Inventors: 潘文婷
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-07-03
Filing date: 2019-07-03
Publication date: 2022-07-01
Anticipated expiration: 2039-07-03
Also published as: CN112188307A

Abstract

The invention discloses a video resource synthesis method, a video resource synthesis device, a storage medium and an electronic device. Wherein, the method comprises the following steps: the method comprises the steps of obtaining an audio segmentation request on a client, wherein the audio segmentation request is used for requesting to divide a target audio resource into a plurality of audio fragments; dividing the target audio resource into a plurality of audio segments in response to the audio segmentation request on the client; under the condition that a video shooting request is received, responding to the video shooting request on a client to sequentially shoot a plurality of video clips corresponding to a plurality of audio clips one by one, wherein the playing time length of each video clip in the plurality of video clips is the same as the playing time length of the audio clip corresponding to each video clip; and synthesizing the target audio resource or the plurality of audio clips and the plurality of video clips through the client to obtain the target video resource. The invention solves the technical problem of low video shooting efficiency in the related technology.

Description

Video resource synthesis method and device, storage medium and electronic device

Technical Field

The invention relates to the field of computers, in particular to a video resource synthesis method, a video resource synthesis device, a video resource storage medium and an electronic device.

Background

When a user shoots a video, after the user selects to start shooting, the user firstly selects a song and then enters a shooting interface, a recording button is arranged on the shooting interface, and the user clicks the recording button to start recording the video and simultaneously start playing the music score. During recording, the user clicks the button again, i.e. the recording of the video is paused, and the soundtrack is paused. And after the whole music is played, recording is finished to obtain a continuous video with the music.

In the existing scheme, when the intermittent dubbing music is played, a user cannot clearly hear rhythm points of music, cannot accurately pause and continue operations on the rhythm points, and is difficult to finish the effect of rhythm snapshot with high quality. The photographing efficiency is low.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the invention provides a video resource synthesis method, a video resource synthesis device, a storage medium and an electronic device, and at least solves the technical problem of low video shooting efficiency in the related art.

According to an aspect of the embodiments of the present invention, there is provided a method for synthesizing a video resource, including:

the method comprises the steps of obtaining an audio segmentation request on a client, wherein the audio segmentation request is used for requesting to divide a target audio resource into a plurality of audio fragments;

dividing the target audio resource into the plurality of audio segments in response to the audio segmentation request on the client;

under the condition that a video shooting request is received, responding to the video shooting request on the client to sequentially shoot a plurality of video clips corresponding to the plurality of audio clips one by one, wherein the playing time length of each video clip in the plurality of video clips is the same as the playing time length of the audio clip corresponding to each video clip;

and synthesizing the target audio resource or the plurality of audio clips and the plurality of video clips through the client to obtain the target video resource.

According to another aspect of the embodiments of the present invention, there is also provided a video resource composition apparatus, including:

the system comprises a first obtaining module, a second obtaining module and a third obtaining module, wherein the first obtaining module is used for obtaining an audio segmentation request on a client, and the audio segmentation request is used for requesting to divide a target audio resource into a plurality of audio segments;

a partitioning module for partitioning the target audio resource into the plurality of audio segments on the client in response to the audio segmentation request;

the shooting module is used for responding to the video shooting request on the client to sequentially shoot a plurality of video clips corresponding to the plurality of audio clips one by one under the condition that the video shooting request is received, wherein the playing time length of each video clip in the plurality of video clips is the same as the playing time length of the audio clip corresponding to each video clip;

and the synthesis module is used for synthesizing the target audio resource or the plurality of audio clips and the plurality of video clips through the client to obtain the target video resource.

Optionally, the obtaining module includes:

the first display unit is used for displaying the playing information of the target audio resource on a first interface of a client, wherein the playing information is used for indicating the playing time of the target audio resource;

a first determining unit, configured to determine, when a segmentation operation performed at N positions on the play information is detected, time points corresponding to the N positions as N segmentation points of the target audio resource, where N is an integer greater than or equal to 1;

a generating unit, configured to generate the audio segmentation request carrying the N segmentation points, where the audio segmentation request is used to request that the target audio resource is divided into N +1 audio segments at the N segmentation points.

Optionally, the first determination unit includes:

the playing subunit is used for responding to the received playing instruction and playing the target audio resource;

a detecting subunit, configured to detect, during the playing of the target audio resource, N times of the segmentation operations performed on the playing information;

a first determining subunit, configured to determine, as the N positions, the playing positions of the target audio resource when the N times of the segmenting operations are performed;

and the second determining subunit is configured to determine time points corresponding to the N positions as N segment points of the target audio resource.

Optionally, the dividing module includes:

a dividing unit, configured to divide the target audio resource into the N +1 audio segments at the N segmentation points in response to the audio segmentation request;

a second determining unit, configured to determine a time order of the N +1 audio clips in the target audio resource as a video shooting order corresponding to the N +1 audio clips.

Optionally, the apparatus further comprises:

a first display module, configured to display an audio resource set on a second interface before displaying the playing information of the target audio resource on the first interface, where an audio resource included in the audio resource set is used as background music for shooting a video resource, and the audio resource set includes the target audio resource;

a second obtaining module, configured to obtain the playing information of the target audio resource when a selection operation performed on the target audio resource is detected, where the playing information includes at least one of a time axis and a sound wave diagram.

Optionally, the photographing module includes:

the second display unit is used for responding to the received first shooting request on the client to display a third interface and displaying a first button on the third interface, wherein the first button is used for controlling shooting of a video clip;

the first shooting unit is used for shooting a current video clip on the client under the condition that the triggering operation executed on the first button is detected, simultaneously playing an audio clip corresponding to the current video clip, and simultaneously starting to shoot the current video clip;

the processing unit is used for stopping shooting the current video clip when the audio clip corresponding to the current video clip is played, and displaying a second button on the third interface, wherein the second button is used for controlling shooting of the next video clip;

and the second shooting unit is used for playing the audio clip corresponding to the next video clip and starting shooting the next video clip under the condition that the triggering operation executed on the second button is detected, wherein the audio clip corresponding to the next video clip is an audio clip arranged behind the audio clip corresponding to the current video clip according to the time sequence.

Optionally, the second display unit comprises:

a first display subunit, configured to display a video shooting button on the third interface when the playing time of the audio clip corresponding to the current video clip is longer than a target time, where the video shooting button is used to control to shoot a video, and the first button includes the video shooting button;

and the second display subunit is configured to display a photo shooting button on the third interface when the playing time of the audio clip corresponding to the current video clip is less than or equal to a target time, where the photo shooting button is used to control to shoot a photo, and the first button includes the photo shooting button.

Optionally, the first photographing unit includes:

the shooting subunit is configured to, when the trigger operation performed on the photo shooting button is detected, play an audio clip corresponding to the current video clip on the client, and simultaneously shoot a photo corresponding to the audio clip corresponding to the current video clip;

and the generating subunit is configured to generate the current video clip whose playing content is the photo and whose playing time is the playing time of the audio clip corresponding to the current video clip.

Optionally, the apparatus further comprises:

the second display module is used for displaying prompt information on a third interface after responding to the received first shooting request and displaying the third interface, wherein the prompt information is used for prompting at least one of the following information: the time length of the audio clip corresponding to the current video clip, the number of the plurality of audio clips, the sequence of the audio clip corresponding to the current video clip in the plurality of audio clips, the shooting progress of the current video clip, and the shooting progress of the plurality of video clips.

Optionally, the synthesis module comprises:

the first splicing unit is used for splicing the plurality of video clips into a video file according to the time sequence of the plurality of audio clips;

and the first synthesis unit is used for synthesizing the video file and the target audio resource into the target video resource.

Optionally, the synthesis module comprises:

the second synthesis unit is used for synthesizing each video clip in the video clips and each audio clip in the audio clips corresponding to each video clip into a plurality of video resources;

and the second splicing unit is used for splicing the plurality of video resources into the target video resource according to the time sequence of the plurality of audio clips.

According to another aspect of the embodiments of the present invention, there is also provided a storage medium, characterized in that the storage medium has a computer program stored therein, wherein the computer program is configured to execute the method described in any one of the above when running.

According to another aspect of the embodiments of the present invention, there is also provided an electronic apparatus, including a memory and a processor, wherein the memory stores therein a computer program, and the processor is configured to execute the method described in any one of the above through the computer program.

In the embodiment of the invention, an audio segmentation request is acquired from a client, wherein the audio segmentation request is used for requesting to divide a target audio resource into a plurality of audio segments; dividing the target audio resource into a plurality of audio segments in response to the audio segmentation request on the client; under the condition that a video shooting request is received, responding to the video shooting request on a client to sequentially shoot a plurality of video clips corresponding to a plurality of audio clips one by one, wherein the playing time length of each video clip in the plurality of video clips is the same as the playing time length of the audio clip corresponding to each video clip; the method comprises the steps of synthesizing a target audio resource or a plurality of audio clips and a plurality of video clips through a client to obtain a target video resource, dividing the selected target audio resource into a plurality of audio clips according to an audio segmentation request before shooting a video, shooting a plurality of video clips corresponding to the plurality of audio clips in sequence in a video shooting stage, and synthesizing the plurality of video clips into the target video resource.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

FIG. 1 is a schematic diagram of an alternative method for composition of a video asset in accordance with an embodiment of the present invention;

FIG. 2 is a schematic diagram of an application environment of an alternative video asset composition method according to an embodiment of the present invention;

FIG. 3 is a first schematic diagram of an alternative video asset compositing method according to an alternative embodiment of the invention;

FIG. 4 is a second schematic diagram of an alternative video asset compositing method according to an alternative embodiment of the invention;

FIG. 5 is a third schematic diagram of an alternative video asset compositing method according to an alternative embodiment of the invention;

FIG. 6 is a fourth schematic diagram of an alternative video asset compositing method according to an alternative embodiment of the present invention;

FIG. 7 is a fifth schematic diagram of an alternative video asset compositing method according to an alternative embodiment of the invention;

fig. 8 is a schematic diagram of an alternative video asset compositing apparatus according to an embodiment of the invention;

fig. 9 is a schematic view of an application scenario of an alternative video asset composition method according to an embodiment of the present invention; and

FIG. 10 is a schematic diagram of an alternative electronic device according to embodiments of the invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

According to an aspect of an embodiment of the present invention, there is provided a method for synthesizing a video resource, as shown in fig. 1, the method including:

s102, an audio segmentation request is obtained from a client, wherein the audio segmentation request is used for requesting to divide a target audio resource into a plurality of audio segments;

s104, responding to the audio segmentation request on the client to divide the target audio resource into a plurality of audio segments;

s106, under the condition that a video shooting request is received, responding to the video shooting request on a client to sequentially shoot a plurality of video clips corresponding to a plurality of audio clips one by one, wherein the playing time length of each video clip in the plurality of video clips is the same as the playing time length of the audio clip corresponding to each video clip;

and S108, synthesizing the target audio resource or the plurality of audio clips and the plurality of video clips through the client to obtain the target video resource.

Optionally, in this embodiment, the above method for synthesizing a video resource may be applied to a hardware environment formed by the client 202 shown in fig. 2. As shown in fig. 2, a client 202 is installed on a terminal device, and after the client 202 is started, an audio segmentation request is obtained on the client 202, where the audio segmentation request is used to request that a target audio resource is divided into a plurality of audio segments; dividing the target audio resource into a plurality of audio segments in response to the audio segmentation request at the client 202; under the condition that a video shooting request is received, responding to the video shooting request on a client 202 to shoot a plurality of video clips corresponding to a plurality of audio clips one by one, wherein the playing time length of each video clip in the plurality of video clips is the same as the playing time length of the audio clip corresponding to each video clip; the target audio resource or the plurality of audio clips and the plurality of video clips are synthesized by the client 202 to obtain the target video resource.

Optionally, in this embodiment, the above method for synthesizing a video resource may be applied to, but not limited to, a scene of video shooting. The client may be, but not limited to, various applications having a video shooting function, such as a photographing application, an instant messaging application, a community space application, a video shooting application, a shopping application, a browser application, a multimedia editing application, a multimedia application, a live broadcast application, and the like. In particular, the method can be applied to, but not limited to, scenes in which videos are shot in the above video shooting application, or can also be applied to, but not limited to, scenes in which videos are shot in the above live broadcast application, so as to improve the efficiency of video shooting. The above is only an example, and this is not limited in this embodiment.

Alternatively, in the present embodiment, the target audio resource is used as a background sound of the target video resource, and the target audio resource may include, but is not limited to, a dubbing of a program such as music, movie and television integrated sports, a sound resource such as a phase drama, a music track of a certain video, and the like.

Optionally, in this embodiment, the audio segmentation request is used to request that the target audio resource be divided into a plurality of audio segments. The audio segmentation request may be, but is not limited to, an audio segmentation request triggered after a user performs a segmentation operation on the target audio resource. A client performs a partitioning operation on the target audio resource in response to the request. Alternatively, the audio segment request may also be, but is not limited to, a segment request that is automatically recognized by the client for the target audio asset and that identifies the audio segment lag trigger. And dividing the audio fragments by the client according to the identification.

Optionally, in this embodiment, a plurality of video clips are shot in sequence on the client, the shooting sequence may be, but is not limited to, a time sequence of a plurality of audio clips in the target audio resource, and since the audio clips are already divided before shooting, the user does not need to repeatedly intercept music. The fluency of video shooting is increased.

Alternatively, in this embodiment, in the step S108, the client may perform an audio/video synthesis operation. Or, the client may, but is not limited to, send the captured video clips and the target audio resource or audio clips to the server, request the server to perform the composition operation, and the client acquires the composited target video resource from the server.

In an alternative embodiment, as shown in fig. 3, an audio segmentation request is obtained on the client, wherein the audio segmentation request is used for requesting to divide the target audio resource into an audio segment 1, an audio segment 2 and an audio segment 3; dividing a target audio resource into an audio segment 1, an audio segment 2 and an audio segment 3 in response to an audio segment request on a client; under the condition that a video shooting request is received, responding to the video shooting request on a client to shoot a video clip A, a video clip B and a video clip C which are in one-to-one correspondence with an audio clip 1, an audio clip 2 and an audio clip 3 in sequence, wherein the playing time of the video clip A is the same as that of the audio clip 1, the playing time of the video clip B is the same as that of the audio clip 2, and the playing time of the video clip C is the same as that of the audio clip 3; the method comprises the steps of synthesizing an audio clip 1 and a video clip A into a video resource A1, synthesizing an audio clip 2 and a video clip B into a video resource B2, synthesizing an audio clip 3 and a video clip C into a video resource C3, and splicing the video resource A1, the video resource B2 and the video resource C3 into a target video resource through a client. Or splicing the video clip A, the video clip B and the video clip C into video files, and synthesizing the video files and the target audio resources into target video resources.

It can be seen that, through the above steps, before the video is shot, the selected target audio resource is divided into a plurality of audio segments according to the audio segmentation request, in the video shooting stage, a plurality of video segments corresponding to the plurality of audio segments are shot in sequence, and the plurality of video segments are synthesized into the target video resource, so that the user can divide the target audio resource as required, thereby enabling the user to accurately master the rhythm of music in the shooting process, enabling the shooting process to be smoother, achieving the technical effect of improving the video shooting efficiency, and further solving the technical problem of lower video shooting efficiency in the related art.

As an alternative, obtaining the audio segment request at the client comprises:

s1, displaying the playing information of the target audio resource on a first interface of the client, wherein the playing information is used for indicating the playing time of the target audio resource;

s2, when detecting the segment operation executed at N positions on the playing information, determining the time points corresponding to the N positions as N segment points of the target audio resource, wherein N is an integer greater than or equal to 1;

s3, generating an audio segmentation request carrying N segmentation points, wherein the audio segmentation request is used for requesting to divide the target audio resource into N +1 audio segments at the N segmentation points.

Optionally, in this embodiment, the first interface may be, but is not limited to, a segmented interface that is a target audio asset. Displaying the playing information of the target audio resource, which is used for indicating the playing time of the target audio resource, on the first interface, so that a user can clearly know the playing information of the target audio resource, and audio segments can be divided according to the needs of the user, for example: the user may divide the audio segments according to the time each audio segment is desired to be played.

Alternatively, in this embodiment, the segment operations performed at the N positions on the playing information may include, but are not limited to, a click operation performed by the user at the N positions, or a move operation performed by the user on an icon for marking the segment position, and the like.

As an alternative, in a case that the segmentation operation performed at the N positions on the playing information is detected, determining the time points corresponding to the N positions as the N segmentation points of the target audio resource includes:

s1, responding to the received playing instruction, playing the target audio resource;

s2, detecting N times of segmentation operation executed on the playing information in the process of playing the target audio resource;

s3, determining the playing position of the target audio resource when N times of segmentation operation is executed as N positions;

s4, determining the time points corresponding to the N positions as N segment points of the target audio resource.

Optionally, in this embodiment, the segmentation process of the target audio resource may be executed while playing the target audio resource, and the user can know information such as a rhythm beat of the target audio resource through the playing of the target audio resource, so as to divide the audio segments according to the requirement.

Optionally, in this embodiment, during the playing process of the target audio resource, the executed segmentation operation is detected, for example: a segment button may be displayed on the first interface, and if a user clicks the segment button, it is determined that a segment operation is detected, and a position to which the target audio resource is played at the time when the click operation is detected is taken as a segment point.

Alternatively, in the present embodiment, the shooting order may be displayed on the play information in the segmentation stage of the audio asset. For example: as shown in fig. 4, in the segment setting stage, the system reads the duration information of the music, when the system detects that the user clicks the "segment" button, it records a segment mark with a sequence number (numerical sequence 1.2.3 …) at the current time point of the dubbing music, when it detects that the user clicks the "shooting" button, it stops playing the music, and then the system divides the music into N segments of corresponding duration according to the time positions of the segment marks and sorts the N segments according to their sequence numbers.

In an alternative embodiment, as shown in fig. 5, in the segmented interface, after the user clicks the "start" button, the music starts playing (a), and when the music is played, the time position (b) to which the current music is played is displayed on the interface. When the user clicks the "segment" button, a segment marker (c) is marked on the current playing time point of the music. When the user marks more than 1 mark, the user can click 'shooting' to enter the shooting stage (d), and the system automatically completes music segmentation (e) according to the mark marked by the user.

As an alternative, dividing the target audio resource into a plurality of audio pieces in response to the audio segment request on the client comprises:

s1, dividing the target audio resource into N +1 audio segments at N segmentation points in response to the audio segmentation request;

and S2, determining the time sequence of the N +1 audio clips in the target audio resource as the video shooting sequence corresponding to the N +1 audio clips.

Optionally, in this embodiment, in the audio segment dividing stage, the client divides the target audio resource into N +1 audio segments according to the N segment points indicated by the audio segment request, and determines the time sequence of the N +1 audio segments in the target audio resource as the video shooting sequence corresponding to the N +1 audio segments. The video capture sequence may be determined by assigning sequence tags.

As an optional scheme, before displaying the playing information of the target audio resource on the first interface, the method further includes:

s1, displaying an audio resource set on a second interface, wherein the audio resource set comprises audio resources used as background music for shooting video resources, and the audio resource set comprises target audio resources;

s2, in a case where the selection operation performed on the target audio resource is detected, acquiring playback information of the target audio resource, wherein the playback information includes at least one of a time axis and a sound wave diagram.

Optionally, in this embodiment, the second interface may be, but is not limited to, a selection interface of the target audio resource. The user can search the selectable audio resources in the interface, or the user can upload the audio resources as the target audio resources of the shooting.

Optionally, in this embodiment, the playing information of the target audio resource may include, but is not limited to, a time axis and/or a sound wave diagram. The display of the sound wave graph can prompt information such as rhythm of the target audio resource.

In an alternative embodiment, as shown in fig. 6, the user first selects a piece of music provided by the system on the selection interface of the audio resource (a), and then selects the free segmentation mode (b), and enters the music segmentation interface (c).

As an alternative, in a case where a video shooting request is received, sequentially shooting, on the client, a plurality of video clips one-to-one corresponding to a plurality of audio clips in response to the video shooting request includes:

s1, responding to the received first shooting request on the client to display a third interface, and displaying a first button on the third interface, wherein the first button is used for controlling the shooting of the video clip;

s2, shooting the current video clip on the client and simultaneously playing the audio clip corresponding to the current video clip under the condition that the triggering operation executed on the first button is detected;

s3, when the playing of the audio clip corresponding to the current video clip is finished, stopping shooting the current video clip, and displaying a second button on a third interface, wherein the second button is used for controlling the shooting of the next video clip;

and S4, under the condition that the triggering operation executed on the second button is detected, playing the audio clip corresponding to the next video clip, and simultaneously starting shooting the next video clip, wherein the audio clip corresponding to the next video clip is one audio clip arranged behind the audio clip corresponding to the current video clip according to the time sequence.

Optionally, in this embodiment, the third interface may be, but is not limited to, a video shooting interface, and the first button may be, but is not limited to, a shooting control button. The user can control the start and pause of the current video clip shooting process by performing an operation on the first button.

Optionally, in this embodiment, the triggering operation performed on the first button may include, but is not limited to, a clicking operation performed on the first button, such as: single click, double click, multiple click, long press operation, slide operation, and the like.

Alternatively, in the present embodiment, the second button may be, but is not limited to, a button for controlling entry into the next video clip shooting process. The triggering operation performed on the second button may include, but is not limited to, a clicking operation performed on the second button, such as: single click, double click, multiple click, long press operation, slide operation, and the like.

As an alternative, displaying the first button on the third interface includes:

s1, when the playing time of the audio clip corresponding to the current video clip is longer than the target time, displaying a video shooting button on a third interface, wherein the video shooting button is used for controlling video shooting, and the first button comprises a video shooting button;

and S2, when the playing time of the audio clip corresponding to the current video clip is less than or equal to the target time, displaying a photo shooting button on a third interface, wherein the photo shooting button is used for controlling the shooting of photos, and the first button comprises a photo shooting button.

Alternatively, in the present embodiment, the client may display different kinds of buttons for controlling the shooting of the video clip according to the playing time of the audio clip, but not limited thereto. Such as: if the audio segment is short in time, a photograph can be taken. If the audio clip is timed to a certain length, the video can be captured.

Optionally, in this embodiment, the target time may be set to a shorter time, such as: 0.5 second, 0.75 second, 1 second, 1.5 seconds, etc.

As an alternative, in a case where the triggering operation performed on the first button is detected, starting to play the audio clip corresponding to the current video clip on the client, and simultaneously starting to capture the current video clip includes:

s1, under the condition that the trigger operation executed on the photo shooting button is detected, playing the audio clip corresponding to the current video clip on the client, and shooting the photo corresponding to the audio clip corresponding to the current video clip;

s2, generating the current video clip with the playing content being a photo and the playing time being the playing time of the audio clip corresponding to the current video clip.

Optionally, in this embodiment, if a photo is taken in a video clip corresponding to a certain audio clip, the photo may be used as a playing time corresponding to the playing content of the video clip to play the audio clip.

Alternatively, in this embodiment, each time the user finishes shooting a segment, the system marks the sequence number label (number sequence 1.2.3 …) on the shot content, and stores the content of the segment on the terminal. If the shot content is a video, the shot content is directly stored, if the shot content is a photo, the system reads the duration of the music piece with the same sequence number, then converts the photo into a static video with the same duration, and then stores the static video.

As an optional scheme, after displaying the third interface in response to the received first shooting request, the method further includes:

s1, displaying prompt information on the third interface, wherein the prompt information is used for prompting at least one of the following information: the shooting progress of the current video clip and the shooting progress of the plurality of video clips.

Optionally, in order to facilitate the user to control the video shooting, one or more of information such as a duration of an audio clip corresponding to the current video clip, the number of the multiple audio clips, an order of the audio clip corresponding to the current video clip in the multiple audio clips, a shooting progress of the current video clip, and shooting progresses of the multiple video clips may be displayed on the video shooting interface.

In an optional embodiment, when the user enters the shooting stage, the camera is immediately turned on, the interface displays the content shot by the camera in real time, and a shooting button of the first segment appears. The shooting button types are two types, namely a video shooting button or a shooting button, the system can judge the time length of the current music segmentation, when the time length of the segmentation is more than 1s, the video shooting button is displayed, and when the time length of the segmentation is less than 1s, the shooting button is displayed. For example, as shown in fig. 7, the duration of the first segment is 2.9s, so a video shooting button (a) is displayed on the interface, the user starts shooting videos by clicking, the system also synchronously plays music of the segment (b), the shooting of the segment is stopped after the music of the segment is played, and then a shooting button of the second segment appears. The duration of the second segment is less than 1s, so that the photographing button (c) is displayed, the user can take a picture by clicking the button, and then the photographing button of the third segment is displayed. And analogizing in sequence, when the user finishes shooting all the segments, entering a work preview interface (d), and sequentially synthesizing the video and the photos into a complete video by the system according to the sequence of the score segments. The user clicks the next button and enters the publishing interface (e).

As an optional scheme, synthesizing, by the client, the target audio resource with the plurality of video segments, to obtain the target video resource includes:

s1, splicing the video clips into a video file according to the time sequence of the audio clips;

and S2, synthesizing the video file and the target audio resource into the target video resource.

Optionally, in this embodiment, when the user enters the work preview interface, the system immediately synthesizes the stored video segments, sorts the video segments according to the sequence number labels, then stitches the video segments to generate a complete video segment, then performs audio superposition and synthesis with complete music, and finally generates a complete video segment with the soundtrack.

As an optional scheme, synthesizing, by the client, the plurality of audio segments and the plurality of video segments to obtain the target video resource includes:

s1, synthesizing each video clip in the video clips and each audio clip in the audio clips corresponding to each video clip into a plurality of video resources;

and S2, splicing the plurality of video resources into the target video resource according to the time sequence of the plurality of audio clips.

Optionally, in this embodiment, a plurality of video segments may be synthesized with their corresponding audio segments, and then the video resources may be spliced.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

According to another aspect of the embodiments of the present invention, there is also provided a video asset composing apparatus for implementing the above video asset composing method, as shown in fig. 8, the apparatus including:

a first obtaining module 82, configured to obtain an audio segmentation request on a client, where the audio segmentation request is used to request that a target audio resource is divided into multiple audio segments;

a partitioning module 84 for partitioning the target audio resource into a plurality of audio segments on the client in response to the audio segmentation request;

the shooting module 86 is configured to, in a case where a video shooting request is received, sequentially shoot, on the client, a plurality of video clips that correspond to the plurality of audio clips one to one in response to the video shooting request, where a play duration of each of the plurality of video clips is the same as a play duration of an audio clip corresponding to each of the plurality of video clips;

and a synthesizing module 88, configured to synthesize the target audio resource or the multiple audio clips and the multiple video clips through the client, so as to obtain the target video resource.

Optionally, the obtaining module includes:

the first display unit is used for displaying the playing information of the target audio resource on a first interface of the client, wherein the playing information is used for indicating the playing time of the target audio resource;

a first determining unit, configured to determine, when a segmenting operation performed at N positions on the playing information is detected, time points corresponding to the N positions as N segmenting points of the target audio resource, where N is an integer greater than or equal to 1;

the device comprises a generating unit, a generating unit and a processing unit, wherein the generating unit is used for generating an audio segmentation request carrying N segmentation points, and the audio segmentation request is used for requesting to divide a target audio resource into N +1 audio segments at the N segmentation points.

Optionally, the first determination unit includes:

the detection subunit is used for detecting N times of segmentation operations executed on the playing information in the process of playing the target audio resource;

a first determining subunit, configured to determine, as N positions, playback positions of the target audio resource when N times of the segmentation operations are performed;

and the second determining subunit is used for determining the time points corresponding to the N positions as N segmentation points of the target audio resource.

Optionally, the dividing module includes:

a dividing unit for dividing the target audio resource into N +1 audio segments at N segmentation points in response to the audio segmentation request;

and the second determining unit is used for determining the time sequence of the N +1 audio clips in the target audio resource as the video shooting sequence corresponding to the N +1 audio clips.

Optionally, the apparatus further comprises:

the first display module is used for displaying an audio resource set on a second interface before displaying the playing information of a target audio resource on a first interface, wherein the audio resource included in the audio resource set is used as background music for shooting a video resource, and the audio resource set includes the target audio resource;

the second obtaining module is configured to obtain playing information of the target audio resource when a selection operation performed on the target audio resource is detected, where the playing information includes at least one of a time axis and a sound wave diagram.

Optionally, the photographing module includes:

the second display unit is used for responding to the received first shooting request on the client to display a third interface and displaying a first button on the third interface, wherein the first button is used for controlling the shooting of the video clip;

the processing unit is used for stopping shooting the current video clip when the audio clip corresponding to the current video clip is played, and displaying a second button on a third interface, wherein the second button is used for controlling shooting of the next video clip;

and the second shooting unit is used for playing the audio clip corresponding to the next video clip and simultaneously starting shooting the next video clip under the condition that the triggering operation executed on the second button is detected, wherein the audio clip corresponding to the next video clip is one audio clip which is arranged behind the audio clip corresponding to the current video clip according to the time sequence.

Optionally, the second display unit comprises:

the first display subunit is used for displaying a video shooting button on the third interface when the playing time of the audio clip corresponding to the current video clip is greater than the target time, wherein the video shooting button is used for controlling video shooting, and the first button comprises a video shooting button;

and the second display subunit is used for displaying a photo shooting button on the third interface when the playing time of the audio clip corresponding to the current video clip is less than or equal to the target time, wherein the photo shooting button is used for controlling the shooting of photos, and the first button comprises a photo shooting button.

Optionally, the first photographing unit includes:

the shooting sub-unit is used for playing the audio clip corresponding to the current video clip on the client and shooting the photo corresponding to the audio clip corresponding to the current video clip under the condition that the triggering operation executed on the photo shooting button is detected;

and the generating subunit is used for generating the current video clip of which the playing content is a photo and the playing time is the playing time of the audio clip corresponding to the current video clip.

Optionally, the apparatus further comprises:

the second display module is used for displaying prompt information on a third interface after responding to the received first shooting request and displaying the third interface, wherein the prompt information is used for prompting at least one of the following information: the shooting progress of the current video clip and the shooting progress of the plurality of video clips.

Optionally, the synthesis module comprises:

the first splicing unit is used for splicing the video clips into a video file according to the time sequence of the audio clips;

Optionally, the synthesis module comprises:

the second synthesis unit is used for synthesizing each video clip in the video clips and each audio clip in a plurality of audio clips corresponding to each video clip into a plurality of video resources;

The application environment of the embodiment of the present invention may refer to the application environment in the above embodiment, but is not limited to this embodiment, and details thereof are not described again in this embodiment. The embodiment of the invention provides an optional specific application example of the connection method for implementing the real-time communication.

As an alternative embodiment, the above-mentioned video resource composition method can be applied, but not limited, to a scene in which user a takes a video using a client installed on a terminal as shown in fig. 9. In this scenario, S902, a user a starts a video shooting function of a client installed on a terminal, selects a score on a selection interface of an audio resource, and selects a setting segment, and the terminal reads duration information of the score and displays the duration information on a time axis on the segment interface. S904, the user A clicks the 'start' button to start playing the soundtrack, and during playing, the user A can click the 'segment' button to set the segment. S906, the terminal identifies the current time point when the 'segment' button is clicked, and generates segment marks with sequence numbers on the time axis. And S908, after the user A finishes the segmenting process of the score, clicking a shooting button, segmenting the score according to the segmentation mark and well beating the sequence according to the sequence numbers by the terminal. S910, the terminal analyzes the duration of each section of the score and judges whether the shooting button is displayed or photographed. And S912, the user A sequentially shoots videos or photos, and the terminal marks serial number labels on shot contents. And S914, the terminal identifies the type of the shot content, directly stores the shot content if the shot content is a video, reads the time length information of the score segment with the same sequence number if the shot content is a photo, converts the photo into a static video with the same time length, and stores the static video. S916, the user A clicks a 'finish' button, the terminal sorts the stored videos according to the sequence number labels, and the videos are stitched. And S918, overlapping and synthesizing the stitched video and the complete score. S920, the user a can preview and issue the work.

In this scenario, the video capture process is divided into a setup phase and a capture phase. The setup phase allows the user to freely segment the soundtrack, dividing a piece of music into N sections. When the user enters the shooting stage, the system analyzes the section of the score and guides the user to shoot a video or a photo according to the time length of the section. And finally, a complete video is compounded according to the paragraph sequence, so that the user can simply realize the effect of 'rhythm snapshot'.

In the setting stage, the user segments the score music, the user can accurately mark segmentation points on rhythm points of the music in the process of listening to the music, and then the user can automatically segment a shooting flow according to the marked points when shooting.

Through the mode, in the shooting process of the whole video, the user can accurately change the shooting content at a certain time point of music to achieve the effect of rhythm snapshot, so that the common user can conveniently complete the rhythm snapshot at a mobile end, and the capability of producing high-quality videos of the common user is improved.

According to still another aspect of the embodiments of the present invention, there is also provided an electronic apparatus for implementing the composition of the video asset, as shown in fig. 10, the electronic apparatus including: one or more processors 1002 (only one of which is shown in the figure), in which a computer program is stored, a memory 1004, in which a sensor 1006, an encoder 1008 and a transmission device 1010, the processor being arranged to carry out the steps of any of the method embodiments described above by means of the computer program.

Optionally, in this embodiment, the electronic apparatus may be located in at least one network device of a plurality of network devices of a computer network.

Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:

s1, obtaining an audio segmentation request on the client, wherein the audio segmentation request is used for requesting to divide the target audio resource into a plurality of audio segments;

s2, dividing the target audio resource into a plurality of audio fragments on the client end in response to the audio segmentation request;

s3, under the condition that a video shooting request is received, responding to the video shooting request on the client to shoot a plurality of video clips corresponding to the plurality of audio clips in sequence, wherein the playing time length of each video clip in the plurality of video clips is the same as the playing time length of the audio clip corresponding to each video clip;

and S4, synthesizing the target audio resource or the plurality of audio clips and the plurality of video clips through the client to obtain the target video resource.

Alternatively, it can be understood by those skilled in the art that the structure shown in fig. 10 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 10 is a diagram illustrating a structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in FIG. 10, or have a different configuration than shown in FIG. 10.

The memory 1004 may be used to store software programs and modules, such as program instructions/modules corresponding to the method and apparatus for synthesizing a video resource in the embodiment of the present invention, and the processor 1002 executes various functional applications and data processing by running the software programs and modules stored in the memory 1004, that is, implementing the control method of the target component. The memory 1004 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 1004 may further include memory located remotely from the processor 1002, which may be connected to the terminal through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 1010 is used for receiving or transmitting data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 1010 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices so as to communicate with the internet or a local area Network. In one example, the transmission device 1010 is a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

Among other things, the memory 1004 is used to store application programs.

Embodiments of the present invention also provide a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.

Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:

Optionally, the storage medium is further configured to store a computer program for executing the steps included in the method in the foregoing embodiment, which is not described in detail in this embodiment.

Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing one or more computer devices (which may be personal computers, servers, network devices, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A method for synthesizing a video asset, comprising:

the method comprises the steps of obtaining an audio segmentation request in the process of playing music by a client, wherein the audio segmentation request is used for requesting to divide a target audio resource into a plurality of audio segments;

dividing the target audio resource into the plurality of audio segments in response to the audio segmentation request on the client, wherein the dividing is: marking segment points on rhythm points of the target audio resource, and dividing the target audio resource by the marked segment points;

under the condition that a video shooting request is received, guiding a user to shoot a plurality of video clips corresponding to the plurality of audio clips in sequence according to the time length of each audio clip in the target audio resource, wherein the playing time length of each video clip in the plurality of video clips is the same as the playing time length of the audio clip corresponding to each video clip, the plurality of video clips comprise videos and/or static videos, and the static videos are videos obtained by converting one or more shot photos;

2. The method of claim 1, wherein obtaining the audio segmentation request during the playing of the music by the client comprises:

displaying playing information of the target audio resource on a first interface of a client, wherein the playing information is used for indicating the playing time of the target audio resource;

when the segmentation operation executed at N positions on the playing information is detected, determining time points corresponding to the N positions as N segmentation points of the target audio resource, wherein N is an integer greater than or equal to 1;

and generating the audio segmentation request carrying the N segmentation points, wherein the audio segmentation request is used for requesting to divide the target audio resource into N +1 audio segments at the N segmentation points.

3. The method of claim 2, wherein, when detecting a segmentation operation performed at N positions on the playback information, determining time points corresponding to the N positions as N segmentation points of the target audio resource comprises:

responding to the received playing instruction, and playing the target audio resource;

detecting N times of the segmentation operation executed on the playing information in the process of playing the target audio resource;

determining the playing positions of the target audio resource when the N times of the segmentation operation are executed as the N positions;

and determining the time points corresponding to the N positions as N segmentation points of the target audio resource.

4. The method of claim 2, wherein dividing the target audio resource into the plurality of audio segments in response to the audio segment request on the client comprises:

in response to the audio segmentation request, dividing the target audio resource into the N +1 audio segments at the N segmentation points;

and determining the time sequence of the N +1 audio clips in the target audio resource as the video shooting sequence corresponding to the N +1 audio clips.

5. The method of claim 2, wherein prior to displaying the playback information for the target audio asset on the first interface, the method further comprises:

displaying an audio resource set on a second interface, wherein the audio resources included in the audio resource set are used as background music for shooting video resources, and the audio resource set includes the target audio resource;

and under the condition that the selection operation performed on the target audio resource is detected, acquiring the playing information of the target audio resource, wherein the playing information comprises at least one of a time axis and a sound wave diagram.

6. The method of claim 1, wherein, in a case where a video shooting request is received, guiding a user to sequentially shoot a plurality of video segments corresponding to the plurality of audio segments in a one-to-one correspondence according to a time length of each audio segment in the target audio resource comprises:

responding to a received first shooting request on the client to display a third interface, and displaying a first button on the third interface, wherein the first button is used for controlling shooting of a video clip;

under the condition that the triggering operation executed on the first button is detected, shooting a current video clip on the client, and simultaneously playing an audio clip corresponding to the current video clip;

when the playing of the audio clip corresponding to the current video clip is finished, stopping shooting the current video clip, and displaying a second button on the third interface, wherein the second button is used for controlling the shooting of the next video clip;

and under the condition that the triggering operation executed on the second button is detected, playing an audio clip corresponding to the next video clip, and simultaneously starting shooting the next video clip, wherein the audio clip corresponding to the next video clip is an audio clip arranged behind the audio clip corresponding to the current video clip according to the time sequence.

7. The method of claim 6, wherein displaying the first button on the third interface comprises:

when the playing time of the audio clip corresponding to the current video clip is longer than the target time, displaying a video shooting button on the third interface, wherein the video shooting button is used for controlling video shooting, and the first button comprises the video shooting button;

and when the playing time of the audio clip corresponding to the current video clip is less than or equal to the target time, displaying a photo shooting button on the third interface, wherein the photo shooting button is used for controlling to shoot a photo, and the first button comprises the photo shooting button.

8. The method of claim 7, wherein in a case where the triggering operation performed on the first button is detected, starting to play an audio clip corresponding to the current video clip on the client and starting to capture the current video clip comprises:

under the condition that the triggering operation executed on the photo shooting button is detected, playing an audio clip corresponding to the current video clip on the client, and shooting a photo corresponding to the audio clip corresponding to the current video clip;

and generating the current video clip with the playing content of the photo and the playing time of the audio clip corresponding to the current video clip.

9. The method of claim 6, wherein after displaying the third interface in response to the received first capture request, the method further comprises:

displaying prompt information on the third interface, wherein the prompt information is used for prompting at least one of the following information: the time length of the audio clip corresponding to the current video clip, the number of the plurality of audio clips, the sequence of the audio clip corresponding to the current video clip in the plurality of audio clips, the shooting progress of the current video clip, and the shooting progress of the plurality of video clips.

10. The method according to any one of claims 1 to 9, wherein synthesizing, by the client, the target audio resource with the plurality of video segments to obtain a target video resource comprises:

splicing the plurality of video clips into a video file according to the time sequence of the plurality of audio clips;

and synthesizing the video file and the target audio resource into the target video resource.

11. The method of any one of claims 1 to 9, wherein synthesizing, by the client, the plurality of audio segments and the plurality of video segments to obtain a target video resource comprises:

synthesizing each video clip of the plurality of video clips and each audio clip of the plurality of audio clips corresponding to each video clip into a plurality of video resources;

and splicing the plurality of video resources into the target video resource according to the time sequence of the plurality of audio clips.

12. An apparatus for synthesizing a video asset, comprising:

the device comprises a first acquisition module, a first processing module and a first processing module, wherein the first acquisition module is used for acquiring an audio segmentation request in the process of playing music by a client, and the audio segmentation request is used for requesting to divide a target audio resource into a plurality of audio segments;

a partitioning module, configured to partition the target audio resource into the plurality of audio segments in response to the audio segmentation request on the client, where the partitioning is: marking segmentation points on rhythm points of the target audio resource, and dividing the target audio resource by the marked segmentation points;

the shooting module is used for guiding a user to sequentially shoot a plurality of video clips corresponding to the plurality of audio clips one by one according to the time length of each audio clip in the target audio resource under the condition of receiving a video shooting request, wherein the playing time length of each video clip in the plurality of video clips is the same as the playing time length of the audio clip corresponding to each video clip, the plurality of video clips comprise videos and/or static videos, and the static videos are videos obtained by converting one or more shot photos;

13. The apparatus of claim 12, wherein the obtaining module comprises:

14. A storage medium, in which a computer program is stored, wherein the computer program is arranged to perform the method of any of claims 1 to 11 when executed.

15. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 11 by means of the computer program.