CN114827753A

CN114827753A - Video index information generation method and device and computer equipment

Info

Publication number: CN114827753A
Application number: CN202110089220.6A
Authority: CN
Inventors: 杜正中; 李育中
Original assignee: Tencent Technology Beijing Co Ltd
Current assignee: Tencent Technology Beijing Co Ltd
Priority date: 2021-01-22
Filing date: 2021-01-22
Publication date: 2022-07-29
Anticipated expiration: 2041-01-22
Also published as: CN114827753B

Abstract

The application discloses a method and a device for generating video index information and computer equipment. The method comprises the following steps: acquiring interception information aiming at a target video, wherein the interception information is used for indicating the relation between an intercepted video to be intercepted and the target video; and generating video index information corresponding to the intercepted video to be intercepted based on the intercepting information and the video resource corresponding to the target video, wherein the video resource corresponding to the target video is obtained by transcoding the target video, and the video index information is used for indexing the video resource corresponding to the intercepted video to be intercepted. Based on the mode, uploading of the intercepted video is not needed in the generation process of the video index information corresponding to the intercepted video, transcoding of the intercepted video is not needed, time consumption is short, the generation efficiency of the video index information corresponding to the intercepted video is high, the time length required by the online of the intercepted video is favorably shortened, and the time length required by the function of carrying out content distribution on the related information of the intercepted video is further shortened.

Description

Video index information generation method and device and computer equipment

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to a method and a device for generating video index information and computer equipment.

Background

With the development of computer technology, more and more applications and web pages have a video playing function. In the process of playing the video, the playing terminal firstly acquires video index information corresponding to the video from the server, then downloads the corresponding video resource according to the storage information of the video resource recorded in the video index information, and realizes video playing according to the downloaded video resource. The playing terminal can play a certain video and also can play an intercepted video intercepted from the certain video. In order to enable the playing terminal to play the intercepted video, the server needs to generate video index information corresponding to the intercepted video.

In the related art, the process of generating the video index information corresponding to the intercepted video by the server is as follows: and the server receives the uploaded intercepted video, transcodes the received intercepted video, and generates video index information corresponding to the intercepted video based on the video resources obtained after transcoding.

Based on the above process, the generation of the video index information corresponding to the intercepted video needs to be performed through the processes of uploading and transcoding the intercepted video, the time consumption is long, and the generation efficiency of the video index information corresponding to the intercepted video is low. In addition, since the intercepted video can be online after the video index information corresponding to the intercepted video is generated, the time required for online of the intercepted video is long.

Disclosure of Invention

The embodiment of the application provides a method and a device for generating video index information and computer equipment, which can be used for improving the generation efficiency of the video index information corresponding to an intercepted video. The technical scheme is as follows:

in one aspect, an embodiment of the present application provides a method for generating video index information, where the method includes:

acquiring interception information aiming at a target video, wherein the interception information is used for indicating the relation between an intercepted video to be intercepted and the target video;

and generating video index information corresponding to the intercepted video to be intercepted based on the intercepting information and the video resource corresponding to the target video, wherein the video resource corresponding to the target video is obtained by transcoding the target video, and the video index information is used for indexing the video resource corresponding to the intercepted video to be intercepted.

There is also provided a method for generating video index information, the method including:

responding to the video interception request, and displaying a video interception editing page;

acquiring interception information aiming at the target video on the video interception editing page, wherein the interception information is used for indicating the relation between the intercepted video to be intercepted and the target video;

sending the interception information aiming at the target video to a server, wherein the server is used for generating video index information corresponding to the intercepted video to be intercepted based on the interception information and video resources corresponding to the target video, the video resources corresponding to the target video are obtained by transcoding the target video, and the video index information is used for indexing the video resources corresponding to the intercepted video to be intercepted.

In another aspect, an apparatus for generating video index information is provided, the apparatus including:

the device comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring interception information aiming at a target video, and the interception information is used for indicating the relation between an intercepted video to be intercepted and the target video;

the generating unit is used for generating video index information corresponding to the intercepted video to be intercepted based on the intercepting information and the video resource corresponding to the target video, the video resource corresponding to the target video is obtained by transcoding the target video, and the video index information is used for indexing the video resource corresponding to the intercepted video to be intercepted.

In one possible implementation manner, the capture information includes a capture start time point and a capture end time point, the video resource corresponding to the target video includes at least one video slice corresponding to the target video, and different video slices correspond to different slice time periods; the generating unit is configured to determine, based on the capture start time point and the capture termination time point, at least one target video segment corresponding to the captured video to be captured in the at least one video segment, where a segment time period corresponding to any target video segment is partially or completely within a capture time period formed by the capture start time point and the capture termination time point; and generating video index information corresponding to the intercepted video to be intercepted based on the at least one target video fragment.

In a possible implementation manner, the generating unit is further configured to generate, in response to that all of the segment time periods corresponding to the at least one target video segment are within the capturing time period, segment index information corresponding to the at least one target video segment based on the storage information of the at least one target video segment and the duration of the at least one target video segment; and generating video index information corresponding to the intercepted video to be intercepted based on the fragment index information corresponding to the at least one target video fragment.

In a possible implementation manner, the generating unit is further configured to, in response to that a slice time period portion corresponding to a reference video slice in the at least one target video slice is within the truncation time period and all slice time periods corresponding to other target video slices are within the truncation time period, generate slice index information corresponding to the reference video slice based on a reference time period within the truncation time period in the slice time period corresponding to the reference video slice and the storage information of the reference video slice; generating fragment index information corresponding to the other target video fragments based on the storage information of the other target video fragments and the time lengths of the other target video fragments; generating video index information corresponding to the intercepted video to be intercepted based on the fragment index information corresponding to the reference video fragments and the fragment index information corresponding to the other target video fragments; the reference video slices comprise at least one of a first target video slice and a last target video slice in the at least one target video slice, and the at least one target video slice is arranged in sequence according to corresponding slice time periods.

In a possible implementation manner, the generating unit is further configured to determine, based on the reference time period, a reference duration, start marker information, and end marker information, where the start marker information and the end marker information are used to determine, in the reference video slice, a partial video slice that matches the captured video to be captured; and generating fragment index information corresponding to the reference video fragments based on the reference duration, the start marker information, the end marker information and the storage information of the reference video fragments.

In one possible implementation manner, the capture information includes a capture start time point, a capture end time point, and an even number of capture intermediate time points, the video resource corresponding to the target video includes at least one video slice corresponding to the target video, and different video slices correspond to different slice time periods; the generating unit is used for determining at least two sub-interception time periods based on the interception start time point, the interception end time point and the even number of interception middle time points; determining video fragment groups corresponding to the at least two sub-interception time periods respectively based on the at least one video fragment, wherein the video fragment group corresponding to any sub-interception time period is composed of video fragments of which the corresponding fragment time period is partially or completely in any sub-interception time period; generating reference index information corresponding to the at least two video slice groups respectively, wherein the reference index information corresponding to any video slice group comprises slice index information corresponding to video slices forming the any video slice group; and generating video index information corresponding to the intercepted video to be intercepted based on the reference index information and the splicing mark information respectively corresponding to the at least two video slice groups.

In one possible implementation manner, the number of video resources corresponding to the target video is a reference number, and different video resources have different definitions; the number of the video index information corresponding to the intercepted video to be intercepted is the reference number, and any video index information is generated based on the intercepted information and the video resource with any definition corresponding to the target video.

In a possible implementation manner, the reference number of video resources corresponding to the target video are obtained by transcoding based on a reference frame alignment manner, and video fragments located at the same arrangement position in different video resources have the same duration.

There is provided an apparatus for generating video index information, the apparatus including:

the display unit is used for responding to the video interception request and displaying a video interception editing page;

an obtaining unit, configured to obtain, at the video capture editing page, capture information for the target video, where the capture information is used to indicate a relationship between a captured video to be captured and the target video;

the sending unit is used for sending the intercepted information aiming at the target video to a server, the server is used for generating video index information corresponding to the intercepted video to be intercepted based on the intercepted information and video resources corresponding to the target video, the video resources corresponding to the target video are obtained by transcoding the target video, and the video index information is used for indexing the video resources corresponding to the intercepted video to be intercepted.

In one possible implementation manner, the video capture editing page includes a video input control, a dotting control and a confirmation control, and the dotting control is used for specifying a time point on a time axis; the acquisition unit is used for responding to a triggering instruction of the video input control and displaying selectable candidate videos; responding to a selected instruction of a target video in the candidate videos, and displaying the target video and a time axis corresponding to the target video in the video intercepting and editing page; determining a designated time point on a time axis corresponding to the target video based on the trigger operation for the dotting control; and responding to a trigger instruction of the confirmation control, and acquiring interception information aiming at the target video based on the appointed time point.

In a possible implementation manner, the sending unit is further configured to send a play request for the intercepted video to be intercepted to the server, and the server is configured to return video index information corresponding to the intercepted video to be intercepted based on the play request;

the acquisition unit is further configured to acquire a video resource corresponding to the intercepted video to be intercepted based on the video index information corresponding to the intercepted video to be intercepted, which is returned by the server;

the device further comprises:

and the playing unit is used for playing the intercepted video to be intercepted based on the video resource corresponding to the intercepted video to be intercepted.

In a possible implementation manner, the video index information corresponding to the intercepted video to be intercepted includes slice index information corresponding to at least one associated video slice related to the intercepted video to be intercepted, and the video resource corresponding to the intercepted video to be intercepted includes a sub video resource corresponding to the at least one associated video slice; the acquiring unit is further configured to acquire the at least one associated video segment based on video index information corresponding to the intercepted video to be intercepted; for any associated video fragment, in response to that fragment index information corresponding to the any associated video fragment does not include start marker information and end marker information, taking the whole associated video fragment as a sub-video resource corresponding to the any associated video fragment; and in response to that the fragment index information corresponding to any associated video fragment comprises start mark information and end mark information, determining a partial video fragment matched with the intercepted video to be intercepted in any associated video fragment based on the start mark information and the end mark information, and taking the partial video fragment as a sub-video resource corresponding to any associated video fragment.

In another aspect, a computer device is provided, which includes a processor and a memory, where at least one computer program is stored in the memory, and the at least one computer program is loaded and executed by the processor to implement any one of the above-mentioned methods for generating video index information.

In another aspect, a computer-readable storage medium is provided, in which at least one computer program is stored, and the at least one computer program is loaded and executed by a processor to implement any one of the above methods for generating video index information.

In another aspect, a computer program product or a computer program is also provided, comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to make the computer device execute any one of the above-mentioned video index information generation methods.

The technical scheme provided by the embodiment of the application at least has the following beneficial effects:

in the embodiment of the application, video index information corresponding to the intercepted video is generated directly based on the intercepted information aiming at the target video and the video resource corresponding to the target video. Based on the mode, the generation process of the video index information corresponding to the intercepted video does not need uploading of the intercepted video, transcoding of the intercepted video is not needed, time consumption is short, the generation efficiency of the video index information corresponding to the intercepted video is high, and the time required for online of the intercepted video is favorably shortened.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of an implementation environment of a method for generating video index information according to an embodiment of the present application;

fig. 2 is a flowchart of a method for generating video index information according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a video capture editing page provided by an embodiment of the present application;

fig. 4 is a flowchart of a process of acquiring capture information for a target video on a video capture editing page according to an embodiment of the present application;

fig. 5 is a flowchart of a process of generating video index information corresponding to an intercepted video to be intercepted based on the interception information and a video resource corresponding to a target video, provided in an embodiment of the present application;

fig. 6 is a schematic diagram of a video resource corresponding to a target video according to an embodiment of the present application;

fig. 7 is a schematic diagram of video index information corresponding to a target video according to an embodiment of the present application;

fig. 8 is a schematic diagram of video index information corresponding to an intercepted video to be intercepted according to an embodiment of the present application;

fig. 9 is a schematic diagram of video index information corresponding to another intercepted video to be intercepted according to an embodiment of the present application;

fig. 10 is a schematic diagram of video index information corresponding to another intercepted video to be intercepted according to an embodiment of the present application;

FIG. 11 is a schematic diagram of an example of an intercept provided by an embodiment of the present application;

fig. 12 is a schematic diagram of video index information corresponding to another intercepted video to be intercepted according to an embodiment of the present application;

fig. 13 is a flowchart of another process of generating video index information corresponding to an intercepted video to be intercepted based on the interception information and a video resource corresponding to a target video, provided in an embodiment of the present application;

FIG. 14 is a schematic diagram of another example of interception provided by embodiments of the present application;

fig. 15 is a schematic diagram of video index information corresponding to another intercepted video to be intercepted according to an embodiment of the present application;

fig. 16 is a schematic diagram of a video resource corresponding to another target video provided in an embodiment of the present application;

fig. 17 is a flowchart of a process of playing an intercepted video to be intercepted according to an embodiment of the present application;

fig. 18 is a flowchart of another method for generating video index information according to an embodiment of the present application;

fig. 19 is a flowchart of another method for generating video index information according to an embodiment of the present application;

fig. 20 is a schematic diagram of an apparatus for generating video index information according to an embodiment of the present application;

fig. 21 is a schematic diagram of another apparatus for generating video index information according to an embodiment of the present application;

fig. 22 is a schematic diagram of another apparatus for generating video index information according to an embodiment of the present application;

fig. 23 is a schematic structural diagram of a server provided in an embodiment of the present application;

fig. 24 is a schematic structural diagram of a terminal according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, the following detailed description of the embodiments of the present application will be made with reference to the accompanying drawings.

In some embodiments, the method for generating the video index information provided by the embodiments of the present application can be applied in the field of cloud technology, and the video resources obtained after transcoding, the generated video index information, and the like are stored based on the cloud technology. Before introducing the embodiments of the present application, some basic concepts in the cloud technology field need to be introduced.

Cloud Technology (Cloud Technology): the method is a hosting technology for unifying series resources such as hardware, software, network and the like in a wide area network or a local area network to realize the calculation, storage, processing and sharing of data. The cloud technology is a general term of network technology, information technology, integration technology, management platform technology, application technology and the like applied based on a cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support in the field of cloud technology. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, picture-like websites and more web portals. With the high development and application of the internet industry, each article may have its own identification mark and needs to be transmitted to a background system for logic processing, data of different levels are processed separately, and various industrial data need strong system background support and need to be realized through cloud computing.

Cloud Storage (Cloud Storage): the distributed cloud storage system (hereinafter referred to as a storage system) refers to a storage system which integrates a large number of storage devices (storage devices are also referred to as storage nodes) of various types in a network through application software or application interfaces to cooperatively work through functions of cluster application, grid technology, distributed storage file systems and the like, and provides data storage and service access functions to the outside.

Certainly, the method for generating video index information provided in the embodiment of the present application can also be applied to other fields besides the field of cloud technology, and the embodiments of the present application are not listed here one by one.

Referring to fig. 1, a schematic diagram of an implementation environment of a method for generating video index information according to an embodiment of the present application is shown. The implementation environment includes: a terminal 11 and a server 12.

In an exemplary embodiment, the terminal 11 is installed with an application or a web page capable of providing a video interception function, and the terminal 11 is capable of displaying a video interception edit page where interception information for a certain transcoded video is acquired. After acquiring the interception information for a certain transcoded video, the terminal 11 can send the interception information for the video to the server 12. The server 12 can obtain interception information for a certain transcoded video sent by the terminal 11, and then generate video index information corresponding to the intercepted video to be intercepted based on the interception information and the video resource obtained after transcoding corresponding to the video. In this case, the generation method of the video index information is performed by the terminal 11 and the server 12 in common.

In an exemplary embodiment, after the video capture editing page obtains capture information for a certain transcoded video, the terminal 11 can directly generate video index information corresponding to the captured video to be captured based on the capture information and a video resource obtained after transcoding corresponding to the video. In this case, the generation method of the video index information is performed by the terminal 11.

In an exemplary embodiment, the interception information for a certain transcoded video may be stored in the server 12 in advance, and the server 12 can directly extract the interception information for a certain transcoded video, and then generate video index information corresponding to the intercepted video to be intercepted based on the interception information and a video resource obtained after transcoding corresponding to the video. In this case, the generation method of the video index information is performed by the server 12.

In an exemplary embodiment, after the server 12 generates video index information corresponding to the intercepted video to be intercepted, the intercepted video to be intercepted can be selected by the user for playing, and the terminal 11 can send a playing request for the intercepted video to be intercepted to the server 12. The server 12 can send the video index information corresponding to the intercepted video to be intercepted to the terminal 11 based on the play request, so that the terminal 11 obtains the video resource corresponding to the intercepted video to be intercepted based on the video index information, and then the video resource corresponding to the intercepted video to be intercepted is played based on the video resource corresponding to the intercepted video to be intercepted.

In one possible implementation, the terminal 11 may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, a smart television, a smart car device, and the like. The server 12 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a web service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform. The terminal 11 and the server 12 may be directly or indirectly connected through wired or wireless communication, and the embodiment of the present application is not limited herein.

It should be understood by those skilled in the art that the above-mentioned terminal 11 and server 12 are only examples, and other existing or future terminals or servers may be suitable for the present application and are included within the scope of the present application and are herein incorporated by reference.

As can be seen from the description of the embodiment environment shown in fig. 1, the method for generating the video index information may be executed by the terminal, the server, or both the terminal and the server. In the embodiment of the present application, a method for generating video index information is described as an example in which a terminal and a server are jointly executed, and as shown in fig. 2, the method for generating video index information provided in the embodiment of the present application includes the following steps 201 to 205:

in step 201, the terminal displays a video capture editing page in response to a video capture request.

The video interception request is used for indicating that a user needs to intercept a video, and the method for acquiring the video interception request by the terminal is not limited in the embodiment of the application. Illustratively, the terminal is provided with an application program or a webpage capable of providing a video interception function, a video interception entry is displayed in the application program or the webpage capable of providing the video interception function, and the terminal responds to the triggering operation of a user on the video interception entry to acquire a video interception request.

And after the terminal acquires the video intercepting request, responding to the video intercepting request and displaying a video intercepting and editing page. The video intercepting and editing page is used for a user to intercept videos according to requirements. The layout mode of the video intercepting and editing page is set according to experience or flexibly adjusted according to an application scene, and the layout mode is not limited in the embodiment of the application. Illustratively, a video capture editing page is shown in FIG. 3.

In step 202, the terminal acquires interception information for the target video on a video interception editing page, where the interception information is used to indicate a relationship between an intercepted video to be intercepted and the target video.

The method for generating the video index information is applied to the scene of intercepting a target video to obtain the intercepted video, and the target video refers to the video which is obtained by transcoding and has stored video resources. The format of the video resource obtained after transcoding is related to the transcoding mode, which is not limited in the embodiment of the present application. It should be noted that, in one capturing process, the target video may refer to one video or may refer to multiple videos, which is not limited in the embodiment of the present application. That is to say, the embodiment of the application supports capturing a plurality of videos to obtain an application scene of capturing the videos. The embodiment of the present application takes a target video as an example for explanation.

The interception information for the target video is used for indicating the relationship between the intercepted video to be intercepted and the target video. According to the interception information aiming at the target video, the user can know which part or parts of the target video the intercepted video to be intercepted consists of. And acquiring the interception information aiming at the target video by the terminal on the video interception editing page.

After the video intercepting and editing page is displayed, a user can generate intercepting operation aiming at the target video on the video intercepting and editing page displayed by the terminal, so that the terminal can acquire the intercepting information aiming at the target video on the video intercepting and editing page.

The intercepting operation for the target video generated by the user on the video intercepting editing page can indicate which part or parts of the target video need to be intercepted by the user, namely, can indicate which part or parts of the target video the intercepted video needs to be intercepted by the user. The embodiment of the application does not limit the type of the intercepting operation that the user can perform for the target video, and the intercepting operation that the user can perform for the target video exemplarily includes specifying an intercepting start time point and an intercepting end time point of the intercepted video to be intercepted on a time axis of the target video. The interception starting time point is used for indicating the starting position of the intercepted video to be intercepted, and the interception ending time point is used for indicating the ending position of the intercepted video to be intercepted.

In an exemplary embodiment, for a case where a user needs to cut a plurality of portions of a target video to constitute one cut video to be cut out from the plurality of portions, the user can also specify an even number of cut-out intermediate time points on the time axis of the target video. In this case, the interception start time point, the even number of interception intermediate time points, and the interception end time point are arranged in time sequence, each two interception time points form a sub-interception time period, and each sub-interception time period corresponds to a portion of the target video to be intercepted.

The above-mentioned capture start time point, capture intermediate time point, and capture termination time point are all for the target video, and for example, the capture start time point is the 1 st second of the target video, the capture intermediate time point is the 2 nd and 3 rd seconds of the target video, and the capture termination time point is the 4 th second of the target video. In an exemplary embodiment, the user can specify the respective interception time points by performing a dotting operation on the time axis of the target video.

In one possible implementation, the video capture edit page includes a video input control, a dotting control, and a confirmation control. And the dotting control is used for specifying a time point on the time axis. In this case, referring to fig. 4, the process of acquiring the cut information for the target video at the video cut editing page by the terminal includes the following steps 2021 to 2024:

step 2021: and displaying the selectable candidate videos in response to the triggering instruction of the video input control.

When the triggering operation of a user on a video input control in a video intercepting and editing page is detected, a triggering instruction of the video input control is obtained, and at the moment, the user needs to select a source video of an intercepted video to be intercepted. At this time, the terminal displays the candidate videos available for selection. Illustratively, the number of candidate videos to be selected is one or more, which is not limited in the embodiments of the present application. In an exemplary embodiment, the candidate videos all refer to videos of which video resources obtained after transcoding are stored in a server.

Step 2022: and responding to a selected instruction of a target video in the candidate videos, and displaying the target video and a time axis corresponding to the target video in a video intercepting and editing page.

After the selectable candidate videos are displayed, a user can perform selection operation in the candidate videos, the candidate videos selected by the user are used as target videos, and the terminal responds to a selection instruction of the target videos and displays the target videos and time axes corresponding to the target videos in a video intercepting and editing page. The time axis corresponding to the target video is used for indicating the total duration of the target video and the time point corresponding to each video picture of the target video.

In one possible implementation manner, in response to a selection instruction of a target video, a process of displaying the target video in a video capture editing page by a terminal is as follows: and the terminal responds to the selected instruction of the target video, acquires the video resource corresponding to the target video from the server, and decodes and displays the video resource corresponding to the target video. In an exemplary embodiment, in presenting the target video, each video picture of the target video is presented in turn. When a certain video picture is displayed, marking a time point corresponding to the video picture on a time axis corresponding to a target video.

In an exemplary embodiment, the video capture editing page includes a video screen preview area in which each video screen of the target video is presented in turn.

Step 2023: and determining the appointed time point on the time axis corresponding to the target video based on the trigger operation aiming at the dotting control.

After the target video and the time axis corresponding to the target video are displayed, a user can trigger the dotting control in the process of watching the target video. The terminal can determine the appointed time point on the time axis corresponding to the target video based on the trigger operation of the user for the dotting control.

In one possible implementation manner, the dotting control includes a dotting determination control and a dotting cancellation control, and the process of determining the specified time point on the time axis corresponding to the target video by the terminal based on the trigger operation for the dotting control is as follows: when the terminal detects the trigger operation aiming at the determined dotting control each time, the time point corresponding to the current video picture marked on the time axis is used as a candidate time point; when the terminal detects the trigger operation aiming at the dotting cancellation control each time, the time point corresponding to the current video picture marked on the time axis is used as a cancellation time point; and deleting the candidate time points overlapped with the cancel time points in all the candidate time points, and taking the rest candidate time points as the appointed time points on the time axis corresponding to the target video.

Step 2024: and responding to a trigger instruction of the confirmation control, and acquiring interception information aiming at the target video based on the appointed time point.

When the trigger instruction of the confirmation control is obtained, it is indicated that the user has finished the intercepting operation for the target video, and at this time, the terminal obtains the intercepting information for the target video based on the specified time point.

In one possible implementation manner, the manner of acquiring, by the terminal, the interception information for the target video based on the specified time point is as follows: and arranging the appointed time points according to the time sequence, taking the first appointed time point as an interception starting time point, taking the last appointed time point as an interception ending time point, taking other appointed time points as interception middle time points, and taking the interception starting time point, the interception middle time point and the interception ending time point as interception information aiming at the target video. It should be noted that the number of the middle time points of the capturing is even, so as to ensure that the relationship between the captured video to be captured and the target video can be accurately obtained by using the capturing information. In this case, the cut information for the target video includes a cut start time point, a cut end time point, and an even number of cut intermediate time points.

In the exemplary embodiment, the number of the designated time points is two, and in this case, the designated time point that is earlier in time is taken as the cut-out start time point, the designated time point that is later in time is taken as the cut-out termination time point, and the cut-out start time point and the cut-out termination time point are taken as cut-out information for the target video. At this time, the interception information for the target video includes an interception start time point and an interception end time point, and does not include an interception intermediate time point.

In an exemplary embodiment, a video picture control is further included in the video capture editing page, and the video picture control is used for controlling the video picture displayed in the video capture editing page. Illustratively, the video picture control comprises at least one sub-control, and different sub-controls are used for performing different control on the video picture displayed in the video capture editing page.

For example, the video picture control controls include a start sub-control, a pause sub-control, a previous frame sub-control, and a next frame sub-control. The starting sub-control is used for controlling the starting of displaying the video picture; the pause sub-control is used for controlling the pause of the video picture; the previous frame of sub-control is used for controlling the currently displayed video picture in the video capturing and editing page to jump to the previous frame of video picture; and the next frame of sub-control is used for controlling the video picture currently displayed in the video capturing and editing page to jump to the next frame of video picture.

Illustratively, a video capture editing page displayed by the terminal is as shown in fig. 3, and in the video capture editing page shown in fig. 3, the video capture editing page includes a video input control 301 marked with a text "input video", a determination dotting control 302 marked with a text "dotting", a cancellation dotting control 303 marked with a text "cancellation dotting", a confirmation control 304 marked with a text "capture generation", and a video picture control 305, and the video picture control 305 includes a start sub-control marked with a text "start", a pause sub-control marked with a text "pause", a previous frame control marked with a text "previous frame", and a next frame sub-control marked with a text "next frame".

The user can trigger the video input control 301, the terminal acquires video resources of the target video based on the trigger operation of the user on the video input control 301, then the target video is displayed in the video picture preview area 306, and the time axis corresponding to the target video is displayed below the video picture preview area 306. During the process of watching the displayed target video, the user can control the display process by triggering one or more sub-controls marked with the characters "start", "pause", "previous frame" and "next frame", that is, the video picture displayed in the video picture preview area.

The user can also specify or cancel a time point on the time axis of the target video by triggering the dotting determining control 302 and the dotting canceling control 303, and the terminal determines the specified time point on the time axis corresponding to the target video based on the triggering operation of the user on the dotting determining control 302 and the dotting canceling control 303. When the triggering operation of the user on the confirmation control 304 is detected, the terminal acquires interception information for the target video based on the specified time point.

In step 203, the terminal sends interception information for the target video to the server.

After acquiring the interception information aiming at the target video, the terminal sends the interception information aiming at the target video to the server, so that the server generates video index information corresponding to the intercepted video to be intercepted based on the interception information aiming at the target video.

In step 204, the server obtains interception information for the target video.

After the terminal sends the interception information for the target video to the server, the server obtains the interception information for the target video sent by the terminal, and then executes step 205.

In step 205, the server generates video index information corresponding to the intercepted video to be intercepted based on the interception information and the video resource corresponding to the target video, where the video resource corresponding to the target video is obtained by transcoding the target video, and the video index information is used to index the video resource corresponding to the intercepted video to be intercepted.

The interception information is generated aiming at the target video and is used for indicating the relation between the intercepted video to be intercepted and the target video, so that after the interception information is obtained, the video index information corresponding to the intercepted video to be intercepted can be generated directly based on the interception information and the video resource corresponding to the target video. It should be noted that the video resource corresponding to the target video is obtained by transcoding the target video. The video index information is used for indexing video resources corresponding to the intercepted video to be intercepted. That is to say, according to the video index information corresponding to the intercepted video to be intercepted, the video resource corresponding to the intercepted video to be intercepted can be obtained. It should be noted that the video resource corresponding to the target video is stored before the video index information corresponding to the intercepted video to be intercepted needs to be acquired, the video index information corresponding to the intercepted video to be intercepted can be generated directly based on the interception information and the stored video resource corresponding to the target video, and the generation efficiency of the video index information is high.

The embodiment of the present application does not limit the type of the video index information, and the video index information refers to an M3U8 file, for example. The M3U8 file is an M3U (Moving Picture Experts Group Audio Layer 3Uniform Resource Locator) file in UTF-8 (universal transform Format-8 bit) encoding Format. The M3U8 file is a common streaming media format, and mainly exists in the form of a file list, and supports both live and on-demand. The M3U8 file is a file recorded with an index plain text, when the M3U8 file is opened, the playing software does not play the M3U8 file, but acquires the video resource according to the M3U8 file, and then plays the video resource online. The M3U8 file is a basis for using an HLS (HTTP Live Streaming) Protocol format, and the HLS is a Streaming media transport Protocol based on HTTP (Hyper Text Transfer Protocol), and can implement Live broadcast and on-demand broadcast of Streaming media.

The video resources corresponding to the target video are video resources obtained after transcoding the target video, and exemplarily, the video resources corresponding to the target video are stored in a CDN module having a content distribution function in a server, and the CDN module having the content distribution function is used to implement distribution of the video resources corresponding to the target video.

In one possible implementation, the video resource corresponding to the target video includes at least one video slice corresponding to the target video, and different video slices correspond to different slice time periods. At least one video fragment corresponding to the target video is a fragment obtained by segmenting the target video in the transcoding process, and different video fragments correspond to different fragment time periods.

It should be noted that at least one video slice is sequentially arranged according to the corresponding slice time period. The slice time period corresponding to the video slice is for the target video, for example, the slice time period corresponding to the first video slice refers to a time period between 0 second and 6 th second of the target video, and the slice time period corresponding to the second video slice refers to a time period between 6 second and 12 th second of the target video. According to the fragment time period corresponding to the video fragment, the time length of the video fragment can be determined, for example, if the fragment time period corresponding to a certain video fragment is a time period between 0 th second and 6 th second of the target video, the time length of the video fragment is 6 seconds. The time lengths of different video slices may be the same or different, which is not limited in this embodiment of the application.

In an exemplary embodiment, the video segments are segmented according to video I-frames, and the first frame in each video segment is a video I-frame, so as to ensure that each video segment can be decoded and played by the playing terminal.

In some embodiments, according to different contents included in the interception information, based on the interception information and the video resource corresponding to the target video, the process of generating the video index information corresponding to the intercepted video to be intercepted is also different. The intercepted information includes contents including but not limited to the following two cases:

the first condition is as follows: the interception information includes an interception start time point and an interception end time point, and does not include an interception intermediate time point.

This case-one illustrates that the intercepted video to be intercepted is composed of one part of the target video. In one possible implementation manner, in this case, referring to fig. 5, the process of generating video index information corresponding to the intercepted video to be intercepted based on the interception information and the video resource corresponding to the target video includes the following

steps

205a and 205 b:

step 205 a: and determining at least one target video fragment corresponding to the intercepted video to be intercepted in at least one video fragment based on the interception starting time point and the interception ending time point, wherein part or all of a fragment time period corresponding to any target video fragment is in an interception time period formed by the interception starting time point and the interception ending time point.

And the interception time period formed by the interception start time point and the interception end time point is the time period required to be intercepted in the target video. By comparing the fragment time period corresponding to the video fragment with the capturing time period, at least one target video fragment in which part or all of the corresponding fragment time period is within the capturing time period can be determined in at least one video fragment, and the at least one target video fragment corresponds to the captured video to be captured.

It should be noted that the time period in this embodiment of the application refers to a time period between a start time point and an end time point, and if only the start time point or the end time point in a certain fragmentation time period is within the capture time period, it is considered that the video fragment corresponding to the fragmentation time period is not the target video fragment corresponding to the captured video to be captured. It should be further explained that the number of target video slices corresponding to the intercepted video to be intercepted may be one or multiple, depending on the actual situation, which is not limited in this embodiment of the application.

Step 205 b: and generating video index information corresponding to the intercepted video to be intercepted based on at least one target video fragment.

And after at least one target video fragment corresponding to the intercepted video to be intercepted is determined in at least one video fragment corresponding to the target video, generating video index information corresponding to the intercepted video to be intercepted based on the at least one target video fragment. The slice time periods corresponding to the at least one target video slice determined based on the interception start time point and the interception end time point are consecutive, that is, the at least one target video slice is one or more consecutive video slices of the at least one video slice corresponding to the target video. The slice time period corresponding to at least one target video slice has the following two conditions:

case 1: and all the fragment time periods corresponding to the at least one target video fragment are in the intercepting time period.

In this case 1, the video resource of the captured video to be captured is described as being constituted by the entirety of each target video slice. In one possible implementation manner, in this case 1, the process of generating video index information corresponding to the intercepted video to be intercepted based on at least one target video slice includes the following two steps:

step 1: and generating fragment index information corresponding to at least one target video fragment based on the storage information of at least one target video fragment and the duration of at least one target video fragment.

The time length of any target video fragment is determined based on the fragment time period corresponding to any target video fragment. The storage information of any target video fragment is used for indicating the storage position of any target video fragment, and the server can determine the storage information of at least one target video fragment according to the storage position of at least one target video fragment. The embodiment of the present application does not limit the representation form of the storage information, and for example, the representation form of the storage information is a URL (Uniform Resource Locator).

The process of generating the segment index information corresponding to at least one target video segment based on the storage information of at least one target video segment and the duration of at least one target video segment is as follows: and for each target video fragment, generating fragment index information corresponding to the target video fragment based on the storage information of the target video fragment and the duration of the target video fragment. That is, the number of slice index information is the same as the number of target video slices.

The embodiment of the application does not limit the specific manner of generating the fragment index information corresponding to the target video fragment based on the storage information of the target video fragment and the duration of the target video fragment, as long as it is ensured that the fragment index information corresponding to the target video fragment includes the storage information of the target video fragment and information for indicating the duration of the target video fragment.

Step 2: and generating video index information corresponding to the intercepted video to be intercepted based on the fragment index information corresponding to at least one target video fragment.

The mode of generating the video index information corresponding to the intercepted video to be intercepted is related to the format of the video index information based on the fragment index information corresponding to at least one target video fragment, which is not limited in the embodiment of the present application. Illustratively, the video index information is composed of three parts, a first part being header information, a second part being detail index information, and a third part being trailer information. In this case, the fragment index information corresponding to at least one target video fragment is the detail index information, and the video index information corresponding to the intercepted video to be intercepted is obtained by adding the head information and the tail information on the basis of the index information corresponding to at least one target video fragment.

In an exemplary embodiment, the video index information may also be only composed of the first part and the second part, and in this case, the video index information corresponding to the captured video to be captured may be obtained by adding the header information on the basis of the slice index information corresponding to the at least one target video slice.

In this embodiment of the application, video resources corresponding to a target video all need to be stored in a server, for example, the video resources corresponding to the target video are as shown in fig. 6, where 1.ts, 2.ts, and the like are all used to identify one video fragment corresponding to the target video. As can be seen from fig. 6, the video slices corresponding to the target video (assuming that the video is identified as vid1) are: video segment 1.ts, video segment 2.ts, video segment 3.ts, video segment 4.ts, …, video segment 10.ts, video segment 11.ts, video segment 12.ts, …, video segment 20.ts, video segment 21.ts, video segment 22.ts, and other video segments not shown.

In a possible implementation manner, after the server acquires the interception information, a new video identifier is allocated to the intercepted video to be intercepted corresponding to the interception information, the video identifier allocated to the intercepted video to be intercepted is different from the video identifier of the target video, and the video identifiers allocated to different intercepted videos to be intercepted are also different, so that the videos are distinguished conveniently.

For example, as shown in fig. 6, the video identification of the target video is vid1, and the video identifications of the three captured videos to be captured based on the target video are vid2, vid3 and vid 4. At least one target video fragment corresponding to different intercepted videos to be intercepted is different. As can be seen from fig. 6, at least one target video segment corresponding to the intercepted video vid2 to be intercepted is: video segment 2.ts, video segment 3.ts and video segment 4. ts; at least one target video fragment corresponding to the intercepted video vid3 to be intercepted is: video segment 10.ts, video segment 11.ts and video segment 12. ts; at least one target video fragment corresponding to the intercepted video vid4 to be intercepted is: video slice 20.ts, video slice 21.ts and video slice 22. ts.

Different videos have different video index information, and different video contents are played through different video index information, which is described in the embodiment of the present application by taking the video index information as an M3U8 file as an example. Video index information corresponding to the target video vid1 as shown in fig. 7, the video index information shown in fig. 7 includes header information 701, slice index information 702 corresponding to at least one video slice, and trailer information 703.

In the header information 701, "EXTM 3U" is used to indicate that the video index information is an M3U8 file; "EXT-X-VERSION: 3" is used to indicate that the protocol VERSION number of the HLS is 3; "EXT-X-MEDIA-SEQUENCE: 0" is used to indicate that the SEQUENCE number of the first video slice in the playlist is 0; "EXT-X-TARGETDURATION: 6" is used to indicate that the maximum duration of each video slice is 6 seconds; "EXT-X-PLAYLIST-TYPE: VOD" is used to indicate that the play TYPE is VOD (Video On Demand).

In the slice index information 702 corresponding to at least one video slice, the slice index information corresponding to any video slice includes information indicating a duration of the any video slice and storage information of the any video slice. For example, "extinn: 6.000" in the slice index information 7021 corresponding to the video slice 1.ts is used to indicate that the duration of the video slice 1.ts is 6 seconds; "1. ts" indicates storage information of the video slice 1. ts. The trailer information 703 is represented by "EXT-X-ENDLIST", and the trailer information is used to indicate that the video index information corresponding to the target video vid1 is complete.

Assuming that at least one target video slice (i.e., video slice 2.ts, video slice 3.ts, and video slice 4.ts) corresponding to the intercepted video vid2 to be intercepted satisfies the condition 1, the video index information corresponding to the intercepted video vid2 to be intercepted is as shown in fig. 8. In the video index information shown in fig. 8, the header information 801, slice index information 802 corresponding to at least one target video slice, and trailer information 803 are included. The header information 801 is consistent with the header information 701 in fig. 7, and the trailer information 803 is consistent with the trailer information 703 in fig. 7, which are not described herein again; the slice index information 802 corresponding to at least one target video slice includes slice index information corresponding to video slice 2.ts, slice index information corresponding to video slice 3.ts, and slice index information corresponding to video slice 4. ts.

Assuming that at least one target video slice (i.e., the video slice 10.ts, the video slice 11.ts, and the video slice 12.ts) corresponding to the intercepted video vid3 to be intercepted satisfies the condition 1, video index information corresponding to the intercepted video vid3 to be intercepted is as shown in fig. 9. The video index information shown in fig. 9 includes header information 901, slice index information 902 corresponding to at least one target video slice, and trailer information 903. The header information 901 is consistent with the header information 701 in fig. 7, and the trailer information 903 is consistent with the trailer information 703 in fig. 7, which is not described herein again; the slice index information 902 corresponding to at least one target video slice includes slice index information corresponding to the video slice 10.ts, slice index information corresponding to the video slice 11.ts, and slice index information corresponding to the video slice 12. ts.

Assuming that at least one target video slice (i.e., the video slice 20.ts, the video slice 21.ts, and the video slice 22.ts) corresponding to the intercepted video vid4 to be intercepted satisfies the condition 1, video index information corresponding to the intercepted video vid4 to be intercepted is as shown in fig. 10. In the video index information shown in fig. 10, header information 1001, slice index information 1002 and trailer information 1003 corresponding to at least one target video slice are included. The header information 1001 is consistent with the header information 701 in fig. 7, and the trailer information 1003 is consistent with the trailer information 703 in fig. 7, which is not described herein again; the slice index information 1002 corresponding to at least one target video slice includes slice index information corresponding to the video slice 20.ts, slice index information corresponding to the video slice 21.ts, and slice index information corresponding to the video slice 22. ts.

Case 2: the fragment time period part corresponding to the reference video fragment in at least one target video fragment is in the intercepting time period, and the fragment time periods corresponding to other target video fragments are all in the intercepting time period. The reference video slices comprise at least one of a first target video slice and a last target video slice in at least one target video slice, and the at least one target video slice is sequentially sequenced according to corresponding slice time periods.

That is to say, the condition 2 means that when at least one target video segment is arranged in sequence according to the corresponding segment time periods, the segment time period part corresponding to the first target video segment in the at least one target video segment is in the capturing time period, and the segment time periods corresponding to other target video segments except the first target video segment are all in the capturing time period; or the fragment time period part corresponding to the last target video fragment in the at least one target video fragment is in the capturing time period, and the fragment time periods corresponding to other target video fragments except the last target video fragment are all in the capturing time period; or the fragment time period corresponding to the first target video fragment and the fragment time period corresponding to the last target video fragment in the at least one target video fragment are partially within the capturing time period, and the fragment time periods corresponding to other target video fragments except the first target video fragment and the last target video fragment are all within the capturing time period.

In this case 2, it is described that the video resource corresponding to the intercepted video to be intercepted is formed by a part of the reference video slice and the whole of the other target video slices.

In a possible implementation manner, in this case 2, the process of generating video index information corresponding to the intercepted video to be intercepted based on at least one target video slice includes the following steps i to iii:

step I: and generating fragment index information corresponding to the reference video fragments based on the reference time periods in the intercepting time periods in the fragment time periods corresponding to the reference video fragments and the storage information of the reference video fragments.

Slice index information corresponding to the reference video slice is generated based on the reference time period and the storage information of the reference video slice. The reference time period refers to a time period within the capture time period in the fragmentation time period corresponding to the reference video fragmentation, and since the fragmentation time period part corresponding to the reference video fragmentation is within the capture time period, the reference time period is a part of the time period in the fragmentation time period corresponding to the reference video fragmentation.

In one possible implementation manner, based on the reference time period within the capture time period in the fragmentation time period corresponding to the reference video fragmentation and the storage information of the reference video fragmentation, the method for generating fragmentation index information corresponding to the reference video fragmentation is as follows: determining a reference duration, start flag information, and end flag information based on the reference time period; and generating fragment index information corresponding to the reference video fragments based on the reference duration, the start mark information, the end mark information and the storage information of the reference video fragments. The starting mark information and the ending mark information are used for determining a part of video slices matched with the intercepted video to be intercepted in the reference video slices.

The reference duration refers to a duration corresponding to the reference time period, and the reference duration can be directly determined according to the reference time period. The starting mark information and the ending mark information can accurately mark which part of the video fragments in the reference video fragments belong to the video resources corresponding to the intercepted video to be intercepted, namely which part of the video fragments are matched with the intercepted video to be intercepted, so that the intercepting precision of the intercepted video to be intercepted is ensured. The starting mark information is used for indicating the starting position of the part of the video slice matched with the intercepted video to be intercepted in the reference video slice, and the ending mark information is used for indicating the ending position of the part of the video slice matched with the intercepted video to be intercepted in the reference video slice.

In one possible implementation, based on the reference time period, the start flag information and the end flag information are determined by: calculating a first difference value between a starting time point in a reference time period and a starting time point in a slicing time period corresponding to the reference video slicing, and determining starting mark information based on the first difference value; and calculating a second difference value between the ending time point in the reference time period and the starting time point in the slice time period corresponding to the reference video slice, and determining the termination mark information based on the second difference value.

In one possible implementation manner, based on the reference duration, the start flag information, the end flag information, and the storage information of the reference video slice, the process of generating the slice index information corresponding to the reference video slice is as follows: based on the reference duration, the start marker information, the end marker information, and the storage information of the reference video slice, index information including information indicating the reference duration, the start marker information, the end marker information, and the storage information of the reference video slice is generated as slice index information corresponding to the reference video slice.

Illustratively, assuming that the slice time period corresponding to the reference video slice is a time period between 6 th and 12 th seconds of the target video, and the reference time period is a time period between 8.2 th and 12 th seconds of the target video, the reference time period is the 8.2 th second of the target video at the start time point in the reference time period, the 12 th second of the target video at the end time point in the reference time period, and the 6 th second of the target video at the start time point in the slice time period corresponding to the reference video slice. A first difference between a starting time point in the reference time period and a starting time point in a slice time period corresponding to the reference video slice is 2.2 seconds, and a second difference between an ending time point in the reference time period and a starting time point in a slice time period corresponding to the reference video slice is 6 seconds. The start flag information determined based on the first difference is denoted by "start 2200" and the end flag information determined based on the second difference is denoted by "end 6000". It can be determined that the video slice between 2.2 second to 6 second in the reference video slice matches the truncated video to be truncated based on the start flag information "start 2200" and the end flag information "end 6000".

According to the reference time period being the time period between the 8.2 th second and the 12 th second of the target video, the reference time period is 3.8 seconds. Assuming that the reference video slice is a video slice 2.ts, based on the reference duration, the start flag information, the end flag information, and the storage information of the reference video slice, the generated slice index information corresponding to the reference video slice is as follows:

#EXTINF:3.800,

2.tsstart＝2200&end＝6000

wherein "extinn: 3.800" is used to indicate that the reference duration is 3.8 seconds, "2. ts" represents the storage information of the reference video slice, and "start ═ 2200" represents the start mark information; "end 6000" indicates the end marker information.

It should be noted that the number of reference video slices may be one or two, and when the number of reference video slices is two, slice index information corresponding to the two reference video slices respectively needs to be generated.

Step II: and generating fragment index information corresponding to other target video fragments based on the storage information of other target video fragments and the time lengths of other target video fragments.

The process of implementing this step ii refers to the process of generating the segment index information corresponding to at least one target video segment based on step 1 under condition 1, and is not described herein again.

It should be noted that the number of other target video slices may be one or more, and when the number of other target video slices is multiple, slice index information corresponding to each other target video slice needs to be generated.

Step III: and generating video index information corresponding to the intercepted video to be intercepted based on the fragment index information corresponding to the reference video fragments and the fragment index information corresponding to the other target video fragments.

In one possible implementation manner, the implementation manner of step iii is: sequentially arranging fragment index information corresponding to reference video fragments and fragment index information corresponding to other target video fragments according to a reference sequence to form detailed index information; and based on the detail index information, generating video index information corresponding to the intercepted video to be intercepted. The reference sequence is the arrangement sequence of the target video fragments corresponding to the fragment index information, and the arrangement sequence of the target video fragments is determined according to the sequence of the fragment time periods corresponding to the target video fragments.

In one possible implementation manner, for a case that video index information corresponding to an intercepted video to be intercepted includes header information, detail index information, and trailer information, a manner of generating video index information corresponding to the intercepted video to be intercepted based on the detail index information is as follows: and adding head information and tail information on the basis of the detail index information to obtain video index information corresponding to the intercepted video to be intercepted.

For example, as shown in fig. 11, it is assumed that the duration of each video slice corresponding to the target video is 6 seconds, the interception information includes an interception start time point being 8.2 seconds of the target video, an interception end time point being 26.5 seconds of the target video, and an interception time period being a time period between 8.2 seconds and 26.5 seconds of the target video, that is, the interception manner indicated by the interception information is to use video contents of 8.2 seconds to 26.5 seconds in the target video as a new video (i.e., an intercepted video to be intercepted), and in fig. 11, the intercepted video to be intercepted, which is composed of video contents of 8.2 seconds to 26.5 seconds in the target video, is identified by "intercepted vid".

In fig. 11, 1.ts, 2.ts, 3.ts, 4.ts, and 5.ts are used to indicate the first video slice, the second video slice, the third video slice, the fourth video slice, and the fifth video slice corresponding to the target video, respectively. Because the duration of each video slice is 6 seconds, the slice time period corresponding to the video slice 1.ts is the time period between 0 th second and 6 th second of the target video, the slice time period corresponding to the video slice 2.ts is the time period between 6 th second and 12 th second of the target video, the slice time period corresponding to the video slice 3.ts is the time period between 12 th second and 18 th second of the target video, the slice time period corresponding to the video slice 4.ts is the time period between 18 th second and 24 th second of the target video, and the slice time period corresponding to the video slice 5.ts is the time period between 24 th second and 30 th second of the target video.

Under the corresponding condition of fig. 11, at least one target video segment corresponding to the captured video to be captured is a video segment 2.ts, a video segment 3.ts, a video segment 4.ts, and a video segment 5. ts. The fragment time periods corresponding to the video fragment 2.ts and the video fragment 5.ts are partially in the capturing time period (i.e. the time period between 8.2 seconds and 26.5 seconds of the target video), and the fragment time periods corresponding to the video fragment 3.ts and the video fragment 4.ts are all in the capturing time period (i.e. the time period between 8.2 seconds and 26.5 seconds of the target video). That is, video slice 2.ts and video slice 5.ts are reference video slices, and video slice 3.ts and video slice 4.ts are other target video slices. In this case, the generated video index information corresponding to the captured video to be captured is as shown in fig. 12.

The video index information shown in fig. 12 includes header information 1201, slice index information 1202 corresponding to a reference video slice, slice index information 1203 corresponding to other target video slices, and trailer information 1204. The slice index information 1202 corresponding to the reference video slice and the slice index information 1203 corresponding to other target video slices constitute detailed index information.

Based on the implementation process under the condition 2, when the fragment time period part corresponding to a certain target video fragment is in the capturing time period, the target video fragment cannot be directly used as a part of the video resource corresponding to the captured video to be captured, so that the problem of accurate capturing can be avoided. Illustratively, the start marker information and the end marker information determined based on the reference time period are represented by start and end tags in M3U8 to accurately determine which part of the target video slice matches the intercepted video to be intercepted. The method can avoid reducing the coding compression efficiency and increasing the cost on the basis of eliminating the problem of interception precision.

It should be noted that, in the above case 2, the at least one target video slice includes both the reference video slice and the other target video slices as an example, and the embodiment of the present application is not limited thereto. Illustratively, at least one target video slice may include only the reference video slice and no other target video slices. In this case, the video index information corresponding to the captured video to be captured only includes slice index information corresponding to the reference video slice.

Case two: the interception information includes an interception start time point, an interception end time point, and an even number of interception intermediate time points.

The second case illustrates that the intercepted video to be intercepted is formed by splicing at least two discontinuous parts in the target video. In a possible implementation manner, in this case two, referring to fig. 13, the process of generating video index information corresponding to the intercepted video to be intercepted based on the video resource corresponding to the intercepted information and the target video includes the following steps 205A to 205D:

step 205A: at least two sub-clipping time periods are determined based on the clipping start time point, the clipping end time point and the even number of clipping intermediate time points.

And arranging the interception start time points, the even number of interception intermediate time points and the interception end time points according to the sequence of the time points, and grouping each interception time point in pairs from the first interception time point in each arranged interception time point, wherein the time period between two interception time points in each group is a sub-interception time period.

For example, assuming that the interception start time point is the 8.2 th second of the target video, the interception middle time point is the 16.2 th second of the target video and the 20.5 th second of the target video, and the interception end time point is the 26.5 th second of the target video, the interception time points obtained by arranging the time points in sequence are the 8.2 th second of the target video, the 16.2 th second of the target video, the 20.5 th second of the target video and the 26.5 th second of the target video, and after pairwise grouping is performed on the interception time points from the first interception time point, two sub-interception time periods can be obtained, which are the time period between the 8.2 th second and the 16.2 th second of the target video and the time period between the 20.5 th second and the 26.5 th second of the target video respectively.

Step 205B: and determining video fragment groups corresponding to at least two sub-interception time periods respectively based on at least one video fragment, wherein the video fragment group corresponding to any sub-interception time period consists of video fragments of which the corresponding fragment time period is partially or completely in any sub-interception time period.

For any sub-interception time period, by comparing any sub-interception time period with the corresponding video slicing time period of each video slicing, video slicing of which the corresponding slicing time period is partially or completely in any sub-interception time period can be determined, and the video slicing forms a video slice group corresponding to any sub-interception time period. Based on this way, the video slice groups corresponding to at least two sub-capturing time periods can be determined, and then step 205C is performed.

Step 205C: and generating reference index information corresponding to at least two video slice groups respectively, wherein the reference index information corresponding to any video slice group comprises slice index information corresponding to video slices forming any video slice group.

Since the reference index information corresponding to any video slice group includes slice index information corresponding to video slices constituting any video slice group, the reference index information corresponding to any video slice group can be obtained by generating slice index information corresponding to video slices constituting any video slice group. The video slices constituting any video slice group are consecutive one or more video slices among at least one video slice corresponding to the target video. The process of generating the slice index information corresponding to the video slices constituting the video slice group is described in the following description of the process of generating the slice index information corresponding to at least one target video slice corresponding to the intercepted video to be intercepted, and details are not repeated here.

After the fragment index information corresponding to the video fragments forming any video fragment group is generated, the fragment index information corresponding to the video fragments is arranged according to the sequence of the fragment time periods corresponding to the video fragments forming any video fragment group, and the reference index information corresponding to any video fragment group is obtained. In this way, the reference index information corresponding to at least two video slice groups can be generated, and step 205D is further performed.

Step 205D: and generating video index information corresponding to the intercepted video to be intercepted based on the reference index information and the splicing mark information respectively corresponding to the at least two video slice groups.

Since at least two sub-capturing time periods corresponding to at least two video frequency groups are discontinuous, in the process of generating the video index information corresponding to the captured video to be captured, splicing mark information needs to be added on the basis of the reference index information corresponding to at least two video frequency groups respectively, so that the playing terminal can play the captured video to be captured by splicing the discontinuous parts according to the video index information corresponding to the captured video to be captured.

In one possible implementation manner, based on the reference index information and the splicing flag information respectively corresponding to at least two video slice groups, the process of generating the video index information corresponding to the intercepted video to be intercepted is as follows: adding splicing mark information between reference index information corresponding to every two adjacent video frequency slice groups to obtain detailed index information; and based on the detail index information, generating video index information corresponding to the intercepted video to be intercepted. The adjacent video slice groups refer to video slice groups corresponding to two adjacent sub-interception time periods.

Illustratively, as shown in fig. 14, it is assumed that the duration of each video slice corresponding to the target video is 6 seconds, the capture start time point included in the capture information is the 8.2 th second of the target video, the capture middle time points include two, which are respectively the 16.2 th second and the 20.5 th second of the target video, and the capture end time point is the 26.5 th second of the target video. In this case, the number of the sub-cut periods is two, and the time period between 8.2 th and 16.2 th seconds of the target video (denoted as the sub-cut period a) and the time period between 20.5 th and 26.5 th seconds of the target video (denoted as the sub-cut period B), respectively. In the capture example shown in fig. 14, the captured video to be captured is obtained by splicing the sub-video captured according to the sub-capture time period a and the sub-video captured according to the sub-capture time period B, and in fig. 14, the captured video to be captured is represented by "captured vid".

In the clipping example shown in fig. 14, the video slice group a corresponding to the sub-clipping period a is composed of a video slice 2.ts and a video slice 3. ts; and the video slice group B corresponding to the sub-interception time period B consists of a video slice 4.ts and a video slice 5. ts. In the process of generating the reference index information a corresponding to the video slice group a, since the slice time periods corresponding to the video slice 2.ts and the video slice 3.ts are both partially in the sub-capture time period a, the slice index information corresponding to the video slice 2.ts and the video slice 3.ts included in the generated reference index information a both include start marker information and end marker information. Similarly, in the process of generating the reference index information B corresponding to the video slice group B, since the slice time periods corresponding to the video slices 4.ts and 5.ts are both partially within the sub-truncation time period B, the slice index information corresponding to the generated video slices 4.ts and 5.ts should include the start marker information and the end marker information.

Exemplarily, in the clipping example shown in fig. 14, video index information corresponding to the clipped video to be clipped is as shown in fig. 15. In fig. 15, the header information 1501, the reference index information a1502 corresponding to the video slice group a composed of the video slice 2.ts and the video slice 3.ts, the splice mark information 1503, the reference index information B1504 corresponding to the video slice group B composed of the video slice 4.ts and the video slice 5.ts, and the trailer information 1505 are included. Here, the splicing flag information 1503 is represented by "EXT-X-DISCONTINUITY".

It should be noted that, the above description is only exemplified by taking the number of video resources corresponding to the target video as one example, and the embodiment of the present application is not limited to this. In one possible implementation, the number of video resources corresponding to the target video is the reference number. Illustratively, the reference number is a number not less than 1. Different video assets have different definitions. That is, the number of video resources corresponding to the target video is the same as the number of definitions required to be transcoded. The type and the number of the definitions to be transcoded are set according to experience, or flexibly adjusted according to an application scenario, which is not limited in the embodiment of the present application.

For example, suppose the definitions required for transcoding are 1080P (a video display format), 720P, 540P and 360P, in this case, the number of video resources corresponding to the target video is 4. As shown in fig. 16, the video assets corresponding to the target video include a video asset a having a definition of 1080P, a video asset B having a definition of 720P, a video asset C having a definition of 540P, and a video asset D having a definition of 360P.

And for the condition that the number of the video resources corresponding to the target video is the reference number, generating video index information corresponding to the intercepted video to be intercepted based on the interception information and each video resource corresponding to the target video, wherein different video index information is used for indexing the video resources with different definitions corresponding to the intercepted video to be intercepted. That is to say, the number of the video index information corresponding to the intercepted video to be intercepted is also the reference number, and any video index information is generated based on the intercepted information and the video resource with any definition corresponding to the target video.

It should be noted that, when the process of generating any video index information corresponding to the intercepted video to be intercepted based on the intercepting information and the video resource with any definition corresponding to the target video is equal to the number of the video resources corresponding to the target video, the process of generating the video index information corresponding to the intercepted video to be intercepted based on the intercepting information and the video resource is the same, and details are not repeated here.

In a possible implementation manner, for a case that the number of video resources corresponding to a target video is a reference number, the reference number of video resources corresponding to the target video is obtained by transcoding based on a reference frame alignment manner, and video segments located at the same arrangement position and included in different video resources have the same duration.

In an exemplary embodiment, for a case that a video slice in a video resource corresponding to a target video is obtained by dividing based on a video I frame, reference frame alignment refers to video I frame alignment. After the target video is transcoded based on a mode corresponding to the video I frame, the positions of the video I frames in the video resources with different definitions are the same, so that the video fragments in the same arrangement position in different video resources have the same time length, and the total playing time lengths of the video resources with different definitions corresponding to the intercepted video to be intercepted, which is intercepted from the video resources with different definitions corresponding to the target video, are ensured to be consistent.

For example, as shown in fig. 16, the video slices 1.ts ranked at 1 st bit, the video slices 2.ts ranked at 2 nd bit, the video slices 40.ts ranked at 40 th bit, and the video slices 41.ts ranked at 41 th bit, which are included in the 4 video resources with different resolutions corresponding to the target video, have time durations of 5.2 seconds, 6.0 seconds, 4.9 seconds, and 5 seconds, respectively. It should be noted that the video slices at the same arrangement position included in the video resources with different resolutions corresponding to the target video have different resolutions, for example, the resolution of the video slice 1.ts ranked at 1 st bit included in the video resource a is 1080P, the resolution of the video slice 1.ts ranked at 1 st bit included in the video resource B is 720P, the resolution of the video slice 1.ts ranked at 1 st bit included in the video resource C is 540P, and the resolution of the video slice 1.ts ranked at 1 st bit included in the video resource D is 360P.

In a possible implementation manner, after video index information corresponding to an intercepted video to be intercepted is generated, the video index information and a video identifier of the intercepted video to be intercepted are correspondingly stored, so that when a playing request which is sent by a playing terminal and carries the video identifier of the intercepted video to be intercepted is obtained, the video index information corresponding to the intercepted video to be intercepted is directly extracted based on the video identifier of the intercepted video to be intercepted, the video index information corresponding to the intercepted video to be intercepted is sent to the playing terminal, and the efficiency is improved. It should be noted that the video identifier of the captured video to be captured is different from the video identifier of the target video, so as to facilitate distinguishing.

In a possible implementation manner, after the server generates video index information corresponding to the intercepted video to be intercepted, the intercepted video to be intercepted can be on-line, and the terminal can play the intercepted video to be intercepted. Referring to fig. 17, the process of playing the intercepted video to be intercepted by the terminal includes the following steps 206a to 206 d:

step 206 a: and the terminal sends a playing request aiming at the intercepted video to be intercepted to the server.

When the intercepted video to be intercepted needs to be played, the terminal sends a playing request aiming at the intercepted video to be intercepted to the server. In an exemplary embodiment, the playing request for the intercepted video to be intercepted carries the video identifier of the intercepted video to be intercepted, so that the server can quickly query the video index information corresponding to the intercepted video to be intercepted according to the video identifier of the intercepted video to be intercepted.

Step 206 b: and the server returns video index information corresponding to the intercepted video to be intercepted based on the playing request.

After receiving a playing request aiming at the intercepted video to be intercepted, sent by the terminal, the server inquires video index information corresponding to the intercepted video to be intercepted based on the playing request, and then returns the video index information corresponding to the intercepted video to be intercepted to the terminal.

Step 206 c: and the terminal acquires video resources corresponding to the intercepted video to be intercepted based on the video index information corresponding to the intercepted video to be intercepted, which is returned by the server.

After the server returns the video index information corresponding to the intercepted video to be intercepted to the terminal, the terminal acquires the video index information corresponding to the intercepted video to be intercepted. The video index information corresponding to the intercepted video to be intercepted is used for indexing the video resources corresponding to the intercepted video to be intercepted, so that the terminal can acquire the video resources corresponding to the intercepted video to be intercepted based on the video index information corresponding to the intercepted video to be intercepted. And the video resource corresponding to the intercepted video to be intercepted is the playing basis of the intercepted video to be intercepted.

In a possible implementation manner, the video index information corresponding to the intercepted video to be intercepted includes slice index information corresponding to at least one associated video slice related to the intercepted video to be intercepted, and the video resource corresponding to the intercepted video to be intercepted includes a sub video resource corresponding to the at least one associated video slice. That is to say, the process of obtaining the video resource corresponding to the intercepted video to be intercepted is a process of obtaining the sub-video resource corresponding to at least one associated video fragment.

The at least one associated video slice refers to a video slice related to the intercepted video to be intercepted in at least one video slice corresponding to the target video. At least one associated video fragment is different according to different contents included in the interception information according to the video index information corresponding to the intercepted video to be intercepted. Illustratively, when the content included in the interception information is the first case in step 205, that is, the interception information includes an interception start time point and an interception end time point, and does not include an interception intermediate time point, the at least one associated video slice is at least one target video slice whose corresponding slice time period is partially or completely within an interception time period formed by the interception start time point and the interception end time point.

Exemplarily, when the content included in the interception information is the second case in step 205, that is, the interception information includes an interception start time point, an interception end time point, and an even number of interception intermediate time points, at least one associated video slice is each video slice constituting a video slice group corresponding to each of the at least two sub-interception time periods.

According to the content of step 205, no matter what the at least one associated video slice is, the slice index information corresponding to any associated video slice in the at least one associated video slice includes storage information of the any associated video slice. Therefore, the terminal can acquire the at least one associated video fragment according to the storage information in the fragment index information corresponding to the at least one associated video fragment included in the video index information corresponding to the intercepted video to be intercepted.

After obtaining the at least one associated video fragment, the terminal further obtains a sub-video resource corresponding to the at least one associated video fragment, thereby obtaining a video resource of the intercepted video to be intercepted. Illustratively, for any associated video slice, the sub-video resource corresponding to the any associated video slice refers to a video slice in the any associated video slice, which matches with the intercepted video to be intercepted. As can be seen from the content in step 205, for any associated video slice, the slice index information corresponding to the any associated video slice may or may not include start marker information and end marker information.

In one possible implementation manner, the manner of obtaining the sub-video resource corresponding to any associated video fragment is as follows: in response to that the fragment index information corresponding to any associated video fragment does not include the start marker information and the end marker information, taking any associated video fragment as a whole as a sub-video resource corresponding to any associated video fragment; and in response to that the fragment index information corresponding to any associated video fragment comprises the start mark information and the end mark information, determining a part of video fragments matched with the intercepted video to be intercepted in any associated video fragment based on the start mark information and the end mark information, and taking the part of video fragments as sub-video resources corresponding to any associated video fragment.

When the fragment index information corresponding to any associated video fragment does not include the start marker information and the end marker information, it is indicated that the whole associated video fragment is matched with the intercepted video to be intercepted, and at the moment, the whole associated video fragment is directly used as a sub-video resource corresponding to any associated video fragment. When the fragment index information corresponding to any associated video fragment includes the start mark information and the end mark information, it is indicated that only part of the video fragments in any associated video fragment are matched with the intercepted video to be intercepted, and the part of the video fragments determined based on the start mark information and the end mark information is used as the sub-video resources corresponding to any associated video fragment.

Based on the mode, the sub-video resources corresponding to the associated video fragments can be obtained, and then the video resources corresponding to the intercepted video to be intercepted are formed by the sub-video resources corresponding to the associated video fragments. It should be noted that, in the process of forming the video resource corresponding to the intercepted video to be intercepted by the sub-video resources corresponding to the associated video fragments, the sub-video resources corresponding to the associated video fragments are sequentially arranged according to the sequence of the fragment time periods corresponding to the associated video fragments, so as to obtain the video resource corresponding to the intercepted video to be intercepted.

Step 206 d: and the terminal plays the intercepted video to be intercepted based on the video resource corresponding to the intercepted video to be intercepted.

The process of playing the intercepted video to be intercepted based on the video resource corresponding to the intercepted video to be intercepted refers to a process of sequentially rendering and outputting video pictures of the sub-video resources corresponding to the associated video fragments. In a possible implementation manner, since the associated video slice is obtained after transcoding, the video picture of the sub-video resource corresponding to the associated video slice is a video picture obtained after decoding. In one possible implementation, after obtaining the at least one associated video slice, the at least one associated video slice is decoded. If the sub-video resource corresponding to any associated video fragment is the whole associated video fragment, the process of rendering and outputting the video picture of the sub-video resource corresponding to any associated video fragment is as follows: and sequentially rendering and outputting each frame of video picture obtained after decoding any one of the associated video slices.

If the sub-video resource corresponding to any associated video fragment is a part of the video fragments in any associated video fragment, the process of rendering and outputting the video picture of the sub-video resource corresponding to any associated video fragment is as follows: and determining each frame target video picture belonging to the sub video resource corresponding to any one associated video fragment from each frame video picture obtained after decoding any one associated video fragment, and rendering and outputting each frame target video picture in sequence. The sub-video resources corresponding to any one of the associated video slices are determined based on the start marker information and the end marker information, and the process of rendering and outputting the target video pictures of each frame can be regarded as rendering and outputting the video pictures from the time point indicated by the start marker information until the time point indicated by the end marker information stops rendering and outputting the video pictures.

Exemplarily, in the case of the capturing example shown in fig. 11, in the process of playing the captured video to be captured, the playing terminal starts to render the output video picture at 2.2 th second of the video slice 2.ts until stopping rendering the output video picture at 6 th second of the video slice 2.ts, then renders all video pictures corresponding to the output video slice 3.ts and the video slice 4.ts, then, renders the output video picture from 0 th second of the video slice 5.ts until stopping rendering the output video picture at 2.5 th second of the video slice 5.ts, and thus, completes the complete playing of the captured video to be captured.

The method for generating the video index information can be applied to an application scene of intercepting short videos from long videos, the short video information stream is one of the most popular internet products at present, active users reach 8 hundred million in 2019, and a large amount of new short videos are produced every day. Many short videos are from fragments of long video interception, based on the method provided by the embodiment of the application, the scheme of generating the short videos by the efficient and flexible online long video interception can be realized, re-transcoding is not needed, re-auditing is not needed, resource consumption caused by repeated uploading, repeated transcoding and repeated storage of the short videos related to the long videos can be avoided, broadband resources can be saved, and the problems that the time required by online short videos is long and the like can be solved.

In addition, after the intercepted video is uploaded, the intercepted video is often required to be manually checked again, and in the mode that the video index information corresponding to the intercepted video is generated directly based on the intercepted information of the target video and the video resource corresponding to the target video, the intercepted video is not required to be uploaded, so that extra manual checking is not required, and the labor cost is favorably saved.

The embodiment of the present application provides a method for generating video index information, where the method for generating video index information provided in the embodiment of the present application may be executed by a terminal, or may be executed immediately by a server, or may be executed by both the terminal and the server, which is not limited in the embodiment of the present application. In the embodiment of the present application, a method for generating video index information is described as an example that is executed by a computer device, where the computer device is a server or a terminal. As shown in fig. 18, the method for generating video index information according to the embodiment of the present application includes the following steps 1801 and 1802:

in step 1801, the computer device obtains interception information for the target video, where the interception information is used to indicate a relationship between an intercepted video to be intercepted and the target video.

For the case that the computer device is a terminal, the capturing information for the target video in step 1801 may be the capturing information obtained by the terminal in real time, or may be the capturing information stored in advance by the terminal, which is not limited in this embodiment of the present application. For the condition that the interception information aiming at the target video is the interception information obtained by the terminal in real time, the process that the terminal obtains the interception information aiming at the target video is as follows: responding to the video interception request, and displaying a video interception editing page; and acquiring interception information aiming at the target video on the video interception editing page. The implementation of this process is shown in step 201 and step 202 in the embodiment shown in fig. 2, and is not described here again. In the case where the interception information for the target video is interception information stored in advance by the terminal, the terminal can directly extract the interception information for the target video from the storage space.

In the case that the computer device is a server, the interception information of the target video in step 1801 may be the interception information sent by the terminal to the server, or may be the interception information stored in advance by the server, which is not limited in this embodiment of the present application. In the case where the interception information for the target video is the interception information transmitted from the terminal to the server, the terminal acquires the interception information for the target video according to the implementation manner described in step 201 to step 203 in the embodiment shown in fig. 2 and transmits the interception information for the target video to the server, and thus, the server acquires the interception information for the target video. In the case where the interception information for the target video is the interception information stored in advance by the server, the server can directly extract the interception information for the target video from the storage space.

In step 1802, the computer device generates video index information corresponding to the intercepted video to be intercepted, based on the interception information and the video resource corresponding to the target video, where the video resource corresponding to the target video is obtained by transcoding the target video, and the video index information is used to index the video resource corresponding to the intercepted video to be intercepted.

The video resource corresponding to the target video may be stored locally in the computer device, or may be stored in other devices, which is not limited in this embodiment of the present application. For the condition that the video resources corresponding to the target video are stored in the local computer, the computer equipment can directly extract the video resources corresponding to the target video from the local storage; for the case that the video resource corresponding to the target video is stored in other devices, the computer device can acquire the video resource corresponding to the target video by communicating with other devices.

After the video resources corresponding to the target video are obtained, the computer equipment generates video index information corresponding to the intercepted video to be intercepted based on the intercepting information and the video resources corresponding to the target video. The implementation manner of generating, by the computer device, video index information corresponding to the intercepted video to be intercepted based on the interception information and the video resource corresponding to the target video refers to step 205 in the embodiment shown in fig. 2, which is not described herein again.

In the embodiment of the application, the computer device directly generates the video index information corresponding to the intercepted video based on the intercepted information for the target video and the video resource corresponding to the target video. Based on the mode, the generation process of the video index information corresponding to the intercepted video does not need uploading of the intercepted video, transcoding of the intercepted video is not needed, time consumption is short, the generation efficiency of the video index information corresponding to the intercepted video is high, and the time required for online of the intercepted video is favorably shortened.

The embodiment of the application provides a method for generating video index information, which is executed by a terminal. As shown in fig. 19, the video index information provided in the embodiment of the present application includes the following steps 1901 to 1903:

in step 1901, the terminal displays a video capture edit page in response to the video capture request.

The implementation of step 1901 refers to step 201 in the embodiment shown in fig. 2, and is not described here again.

In step 1902, the terminal obtains, at a video capture editing page, capture information for a target video, where the capture information is used to indicate a relationship between a captured video to be captured and the target video.

The implementation of this step 1902 is shown in step 202 in the embodiment shown in fig. 2, and will not be described herein again.

In step 1903, the terminal sends the interception information for the target video to the server, and the server is configured to generate video index information corresponding to the intercepted video to be intercepted based on the interception information and the video resource corresponding to the target video, where the video resource corresponding to the target video is obtained by transcoding the target video, and the video index information is used to index the video resource corresponding to the intercepted video to be intercepted.

The implementation of step 1903 refers to step 203 in the embodiment shown in fig. 2, and is not described here again.

After the terminal sends the interception information aiming at the target video to the server, the server acquires the interception information aiming at the target video, and then generates video index information corresponding to the intercepted video to be intercepted based on the interception information and the video resource corresponding to the target video. The implementation manner of the processing procedure of the server refers to step 204 and step 205 in the embodiment shown in fig. 2, and is not described here again.

In the embodiment of the application, after the terminal sends the interception information aiming at the target video to the server, the server directly generates the video index information corresponding to the intercepted video based on the interception information aiming at the target video and the video resource corresponding to the target video. Based on the mode, the generation process of the video index information corresponding to the intercepted video does not need uploading of the intercepted video, transcoding of the intercepted video is not needed, time consumption is short, the generation efficiency of the video index information corresponding to the intercepted video is high, and the time required for online of the intercepted video is favorably shortened.

Referring to fig. 20, an embodiment of the present application provides an apparatus for generating video index information, where the apparatus includes:

an obtaining unit 2001, configured to obtain interception information for a target video, where the interception information is used to indicate a relationship between an intercepted video to be intercepted and the target video;

the generating unit 2002 is configured to generate video index information corresponding to the intercepted video to be intercepted, based on the interception information and the video resource corresponding to the target video, where the video resource corresponding to the target video is obtained by transcoding the target video, and the video index information is used to index the video resource corresponding to the intercepted video to be intercepted.

In one possible implementation manner, the capture information includes a capture start time point and a capture end time point, the video resource corresponding to the target video includes at least one video slice corresponding to the target video, and different video slices correspond to different slice time periods; a generating unit 2002, configured to determine, based on the capture start time point and the capture termination time point, at least one target video segment corresponding to the captured video to be captured in at least one video segment, where a segment time period corresponding to any target video segment is partially or completely within a capture time period formed by the capture start time point and the capture termination time point; and generating video index information corresponding to the intercepted video to be intercepted based on at least one target video fragment.

In a possible implementation manner, the generating unit 2002 is further configured to, in response to that all the segment time periods corresponding to the at least one target video segment are within the capturing time period, generate segment index information corresponding to the at least one target video segment based on the storage information of the at least one target video segment and the time length of the at least one target video segment; and generating video index information corresponding to the intercepted video to be intercepted based on the fragment index information corresponding to at least one target video fragment.

In a possible implementation manner, the generating unit 2002 is further configured to, in response to that a slice time period portion corresponding to a reference video slice in the at least one target video slice is within the capturing time period and all slice time periods corresponding to other target video slices are within the capturing time period, generate slice index information corresponding to the reference video slice based on the reference time period within the capturing time period in the slice time period corresponding to the reference video slice and the storage information of the reference video slice; generating fragment index information corresponding to other target video fragments based on the storage information of the other target video fragments and the time lengths of the other target video fragments; generating video index information corresponding to the intercepted video to be intercepted based on the fragment index information corresponding to the reference video fragments and the fragment index information corresponding to the other target video fragments; the reference video slices comprise at least one of a first target video slice and a last target video slice in at least one target video slice, and the at least one target video slice is arranged in sequence according to corresponding slice time periods.

In a possible implementation manner, the generating unit 2002 is further configured to determine, based on the reference time period, a reference duration, start marker information, and end marker information, where the start marker information and the end marker information are used to determine, in the reference video slice, a partial video slice that matches the captured video to be captured; and generating fragment index information corresponding to the reference video fragments based on the reference duration, the start mark information, the end mark information and the storage information of the reference video fragments.

In one possible implementation manner, the interception information includes an interception start time point, an interception end time point and an even number of interception intermediate time points, the video resource corresponding to the target video includes at least one video slice corresponding to the target video, and different video slices correspond to different slice time periods; a generating unit 2002 for determining at least two sub-interception time periods based on the interception start time point, the interception end time point, and the even number of interception intermediate time points; determining video fragment groups corresponding to at least two sub-interception time periods respectively based on at least one video fragment, wherein the video fragment group corresponding to any sub-interception time period consists of video fragments of which the corresponding fragment time period is partially or completely in any sub-interception time period; generating reference index information corresponding to at least two video slice groups respectively, wherein the reference index information corresponding to any video slice group comprises slice index information corresponding to video slices forming any video slice group; and generating video index information corresponding to the intercepted video to be intercepted based on the reference index information and the splicing mark information respectively corresponding to the at least two video slice groups.

In a possible implementation manner, a reference number of video resources corresponding to a target video are obtained by transcoding based on a reference frame alignment manner, and video fragments located at the same arrangement position in different video resources have the same duration.

In the embodiment of the application, video index information corresponding to the intercepted video is generated directly based on the intercepted information aiming at the target video and the video resource corresponding to the target video. Based on the mode, uploading of the intercepted video is not needed in the generation process of the video index information corresponding to the intercepted video, transcoding of the intercepted video is not needed, time consumption is short, the generation efficiency of the video index information corresponding to the intercepted video is high, and the time required for online of the intercepted video can be shortened.

Referring to fig. 21, an embodiment of the present application provides an apparatus for generating video index information, where the apparatus includes:

a display unit 2101 configured to display a video capture editing page in response to a video capture request;

an obtaining unit 2102 configured to obtain, on a video capture editing page, capture information for a target video, where the capture information is used to indicate a relationship between a captured video to be captured and the target video;

a sending unit 2103, configured to send the intercepted information for the target video to a server, where the server is configured to generate video index information corresponding to the intercepted video to be intercepted based on the intercepted information and the video resource corresponding to the target video, where the video resource corresponding to the target video is obtained by transcoding the target video, and the video index information is used to index the video resource corresponding to the intercepted video to be intercepted.

In one possible implementation manner, the video intercepting and editing page comprises a video input control, a dotting control and a confirmation control, wherein the dotting control is used for specifying a time point on a time axis; an obtaining unit 2102, configured to display, in response to a trigger instruction of a video input control, selectable candidate videos; responding to a selected instruction of a target video in the candidate videos, and displaying the target video and a time axis corresponding to the target video in a video intercepting and editing page; determining a designated time point on a time axis corresponding to the target video based on the trigger operation aiming at the dotting control; and responding to a trigger instruction of the confirmation control, and acquiring interception information aiming at the target video based on the specified time point.

In a possible implementation manner, the sending unit 2103 is further configured to send a play request for the intercepted video to be intercepted to a server, and the server is configured to return video index information corresponding to the intercepted video to be intercepted based on the play request;

the obtaining unit 2102 is further configured to obtain a video resource corresponding to the intercepted video to be intercepted based on the video index information corresponding to the intercepted video to be intercepted, which is returned by the server;

referring to fig. 22, the apparatus further comprises:

the playing unit 2104 is configured to play the intercepted video to be intercepted based on a video resource corresponding to the intercepted video to be intercepted.

In one possible implementation manner, the video index information corresponding to the intercepted video to be intercepted includes slice index information corresponding to at least one associated video slice related to the intercepted video to be intercepted, and the video resource corresponding to the intercepted video to be intercepted includes a sub video resource corresponding to the at least one associated video slice; the acquiring unit 2102 is further configured to acquire at least one associated video segment based on video index information corresponding to an intercepted video to be intercepted; for any associated video fragment, in response to that fragment index information corresponding to any associated video fragment does not include start marker information and end marker information, taking the whole of any associated video fragment as a sub-video resource corresponding to any associated video fragment; and in response to that the fragment index information corresponding to any associated video fragment comprises the start mark information and the end mark information, determining a part of video fragments matched with the intercepted video to be intercepted in any associated video fragment based on the start mark information and the end mark information, and taking the part of video fragments as sub-video resources corresponding to any associated video fragment.

It should be noted that, when the apparatus provided in the foregoing embodiment implements the functions thereof, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the apparatus may be divided into different functional modules to implement all or part of the functions described above. In addition, the apparatus and method embodiments provided in the above embodiments belong to the same concept, and specific implementation processes thereof are described in detail in the method embodiments, which are not described herein again.

In an exemplary embodiment, a computer device is also provided, the computer device comprising a processor and a memory, the memory having at least one computer program stored therein. The at least one computer program is loaded and executed by one or more processors to implement any of the above-described methods for generating video index information. The computer device may be a server or a terminal. Next, the structures of the server and the terminal will be described separately.

Fig. 23 is a schematic structural diagram of a server according to an embodiment of the present application, where the server may generate relatively large differences due to different configurations or performances, and may include one or more processors (CPUs) 2301 and one or more memories 2302, where the one or more memories 2302 store at least one computer program, and the at least one computer program is loaded and executed by the one or more processors 2301 to implement the video index information generation method provided by the foregoing method embodiments. Of course, the server may also have a wired or wireless network interface, an input/output interface, and other components to facilitate input and output, and the server may also include other components for implementing the functions of the device, which are not described herein again.

Fig. 24 is a schematic structural diagram of a terminal according to an embodiment of the present application. The terminal may be: smart phones, tablet computers, notebook computers, desktop computers, smart speakers, smart watches, smart televisions, smart car-mounted devices, and the like. A terminal may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, etc.

Generally, a terminal includes: a processor 2401 and a memory 2402.

Processor 2401 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 2401 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also referred to as a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 2401 may be integrated with a GPU (Graphics Processing Unit) for rendering and drawing content required to be displayed by the display screen.

Memory 2402 may include one or more computer-readable storage media, which may be non-transitory. In some embodiments, a non-transitory computer readable storage medium in the memory 2402 is used to store at least one computer program for execution by the processor 2401 to implement the method of generating video index information provided by the method embodiments of the present application.

In some embodiments, the terminal may further include: a peripheral interface 2403 and at least one peripheral. The processor 2401, memory 2402 and peripheral interface 2403 may be connected by buses or signal lines. Various peripheral devices may be connected to peripheral interface 2403 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 2404, a display screen 2405, a camera assembly 2406, an audio circuit 2407, a positioning assembly 2408 and a power supply 2409.

The peripheral interface 2403 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 2401 and the memory 2402. The Radio Frequency circuit 2404 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The display screen 2405 is used to display a UI (User Interface). The camera assembly 2406 is used to capture images or video. Audio circuitry 2407 may include a microphone and a speaker. The positioning component 2408 is utilized to locate a current geographic Location of the terminal to implement navigation or LBS (Location Based Service). The power supply 2409 is used to supply power to various components in the terminal. The power source 2409 may be alternating current, direct current, disposable batteries, or rechargeable batteries.

In some embodiments, the terminal further includes one or more sensors 2410. The one or more sensors 2410 include, but are not limited to: acceleration sensor 2411, gyro sensor 2412, pressure sensor 2413, fingerprint sensor 2414, optical sensor 2415, and proximity sensor 2416.

Those skilled in the art will appreciate that the configuration shown in fig. 24 is not intended to be limiting, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.

In an exemplary embodiment, there is also provided a computer-readable storage medium having at least one computer program stored therein, the at least one computer program being loaded and executed by a processor of a computer device to implement any one of the above-mentioned methods for generating video index information.

In one possible implementation, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, a computer program product or computer program is also provided, the computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to execute any one of the above-mentioned video index information generation methods.

It should be noted that the terms "first", "second", and the like in the description of the present application are used for distinguishing similar objects, and are not necessarily used for describing a particular order or sequence. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. The implementations described in the above exemplary embodiments do not represent all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

It should be understood that reference to "a plurality" herein means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

The above description is only exemplary of the present application and should not be taken as limiting the present application, and any modifications, equivalents, improvements and the like that are made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method for generating video index information, the method comprising:

2. The method according to claim 1, wherein the capture information includes a capture start time point and a capture stop time point, the video resource corresponding to the target video includes at least one video slice corresponding to the target video, and different video slices correspond to different slice time periods;

the generating video index information corresponding to the intercepted video to be intercepted based on the intercepting information and the video resource corresponding to the target video includes:

determining at least one target video fragment corresponding to the intercepted video to be intercepted in the at least one video fragment based on the interception starting time point and the interception ending time point, wherein the fragment time period corresponding to any target video fragment is partially or completely in an interception time period formed by the interception starting time point and the interception ending time point;

and generating video index information corresponding to the intercepted video to be intercepted based on the at least one target video fragment.

3. The method according to claim 2, wherein the generating video index information corresponding to the intercepted video to be intercepted based on the at least one target video slice comprises:

in response to that all the fragment time periods corresponding to the at least one target video fragment are within the intercepting time period, generating fragment index information corresponding to the at least one target video fragment based on the storage information of the at least one target video fragment and the duration of the at least one target video fragment;

and generating video index information corresponding to the intercepted video to be intercepted based on the fragment index information corresponding to the at least one target video fragment.

4. The method according to claim 2, wherein the generating video index information corresponding to the intercepted video to be intercepted based on the at least one target video slice comprises:

in response to that a fragment time period part corresponding to a reference video fragment in the at least one target video fragment is within the intercepting time period and all fragment time periods corresponding to other target video fragments are within the intercepting time period, generating fragment index information corresponding to the reference video fragment based on the reference time period within the intercepting time period in the fragment time period corresponding to the reference video fragment and the storage information of the reference video fragment;

generating fragment index information corresponding to the other target video fragments based on the storage information of the other target video fragments and the time lengths of the other target video fragments;

generating video index information corresponding to the intercepted video to be intercepted based on the fragment index information corresponding to the reference video fragments and the fragment index information corresponding to the other target video fragments;

the reference video slices comprise at least one of a first target video slice and a last target video slice in the at least one target video slice, and the at least one target video slice is arranged in sequence according to corresponding slice time periods.

5. The method according to claim 4, wherein the generating slice index information corresponding to the reference video slice based on the reference time segment within the clipping time period in the slice time period corresponding to the reference video slice and the storage information of the reference video slice comprises:

determining reference duration, start mark information and end mark information based on the reference time period, wherein the start mark information and the end mark information are used for determining partial video slices matched with the intercepted video to be intercepted in the reference video slices;

and generating fragment index information corresponding to the reference video fragments based on the reference duration, the start marker information, the end marker information and the storage information of the reference video fragments.

6. The method according to claim 1, wherein the capture information includes a capture start time point, a capture end time point, and an even number of capture intermediate time points, the video resource corresponding to the target video includes at least one video slice corresponding to the target video, and different video slices correspond to different slice time periods;

determining at least two sub-interception time periods based on the interception start time point, the interception end time point and the even number of interception intermediate time points;

determining video slice groups corresponding to the at least two sub-interception time periods respectively based on the at least one video slice, wherein the video slice group corresponding to any sub-interception time period consists of video slices of which the corresponding slice time periods are partially or completely in any sub-interception time period;

generating reference index information corresponding to the at least two video slice groups respectively, wherein the reference index information corresponding to any video slice group comprises slice index information corresponding to video slices forming the any video slice group;

and generating video index information corresponding to the intercepted video to be intercepted based on the reference index information and the splicing mark information respectively corresponding to the at least two video slice groups.

7. The method according to any one of claims 1-6, wherein the number of video resources corresponding to the target video is a reference number, and different video resources have different definitions; the number of the video index information corresponding to the intercepted video to be intercepted is the reference number, and any video index information is generated based on the intercepted information and the video resource with any definition corresponding to the target video.

8. The method according to claim 7, wherein a reference number of video resources corresponding to the target video are obtained by transcoding based on a reference frame alignment method, and video slices located at the same arrangement position in different video resources have the same duration.

9. A method for generating video index information, the method comprising:

acquiring interception information aiming at a target video on the video interception editing page, wherein the interception information is used for indicating the relation between an intercepted video to be intercepted and the target video;

10. The method of claim 9, wherein the video capture edit page comprises a video input control, a dotting control and a confirmation control, wherein the dotting control is used for specifying a time point on a time axis; the acquiring, at the video capture editing page, capture information for a target video includes:

displaying alternative candidate videos in response to a triggering instruction of the video input control;

responding to a selected instruction of a target video in the candidate videos, and displaying the target video and a time axis corresponding to the target video in the video intercepting and editing page;

determining a designated time point on a time axis corresponding to the target video based on the trigger operation aiming at the dotting control;

and responding to a trigger instruction of the confirmation control, and acquiring interception information aiming at the target video based on the appointed time point.

11. The method according to claim 9 or 10, wherein after sending the interception information for the target video to a server, the method further comprises:

sending a playing request aiming at the intercepted video to be intercepted to the server, wherein the server is used for returning video index information corresponding to the intercepted video to be intercepted based on the playing request;

acquiring video resources corresponding to the intercepted video to be intercepted based on the video index information corresponding to the intercepted video to be intercepted, which is returned by the server;

and playing the intercepted video to be intercepted based on the video resource corresponding to the intercepted video to be intercepted.

12. The method according to claim 11, wherein the video index information corresponding to the intercepted video to be intercepted includes slice index information corresponding to at least one associated video slice related to the intercepted video to be intercepted, and the video resource corresponding to the intercepted video to be intercepted includes a sub video resource corresponding to the at least one associated video slice; the acquiring the video resource corresponding to the intercepted video to be intercepted based on the video index information corresponding to the intercepted video to be intercepted, which is returned by the server, includes:

acquiring the at least one associated video fragment based on the video index information corresponding to the intercepted video to be intercepted;

for any associated video fragment, in response to that fragment index information corresponding to the associated video fragment does not include start marker information and end marker information, taking the associated video fragment as a whole as a sub-video resource corresponding to the associated video fragment;

and in response to that the fragment index information corresponding to any associated video fragment comprises start mark information and end mark information, determining a partial video fragment matched with the intercepted video to be intercepted in any associated video fragment based on the start mark information and the end mark information, and taking the partial video fragment as a sub-video resource corresponding to any associated video fragment.

13. An apparatus for generating video index information, the apparatus comprising:

14. An apparatus for generating video index information, the apparatus comprising:

15. A computer device comprising a processor and a memory, wherein at least one computer program is stored in the memory, and wherein the at least one computer program is loaded and executed by the processor to implement the method for generating video index information according to any one of claims 1 to 8 or the method for generating video index information according to any one of claims 9 to 12.