CN114885188A

CN114885188A - Video processing method, device, equipment and storage medium

Info

Publication number: CN114885188A
Application number: CN202210438249.5A
Authority: CN
Inventors: 林晓春
Original assignee: Baidu Online Network Technology Beijing Co Ltd
Current assignee: Baidu Online Network Technology Beijing Co Ltd
Priority date: 2022-04-20
Filing date: 2022-04-20
Publication date: 2022-08-09

Abstract

The disclosure provides a video processing method, apparatus, device and storage medium. The method relates to the field of artificial intelligence, in particular to the fields of video retrieval, video playing, intelligent interaction, intelligent recommendation and the like. The specific implementation scheme is as follows: determining a plurality of videos based on search information indicating related videos of a search target object; determining playing parameters of the plurality of videos based on the meta-information of the plurality of videos; generating video aggregation information of a target object based on the storage addresses and the playing parameters of a plurality of videos; and sending video aggregation information, wherein the video aggregation information is used for indicating the terminal to load the video clips of the target objects in the plurality of videos from the storage address according to the playing parameters and play the video clips of the target objects in the plurality of videos. According to the technical scheme disclosed by the invention, the switching time delay among different videos can be reduced, and the playing efficiency is improved.

Description

Video processing method, device, equipment and storage medium

Technical Field

The present disclosure relates to the field of artificial intelligence, and in particular, to the fields of video retrieval, video playing, intelligent interaction, intelligent recommendation, and the like.

Background

With the popularization of the video wave, video resources of various situations such as long videos, short videos, small videos and the like are rapidly produced and can be played on a terminal. Although video resources are widely popularized, a video player takes a video file as a natural segmentation scale, and in scenes such as video retrieval and video playing, a user needs to click video retrieval results in sequence to play, switching time between different videos is prolonged, and playing efficiency is low.

Disclosure of Invention

The disclosure provides a video processing method, apparatus, device and storage medium.

According to a first aspect of the present disclosure, there is provided a video processing method applied to a server, including:

determining a plurality of videos based on search information indicating related videos of a search target object;

determining playing parameters of the plurality of videos based on the meta information of the plurality of videos;

generating video aggregation information of the target object based on the storage addresses and the playing parameters of the plurality of videos;

and sending the video aggregation information, wherein the video aggregation information is used for instructing the terminal to load the video clips of the target objects in the videos from the storage address according to the playing parameters and play the video clips of the target objects in the videos.

According to a second aspect of the present disclosure, there is provided a video processing method applied to a terminal, including:

transmitting search information indicating a video related to a search target object;

receiving video aggregation information generated by the server based on the search information;

analyzing the video aggregation information to obtain playing parameters and storage addresses of a plurality of videos included in the video aggregation information, wherein each video in the plurality of videos includes a video clip of a target object;

and loading the video clips of the target objects in the videos from the storage address according to the playing parameters, and playing the video clips of the target objects in the videos.

According to a third aspect of the present disclosure, there is provided a video processing apparatus applied to a server, including:

a first determination unit configured to determine a plurality of videos based on search information indicating related videos of a search target object;

a second determining unit for determining the playing parameters of the plurality of videos based on the meta information of the plurality of videos;

a generating unit, configured to generate video aggregation information of the target object based on the storage addresses and the playing parameters of the plurality of videos;

and the first sending unit is used for sending the video aggregation information, and the video aggregation information is used for instructing the terminal to load the video clips of the target objects in the videos from the storage address according to the playing parameters and playing the video clips of the target objects in the videos.

According to a fourth aspect of the present disclosure, there is provided a video processing apparatus applied to a terminal, including:

a second transmitting unit that transmits search information indicating a video related to a search target object;

a second receiving unit configured to receive video aggregation information generated by the server based on the search information;

the analysis unit is used for analyzing the video aggregation information to obtain the playing parameters and the storage addresses of a plurality of videos included in the video aggregation information, wherein each video in the plurality of videos includes a video clip of the target object;

and the playing unit is used for loading the video clips of the target objects in the videos from the storage address according to the playing parameters and playing the video clips of the target objects in the videos.

According to a fifth aspect of the present disclosure, there is provided an electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first and second aspects.

According to a sixth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method provided by the first and second aspects.

According to a seventh aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method provided by the first and second aspects above.

According to the technical scheme disclosed by the invention, the switching time delay among different videos can be reduced, and the playing efficiency is improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1 is a first flowchart illustrating a video processing method according to an embodiment of the present disclosure;

fig. 2 is a second flowchart of a video processing method according to an embodiment of the disclosure;

FIG. 3 is a schematic diagram of a video playback interface according to an embodiment of the present disclosure;

FIG. 4 is an architectural diagram of video processing according to an embodiment of the present disclosure;

FIG. 5 is a first schematic diagram of a video processing apparatus according to an embodiment of the present disclosure;

FIG. 6 is a second schematic structural diagram of a video processing apparatus according to an embodiment of the present disclosure;

FIG. 7 is a scene schematic of video processing according to an embodiment of the present disclosure;

fig. 8 is a block diagram of an electronic device for implementing a video processing method of an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The terms "first," "second," and "third," etc. in the description embodiments and claims of the present disclosure and the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprises" and "comprising," as well as any variations thereof, are intended to cover a non-exclusive inclusion, such as a list of steps or elements. A method, system, article, or apparatus is not necessarily limited to those steps or elements explicitly listed, but may include other steps or elements not explicitly listed or inherent to such process, system, article, or apparatus.

The embodiment of the disclosure provides a video processing method, which can be applied to a server having a video retrieval or video recommendation function. The server includes, but is not limited to, a general server, a cloud server, and the like. As shown in fig. 1, the video processing method includes:

s101: determining a plurality of videos based on search information indicating related videos of a search target object;

s102: determining playing parameters of the plurality of videos based on the meta information of the plurality of videos;

s103: generating video aggregation information of the target object based on the storage addresses and the playing parameters of the plurality of videos;

s104: and sending the video aggregation information, wherein the video aggregation information is used for instructing the terminal to load the video clips of the target objects in the videos from the storage address according to the playing parameters and play the video clips of the target objects in the videos.

In the embodiment of the present disclosure, the search information is determined by the terminal, specifically, the terminal determines according to the information input by the user. In some embodiments, the server directly receives the search information sent by the terminal. In other embodiments, the server receives search information forwarded by other devices that is determined by the terminal. The present disclosure does not limit the communication method between the terminal and the server.

In the embodiment of the present disclosure, the video may be a long video, a short video, or a small video. The above is merely exemplary and not intended as a limitation on all possible types of video, and is not exhaustive.

In the embodiment of the present disclosure, the target object is a main body of the video content, for example, the target object may be a person, a place, an event, a subject, and the like. Taking the search information as "actor XXX" as an example, the target object is "XXX". Taking the search information as "tourist attraction YYY" as an example, the target object is "YYY". Taking the search information as "video for cooking pork in brown sauce" as an example, the target object is "pork in brown sauce". Taking the search information as "eye makeup video" as an example, the target object is "eye makeup". The above is merely an exemplary illustration, and is not intended to limit all possible types of the target object, which is not exhaustive here.

In the embodiment of the present disclosure, the meta information is related information characterizing features of the video. For example, meta-information includes, but is not limited to: coding format, channel number, resolution, frame rate, code rate, fingerprint characteristics and other information. The fingerprint features can be understood as multi-dimensional feature vectors formed according to the extracted key information of the video. If the similarity between the feature vectors of the two videos reaches a preset ratio, for example, 99%, the fingerprint features of the two videos are considered to be the same.

In the embodiment of the disclosure, the server can acquire a large number of video resources, and previously identify the acquired video resources to acquire the meta information of each video, and store the meta information of each video. In this way, when the search information sent by the terminal is received, the meta information of the videos and the videos matched with the search information can be quickly found, and the speed of providing the videos to the terminal by the server is improved.

In the embodiment of the present disclosure, the video aggregation information is information obtained by aggregating storage addresses and playing parameters of a plurality of videos. The embodiment of the present disclosure does not limit the representation form of the video aggregation information. For example, the video aggregation information may be represented in a list form, or may be represented in a text form. The above is merely an exemplary illustration, and is not a limitation on all possible forms of video aggregation information, which is not exhaustive here.

According to the technical scheme of the embodiment of the disclosure, the server determines a plurality of videos based on the search information, and determines the playing parameters of the videos based on the meta information of the videos; generating video aggregation information of the target object based on the storage addresses and the playing parameters of the plurality of videos; by sending the video aggregation information to the terminal, the terminal is facilitated to load the video clips of the target objects in the videos from the storage address based on the video aggregation information, and the video clips of the target objects in the videos are played according to the playing parameters. The video aggregation information comprises the storage addresses and the playing parameters of the videos, so that the video clips of the target objects in the videos can be played in sequence on the same playing interface, a user does not need to manually switch or position the video clips of the target objects in different videos, the switching time delay among different videos is greatly reduced, and the playing efficiency is improved. In addition, through the video aggregation information, not only can the video clips of the target objects in the multiple videos be played more smoothly, but also a playing experience similar to that of a single long video related to the target objects can be provided.

In some embodiments, determining the plurality of videos based on the search information comprises: acquiring a plurality of videos related to the search information; and carrying out de-duplication processing on the plurality of videos related to the search information to obtain a plurality of videos.

In some embodiments, obtaining a plurality of videos related to search information includes: and inquiring videos relevant to the search information from the database based on the search information to obtain a plurality of videos relevant to the search information. In practical application, the inverted index is pre-established in the database, that is, the title is retrieved through the keyword. Illustratively, the indexes are arranged in the alphabetical order of A to Z, and when the search information is Z, a plurality of videos corresponding to Z are quickly locked. Therefore, a plurality of videos relevant to the search information can be quickly inquired.

In some embodiments, obtaining a plurality of videos related to search information includes: determining the number of preprocessed videos related to the search information, and inquiring videos related to the search information in a database based on the search information to obtain the number of videos related to the search information. In practical applications, there may be a huge amount of video resources related to the search information, for example, videos with different sources (source station addresses) and different resolutions/frame rates/code rates but consistent video contents exist in the video retrieval result. In order to improve the response speed, a certain number of videos are selected from massive video resources to be processed. Therefore, the server can conveniently and quickly aggregate the information of the video clips of the target object in the videos, and the speed of switching the video clips related to the target object among the videos by the terminal is further improved.

Therefore, through carrying out the duplicate removal processing on the videos related to the search information, the contents of the video clips of the target object contained in the videos can be different, so that the problem of playing the video clips with repeated contents is avoided, and the playing effect is improved.

In some embodiments, the performing the deduplication processing on the plurality of videos related to the search information to obtain a plurality of videos includes: determining a pre-reserved video from the videos with repeated contents in the case that the videos with repeated contents exist in a plurality of videos related to the search information; and obtaining a plurality of videos based on the pre-reserved videos and the videos with non-repeated contents in the videos related to the search information.

In some embodiments, in a plurality of videos related to search information, whether two or more videos with the same video fingerprint characteristics exist is judged; and if two or more videos with the same video fingerprint characteristics exist, judging that the videos with repeated contents exist in the videos related to the search information.

In some embodiments, determining a pre-reserved video from a content-repeated video comprises: and determining the pre-reserved video from the video with repeated content according to the preset filtering condition. For example, the preset filtering condition may be a video with a preference for retaining format matching. Here, the format matching includes, but is not limited to, resolution matching, frame rate matching, or encoding format matching, etc. For another example, the preset filtering condition may be that a video with high definition is preferentially retained.

Therefore, the pre-reserved video is determined from the video with repeated content, the matching degree of the finally determined multiple videos can be improved, and the subsequent aggregation processing speed of the multiple videos is improved.

In some embodiments, determining playback parameters for a plurality of videos based on meta-information for the plurality of videos includes: and respectively determining the starting position and the ending position of the video segment of the target object contained in the videos based on the labeling information of the videos.

Here, the meta information includes annotation information. The label information is information labeled in advance. The labeling is completed before S101. The server may receive a large amount of video resources at the same time, and the large amount of video resources can be labeled offline for subsequent retrieval or recommendation. For example, the target object C appears for 10 seconds to 20 seconds in the video 1, the target object D appears for 21 seconds to 40 seconds in the video 1, and the target object C appears for 41 seconds to 70 seconds in the video 1.

Here, the play parameter includes a start position and an end position. In the same video, there may be multiple video segments of the target object, and each video segment needs to be marked with a start position and an end position. Continuing to take the example that the target object C appears from 10 th to 20 th seconds in the video 1, the target object D appears from 21 st to 40 th seconds in the video 1, and the target object C appears from 41 th to 70 th seconds in the video 1 as examples, if the target object is C, the starting position 1 is the 10 th second of the video 1, and the ending position 1 is the 20 th second of the video 1 for the video 1; the start position 2 is 41 th second of video 1, and the end position 2 is 70 th second of video 1.

Therefore, the terminal can conveniently and quickly lock the to-be-played segment of each video, and the switching time delay among different video segments can be reduced while the video segments with the playing contents as the target objects are met.

In some embodiments, determining playback parameters for a plurality of videos based on meta-information for the plurality of videos includes: determining a target resolution of the plurality of videos based on resolutions of the plurality of videos, the target resolution being a resolution at which the plurality of videos is highest; and determining cropping information or edge supplement information for the video which does not meet the target resolution in the plurality of videos, so that the video processed based on the cropping information or the edge supplement information meets the target resolution.

Here, determining the resolution with the highest ratio of the plurality of videos as the target resolution contributes to reducing the workload of the server for calculating the cropping information or the edge supplement information, and also contributes to reducing the workload of the terminal for performing the cropping or the edge supplement.

For example, in a plurality of videos, if a horizontal screen image is taken as a main image, edge repairing processing needs to be performed on a vertical screen image.

For example, in a plurality of videos, if a vertical screen image is mainly used, a horizontal screen image needs to be cropped.

It should be noted that the present disclosure does not limit the calculation method of the cropping information and the edge supplement information.

Therefore, the server determines the clipping information or the edge supplement information of some videos in advance, so that the terminal can clip or supplement the edges quickly when playing the videos, the image resolution of the played videos is enabled to be consistent, and the playing experience similar to that of a single long video is provided.

In some embodiments, determining playback parameters for a plurality of videos based on meta-information for the plurality of videos includes: and determining the advanced loading duration and the video buffer amount of the plurality of videos based on the code rates of the plurality of videos.

Here, the loading duration in advance is a duration in which the terminal needs to load the video in advance before playing the video.

Here, the video buffer size is at least the size of the video content that needs to be buffered after the terminal loads the video.

In some embodiments, determining the loading ahead duration and the video buffer size of the plurality of videos based on the bitrate of the plurality of videos includes: for the video with the code rate larger than the first code rate value, the loading duration in advance is a first time value, the duration of the video buffer amount is a second time value, and the second time value is smaller than the first time value; for the video with the code rate smaller than the second code rate value, the loading duration in advance is a third time value, the duration of the video buffer amount is a fourth time value, the second code rate value is larger than the first code rate value, the third time value is smaller than the first time value, the fourth time value is smaller than the third time value, and the fourth time value is smaller than the second time value. For example, for a video with a large code rate, pre-reading is started 10 seconds in advance, and buffering is performed for 5 seconds in advance; for the video with smaller code rate, the pre-reading can be started in advance by 1 second, and the buffering of 0.5 second is pre-read.

Therefore, the server determines the loading-in-advance time and the video buffer amount of each video in advance, so that the terminal can load the video based on the loading-in-advance time and buffer the video based on the video buffer amount, and the fluency of the terminal in playing the video can be further improved.

In some embodiments, generating video aggregation information of the target object based on the storage addresses and the playing parameters of the plurality of videos comprises: determining the playing sequence, the advanced loading duration, the video buffer amount and the cutting information or the edge supplement information in the playing parameters of the videos; and generating video aggregation information of the target object based on the storage addresses of the videos, the playing sequence of the videos, the loading time length in advance, the video buffer amount and the cutting information or the edge supplement information.

Here, the playing order may be determined according to the ordering of the plurality of videos in S101. The playing sequence can be set or adjusted according to the connection of the video content, the shooting time of the video content, the popularity of the video content and other factors.

Here, the video aggregation information includes at least a storage address, a play order, a loading ahead duration, a video buffer amount, and cropping information or edge supplement information of the plurality of videos.

Therefore, the server generates video aggregation information in advance, guidance can be provided for the subsequent terminal to play the video clips of the target objects in the videos, and the terminal can conveniently and rapidly lock and play the video clips of the bid objects in the videos. Compared with a processing mode of editing and generating a long video by a server, the speed of the terminal responding to the playing request of the user can be improved, and playing experience similar to that of a single long video can be provided.

In some embodiments, the video processing method may further include: receiving a search request, wherein the search request is used for requesting to continuously acquire videos related to the search information; generating new video aggregation information based on the search request; the new video aggregation information is transmitted.

Here, the new video aggregation information is with respect to the original video aggregation information. The plurality of videos used for generating the new video aggregation information are different from the plurality of videos used for generating the original video aggregation information.

In this way, the server provides video aggregation information for the terminal based on the plurality of videos, and when a request for continuously acquiring videos related to the search information is received, the server continuously generates new video aggregation information. Therefore, the speed of the terminal responding to the playing request of the user can be improved, the workload of the server can be reduced, and the requirement that the terminal continues to acquire the video related to the search information is not influenced.

The embodiment of the disclosure provides a video processing method, which can be applied to a terminal, wherein the terminal has a video playing function and supports video searching and video recommendation. In practical applications, the terminal includes, but is not limited to, a mobile phone, a tablet computer, a wearable device, or a personal computer. As shown in fig. 2, the video processing method may include:

s201: transmitting search information indicating a video related to a search target object;

s202: receiving video aggregation information generated by the server based on the search information;

s203: analyzing the video aggregation information to obtain the playing parameters and the storage addresses of a plurality of videos included in the video aggregation information;

s204: and loading the video clips of the target objects in the videos from the storage address according to the playing parameters, and playing the video clips of the target objects in the videos.

In the embodiment of the present disclosure, the search information is determined by the terminal based on the input information of the user. The present disclosure does not limit the input method of the input information. For example, the input mode may be text input, voice input, image input, and the like.

Here, the playing parameters at least include a playing sequence, a loading-ahead duration, a video buffer amount, and clipping information or side-filling information.

In this way, the terminal analyzes the video aggregation information provided by the server, loads the video clips of the target objects in the plurality of videos according to the playing addresses included in the video aggregation information, and plays the video clips of the target objects in the plurality of videos according to the playing parameters included in the video aggregation information; therefore, the video clips related to the target object in the videos can be played in the same playing interface, the unbounded playing of the video clips related to the target object in the videos can be achieved, and the video experience of playing a long video in the same playing interface can be provided.

In some embodiments, loading a video clip of a target object in a plurality of videos from a storage address according to a playback parameter includes: determining a loading sequence according to the playing sequence corresponding to each video in the playing parameters; and loading the video clips of the target object in the plurality of videos according to the loading-in-advance time and the video buffer amount corresponding to each video in the playing parameters, and the starting position and the ending position of the video clip of the target object included in each video in the playing parameters by combining the loading sequence.

Here, for the same video, a plurality of video clips regarding the same target object may be included.

Therefore, the video clips of the target objects in the videos can be quickly locked, and the video clips of the target objects in the videos can be loaded in order, so that the video clips of the target objects in the videos can be played unboundedly while a large amount of memory of a terminal is avoided.

In some embodiments, playing a video clip of a target object in a plurality of videos includes: displaying the current playing time length and the total time length at a first position of a playing interface; the current playing time length is the played time length of the video clips of the target objects in the videos, and the total time length is the total time length of the video clips of the target objects in the videos.

Here, the first position may be an arbitrary position on the play interface. For example, the first position may be a tail of the playing progress bar on the playing interface, and specifically may be located directly above or below the tail of the playing progress bar.

Fig. 3 is a schematic diagram of a play interface, and as shown in fig. 3, a play progress bar is displayed below the play interface, and a current play time length and a total time length are displayed right below a tail portion of the play progress bar.

Therefore, the total time length of the video clip of the playable target object and the current played time length can be prompted, and the playing experience is improved.

In some embodiments, playing a video clip of a target object in a plurality of videos includes: in response to the operation of displaying the video list, displaying the video list at a second position of the playing interface, wherein the video list comprises a plurality of videos; and hiding the video list in response to the operation of hiding the video list.

Here, the video list is a list of a plurality of videos included in the video aggregation information, the video list being used at least to cue the play order of the respective videos. The video list may also be used to prompt the number of currently searched videos.

In some embodiments, the video list specifically includes file names of a plurality of videos and a play time point, which is used to locate the videos on the play progress bar.

Here, the operation of displaying the video list may be set or adjusted according to design requirements. For example, clicking the second position or pointing the cursor to the second position, the operation of displaying the video list is issued. In fig. 3, a video list is displayed on the right side of the playback interface, and the video list displays the file name and the playback time point of each video. For example, video 2 starts playing from 1 minute 20 seconds, and video 3 starts playing from 3 minutes 20 seconds.

Therefore, the related information of the videos can be prompted, and therefore the videos can be positioned quickly.

In some embodiments, playing a video clip of a target object in a plurality of videos includes: and responding to the operation of dragging or clicking the playing progress bar, determining the video to be played indicated by the operation, and playing the video to be played.

In some embodiments, determining the video to be played indicated by the operation includes: determining the position of the operation on the playing progress bar; determining the length between the starting position of the playing progress bar and the position; determining the ratio of the length to the total length of the playing progress bar; and determining the video to be played according to the ratio.

In some embodiments, determining the video to be played indicated by the operation includes: determining the position of the operation on the playing progress, and determining the duration of the position indicated by the playing progress bar; and determining the video to be played according to the duration. Continuing with the example of the playing interface shown in fig. 3, when it is detected that the mouse drags the playing progress to the position of the playing progress bar for 1 minute 25 seconds, the video 2 corresponding to the 1 minute 25 seconds is found, and since the video 2 starts playing from 1 minute 20 seconds, the video 2 should start playing from the 5 th second of the video 2. Note that the 5 th second of the video 2 described here refers to the 5 th second of the video clip of the target object included in the video 2.

Therefore, the switching of the videos is supported through the control of the playing progress bar, and the experience of playing videos similar to a long video in the same playing interface can be provided.

In some embodiments, the video processing method may further include: sending a search request under the condition that the playing completion ratio of the videos is larger than a first threshold, wherein the search request is used for requesting to continuously acquire the videos related to the search information; or, the search request is sent when the remaining playing time of the plurality of videos is less than the second threshold.

Here, the first threshold and the second threshold may be set or adjusted according to design requirements or user requirements.

Therefore, when the playing progress of the videos meets a certain condition, the request for continuously acquiring the videos relevant to the search information is sent to the server, so that the requirement of the terminal for continuously acquiring the videos relevant to the search information can be met, and the search pressure of the server can be relieved.

In some embodiments, the video processing method may further include: receiving new video aggregation information returned by the server based on the search request; analyzing the new video aggregation information to obtain a plurality of videos included in the new video aggregation information; adding a plurality of videos included in the new video aggregation information to the tail parts of the plurality of videos included in the video aggregation information; and changing the total duration in the playing interface based on the total duration of the video clips of the target object included in the new video aggregation information and the total duration of the video clips of the target object included in the video aggregation information.

Therefore, new videos can be added under the condition that playing experience is not influenced so as to meet the continuous watching requirement, and playing experience can be further improved.

Taking a video retrieval scene as an example, fig. 4 shows an architecture diagram of video processing. As shown in fig. 4, the architecture includes four major parts, a video understanding layer, a video retrieval layer, a preprocessing layer, and an unbounded player.

The video understanding layer is mainly responsible for deeply understanding video resources taken by crawlers and resource partners in multiple dimensions, decomposing video contents which cannot be directly read by a computer into structured information and storing the structured information in the form of Meta information (Meta information).

The video retrieval layer mainly utilizes Meta information of videos to create an inverted index, so that videos related to search terms (query) can be quickly retrieved. In order to improve the video retrieval quality, the retrieval results can be subjected to relevance sorting.

The preprocessing layer is mainly responsible for carrying out a series of back-end processing on a plurality of retrieved videos.

Specifically, the back-end processing mainly includes the following contents:

and (3) duplicate removal treatment: the video retrieval results have many videos with different sources (source station addresses), different resolutions/frame rates/code rates and consistent video contents. The duplicate removal processing is to determine videos with repeated contents according to the fingerprint characteristics of the videos, filter the videos with repeated contents according to certain filtering conditions, and only keep one video in the videos with repeated contents. Here, the filtering condition includes, but is not limited to, sharpness-first, format (resolution/frame rate/encoding format) matching-first, and the like.

Video head and tail labeling treatment: the video retrieval results are sorted and displayed with the video as granularity. However, only a certain section of the video may be matched with the search term, and then the effective section included in each video needs to be screened according to the Meta information of each video. In the same video, multiple sections of effective videos may exist, and head and tail information needs to be marked in each section of effective video. It should be noted that, in order to better link multiple effective videos in the same video, a fusion strategy may be used to fuse multiple adjacent effective segments in the same video, so that the effective segments are labeled as a complete segment. Here, fusion strategies include, but are not limited to: and fusing other fragments between the two effective fragments with the two effective fragments under the condition that the interval between the two effective fragments is less than a certain threshold value.

And (3) format identification processing: the purpose of format recognition is to provide the necessary information for the player to read ahead. The video sources in the video retrieval result are complex, and thus, parameters such as video encoding format/number of channels/resolution/frame rate/code rate may be different. The format recognition stage can analyze the parameters according to the Meta information of the video and the video source file. These parameter information can be used for the following functions:

format matching precedence strategy in deduplication: the encoding format, the length-width ratio format, the resolution format, the color monochromatic format and the like of the video in the finally generated video list are unified as much as possible, so that the difference of the playing experiences of the videos in different formats is reduced, and the playing experience of the final video is unified as much as possible.

Video clipping/edge-filling strategy: the problem of non-uniform video formats may still exist after deduplication, and in order to further unify the playing experience, a most reasonable uniform resolution needs to be selected according to formats of the remaining videos, and the most frequently occurring resolution in the videos is usually selected. Video of other resolutions requires cropping/edge-filling during playback. The calculated parameters of clipping and edge-filling are stored in a final video list and are analyzed and executed by a player.

Pre-reading strategy: the video coding formats are different, the code rates are different, and the data size needing to be read in advance is also different. And arranging a pre-reading strategy of each video according to the specific format of each video. For example, for a video with a large code rate, pre-reading is started 10 seconds in advance, and buffering is performed for 5 seconds in advance; and the video with smaller code rate can be pre-read only 1 second ahead of time and buffered for 0.5 second ahead of time.

Time axis reorganization: after the video head-to-tail labeling, cutting/edge-filling strategies and pre-reading strategies are finished, the time axis can be rewritten according to the obtained strategy data. The timeline is primarily used to indicate when the player is pre-reading what video and from where to start how to play. One segment of the time axis implements the code as follows:

it should be noted that the preprocessing process is not disposable, but is performed dynamically. That is, as the video is played, when the last video list is about to be played, the player will send out a request in advance, the preprocessing layer will request the video retrieval layer to return the video retrieval result of the next page, and on the basis of the preprocessing result of the last time (keeping the time axis continuous, ensuring the resolution ratio to be the same as before), the preprocessing layer will preprocess the video retrieval result of the next page.

The unbounded player is mainly responsible for playing videos.

Specifically, a video list may be displayed on the play interface. For example, the video list shows the original title of the video + the starting position of the video in the video list, e.g., video 3 starts in the current video list for 2 minutes and 20 seconds, and when the line of video 3 is clicked, the playing progress bar jumps to the position of 00:02:20 and starts playing the effective segment in video 3.

Specifically, the current playing time and the total time may be displayed on the playing interface. For example, the current playing time on the right side of the progress bar indicates the current playing time in the aggregation list, and the total time indicates the total time of all valid segments of the aggregation list that have been loaded currently.

Specifically, as the video is played to a position close to the end of the current video list, the player will automatically request the next video list, at this time, the video list on the right side will be automatically expanded, and the total time on the right side of the progress bar will also be automatically increased. But the current play time does not change. That is, after the user inputs the search term, only one list is calculated; on the previous basis, the next list continues to be generated.

Specifically, the video list displayed on the playing interface can be automatically hidden and automatically appear when the mouse is hovered.

Specifically, when the progress bar is dragged or clicked, the video can be quickly browsed or positioned among a plurality of video source files, and the experience is consistent with the experience of playing a complete video.

It should be understood that the architecture diagram shown in fig. 4 is merely exemplary and not limiting, and is extensible, and that various obvious changes and/or substitutions may be made by those skilled in the art based on the example of fig. 4, and still fall within the scope of the disclosure of the embodiments of the disclosure.

The video processing method provided by the disclosure can be used for items such as video retrieval, video aggregation, video playing and the like. Illustratively, the execution subject of the method may be an electronic device, which may be a variety of search engine devices, query engine servers, and the like.

The embodiment of the present disclosure provides a video processing apparatus, which is applied to a server, and as shown in fig. 5, the video processing apparatus may include: a first determination unit 501 for determining a plurality of videos based on search information indicating related videos of a search target object; a second determining unit 502 for determining playing parameters of the plurality of videos based on meta information of the plurality of videos; a generating unit 503, configured to generate video aggregation information of the target object based on the storage addresses and the playing parameters of the plurality of videos; a first sending unit 504, configured to send video aggregation information, so that the terminal loads a video clip of a target object in the multiple videos from the storage address based on the video aggregation information, and plays the video clip of the target object in the multiple videos according to the playing parameter.

In some embodiments, the first determining unit 501 includes: an acquisition subunit configured to acquire a plurality of videos related to the search information; and the duplication removing subunit is used for carrying out duplication removing processing on the videos related to the search information to obtain a plurality of videos.

In some embodiments, the duplication removal subunit is configured to determine, in a case where there is a content-duplicated video among the plurality of videos related to the search information, a pre-reserved video from the content-duplicated videos; and obtaining a plurality of videos based on the pre-reserved videos and the videos with non-repeated contents in the videos related to the search information.

In some embodiments, the second determining unit 502 includes: the first determining subunit is configured to determine, based on the annotation information of the plurality of videos, a start position and an end position of a video segment of a target object included in the plurality of videos, respectively.

In some embodiments, the second determining unit 502 includes: a second determining subunit configured to determine a target resolution of the plurality of videos based on resolutions of the plurality of videos, the target resolution being a resolution at which the plurality of videos occupy the highest ratio; and determining cropping information or edge supplement information for the video which does not meet the target resolution in the plurality of videos, so that the video processed based on the cropping information or the edge supplement information meets the target resolution.

In some embodiments, the second determining unit 502 includes: and the third determining subunit is used for determining the loading-in-advance time length and the video buffering amount of the plurality of videos based on the code rates of the plurality of videos.

In some embodiments, the generating unit 503 includes: the fourth determining subunit is used for determining the playing sequence, the advanced loading duration, the video buffer size and the cutting information or the edge supplement information in the playing parameters of the plurality of videos; and the generating subunit is used for generating the video aggregation information of the target object based on the storage addresses of the videos, the playing sequence of the videos, the loading time in advance, the video buffer amount and the cutting information or the edge supplement information.

In some embodiments, the video processing apparatus further includes: the first receiving unit 505 (not shown in fig. 5) is configured to receive a search request, where the search request is used to request that video related to search information is continuously acquired. The generating unit 503 is further configured to generate new video aggregation information based on the search request; the first sending unit is further configured to send new video aggregation information.

It should be understood by those skilled in the art that the functions of the processing modules in the video processing apparatus according to the embodiments of the present disclosure may be understood by referring to the foregoing description of the video processing method applied to the server, and the processing modules in the video processing apparatus according to the embodiments of the present disclosure may be implemented by analog circuits that implement the functions described in the embodiments of the present disclosure, or by running software that performs the functions described in the embodiments of the present disclosure on electronic devices.

The video processing device of the embodiment of the disclosure can provide video aggregation information for the terminal, so that the terminal can load video clips of target objects in a plurality of videos from a storage address based on the video aggregation information and play the video clips of the target objects in the plurality of videos according to the playing parameters. The video aggregation information comprises the storage addresses and the playing parameters of the videos, so that the video clips of the target objects in the videos can be played in sequence on the same playing interface, a user does not need to manually switch or position the video clips of the target objects in different videos, the switching time delay among different videos is greatly reduced, and the playing efficiency is improved.

The embodiment of the present disclosure provides a video processing apparatus, which is applied to a terminal, and as shown in fig. 6, the video processing apparatus may include: a second transmitting unit 601 for transmitting search information indicating a video related to a search target object; a second receiving unit 602, configured to receive video aggregation information generated by the server based on the search information; the parsing unit 603 is configured to parse the video aggregation information to obtain play parameters and storage addresses of multiple videos included in the video aggregation information, where each of the multiple videos includes a video segment of the target object; the playing unit 604 is configured to load the video segments of the target object in the multiple videos from the storage address according to the playing parameter, and play the video segments of the target object in the multiple videos.

In some embodiments, the playing unit 604 includes: a fifth determining subunit, configured to determine a loading order according to the playing order corresponding to each video in the playing parameters; and the loading subunit is used for loading the video clips of the target object in the plurality of videos according to the loading duration and the video buffer amount in advance corresponding to each video in the playing parameters, and the starting position and the ending position of the video clip of the target object included in each video in the playing parameters by combining the loading sequence.

In some embodiments, the playing unit 604 includes: the first display subunit is used for displaying the current playing time length and the total time length at a first position of the playing interface; the current playing time length is the played time length of the video clips of the target objects in the videos, and the total time length is the total time length of the video clips of the target objects in the videos.

In some embodiments, the playing unit 604 includes: the second display subunit is used for responding to the operation of displaying the video list, displaying the video list at a second position of the playing interface, wherein the video list comprises file names of a plurality of videos and playing time points, and the playing time points are used for positioning the videos on the playing progress bar; and the hiding subunit is used for responding to the operation of hiding the video list and hiding the video list.

In some embodiments, the playing unit 604 includes: and the control subunit is used for responding to the operation of dragging or clicking the playing progress bar, determining the video to be played indicated by the operation, and playing the video to be played.

In some embodiments, the second sending unit 601 is further configured to: and sending a search request for requesting to continuously acquire videos related to the search information under the condition that the playing completion ratio of the videos is larger than a first threshold value.

In some embodiments, the second sending unit 601 is further configured to send the search request if the remaining playing time lengths of the plurality of videos are less than the second threshold.

In some embodiments, the second receiving unit 602 is further configured to receive new video aggregation information returned by the server based on the search request; the parsing unit 603 is further configured to parse the new video aggregation information to obtain a plurality of videos included in the new video aggregation information; the playing unit 604 is further configured to add the plurality of videos included in the new video aggregation information to the tails of the plurality of videos included in the video aggregation information; and changing the total duration in the playing interface based on the total duration of the video clips of the target object included by the new video aggregation information and the total duration of the video clips of the target object included by the video aggregation information.

It should be understood by those skilled in the art that the functions of each processing module in the video processing apparatus according to the embodiments of the present disclosure may be understood by referring to the foregoing description of the video processing method applied to the terminal, and each processing module in the video processing apparatus according to the embodiments of the present disclosure may be implemented by an analog circuit that implements the functions described in the embodiments of the present disclosure, or may be implemented by running software that performs the functions described in the embodiments of the present disclosure on an electronic device.

The video processing device of the embodiment of the disclosure can play the video clips of the target objects in the plurality of videos based on the video aggregation information, not only can improve the playing fluency of the video clips of the target objects in the plurality of videos, but also can provide a playing experience similar to that of a single long video related to the target objects.

Fig. 7 is a schematic view illustrating a scene of video playing without bound, and as can be seen from fig. 7, an electronic device such as a cloud server receives search information from each terminal; acquiring a plurality of videos related to the search information from a plurality of video resource libraries, and acquiring meta information of the plurality of videos from a meta information database; performing video aggregation processing according to the videos and the meta information thereof, and generating video aggregation information for each search information, wherein the video aggregation information comprises storage addresses and playing parameters of the videos; and the terminal loads the video clips of the target objects in the plurality of videos from the storage address and plays the video clips of the target objects in the plurality of videos according to the playing parameters. Therefore, the terminal can realize the progress bar dragging fixed-point playing spanning a plurality of videos on the same playing interface, and the playing experience similar to that of a single long video is provided for a user. When the video retrieval function is used, the retrieved videos are short in length, and it is hoped that the progress bar dragging fixed-point playing spanning multiple videos can be achieved on the same playing interface, so that playing experience similar to that of a single long video is provided for a user. In the playing process of the terminal, repeated contents can be skipped automatically; and automatically record the playing position of each time to start from here next.

Several intelligent content authoring scenarios are listed below. For example, a plurality of related videos of the same hot event are aggregated into a long video with event veins connected in series; or aggregating the hot event short videos of the current day into a short message long video of the current day. For another example, all videos of an artist are continuously played through one-time retrieval, and the hope of time spent by star pursuing fans is met. For another example, on the basis of the function of skipping the head and the tail of a movie, the repeat scenarios of the front episode caused by diversity are reduced, and the experience of watching the movie caused by the diversity is reduced, so that the demand of the person who plays the movie in the same run can be better met.

It should be understood that the scene diagram shown in fig. 7 is only illustrative and not restrictive, and those skilled in the art may make various obvious changes and/or substitutions based on the example of fig. 7, and the obtained technical solutions still belong to the disclosure scope of the embodiments of the present disclosure.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 8 illustrates a schematic block diagram of an example electronic device 800 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 8, the device 800 includes a computing unit 801 that can perform various appropriate actions and processes in accordance with a computer program stored in a Read-Only Memory (ROM) 802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An Input/Output (I/O) interface 805 is also connected to bus 804.

A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing Unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable Processor, controller, microcontroller, and the like. The calculation unit 801 executes the respective methods and processes described above, such as a video processing method. For example, in some embodiments, the video processing method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 800 via ROM 802 and/or communications unit 809. When loaded into RAM 803 and executed by the computing unit 801, a computer program may perform one or more steps of the video processing method described above. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the video processing method in any other suitable manner (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be realized in digital electronic circuitry, Integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application-Specific Standard Products (ASSPs), System-on-Chip (SOC), load Programmable Logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard Disk, a random access Memory, a Read-Only Memory, an Erasable Programmable Read-Only Memory (EPROM), a flash Memory, an optical fiber, a Compact disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a Display device (e.g., a Cathode Ray Tube (CRT) or Liquid Crystal Display (LCD) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client and server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel or sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A video processing method is applied to a server and comprises the following steps:

determining playing parameters of the plurality of videos based on meta information of the plurality of videos;

generating video aggregation information of the target object based on the storage addresses of the plurality of videos and the playing parameters;

and sending the video aggregation information, wherein the video aggregation information is used for instructing a terminal to load the video clips of the target objects in the videos from the storage address according to the playing parameters and play the video clips of the target objects in the videos.

2. The method of claim 1, wherein the determining a plurality of videos based on search information comprises:

acquiring a plurality of videos related to the search information;

and carrying out duplication removal processing on the videos related to the search information to obtain a plurality of videos.

3. The method of claim 2, wherein the performing the de-duplication process on the plurality of videos related to the search information to obtain a plurality of videos comprises:

in the case that a video with repeated content exists in the plurality of videos related to the search information, determining a pre-reserved video from the video with repeated content;

and obtaining a plurality of videos based on the pre-reserved videos and the videos with non-repeated contents in the videos related to the search information.

4. The method of claim 1, wherein the determining playback parameters for the plurality of videos based on meta information for the plurality of videos comprises:

and respectively determining the starting position and the ending position of the video segments of the target object contained in the videos based on the labeling information of the videos.

5. The method of claim 1, wherein the determining playback parameters for the plurality of videos based on meta information for the plurality of videos comprises:

determining a target resolution of the plurality of videos based on resolutions of the plurality of videos, the target resolution being a highest resolution of the plurality of videos;

and determining cutting information or edge supplement information for the video which does not meet the target resolution in the plurality of videos, so that the video processed based on the cutting information or the edge supplement information meets the target resolution.

6. The method of claim 1, wherein the determining playback parameters for the plurality of videos based on meta information for the plurality of videos comprises:

and determining the loading-in-advance duration and the video buffer amount of the videos based on the code rates of the videos.

7. The method of claim 1, wherein the generating video aggregation information for the target object based on the storage addresses of the plurality of videos and the playback parameters comprises:

determining the playing sequence, the loading time length in advance, the video buffer amount and the cutting information or the edge supplement information in the playing parameters of the videos;

generating the video aggregation information of the target object based on the storage addresses of the videos, the playing sequence of the videos, the advanced loading duration, the video buffer size, and the clipping information or the edge supplement information.

8. The method of claim 1, further comprising:

receiving a search request, wherein the search request is used for requesting to continuously acquire videos related to the search information;

generating new video aggregation information based on the search request;

and sending the new video aggregation information.

9. A video processing method is applied to a terminal and comprises the following steps:

sending search information, wherein the search information is used for indicating related videos of a search target object;

receiving video aggregation information generated by a server based on the search information;

analyzing the video aggregation information to obtain playing parameters and storage addresses of a plurality of videos included in the video aggregation information, wherein each video in the plurality of videos includes a video clip of the target object;

and loading the video clips of the target objects in the videos from the storage addresses according to the playing parameters, and playing the video clips of the target objects in the videos.

10. The method of claim 9, wherein said loading a video clip of said target object in said plurality of videos from said storage address according to said playback parameters comprises:

determining a loading sequence according to the playing sequence corresponding to each video in the playing parameters;

and loading the video clips of the target object in the plurality of videos according to the loading-in-advance time and the video buffer amount corresponding to each video in the playing parameters, and the starting position and the ending position of the video clip of the target object included in each video in the playing parameters by combining the loading sequence.

11. The method of claim 9, wherein said playing a video clip of said target object in said plurality of videos comprises:

displaying the current playing time length and the total time length at a first position of a playing interface; wherein the current playing time length is the played time length of the video clips of the target object in the videos, and the total time length is the total time length of the video clips of the target object in the videos.

12. The method of claim 9, wherein said playing a video clip of said target object in said plurality of videos comprises:

in response to an operation of displaying a video list, displaying the video list at a second position of a playing interface, wherein the video list comprises the plurality of videos;

hiding the video list in response to hiding the video list.

13. The method of claim 9, wherein said playing a video clip of said target object in said plurality of videos comprises:

and responding to the operation of dragging or clicking the playing progress bar, determining the video to be played indicated by the operation, and playing the video to be played.

14. The method of claim 9, further comprising:

sending a search request under the condition that the playing completion ratio of the videos is larger than a first threshold, wherein the search request is used for requesting to continuously acquire the videos related to the search information; or

And sending the search request under the condition that the remaining playing time of the videos is less than a second threshold value.

15. The method of claim 14, further comprising:

receiving new video aggregation information returned by the server based on the search request;

analyzing the new video aggregation information to obtain a plurality of videos included in the new video aggregation information;

adding the plurality of videos included in the new video aggregation information to the tail parts of the plurality of videos included in the video aggregation information;

and changing the total duration in the playing interface based on the total duration of the video clips of the target object included in the new video aggregation information and the total duration of the video clips of the target object included in the video aggregation information.

16. A video processing device applied to a server comprises:

a second determining unit configured to determine playing parameters of the plurality of videos based on meta information of the plurality of videos;

a generating unit, configured to generate video aggregation information of the target object based on the storage addresses of the plurality of videos and the playing parameter;

and the first sending unit is used for sending the video aggregation information, and the video aggregation information is used for instructing a terminal to load the video clips of the target objects in the videos from the storage addresses according to the playing parameters and playing the video clips of the target objects in the videos.

17. The apparatus of claim 16, wherein the first determining unit comprises:

an acquisition subunit configured to acquire a plurality of videos related to the search information;

and the duplication removing subunit is used for carrying out duplication removing processing on the videos related to the search information to obtain a plurality of videos.

18. The apparatus of claim 17, wherein the de-weighting subunit is to:

19. The apparatus of claim 16, wherein the second determining unit comprises:

a first determining subunit, configured to determine, based on the annotation information of the multiple videos, a start position and an end position of a video segment of the target object included in the multiple videos, respectively.

20. The apparatus of claim 16, wherein the second determining unit comprises:

a second determining subunit configured to determine a target resolution of the plurality of videos based on resolutions of the plurality of videos, the target resolution being a resolution at which the plurality of videos is the highest; and determining cutting information or edge supplement information for the video which does not meet the target resolution in the plurality of videos, so that the video processed based on the cutting information or the edge supplement information meets the target resolution.

21. The apparatus of claim 16, wherein the second determining unit comprises:

and the third determining subunit is used for determining the loading-in-advance time length and the video buffering amount of the videos based on the code rates of the videos.

22. The apparatus of claim 16, wherein the generating unit comprises:

the fourth determining subunit is configured to determine a playing sequence, a loading-ahead duration, a video buffer amount, and clipping information or edge-filling information in the playing parameters of the multiple videos;

a generating subunit, configured to generate the video aggregation information of the target object based on the storage addresses of the multiple videos, the playing order of the multiple videos, the advanced loading duration, the video buffer size, and the clipping information or the edge supplement information.

23. The apparatus of claim 16, further comprising:

a first receiving unit, configured to receive a search request, where the search request is used to request to continue to acquire a video related to the search information;

the generating unit is further used for generating new video aggregation information based on the search request;

the first sending unit is further configured to send the new video aggregation information.

24. A video processing device applied to a terminal comprises:

a second transmitting unit configured to transmit search information indicating a video related to a search target object;

a second receiving unit, configured to receive video aggregation information generated by the server based on the search information;

the analysis unit is used for analyzing the video aggregation information to obtain playing parameters and storage addresses of a plurality of videos included in the video aggregation information, wherein each video in the plurality of videos includes a video clip of the target object;

and the playing unit is used for loading the video clips of the target objects in the videos from the storage addresses according to the playing parameters and playing the video clips of the target objects in the videos.

25. The apparatus of claim 24, wherein the playback unit comprises:

a fifth determining subunit, configured to determine a loading order according to the playing order corresponding to each video in the playing parameters;

and the loading subunit is configured to load the video segments of the target object in the multiple videos according to the loading duration and the video buffer amount in advance corresponding to each video in the playing parameter, and the start position and the end position of the video segment of the target object included in each video in the playing parameter, in combination with the loading sequence.

26. The apparatus of claim 24, wherein the playback unit comprises:

the first display subunit is used for displaying the current playing time length and the total time length at a first position of the playing interface; wherein the current playing time length is the played time length of the video clips of the target object in the videos, and the total time length is the total time length of the video clips of the target object in the videos.

27. The apparatus of claim 24, wherein the playback unit comprises:

a second display subunit, configured to display, in response to an operation of displaying a video list, the video list at a second position of a play interface, where the video list includes the plurality of videos;

a hiding subunit, configured to hide the video list in response to an operation of hiding the video list.

28. The apparatus of claim 24, wherein the playback unit comprises:

and the control subunit is used for responding to the operation of dragging or clicking the playing progress bar, determining the video to be played indicated by the operation, and playing the video to be played.

29. The apparatus of claim 24, wherein the second transmitting unit is further configured to:

30. The apparatus of claim 29, wherein,

the second receiving unit is further configured to receive new video aggregation information returned by the server based on the search request;

the analysis unit is further configured to analyze the new video aggregation information to obtain a plurality of videos included in the new video aggregation information;

the playing unit is further configured to add the plurality of videos included in the new video aggregation information to the tails of the plurality of videos included in the video aggregation information; and changing the total duration in the playing interface based on the total duration of the video clips of the target object included in the new video aggregation information and the total duration of the video clips of the target object included in the video aggregation information.

31. An electronic device, comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-15.

32. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-15.

33. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-15.