CN112399261B

CN112399261B - Video data processing method and device

Info

Publication number: CN112399261B
Application number: CN202110070879.7A
Authority: CN
Inventors: 褚天颖
Original assignee: Zhejiang Koubei Network Technology Co Ltd
Current assignee: Zhejiang Koubei Network Technology Co Ltd
Priority date: 2021-01-19
Filing date: 2021-01-19
Publication date: 2021-05-14
Anticipated expiration: 2041-01-19
Also published as: CN112399261A

Abstract

The application discloses a video data processing method and device, comprising the following steps: responding to a video data editing page starting instruction, and displaying the video data editing page; providing the first video data to a server; providing video auxiliary information corresponding to the first video data input by a user to the server; receiving and displaying a video clip which is provided by the server and contains a playing mode of the video auxiliary information in second video data or data matched with the playing mode; the second video data is obtained by editing the first video data by using the video auxiliary information corresponding to the first video data based on the learning result of the characteristic data of the video auxiliary information corresponding to the video data sample and the video data sample; and displaying the second video data at the client of the access platform. By adopting the method, the problem of low video data quality caused by dependence on personal aesthetic level in the existing video editing is solved.

Description

Video data processing method and device

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a video data processing method and device. The application also relates to another video data processing method and device.

Background

With the development of the internet, video production is more and more common. In order to make the content of the video display richer, post-editing processing, such as adding a text, is often performed on the shot video.

Existing video editing processes rely on the individual aesthetic level and experience of the video clippers. The quality of the video with the added text made by different editing personnel is greatly different. For example, if the text color is manually selected, there may be a case where the text color is not properly selected, which may cause the text to be buried in the background image of the video, making reading difficult for the user. The manual selection of the font may result in too small or too large a font size, which affects the video quality. Manually setting the text position may cause the video to be blocked at an improper position.

Therefore, how to edit the video to reduce the influence of the personal aesthetic level on the video quality is a problem to be solved.

Disclosure of Invention

The video data processing method provided by the embodiment of the application carries out intelligent editing processing on the video data, and solves the problem that the quality of the video data is low due to the dependence on the personal aesthetic level in the existing video editing.

The embodiment of the application provides a video data processing method, which comprises the following steps: responding to a video data editing page starting instruction, and displaying the video data editing page, wherein the video data editing page is used for receiving editing behavior information of video data and operation information for clipping the video data; acquiring video data of a user, and providing the video data or the clipped video data as first video data to a server; receiving editing behavior information of the user, and acquiring video auxiliary information corresponding to the first video data, which is input by the user, according to the editing behavior information; in response to an editing processing instruction for intelligently editing the first video data by using video auxiliary information corresponding to the first video data, providing the video auxiliary information corresponding to the first video data to the server; receiving and displaying a video clip which is provided by the server and contains a playing mode of the video auxiliary information in second video data or data matched with the playing mode; the second video data is obtained by editing the first video data by using the video auxiliary information corresponding to the first video data based on the learning result of the characteristic data of the video auxiliary information corresponding to the video data sample and the video data sample; and displaying the second video data at the client of the access platform.

Optionally, the second video data is video data generated by determining, based on the learning result, pattern data of video auxiliary information corresponding to the first video data and inserting the video auxiliary information into a video stream of the first video data according to the pattern data.

Optionally, the video auxiliary information corresponding to the first video data is text information; the style data includes at least one of: the text size, the text color, the text font and the display position of the text in the video frame corresponding to the second video data; and/or the style data is determined based on the learning result and at least one of the following feature data of the first video data: video scale, video style, video hue, aspect ratio of the subject visual element, style of the subject visual element, hue of the subject visual element.

Optionally, the method further includes: in response to a style setting trigger for the input video auxiliary information, presenting an operation option for style setting for the input video auxiliary information; the editing processing instructions responsive to intelligently editing the first video data using the video auxiliary information comprise: and receiving a checking operation aiming at an intelligent editing option in the operation options, and triggering an editing processing instruction for intelligently editing the first video data by using the input video auxiliary information.

Optionally, the method further includes: receiving the display duration of the video auxiliary information corresponding to the first video data and input by the user; the display duration is used for controlling the display duration of the inserted video auxiliary information in the second video data playing; providing the display duration to the server; the second video data provided by the server is the second video data with the display duration of the video auxiliary information set; and/or acquiring first processing state information for editing first video data and/or first processing result data corresponding to the first processing state information; the first processing result data is characteristic data of first video data, and is used for determining style characteristics adopted in the process of editing the first video data by using the video auxiliary information; and displaying the first processing state information and/or the first processing result data.

Optionally, the method further includes: acquiring user characteristics of the user; the second video data is video data generated by determining style data of video auxiliary information corresponding to the first video data based on the user characteristics and inserting the video auxiliary information into a video stream of the first video data according to the style data.

An embodiment of the present application further provides a video data processing method, including: receiving first video data which are required to be edited and provided by a video data editing page and video auxiliary information corresponding to the first video data; editing the first video data by using the video auxiliary information corresponding to the first video data based on the learning result of the video data sample and the characteristic data of the video auxiliary information corresponding to the video data sample to obtain second video data; and providing the video data editing page with a video clip containing the playing style of the video auxiliary information in the second video data or data matched with the playing style.

Optionally, the editing the first video data by using the video auxiliary information corresponding to the first video data based on the learning result of the feature data of the video auxiliary information corresponding to the video data sample and the video data sample to obtain the second video data includes: determining style data of video auxiliary information corresponding to the first video data based on the learning result; inserting the video auxiliary information into a video stream of the first video data according to the style data to generate second video data; and displaying the inserted video auxiliary information according to the style data in the second video data playing.

An embodiment of the present application further provides an electronic device, including: a memory, and a processor; the memory is used for storing a computer program, and the computer program is executed by the processor to execute the method provided by the embodiment of the application.

The embodiment of the present application further provides a storage device, in which a computer program is stored, and the computer program is executed by the processor to perform the method provided in the embodiment of the present application.

Compared with the prior art, the method has the following advantages:

according to the video data processing method and the video data processing device, the first video data are provided for the server; receiving video auxiliary information input for the first video data; providing the video auxiliary information to the server in response to an editing processing instruction for intelligently editing the first video data by using the video auxiliary information; receiving and displaying a video clip which is provided by the server and contains a playing mode of the video auxiliary information in second video data or data matched with the playing mode; the second video data is obtained by editing the first video data by using the video auxiliary information corresponding to the first video data based on the learning result of the characteristic data of the video auxiliary information corresponding to the video data sample and the video data sample; and displaying the second video data at the client of the access platform. The video data is edited based on the automatic learning result aiming at the video sample, so that intelligent editing is realized, dependence on the personal aesthetic level in video editing is avoided, and the problem of low quality of the video data after editing processing is solved.

According to the video data processing method and the video data processing device, first video data which are required to be edited and provided by a video data editing page and video auxiliary information corresponding to the first video data are received; and editing the first video data by using the video auxiliary information corresponding to the first video data based on the learning result of the video data sample and the characteristic data of the video auxiliary information corresponding to the video data sample to obtain second video data. The first video data are intelligently edited based on the automatic learning result aiming at the video sample, dependence on personal aesthetic level in video editing is avoided, and the problem of low quality of the video data after editing processing is solved.

Drawings

Fig. 1 is a processing flow chart of a video data processing method according to a first embodiment of the present application.

Fig. 1A is a schematic view of a video data editing page according to a first embodiment of the present application.

Fig. 1B is a schematic view of an interaction flow of a video data editing interface according to a first embodiment of the present application.

Fig. 2 is a flowchart of a video profile according to a first embodiment of the present application.

Fig. 3 is a processing flow chart of another video data processing method according to a second embodiment of the present application.

Fig. 4 is a schematic diagram of a video data processing apparatus according to a third embodiment of the present application.

Fig. 5 is a schematic diagram of a video data processing apparatus according to a fourth embodiment of the present application.

Fig. 6 is a schematic diagram of an electronic device provided herein.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.

The embodiment of the application provides a video data processing method and device, electronic equipment and storage equipment. The embodiment of the application also provides another video data processing method and device, electronic equipment and storage equipment. The following examples are individually set forth.

According to the video data processing method provided by the embodiment of the application, first video data and video auxiliary information input aiming at the first video data are provided for a server; and receiving second video data obtained by editing the first video data by the server side by using the video auxiliary information, wherein the second video data is a learning result of the server side based on the video data sample and the characteristic data of the video auxiliary information corresponding to the video data sample, and the video data is generated by editing the first video data serving as an original video by using the video auxiliary information and can be displayed at the client side of the access platform. The intelligent editing of the first video data based on the automatic learning result aiming at the video sample by the server is realized in an interactive mode, the dependence on the personal aesthetic level in the video editing is avoided, and the quality of the video data can be improved.

A video data processing method according to a first embodiment of the present application is described below with reference to fig. 1 and 2. The video data processing method shown in fig. 1 includes: step S101 to step S106.

Step S101, responding to a video data editing page starting instruction, and displaying the video data editing page, wherein the video data editing page is used for receiving editing behavior information of video data and operation information for performing clipping processing on the video data.

In practical application, a user can start the video data editing page through the video data editing page inlet. And providing an interactive mode for editing video data for a user through a video data editing page, and guiding the user to perform clipping processing on the video to be edited and add video auxiliary information. The editing behavior includes, but is not limited to: adding character information to the video data; the method comprises the steps of uploading video data and text information needing to be added into the video data to a server, intelligently determining style data of the text information by the server, and adding the text information into the video data according to the style data. The editing action may also include adding other auxiliary information to the video, such as soundtrack, special effects, watermarks, filters, etc. The clipping processing of the video is to cut and merge the video. For example, selecting a video clip, cropping and/or rotating video frames of a video, adjusting the ratio of the length and width of a video playing picture, adding slices to a video, etc. One or more interface elements may be presented in a video data editing page. For example, interface elements for editing video data are presented, such as providing a clip button and presented to the interface. The following steps are repeated: displaying a first interface element for shooting or uploading video data; and responding to the trigger information of the first interface element, and acquiring video data needing to be uploaded to a server side for editing. An example of a first interface element is a shoot or pick video button locally. The following steps are repeated: displaying a second interface element for triggering the addition of the text information into the video data; responding to the trigger information of the second interface element, acquiring character information input aiming at the video data needing to be added with the character information, sending the character information to the server, and adding the character information into the video stream of the video data according to a pattern matched with the characteristic information by the server based on the characteristic information of the video data needing to be added with the character information. Therefore, the intelligent layout of the text is determined based on the characteristic data of the video, the video data after the text is generated according to the intelligent layout, and dependence on personal aesthetics is reduced.

Referring to fig. 1A, an example of an interface of a video data editing page shown in the figure includes: a clip control 101a, a text control 102a, a music control 103a, and a filter control 104 a. Each control may be embodied as a button. Each button may trigger the receipt of different input information. For example, clicking on the clip button may trigger a second level of video data editing pages: and editing the page, and cutting, combining and the like of the video data on the editing page. For another example, a text button is clicked to receive text content input by a user, and clicking the text content input by the user can trigger setting of a playing style of the text content after video data is inserted.

In this embodiment, the video editing page starting instruction may be triggered through an entry corresponding to the video editing page, and video data to be edited is selected in the started video editing page. Or determining the video data to be edited, and then triggering a video editing page starting instruction through an operation option for the video data, for example, right-clicking or double-clicking or long-pressing the selected video data, displaying the operation option, detecting the editing in the selected operation option, and starting the video editing page. The starting of the video editing page and the operation sequence of selecting the video data needing to be edited are not limited.

Step S102, video data of a user is obtained, and the video data and/or the clipped video data are provided to a server as first video data.

In this step, the video data of the user or the video data after being edited is uploaded to the server, and the server automatically processes the first video data when the first video data needs to be intelligently edited. For example, after shooting is triggered or a source video is selected through a video data editing page, the source video is cut, a video segment in the source video is selected and serves as first video data needing to be intelligently edited by using video auxiliary information, and the first video data is uploaded to a server. Of course, a plurality of source videos may also be merged, and the merged video is uploaded to the server as the first video data. The intelligent editing processing can be carried out at the server, the problem that the existing editing processing process excessively depends on the personal aesthetic level is effectively solved, and the video editing cost is reduced.

In this embodiment, the time for uploading the first video data to the server is not limited. The first video data can be uploaded to the server before being intelligently edited, and the user selects the first video data of the uploaded server to edit. The first video data can also be uploaded to the server according to the editing behavior information, for example, after the user shoots a video or after the user selects a video from videos locally stored in the user terminal, the video is clipped and then triggered to be uploaded to the server. For another example, after a user takes a video or selects a video from videos locally stored in a user terminal, the user inputs video auxiliary information for the video, such as characters to be added to the video, detects an intelligent style of the video auxiliary information selected by the user, uploads the video and the video auxiliary information of the video to a server, and the server sets the intelligent style of the video auxiliary information of the video. The first video data and the video auxiliary information which need to be edited are uploaded to the server side at a non-restricted time, so that more operation space can be provided for users.

Step S103, receiving the editing behavior information of the user, and acquiring the video auxiliary information corresponding to the first video data, which is input by the user, according to the editing behavior information.

The step is to determine the video auxiliary information input by the user according to the editing behavior of the user. The video auxiliary information is information that needs to be added to the first video data and displayed. Preferably, the video auxiliary information corresponding to the first video data is text information. For example, the first video data is subjected to text matching, that is, text information is added, and the added text information is displayed in the playing of the first video data after text matching.

In this embodiment, the method further includes: displaying a second interface element for triggering the addition of the text information into the first video data on the video editing page; and responding to the trigger information of the second interface element, and acquiring the text information input aiming at the first video data needing to be added with the text information. In the subsequent step, the server automatically sets style data of the text information based on the characteristic information of the first video data, and inserts the text information into the video stream of the first video data according to the style data to generate second video data.

Step S104, responding to an editing processing instruction for intelligently editing the first video data by using the video auxiliary information corresponding to the first video data, and providing the video auxiliary information corresponding to the first video data to the server.

In this embodiment, the method further includes displaying an interface element for performing composition editing on the video data on the video editing page, and the user may perform composition editing on the first video data by using the interface element. The method specifically comprises the following steps: in response to a style setting trigger for the input video auxiliary information, presenting an operation option for style setting for the input video auxiliary information; the editing processing instructions responsive to intelligently editing the first video data using the video auxiliary information comprise: and receiving a checking operation aiming at an intelligent editing option in the operation options, and triggering an editing processing instruction for intelligently editing the first video data by using the input video auxiliary information. The intelligent editing is that the server automatically sets the style data of the video auxiliary information required to be displayed in the first video data.

Step S105, receiving and displaying a video clip which is provided by the server and contains the playing style of the video auxiliary information in the second video data or data matched with the playing style; the second video data is obtained by editing the first video data using the video auxiliary information corresponding to the first video data based on the learning result of the feature data for the video data sample and the video auxiliary information corresponding to the video data sample.

In this step, the playing style generated by the server side for the video auxiliary information automatically by taking the first video data as the initial video is displayed. The video clip and/or the second video data provided by the server and containing the playing style of the video auxiliary information in the second video data can be received for displaying, and data matched with the playing style can be obtained and the playing effect obtained by inserting the video auxiliary information into the first video data can be displayed.

In this embodiment, the second video data is specifically video data generated by determining, based on the learning result, pattern data of video auxiliary information corresponding to the first video data and inserting the video auxiliary information into a video stream of the first video data according to the pattern data. Preferably, the video auxiliary information corresponding to the first video data is text information; the style data includes at least one of: the text size, the text color, the text font and the display position of the text in the video frame corresponding to the second video data. For example, text in the video auxiliary information is added as subtitles to the first video data according to the style data to obtain second video data, and the subtitles are presented according to the style data during the playing of the second video data. Further, the style data is determined based on the learning result and at least one of the following feature data of the first video data: video scale, video style, video hue, aspect ratio of the subject visual element, style of the subject visual element, hue of the subject visual element. In implementation, the server may learn the feature data of the first video data according to the learning result, and determine the style data of the video auxiliary information inserted into the first video data according to the feature data.

In this embodiment, the method further includes setting a display duration of the video auxiliary information corresponding to the first video data. The method specifically comprises the following steps: receiving display duration of video auxiliary information corresponding to the first video data and input by a user; the display duration is used for controlling the display duration of the inserted video auxiliary information in the second video data playing; providing the display duration to the server; and the second video data provided by the server is the second video data with the display duration of the video auxiliary information set. Correspondingly, the receiving and displaying the second video data provided by the server includes: and receiving second video data which is provided by the server and is provided with the display duration of the video auxiliary information. The displaying the second video data at the client of the access platform comprises: and playing the second video data with the display duration of the video auxiliary information set at the client. For example, the video auxiliary information corresponding to the first video data is the content of a profile; correspondingly, the inserted recipe content is played in the second video data played by the client of the access platform according to the display duration of the recipe content and the style data of the recipe content.

In this embodiment, the method further includes: information of an editing process for the first video data is presented to a user. The method specifically comprises the following steps: acquiring first processing state information for editing first video data and/or first processing result data corresponding to the first processing state information; the first processing result data is characteristic data of first video data, and is used for determining style characteristics adopted in the process of editing the first video data by using the video auxiliary information; and displaying the first processing state information and/or the first processing result data. The style feature may be a feature represented using style data. Wherein the first processing state information is any one of the following states: the video style analysis state, the video tone analysis state and the video main body visual element position analysis state. For example, the analysis processing of the feature data of the first video data is divided into three stages, a video style analysis stage, a video tone analysis stage, and a position analysis stage. When the video style analysis is carried out, displaying the video style analysis in progress; when video tone analysis is carried out, displaying that the video tone analysis is in progress; when the video subject visual element position analysis is performed, the presentation position analysis is in progress. And displaying the analysis results at the end of each analysis phase: video style result data, video hue result data, position result data. The pattern of video auxiliary information to be inserted into the first video data is determined based on the result data.

In this embodiment, the style data of the video auxiliary information added to the first video data may also be determined according to the user characteristics. The method specifically comprises the following steps: acquiring user characteristics of the user; the second video data is video data generated by determining style data of video auxiliary information corresponding to the first video data based on the user characteristics and inserting the video auxiliary information into a video stream of the first video data according to the style data. In practice, the style data of the video auxiliary information corresponding to the first video data may be determined based on the learning result and the user characteristic. Of course, the style data of the video auxiliary information corresponding to the first video data may be determined based on the user characteristics without depending on the learning result. The user characteristics may be characteristics of the video uploader, such as industry characteristics, tag characteristics. For example, the video uploader is a merchant, the business where the merchant is located is the barbecue industry, and the subtitle presentation in the video auxiliary information can be specific to the smoke and fire. For another example, if the video uploader is a predator and the tag set by the predator is a gourmet class, some words in the video auxiliary information as subtitles may be replaced by corresponding gourmet symbols.

Referring to FIG. 1B, the interface interaction flow shown in the figure includes: s101b, the text control receives text input, and the user can trigger to display operation options for the input text. For example, clicking a text button of the interface triggers the receipt of text input. Clicking on the entered text content reveals the operational options for the entered text. The operation options triggered after clicking the text content input by the user at least comprise: and (6) selecting intelligent layout. Other options may also be included for editing, setting time, deleting, etc. Detecting selection of the intelligent layout may trigger an edit processing instruction to intelligently edit the first video data using the video auxiliary information. Detecting the selection setting time may trigger a play start time and a play duration of the video auxiliary information in the edited second video data. Detecting selection of an edit option may trigger editing of the entered text content. S102b, detecting and selecting the intelligent layout, and determining the playing style and position of the text by the server. In the system calculation process of the server, the visualization processing is carried out on the calculation steps of the system, the processing state and the processing result of the intelligent editing of the video data are displayed, and the system calculation steps include but are not limited to video style analysis, video tone analysis, shooting subject position calculation and the like. S103b, displaying and prompting the automatically set playing style and position. The playing style and position which are automatically set can be taken for schematic display, and the video clip can also be taken for display. The prompt message disappears after the display duration reaches the set duration.

And step S106, displaying the second video data on the client side of the access platform.

The method comprises the following steps of pushing second video data obtained by editing and processing first video data to a client side of an access platform, and displaying the second video data on the client side. The client is the client of the user accessing the platform in the common sense of browsing videos. Preferably, the second video data with the display duration of the video auxiliary information set is played at the client. For example, the video auxiliary information corresponding to the first video data is the content of a profile; the second video data is generated by determining style data of the configuration content based on the learning result and inserting the configuration content into the video stream of the first video data according to the style data; and playing the inserted configuration contents in the second video data playing of the client of the access platform according to the display duration of the configuration contents and the style data of the configuration contents.

In this embodiment, a video editor or a video creator is guided to implement intelligent editing of video data in an interactive manner, the video data after intelligent editing is displayed at a client of an access platform, and specifically, second video data can be pushed to the client and played. In practical application, the scheme can be applied to a platform based on video as a core or a column of the platform, a video editor or a video creator inputs the content of video auxiliary information aiming at a shot video, and a server end learns the video characteristics such as video style (such as small freshness, Sendzein, and laugh), color tone, character color, font style and the like according to a video sample with higher quality inserted with the video auxiliary information; and the main body in the video sample can be identified, the video frame proportion is obtained, and the video characteristics such as the size and the placing position of characters are analyzed. And providing an intelligent format of the video auxiliary information based on the learning result. And the video edited according to the intelligent format is pushed to the client side of the access platform, so that a video transmission and operation environment is provided. By spreading the video with higher quality, the effect that the flow and the retention rate of the user are improved by the platform is achieved.

Referring to fig. 2, the video collocation process shown in the figure includes: s201, a user shoots/uploads a video through a video editing client page, adds characters to the video and selects an intelligent format. The user is a video creator or video editor who clips and/or edits the video. And S202, uploading the video to a server. S203, setting style data of the characters added to the video based on the learning result of the video pictures and character style characteristics of the video samples. Such as the size, color, font, and placement in the video. And S204, displaying the style effect of the video adding characters in the video editing page. And S205, setting the display duration of the characters in the video. S206, inserting characters into the video stream of the video according to the display duration and the style data. And S207, pushing the collocation text video to a user client for display. The user client is a client corresponding to a user in the common sense of browsing videos.

It should be noted that, in the case of no conflict, the features given in this embodiment and other embodiments of the present application may be combined with each other, and the steps S101 and S102 or similar terms do not limit the steps to be executed sequentially.

So far, the method provided by the present embodiment is explained, and the method provides the first video data to the server; receiving video auxiliary information input for the first video data; providing the video auxiliary information to the server in response to an editing processing instruction for intelligently editing the first video data by using the video auxiliary information; receiving and displaying a video clip which is provided by the server and contains a playing mode of the video auxiliary information in second video data or data matched with the playing mode; the second video data is obtained by editing the first video data by using the video auxiliary information corresponding to the first video data based on the learning result of the characteristic data of the video auxiliary information corresponding to the video data sample and the video data sample; and displaying the second video data at the client of the access platform. The intelligent editing of the first video data based on the automatic learning result aiming at the video sample by the server is realized in an interactive mode, the dependence on the personal aesthetic level in the video editing is avoided, and the problem of low quality of the video data after the editing processing is solved.

Second embodiment based on the above embodiments, a second embodiment of the present application provides another video data processing method. The method is described below with reference to fig. 3. The video data processing method shown in fig. 3 includes: step S301 to step S303.

Step S301, receiving first video data that needs to be edited and video auxiliary information corresponding to the first video data, which are provided by a video data editing page.

The steps in this embodiment may be performed by the server. The server side obtains first video data and video auxiliary information, determines style data of the video auxiliary information according to feature data of the first video data, inserts the video auxiliary information into a video stream of the first video data according to the style data to obtain second video data, the second video data can be pushed to a client side of an access platform to be played, and the inserted video auxiliary information is displayed according to the style data during playing. Therefore, an intelligent layout scheme for automatically determining the auxiliary information style of the video based on the characteristic data of the video is realized. For example, intelligent collocation of video. The generated video data is edited according to the intelligent layout scheme, so that dependence on personal aesthetics is reduced.

The method comprises the following steps of receiving first video data uploaded by a user through a video editing page. When the method is implemented, the user level is determined; users at a particular level have video intelligent editing rights. And if the user uploading the first video data has the video intelligent editing authority, carrying out intelligent editing processing of the subsequent steps.

The video auxiliary information is information that needs to be added to the first video data and displayed. Preferably, the video auxiliary information is text information. For example, the first video data is subjected to text matching, that is, text information is added, and the added text information is displayed in the playing of the first video data after text matching.

In this embodiment, the method further includes: receiving an editing processing request for intelligently editing the first video data by using the video auxiliary information, analyzing information for intelligently setting the format of the video auxiliary information from the editing processing request, and intelligently editing the first video data by using the video auxiliary information input by a user. The intelligent editing is that the server automatically sets the style data of the video auxiliary information which needs to be played and displayed in the video obtained by editing and processing the original video by using the first video data.

Step S302, based on the learning result for the video data sample and the feature data of the video auxiliary information corresponding to the video data sample, edit the first video data using the video auxiliary information corresponding to the first video data, and obtain second video data.

In this embodiment, the first video data and the video auxiliary information corresponding to the first video data are analyzed in a machine learning manner, and the style data of the video auxiliary information in the video playing after the first video data is inserted is determined by learning. The method specifically comprises the following steps: determining style data of video auxiliary information corresponding to the first video data based on the learning result; inserting the video auxiliary information into a video stream of the first video data according to the style data to generate second video data; and displaying the inserted video auxiliary information according to the style data in the second video data playing. Preferably, the video auxiliary information corresponding to the first video data is text information; the style data includes at least one of: the text size, the text color, the text font and the display position of the text in the video frame corresponding to the second video data.

Further, the determining style data of the video auxiliary information corresponding to the first video data based on the learning result includes: acquiring at least one of the following characteristic data of the first video data: video proportion, video style, video tone, the proportion of the main visual elements, the style of the main visual elements and the tone of the main visual elements; determining pattern data of the video auxiliary information corresponding to the first video data based on the learning result and the feature data of the first video data. For example, the server may learn the feature data of the first video data according to the learning result, and determine style data of text information inserted into the first video data according to the feature data. Determining the font of the inserted text information according to the style of the first video data, wherein the cartoon font can be adopted for the style of the smiling; and determining the size and the placement position of the characters of the inserted text information according to the main elements of the first video data and the video frame proportion.

In this embodiment, the method further includes: receiving the display duration of the video auxiliary information corresponding to the first video data; the display duration is used for controlling the display duration of the inserted video auxiliary information in the second video data playing; and adding the display duration of the video auxiliary information into the video stream of the second video data. Further, second video data added with the display duration of the video auxiliary information is pushed to a client of the access platform, and the inserted video auxiliary information, such as the text matching content of the video, is displayed according to the display duration and the style data in the second video data played by the client.

Step S303, providing a video clip containing the play style of the video auxiliary information in the second video data or data matched with the play style to the video data editing page.

The method comprises the steps of providing a playing style which is generated by a display server side by taking first video data as an initial video and automatically aiming at the video auxiliary information for a video data editing page. The video data editing page can be provided with a video clip containing the playing style of the video auxiliary information in the second video data and/or the second video data for displaying, and data matched with the playing style can be obtained and a playing effect obtained by inserting the video auxiliary information into the first video data can be displayed.

The second video data. And displaying the second video data obtained after the video auxiliary information is inserted into the first video data to a user, so that the user can know the current editing effect, and whether to adjust the second video data further is determined.

In this embodiment, the method further includes: determining first processing state information for performing editing processing on first video data and/or first processing result data corresponding to the first processing state information; the first processing result data is characteristic data of first video data, and is used for determining style characteristics adopted in the process of editing the first video data by using the video auxiliary information; and providing the first processing state information and/or the first processing result data to the video editing page. For example, the analysis processing of the feature data of the first video data is divided into three stages, a video style analysis stage, a video tone analysis stage, and a position analysis stage. When the video style analysis is carried out, displaying the video style analysis in progress; when video tone analysis is carried out, displaying that the video tone analysis is in progress; when the video subject visual element position analysis is performed, the presentation position analysis is in progress. And displaying the analysis results at the end of each analysis phase: video style result data, video hue result data, position result data. Pattern data of video auxiliary information to be inserted into the first video data is determined according to the result data. And presenting the analysis process to a user so that the user can know the processing progress and included contents of the intelligent editing.

In this embodiment, the method further includes: and pushing the second video data to a client of an access platform. And pushing second video data obtained by editing the first video data to a client of the access platform, and displaying the second video data on the client. For example, the client displays the inserted video auxiliary information, such as the text content of the video, according to the set display duration and the style data automatically set by the server.

In this embodiment, the style data of the video auxiliary information added to the first video data may also be determined according to the user characteristics. The method specifically comprises the following steps: acquiring user characteristics of the user; based on the user characteristics and/or the learning result, determining style data of video auxiliary information corresponding to the first video data, and video data generated by inserting the video auxiliary information into a video stream of the first video data according to the style data. The user characteristics may be characteristics of the video uploader, such as industry characteristics, tag characteristics. For example, the video uploader is a merchant, the business where the merchant is located is the barbecue industry, and the subtitle presentation in the video auxiliary information can be specific to the smoke and fire. For another example, if the video uploader is a predator and the tag set by the predator is a gourmet class, some words in the video auxiliary information as subtitles may be replaced by corresponding gourmet symbols.

Thus, the method provided by the present embodiment is described, which receives first video data that needs to be edited and provided by a video data editing page, and video auxiliary information corresponding to the first video data; and editing the first video data by using the video auxiliary information corresponding to the first video data based on the learning result of the video data sample and the characteristic data of the video auxiliary information corresponding to the video data sample to obtain second video data. The first video data are intelligently edited based on the automatic learning result aiming at the video sample, dependence on personal aesthetic level in video editing is avoided, and the problem of low quality of the video data after editing processing is solved.

A third embodiment corresponds to the first embodiment, and provides a video data processing apparatus. The device is described below with reference to fig. 4. The video data processing apparatus shown in fig. 4 includes:

an edit enabling unit 401, configured to respond to a video data edit page enabling instruction, and display the video data edit page, where the video data edit page is used to receive edit behavior information of video data and operation information for performing clipping processing on the video data;

a video uploading unit 402, configured to acquire first video data, and provide the video data or the video data after being clipped as the first video data to a server;

an auxiliary information input unit 403, configured to receive editing behavior information of the user, and obtain video auxiliary information corresponding to the first video data, which is input by the user, according to the editing behavior information;

an intelligent editing unit 404, configured to provide video auxiliary information corresponding to the first video data to the server in response to an editing processing instruction for intelligently editing the first video data using the video auxiliary information corresponding to the first video data;

an editing result receiving unit 405, configured to receive and display a video clip that includes a play pattern of the video auxiliary information in the second video data provided by the server, or data that matches the play pattern; the second video data is obtained by editing the first video data by using the video auxiliary information corresponding to the first video data based on the learning result of the characteristic data of the video auxiliary information corresponding to the video data sample and the video data sample;

a presentation unit 406, configured to present the second video data at a client of the access platform.

Optionally, the intelligent editing unit 404 is specifically configured to: in response to a style setting trigger for the input video auxiliary information, presenting an operation option for style setting for the input video auxiliary information; and receiving a checking operation aiming at an intelligent editing option in the operation options, and triggering an editing processing instruction for intelligently editing the first video data by using the input video auxiliary information.

Optionally, the intelligent editing unit 404 is specifically configured to: receiving the display duration of the video auxiliary information corresponding to the first video data and input by the user; the display duration is used for controlling the display duration of the inserted video auxiliary information in the second video data playing; providing the display duration to the server; the second video data provided by the server is the second video data with the display duration of the video auxiliary information set; and/or the intelligent editing unit 404 is specifically configured to: acquiring first processing state information for editing first video data and/or first processing result data corresponding to the first processing state information; the first processing result data is characteristic data of first video data, and is used for determining style characteristics adopted in the process of editing the first video data by using the video auxiliary information; and displaying the first processing state information and/or the first processing result data.

Optionally, the intelligent editing unit 404 is specifically configured to: acquiring user characteristics of the user; the second video data is video data generated by determining style data of video auxiliary information corresponding to the first video data based on the user characteristics and inserting the video auxiliary information into a video stream of the first video data according to the style data.

A fourth embodiment corresponds to the second embodiment, and provides another video data processing apparatus. The device is described below with reference to fig. 5. The video data processing apparatus shown in fig. 5 includes:

a video and auxiliary information receiving unit 501, configured to receive first video data that needs to be edited and provided by a video data editing page and video auxiliary information corresponding to the first video data;

an editing unit 502, configured to edit first video data using video auxiliary information corresponding to the first video data based on a learning result for a video data sample and feature data of video auxiliary information corresponding to the video data sample, so as to obtain second video data;

an output unit 503, configured to provide, to the video data editing page, a video clip containing a play pattern of the video auxiliary information in the second video data or data matching the play pattern.

Optionally, the editing unit 502 is specifically configured to: determining style data of video auxiliary information corresponding to the first video data based on the learning result; inserting the video auxiliary information into a video stream of the first video data according to the style data to generate second video data; and displaying the inserted video auxiliary information according to the style data in the second video data playing.

Based on the above embodiments, a fifth embodiment of the present application provides an electronic device, and please refer to the corresponding description of the above embodiments for related parts. Referring to fig. 6, the electronic device shown in fig. 6 includes a memory 601 and a processor 602. The memory stores a computer program, and the computer program is executed by the processor to execute the method provided by the embodiment of the application.

Based on the foregoing embodiments, a sixth embodiment of the present application provides a storage device, and please refer to the corresponding description of the foregoing embodiments for related parts. The schematic diagram of the storage device is similar to fig. 6. The storage device stores a computer program, and the computer program is executed by the processor to execute the method provided by the embodiment of the application.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

1. Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.

2. As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Although the present application has been described with reference to the preferred embodiments, it is not intended to limit the present application, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present application, therefore, the scope of the present application should be determined by the claims that follow.

Claims

1. A method of processing video data, comprising:

responding to a video data editing page starting instruction, and displaying the video data editing page, wherein the video data editing page is used for receiving editing behavior information of video data and operation information for clipping the video data;

acquiring video data of a user, and providing the video data or the clipped video data as first video data to a server;

receiving editing behavior information of the user, and acquiring video auxiliary information corresponding to the first video data, which is input by the user, according to the editing behavior information; in response to an editing processing instruction for intelligently editing the first video data by using video auxiliary information corresponding to the first video data, providing the video auxiliary information corresponding to the first video data to the server;

receiving and displaying a video clip which is provided by the server and contains a playing mode of the video auxiliary information in second video data or data matched with the playing mode; the second video data is obtained by editing the first video data by using the video auxiliary information corresponding to the first video data based on the learning result of the characteristic data of the video auxiliary information corresponding to the video data sample and the video data sample;

displaying the second video data at a client of an access platform;

wherein the second video data further comprises style data that determines video auxiliary information corresponding to the first video data based on a user characteristic of the user, the user characteristic being a characteristic of the video uploader.

2. The method of claim 1,

the second video data is video data generated by determining pattern data of video auxiliary information corresponding to the first video data based on the learning result and inserting the video auxiliary information into a video stream of the first video data according to the pattern data.

3. The method according to claim 2, wherein the video auxiliary information corresponding to the first video data is text information;

the style data includes at least one of: the text size, the text color, the text font and the display position of the text in the video frame corresponding to the second video data; and/or the presence of a gas in the gas,

the style data is determined based on the learning result and at least one of the following feature data of the first video data: video scale, video style, video hue, aspect ratio of the subject visual element, style of the subject visual element, hue of the subject visual element.

4. The method of claim 1, further comprising:

in response to a style setting trigger for the input video auxiliary information, presenting an operation option for style setting for the input video auxiliary information;

the editing processing instructions responsive to intelligently editing the first video data using the video auxiliary information comprise:

and receiving a checking operation aiming at an intelligent editing option in the operation options, and triggering an editing processing instruction for intelligently editing the first video data by using the input video auxiliary information.

5. The method of claim 1, further comprising:

receiving the display duration of the video auxiliary information corresponding to the first video data and input by the user; the display duration is used for controlling the display duration of the inserted video auxiliary information in the second video data playing; providing the display duration to the server; the second video data provided by the server is the second video data with the display duration of the video auxiliary information set; and/or the presence of a gas in the gas,

acquiring first processing state information for editing first video data and/or first processing result data corresponding to the first processing state information; the first processing result data is characteristic data of first video data, and is used for determining style characteristics adopted in the process of editing the first video data by using the video auxiliary information; and displaying the first processing state information and/or the first processing result data.

6. The method of claim 1, further comprising:

acquiring user characteristics of the user;

the second video data is video data generated by determining style data of video auxiliary information corresponding to the first video data based on the user characteristics and inserting the video auxiliary information into a video stream of the first video data according to the style data.

7. A method of processing video data, comprising:

receiving first video data which are required to be edited and provided by a video data editing page and video auxiliary information corresponding to the first video data;

editing the first video data by using the video auxiliary information corresponding to the first video data based on the learning result of the video data sample and the characteristic data of the video auxiliary information corresponding to the video data sample to obtain second video data;

providing a video clip containing a play style of the video auxiliary information in second video data or data matched with the play style to the video data editing page;

wherein the second video data further comprises style data that determines video auxiliary information corresponding to the first video data based on a user characteristic of a user, the user characteristic being a characteristic of the video uploader.

8. The method according to claim 7, wherein the obtaining second video data by performing editing processing on the first video data using video auxiliary information corresponding to the first video data based on the learning result of the feature data for the video data sample and the video auxiliary information corresponding to the video data sample comprises:

determining style data of video auxiliary information corresponding to the first video data based on the learning result;

inserting the video auxiliary information into a video stream of the first video data according to the style data to generate second video data; and displaying the inserted video auxiliary information according to the style data in the second video data playing.

9. An electronic device, comprising:

a memory, and a processor; the memory is adapted to store a computer program which, when executed by the processor, performs the method of any one of claims 1 to 8.

10. A storage device, characterized in that a computer program is stored which, when being executed by a processor, performs the method of any one of claims 1 to 8.