WO2022042157A1 - Method and apparatus for manufacturing video data, and computer device and storage medium - Google Patents

Method and apparatus for manufacturing video data, and computer device and storage medium Download PDF

Info

Publication number
WO2022042157A1
WO2022042157A1 PCT/CN2021/108174 CN2021108174W WO2022042157A1 WO 2022042157 A1 WO2022042157 A1 WO 2022042157A1 CN 2021108174 W CN2021108174 W CN 2021108174W WO 2022042157 A1 WO2022042157 A1 WO 2022042157A1
Authority
WO
WIPO (PCT)
Prior art keywords
video data
summary information
video
audiovisual
client
Prior art date
Application number
PCT/CN2021/108174
Other languages
French (fr)
Chinese (zh)
Inventor
裴得利
Original Assignee
百果园技术(新加坡)有限公司
裴得利
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百果园技术(新加坡)有限公司, 裴得利 filed Critical 百果园技术(新加坡)有限公司
Publication of WO2022042157A1 publication Critical patent/WO2022042157A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8549Creating video summaries, e.g. movie trailer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications

Definitions

  • the present application relates to the technical field of multimedia, for example, to a method, apparatus, computer equipment and storage medium for producing video data.
  • the elements added by users to the video data are mostly templates provided by the platform.
  • the platform provides fewer templates, and more users produce video data, which leads to the obvious homogeneity of the video data produced using these templates. Therefore, , many users manually collect elements to realize element personalization, thereby realizing the personalization of video data, for example, downloading data from the Internet as elements, parsing data from other video data as elements, and so on.
  • the present application proposes a method, device, computer equipment and storage medium for producing video data, so as to solve the problem of how to reduce the cost of producing video data under the condition of keeping the individuality of video data.
  • the application provides a method for producing video data, including:
  • element summary information is displayed, wherein the element summary information is used to represent audiovisual elements included in the first video data;
  • third video data is collected, and audiovisual elements corresponding to the element summary information are added to the third video data.
  • the present application also provides a method for producing video data, including:
  • the client is configured to display element summary information when the first video data is played, and the element summary information is used to indicate that the first video data contains audiovisual elements;
  • the audiovisual element corresponding to the element summary information is sent to the client, and the client is further configured to collect third video data, The audiovisual element corresponding to the element summary information is added to the third video data.
  • the present application also provides a device for producing video data, including:
  • a display screen configured to display element summary information when the first video data is played, wherein the element summary information is used to represent audiovisual elements included in the first video data;
  • a touch screen configured to receive a first operation acting on the element summary information
  • a display screen further configured to jointly display video summary information and production controls of second video data in response to the first operation, wherein the second video data includes the audiovisual element;
  • a touch screen further configured to receive a second operation acting on the production control
  • a camera configured to collect third video data in response to the second operation
  • the processor is configured to add the audiovisual element corresponding to the element summary information to the third video data.
  • the present application also provides a device for producing video data, including:
  • the first video data sending module is configured to send the first video data to the client, wherein the client is configured to display element summary information when the first video data is played, and the element summary information uses to represent the audiovisual elements contained in the first video data;
  • a second video data search module configured to search for the second video data containing the audiovisual element in the case of receiving a request triggered by the client based on the element summary information
  • a video summary information sending module configured to send the video summary information of the second video data to the client, wherein the client is also set to jointly display the video summary information and the production control;
  • the audiovisual element sending module is configured to send the audiovisual element corresponding to the element summary information to the client when receiving the request triggered by the client based on the production control, wherein the client also further Setting is to collect third video data, and add audiovisual elements corresponding to the element summary information to the third video data.
  • the present application also provides a computer device, the computer device comprising:
  • processors one or more processors
  • memory arranged to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the above-mentioned method for producing video data.
  • the present application also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the above-mentioned method for producing video data is implemented.
  • FIG. 1 is a flowchart of a method for producing video data according to Embodiment 1 of the present application
  • FIG. 2A is an exemplary diagram of producing video data according to Embodiment 1 of the present application.
  • FIG. 2B is another exemplary diagram of producing video data according to Embodiment 1 of the present application.
  • FIG. 2C is another exemplary diagram of producing video data according to Embodiment 1 of the present application.
  • FIG. 2D is another exemplary diagram of producing video data according to Embodiment 1 of the present application.
  • FIG. 2E is another exemplary diagram of producing video data according to Embodiment 1 of the present application.
  • FIG. 2F is another exemplary diagram of producing video data according to Embodiment 1 of the present application.
  • FIG. 3 is a flowchart of a method for producing video data according to Embodiment 2 of the present application.
  • FIG. 5 is a schematic structural diagram of a multi-task learning model provided in Embodiment 3 of the present application.
  • FIG. 6 is a flowchart of a method for producing video data according to Embodiment 4 of the present application.
  • FIG. 7 is a schematic structural diagram of an apparatus for producing video data according to Embodiment 5 of the present application.
  • FIG. 8 is a schematic structural diagram of an apparatus for producing video data according to Embodiment 6 of the present application.
  • FIG. 9 is a schematic structural diagram of a computer device according to Embodiment 7 of the present application.
  • video platforms need to maintain a good ecology of consumption and production, that is, in terms of consumption, video platforms strive to push video data in line with users’ interests and preferences to obtain users’ higher consumption time and satisfaction; in terms of production , the video platform should also encourage users to shoot more video data and upload it to the video platform for release, enrich the content of the video platform, and richer content can make it easier for users to obtain video data that meets their interests and preferences, forming a virtuous circle.
  • the video recommendation algorithm is mainly aimed at the consumption mechanism.
  • these implicit feedbacks are used to construct the positive feedback of the training data.
  • Negative samples use the training data to train a ranking model, use the ranking model to calculate the user's score on the video data, and then select the video data that best meets the user's interests and preferences, and push it to the user.
  • the optimization goal of the ranking model has also developed from a single playback duration (or completion rate) to a multi-task learning model that has both consumption indicators such as duration, and satisfaction indicators such as likes, comments and forwarding.
  • the user can reuse the audio-visual elements of interest more concisely when consuming video data, and quickly produce video data.
  • the multi-task learning model is used to introduce A goal of conversion from consumption to production, to predict which video data is more likely to arouse users’ interest and then produce new video data.
  • a factor is added, that is, whether the current video data will arouse users’ interest.
  • This type of video data will be preferentially pushed to increase the willingness of users to produce, which can guide more users from single consumption to consumption, while further production and improve The conversion ratio from consumption to production, thereby enriching the ecological closed loop of video platform content.
  • the video data production device can be implemented by software and/or hardware, and can be configured in computer equipment, for example, mobile terminals (such as mobile phones, tablet computers, etc.), smart wearable devices (such as smart watches, smart watches, etc.) glasses, etc.), personal computer, etc., including the following steps:
  • Step 101 When playing the first video data, display element summary information.
  • the operating system of the computer device may include Android (Android), a mobile operating system (IOS) developed by Apple, Windows, etc., in these operating systems, a client for running and playing and producing video data is supported, For example, short video applications, instant messaging tools, online video applications, and so on.
  • Android Android
  • IOS mobile operating system
  • the client may request the server to play video data in the form of a Uniform Resource Locators (URL), etc.
  • the video data is referred to as first video data in this embodiment, and after the server receives the request , searching for the first video data in a personalized or non-personalized manner, and sending part or all of the first video data to the client.
  • the first video data is video data that has been produced offline, and the form may include short videos, micro-movies, performance programs, and so on. This embodiment does not limit the form of the first video data.
  • the so-called personalization can refer to the adaptation of the video data to the user currently logged in on the client (represented by an identifier (ID), etc.), that is, based on the multi-objective optimization algorithm, collaborative filtering algorithm, etc.
  • ID an identifier
  • For matching refer to Embodiment 3 for the matching method. This embodiment does not describe the matching method in detail.
  • the video data can be regarded as the first video data.
  • non-personalization means that the screening of video data does not depend on the user currently logged in on the client (indicated by ID, etc.), and can be evaluated based on video quality (integrated definition, playback volume, likes, comments, etc.) ), popularity and other non-personalized factors to screen video data, and use the screened video data as the first video data.
  • the server can query one or more element summary information marked in the first video data, and the element summary information indicates the first video data containing audiovisual elements, and sending element summary information for the one or more audiovisual elements to the client.
  • the so-called audiovisual elements can include visual elements (that is, elements that users can see), audible elements (that are, elements that users can hear), and the form of audio-visual elements can be set according to actual conditions, such as audio data, video data. , beauty special effects, filters, etc., the form of audiovisual elements is not limited in this embodiment.
  • the element summary information may include text data, image data, etc., for example, the name of the audio data, the cover of the audio data, the name of the video data, the cover of the video data, the size, the author, the publisher, the number of users, etc. Wait.
  • the element summary information may also carry other information, for example, the ID of the audiovisual element, etc., which is not limited in this embodiment.
  • the client can call the video player provided by the operating system to play the first video data, that is, the client generates a first user interface and displays it in the first user interface The picture of the first video data, and the speaker is driven to play the audio of the first video data.
  • the first user interface has a play area, and the play area is used to display a picture of the first video data.
  • the play area is a partial area of the first user interface.
  • the other information of the video data can be displayed in the area outside the play area.
  • the play area is the entire area of the first user interface. At this time, other information for the first video data can be displayed in the floating The form is displayed above this display area.
  • Other information for the first video data may include controls for expressing positive emotions (such as “like”, “like"), comment information, controls for sharing, fields for inputting comment information, and the like.
  • components such as VideoView, MediaPlayer, SurfaceView, Vitamio, and JCPlayer can be called to play the first video data.
  • the element summary information can be converted into the first data structure in the first user interface, and the element summary information can be converted into the first data structure in a static manner (such as textual
  • the name of the audiovisual element is displayed in a dynamic manner (such as rotating the cover of the audiovisual element) to display the element summary information under the first data structure.
  • the one or more element summary information can be displayed above the play area in a floating form, so that the one or more element summary information is displayed in the first video data on the screen.
  • the screen of the first video data is displayed in all areas, and the first video data has audible elements, that is, the song "Quiet Night".
  • the lower left corner of the first user interface 210 displays the title of the song and the publisher "Quiet Night-Little Red” (element summary information 211), and the cover of the song (element summary information 212) is displayed in the lower right corner of the first user interface 210. ).
  • the first user interface 240 a picture of the first video data is displayed in a partial area, and the first video data has audible elements and visual elements, that is, the song "Exciting", small The video of stepping on a tightrope has just been performed.
  • the name of the song and the publisher "Exciting-Xiao Ming" (element summary information)
  • the introductory language of the video to guide the user to use the Video production new video data
  • the number of users of the video "Due with Xiaogang (2.27K)" (element summary information 241)
  • the cover of the song is displayed in the lower right corner of the first user interface 240.
  • Step 102 Receive a first operation acting on the element summary information.
  • the human-computer interaction tool provided by the computer device can trigger the first audio-visual element summary information corresponding to the audio-visual element. action to select the visual element represented by the element's summary information.
  • the human-computer interaction tools provided by them are different, and correspondingly, the ways of triggering the first operation through the human-computer interaction tools are also different.
  • the manner in which the tool triggers the first action is not limited.
  • the human-computer interaction tool provided by the computer equipment is a touch screen
  • a touch operation such as a click operation, long-press operation, re-press operation, etc. It is determined to receive a first operation that acts on the element digest information.
  • the human-computer interaction tool provided by the computer device is an external device, after receiving a key event (such as a single-click event, double-click event, long-press event, etc.) that occurs in an element summary information sent by the external device ), it is determined to receive the first operation acting on the element summary information.
  • a key event such as a single-click event, double-click event, long-press event, etc.
  • the external device includes, but is not limited to, a mouse, a remote control, and the like.
  • Step 103 In response to the first operation, jointly display video summary information and production controls of the second video data.
  • the server can collect video data in various ways, mark the video data with audiovisual elements (represented by ID, etc.), and store the video data in a local database of the server.
  • a specific visual element can be detected in the picture of the video data by calling an object detection model with a specific visual element as a target, If the specific visual element is detected, the video data is marked with the specific visual element.
  • the target detection model includes a first-order (One Stage) target detection model and a second-order (Two Stage) target detection model.
  • a target detection model that generates a series of candidate boxes as samples, and then classifies the samples through a convolutional neural network is called a second-order target detection model, for example, a regional convolutional neural network (Region-CNN, R-CNN), Spatial Pyramid Pooling Network (SPP-Net), Fast-RCNN, Faster-RCNN, etc.
  • a convolutional neural network for example, a regional convolutional neural network (Region-CNN, R-CNN), Spatial Pyramid Pooling Network (SPP-Net), Fast-RCNN, Faster-RCNN, etc.
  • a target detection model that does not generate candidate frames and directly converts the problem of target frame positioning into a regression problem is called a first-order target detection model, for example, Generalized Congruence Neural Network (GCNN), YOLO ( You Only Look Once), first-order multi-box prediction (Single Shot Mutibox Detector, SSD), etc.
  • GCNN Generalized Congruence Neural Network
  • YOLO You Only Look Once
  • first-order multi-box prediction Single Shot Mutibox Detector, SSD
  • the audio contained in the audio can be extracted, and the features of the audio can be extracted. If the features of the audio are the same as or similar to the features of a specific visual element, then The video data marks the specific visual element.
  • the custom visual element is compared with the original visual element, if the custom visual element is If the element is the same as or similar to the original visual element, the video data will be marked with the original visual element. If the custom visual element is different or not similar to the original visual element, The visual element is set with a new identifier (such as a new ID), and the video data is marked with a custom visual element (represented by the new identifier).
  • a new identifier such as a new ID
  • the visual element can be marked for the new video data.
  • the client can send a request to the server carrying the identification (such as ID) of the audiovisual element, requesting the server to search for the audiovisual element (with ID, etc.) containing the audiovisual element. representation) of the video data.
  • the server When the server receives the request, it parses the identification of the audiovisual element from the request, uses the identification as a search condition in the local database of the server, searches for the video data marked with the identification, and writes the video data into the video
  • the video data is referred to as the second video data in this embodiment
  • the video collection is referred to as the first video collection in this embodiment
  • the server extracts the first video data from the local database.
  • Video summary information (such as cover, name, producer, etc.) of the second video data in the video set, and send the video summary information to the client.
  • the so-called mark means that the video data contains the audiovisual element corresponding to the mark, that is, the plurality of second video data contains the same audiovisual element.
  • the second video data may be sorted according to a preset sorting method, and each time the top n (n is a positive integer) second video data of the sorting are selected and sent to the client.
  • the sorting method may include non-personalized sorting methods such as descending sorting according to video quality, descending sorting according to video popularity, etc., so as to reduce the processing complexity and improve the processing speed.
  • a personalized sorting manner such as collaborative filtering may also be used, which is not limited in this embodiment.
  • video summary information of the second video data sent by the server is received and cached locally on the client.
  • a second user interface is generated.
  • the element summary information includes element image data (such as the cover of audio data, the thumbnail of video data, etc.)
  • the element image can be displayed in the form of background data, that is, the image data in the element summary information is set as the background, and when setting, the image data can be blurred.
  • one or more information areas may be displayed in a waterfall or the like at a position below the element summary information, wherein the area of the information area matches the type of audiovisual element, that is, according to the type of audiovisual element Sets the size of the information area.
  • the area with the area of the first value can be set as the first area , display the first area as an information area, if the type of the audiovisual element is an audible element, you can set the area with the second value as the second area, and display the second area as an information area, where the first area If the value is greater than the second value, that is, the area of the first area is larger than the area of the second area, and the area of the display area is increased for the visual element, so that the visual element retains more details when displayed, and the user can browse more clearly. visual elements.
  • the second video data when the second video data is recalled to a first video set for the visual elements, the second video data is displayed in the first area.
  • Two video summaries of the video data when recalling the second video data for the audible element to another first video set, displaying the video summaries of the second video data in the second area.
  • the video summary information of the second video data in the first video set is sequentially loaded into the plurality of information areas, so that the video summary information of the second video data is displayed in the information area.
  • the production control is displayed on the information area in a floating manner, and the user can trigger operations such as sliding operation and page turning operation for the element summary information corresponding to the audio-visual element through the human-computer interaction tool provided by the computer equipment, so that in the first 2.
  • the user interface switches to display the video summary information of the second video data.
  • the position of the video summary information of the second video data changes, but the production control maintains the position. It does not change with operations such as sliding operations and page turning operations.
  • the client can continue to request the server for other second video data in the first video set, and display them in the second user interface until the first video is requested.
  • the other second video data in the set is complete.
  • the user triggers a click operation (the first operation) for the title of the song displayed in the lower left corner of the first user interface 210 and the publisher “Quiet Night-Little Red” (element summary information 211 )
  • a click operation (the first operation) is triggered for the cover (element summary information 212 ) of the song displayed in the lower right corner of the first user interface 210
  • FIG. 2B in the second user interface 220 .
  • the cover of the song is blurred and set as the background, on which the cover of the song, the name of the song, the publisher of the song, the number of users of the song are displayed in a concentrated manner, and nine smaller information areas are displayed.
  • the video summary information including the second video data corresponding to the song is loaded in sequence in the area.
  • the second video data in the third order (ie "N0.3"), in addition to the song, also includes audiovisual elements of other video data, when the user selects a song, the third order (ie "N0.3") ) of the second video data uses a smaller information area to display its video summary information.
  • the cover of the video is blurred and set as a background, on which the cover, the producer, and the background of the video are collectively displayed.
  • Step 104 Receive a second operation for making the control.
  • the human-computer interaction tool provided by the computer device can trigger the second operation for the current production control, so that the production includes the New video data for audiovisual elements.
  • the provided human-computer interaction tools are different, and correspondingly, the ways of triggering the second operation through the human-computer interaction tools are also different.
  • the manner in which the tool triggers the second action is not limited.
  • the human-computer interaction tool provided by the computer equipment is a touch screen
  • a touch operation such as a click operation, long press operation, repress operation, etc.
  • the first operation that acts on the make control.
  • the human-computer interaction tool provided by the computer device is an external device, when receiving a key event (such as a single-click event, double-click event, long-press event, etc.) It is determined to receive a first operation acting on the authoring control.
  • a key event such as a single-click event, double-click event, long-press event, etc.
  • the external device includes but is not limited to a mouse, a remote control, and the like.
  • Step 105 In response to the second operation, collect third video data, and add audiovisual elements corresponding to the element summary information to the third video data.
  • the client sends a request for downloading the audiovisual element (represented by ID, etc.) to the server in response to the user triggering the second operation of making the control, and after receiving the request, the server searches for the independent audiovisual element (represented by an ID, etc.) and send the audiovisual element to the client.
  • the server searches for the independent audiovisual element (represented by an ID, etc.) and send the audiovisual element to the client.
  • the so-called independent can mean that the audiovisual element is an independent file and does not depend on the first video data and the second video data.
  • the format of the audiovisual element (such as resolution, sampling rate, size, etc.) conforms to the production specification, and the client can directly use the audiovisual element to produce new video data.
  • the client can generate a third user interface, generate a control for making video data in the third user interface, call the camera of the computer device, preview the video data on the third user interface, and then receive the video data.
  • video data is collected, and for convenience of distinction, the video data is referred to as third video data in this embodiment.
  • the audiovisual elements corresponding to the element summary information are added to the third video data as the produced material.
  • the third video data is kept synchronized on the time axis with the audiovisual element.
  • the audiovisual element is started to be played, so that the user can preview the effect of adding the audiovisual element.
  • the collection of the third video data may be stopped.
  • the collection of the third video data may also be continued, which is not limited in this embodiment.
  • the audiovisual element includes audio data in the audiovisual element, then in this example, the audio element starts to be played at the same time when the third video data is collected, so that the audio data corresponding to the element summary information is set as the third video data. Background music for video data.
  • the audiovisual element includes video data in the visual element, and for convenience of distinction, the video element may be referred to as fourth video data in this embodiment.
  • the fourth video data is played at the same time when the third video data is collected.
  • the third video data is displayed on the left and the fourth video data is displayed on the right, or the third video data is displayed on the right.
  • the fourth video data is displayed on the left, and the third video data and the fourth video data are displayed in the form of picture-in-picture, so that the fourth video data and the third video data corresponding to the element summary information are synthesized in a split-screen manner.
  • FIG. 2E if the user triggers a click operation (second operation) for “joining” the display creation control 251 displayed below the second user interface 250 , then as shown in FIG. 2F , in the third user interface 260 Call the camera to preview in the middle, and when receiving the confirmation operation triggered by the circular control, collect the third video data and display it on the left, and play the video of Xiaogang stepping on the wire on the right, so that the video of Xiaogang stepping on the wire Merged with the third video data.
  • second operation for “joining” the display creation control 251 displayed below the second user interface 250
  • FIG. 2F in the third user interface 260 Call the camera to preview in the middle, and when receiving the confirmation operation triggered by the circular control, collect the third video data and display it on the left, and play the video of Xiaogang stepping on the wire on the right, so that the video of Xiaogang stepping on the wire Merged with the third video data.
  • the third video data can be sent to the server, and the server receives the third video data sent by the client, and marks the third video data to include the audiovisual element ( Represented by ID, etc.), if the marking is completed, the server publishes the third video data, the client thus publishes the third video data marked with audiovisual elements, and other clients can download the third video data from the server for playback. User browses.
  • the element summary information when the first video data is played, the element summary information is displayed, and the element summary information indicates the audiovisual elements contained in the first video data, the first operation acting on the element summary information is received, and in response to the first operation, the Displaying video summary information and production controls of the second video data, where the second video data includes audiovisual elements, receiving a second operation acting on the production controls, collecting third video data in response to the second operation, and converting the audiovisual elements corresponding to the element summary information
  • the element is added to the third video data.
  • users can use the audiovisual elements contained in the existing video data to create new video data.
  • the audiovisual elements do not depend on the template of the system, and the channels are diversified, which can maintain the individualization of the audiovisual elements.
  • the audio-visual elements provided by the system can ensure that the format of the audio-visual elements conforms to the production specifications, and can be directly used to produce new video data, preventing users from using professional applications. Elements are revised, which greatly reduces the technical threshold, reduces the time-consuming, and thus reduces the cost of producing video data.
  • Step 301 When playing the first video data, display element summary information.
  • the element summary information indicates audiovisual elements contained in the first video data.
  • Step 302 Receive a first operation acting on the element summary information.
  • Step 303 In response to the first operation, jointly display video summary information and production controls of the second video data.
  • the second video data contains the audiovisual element.
  • Step 304 Receive a third operation acting on the video summary information.
  • a human-computer interaction tool provided by the computer device can trigger a third operation for the corresponding video summary information, thereby selecting The second video data corresponding to the frequency summary information is determined.
  • the provided human-computer interaction tools are different, and correspondingly, the ways of triggering the third operation through the human-computer interaction tools are also different.
  • the manner in which the tool triggers the third action is not limited.
  • the human-computer interaction tool provided by the computer equipment is a touch screen
  • a touch operation such as a click operation, long press operation, repress operation, etc. It is determined to receive a third operation acting on the video summary information.
  • the human-computer interaction tool provided by the computer device is an external device, after receiving a key event (such as a single-click event, double-click event, long-press event, etc.) that occurs in a video summary message sent by the external device When , it is determined to receive a third operation acting on the video summary information.
  • the external device includes, but is not limited to, a mouse, a remote control, and the like.
  • Step 305 In response to the third operation, play the second video data to which the video summary information belongs.
  • the client may request the server to play the second video data in the form of a URL (carrying the identifier of the second video data, such as ID, etc.), and after the server receives the request , part or all of the second video data can be sent to the client.
  • a URL carrying the identifier of the second video data, such as ID, etc.
  • the client can call the video player provided by the operating system to play the second video data, that is, the client generates a first user interface and displays it in the first user interface A picture of the second video data, and driving a speaker to play the audio of the second video data.
  • the second video data that the user may like can be pushed centrally, which reduces the user's operation of searching for similar video data through keywords, page turning, etc., and reduces the consumption of searching for similar video data.
  • the occupation of resources such as processor resources, memory resources, bandwidth resources, etc.
  • Step 306 Receive a fourth operation acting on the first video data.
  • a human-computer interaction tool provided by the computer device can trigger a fourth operation for the first video data, so that the first user interface Switch to play other first video data.
  • the human-computer interaction tools provided by them are different, and accordingly, the ways of triggering the fourth operation through the human-computer interaction tools are also different.
  • the manner in which the tool triggers the fourth operation is not limited.
  • the touch screen detects that the touch screen detects the occurrence in the spatial area of the first user interface (the area other than the control, element summary information and other operable data) During a touch operation (such as a sliding operation, etc.), it is determined to receive a fourth operation acting on the first video data.
  • a touch operation such as a sliding operation, etc.
  • the human-computer interaction tool provided by the computer device is an external device, after receiving the data sent by the external device and occurring in the space area in the first user interface (except for control, element summary information and other operable data) When a key event (such as a drag event, etc.) occurs in the region), it is determined to receive a fourth operation acting on the first video data.
  • the external device includes, but is not limited to, a mouse, a remote control, and the like.
  • Step 307 In response to the fourth operation, play other first video data adapted to the current user, or other first video data including other audiovisual elements.
  • the client switches the first video data in response to the user Fourth operation, you can request the server to play other first video data adapted to the current user in the form of URL, etc. After receiving the request, the server can send other first videos adapted to the current user to the client Part or all of the data.
  • the client can call the video player provided by the operating system to play other first video data adapted to the current user, that is, in the first video In the user interface, the screen is switched to display other first video data adapted to the current user, and the speaker is driven to switch and play audio of the other first video data adapted to the current user.
  • the currently playing first video data is non-personalized push video data
  • the current first video data contains other audiovisual elements
  • the first video data is the video data in the video set corresponding to the other audiovisual elements, that is, the video set indicates that it contains
  • the video data is referred to as the second video set in this embodiment.
  • the client may request the server to play other first video data in the second video set, and after receiving the request, the server may send the first video to the client. Part or all of the other first video data in the second video set.
  • the client can call the video player provided by the operating system to play the other first video data in the second video set, that is, in the In the first user interface, the screen for displaying other first video data in the second video set is switched, and the speaker is driven to switch and play audio of the other first video data in the second video set.
  • the user can trigger a return operation for the return control in the first user interface through a touch operation or the like
  • the client receives the return operation acting on the return control, and, in response to the return operation, displays the second user interface, Video summary information of the first video data in the second video set is displayed in the second user interface.
  • the types of the first video data are distinguished, and other first video data adapted to the user and other first video data in the second video set are respectively pushed for personalized and non-personalized service scenarios, which can ensure that The accuracy of the first video data switching meets the requirements of business scenarios.
  • the video data production device can be implemented by software and/or hardware, and can be configured in computer equipment, such as a server, a workstation, etc., and includes the following steps:
  • Step 401 Send the first video data to the client.
  • the operating system of the computer device may include Unix, Linux, Windows Server, Netware, etc., in these operating systems, a server is supported, and the server is configured to provide video services to multiple clients, such as push Video data, publish video data, etc.
  • the server may determine the first video data in a personalized or non-personalized manner, and send part or all of the first video data to the client.
  • the server may send element summary information of one or more audiovisual elements to the client, where the element summary information represents the first video The audiovisual elements that the data contains.
  • the client is configured to display the one or more element summary information on the first user interface when playing the first video data.
  • step 401 may include the following steps:
  • Step 4011 Acquire historical data recorded when the user browses the video data.
  • the user browses the video data on the client side, and the server side records the information during the browsing process in a log file and stores it in the database.
  • the server can query the historical data recorded when the user browses the video data in the log file of the database, and wait to filter the video data suitable for the user.
  • Step 4012 Extract features from historical data as behavior features.
  • the behavioral characteristic may include at least one of the following:
  • user characteristics may be collected from historical data as user characteristics.
  • the user characteristics include characteristics inherent to the user, eg, ID (ie, User ID (UID)), gender, age, country, and the like.
  • ID ie, User ID (UID)
  • UID User ID
  • the user features include user dynamic features, for example, viewing behaviors in a recent period of time, interaction behaviors in a recent period of time, preferences for multiple types of video data in a recent period of time, and so on.
  • features of video data may be collected from historical data as video features.
  • the video features include features inherent to the video data, such as ID (ie, Video ID (VID)), length, tag, UID of the photographer (the user who made the video data), and the like.
  • ID ie, Video ID (VID)
  • length ie, Tag
  • UID the photographer (the user who made the video data), and the like.
  • the video features include dynamic features of the video data, for example, the number of times pushed to users in a recent period of time, the number of times it was viewed in a recent period of time, the number of times it was liked in a recent period of time, and so on.
  • the characteristics of the environment where the user browses the video data can be collected from the historical data, as the context characteristics, for example, the time of requesting to browse the video data, the location of the request to browse the video data, the network status of the request to browse the video data, etc.
  • At least two of the user feature, the video feature, and the context feature may be combined to obtain a cross feature, thereby increasing the dimension of the feature.
  • Step 4013 Use the behavior feature to predict multiple probabilities corresponding to the user performing multiple target behaviors on the video data.
  • a multi-task learning model can be set, and the multi-task learning model can be used to calculate that the user performs multiple (two or more) target behaviors (such as clicking, playing time, like, comment, share) on the video data , collection, attention, etc.), the probability is expressed as follows:
  • ui represents the i-th user
  • v j represents the j-th video data
  • t is the current moment, so the probability is abbreviated as pi ,j .
  • a target behavior of conversion from consumption to production is added, that is, to request other video data containing the same audiovisual elements as the video data, to produce new,
  • the target behavior can refer to steps 101-105.
  • the video data can be set as a positive sample, and the negative sample is Video data that has been viewed without triggering audiovisual elements.
  • the multi-task learning model can be a neural network, such as a deep neural network (Deep Neural Networks, DNN), etc., or can be other machine learning models, such as a logistic regression (Logistics Regression, LR) model, user click-through rate (Click-Through- Rate, CTR) model, etc., the type of the multi-task learning model is not limited in this embodiment.
  • a neural network such as a deep neural network (Deep Neural Networks, DNN), etc.
  • LR logistic regression
  • CTR Click-Through- Rate
  • the multi-task learning model can be trained based on multi-task learning.
  • Multi-task learning is a learning method that derives transfer. Multiple goals (such as the target behavior in this embodiment) are put together to learn from each other, and related goals (such as this The target behavior in the embodiment) shared information and the noise introduced by irrelevant targets can improve the generalization ability of the multi-task learning model to a certain extent.
  • Multi-task learning belongs to the category of transfer learning.
  • the main difference between it and transfer learning is that it learns to improve the effect of the model through multiple targets (such as the target behavior in this example), while the usual transfer learning uses other targets to improve the effect of the model. Improve the learning effect of a target.
  • a model based on parameter sharing can be used as a multi-task learning model.
  • the multi-task learning model receives the same input (Input), the underlying network shares model parameters, and multiple The target behaviors (such as Task1, Task2, Task3, Task4, etc.) learn from each other, and the gradients are back-propagated at the same time, which can improve the generalization ability of the multi-task learning model.
  • Step 4014 fuse the multiple probabilities into a quality value of the video data for the user.
  • the quality value of the video data for the user can be evaluated, and the quality value can be used to indicate the degree of the user's preference for the video data under the target dimension.
  • the quality value is positively correlated with the probability, that is, the higher the probability, the greater the quality value, and the lower the probability, the smaller the quality value.
  • multiple probabilities can be fused into a quality value of video data for the user by means of linear fusion, and feature weights are configured for each probability.
  • feature weights are configured for each probability.
  • the larger the feature weight the more important the target behavior is.
  • the product between each probability and the feature weight corresponding to each probability is calculated as the feature value, and the sum of all the feature values is calculated as the quality value of the video data for the user.
  • w l is the feature weight.
  • Step 4015 If the quality value satisfies the preset recall condition, set the video data to which the quality value belongs as the first video data adapted to the user.
  • recall conditions can be preset, for example, n (n is a positive integer) quality values with the highest numerical value, the quality value is greater than the threshold, m% (m is a positive number) quality values with the highest numerical value, etc. .
  • the video data to which the quality value belongs is set as the first video data adapted to the user, and at this time, the user's identification and the identification of the first video data can be recorded. relationship between.
  • Step 4016 Send the first video data to the client.
  • steps 4011 to 4015 can be performed offline.
  • a user represented by an ID, etc.
  • the user's ID can be used as a search condition to search for and The identifier of the first video data associated with the identifier of the user is searched for the first video data based on the identifier of the first video data, and the first video data is sent to the client.
  • Step 402 When a request triggered by the client based on the element summary information is received, search for second video data containing audiovisual elements.
  • the client When receiving the first operation acting on the element summary information, the client generates a request, sends the request to the server, and requests the server to push the second video data including the audiovisual element corresponding to the element summary information.
  • the server can collect video data in various ways, mark the video data with audiovisual elements (represented by ID, etc.), and store the video data in the local database of the server.
  • the server can search for the video data marked with the audiovisual element as the second video data, and write the second video data into the first video set.
  • Step 403 Send the video summary information of the second video data to the client.
  • the client extracts the video summary information (such as cover, name, producer, etc.) of the second video data in the first video set from the local database, and sends the video summary information to the client.
  • video summary information such as cover, name, producer, etc.
  • the client may be configured to jointly display the video summary information and production controls on the second user interface.
  • Step 404 When a request triggered by the client based on the production control is received, send the audiovisual element corresponding to the element summary information to the client.
  • the client When receiving the second operation acting on the production control, the client generates a request, sends the request to the server, and requests the server to push the audiovisual element corresponding to the element abstract information (represented by ID, etc.).
  • the server After receiving the request, the server searches for an independent audiovisual element corresponding to the element abstract information (represented by ID, etc.), and sends the audiovisual element to the client.
  • the element abstract information represented by ID, etc.
  • the client may be configured to collect third video data, and add the audiovisual element corresponding to the element summary information to the third video data.
  • the audiovisual element includes audio data
  • the audio data corresponding to the element summary information may be sent to the client, and the client may be configured to set the audio data corresponding to the element summary information as the third Background music for video data.
  • the audiovisual element includes fourth video data
  • the fourth video data corresponding to the element summary information may be sent to the client, and the client may be configured to synthesize the element in a split-screen manner The fourth video data and the third video data corresponding to the abstract information.
  • the client can upload the third video data, and the server can receive the third video data sent by the client, and mark the third video data with audiovisual elements (represented by ID, etc.), if After the marking is completed, the third video data is released, so that other clients can browse the third video data.
  • the server can receive the third video data sent by the client, and mark the third video data with audiovisual elements (represented by ID, etc.), if After the marking is completed, the third video data is released, so that other clients can browse the third video data.
  • the first video data is sent to the client, and the client is set to display element summary information when the first video data is played, and the element summary information indicates the audiovisual elements contained in the first video data.
  • the terminal searches for the second video data containing audiovisual elements, and sends the video summary information of the second video data to the client terminal.
  • the client terminal is set to jointly display the video summary information and make controls.
  • the client sends the audiovisual element corresponding to the element summary information to the client based on the request triggered by the production control, the client is set to collect the third video data, and the audiovisual element corresponding to the element summary information is added to the third video data.
  • users can use the audiovisual elements contained in the existing video data to create new video data.
  • the audiovisual elements do not depend on the template of the system, and the channels are diversified, which can maintain the personalization of the audiovisual elements, thereby ensuring the individuality of the newly produced video data.
  • the system provides audio-visual elements, which can ensure that the format of the audio-visual elements conforms to the production specifications, and can be directly used to produce new video data, avoiding the need for users to use professional applications to revise the elements, greatly reducing the technical threshold, Time consuming is reduced, thereby reducing the cost of producing video data.
  • FIG. 6 is a flowchart of a method for producing video data according to Embodiment 4 of the present application. Based on the foregoing embodiments, this embodiment adds operations of switching the first video data and playing the second video data.
  • the method includes the following steps: step:
  • Step 601 Send the first video data to the client.
  • the client is configured to display element summary information when the first video data is played, where the element summary information indicates audiovisual elements included in the first video data.
  • Step 602 When a request triggered by the client based on the element summary information is received, search for second video data containing audiovisual elements.
  • Step 603 Send the video summary information of the second video data to the client.
  • the client is set to jointly display the video summary information and the production controls.
  • Step 604 When a request triggered by the client based on the video summary information is received, send the second video data to which the video summary information belongs to the client for playback.
  • the client When receiving the third operation acting on the video summary information, the client generates a request, sends the request to the server, and requests the server to push the second video data corresponding to the video summary information.
  • the server can search for the second video data corresponding to the video summary information, and send part or all of the second video data to the client.
  • the client After buffering part or all of the second video data, the client can call the video player provided by the operating system to play the second video data.
  • the second video data that the user may like can be pushed centrally, which reduces the user's operation of searching for similar video data through keywords, page turning, etc., and reduces the consumption of searching for similar video data.
  • the occupation of resources such as processor resources, memory resources, bandwidth resources, etc.
  • Step 605 When receiving a request triggered by the client based on the first video data, send other first video data adapted to the user to the client for playback, or send other first video data containing other audiovisual elements to the client for playback.
  • the client When receiving the fourth operation acting on the first video data, the client generates a request, sends the request to the server, and requests the server to push other first video data.
  • the server When receiving the request, the server identifies the type of the first video data, so as to distinguish and push different first video data.
  • the server can send to the client other first video data adapted to the current user. some or all of the data.
  • the client After buffering part or all of the other first video data adapted to the current user, the client can call the video player provided by the operating system to play the other first video data adapted to the current user.
  • the server can send the The client sends part or all data of other first video data in the second video set.
  • the client After buffering part or all of the other first video data in the second video set, the client can call the video player provided by the operating system to play the other first video data in the second video set.
  • the types of the first video data are distinguished, and other first video data adapted to the user and other first video data in the second video set are respectively pushed for personalized and non-personalized service scenarios, which can ensure that The accuracy of the first video data switching meets the requirements of business scenarios.
  • FIG. 7 is a structural block diagram of an apparatus for producing video data according to Embodiment 5 of the present application, which may include the following modules:
  • the display screen 701 is set to display element summary information when the first video data is played, and the element summary information represents the audiovisual elements contained in the first video data; the touch screen 702 is set to receive the element summary that acts on the element summary.
  • the first operation of information; the display screen 701 is further configured to, in response to the first operation, jointly display video summary information and production controls of the second video data, where the second video data includes the audiovisual elements; the touch screen 702, further configured to receive a second operation acting on the production control; a camera 703, configured to collect third video data in response to the second operation; and a processor 704, configured to Audiovisual elements are added to the third video data.
  • the apparatus for producing video data provided by the embodiment of the present application can execute the method for producing video data provided by any embodiment of the present application, and has functional modules and effects corresponding to the execution method.
  • Embodiment 8 is a structural block diagram of an apparatus for producing video data according to Embodiment 6 of the present application, which may include the following modules:
  • the first video data sending module 801 is configured to send the first video data to the client, and the client is configured to display element summary information when the first video data is played, and the element summary information indicates the first video data.
  • an audiovisual element contained in video data the second video data search module 802 is configured to search for second video data containing the audiovisual element when receiving a request triggered by the client based on the element summary information;
  • video summary The information sending module 803 is configured to send the video summary information of the second video data to the client, and the client is configured to jointly display the video summary information and production controls;
  • the audiovisual element sending module 804 is configured to When a request triggered by the client based on the production control is received, the audiovisual element corresponding to the element summary information is sent to the client, and the client is set to collect third video data, and the element Audiovisual elements corresponding to the summary information are added to the third video data.
  • the apparatus for producing video data provided by the embodiment of the present application can execute the method for producing video data provided by any embodiment of the present application, and has functional modules and effects corresponding to the execution method.
  • FIG. 9 is a schematic structural diagram of a computer device according to Embodiment 7 of the present application.
  • Figure 9 shows a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present application.
  • computer device 12 takes the form of a general-purpose computing device.
  • Components of computer device 12 may include, but are not limited to, one or more processors or processing units 16 , system memory 28 , and a bus 18 connecting various system components including system memory 28 and processing unit 16 .
  • System memory 28 may include computer system readable media in the form of volatile memory, such as random access memory (RAM) 30 and/or cache memory 32 .
  • Storage system 34 may be configured to read and write to non-removable, non-volatile magnetic media (not shown in Figure 9, commonly referred to as a "hard disk drive").
  • a program/utility 40 having a set (at least one) of program modules 42 may be stored in memory 28, for example.
  • Computer device 12 may also communicate with one or more external devices 14 (eg, keyboard, pointing device, display 24, etc.). Such communication may take place through an input/output (I/O) interface 22 . Also, computer device 12 may communicate with one or more networks (eg, Local Area Network (LAN), Wide Area Network (WAN), and/or public networks such as the Internet) through network adapter 20. As shown, network adapter 20 communicates with other modules of computer device 12 via bus 18 .
  • networks eg, Local Area Network (LAN), Wide Area Network (WAN), and/or public networks such as the Internet
  • the processing unit 16 executes a variety of functional applications and data processing by running the programs stored in the system memory 28, for example, implementing the video data production method provided by the embodiments of the present application.
  • the eighth embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • User Interface Of Digital Computer (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

Disclosed are a method and apparatus for manufacturing video data, and a computer device and a storage medium. The method for manufacturing video data comprises: when first video data is played, displaying element digest information, wherein the element digest information is used for representing an audiovisual element included in the first video data; receiving a first operation that acts on the element digest information; in response to the first operation, jointly displaying video digest information of second video data and a manufacturing control, wherein the second video data includes the audiovisual element; receiving a second operation that acts on the manufacturing control; and in response to the second operation, collecting third video data, and adding, to the third video data, the audiovisual element corresponding to the element digest information.

Description

视频数据的制作方法、装置、计算机设备和存储介质Video data production method, device, computer equipment and storage medium
本申请要求在2020年08月31日提交中国专利局、申请号为202010896513.0的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with application number 202010896513.0 filed with the China Patent Office on August 31, 2020, the entire contents of which are incorporated herein by reference.
技术领域technical field
本申请涉及多媒体的技术领域,例如涉及一种视频数据的制作方法、装置、计算机设备和存储介质。The present application relates to the technical field of multimedia, for example, to a method, apparatus, computer equipment and storage medium for producing video data.
背景技术Background technique
随着移动终端的广泛普及,用户可以随时随地使用移动终端制作视频数据,如短视频等,并发布到互联网上的平台。With the widespread popularity of mobile terminals, users can use mobile terminals to create video data, such as short videos, anytime, anywhere, and publish them to platforms on the Internet.
在制作视频数据时,用户通常向视频数据中添加多种元素,从而提高视频数据的精彩程度。When producing video data, users usually add various elements to the video data, so as to improve the splendor of the video data.
用户向视频数据添加的元素,在先多为平台提供的模板,但是,平台提供的模板较少,而制作视频数据的用户较多,导致使用这些模板制作的视频数据同质化较为明显,因此,许多用户手动搜集元素,以实现元素的个性化,从而实现视频数据的个性化,例如,从网上下载数据作为元素,从其他视频数据中解析数据作为元素,等等。The elements added by users to the video data are mostly templates provided by the platform. However, the platform provides fewer templates, and more users produce video data, which leads to the obvious homogeneity of the video data produced using these templates. Therefore, , many users manually collect elements to realize element personalization, thereby realizing the personalization of video data, for example, downloading data from the Internet as elements, parsing data from other video data as elements, and so on.
由于该元素的格式(如分辨率、采样率、体积大小等)可能不符合制作的规范,往往需要用户使用专业的应用对元素进行修正,如裁剪、压缩等,技术门槛较高,耗时较多,导致制作视频数据的成本较高。Since the format of the element (such as resolution, sampling rate, size, etc.) may not meet the production specifications, users are often required to use professional applications to correct the element, such as cropping, compression, etc., which requires a high technical threshold and takes a long time. The cost of producing video data is relatively high.
发明内容SUMMARY OF THE INVENTION
本申请提出了一种视频数据的制作方法、装置、计算机设备和存储介质,以解决在保持视频数据个性化的条件下,如何降低制作视频数据的成本的问题。The present application proposes a method, device, computer equipment and storage medium for producing video data, so as to solve the problem of how to reduce the cost of producing video data under the condition of keeping the individuality of video data.
本申请提供了一种视频数据的制作方法,包括:The application provides a method for producing video data, including:
在播放第一视频数据的情况下,显示元素摘要信息,其中,所述元素摘要信息用于表示所述第一视频数据包含的视听元素;In the case of playing the first video data, element summary information is displayed, wherein the element summary information is used to represent audiovisual elements included in the first video data;
接收作用于所述元素摘要信息的第一操作;receiving a first operation acting on the element summary information;
响应于所述第一操作,共同显示第二视频数据的视频摘要信息以及制作控件,其中,所述第二视频数据包含所述视听元素;in response to the first operation, collectively displaying video summary information and production controls for second video data, wherein the second video data includes the audiovisual element;
接收作用于所述制作控件的第二操作;receiving a second operation acting on the production control;
响应于所述第二操作,采集第三视频数据,将所述元素摘要信息对应的视听元素添加至所述第三视频数据中。In response to the second operation, third video data is collected, and audiovisual elements corresponding to the element summary information are added to the third video data.
本申请还提供了一种视频数据的制作方法,包括:The present application also provides a method for producing video data, including:
将第一视频数据发送至客户端,其中,所述客户端设置为在播放所述第一视频数据的情况下,显示元素摘要信息,所述元素摘要信息用于表示所述第一视频数据包含的视听元素;Send the first video data to the client, wherein the client is configured to display element summary information when the first video data is played, and the element summary information is used to indicate that the first video data contains audiovisual elements;
在接收到所述客户端基于所述元素摘要信息触发的请求的情况下,查找包含所述视听元素的第二视频数据;in the case of receiving a request triggered by the client based on the element summary information, searching for second video data containing the audiovisual element;
将所述第二视频数据的视频摘要信息发送至所述客户端,其中,所述客户端还设置为共同显示所述视频摘要信息以及制作控件;sending the video summary information of the second video data to the client, wherein the client is further configured to jointly display the video summary information and production controls;
在接收到所述客户端基于所述制作控件触发的请求的情况下,将所述元素摘要信息对应的视听元素发送至所述客户端,所述客户端还设置为采集第三视频数据,将所述元素摘要信息对应的视听元素添加至所述第三视频数据中。In the case of receiving the request triggered by the client based on the production control, the audiovisual element corresponding to the element summary information is sent to the client, and the client is further configured to collect third video data, The audiovisual element corresponding to the element summary information is added to the third video data.
本申请还提供了一种视频数据的制作装置,包括:The present application also provides a device for producing video data, including:
显示屏,设置为在播放第一视频数据的情况下,显示元素摘要信息,其中,所述元素摘要信息用于表示所述第一视频数据包含的视听元素;a display screen, configured to display element summary information when the first video data is played, wherein the element summary information is used to represent audiovisual elements included in the first video data;
触控屏,设置为接收作用于所述元素摘要信息的第一操作;a touch screen, configured to receive a first operation acting on the element summary information;
显示屏,还设置为响应于所述第一操作,共同显示第二视频数据的视频摘要信息以及制作控件,其中,所述第二视频数据包含所述视听元素;a display screen, further configured to jointly display video summary information and production controls of second video data in response to the first operation, wherein the second video data includes the audiovisual element;
触控屏,还设置为接收作用于所述制作控件的第二操作;a touch screen, further configured to receive a second operation acting on the production control;
摄像头,设置为响应于所述第二操作,采集第三视频数据;a camera, configured to collect third video data in response to the second operation;
处理器,设置为将所述元素摘要信息对应的视听元素添加至所述第三视频数据中。The processor is configured to add the audiovisual element corresponding to the element summary information to the third video data.
本申请还提供了一种视频数据的制作装置,包括:The present application also provides a device for producing video data, including:
第一视频数据发送模块,设置为将第一视频数据发送至客户端,其中,所述客户端设置为在播放所述第一视频数据的情况下,显示元素摘要信息,所述元素摘要信息用于表示所述第一视频数据包含的视听元素;The first video data sending module is configured to send the first video data to the client, wherein the client is configured to display element summary information when the first video data is played, and the element summary information uses to represent the audiovisual elements contained in the first video data;
第二视频数据查找模块,设置为在接收到所述客户端基于所述元素摘要信息触发的请求的情况下,查找包含所述视听元素的第二视频数据;A second video data search module, configured to search for the second video data containing the audiovisual element in the case of receiving a request triggered by the client based on the element summary information;
视频摘要信息发送模块,设置为将所述第二视频数据的视频摘要信息发送 至所述客户端,其中,所述客户端还设置为共同显示所述视频摘要信息以及制作控件;A video summary information sending module, configured to send the video summary information of the second video data to the client, wherein the client is also set to jointly display the video summary information and the production control;
视听元素发送模块,设置为在接收到所述客户端基于所述制作控件触发的请求的情况下,将所述元素摘要信息对应的视听元素发送至所述客户端,其中,所述客户端还设置为采集第三视频数据,将所述元素摘要信息对应的视听元素添加至所述第三视频数据中。The audiovisual element sending module is configured to send the audiovisual element corresponding to the element summary information to the client when receiving the request triggered by the client based on the production control, wherein the client also further Setting is to collect third video data, and add audiovisual elements corresponding to the element summary information to the third video data.
本申请还提供了一种计算机设备,所述计算机设备包括:The present application also provides a computer device, the computer device comprising:
一个或多个处理器;one or more processors;
存储器,设置为存储一个或多个程序;memory, arranged to store one or more programs;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现上述的视频数据的制作方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the above-mentioned method for producing video data.
本申请还提供了一种计算机可读存储介质,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现上述的视频数据的制作方法。The present application also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the above-mentioned method for producing video data is implemented.
附图说明Description of drawings
图1为本申请实施例一提供的一种视频数据的制作方法的流程图;1 is a flowchart of a method for producing video data according to Embodiment 1 of the present application;
图2A为本申请实施例一提供的一种制作视频数据的示例图;FIG. 2A is an exemplary diagram of producing video data according to Embodiment 1 of the present application;
图2B为本申请实施例一提供的另一种制作视频数据的示例图;FIG. 2B is another exemplary diagram of producing video data according to Embodiment 1 of the present application;
图2C为本申请实施例一提供的另一种制作视频数据的示例图;FIG. 2C is another exemplary diagram of producing video data according to Embodiment 1 of the present application;
图2D为本申请实施例一提供的另一种制作视频数据的示例图;FIG. 2D is another exemplary diagram of producing video data according to Embodiment 1 of the present application;
图2E为本申请实施例一提供的另一种制作视频数据的示例图;FIG. 2E is another exemplary diagram of producing video data according to Embodiment 1 of the present application;
图2F为本申请实施例一提供的另一种制作视频数据的示例图;FIG. 2F is another exemplary diagram of producing video data according to Embodiment 1 of the present application;
图3为本申请实施例二提供的一种视频数据的制作方法的流程图;3 is a flowchart of a method for producing video data according to Embodiment 2 of the present application;
图4为本申请实施例三提供的一种视频数据的制作方法的流程图;4 is a flowchart of a method for producing video data according to Embodiment 3 of the present application;
图5为本申请实施例三提供的一种多任务学习模型的结构示意图;5 is a schematic structural diagram of a multi-task learning model provided in Embodiment 3 of the present application;
图6为本申请实施例四提供的一种视频数据的制作方法的流程图;6 is a flowchart of a method for producing video data according to Embodiment 4 of the present application;
图7为本申请实施例五提供的一种视频数据的制作装置的结构示意图;7 is a schematic structural diagram of an apparatus for producing video data according to Embodiment 5 of the present application;
图8为本申请实施例六提供的一种视频数据的制作装置的结构示意图;8 is a schematic structural diagram of an apparatus for producing video data according to Embodiment 6 of the present application;
图9为本申请实施例七提供的一种计算机设备的结构示意图。FIG. 9 is a schematic structural diagram of a computer device according to Embodiment 7 of the present application.
具体实施方式detailed description
下面结合附图和实施例对本申请进行说明。The present application will be described below with reference to the accompanying drawings and embodiments.
一般情况,视频平台需要维护一个良好的消费和生产的生态,即,对于消费方面,视频平台竭力为用户推送符合用户兴趣偏好的视频数据,获取用户更高的消费时长和满意度;对于生产方面,视频平台也要激励用户更多地拍摄视频数据并上传至视频平台发布,丰富视频平台的内容,而更丰富的内容能够使得用户更容易获得符合其兴趣偏好的视频数据,形成良性循环。In general, video platforms need to maintain a good ecology of consumption and production, that is, in terms of consumption, video platforms strive to push video data in line with users’ interests and preferences to obtain users’ higher consumption time and satisfaction; in terms of production , the video platform should also encourage users to shoot more video data and upload it to the video platform for release, enrich the content of the video platform, and richer content can make it easier for users to obtain video data that meets their interests and preferences, forming a virtuous circle.
视频推荐算法主要是针对消费的机制,通过记录用户对视频数据的隐式反馈,例如,观看时长、是否点赞、是否评论、是否转发,等等,使用这些隐式反馈来构造训练数据的正负样本,使用训练数据训练排序模型,使用该排序模型计算用户对视频数据的评分,进而选择最符合用户兴趣偏好的视频数据,并推送给用户。The video recommendation algorithm is mainly aimed at the consumption mechanism. By recording the user's implicit feedback on the video data, such as the viewing time, whether to like, whether to comment, whether to forward, etc., these implicit feedbacks are used to construct the positive feedback of the training data. Negative samples, use the training data to train a ranking model, use the ranking model to calculate the user's score on the video data, and then select the video data that best meets the user's interests and preferences, and push it to the user.
此外,在生产激励方面,常规的手段大都以运营机制为主,例如,以等级、贡献值、现金奖励等手段,直接对用户的生产视频数据的行为进行正向激励。In addition, in terms of production incentives, most of the conventional means are based on operation mechanisms, such as grades, contribution values, cash rewards and other means to directly motivate users' behavior of producing video data.
但是,这类方法需要人工进行运营、反作弊等,成本较大,而且,一旦停止激励,用户生产的意愿会迅速下降。However, such methods require manual operations, anti-cheating, etc., which are costly. Moreover, once the incentives are stopped, the willingness of users to produce will drop rapidly.
由于业务方面的需求,排序模型的优化目标也从单一的播放时长(或播完率),发展到既有时长等消费指标,也有点赞评论转发等满意度指标的多任务学习模型。Due to business requirements, the optimization goal of the ranking model has also developed from a single playback duration (or completion rate) to a multi-task learning model that has both consumption indicators such as duration, and satisfaction indicators such as likes, comments and forwarding.
在本实施例中,配合适当的产品形态,已经使得用户在消费视频数据时,对感兴趣的视听元素能够更加简洁地复用,快速生产视频数据,与此同时,利用多任务学习模型,引入一个从消费到生产转换的目标,来预测哪些视频数据更容易引发用户的兴趣进而生产新的视频数据,在最终对召回的视频数据排序时,增加一个因素,即当前视频数据是否会引起用户的兴趣进行生产,在消费和满意度条件接近的情况下,优先推送该类视频数据,提高用户生产的意愿,这样能够引导更多的用户从单一消费变为消费的同时,进一步进行生产,提高从消费到生产的转化比例,进而丰富视频平台内容的生态闭环。In this embodiment, with the appropriate product form, the user can reuse the audio-visual elements of interest more concisely when consuming video data, and quickly produce video data. At the same time, the multi-task learning model is used to introduce A goal of conversion from consumption to production, to predict which video data is more likely to arouse users’ interest and then produce new video data. When finally sorting the recalled video data, a factor is added, that is, whether the current video data will arouse users’ interest. Interested in production, when the consumption and satisfaction conditions are close, this type of video data will be preferentially pushed to increase the willingness of users to produce, which can guide more users from single consumption to consumption, while further production and improve The conversion ratio from consumption to production, thereby enriching the ecological closed loop of video platform content.
实施例一Example 1
图1为本申请实施例一提供的一种视频数据的制作方法的流程图,本实施例可适用于提供已有视频数据的视听元素、制作新的视频数据的情况,该方法可以由视频数据的制作装置来执行,该视频数据的制作装置可以由软件和/或硬 件实现,可配置在计算机设备中,例如,移动终端(如手机、平板电脑等)、智能穿戴设备(如智能手表、智能眼镜等)、个人电脑,等等,包括如下步骤:1 is a flowchart of a method for producing video data according to Embodiment 1 of the present application. This embodiment is applicable to the situation of providing audiovisual elements of existing video data and producing new video data. The video data production device can be implemented by software and/or hardware, and can be configured in computer equipment, for example, mobile terminals (such as mobile phones, tablet computers, etc.), smart wearable devices (such as smart watches, smart watches, etc.) glasses, etc.), personal computer, etc., including the following steps:
步骤101、当播放第一视频数据时,显示元素摘要信息。Step 101: When playing the first video data, display element summary information.
在本实施例中,计算机设备的操作系统可以包括安卓(Android)、由苹果公司开发的移动操作系统(IOS)、Windows等等,在这些操作系统中支持运行播放、制作视频数据的客户端,例如,短视频应用、即时通讯工具、在线视频应用,等等。In this embodiment, the operating system of the computer device may include Android (Android), a mobile operating system (IOS) developed by Apple, Windows, etc., in these operating systems, a client for running and playing and producing video data is supported, For example, short video applications, instant messaging tools, online video applications, and so on.
该客户端可以以统一资源定位器(Uniform Resource Locators,URL)等形式,向服务端请求播放视频数据,该视频数据在本实施例中称之为第一视频数据,服务端接收到该请求之后,通过个性化或非个性化的方式查找第一视频数据,可向该客户端发送该第一视频数据的部分或全部数据。The client may request the server to play video data in the form of a Uniform Resource Locators (URL), etc. The video data is referred to as first video data in this embodiment, and after the server receives the request , searching for the first video data in a personalized or non-personalized manner, and sending part or all of the first video data to the client.
该第一视频数据为已离线完成制作的视频数据,其形式可以包括短视频、微电影、表演节目,等等,本实施例对第一视频数据的形式不加以限制。The first video data is video data that has been produced offline, and the form may include short videos, micro-movies, performance programs, and so on. This embodiment does not limit the form of the first video data.
所谓个性化,可以指视频数据与当前在客户端登录的用户(以标识(Identifier,ID)等表示)适配,即基于多目标优化算法、协同过滤算法等个性化的方式对视频数据与用户进行匹配,匹配方式可参见实施例三,本实施例对匹配方式不加以详述,在匹配成功时,该视频数据可视为第一视频数据。The so-called personalization can refer to the adaptation of the video data to the user currently logged in on the client (represented by an identifier (ID), etc.), that is, based on the multi-objective optimization algorithm, collaborative filtering algorithm, etc. For matching, refer to Embodiment 3 for the matching method. This embodiment does not describe the matching method in detail. When the matching is successful, the video data can be regarded as the first video data.
所谓非个性化,可以指视频数据的筛选并不依赖于当前在客户端登录的用户(以ID等表示),可以基于视频质量(综合清晰度、播放量、点赞量、评论量等进行评价)、热度等非个性化的因素筛选视频数据,将筛选出的视频数据作为第一视频数据。The so-called non-personalization means that the screening of video data does not depend on the user currently logged in on the client (indicated by ID, etc.), and can be evaluated based on video quality (integrated definition, playback volume, likes, comments, etc.) ), popularity and other non-personalized factors to screen video data, and use the screened video data as the first video data.
若第一视频数据为一视频数据与一个或多个视听元素混合制作而成,则服务端可以查询该第一视频数据标记的一个或多个元素摘要信息,该元素摘要信息表示第一视频数据包含的视听元素,以及,将该一个或多个视听元素的元素摘要信息发送至该客户端。If the first video data is made by mixing a video data with one or more audiovisual elements, the server can query one or more element summary information marked in the first video data, and the element summary information indicates the first video data containing audiovisual elements, and sending element summary information for the one or more audiovisual elements to the client.
所谓视听元素,可以包括可视元素(即用户可以看到的元素)、可听元素(即用户可以听到的元素),视听元素的形式可以根据实际情况进行设置,例如,音频数据、视频数据、美颜特效、滤镜,等等,本实施例对视听元素的形式不加以限制。The so-called audiovisual elements can include visual elements (that is, elements that users can see), audible elements (that are, elements that users can hear), and the form of audio-visual elements can be set according to actual conditions, such as audio data, video data. , beauty special effects, filters, etc., the form of audiovisual elements is not limited in this embodiment.
此外,该元素摘要信息可以包括文本数据、图像数据等形式,例如,音频数据的名称、音频数据的封面、视频数据的名称、视频数据的封面、体积大小、作者、发布者、使用人数,等等。元素摘要信息除了表示视听元素之外,还可以携带其他信息,例如,视听元素的ID,等等,本实施例对此不加以限制。In addition, the element summary information may include text data, image data, etc., for example, the name of the audio data, the cover of the audio data, the name of the video data, the cover of the video data, the size, the author, the publisher, the number of users, etc. Wait. In addition to representing the audiovisual element, the element summary information may also carry other information, for example, the ID of the audiovisual element, etc., which is not limited in this embodiment.
客户端在缓存该第一视频数据的部分或全部数据之后,可调用操作系统提供的视频播放器播放该第一视频数据,即,客户端生成第一用户界面,在该第一用户界面中显示第一视频数据的画面,以及,驱动扬声器播放第一视频数据的音频。After buffering part or all of the first video data, the client can call the video player provided by the operating system to play the first video data, that is, the client generates a first user interface and displays it in the first user interface The picture of the first video data, and the speaker is driven to play the audio of the first video data.
该第一用户界面中具有播放区域,该播放区域用于显示第一视频数据的画面,在一些情况下,该播放区域为该第一用户界面的部分区域,此时,针对该第一视频数据的其他信息,可显示在该播放区域之外的区域,在一些情况下,该播放区域为该第一用户界面的全部区域,此时,针对该第一视频数据的其他信息,可以以悬浮的形式显示在该显示区域之上。The first user interface has a play area, and the play area is used to display a picture of the first video data. In some cases, the play area is a partial area of the first user interface. In this case, for the first video data The other information of the video data can be displayed in the area outside the play area. In some cases, the play area is the entire area of the first user interface. At this time, other information for the first video data can be displayed in the floating The form is displayed above this display area.
针对该第一视频数据的其他信息可以包括表达正向情感的控件(如“点赞”、“喜欢”)、评论信息、用于分享的控件、用于输入评论信息的栏目,等等。Other information for the first video data may include controls for expressing positive emotions (such as "like", "like"), comment information, controls for sharing, fields for inputting comment information, and the like.
以Android系统为例,可调用VideoView,MediaPlayer与SurfaceView,Vitamio,JCPlayer等组件播放第一视频数据。Taking the Android system as an example, components such as VideoView, MediaPlayer, SurfaceView, Vitamio, and JCPlayer can be called to play the first video data.
此外,若客户端缓存有该第一视频数据的一个或多个元素摘要信息,则可以在第一用户界面中,将元素摘要信息转换为第一数据结构,以静态的方式(如以文本的方式显示视听元素的名称)或动态的方式(如转动视听元素的封面)显示该第一数据结构下的元素摘要信息。In addition, if the client has cached one or more element summary information of the first video data, the element summary information can be converted into the first data structure in the first user interface, and the element summary information can be converted into the first data structure in a static manner (such as textual The name of the audiovisual element is displayed in a dynamic manner (such as rotating the cover of the audiovisual element) to display the element summary information under the first data structure.
若播放区域为第一用户界面的全部区域,则该一个或多个元素摘要信息可以以悬浮的形式显示在该播放区域之上,使得该一个或多个元素摘要信息显示在第一视频数据的画面之上。If the play area is the entire area of the first user interface, the one or more element summary information can be displayed above the play area in a floating form, so that the one or more element summary information is displayed in the first video data on the screen.
例如,如图2A所示,在第一用户界面210中,以全部区域显示第一视频数据的画面,该第一视频数据具有可听元素,即歌曲“宁静的夜晚”,此时,可在第一用户界面210的左下角显示该歌曲的名称与发布者“宁静的夜晚-小红”(元素摘要信息211),在第一用户界面210的右下角显示该歌曲的封面(元素摘要信息212)。For example, as shown in FIG. 2A, in the first user interface 210, the screen of the first video data is displayed in all areas, and the first video data has audible elements, that is, the song "Quiet Night". The lower left corner of the first user interface 210 displays the title of the song and the publisher "Quiet Night-Little Red" (element summary information 211), and the cover of the song (element summary information 212) is displayed in the lower right corner of the first user interface 210. ).
又例如,如图2D所示,在第一用户界面240中,以部分区域显示第一视频数据的画面,该第一视频数据具有可听元素、可视元素,即歌曲“激动人心”、小刚表演踩钢丝的视频,此时,可在第一用户界面240的左下角显示该歌曲的名称与发布者“激动人心-小明”(元素摘要信息)、该视频的引导语(引导用户使用该视频制作新的视频数据)及该视频的使用人数“与小刚二重唱(2.27K)”(元素摘要信息241),在第一用户界面240的右下角显示该歌曲的封面(元素摘要信息)。For another example, as shown in FIG. 2D, in the first user interface 240, a picture of the first video data is displayed in a partial area, and the first video data has audible elements and visual elements, that is, the song "Exciting", small The video of stepping on a tightrope has just been performed. At this time, the name of the song and the publisher "Exciting-Xiao Ming" (element summary information), the introductory language of the video (to guide the user to use the Video production new video data) and the number of users of the video "Due with Xiaogang (2.27K)" (element summary information 241), the cover of the song (element summary information) is displayed in the lower right corner of the first user interface 240.
步骤102、接收作用于元素摘要信息的第一操作。Step 102: Receive a first operation acting on the element summary information.
在播放第一视频数据的过程中,若用户对该第一视频数据包含的一视听元素感兴趣,则可以通过计算机设备提供的人机交互工具,针对该视听元素对应的元素摘要信息触发第一操作,从而选定该元素摘要信息表示的可视元素。In the process of playing the first video data, if the user is interested in an audio-visual element contained in the first video data, the human-computer interaction tool provided by the computer device can trigger the first audio-visual element summary information corresponding to the audio-visual element. action to select the visual element represented by the element's summary information.
在实现中,针对不同类型的计算机设备,其所提供的人机交互工具有所不同,相应地,通过该人机交互工具触发第一操作的方式也有所不同,本实施例对通过人机交互工具触发第一操作的方式不加以限制。In implementation, for different types of computer equipment, the human-computer interaction tools provided by them are different, and correspondingly, the ways of triggering the first operation through the human-computer interaction tools are also different. The manner in which the tool triggers the first action is not limited.
例如,若计算机设备提供的人机交互工具为触控屏,则在触控屏检测到发生在一个元素摘要信息内的触控操作(如点击操作、长按操作、重按操作等)时,确定接收作用于该元素摘要信息的第一操作。For example, if the human-computer interaction tool provided by the computer equipment is a touch screen, when the touch screen detects a touch operation (such as a click operation, long-press operation, re-press operation, etc.) It is determined to receive a first operation that acts on the element digest information.
又例如,若计算机设备提供的人机交互工具为外置设备,则在接收到外置设备发送的、发生在一个元素摘要信息内的按键事件(如单击事件、双击事件、长按事件等)时,确定接收作用于该元素摘要信息的第一操作。其中,该外置设备包括但不限定于鼠标、遥控器等。For another example, if the human-computer interaction tool provided by the computer device is an external device, after receiving a key event (such as a single-click event, double-click event, long-press event, etc.) that occurs in an element summary information sent by the external device ), it is determined to receive the first operation acting on the element summary information. Wherein, the external device includes, but is not limited to, a mouse, a remote control, and the like.
步骤103、响应于第一操作,共同显示第二视频数据的视频摘要信息、制作控件。Step 103: In response to the first operation, jointly display video summary information and production controls of the second video data.
应用本实施例,服务端可通过多种方式收集视频数据并对该视频数据标记其包含的视听元素(以ID等表示),存储在服务端本地的数据库中。Applying this embodiment, the server can collect video data in various ways, mark the video data with audiovisual elements (represented by ID, etc.), and store the video data in a local database of the server.
在一种标记视听元素的方式中,针对已有的视频数据,可以通过以特定的可视元素作为目标,调用目标检测(object detection)模型在该视频数据的画面中检测特定的可视元素,如果检测到该特定的可视元素,则对该视频数据标记该特定的可视元素。In a method of marking audiovisual elements, for existing video data, a specific visual element can be detected in the picture of the video data by calling an object detection model with a specific visual element as a target, If the specific visual element is detected, the video data is marked with the specific visual element.
该目标检测模型包括一阶(One Stage)目标检测模型和二阶(Two Stage)目标检测模型。The target detection model includes a first-order (One Stage) target detection model and a second-order (Two Stage) target detection model.
生成一系列作为样本的候选框,再通过卷积神经网络(Convolutional Neural Network,CNN)进行样本分类的目标检测模型被称为二阶目标检测模型,例如,区域卷积神经网络(Region-CNN,R-CNN)、空间金字塔池网络(Spatial Pyramid Pooling Network,SPP-Net)、Fast-RCNN、Faster-RCNN,等等。A target detection model that generates a series of candidate boxes as samples, and then classifies the samples through a convolutional neural network (CNN) is called a second-order target detection model, for example, a regional convolutional neural network (Region-CNN, R-CNN), Spatial Pyramid Pooling Network (SPP-Net), Fast-RCNN, Faster-RCNN, etc.
不生成候选框,直接将目标边框定位的问题转化为回归问题进行处理的目标检测模型则被称为一阶目标检测模型,例如,广义同余神经网络(Generalized Congruence Neural Network,GCNN)、YOLO(You Only Look Once)、一阶多框预测(Single Shot Mutibox Detector,SSD),等等。A target detection model that does not generate candidate frames and directly converts the problem of target frame positioning into a regression problem is called a first-order target detection model, for example, Generalized Congruence Neural Network (GCNN), YOLO ( You Only Look Once), first-order multi-box prediction (Single Shot Mutibox Detector, SSD), etc.
在另一种标记视听元素的方式中,针对已有的视频数据,可以提取其包含的音频,提取该音频的特征,若该音频的特征与特定的可视元素的特征相同或 相似,则对该视频数据标记该特定的可视元素。In another way of marking audiovisual elements, for existing video data, the audio contained in the audio can be extracted, and the features of the audio can be extracted. If the features of the audio are the same as or similar to the features of a specific visual element, then The video data marks the specific visual element.
在又一种标记视听元素的方式中,若用户使用自定义的可视元素制作视频数据,则将该自定义的可视元素与原有的可视元素进行对比,若该自定义的可视元素与原有的可视元素相同或相似,则对该视频数据标记原有的可视元素,若该自定义的可视元素与原有的可视元素不同或不相似,则对自定义的可视元素设置新的标识(如新的ID),并对该视频数据标记自定义的可视元素(以新的标识表示)。In yet another method of marking audiovisual elements, if the user uses a custom visual element to create video data, the custom visual element is compared with the original visual element, if the custom visual element is If the element is the same as or similar to the original visual element, the video data will be marked with the original visual element. If the custom visual element is different or not similar to the original visual element, The visual element is set with a new identifier (such as a new ID), and the video data is marked with a custom visual element (represented by the new identifier).
在再一种标记视听元素的方式中,若用户使用其他视频数据已标记的可视元素制作新的视频数据,则可以对该新的视频数据标记该可视元素。In yet another way of marking the audiovisual element, if the user makes new video data by using the marked visual element of other video data, the visual element can be marked for the new video data.
上述标记视听元素的方式只是作为示例,在实施本申请实施例时,可以根据实际情况设置其他标记视听元素的方式,本申请实施例对标记视听元素的方式不加以限制。The above manner of marking audiovisual elements is only an example. When implementing the embodiments of the present application, other manners of marking audiovisual elements may be set according to actual conditions, and the embodiments of the present application do not limit the manners of marking audiovisual elements.
响应于用户选定一元素摘要信息表示的视听元素的第一操作,客户端可向服务端发送携带该视听元素的标识(如ID)的请求,请求服务端搜索包含该视听元素(以ID等表示)的视频数据。In response to the user's first operation of selecting an audiovisual element represented by the element summary information, the client can send a request to the server carrying the identification (such as ID) of the audiovisual element, requesting the server to search for the audiovisual element (with ID, etc.) containing the audiovisual element. representation) of the video data.
服务端在接收到该请求时,从该请求中解析视听元素的标识,在服务端本地的数据库中以该标识作为搜索的条件,搜索标记该标识的视频数据,并将该视频数据写入视频集合中,为便于区分,该视频数据在本实施例中称之为第二视频数据,该视频集合在本实施例中称之为第一视频集合,服务端从本地的数据库中提取该第一视频集合中的第二视频数据的视频摘要信息(如封面、名称、制作者等),并将该视频摘要信息发送至客户端。When the server receives the request, it parses the identification of the audiovisual element from the request, uses the identification as a search condition in the local database of the server, searches for the video data marked with the identification, and writes the video data into the video In the collection, for the convenience of distinction, the video data is referred to as the second video data in this embodiment, the video collection is referred to as the first video collection in this embodiment, and the server extracts the first video data from the local database. Video summary information (such as cover, name, producer, etc.) of the second video data in the video set, and send the video summary information to the client.
所谓标记,表示该视频数据包含该标识对应的视听元素,即多个第二视频数据包含相同的视听元素。The so-called mark means that the video data contains the audiovisual element corresponding to the mark, that is, the plurality of second video data contains the same audiovisual element.
由于第二视频数据的数量较多,可按照预设的排序方式对该第二视频数据进行排序,每次选择排序最前的n(n为正整数)个第二视频数据发送至客户端。Due to the large quantity of the second video data, the second video data may be sorted according to a preset sorting method, and each time the top n (n is a positive integer) second video data of the sorting are selected and sent to the client.
一般情况下,该排序方式可以包括按照视频质量进行降序排序、按照视频热度降序排序等非个性化的排序方式,降低处理的复杂度,从而提高处理的速度。In general, the sorting method may include non-personalized sorting methods such as descending sorting according to video quality, descending sorting according to video popularity, etc., so as to reduce the processing complexity and improve the processing speed.
除了非个性化的排序方式之外,还可以使用协同过滤等个性化的排序方式,本实施例对该排序方式不加以限制。In addition to a non-personalized sorting manner, a personalized sorting manner such as collaborative filtering may also be used, which is not limited in this embodiment.
对于客户端而言,一方面,接收服务端发送的第二视频数据的视频摘要信息,并缓存在客户端本地。For the client, on the one hand, video summary information of the second video data sent by the server is received and cached locally on the client.
另一方面,生成第二用户界面,在第二用户界面中,若元素摘要信息包括元素图像数据(如音频数据的封面、视频数据的缩略图等),则可以以背景的形式显示该元素图像数据,即,将元素摘要信息中的图像数据设置为背景,在设置时,可以对该图像数据进行模糊处理。On the other hand, a second user interface is generated. In the second user interface, if the element summary information includes element image data (such as the cover of audio data, the thumbnail of video data, etc.), the element image can be displayed in the form of background data, that is, the image data in the element summary information is set as the background, and when setting, the image data can be blurred.
将元素摘要信息转换为第二数据结构,以标题的形式显示该第二数据结构中的元素摘要信息,从而表示视频摘要信息对应的第二视频数据包含该元素摘要信息对应的视听元素,例如,在第二用户界面的顶部独立显示可视元素的元素摘要信息。Convert the element summary information into a second data structure, and display the element summary information in the second data structure in the form of a title, thereby indicating that the second video data corresponding to the video summary information contains the audiovisual element corresponding to the element summary information, for example, Element summary information for the visual element is independently displayed at the top of the second user interface.
在第二用户界面中,可在元素摘要信息之下的位置,以瀑布流等方式显示一个或多个信息区域,其中,该信息区域的面积与视听元素的类型匹配,即根据视听元素的类型设置信息区域的面积。In the second user interface, one or more information areas may be displayed in a waterfall or the like at a position below the element summary information, wherein the area of the information area matches the type of audiovisual element, that is, according to the type of audiovisual element Sets the size of the information area.
在一个示例中,考虑到可听元素并不占用显示的空间,而可视元素占用显示的空间,若视听元素的类型为可视元素,则可以设置面积为第一数值的区域为第一区域,显示该第一区域、作为信息区域,若视听元素的类型为可听元素,则可以设置面积为第二数值的区域为第二区域,显示该第二区域、作为信息区域,其中,第一数值大于第二数值,即第一区域的面积大于第二区域的面积,对可视元素增大显示区域的面积,使得可视元素在显示时保留更多的细节,用户可以更清晰地浏览到可视元素。In an example, considering that the audible element does not occupy the display space, but the visual element occupies the display space, if the type of the audiovisual element is a visual element, the area with the area of the first value can be set as the first area , display the first area as an information area, if the type of the audiovisual element is an audible element, you can set the area with the second value as the second area, and display the second area as an information area, where the first area If the value is greater than the second value, that is, the area of the first area is larger than the area of the second area, and the area of the display area is increased for the visual element, so that the visual element retains more details when displayed, and the user can browse more clearly. visual elements.
在本示例中,若同一个第二视频数据既包含可视元素、又包含可听元素,当针对可视元素召回该第二视频数据至一第一视频集合时,以第一区域显示该第二视频数据的视频摘要,当针对可听元素召回该第二视频数据至另一第一视频集合时,以第二区域显示该第二视频数据的视频摘要。In this example, if the same second video data contains both visual elements and audible elements, when the second video data is recalled to a first video set for the visual elements, the second video data is displayed in the first area. Two video summaries of the video data, when recalling the second video data for the audible element to another first video set, displaying the video summaries of the second video data in the second area.
将第一视频集合中的第二视频数据的视频摘要信息按照顺序加载至多个信息区域中,从而在信息区域中显示第二视频数据的视频摘要信息。The video summary information of the second video data in the first video set is sequentially loaded into the plurality of information areas, so that the video summary information of the second video data is displayed in the information area.
此外,以悬浮的方式,在信息区域之上显示制作控件,用户可以通过计算机设备提供的人机交互工具,针对该视听元素对应的元素摘要信息触发滑动操作、翻页操作等操作,使得在第二用户界面切换显示第二视频数据的视频摘要信息,在切换显示第二视频数据的视频摘要信息的过程中,第二视频数据的视频摘要信息的位置发生变化,而制作控件保持位置,该位置并不随着滑动操作、翻页操作等操作而发生变化。In addition, the production control is displayed on the information area in a floating manner, and the user can trigger operations such as sliding operation and page turning operation for the element summary information corresponding to the audio-visual element through the human-computer interaction tool provided by the computer equipment, so that in the first 2. The user interface switches to display the video summary information of the second video data. During the process of switching to display the video summary information of the second video data, the position of the video summary information of the second video data changes, but the production control maintains the position. It does not change with operations such as sliding operations and page turning operations.
在切换显示第二视频数据的视频摘要信息的过程中,客户端可以继续向服务端请求该第一视频集合中的其他第二视频数据,显示至第二用户界面中,直至请求该第一视频集合中的其他第二视频数据完毕。In the process of switching and displaying the video summary information of the second video data, the client can continue to request the server for other second video data in the first video set, and display them in the second user interface until the first video is requested. The other second video data in the set is complete.
例如,如图2A所示,若用户针对在第一用户界面210的左下角显示的歌曲的名称与发布者“宁静的夜晚-小红”(元素摘要信息211)触发点击操作(第一操作),或者,针对在第一用户界面210的右下角显示的歌曲的封面(元素摘要信息212)触发点击操作(第一操作),则如图2B所示,在第二用户界面220中,对歌曲的封面进行模糊处理并设置为背景,在该背景上,集中显示歌曲的封面、歌曲的名称、歌曲的发布者、歌曲的使用人数,以及,显示九个较小的信息区域,在每个信息区域中按照顺序加载包含该歌曲对应的第二视频数据的视频摘要信息。For example, as shown in FIG. 2A , if the user triggers a click operation (the first operation) for the title of the song displayed in the lower left corner of the first user interface 210 and the publisher “Quiet Night-Little Red” (element summary information 211 ) , or, a click operation (the first operation) is triggered for the cover (element summary information 212 ) of the song displayed in the lower right corner of the first user interface 210 , then as shown in FIG. 2B , in the second user interface 220 , The cover of the song is blurred and set as the background, on which the cover of the song, the name of the song, the publisher of the song, the number of users of the song are displayed in a concentrated manner, and nine smaller information areas are displayed. The video summary information including the second video data corresponding to the song is loaded in sequence in the area.
排序第三(即“N0.3”)的第二视频数据,除了包含该歌曲之外,还包括其他视频数据的视听元素,在用户选定歌曲时,排序第三(即“N0.3”)的第二视频数据使用较小的信息区域显示其视频摘要信息。The second video data in the third order (ie "N0.3"), in addition to the song, also includes audiovisual elements of other video data, when the user selects a song, the third order (ie "N0.3") ) of the second video data uses a smaller information area to display its video summary information.
此外,在第二用户界面220的下方,显示制作控件221“加入”。In addition, below the second user interface 220, the authoring control 221 "Join" is displayed.
例如,如图2D所示,若用户针对在第一用户界面240的左下角显示的视频的引导语及该视频的使用人数“与小刚二重唱(2.27K)”(元素摘要信息241)触发点击操作(第一操作),则如图2E所示,在第二用户界面250中,对视频的封面进行模糊处理并设置为背景,在该背景上,集中显示视频的封面、制作者、视频的使用人数,以及,显示四个较大的信息区域,在每个信息区域中按照顺序加载包含该视频的第二视频数据的视频摘要信息。For example, as shown in FIG. 2D , if the user clicks on the introductory phrase of the video displayed in the lower left corner of the first user interface 240 and the number of users of the video “Due to Xiaogang (2.27K)” (element summary information 241 ) operation (the first operation), then as shown in FIG. 2E, in the second user interface 250, the cover of the video is blurred and set as a background, on which the cover, the producer, and the background of the video are collectively displayed. The number of users, and, displays four larger information areas, in each information area sequentially loading video summary information containing the second video data for the video.
此外,在第二用户界面250的下方,显示制作控件251“加入”。In addition, below the second user interface 250, an authoring control 251 "Join" is displayed.
步骤104、接收作用于制作控件的第二操作。Step 104: Receive a second operation for making the control.
在显示第二视频数据的视频摘要信息的过程中,若用户对当前的视听元素感兴趣,则可以通过计算机设备提供的人机交互工具,针对当前的制作控件触发第二操作,从而制作包含该视听元素的、新的视频数据。In the process of displaying the video summary information of the second video data, if the user is interested in the current audio-visual element, the human-computer interaction tool provided by the computer device can trigger the second operation for the current production control, so that the production includes the New video data for audiovisual elements.
在实现中,针对不同类型的计算机设备,其所提供的人机交互工具有所不同,相应地,通过该人机交互工具触发第二操作的方式也有所不同,本实施例对通过人机交互工具触发第二操作的方式不加以限制。In implementation, for different types of computer equipment, the provided human-computer interaction tools are different, and correspondingly, the ways of triggering the second operation through the human-computer interaction tools are also different. The manner in which the tool triggers the second action is not limited.
例如,若计算机设备提供的人机交互工具为触控屏,则在触控屏检测到发生在制作控件内的触控操作(如点击操作、长按操作、重按操作等)时,确定接收作用于该制作控件的第一操作。For example, if the human-computer interaction tool provided by the computer equipment is a touch screen, when the touch screen detects a touch operation (such as a click operation, long press operation, repress operation, etc.) The first operation that acts on the make control.
又例如,若计算机设备提供的人机交互工具为外置设备,则在接收到外置设备发送的、发生在制作控件内按键事件(如单击事件、双击事件、长按事件等)时,确定接收作用于该制作控件的第一操作。其中,该外置设备包括但不限定于鼠标、遥控器等。For another example, if the human-computer interaction tool provided by the computer device is an external device, when receiving a key event (such as a single-click event, double-click event, long-press event, etc.) It is determined to receive a first operation acting on the authoring control. Wherein, the external device includes but is not limited to a mouse, a remote control, and the like.
步骤105、响应于第二操作,采集第三视频数据,将元素摘要信息对应的视听元素添加至第三视频数据中。Step 105: In response to the second operation, collect third video data, and add audiovisual elements corresponding to the element summary information to the third video data.
在本实施例中,客户端响应于用户触发制作控件的第二操作,向服务端发送下载该视听元素(以ID等表示)的请求,服务端接收到该请求之后,查找独立的视听元素(以ID等表示),并将该视听元素发送至客户端。In this embodiment, the client sends a request for downloading the audiovisual element (represented by ID, etc.) to the server in response to the user triggering the second operation of making the control, and after receiving the request, the server searches for the independent audiovisual element ( represented by an ID, etc.) and send the audiovisual element to the client.
所谓独立,可以指视听元素是一个独立的文件,并不依赖第一视频数据、第二视频数据。The so-called independent can mean that the audiovisual element is an independent file and does not depend on the first video data and the second video data.
视听元素的格式(如分辨率、采样率、体积大小等)符合制作的规范,客户端可直接使用该视听元素制作新的视频数据。The format of the audiovisual element (such as resolution, sampling rate, size, etc.) conforms to the production specification, and the client can directly use the audiovisual element to produce new video data.
客户端在接收该视听元素完毕之后,可生成第三用户界面,在该第三用户界面中生成用于制作视频数据的控件,调用计算机设备的摄像头、在第三用户界面预览视频数据,在接收到确定操作时,采集视频数据,为便于区分,该视频数据在本实施例中称之为第三视频数据。After receiving the audiovisual element, the client can generate a third user interface, generate a control for making video data in the third user interface, call the camera of the computer device, preview the video data on the third user interface, and then receive the video data. When the operation is determined, video data is collected, and for convenience of distinction, the video data is referred to as third video data in this embodiment.
在采集第三视频数据的同时,将元素摘要信息对应的视听元素作为制作的素材,添加至第三视频数据中。While collecting the third video data, the audiovisual elements corresponding to the element summary information are added to the third video data as the produced material.
在添加视听元素的过程中,保持第三视频数据与视听元素在时间轴上同步。During the process of adding the audiovisual element, the third video data is kept synchronized on the time axis with the audiovisual element.
在本实施例中,在开始采集第三视频数据的同时,开始播放视听元素,以便用户预览添加视听元素的效果。In this embodiment, when the third video data is started to be collected, the audiovisual element is started to be played, so that the user can preview the effect of adding the audiovisual element.
一般情况下,在视听元素结束时,可停止采集第三视频数据,当然,在视听元素结束时,也可以继续采集第三视频数据,本实施例对此不加以限制。Generally, when the audiovisual element ends, the collection of the third video data may be stopped. Of course, when the audiovisual element ends, the collection of the third video data may also be continued, which is not limited in this embodiment.
在实际应用中,由于视听元素的类型不同,将视听元素添加至第三视频数据的方式也有所不同。In practical applications, due to the different types of audiovisual elements, the ways of adding audiovisual elements to the third video data are also different.
在一个示例中,视听元素包括可听元素中的音频数据,则在本示例中,在开始采集第三视频数据的同时,开始播放音频元素,从而将元素摘要信息对应的音频数据设置为第三视频数据的背景音乐。In an example, the audiovisual element includes audio data in the audiovisual element, then in this example, the audio element starts to be played at the same time when the third video data is collected, so that the audio data corresponding to the element summary information is set as the third video data. Background music for video data.
例如,如图2B所示,若用户针对在第二用户界面220的下方显示的显示制作控件221“加入”触发点击操作(第二操作),则如图2C所示,在第三用户界面230中调用摄像头进行预览,在接收到针对圆形控件触发的确定操作时,采集第三视频数据并播放歌曲“宁静的夜晚”,使得歌曲“宁静的夜晚”作为第三视频数据的背景音乐。For example, as shown in FIG. 2B , if the user “joins” the click operation (second operation) for the display creation control 221 displayed below the second user interface 220 , then as shown in FIG. 2C , in the third user interface 230 Call the camera to preview in the middle, when receiving the confirmation operation triggered by the circular control, collect the third video data and play the song "Quiet Night", so that the song "Quiet Night" is used as the background music of the third video data.
在另一个示例中,视听元素包括可视元素中的视频数据,为便于区分,该视频元素在本实施例中可称之为第四视频数据。In another example, the audiovisual element includes video data in the visual element, and for convenience of distinction, the video element may be referred to as fourth video data in this embodiment.
在本示例中,在开始采集第三视频数据的同时,开始播放第四视频数据,在同一个画面中,第三视频数据居左、第四视频数据居右显示,或者,第三视频数据居右、第四视频数据居左显示,第三视频数据与第四视频数据以画中画的形式显示,从而以分屏的方式合成元素摘要信息对应的第四视频数据与第三视频数据。In this example, the fourth video data is played at the same time when the third video data is collected. In the same screen, the third video data is displayed on the left and the fourth video data is displayed on the right, or the third video data is displayed on the right. . The fourth video data is displayed on the left, and the third video data and the fourth video data are displayed in the form of picture-in-picture, so that the fourth video data and the third video data corresponding to the element summary information are synthesized in a split-screen manner.
例如,如图2E所示,若用户针对在第二用户界面250的下方显示的显示制作控件251“加入”触发点击操作(第二操作),则如图2F所示,在第三用户界面260中调用摄像头进行预览,在接收到针对圆形控件触发的确定操作时,采集第三视频数据并显示在左侧,以及,在右侧播放小刚踩钢丝的视频,使得小刚踩钢丝的视频与第三视频数据合并。For example, as shown in FIG. 2E , if the user triggers a click operation (second operation) for “joining” the display creation control 251 displayed below the second user interface 250 , then as shown in FIG. 2F , in the third user interface 260 Call the camera to preview in the middle, and when receiving the confirmation operation triggered by the circular control, collect the third video data and display it on the left, and play the video of Xiaogang stepping on the wire on the right, so that the video of Xiaogang stepping on the wire Merged with the third video data.
上述添加视听元素的方式只是作为示例,在实施本申请实施例时,可以根据实际情况设置其他添加视听元素的方式,本申请实施例对添加视听元素的方式不加以限制。The above manner of adding audiovisual elements is only an example. When implementing the embodiments of the present application, other manners of adding audiovisual elements may be set according to actual conditions, and the embodiments of the present application do not limit the manners of adding audiovisual elements.
将元素摘要信息对应的视听元素添加至第三视频数据之后,可将第三视频数据发送至服务端,服务端接收客户端发送的第三视频数据,对第三视频数据标记包含该视听元素(以ID等表示),若标记完成,则服务端发布第三视频数据,客户端从而发布标记包含视听元素的第三视频数据,其他客户端可从服务端下载该第三视频数据进行播放,供用户浏览。After adding the audiovisual element corresponding to the element summary information to the third video data, the third video data can be sent to the server, and the server receives the third video data sent by the client, and marks the third video data to include the audiovisual element ( Represented by ID, etc.), if the marking is completed, the server publishes the third video data, the client thus publishes the third video data marked with audiovisual elements, and other clients can download the third video data from the server for playback. User browses.
在本实施例中,当播放第一视频数据时,显示元素摘要信息,元素摘要信息表示第一视频数据包含的视听元素,接收作用于元素摘要信息的第一操作,响应于第一操作,共同显示第二视频数据的视频摘要信息、制作控件,第二视频数据包含视听元素,接收作用于制作控件的第二操作,响应于第二操作,采集第三视频数据,将元素摘要信息对应的视听元素添加至第三视频数据中,一方面,用户可以使用已有视频数据包含的视听元素制作新的视频数据,视听元素并不依赖于系统的模板,渠道多样化,可以保持视听元素的个性化,从而保证新制作的视频数据的个性化,另一方面,由系统提供视听元素,可以保证该视听元素的格式符合制作的规范,直接用于制作新的视频数据,避免用户使用专业的应用对元素进行修正,大大降低了技术门槛,减少了耗时,从而降低了制作视频数据的成本。In this embodiment, when the first video data is played, the element summary information is displayed, and the element summary information indicates the audiovisual elements contained in the first video data, the first operation acting on the element summary information is received, and in response to the first operation, the Displaying video summary information and production controls of the second video data, where the second video data includes audiovisual elements, receiving a second operation acting on the production controls, collecting third video data in response to the second operation, and converting the audiovisual elements corresponding to the element summary information The element is added to the third video data. On the one hand, users can use the audiovisual elements contained in the existing video data to create new video data. The audiovisual elements do not depend on the template of the system, and the channels are diversified, which can maintain the individualization of the audiovisual elements. , so as to ensure the personalization of the newly produced video data. On the other hand, the audio-visual elements provided by the system can ensure that the format of the audio-visual elements conforms to the production specifications, and can be directly used to produce new video data, preventing users from using professional applications. Elements are revised, which greatly reduces the technical threshold, reduces the time-consuming, and thus reduces the cost of producing video data.
实施例二 Embodiment 2
图3为本申请实施例二提供的一种视频数据的制作方法的流程图,本实施例以前述实施例为基础,增加切换第一视频数据、播放第二视频数据的操作, 该方法包括如下步骤:3 is a flowchart of a method for producing video data according to Embodiment 2 of the present application. Based on the foregoing embodiments, this embodiment adds operations of switching the first video data and playing the second video data. The method includes the following steps: step:
步骤301、当播放第一视频数据时,显示元素摘要信息。Step 301: When playing the first video data, display element summary information.
元素摘要信息表示第一视频数据包含的视听元素。The element summary information indicates audiovisual elements contained in the first video data.
步骤302、接收作用于元素摘要信息的第一操作。Step 302: Receive a first operation acting on the element summary information.
步骤303、响应于第一操作,共同显示第二视频数据的视频摘要信息、制作控件。Step 303: In response to the first operation, jointly display video summary information and production controls of the second video data.
第二视频数据包含该视听元素。The second video data contains the audiovisual element.
步骤304、接收作用于视频摘要信息的第三操作。Step 304: Receive a third operation acting on the video summary information.
在显示第二视频数据的视频摘要信息的过程中,若用户对一个第二视频数据感兴趣,则可以通过计算机设备提供的人机交互工具,针对相应的视频摘要信息触发第三操作,从而选定该频摘要信息对应的第二视频数据。In the process of displaying the video summary information of the second video data, if the user is interested in a piece of second video data, a human-computer interaction tool provided by the computer device can trigger a third operation for the corresponding video summary information, thereby selecting The second video data corresponding to the frequency summary information is determined.
在实现中,针对不同类型的计算机设备,其所提供的人机交互工具有所不同,相应地,通过该人机交互工具触发第三操作的方式也有所不同,本实施例对通过人机交互工具触发第三操作的方式不加以限制。In implementation, for different types of computer equipment, the provided human-computer interaction tools are different, and correspondingly, the ways of triggering the third operation through the human-computer interaction tools are also different. The manner in which the tool triggers the third action is not limited.
例如,若计算机设备提供的人机交互工具为触控屏,则在触控屏检测到发生在一个视频摘要信息内的触控操作(如点击操作、长按操作、重按操作等)时,确定接收作用于该视频摘要信息的第三操作。For example, if the human-computer interaction tool provided by the computer equipment is a touch screen, when the touch screen detects a touch operation (such as a click operation, long press operation, repress operation, etc.) It is determined to receive a third operation acting on the video summary information.
又例如,若计算机设备提供的人机交互工具为外置设备,则在接收到外置设备发送的、发生在一个视频摘要信息内按键事件(如单击事件、双击事件、长按事件等)时,确定接收作用于该视频摘要信息的第三操作。其中,该外置设备包括但不限定于鼠标、遥控器等。For another example, if the human-computer interaction tool provided by the computer device is an external device, after receiving a key event (such as a single-click event, double-click event, long-press event, etc.) that occurs in a video summary message sent by the external device When , it is determined to receive a third operation acting on the video summary information. Wherein, the external device includes, but is not limited to, a mouse, a remote control, and the like.
步骤305、响应于第三操作,播放视频摘要信息所属的第二视频数据。Step 305: In response to the third operation, play the second video data to which the video summary information belongs.
客户端响应于用户选择视频摘要信息的第三操作,可以以URL(携带第二视频数据的标识,如ID等)等形式,向服务端请求播放第二视频数据,服务端接收到该请求之后,可向该客户端发送该第二视频数据的部分或全部数据。In response to the user's third operation of selecting the video summary information, the client may request the server to play the second video data in the form of a URL (carrying the identifier of the second video data, such as ID, etc.), and after the server receives the request , part or all of the second video data can be sent to the client.
客户端在缓存该第二视频数据的部分或全部数据之后,可调用操作系统提供的视频播放器播放该第二视频数据,即,客户端生成第一用户界面,在该第一用户界面中显示第二视频数据的画面,以及,驱动扬声器播放第二视频数据的音频。After buffering part or all of the second video data, the client can call the video player provided by the operating system to play the second video data, that is, the client generates a first user interface and displays it in the first user interface A picture of the second video data, and driving a speaker to play the audio of the second video data.
在本实施例中,通过视听元素聚合第二视频数据,可以集中推送用户可能喜好的第二视频数据,减少用户通过关键词、翻页等搜索相似视频数据的操作,减少搜索相似视频数据的耗时,减少因该相应搜索的操作而对客户端、服务端 的资源(如处理器资源、内存资源、带宽资源等)的占用,从而提高用户浏览第二视频数据的效率。In this embodiment, by aggregating the second video data through audio-visual elements, the second video data that the user may like can be pushed centrally, which reduces the user's operation of searching for similar video data through keywords, page turning, etc., and reduces the consumption of searching for similar video data. When the corresponding search operation is performed, the occupation of resources (such as processor resources, memory resources, bandwidth resources, etc.) of the client and the server due to the corresponding search operation is reduced, thereby improving the efficiency of the user browsing the second video data.
步骤306、接收作用于第一视频数据的第四操作。Step 306: Receive a fourth operation acting on the first video data.
在播放第一视频数据的过程中,若用户对该第一视频数据不感兴趣,则可以通过计算机设备提供的人机交互工具,针对该第一视频数据触发第四操作,从而在第一用户界面切换播放其他第一视频数据。In the process of playing the first video data, if the user is not interested in the first video data, a human-computer interaction tool provided by the computer device can trigger a fourth operation for the first video data, so that the first user interface Switch to play other first video data.
在实现中,针对不同类型的计算机设备,其所提供的人机交互工具有所不同,相应地,通过该人机交互工具触发第四操作的方式也有所不同,本实施例对通过人机交互工具触发第四操作的方式不加以限制。In implementation, for different types of computer equipment, the human-computer interaction tools provided by them are different, and accordingly, the ways of triggering the fourth operation through the human-computer interaction tools are also different. The manner in which the tool triggers the fourth operation is not limited.
例如,若计算机设备提供的人机交互工具为触控屏,则在触控屏检测到发生在第一用户界面中空间区域(除控件、元素摘要信息等可操作的数据之外的区域)的触控操作(如滑动操作等)时,确定接收作用于该第一视频数据的第四操作。For example, if the human-computer interaction tool provided by the computer device is a touch screen, the touch screen detects that the touch screen detects the occurrence in the spatial area of the first user interface (the area other than the control, element summary information and other operable data) During a touch operation (such as a sliding operation, etc.), it is determined to receive a fourth operation acting on the first video data.
又例如,若计算机设备提供的人机交互工具为外置设备,则在接收到外置设备发送的、发生在第一用户界面中空间区域(除控件、元素摘要信息等可操作的数据之外的区域)的按键事件(如拖动事件等)时,确定接收作用于该第一视频数据的第四操作。其中,该外置设备包括但不限定于鼠标、遥控器等。For another example, if the human-computer interaction tool provided by the computer device is an external device, after receiving the data sent by the external device and occurring in the space area in the first user interface (except for control, element summary information and other operable data) When a key event (such as a drag event, etc.) occurs in the region), it is determined to receive a fourth operation acting on the first video data. Wherein, the external device includes, but is not limited to, a mouse, a remote control, and the like.
步骤307、响应于第四操作,播放与当前用户适配的其他第一视频数据,或者,包含其他视听元素的其他第一视频数据。Step 307: In response to the fourth operation, play other first video data adapted to the current user, or other first video data including other audiovisual elements.
如果当前播放的第一视频数据为个性化推送的视频数据,即第一视频数据为与当前用户(以ID等表示)适配的视频数据,则客户端响应于用户切换第一视频数据的第四操作,可以以URL等形式,向服务端请求播放与当前用户适配的其他第一视频数据,服务端接收到该请求之后,可向该客户端发送与当前用户适配的其他第一视频数据的部分或全部数据。If the currently playing first video data is personalized push video data, that is, the first video data is video data adapted to the current user (represented by ID, etc.), the client switches the first video data in response to the user Fourth operation, you can request the server to play other first video data adapted to the current user in the form of URL, etc. After receiving the request, the server can send other first videos adapted to the current user to the client Part or all of the data.
客户端在缓存与当前用户适配的其他第一视频数据的部分或全部数据之后,可调用操作系统提供的视频播放器播放与当前用户适配的其他第一视频数据,即,在该第一用户界面中切换显示与当前用户适配的其他第一视频数据的画面,以及,驱动扬声器切换播放与当前用户适配的其他第一视频数据的音频。After buffering part or all of the other first video data adapted to the current user, the client can call the video player provided by the operating system to play other first video data adapted to the current user, that is, in the first video In the user interface, the screen is switched to display other first video data adapted to the current user, and the speaker is driven to switch and play audio of the other first video data adapted to the current user.
如果当前播放的第一视频数据为非个性化推送的视频数据,当前第一视频数据包含其他视听元素,第一视频数据为该其他视听元素对应视频集合中的视频数据,即该视频集合表示包含同一视听元素的视频数据,为便于区分,该视频数据在本实施例中称之为第二视频集合。If the currently playing first video data is non-personalized push video data, the current first video data contains other audiovisual elements, and the first video data is the video data in the video set corresponding to the other audiovisual elements, that is, the video set indicates that it contains For the video data of the same audio-visual element, for convenience of distinction, the video data is referred to as the second video set in this embodiment.
客户端响应于用户切换第一视频数据的第四操作,可以向服务端请求播放 该第二视频集合中的其他第一视频数据,服务端接收到该请求之后,可向该客户端发送该第二视频集合中的其他第一视频数据的部分或全部数据。In response to the user's fourth operation of switching the first video data, the client may request the server to play other first video data in the second video set, and after receiving the request, the server may send the first video to the client. Part or all of the other first video data in the second video set.
客户端在缓存该第二视频集合中的其他第一视频数据的部分或全部数据之后,可调用操作系统提供的视频播放器播放该第二视频集合中的其他第一视频数据,即,在该第一用户界面中切换显示该第二视频集合中的其他第一视频数据的画面,以及,驱动扬声器切换播放该第二视频集合中的其他第一视频数据的音频。After buffering part or all of the other first video data in the second video set, the client can call the video player provided by the operating system to play the other first video data in the second video set, that is, in the In the first user interface, the screen for displaying other first video data in the second video set is switched, and the speaker is driven to switch and play audio of the other first video data in the second video set.
此时,用户可通过触控操作等方式针对第一用户界面中的返回控件触发返回操作,客户端接收到作用于返回控件的返回操作,以及,响应于该返回操作,显示第二用户界面,在该第二用户界面中显示第二视频集合中的第一视频数据的视频摘要信息。At this time, the user can trigger a return operation for the return control in the first user interface through a touch operation or the like, the client receives the return operation acting on the return control, and, in response to the return operation, displays the second user interface, Video summary information of the first video data in the second video set is displayed in the second user interface.
在本实施例中,区分第一视频数据的类型,针对个性化、非个性化的业务场景分别推送与用户适配的其他第一视频数据、第二视频集合的其他第一视频数据,可以保证第一视频数据切换的准确性,符合业务场景的需求。In this embodiment, the types of the first video data are distinguished, and other first video data adapted to the user and other first video data in the second video set are respectively pushed for personalized and non-personalized service scenarios, which can ensure that The accuracy of the first video data switching meets the requirements of business scenarios.
实施例三 Embodiment 3
图4为本申请实施例三提供的一种视频数据的制作方法的流程图,本实施例可适用于提供已有视频数据的视听元素、制作新的视频数据的情况,该方法可以由视频数据的制作装置来执行,该视频数据的制作装置可以由软件和/或硬件实现,可配置在计算机设备中,例如,服务器、工作站,等等,包括如下步骤:4 is a flowchart of a method for producing video data according to Embodiment 3 of the present application. This embodiment is applicable to the situation of providing audiovisual elements of existing video data and producing new video data. The video data production device can be implemented by software and/or hardware, and can be configured in computer equipment, such as a server, a workstation, etc., and includes the following steps:
步骤401、将第一视频数据发送至客户端。Step 401: Send the first video data to the client.
在本实施例中,计算机设备的操作系统可以包括Unix、Linux、Windows Server和Netware等等,在这些操作系统中支持运行服务端,该服务端设置为向多个客户端提供视频服务,如推送视频数据、发布视频数据,等等。In this embodiment, the operating system of the computer device may include Unix, Linux, Windows Server, Netware, etc., in these operating systems, a server is supported, and the server is configured to provide video services to multiple clients, such as push Video data, publish video data, etc.
服务端可通过个性化或非个性化的方式确定第一视频数据,向客户端发送该第一视频数据的部分或全部数据。The server may determine the first video data in a personalized or non-personalized manner, and send part or all of the first video data to the client.
若第一视频数据为一视频数据与一个或多个视听元素混合制作而成,则服务端可以将一个或多个视听元素的元素摘要信息发送至该客户端,该元素摘要信息表示第一视频数据包含的视听元素。If the first video data is produced by mixing a video data with one or more audiovisual elements, the server may send element summary information of one or more audiovisual elements to the client, where the element summary information represents the first video The audiovisual elements that the data contains.
对于第一视频数据、元素摘要信息,客户端设置为在播放第一视频数据时,在第一用户界面显示该一个或多个元素摘要信息。For the first video data and element summary information, the client is configured to display the one or more element summary information on the first user interface when playing the first video data.
在本申请的一个实施例中,通过多目标优化算法向当前客户端推送个性化的第一视频数据,则在本实施例中,步骤401可以包括如下步骤:In an embodiment of the present application, the personalized first video data is pushed to the current client through a multi-objective optimization algorithm. In this embodiment, step 401 may include the following steps:
步骤4011、获取用户浏览视频数据时记录的历史数据。Step 4011: Acquire historical data recorded when the user browses the video data.
用户(以ID等表示)在客户端浏览视频数据,服务端将在这个浏览过程中的信息记录在日志文件中,并存储在数据库中。The user (represented by ID, etc.) browses the video data on the client side, and the server side records the information during the browsing process in a log file and stores it in the database.
服务端针对该用户,可以在数据库的日志文件中查询该用户浏览视频数据时记录的历史数据,等待筛选与该用户适配的视频数据。For the user, the server can query the historical data recorded when the user browses the video data in the log file of the database, and wait to filter the video data suitable for the user.
步骤4012、从历史数据中提取特征,作为行为特征。Step 4012: Extract features from historical data as behavior features.
在特征的维度下,可从用户的历史数据中提取多种类型的特征,作为行为特征。Under the dimension of features, various types of features can be extracted from the user's historical data as behavioral features.
在一个示例中,该行为特征可以包括如下的至少一种:In one example, the behavioral characteristic may include at least one of the following:
1、用户特征1. User characteristics
在本示例中,可从历史数据中采集用户的特征,作为用户特征。In this example, user characteristics may be collected from historical data as user characteristics.
一方面,该用户特征包括用户固有的特征,例如,ID(即用户标识(User ID,UID))、性别、年龄、国家,等等。In one aspect, the user characteristics include characteristics inherent to the user, eg, ID (ie, User ID (UID)), gender, age, country, and the like.
另一方面,该用户特征包括用户动态的特征,例如,最近一段时间内的观看行为,最近一段时间内的互动行为,最近一段时间内对多种类型的视频数据的偏好度,等等。On the other hand, the user features include user dynamic features, for example, viewing behaviors in a recent period of time, interaction behaviors in a recent period of time, preferences for multiple types of video data in a recent period of time, and so on.
2、视频特征2. Video Features
在本示例中,可从历史数据中采集视频数据的特征,作为视频特征。In this example, features of video data may be collected from historical data as video features.
一方面,该视频特征包括视频数据固有的特征,例如,ID(即视频标识(Video ID,VID))、长度、标签、拍客(制作该视频数据的用户)的UID,等等。In one aspect, the video features include features inherent to the video data, such as ID (ie, Video ID (VID)), length, tag, UID of the photographer (the user who made the video data), and the like.
另一方面,该视频特征包括视频数据动态的特征,例如,最近一段时间内推送给用户的次数,最近一段时间内被观看的次数,最近一段时间内被点赞的次数,等等。On the other hand, the video features include dynamic features of the video data, for example, the number of times pushed to users in a recent period of time, the number of times it was viewed in a recent period of time, the number of times it was liked in a recent period of time, and so on.
3、上下文特征3. Contextual Features
在本示例中,可从历史数据中采集用户浏览视频数据时所处环境的特征,作为上下文特征,例如,请求浏览视频数据的时间,请求浏览视频数据的地点,请求浏览视频数据的网络状况,等等。In this example, the characteristics of the environment where the user browses the video data can be collected from the historical data, as the context characteristics, for example, the time of requesting to browse the video data, the location of the request to browse the video data, the network status of the request to browse the video data, etc.
4、交叉特征4. Cross feature
在本示例中,可将用户特征、视频特征与上下文特征中的至少两者进行组合,获得交叉特征,从而增加特征的维度。In this example, at least two of the user feature, the video feature, and the context feature may be combined to obtain a cross feature, thereby increasing the dimension of the feature.
例如,将用户的UID与视频数据的标签组合为交叉特征,将用户的ID与拍客的UID组合为交叉特征,等等。For example, combine the user's UID with the tag of the video data as a cross feature, combine the user's ID with the photographer's UID as a cross feature, and so on.
上述行为特征只是作为示例,在实施本申请实施例时,可以根据实际情况设置其他行为特征,本申请实施例对此不加以限制。The above behavioral features are only examples. When implementing the embodiments of the present application, other behavioral features may be set according to actual situations, which are not limited in the embodiments of the present application.
步骤4013、使用行为特征预测用户对视频数据执行多个目标行为分别对应的多个概率。Step 4013: Use the behavior feature to predict multiple probabilities corresponding to the user performing multiple target behaviors on the video data.
在目标的维度下,根据业务需求,不仅关注用户是否点击视频数据,还考虑点击后对该视频数据的播放时长、是否对视频数据有进一步的交互,如点赞、评论、分享等。In the dimension of the target, according to business requirements, not only pay attention to whether the user clicks on the video data, but also consider the playback time of the video data after clicking, and whether there is further interaction with the video data, such as likes, comments, and shares.
在本实施例中,可以设置多任务学习模型,多任务学习模型可用于计算用户对视频数据执行多个(两个及两个以上)目标行为(如点击、播放时长、点赞、评论、分享、收藏、关注等)的概率,该概率表示如下:In this embodiment, a multi-task learning model can be set, and the multi-task learning model can be used to calculate that the user performs multiple (two or more) target behaviors (such as clicking, playing time, like, comment, share) on the video data , collection, attention, etc.), the probability is expressed as follows:
p(u i,v j,t) p(u i ,v j ,t)
其中,u i表示第i个用户,v j表示第j个视频数据,t为当前的时刻,则该概率简记为p i,jAmong them, ui represents the i-th user, v j represents the j-th video data, and t is the current moment, so the probability is abbreviated as pi ,j .
在本实施例中,为激励用户生产(即制作新的视频数据)的业务,增加一个从消费到生产转换的目标行为,即请求与视频数据包含相同视听元素的其他视频数据,制作新的、包含该视听元素的视频数据,该目标行为可参考步骤101-步骤105。In this embodiment, in order to motivate users to produce (that is, to produce new video data) services, a target behavior of conversion from consumption to production is added, that is, to request other video data containing the same audiovisual elements as the video data, to produce new, For the video data containing the audiovisual element, the target behavior can refer to steps 101-105.
那么,在训练多任务学习模型时,若用户通过一视频数据的视听元素请求其他视频数据并制作新的、包含该视听元素的视频数据,则该视频数据可以设置为正样本,而负样本为已浏览且未触发视听元素的视频数据。Then, when training a multi-task learning model, if the user requests other video data through an audiovisual element of a video data and creates new video data containing the audiovisual element, the video data can be set as a positive sample, and the negative sample is Video data that has been viewed without triggering audiovisual elements.
多任务学习模型可以为神经网络,如深度神经网络(Deep Neural Networks,DNN)等,也可以为其他机器学习模型,如逻辑回归(Logistics Regression,LR)模型、用户点击通过率(Click-Through-Rate,CTR)模型等,本实施例对多任务学习模型的类型不加以限制。The multi-task learning model can be a neural network, such as a deep neural network (Deep Neural Networks, DNN), etc., or can be other machine learning models, such as a logistic regression (Logistics Regression, LR) model, user click-through rate (Click-Through- Rate, CTR) model, etc., the type of the multi-task learning model is not limited in this embodiment.
多任务学习模型可以基于多任务学习进行训练,多任务学习是一种推导迁移的学习方法,将多个目标(如本实施例中的目标行为)放在一起相互学习,相关的目标(如本实施例中的目标行为)共享信息以及不相关的目标引入的噪声都可以在一定程度上提高多任务学习模型的泛化能力。The multi-task learning model can be trained based on multi-task learning. Multi-task learning is a learning method that derives transfer. Multiple goals (such as the target behavior in this embodiment) are put together to learn from each other, and related goals (such as this The target behavior in the embodiment) shared information and the noise introduced by irrelevant targets can improve the generalization ability of the multi-task learning model to a certain extent.
多任务学习属于迁移学习的范畴,它与迁移学习的主要区别在于它是通过多个目标(如本实施例中的目标行为)一起学习提升模型的效果,而通常的迁移学习是借助其他目标来提升一个目标的学习效果。Multi-task learning belongs to the category of transfer learning. The main difference between it and transfer learning is that it learns to improve the effect of the model through multiple targets (such as the target behavior in this example), while the usual transfer learning uses other targets to improve the effect of the model. Improve the learning effect of a target.
在实现中,可以采用基于参数共享的模型作为多任务学习模型,以神经网络为例,如图5所示,多任务学习模型接收相同的输入(Input),底层的网络共享模型参数,多个目标行为(如Task1、Task2、Task3、Task4等)相互学习,梯度同时反向传播,可提高多任务学习模型的泛化能力。In implementation, a model based on parameter sharing can be used as a multi-task learning model. Taking neural network as an example, as shown in Figure 5, the multi-task learning model receives the same input (Input), the underlying network shares model parameters, and multiple The target behaviors (such as Task1, Task2, Task3, Task4, etc.) learn from each other, and the gradients are back-propagated at the same time, which can improve the generalization ability of the multi-task learning model.
步骤4014、将多个概率融合为视频数据对于用户的质量值。Step 4014 , fuse the multiple probabilities into a quality value of the video data for the user.
参考用户对同一视频数据执行多个目标行为的多个概率,可以评估该视频数据对于用户的质量值,该质量值可用于表示在目标的维度下,用户喜好该视频数据的程度。With reference to multiple probabilities of the user performing multiple target actions on the same video data, the quality value of the video data for the user can be evaluated, and the quality value can be used to indicate the degree of the user's preference for the video data under the target dimension.
一般情况下,该质量值与概率正相关,即概率越高,则质量值越大,概率越低,则质量值越小。In general, the quality value is positively correlated with the probability, that is, the higher the probability, the greater the quality value, and the lower the probability, the smaller the quality value.
在一个示例中,可以通过线性融合的方式将多个概率融合为视频数据对于用户的质量值,对每个概率配置特征权重,此时,特征权重越大、该目标行为越重要。In one example, multiple probabilities can be fused into a quality value of video data for the user by means of linear fusion, and feature weights are configured for each probability. In this case, the larger the feature weight, the more important the target behavior is.
计算每个概率与所述每个概率对应的特征权重之间的乘积,作为特征值,计算所有特征值的和值,作为视频数据对于用户的质量值。The product between each probability and the feature weight corresponding to each probability is calculated as the feature value, and the sum of all the feature values is calculated as the quality value of the video data for the user.
记目标行为的集合为O={O 1,O 2,…,O k},预测用户u i对视频数据V j执行目标行为的概率为
Figure PCTCN2021108174-appb-000001
则视频数据V j对于用户u i的质量值为:
Denote the set of target behaviors as O={O 1 ,O 2 ,...,O k }, and predict that the probability of user ui performing the target behavior on video data V j is
Figure PCTCN2021108174-appb-000001
Then the quality value of video data V j for user ui is:
Figure PCTCN2021108174-appb-000002
Figure PCTCN2021108174-appb-000002
其中,w l为特征权重。 Among them, w l is the feature weight.
步骤4015、若质量值满足预设的召回条件,则将质量值所属的视频数据设置为与用户适配的第一视频数据。Step 4015: If the quality value satisfies the preset recall condition, set the video data to which the quality value belongs as the first video data adapted to the user.
在本实施例中,可以预先设置召回条件,例如,数值最高的n(n为正整数)个质量值,质量值大于阈值,数值最高的m%(m为正数)个质量值,等等。In this embodiment, recall conditions can be preset, for example, n (n is a positive integer) quality values with the highest numerical value, the quality value is greater than the threshold, m% (m is a positive number) quality values with the highest numerical value, etc. .
如果当前视频数据的质量值符合该召回条件,则将质量值所属的视频数据设置为与用户适配的第一视频数据,此时,可记录该用户的标识与该第一视频数据的标识之间的关联关系。If the quality value of the current video data meets the recall condition, the video data to which the quality value belongs is set as the first video data adapted to the user, and at this time, the user's identification and the identification of the first video data can be recorded. relationship between.
步骤4016、将第一视频数据发送至客户端。Step 4016: Send the first video data to the client.
在本实施例中,步骤4011-步骤4015可离线执行,对于在线的情况,若一用户(以ID等表示)当前登录于一客户端,则可以以该用户的标识作为搜索的条件,搜索与该用户的标识关联的第一视频数据的标识,基于该第一视频数据的标识查找该第一视频数据,并将该第一视频数据发送至客户端。In this embodiment, steps 4011 to 4015 can be performed offline. In the case of online, if a user (represented by an ID, etc.) is currently logged in to a client, the user's ID can be used as a search condition to search for and The identifier of the first video data associated with the identifier of the user is searched for the first video data based on the identifier of the first video data, and the first video data is sent to the client.
步骤402、当接收到客户端基于元素摘要信息触发的请求时,查找包含视听元素的第二视频数据。Step 402: When a request triggered by the client based on the element summary information is received, search for second video data containing audiovisual elements.
客户端在接收到作用于元素摘要信息的第一操作时,生成请求,并将该请求发送至服务端,请求服务端推送包含该元素摘要信息对应的视听元素的第二视频数据。When receiving the first operation acting on the element summary information, the client generates a request, sends the request to the server, and requests the server to push the second video data including the audiovisual element corresponding to the element summary information.
服务端可通过多种方式收集视频数据并对该视频数据标记其包含的视听元素(以ID等表示),存储在服务端本地的数据库中。The server can collect video data in various ways, mark the video data with audiovisual elements (represented by ID, etc.), and store the video data in the local database of the server.
服务端在接收到客户端的请求时,可搜索标记该视听元素的视频数据,作为第二视频数据,并将该第二视频数据写入第一视频集合中。When receiving the request from the client, the server can search for the video data marked with the audiovisual element as the second video data, and write the second video data into the first video set.
步骤403、将第二视频数据的视频摘要信息发送至客户端。Step 403: Send the video summary information of the second video data to the client.
客户端从本地的数据库中提取该第一视频集合中的第二视频数据的视频摘要信息(如封面、名称、制作者等),并将该视频摘要信息发送至客户端。The client extracts the video summary information (such as cover, name, producer, etc.) of the second video data in the first video set from the local database, and sends the video summary information to the client.
对于视频摘要信息,客户端可设置为在第二用户界面共同显示视频摘要信息、制作控件。For the video summary information, the client may be configured to jointly display the video summary information and production controls on the second user interface.
步骤404、当接收到客户端基于制作控件触发的请求时,将元素摘要信息对应的视听元素发送至客户端。Step 404: When a request triggered by the client based on the production control is received, send the audiovisual element corresponding to the element summary information to the client.
客户端在接收到作用于制作控件的第二操作时,生成请求,并将该请求发送至服务端,请求服务端推送该元素摘要信息(以ID等表示)对应的视听元素。When receiving the second operation acting on the production control, the client generates a request, sends the request to the server, and requests the server to push the audiovisual element corresponding to the element abstract information (represented by ID, etc.).
服务端接收到该请求之后,查找该元素摘要信息(以ID等表示)对应的、且独立的视听元素,并将该视听元素发送至客户端。After receiving the request, the server searches for an independent audiovisual element corresponding to the element abstract information (represented by ID, etc.), and sends the audiovisual element to the client.
对于该视听元素,客户端可设置为采集第三视频数据,将元素摘要信息对应的视听元素添加至第三视频数据中。For the audiovisual element, the client may be configured to collect third video data, and add the audiovisual element corresponding to the element summary information to the third video data.
在一个示例中,视听元素包括音频数据,则在本示例中,可将元素摘要信息对应的音频数据发送至客户端,该客户端可设置为将该元素摘要信息对应的音频数据设置为第三视频数据的背景音乐。In an example, the audiovisual element includes audio data, then in this example, the audio data corresponding to the element summary information may be sent to the client, and the client may be configured to set the audio data corresponding to the element summary information as the third Background music for video data.
在另一个示例中,视听元素包括第四视频数据,则在本示例中,可将元素摘要信息对应的第四视频数据发送至客户端,该客户端可设置为以分屏的方式合成该元素摘要信息对应的第四视频数据与第三视频数据。In another example, the audiovisual element includes fourth video data, then in this example, the fourth video data corresponding to the element summary information may be sent to the client, and the client may be configured to synthesize the element in a split-screen manner The fourth video data and the third video data corresponding to the abstract information.
若客户端完成制作第三视频数据,客户端可上传第三视频数据,则服务端可接收客户端发送的第三视频数据,对第三视频数据标记包含视听元素(以ID等表示),若标记完成,则发布第三视频数据,使得其他客户端可浏览该第三视频数据。If the client completes the production of the third video data, the client can upload the third video data, and the server can receive the third video data sent by the client, and mark the third video data with audiovisual elements (represented by ID, etc.), if After the marking is completed, the third video data is released, so that other clients can browse the third video data.
由于本实施例与实施例一的应用基本相似,所以描述的比较简单,相关之处参见实施例一的部分说明即可,本实施例在此不加以详述。Since the application of this embodiment is basically similar to that of the first embodiment, the description is relatively simple, and the relevant parts may refer to the partial description of the first embodiment, and this embodiment will not be described in detail here.
在本实施例中,将第一视频数据发送至客户端,客户端设置为在播放第一视频数据时,显示元素摘要信息,元素摘要信息表示第一视频数据包含的视听元素,当接收到客户端基于元素摘要信息触发的请求时,查找包含视听元素的第二视频数据,将第二视频数据的视频摘要信息发送至客户端,客户端设置为共同显示视频摘要信息、制作控件,当接收到客户端基于制作控件触发的请求时,将元素摘要信息对应的视听元素发送至客户端,客户端设置为采集第三视频数据,将元素摘要信息对应的视听元素添加至第三视频数据中,一方面,用户可以使用已有视频数据包含的视听元素制作新的视频数据,视听元素并不依赖于系统的模板,渠道多样化,可以保持视听元素的个性化,从而保证新制作的视频数据的个性化,另一方面,由系统提供视听元素,可以保证该视听元素的格式符合制作的规范,直接用于制作新的视频数据,避免用户使用专业的应用对元素进行修正,大大降低了技术门槛,减少了耗时,从而降低了制作视频数据的成本。In this embodiment, the first video data is sent to the client, and the client is set to display element summary information when the first video data is played, and the element summary information indicates the audiovisual elements contained in the first video data. When the terminal is based on a request triggered by the element summary information, it searches for the second video data containing audiovisual elements, and sends the video summary information of the second video data to the client terminal. The client terminal is set to jointly display the video summary information and make controls. When the client sends the audiovisual element corresponding to the element summary information to the client based on the request triggered by the production control, the client is set to collect the third video data, and the audiovisual element corresponding to the element summary information is added to the third video data. On the other hand, users can use the audiovisual elements contained in the existing video data to create new video data. The audiovisual elements do not depend on the template of the system, and the channels are diversified, which can maintain the personalization of the audiovisual elements, thereby ensuring the individuality of the newly produced video data. On the other hand, the system provides audio-visual elements, which can ensure that the format of the audio-visual elements conforms to the production specifications, and can be directly used to produce new video data, avoiding the need for users to use professional applications to revise the elements, greatly reducing the technical threshold, Time consuming is reduced, thereby reducing the cost of producing video data.
实施例四 Embodiment 4
图6为本申请实施例四提供的一种视频数据的制作方法的流程图,本实施例以前述实施例为基础,增加切换第一视频数据、播放第二视频数据的操作,该方法包括如下步骤:6 is a flowchart of a method for producing video data according to Embodiment 4 of the present application. Based on the foregoing embodiments, this embodiment adds operations of switching the first video data and playing the second video data. The method includes the following steps: step:
步骤601、将第一视频数据发送至客户端。Step 601: Send the first video data to the client.
对于第一视频数据,客户端设置为在播放第一视频数据时,显示元素摘要信息,其中,元素摘要信息表示第一视频数据包含的视听元素。For the first video data, the client is configured to display element summary information when the first video data is played, where the element summary information indicates audiovisual elements included in the first video data.
步骤602、当接收到客户端基于元素摘要信息触发的请求时,查找包含视听元素的第二视频数据。Step 602: When a request triggered by the client based on the element summary information is received, search for second video data containing audiovisual elements.
步骤603、将第二视频数据的视频摘要信息发送至客户端。Step 603: Send the video summary information of the second video data to the client.
对于第二视频数据的视频摘要信息,客户端设置为共同显示视频摘要信息、制作控件。For the video summary information of the second video data, the client is set to jointly display the video summary information and the production controls.
步骤604、当接收到客户端基于视频摘要信息触发的请求时,将视频摘要信息所属的第二视频数据发送至客户端进行播放。Step 604: When a request triggered by the client based on the video summary information is received, send the second video data to which the video summary information belongs to the client for playback.
客户端在接收到作用于视频摘要信息的第三操作时,生成请求,并将该请求发送至服务端,请求服务端推送包含该视频摘要信息对应的第二视频数据。When receiving the third operation acting on the video summary information, the client generates a request, sends the request to the server, and requests the server to push the second video data corresponding to the video summary information.
服务端在接收到客户端的请求时,可搜索该视频摘要信息对应的第二视频数据,并将该第二视频数据的部分或全部数据发送至客户端。When receiving the request from the client, the server can search for the second video data corresponding to the video summary information, and send part or all of the second video data to the client.
客户端在缓存该第二视频数据的部分或全部数据之后,可调用操作系统提供的视频播放器播放该第二视频数据。After buffering part or all of the second video data, the client can call the video player provided by the operating system to play the second video data.
在本实施例中,通过视听元素聚合第二视频数据,可以集中推送用户可能喜好的第二视频数据,减少用户通过关键词、翻页等搜索相似视频数据的操作,减少搜索相似视频数据的耗时,减少因该相应搜索的操作而对客户端、服务端的资源(如处理器资源、内存资源、带宽资源等)的占用,从而提高用户浏览第二视频数据的效率。In this embodiment, by aggregating the second video data through audio-visual elements, the second video data that the user may like can be pushed centrally, which reduces the user's operation of searching for similar video data through keywords, page turning, etc., and reduces the consumption of searching for similar video data. When the corresponding search operation is performed, the occupation of resources (such as processor resources, memory resources, bandwidth resources, etc.) of the client and the server due to the corresponding search operation is reduced, thereby improving the efficiency of the user browsing the second video data.
步骤605、当接收到客户端基于第一视频数据触发的请求时,将与用户适配的其他第一视频数据发送至客户端进行播放,或者,将包含其他视听元素的其他第一视频数据发送至客户端进行播放。Step 605: When receiving a request triggered by the client based on the first video data, send other first video data adapted to the user to the client for playback, or send other first video data containing other audiovisual elements to the client for playback.
客户端在接收到作用于第一视频数据的第四操作时,生成请求,并将该请求发送至服务端,请求服务端推送其他第一视频数据。When receiving the fourth operation acting on the first video data, the client generates a request, sends the request to the server, and requests the server to push other first video data.
服务端在接收到该请求时,识别第一视频数据的类型,从而区分推送不同的第一视频数据。When receiving the request, the server identifies the type of the first video data, so as to distinguish and push different first video data.
如果当前播放的第一视频数据为个性化推送的视频数据,即与当前用户(以ID等表示)适配,则服务端可以向该客户端发送与当前用户适配的其他第一视频数据的部分或全部数据。If the currently playing first video data is personalized push video data, that is, it is adapted to the current user (represented by ID, etc.), the server can send to the client other first video data adapted to the current user. some or all of the data.
客户端在缓存与当前用户适配的其他第一视频数据的部分或全部数据之后,可调用操作系统提供的视频播放器播放与当前用户适配的其他第一视频数据。After buffering part or all of the other first video data adapted to the current user, the client can call the video player provided by the operating system to play the other first video data adapted to the current user.
如果当前播放的第一视频数据为非个性化推送的视频数据,当前第一视频数据包含其他视听元素,第一视频数据来源于该其他视听元素对应的第二视频集合,则服务端可以向该客户端发送该第二视频集合中的其他第一视频数据的部分或全部数据。If the currently playing first video data is non-personalized push video data, the current first video data contains other audiovisual elements, and the first video data comes from the second video set corresponding to the other audiovisual elements, the server can send the The client sends part or all data of other first video data in the second video set.
客户端在缓存该第二视频集合中的其他第一视频数据的部分或全部数据之后,可调用操作系统提供的视频播放器播放该第二视频集合中的其他第一视频数据。After buffering part or all of the other first video data in the second video set, the client can call the video player provided by the operating system to play the other first video data in the second video set.
在本实施例中,区分第一视频数据的类型,针对个性化、非个性化的业务场景分别推送与用户适配的其他第一视频数据、第二视频集合的其他第一视频数据,可以保证第一视频数据切换的准确性,符合业务场景的需求。In this embodiment, the types of the first video data are distinguished, and other first video data adapted to the user and other first video data in the second video set are respectively pushed for personalized and non-personalized service scenarios, which can ensure that The accuracy of the first video data switching meets the requirements of business scenarios.
由于本实施例与实施例二的应用基本相似,所以描述的比较简单,相关之处参见实施例二的部分说明即可,本实施例在此不加以详述。Since the application of this embodiment is basically similar to that of the second embodiment, the description is relatively simple, and the relevant parts may refer to the partial description of the second embodiment, and this embodiment will not be described in detail here.
对于方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本申请实施例并不受所描述的动作顺序的限制,因为依据本申请实施例,一些步骤可以采用其他顺序或者同时进行。其次,文中所描述的实施例均属于示例,所涉及的动作并不一定是本申请实施例所必须的。For the method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but the embodiments of the present application are not limited by the described sequence of actions, because according to the embodiments of the present application, some steps may adopt other sequences or both. Secondly, the embodiments described in the text are all examples, and the actions involved are not necessarily required by the embodiments of the present application.
实施例五Embodiment 5
图7为本申请实施例五提供的一种视频数据的制作装置的结构框图,可以包括如下模块:7 is a structural block diagram of an apparatus for producing video data according to Embodiment 5 of the present application, which may include the following modules:
显示屏701,设置为当播放第一视频数据时,显示元素摘要信息,所述元素摘要信息表示所述第一视频数据包含的视听元素;触控屏702,设置为接收作用于所述元素摘要信息的第一操作;显示屏701,还设置为响应于所述第一操作,共同显示第二视频数据的视频摘要信息、制作控件,所述第二视频数据包含所述视听元素;触控屏702,还设置为接收作用于所述制作控件的第二操作;摄像头703,设置为响应于所述第二操作,采集第三视频数据;处理器704,设置为将所述元素摘要信息对应的视听元素添加至所述第三视频数据中。The display screen 701 is set to display element summary information when the first video data is played, and the element summary information represents the audiovisual elements contained in the first video data; the touch screen 702 is set to receive the element summary that acts on the element summary. The first operation of information; the display screen 701 is further configured to, in response to the first operation, jointly display video summary information and production controls of the second video data, where the second video data includes the audiovisual elements; the touch screen 702, further configured to receive a second operation acting on the production control; a camera 703, configured to collect third video data in response to the second operation; and a processor 704, configured to Audiovisual elements are added to the third video data.
本申请实施例所提供的视频数据的制作装置可执行本申请任意实施例所提供的视频数据的制作方法,具备执行方法相应的功能模块和效果。The apparatus for producing video data provided by the embodiment of the present application can execute the method for producing video data provided by any embodiment of the present application, and has functional modules and effects corresponding to the execution method.
实施例六Embodiment 6
图8为本申请实施例六提供的一种视频数据的制作装置的结构框图,可以包括如下模块:8 is a structural block diagram of an apparatus for producing video data according to Embodiment 6 of the present application, which may include the following modules:
第一视频数据发送模块801,设置为将第一视频数据发送至客户端,所述客户端设置为在播放所述第一视频数据时,显示元素摘要信息,所述元素摘要信息表示所述第一视频数据包含的视听元素;第二视频数据查找模块802,设置为当接收到所述客户端基于所述元素摘要信息触发的请求时,查找包含所述视听元素的第二视频数据;视频摘要信息发送模块803,设置为将所述第二视频数据的视频摘要信息发送至所述客户端,所述客户端设置为共同显示所述视频摘要 信息、制作控件;视听元素发送模块804,设置为当接收到所述客户端基于所述制作控件触发的请求时,将所述元素摘要信息对应的视听元素发送至所述客户端,所述客户端设置为采集第三视频数据,将所述元素摘要信息对应的视听元素添加至所述第三视频数据中。The first video data sending module 801 is configured to send the first video data to the client, and the client is configured to display element summary information when the first video data is played, and the element summary information indicates the first video data. an audiovisual element contained in video data; the second video data search module 802 is configured to search for second video data containing the audiovisual element when receiving a request triggered by the client based on the element summary information; video summary The information sending module 803 is configured to send the video summary information of the second video data to the client, and the client is configured to jointly display the video summary information and production controls; the audiovisual element sending module 804 is configured to When a request triggered by the client based on the production control is received, the audiovisual element corresponding to the element summary information is sent to the client, and the client is set to collect third video data, and the element Audiovisual elements corresponding to the summary information are added to the third video data.
本申请实施例所提供的视频数据的制作装置可执行本申请任意实施例所提供的视频数据的制作方法,具备执行方法相应的功能模块和效果。The apparatus for producing video data provided by the embodiment of the present application can execute the method for producing video data provided by any embodiment of the present application, and has functional modules and effects corresponding to the execution method.
实施例七Embodiment 7
图9为本申请实施例七提供的一种计算机设备的结构示意图。图9示出了适于用来实现本申请实施方式的示例性计算机设备12的框图。FIG. 9 is a schematic structural diagram of a computer device according to Embodiment 7 of the present application. Figure 9 shows a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present application.
如图9所示,计算机设备12以通用计算设备的形式表现。计算机设备12的组件可以包括但不限于:一个或者多个处理器或者处理单元16,系统存储器28,连接不同系统组件(包括系统存储器28和处理单元16)的总线18。As shown in FIG. 9, computer device 12 takes the form of a general-purpose computing device. Components of computer device 12 may include, but are not limited to, one or more processors or processing units 16 , system memory 28 , and a bus 18 connecting various system components including system memory 28 and processing unit 16 .
系统存储器28可以包括易失性存储器形式的计算机系统可读介质,例如随机存取存储器(Random Access Memory,RAM)30和/或高速缓存存储器32。存储系统34可以设置为读写不可移动的、非易失性磁介质(图9未显示,通常称为“硬盘驱动器”)。 System memory 28 may include computer system readable media in the form of volatile memory, such as random access memory (RAM) 30 and/or cache memory 32 . Storage system 34 may be configured to read and write to non-removable, non-volatile magnetic media (not shown in Figure 9, commonly referred to as a "hard disk drive").
具有一组(至少一个)程序模块42的程序/实用工具40,可以存储在例如存储器28中。A program/utility 40 having a set (at least one) of program modules 42 may be stored in memory 28, for example.
计算机设备12也可以与一个或多个外部设备14(例如键盘、指向设备、显示器24等)通信。这种通信可以通过输入/输出(Input/Output,I/O)接口22进行。并且,计算机设备12还可以通过网络适配器20与一个或者多个网络(例如局域网(Local Area Network,LAN),广域网(Wide Area Network,WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器20通过总线18与计算机设备12的其它模块通信。 Computer device 12 may also communicate with one or more external devices 14 (eg, keyboard, pointing device, display 24, etc.). Such communication may take place through an input/output (I/O) interface 22 . Also, computer device 12 may communicate with one or more networks (eg, Local Area Network (LAN), Wide Area Network (WAN), and/or public networks such as the Internet) through network adapter 20. As shown, network adapter 20 communicates with other modules of computer device 12 via bus 18 .
处理单元16通过运行存储在系统存储器28中的程序,从而执行多种功能应用以及数据处理,例如实现本申请实施例所提供的视频数据的制作方法。The processing unit 16 executes a variety of functional applications and data processing by running the programs stored in the system memory 28, for example, implementing the video data production method provided by the embodiments of the present application.
实施例八 Embodiment 8
本申请实施例八还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现上述视频数据的制作方法的多个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。The eighth embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium. When the computer program is executed by a processor, multiple processes of the above-mentioned video data production method can be realized, and the same can be achieved. In order to avoid repetition, the technical effect will not be repeated here.

Claims (17)

  1. 一种视频数据的制作方法,包括:A method for producing video data, comprising:
    在播放第一视频数据的情况下,显示元素摘要信息,其中,所述元素摘要信息用于表示所述第一视频数据包含的视听元素;In the case of playing the first video data, element summary information is displayed, wherein the element summary information is used to represent audiovisual elements included in the first video data;
    接收作用于所述元素摘要信息的第一操作;receiving a first operation acting on the element summary information;
    响应于所述第一操作,共同显示第二视频数据的视频摘要信息以及制作控件,其中,所述第二视频数据包含所述视听元素;in response to the first operation, collectively displaying video summary information and production controls for second video data, wherein the second video data includes the audiovisual element;
    接收作用于所述制作控件的第二操作;receiving a second operation acting on the production control;
    响应于所述第二操作,采集第三视频数据,将所述元素摘要信息对应的视听元素添加至所述第三视频数据中。In response to the second operation, third video data is collected, and audiovisual elements corresponding to the element summary information are added to the third video data.
  2. 根据权利要求1所述的方法,其中,所述共同显示第二视频数据的视频摘要信息以及制作控件,包括:The method according to claim 1, wherein the jointly displaying the video summary information and the production control of the second video data comprises:
    显示信息区域,其中,所述信息区域的面积与所述视听元素的类型匹配;displaying an information area, wherein the area of the information area matches the type of the audiovisual element;
    在所述信息区域中显示所述第二视频数据的视频摘要信息;displaying video summary information of the second video data in the information area;
    显示所述制作控件,其中,所述制作控件悬浮于所述信息区域上。The authoring control is displayed, wherein the authoring control is suspended on the information area.
  3. 根据权利要求2所述的方法,其中,所述显示信息区域,包括:The method of claim 2, wherein the displaying the information area comprises:
    在所述视听元素的类型为可视元素的情况下,显示第一区域,将所述第一区域作为所述信息区域;In the case that the type of the audiovisual element is a visual element, display a first area, and use the first area as the information area;
    在所述视听元素的类型为可听元素的情况下,显示第二区域,将所述第二区域作为所述信息区域;In the case that the type of the audiovisual element is an audible element, display a second area, and use the second area as the information area;
    其中,所述第一区域的面积大于所述第二区域的面积。Wherein, the area of the first region is larger than that of the second region.
  4. 根据权利要求2所述的方法,其中,所述元素摘要信息包括元素图像数据;The method of claim 2, wherein the element summary information includes element image data;
    所述共同显示第二视频数据的视频摘要信息以及制作控件,还包括:The jointly displaying the video summary information and the production control of the second video data further includes:
    以背景的形式显示所述元素图像数据;displaying the element image data in the form of a background;
    以标题的形式显示所述元素摘要信息。The element summary information is displayed in the form of a title.
  5. 根据权利要求1所述的方法,其中,所述视听元素包括音频数据;The method of claim 1, wherein the audiovisual element comprises audio data;
    所述将所述元素摘要信息对应的视听元素添加至所述第三视频数据中,包括:The adding the audiovisual element corresponding to the element summary information to the third video data includes:
    将所述元素摘要信息对应的音频数据设置为所述第三视频数据的背景音乐。The audio data corresponding to the element summary information is set as the background music of the third video data.
  6. 根据权利要求1所述的方法,其中,所述视听元素包括第四视频数据;The method of claim 1, wherein the audiovisual element comprises fourth video data;
    所述将所述元素摘要信息对应的视听元素添加至所述第三视频数据中,包括:The adding the audiovisual element corresponding to the element summary information to the third video data includes:
    以分屏的方式合成所述元素摘要信息对应的第四视频数据与所述第三视频数据。The fourth video data and the third video data corresponding to the element summary information are synthesized in a split-screen manner.
  7. 根据权利要求1所述的方法,还包括:The method of claim 1, further comprising:
    接收作用于所述视频摘要信息的第三操作;receiving a third operation acting on the video summary information;
    响应于所述第三操作,播放所述视频摘要信息所属的第二视频数据。In response to the third operation, second video data to which the video summary information belongs is played.
  8. 根据权利要求1-7中任一项所述的方法,还包括:The method according to any one of claims 1-7, further comprising:
    接收作用于所述第一视频数据的第四操作;receiving a fourth operation acting on the first video data;
    响应于所述第四操作,播放与当前用户适配的其他第一视频数据,或者,在所述第一视频数据包含其他视听元素的情况下播放包含所述其他视听元素的其他第一视频数据。In response to the fourth operation, play other first video data adapted to the current user, or play other first video data including the other audiovisual elements if the first video data includes other audiovisual elements .
  9. 根据权利要求1-7中任一项所述的方法,在所述将所述元素摘要信息对应的视听元素添加至所述第三视频数据中之后,还包括:The method according to any one of claims 1-7, after adding the audiovisual element corresponding to the element summary information to the third video data, further comprising:
    发布标记包含所述视听元素的所述第三视频数据。A release markup contains the third video data of the audiovisual element.
  10. 一种视频数据的制作方法,包括:A method for producing video data, comprising:
    将第一视频数据发送至客户端,其中,所述客户端设置为在播放所述第一视频数据的情况下,显示元素摘要信息,所述元素摘要信息用于表示所述第一视频数据包含的视听元素;Send the first video data to the client, wherein the client is configured to display element summary information when the first video data is played, and the element summary information is used to indicate that the first video data contains audiovisual elements;
    在接收到所述客户端基于所述元素摘要信息触发的请求的情况下,查找包含所述视听元素的第二视频数据;in the case of receiving a request triggered by the client based on the element summary information, searching for second video data containing the audiovisual element;
    将所述第二视频数据的视频摘要信息发送至所述客户端,其中,所述客户端还设置为共同显示所述视频摘要信息以及制作控件;sending the video summary information of the second video data to the client, wherein the client is further configured to jointly display the video summary information and production controls;
    在接收到所述客户端基于所述制作控件触发的请求的情况下,将所述元素摘要信息对应的视听元素发送至所述客户端,其中,所述客户端还设置为采集第三视频数据,将所述元素摘要信息对应的视听元素添加至所述第三视频数据中。In the case of receiving the request triggered by the client based on the production control, send the audiovisual element corresponding to the element summary information to the client, wherein the client is further configured to collect third video data , adding the audiovisual element corresponding to the element summary information to the third video data.
  11. 根据权利要求10所述的方法,其中,所述将第一视频数据发送至客户端,包括:The method according to claim 10, wherein the sending the first video data to the client comprises:
    获取用户浏览视频数据的情况下记录的历史数据,其中,所述用户当前登 录于一客户端;Obtain the historical data recorded under the situation that the user browses the video data, wherein, the user is currently logged in to a client;
    从所述历史数据中提取特征,将提取的特征作为行为特征;Extract features from the historical data, and use the extracted features as behavioral features;
    使用所述行为特征预测所述用户对所述视频数据执行多个目标行为分别对应的多个概率,其中,所述目标行为包括请求与所述视频数据包含相同视听元素的其他视频数据,且制作新的、包含所述视听元素的视频数据;Using the behavior feature to predict a plurality of probabilities respectively corresponding to the user performing a plurality of target actions on the video data, wherein the target actions include requesting other video data containing the same audiovisual elements as the video data, and making new video data containing the audiovisual element;
    将所述多个概率融合为所述视频数据对于所述用户的质量值;fusing the plurality of probabilities into a quality value of the video data for the user;
    在所述质量值满足预设的召回条件的情况下,将所述质量值所属的视频数据设置为与所述用户适配的第一视频数据;In the case that the quality value satisfies a preset recall condition, setting the video data to which the quality value belongs as the first video data adapted to the user;
    将所述第一视频数据发送至所述客户端。Sending the first video data to the client.
  12. 根据权利要求11所述的方法,其中,所述行为特征包括用户特征、视频特征、上下文特征、交叉特征中的至少一种;The method according to claim 11, wherein the behavioral features include at least one of user features, video features, contextual features, and cross-features;
    所述从所述历史数据中提取特征,作为行为特征,包括:The extracting features from the historical data, as behavioral features, include:
    从所述历史数据中采集所述用户的特征,作为所述用户特征;Collect the user's characteristics from the historical data as the user's characteristics;
    从所述历史数据中采集所述视频数据的特征,作为所述视频特征;The features of the video data are collected from the historical data as the video features;
    从所述历史数据中采集所述用户浏览所述视频数据的情况下所处环境的特征,作为所述上下文特征;Collect the characteristics of the environment in which the user browses the video data from the historical data, as the context characteristics;
    将所述用户特征、所述视频特征与所述上下文特征中的至少两者进行组合,获得所述交叉特征。The intersection feature is obtained by combining at least two of the user feature, the video feature, and the context feature.
  13. 根据权利要求11-12中任一项所述的方法,其中,所述将所述多个概率融合为所述视频数据对于所述用户的质量值,包括:The method according to any one of claims 11-12, wherein the fusing the plurality of probabilities into a quality value of the video data for the user comprises:
    对每个概率配置特征权重;Configure feature weights for each probability;
    计算每个概率与所述每个概率对应的特征权重之间的乘积,将所述乘积作为所述每个概率对应的特征值;Calculate the product between each probability and the feature weight corresponding to each probability, and use the product as the feature value corresponding to each probability;
    计算所有特征值的和值,作为所述视频数据对于所述用户的质量值。The sum of all feature values is calculated as the quality value of the video data for the user.
  14. 一种视频数据的制作装置,包括:A device for producing video data, comprising:
    显示屏,设置为在播放第一视频数据的情况下,显示元素摘要信息,其中,所述元素摘要信息用于表示所述第一视频数据包含的视听元素;a display screen, configured to display element summary information when the first video data is played, wherein the element summary information is used to represent audiovisual elements included in the first video data;
    触控屏,设置为接收作用于所述元素摘要信息的第一操作;a touch screen, configured to receive a first operation acting on the element summary information;
    显示屏,还设置为响应于所述第一操作,共同显示第二视频数据的视频摘要信息以及制作控件,其中,所述第二视频数据包含所述视听元素;a display screen, further configured to jointly display video summary information and production controls of second video data in response to the first operation, wherein the second video data includes the audiovisual element;
    触控屏,还设置为接收作用于所述制作控件的第二操作;a touch screen, further configured to receive a second operation acting on the production control;
    摄像头,设置为响应于所述第二操作,采集第三视频数据;a camera, configured to collect third video data in response to the second operation;
    处理器,设置为将所述元素摘要信息对应的视听元素添加至所述第三视频数据中。The processor is configured to add the audiovisual element corresponding to the element summary information to the third video data.
  15. 一种视频数据的制作装置,包括:A device for producing video data, comprising:
    第一视频数据发送模块,设置为将第一视频数据发送至客户端,其中,所述客户端设置为在播放所述第一视频数据的情况下,显示元素摘要信息,所述元素摘要信息用于表示所述第一视频数据包含的视听元素;The first video data sending module is configured to send the first video data to the client, wherein the client is configured to display element summary information when the first video data is played, and the element summary information uses to represent the audiovisual elements contained in the first video data;
    第二视频数据查找模块,设置为在接收到所述客户端基于所述元素摘要信息触发的请求的情况下,查找包含所述视听元素的第二视频数据;A second video data search module, configured to search for the second video data containing the audiovisual element in the case of receiving a request triggered by the client based on the element summary information;
    视频摘要信息发送模块,设置为将所述第二视频数据的视频摘要信息发送至所述客户端,其中,所述客户端还设置为共同显示所述视频摘要信息以及制作控件;a video summary information sending module, configured to send the video summary information of the second video data to the client, wherein the client is further configured to jointly display the video summary information and the production control;
    视听元素发送模块,设置为在接收到所述客户端基于所述制作控件触发的请求的情况下,将所述元素摘要信息对应的视听元素发送至所述客户端,其中,所述客户端还设置为采集第三视频数据,将所述元素摘要信息对应的视听元素添加至所述第三视频数据中。The audiovisual element sending module is configured to send the audiovisual element corresponding to the element summary information to the client when receiving the request triggered by the client based on the production control, wherein the client also further Setting is to collect third video data, and add audiovisual elements corresponding to the element summary information to the third video data.
  16. 一种计算机设备,包括:A computer device comprising:
    至少一个处理器;at least one processor;
    存储器,设置为存储至少一个程序;a memory, arranged to store at least one program;
    当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-13中任一项所述的视频数据的制作方法。When the at least one program is executed by the at least one processor, the at least one processor implements the method for producing video data according to any one of claims 1-13.
  17. 一种计算机可读存储介质,设置为存储计算机程序,所述计算机程序被处理器执行时实现如权利要求1-13中任一项所述的视频数据的制作方法。A computer-readable storage medium configured to store a computer program, when the computer program is executed by a processor, the method for producing video data according to any one of claims 1-13 is implemented.
PCT/CN2021/108174 2020-08-31 2021-07-23 Method and apparatus for manufacturing video data, and computer device and storage medium WO2022042157A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010896513.0A CN112040339A (en) 2020-08-31 2020-08-31 Method and device for making video data, computer equipment and storage medium
CN202010896513.0 2020-08-31

Publications (1)

Publication Number Publication Date
WO2022042157A1 true WO2022042157A1 (en) 2022-03-03

Family

ID=73586447

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/108174 WO2022042157A1 (en) 2020-08-31 2021-07-23 Method and apparatus for manufacturing video data, and computer device and storage medium

Country Status (2)

Country Link
CN (1) CN112040339A (en)
WO (1) WO2022042157A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112040339A (en) * 2020-08-31 2020-12-04 广州市百果园信息技术有限公司 Method and device for making video data, computer equipment and storage medium
CN113011566A (en) * 2021-03-30 2021-06-22 北京深演智能科技股份有限公司 Data processing method, electronic device and computer readable storage medium
CN114268815B (en) * 2021-12-15 2024-08-13 北京达佳互联信息技术有限公司 Video quality determining method, device, electronic equipment and storage medium
CN115103219A (en) * 2022-07-01 2022-09-23 抖音视界(北京)有限公司 Audio distribution method, device and computer readable storage medium
CN116916082B (en) * 2023-09-12 2023-12-08 华光影像科技有限公司 Film and television making interface switching system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004200811A (en) * 2002-12-16 2004-07-15 Canon Inc Moving picture photographing apparatus
CN106375782A (en) * 2016-08-31 2017-02-01 北京小米移动软件有限公司 Video playing method and device
CN107959873A (en) * 2017-11-02 2018-04-24 深圳天珑无线科技有限公司 Method, apparatus, terminal and the storage medium of background music are implanted into video
CN108600825A (en) * 2018-07-12 2018-09-28 北京微播视界科技有限公司 Select method, apparatus, terminal device and the medium of background music shooting video
CN108668164A (en) * 2018-07-12 2018-10-16 北京微播视界科技有限公司 Select method, apparatus, terminal device and the medium of background music shooting video
CN108900768A (en) * 2018-07-12 2018-11-27 北京微播视界科技有限公司 Video capture method, apparatus, terminal, server and storage medium
CN112040339A (en) * 2020-08-31 2020-12-04 广州市百果园信息技术有限公司 Method and device for making video data, computer equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109874053B (en) * 2019-02-21 2021-10-22 南京航空航天大学 Short video recommendation method based on video content understanding and user dynamic interest

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004200811A (en) * 2002-12-16 2004-07-15 Canon Inc Moving picture photographing apparatus
CN106375782A (en) * 2016-08-31 2017-02-01 北京小米移动软件有限公司 Video playing method and device
CN107959873A (en) * 2017-11-02 2018-04-24 深圳天珑无线科技有限公司 Method, apparatus, terminal and the storage medium of background music are implanted into video
CN108600825A (en) * 2018-07-12 2018-09-28 北京微播视界科技有限公司 Select method, apparatus, terminal device and the medium of background music shooting video
CN108668164A (en) * 2018-07-12 2018-10-16 北京微播视界科技有限公司 Select method, apparatus, terminal device and the medium of background music shooting video
CN108900768A (en) * 2018-07-12 2018-11-27 北京微播视界科技有限公司 Video capture method, apparatus, terminal, server and storage medium
CN112040339A (en) * 2020-08-31 2020-12-04 广州市百果园信息技术有限公司 Method and device for making video data, computer equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DREAM IN SHENXIANG: "How to use other people's video background music to shoot video in Douyin?", CN, pages 1 - 5, XP009534600, Retrieved from the Internet <URL:https://www.kafan.cn/edu/20089681.html> *
DREAM IN THE LANE: "TikTok How to Use Video Background Music for a Video in TikTok?", KAFAN NETWORK, 25 December 2018 (2018-12-25) *
NAO DONG DA KAI: "How to Shoot with the Same Special Effects in TikTok?", CN, XP009534750, Retrieved from the Internet <URL:https://jingyan.baidu.com/article/335530dad3f6ec19ca41c36d.html> *

Also Published As

Publication number Publication date
CN112040339A (en) 2020-12-04

Similar Documents

Publication Publication Date Title
WO2022042157A1 (en) Method and apparatus for manufacturing video data, and computer device and storage medium
JP6967059B2 (en) Methods, devices, servers, computer-readable storage media and computer programs for producing video
US9332315B2 (en) Timestamped commentary system for video content
US10277696B2 (en) Method and system for processing data used by creative users to create media content
US8296797B2 (en) Intelligent video summaries in information access
KR101944469B1 (en) Estimating and displaying social interest in time-based media
JP6930041B1 (en) Predicting potentially relevant topics based on searched / created digital media files
WO2022052749A1 (en) Message processing method, apparatus and device, and storage medium
CN112948708B (en) Short video recommendation method
CN109165302A (en) Multimedia file recommendation method and device
CN111078939A (en) Method, system and recording medium for extracting and providing highlight image in video content
CN111368141B (en) Video tag expansion method, device, computer equipment and storage medium
JP7240505B2 (en) Voice packet recommendation method, device, electronic device and program
US20210117471A1 (en) Method and system for automatically generating a video from an online product representation
RU2714594C1 (en) Method and system for determining parameter relevance for content items
WO2022134689A1 (en) Multimedia resource display method and device
CN112765373A (en) Resource recommendation method and device, electronic equipment and storage medium
CN110413894A (en) The training method of commending contents model, the method for commending contents and relevant apparatus
WO2024021687A1 (en) Search result reordering method and apparatus, device, storage medium, and program product
CN100397401C (en) Method for multiple resources pools integral parallel search in open websites
Su et al. Classification and interaction of new media instant music video based on deep learning under the background of artificial intelligence
US20240121485A1 (en) Method, apparatus, device, medium and program product for obtaining text material
CN114817692A (en) Method, device and equipment for determining recommended object and computer storage medium
CN116980665A (en) Video processing method, device, computer equipment, medium and product
Jin et al. Personalized micro-video recommendation based on multi-modal features and user interest evolution

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21859997

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21859997

Country of ref document: EP

Kind code of ref document: A1