WO2022042157A1 - Procédé et appareil de fabrication de données vidéo, dispositif informatique et support d'enregistrement - Google Patents

Procédé et appareil de fabrication de données vidéo, dispositif informatique et support d'enregistrement Download PDF

Info

Publication number
WO2022042157A1
WO2022042157A1 PCT/CN2021/108174 CN2021108174W WO2022042157A1 WO 2022042157 A1 WO2022042157 A1 WO 2022042157A1 CN 2021108174 W CN2021108174 W CN 2021108174W WO 2022042157 A1 WO2022042157 A1 WO 2022042157A1
Authority
WO
WIPO (PCT)
Prior art keywords
video data
summary information
video
audiovisual
client
Prior art date
Application number
PCT/CN2021/108174
Other languages
English (en)
Chinese (zh)
Inventor
裴得利
Original Assignee
百果园技术(新加坡)有限公司
裴得利
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百果园技术(新加坡)有限公司, 裴得利 filed Critical 百果园技术(新加坡)有限公司
Publication of WO2022042157A1 publication Critical patent/WO2022042157A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8549Creating video summaries, e.g. movie trailer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications

Definitions

  • the present application relates to the technical field of multimedia, for example, to a method, apparatus, computer equipment and storage medium for producing video data.
  • the elements added by users to the video data are mostly templates provided by the platform.
  • the platform provides fewer templates, and more users produce video data, which leads to the obvious homogeneity of the video data produced using these templates. Therefore, , many users manually collect elements to realize element personalization, thereby realizing the personalization of video data, for example, downloading data from the Internet as elements, parsing data from other video data as elements, and so on.
  • the present application proposes a method, device, computer equipment and storage medium for producing video data, so as to solve the problem of how to reduce the cost of producing video data under the condition of keeping the individuality of video data.
  • the application provides a method for producing video data, including:
  • element summary information is displayed, wherein the element summary information is used to represent audiovisual elements included in the first video data;
  • third video data is collected, and audiovisual elements corresponding to the element summary information are added to the third video data.
  • the present application also provides a method for producing video data, including:
  • the client is configured to display element summary information when the first video data is played, and the element summary information is used to indicate that the first video data contains audiovisual elements;
  • the audiovisual element corresponding to the element summary information is sent to the client, and the client is further configured to collect third video data, The audiovisual element corresponding to the element summary information is added to the third video data.
  • the present application also provides a device for producing video data, including:
  • a display screen configured to display element summary information when the first video data is played, wherein the element summary information is used to represent audiovisual elements included in the first video data;
  • a touch screen configured to receive a first operation acting on the element summary information
  • a display screen further configured to jointly display video summary information and production controls of second video data in response to the first operation, wherein the second video data includes the audiovisual element;
  • a touch screen further configured to receive a second operation acting on the production control
  • a camera configured to collect third video data in response to the second operation
  • the processor is configured to add the audiovisual element corresponding to the element summary information to the third video data.
  • the present application also provides a device for producing video data, including:
  • the first video data sending module is configured to send the first video data to the client, wherein the client is configured to display element summary information when the first video data is played, and the element summary information uses to represent the audiovisual elements contained in the first video data;
  • a second video data search module configured to search for the second video data containing the audiovisual element in the case of receiving a request triggered by the client based on the element summary information
  • a video summary information sending module configured to send the video summary information of the second video data to the client, wherein the client is also set to jointly display the video summary information and the production control;
  • the audiovisual element sending module is configured to send the audiovisual element corresponding to the element summary information to the client when receiving the request triggered by the client based on the production control, wherein the client also further Setting is to collect third video data, and add audiovisual elements corresponding to the element summary information to the third video data.
  • the present application also provides a computer device, the computer device comprising:
  • processors one or more processors
  • memory arranged to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the above-mentioned method for producing video data.
  • the present application also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the above-mentioned method for producing video data is implemented.
  • FIG. 1 is a flowchart of a method for producing video data according to Embodiment 1 of the present application
  • FIG. 2A is an exemplary diagram of producing video data according to Embodiment 1 of the present application.
  • FIG. 2B is another exemplary diagram of producing video data according to Embodiment 1 of the present application.
  • FIG. 2C is another exemplary diagram of producing video data according to Embodiment 1 of the present application.
  • FIG. 2D is another exemplary diagram of producing video data according to Embodiment 1 of the present application.
  • FIG. 2E is another exemplary diagram of producing video data according to Embodiment 1 of the present application.
  • FIG. 2F is another exemplary diagram of producing video data according to Embodiment 1 of the present application.
  • FIG. 3 is a flowchart of a method for producing video data according to Embodiment 2 of the present application.
  • FIG. 5 is a schematic structural diagram of a multi-task learning model provided in Embodiment 3 of the present application.
  • FIG. 6 is a flowchart of a method for producing video data according to Embodiment 4 of the present application.
  • FIG. 7 is a schematic structural diagram of an apparatus for producing video data according to Embodiment 5 of the present application.
  • FIG. 8 is a schematic structural diagram of an apparatus for producing video data according to Embodiment 6 of the present application.
  • FIG. 9 is a schematic structural diagram of a computer device according to Embodiment 7 of the present application.
  • video platforms need to maintain a good ecology of consumption and production, that is, in terms of consumption, video platforms strive to push video data in line with users’ interests and preferences to obtain users’ higher consumption time and satisfaction; in terms of production , the video platform should also encourage users to shoot more video data and upload it to the video platform for release, enrich the content of the video platform, and richer content can make it easier for users to obtain video data that meets their interests and preferences, forming a virtuous circle.
  • the video recommendation algorithm is mainly aimed at the consumption mechanism.
  • these implicit feedbacks are used to construct the positive feedback of the training data.
  • Negative samples use the training data to train a ranking model, use the ranking model to calculate the user's score on the video data, and then select the video data that best meets the user's interests and preferences, and push it to the user.
  • the optimization goal of the ranking model has also developed from a single playback duration (or completion rate) to a multi-task learning model that has both consumption indicators such as duration, and satisfaction indicators such as likes, comments and forwarding.
  • the user can reuse the audio-visual elements of interest more concisely when consuming video data, and quickly produce video data.
  • the multi-task learning model is used to introduce A goal of conversion from consumption to production, to predict which video data is more likely to arouse users’ interest and then produce new video data.
  • a factor is added, that is, whether the current video data will arouse users’ interest.
  • This type of video data will be preferentially pushed to increase the willingness of users to produce, which can guide more users from single consumption to consumption, while further production and improve The conversion ratio from consumption to production, thereby enriching the ecological closed loop of video platform content.
  • the video data production device can be implemented by software and/or hardware, and can be configured in computer equipment, for example, mobile terminals (such as mobile phones, tablet computers, etc.), smart wearable devices (such as smart watches, smart watches, etc.) glasses, etc.), personal computer, etc., including the following steps:
  • Step 101 When playing the first video data, display element summary information.
  • the operating system of the computer device may include Android (Android), a mobile operating system (IOS) developed by Apple, Windows, etc., in these operating systems, a client for running and playing and producing video data is supported, For example, short video applications, instant messaging tools, online video applications, and so on.
  • Android Android
  • IOS mobile operating system
  • the client may request the server to play video data in the form of a Uniform Resource Locators (URL), etc.
  • the video data is referred to as first video data in this embodiment, and after the server receives the request , searching for the first video data in a personalized or non-personalized manner, and sending part or all of the first video data to the client.
  • the first video data is video data that has been produced offline, and the form may include short videos, micro-movies, performance programs, and so on. This embodiment does not limit the form of the first video data.
  • the so-called personalization can refer to the adaptation of the video data to the user currently logged in on the client (represented by an identifier (ID), etc.), that is, based on the multi-objective optimization algorithm, collaborative filtering algorithm, etc.
  • ID an identifier
  • For matching refer to Embodiment 3 for the matching method. This embodiment does not describe the matching method in detail.
  • the video data can be regarded as the first video data.
  • non-personalization means that the screening of video data does not depend on the user currently logged in on the client (indicated by ID, etc.), and can be evaluated based on video quality (integrated definition, playback volume, likes, comments, etc.) ), popularity and other non-personalized factors to screen video data, and use the screened video data as the first video data.
  • the server can query one or more element summary information marked in the first video data, and the element summary information indicates the first video data containing audiovisual elements, and sending element summary information for the one or more audiovisual elements to the client.
  • the so-called audiovisual elements can include visual elements (that is, elements that users can see), audible elements (that are, elements that users can hear), and the form of audio-visual elements can be set according to actual conditions, such as audio data, video data. , beauty special effects, filters, etc., the form of audiovisual elements is not limited in this embodiment.
  • the element summary information may include text data, image data, etc., for example, the name of the audio data, the cover of the audio data, the name of the video data, the cover of the video data, the size, the author, the publisher, the number of users, etc. Wait.
  • the element summary information may also carry other information, for example, the ID of the audiovisual element, etc., which is not limited in this embodiment.
  • the client can call the video player provided by the operating system to play the first video data, that is, the client generates a first user interface and displays it in the first user interface The picture of the first video data, and the speaker is driven to play the audio of the first video data.
  • the first user interface has a play area, and the play area is used to display a picture of the first video data.
  • the play area is a partial area of the first user interface.
  • the other information of the video data can be displayed in the area outside the play area.
  • the play area is the entire area of the first user interface. At this time, other information for the first video data can be displayed in the floating The form is displayed above this display area.
  • Other information for the first video data may include controls for expressing positive emotions (such as “like”, “like"), comment information, controls for sharing, fields for inputting comment information, and the like.
  • components such as VideoView, MediaPlayer, SurfaceView, Vitamio, and JCPlayer can be called to play the first video data.
  • the element summary information can be converted into the first data structure in the first user interface, and the element summary information can be converted into the first data structure in a static manner (such as textual
  • the name of the audiovisual element is displayed in a dynamic manner (such as rotating the cover of the audiovisual element) to display the element summary information under the first data structure.
  • the one or more element summary information can be displayed above the play area in a floating form, so that the one or more element summary information is displayed in the first video data on the screen.
  • the screen of the first video data is displayed in all areas, and the first video data has audible elements, that is, the song "Quiet Night".
  • the lower left corner of the first user interface 210 displays the title of the song and the publisher "Quiet Night-Little Red” (element summary information 211), and the cover of the song (element summary information 212) is displayed in the lower right corner of the first user interface 210. ).
  • the first user interface 240 a picture of the first video data is displayed in a partial area, and the first video data has audible elements and visual elements, that is, the song "Exciting", small The video of stepping on a tightrope has just been performed.
  • the name of the song and the publisher "Exciting-Xiao Ming" (element summary information)
  • the introductory language of the video to guide the user to use the Video production new video data
  • the number of users of the video "Due with Xiaogang (2.27K)" (element summary information 241)
  • the cover of the song is displayed in the lower right corner of the first user interface 240.
  • Step 102 Receive a first operation acting on the element summary information.
  • the human-computer interaction tool provided by the computer device can trigger the first audio-visual element summary information corresponding to the audio-visual element. action to select the visual element represented by the element's summary information.
  • the human-computer interaction tools provided by them are different, and correspondingly, the ways of triggering the first operation through the human-computer interaction tools are also different.
  • the manner in which the tool triggers the first action is not limited.
  • the human-computer interaction tool provided by the computer equipment is a touch screen
  • a touch operation such as a click operation, long-press operation, re-press operation, etc. It is determined to receive a first operation that acts on the element digest information.
  • the human-computer interaction tool provided by the computer device is an external device, after receiving a key event (such as a single-click event, double-click event, long-press event, etc.) that occurs in an element summary information sent by the external device ), it is determined to receive the first operation acting on the element summary information.
  • a key event such as a single-click event, double-click event, long-press event, etc.
  • the external device includes, but is not limited to, a mouse, a remote control, and the like.
  • Step 103 In response to the first operation, jointly display video summary information and production controls of the second video data.
  • the server can collect video data in various ways, mark the video data with audiovisual elements (represented by ID, etc.), and store the video data in a local database of the server.
  • a specific visual element can be detected in the picture of the video data by calling an object detection model with a specific visual element as a target, If the specific visual element is detected, the video data is marked with the specific visual element.
  • the target detection model includes a first-order (One Stage) target detection model and a second-order (Two Stage) target detection model.
  • a target detection model that generates a series of candidate boxes as samples, and then classifies the samples through a convolutional neural network is called a second-order target detection model, for example, a regional convolutional neural network (Region-CNN, R-CNN), Spatial Pyramid Pooling Network (SPP-Net), Fast-RCNN, Faster-RCNN, etc.
  • a convolutional neural network for example, a regional convolutional neural network (Region-CNN, R-CNN), Spatial Pyramid Pooling Network (SPP-Net), Fast-RCNN, Faster-RCNN, etc.
  • a target detection model that does not generate candidate frames and directly converts the problem of target frame positioning into a regression problem is called a first-order target detection model, for example, Generalized Congruence Neural Network (GCNN), YOLO ( You Only Look Once), first-order multi-box prediction (Single Shot Mutibox Detector, SSD), etc.
  • GCNN Generalized Congruence Neural Network
  • YOLO You Only Look Once
  • first-order multi-box prediction Single Shot Mutibox Detector, SSD
  • the audio contained in the audio can be extracted, and the features of the audio can be extracted. If the features of the audio are the same as or similar to the features of a specific visual element, then The video data marks the specific visual element.
  • the custom visual element is compared with the original visual element, if the custom visual element is If the element is the same as or similar to the original visual element, the video data will be marked with the original visual element. If the custom visual element is different or not similar to the original visual element, The visual element is set with a new identifier (such as a new ID), and the video data is marked with a custom visual element (represented by the new identifier).
  • a new identifier such as a new ID
  • the visual element can be marked for the new video data.
  • the client can send a request to the server carrying the identification (such as ID) of the audiovisual element, requesting the server to search for the audiovisual element (with ID, etc.) containing the audiovisual element. representation) of the video data.
  • the server When the server receives the request, it parses the identification of the audiovisual element from the request, uses the identification as a search condition in the local database of the server, searches for the video data marked with the identification, and writes the video data into the video
  • the video data is referred to as the second video data in this embodiment
  • the video collection is referred to as the first video collection in this embodiment
  • the server extracts the first video data from the local database.
  • Video summary information (such as cover, name, producer, etc.) of the second video data in the video set, and send the video summary information to the client.
  • the so-called mark means that the video data contains the audiovisual element corresponding to the mark, that is, the plurality of second video data contains the same audiovisual element.
  • the second video data may be sorted according to a preset sorting method, and each time the top n (n is a positive integer) second video data of the sorting are selected and sent to the client.
  • the sorting method may include non-personalized sorting methods such as descending sorting according to video quality, descending sorting according to video popularity, etc., so as to reduce the processing complexity and improve the processing speed.
  • a personalized sorting manner such as collaborative filtering may also be used, which is not limited in this embodiment.
  • video summary information of the second video data sent by the server is received and cached locally on the client.
  • a second user interface is generated.
  • the element summary information includes element image data (such as the cover of audio data, the thumbnail of video data, etc.)
  • the element image can be displayed in the form of background data, that is, the image data in the element summary information is set as the background, and when setting, the image data can be blurred.
  • one or more information areas may be displayed in a waterfall or the like at a position below the element summary information, wherein the area of the information area matches the type of audiovisual element, that is, according to the type of audiovisual element Sets the size of the information area.
  • the area with the area of the first value can be set as the first area , display the first area as an information area, if the type of the audiovisual element is an audible element, you can set the area with the second value as the second area, and display the second area as an information area, where the first area If the value is greater than the second value, that is, the area of the first area is larger than the area of the second area, and the area of the display area is increased for the visual element, so that the visual element retains more details when displayed, and the user can browse more clearly. visual elements.
  • the second video data when the second video data is recalled to a first video set for the visual elements, the second video data is displayed in the first area.
  • Two video summaries of the video data when recalling the second video data for the audible element to another first video set, displaying the video summaries of the second video data in the second area.
  • the video summary information of the second video data in the first video set is sequentially loaded into the plurality of information areas, so that the video summary information of the second video data is displayed in the information area.
  • the production control is displayed on the information area in a floating manner, and the user can trigger operations such as sliding operation and page turning operation for the element summary information corresponding to the audio-visual element through the human-computer interaction tool provided by the computer equipment, so that in the first 2.
  • the user interface switches to display the video summary information of the second video data.
  • the position of the video summary information of the second video data changes, but the production control maintains the position. It does not change with operations such as sliding operations and page turning operations.
  • the client can continue to request the server for other second video data in the first video set, and display them in the second user interface until the first video is requested.
  • the other second video data in the set is complete.
  • the user triggers a click operation (the first operation) for the title of the song displayed in the lower left corner of the first user interface 210 and the publisher “Quiet Night-Little Red” (element summary information 211 )
  • a click operation (the first operation) is triggered for the cover (element summary information 212 ) of the song displayed in the lower right corner of the first user interface 210
  • FIG. 2B in the second user interface 220 .
  • the cover of the song is blurred and set as the background, on which the cover of the song, the name of the song, the publisher of the song, the number of users of the song are displayed in a concentrated manner, and nine smaller information areas are displayed.
  • the video summary information including the second video data corresponding to the song is loaded in sequence in the area.
  • the second video data in the third order (ie "N0.3"), in addition to the song, also includes audiovisual elements of other video data, when the user selects a song, the third order (ie "N0.3") ) of the second video data uses a smaller information area to display its video summary information.
  • the cover of the video is blurred and set as a background, on which the cover, the producer, and the background of the video are collectively displayed.
  • Step 104 Receive a second operation for making the control.
  • the human-computer interaction tool provided by the computer device can trigger the second operation for the current production control, so that the production includes the New video data for audiovisual elements.
  • the provided human-computer interaction tools are different, and correspondingly, the ways of triggering the second operation through the human-computer interaction tools are also different.
  • the manner in which the tool triggers the second action is not limited.
  • the human-computer interaction tool provided by the computer equipment is a touch screen
  • a touch operation such as a click operation, long press operation, repress operation, etc.
  • the first operation that acts on the make control.
  • the human-computer interaction tool provided by the computer device is an external device, when receiving a key event (such as a single-click event, double-click event, long-press event, etc.) It is determined to receive a first operation acting on the authoring control.
  • a key event such as a single-click event, double-click event, long-press event, etc.
  • the external device includes but is not limited to a mouse, a remote control, and the like.
  • Step 105 In response to the second operation, collect third video data, and add audiovisual elements corresponding to the element summary information to the third video data.
  • the client sends a request for downloading the audiovisual element (represented by ID, etc.) to the server in response to the user triggering the second operation of making the control, and after receiving the request, the server searches for the independent audiovisual element (represented by an ID, etc.) and send the audiovisual element to the client.
  • the server searches for the independent audiovisual element (represented by an ID, etc.) and send the audiovisual element to the client.
  • the so-called independent can mean that the audiovisual element is an independent file and does not depend on the first video data and the second video data.
  • the format of the audiovisual element (such as resolution, sampling rate, size, etc.) conforms to the production specification, and the client can directly use the audiovisual element to produce new video data.
  • the client can generate a third user interface, generate a control for making video data in the third user interface, call the camera of the computer device, preview the video data on the third user interface, and then receive the video data.
  • video data is collected, and for convenience of distinction, the video data is referred to as third video data in this embodiment.
  • the audiovisual elements corresponding to the element summary information are added to the third video data as the produced material.
  • the third video data is kept synchronized on the time axis with the audiovisual element.
  • the audiovisual element is started to be played, so that the user can preview the effect of adding the audiovisual element.
  • the collection of the third video data may be stopped.
  • the collection of the third video data may also be continued, which is not limited in this embodiment.
  • the audiovisual element includes audio data in the audiovisual element, then in this example, the audio element starts to be played at the same time when the third video data is collected, so that the audio data corresponding to the element summary information is set as the third video data. Background music for video data.
  • the audiovisual element includes video data in the visual element, and for convenience of distinction, the video element may be referred to as fourth video data in this embodiment.
  • the fourth video data is played at the same time when the third video data is collected.
  • the third video data is displayed on the left and the fourth video data is displayed on the right, or the third video data is displayed on the right.
  • the fourth video data is displayed on the left, and the third video data and the fourth video data are displayed in the form of picture-in-picture, so that the fourth video data and the third video data corresponding to the element summary information are synthesized in a split-screen manner.
  • FIG. 2E if the user triggers a click operation (second operation) for “joining” the display creation control 251 displayed below the second user interface 250 , then as shown in FIG. 2F , in the third user interface 260 Call the camera to preview in the middle, and when receiving the confirmation operation triggered by the circular control, collect the third video data and display it on the left, and play the video of Xiaogang stepping on the wire on the right, so that the video of Xiaogang stepping on the wire Merged with the third video data.
  • second operation for “joining” the display creation control 251 displayed below the second user interface 250
  • FIG. 2F in the third user interface 260 Call the camera to preview in the middle, and when receiving the confirmation operation triggered by the circular control, collect the third video data and display it on the left, and play the video of Xiaogang stepping on the wire on the right, so that the video of Xiaogang stepping on the wire Merged with the third video data.
  • the third video data can be sent to the server, and the server receives the third video data sent by the client, and marks the third video data to include the audiovisual element ( Represented by ID, etc.), if the marking is completed, the server publishes the third video data, the client thus publishes the third video data marked with audiovisual elements, and other clients can download the third video data from the server for playback. User browses.
  • the element summary information when the first video data is played, the element summary information is displayed, and the element summary information indicates the audiovisual elements contained in the first video data, the first operation acting on the element summary information is received, and in response to the first operation, the Displaying video summary information and production controls of the second video data, where the second video data includes audiovisual elements, receiving a second operation acting on the production controls, collecting third video data in response to the second operation, and converting the audiovisual elements corresponding to the element summary information
  • the element is added to the third video data.
  • users can use the audiovisual elements contained in the existing video data to create new video data.
  • the audiovisual elements do not depend on the template of the system, and the channels are diversified, which can maintain the individualization of the audiovisual elements.
  • the audio-visual elements provided by the system can ensure that the format of the audio-visual elements conforms to the production specifications, and can be directly used to produce new video data, preventing users from using professional applications. Elements are revised, which greatly reduces the technical threshold, reduces the time-consuming, and thus reduces the cost of producing video data.
  • Step 301 When playing the first video data, display element summary information.
  • the element summary information indicates audiovisual elements contained in the first video data.
  • Step 302 Receive a first operation acting on the element summary information.
  • Step 303 In response to the first operation, jointly display video summary information and production controls of the second video data.
  • the second video data contains the audiovisual element.
  • Step 304 Receive a third operation acting on the video summary information.
  • a human-computer interaction tool provided by the computer device can trigger a third operation for the corresponding video summary information, thereby selecting The second video data corresponding to the frequency summary information is determined.
  • the provided human-computer interaction tools are different, and correspondingly, the ways of triggering the third operation through the human-computer interaction tools are also different.
  • the manner in which the tool triggers the third action is not limited.
  • the human-computer interaction tool provided by the computer equipment is a touch screen
  • a touch operation such as a click operation, long press operation, repress operation, etc. It is determined to receive a third operation acting on the video summary information.
  • the human-computer interaction tool provided by the computer device is an external device, after receiving a key event (such as a single-click event, double-click event, long-press event, etc.) that occurs in a video summary message sent by the external device When , it is determined to receive a third operation acting on the video summary information.
  • the external device includes, but is not limited to, a mouse, a remote control, and the like.
  • Step 305 In response to the third operation, play the second video data to which the video summary information belongs.
  • the client may request the server to play the second video data in the form of a URL (carrying the identifier of the second video data, such as ID, etc.), and after the server receives the request , part or all of the second video data can be sent to the client.
  • a URL carrying the identifier of the second video data, such as ID, etc.
  • the client can call the video player provided by the operating system to play the second video data, that is, the client generates a first user interface and displays it in the first user interface A picture of the second video data, and driving a speaker to play the audio of the second video data.
  • the second video data that the user may like can be pushed centrally, which reduces the user's operation of searching for similar video data through keywords, page turning, etc., and reduces the consumption of searching for similar video data.
  • the occupation of resources such as processor resources, memory resources, bandwidth resources, etc.
  • Step 306 Receive a fourth operation acting on the first video data.
  • a human-computer interaction tool provided by the computer device can trigger a fourth operation for the first video data, so that the first user interface Switch to play other first video data.
  • the human-computer interaction tools provided by them are different, and accordingly, the ways of triggering the fourth operation through the human-computer interaction tools are also different.
  • the manner in which the tool triggers the fourth operation is not limited.
  • the touch screen detects that the touch screen detects the occurrence in the spatial area of the first user interface (the area other than the control, element summary information and other operable data) During a touch operation (such as a sliding operation, etc.), it is determined to receive a fourth operation acting on the first video data.
  • a touch operation such as a sliding operation, etc.
  • the human-computer interaction tool provided by the computer device is an external device, after receiving the data sent by the external device and occurring in the space area in the first user interface (except for control, element summary information and other operable data) When a key event (such as a drag event, etc.) occurs in the region), it is determined to receive a fourth operation acting on the first video data.
  • the external device includes, but is not limited to, a mouse, a remote control, and the like.
  • Step 307 In response to the fourth operation, play other first video data adapted to the current user, or other first video data including other audiovisual elements.
  • the client switches the first video data in response to the user Fourth operation, you can request the server to play other first video data adapted to the current user in the form of URL, etc. After receiving the request, the server can send other first videos adapted to the current user to the client Part or all of the data.
  • the client can call the video player provided by the operating system to play other first video data adapted to the current user, that is, in the first video In the user interface, the screen is switched to display other first video data adapted to the current user, and the speaker is driven to switch and play audio of the other first video data adapted to the current user.
  • the currently playing first video data is non-personalized push video data
  • the current first video data contains other audiovisual elements
  • the first video data is the video data in the video set corresponding to the other audiovisual elements, that is, the video set indicates that it contains
  • the video data is referred to as the second video set in this embodiment.
  • the client may request the server to play other first video data in the second video set, and after receiving the request, the server may send the first video to the client. Part or all of the other first video data in the second video set.
  • the client can call the video player provided by the operating system to play the other first video data in the second video set, that is, in the In the first user interface, the screen for displaying other first video data in the second video set is switched, and the speaker is driven to switch and play audio of the other first video data in the second video set.
  • the user can trigger a return operation for the return control in the first user interface through a touch operation or the like
  • the client receives the return operation acting on the return control, and, in response to the return operation, displays the second user interface, Video summary information of the first video data in the second video set is displayed in the second user interface.
  • the types of the first video data are distinguished, and other first video data adapted to the user and other first video data in the second video set are respectively pushed for personalized and non-personalized service scenarios, which can ensure that The accuracy of the first video data switching meets the requirements of business scenarios.
  • the video data production device can be implemented by software and/or hardware, and can be configured in computer equipment, such as a server, a workstation, etc., and includes the following steps:
  • Step 401 Send the first video data to the client.
  • the operating system of the computer device may include Unix, Linux, Windows Server, Netware, etc., in these operating systems, a server is supported, and the server is configured to provide video services to multiple clients, such as push Video data, publish video data, etc.
  • the server may determine the first video data in a personalized or non-personalized manner, and send part or all of the first video data to the client.
  • the server may send element summary information of one or more audiovisual elements to the client, where the element summary information represents the first video The audiovisual elements that the data contains.
  • the client is configured to display the one or more element summary information on the first user interface when playing the first video data.
  • step 401 may include the following steps:
  • Step 4011 Acquire historical data recorded when the user browses the video data.
  • the user browses the video data on the client side, and the server side records the information during the browsing process in a log file and stores it in the database.
  • the server can query the historical data recorded when the user browses the video data in the log file of the database, and wait to filter the video data suitable for the user.
  • Step 4012 Extract features from historical data as behavior features.
  • the behavioral characteristic may include at least one of the following:
  • user characteristics may be collected from historical data as user characteristics.
  • the user characteristics include characteristics inherent to the user, eg, ID (ie, User ID (UID)), gender, age, country, and the like.
  • ID ie, User ID (UID)
  • UID User ID
  • the user features include user dynamic features, for example, viewing behaviors in a recent period of time, interaction behaviors in a recent period of time, preferences for multiple types of video data in a recent period of time, and so on.
  • features of video data may be collected from historical data as video features.
  • the video features include features inherent to the video data, such as ID (ie, Video ID (VID)), length, tag, UID of the photographer (the user who made the video data), and the like.
  • ID ie, Video ID (VID)
  • length ie, Tag
  • UID the photographer (the user who made the video data), and the like.
  • the video features include dynamic features of the video data, for example, the number of times pushed to users in a recent period of time, the number of times it was viewed in a recent period of time, the number of times it was liked in a recent period of time, and so on.
  • the characteristics of the environment where the user browses the video data can be collected from the historical data, as the context characteristics, for example, the time of requesting to browse the video data, the location of the request to browse the video data, the network status of the request to browse the video data, etc.
  • At least two of the user feature, the video feature, and the context feature may be combined to obtain a cross feature, thereby increasing the dimension of the feature.
  • Step 4013 Use the behavior feature to predict multiple probabilities corresponding to the user performing multiple target behaviors on the video data.
  • a multi-task learning model can be set, and the multi-task learning model can be used to calculate that the user performs multiple (two or more) target behaviors (such as clicking, playing time, like, comment, share) on the video data , collection, attention, etc.), the probability is expressed as follows:
  • ui represents the i-th user
  • v j represents the j-th video data
  • t is the current moment, so the probability is abbreviated as pi ,j .
  • a target behavior of conversion from consumption to production is added, that is, to request other video data containing the same audiovisual elements as the video data, to produce new,
  • the target behavior can refer to steps 101-105.
  • the video data can be set as a positive sample, and the negative sample is Video data that has been viewed without triggering audiovisual elements.
  • the multi-task learning model can be a neural network, such as a deep neural network (Deep Neural Networks, DNN), etc., or can be other machine learning models, such as a logistic regression (Logistics Regression, LR) model, user click-through rate (Click-Through- Rate, CTR) model, etc., the type of the multi-task learning model is not limited in this embodiment.
  • a neural network such as a deep neural network (Deep Neural Networks, DNN), etc.
  • LR logistic regression
  • CTR Click-Through- Rate
  • the multi-task learning model can be trained based on multi-task learning.
  • Multi-task learning is a learning method that derives transfer. Multiple goals (such as the target behavior in this embodiment) are put together to learn from each other, and related goals (such as this The target behavior in the embodiment) shared information and the noise introduced by irrelevant targets can improve the generalization ability of the multi-task learning model to a certain extent.
  • Multi-task learning belongs to the category of transfer learning.
  • the main difference between it and transfer learning is that it learns to improve the effect of the model through multiple targets (such as the target behavior in this example), while the usual transfer learning uses other targets to improve the effect of the model. Improve the learning effect of a target.
  • a model based on parameter sharing can be used as a multi-task learning model.
  • the multi-task learning model receives the same input (Input), the underlying network shares model parameters, and multiple The target behaviors (such as Task1, Task2, Task3, Task4, etc.) learn from each other, and the gradients are back-propagated at the same time, which can improve the generalization ability of the multi-task learning model.
  • Step 4014 fuse the multiple probabilities into a quality value of the video data for the user.
  • the quality value of the video data for the user can be evaluated, and the quality value can be used to indicate the degree of the user's preference for the video data under the target dimension.
  • the quality value is positively correlated with the probability, that is, the higher the probability, the greater the quality value, and the lower the probability, the smaller the quality value.
  • multiple probabilities can be fused into a quality value of video data for the user by means of linear fusion, and feature weights are configured for each probability.
  • feature weights are configured for each probability.
  • the larger the feature weight the more important the target behavior is.
  • the product between each probability and the feature weight corresponding to each probability is calculated as the feature value, and the sum of all the feature values is calculated as the quality value of the video data for the user.
  • w l is the feature weight.
  • Step 4015 If the quality value satisfies the preset recall condition, set the video data to which the quality value belongs as the first video data adapted to the user.
  • recall conditions can be preset, for example, n (n is a positive integer) quality values with the highest numerical value, the quality value is greater than the threshold, m% (m is a positive number) quality values with the highest numerical value, etc. .
  • the video data to which the quality value belongs is set as the first video data adapted to the user, and at this time, the user's identification and the identification of the first video data can be recorded. relationship between.
  • Step 4016 Send the first video data to the client.
  • steps 4011 to 4015 can be performed offline.
  • a user represented by an ID, etc.
  • the user's ID can be used as a search condition to search for and The identifier of the first video data associated with the identifier of the user is searched for the first video data based on the identifier of the first video data, and the first video data is sent to the client.
  • Step 402 When a request triggered by the client based on the element summary information is received, search for second video data containing audiovisual elements.
  • the client When receiving the first operation acting on the element summary information, the client generates a request, sends the request to the server, and requests the server to push the second video data including the audiovisual element corresponding to the element summary information.
  • the server can collect video data in various ways, mark the video data with audiovisual elements (represented by ID, etc.), and store the video data in the local database of the server.
  • the server can search for the video data marked with the audiovisual element as the second video data, and write the second video data into the first video set.
  • Step 403 Send the video summary information of the second video data to the client.
  • the client extracts the video summary information (such as cover, name, producer, etc.) of the second video data in the first video set from the local database, and sends the video summary information to the client.
  • video summary information such as cover, name, producer, etc.
  • the client may be configured to jointly display the video summary information and production controls on the second user interface.
  • Step 404 When a request triggered by the client based on the production control is received, send the audiovisual element corresponding to the element summary information to the client.
  • the client When receiving the second operation acting on the production control, the client generates a request, sends the request to the server, and requests the server to push the audiovisual element corresponding to the element abstract information (represented by ID, etc.).
  • the server After receiving the request, the server searches for an independent audiovisual element corresponding to the element abstract information (represented by ID, etc.), and sends the audiovisual element to the client.
  • the element abstract information represented by ID, etc.
  • the client may be configured to collect third video data, and add the audiovisual element corresponding to the element summary information to the third video data.
  • the audiovisual element includes audio data
  • the audio data corresponding to the element summary information may be sent to the client, and the client may be configured to set the audio data corresponding to the element summary information as the third Background music for video data.
  • the audiovisual element includes fourth video data
  • the fourth video data corresponding to the element summary information may be sent to the client, and the client may be configured to synthesize the element in a split-screen manner The fourth video data and the third video data corresponding to the abstract information.
  • the client can upload the third video data, and the server can receive the third video data sent by the client, and mark the third video data with audiovisual elements (represented by ID, etc.), if After the marking is completed, the third video data is released, so that other clients can browse the third video data.
  • the server can receive the third video data sent by the client, and mark the third video data with audiovisual elements (represented by ID, etc.), if After the marking is completed, the third video data is released, so that other clients can browse the third video data.
  • the first video data is sent to the client, and the client is set to display element summary information when the first video data is played, and the element summary information indicates the audiovisual elements contained in the first video data.
  • the terminal searches for the second video data containing audiovisual elements, and sends the video summary information of the second video data to the client terminal.
  • the client terminal is set to jointly display the video summary information and make controls.
  • the client sends the audiovisual element corresponding to the element summary information to the client based on the request triggered by the production control, the client is set to collect the third video data, and the audiovisual element corresponding to the element summary information is added to the third video data.
  • users can use the audiovisual elements contained in the existing video data to create new video data.
  • the audiovisual elements do not depend on the template of the system, and the channels are diversified, which can maintain the personalization of the audiovisual elements, thereby ensuring the individuality of the newly produced video data.
  • the system provides audio-visual elements, which can ensure that the format of the audio-visual elements conforms to the production specifications, and can be directly used to produce new video data, avoiding the need for users to use professional applications to revise the elements, greatly reducing the technical threshold, Time consuming is reduced, thereby reducing the cost of producing video data.
  • FIG. 6 is a flowchart of a method for producing video data according to Embodiment 4 of the present application. Based on the foregoing embodiments, this embodiment adds operations of switching the first video data and playing the second video data.
  • the method includes the following steps: step:
  • Step 601 Send the first video data to the client.
  • the client is configured to display element summary information when the first video data is played, where the element summary information indicates audiovisual elements included in the first video data.
  • Step 602 When a request triggered by the client based on the element summary information is received, search for second video data containing audiovisual elements.
  • Step 603 Send the video summary information of the second video data to the client.
  • the client is set to jointly display the video summary information and the production controls.
  • Step 604 When a request triggered by the client based on the video summary information is received, send the second video data to which the video summary information belongs to the client for playback.
  • the client When receiving the third operation acting on the video summary information, the client generates a request, sends the request to the server, and requests the server to push the second video data corresponding to the video summary information.
  • the server can search for the second video data corresponding to the video summary information, and send part or all of the second video data to the client.
  • the client After buffering part or all of the second video data, the client can call the video player provided by the operating system to play the second video data.
  • the second video data that the user may like can be pushed centrally, which reduces the user's operation of searching for similar video data through keywords, page turning, etc., and reduces the consumption of searching for similar video data.
  • the occupation of resources such as processor resources, memory resources, bandwidth resources, etc.
  • Step 605 When receiving a request triggered by the client based on the first video data, send other first video data adapted to the user to the client for playback, or send other first video data containing other audiovisual elements to the client for playback.
  • the client When receiving the fourth operation acting on the first video data, the client generates a request, sends the request to the server, and requests the server to push other first video data.
  • the server When receiving the request, the server identifies the type of the first video data, so as to distinguish and push different first video data.
  • the server can send to the client other first video data adapted to the current user. some or all of the data.
  • the client After buffering part or all of the other first video data adapted to the current user, the client can call the video player provided by the operating system to play the other first video data adapted to the current user.
  • the server can send the The client sends part or all data of other first video data in the second video set.
  • the client After buffering part or all of the other first video data in the second video set, the client can call the video player provided by the operating system to play the other first video data in the second video set.
  • the types of the first video data are distinguished, and other first video data adapted to the user and other first video data in the second video set are respectively pushed for personalized and non-personalized service scenarios, which can ensure that The accuracy of the first video data switching meets the requirements of business scenarios.
  • FIG. 7 is a structural block diagram of an apparatus for producing video data according to Embodiment 5 of the present application, which may include the following modules:
  • the display screen 701 is set to display element summary information when the first video data is played, and the element summary information represents the audiovisual elements contained in the first video data; the touch screen 702 is set to receive the element summary that acts on the element summary.
  • the first operation of information; the display screen 701 is further configured to, in response to the first operation, jointly display video summary information and production controls of the second video data, where the second video data includes the audiovisual elements; the touch screen 702, further configured to receive a second operation acting on the production control; a camera 703, configured to collect third video data in response to the second operation; and a processor 704, configured to Audiovisual elements are added to the third video data.
  • the apparatus for producing video data provided by the embodiment of the present application can execute the method for producing video data provided by any embodiment of the present application, and has functional modules and effects corresponding to the execution method.
  • Embodiment 8 is a structural block diagram of an apparatus for producing video data according to Embodiment 6 of the present application, which may include the following modules:
  • the first video data sending module 801 is configured to send the first video data to the client, and the client is configured to display element summary information when the first video data is played, and the element summary information indicates the first video data.
  • an audiovisual element contained in video data the second video data search module 802 is configured to search for second video data containing the audiovisual element when receiving a request triggered by the client based on the element summary information;
  • video summary The information sending module 803 is configured to send the video summary information of the second video data to the client, and the client is configured to jointly display the video summary information and production controls;
  • the audiovisual element sending module 804 is configured to When a request triggered by the client based on the production control is received, the audiovisual element corresponding to the element summary information is sent to the client, and the client is set to collect third video data, and the element Audiovisual elements corresponding to the summary information are added to the third video data.
  • the apparatus for producing video data provided by the embodiment of the present application can execute the method for producing video data provided by any embodiment of the present application, and has functional modules and effects corresponding to the execution method.
  • FIG. 9 is a schematic structural diagram of a computer device according to Embodiment 7 of the present application.
  • Figure 9 shows a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present application.
  • computer device 12 takes the form of a general-purpose computing device.
  • Components of computer device 12 may include, but are not limited to, one or more processors or processing units 16 , system memory 28 , and a bus 18 connecting various system components including system memory 28 and processing unit 16 .
  • System memory 28 may include computer system readable media in the form of volatile memory, such as random access memory (RAM) 30 and/or cache memory 32 .
  • Storage system 34 may be configured to read and write to non-removable, non-volatile magnetic media (not shown in Figure 9, commonly referred to as a "hard disk drive").
  • a program/utility 40 having a set (at least one) of program modules 42 may be stored in memory 28, for example.
  • Computer device 12 may also communicate with one or more external devices 14 (eg, keyboard, pointing device, display 24, etc.). Such communication may take place through an input/output (I/O) interface 22 . Also, computer device 12 may communicate with one or more networks (eg, Local Area Network (LAN), Wide Area Network (WAN), and/or public networks such as the Internet) through network adapter 20. As shown, network adapter 20 communicates with other modules of computer device 12 via bus 18 .
  • networks eg, Local Area Network (LAN), Wide Area Network (WAN), and/or public networks such as the Internet
  • the processing unit 16 executes a variety of functional applications and data processing by running the programs stored in the system memory 28, for example, implementing the video data production method provided by the embodiments of the present application.
  • the eighth embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • User Interface Of Digital Computer (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

La divulgation concerne un procédé et un appareil de fabrication de données vidéo, ainsi qu'un dispositif informatique et un support d'enregistrement. Le procédé de fabrication de données vidéo consiste : lorsque des premières données vidéo sont lues, à afficher des informations de condensé d'élément, les informations de condensé d'élément étant utilisées pour représenter un élément audiovisuel inclus dans les premières données vidéo ; à recevoir une première opération qui agit sur les informations de condensé d'élément ; en réponse à la première opération, à afficher conjointement des informations de condensé vidéo des deuxièmes données vidéo et une commande de fabrication, les deuxièmes données vidéo comprenant l'élément audiovisuel ; à recevoir une deuxième opération qui agit sur la commande de fabrication ; et en réponse à la deuxième opération, à collecter des troisièmes données vidéo, et à ajouter, aux troisièmes données vidéo, l'élément audiovisuel correspondant aux informations de condensé d'élément.
PCT/CN2021/108174 2020-08-31 2021-07-23 Procédé et appareil de fabrication de données vidéo, dispositif informatique et support d'enregistrement WO2022042157A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010896513.0A CN112040339A (zh) 2020-08-31 2020-08-31 一种视频数据的制作方法、装置、计算机设备和存储介质
CN202010896513.0 2020-08-31

Publications (1)

Publication Number Publication Date
WO2022042157A1 true WO2022042157A1 (fr) 2022-03-03

Family

ID=73586447

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/108174 WO2022042157A1 (fr) 2020-08-31 2021-07-23 Procédé et appareil de fabrication de données vidéo, dispositif informatique et support d'enregistrement

Country Status (2)

Country Link
CN (1) CN112040339A (fr)
WO (1) WO2022042157A1 (fr)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112040339A (zh) * 2020-08-31 2020-12-04 广州市百果园信息技术有限公司 一种视频数据的制作方法、装置、计算机设备和存储介质
CN114268815A (zh) * 2021-12-15 2022-04-01 北京达佳互联信息技术有限公司 视频质量确定方法、装置、电子设备及存储介质
CN115103219A (zh) * 2022-07-01 2022-09-23 抖音视界(北京)有限公司 音频发布方法、装置和计算机可读存储介质
CN116916082B (zh) * 2023-09-12 2023-12-08 华光影像科技有限公司 一种影视制作界面切换系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004200811A (ja) * 2002-12-16 2004-07-15 Canon Inc 動画撮影装置
CN106375782A (zh) * 2016-08-31 2017-02-01 北京小米移动软件有限公司 视频播放方法及装置
CN107959873A (zh) * 2017-11-02 2018-04-24 深圳天珑无线科技有限公司 在视频中植入背景音乐的方法、装置、终端及存储介质
CN108600825A (zh) * 2018-07-12 2018-09-28 北京微播视界科技有限公司 选择背景音乐拍摄视频的方法、装置、终端设备和介质
CN108668164A (zh) * 2018-07-12 2018-10-16 北京微播视界科技有限公司 选择背景音乐拍摄视频的方法、装置、终端设备及介质
CN108900768A (zh) * 2018-07-12 2018-11-27 北京微播视界科技有限公司 视频拍摄方法、装置、终端、服务器和存储介质
CN112040339A (zh) * 2020-08-31 2020-12-04 广州市百果园信息技术有限公司 一种视频数据的制作方法、装置、计算机设备和存储介质

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109874053B (zh) * 2019-02-21 2021-10-22 南京航空航天大学 基于视频内容理解和用户动态兴趣的短视频推荐方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004200811A (ja) * 2002-12-16 2004-07-15 Canon Inc 動画撮影装置
CN106375782A (zh) * 2016-08-31 2017-02-01 北京小米移动软件有限公司 视频播放方法及装置
CN107959873A (zh) * 2017-11-02 2018-04-24 深圳天珑无线科技有限公司 在视频中植入背景音乐的方法、装置、终端及存储介质
CN108600825A (zh) * 2018-07-12 2018-09-28 北京微播视界科技有限公司 选择背景音乐拍摄视频的方法、装置、终端设备和介质
CN108668164A (zh) * 2018-07-12 2018-10-16 北京微播视界科技有限公司 选择背景音乐拍摄视频的方法、装置、终端设备及介质
CN108900768A (zh) * 2018-07-12 2018-11-27 北京微播视界科技有限公司 视频拍摄方法、装置、终端、服务器和存储介质
CN112040339A (zh) * 2020-08-31 2020-12-04 广州市百果园信息技术有限公司 一种视频数据的制作方法、装置、计算机设备和存储介质

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DREAM IN SHENXIANG: "How to use other people's video background music to shoot video in Douyin?", CN, pages 1 - 5, XP009534600, Retrieved from the Internet <URL:https://www.kafan.cn/edu/20089681.html> *
DREAM IN THE LANE: "TikTok How to Use Video Background Music for a Video in TikTok?", KAFAN NETWORK, 25 December 2018 (2018-12-25) *
NAO DONG DA KAI: "How to Shoot with the Same Special Effects in TikTok?", CN, XP009534750, Retrieved from the Internet <URL:https://jingyan.baidu.com/article/335530dad3f6ec19ca41c36d.html> *

Also Published As

Publication number Publication date
CN112040339A (zh) 2020-12-04

Similar Documents

Publication Publication Date Title
WO2022042157A1 (fr) Procédé et appareil de fabrication de données vidéo, dispositif informatique et support d&#39;enregistrement
US9332315B2 (en) Timestamped commentary system for video content
US9372926B2 (en) Intelligent video summaries in information access
US10277696B2 (en) Method and system for processing data used by creative users to create media content
US20160188997A1 (en) Selecting a High Valence Representative Image
JP6930041B1 (ja) 検索/作成されたデジタルメディアファイルに基づく潜在的関連のあるトピックの予測
CN111078939A (zh) 提取并提供视频内容中精彩图像的方法、系统及记录介质
WO2022052749A1 (fr) Procédé, appareil et dispositif de traitement de message, et support d&#39;enregistrement
CN112948708B (zh) 一种短视频推荐方法
CN109165302A (zh) 多媒体文件推荐方法及装置
KR20180005277A (ko) 컴퓨터 실행 방법, 시스템 및 컴퓨터 판독 가능 매체
US20210117471A1 (en) Method and system for automatically generating a video from an online product representation
JP7240505B2 (ja) 音声パケット推薦方法、装置、電子機器およびプログラム
RU2714594C1 (ru) Способ и система определения параметра релевантность для элементов содержимого
CN112765373A (zh) 资源推荐方法、装置、电子设备和存储介质
CN116975615A (zh) 基于视频多模态信息的任务预测方法和装置
CN100397401C (zh) 用于门户网站上,对多种资源仓库统一并行检索的方法
CN114817692A (zh) 确定推荐对象的方法、装置和设备及计算机存储介质
CN114727143A (zh) 多媒体资源展示方法及装置
WO2024021687A1 (fr) Procédé et appareil de réordonnancement de résultat de recherche, dispositif, support d&#39;enregistrement et produit-programme
Jin et al. Personalized micro-video recommendation based on multi-modal features and user interest evolution
CN112445921A (zh) 摘要生成方法和装置
US20240012861A1 (en) Method and a server for generating a machine learning model
RU2778382C2 (ru) Способ обучения алгоритма машинного обучения формированию прогнозируемого совместного векторного представления для цифрового элемента
US20220083614A1 (en) Method for training a machine learning algorithm (mla) to generate a predicted collaborative embedding for a digital item

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21859997

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21859997

Country of ref document: EP

Kind code of ref document: A1