WO2021136363A1 - Video data processing and display methods and apparatuses, electronic device, and storage medium - Google Patents

Video data processing and display methods and apparatuses, electronic device, and storage medium Download PDF

Info

Publication number
WO2021136363A1
WO2021136363A1 PCT/CN2020/141337 CN2020141337W WO2021136363A1 WO 2021136363 A1 WO2021136363 A1 WO 2021136363A1 CN 2020141337 W CN2020141337 W CN 2020141337W WO 2021136363 A1 WO2021136363 A1 WO 2021136363A1
Authority
WO
WIPO (PCT)
Prior art keywords
link data
video
content object
display
information
Prior art date
Application number
PCT/CN2020/141337
Other languages
French (fr)
Chinese (zh)
Inventor
王楚天
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2021136363A1 publication Critical patent/WO2021136363A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history
    • G06Q30/0256User search
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • H04N21/8405Generation or processing of descriptive data, e.g. content descriptors represented by keywords
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • H04N21/8586Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot by using a URL

Definitions

  • the embodiments of the present invention relate to the field of computer technology, and in particular to a method, device, electronic device, and storage medium for processing and displaying video data.
  • Video advertising is an advertising method that introduces products in the form of videos.
  • the existing video advertisement only pushes advertisement information to the audience, and cannot realize the interaction with the audience.
  • the audience wants to watch more detailed information they can only search through the tools in their hands such as mobile phones or computers based on the information of the video advertisement they are watching.
  • embodiments of the present invention provide a video data processing solution to solve some or all of the above-mentioned problems.
  • a method for processing video data including: acquiring a video to be played and link data, wherein the video includes information about preset keywords, and the link data is related to The target content object indicated by the preset keyword corresponds to; during the playback of the video, at least part of the image frames and/or at least part of the audio data of the played video is subjected to the information of the preset keyword Detection; if it is determined that the information of the preset keyword is detected according to the detection result, the corresponding link data is displayed based on the played video.
  • a display method which includes: during video playback, when information about a preset keyword is detected, displaying and the detected preset keyword in a video playback interface The link data corresponding to the target content object indicated by the information; obtain the trigger operation for the link data corresponding to the target content object displayed in the video playback interface; according to the trigger operation, jump from the video playback interface to the link data The linked page used to display the target content object.
  • a method for processing video data including: acquiring and playing a live video stream; during the playing process of the live video stream, processing image frames in the live video stream Perform content detection, and/or perform content detection on the audio in the live video stream to obtain the content object contained in the live video stream; find out whether the content object has corresponding link data; there will be a corresponding The content object of the link data serves as the target content object, and the link data corresponding to the target content object is displayed in the playback interface of the live video stream.
  • a video data processing device including: a first acquisition module for acquiring a video to be played and link data, wherein the video includes preset keywords Information, the link data corresponds to the target content object indicated by the preset keyword; the first detection module is configured to perform at least part of the image frames and/or of the played video during the playback of the video At least part of the audio data is used to detect the information of the preset keywords; the first display module is configured to, if the information of the preset keywords is determined to be detected according to the detection result, display the corresponding information based on the played video The link data of the target content object.
  • a display device including: a video playback module, configured to display and detect information about preset keywords in a video playback interface during video playback.
  • the trigger acquisition module is used to obtain the trigger operation of the link data corresponding to the target content object displayed in the video playback interface;
  • the interface jump module is used to According to the trigger operation, jump from the video playback interface to the page linked by the link data for displaying the target content object.
  • a video data processing device including: a second acquisition module, configured to acquire and play a live video stream; and a second detection module, configured to display the live video stream During the playback process, perform content detection on the image frames in the live video stream, and/or perform content detection on the audio in the live video stream to obtain the content objects contained in the live video stream; matching module , Used to find whether the content object has corresponding link data; the second display module, used to take the content object with the corresponding link data as the target content object, and display and Link data corresponding to the target content object.
  • an electronic device including: a processor, a memory, a communication interface, and a communication bus.
  • the processor, the memory, and the communication interface complete each other through the communication bus.
  • the memory is used to store at least one executable instruction, the executable instruction causes the processor to perform operations corresponding to the video data processing method described in the first aspect or the third aspect or perform operations such as The operation corresponding to the two display methods.
  • a computer storage medium having a computer program stored thereon, and when the program is executed by a processor, it implements the video data processing method as described in the first or third aspect or Realize the display method as in the second aspect.
  • the video used to promote the target content object includes the information of the preset keywords.
  • the information of the preset keywords is detected, it is based on the playback
  • the video displays the link data corresponding to the target content object indicated by the information of the preset keyword. In this way, not only can the video be used to draw traffic to the page corresponding to the link data, but also it can interact with the audience well, and provide the audience with a way to understand the detailed information of the target content object.
  • Fig. 1a is a flowchart of steps of a method for processing video data according to the first embodiment of the present invention
  • Figure 1b is a schematic diagram of interaction between a terminal device and a server in a usage scenario according to the first embodiment of the present invention
  • FIG. 1c is a schematic diagram of interface changes in a terminal device in a usage scenario according to Embodiment 1 of the present invention.
  • 2a is a flowchart of steps of a method for processing video data according to the second embodiment of the present invention.
  • 2b is a schematic diagram of interface changes in a usage scenario according to the second embodiment of the present invention.
  • Fig. 3 is a flow chart of the steps of a display method according to the third embodiment of the present invention.
  • Fig. 4a is a flowchart of steps of processing video data according to the fourth embodiment of the present invention.
  • 4b is a schematic diagram of interface changes in a usage scenario according to the fourth embodiment of the present invention.
  • Fig. 5 is a structural block diagram of a video data processing device according to the fifth embodiment of the present invention.
  • FIG. 6 is a structural block diagram of a display device according to the sixth embodiment of the present invention.
  • Fig. 7 is a structural block diagram of a video data processing device according to the seventh embodiment of the present invention.
  • Fig. 8 is a schematic structural diagram of an electronic device according to the eighth embodiment of the present invention.
  • one way of advertising through video is to play a pre-shot advertisement video on the interface of a webpage or application program for the audience (that is, the person watching the advertisement video) Watch.
  • the problem with this advertising method is that on the one hand, it is necessary to shoot a video specifically for a product that needs to be promoted in advance, which leads to a long production time and high cost for the advertisement; on the other hand, the duration of the advertising video is usually short, resulting in products that can be introduced. The information is limited. If the audience wants to know more details of the product, they can only search and understand it by themselves based on the product name, model and other information, resulting in poor interaction.
  • FIG. 1a there is shown a flow chart of the steps of a method for processing video data according to the first embodiment of the present invention.
  • the processing method of the video data is described by taking the processing method of the video data executed by the terminal device as an example.
  • the video data processing method may also be executed by the server (the server includes a server or the cloud), and this embodiment does not limit this.
  • the video data processing method includes the following steps:
  • Step S102 Obtain a video to be played and link data, where the video includes information about a preset keyword, and the link data corresponds to the target content object indicated by the preset keyword.
  • the video to be played may be a video used to explain or display the target content object
  • the target content object may be any suitable object such as a commodity, a person, a location, and so on.
  • commodities can be tangible commodities or intangible commodities (such as services, virtual commodities, etc.).
  • the video includes image frame sequence data and audio data, in addition to information target content objects with preset keywords.
  • image frame sequence data may include the image of the target content object, or may not include the image of the target content object at all.
  • the information of the preset keywords includes at least one of the following: a preset voice keyword and a preset text keyword target content object.
  • the preset voice keywords and the preset text keywords may be the name, model, etc. of the target content object, or other preset keywords determined as needed, such as the category of the target content object.
  • the information of the preset keywords indicates that the voice keyword is "**mobile phone”.
  • the text keyword indicated in the information of the preset keyword is "** thermos cup” and so on.
  • the information of the preset keywords may indicate one or more preset keywords.
  • the preset keywords indicated by the information of the preset keywords include "**lipstick", "**pen”, etc.
  • Each preset keyword can correspond to one link data, and the link data corresponding to different preset keywords can be the same or different.
  • the link data can jump to the corresponding page after a trigger operation is performed by an audience of the target content object (hereinafter referred to as the audience, the audience may be a person watching the video), so as to achieve the purpose of diverting traffic to the page corresponding to the link data through the video.
  • the link data can be the URL or IP address of the page, and so on.
  • the link data can be determined by the application provider. For example, if the link data A corresponds to the product purchase page of the product A, the preset keyword corresponding to the link data A is the name of the product A. In the process of playing the video, if it is detected that the image of the product A is included in the video or the name of the product A is mentioned in the subtitle or audio, it can be determined that the link data A matches the video.
  • any video containing preset keywords can be used as the promotion carrier of the link data to promote the link data.
  • the target content object thus enables any appropriate video to provide services for advertisers to meet their needs for promoting the target content object through the video.
  • Step S104 In the process of playing the video, at least part of the image frames and/or at least part of the audio data of the played video is detected for the information of the preset keywords.
  • a specific detection method For example, if the preset keyword information is the name of commodity A, when detecting image frames, image recognition algorithms or a trained neural network model with commodity A recognition function (such as convolution The neural network model) detects the image frames in the video to determine whether any image frames contain commodity A.
  • image recognition algorithms or a trained neural network model with commodity A recognition function such as convolution The neural network model
  • a neural network model capable of character recognition can also be used to detect image frames in the video to determine whether any text in the image frame contains the name of commodity A.
  • a speech recognition algorithm such as asr, Automatic Speech Recognition algorithm
  • a speech recognition algorithm such as asr, Automatic Speech Recognition algorithm
  • Step S106 If it is determined that the information of the preset keyword is detected according to the detection result, the corresponding link data is displayed based on the played video.
  • the information of the preset keyword is the name of commodity A, and if it is recognized that the name of commodity A exists in the caption contained in the image frame for the image frame, it is determined that the information of the preset keyword is detected, and the corresponding link data is displayed.
  • the link data can be used to improve the interactivity of the audience, so that the audience can more easily understand the target content object of interest.
  • the flexibility of the link data display is improved, so that it can be better integrated into the video, and it can target any Videos can be applied.
  • the time information of the voice keyword in the audio data that is, the playing time, such as 1 minute and 20 seconds, etc.
  • the preset voice keyword is played at the current moment.
  • the control corresponding to the link data can be displayed in the appropriate position of the video playback interface to display the link data through the control (such as floating window, mask or pop-up window, etc.), which can be received through the control
  • the link data will not be displayed in the video all the time, but the link data will be displayed intelligently according to whether the information of the preset keywords is detected during the playback process of the video, so that the display of the link data is well integrated into the video , Will not appear too obtrusive or implanted strong, thereby enhancing the promotion effect through video promotion and ensuring the video viewing experience.
  • Those skilled in the art can use any appropriate method to display the link data based on the played video, which is not limited in this embodiment. For example, when the video starts to play, the control corresponding to the link data is drawn in advance, and its attribute is set to "hidden”. When the preset keyword information is detected, the attribute of the pre-drawn control is changed to "display”. and many more.
  • the interaction between the video and the audience can be realized, and the audience can jump to the corresponding page when the audience operates on the link data to provide the audience with further more detailed information, which is convenient for the audience to understand or make purchases.
  • terminal devices such as mobile phones, personal computers, personal mobile computers, etc.
  • the server includes a server or the cloud
  • the terminal device obtains webpage data from the server.
  • the webpage data contains a video to be played (the video can be played in a small window in the lower right corner of a webpage that often appears).
  • the video contains not only the image frame sequence data and audio data in the conventional video, but also the information of preset keywords.
  • link data to the target content object indicated by the preset keyword can also be obtained.
  • the link data can be included in the video, or it can exist independently of the video.
  • the interface When playing a video through a small window, the interface is as shown in interface 1 in Figure 1c.
  • the information of the preset keywords is detected during video playback, as shown in interface 2 in Figure 1c, the corresponding link data is displayed in the video playback interface, and the interface for displaying the link data is shown in interface 3 in Figure 1c. Show.
  • the voice keyword or text keyword indicated by the information of the preset keyword is the name of product A (such as **cup), and the target content object is **cup.
  • One of the ways to detect the information of the preset keyword can be : Perform image recognition on the current image frame to determine the content object contained in the current image frame. If there is commodity A in the content object (for example, there is **cup), it is determined that the preset keyword information is detected, and the corresponding Link data.
  • another detection method can be: perform voice recognition on the currently played audio data. If a voice keyword (such as the name of product A) is recognized, it is determined that the information of the preset keyword is detected, and the corresponding information can be displayed. Link data.
  • voice keyword such as the name of product A
  • Link data can also be used for detection, which will not be described here.
  • the video used to promote the target content object includes the information of the preset keywords.
  • the information of the preset keywords is detected, based on the displayed video and the preset keywords
  • the link data corresponding to the target content object indicated by the information is not only can the video be used to draw traffic to the page corresponding to the link data, but also it can interact with the audience well, and provide the audience with a way to understand the detailed information of the target content object.
  • FIG. 2a there is shown a flow chart of the steps of a method for processing video data according to the second embodiment of the present invention.
  • the terminal device is still the main body of execution, and the display process of the link data in the video data processing method is mainly described.
  • the video data processing method of this embodiment includes the following:
  • Step S100 Obtain the corresponding identifier of the target content object, and generate the preset keyword information corresponding to the target content object according to the identifier.
  • the target content object can be selected by the advertiser, or determined based on big data analysis.
  • the target content object is physical goods, such as cups, mobile phones, etc., which may also be non-physical goods, such as cleaning services, virtual currency, and so on.
  • the identification can be the name, model, category or preset code of the target content object, words selected by other advertisers, and so on.
  • the generated preset keyword information may only indicate one preset keyword, or may be used to indicate more than one preset keyword.
  • the information of the preset keywords includes “**cup”, “**lipstick” and “**phone”.
  • Step S102 Obtain the video to be played and the link data.
  • Step S102 can adopt the same implementation process as the step S102 in the first embodiment, so it will not be repeated here.
  • Step S104 In the process of playing the video, at least part of the image frames and/or at least part of the audio data of the played video is detected for the information of the preset keywords.
  • the process of detecting the information of the preset keywords can adopt the implementation process in the first embodiment, so it will not be repeated here.
  • Step S106 If it is determined that the information of the preset keyword is detected according to the detection result, the corresponding link data is displayed based on the played video.
  • Step S106 can be implemented in the first embodiment.
  • step S106 can be implemented as: matching the copywriting data corresponding to the link data from the copywriting data input by the application provider of the target content object; generating the to-be-displayed based on the link data and the matched copywriting data Link data; based on the video playback interface, display the link data to be displayed.
  • the application provider can input personalized copywriting data as needed, and automatically match the copywriting data with the link data, so that the corresponding copywriting data can be displayed at the same time when the link data is displayed, so as to increase the audience's interest in clicking the link data.
  • the display duration may be set for the link data, that is, the stay time in the playback interface of the video may be set.
  • displaying the link data corresponding to the played video may be implemented as: displaying the link data with a preset display duration, so that the target The audience of the content object operates on the link data within the preset display duration.
  • the preset display duration that is, the duration of the link data stay and display, can be determined according to needs, such as 20 seconds, 1 minute, etc., which is not limited in this embodiment.
  • the trigger operation of the link data by the audience is not received until the preset display time is reached, the link data is hidden or destroyed.
  • the method can also jump to the page corresponding to the link data.
  • the following method can be used: in the video playback interface, a display control is added to display the corresponding link data, where The control includes at least one of the following: floating window, mask, and pop-up window.
  • the display control can easily adjust the display position, it is convenient to typeset the video when the link data is displayed, so that the position of the content object in the image frame of the video can be adapted to ensure that the link data can be displayed in a more appropriate position.
  • adding a display control to display the corresponding link data includes the following sub-steps:
  • Sub-step S1061 Perform image recognition on a preset number of image frames after the current image frame based on the information of the preset keyword being played, and determine the position information of the content object in the image frame according to the recognition result.
  • a neural network model, front and back background segmentation, etc. can be used to perform image recognition on a preset number of image frames after the current image frame to obtain the recognition result, and the content object in the image frame can be determined according to the recognition result.
  • the position information, the blank area in the image frame, or the area that does not block the main content object in the image frame, etc. can be determined subsequently based on the position information, and these areas are determined as suitable display positions for displaying the display control.
  • the content objects in the image frame may be people, objects, buildings, texts, etc. in the image frame.
  • the preset number can be determined according to needs, which is not limited in this embodiment. It should be noted that, for example, the preset number is 5, and the preset number of image frames after the current image frame may be 5 consecutive image frames after the current image frame, or may be 5 image frames at intervals. If it is 5 image frames at intervals, the number of image frames at intervals between two adjacent image frames can be determined as required.
  • Step S1062 Determine the display position of the link data according to the position information of the content object in the image frame.
  • step S1062 may be implemented as: determining a blank position in each image frame according to the position information of the content object in the image frame; determining the blank position in each image frame according to the blank position in each image frame The placement of the link data. In this way, it can be ensured that the occlusion of content objects can be reduced when link data is displayed.
  • the information of the preset keywords starts to be played at the 20s, and the 5 image frames after the 20s corresponding image frame are image-recognized, and the blank positions in each image frame are determined Based on this, the blank position with the highest coincidence rate can be determined, and the blank position with the highest coincidence rate can be used as the display position of the link data, so that the link data can not block or less block the target content object in the image frame when displaying the link data.
  • the link data can not block or less block the target content object in the image frame when displaying the link data.
  • the target content object corresponding to the link data is determined from the content object, and the target content object will be separated from the target content object.
  • a certain distance position is determined as a suitable position as a display position.
  • Sub-step S1063 display the corresponding link data through the display control at the display position.
  • the display control can be displayed in different ways according to the different structure of the display control.
  • the sub-step S1063 may be implemented as: displaying the display control in the display position, and displaying the first sub-control and the second sub-control in the display control.
  • the first sub-control is used to display the text and/or image information corresponding to the target content object indicated by the information of the preset keywords; the second sub-control includes a trigger control corresponding to the link data, which is used to jump the video playback interface when triggered Go to the page linked by the link data.
  • the text and/or image information corresponding to the target content object can be preset, or can be added by the business owner voluntarily.
  • the related information of the target content object and the trigger control of the link data can be displayed at the same time, so that the audience can know the target content object and its corresponding trigger control well, so that multiple different display controls can be displayed in one interface.
  • different display controls can have different functions and be displayed in different ways, thereby reducing implementation costs.
  • the display control can be directly displayed at the display position, which is not limited in this embodiment.
  • the method further includes:
  • Step S108 Receive an operation on the displayed link data, and jump from the playback interface of the video to the page linked by the link data according to the operation.
  • receiving an operation on the displayed link data indicates that the audience wants to further understand the target content object or view more information related to the target content object, so according to this operation, jump from the video playback interface to Link the page to which the data is linked to display more information corresponding to the target content object.
  • the video data processing method can be applied to any appropriate use scenario. For example, it is quoted in an e-commerce website to add a video playback window to the homepage, product display page, search display page, etc., and display related link data during video playback, so that the audience can click on the link data to jump Go to the corresponding page to view the page corresponding to the link data, so as to achieve the purpose of page drainage.
  • the terminal device obtains webpage data from the server (the server includes the server or the cloud) through the network (refer to Figure 1b for a schematic diagram of the connection between the terminal device and the server), where the webpage data includes video, and the video includes image frame sequence data and audio data , Preset keyword information.
  • the web page data can also include link data.
  • the interface for playing the video in the web interface is as shown in interface 1 in Figure 2b.
  • the video may be a video that introduces a certain target content object (such as a product).
  • the text keywords are displayed in the subtitles in the interface.
  • a semi-transparent mask is displayed in the mask, and the semi-transparent first and second sub-controls are displayed in the mask.
  • the first sub-control is used to display the name of the target content object 1 (such as XXX hand cream, etc. )
  • the second sub-control is a trigger control for linking data (such as a trigger button, a trigger pop-up window, etc.).
  • the interface jumps to the product introduction interface of the target content object, which is used to display the detailed information of the target content object (taking hand cream as an example, which The detailed information can be the appearance, volume, composition, etc. of the hand cream).
  • the aforementioned video may be automatically generated using a video generation tool.
  • the material video is obtained in advance, and the material video is analyzed and processed to obtain the target content object and preset keywords corresponding to the material video.
  • the algorithm automatically matches the material video that matches the search information, associates the material video with the link data provided by the business owner, and then Material video and link data generate video.
  • the image frame sequence data included in the video can be the image frame sequence data in the material video
  • the audio data can be the audio data in the material video or the audio data automatically generated according to the material copy
  • the link data is provided by the business owner Link data so that advertising videos can be produced more accurately.
  • the video used to promote the target content object includes the information of the preset keywords.
  • the information of the preset keywords is detected, based on the displayed video and the preset keywords
  • the link data corresponding to the target content object indicated by the information is not only can the video be used to draw traffic to the page corresponding to the link data, but also it can interact with the audience well, and provide the audience with a way to understand the detailed information of the target content object.
  • the intelligent display of the link data can be realized, and the display of the link data can be well integrated into the video.
  • videos can be automatically generated, so that business owners (such as advertisers) who are not familiar with video production tools or have no video production capabilities can also generate the videos they need.
  • the link data with better integration is displayed, so that the display position of the link data changes with the image changes in the video, and the intelligence and adaptability are better, which can directly increase the conversion rate of the video.
  • FIG. 3 there is shown a schematic flowchart of the steps of a display method according to the third embodiment of the present invention.
  • the display method is described as follows.
  • the display method of this embodiment includes the following steps:
  • Step S300 During the video playing process, when the information of the preset keyword is detected, the link data corresponding to the target content object indicated by the detected information of the preset keyword is displayed in the video playing interface.
  • the process of displaying the link data can be the process described in the foregoing first embodiment or second embodiment, so it will not be repeated here.
  • Step S302 Acquire a trigger operation on the link data corresponding to the target content object displayed in the video playback interface.
  • the link data may indicate a page associated with the target content object, and the link data may be a URL (Uniform Resource Locator, Uniform Resource Locator) or an IP address, etc.
  • URL Uniform Resource Locator, Uniform Resource Locator
  • the link data is the link data corresponding to the information of the preset keyword that is triggered to be displayed when the video is played to the information of the preset keyword.
  • the target content object can be any suitable object such as commodities, characters, scenic spots, and so on.
  • Commodities can be tangible or intangible.
  • the trigger operation may be a click operation, a long press operation, a sliding operation, a double-click operation, etc. on the link data (for example, a display control displaying the link data) by the audience.
  • Step S304 According to the trigger operation, jump from the video playback interface to the page linked by the link data for displaying the target content object.
  • a request for accessing the page indicated by the link data is generated and sent to the corresponding server to obtain the data of the page corresponding to the link data for display.
  • the corresponding link data is displayed for the audience to trigger, and if the trigger operation is received, the page corresponding to the link data is displayed, thereby displaying the target The detailed information of content objects (such as products) for the audience to view.
  • FIG. 4a there is shown a schematic flow chart of the steps of a method for processing video data according to the fourth embodiment of the present invention.
  • the method for processing video data is described in conjunction with a live video sales scenario.
  • the live broadcaster can recommend and introduce products to viewers through the live broadcast, and can also display the effects of trials in the live broadcast, thereby realizing online sales of the products.
  • the video data processing method can be executed by a terminal device as a playback terminal.
  • Step S402 Acquire and play the live video stream.
  • the live video stream can be a real-time video obtained from the live broadcast server, or it can be a real-time video obtained directly from the live broadcast terminal.
  • the live video stream may be a video in which the live broadcaster introduces the product, but is not limited to this, and it may be a video of any other content.
  • Step S404 During the playback of the live video stream, perform content detection on the image frames in the live video stream, and/or perform content detection on the audio in the live video stream to obtain the live broadcast The content object contained in the video stream.
  • the content object can be people, objects, buildings, etc. in the image, or the people, objects, buildings, etc. indicated by the text in the image frame or the text keywords appearing in the subtitles, or it can be the voice keyword indications appearing in the audio People, objects, buildings, etc.
  • performing content detection on the image frames in the live video stream to obtain the content objects contained in the live video stream may be implemented as follows: during the playback of the live video stream, Perform image recognition at a preset position in the image frame in the live video stream, and obtain the content object indicated by the text keyword in the image frame and/or the content object indicated by the image in the image frame according to the recognition result .
  • a neural network model with a corresponding content object recognition function is used to detect the preset position in the image frame to identify the content object contained in the preset position of the image frame.
  • the preset position may be a position configured by default, such as the entire image frame; or, it may also be a part or all of the image frames selected by the live broadcast host through a selection box during the live broadcast.
  • the degree of freedom of detection can be improved, so as to improve the adaptability.
  • Performing content detection on the audio in the live video stream to obtain the content objects contained in the live video stream includes: performing audio on the audio in the live video stream during the playback of the live video stream Identify and obtain the content object indicated by the voice keyword in the audio.
  • voice keywords such as product names and product models mentioned in the audio can be detected to determine the content objects indicated by these voice keywords.
  • At least some of the content objects contained in the live video stream can be obtained, and then it can be subsequently determined whether there is a target content object among these content objects to determine whether it is necessary Show link data.
  • Step S406 Find out whether the content object has corresponding link data.
  • step S406 includes: searching a preset commodity database to determine whether the content object has corresponding link data.
  • the commodity database stores commodity identifiers (such as commodity names) and its corresponding link data (commodity purchase links).
  • the preset keywords corresponding to the content object can be matched with the product identifier to determine whether there is corresponding link data in the product database. If it exists, it means that it is the target content object, and the link data can be displayed in the playback interface for the viewer to manipulate the link data as needed.
  • the live video stream includes information indicating preset keywords of the target content object to be identified and corresponding link data.
  • the information and link data of the preset keywords of the target content object to be recognized may be selected by the live broadcaster at the live broadcast end.
  • the live broadcast terminal is equipped with a setting interface, through which the live broadcast host can configure the information of the preset keywords to indicate the target content object; and configure the corresponding link data to improve autonomy and enable the live broadcast host to follow Need to control the displayed link data.
  • step S406 can be implemented as: determining whether the detected content object includes a content object matching the target content object to be identified, and if it exists, determining that there is corresponding link data.
  • Step S408 Use the content object with corresponding link data as the target content object, and display the link data corresponding to the target content object in the play interface of the live video stream.
  • the corresponding link data is displayed, so that the live broadcast viewer can easily jump to the page corresponding to the link data by operating the displayed link data while watching the live broadcast to view the content. Or perform operations such as purchasing the product corresponding to the target content object.
  • the live broadcast function is enriched, and the viewer can conveniently view the information of the target content object.
  • link data on the broadcast side it can also be displayed on the live broadcast side simultaneously, so that the live broadcaster can know whether the link data is displayed and the display effect of the link data in a timely manner, so that the live broadcaster can more easily monitor the live broadcast effect .
  • an animation can be set to realize the effect of prompting the link data, so that the viewer can more easily notice the link data.
  • the live broadcast host can configure at least one preset keyword and the corresponding link data through the live broadcast terminal. According to the configured preset keywords, a command to indicate to be recognized can be generated.
  • the preset keyword information of the target content object, and the live video stream, the preset keyword information and link data are sent to the playback terminal.
  • the interface 1 of the player terminal in FIG. 4b shows the interface of the player terminal to play the live video stream.
  • content recognition can be performed on the image frames and/or audio therein to determine the content objects contained in the live video stream. Then find out whether the detected content object has corresponding link data. If there is corresponding link data, the link data will be displayed on the playback interface of the player end and the playback interface of the live broadcast end.
  • the interface 2 of the player end shows the display link data. Schematic diagram of the interface.
  • the viewer can jump to the corresponding page by operating the link data to view the content on the page (as shown in the player interface 3 in Figure 4b).
  • FIG. 5 there is shown a structural block diagram of a video data processing apparatus according to the fifth embodiment of the present invention.
  • the video data processing device of this embodiment includes: a first acquisition module 502 for acquiring a video to be played and link data, where the video includes information about preset keywords, the link data and the target indicated by the preset keywords Content object correspondence; a first detection module 504, used to detect at least part of the image frames and/or at least part of the audio data of the played video during the playback of the video; the first display module 506 , Used to display the corresponding link data based on the played video if it is determined that the information of the preset keyword is detected according to the detection result.
  • the information of the preset keyword includes at least one of the following: a preset voice keyword and a preset text keyword.
  • the device further includes: an information generating module 500, configured to obtain a corresponding identifier of a target content object, and generate information of the preset keyword corresponding to the target content object according to the identifier, wherein the The identification includes at least one of the following: the name, model, and category of the target content object.
  • an information generating module 500 configured to obtain a corresponding identifier of a target content object, and generate information of the preset keyword corresponding to the target content object according to the identifier, wherein the The identification includes at least one of the following: the name, model, and category of the target content object.
  • the device further includes: a receiving module 508, configured to receive an operation on the displayed link data, and jump from the playing interface of the video to the page linked by the link data according to the operation.
  • a receiving module 508 configured to receive an operation on the displayed link data, and jump from the playing interface of the video to the page linked by the link data according to the operation.
  • the first display module 506 is configured to add a display control to display the corresponding link data on the video playback interface, where the display control includes at least one of the following: a floating window, a mask, and a pop-up window.
  • the first display module 506 is configured to perform image recognition on a preset number of image frames after the current image frame based on the information of the preset keyword being played, and determine the position of the content object in the image frame according to the recognition result Information: According to the position information of the content object in the image frame, the display position of the link data is determined; the corresponding link data is displayed through the display control in the display position.
  • the first display module 506 is configured to display the display control in the display position when displaying the corresponding link data through the display control in the display position, and display the first sub-control and the second sub-control in the display control; wherein, The first sub-control is used to display the text and/or image information corresponding to the target content object indicated by the information of the preset keywords; the second sub-control includes a trigger control corresponding to the link data, which is used to jump the video playback interface when triggered Go to the page linked by the link data.
  • the first display module 506 is configured to determine each image according to the position information of the content object in each image frame indicated by the recognition result when determining the display position of the link data according to the position information of the content object in the image frame The blank position in the frame; according to the blank position in each image frame, determine the display position of the link data.
  • the first display module 506 is configured to display the link data with a preset display duration when displaying the corresponding link data based on the played video, so that the audience of the target content object can operate on the link data within the preset display duration .
  • the first display module 506 matches the copy data corresponding to the link data from the copy data input from the application provider of the target content object; Copywriting data to generate link data to be displayed; video-based playback interface to display link data to be displayed.
  • the video data processing apparatus of this embodiment is used to implement the corresponding video data processing methods in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here.
  • the functional realization of each module in the video data processing apparatus of this embodiment can refer to the description of the corresponding part in the foregoing method embodiment, and will not be repeated here.
  • FIG. 6 there is shown a structural block diagram of a display device according to the sixth embodiment of the present invention.
  • the display device of this embodiment includes: a link data display module 600, which is used to display and indicate the detected preset keyword information in the video playback interface when the information of the preset keyword is detected during the video playback process
  • the link data corresponding to the target content object
  • the trigger obtaining module 602 is used to obtain the trigger operation of the link data corresponding to the target content object displayed in the video playback interface
  • the interface jump module 604 is used to play from the video according to the trigger operation The interface jumps to the page linked by the link data and used to display the target content object.
  • the display device of this embodiment is used to implement the corresponding display methods in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here.
  • the functional realization of each module in the display device of this embodiment can refer to the description of the corresponding part in the foregoing method embodiment, and will not be repeated here.
  • FIG. 7 there is shown a structural block diagram of a video data processing device according to the seventh embodiment of the present invention.
  • the device for processing video data in this embodiment includes: a second acquisition module 702, configured to acquire and play a live video stream; a second detection module 704, configured to perform processing on the live video stream during the playback of the live video stream. Content detection is performed on the image frames in the stream, and/or content detection is performed on the audio in the live video stream to obtain the content objects contained in the live video stream; the matching module 706 is used to find the content objects Whether there is corresponding link data; the second display module 708 is configured to use the content object with the corresponding link data as the target content object, and display the content object corresponding to the target content object in the playback interface of the live video stream Link data.
  • the matching module 706 is specifically configured to search a preset commodity database to determine whether the content object has corresponding link data.
  • the live video stream includes information indicating preset keywords of the target content object to be identified and corresponding link data; the matching module 706 is specifically configured to determine whether the detected content object includes If there is a content object matching the target content object to be identified, it is determined that there is corresponding link data.
  • the second detection module 704 is specifically configured to perform content detection on image frames in the live video stream during the playback process of the live video stream to obtain content objects contained in the live video stream.
  • the second detection module 704 perform image recognition on the preset position in the image frame in the live video stream, and obtain the content object and the content object indicated by the text keyword in the image frame according to the recognition result /Or the content object indicated by the image in the image frame.
  • the second detection module 704 is specifically configured to perform content detection on the audio in the live video stream during the playback process of the live video stream to obtain content objects contained in the live video stream, During the playing process of the live video stream, audio recognition is performed on the audio in the live video stream, and the content object indicated by the voice keyword in the audio is obtained.
  • the video data processing apparatus of this embodiment is used to implement the corresponding video data processing methods in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here.
  • the functional realization of each module in the video data processing apparatus of this embodiment can refer to the description of the corresponding part in the foregoing method embodiment, and will not be repeated here.
  • FIG. 8 there is shown a schematic structural diagram of an electronic device according to the eighth embodiment of the present invention.
  • the specific embodiment of the present invention does not limit the specific implementation of the electronic device.
  • the electronic device may include: a processor (processor) 802, a communication interface (Communications Interface) 804, a memory (memory) 806, and a communication bus 808.
  • processor processor
  • Communication interface Communication Interface
  • memory memory
  • the processor 802, the communication interface 804, and the memory 806 communicate with each other through the communication bus 808.
  • the communication interface 804 is used to communicate with other electronic devices such as terminal devices or servers.
  • the processor 802 is configured to execute a program 810, and specifically can execute related steps in the foregoing video data processing or display method embodiments.
  • the program 810 may include program code, and the program code includes computer operation instructions.
  • the processor 802 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of the present invention.
  • the one or more processors included in the electronic device may be the same type of processor, such as one or more CPUs, or different types of processors, such as one or more CPUs and one or more ASICs.
  • the memory 806 is used to store the program 810.
  • the memory 806 may include a high-speed RAM memory, and may also include a non-volatile memory (non-volatile memory), for example, at least one disk memory.
  • the program 810 can specifically be used to make the processor 802 perform the following operations: obtain the video to be played and link data, where the video includes information about preset keywords, and the link data corresponds to the target content object indicated by the preset keywords; During the playback of the video, at least part of the image frames and/or at least part of the audio data of the played video is detected for the information of the preset keywords; if it is determined that the information of the preset keywords is detected according to the detection result, it will be The video shows the corresponding link data.
  • the preset keyword information includes at least one of the following: a preset voice keyword and a preset text keyword.
  • the program 810 is further configured to enable the processor 802 to obtain the corresponding identification of the target content object, and generate information of preset keywords corresponding to the target content object according to the identification, wherein the identification includes the following At least one: the name, model, and category of the target content object.
  • the program 810 is further configured to enable the processor 802 to receive an operation on the displayed link data, and according to the operation, jump from the playing interface of the video to the page linked by the link data.
  • the program 810 is further configured to enable the processor 802 to add a display control to display the corresponding link data on the video playback interface when the processor 802 displays the corresponding link data based on the played video, where:
  • the display control includes at least one of the following: floating window, mask, and pop-up window.
  • the program 810 is also used to enable the processor 802 to add a display control to display the corresponding link data on the video playback interface, based on the information of the preset keywords being played, to compare the current Perform image recognition on a preset number of image frames after the image frame, determine the position information of the content object in the image frame according to the recognition result, and determine the display position of the link data according to the position information of the content object in the image frame; pass at the display position
  • the display control displays the corresponding link data.
  • the program 810 is further configured to cause the processor 802 to display the display control in the display position when displaying the corresponding link data through the display control in the display position, and display the first child in the display control.
  • Control and a second sub-control wherein, the first sub-control is used to display the text and/or image information corresponding to the target content object indicated by the information of the preset keywords; the second sub-control includes a trigger control corresponding to the link data for When triggered, the video playback interface jumps to the page linked by the link data.
  • the program 810 is further configured to make the processor 802 determine the display position of the link data according to the position information of the content object in the image frame, according to each image indicated by the recognition result.
  • the position information of the content object in the frame determines the blank position in each image frame; according to the blank position in each image frame, the display position of the link data is determined.
  • the program 810 is further configured to cause the processor 802 to display the link data for a preset display duration when displaying the corresponding link data based on the played video, so that the target content object is The audience operates on the link data within the preset display duration.
  • the program 810 is further configured to enable the processor 802 to match the text data input from the application provider of the target content object when displaying the corresponding link data based on the played video.
  • Copywriting data corresponding to the link data generate link data to be displayed based on the link data and matching copywriting data; display the link data to be displayed based on the video-based playback interface.
  • the program 810 may specifically be used to cause the processor 802 to perform the following operations: in the video playback process, when the information of the preset keyword is detected, the information indication of the detected preset keyword is displayed in the video playback interface The link data corresponding to the target content object; obtain the trigger operation of the link data corresponding to the target content object displayed in the video playback interface; according to the trigger operation, jump from the video playback interface to the link data linked to display the target content The page of the object.
  • the program 810 may specifically be used to cause the processor 802 to perform the following operations: obtain and play a live video stream; during the playback of the live video stream, perform content detection on the image frames in the live video stream, and/ Or, perform content detection on the audio in the live video stream to obtain the content object contained in the live video stream; find whether the content object has corresponding link data; the content of the corresponding link data will exist
  • the object serves as the target content object, and the link data corresponding to the target content object is displayed in the playback interface of the live video stream.
  • the program 810 is further configured to enable the processor 802 to search a preset commodity database when searching whether the content object has corresponding link data, so as to determine whether the content object has a corresponding link data.
  • Link data is further configured to enable the processor 802 to search a preset commodity database when searching whether the content object has corresponding link data, so as to determine whether the content object has a corresponding link data.
  • the live video stream includes information indicating preset keywords of the target content object to be recognized and corresponding link data; the program 810 is also used to make the processor 802 search for the When the content object has corresponding link data, it is determined whether the detected content object includes a content object matching the target content object to be identified, and if it exists, it is determined that there is corresponding link data.
  • the program 810 is further configured to enable the processor 802 to perform content detection on the image frames in the live video stream during the playback of the live video stream, so as to obtain the live video
  • the processor 802 perform content detection on the image frames in the live video stream during the playback of the live video stream, so as to obtain the live video
  • the program 810 is further configured to enable the processor 802 to perform content detection on the audio in the live video stream during the playback of the live video stream, so as to obtain the live video stream.
  • content detection is performed on the audio in the live video stream, and the content object indicated by the voice keyword in the audio is obtained.
  • each component/step described in the embodiment of the present invention can be split into more components/steps, or two or more components/steps or partial operations of components/steps can be combined into New components/steps to achieve the purpose of the embodiments of the present invention.
  • the above method according to the embodiments of the present invention can be implemented in hardware, firmware, or implemented as software or computer code that can be stored in a recording medium (such as CD ROM, RAM, floppy disk, hard disk, or magneto-optical disk), or implemented by
  • a recording medium such as CD ROM, RAM, floppy disk, hard disk, or magneto-optical disk
  • the computer code downloaded from the network is originally stored in a remote recording medium or a non-transitory machine-readable medium and will be stored in a local recording medium, so that the method described here can be stored in a general-purpose computer, a special-purpose processor, or a programmable Or such software processing on a recording medium of dedicated hardware (such as ASIC or FPGA).
  • a computer, a processor, a microprocessor controller, or programmable hardware includes a storage component (for example, RAM, ROM, flash memory, etc.) that can store or receive software or computer code, when the software or computer code is used by the computer, When accessed and executed by the processor or hardware, the video data processing or display method described herein is implemented.
  • a general-purpose computer accesses the code used to implement the processing or display method of video data shown here, the execution of the code converts the general-purpose computer into a special-purpose computer for executing the processing or display method of video data shown here. computer.

Abstract

Video data processing and display methods and apparatuses, an electronic device, and a storage medium. The video data processing method comprises: acquiring a video to be played back and link data, the video comprising information about preset keywords, and the link data corresponding to a target content object indicated by the preset keywords (S102); during playback of the video, performing detection of the information about the preset keywords on at least part of image frames and/or at least part of audio data of the video that is played back (S104); and if it is determined, according to the detection result, that the information about the preset keywords is detected, displaying the corresponding link data on the basis of the video that is played back (S106). By means of said method, the video is more interactive.

Description

视频数据的处理、显示方法、装置、电子设备及存储介质Video data processing, display method, device, electronic equipment and storage medium
本申请要求2019年12月31日递交的申请号为201911412396.X、发明名称为“视频数据的处理、显示方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on December 31, 2019, with the application number 201911412396.X and the invention title "Video data processing, display methods, devices, electronic equipment and storage media", all of which are approved The reference is incorporated in this application.
技术领域Technical field
本发明实施例涉及计算机技术领域,尤其涉及一种视频数据的处理、显示方法、装置、电子设备及存储介质。The embodiments of the present invention relate to the field of computer technology, and in particular to a method, device, electronic device, and storage medium for processing and displaying video data.
背景技术Background technique
随着互联网技术的发展,人们的日常生活越来越依赖电子设备。不论是购物、支付和社交等均可以通过电子设备实现。随之而来的是新视频互动方式——互动视频,而视频广告则是互动视频中比较重要的一种类型。With the development of Internet technology, people's daily lives are increasingly dependent on electronic devices. Whether it is shopping, payment, socializing, etc., it can be achieved through electronic devices. What follows is a new way of video interaction-interactive video, and video advertising is a more important type of interactive video.
视频广告是一种通过以视频形式介绍产品的广告方式。现有的视频广告只是向受众推送广告信息,无法实现与受众的互动。尤其在受众想要观看更进一步的细节信息时,只能通过手中的工具如手机或电脑根据收看的视频广告的信息进行查找。Video advertising is an advertising method that introduces products in the form of videos. The existing video advertisement only pushes advertisement information to the audience, and cannot realize the interaction with the audience. Especially when the audience wants to watch more detailed information, they can only search through the tools in their hands such as mobile phones or computers based on the information of the video advertisement they are watching.
可见,现有的视频广告方式缺乏与受众的互动,难以向受众提供了解详细信息的途径。It can be seen that the existing video advertising method lacks interaction with the audience, and it is difficult to provide the audience with a way to understand detailed information.
发明内容Summary of the invention
有鉴于此,本发明实施例提供一种视频数据的处理方案,以解决上述部分或全部问题。In view of this, embodiments of the present invention provide a video data processing solution to solve some or all of the above-mentioned problems.
根据本发明实施例的第一方面,提供了一种视频数据的处理方法,包括:获取待播放的视频以及链接数据,其中,所述视频中包括预设关键词的信息,所述链接数据与所述预设关键词指示的目标内容对象对应;在所述视频的播放过程中,对播放的所述视频的至少部分图像帧和/或至少部分音频数据进行所述预设关键词的信息的检测;若根据检测结果确定检测到所述预设关键词的信息,则基于播放的所述视频显示对应的所述链接数据。According to a first aspect of the embodiments of the present invention, there is provided a method for processing video data, including: acquiring a video to be played and link data, wherein the video includes information about preset keywords, and the link data is related to The target content object indicated by the preset keyword corresponds to; during the playback of the video, at least part of the image frames and/or at least part of the audio data of the played video is subjected to the information of the preset keyword Detection; if it is determined that the information of the preset keyword is detected according to the detection result, the corresponding link data is displayed based on the played video.
根据本发明实施例的第二方面,提供了一种显示方法,包括:在视频播放过程中,当检测到预设关键词的信息时,在视频播放界面中显示与检测到的预设关键词的信息指 示的目标内容对象对应的链接数据;获取对视频播放界面中显示的目标内容对象对应的链接数据的触发操作;根据所述触发操作,从所述视频播放界面跳转至所述链接数据所链接的、用于显示所述目标内容对象的页面。According to a second aspect of the embodiments of the present invention, a display method is provided, which includes: during video playback, when information about a preset keyword is detected, displaying and the detected preset keyword in a video playback interface The link data corresponding to the target content object indicated by the information; obtain the trigger operation for the link data corresponding to the target content object displayed in the video playback interface; according to the trigger operation, jump from the video playback interface to the link data The linked page used to display the target content object.
根据本发明实施例的第三方面,提供了一种视频数据的处理方法,包括:获取并播放直播视频流;在所述直播视频流的播放过程中,对所述直播视频流中的图像帧进行内容检测,和/或,对所述直播视频流中的音频进行内容检测,以获取所述直播视频流中包含的内容对象;查找所述内容对象是否存在对应的链接数据;将存在对应的链接数据的所述内容对象作为目标内容对象,在所述直播视频流的播放界面中显示与所述目标内容对象对应的链接数据。According to a third aspect of the embodiments of the present invention, there is provided a method for processing video data, including: acquiring and playing a live video stream; during the playing process of the live video stream, processing image frames in the live video stream Perform content detection, and/or perform content detection on the audio in the live video stream to obtain the content object contained in the live video stream; find out whether the content object has corresponding link data; there will be a corresponding The content object of the link data serves as the target content object, and the link data corresponding to the target content object is displayed in the playback interface of the live video stream.
根据本发明实施例的第四方面,提供了一种视频数据的处理装置,包括:第一获取模块,用于获取待播放的视频以及链接数据,其中,所述视频中包括预设关键词的信息,所述链接数据与所述预设关键词指示的目标内容对象对应;第一检测模块,用于在所述视频的播放过程中,对播放的所述视频的至少部分图像帧和/或至少部分音频数据进行所述预设关键词的信息的检测;第一显示模块,用于若根据检测结果确定检测到所述预设关键词的信息,则基于播放的所述视频显示对应的所述目标内容对象的链接数据。According to a fourth aspect of the embodiments of the present invention, there is provided a video data processing device, including: a first acquisition module for acquiring a video to be played and link data, wherein the video includes preset keywords Information, the link data corresponds to the target content object indicated by the preset keyword; the first detection module is configured to perform at least part of the image frames and/or of the played video during the playback of the video At least part of the audio data is used to detect the information of the preset keywords; the first display module is configured to, if the information of the preset keywords is determined to be detected according to the detection result, display the corresponding information based on the played video The link data of the target content object.
根据本发明实施例的第五方面,提供了一种显示装置,包括:视频播放模块,用于在视频播放过程中,当检测到预设关键词的信息时,在视频播放界面中显示与检测到的预设关键词的信息指示的目标内容对象对应的链接数据;触发获取模块,用于获取对视频播放界面中显示的目标内容对象对应的链接数据的触发操作;界面跳转模块,用于根据所述触发操作,从所述视频播放界面跳转至所述链接数据所链接的、用于显示所述目标内容对象的页面。According to a fifth aspect of the embodiments of the present invention, there is provided a display device, including: a video playback module, configured to display and detect information about preset keywords in a video playback interface during video playback. The link data corresponding to the target content object indicated by the preset keyword information; the trigger acquisition module is used to obtain the trigger operation of the link data corresponding to the target content object displayed in the video playback interface; the interface jump module is used to According to the trigger operation, jump from the video playback interface to the page linked by the link data for displaying the target content object.
根据本发明实施例的第六方面,提供了一种视频数据的处理装置,包括:第二获取模块,用于获取并播放直播视频流;第二检测模块,用于在所述直播视频流的播放过程中,对所述直播视频流中的图像帧进行内容检测,和/或,对所述直播视频流中的音频进行内容检测,以获取所述直播视频流中包含的内容对象;匹配模块,用于查找所述内容对象是否存在对应的链接数据;第二显示模块,用于将存在对应的链接数据的所述内容对象作为目标内容对象,在所述直播视频流的播放界面中显示与所述目标内容对象对应的链接数据。According to a sixth aspect of the embodiments of the present invention, there is provided a video data processing device, including: a second acquisition module, configured to acquire and play a live video stream; and a second detection module, configured to display the live video stream During the playback process, perform content detection on the image frames in the live video stream, and/or perform content detection on the audio in the live video stream to obtain the content objects contained in the live video stream; matching module , Used to find whether the content object has corresponding link data; the second display module, used to take the content object with the corresponding link data as the target content object, and display and Link data corresponding to the target content object.
根据本发明实施例的第七方面,提供了一种电子设备,包括:处理器、存储器、通信接口和通信总线,所述处理器、所述存储器和所述通信接口通过所述通信总线完成相 互间的通信;所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行如第一方面或第三方面所述的视频数据的处理方法对应的操作或者执行如第二方面的显示方法对应的操作。According to a seventh aspect of the embodiments of the present invention, there is provided an electronic device, including: a processor, a memory, a communication interface, and a communication bus. The processor, the memory, and the communication interface complete each other through the communication bus. Inter-communication; the memory is used to store at least one executable instruction, the executable instruction causes the processor to perform operations corresponding to the video data processing method described in the first aspect or the third aspect or perform operations such as The operation corresponding to the two display methods.
根据本发明实施例的第八方面,提供了一种计算机存储介质,其上存储有计算机程序,该程序被处理器执行时实现如第一方面或第三方面所述的视频数据的处理方法或者实现如第二方面的显示方法。According to an eighth aspect of the embodiments of the present invention, there is provided a computer storage medium having a computer program stored thereon, and when the program is executed by a processor, it implements the video data processing method as described in the first or third aspect or Realize the display method as in the second aspect.
根据本发明实施例提供的视频数据的处理方案,用于推广目标内容对象的视频中包括预设关键词的信息,在视频的播放过程中,检测到预设关键词的信息时,基于播放的视频展示与预设关键词的信息指示的目标内容对象对应的链接数据。这样不仅可以实现通过视频为链接数据对应的页面引流,而且可以很好地与受众进行互动,为受众提供了解目标内容对象的详细信息的途径。According to the video data processing solution provided by the embodiment of the present invention, the video used to promote the target content object includes the information of the preset keywords. During the playback of the video, when the information of the preset keywords is detected, it is based on the playback The video displays the link data corresponding to the target content object indicated by the information of the preset keyword. In this way, not only can the video be used to draw traffic to the page corresponding to the link data, but also it can interact with the audience well, and provide the audience with a way to understand the detailed information of the target content object.
附图说明Description of the drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明实施例中记载的一些实施例,对于本领域普通技术人员来讲,还可以根据这些附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some of the embodiments described in the embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings.
图1a为根据本发明实施例一的一种视频数据的处理方法的步骤流程图;Fig. 1a is a flowchart of steps of a method for processing video data according to the first embodiment of the present invention;
图1b为根据本发明实施例一的一种使用场景的终端设备与服务端交互的示意图;Figure 1b is a schematic diagram of interaction between a terminal device and a server in a usage scenario according to the first embodiment of the present invention;
图1c为根据本发明实施例一的一种使用场景的终端设备中界面变化的示意图;FIG. 1c is a schematic diagram of interface changes in a terminal device in a usage scenario according to Embodiment 1 of the present invention;
图2a为根据本发明实施例二的一种视频数据的处理方法的步骤流程图;2a is a flowchart of steps of a method for processing video data according to the second embodiment of the present invention;
图2b为根据本发明实施例二的一种使用场景的界面变化示意图;2b is a schematic diagram of interface changes in a usage scenario according to the second embodiment of the present invention;
图3为根据本发明实施例三的一种显示方法的步骤流程图;Fig. 3 is a flow chart of the steps of a display method according to the third embodiment of the present invention;
图4a为根据本发明实施例四的一种视频数据的处理的步骤流程图;Fig. 4a is a flowchart of steps of processing video data according to the fourth embodiment of the present invention;
图4b为根据本发明实施例四的一种使用场景的界面变化示意图;4b is a schematic diagram of interface changes in a usage scenario according to the fourth embodiment of the present invention;
图5为根据本发明实施例五的一种视频数据的处理装置的结构框图;Fig. 5 is a structural block diagram of a video data processing device according to the fifth embodiment of the present invention;
图6为根据本发明实施例六的一种显示装置的结构框图;6 is a structural block diagram of a display device according to the sixth embodiment of the present invention;
图7为根据本发明实施例七的一种视频数据的处理装置的结构框图;Fig. 7 is a structural block diagram of a video data processing device according to the seventh embodiment of the present invention;
图8为根据本发明实施例八的一种电子设备的结构示意图。Fig. 8 is a schematic structural diagram of an electronic device according to the eighth embodiment of the present invention.
具体实施方式Detailed ways
为了使本领域的人员更好地理解本发明实施例中的技术方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本发明实施例一部分实施例,而不是全部的实施例。基于本发明实施例中的实施例,本领域普通技术人员所获得的所有其他实施例,都应当属于本发明实施例保护的范围。In order to enable those skilled in the art to better understand the technical solutions in the embodiments of the present invention, the technical solutions in the embodiments of the present invention will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the description is The embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments in the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art should fall within the protection scope of the embodiments of the present invention.
下面结合本发明实施例附图进一步说明本发明实施例具体实现。The specific implementation of the embodiments of the present invention will be further described below in conjunction with the accompanying drawings of the embodiments of the present invention.
以应用场景为视频广告为例,现有技术中,一种通过视频进行广告的方式是,在网页或者应用程序的界面中播放预先拍摄的广告视频,供受众(即观看该广告视频的人)观看。这样广告方式存在的问题在于:一方面需要预先针对某一需要推广的产品专门拍摄视频,导致广告的制作时间长、成本高;另一方面由于广告视频的时长通常较短,导致能够介绍的产品信息有限,受众如果想要了解产品的更多细节只能根据产品的名称、型号等信息自行搜索了解,导致互动性差。Taking the application scenario as a video advertisement as an example, in the prior art, one way of advertising through video is to play a pre-shot advertisement video on the interface of a webpage or application program for the audience (that is, the person watching the advertisement video) Watch. The problem with this advertising method is that on the one hand, it is necessary to shoot a video specifically for a product that needs to be promoted in advance, which leads to a long production time and high cost for the advertisement; on the other hand, the duration of the advertising video is usually short, resulting in products that can be introduced. The information is limited. If the audience wants to know more details of the product, they can only search and understand it by themselves based on the product name, model and other information, resulting in poor interaction.
实施例一Example one
参照图1a,示出了根据本发明实施例一的一种视频数据的处理方法的步骤流程图。Referring to FIG. 1a, there is shown a flow chart of the steps of a method for processing video data according to the first embodiment of the present invention.
在本实施例中,以终端设备执行该视频数据的处理方法为例,对视频数据的处理方法进行说明。当然,在其他实施例中,视频数据的处理方法也可以由服务端(服务端包括服务器或云端)执行,本实施例对此不作限制。In this embodiment, the processing method of the video data is described by taking the processing method of the video data executed by the terminal device as an example. Of course, in other embodiments, the video data processing method may also be executed by the server (the server includes a server or the cloud), and this embodiment does not limit this.
其中,视频数据的处理方法包括以下步骤:Among them, the video data processing method includes the following steps:
步骤S102:获取待播放的视频以及链接数据,其中,视频中包括预设关键词的信息,链接数据与预设关键词指示的目标内容对象对应。Step S102: Obtain a video to be played and link data, where the video includes information about a preset keyword, and the link data corresponds to the target content object indicated by the preset keyword.
待播放的视频可以是用于对目标内容对象进行说明或者展示的视频,目标内容对象可以是商品、人物、地点等等任何适当的对象。以商品为例,其可以是有形的商品,也可以是无形的商品(如服务、虚拟商品等)。The video to be played may be a video used to explain or display the target content object, and the target content object may be any suitable object such as a commodity, a person, a location, and so on. Taking commodities as an example, they can be tangible commodities or intangible commodities (such as services, virtual commodities, etc.).
视频中包括图像帧序列数据和音频数据,除此以外还包括预设关键词的信息目标内容对象。需要说明的是,该图像帧序列数据中可以包括目标内容对象的图像,也可以完全不包括目标内容对象的图像。The video includes image frame sequence data and audio data, in addition to information target content objects with preset keywords. It should be noted that the image frame sequence data may include the image of the target content object, or may not include the image of the target content object at all.
预设关键词的信息包括下列至少之一:预设语音关键词、预设的文字关键词目标内容对象。The information of the preset keywords includes at least one of the following: a preset voice keyword and a preset text keyword target content object.
具体例如,预设语音关键词和预设的文字关键词可以是目标内容对象的名称、型号等,也可以是其他根据需要确定的预设关键词,如目标内容对象的品类等。例如,预设关键词的信息中指示语音关键词为“**手机”。又例如,预设关键词的信息中指示的文字关键词为“**保温杯”等。Specifically, for example, the preset voice keywords and the preset text keywords may be the name, model, etc. of the target content object, or other preset keywords determined as needed, such as the category of the target content object. For example, the information of the preset keywords indicates that the voice keyword is "**mobile phone". For another example, the text keyword indicated in the information of the preset keyword is "** thermos cup" and so on.
在一个视频中,预设关键词的信息可以指示一个或多个预设关键词,例如,预设关键词的信息指示的预设关键词包括“**口红”、“**钢笔”等。每个预设关键词可以对应一个链接数据,不同的预设关键词对应的链接数据可以相同或不同。In a video, the information of the preset keywords may indicate one or more preset keywords. For example, the preset keywords indicated by the information of the preset keywords include "**lipstick", "**pen", etc. Each preset keyword can correspond to one link data, and the link data corresponding to different preset keywords can be the same or different.
链接数据可以在目标内容对象的受众(以下简称受众,该受众可以是观看视频的人)进行触发操作后跳转到对应的页面中,从而实现通过视频对链接数据对应的页面引流的目的。链接数据可以是该页面的URL或者IP地址等等。The link data can jump to the corresponding page after a trigger operation is performed by an audience of the target content object (hereinafter referred to as the audience, the audience may be a person watching the video), so as to achieve the purpose of diverting traffic to the page corresponding to the link data through the video. The link data can be the URL or IP address of the page, and so on.
在一种可行方式中,链接数据可以由应用供应端确定。例如,链接数据A对应于商品A的商品购买页面,则该链接数据A对应的预设关键词为商品A的名称。在播放视频的过程中,若检测到视频中包括商品A的图像或者在字幕或音频中提及商品A的名称,就可以确定链接数据A与视频匹配。In one possible way, the link data can be determined by the application provider. For example, if the link data A corresponds to the product purchase page of the product A, the preset keyword corresponding to the link data A is the name of the product A. In the process of playing the video, if it is detected that the image of the product A is included in the video or the name of the product A is mentioned in the subtitle or audio, it can be determined that the link data A matches the video.
通过此种方式,任意包含预设关键词视频都可以作为链接数据的推广载体,进行链接数据的推广。目标内容对象由此使得任意适当的视频都可以为广告主提供服务,满足其通过视频进行目标内容对象推广的需求。In this way, any video containing preset keywords can be used as the promotion carrier of the link data to promote the link data. The target content object thus enables any appropriate video to provide services for advertisers to meet their needs for promoting the target content object through the video.
步骤S104:在视频的播放过程中,对播放的视频的至少部分图像帧和/或至少部分音频数据进行预设关键词的信息的检测。Step S104: In the process of playing the video, at least part of the image frames and/or at least part of the audio data of the played video is detected for the information of the preset keywords.
本领域技术人员可以采用任何适当的方式对图像帧或音频数据进行检测。需要说明的是,检测时,可以是对当前播放的图像帧和/或音频数据进行检测,也可以是对当前时刻播放的图像帧和/或音频数据之后的图像帧和/或音频数据进行预检测,本实施例对此不作限制。Those skilled in the art can use any appropriate method to detect image frames or audio data. It should be noted that during the detection, the image frame and/or audio data currently being played can be detected, or the image frame and/or audio data after the image frame and/or audio data being played at the current moment can be pre-examined. For detection, this embodiment does not limit this.
一种具体的检测方式例如,若预设关键词的信息为商品A的名称,在针对图像帧进行检测时,可以采用图像识别算法或者训练的具有商品A识别功能的神经网络模型(如卷积神经网络模型)对视频中的图像帧进行检测,以确定是否有图像帧中包含商品A。A specific detection method. For example, if the preset keyword information is the name of commodity A, when detecting image frames, image recognition algorithms or a trained neural network model with commodity A recognition function (such as convolution The neural network model) detects the image frames in the video to determine whether any image frames contain commodity A.
或者,也可以使用能够进行文字识别的神经网络模型(如卷积神经网络模型)对视频中的图像帧进行检测,以确定是否有图像帧中的文字中包含商品A的名称。Alternatively, a neural network model capable of character recognition (such as a convolutional neural network model) can also be used to detect image frames in the video to determine whether any text in the image frame contains the name of commodity A.
又或者,使用语音识别算法(如asr,Automatic Speech Recognition算法)对音频数据进行检测,以确定音频数据中是否存在包含有商品A的名称的音频片段。Or, use a speech recognition algorithm (such as asr, Automatic Speech Recognition algorithm) to detect the audio data to determine whether there is an audio segment containing the name of commodity A in the audio data.
步骤S106:若根据检测结果确定检测到预设关键词的信息,则基于播放的视频显示对应的链接数据。Step S106: If it is determined that the information of the preset keyword is detected according to the detection result, the corresponding link data is displayed based on the played video.
例如,预设关键词的信息为商品A的名称,若针对图像帧识别出图像帧包含的字幕中存在商品A的名称,则确定检测到预设关键词的信息,显示对应的链接数据。For example, the information of the preset keyword is the name of commodity A, and if it is recognized that the name of commodity A exists in the caption contained in the image frame for the image frame, it is determined that the information of the preset keyword is detected, and the corresponding link data is displayed.
这样可以实现针对任何在播放的视频自动侦测是否存在预设关键词的信息,在检测到预设关键词的信息,也就是预设关键词的信息被播放时,自动显示对应的链接数据,一方面可以通过链接数据为受众提升交互性,使受众可以更加方便地了解感兴趣的目标内容对象,另一方面提升了链接数据显示的灵活性,使其可以更好地融入视频,而且针对任何视频均可以适用。In this way, it can automatically detect the existence of preset keyword information for any video being played, and automatically display the corresponding link data when the preset keyword information is detected, that is, the preset keyword information is played. On the one hand, the link data can be used to improve the interactivity of the audience, so that the audience can more easily understand the target content object of interest. On the other hand, the flexibility of the link data display is improved, so that it can be better integrated into the video, and it can target any Videos can be applied.
本领域技术人员可以采用任何适当的方式确定是否检测到预设关键词信息,本实施例对此不作限制。Those skilled in the art can use any appropriate method to determine whether the preset keyword information is detected, and this embodiment does not limit this.
例如,根据语音关键词在音频数据中的时间信息(即播放时间,如第1分20秒等)确定当前时刻预设语音关键词是否被播放。For example, according to the time information of the voice keyword in the audio data (that is, the playing time, such as 1 minute and 20 seconds, etc.), it is determined whether the preset voice keyword is played at the current moment.
或者,通过语音检测算法、图像识别算法或深度学习网络模型等确定是否检测到预设关键词的信息。Alternatively, it is determined whether the preset keyword information is detected through a voice detection algorithm, an image recognition algorithm, or a deep learning network model.
在预设关键词被播放时,可以在视频的播放界面的适当位置显示与链接数据对应的控件,以通过控件(例如悬浮窗、蒙版或弹窗等)展示链接数据,而可以通过控件接收受众的点击等操作。这样使得链接数据不会一直在视频中展示,而是随视频的播放过程,根据预设关键词的信息是否被检测到而智能地展示链接数据,使链接数据的展示很好地融合到视频中,不会显得过于突兀或植入强硬,从而提升通过视频进行推广的推广效果,并保证视频的观看体验。When the preset keyword is played, the control corresponding to the link data can be displayed in the appropriate position of the video playback interface to display the link data through the control (such as floating window, mask or pop-up window, etc.), which can be received through the control The audience’s clicks and other operations. In this way, the link data will not be displayed in the video all the time, but the link data will be displayed intelligently according to whether the information of the preset keywords is detected during the playback process of the video, so that the display of the link data is well integrated into the video , Will not appear too obtrusive or implanted strong, thereby enhancing the promotion effect through video promotion and ensuring the video viewing experience.
本领域技术人员可以采用任何适当的方式基于播放的视频展示链接数据,本实施例对此不作限制。例如,在视频开始播放时即预先绘制好与链接数据对应的控件,并将其属性设置为“隐藏”,检测到预设关键词信息时,将预先绘制的控件的属性修改为“显示”,等等。Those skilled in the art can use any appropriate method to display the link data based on the played video, which is not limited in this embodiment. For example, when the video starts to play, the control corresponding to the link data is drawn in advance, and its attribute is set to "hidden". When the preset keyword information is detected, the attribute of the pre-drawn control is changed to "display". and many more.
通过展示的链接数据,既可实现视频与受众的交互,又可在受众对该链接数据进行操作时跳转至相应的页面以为受众提供进一步的更详细的信息,方便受众了解或者进行购买。Through the displayed link data, the interaction between the video and the audience can be realized, and the audience can jump to the corresponding page when the audience operates on the link data to provide the audience with further more detailed information, which is convenient for the audience to understand or make purchases.
下面结合图1b和图1c,以一个具体的使用场景为例对视频数据的处理方法的实现过程进行说明如下:In the following, in conjunction with Figure 1b and Figure 1c, a specific usage scenario is taken as an example to describe the implementation process of the video data processing method as follows:
本使用场景中,如图1b所示,终端设备(如手机、个人电脑、个人移动电脑等)通过网络与服务端(服务端包括服务器或云端)连接。受众通过终端设备上的浏览器浏览网页时,终端设备从服务端获取网页数据,该网页数据中包含有待播放的视频(该视频可以是在经常出现在的网页右下角的小窗口中播放)。In this usage scenario, as shown in Figure 1b, terminal devices (such as mobile phones, personal computers, personal mobile computers, etc.) are connected to the server (the server includes a server or the cloud) through a network. When an audience browses a webpage through a browser on a terminal device, the terminal device obtains webpage data from the server. The webpage data contains a video to be played (the video can be played in a small window in the lower right corner of a webpage that often appears).
该视频中不仅包含常规视频中的图像帧序列数据和音频数据,而且还包括预设关键词的信息。The video contains not only the image frame sequence data and audio data in the conventional video, but also the information of preset keywords.
此外还可以获得与预设关键词指示的目标内容对象的链接数据。链接数据可以包含于视频中,也可以独立于视频单独存在。In addition, link data to the target content object indicated by the preset keyword can also be obtained. The link data can be included in the video, or it can exist independently of the video.
当通过小窗口播放视频时,界面如图1c中界面1所示。当在视频播放过程中,检测到预设关键词的信息时,如图1c中的界面2所示,在视频播放界面中展示对应的链接数据,展示链接数据的界面如图1c中界面3所示。When playing a video through a small window, the interface is as shown in interface 1 in Figure 1c. When the information of the preset keywords is detected during video playback, as shown in interface 2 in Figure 1c, the corresponding link data is displayed in the video playback interface, and the interface for displaying the link data is shown in interface 3 in Figure 1c. Show.
设预设关键词的信息指示的语音关键词或文字关键词为商品A的名称(如**杯子),目标内容对象为**杯子,其中一种检测预设关键词的信息的方式可以为:对当前图像帧进行图像识别,以确定当前图像帧中包含的内容对象,若内容对象中存在商品A(例如存在**杯子),则确定检测到预设关键词的信息,可以显示对应的链接数据。Suppose that the voice keyword or text keyword indicated by the information of the preset keyword is the name of product A (such as **cup), and the target content object is **cup. One of the ways to detect the information of the preset keyword can be : Perform image recognition on the current image frame to determine the content object contained in the current image frame. If there is commodity A in the content object (for example, there is **cup), it is determined that the preset keyword information is detected, and the corresponding Link data.
或者,另一种检测的方式可以为:对当前播放的音频数据进行语音识别,若识别出语音关键词(如商品A的名称),则确定预设关键词的信息被检测到,可以显示对应的链接数据。当然,也可以采用其他方式进行检测,在此不再一一说明。Alternatively, another detection method can be: perform voice recognition on the currently played audio data. If a voice keyword (such as the name of product A) is recognized, it is determined that the information of the preset keyword is detected, and the corresponding information can be displayed. Link data. Of course, other methods can also be used for detection, which will not be described here.
在后续过程中,如果受众点击了展示的链接数据,则浏览器跳转至与链接数据对应的网页进行显示,跳转后界面如图1c中界面4所示。In the subsequent process, if the audience clicks on the displayed link data, the browser jumps to the webpage corresponding to the link data for display, and the interface after the jump is shown as interface 4 in Figure 1c.
通过本实施例,用于推广目标内容对象的视频中包括预设关键词的信息,在视频的播放过程中,检测到预设关键词的信息时,基于播放的视频展示与预设关键词的信息指示的目标内容对象对应的链接数据。这样不仅可以实现通过视频为链接数据对应的页面引流,而且可以很好地与受众进行互动,为受众提供了解目标内容对象的详细信息的途径。Through this embodiment, the video used to promote the target content object includes the information of the preset keywords. During the playback of the video, when the information of the preset keywords is detected, based on the displayed video and the preset keywords The link data corresponding to the target content object indicated by the information. In this way, not only can the video be used to draw traffic to the page corresponding to the link data, but also it can interact with the audience well, and provide the audience with a way to understand the detailed information of the target content object.
实施例二Example two
参照图2a,示出了根据本发明实施例二的一种视频数据的处理方法的步骤流程图。Referring to Fig. 2a, there is shown a flow chart of the steps of a method for processing video data according to the second embodiment of the present invention.
本实施例中,仍然以终端设备为执行主体,主要对视频数据的处理方法中的链接数据的展示过程进行说明。本实施例的视频数据的处理方法包括以下:In this embodiment, the terminal device is still the main body of execution, and the display process of the link data in the video data processing method is mainly described. The video data processing method of this embodiment includes the following:
步骤S100:获取目标内容对象的对应的标识,并根据标识生成与所述目标内容对象对应的所述预设关键词的信息。Step S100: Obtain the corresponding identifier of the target content object, and generate the preset keyword information corresponding to the target content object according to the identifier.
目标内容对象可以是广告主选择的,也可以是根据大数据分析确定的等。例如,目标内容对象为实体商品,如杯子、手机等,其也可以是非实体商品,如保洁服务、虚拟货币等等。The target content object can be selected by the advertiser, or determined based on big data analysis. For example, the target content object is physical goods, such as cups, mobile phones, etc., which may also be non-physical goods, such as cleaning services, virtual currency, and so on.
标识可以是目标内容对象的名称、型号、类别或者预设代号、其他广告主选定的词等。The identification can be the name, model, category or preset code of the target content object, words selected by other advertisers, and so on.
生成的预设关键词的信息可以仅指示一个预设关键词,也可以用于指示一个以上的预设关键词。The generated preset keyword information may only indicate one preset keyword, or may be used to indicate more than one preset keyword.
例如,预设关键词的信息中包括“**杯子”、“**口红”和“**手机”等。For example, the information of the preset keywords includes "**cup", "**lipstick" and "**phone".
步骤S102:获取待播放的视频以及链接数据。Step S102: Obtain the video to be played and the link data.
该步骤S102可以采用与实施例一中的步骤S102相同的实现过程,故不再赘述。步骤S104:在视频的播放过程中,对播放的视频的至少部分图像帧和/或至少部分音频数据进行预设关键词的信息的检测。This step S102 can adopt the same implementation process as the step S102 in the first embodiment, so it will not be repeated here. Step S104: In the process of playing the video, at least part of the image frames and/or at least part of the audio data of the played video is detected for the information of the preset keywords.
检测预设关键词的信息的过程可以采用实施例一中的实现过程,故不再赘述。The process of detecting the information of the preset keywords can adopt the implementation process in the first embodiment, so it will not be repeated here.
步骤S106:若根据检测结果确定检测到预设关键词的信息,则基于播放的视频显示对应的链接数据。Step S106: If it is determined that the information of the preset keyword is detected according to the detection result, the corresponding link data is displayed based on the played video.
步骤S106可以采用实施例一中的实现方式。Step S106 can be implemented in the first embodiment.
或者,在一种可行方式中,步骤S106可以实现为:从目标内容对象的应用供应端输入的文案数据中匹配出与链接数据对应的文案数据;根据链接数据和匹配的文案数据,生成待展示的链接数据;基于视频的播放界面,展示待展示的链接数据。Or, in a feasible manner, step S106 can be implemented as: matching the copywriting data corresponding to the link data from the copywriting data input by the application provider of the target content object; generating the to-be-displayed based on the link data and the matched copywriting data Link data; based on the video playback interface, display the link data to be displayed.
这样使得应用供应端可以根据需要输入个性化的文案数据,并自动将文案数据与链接数据进行匹配,从而在展示链接数据时可以同时显示对应的文案数据,以提升受众点击链接数据的兴趣。In this way, the application provider can input personalized copywriting data as needed, and automatically match the copywriting data with the link data, so that the corresponding copywriting data can be displayed at the same time when the link data is displayed, so as to increase the audience's interest in clicking the link data.
或者,在另一种可行方式中,为了能够适应视频的播放过程中界面的图像变化,可以为链接数据设置显示时长,即设定其在视频的播放界面中的停留时间。Or, in another feasible way, in order to be able to adapt to the image change of the interface during the playback of the video, the display duration may be set for the link data, that is, the stay time in the playback interface of the video may be set.
其中,为了使链接数据能够更加方便受众操作,在步骤S106中,基于播放的所述视频显示对应的所述链接数据可以实现为:以预设展示时长显示所述链接数据,以使所述目标内容对象的受众在所述预设展示时长内对所述链接数据进行操作。预设展示时长即链接数据停留展示的时长可以根据需要确定,如20秒、1分钟等等,本实施例对此不作 限制。Wherein, in order to make the link data more convenient for the audience to operate, in step S106, displaying the link data corresponding to the played video may be implemented as: displaying the link data with a preset display duration, so that the target The audience of the content object operates on the link data within the preset display duration. The preset display duration, that is, the duration of the link data stay and display, can be determined according to needs, such as 20 seconds, 1 minute, etc., which is not limited in this embodiment.
若在显示链接数据的过程中直到到达预设展示时长都未接收到受众对链接数据的触发操作,则将链接数据隐藏或者销毁。If in the process of displaying the link data, the trigger operation of the link data by the audience is not received until the preset display time is reached, the link data is hidden or destroyed.
若在显示过程中接收到受众对链接数据的操作,则方法还可以跳转到链接数据对应的页面。If an operation of the audience on the link data is received during the display process, the method can also jump to the page corresponding to the link data.
或者,在本实施例中,步骤S106中在基于播放的视频显示对应的链接数据时,可以采用下述方式实现:在视频的播放界面,增加展示控件用于显示对应的链接数据,其中,展示控件包括以下至少之一:悬浮窗、蒙版、弹窗。Or, in this embodiment, when the corresponding link data is displayed based on the played video in step S106, the following method can be used: in the video playback interface, a display control is added to display the corresponding link data, where The control includes at least one of the following: floating window, mask, and pop-up window.
由于展示控件可以方便地调整显示位置,因而可以便于在显示链接数据时对视频进行排版,从而可以适应视频的图像帧中的内容对象的位置,以保证能够在较为适当的位置显示链接数据。Since the display control can easily adjust the display position, it is convenient to typeset the video when the link data is displayed, so that the position of the content object in the image frame of the video can be adapted to ensure that the link data can be displayed in a more appropriate position.
优选地,为了使展示控件可以更好地融入视频中,实现既能够比较醒目地显示,使受众更容易注意到展示控件,又能够减少强行植入的感觉,在步骤S106中,在视频的播放界面,增加展示控件显示对应的链接数据包括以下子步骤:Preferably, in order to enable the display controls to be better integrated into the video, to achieve a more conspicuous display, so that the audience can more easily notice the display controls, and to reduce the feeling of forced implantation, in step S106, in the video playback In the interface, adding a display control to display the corresponding link data includes the following sub-steps:
子步骤S1061:基于被播放的所述预设关键词的信息,对当前图像帧之后的预设数量的图像帧进行图像识别,根据识别结果确定图像帧中的内容对象的位置信息。Sub-step S1061: Perform image recognition on a preset number of image frames after the current image frame based on the information of the preset keyword being played, and determine the position information of the content object in the image frame according to the recognition result.
对于将展示控件融合到视频中而言,展示控件能否在图像中进行较好地融合直接影响了融合效果。故而,在本实施例中,可以采用神经网络模型、前后背景分割等方式对当前图像帧之后的预设数量的图像帧进行图像识别,获得识别结果,根据识别结果可以确定图像帧中的内容对象的位置信息,后续根据位置信息可以确定图像帧中的空白区域、或者不遮挡图像帧中主要内容对象的区域等,将这些区域确定为适合显示展示控件的展示位置。For the fusion of display controls into video, whether the display controls can be better integrated in the image directly affects the fusion effect. Therefore, in this embodiment, a neural network model, front and back background segmentation, etc. can be used to perform image recognition on a preset number of image frames after the current image frame to obtain the recognition result, and the content object in the image frame can be determined according to the recognition result. According to the position information, the blank area in the image frame, or the area that does not block the main content object in the image frame, etc. can be determined subsequently based on the position information, and these areas are determined as suitable display positions for displaying the display control.
其中,图像帧中的内容对象可以是图像帧中的人、物品、建筑、文字等等。预设数量可以根据需要确定,本实施例对此不作限制。需要说明的是,例如预设数量为5,则当前图像帧之后的预设数量的图像帧可以是当前图像帧之后连续的5个图像帧,也可以是间隔的5个图像帧。若是间隔的5个图像帧,则相邻两个图像帧之间间隔的图像帧的数量可以根据需要确定。Among them, the content objects in the image frame may be people, objects, buildings, texts, etc. in the image frame. The preset number can be determined according to needs, which is not limited in this embodiment. It should be noted that, for example, the preset number is 5, and the preset number of image frames after the current image frame may be 5 consecutive image frames after the current image frame, or may be 5 image frames at intervals. If it is 5 image frames at intervals, the number of image frames at intervals between two adjacent image frames can be determined as required.
步骤S1062:根据图像帧中的内容对象的位置信息,确定链接数据的展示位置。Step S1062: Determine the display position of the link data according to the position information of the content object in the image frame.
在一具体实现中,步骤S1062可以实现为:根据所述图像帧中的内容对象的位置信息,确定各所述图像帧中的空白位置;根据各所述图像帧中的空白位置,确定所述链接 数据的展示位置。这样可以保证进行链接数据展示时可以减少遮挡内容对象。In a specific implementation, step S1062 may be implemented as: determining a blank position in each image frame according to the position information of the content object in the image frame; determining the blank position in each image frame according to the blank position in each image frame The placement of the link data. In this way, it can be ensured that the occlusion of content objects can be reduced when link data is displayed.
具体地,一种情况中,在视频中,预设关键词的信息在第20s开始播放,对第20s对应的图像帧之后的5个图像帧进行图像识别,确定了各个图像帧中的空白位置,据此可以确定重合率最高的空白位置,该重合率最高的空白位置就可以作为链接数据的展示位置,这样展示链接数据时可以使得链接数据不遮挡或者少遮挡图像帧中的目标内容对象,提升融合性。Specifically, in one case, in the video, the information of the preset keywords starts to be played at the 20s, and the 5 image frames after the 20s corresponding image frame are image-recognized, and the blank positions in each image frame are determined Based on this, the blank position with the highest coincidence rate can be determined, and the blank position with the highest coincidence rate can be used as the display position of the link data, so that the link data can not block or less block the target content object in the image frame when displaying the link data. Improve integration.
当然,也可以采用其他方式来根据识别结果确定展示位置,例如,根据内容对象的位置信息和预设的排版规则,从内容对象中确定与链接数据对应的目标内容对象,将与目标内容对象间隔一定距离的位置确定为合适的位置作为展示位置。Of course, other methods can also be used to determine the display location based on the recognition result. For example, according to the location information of the content object and preset layout rules, the target content object corresponding to the link data is determined from the content object, and the target content object will be separated from the target content object. A certain distance position is determined as a suitable position as a display position.
子步骤S1063:在展示位置通过展示控件显示对应的链接数据。Sub-step S1063: display the corresponding link data through the display control at the display position.
在确定展示位置后,针对展示控件的结构不同可以采用不同的方式展示该展示控件。After the display position is determined, the display control can be displayed in different ways according to the different structure of the display control.
例如,子步骤S1063可以实现为:在展示位置显示展示控件,并且,在展示控件中显示第一子控件和第二子控件。For example, the sub-step S1063 may be implemented as: displaying the display control in the display position, and displaying the first sub-control and the second sub-control in the display control.
第一子控件用于展示预设关键词的信息指示的目标内容对象对应的文字和/或图像信息;第二子控件包括链接数据对应的触发控件,用于被触发时将视频的播放界面跳转至链接数据所链接的页面。The first sub-control is used to display the text and/or image information corresponding to the target content object indicated by the information of the preset keywords; the second sub-control includes a trigger control corresponding to the link data, which is used to jump the video playback interface when triggered Go to the page linked by the link data.
目标内容对象对应的文字和/或图像信息可以是预设的,也可以是业务主自主添加的。The text and/or image information corresponding to the target content object can be preset, or can be added by the business owner voluntarily.
这样可以同时显示目标内容对象的相关信息和链接数据的触发控件,从而使受众可以很好地知道目标内容对象及其对应的触发控件,从而使得可以在一个界面中展示多个不同的展示控件。其中,不同的展示控件可以有不同的功能,并采用不同的方式进行展示,从而降低实现成本。In this way, the related information of the target content object and the trigger control of the link data can be displayed at the same time, so that the audience can know the target content object and its corresponding trigger control well, so that multiple different display controls can be displayed in one interface. Among them, different display controls can have different functions and be displayed in different ways, thereby reducing implementation costs.
当然,在其他实施例中,若展示控件仅包括一个控件,则可以直接在展示位置显示该展示控件,本实施例对此不作限制。Of course, in other embodiments, if the display control includes only one control, the display control can be directly displayed at the display position, which is not limited in this embodiment.
在展示展示控件的过程中,受众如果需要进一步了解目标内容对象,则可以对展示控制进行操作。In the process of displaying the display control, if the audience needs to further understand the target content object, they can operate the display control.
可选地,本实施例中,方法还包括:Optionally, in this embodiment, the method further includes:
步骤S108:接收对展示的链接数据的操作,根据操作从视频的播放界面跳转至链接数据所链接的页面。Step S108: Receive an operation on the displayed link data, and jump from the playback interface of the video to the page linked by the link data according to the operation.
在本实施例中,接收到对展示的链接数据的操作表示受众希望进一步对目标内容对 象进行了解或者查看更多与目标内容对象有关的信息,故而根据该操作,从视频的播放界面跳转到链接数据所链接的页面,以展示更多与目标内容对象对应的信息。In this embodiment, receiving an operation on the displayed link data indicates that the audience wants to further understand the target content object or view more information related to the target content object, so according to this operation, jump from the video playback interface to Link the page to which the data is linked to display more information corresponding to the target content object.
该视频数据的处理方法可以应用至任何适当的使用场景。例如,引用在电商网站中,以在其中的首页、商品展示页面和搜索展示页面等场景中添加视频播放的窗口,在播放视频的过程中显示相关的链接数据,使受众能够点击链接数据跳转到对应的页面查看与链接数据对应的页面,从而实现为页面引流的目的。The video data processing method can be applied to any appropriate use scenario. For example, it is quoted in an e-commerce website to add a video playback window to the homepage, product display page, search display page, etc., and display related link data during video playback, so that the audience can click on the link data to jump Go to the corresponding page to view the page corresponding to the link data, so as to achieve the purpose of page drainage.
当然,除了电商网站外,还可以应用到其他任何能够播放视频的场景中。Of course, in addition to e-commerce websites, it can also be applied to any other scenes that can play videos.
下面结合一个在网页界面中播放视频的使用场景对视频数据的处理方法进行说明如下:The following describes the processing method of video data in combination with a usage scenario of playing video in the web interface:
终端设备通过网络从服务端(服务端包括服务器或云端)获取网页数据(终端设备与服务端连接的示意图可以参考图1b),其中,网页数据中包括视频,视频包括图像帧序列数据、音频数据、预设关键词的信息。当然,网页数据中还可以包括链接数据。The terminal device obtains webpage data from the server (the server includes the server or the cloud) through the network (refer to Figure 1b for a schematic diagram of the connection between the terminal device and the server), where the webpage data includes video, and the video includes image frame sequence data and audio data , Preset keyword information. Of course, the web page data can also include link data.
在网页界面中播放视频的界面,如图2b中界面1所示。该视频可以是对某个目标内容对象(如商品)进行介绍的视频。The interface for playing the video in the web interface is as shown in interface 1 in Figure 2b. The video may be a video that introduces a certain target content object (such as a product).
如图2b中的界面2所示,当检测到预设关键词的信息(本使用场景中为播放到语音关键词对应的音频片段)时,界面中的字幕中显示了文字关键词,在界面中显示一个半透明的蒙版,并在蒙版中显示半透明的第一子控件和第二子控件,其中,第一子控件用于显示目标内容对象1的名称(如XXX护手霜等),第二子控件是链接数据的触发控件(如触发按钮、触发弹窗等)。As shown in the interface 2 in Figure 2b, when the information of the preset keywords is detected (in this usage scenario, the audio clips corresponding to the voice keywords are played), the text keywords are displayed in the subtitles in the interface. A semi-transparent mask is displayed in the mask, and the semi-transparent first and second sub-controls are displayed in the mask. The first sub-control is used to display the name of the target content object 1 (such as XXX hand cream, etc. ), the second sub-control is a trigger control for linking data (such as a trigger button, a trigger pop-up window, etc.).
如图2b中的界面3所示,当受众点击了第二子控件时,界面跳转到目标内容对象的商品介绍界面,用于显示目标内容对象的详细信息(以护手霜为例,其详细信息可以是护手霜的外观图、容量、成分等等)。As shown in interface 3 in Figure 2b, when the audience clicks on the second sub-control, the interface jumps to the product introduction interface of the target content object, which is used to display the detailed information of the target content object (taking hand cream as an example, which The detailed information can be the appearance, volume, composition, etc. of the hand cream).
可选地,在本实施例中,前述的视频可以是使用视频生成工具自动生成的。例如,预先获取素材视频,对素材视频进行分析处理,获得素材视频对应的目标内容对象和预设关键词。当业务主需要生成用于推广目标内容对象的视频时,根据业务主输入的搜索信息,算法自动匹配确定与搜索信息匹配的素材视频,将素材视频与业务主提供的链接数据进行关联,并根据素材视频和链接数据生成视频。Optionally, in this embodiment, the aforementioned video may be automatically generated using a video generation tool. For example, the material video is obtained in advance, and the material video is analyzed and processed to obtain the target content object and preset keywords corresponding to the material video. When the business owner needs to generate a video for promoting the target content object, according to the search information input by the business owner, the algorithm automatically matches the material video that matches the search information, associates the material video with the link data provided by the business owner, and then Material video and link data generate video.
其中,视频中包括的图像帧序列数据可以是素材视频中的图像帧序列数据,音频数据可以是素材视频中的音频数据或者是根据素材文案自动生成的音频数据,链接数据即为业务主提供的链接数据,这样能更精准的产出广告视频。Among them, the image frame sequence data included in the video can be the image frame sequence data in the material video, the audio data can be the audio data in the material video or the audio data automatically generated according to the material copy, and the link data is provided by the business owner Link data so that advertising videos can be produced more accurately.
通过本实施例,用于推广目标内容对象的视频中包括预设关键词的信息,在视频的播放过程中,检测到预设关键词的信息时,基于播放的视频展示与预设关键词的信息指示的目标内容对象对应的链接数据。这样不仅可以实现通过视频为链接数据对应的页面引流,而且可以很好地与受众进行互动,为受众提供了解目标内容对象的详细信息的途径。Through this embodiment, the video used to promote the target content object includes the information of the preset keywords. During the playback of the video, when the information of the preset keywords is detected, based on the displayed video and the preset keywords The link data corresponding to the target content object indicated by the information. In this way, not only can the video be used to draw traffic to the page corresponding to the link data, but also it can interact with the audience well, and provide the audience with a way to understand the detailed information of the target content object.
通过智能地确定展示控件的展示位置,使得可以实现链接数据的智能展示,使链接数据的展示很好地融合在视频中。By intelligently determining the display position of the display control, the intelligent display of the link data can be realized, and the display of the link data can be well integrated into the video.
此外,可以自动生成视频,使得不熟悉视频制作工具或者没有视频制作能力的业务主(如广告主)也能生成需要的视频。在视频播放过程中显示融合度更好的链接数据,做到链接数据的展示位置随视频中的图像变化而变化,智能性和适应性更好,可以直接提升视频的转化率。In addition, videos can be automatically generated, so that business owners (such as advertisers) who are not familiar with video production tools or have no video production capabilities can also generate the videos they need. During the video playback process, the link data with better integration is displayed, so that the display position of the link data changes with the image changes in the video, and the intelligence and adaptability are better, which can directly increase the conversion rate of the video.
实施例三Example three
参照图3,示出了根据本发明实施例三的一种显示方法的步骤流程示意图。Referring to FIG. 3, there is shown a schematic flowchart of the steps of a display method according to the third embodiment of the present invention.
在本实施例中,以终端设备作为执行主体为例,对显示方法进行说明如下。In this embodiment, taking the terminal device as the execution subject as an example, the display method is described as follows.
其中,本实施例的显示方法包括以下步骤:Wherein, the display method of this embodiment includes the following steps:
步骤S300:在视频播放过程中,当检测到预设关键词的信息时,在视频播放界面中显示与检测到的预设关键词的信息指示的目标内容对象对应的链接数据。Step S300: During the video playing process, when the information of the preset keyword is detected, the link data corresponding to the target content object indicated by the detected information of the preset keyword is displayed in the video playing interface.
显示该链接数据的过程可以如前述的实施例一或实施例二中描述的过程,故在此不再赘述。The process of displaying the link data can be the process described in the foregoing first embodiment or second embodiment, so it will not be repeated here.
步骤S302:获取对视频播放界面中显示的目标内容对象对应的链接数据的触发操作。Step S302: Acquire a trigger operation on the link data corresponding to the target content object displayed in the video playback interface.
其中,链接数据可以指示与目标内容对象关联的页面,链接数据可以是URL(Uniform Resource Locator,统一资源定位符)或者IP地址等等。Among them, the link data may indicate a page associated with the target content object, and the link data may be a URL (Uniform Resource Locator, Uniform Resource Locator) or an IP address, etc.
具体到本实施例,链接数据为视频播放至预设关键词的信息时,触发显示的与预设关键词的信息对应的链接数据。Specifically in this embodiment, the link data is the link data corresponding to the information of the preset keyword that is triggered to be displayed when the video is played to the information of the preset keyword.
目标内容对象可以是商品、人物、景点等等任何适当的对象。商品可以是有形商品,也可以是无形商品。The target content object can be any suitable object such as commodities, characters, scenic spots, and so on. Commodities can be tangible or intangible.
触发操作可以是受众对链接数据(例如展示链接数据的展示控件)的点击操作、长按操作、滑动操作、双击操作等等。The trigger operation may be a click operation, a long press operation, a sliding operation, a double-click operation, etc. on the link data (for example, a display control displaying the link data) by the audience.
步骤S304:根据触发操作,从视频播放界面跳转至链接数据所链接的、用于显示目标内容对象的页面。Step S304: According to the trigger operation, jump from the video playback interface to the page linked by the link data for displaying the target content object.
在一种可行方式中,根据触发操作,生成访问链接数据指示的页面的请求并将其发送到对应的服务端,以获取链接数据对应的页面的数据,从而进行显示。In a feasible manner, according to the trigger operation, a request for accessing the page indicated by the link data is generated and sent to the corresponding server to obtain the data of the page corresponding to the link data for display.
通过本实施例,在播放视频过程中,若播放到预设关键词的信息,则显示对应的链接数据,以供受众触发,若接收到触发操作,则显示链接数据对应的页面,从而展示目标内容对象(如商品)的详细信息,供受众查看。Through this embodiment, in the process of playing the video, if the information of the preset keyword is played, the corresponding link data is displayed for the audience to trigger, and if the trigger operation is received, the page corresponding to the link data is displayed, thereby displaying the target The detailed information of content objects (such as products) for the audience to view.
实施例四Example four
参照图4a,示出了根据本发明实施例四的一种视频数据的处理方法的步骤流程示意图。Referring to FIG. 4a, there is shown a schematic flow chart of the steps of a method for processing video data according to the fourth embodiment of the present invention.
在本实施例中,结合视频直播销售场景,对视频数据的处理方法进行说明。直播视频销售场景中直播主可以通过直播的方式向观看者推荐和介绍商品,还可以在直播中进行试用的效果展示等,由此实现对商品的线上售卖。该视频数据的处理方法可以以作为播放端的终端设备作为执行主体。In this embodiment, the method for processing video data is described in conjunction with a live video sales scenario. In the live video sales scenario, the live broadcaster can recommend and introduce products to viewers through the live broadcast, and can also display the effects of trials in the live broadcast, thereby realizing online sales of the products. The video data processing method can be executed by a terminal device as a playback terminal.
本实施例的视频数据的处理方法包括:The video data processing method of this embodiment includes:
步骤S402:获取并播放直播视频流。Step S402: Acquire and play the live video stream.
直播视频流可以是从直播服务端获取的实时视频,也可以是直接从直播端获取的实时视频。The live video stream can be a real-time video obtained from the live broadcast server, or it can be a real-time video obtained directly from the live broadcast terminal.
该直播视频流可以是直播主介绍商品的视频,但并不限于此,其可以是其他任何内容的视频。The live video stream may be a video in which the live broadcaster introduces the product, but is not limited to this, and it may be a video of any other content.
步骤S404:在所述直播视频流的播放过程中,对所述直播视频流中的图像帧进行内容检测,和/或,对所述直播视频流中的音频进行内容检测,以获取所述直播视频流中包含的内容对象。Step S404: During the playback of the live video stream, perform content detection on the image frames in the live video stream, and/or perform content detection on the audio in the live video stream to obtain the live broadcast The content object contained in the video stream.
内容对象可以是图像中的人、物品、建筑等,也可以是图像帧中的文字或字幕出现的文字关键词指示的人、物品、建筑等,或者,可以是音频中出现的语音关键词指示的人、物品、建筑等。The content object can be people, objects, buildings, etc. in the image, or the people, objects, buildings, etc. indicated by the text in the image frame or the text keywords appearing in the subtitles, or it can be the voice keyword indications appearing in the audio People, objects, buildings, etc.
在一种具体实现方式中,对所述直播视频流中的图像帧进行内容检测,以获取所述直播视频流中包含的内容对象可以实现为:在所述直播视频流的播放过程中,对所述直播视频流中的图像帧中的预设位置进行图像识别,并根据识别结果获取所述图像帧中的 文字关键词指示的内容对象和/或所述图像帧中的图像指示的内容对象。In a specific implementation manner, performing content detection on the image frames in the live video stream to obtain the content objects contained in the live video stream may be implemented as follows: during the playback of the live video stream, Perform image recognition at a preset position in the image frame in the live video stream, and obtain the content object indicated by the text keyword in the image frame and/or the content object indicated by the image in the image frame according to the recognition result .
例如,采用具有对应的内容对象的识别功能的神经网络模型对图像帧中的预设位置进行检测,以识别出图像帧的预设位置处包含的内容对象。For example, a neural network model with a corresponding content object recognition function is used to detect the preset position in the image frame to identify the content object contained in the preset position of the image frame.
其中,预设位置可以是默认配置的位置,如整个图像帧;或者,也可以是由直播主在直播过程中通过选框等方式选定的部分或全部图像帧。通过这种对预设位置进行检测的方式可以提升检测的自由度,以提升适应性。Among them, the preset position may be a position configured by default, such as the entire image frame; or, it may also be a part or all of the image frames selected by the live broadcast host through a selection box during the live broadcast. Through this method of detecting the preset position, the degree of freedom of detection can be improved, so as to improve the adaptability.
对所述直播视频流中的音频进行内容检测,以获取所述直播视频流中包含的内容对象,包括:在所述直播视频流的播放过程中,对所述直播视频流中的音频进行音频识别,并获取所述音频中的语音关键词指示的内容对象。Performing content detection on the audio in the live video stream to obtain the content objects contained in the live video stream includes: performing audio on the audio in the live video stream during the playback of the live video stream Identify and obtain the content object indicated by the voice keyword in the audio.
例如,可以检测音频中的提及的商品名称、商品型号等语音关键词,从而确定这些语音关键词指示的内容对象。For example, voice keywords such as product names and product models mentioned in the audio can be detected to determine the content objects indicated by these voice keywords.
通过采用图像识别和/或音频视频的方式对直播视频流进行检测,可以获取到直播视频流中包含的至少部分内容对象,进而后续可以判断这些内容对象中是否存在目标内容对象,以确定是否需要展示链接数据。By using image recognition and/or audio and video methods to detect the live video stream, at least some of the content objects contained in the live video stream can be obtained, and then it can be subsequently determined whether there is a target content object among these content objects to determine whether it is necessary Show link data.
步骤S406:查找所述内容对象是否存在对应的链接数据。Step S406: Find out whether the content object has corresponding link data.
在一种可行方式中,步骤S406包括:查找预设的商品数据库,以确定所述内容对象是否存在对应的链接数据。In a feasible manner, step S406 includes: searching a preset commodity database to determine whether the content object has corresponding link data.
商品数据库中保存有商品标识(例如商品名称)以及其对应的链接数据(商品的购买链接)。针对检测出的内容对象,可以通过将内容对象对应的预设关键词与商品标识进行匹配,从而确定商品数据库中是否存在对应的链接数据。若存在,则表示其为目标内容对象,可以在播放界面中显示链接数据,以供观看者根据需要操作链接数据。The commodity database stores commodity identifiers (such as commodity names) and its corresponding link data (commodity purchase links). For the detected content object, the preset keywords corresponding to the content object can be matched with the product identifier to determine whether there is corresponding link data in the product database. If it exists, it means that it is the target content object, and the link data can be displayed in the playback interface for the viewer to manipulate the link data as needed.
在另一种可行方式中,所述直播视频流中包括用于指示待识别的目标内容对象的预设关键词的信息和对应的链接数据。待识别的目标内容对象的预设关键词的信息和链接数据可以是直播主在直播端选定的。例如,在直播端配置有设置接口,直播主通过该设置接口可以配置预设关键词的信息,以通过其指示目标内容对象;并配置对应的链接数据,从而提升自主性,使直播主可以根据需要控制展示的链接数据。In another feasible manner, the live video stream includes information indicating preset keywords of the target content object to be identified and corresponding link data. The information and link data of the preset keywords of the target content object to be recognized may be selected by the live broadcaster at the live broadcast end. For example, the live broadcast terminal is equipped with a setting interface, through which the live broadcast host can configure the information of the preset keywords to indicate the target content object; and configure the corresponding link data to improve autonomy and enable the live broadcast host to follow Need to control the displayed link data.
此种情况下,步骤S406可以实现为:确定检测出的所述内容对象中是否包括与所述待识别的目标内容对象匹配的内容对象,若存在,则确定存在对应的链接数据。In this case, step S406 can be implemented as: determining whether the detected content object includes a content object matching the target content object to be identified, and if it exists, determining that there is corresponding link data.
例如,通过将检测出的内容对象对应的预设关键词与待识别的目标内容对象的预设关键词进行匹配,确定是否存在匹配的内容对象。若存在匹配的内容对象,则确定存在 对应的链接数据。For example, by matching a preset keyword corresponding to the detected content object with a preset keyword of the target content object to be identified, it is determined whether there is a matching content object. If there is a matching content object, it is determined that there is corresponding link data.
步骤S408:将存在对应的链接数据的所述内容对象作为目标内容对象,在所述直播视频流的播放界面中显示与所述目标内容对象对应的链接数据。Step S408: Use the content object with corresponding link data as the target content object, and display the link data corresponding to the target content object in the play interface of the live video stream.
在检测到目标内容对象时,展示与其对应链接数据,这样可以使直播观看者可以方便地在观看直播的过程中通过操作显示的链接数据而跳转到链接数据对应的页面,以查看其中的内容或者对目标内容对象对应的商品进行购买等操作。由此丰富了直播功能,而且可以使观看者方便地查看目标内容对象的信息。When the target content object is detected, the corresponding link data is displayed, so that the live broadcast viewer can easily jump to the page corresponding to the link data by operating the displayed link data while watching the live broadcast to view the content. Or perform operations such as purchasing the product corresponding to the target content object. As a result, the live broadcast function is enriched, and the viewer can conveniently view the information of the target content object.
链接数据除了可以在播放端进行展示之外,还可以同步在直播端进行展示,使直播主可以及时获知链接数据是否展示,以及链接数据的展示效果,从而使直播主能够更加容易的监控直播效果。In addition to displaying link data on the broadcast side, it can also be displayed on the live broadcast side simultaneously, so that the live broadcaster can know whether the link data is displayed and the display effect of the link data in a timely manner, so that the live broadcaster can more easily monitor the live broadcast effect .
为了提升观看效果,在展示链接数据时可以通过设置动画等方式实现对链接数据进行提示的效果,使观看者更容易注意到链接数据。In order to improve the viewing effect, when displaying the link data, an animation can be set to realize the effect of prompting the link data, so that the viewer can more easily notice the link data.
下面结合一具体使用场景,对直播过程中进行说明:The following describes the live broadcast process in conjunction with a specific usage scenario:
在直播过程中,如图4b中直播端界面1所示,直播主可以通过直播端配置至少一个预设关键词及对应的链接数据,根据配置的预设关键词可以生成用于指示待识别的目标内容对象的预设关键词的信息,并将直播视频流、预设关键词的信息和链接数据发送给播放端。During the live broadcast process, as shown in the live broadcast terminal interface 1 in Figure 4b, the live broadcast host can configure at least one preset keyword and the corresponding link data through the live broadcast terminal. According to the configured preset keywords, a command to indicate to be recognized can be generated. The preset keyword information of the target content object, and the live video stream, the preset keyword information and link data are sent to the playback terminal.
如图4b中播放端界面1所示,其示出了播放端播放直播视频流的界面。在播放直播视频流的过程中,可以对其中的图像帧和/或音频进行内容识别,以确定直播视频流中包含的内容对象。进而查找检测到的内容对象是否存在对应的链接数据,如果存在对应的链接数据,则在播放端的播放界面和直播端的播放界面展示该链接数据,图4b中播放端界面2示出了展示链接数据的界面示意图。As shown in the interface 1 of the player terminal in FIG. 4b, it shows the interface of the player terminal to play the live video stream. In the process of playing the live video stream, content recognition can be performed on the image frames and/or audio therein to determine the content objects contained in the live video stream. Then find out whether the detected content object has corresponding link data. If there is corresponding link data, the link data will be displayed on the playback interface of the player end and the playback interface of the live broadcast end. In Figure 4b, the interface 2 of the player end shows the display link data. Schematic diagram of the interface.
观看者通过操作该链接数据可以跳转到对应的页面中,以查看页面中的内容(如图4b中播放端界面3所示)。The viewer can jump to the corresponding page by operating the link data to view the content on the page (as shown in the player interface 3 in Figure 4b).
实施例五Example five
参照图5,示出了根据本发明实施例五的一种视频数据的处理装置的结构框图。Referring to FIG. 5, there is shown a structural block diagram of a video data processing apparatus according to the fifth embodiment of the present invention.
本实施例的视频数据的处理装置包括:第一获取模块502,用于获取待播放的视频以及链接数据,其中,视频中包括预设关键词的信息,链接数据与预设关键词指示的目标内容对象对应;第一检测模块504,用于在视频的播放过程中,对播放的视频的至少 部分图像帧和/或至少部分音频数据进行预设关键词的信息的检测;第一显示模块506,用于若根据检测结果确定检测到预设关键词的信息,则基于播放的视频显示对应的链接数据。The video data processing device of this embodiment includes: a first acquisition module 502 for acquiring a video to be played and link data, where the video includes information about preset keywords, the link data and the target indicated by the preset keywords Content object correspondence; a first detection module 504, used to detect at least part of the image frames and/or at least part of the audio data of the played video during the playback of the video; the first display module 506 , Used to display the corresponding link data based on the played video if it is determined that the information of the preset keyword is detected according to the detection result.
可选地,预设关键词的信息包括下列至少之一:预设的语音关键词和预设的文字关键词。Optionally, the information of the preset keyword includes at least one of the following: a preset voice keyword and a preset text keyword.
可选地,该装置还包括:信息生成模块500,用于获取目标内容对象的对应的标识,并根据标识生成与所述目标内容对象对应的所述预设关键词的信息,其中,所述标识包括下列至少之一:目标内容对象的名称、型号和类别。Optionally, the device further includes: an information generating module 500, configured to obtain a corresponding identifier of a target content object, and generate information of the preset keyword corresponding to the target content object according to the identifier, wherein the The identification includes at least one of the following: the name, model, and category of the target content object.
可选地,该装置还包括:接收模块508,用于接收对展示的链接数据的操作,根据操作从视频的播放界面跳转至链接数据所链接的页面。Optionally, the device further includes: a receiving module 508, configured to receive an operation on the displayed link data, and jump from the playing interface of the video to the page linked by the link data according to the operation.
可选地,第一显示模块506用于在视频的播放界面,增加展示控件用于显示对应的链接数据,其中,展示控件包括以下至少之一:悬浮窗、蒙版、弹窗。Optionally, the first display module 506 is configured to add a display control to display the corresponding link data on the video playback interface, where the display control includes at least one of the following: a floating window, a mask, and a pop-up window.
可选地,第一显示模块506用于基于被播放的预设关键词的信息,对当前图像帧之后的预设数量的图像帧进行图像识别,根据识别结果确定图像帧中的内容对象的位置信息,根据图像帧中的内容对象的位置信息,确定链接数据的展示位置;在展示位置通过展示控件显示对应的链接数据。Optionally, the first display module 506 is configured to perform image recognition on a preset number of image frames after the current image frame based on the information of the preset keyword being played, and determine the position of the content object in the image frame according to the recognition result Information: According to the position information of the content object in the image frame, the display position of the link data is determined; the corresponding link data is displayed through the display control in the display position.
可选地,第一显示模块506用于在展示位置通过展示控件显示对应的链接数据时,在展示位置显示展示控件,并且,在展示控件中显示第一子控件和第二子控件;其中,第一子控件用于展示预设关键词的信息指示的目标内容对象对应的文字和/或图像信息;第二子控件包括链接数据对应的触发控件,用于被触发时将视频的播放界面跳转至链接数据所链接的页面。Optionally, the first display module 506 is configured to display the display control in the display position when displaying the corresponding link data through the display control in the display position, and display the first sub-control and the second sub-control in the display control; wherein, The first sub-control is used to display the text and/or image information corresponding to the target content object indicated by the information of the preset keywords; the second sub-control includes a trigger control corresponding to the link data, which is used to jump the video playback interface when triggered Go to the page linked by the link data.
可选地,第一显示模块506用于在根据图像帧中的内容对象的位置信息,确定链接数据的展示位置时,根据识别结果指示的各图像帧中的内容对象的位置信息,确定各图像帧中的空白位置;根据各图像帧中的空白位置,确定链接数据的展示位置。Optionally, the first display module 506 is configured to determine each image according to the position information of the content object in each image frame indicated by the recognition result when determining the display position of the link data according to the position information of the content object in the image frame The blank position in the frame; according to the blank position in each image frame, determine the display position of the link data.
可选地,第一显示模块506用于在基于播放的视频显示对应的链接数据时,以预设展示时长显示链接数据,以使目标内容对象的受众在预设展示时长内对链接数据进行操作。Optionally, the first display module 506 is configured to display the link data with a preset display duration when displaying the corresponding link data based on the played video, so that the audience of the target content object can operate on the link data within the preset display duration .
可选地,第一显示模块506在基于播放的视频显示对应的链接数据时,从目标内容对象的应用供应端输入的文案数据中匹配出与链接数据对应的文案数据;根据链接数据和匹配的文案数据,生成待展示的链接数据;基于视频的播放界面,展示待展示的链接 数据。Optionally, when displaying the corresponding link data based on the played video, the first display module 506 matches the copy data corresponding to the link data from the copy data input from the application provider of the target content object; Copywriting data to generate link data to be displayed; video-based playback interface to display link data to be displayed.
本实施例的视频数据的处理装置用于实现前述多个方法实施例中相应的视频数据的处理方法,并具有相应的方法实施例的有益效果,在此不再赘述。此外,本实施例的视频数据的处理装置中的各个模块的功能实现均可参照前述方法实施例中的相应部分的描述,在此亦不再赘述。The video data processing apparatus of this embodiment is used to implement the corresponding video data processing methods in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here. In addition, the functional realization of each module in the video data processing apparatus of this embodiment can refer to the description of the corresponding part in the foregoing method embodiment, and will not be repeated here.
实施例六Example Six
参照图6,示出了根据本发明实施例六的一种显示装置的结构框图。Referring to FIG. 6, there is shown a structural block diagram of a display device according to the sixth embodiment of the present invention.
本实施例的显示装置包括:链接数据展示模块600,用于在视频播放过程中,当检测到预设关键词的信息时,在视频播放界面中显示与检测到的预设关键词的信息指示的目标内容对象对应的链接数据;触发获取模块602,用于获取对视频播放界面中显示的目标内容对象对应的链接数据的触发操作;界面跳转模块604,用于根据触发操作,从视频播放界面跳转至链接数据所链接的、用于显示目标内容对象的页面。The display device of this embodiment includes: a link data display module 600, which is used to display and indicate the detected preset keyword information in the video playback interface when the information of the preset keyword is detected during the video playback process The link data corresponding to the target content object; the trigger obtaining module 602 is used to obtain the trigger operation of the link data corresponding to the target content object displayed in the video playback interface; the interface jump module 604 is used to play from the video according to the trigger operation The interface jumps to the page linked by the link data and used to display the target content object.
本实施例的显示装置用于实现前述多个方法实施例中相应的显示方法,并具有相应的方法实施例的有益效果,在此不再赘述。此外,本实施例的显示装置中的各个模块的功能实现均可参照前述方法实施例中的相应部分的描述,在此亦不再赘述。The display device of this embodiment is used to implement the corresponding display methods in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here. In addition, the functional realization of each module in the display device of this embodiment can refer to the description of the corresponding part in the foregoing method embodiment, and will not be repeated here.
实施例七Example Seven
参照图7,示出了根据本发明实施例七的一种视频数据的处理装置的结构框图。Referring to FIG. 7, there is shown a structural block diagram of a video data processing device according to the seventh embodiment of the present invention.
本实施例的视频数据的处理装置包括:第二获取模块702,用于获取并播放直播视频流;第二检测模块704,用于在所述直播视频流的播放过程中,对所述直播视频流中的图像帧进行内容检测,和/或,对所述直播视频流中的音频进行内容检测,以获取所述直播视频流中包含的内容对象;匹配模块706,用于查找所述内容对象是否存在对应的链接数据;第二显示模块708,用于将存在对应的链接数据的所述内容对象作为目标内容对象,在所述直播视频流的播放界面中显示与所述目标内容对象对应的链接数据。The device for processing video data in this embodiment includes: a second acquisition module 702, configured to acquire and play a live video stream; a second detection module 704, configured to perform processing on the live video stream during the playback of the live video stream. Content detection is performed on the image frames in the stream, and/or content detection is performed on the audio in the live video stream to obtain the content objects contained in the live video stream; the matching module 706 is used to find the content objects Whether there is corresponding link data; the second display module 708 is configured to use the content object with the corresponding link data as the target content object, and display the content object corresponding to the target content object in the playback interface of the live video stream Link data.
可选地,匹配模块706具体用于查找预设的商品数据库,以确定所述内容对象是否存在对应的链接数据。Optionally, the matching module 706 is specifically configured to search a preset commodity database to determine whether the content object has corresponding link data.
可选地,所述直播视频流中包括用于指示待识别的目标内容对象的预设关键词的信息和对应的链接数据;匹配模块706具体用于确定检测出的所述内容对象中是否包括与所述待识别的目标内容对象匹配的内容对象,若存在,则确定存在对应的链接数据。Optionally, the live video stream includes information indicating preset keywords of the target content object to be identified and corresponding link data; the matching module 706 is specifically configured to determine whether the detected content object includes If there is a content object matching the target content object to be identified, it is determined that there is corresponding link data.
可选地,第二检测模块704具体用于在所述直播视频流的播放过程中,对所述直播视频流中的图像帧进行内容检测,以获取所述直播视频流中包含的内容对象时,在所述直播视频流的播放过程中,对所述直播视频流中的图像帧中的预设位置进行图像识别,并根据识别结果获取所述图像帧中的文字关键词指示的内容对象和/或所述图像帧中的图像指示的内容对象。Optionally, the second detection module 704 is specifically configured to perform content detection on image frames in the live video stream during the playback process of the live video stream to obtain content objects contained in the live video stream. In the process of playing the live video stream, perform image recognition on the preset position in the image frame in the live video stream, and obtain the content object and the content object indicated by the text keyword in the image frame according to the recognition result /Or the content object indicated by the image in the image frame.
可选地,第二检测模块704具体用于在所述直播视频流的播放过程中,对所述直播视频流中的音频进行内容检测,以获取所述直播视频流中包含的内容对象时,在所述直播视频流的播放过程中,对所述直播视频流中的音频进行音频识别,并获取所述音频中的语音关键词指示的内容对象。Optionally, the second detection module 704 is specifically configured to perform content detection on the audio in the live video stream during the playback process of the live video stream to obtain content objects contained in the live video stream, During the playing process of the live video stream, audio recognition is performed on the audio in the live video stream, and the content object indicated by the voice keyword in the audio is obtained.
本实施例的视频数据的处理装置用于实现前述多个方法实施例中相应的视频数据的处理方法,并具有相应的方法实施例的有益效果,在此不再赘述。此外,本实施例的视频数据的处理装置中的各个模块的功能实现均可参照前述方法实施例中的相应部分的描述,在此亦不再赘述。The video data processing apparatus of this embodiment is used to implement the corresponding video data processing methods in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here. In addition, the functional realization of each module in the video data processing apparatus of this embodiment can refer to the description of the corresponding part in the foregoing method embodiment, and will not be repeated here.
实施例八Example eight
参照图8,示出了根据本发明实施例八的一种电子设备的结构示意图,本发明具体实施例并不对电子设备的具体实现做限定。Referring to FIG. 8, there is shown a schematic structural diagram of an electronic device according to the eighth embodiment of the present invention. The specific embodiment of the present invention does not limit the specific implementation of the electronic device.
如图8所示,该电子设备可以包括:处理器(processor)802、通信接口(Communications Interface)804、存储器(memory)806、以及通信总线808。As shown in FIG. 8, the electronic device may include: a processor (processor) 802, a communication interface (Communications Interface) 804, a memory (memory) 806, and a communication bus 808.
其中:among them:
处理器802、通信接口804、以及存储器806通过通信总线808完成相互间的通信。The processor 802, the communication interface 804, and the memory 806 communicate with each other through the communication bus 808.
通信接口804,用于与其它电子设备如终端设备或服务器进行通信。The communication interface 804 is used to communicate with other electronic devices such as terminal devices or servers.
处理器802,用于执行程序810,具体可以执行上述视频数据的处理或者显示方法实施例中的相关步骤。The processor 802 is configured to execute a program 810, and specifically can execute related steps in the foregoing video data processing or display method embodiments.
具体地,程序810可以包括程序代码,该程序代码包括计算机操作指令。Specifically, the program 810 may include program code, and the program code includes computer operation instructions.
处理器802可能是中央处理器CPU,或者是特定集成电路ASIC(Application Specific Integrated Circuit),或者是被配置成实施本发明实施例的一个或多个集成电路。电子设备包括的一个或多个处理器,可以是同一类型的处理器,如一个或多个CPU;也可以是不同类型的处理器,如一个或多个CPU以及一个或多个ASIC。The processor 802 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of the present invention. The one or more processors included in the electronic device may be the same type of processor, such as one or more CPUs, or different types of processors, such as one or more CPUs and one or more ASICs.
存储器806,用于存放程序810。存储器806可能包含高速RAM存储器,也可能还 包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。The memory 806 is used to store the program 810. The memory 806 may include a high-speed RAM memory, and may also include a non-volatile memory (non-volatile memory), for example, at least one disk memory.
程序810具体可以用于使得处理器802执行以下操作:获取待播放的视频以及链接数据,其中,视频中包括预设关键词的信息,链接数据与预设关键词指示的目标内容对象对应;在视频的播放过程中,对播放的视频的至少部分图像帧和/或至少部分音频数据进行预设关键词的信息的检测;若根据检测结果确定检测到预设关键词的信息,则基于播放的视频显示对应的链接数据。The program 810 can specifically be used to make the processor 802 perform the following operations: obtain the video to be played and link data, where the video includes information about preset keywords, and the link data corresponds to the target content object indicated by the preset keywords; During the playback of the video, at least part of the image frames and/or at least part of the audio data of the played video is detected for the information of the preset keywords; if it is determined that the information of the preset keywords is detected according to the detection result, it will be The video shows the corresponding link data.
在一种可选的实施方式中,预设关键词的信息包括下列至少之一:预设的语音关键词和预设的文字关键词。In an optional implementation manner, the preset keyword information includes at least one of the following: a preset voice keyword and a preset text keyword.
在一种可选的实施方式中,程序810还用于使得处理器802获取目标内容对象的对应的标识,并根据标识生成与目标内容对象对应的预设关键词的信息,其中,标识包括下列至少之一:目标内容对象的名称、型号和类别。In an optional implementation manner, the program 810 is further configured to enable the processor 802 to obtain the corresponding identification of the target content object, and generate information of preset keywords corresponding to the target content object according to the identification, wherein the identification includes the following At least one: the name, model, and category of the target content object.
在一种可选的实施方式中,程序810还用于使得处理器802接收对展示的链接数据的操作,根据操作从视频的播放界面跳转至链接数据所链接的页面。In an optional implementation manner, the program 810 is further configured to enable the processor 802 to receive an operation on the displayed link data, and according to the operation, jump from the playing interface of the video to the page linked by the link data.
在一种可选的实施方式中,程序810还用于使得处理器802在基于播放的视频显示对应的链接数据时,在视频的播放界面,增加展示控件用于显示对应的链接数据,其中,展示控件包括以下至少之一:悬浮窗、蒙版、弹窗。In an optional implementation manner, the program 810 is further configured to enable the processor 802 to add a display control to display the corresponding link data on the video playback interface when the processor 802 displays the corresponding link data based on the played video, where: The display control includes at least one of the following: floating window, mask, and pop-up window.
在一种可选的实施方式中,程序810还用于使得处理器802在视频的播放界面,增加展示控件用于显示对应的链接数据时,基于被播放的预设关键词的信息,对当前图像帧之后的预设数量的图像帧进行图像识别,根据识别结果确定图像帧中的内容对象的位置信息,根据图像帧中的内容对象的位置信息,确定链接数据的展示位置;在展示位置通过展示控件显示对应的链接数据。In an optional implementation manner, the program 810 is also used to enable the processor 802 to add a display control to display the corresponding link data on the video playback interface, based on the information of the preset keywords being played, to compare the current Perform image recognition on a preset number of image frames after the image frame, determine the position information of the content object in the image frame according to the recognition result, and determine the display position of the link data according to the position information of the content object in the image frame; pass at the display position The display control displays the corresponding link data.
在一种可选的实施方式中,程序810还用于使得处理器802在在展示位置通过展示控件显示对应的链接数据时,在展示位置显示展示控件,并且,在展示控件中显示第一子控件和第二子控件;其中,第一子控件用于展示预设关键词的信息指示的目标内容对象对应的文字和/或图像信息;第二子控件包括链接数据对应的触发控件,用于被触发时将视频的播放界面跳转至链接数据所链接的页面。In an optional implementation manner, the program 810 is further configured to cause the processor 802 to display the display control in the display position when displaying the corresponding link data through the display control in the display position, and display the first child in the display control. Control and a second sub-control; wherein, the first sub-control is used to display the text and/or image information corresponding to the target content object indicated by the information of the preset keywords; the second sub-control includes a trigger control corresponding to the link data for When triggered, the video playback interface jumps to the page linked by the link data.
在一种可选的实施方式中,程序810还用于使得处理器802在根据所述图像帧中的内容对象的位置信息,确定所述链接数据的展示位置时,根据识别结果指示的各图像帧中的内容对象的位置信息,确定各图像帧中的空白位置;根据各图像帧中的空白位置,确定链接数据的展示位置。In an optional implementation manner, the program 810 is further configured to make the processor 802 determine the display position of the link data according to the position information of the content object in the image frame, according to each image indicated by the recognition result. The position information of the content object in the frame determines the blank position in each image frame; according to the blank position in each image frame, the display position of the link data is determined.
在一种可选的实施方式中,程序810还用于使得处理器802在基于播放的所述视频显示对应的所述链接数据时,以预设展示时长显示链接数据,以使目标内容对象的受众在预设展示时长内对链接数据进行操作。In an optional implementation manner, the program 810 is further configured to cause the processor 802 to display the link data for a preset display duration when displaying the corresponding link data based on the played video, so that the target content object is The audience operates on the link data within the preset display duration.
在一种可选的实施方式中,程序810还用于使得处理器802在基于播放的所述视频显示对应的所述链接数据时,从目标内容对象的应用供应端输入的文案数据中匹配出与链接数据对应的文案数据;根据链接数据和匹配的文案数据,生成待展示的链接数据;基于视频的播放界面,展示待展示的链接数据。In an optional implementation manner, the program 810 is further configured to enable the processor 802 to match the text data input from the application provider of the target content object when displaying the corresponding link data based on the played video. Copywriting data corresponding to the link data; generate link data to be displayed based on the link data and matching copywriting data; display the link data to be displayed based on the video-based playback interface.
或者,程序810具体可以用于使得处理器802执行以下操作:在视频播放过程中,当检测到预设关键词的信息时,在视频播放界面中显示与检测到的预设关键词的信息指示的目标内容对象对应的链接数据;获取对视频播放界面中显示的目标内容对象对应的链接数据的触发操作;根据触发操作,从视频播放界面跳转至链接数据所链接的、用于显示目标内容对象的页面。Alternatively, the program 810 may specifically be used to cause the processor 802 to perform the following operations: in the video playback process, when the information of the preset keyword is detected, the information indication of the detected preset keyword is displayed in the video playback interface The link data corresponding to the target content object; obtain the trigger operation of the link data corresponding to the target content object displayed in the video playback interface; according to the trigger operation, jump from the video playback interface to the link data linked to display the target content The page of the object.
或者,程序810具体可以用于使得处理器802执行以下操作:获取并播放直播视频流;在所述直播视频流的播放过程中,对所述直播视频流中的图像帧进行内容检测,和/或,对所述直播视频流中的音频进行内容检测,以获取所述直播视频流中包含的内容对象;查找所述内容对象是否存在对应的链接数据;将存在对应的链接数据的所述内容对象作为目标内容对象,在所述直播视频流的播放界面中显示与所述目标内容对象对应的链接数据。Alternatively, the program 810 may specifically be used to cause the processor 802 to perform the following operations: obtain and play a live video stream; during the playback of the live video stream, perform content detection on the image frames in the live video stream, and/ Or, perform content detection on the audio in the live video stream to obtain the content object contained in the live video stream; find whether the content object has corresponding link data; the content of the corresponding link data will exist The object serves as the target content object, and the link data corresponding to the target content object is displayed in the playback interface of the live video stream.
在一种可选的实施方式中,程序810还用于使得处理器802在查找所述内容对象是否存在对应的链接数据时,查找预设的商品数据库,以确定所述内容对象是否存在对应的链接数据。In an optional implementation manner, the program 810 is further configured to enable the processor 802 to search a preset commodity database when searching whether the content object has corresponding link data, so as to determine whether the content object has a corresponding link data. Link data.
在一种可选的实施方式中,直播视频流中包括用于指示待识别的目标内容对象的预设关键词的信息和对应的链接数据;程序810还用于使得处理器802在查找所述内容对象是否存在对应的链接数据时,确定检测出的所述内容对象中是否包括与所述待识别的目标内容对象匹配的内容对象,若存在,则确定存在对应的链接数据。In an optional implementation manner, the live video stream includes information indicating preset keywords of the target content object to be recognized and corresponding link data; the program 810 is also used to make the processor 802 search for the When the content object has corresponding link data, it is determined whether the detected content object includes a content object matching the target content object to be identified, and if it exists, it is determined that there is corresponding link data.
在一种可选的实施方式中,程序810还用于使得处理器802在所述直播视频流的播放过程中,对所述直播视频流中的图像帧进行内容检测,以获取所述直播视频流中包含的内容对象时,在所述直播视频流的播放过程中,对所述直播视频流中的图像帧中的预设位置进行图像识别,并根据识别结果获取所述图像帧中的文字关键词指示的内容对象和/或所述图像帧中的图像指示的内容对象。In an optional implementation manner, the program 810 is further configured to enable the processor 802 to perform content detection on the image frames in the live video stream during the playback of the live video stream, so as to obtain the live video When the content object contained in the stream, during the playback of the live video stream, perform image recognition on the preset position in the image frame in the live video stream, and obtain the text in the image frame according to the recognition result The content object indicated by the keyword and/or the content object indicated by the image in the image frame.
在一种可选的实施方式中,程序810还用于使得处理器802在所述直播视频流的播放过程中,对所述直播视频流中的音频进行内容检测,以获取所述直播视频流中包含的内容对象时,在所述直播视频流的播放过程中,对所述直播视频流中的音频进行音频识别,并获取所述音频中的语音关键词指示的内容对象。In an optional implementation manner, the program 810 is further configured to enable the processor 802 to perform content detection on the audio in the live video stream during the playback of the live video stream, so as to obtain the live video stream. In the process of playing the live video stream, audio recognition is performed on the audio in the live video stream, and the content object indicated by the voice keyword in the audio is obtained.
程序810中各步骤的具体实现可以参见上述视频数据的处理、或者显示方法实施例中的相应步骤和单元中对应的描述,在此不赘述。所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的设备和模块的具体工作过程,可以参考前述方法实施例中的对应过程描述,在此不再赘述。For the specific implementation of each step in the program 810, reference may be made to the corresponding description of the corresponding steps and units in the above-mentioned video data processing or display method embodiment, which will not be repeated here. Those skilled in the art can clearly understand that, for convenience and concise description, the specific working process of the devices and modules described above can be referred to the corresponding process description in the foregoing method embodiment, which will not be repeated here.
需要指出,根据实施的需要,可将本发明实施例中描述的各个部件/步骤拆分为更多部件/步骤,也可将两个或多个部件/步骤或者部件/步骤的部分操作组合成新的部件/步骤,以实现本发明实施例的目的。It should be pointed out that according to the needs of implementation, each component/step described in the embodiment of the present invention can be split into more components/steps, or two or more components/steps or partial operations of components/steps can be combined into New components/steps to achieve the purpose of the embodiments of the present invention.
上述根据本发明实施例的方法可在硬件、固件中实现,或者被实现为可存储在记录介质(诸如CD ROM、RAM、软盘、硬盘或磁光盘)中的软件或计算机代码,或者被实现通过网络下载的原始存储在远程记录介质或非暂时机器可读介质中并将被存储在本地记录介质中的计算机代码,从而在此描述的方法可被存储在使用通用计算机、专用处理器或者可编程或专用硬件(诸如ASIC或FPGA)的记录介质上的这样的软件处理。可以理解,计算机、处理器、微处理器控制器或可编程硬件包括可存储或接收软件或计算机代码的存储组件(例如,RAM、ROM、闪存等),当所述软件或计算机代码被计算机、处理器或硬件访问且执行时,实现在此描述的视频数据的处理或者显示方法。此外,当通用计算机访问用于实现在此示出的视频数据的处理或者显示方法的代码时,代码的执行将通用计算机转换为用于执行在此示出的视频数据的处理或者显示方法的专用计算机。The above method according to the embodiments of the present invention can be implemented in hardware, firmware, or implemented as software or computer code that can be stored in a recording medium (such as CD ROM, RAM, floppy disk, hard disk, or magneto-optical disk), or implemented by The computer code downloaded from the network is originally stored in a remote recording medium or a non-transitory machine-readable medium and will be stored in a local recording medium, so that the method described here can be stored in a general-purpose computer, a special-purpose processor, or a programmable Or such software processing on a recording medium of dedicated hardware (such as ASIC or FPGA). It can be understood that a computer, a processor, a microprocessor controller, or programmable hardware includes a storage component (for example, RAM, ROM, flash memory, etc.) that can store or receive software or computer code, when the software or computer code is used by the computer, When accessed and executed by the processor or hardware, the video data processing or display method described herein is implemented. In addition, when a general-purpose computer accesses the code used to implement the processing or display method of video data shown here, the execution of the code converts the general-purpose computer into a special-purpose computer for executing the processing or display method of video data shown here. computer.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及方法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明实施例的范围。A person of ordinary skill in the art may be aware that the units and method steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of the embodiments of the present invention.
以上实施方式仅用于说明本发明实施例,而并非对本发明实施例的限制,有关技术领域的普通技术人员,在不脱离本发明实施例的精神和范围的情况下,还可以做出各种变化和变型,因此所有等同的技术方案也属于本发明实施例的范畴,本发明实施例的专 利保护范围应由权利要求限定。The above implementations are only used to illustrate the embodiments of the present invention, and are not intended to limit the embodiments of the present invention. Those of ordinary skill in the relevant technical field can also make various modifications without departing from the spirit and scope of the embodiments of the present invention. Changes and modifications, therefore, all equivalent technical solutions also belong to the scope of the embodiments of the present invention, and the scope of patent protection of the embodiments of the present invention should be defined by the claims.

Claims (21)

  1. 一种视频数据的处理方法,包括:A method for processing video data, including:
    获取待播放的视频以及链接数据,其中,所述视频中包括预设关键词的信息,所述链接数据与所述预设关键词指示的目标内容对象对应;Acquiring a video to be played and link data, where the video includes information about a preset keyword, and the link data corresponds to a target content object indicated by the preset keyword;
    在所述视频的播放过程中,对播放的所述视频的至少部分图像帧和/或至少部分音频数据进行所述预设关键词的信息的检测;During the playing process of the video, perform detection of the preset keyword information on at least part of the image frames and/or at least part of the audio data of the played video;
    若根据检测结果确定检测到所述预设关键词的信息,则基于播放的所述视频显示对应的所述链接数据。If it is determined according to the detection result that the information of the preset keyword is detected, the corresponding link data is displayed based on the played video.
  2. 根据权利要求1所述的方法,其中,所述预设关键词的信息包括下列至少之一:预设的语音关键词和预设的文字关键词。The method according to claim 1, wherein the information of the preset keyword includes at least one of the following: a preset voice keyword and a preset text keyword.
  3. 根据权利要求2所述的方法,其中,所述方法还包括:获取目标内容对象的对应的标识,并根据标识生成与所述目标内容对象对应的所述预设关键词的信息,其中,所述标识包括下列至少之一:目标内容对象的名称、型号和类别。The method according to claim 2, wherein the method further comprises: obtaining a corresponding identification of the target content object, and generating information of the preset keyword corresponding to the target content object according to the identification, wherein The identification includes at least one of the following: the name, model, and category of the target content object.
  4. 根据权利要求1-3中任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 1-3, wherein the method further comprises:
    接收对展示的所述链接数据的操作,根据所述操作从所述视频的播放界面跳转至所述链接数据所链接的页面。Receive an operation on the displayed link data, and jump from the playback interface of the video to the page linked by the link data according to the operation.
  5. 根据权利要求1-3中任一项所述的方法,其中,所述基于播放的所述视频显示对应的所述链接数据,包括:The method according to any one of claims 1 to 3, wherein the display of the corresponding link data based on the video of the playback comprises:
    在所述视频的播放界面,增加展示控件用于显示对应的所述链接数据,其中,所述展示控件包括以下至少之一:悬浮窗、蒙版、弹窗。On the video playback interface, a display control is added to display the corresponding link data, wherein the display control includes at least one of the following: a floating window, a mask, and a pop-up window.
  6. 根据权利要求5所述的方法,其中,所述在所述视频的播放界面,增加展示控件用于显示对应的所述链接数据,包括:The method according to claim 5, wherein the adding a display control to display the corresponding link data on the playing interface of the video comprises:
    基于被播放的所述预设关键词的信息,对当前图像帧之后的预设数量的图像帧进行图像识别,根据识别结果确定所述图像帧中的内容对象的位置信息;Performing image recognition on a preset number of image frames following the current image frame based on the information of the preset keyword being played, and determining the position information of the content object in the image frame according to the recognition result;
    根据所述图像帧中的内容对象的位置信息,确定所述链接数据的展示位置;Determine the display position of the link data according to the position information of the content object in the image frame;
    在所述展示位置通过所述展示控件显示对应的所述链接数据。The corresponding link data is displayed in the display position through the display control.
  7. 根据权利要求6所述的方法,其中,所述在所述展示位置通过所述展示控件显示对应的所述链接数据,包括:The method according to claim 6, wherein the displaying the corresponding link data through the display control at the display position comprises:
    在所述展示位置显示所述展示控件,并且,在所述展示控件中显示第一子控件和第二子控件;Displaying the display control in the display position, and displaying a first child control and a second child control in the display control;
    其中,所述第一子控件用于展示所述预设关键词的信息指示的目标内容对象对应的文字和/或图像信息;所述第二子控件包括所述链接数据对应的触发控件,用于被触发时将所述视频的播放界面跳转至所述链接数据所链接的页面。Wherein, the first sub-control is used to display the text and/or image information corresponding to the target content object indicated by the information of the preset keyword; the second sub-control includes the trigger control corresponding to the link data. When triggered, the playback interface of the video is jumped to the page linked by the link data.
  8. 根据权利要求6所述的方法,其中,所述根据所述图像帧中的内容对象的位置信息,确定所述链接数据的展示位置,包括:The method according to claim 6, wherein the determining the display position of the link data according to the position information of the content object in the image frame comprises:
    根据所述图像帧中的内容对象的位置信息,确定各所述图像帧中的空白位置;Determine the blank position in each image frame according to the position information of the content object in the image frame;
    根据各所述图像帧中的空白位置,确定所述链接数据的展示位置。Determine the display position of the link data according to the blank position in each of the image frames.
  9. 根据权利要求1所述的方法,其中,所述基于播放的所述视频显示对应的所述链接数据,包括:The method according to claim 1, wherein the displaying of the link data corresponding to the video based on the playback comprises:
    以预设展示时长显示所述链接数据,以使所述目标内容对象的受众在所述预设展示时长内对所述链接数据进行操作。The link data is displayed with a preset display duration, so that an audience of the target content object can operate on the link data within the preset display duration.
  10. 根据权利要求1所述的方法,其中,所述基于播放的所述视频显示对应的所述链接数据,包括:The method according to claim 1, wherein the displaying of the link data corresponding to the video based on the playback comprises:
    从所述目标内容对象的应用供应端输入的文案数据中匹配出与所述链接数据对应的文案数据;Match the copy data corresponding to the link data from the copy data input by the application provider of the target content object;
    根据所述链接数据和匹配的所述文案数据,生成待展示的链接数据;Generating link data to be displayed according to the link data and the matched copywriting data;
    基于所述视频的播放界面,展示所述待展示的链接数据。Based on the playing interface of the video, the link data to be displayed is displayed.
  11. 一种显示方法,包括:A display method including:
    在视频播放过程中,当检测到预设关键词的信息时,在视频播放界面中显示与检测到的预设关键词的信息指示的目标内容对象对应的链接数据;During the video playback process, when the information of the preset keyword is detected, the link data corresponding to the target content object indicated by the information of the detected preset keyword is displayed in the video playback interface;
    获取对视频播放界面中显示的目标内容对象对应的链接数据的触发操作;Acquire a trigger operation on the link data corresponding to the target content object displayed in the video playback interface;
    根据所述触发操作,从所述视频播放界面跳转至所述链接数据所链接的、用于显示所述目标内容对象的页面。According to the trigger operation, jump from the video playback interface to the page linked by the link data for displaying the target content object.
  12. 一种视频数据的处理方法,包括:A method for processing video data, including:
    获取并播放直播视频流;Obtain and play live video streams;
    在所述直播视频流的播放过程中,对所述直播视频流中的图像帧进行内容检测,和/或,对所述直播视频流中的音频进行内容检测,以获取所述直播视频流中包含的内容对象;During the playback of the live video stream, perform content detection on the image frames in the live video stream, and/or perform content detection on the audio in the live video stream to obtain Contained content objects;
    查找所述内容对象是否存在对应的链接数据;Searching whether the content object has corresponding link data;
    将存在对应的链接数据的所述内容对象作为目标内容对象,在所述直播视频流的播 放界面中显示与所述目标内容对象对应的链接数据。The content object with corresponding link data is taken as the target content object, and the link data corresponding to the target content object is displayed in the playback interface of the live video stream.
  13. 根据权利要求12所述的方法,其中,所述查找所述内容对象是否存在对应的链接数据,包括:The method according to claim 12, wherein said searching whether the content object has corresponding link data comprises:
    查找预设的商品数据库,以确定所述内容对象是否存在对应的链接数据。Look up a preset commodity database to determine whether the content object has corresponding link data.
  14. 根据权利要求12所述的方法,其中,所述直播视频流中包括用于指示待识别的目标内容对象的预设关键词的信息和对应的链接数据;The method according to claim 12, wherein the live video stream includes information and corresponding link data for indicating preset keywords of the target content object to be recognized;
    所述查找所述内容对象是否存在对应的链接数据,包括:The searching whether the content object has corresponding link data includes:
    确定检测出的所述内容对象中是否包括与所述待识别的目标内容对象匹配的内容对象,若存在,则确定存在对应的链接数据。It is determined whether the detected content object includes a content object that matches the target content object to be identified, and if it exists, it is determined that there is corresponding link data.
  15. 根据权利要求12所述的方法,其中,所述在所述直播视频流的播放过程中,对所述直播视频流中的图像帧进行内容检测,以获取所述直播视频流中包含的内容对象,包括:The method according to claim 12, wherein, during the playing process of the live video stream, content detection is performed on the image frames in the live video stream to obtain the content objects contained in the live video stream ,include:
    在所述直播视频流的播放过程中,对所述直播视频流中的图像帧中的预设位置进行图像识别,并根据识别结果获取所述图像帧中的文字关键词指示的内容对象和/或所述图像帧中的图像指示的内容对象。During the playback of the live video stream, image recognition is performed on the preset position in the image frame in the live video stream, and the content object and/or the content object indicated by the text keyword in the image frame is obtained according to the recognition result Or the content object indicated by the image in the image frame.
  16. 根据权利要求12所述的方法,其中,所述在所述直播视频流的播放过程中,对所述直播视频流中的音频进行内容检测,以获取所述直播视频流中包含的内容对象,包括:The method according to claim 12, wherein, during the playing process of the live video stream, performing content detection on the audio in the live video stream to obtain the content objects contained in the live video stream, include:
    在所述直播视频流的播放过程中,对所述直播视频流中的音频进行音频识别,并获取所述音频中的语音关键词指示的内容对象。During the playing process of the live video stream, audio recognition is performed on the audio in the live video stream, and the content object indicated by the voice keyword in the audio is obtained.
  17. 一种视频数据的处理装置,包括:A video data processing device, including:
    第一获取模块,用于获取待播放的视频以及链接数据,其中,所述视频中包括预设关键词的信息,所述链接数据与所述预设关键词指示的目标内容对象对应;The first obtaining module is configured to obtain a video to be played and link data, wherein the video includes information about a preset keyword, and the link data corresponds to a target content object indicated by the preset keyword;
    第一检测模块,用于在所述视频的播放过程中,对播放的所述视频的至少部分图像帧和/或至少部分音频数据进行所述预设关键词的信息的检测;The first detection module is configured to detect the information of the preset keywords on at least part of the image frames and/or at least part of the audio data of the played video during the playback process of the video;
    第一显示模块,用于若根据检测结果确定检测到所述预设关键词的信息,则基于播放的所述视频显示对应的所述目标内容对象的链接数据。The first display module is configured to display the corresponding link data of the target content object based on the played video if it is determined that the information of the preset keyword is detected according to the detection result.
  18. 一种显示装置,包括:A display device includes:
    视频播放模块,用于在视频播放过程中,当检测到预设关键词的信息时,在视频播放界面中显示与检测到的预设关键词的信息指示的目标内容对象对应的链接数据;The video playback module is used to display the link data corresponding to the target content object indicated by the detected preset keyword information in the video playback interface when the preset keyword information is detected during the video playback process;
    触发获取模块,用于获取对视频播放界面中显示的目标内容对象对应的链接数据的触发操作;The trigger acquisition module is used to acquire the trigger operation of the link data corresponding to the target content object displayed in the video playback interface;
    界面跳转模块,用于根据所述触发操作,从所述视频播放界面跳转至所述链接数据所链接的、用于显示所述目标内容对象的页面。The interface jump module is configured to jump from the video playback interface to the page linked by the link data for displaying the target content object according to the trigger operation.
  19. 一种视频数据的处理装置,包括:A video data processing device, including:
    第二获取模块,用于获取并播放直播视频流;The second acquisition module is used to acquire and play the live video stream;
    第二检测模块,用于在所述直播视频流的播放过程中,对所述直播视频流中的图像帧进行内容检测,和/或,对所述直播视频流中的音频进行内容检测,以获取所述直播视频流中包含的内容对象;The second detection module is configured to perform content detection on the image frames in the live video stream during the playback process of the live video stream, and/or perform content detection on the audio in the live video stream to Acquiring content objects included in the live video stream;
    匹配模块,用于查找所述内容对象是否存在对应的链接数据;The matching module is used to find whether the content object has corresponding link data;
    第二显示模块,用于将存在对应的链接数据的所述内容对象作为目标内容对象,在所述直播视频流的播放界面中显示与所述目标内容对象对应的链接数据。The second display module is configured to use the content object with corresponding link data as the target content object, and display the link data corresponding to the target content object in the play interface of the live video stream.
  20. 一种电子设备,包括:处理器、存储器、通信接口和通信总线,所述处理器、所述存储器和所述通信接口通过所述通信总线完成相互间的通信;An electronic device, comprising: a processor, a memory, a communication interface, and a communication bus. The processor, the memory, and the communication interface communicate with each other through the communication bus;
    所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行如权利要求1-10中任一项所述的视频数据的处理方法对应的操作,或者,执行如权利要求11所述的显示方法对应的操作,或者,执行如权利要求12-16中任一项所述的视频数据的处理方法对应的操作。The memory is used to store at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the video data processing method according to any one of claims 1-10, or execute the operation as claimed in the right The operation corresponding to the display method according to claim 11, or the operation corresponding to the video data processing method according to any one of claims 12-16 is performed.
  21. 一种计算机存储介质,其上存储有计算机程序,该程序被处理器执行时实现如权利要求1-10中任一所述的视频数据的处理方法,或者,执行时实现如权利要求11所述的显示方法,或者,执行时实现如权利要求12-16中任一项所述的视频数据的处理方法。A computer storage medium, on which a computer program is stored, and when the program is executed by a processor, the method for processing video data according to any one of claims 1-10 is realized, or when executed, the method for processing video data according to claim 11 is realized Or, when executed, the video data processing method according to any one of claims 12-16 is realized.
PCT/CN2020/141337 2019-12-31 2020-12-30 Video data processing and display methods and apparatuses, electronic device, and storage medium WO2021136363A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911412396.X 2019-12-31
CN201911412396.XA CN113129045A (en) 2019-12-31 2019-12-31 Video data processing method, video data display method, video data processing device, video data display device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2021136363A1 true WO2021136363A1 (en) 2021-07-08

Family

ID=76685949

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/141337 WO2021136363A1 (en) 2019-12-31 2020-12-30 Video data processing and display methods and apparatuses, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN113129045A (en)
WO (1) WO2021136363A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115243062A (en) * 2022-06-16 2022-10-25 科大讯飞股份有限公司 Scene display method and device, screen display equipment, electronic equipment and storage medium
WO2023045939A1 (en) * 2021-09-24 2023-03-30 北京沃东天骏信息技术有限公司 Live broadcast processing method, live broadcast platform, storage medium and electronic device
CN116074596A (en) * 2023-03-06 2023-05-05 成都光合信号科技有限公司 Information display method and device
CN116347178A (en) * 2022-12-14 2023-06-27 北京优酷科技有限公司 Video playing method, device and equipment

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113596496A (en) * 2021-07-28 2021-11-02 广州博冠信息科技有限公司 Interaction control method, device, medium and electronic equipment for virtual live broadcast room
CN115878838A (en) * 2021-09-27 2023-03-31 北京有竹居网络技术有限公司 Video-based information display method and device, electronic equipment and storage medium
CN115334346A (en) * 2022-08-08 2022-11-11 北京达佳互联信息技术有限公司 Interface display method, video publishing method, video editing method and device
CN115720279B (en) * 2022-11-18 2023-09-15 杭州面朝信息科技有限公司 Method and device for showing arbitrary special effects in live broadcast scene

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105187866A (en) * 2015-09-15 2015-12-23 百度在线网络技术(北京)有限公司 Advertisement putting method and apparatus
CN107180055A (en) * 2016-03-11 2017-09-19 阿里巴巴集团控股有限公司 The methods of exhibiting and device of business object
CN108833952A (en) * 2018-06-20 2018-11-16 北京优酷科技有限公司 The advertisement placement method and device of video
US20190251144A1 (en) * 2018-02-13 2019-08-15 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method and apparatus for launching application page, and electronic device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105187866A (en) * 2015-09-15 2015-12-23 百度在线网络技术(北京)有限公司 Advertisement putting method and apparatus
CN107180055A (en) * 2016-03-11 2017-09-19 阿里巴巴集团控股有限公司 The methods of exhibiting and device of business object
US20190251144A1 (en) * 2018-02-13 2019-08-15 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method and apparatus for launching application page, and electronic device
CN108833952A (en) * 2018-06-20 2018-11-16 北京优酷科技有限公司 The advertisement placement method and device of video

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023045939A1 (en) * 2021-09-24 2023-03-30 北京沃东天骏信息技术有限公司 Live broadcast processing method, live broadcast platform, storage medium and electronic device
CN115243062A (en) * 2022-06-16 2022-10-25 科大讯飞股份有限公司 Scene display method and device, screen display equipment, electronic equipment and storage medium
CN116347178A (en) * 2022-12-14 2023-06-27 北京优酷科技有限公司 Video playing method, device and equipment
CN116347178B (en) * 2022-12-14 2024-01-30 北京优酷科技有限公司 Video playing method, device and equipment
CN116074596A (en) * 2023-03-06 2023-05-05 成都光合信号科技有限公司 Information display method and device
CN116074596B (en) * 2023-03-06 2024-02-20 成都光合信号科技有限公司 Information display method and device

Also Published As

Publication number Publication date
CN113129045A (en) 2021-07-16

Similar Documents

Publication Publication Date Title
WO2021136363A1 (en) Video data processing and display methods and apparatuses, electronic device, and storage medium
TWI744368B (en) Play processing method, device and equipment
US9930311B2 (en) System and method for annotating a video with advertising information
US10643264B2 (en) Method and computer readable medium for presentation of content items synchronized with media display
CN111436006B (en) Method, device, equipment and storage medium for displaying information on video
US8732766B2 (en) Video object tag creation and processing
US9123061B2 (en) System and method for personalized dynamic web content based on photographic data
US8315423B1 (en) Providing information in an image-based information retrieval system
US20120167146A1 (en) Method and apparatus for providing or utilizing interactive video with tagged objects
CN109155136A (en) The computerized system and method for highlight are detected and rendered automatically from video
US9043828B1 (en) Placing sponsored-content based on images in video content
US20140075274A1 (en) Method for Publishing Composite Media Content and Publishing System to Perform the Method
CN106462874A (en) Methods, systems, and media for presenting commerece information relating to video content
JP2003157288A (en) Method for relating information, terminal equipment, server device, and program
US8156001B1 (en) Facilitating bidding on images
US20170213248A1 (en) Placing sponsored-content associated with an image
US20150317319A1 (en) Enhanced search results associated with a modular search object framework
US20200250369A1 (en) System and method for transposing web content
US20150294370A1 (en) Target Area Based Monetization Using Sensory Feedback
US20170287000A1 (en) Dynamically generating video / animation, in real-time, in a display or electronic advertisement based on user data
JP2008146492A (en) Information providing device, information providing method, and computer program
KR101538593B1 (en) Advertisement insertion apparatus using an image editing in a mobile terminal
JP2015176597A (en) Method for publishing composite media content and publishing system to perform the method
EP2919179A1 (en) Method for publishing composite media content and publishing system to perform the method
CN116886948A (en) Information display method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20911088

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20911088

Country of ref document: EP

Kind code of ref document: A1