WO2020078215A1 - 基于视频的信息获取方法和装置 - Google Patents

基于视频的信息获取方法和装置 Download PDF

Info

Publication number
WO2020078215A1
WO2020078215A1 PCT/CN2019/109446 CN2019109446W WO2020078215A1 WO 2020078215 A1 WO2020078215 A1 WO 2020078215A1 CN 2019109446 W CN2019109446 W CN 2019109446W WO 2020078215 A1 WO2020078215 A1 WO 2020078215A1
Authority
WO
WIPO (PCT)
Prior art keywords
subject
terminal device
video
information
server
Prior art date
Application number
PCT/CN2019/109446
Other languages
English (en)
French (fr)
Inventor
王群
董维山
马春洋
Original Assignee
百度在线网络技术(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百度在线网络技术(北京)有限公司 filed Critical 百度在线网络技术(北京)有限公司
Priority to JP2020547082A priority Critical patent/JP7231638B2/ja
Priority to EP19874167.0A priority patent/EP3869810A4/en
Priority to KR1020207024019A priority patent/KR102370699B1/ko
Publication of WO2020078215A1 publication Critical patent/WO2020078215A1/zh
Priority to US17/013,686 priority patent/US20200404378A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/426Internal components of the client ; Characteristics thereof
    • H04N21/42653Internal components of the client ; Characteristics thereof for processing graphics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/239Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
    • H04N21/2393Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests involving handling client requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/414Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
    • H04N21/41407Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance embedded in a portable device, e.g. video client on a mobile phone, PDA, laptop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • H04N21/4316Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations for displaying supplemental content in a region of the screen, e.g. an advertisement in a separate window
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/462Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
    • H04N21/4622Retrieving content or additional data from different sources, e.g. from a broadcast channel and the Internet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4668Learning process for intelligent management, e.g. learning user preferences for recommending movies for recommending content, e.g. movies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/4722End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting additional data associated with the content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/482End-user interface for program selection
    • H04N21/4826End-user interface for program selection using recommendation lists, e.g. of programs or channels sorted out according to their score
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/482End-user interface for program selection
    • H04N21/4828End-user interface for program selection for searching program descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8126Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts
    • H04N21/8133Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts specifically related to the content, e.g. biography of the actors in a movie, detailed information about an article seen in a video program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/44Receiver circuitry for the reception of television signals according to analogue transmission standards
    • H04N5/445Receiver circuitry for the reception of television signals according to analogue transmission standards for displaying additional information

Definitions

  • the present application relates to the field of video technology, and in particular, to a video-based information acquisition method and device.
  • the user can only interrupt the current video playback to query through the search engine, etc., or use other devices to query, the user operation is more cumbersome , Time-consuming.
  • the user may face the problem of not knowing what to query. For example, the user may be interested in a person in the video, but does not know who the person is, so that he cannot enter accurate keywords in the search engine to search.
  • the present application provides a method and device for acquiring information based on video, which can actively recommend related content of the subject in the video to the user without user triggering, thereby improving user experience.
  • the first aspect of this application provides a video-based information acquisition method, including:
  • the terminal device detects the main body in the currently playing video picture
  • the terminal device intercepts the image of the subject from the video picture
  • the terminal device obtains the relevant information of the subject according to the image of the subject;
  • the terminal device displays related information of the subject and the video picture on the same screen.
  • the terminal device can proactively recommend the relevant content of the main body in the video to the user by actively detecting the main body in the video screen and triggering to obtain the relevant information of the main body to display to the user, without any operation by the user, which improves the user experience.
  • the terminal device acquiring the relevant information of the subject according to the image of the subject includes:
  • the terminal device sends the image of the subject to a server, so that the server recognizes the subject according to the image of the subject;
  • the terminal device receives information about the subject sent by the server.
  • the method before the terminal device receives the relevant information of the subject sent by the server, the method further includes:
  • the terminal device receives the identification result of the subject sent by the server;
  • the terminal device determines whether the subject has been detected according to the recognition result
  • the terminal device sends a data request to the server, and the data request is used to request to obtain information about the subject.
  • the terminal device judges whether the subject has been detected according to the recognition result sent by the server, and then ends the search recommendation process to avoid repeatedly recommending the relevant content of the same subject to the user, improving the user experience and avoiding repeated requests to the server for the same content Waste of resources.
  • the terminal device acquiring the relevant information of the subject according to the image of the subject includes:
  • the terminal device recognizes the subject according to the image of the subject to obtain a recognition result
  • the terminal device determines whether the subject has been detected according to the recognition result
  • the terminal device sends a data request to the server, and the data request is used to request to obtain information about the subject;
  • the terminal device receives information about the subject sent by the server.
  • the terminal device recognizes the subject, and judges whether the subject has been detected according to the recognition result, then ends the search recommendation process, which can avoid repeatedly recommending the relevant content of the same subject to the user, improving the user experience, and avoiding repeated submission to the server Requesting the same content is a waste of resources.
  • the method further includes:
  • the terminal device displays prompt information on the screen, and the prompt information is used to prompt the related information on the screen to be related information of the subject.
  • the terminal device displays related information of the subject and the video screen on the same screen, including:
  • the terminal device superimposes and displays the relevant content of the subject on a preset position of the video content, and the display window of the relevant content of the subject is less than half of the display window of the video.
  • the related content of the subject and the video content can be better merged together to bring the user a better experience.
  • the terminal device displaying related information of the subject and the video screen on the same screen includes:
  • the terminal device displays the content of the subject in a preset area outside the display window of the video.
  • the terminal device detects the main body in the played video frame, including:
  • the terminal device detects the outline of the detection object in the video picture
  • the terminal device determines the subject according to the outline of the detection object in the video picture.
  • a second aspect of the present application provides a video-based information acquisition device, including:
  • the detection module is used to detect the main body in the video picture currently playing on the terminal device;
  • An interception module used to intercept the image of the subject from the video picture
  • An obtaining module configured to obtain related information of the subject according to the image of the subject
  • the display module is used for displaying related information of the subject and the video picture on the same screen.
  • the acquisition module is specifically used to:
  • the acquiring module before the acquiring module receives the relevant information of the subject sent by the server, it is also used to:
  • a data request is sent to the server, and the data request is used to request to obtain relevant information of the subject.
  • the acquisition module is specifically used to:
  • a data request is sent to the server, and the data request is used to request to obtain relevant information of the subject;
  • the display module is further configured to: display prompt information on the screen, and the prompt information is used to prompt the related information on the screen to be related information of the subject.
  • the display module is specifically used for:
  • the relevant content of the subject is superimposed and displayed on a preset position of the video content, and the display window of the relevant content of the subject is smaller than half of the display window of the video.
  • the display module is specifically used for:
  • the content of the main body is displayed in a preset area outside the display window of the video.
  • the detection module is specifically used to:
  • the subject is determined according to the outline of the detection object in the video picture.
  • a third aspect of the present application provides a terminal device, including a processor, a memory, and a transceiver, the memory is used to store instructions, the transceiver is used to communicate with other devices, and the processor is used to execute the stored in the memory Instructions to enable the terminal device to perform the method as described in the first aspect of the present application.
  • a fourth aspect of the present application provides a computer-readable storage medium that stores instructions, and when the instructions are executed, causes the computer to execute the method according to the first aspect of the application.
  • the terminal device detects the main body in the currently playing video picture, intercepts the main body image from the video picture, obtains the main body related information according to the main body image, and displays the main body related on the same screen Information and video footage.
  • the terminal device can proactively recommend the relevant content of the main body in the video to the user by actively detecting the main body in the video screen and triggering to obtain the relevant information of the main body to display to the user, without any operation by the user, which improves the user experience.
  • FIG. 1 is a schematic diagram of a network architecture applicable to this application.
  • FIG. 2 is a flowchart of a video-based information acquisition method provided in Embodiment 1 of the present application;
  • FIG. 3 is a schematic diagram of displaying related information of a video and a subject
  • FIG. 4 is another schematic diagram of displaying related information of the video and the main body
  • FIG. 5 is a signaling flowchart of a video-based information acquisition method provided in Embodiment 2 of the present application.
  • FIG. 6 is a schematic structural diagram of a video-based information acquisition device provided in Embodiment 3 of the present application.
  • FIG. 7 is a schematic structural diagram of a terminal device according to Embodiment 4 of the present application.
  • FIG. 1 is a schematic diagram of a network architecture applicable to this application.
  • the network architecture includes at least one terminal device 11 and at least one server 12.
  • the terminal device 11 can play a video, and the terminal device 11 can play a video through an installed video player or a browser.
  • the terminal device 11 is also called a terminal (terminal), user equipment (UE), access terminal, user unit, mobile device, user terminal, wireless communication device, user agent, or user device.
  • the terminal device may be a personal digital processing (personal digital assistant, PDA for short) device, a smart TV, a handheld device with wireless communication function (such as a smartphone, a tablet), a computing device (such as a personal computer (PC), Car equipment and wearable devices.
  • PDA personal digital assistant
  • the server 12 can be used for image recognition, and a large number of image features of people, objects, landscapes, etc. are pre-stored on the server 12, and subsequent matching can be performed according to the feature parameters of the image sent by the terminal device and the pre-stored large number of images to identify the image People, objects or landscapes in The server 12 can also be used to generate related content of the image body.
  • the server 12 can store related content of people, objects, landscapes, etc., and the related content of people, objects, landscapes, etc. can also be stored on other servers.
  • FIG. 2 is a flowchart of a video-based information acquisition method provided in Embodiment 1 of the present application. As shown in FIG. 2, the method in this embodiment includes the following steps:
  • Step S101 The terminal device detects the main body in the currently playing video picture.
  • the terminal device can play the video through the installed video player or the browser, and the video can be a TV series, movie, or other program.
  • the terminal device may periodically detect the main body in the currently playing video picture, for example, every 5 minutes.
  • buttons for the search recommendation function on the video playback page there are on and off buttons for the search recommendation function on the video playback page. If the user turns on the search recommendation function, the terminal device will periodically detect the main body of the currently playing video screen, if the user does not turn on the search With the recommended function, the main body of the currently playing video screen will not be detected.
  • users can also turn on or off the search recommendation function at any time according to their needs. For example, when a user sees an actor that he does not know, the search recommendation function is turned on, and after obtaining relevant information of the actor, the search recommendation function is turned off.
  • the subject in the video screen can be a character, such as a certain character in a TV series, a contestant in a competition; the subject can also be an object, such as a car, home appliance, building, etc .; the subject can also be a scenic spot.
  • different detection objects may have a priority order.
  • the terminal device selects the detection object with the highest priority as the candidate object when detecting the main body in the video screen To determine the subject from the candidates.
  • people have the highest priority, followed by objects, and finally landscapes.
  • the terminal device selects people as alternative objects.
  • the terminal device detects the outline of the detection object in the video picture, and determines the subject according to the outline of the detection object in the video picture. You can first identify the person in the video screen according to the outline of the detection object. When multiple persons are identified, determine which person's faces are front, side, and back according to the outline of the detection object.
  • the face of a person is front, then remove the person on the side and back, if only one person's face is front, then determine that the person whose face is front is the main body of the video screen, if there are multiple people's face is front, you can A person whose front faces are the main faces can be selected as the subject, or a person located in the middle of the screen can be selected as the subject, or a person with the largest contour area can be selected as the subject.
  • Step S102 The terminal device intercepts the image of the subject from the video screen.
  • the terminal device can capture one or more images of the subject.
  • the terminal device can take a screenshot of the entire video screen and then crop the screenshot to obtain the image of the subject.
  • the captured image of the subject must include the person Face.
  • the terminal device can also capture only the image of the subject, without taking a screenshot of the entire video screen.
  • Step S103 The terminal device acquires the relevant information of the subject according to the image of the subject.
  • the terminal device sends the image of the subject to the server, so that the server recognizes the subject according to the image of the subject, and the terminal device receives the relevant information of the subject sent by the server.
  • the server after receiving the subject's image, the server obtains the feature parameters of the subject's image.
  • the feature parameters may include any one of the following parameters or a combination thereof: color feature, shape feature, and texture feature.
  • the server may obtain the characteristic parameters of the image of the subject through at least one method of horizontal and vertical projection, edge detection result, shape analysis, or color analysis.
  • the server matches the feature parameters of the subject's image with the feature parameters of a large number of template images saved locally or in the database.
  • the subject in the template image is known. If the subject's image matches the feature parameters of a certain image successfully, Then the subject can be identified. For example, a large number of star image feature parameters are stored locally on the server or in the database, and the subject can be identified as a certain star through matching.
  • the server further queries related information of the subject.
  • the related information of the subject may be a brief introduction of the subject (such as Baidu Encyclopedia content), the latest news of the subject, or other related videos of the subject.
  • the server sends the identification result of the subject to the terminal device.
  • the identification result of the subject includes the name of the subject, and may also include some simple description of the subject. For example, when the subject is a person, the identification result may be Include the name of the person, but also include gender, occupation and age.
  • the terminal device receives the recognition result of the subject sent by the server, and determines whether the subject has been detected according to the recognition result. Each time the terminal device recognizes the subject, it will save the recognition result of the new subject. Subsequently, when the recognition result of the subject is received, the terminal device determines whether the recognition result of the subject is saved, and if the recognition result of the subject is saved , Indicating that the subject has been detected. If the recognition result of the subject has not been saved, it indicates that the subject has not been detected.
  • the terminal device sends a data request to the server.
  • the data request is used to request information about the subject.
  • the data request may include a keyword of the subject.
  • the keyword may be the name of the person, gender, and Occupation, object name, attribute, etc.
  • the server queries related content of the subject according to the subject's keywords and sends it to the terminal device. If the subject has been detected, the search recommendation process is ended.
  • the terminal device recognizes the subject based on the subject's image to obtain a recognition result, and determines whether the subject has been detected according to the recognition result. If the subject has not been detected, the terminal device sends a data request to the server. The data request is used to request acquisition For the relevant information of the subject, the server sends the relevant information of the subject to the terminal device. Different from the previous method, the terminal device recognizes the main body in this method, and the identification method adopted by the terminal device and the server may be the same.
  • the terminal device determines whether the subject has been detected according to the recognition result, to avoid repeatedly recommending the relevant content of the same subject to the user, which improves the user experience and avoids wasting resources by repeatedly requesting the same content from the server.
  • Step S104 The terminal device displays related information and video images of the main body on the same screen.
  • the terminal device can display the relevant content of the main body and the video screen on the same screen according to the pre-designed template style.
  • the terminal device superimposes and displays the relevant content of the subject on the preset position of the video content, and the display window of the relevant content of the subject is smaller than half of the display window of the video.
  • the preset position may be the upper-right corner, the lower-right corner, the upper-left corner, or the lower-left corner of the display window of the video, so as to prevent the display window of the relevant content of the subject from blocking the video and affecting the user to watch the video.
  • the display window of the relevant content of the subject should not be too large to avoid blocking the video and affecting the user to watch the video.
  • FIG. 3 is a schematic diagram of displaying related information of a video and a subject. As shown in FIG. 3, the display window of the related information of the subject is located in the upper right corner of the display window of the video.
  • the size of the display window of the relevant content of the subject can be adjusted, and the position can also be moved.
  • the user can move the display window of the relevant content of the subject according to his own needs, and adjust the size of the display window.
  • the shape of the display window of the relevant content of the main body may be rectangular, circular, or polygonal. In order to increase the interest, it may also be an animal outline shape, which is not limited in this embodiment.
  • the display window of the relevant content of the main body may also be displayed in a translucent manner.
  • FIG. 4 is another schematic diagram of displaying related information of the video and the main body. As shown in FIG. 4, the display window of the related information of the main body is located below the display window of the video.
  • the terminal device displays prompt information on the screen, and the prompt information is used to prompt that the related information on the screen is related information of the subject.
  • the prompt information may be text, for example, prompting the relevant information in text beside the main body to belong to the main body.
  • the prompt information may also be a graphic, for example, the main body is framed by a dotted frame, or the main body is pointed by a floating arrow.
  • the terminal device detects the main body in the currently playing video screen, intercepts the main body image from the video screen, obtains the main body related information according to the main body image, and displays the main body related information and the video screen on the same screen.
  • the terminal device can proactively recommend the relevant content of the main body in the video to the user by actively detecting the main body in the video screen and triggering to obtain the relevant information of the main body to display to the user, without any operation by the user, which improves the user experience.
  • FIG. 5 is a signaling flowchart of a video-based information acquisition method provided in Embodiment 2 of the present application.
  • the server performs image recognition as an example for illustration.
  • the method provided in this embodiment includes the following steps :
  • Step S201 The terminal device detects the main body in the currently playing video picture.
  • Step S202 The terminal device intercepts the image of the subject from the video screen.
  • Step S203 The terminal device sends the image of the subject to the server.
  • Step S204 the server recognizes the subject according to the image of the subject, and obtains a recognition result.
  • Step S205 The server sends the recognition result of the subject to the terminal device.
  • Step S206 The terminal device determines whether the subject has been detected according to the recognition result.
  • step S207 is executed, and if the subject has been detected, the flow ends.
  • Step S207 The terminal device sends a data request to the server, where the data request is used to request to obtain relevant information of the subject.
  • Step S208 The server queries the relevant information of the subject according to the data request.
  • Step S209 The server sends the relevant information of the subject to the terminal device.
  • Step S210 The terminal device displays related information and video images of the main body on the same screen.
  • FIG. 6 is a schematic structural diagram of a video-based information acquisition device provided in Embodiment 3 of the present application.
  • the device may be integrated in a terminal device. As shown in FIG. 6, the device includes:
  • the detection module 21 is used to detect the main body in the video picture currently playing on the terminal device;
  • Interception module 22 used to intercept the image of the subject from the video picture
  • An obtaining module 23 configured to obtain relevant information of the subject according to the image of the subject;
  • the display module 24 is used to display related information of the subject and the video picture on the same screen.
  • the obtaining module 23 is specifically used to:
  • the acquiring module 23 before the acquiring module 23 receives the relevant information of the subject sent by the server, it is also used to:
  • a data request is sent to the server, and the data request is used to request to obtain relevant information of the subject.
  • the obtaining module 23 is specifically used to:
  • a data request is sent to the server, and the data request is used to request to obtain relevant information of the subject;
  • the display module 24 is further configured to: display prompt information on the screen, and the prompt information is used to prompt the related information on the screen to be related information of the subject.
  • the display module 24 is specifically used to:
  • the relevant content of the subject is superimposed and displayed on a preset position of the video content, and the display window of the relevant content of the subject is smaller than half of the display window of the video.
  • the display module 24 is specifically used to:
  • the content of the main body is displayed in a preset area outside the display window of the video.
  • the detection module 21 is specifically used to:
  • the subject is determined according to the outline of the detection object in the video picture.
  • the apparatus provided in this embodiment may be used to execute the method performed by the terminal device in Embodiment 1 and Embodiment 2.
  • the specific implementation manner and technical effect are similar, and are not described here again.
  • the terminal device provided in this embodiment includes a processor 31, a memory 32, and a transceiver 33.
  • the memory 32 is used to store instructions.
  • the transceiver 33 is used to communicate with other devices, and the processor 31 is used to execute instructions stored in the memory 32, so that the terminal device executes the method described in Embodiment 1 or Embodiment 2, here No longer.
  • the processor 31 may be a microcontroller unit (MCU), which is also called a single chip microcomputer (Single Chip Microcomputer) or a single chip microcomputer, and the processor 31 may also be a central processor (Central Process Unit, CPU) , Digital signal processor (DSP), application specific integrated circuit (ASIC), ready-to-program programmable gate array (FPGA) or other programmable logic devices, discrete gates or transistor logic Device.
  • MCU microcontroller unit
  • CPU Central Process Unit
  • DSP Digital signal processor
  • ASIC application specific integrated circuit
  • FPGA ready-to-program programmable gate array
  • the memory 32 may be random access memory (random access memory, RAM), flash memory, read-only memory (read-only memory (ROM), programmable read-only memory or electrically erasable programmable memory, registers, etc., which are mature in the art Storage media.
  • Embodiment 5 of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores instructions, and when the instructions are executed, the computer is caused to execute all the operations performed by the terminal device in Embodiment 1 or Embodiment 2. ⁇ ⁇ The method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Computer Graphics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • User Interface Of Digital Computer (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本申请提供一种基于视频的信息获取方法和装置,所述方法包括:终端设备检测当前播放的视频画面中的主体,从视频画面中截取主体的图像,根据主体的图像获取主体的相关信息,同屏显示主体的相关信息和视频画面。终端设备通过主动检测视频画面中的主体,并触发获取主体的相关信息显示给用户,就能够主动为用户推荐视频中的主体的相关内容,不需要用户进行任何操作,提高了用户体验。

Description

基于视频的信息获取方法和装置
本申请要求于2018年10月18日提交中国专利局、申请号为2018112151335、申请名称为“基于视频的信息获取方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及视频技术领域,尤其涉及一种基于视频的信息获取方法和装置。
背景技术
随着智能手机、平板电脑、智能电视、智能家居等智能终端的普及,通过智能终端观看视频成为人们日常生活中进行娱乐或者获取信息的重要手段。目前,通过智能终端播放视频的过程中,用户无法基于视频画面中的内容进行交互。
如果用户在视频播放过程中对视频中的某个人物、物体甚至场景等产生兴趣,则用户只能中断当前视频的播放通过搜索引擎等进行查询,或者,使用其他设备进行查询,用户操作比较繁琐、费时。另外,用户可能会面临不知道查询什么的问题,例如用户可能对视频中的某个人物感兴趣,但并不知道这个人物是谁,从而无法在搜索引擎中输入准确的关键词进行搜索。
发明内容
本申请提供一种基于视频的信息获取方法和装置,不需要用户触发,就能够主动为用户推荐视频中的主体的相关内容,提高了用户体验。
本申请第一方面提供一种基于视频的信息获取方法,包括:
终端设备检测当前播放的视频画面中的主体;
所述终端设备从所述视频画面中截取所述主体的图像;
所述终端设备根据所述主体的图像获取所述主体的相关信息;
所述终端设备同屏显示所述主体的相关信息和所述视频画面。
终端设备通过主动检测视频画面中的主体,并触发获取主体的相关信息显示给用户,就能够主动为用户推荐视频中的主体的相关内容,不需要用户进行任何操作,提高了用户体验。
一种示例性的方式中,所述终端设备根据所述主体的图像获取所述主体的相关信息,包括:
所述终端设备将所述主体的图像发送给服务器,以使所述服务器根据所述主体的图像识别所述主体;
所述终端设备接收所述服务器发送的所述主体的相关信息。
一种示例性的方式中,所述终端设备接收所述服务器发送的所述主体的相关信息之前,还包括:
所述终端设备接收所述服务器发送的所述主体的识别结果;
所述终端设备根据所述识别结果判断所述主体是否检测过;
如果所述主体没有检测过,则所述终端设备向所述服务器发送数据请求,所述数据请求用于请求获取所述主体的相关信息。
终端设备通过根据服务器发送的识别结果判断主体是否检测过,则结束搜索推荐流程,避免重复的向用户推荐同一个主体的相关内容,提高了用户体验,也避免了重复向服务器请求相同的内容造成资源的浪费。
又一种示例性的方式中,所述终端设备根据所述主体的图像获取所述主体的相关信息,包括:
所述终端设备根据所述主体的图像,识别所述主体得到识别结果;
所述终端设备根据所述识别结果,判断所述主体是否检测过;
如果所述主体没有检测过,则所述终端设备向服务器发送数据请求,所述数据请求用于请求获取所述主体的相关信息;
所述终端设备接收所述服务器发送的所述主体的相关信息。
由终端设备对主体进行识别,并根据识别结果判断主体是否检测过,则结束搜索推荐流程,从而能够避免重复的向用户推荐同一个主体的相关内容,提高了用户体验,也避免了重复向服务器请求相同的内容造成资源的浪费。
一种示例性的方式中,还包括:
所述终端设备在屏幕上显示提示信息,所述提示信息用于提示屏幕上的相关信息为所述主体的相关信息。
一种示例性的方式中,所述终端设备同屏显示所述主体的相关信息和所述视频画面,包括:
所述终端设备将所述主体的相关内容叠加显示在所述视频内容的预设位置上,所述主体的相关内容的显示窗口小于所述视频的显示窗口的一半。
通过将主体的相关内容叠加显示在视频内容之上,可以将主体的相关内容和视频内容较好的融合到一起,带给用户更好的体验。
又一种示例性的方式中,所述终端设备同屏显示所述主体的相关信息和所述视频画面,包括:
所述终端设备将所述主体的内容显示在所述视频的显示窗口之外的预设区域。
一种示例性的方式中,所述终端设备检测播放的视频画面中的主体,包括:
所述终端设备检测所述视频画面中的检测对象的轮廓;
所述终端设备根据所述视频画面中的检测对象的轮廓,确定所述主体。
本申请第二方面提供一种基于视频的信息获取装置,包括:
检测模块,用于检测终端设备上当前播放的视频画面中的主体;
截取模块,用于从所述视频画面中截取所述主体的图像;
获取模块,用于根据所述主体的图像获取所述主体的相关信息;
显示模块,用于同屏显示所述主体的相关信息和所述视频画面。
一种示例性的方式中,所述获取模块具体用于:
将所述主体的图像发送给服务器,以使所述服务器根据所述主体的图像识别所述主体;
接收所述服务器发送的所述主体的相关信息。
一种示例性的方式中,所述获取模块接收所述服务器发送的所述主体的相关信息之前,还用于:
接收所述服务器发送的所述主体的识别结果;
根据所述识别结果判断所述主体是否检测过;
如果所述主体没有检测过,则向所述服务器发送数据请求,所述数据请求用于请求获取所述主体的相关信息。
又一种示例性的方式中,所述获取模块具体用于:
根据所述主体的图像,识别所述主体得到识别结果;
根据所述识别结果,判断所述主体是否检测过;
如果所述主体没有检测过,则向服务器发送数据请求,所述数据请求用于请求获取所述主体的相关信息;
接收所述服务器发送的所述主体的相关信息。
一种示例性的方式中,所述显示模块还用于:在屏幕上显示提示信息,所述提示信息用于提示屏幕上的相关信息为所述主体的相关信息。
一种示例性的方式中,所述显示模块具体用于:
将所述主体的相关内容叠加显示在所述视频内容的预设位置上,所述主体的相关内容的显示窗口小于所述视频的显示窗口的一半。
又一种示例性的方式中,所述显示模块具体用于:
将所述主体的内容显示在所述视频的显示窗口之外的预设区域。
一种示例性的方式中,所述检测模块具体用于:
检测所述视频画面中的检测对象的轮廓;
根据所述视频画面中的检测对象的轮廓,确定所述主体。
本申请第三方面提供一种终端设备,包括处理器、存储器和收发器,所述存储器用于存储指令,所述收发器用于和其他设备通信,所述处理器用于执行所述存储器中存储的指令,以使所述终端设备执行如本申请第一方面所述的方法。
本申请第四方面提供一种计算机可读存储介质,所述计算机可读存储介质存储有指令,当所述指令被执行时,使得计算机执行如本申请第一方面所述的方法。
本申请提供的基于视频的信息获取方法和装置,终端设备检测当前播放的视频画面中的主体,从视频画面中截取主体的图像,根据主体的图像获取主体的相关信息,同屏显示主体的相关信息和视频画面。终端设备通过主动检测视频画面中的主体,并触发获取主体的相关信息显示给用户,就能够主动为用户推荐视频中的主体的相关内容,不需要用户进行任何操作,提高了用户体验。
附图说明
图1为本申请适用的一种网络架构的示意图;
图2为本申请实施例一提供的基于视频的信息获取方法的流程图;
图3为视频和主体的相关信息的一种显示示意图;
图4为视频和主体的相关信息的又一种显示示意图;
图5为本申请实施例二提供的基于视频的信息获取方法的信令流程图;
图6为本申请实施例三提供的基于视频的信息获取装置的结构示意图;
图7为本申请实施例四提供的终端设备的结构示意图。
具体实施方式
本申请提供一种基于视频的信息获取方法,图1为本申请适用的一种网络架构的示意图,如图1所示,该网络架构包括至少一个终端设备11和至少一个服务器12。终端设备11能够播放视频,终端设备11可以通过已安装的视频播放器播放视频,也可以通过浏览器播放视频。终端设备11也称为终端(Terminal)、用户设备(user equipment,简称UE)、接入终端、用户单元、移动设备、用户终端、无线通信设备、用户代理或用户装置。终端设备可以是个人数字处理(personal digital assistant,简称PDA)设备、智能电视、具有无线通信功能的手持设备(例如智能手机、平板电脑)、计算设备(例如个人电脑(personal computer,简称PC)、车载设备以及可穿戴设备等。
服务器12可以用于图像识别,在服务器12上预先存储有大量人物、物体、风景等的图像特征,后续可以根据终端设备发送的图像和预先存储的大量图像的特征参数进行匹配,以识别该图像中的人物、物体或者风景等。服务器12还可以用于生成图像主体的相关内容,其中,服务器12上可以存储有人物、物体、风景等的相关内容,人物、物体、风景等的相关内容也可以存储在其他服务器上。
图2为本申请实施例一提供的基于视频的信息获取方法的流程图,如图2所示,本实施例的方法包括以下步骤:
步骤S101、终端设备检测当前播放的视频画面中的主体。
终端设备可以通过安装的视频播放器播放视频,也可以通过浏览器播放视频,该视频可以是电视剧、电影或者其他节目。终端设备可以周期性检测当前播放的视频画面中的主体,例如,每隔5分钟检测一次。
可选的,在视频播放页面上设置有搜索推荐功能的开启和关闭按钮,如果用户开启了该搜索推荐功能,则终端设备会周期性检测当前播放的视频画面中的主体,如果用户没有开启搜索推荐功能,则不会检测当前播放的视频画面中的主体。在视频播放过程中,用户也可以根据需求随时开启或者关闭搜索推荐功能。例如,当用户看到不认识的演员时,开启搜索推荐功能,在获取到该演员的相关信息后,关闭搜索推荐功能。
视频画面中的主体可以是人物,例如电视剧中的某个人物,比赛中的某个参赛者;该主体还可以是物体,例如,汽车、家电、建筑物等;该主体还可以是风景名胜。可选的,不同检测对象之间可以具有优先级顺序,当视频画面中既有人物、物体和风景时,终端设备在检测视频画面中的主体时,选择优先级最高的检测对象为备选对象,从备选对象中确定主体。通常情况下,人物的优先级最高,其次为物体,最后是风景,当视频画面中同时存在人物、物体和风景时,终端设备选择人物为备选对象,视频中可能存在多个人物,需要从多个人物中选择一个或者多个作为主体。当然,也可以设置视频画面中的主体为人物,这样,检测对象只能是人物。
示例性的,终端设备检测视频画面中的检测对象的轮廓,根据视频画面中的检测对象的轮廓,确定主体。可以先根据检测对象的轮廓识别视频画面中的人物,当识别出多个人 物时,根据检测对象的轮廓确定哪些人物的脸是正面、侧面和背面。如果有人物的脸是正面,则剔除侧面和背面的人物,如果只有一个人物的脸是正面,则确定脸是正面的人物为视频画面的主体,如果有多个人物的脸是正面,则可以将该多个脸是正面的人物作为主体,也可以选择一个位于画面中间的人物作为主体,或者轮廓面积最大的人物作为主体。
步骤S102、终端设备从视频画面中截取主体的图像。
终端设备可以截取该主体的一个或者多个图像,终端设备可以对视频画面整体进行截图,再对截图进行裁剪,得到主体的图像,当该主体为人物时,截取的该主体的图像必须包括人物的脸部。终端设备也可以只截取主体的图像,不需要将整个视频画面都截图。
步骤S103、终端设备根据主体的图像获取主体的相关信息。
一种方式中,终端设备将主体的图像发送给服务器,以使服务器根据主体的图像识别主体,终端设备接收服务器发送的主体的相关信息。
该方式中,服务器接收到主体的图像后,获取主体的图像的特征参数,该特征参数可以包括以下任意一个参数或其组合:颜色特征、形状特征和纹理特征。服务器可以通过水平垂直投影、边缘检测结果、形状分析或颜色分析中的至少一种方法获取主体的图像的特征参数。
服务器将主体的图像的特征参数与本地保存或者数据库中保存的大量的模板图像的特征参数进行匹配,模板图像中的主体是已知的,如果主体的图像与某个图像的特征参数匹配成功,则可以识别出该主体。例如,服务器本地或者数据库中保存了大量的明星的图像的特征参数,通过匹配可以识别出主体为某个明星。服务器进一步查询该主体的相关信息,该主体的相关信息可以是主体的简单介绍(例如百度百科的内容),也可以是主体的最新消息,还可以是主体的其他相关视频。
可选的,服务器识别出主体后,向终端设备发送主体的识别结果,主体的识别结果中包括主体的名称,还可以包括主体的一些简单描述,例如,当主体为人物时,识别结果中可以包括人物的名称,还可以包括性别、职业和年龄。
终端设备接收服务器发送的主体的识别结果,根据识别结果判断主体是否检测过。终端设备在每次识别出主体后,会保存新的主体的识别结果,后续,当接收到主体的识别结果时,终端设备判断是否保存有该主体的识别结果,如果保存有该主体的识别结果,说明该主体已经检测过,如果没有保存该主体的识别结果,则说明该主体没有检测过。
如果该主体没有检测过,则终端设备向服务器发送数据请求,该数据请求用于请求获取主体的相关信息,该数据请求中可以包括主体的关键字,该关键字可以是人物的名称、性别以及职业,物体的名称、属性等。服务器根据主体的关键词查询主体的相关内容,并发送给终端设备。如果该主体检测过,则结束本次搜索推荐流程。
另一种方式中,终端设备根据主体的图像,识别主体得到识别结果,根据识别结果,判断主体是否检测过,如果主体没有检测过,则终端设备向服务器发送数据请求,数据请求用于请求获取主体的相关信息,服务器将主体的相关信息发送给终端设备。与前一种方式不同,该方式中由终端设备识别主体,终端设备采用的识别方式与服务器的识别方式可以相同。
本实施例中,终端设备通过根据识别结果判断主体是否检测过,避免重复的向用户推荐同一个主体的相关内容,提高了用户体验,也避免了重复向服务器请求相同的内容造成 资源的浪费。
步骤S104、终端设备同屏显示主体的相关信息和视频画面。
终端设备可以根据预先设计好的模板样式,将主体的相关内容和视频画面显示在同一屏上。一种方式中,终端设备将主体的相关内容叠加显示在视频内容的预设位置上,主体的相关内容的显示窗口小于视频的显示窗口的一半。
该预设位置可以为视频的显示窗口的右上角、右下角、左上角或者左下角,以避免主体的相关内容的显示窗口遮挡视频,影响用户观看视频。另外,主体的相关内容的显示窗口也不宜过大,以避免遮挡视频,影响用户观看视频。图3为视频和主体的相关信息的一种显示示意图,如图3所示,主体的相关信息的显示窗口位于视频的显示窗口内的右上角。
可选的,主体的相关内容的显示窗口的大小可以调整,位置也可以移动,用户可以根据自己的需求移动主体的相关内容的显示窗口,以及调整显示窗口的大小。主体的相关内容的显示窗口的形状可以为矩形、圆形、多边形,为了增加趣味性,还可以为动物轮廓形状,本实施例不对此进行限制。主体的相关内容的显示窗口也可以为半透明方式显示。
另一种方式中,终端设备将主体的内容显示在视频的显示窗口之外的预设区域。图4为视频和主体的相关信息的又一种显示示意图,如图4所示,主体的相关信息的显示窗口位于视频的显示窗口外的下方。
可选的,终端设备在屏幕上显示提示信息,该提示信息用于提示屏幕上的相关信息为主体的相关信息。通过将主体和相关信息关联起来,避免当屏幕上有多个人物或者物体时,用户不知道屏幕上的相关信息属于哪个人物或者物体。该提示信息可以为文字,例如,在主体的旁边用文字提示相关信息属于主体。该提示信息还可以为图形,例如,通过虚线框将主体框起来,或者,通过一个悬浮的箭头指向主体。
本实施例中,终端设备检测当前播放的视频画面中的主体,从视频画面中截取主体的图像,根据主体的图像获取主体的相关信息,同屏显示主体的相关信息和视频画面。终端设备通过主动检测视频画面中的主体,并触发获取主体的相关信息显示给用户,就能够主动为用户推荐视频中的主体的相关内容,不需要用户进行任何操作,提高了用户体验。
图5为本申请实施例二提供的基于视频的信息获取方法的信令流程图,本实施例以服务器进行图像识别为例进行说明,如图5所示,本实施例提供的方法包括以下步骤:
步骤S201、终端设备检测当前播放的视频画面中的主体。
步骤S202、终端设备从视频画面中截取主体的图像。
步骤S203、终端设备将主体的图像发送给服务器。
步骤S204、服务器根据主体的图像识别主体,得到识别结果。
步骤S205、服务器将主体的识别结果发送给终端设备。
步骤S206、终端设备根据识别结果,判断主体是否检测过。
如果主体没有检测过,则执行步骤S207,如果主体检测过,则结束流程。
步骤S207、终端设备向服务器发送数据请求,该数据请求用于请求获取主体的相关信息。
步骤S208、服务器根据数据请求,查询主体的相关信息。
步骤S209、服务器将主体的相关信息发送给终端设备。
步骤S210、终端设备同屏显示主体的相关信息和视频画面。
本实施例的具体实现方式,参照实施例一的相关描述,这里不再赘述。
图6为本申请实施例三提供的基于视频的信息获取装置的结构示意图,该装置可以集成在终端设备中,如图6所示,该装置包括:
检测模块21,用于检测终端设备上当前播放的视频画面中的主体;
截取模块22,用于从所述视频画面中截取所述主体的图像;
获取模块23,用于根据所述主体的图像获取所述主体的相关信息;
显示模块24,用于同屏显示所述主体的相关信息和所述视频画面。
一种示例性的方式中,所述获取模块23具体用于:
将所述主体的图像发送给服务器,以使所述服务器根据所述主体的图像识别所述主体;
接收所述服务器发送的所述主体的相关信息。
一种示例性的方式中,所述获取模块23接收所述服务器发送的所述主体的相关信息之前,还用于:
接收所述服务器发送的所述主体的识别结果;
根据所述识别结果判断所述主体是否检测过;
如果所述主体没有检测过,则向所述服务器发送数据请求,所述数据请求用于请求获取所述主体的相关信息。
又一种示例性的方式中,所述获取模块23具体用于:
根据所述主体的图像,识别所述主体得到识别结果;
根据所述识别结果,判断所述主体是否检测过;
如果所述主体没有检测过,则向服务器发送数据请求,所述数据请求用于请求获取所述主体的相关信息;
接收所述服务器发送的所述主体的相关信息。
一种示例性的方式中,所述显示模块24还用于:在屏幕上显示提示信息,所述提示信息用于提示屏幕上的相关信息为所述主体的相关信息。
一种示例性的方式中,所述显示模块24具体用于:
将所述主体的相关内容叠加显示在所述视频内容的预设位置上,所述主体的相关内容的显示窗口小于所述视频的显示窗口的一半。
又一种示例性的方式中,所述显示模块24具体用于:
将所述主体的内容显示在所述视频的显示窗口之外的预设区域。
一种示例性的方式中,所述检测模块21具体用于:
检测所述视频画面中的检测对象的轮廓;
根据所述视频画面中的检测对象的轮廓,确定所述主体。
本实施例提供的装置可用于执行实施例一和实施例二中终端设备执行的方法,具体实现方式和技术效果类似,这里不再赘述。
图7为本申请实施例四提供的终端设备的结构示意图,如图7所示,本实施例提供的终端设备包括处理器31、存储器32和收发器33,所述存储器32用于存储指令,所述收发器33用于和其他设备通信,所述处理器31用于执行所述存储器32中存储的指令,以使所述终端设备执行如实施例一或实施例二所述的方法,这里不再赘述。
其中,该处理器31可以是微控制单元(Microcontroller Unit,MCU),MCU又称单片 微型计算机(Single Chip Microcomputer)或者单片机,该处理器31还可以是中央处理器(Central Process Unit,CPU)、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件。
存储器32可以是随机存取存储器(random access memory,RAM)、闪存、只读存储器(read-only memory,ROM)、可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。
本申请实施例五提供一种计算机可读存储介质,所述计算机可读存储介质存储有指令,当所述指令被执行时,使得计算机执行如实施例一或实施例二中终端设备执行的所述方法。

Claims (14)

  1. 一种基于视频的信息获取方法,其特征在于,包括:
    终端设备检测当前播放的视频画面中的主体;
    所述终端设备从所述视频画面中截取所述主体的图像;
    所述终端设备根据所述主体的图像获取所述主体的相关信息;
    所述终端设备同屏显示所述主体的相关信息和所述视频画面。
  2. 根据权利要求1所述的方法,其特征在于,所述终端设备根据所述主体的图像获取所述主体的相关信息,包括:
    所述终端设备将所述主体的图像发送给服务器,以使所述服务器根据所述主体的图像识别所述主体;
    所述终端设备接收所述服务器发送的所述主体的相关信息。
  3. 根据权利要求2所述的方法,其特征在于,所述终端设备接收所述服务器发送的所述主体的相关信息之前,还包括:
    所述终端设备接收所述服务器发送的所述主体的识别结果;
    所述终端设备根据所述识别结果判断所述主体是否检测过;
    如果所述主体没有检测过,则所述终端设备向所述服务器发送数据请求,所述数据请求用于请求获取所述主体的相关信息。
  4. 根据权利要求1所述的方法,其特征在于,所述终端设备根据所述主体的图像获取所述主体的相关信息,包括:
    所述终端设备根据所述主体的图像,识别所述主体得到识别结果;
    所述终端设备根据所述识别结果,判断所述主体是否检测过;
    如果所述主体没有检测过,则所述终端设备向服务器发送数据请求,所述数据请求用于请求获取所述主体的相关信息;
    所述终端设备接收所述服务器发送的所述主体的相关信息。
  5. 根据权利要求1-4任一项所述的方法,其特征在于,还包括:
    所述终端设备在屏幕上显示提示信息,所述提示信息用于提示屏幕上的相关信息为所述主体的相关信息。
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述终端设备同屏显示所述主体的相关信息和所述视频画面,包括:
    所述终端设备将所述主体的相关内容叠加显示在所述视频内容的预设位置上,所述主体的相关内容的显示窗口小于所述视频的显示窗口的一半。
  7. 根据权利要求1-5任一项所述的方法,其特征在于,所述终端设备同屏显示所述主体的相关信息和所述视频画面,包括:
    所述终端设备将所述主体的内容显示在所述视频的显示窗口之外的预设区域。
  8. 根据权利要求1-7任一项所述的方法,其特征在于,所述终端设备检测播放的视频画面中的主体,包括:
    所述终端设备检测所述视频画面中的检测对象的轮廓;
    所述终端设备根据所述视频画面中的检测对象的轮廓,确定所述主体。
  9. 根据权利要求1-8任一项所述的方法,其特征在于,所述终端设备检测当前播放的视频画面中的主体之前,还包括:
    所述终端设备在用户界面上显示有推荐功能按钮;
    所述终端设备接收用户对所述推荐功能按钮的第一操作;
    所述终端设备根据所述第一操作开启推荐功能;
    所述终端设备检测当前播放的视频画面中的主体,包括:
    所述终端设备在所述推荐功能开启的情况下,检测当前播放的视频画面中的主体。
  10. 根据权利要求9所述的方法,其特征在于,所述终端设备同屏显示所述主体的相关信息和所述视频画面之后,还包括:
    所述终端设备接收用户对所述推荐功能按钮的第二操作;
    所述终端设备根据所述第二操作关闭所述推荐功能。
  11. 根据权利要求1-9任一项所述的方法,其特征在于,所述终端设备检测当前播放的视频画面中的主体,包括:
    所述终端设备根据预设的检测对象的优先级从高到低的顺序,对所述视频画面中的物体进行检测;
    当根据当前优先级对应的检测对象,从所述视频画面的物体中检测到所述当前优先级对应的检测对象时,从检测到检测对象中确定所述主体。
  12. 一种终端设备,其特征在于,包括处理器、存储器和收发器,所述存储器用于存储指令,所述收发器用于和其他设备通信,所述处理器用于执行所述存储器中存储的指令,以使所述终端设备执行如权利要求1-11任一项所述的方法。
  13. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有指令,当所述指令被执行时,使得计算机执行如权利要求1-11任一项所述的方法。
  14. 一种计算机程序,其特征在于,包括程序代码,当计算机运行所述计算机程序时,所述程序代码执行如权利要求1-11任一项所述的方法。
PCT/CN2019/109446 2018-10-18 2019-09-30 基于视频的信息获取方法和装置 WO2020078215A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2020547082A JP7231638B2 (ja) 2018-10-18 2019-09-30 映像に基づく情報取得方法及び装置
EP19874167.0A EP3869810A4 (en) 2018-10-18 2019-09-30 Video-based information acquisition method and device
KR1020207024019A KR102370699B1 (ko) 2018-10-18 2019-09-30 영상에 기반한 정보 획득 방법 및 장치
US17/013,686 US20200404378A1 (en) 2018-10-18 2020-09-07 Video-based information acquisition method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811215133.5A CN109525877B (zh) 2018-10-18 2018-10-18 基于视频的信息获取方法和装置
CN201811215133.5 2018-10-18

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/013,686 Continuation US20200404378A1 (en) 2018-10-18 2020-09-07 Video-based information acquisition method and device

Publications (1)

Publication Number Publication Date
WO2020078215A1 true WO2020078215A1 (zh) 2020-04-23

Family

ID=65772515

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/109446 WO2020078215A1 (zh) 2018-10-18 2019-09-30 基于视频的信息获取方法和装置

Country Status (6)

Country Link
US (1) US20200404378A1 (zh)
EP (1) EP3869810A4 (zh)
JP (1) JP7231638B2 (zh)
KR (1) KR102370699B1 (zh)
CN (1) CN109525877B (zh)
WO (1) WO2020078215A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109525877B (zh) * 2018-10-18 2021-04-20 百度在线网络技术(北京)有限公司 基于视频的信息获取方法和装置
CN111836093B (zh) * 2019-04-16 2022-05-31 百度在线网络技术(北京)有限公司 视频播放方法、装置、设备和介质
CN110582014A (zh) * 2019-10-17 2019-12-17 深圳创维-Rgb电子有限公司 电视机及其电视控制方法、控制装置和可读存储介质
CN112601116A (zh) * 2020-12-11 2021-04-02 海信视像科技股份有限公司 一种显示设备及内容显示方法
CN113434729B (zh) * 2021-08-04 2024-01-30 深圳墨世科技有限公司 视频相关信息聚合获取方法、装置和终端设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104301755A (zh) * 2013-07-19 2015-01-21 联想(北京)有限公司 一种电视信息获取方法、电视、后台服务器及系统
CN107315844A (zh) * 2017-08-17 2017-11-03 广州视源电子科技股份有限公司 一种基于图片的检索方法、装置、设备及存储介质
JP2018066823A (ja) * 2016-10-18 2018-04-26 株式会社日立システムズ 情報表示装置、及びその処理制御方法
CN108471551A (zh) * 2018-03-23 2018-08-31 上海哔哩哔哩科技有限公司 基于主体识别的视频主体信息显示方法、装置、系统和介质
CN108491419A (zh) * 2018-02-06 2018-09-04 北京奇虎科技有限公司 一种基于视频实现推荐的方法和装置
CN109525877A (zh) * 2018-10-18 2019-03-26 百度在线网络技术(北京)有限公司 基于视频的信息获取方法和装置

Family Cites Families (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR900004954B1 (ko) * 1986-12-10 1990-07-12 삼성전자 주식회사 실시간 영상경계 검출회로
US7467131B1 (en) * 2003-09-30 2008-12-16 Google Inc. Method and system for query data caching and optimization in a search engine system
JP2006185320A (ja) * 2004-12-28 2006-07-13 Ricoh Co Ltd 画像検索装置
JP2006209657A (ja) * 2005-01-31 2006-08-10 Bandai Co Ltd オーサリング装置、オーサリング方法およびコンピュータプログラム
US8861898B2 (en) * 2007-03-16 2014-10-14 Sony Corporation Content image search
JP2009232250A (ja) * 2008-03-24 2009-10-08 Panasonic Corp 番組情報表示装置および番組情報表示方法
US8239896B2 (en) * 2008-05-28 2012-08-07 Sony Computer Entertainment America Inc. Integration of control data into digital broadcast content for access to ancillary information
JP2010152744A (ja) * 2008-12-25 2010-07-08 Toshiba Corp 再生装置
US8839306B2 (en) * 2009-11-20 2014-09-16 At&T Intellectual Property I, Lp Method and apparatus for presenting media programs
US9015139B2 (en) * 2010-05-14 2015-04-21 Rovi Guides, Inc. Systems and methods for performing a search based on a media content snapshot image
KR101708646B1 (ko) * 2010-05-26 2017-03-08 엘지전자 주식회사 영상표시기기, 그 시스템 및 그 영상표시기기에 표시된 오브젝트 검색방법
KR101357262B1 (ko) * 2010-08-13 2014-01-29 주식회사 팬택 필터 정보를 이용한 객체 인식 장치 및 방법
US8818025B2 (en) * 2010-08-23 2014-08-26 Nokia Corporation Method and apparatus for recognizing objects in media content
JP5594672B2 (ja) * 2011-04-14 2014-09-24 株式会社 日立産業制御ソリューションズ 物体認識装置および物体認識方法
JP5834541B2 (ja) * 2011-06-29 2015-12-24 三菱電機株式会社 デジタル放送受信装置及びデジタル放送受信方法
US20130036442A1 (en) * 2011-08-05 2013-02-07 Qualcomm Incorporated System and method for visual selection of elements in video content
CN103729614A (zh) * 2012-10-16 2014-04-16 上海唐里信息技术有限公司 基于视频图像的人物识别方法及人物识别装置
US9409081B2 (en) * 2012-11-16 2016-08-09 Rovi Guides, Inc. Methods and systems for visually distinguishing objects appearing in a media asset
US9247309B2 (en) * 2013-03-14 2016-01-26 Google Inc. Methods, systems, and media for presenting mobile content corresponding to media content
CN103297810A (zh) * 2013-05-23 2013-09-11 深圳市爱渡飞科技有限公司 一种电视画面关联信息的显示方法、装置及系统
CN104066009B (zh) * 2013-10-31 2015-10-14 腾讯科技(深圳)有限公司 节目识别方法、装置、终端、服务器及系统
CN104184923B (zh) * 2014-08-27 2018-01-09 天津三星电子有限公司 用于视频中检索人物信息的系统和方法
KR102365393B1 (ko) * 2014-12-11 2022-02-21 엘지전자 주식회사 이동단말기 및 그 제어방법
JP2016119508A (ja) * 2014-12-18 2016-06-30 株式会社東芝 方法、システム及びプログラム
EP3065067A1 (en) * 2015-03-06 2016-09-07 Captoria Ltd Anonymous live image search
CN106162355A (zh) * 2015-04-10 2016-11-23 北京云创视界科技有限公司 视频交互方法及终端
US10440435B1 (en) * 2015-09-18 2019-10-08 Amazon Technologies, Inc. Performing searches while viewing video content
CN106686404B (zh) * 2016-12-16 2021-02-02 中兴通讯股份有限公司 一种视频分析平台、匹配方法、精准投放广告方法及系统
US10477277B2 (en) * 2017-01-06 2019-11-12 Google Llc Electronic programming guide with expanding cells for video preview
CN108171207A (zh) * 2018-01-17 2018-06-15 百度在线网络技术(北京)有限公司 基于视频序列的人脸识别方法和装置
CN108399349B (zh) * 2018-03-22 2020-11-10 腾讯科技(深圳)有限公司 图像识别方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104301755A (zh) * 2013-07-19 2015-01-21 联想(北京)有限公司 一种电视信息获取方法、电视、后台服务器及系统
JP2018066823A (ja) * 2016-10-18 2018-04-26 株式会社日立システムズ 情報表示装置、及びその処理制御方法
CN107315844A (zh) * 2017-08-17 2017-11-03 广州视源电子科技股份有限公司 一种基于图片的检索方法、装置、设备及存储介质
CN108491419A (zh) * 2018-02-06 2018-09-04 北京奇虎科技有限公司 一种基于视频实现推荐的方法和装置
CN108471551A (zh) * 2018-03-23 2018-08-31 上海哔哩哔哩科技有限公司 基于主体识别的视频主体信息显示方法、装置、系统和介质
CN109525877A (zh) * 2018-10-18 2019-03-26 百度在线网络技术(北京)有限公司 基于视频的信息获取方法和装置

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Baidu Encyclopedia"
See also references of EP3869810A4

Also Published As

Publication number Publication date
EP3869810A4 (en) 2022-06-29
CN109525877A (zh) 2019-03-26
JP2021516501A (ja) 2021-07-01
KR102370699B1 (ko) 2022-03-04
CN109525877B (zh) 2021-04-20
EP3869810A1 (en) 2021-08-25
US20200404378A1 (en) 2020-12-24
JP7231638B2 (ja) 2023-03-01
KR20200110407A (ko) 2020-09-23

Similar Documents

Publication Publication Date Title
WO2020078215A1 (zh) 基于视频的信息获取方法和装置
TWI744368B (zh) 播放處理方法、裝置和設備
US11074436B1 (en) Method and apparatus for face recognition
US10204264B1 (en) Systems and methods for dynamically scoring implicit user interaction
US9788065B2 (en) Methods and devices for providing a video
WO2017092360A1 (zh) 多媒体播放时的交互方法及装置
JP6263263B2 (ja) 関連ユーザー確定方法および装置
US10701301B2 (en) Video playing method and device
CN111897507A (zh) 投屏方法、装置、第二终端和存储介质
CN111629247B (zh) 一种信息显示方法、装置及电子设备
WO2015062224A1 (en) Tv program identification method, apparatus, terminal, server and system
WO2021004137A1 (zh) 基于人脸识别的信息推送方法、装置、计算机设备
JP2017509090A (ja) 画像分類方法及び装置
CN112672208B (zh) 视频播放方法、装置、电子设备、服务器及系统
CN112312215B (zh) 基于用户识别的开机内容推荐方法、智能电视及存储介质
US10956763B2 (en) Information terminal device
US20230316529A1 (en) Image processing method and apparatus, device and storage medium
WO2019119643A1 (zh) 移动直播的互动终端、方法及计算机可读存储介质
CN107463681B (zh) 一种待搜题目的识别方法及装置
CN112866577B (zh) 图像的处理方法、装置、计算机可读介质及电子设备
CN107247794B (zh) 直播中的话题引导方法、直播装置及终端设备
US20170171462A1 (en) Image Collection Method, Information Push Method and Electronic Device, and Mobile Phone
CN109151599B (zh) 视频处理方法和装置
CN107391661B (zh) 推荐词显示方法及装置
US20210377454A1 (en) Capturing method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19874167

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20207024019

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2020547082

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019874167

Country of ref document: EP

Effective date: 20210518