WO2022028177A1 - Information pushing method, video processing method, and device - Google Patents

Information pushing method, video processing method, and device Download PDF

Info

Publication number
WO2022028177A1
WO2022028177A1 PCT/CN2021/104450 CN2021104450W WO2022028177A1 WO 2022028177 A1 WO2022028177 A1 WO 2022028177A1 CN 2021104450 W CN2021104450 W CN 2021104450W WO 2022028177 A1 WO2022028177 A1 WO 2022028177A1
Authority
WO
WIPO (PCT)
Prior art keywords
item
appearing
information
video frame
video stream
Prior art date
Application number
PCT/CN2021/104450
Other languages
French (fr)
Chinese (zh)
Inventor
崔英林
Original Assignee
上海连尚网络科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海连尚网络科技有限公司 filed Critical 上海连尚网络科技有限公司
Publication of WO2022028177A1 publication Critical patent/WO2022028177A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/74Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Definitions

  • the embodiments of the present application relate to the field of computer technologies, and in particular, to methods and devices for information push and video processing.
  • video applications support more and more diverse functions, such as live broadcast functions, on-demand functions, and so on.
  • more and more users are attracted to watch the video, and the viewing time is getting longer and longer.
  • Various objects such as clothes, decorations, food, etc., often appear in videos. If the user is interested in the items in it, he needs to run the video application in the background, then open the search application or shopping application, and enter the name of the item to search to obtain the detailed information of the item.
  • the embodiments of the present application propose methods and devices for information push and video processing.
  • an embodiment of the present application provides a method for pushing information, including: performing code stream conversion on video data to obtain the video stream and identification information of items appearing in the video stream; playing the video stream on a playback device; There is an item of interest of the user in the current video frame of the video stream, and the identification information of the item of interest is determined; based on the identification information of the item of interest, the push information of the item of interest is queried, and the push information is presented.
  • determining that there is an item of interest of the user in the current video frame of the video stream includes: collecting the user's voice information; recognizing the voice information, and determining the name of the item included in the voice information; The occurrence item is matched, and the matched occurrence item is determined as the attention item.
  • determining that there is an item of interest of the user in the current video frame of the video stream includes: setting a trigger area in the video frame where the item appears in the video stream; in response to detecting that the user confirms the trigger area of the current video frame, The appearing item corresponding to the confirmed trigger area is determined as the attention item.
  • the identification information includes coordinate information; and setting the trigger area in the video frame where the item appears in the video stream includes: setting the area corresponding to the coordinate information as the trigger area.
  • the coordinate information is a percentage coordinate
  • setting the area corresponding to the coordinate information as the trigger area includes: calculating the present item of the current video frame based on the resolution of the playback device and the percentage coordinates of the present item of the current video frame the lattice coordinates; set the area corresponding to the lattice coordinates as the trigger area.
  • the lattice coordinates of the items appearing in the current video frame are calculated, including: if the coordinate system of the percentage coordinates and the screen coordinate system of the playback device Similarly, multiply the horizontal and vertical pixel values of the resolution of the playback device and the horizontal and vertical coordinate values of the percentage coordinates of the items in the current video frame to obtain the lattice coordinates of the items in the current video frame.
  • the lattice coordinates of the items appearing in the current video frame are calculated, further comprising: if the coordinate system of the percentage coordinates and the screen coordinates of the playback device If the system is different, convert the coordinate system of the percentage coordinates to obtain the converted percentage coordinates in the screen coordinate system; compare the horizontal pixel value and vertical pixel value of the resolution of the playback device with the horizontal coordinates of the converted percentage coordinates of the items appearing in the current video frame The value and the vertical coordinate value are correspondingly multiplied to obtain the lattice coordinates of the item appearing in the current video frame.
  • detecting the trigger area where the user confirms the current video frame includes: if the user touches the trigger area of the current video frame, determining the user confirms the trigger area.
  • detecting that the user confirms the trigger area of the current video frame includes: capturing the focus of the user's eyes; and determining the user confirms the trigger area in response to determining that the focus is on the trigger area of the current video frame.
  • capturing the focus of the user's eye includes: using a camera of the playback device to emit a light beam to the eye; using a photosensitive material on a screen of the playback device to sense the intensity of the light beam reflected from the eye; and determining the dark spot on the screen based on the light beam intensity point as the focal point.
  • an embodiment of the present application provides a video processing method, including: performing item identification on a video stream to determine an item appearing in the video stream; acquiring identification information of the appearing item; adding the identification information of the appearing item to a corresponding video In the frame protocol, video data is generated.
  • acquiring the identification information of the appearing item includes: performing position recognition on the video stream to determine the coordinate information of the appearing article; and adding the coordinate information of the appearing article to the identification information of the appearing article.
  • performing position recognition on the video stream to determine the coordinate information of the item that appears includes: simulating the pilot video stream on the pilot device; performing position recognition on the video stream to obtain the lattice coordinates of the item that appears; Lattice coordinates, determine the coordinate information of the item appearing.
  • determining the coordinate information of the appearing item based on the lattice coordinates of the appearing item including: comparing the horizontal coordinate value and the vertical coordinate value of the lattice coordinate of the appearing item with the horizontal pixel value and vertical value of the resolution of the pilot device The pixel value is divided correspondingly to get the percentage coordinates of the item appearing.
  • the identification information added to the video frame protocol of the first-occurring item includes the item name, coordinate information, brief information and/or web page link.
  • the identification information of the video frame protocol in which the item appears includes the item name and coordinate information.
  • adding the identification information of the appearing item to the corresponding video frame protocol includes: extending the network abstraction layer information of the corresponding video frame protocol based on the identification information of the appearing article.
  • an embodiment of the present application provides an information push device, comprising: a conversion unit configured to perform code stream conversion on video data to obtain the video stream and identification information of items appearing in the video stream; a playback unit configured to to play the video stream on the playback device; the determining unit is configured to determine the identification information of the attention item in response to determining that the user's attention item exists in the current video frame of the video stream; the presenting unit is configured to be based on the attention item identification information , query the push information of the concerned item, and present the push information.
  • the determining unit is further configured to: collect the voice information of the user; identify the voice information, and determine the name of the item contained in the voice information; Identified as an item of interest.
  • the determining unit includes: a setting subunit configured to set a trigger area in a video frame where an item of appearance of the video stream is located; the determining subunit configured to respond to detecting a trigger of the user confirming the current video frame area, and determine the appearing item corresponding to the confirmed trigger area as the item of interest.
  • the identification information includes coordinate information; and the setting subunit includes: a setting module configured to set an area corresponding to the coordinate information as a trigger area.
  • the coordinate information is a percentage coordinate
  • a setting module comprising: a calculation submodule, configured to calculate, based on the resolution of the playback device and the percentage coordinates of the occurrence item of the current video frame, the Lattice coordinates; the setting sub-module is configured to set the area corresponding to the lattice coordinates as the trigger area.
  • the calculation sub-module is further configured to: if the coordinate system of the percentage coordinates is the same as the screen coordinate system of the playback device, compare the horizontal pixel value and the vertical pixel value of the resolution of the playback device with the appearance item of the current video frame The horizontal coordinate value and the vertical coordinate value of the percentage coordinates are multiplied correspondingly to obtain the lattice coordinates of the items appearing in the current video frame.
  • the calculation submodule is further configured to: if the coordinate system of the percentage coordinates is different from the screen coordinate system of the playback device, convert the coordinate system of the percentage coordinates to obtain the converted percentage coordinates in the screen coordinate system;
  • the horizontal and vertical pixel values of the resolution of the device are correspondingly multiplied by the horizontal and vertical coordinate values of the conversion percentage coordinates of the items appearing in the current video frame to obtain the lattice coordinates of the items appearing in the current video frame.
  • the determining subunit is further configured to: if the user touches the triggering region of the current video frame, determine that the user confirms the triggering region.
  • the determination subunit includes: a capture module configured to capture the focus of the user's eyes; and a determination module configured to determine the user confirms the trigger region in response to determining that the focus is on the trigger region of the current video frame.
  • the capture module is further configured to: use the camera of the playback device to emit a light beam to the eye; use a photosensitive material on the screen of the playback device to sense the intensity of the light beam reflected from the eye; determine the dark spot on the screen based on the light beam intensity, as the focus.
  • an embodiment of the present application provides a video processing device, including: a determining unit configured to perform item identification on a video stream to determine an item appearing in the video stream; an obtaining unit configured to obtain identification information of the appearing item ; The adding unit is configured to add the identification information of the item to the corresponding video frame protocol to generate video data.
  • the acquiring unit includes: a determining subunit, configured to perform position recognition on the video stream, and determine coordinate information of the appearing item; and an adding subunit, configured to add the coordinate information of the appearing article to the identification of the appearing article information.
  • the determining subunit includes: a pilot-broadcasting module, configured to simulate a pilot-broadcast video stream on a pilot-broadcasting device; an identification module, configured to perform position recognition on the video stream to obtain the lattice coordinates of the appearing item; the determining module, is configured to determine coordinate information of the appearing item based on the lattice coordinates of the appearing article.
  • the determining module is further configured to: divide the horizontal coordinate value and the vertical coordinate value of the lattice coordinates of the appearing item and the horizontal pixel value and the vertical pixel value of the resolution of the pilot device correspondingly to obtain the appearance of the article. Percentage coordinates.
  • the identification information added to the video frame protocol of the first-occurring item includes the item name, coordinate information, brief information and/or web page link.
  • the identification information of the video frame protocol in which the item appears includes the item name and coordinate information.
  • the adding unit is further configured to: extend the network abstraction layer information of the corresponding video frame protocol based on the identification information of the present item.
  • an embodiment of the present application provides a computer device, the computer device includes: one or more processors; a storage device on which one or more programs are stored; when one or more programs are stored by one or more The processors execute such that one or more processors implement a method as described in any implementation of the first aspect or implement a method as described in any implementation of the second aspect.
  • an embodiment of the present application provides a computer-readable medium on which a computer program is stored, and when the computer program is executed by a processor, implements the method described in any of the implementation manners in the first aspect or implements the method described in the second aspect.
  • the video data is first converted into a code stream to obtain the video stream and the identification information of the items appearing in the video stream; then the video stream is played on the playback device; then, in response to determining There is an item of interest of the user in the current video frame of the video stream, and the identification information of the item of interest is determined; finally, based on the identification information of the item of interest, the push information of the item of interest is queried, and the push information is presented.
  • FIG. 1 is an exemplary system architecture to which the present application may be applied;
  • FIG. 2 is a flowchart of an embodiment of an information push method according to the present application.
  • Fig. 3 is a flowchart of another embodiment of the information push method according to the present application.
  • FIG. 4 is a flowchart of another embodiment of an information push method according to the present application.
  • FIG. 5 is a flowchart of an embodiment of a video processing method according to the present application.
  • FIG. 6 is a schematic structural diagram of a computer system suitable for implementing the computer device of the embodiment of the present application.
  • FIG. 1 shows an exemplary system architecture 100 to which embodiments of the information push and video processing methods of the present application may be applied.
  • the system architecture 100 may include devices 101 , 102 and a network 103 .
  • the network 103 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
  • the devices 101, 102 may be hardware devices or software that support network connections to provide various network services.
  • the device can be a variety of electronic devices including, but not limited to, smart phones, tablet computers, laptop computers, desktop computers, and servers, among others.
  • a hardware device it can be implemented as a distributed device group composed of multiple devices, or can be implemented as a single device.
  • the device is software, it can be installed in the electronic devices listed above.
  • software it may be implemented as a plurality of software or software modules for providing distributed services, or may be implemented as a single software or software module. There is no specific limitation here.
  • the device can provide corresponding network services by installing a corresponding client application or server application.
  • client application After the client application is installed on the device, it can be embodied as a client in network communication.
  • server application After the server application is installed, it can be embodied as a server in network communication.
  • device 101 is embodied as a client, and device 102 is embodied as a server.
  • the device 101 may be a client of a video application, and the device 102 may be a server of the video application.
  • the information pushing method and the video processing method provided by the embodiments of the present application may be executed by the device 101 .
  • the device 101 executes the information push method, it may be a playback device.
  • the device 102 performs the video processing method, it may be a pilot device.
  • FIG. 2 shows a process 200 of an embodiment of the information push method according to the present application.
  • the information push method includes the following steps:
  • Step 201 Convert the video data to a code stream to obtain the video stream and the identification information of the items appearing in the video stream.
  • the execution body of the information push method may acquire video data from the background server of the video application (for example, the device 102 shown in FIG. 1 ), and encode the video data Stream conversion, to obtain the video stream and the identification information of the items appearing in the video stream.
  • the video data may include the video stream and the identification information of the items appearing in the video stream.
  • Video streams are playable data, including but not limited to TV series, movies, live broadcasts, short videos, and so on.
  • the identification information of the item appearing in the video stream is unplayable data, which is used to identify the item appearing in the video stream, including but not limited to the item name, coordinate information, brief information, and web page link.
  • Appearing items may be items that appear in the video stream, such as clothing, decorations, food, and the like.
  • the code stream conversion may adopt a static transcoding method or a dynamic transcoding method.
  • NAL Network Abstraction Layer
  • NAL Header can be used to store basic information of video frames.
  • the NAL payload can be used to store a binary stream of video frames.
  • NAL Extension can be used to store identification information. It should be noted that since the video frame itself is a highly compressed data body, the NAL Extension also needs to have high compression.
  • the same item can appear in multiple consecutive video frames.
  • the identification information added to the video frame protocol of the first-appearing item may be detailed information, including item name, coordinate information, brief information and/or web page link, and the corresponding video frame is called Detailed frame;
  • the identification information added to the video frame protocol of a non-first-occurring item can be abbreviated information, including item name and coordinate information, and the corresponding video frame is called an abbreviated frame. In this way, the purpose of saving space can be achieved.
  • the detailed information can be decoded and cached, and when the thumbnail frame is played later, if it is detected that there is an item of interest of the user in the current video frame
  • the abbreviated information is cached to query, and the detailed information of the concerned item can be obtained.
  • Step 202 Play the video stream on the playback device.
  • the above-mentioned execution body may play the video stream on the playback device.
  • the above-mentioned execution body may be a playback device on which a player is installed for playing the video stream.
  • the playback device usually plays the video stream while converting the code stream. Therefore, during the playback of the video stream, the identification information of the items appearing in the video stream can be successively obtained.
  • Step 203 in response to determining that there is an item of interest of the user in the current video frame of the video stream, determine identification information of the item of interest.
  • the above-mentioned execution subject may determine whether there is an item of interest of the user in the current video frame of the video stream. If there is an item of interest of the user, determine the identification information of the item of interest; if there is no item of interest of the user, continue to play the video stream.
  • the user's attention item may be determined by the above-mentioned execution subject based on the user's reaction when watching the video stream.
  • the user can say the name of the attention item.
  • the above-mentioned executive body may collect the voice information of the user, identify the voice information, and determine the name of the item contained in the voice information. If the item name matches the item appearing in the current video frame, the matched appearing item is determined as the item of interest; if the item name does not match the appearing item in the current video frame, continue to collect the user's voice information.
  • the current video frame is the currently playing video frame. Multiple items may appear in the same video frame, and the item that matches the item name contained in the user's voice information is the user's attention item. For example, the user says "watch", and the items appearing in the current video frame include watches of brand A, clothes of brand B, and shoes of brand C. Only the watches of brand A match with "watch" and are the user's attention items.
  • Step 204 based on the identification information of the item of interest, query the push information of the item of interest, and present the push information.
  • the above-mentioned execution subject may, based on the identification information of the item of interest, query the push information of the item of interest, and present the push information.
  • the push information may be a link for the user to browse the detailed information of the item of interest or a link to purchase the item of interest.
  • the push information can be presented on the current video frame, especially in the vicinity of the item of interest in the current video frame. Subsequently, the user can perform corresponding operations based on the push information to view the detailed information of the item of interest or purchase the item of interest.
  • the above-mentioned executive body can query the push information of the concerned item in various ways. For example, when the push information of a large number of items is stored locally, the push information of the concerned item is searched locally. For another example, in the case where a video application integrates a search function or a shopping function, based on the identification information of the item of interest, a push information acquisition request is sent to the background server of the video application, and the push information of the item of interest returned by the background server of the video application is received. information. For another example, based on the identification information of the item of interest, a push information acquisition request is sent to the background server of the search application or the shopping application, and the push information of the item of interest returned by the background server of the search application or the shopping application is received.
  • the video data is first converted into a code stream to obtain the video stream and the identification information of the items appearing in the video stream; then the video stream is played on the playback device; then, in response to determining the current video of the video stream There is an item of interest of the user in the frame, and the identification information of the item of interest is determined; finally, based on the identification information of the item of interest, the push information of the item of interest is queried, and the push information is presented.
  • FIG. 3 it shows a process 300 of still another embodiment of the information pushing method according to the present application.
  • the information push method includes the following steps:
  • Step 301 Convert the video data to the code stream to obtain the video stream and the identification information of the items appearing in the video stream.
  • Step 302 Play the video stream on the playback device.
  • steps 301-302 have been described in detail in steps 201-202 in the embodiment shown in FIG. 2, and are not repeated here.
  • Step 303 setting a trigger area in the video frame where the item appearing in the video stream is located.
  • the execution body of the information push method may set a trigger area in the video frame where the item appears in the video stream.
  • the trigger area can be set in the vicinity of the present item in the video frame.
  • the identification information includes coordinate information
  • the area corresponding to the coordinate information is set as the trigger area. It should be understood that, when the number of items appearing in the video frame is multiple, multiple trigger areas may be set, and one trigger area corresponds to one item appearing.
  • Step 304 in response to detecting that the user confirms the trigger area of the current video frame, determine the appearing item corresponding to the confirmed trigger area as the item of interest.
  • the above-mentioned execution body can detect whether the user confirms the trigger area of the current video frame. If the trigger area where the user confirms the current video frame is detected, the item corresponding to the confirmed trigger area is determined as the item of interest; if the trigger area where the user confirms the current video frame is not detected, the video stream continues to be played and the detection continues.
  • the playback device needs to have corresponding hardware or plug-ins to detect the user's operation on the trigger area, while the video stream itself has no monitoring and network connection capabilities.
  • the playback device when the playback device has a touch screen, if the user touches the trigger area of the current video frame, it is determined that the user confirms the trigger area.
  • the playback device when the playback device has a camera, if it is captured that the focus of the user's eyes is in the trigger area of the current video frame, it is determined that the user confirms the trigger area.
  • the above-mentioned executive body may analyze the angle of view of the user's eyes in the user image collected by the camera to determine whether the focus of the user's eyes falls on the trigger area.
  • the above-mentioned executive body can first use the camera to emit light beams to the eyes of the user; then use the photosensitive material on the screen of the playback device to sense the intensity of the light beam reflected from the eyes; finally A dark spot on the screen is determined based on the beam intensity as the focus of the user's eyes.
  • the light beam hits the pupil of the eye, most of the light beam is absorbed by the pupil, so that the intensity of the light beam reflected on the screen is lower and dark spots appear.
  • the light beam irradiates the part other than the pupil, most of the light beam will be reflected on the screen, the light beam intensity is low, and bright spots appear.
  • Step 305 Determine the identification information of the object of interest.
  • Step 306 based on the identification information of the item of interest, query the push information of the item of interest, and present the push information.
  • steps 305-306 have been described in detail in steps 203-204 in the embodiment shown in FIG. 2, and are not repeated here.
  • the process 300 of the information push method in this embodiment highlights the step of determining the user's attention item. Therefore, in the solution described in this embodiment, a trigger area is set in the video frame where the item appears, and the item of interest is determined based on the user's operation on the trigger area, thereby improving the accuracy of determining the item of interest.
  • FIG. 4 it shows a process 400 of another embodiment of the information pushing method according to the present application.
  • the information push method includes the following steps:
  • Step 401 Convert the video data to a code stream to obtain the video stream and the identification information of the items appearing in the video stream.
  • Step 402 Play the video stream on the playback device.
  • steps 401-402 have been described in detail in steps 301-302 in the embodiment shown in FIG. 3, and are not repeated here.
  • Step 403 based on the resolution of the playback device and the percentage coordinates of the items that appear in the current video frame, calculate the lattice coordinates of the items appearing in the current video frame.
  • the execution body of the information push method may be based on the resolution of the playback device and the appearance of the current video frame.
  • the percentage coordinates of calculate the lattice coordinates of the items appearing in the current video frame.
  • the coordinate information in the identification information is a percentage coordinate.
  • lattice coordinates are required, so it is necessary to convert the percentage coordinates into the corresponding lattice coordinates.
  • the above-mentioned execution body can multiply the horizontal pixel value and vertical pixel value of the resolution of the playback device and the horizontal coordinate value and vertical coordinate value of the percentage coordinate of the item appearing in the current video frame correspondingly to obtain the appearing item in the current video frame. lattice coordinates.
  • a video stream is played using a playback device with a resolution of A*B. If the percentage coordinates of the appearing items are (x/a, y/b), then the lattice coordinates of the appearing items are (x*A/a, y*B/b).
  • a, b, A and B are positive integers
  • x is a positive integer not greater than a
  • y is a positive integer not greater than b
  • x/a and y/b are positive numbers not greater than 1
  • x*A/ a and y*B/b are positive integers.
  • the coordinate system of the percentage coordinate is the same as the screen coordinate system of the playback device, both with the upper left corner as the origin, the rightward as the positive direction of the horizontal axis, and the downward as the positive direction of the vertical axis.
  • Step 404 Set the area corresponding to the lattice coordinates as the trigger area.
  • the above-mentioned execution body may set the area corresponding to the lattice coordinates as the trigger area.
  • Step 405 in response to detecting that the user confirms the trigger area of the current video frame, determine the appearing item corresponding to the confirmed trigger area as the item of interest.
  • Step 406 Determine the identification information of the object of interest.
  • Step 407 based on the identification information of the item of interest, query the push information of the item of interest, and present the push information.
  • steps 405-407 have been described in detail in steps 304-306 in the embodiment shown in FIG. 3, and are not repeated here.
  • the process 400 of the information push method in this embodiment highlights the step of setting a trigger area. Therefore, the coordinate information in the identification information in the solution described in this embodiment is a percentage coordinate, and the corresponding lattice coordinates are obtained through coordinate transformation, thereby adapting to different screen resolutions of different playback devices.
  • the video processing method includes the following steps:
  • Step 501 Perform item identification on the video stream to determine the items appearing in the video stream.
  • the execution body of the video processing method (for example, the device 101 shown in FIG. 1 ) can perform item identification on the video stream, and determine the items appearing in the video stream.
  • the above-mentioned executive body can determine the occurrence items of the video stream in various ways. In some embodiments, those skilled in the art can perform item recognition on the video stream, and input the recognition result to the above-mentioned execution body. In some embodiments, the above-mentioned executive body may split the video stream into a series of video frames, and perform item identification on each video frame to determine the occurrence of items in the video stream.
  • step 502 the identification information of the appearing item is acquired.
  • the above-mentioned execution subject may acquire the identification information of the appearing item.
  • the identification information of the appearing item is unplayable data, which is used to identify the article appearing in the video stream.
  • the identification information may include coordinate information.
  • the above-mentioned execution body can perform position identification on the video stream, determine the coordinate information of the appearing item, and add the coordinate information of the appearing article to the identification information of the appearing article.
  • the coordinate information may be determined by simulating a pilot video stream on a pilot device. Specifically, a pilot video stream is first simulated on a pilot device; then the location of the video stream is identified to obtain the lattice coordinates of the appearing item; finally, the coordinate information of the appearing item is determined based on the lattice coordinates of the appearing article.
  • the coordinate information may be lattice coordinates.
  • the coordinate information in the identification information is a percentage coordinate. Specifically, by dividing the horizontal coordinate value and vertical coordinate value of the lattice coordinates of the appearing item correspondingly with the horizontal pixel value and vertical pixel value of the resolution of the pilot device, the percentage coordinates of the appearing article can be obtained.
  • a and b are positive integers
  • x is a positive integer not greater than a
  • y is a positive integer not greater than b
  • x/a and y/b are positive numbers not greater than 1.
  • the selection of the resolution of the pilot device needs to match the resolution of the video, for example, 16:9 is selected above 720p, and 4:3 is selected below. In this way, the error can be reduced as much as possible.
  • Step 503 adding the identification information of the item to the corresponding video frame protocol to generate video data.
  • the above-mentioned execution body may add the identification information of the appearing item to the corresponding video frame protocol to generate video data.
  • the identification information can be added to the original video frame protocol by performing code stream encoding processing on the video frame where the item with identification information is located, and by transforming the video frame protocol.
  • the transforming methods are different for different protocol formats.
  • the NAL information of the corresponding video frame protocol is extended based on the identification information of the item to support adding identification information.
  • NAL can include NAL Header, NAL Extension and NAL payload.
  • NAL Header can be used to store basic information of video frames.
  • the NAL payload can be used to store a binary stream of video frames.
  • NAL Extension can be used to store identification information. It should be noted that since the video frame itself is a highly compressed data body, the NAL Extension also needs to have high compression.
  • the same item can appear in multiple consecutive video frames.
  • the identification information added to the video frame protocol of the first-appearing item may be detailed information, including item name, coordinate information, brief information and/or web page link, and the corresponding video frame is called Detailed frame;
  • the identification information added to the video frame protocol of a non-first-occurring item can be abbreviated information, including item name and coordinate information, and the corresponding video frame is called an abbreviated frame. In this way, the purpose of saving space can be achieved.
  • the detailed information can be decoded and cached, and when the thumbnail frame is played later, if it is detected that there is an item of interest of the user in the current video frame
  • the abbreviated information is cached to query, and the detailed information of the concerned item can be obtained.
  • FIG. 6 shows a schematic structural diagram of a computer system 600 suitable for implementing a computer device (eg, the device 101 shown in FIG. 1 ) according to an embodiment of the present application.
  • the computer device shown in FIG. 6 is only an example, and should not impose any limitations on the functions and scope of use of the embodiments of the present application.
  • a computer system 600 includes a central processing unit (CPU) 601, which can be loaded into a random access memory (RAM) 603 according to a program stored in a read only memory (ROM) 602 or a program from a storage section 608 Instead, various appropriate actions and processes are performed.
  • RAM random access memory
  • ROM read only memory
  • various programs and data required for the operation of the system 600 are also stored.
  • the CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input/output (I/O) interface 605 is also connected to bus 604 .
  • the following components are connected to the I/O interface 605: an input section 606 including a keyboard, a mouse, etc.; an output section 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 608 including a hard disk, etc. ; and a communication section 609 including a network interface card such as a LAN card, a modem, and the like. The communication section 609 performs communication processing via a network such as the Internet.
  • a drive 610 is also connected to the I/O interface 605 as needed.
  • a removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 610 as needed so that a computer program read therefrom is installed into the storage section 608 as needed.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication portion 609 and/or installed from the removable medium 611 .
  • CPU central processing unit
  • the computer-readable medium described in this application may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the above.
  • a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Computer program code for performing the operations of the present application may be written in one or more programming languages, including object-oriented programming languages - such as Java, Smalltalk, C++, but also conventional Procedural programming language - such as "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or electronic device.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider e.g., using an Internet service provider through Internet connection.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present application may be implemented in a software manner, and may also be implemented in a hardware manner.
  • the described unit may also be provided in the processor, for example, it may be described as: a processor includes a converting unit, a playing unit, a determining unit and a presenting unit.
  • the names of these units do not constitute a limitation of the unit itself in this case, for example, the conversion unit can also be described as "converting the video data to a stream to obtain the video stream and the identification information of the items that appear in the video stream. unit".
  • a processor includes a determination unit, an acquisition unit, and an addition unit.
  • the names of these units do not constitute a limitation of the unit itself, for example, the determination unit may also be described as "a unit for identifying items in a video stream and determining items appearing in a video stream".
  • the present application also provides a computer-readable medium.
  • the computer-readable medium may be included in the computer device described in the above embodiments; it may also exist independently without being assembled into the computer device. middle.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the computer equipment, the computer equipment is made to perform code stream conversion on the video data to obtain the video stream and the appearance of the video stream. identification information; play the video stream on the playback device; in response to determining that there is an item of interest of the user in the current video frame of the video stream, determine the identification information of the item of interest; based on the identification information of the item of interest, query the push information of the item of interest, and present Push information. Or make the computer equipment: perform item identification on the video stream to determine the item appearing in the video stream; obtain the identification information of the appearing item; add the identification information of the appearing item to the corresponding video frame protocol to generate video data.

Abstract

An information pushing method, a video processing method, and a device. The information pushing method comprises: performing data rate conversion on video data, so as to obtain a video stream and identification information of an item that appears in the video stream (201); playing the video stream on a playing device (202); in response to determining that there is an item, which a user is concerned about, in the current video frame of the video stream, determining identification information of the item of concern (203); and on the basis of the identification information of the item of concern, querying pushing information of the item of concern, and presenting the pushing information (204). By means of the method, an item of interest to a user is discovered from numerous items that appear in a video stream, and pushing information of said item is automatically presented, such that the requirement of the user for knowing the information of said item in detail is met, so as to purchase said item, thereby saving on operation costs of the user.

Description

信息推送、视频处理方法和设备Information push, video processing method and device 技术领域technical field
本申请实施例涉及计算机技术领域,具体涉及信息推送、视频处理方法和设备。The embodiments of the present application relate to the field of computer technologies, and in particular, to methods and devices for information push and video processing.
背景技术Background technique
随着互联网的飞速发展,视频应用支持的功能越来越多样,例如直播功能、点播功能等等。进而吸引了越来越多的用户观看视频,且观看时间也越来越久。视频中经常会出现各种物品,例如衣服、装饰品、食品等等。若用户对其中的物品感兴趣,需要先将视频应用后台运行,再打开搜索应用或购物应用,输入物品名称进行搜索,才能获取物品的详细信息。With the rapid development of the Internet, video applications support more and more diverse functions, such as live broadcast functions, on-demand functions, and so on. In turn, more and more users are attracted to watch the video, and the viewing time is getting longer and longer. Various objects, such as clothes, decorations, food, etc., often appear in videos. If the user is interested in the items in it, he needs to run the video application in the background, then open the search application or shopping application, and enter the name of the item to search to obtain the detailed information of the item.
发明内容SUMMARY OF THE INVENTION
本申请实施例提出了信息推送、视频处理方法和设备。The embodiments of the present application propose methods and devices for information push and video processing.
第一方面,本申请实施例提供了一种信息推送方法,包括:对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息;在播放设备上播放视频流;响应于确定视频流的当前视频帧中存在用户的关注物品,确定关注物品的标识信息;基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。In a first aspect, an embodiment of the present application provides a method for pushing information, including: performing code stream conversion on video data to obtain the video stream and identification information of items appearing in the video stream; playing the video stream on a playback device; There is an item of interest of the user in the current video frame of the video stream, and the identification information of the item of interest is determined; based on the identification information of the item of interest, the push information of the item of interest is queried, and the push information is presented.
在一些实施例中,确定视频流的当前视频帧中存在用户的关注物品,包括:采集用户的语音信息;对语音信息进行识别,确定语音信息包含的物品名称;若物品名称与当前视频帧的出现物品匹配,将匹配的出现物品确定为关注物品。In some embodiments, determining that there is an item of interest of the user in the current video frame of the video stream includes: collecting the user's voice information; recognizing the voice information, and determining the name of the item included in the voice information; The occurrence item is matched, and the matched occurrence item is determined as the attention item.
在一些实施例中,确定视频流的当前视频帧中存在用户的关注物品,包括:在视频流的出现物品所在的视频帧中设置触发区域;响应于检测到用户确认当前视频帧的触发区域,将确认的触发区域对应的出现物品确定为关注物品。In some embodiments, determining that there is an item of interest of the user in the current video frame of the video stream includes: setting a trigger area in the video frame where the item appears in the video stream; in response to detecting that the user confirms the trigger area of the current video frame, The appearing item corresponding to the confirmed trigger area is determined as the attention item.
在一些实施例中,标识信息包括坐标信息;以及在视频流的出现物品所在 的视频帧中设置触发区域,包括:将坐标信息对应的区域设置为触发区域。In some embodiments, the identification information includes coordinate information; and setting the trigger area in the video frame where the item appears in the video stream includes: setting the area corresponding to the coordinate information as the trigger area.
在一些实施例中,坐标信息是百分比坐标;以及将坐标信息对应的区域设置为触发区域,包括:基于播放设备的分辨率和当前视频帧的出现物品的百分比坐标,计算当前视频帧的出现物品的点阵坐标;将点阵坐标对应的区域设置为触发区域。In some embodiments, the coordinate information is a percentage coordinate; and setting the area corresponding to the coordinate information as the trigger area includes: calculating the present item of the current video frame based on the resolution of the playback device and the percentage coordinates of the present item of the current video frame the lattice coordinates; set the area corresponding to the lattice coordinates as the trigger area.
在一些实施例中,基于播放设备的分辨率和当前视频帧的出现物品的百分比坐标,计算当前视频帧的出现物品的点阵坐标,包括:若百分比坐标的坐标系与播放设备的屏幕坐标系相同,将播放设备的分辨率的水平像素值和垂直像素值与当前视频帧的出现物品的百分比坐标的水平坐标值和垂直坐标值对应相乘,得到当前视频帧的出现物品的点阵坐标。In some embodiments, based on the resolution of the playback device and the percentage coordinates of the items appearing in the current video frame, the lattice coordinates of the items appearing in the current video frame are calculated, including: if the coordinate system of the percentage coordinates and the screen coordinate system of the playback device Similarly, multiply the horizontal and vertical pixel values of the resolution of the playback device and the horizontal and vertical coordinate values of the percentage coordinates of the items in the current video frame to obtain the lattice coordinates of the items in the current video frame.
在一些实施例中,基于播放设备的分辨率和当前视频帧的出现物品的百分比坐标,计算当前视频帧的出现物品的点阵坐标,还包括:若百分比坐标的坐标系与播放设备的屏幕坐标系不同,对百分比坐标的坐标系进行转换,得到屏幕坐标系下的转换百分比坐标;将播放设备的分辨率的水平像素值和垂直像素值与当前视频帧的出现物品的转换百分比坐标的水平坐标值和垂直坐标值对应相乘,得到当前视频帧的出现物品的点阵坐标。In some embodiments, based on the resolution of the playback device and the percentage coordinates of the items appearing in the current video frame, the lattice coordinates of the items appearing in the current video frame are calculated, further comprising: if the coordinate system of the percentage coordinates and the screen coordinates of the playback device If the system is different, convert the coordinate system of the percentage coordinates to obtain the converted percentage coordinates in the screen coordinate system; compare the horizontal pixel value and vertical pixel value of the resolution of the playback device with the horizontal coordinates of the converted percentage coordinates of the items appearing in the current video frame The value and the vertical coordinate value are correspondingly multiplied to obtain the lattice coordinates of the item appearing in the current video frame.
在一些实施例中,检测到用户确认当前视频帧的触发区域,包括:若用户触摸当前视频帧的触发区域,确定用户确认触发区域。In some embodiments, detecting the trigger area where the user confirms the current video frame includes: if the user touches the trigger area of the current video frame, determining the user confirms the trigger area.
在一些实施例中,检测到用户确认当前视频帧的触发区域,包括:捕捉用户的眼睛的焦点;响应于确定焦点在当前视频帧的触发区域,确定用户确认触发区域。In some embodiments, detecting that the user confirms the trigger area of the current video frame includes: capturing the focus of the user's eyes; and determining the user confirms the trigger area in response to determining that the focus is on the trigger area of the current video frame.
在一些实施例中,捕捉用户的眼睛的焦点,包括:利用播放设备的摄像头向眼睛发射光束;利用播放设备的屏幕上的感光材料感应从眼睛反射的光束强度;基于光束强度确定屏幕上的暗点,作为焦点。In some embodiments, capturing the focus of the user's eye includes: using a camera of the playback device to emit a light beam to the eye; using a photosensitive material on a screen of the playback device to sense the intensity of the light beam reflected from the eye; and determining the dark spot on the screen based on the light beam intensity point as the focal point.
第二方面,本申请实施例提供了一种视频处理方法,包括:对视频流进行物品识别,确定视频流的出现物品;获取出现物品的标识信息;将出现物品的标识信息添加到对应的视频帧协议中,生成视频数据。In a second aspect, an embodiment of the present application provides a video processing method, including: performing item identification on a video stream to determine an item appearing in the video stream; acquiring identification information of the appearing item; adding the identification information of the appearing item to a corresponding video In the frame protocol, video data is generated.
在一些实施例中,获取出现物品的标识信息,包括:对视频流进行位置识别,确定出现物品的坐标信息;将出现物品的坐标信息添加到出现物品的标识 信息中。In some embodiments, acquiring the identification information of the appearing item includes: performing position recognition on the video stream to determine the coordinate information of the appearing article; and adding the coordinate information of the appearing article to the identification information of the appearing article.
在一些实施例中,对视频流进行位置识别,确定出现物品的坐标信息,包括:在试播设备上模拟试播视频流;对视频流进行位置识别,得到出现物品的点阵坐标;基于出现物品的点阵坐标,确定出现物品的坐标信息。In some embodiments, performing position recognition on the video stream to determine the coordinate information of the item that appears includes: simulating the pilot video stream on the pilot device; performing position recognition on the video stream to obtain the lattice coordinates of the item that appears; Lattice coordinates, determine the coordinate information of the item appearing.
在一些实施例中,基于出现物品的点阵坐标,确定出现物品的坐标信息,包括:将出现物品的点阵坐标的水平坐标值和垂直坐标值与试播设备的分辨率的水平像素值和垂直像素值对应相除,得到出现物品的百分比坐标。In some embodiments, determining the coordinate information of the appearing item based on the lattice coordinates of the appearing item, including: comparing the horizontal coordinate value and the vertical coordinate value of the lattice coordinate of the appearing item with the horizontal pixel value and vertical value of the resolution of the pilot device The pixel value is divided correspondingly to get the percentage coordinates of the item appearing.
在一些实施例中,对于视频流中的连续视频帧的出现物品,添加到首次出现物品的视频帧协议中的标识信息包括物品名称、坐标信息、简要信息和/或网页链接,添加到非首次出现物品的视频帧协议的标识信息包括物品名称和坐标信息。In some embodiments, for an item that appears in consecutive video frames in a video stream, the identification information added to the video frame protocol of the first-occurring item includes the item name, coordinate information, brief information and/or web page link. The identification information of the video frame protocol in which the item appears includes the item name and coordinate information.
在一些实施例中,将出现物品的标识信息添加到对应的视频帧协议中,包括:基于出现物品的标识信息扩展对应的视频帧协议的网络抽象层信息。In some embodiments, adding the identification information of the appearing item to the corresponding video frame protocol includes: extending the network abstraction layer information of the corresponding video frame protocol based on the identification information of the appearing article.
第三方面,本申请实施例提供了一种信息推送装置,包括:转换单元,被配置成对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息;播放单元,被配置成在播放设备上播放视频流;确定单元,被配置成响应于确定视频流的当前视频帧中存在用户的关注物品,确定关注物品的标识信息;呈现单元,被配置成基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。In a third aspect, an embodiment of the present application provides an information push device, comprising: a conversion unit configured to perform code stream conversion on video data to obtain the video stream and identification information of items appearing in the video stream; a playback unit configured to to play the video stream on the playback device; the determining unit is configured to determine the identification information of the attention item in response to determining that the user's attention item exists in the current video frame of the video stream; the presenting unit is configured to be based on the attention item identification information , query the push information of the concerned item, and present the push information.
在一些实施例中,确定单元进一步被配置成:采集用户的语音信息;对语音信息进行识别,确定语音信息包含的物品名称;若物品名称与当前视频帧的出现物品匹配,将匹配的出现物品确定为关注物品。In some embodiments, the determining unit is further configured to: collect the voice information of the user; identify the voice information, and determine the name of the item contained in the voice information; Identified as an item of interest.
在一些实施例中,确定单元包括:设置子单元,被配置成在视频流的出现物品所在的视频帧中设置触发区域;确定子单元,被配置成响应于检测到用户确认当前视频帧的触发区域,将确认的触发区域对应的出现物品确定为关注物品。In some embodiments, the determining unit includes: a setting subunit configured to set a trigger area in a video frame where an item of appearance of the video stream is located; the determining subunit configured to respond to detecting a trigger of the user confirming the current video frame area, and determine the appearing item corresponding to the confirmed trigger area as the item of interest.
在一些实施例中,标识信息包括坐标信息;以及设置子单元包括:设置模块,被配置成将坐标信息对应的区域设置为触发区域。In some embodiments, the identification information includes coordinate information; and the setting subunit includes: a setting module configured to set an area corresponding to the coordinate information as a trigger area.
在一些实施例中,坐标信息是百分比坐标;以及设置模块,包括:计算子 模块,被配置成基于播放设备的分辨率和当前视频帧的出现物品的百分比坐标,计算当前视频帧的出现物品的点阵坐标;设置子模块,被配置成将点阵坐标对应的区域设置为触发区域。In some embodiments, the coordinate information is a percentage coordinate; and a setting module, comprising: a calculation submodule, configured to calculate, based on the resolution of the playback device and the percentage coordinates of the occurrence item of the current video frame, the Lattice coordinates; the setting sub-module is configured to set the area corresponding to the lattice coordinates as the trigger area.
在一些实施例中,计算子模块进一步被配置成:若百分比坐标的坐标系与播放设备的屏幕坐标系相同,将播放设备的分辨率的水平像素值和垂直像素值与当前视频帧的出现物品的百分比坐标的水平坐标值和垂直坐标值对应相乘,得到当前视频帧的出现物品的点阵坐标。In some embodiments, the calculation sub-module is further configured to: if the coordinate system of the percentage coordinates is the same as the screen coordinate system of the playback device, compare the horizontal pixel value and the vertical pixel value of the resolution of the playback device with the appearance item of the current video frame The horizontal coordinate value and the vertical coordinate value of the percentage coordinates are multiplied correspondingly to obtain the lattice coordinates of the items appearing in the current video frame.
在一些实施例中,计算子模块进一步被配置成:若百分比坐标的坐标系与播放设备的屏幕坐标系不同,对百分比坐标的坐标系进行转换,得到屏幕坐标系下的转换百分比坐标;将播放设备的分辨率的水平像素值和垂直像素值与当前视频帧的出现物品的转换百分比坐标的水平坐标值和垂直坐标值对应相乘,得到当前视频帧的出现物品的点阵坐标。In some embodiments, the calculation submodule is further configured to: if the coordinate system of the percentage coordinates is different from the screen coordinate system of the playback device, convert the coordinate system of the percentage coordinates to obtain the converted percentage coordinates in the screen coordinate system; The horizontal and vertical pixel values of the resolution of the device are correspondingly multiplied by the horizontal and vertical coordinate values of the conversion percentage coordinates of the items appearing in the current video frame to obtain the lattice coordinates of the items appearing in the current video frame.
在一些实施例中,确定子单元进一步被配置成:若用户触摸当前视频帧的触发区域,确定用户确认触发区域。In some embodiments, the determining subunit is further configured to: if the user touches the triggering region of the current video frame, determine that the user confirms the triggering region.
在一些实施例中,确定子单元包括:捕捉模块,被配置成捕捉用户的眼睛的焦点;确定模块,被配置成响应于确定焦点在当前视频帧的触发区域,确定用户确认触发区域。In some embodiments, the determination subunit includes: a capture module configured to capture the focus of the user's eyes; and a determination module configured to determine the user confirms the trigger region in response to determining that the focus is on the trigger region of the current video frame.
在一些实施例中,捕捉模块进一步被配置成:利用播放设备的摄像头向眼睛发射光束;利用播放设备的屏幕上的感光材料感应从眼睛反射的光束强度;基于光束强度确定屏幕上的暗点,作为焦点。In some embodiments, the capture module is further configured to: use the camera of the playback device to emit a light beam to the eye; use a photosensitive material on the screen of the playback device to sense the intensity of the light beam reflected from the eye; determine the dark spot on the screen based on the light beam intensity, as the focus.
第四方面,本申请实施例提供了一种视频处理装置,包括:确定单元,被配置成对视频流进行物品识别,确定视频流的出现物品;获取单元,被配置成获取出现物品的标识信息;添加单元,被配置成将出现物品的标识信息添加到对应的视频帧协议中,生成视频数据。In a fourth aspect, an embodiment of the present application provides a video processing device, including: a determining unit configured to perform item identification on a video stream to determine an item appearing in the video stream; an obtaining unit configured to obtain identification information of the appearing item ; The adding unit is configured to add the identification information of the item to the corresponding video frame protocol to generate video data.
在一些实施例中,获取单元包括:确定子单元,被配置成对视频流进行位置识别,确定出现物品的坐标信息;添加子单元,被配置成将出现物品的坐标信息添加到出现物品的标识信息中。In some embodiments, the acquiring unit includes: a determining subunit, configured to perform position recognition on the video stream, and determine coordinate information of the appearing item; and an adding subunit, configured to add the coordinate information of the appearing article to the identification of the appearing article information.
在一些实施例中,确定子单元包括:试播模块,被配置成在试播设备上模拟试播视频流;识别模块,被配置成对视频流进行位置识别,得到出现物品的 点阵坐标;确定模块,被配置成基于出现物品的点阵坐标,确定出现物品的坐标信息。In some embodiments, the determining subunit includes: a pilot-broadcasting module, configured to simulate a pilot-broadcast video stream on a pilot-broadcasting device; an identification module, configured to perform position recognition on the video stream to obtain the lattice coordinates of the appearing item; the determining module, is configured to determine coordinate information of the appearing item based on the lattice coordinates of the appearing article.
在一些实施例中,确定模块进一步被配置成:将出现物品的点阵坐标的水平坐标值和垂直坐标值与试播设备的分辨率的水平像素值和垂直像素值对应相除,得到出现物品的百分比坐标。In some embodiments, the determining module is further configured to: divide the horizontal coordinate value and the vertical coordinate value of the lattice coordinates of the appearing item and the horizontal pixel value and the vertical pixel value of the resolution of the pilot device correspondingly to obtain the appearance of the article. Percentage coordinates.
在一些实施例中,对于视频流中的连续视频帧的出现物品,添加到首次出现物品的视频帧协议中的标识信息包括物品名称、坐标信息、简要信息和/或网页链接,添加到非首次出现物品的视频帧协议的标识信息包括物品名称和坐标信息。In some embodiments, for an item that appears in consecutive video frames in a video stream, the identification information added to the video frame protocol of the first-occurring item includes the item name, coordinate information, brief information and/or web page link. The identification information of the video frame protocol in which the item appears includes the item name and coordinate information.
在一些实施例中,添加单元进一步被配置成:基于出现物品的标识信息扩展对应的视频帧协议的网络抽象层信息。In some embodiments, the adding unit is further configured to: extend the network abstraction layer information of the corresponding video frame protocol based on the identification information of the present item.
第五方面,本申请实施例提供了一种计算机设备,该计算机设备包括:一个或多个处理器;存储装置,其上存储有一个或多个程序;当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现如第一方面中任一实现方式描述的方法或实现如第二方面中任一实现方式描述的方法。In a fifth aspect, an embodiment of the present application provides a computer device, the computer device includes: one or more processors; a storage device on which one or more programs are stored; when one or more programs are stored by one or more The processors execute such that one or more processors implement a method as described in any implementation of the first aspect or implement a method as described in any implementation of the second aspect.
第六方面,本申请实施例提供了一种计算机可读介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如第一方面中任一实现方式描述的方法或实现如第二方面中任一实现方式描述的方法。In a sixth aspect, an embodiment of the present application provides a computer-readable medium on which a computer program is stored, and when the computer program is executed by a processor, implements the method described in any of the implementation manners in the first aspect or implements the method described in the second aspect. A method as described by any implementation in an aspect.
本申请实施例提供的信息推送、视频处理方法和设备,首先对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息;之后在播放设备上播放视频流;然后响应于确定视频流的当前视频帧中存在用户的关注物品,确定关注物品的标识信息;最后基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。从视频流出现的众多物品中发现用户感兴趣的物品,并自动呈现其推送信息,从而满足了用户详细了解其感兴趣物品的信息的需求,以便于购买其感兴趣的物品,节省了用户的操作成本。In the information push, video processing method and device provided by the embodiments of the present application, the video data is first converted into a code stream to obtain the video stream and the identification information of the items appearing in the video stream; then the video stream is played on the playback device; then, in response to determining There is an item of interest of the user in the current video frame of the video stream, and the identification information of the item of interest is determined; finally, based on the identification information of the item of interest, the push information of the item of interest is queried, and the push information is presented. Find the items that the user is interested in from the many items appearing in the video stream, and automatically present the push information, so as to meet the user's need to know the information of the item of interest in detail, so as to facilitate the purchase of the item of interest and save the user's time. operating costs.
附图说明Description of drawings
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:Other features, objects and advantages of the present application will become more apparent by reading the detailed description of non-limiting embodiments made with reference to the following drawings:
图1是本申请可以应用于其中的示例性系统架构;FIG. 1 is an exemplary system architecture to which the present application may be applied;
图2是根据本申请的信息推送方法的一个实施例的流程图;2 is a flowchart of an embodiment of an information push method according to the present application;
图3是根据本申请的信息推送方法的又一个实施例的流程图;Fig. 3 is a flowchart of another embodiment of the information push method according to the present application;
图4是根据本申请的信息推送方法的另一个实施例的流程图;FIG. 4 is a flowchart of another embodiment of an information push method according to the present application;
图5是根据本申请的视频处理方法的一个实施例的流程图;FIG. 5 is a flowchart of an embodiment of a video processing method according to the present application;
图6是适于用来实现本申请实施例的计算机设备的计算机系统的结构示意图。FIG. 6 is a schematic structural diagram of a computer system suitable for implementing the computer device of the embodiment of the present application.
具体实施方式detailed description
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。The present application will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the related invention, but not to limit the invention. In addition, it should be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that the embodiments in the present application and the features of the embodiments may be combined with each other in the case of no conflict. The present application will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
图1示出了可以应用本申请的信息推送、视频处理方法的实施例的示例性系统架构100。FIG. 1 shows an exemplary system architecture 100 to which embodiments of the information push and video processing methods of the present application may be applied.
如图1所示,系统架构100中可以包括设备101、102和网络103。网络103用以在设备101、102之间提供通信链路的介质。网络103可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1 , the system architecture 100 may include devices 101 , 102 and a network 103 . The medium used by the network 103 to provide a communication link between the devices 101 , 102 . The network 103 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
设备101、102可以是支持网络连接从而提供各种网络服务的硬件设备或软件。当设备为硬件时,其可以是各种电子设备,包括但不限于智能手机、平板电脑、膝上型便携计算机、台式计算机和服务器等等。这时,作为硬件设备,其可以实现成多个设备组成的分布式设备群,也可以实现成单个设备。当设备为软件时,可以安装在上述所列举的电子设备中。这时,作为软件,其可以实现成例如用来提供分布式服务的多个软件或软件模块,也可以实现成单个软件或软件模块。在此不做具体限定。The devices 101, 102 may be hardware devices or software that support network connections to provide various network services. When the device is hardware, it can be a variety of electronic devices including, but not limited to, smart phones, tablet computers, laptop computers, desktop computers, and servers, among others. At this time, as a hardware device, it can be implemented as a distributed device group composed of multiple devices, or can be implemented as a single device. When the device is software, it can be installed in the electronic devices listed above. At this time, as software, it may be implemented as a plurality of software or software modules for providing distributed services, or may be implemented as a single software or software module. There is no specific limitation here.
在实践中,设备可以通过安装相应的客户端应用或服务端应用来提供相应的网络服务。设备在安装了客户端应用之后,其可以在网络通信中体现为客户端。相应地,在安装了服务端应用之后,其可以在网络通信中体现为服务端。In practice, the device can provide corresponding network services by installing a corresponding client application or server application. After the client application is installed on the device, it can be embodied as a client in network communication. Correspondingly, after the server application is installed, it can be embodied as a server in network communication.
作为示例,在图1中,设备101体现为客户端,而设备102体现为服务端。例如,设备101可以是视频类应用的客户端,设备102可以是视频类应用的服务端。As an example, in FIG. 1, device 101 is embodied as a client, and device 102 is embodied as a server. For example, the device 101 may be a client of a video application, and the device 102 may be a server of the video application.
需要说明的是,本申请实施例所提供的信息推送方法和视频处理方法可以由设备101执行。当设备101执行信息推送方法时,其可以是播放设备。当设备102执行视频处理方法时,其可以是试播设备。It should be noted that, the information pushing method and the video processing method provided by the embodiments of the present application may be executed by the device 101 . When the device 101 executes the information push method, it may be a playback device. When the device 102 performs the video processing method, it may be a pilot device.
应该理解,图1中的网络和设备的数目仅仅是示意性的。根据实现需要,可以具有任意数目的网络和设备。It should be understood that the number of networks and devices in Figure 1 is merely illustrative. There can be any number of networks and devices depending on the implementation needs.
继续参考图2,其示出了根据本申请的信息推送方法的一个实施例的流程200。该信息推送方法包括以下步骤:Continue to refer to FIG. 2 , which shows a process 200 of an embodiment of the information push method according to the present application. The information push method includes the following steps:
步骤201,对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息。Step 201: Convert the video data to a code stream to obtain the video stream and the identification information of the items appearing in the video stream.
在本实施例中,信息推送方法的执行主体(例如图1所示的设备101)可以从视频类应用的后台服务器(例如图1所示的设备102)获取视频数据,并对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息。In this embodiment, the execution body of the information push method (for example, the device 101 shown in FIG. 1 ) may acquire video data from the background server of the video application (for example, the device 102 shown in FIG. 1 ), and encode the video data Stream conversion, to obtain the video stream and the identification information of the items appearing in the video stream.
其中,视频数据可以包括视频流和视频流的出现物品的标识信息。视频流是可播放数据,包括但不限于电视剧、电影、直播、短视频等等。视频流的出现物品的标识信息是不可播放数据,用于标识视频流中出现的物品,包括但不限于物品名称、坐标信息、简要信息和网页链接等等。出现物品可以是视频流中出现的物品,例如衣服、装饰品、食品等等。The video data may include the video stream and the identification information of the items appearing in the video stream. Video streams are playable data, including but not limited to TV series, movies, live broadcasts, short videos, and so on. The identification information of the item appearing in the video stream is unplayable data, which is used to identify the item appearing in the video stream, including but not limited to the item name, coordinate information, brief information, and web page link. Appearing items may be items that appear in the video stream, such as clothing, decorations, food, and the like.
对于视频流中的视频帧,并不是每帧视频帧中都出现物品,也不是出现的每个物品都有标识信息,因此,仅对有标识信息的物品所在的视频帧进行码流编码处理,改造视频帧协议,在原有视频帧协议中添加标识信息。视频数据中添加了不可播放数据无法直接播放,因此需要对视频数据进行码流转换,将可播放的视频流和不可播放的标识信息分离开来。其中,码流转换可以采用静态转码方式或动态转码方式。For the video frames in the video stream, not every item appears in the video frame, and not every item that appears has identification information. Therefore, only the video frame where the item with identification information is located is encoded. Modify the video frame protocol and add identification information to the original video frame protocol. Unplayable data is added to the video data and cannot be played directly, so the video data needs to be stream-converted to separate the playable video stream from the unplayable identification information. The code stream conversion may adopt a static transcoding method or a dynamic transcoding method.
需要说明的是,在对视频帧协议改造时,针对不同的协议格式,其改造方式不同。以H.264为例,通过扩展视频帧协议的NAL(Network Abstraction Layer,网络抽象层)信息来支持添加标识信息。其中,NAL可以包括NAL Header、 NAL Extension和NAL payload。NAL Header可以用于存储视频帧的基本信息。NAL payload可以用于存储视频帧的二进制流。NAL Extension可以用于存储标识信息。需要说明的是,由于视频帧本身是一个高压缩的数据体,因此NAL Extension也需要具有高压缩性。It should be noted that when transforming the video frame protocol, the transforming methods are different for different protocol formats. Taking H.264 as an example, adding identification information is supported by extending the NAL (Network Abstraction Layer) information of the video frame protocol. Among them, NAL can include NAL Header, NAL Extension and NAL payload. NAL Header can be used to store basic information of video frames. The NAL payload can be used to store a binary stream of video frames. NAL Extension can be used to store identification information. It should be noted that since the video frame itself is a highly compressed data body, the NAL Extension also needs to have high compression.
在实际应用中,同一物品可以在连续多帧视频帧中出现。对于视频流中的连续视频帧的出现物品,添加到首次出现物品的视频帧协议中的标识信息可以是详细信息,包括物品名称、坐标信息、简要信息和/或网页链接,对应的视频帧叫做详细帧;添加到非首次出现物品的视频帧协议的标识信息可以是缩略信息,包括物品名称和坐标信息,对应的视频帧叫做缩略帧。这样,能够达到节省空间的目的。在播放设备播放视频流的过程中,在播放到详细帧时,可以将详细信息解码缓存,之后再播放缩略帧时,若检测到当前视频帧中存在用户的关注物品,则基于关注物品的缩略信息进行缓存查询,即可得到关注物品的详细信息。In practical applications, the same item can appear in multiple consecutive video frames. For items that appear in consecutive video frames in the video stream, the identification information added to the video frame protocol of the first-appearing item may be detailed information, including item name, coordinate information, brief information and/or web page link, and the corresponding video frame is called Detailed frame; the identification information added to the video frame protocol of a non-first-occurring item can be abbreviated information, including item name and coordinate information, and the corresponding video frame is called an abbreviated frame. In this way, the purpose of saving space can be achieved. In the process of playing the video stream on the playback device, when the detailed frame is played, the detailed information can be decoded and cached, and when the thumbnail frame is played later, if it is detected that there is an item of interest of the user in the current video frame The abbreviated information is cached to query, and the detailed information of the concerned item can be obtained.
步骤202,在播放设备上播放视频流。Step 202: Play the video stream on the playback device.
在本实施例中,上述执行主体可以在播放设备上播放视频流。In this embodiment, the above-mentioned execution body may play the video stream on the playback device.
通常,在上述执行主体作为硬件的情况下,其可以是播放设备,其上安装有播放器,用于播放视频流。Generally, in the case where the above-mentioned execution body is used as hardware, it may be a playback device on which a player is installed for playing the video stream.
需要说明的是,播放设备通常边进行码流转换边播放视频流。因此,在视频流播放过程中,即可陆续得到视频流的出现物品的标识信息。It should be noted that the playback device usually plays the video stream while converting the code stream. Therefore, during the playback of the video stream, the identification information of the items appearing in the video stream can be successively obtained.
步骤203,响应于确定视频流的当前视频帧中存在用户的关注物品,确定关注物品的标识信息。 Step 203 , in response to determining that there is an item of interest of the user in the current video frame of the video stream, determine identification information of the item of interest.
在本实施例中,上述执行主体可以确定视频流的当前视频帧中是否存在用户的关注物品。若存在用户的关注物品,确定关注物品的标识信息;若不存在用户的关注物品,继续播放视频流。In this embodiment, the above-mentioned execution subject may determine whether there is an item of interest of the user in the current video frame of the video stream. If there is an item of interest of the user, determine the identification information of the item of interest; if there is no item of interest of the user, continue to play the video stream.
其中,用户的关注物品可以由上述执行主体基于用户观看视频流时的反应来确定。通常,用户看到其感兴趣的物品时,会作出特殊反应,例如当用户的关注物品在视频流中出现时,用户可以说出关注物品的名称。此时,上述执行主体可以采集用户的语音信息,并对语音信息进行识别,确定语音信息包含的物品名称。若物品名称与当前视频帧的出现物品匹配,将匹配的出现物品确定 为关注物品;若物品名称与当前视频帧的出现物品不匹配,继续采集用户的语音信息。当前视频帧是当前播放的视频帧。同一视频帧中可以出现多个物品,与用户的语音信息包含的物品名称匹配的物品才是用户的关注物品。例如,用户说出“手表”,而当前视频帧的出现物品包括A品牌的手表、B品牌的衣服和C品牌的鞋子,只有A品牌的手表与“手表”匹配,是用户的关注物品。Wherein, the user's attention item may be determined by the above-mentioned execution subject based on the user's reaction when watching the video stream. Usually, when a user sees an item of interest, he or she will make a special reaction, for example, when the user's attention item appears in the video stream, the user can say the name of the attention item. At this time, the above-mentioned executive body may collect the voice information of the user, identify the voice information, and determine the name of the item contained in the voice information. If the item name matches the item appearing in the current video frame, the matched appearing item is determined as the item of interest; if the item name does not match the appearing item in the current video frame, continue to collect the user's voice information. The current video frame is the currently playing video frame. Multiple items may appear in the same video frame, and the item that matches the item name contained in the user's voice information is the user's attention item. For example, the user says "watch", and the items appearing in the current video frame include watches of brand A, clothes of brand B, and shoes of brand C. Only the watches of brand A match with "watch" and are the user's attention items.
步骤204,基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。 Step 204 , based on the identification information of the item of interest, query the push information of the item of interest, and present the push information.
在本实施例中,上述执行主体可以基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。其中,推送信息可以是供用户浏览关注物品的详细信息的链接或购买关注物品的链接。通常,推送信息可以呈现在当前视频帧上,尤其是呈现在当前视频帧中的关注物品的附近。随后,用户可以基于推送信息进行相应操作,以查看关注物品的详情信息或者购买关注物品。In this embodiment, the above-mentioned execution subject may, based on the identification information of the item of interest, query the push information of the item of interest, and present the push information. The push information may be a link for the user to browse the detailed information of the item of interest or a link to purchase the item of interest. Generally, the push information can be presented on the current video frame, especially in the vicinity of the item of interest in the current video frame. Subsequently, the user can perform corresponding operations based on the push information to view the detailed information of the item of interest or purchase the item of interest.
通常,上述执行主体可以通过多种途径查询关注物品的推送信息。例如,在本地存储大量物品的推送信息的情况下,本地查找关注物品的推送信息。又例如,在视频类应用集成搜索功能或购物功能的情况下,基于关注物品的标识信息向视频类应用的后台服务器发送推送信息获取请求,并接收视频类应用的后台服务器返回的关注物品的推送信息。再例如,基于关注物品的标识信息向搜索应用或购物应用的后台服务器发送推送信息获取请求,并接收搜索应用或购物应用的后台服务器返回的关注物品的推送信息。Generally, the above-mentioned executive body can query the push information of the concerned item in various ways. For example, when the push information of a large number of items is stored locally, the push information of the concerned item is searched locally. For another example, in the case where a video application integrates a search function or a shopping function, based on the identification information of the item of interest, a push information acquisition request is sent to the background server of the video application, and the push information of the item of interest returned by the background server of the video application is received. information. For another example, based on the identification information of the item of interest, a push information acquisition request is sent to the background server of the search application or the shopping application, and the push information of the item of interest returned by the background server of the search application or the shopping application is received.
本申请实施例提供的信息推送方法,首先对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息;之后在播放设备上播放视频流;然后响应于确定视频流的当前视频帧中存在用户的关注物品,确定关注物品的标识信息;最后基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。从视频流出现的众多物品中发现用户感兴趣的物品,并自动呈现其推送信息,从而满足了用户详细了解其感兴趣物品的信息的需求,实现了关注物品的快速推送,节省了用户的操作成本。In the information push method provided by the embodiment of the present application, the video data is first converted into a code stream to obtain the video stream and the identification information of the items appearing in the video stream; then the video stream is played on the playback device; then, in response to determining the current video of the video stream There is an item of interest of the user in the frame, and the identification information of the item of interest is determined; finally, based on the identification information of the item of interest, the push information of the item of interest is queried, and the push information is presented. Find the items that the user is interested in from the many items that appear in the video stream, and automatically present the push information, so as to meet the user's need to know the information of the item of interest in detail, realize the fast push of the item of interest, and save the user's operation. cost.
进一步参考图3,其示出了是根据本申请的信息推送方法的又一个实施例的流程300。该信息推送方法包括以下步骤:Referring further to FIG. 3 , it shows a process 300 of still another embodiment of the information pushing method according to the present application. The information push method includes the following steps:
步骤301,对视频数据进行码流转换,得到视频流和视频流的出现物品的 标识信息。Step 301: Convert the video data to the code stream to obtain the video stream and the identification information of the items appearing in the video stream.
步骤302,在播放设备上播放视频流。Step 302: Play the video stream on the playback device.
在本实施例中,步骤301-302的具体操作已在图2所示的实施例中步骤201-202中进行了详细的介绍,在此不再赘述。In this embodiment, the specific operations of steps 301-302 have been described in detail in steps 201-202 in the embodiment shown in FIG. 2, and are not repeated here.
步骤303,在视频流的出现物品所在的视频帧中设置触发区域。 Step 303, setting a trigger area in the video frame where the item appearing in the video stream is located.
在本实施例中,信息推送方法的执行主体(例如图1所示的设备101)可以在视频流的出现物品所在的视频帧中设置触发区域。In this embodiment, the execution body of the information push method (for example, the device 101 shown in FIG. 1 ) may set a trigger area in the video frame where the item appears in the video stream.
通常,触发区域可以设置在视频帧的出现物品的附近。例如,在标识信息包括坐标信息的情况下,将坐标信息对应的区域设置为触发区域。应当理解的是,当视频帧的出现物品的数量是多个时,可以设置多个触发区域,一个触发区域对应一个出现物品。Typically, the trigger area can be set in the vicinity of the present item in the video frame. For example, in the case that the identification information includes coordinate information, the area corresponding to the coordinate information is set as the trigger area. It should be understood that, when the number of items appearing in the video frame is multiple, multiple trigger areas may be set, and one trigger area corresponds to one item appearing.
步骤304,响应于检测到用户确认当前视频帧的触发区域,将确认的触发区域对应的出现物品确定为关注物品。 Step 304, in response to detecting that the user confirms the trigger area of the current video frame, determine the appearing item corresponding to the confirmed trigger area as the item of interest.
在本实施例中,上述执行主体可以检测用户是否确认当前视频帧的触发区域。若检测到用户确认当前视频帧的触发区域,将确认的触发区域对应的出现物品确定为关注物品;若未检测到用户确认当前视频帧的触发区域,继续播放视频流,并持续检测。In this embodiment, the above-mentioned execution body can detect whether the user confirms the trigger area of the current video frame. If the trigger area where the user confirms the current video frame is detected, the item corresponding to the confirmed trigger area is determined as the item of interest; if the trigger area where the user confirms the current video frame is not detected, the video stream continues to be played and the detection continues.
其中,用户对触发区域进行操作时,可以认为确定了触发区域。播放设备需要具有相应的硬件或插件来检测用户对触发区域的操作,而视频流本身没有监测和网络连接能力。Wherein, when the user operates the trigger area, it can be considered that the trigger area is determined. The playback device needs to have corresponding hardware or plug-ins to detect the user's operation on the trigger area, while the video stream itself has no monitoring and network connection capabilities.
在一些实施例中,在播放设备具有触摸屏的情况下,若用户触摸当前视频帧的触发区域,确定用户确认触发区域。In some embodiments, when the playback device has a touch screen, if the user touches the trigger area of the current video frame, it is determined that the user confirms the trigger area.
在一些实施例中,在播放设备具有摄像头的情况下,若捕捉到用户的眼睛的焦点在当前视频帧的触发区域,确定用户确认触发区域。例如,上述执行主体可以在分析摄像头采集到的用户图像中的用户的眼睛的视角,以确定用户的眼睛的焦点是否落在触发区域。又例如,在播放设备的屏幕上覆盖有感光材料的情况下,上述执行主体可以首先利用摄像头向用户的眼睛发射光束;然后利用播放设备的屏幕上的感光材料感应从眼睛反射的光束强度;最后基于光束强度确定屏幕上的暗点,作为用户的眼睛的焦点。其中,当光束照射到眼睛的瞳 孔时,瞳孔会吸收大部分光束,从而使反射到屏幕上的光束强度较低,出现暗点。而光束照射到除瞳孔之外的部分,大部分光束会被反射到屏幕上,光束强度较低,出现亮点。In some embodiments, when the playback device has a camera, if it is captured that the focus of the user's eyes is in the trigger area of the current video frame, it is determined that the user confirms the trigger area. For example, the above-mentioned executive body may analyze the angle of view of the user's eyes in the user image collected by the camera to determine whether the focus of the user's eyes falls on the trigger area. For another example, in the case where the screen of the playback device is covered with a photosensitive material, the above-mentioned executive body can first use the camera to emit light beams to the eyes of the user; then use the photosensitive material on the screen of the playback device to sense the intensity of the light beam reflected from the eyes; finally A dark spot on the screen is determined based on the beam intensity as the focus of the user's eyes. Among them, when the light beam hits the pupil of the eye, most of the light beam is absorbed by the pupil, so that the intensity of the light beam reflected on the screen is lower and dark spots appear. When the light beam irradiates the part other than the pupil, most of the light beam will be reflected on the screen, the light beam intensity is low, and bright spots appear.
步骤305,确定关注物品的标识信息。Step 305: Determine the identification information of the object of interest.
步骤306,基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。 Step 306 , based on the identification information of the item of interest, query the push information of the item of interest, and present the push information.
在本实施例中,步骤305-306的具体操作已在图2所示的实施例中步骤203-204中进行了详细的介绍,在此不再赘述。In this embodiment, the specific operations of steps 305-306 have been described in detail in steps 203-204 in the embodiment shown in FIG. 2, and are not repeated here.
从图3中可以看出,与图2对应的实施例相比,本实施例中的信息推送方法的流程300突出了确定用户的关注物品的步骤。由此,本实施例描述的方案在出现物品所在的视频帧中设置触发区域,基于用户对触发区域操作来确定关注物品,从而提升了关注物品的确定准确度。As can be seen from FIG. 3 , compared with the embodiment corresponding to FIG. 2 , the process 300 of the information push method in this embodiment highlights the step of determining the user's attention item. Therefore, in the solution described in this embodiment, a trigger area is set in the video frame where the item appears, and the item of interest is determined based on the user's operation on the trigger area, thereby improving the accuracy of determining the item of interest.
进一步参考图4,其示出了是根据本申请的信息推送方法的另一个实施例的流程400。该信息推送方法包括以下步骤:Referring further to FIG. 4 , it shows a process 400 of another embodiment of the information pushing method according to the present application. The information push method includes the following steps:
步骤401,对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息。Step 401: Convert the video data to a code stream to obtain the video stream and the identification information of the items appearing in the video stream.
步骤402,在播放设备上播放视频流。Step 402: Play the video stream on the playback device.
在本实施例中,步骤401-402的具体操作已在图3所示的实施例中步骤301-302中进行了详细的介绍,在此不再赘述。In this embodiment, the specific operations of steps 401-402 have been described in detail in steps 301-302 in the embodiment shown in FIG. 3, and are not repeated here.
步骤403,基于播放设备的分辨率和当前视频帧的出现物品的百分比坐标,计算当前视频帧的出现物品的点阵坐标。 Step 403, based on the resolution of the playback device and the percentage coordinates of the items that appear in the current video frame, calculate the lattice coordinates of the items appearing in the current video frame.
在本实施例中,在标识信息中的坐标信息是百分比坐标的情况下,信息推送方法的执行主体(例如图1所示的设备101)可以基于播放设备的分辨率和当前视频帧的出现物品的百分比坐标,计算当前视频帧的出现物品的点阵坐标。In this embodiment, when the coordinate information in the identification information is a percentage coordinate, the execution body of the information push method (for example, the device 101 shown in FIG. 1 ) may be based on the resolution of the playback device and the appearance of the current video frame. The percentage coordinates of , calculate the lattice coordinates of the items appearing in the current video frame.
由于不同的播放设备具有不同的屏幕分辨率,为了适应不同屏幕分辨率,标识信息中的坐标信息是百分比坐标。而确定触发区域时需要点阵坐标,因此需要将百分比坐标转换为对应的点阵坐标。具体地,上述执行主体可以将播放设备的分辨率的水平像素值和垂直像素值与当前视频帧的出现物品的百分比坐标的水平坐标值和垂直坐标值对应相乘,得到当前视频帧的出现物品的点阵 坐标。Since different playback devices have different screen resolutions, in order to adapt to different screen resolutions, the coordinate information in the identification information is a percentage coordinate. However, when determining the trigger area, lattice coordinates are required, so it is necessary to convert the percentage coordinates into the corresponding lattice coordinates. Specifically, the above-mentioned execution body can multiply the horizontal pixel value and vertical pixel value of the resolution of the playback device and the horizontal coordinate value and vertical coordinate value of the percentage coordinate of the item appearing in the current video frame correspondingly to obtain the appearing item in the current video frame. lattice coordinates.
例如,利用分辨率为A*B的播放设备来播放视频流。如果出现物品的百分比坐标为(x/a,y/b),那么出现物品的点阵坐标为(x*A/a,y*B/b)。其中,a、b、A和B为正整数,x为不大于a的正整数,y为不大于b的正整数,x/a和y/b为不大于1的正数,x*A/a和y*B/b为正整数。For example, a video stream is played using a playback device with a resolution of A*B. If the percentage coordinates of the appearing items are (x/a, y/b), then the lattice coordinates of the appearing items are (x*A/a, y*B/b). Among them, a, b, A and B are positive integers, x is a positive integer not greater than a, y is a positive integer not greater than b, x/a and y/b are positive numbers not greater than 1, x*A/ a and y*B/b are positive integers.
通常,百分比坐标的坐标系与播放设备的屏幕坐标系相同,均是以左上角为原点,向右为横轴的正方向,向下为纵轴的正方向。此时,直接将播放设备的分辨率的水平像素值和垂直像素值与当前视频帧的出现物品的百分比坐标的水平坐标值和垂直坐标值对应相乘,即可得到当前视频帧的出现物品的点阵坐标。在特殊情况下,若百分比坐标的坐标系与播放设备的屏幕坐标系不同,则需要先对百分比坐标的坐标系进行转换,得到屏幕坐标系下的转换百分比坐标;然后将播放设备的分辨率的水平像素值和垂直像素值与当前视频帧的出现物品的转换百分比坐标的水平坐标值和垂直坐标值对应相乘,得到当前视频帧的出现物品的点阵坐标。Usually, the coordinate system of the percentage coordinate is the same as the screen coordinate system of the playback device, both with the upper left corner as the origin, the rightward as the positive direction of the horizontal axis, and the downward as the positive direction of the vertical axis. At this time, directly multiply the horizontal pixel value and vertical pixel value of the resolution of the playback device with the horizontal coordinate value and vertical coordinate value of the percentage coordinate of the item appearing in the current video frame. Lattice coordinates. In special cases, if the coordinate system of the percentage coordinates is different from the screen coordinate system of the playback device, you need to convert the coordinate system of the percentage coordinates first to obtain the converted percentage coordinates in the screen coordinate system; then convert the resolution of the playback device to the The horizontal pixel value and the vertical pixel value are correspondingly multiplied with the horizontal coordinate value and the vertical coordinate value of the conversion percentage coordinate of the item appearing in the current video frame, so as to obtain the lattice coordinates of the appearing item in the current video frame.
步骤404,将点阵坐标对应的区域设置为触发区域。Step 404: Set the area corresponding to the lattice coordinates as the trigger area.
在本实施例中,上述执行主体可以将点阵坐标对应的区域设置为触发区域。In this embodiment, the above-mentioned execution body may set the area corresponding to the lattice coordinates as the trigger area.
步骤405,响应于检测到用户确认当前视频帧的触发区域,将确认的触发区域对应的出现物品确定为关注物品。 Step 405 , in response to detecting that the user confirms the trigger area of the current video frame, determine the appearing item corresponding to the confirmed trigger area as the item of interest.
步骤406,确定关注物品的标识信息。Step 406: Determine the identification information of the object of interest.
步骤407,基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。 Step 407 , based on the identification information of the item of interest, query the push information of the item of interest, and present the push information.
在本实施例中,步骤405-407的具体操作已在图3所示的实施例中步骤304-306中进行了详细的介绍,在此不再赘述。In this embodiment, the specific operations of steps 405-407 have been described in detail in steps 304-306 in the embodiment shown in FIG. 3, and are not repeated here.
从图4中可以看出,与图3对应的实施例相比,本实施例中的信息推送方法的流程400突出了设置触发区域的步骤。由此,本实施例描述的方案中的标识信息中的坐标信息是百分比坐标,通过坐标转换来得到对应的点阵坐标,从而适应不同播放设备的不同屏幕分辨率。As can be seen from FIG. 4 , compared with the embodiment corresponding to FIG. 3 , the process 400 of the information push method in this embodiment highlights the step of setting a trigger area. Therefore, the coordinate information in the identification information in the solution described in this embodiment is a percentage coordinate, and the corresponding lattice coordinates are obtained through coordinate transformation, thereby adapting to different screen resolutions of different playback devices.
继续参考图5,其示出了根据本申请的视频处理方法的一个实施例的流程500。该视频处理方法包括以下步骤:Continuing to refer to FIG. 5, a flow 500 of one embodiment of a video processing method according to the present application is shown. The video processing method includes the following steps:
步骤501,对视频流进行物品识别,确定视频流的出现物品。Step 501: Perform item identification on the video stream to determine the items appearing in the video stream.
在本实施例中,视频处理方法的执行主体(例如图1所示的设备101)可以对视频流进行物品识别,确定视频流的出现物品。In this embodiment, the execution body of the video processing method (for example, the device 101 shown in FIG. 1 ) can perform item identification on the video stream, and determine the items appearing in the video stream.
通常,上述执行主体可以通过多种方式确定视频流的出现物品。在一些实施例中,本领域技术人员可以对视频流进行物品识别,将识别结果输入至上述执行主体。在一些实施例中,上述执行主体可以将视频流拆分为一系列的视频帧,并对每一帧视频帧进行物品识别,以确定视频流的出现物品。Generally, the above-mentioned executive body can determine the occurrence items of the video stream in various ways. In some embodiments, those skilled in the art can perform item recognition on the video stream, and input the recognition result to the above-mentioned execution body. In some embodiments, the above-mentioned executive body may split the video stream into a series of video frames, and perform item identification on each video frame to determine the occurrence of items in the video stream.
步骤502,获取出现物品的标识信息。In step 502, the identification information of the appearing item is acquired.
在本实施例中,上述执行主体可以获取出现物品的标识信息。其中,出现物品的标识信息是不可播放数据,用于标识视频流中出现的物品。In this embodiment, the above-mentioned execution subject may acquire the identification information of the appearing item. Wherein, the identification information of the appearing item is unplayable data, which is used to identify the article appearing in the video stream.
在一些实施例中,标识信息可以包括坐标信息。具体地,上述执行主体可以对视频流进行位置识别,确定出现物品的坐标信息;将出现物品的坐标信息添加到出现物品的标识信息中。其中,坐标信息可以是通过在试播设备上模拟试播视频流来确定。具体地,在首先试播设备上模拟试播视频流;之后对视频流进行位置识别,得到出现物品的点阵坐标;最后基于出现物品的点阵坐标,确定出现物品的坐标信息。In some embodiments, the identification information may include coordinate information. Specifically, the above-mentioned execution body can perform position identification on the video stream, determine the coordinate information of the appearing item, and add the coordinate information of the appearing article to the identification information of the appearing article. The coordinate information may be determined by simulating a pilot video stream on a pilot device. Specifically, a pilot video stream is first simulated on a pilot device; then the location of the video stream is identified to obtain the lattice coordinates of the appearing item; finally, the coordinate information of the appearing item is determined based on the lattice coordinates of the appearing article.
通常,在大部分播放设备与试播设备的屏幕分辨率统一的情况下,坐标信息可以是点阵坐标。然而实际应用中,不同的播放设备具有不同的屏幕分辨率,为了适应不同屏幕分辨率,标识信息中的坐标信息是百分比坐标。具体地,将出现物品的点阵坐标的水平坐标值和垂直坐标值与试播设备的分辨率的水平像素值和垂直像素值对应相除,即可得到出现物品的百分比坐标。Generally, in the case where the screen resolutions of most of the playback devices and the pilot device are the same, the coordinate information may be lattice coordinates. However, in practical applications, different playback devices have different screen resolutions. In order to adapt to different screen resolutions, the coordinate information in the identification information is a percentage coordinate. Specifically, by dividing the horizontal coordinate value and vertical coordinate value of the lattice coordinates of the appearing item correspondingly with the horizontal pixel value and vertical pixel value of the resolution of the pilot device, the percentage coordinates of the appearing article can be obtained.
例如,利用分辨率为a*b的标准设备来试播视频流,如果在试播设备上捕捉到的出现物品的点阵坐标为(x,y),那么出现物品的百分比坐标为(x/a,y/b)。其中,a和b为正整数,x为不大于a的正整数,y为不大于b的正整数,x/a和y/b为不大于1的正数。For example, using a standard device with a resolution of a*b to pilot a video stream, if the lattice coordinates of the items captured on the pilot device are (x, y), then the percentage coordinates of the items are (x/a, y/b). Among them, a and b are positive integers, x is a positive integer not greater than a, y is a positive integer not greater than b, and x/a and y/b are positive numbers not greater than 1.
需要说明的是,试播设备的分辨率的选择需要和视频分辨率匹配,例如720p以上选择16∶9,以下选择4∶3。这样,能够尽可能的减小误差。It should be noted that the selection of the resolution of the pilot device needs to match the resolution of the video, for example, 16:9 is selected above 720p, and 4:3 is selected below. In this way, the error can be reduced as much as possible.
步骤503,将出现物品的标识信息添加到对应的视频帧协议中,生成视频数据。 Step 503 , adding the identification information of the item to the corresponding video frame protocol to generate video data.
在本实施例中,上述执行主体可以将出现物品的标识信息添加到对应的视频帧协议中,生成视频数据。In this embodiment, the above-mentioned execution body may add the identification information of the appearing item to the corresponding video frame protocol to generate video data.
通常,通过对有标识信息的物品所在的视频帧进行码流编码处理,改造视频帧协议,即可在原有视频帧协议中添加标识信息。在对视频帧协议改造时,针对不同的协议格式,其改造方式不同。以H.264为例,基于出现物品的标识信息扩展对应的视频帧协议的NAL信息来支持添加标识信息。其中,NAL可以包括NAL Header、NAL Extension和NAL payload。NAL Header可以用于存储视频帧的基本信息。NAL payload可以用于存储视频帧的二进制流。NAL Extension可以用于存储标识信息。需要说明的是,由于视频帧本身是一个高压缩的数据体,因此NAL Extension也需要具有高压缩性。Usually, the identification information can be added to the original video frame protocol by performing code stream encoding processing on the video frame where the item with identification information is located, and by transforming the video frame protocol. When transforming the video frame protocol, the transforming methods are different for different protocol formats. Taking H.264 as an example, the NAL information of the corresponding video frame protocol is extended based on the identification information of the item to support adding identification information. Among them, NAL can include NAL Header, NAL Extension and NAL payload. NAL Header can be used to store basic information of video frames. The NAL payload can be used to store a binary stream of video frames. NAL Extension can be used to store identification information. It should be noted that since the video frame itself is a highly compressed data body, the NAL Extension also needs to have high compression.
在实际应用中,同一物品可以在连续多帧视频帧中出现。对于视频流中的连续视频帧的出现物品,添加到首次出现物品的视频帧协议中的标识信息可以是详细信息,包括物品名称、坐标信息、简要信息和/或网页链接,对应的视频帧叫做详细帧;添加到非首次出现物品的视频帧协议的标识信息可以是缩略信息,包括物品名称和坐标信息,对应的视频帧叫做缩略帧。这样,能够达到节省空间的目的。在播放设备播放视频流的过程中,在播放到详细帧时,可以将详细信息解码缓存,之后再播放缩略帧时,若检测到当前视频帧中存在用户的关注物品,则基于关注物品的缩略信息进行缓存查询,即可得到关注物品的详细信息。In practical applications, the same item can appear in multiple consecutive video frames. For items that appear in consecutive video frames in the video stream, the identification information added to the video frame protocol of the first-appearing item may be detailed information, including item name, coordinate information, brief information and/or web page link, and the corresponding video frame is called Detailed frame; the identification information added to the video frame protocol of a non-first-occurring item can be abbreviated information, including item name and coordinate information, and the corresponding video frame is called an abbreviated frame. In this way, the purpose of saving space can be achieved. In the process of playing the video stream on the playback device, when the detailed frame is played, the detailed information can be decoded and cached, and when the thumbnail frame is played later, if it is detected that there is an item of interest of the user in the current video frame The abbreviated information is cached to query, and the detailed information of the concerned item can be obtained.
本申请实施例提供的视频处理方法,首先对视频流进行物品识别,确定视频流的出现物品;然后获取出现物品的标识信息;最后将出现物品的标识信息添加到对应的视频帧协议中,生成视频数据,从而实现了在视频流中添加不可播放数据。In the video processing method provided by the embodiment of the present application, firstly, item identification is performed on the video stream to determine the items appearing in the video stream; then the identification information of the appearing items is obtained; finally, the identification information of the appearing items is added to the corresponding video frame protocol to generate video data, thereby adding non-playable data to the video stream.
下面参考图6,其示出了适于用来实现本申请实施例的计算机设备(例如图1所示的设备101)的计算机系统600的结构示意图。图6示出的计算机设备仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。Referring to FIG. 6 below, it shows a schematic structural diagram of a computer system 600 suitable for implementing a computer device (eg, the device 101 shown in FIG. 1 ) according to an embodiment of the present application. The computer device shown in FIG. 6 is only an example, and should not impose any limitations on the functions and scope of use of the embodiments of the present application.
如图6所示,计算机系统600包括中央处理单元(CPU)601,其可以根据存储在只读存储器(ROM)602中的程序或者从存储部分608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中, 还存储有系统600操作所需的各种程序和数据。CPU 601、ROM 602以及RAM603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。As shown in FIG. 6, a computer system 600 includes a central processing unit (CPU) 601, which can be loaded into a random access memory (RAM) 603 according to a program stored in a read only memory (ROM) 602 or a program from a storage section 608 Instead, various appropriate actions and processes are performed. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604 .
以下部件连接至I/O接口605:包括键盘、鼠标等的输入部分606;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分607;包括硬盘等的存储部分608;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口605。可拆卸介质611,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器610上,以便于从其上读出的计算机程序根据需要被安装入存储部分608。The following components are connected to the I/O interface 605: an input section 606 including a keyboard, a mouse, etc.; an output section 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 608 including a hard disk, etc. ; and a communication section 609 including a network interface card such as a LAN card, a modem, and the like. The communication section 609 performs communication processing via a network such as the Internet. A drive 610 is also connected to the I/O interface 605 as needed. A removable medium 611, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 610 as needed so that a computer program read therefrom is installed into the storage section 608 as needed.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分609从网络上被下载和安装,和/或从可拆卸介质611被安装。在该计算机程序被中央处理单元(CPU)601执行时,执行本申请的方法中限定的上述功能。In particular, according to embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network via the communication portion 609 and/or installed from the removable medium 611 . When the computer program is executed by the central processing unit (CPU) 601, the above-described functions defined in the method of the present application are performed.
需要说明的是,本申请所述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本申请中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输 用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。It should be noted that the computer-readable medium described in this application may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the above. In this application, a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In this application, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device . Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
可以以一种或多种程序设计语言或其组合来编写用于执行本申请的操作的计算机程序代码,所述程序设计语言包括面向目标的程序设计语言-诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言-诸如”C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或电子设备上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)-连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for performing the operations of the present application may be written in one or more programming languages, including object-oriented programming languages - such as Java, Smalltalk, C++, but also conventional Procedural programming language - such as "C" language or similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or electronic device. Where a remote computer is involved, the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
附图中的流程图和框图,图示了按照本申请各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
描述于本申请实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括转换单元、播放单元、确定单元和呈现单元。其中,这些单元的名称在种情况下并不构成对该单元本身的限定,例如,转换单元还可以被描述为“对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息的单元”。又例如,可以描述为:一种处理器包括确定单元、获取单元和添加单元。其中,这些单元的名称在种情况下并不构成对该单元本身的限定, 例如,确定单元还可以被描述为“对视频流进行物品识别,确定视频流的出现物品的单元”。The units involved in the embodiments of the present application may be implemented in a software manner, and may also be implemented in a hardware manner. The described unit may also be provided in the processor, for example, it may be described as: a processor includes a converting unit, a playing unit, a determining unit and a presenting unit. Among them, the names of these units do not constitute a limitation of the unit itself in this case, for example, the conversion unit can also be described as "converting the video data to a stream to obtain the video stream and the identification information of the items that appear in the video stream. unit". For another example, it can be described as: a processor includes a determination unit, an acquisition unit, and an addition unit. Wherein, the names of these units do not constitute a limitation of the unit itself, for example, the determination unit may also be described as "a unit for identifying items in a video stream and determining items appearing in a video stream".
作为另一方面,本申请还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的计算机设备中所包含的;也可以是单独存在,而未装配入该计算机设备中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该计算机设备执行时,使得该计算机设备:对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息;在播放设备上播放视频流;响应于确定视频流的当前视频帧中存在用户的关注物品,确定关注物品的标识信息;基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。或者使得该计算机设备:对视频流进行物品识别,确定视频流的出现物品;获取出现物品的标识信息;将出现物品的标识信息添加到对应的视频帧协议中,生成视频数据。As another aspect, the present application also provides a computer-readable medium. The computer-readable medium may be included in the computer device described in the above embodiments; it may also exist independently without being assembled into the computer device. middle. The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the computer equipment, the computer equipment is made to perform code stream conversion on the video data to obtain the video stream and the appearance of the video stream. identification information; play the video stream on the playback device; in response to determining that there is an item of interest of the user in the current video frame of the video stream, determine the identification information of the item of interest; based on the identification information of the item of interest, query the push information of the item of interest, and present Push information. Or make the computer equipment: perform item identification on the video stream to determine the item appearing in the video stream; obtain the identification information of the appearing item; add the identification information of the appearing item to the corresponding video frame protocol to generate video data.
以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本申请中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only a preferred embodiment of the present application and an illustration of the applied technical principles. Those skilled in the art should understand that the scope of the invention involved in this application is not limited to the technical solution formed by the specific combination of the above technical features, and should also cover the above technical features or Other technical solutions formed by any combination of its equivalent features. For example, a technical solution is formed by replacing the above-mentioned features with the technical features disclosed in this application (but not limited to) with similar functions.

Claims (18)

  1. 一种信息推送方法,包括:An information push method, comprising:
    对视频数据进行码流转换,得到视频流和所述视频流的出现物品的标识信息;The code stream conversion is performed on the video data to obtain the video stream and the identification information of the occurrence items of the video stream;
    在播放设备上播放所述视频流;playing the video stream on a playback device;
    响应于确定所述视频流的当前视频帧中存在用户的关注物品,确定所述关注物品的标识信息;In response to determining that an item of interest of the user exists in the current video frame of the video stream, determining identification information of the item of interest;
    基于所述关注物品的标识信息,查询所述关注物品的推送信息,以及呈现所述推送信息。Based on the identification information of the item of interest, the push information of the item of interest is queried, and the push information is presented.
  2. 根据权利要求1所述的方法,其中,所述确定所述视频流的当前视频帧中存在用户的关注物品,包括:The method according to claim 1, wherein the determining that there is an item of interest of the user in the current video frame of the video stream comprises:
    采集所述用户的语音信息;collecting the voice information of the user;
    对所述语音信息进行识别,确定所述语音信息包含的物品名称;Identifying the voice information, and determining the name of the item contained in the voice information;
    若所述物品名称与所述当前视频帧的出现物品匹配,将匹配的出现物品确定为所述关注物品。If the name of the item matches the item appearing in the current video frame, the matching appearing item is determined as the item of interest.
  3. 根据权利要求1所述的方法,其中,所述确定所述视频流的当前视频帧中存在用户的关注物品,包括:The method according to claim 1, wherein the determining that there is an item of interest of the user in the current video frame of the video stream comprises:
    在所述视频流的出现物品所在的视频帧中设置触发区域;A trigger area is set in the video frame where the appearance item of the video stream is located;
    响应于检测到所述用户确认所述当前视频帧的触发区域,将确认的触发区域对应的出现物品确定为所述关注物品。In response to detecting that the user confirms the trigger area of the current video frame, the appearing item corresponding to the confirmed trigger area is determined as the item of interest.
  4. 根据权利要求3所述的方法,其中,所述标识信息包括坐标信息;以及The method of claim 3, wherein the identification information includes coordinate information; and
    所述在所述视频流的出现物品所在的视频帧中设置触发区域,包括:The setting of the triggering area in the video frame where the item appearing in the video stream is located includes:
    将所述坐标信息对应的区域设置为所述触发区域。The area corresponding to the coordinate information is set as the trigger area.
  5. 根据权利要求4所述的方法,其中,所述坐标信息是百分比坐标;以及The method of claim 4, wherein the coordinate information is a percentage coordinate; and
    所述将所述坐标信息对应的区域设置为所述触发区域,包括:The setting of the area corresponding to the coordinate information as the trigger area includes:
    基于所述播放设备的分辨率和所述当前视频帧的出现物品的百分比坐标,计算所述当前视频帧的出现物品的点阵坐标;Based on the resolution of the playback device and the percentage coordinates of the items that appear in the current video frame, calculate the lattice coordinates of the items that appear in the current video frame;
    将所述点阵坐标对应的区域设置为所述触发区域。The area corresponding to the lattice coordinates is set as the trigger area.
  6. 根据权利要求5所述的方法,其中,所述基于所述播放设备的分辨率和所述当前视频帧的出现物品的百分比坐标,计算所述当前视频帧的出现物品的点阵坐标,包括:The method according to claim 5, wherein the calculating the lattice coordinates of the appearing articles in the current video frame based on the resolution of the playback device and the percentage coordinates of the appearing articles in the current video frame, comprising:
    若所述百分比坐标的坐标系与所述播放设备的屏幕坐标系相同,将所述播放设备的分辨率的水平像素值和垂直像素值与所述当前视频帧的出现物品的百分比坐标的水平坐标值和垂直坐标值对应相乘,得到所述当前视频帧的出现物品的点阵坐标。If the coordinate system of the percentage coordinates is the same as the screen coordinate system of the playback device, compare the horizontal pixel value and vertical pixel value of the resolution of the playback device with the horizontal coordinates of the percentage coordinates of the items appearing in the current video frame The value and the vertical coordinate value are correspondingly multiplied to obtain the lattice coordinates of the item appearing in the current video frame.
  7. 根据权利要求6所述的方法,其中,所述基于所述播放设备的分辨率和所述当前视频帧的出现物品的百分比坐标,计算所述当前视频帧的出现物品的点阵坐标,还包括:The method according to claim 6, wherein the calculating the lattice coordinates of the items appearing in the current video frame based on the resolution of the playback device and the percentage coordinates of the items appearing in the current video frame, further comprising: :
    若所述百分比坐标的坐标系与所述播放设备的屏幕坐标系不同,对所述百分比坐标的坐标系进行转换,得到所述屏幕坐标系下的转换百分比坐标;If the coordinate system of the percentage coordinates is different from the screen coordinate system of the playback device, converting the coordinate system of the percentage coordinates to obtain the converted percentage coordinates in the screen coordinate system;
    将所述播放设备的分辨率的水平像素值和垂直像素值与所述当前视频帧的出现物品的转换百分比坐标的水平坐标值和垂直坐标值对应相乘,得到所述当前视频帧的出现物品的点阵坐标。Correspondingly multiply the horizontal pixel value and vertical pixel value of the resolution of the playback device with the horizontal coordinate value and the vertical coordinate value of the conversion percentage coordinates of the occurrence item of the current video frame to obtain the occurrence item of the current video frame. lattice coordinates.
  8. 根据权利要求3所述的方法,其中,所述检测到所述用户确认所述当前视频帧的触发区域,包括:The method according to claim 3, wherein the detecting the triggering area of the user confirming the current video frame comprises:
    若所述用户触摸所述当前视频帧的触发区域,确定所述用户确认所述触发区域。If the user touches the trigger area of the current video frame, it is determined that the user confirms the trigger area.
  9. 根据权利要求3所述的方法,其中,所述检测到所述用户确认所述当前视频帧的触发区域,包括:The method according to claim 3, wherein the detecting the triggering area of the user confirming the current video frame comprises:
    捕捉所述用户的眼睛的焦点;capturing the focus of the user's eye;
    响应于确定所述焦点在所述当前视频帧的触发区域,确定所述用户确认所述触发区域。In response to determining that the focus is on a trigger region of the current video frame, it is determined that the user confirms the trigger region.
  10. 根据权利要求9所述的方法,其中,所述捕捉所述用户的眼睛的焦点,包括:The method of claim 9, wherein the capturing the focus of the user's eyes comprises:
    利用所述播放设备的摄像头向所述眼睛发射光束;Use the camera of the playback device to emit light beams to the eyes;
    利用所述播放设备的屏幕上的感光材料感应从所述眼睛反射的光束强度;Use the photosensitive material on the screen of the playback device to sense the intensity of the light beam reflected from the eyes;
    基于光束强度确定所述屏幕上的暗点,作为所述焦点。A dark spot on the screen is determined as the focal point based on the beam intensity.
  11. 一种视频处理方法,包括:A video processing method, comprising:
    对视频流进行物品识别,确定所述视频流的出现物品;Performing item identification on the video stream to determine the items present in the video stream;
    获取所述出现物品的标识信息;obtaining the identification information of the appearing item;
    将所述出现物品的标识信息添加到对应的视频帧协议中,生成视频数据。The identification information of the appearing item is added to the corresponding video frame protocol to generate video data.
  12. 根据权利要求11所述的方法,其中,所述获取所述出现物品的标识信息,包括:The method according to claim 11, wherein the obtaining the identification information of the appearing item comprises:
    对所述视频流进行位置识别,确定所述出现物品的坐标信息;Perform position recognition on the video stream, and determine the coordinate information of the appearing item;
    将所述出现物品的坐标信息添加到所述出现物品的标识信息中。The coordinate information of the appearing item is added to the identification information of the appearing article.
  13. 根据权利要求12所述的方法,其中,所述对所述视频流进行位置识别,确定所述出现物品的坐标信息,包括:The method according to claim 12, wherein the performing position recognition on the video stream to determine the coordinate information of the appearing item comprises:
    在试播设备上模拟试播所述视频流;Simulate and pilot the video stream on the pilot device;
    对所述视频流进行位置识别,得到所述出现物品的点阵坐标;Perform position recognition on the video stream to obtain the lattice coordinates of the item that appears;
    基于所述出现物品的点阵坐标,确定所述出现物品的坐标信息。Based on the lattice coordinates of the appearing article, coordinate information of the appearing article is determined.
  14. 根据权利要求13所述的方法,其中,所述基于所述出现物品的点阵坐标,确定所述出现物品的坐标信息,包括:The method according to claim 13, wherein the determining the coordinate information of the appearing article based on the lattice coordinates of the appearing article comprises:
    将所述出现物品的点阵坐标的水平坐标值和垂直坐标值与所述试播设备 的分辨率的水平像素值和垂直像素值对应相除,得到所述出现物品的百分比坐标。The horizontal coordinate value and the vertical coordinate value of the lattice coordinates of the appearing item are divided correspondingly with the horizontal pixel value and the vertical pixel value of the resolution of the pilot equipment to obtain the percentage coordinates of the appearing article.
  15. 根据权利要求12所述的方法,其中,对于所述视频流中的连续视频帧的出现物品,添加到首次出现物品的视频帧协议中的标识信息包括物品名称、坐标信息、简要信息和/或网页链接,添加到非首次出现物品的视频帧协议的标识信息包括物品名称和坐标信息。The method according to claim 12, wherein, for an item appearing in consecutive video frames in the video stream, the identification information added to the video frame protocol of the first appearing item includes the item name, coordinate information, brief information and/or Web link, the identification information added to the video frame protocol of non-first-occurrence items includes item name and coordinate information.
  16. 根据权利要求11-15之一所述的方法,其中,所述将所述出现物品的标识信息添加到对应的视频帧协议中,包括:The method according to any one of claims 11-15, wherein the adding the identification information of the appearing item to the corresponding video frame protocol comprises:
    基于所述出现物品的标识信息扩展对应的视频帧协议的网络抽象层信息。The network abstraction layer information of the corresponding video frame protocol is extended based on the identification information of the appearing item.
  17. 一种计算机设备,包括:A computer device comprising:
    一个或多个处理器;one or more processors;
    存储装置,其上存储一个或多个程序;a storage device on which one or more programs are stored;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-10中任一所述的方法,或者实现如权利要求11-16中任一所述的方法。The one or more programs, when executed by the one or more processors, cause the one or more processors to implement a method as claimed in any one of claims 1-10, or to implement a method as claimed in claim 11- The method of any of 16.
  18. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1-10中任一所述的方法,或者实现如权利要求11-16中任一所述的方法。A computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the method according to any one of claims 1-10, or implements any one of claims 11-16 the method described.
PCT/CN2021/104450 2020-08-05 2021-07-05 Information pushing method, video processing method, and device WO2022028177A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010777098.7 2020-08-05
CN202010777098.7A CN111859158A (en) 2020-08-05 2020-08-05 Information pushing method, video processing method and equipment

Publications (1)

Publication Number Publication Date
WO2022028177A1 true WO2022028177A1 (en) 2022-02-10

Family

ID=72971071

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/104450 WO2022028177A1 (en) 2020-08-05 2021-07-05 Information pushing method, video processing method, and device

Country Status (2)

Country Link
CN (1) CN111859158A (en)
WO (1) WO2022028177A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859158A (en) * 2020-08-05 2020-10-30 上海连尚网络科技有限公司 Information pushing method, video processing method and equipment
CN115334346A (en) * 2022-08-08 2022-11-11 北京达佳互联信息技术有限公司 Interface display method, video publishing method, video editing method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013040904A1 (en) * 2011-09-22 2013-03-28 中兴通讯股份有限公司 Advertisement processing method and terminal
CN105100944A (en) * 2014-04-30 2015-11-25 广州市动景计算机科技有限公司 Article information outputting method and device
CN107704076A (en) * 2017-09-01 2018-02-16 广景视睿科技(深圳)有限公司 A kind of trend projected objects display systems and its method
CN110288400A (en) * 2019-06-25 2019-09-27 联想(北京)有限公司 Information processing method, information processing unit and information processing system
CN111859158A (en) * 2020-08-05 2020-10-30 上海连尚网络科技有限公司 Information pushing method, video processing method and equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102591553A (en) * 2011-01-13 2012-07-18 京宏科技股份有限公司 Method, system and device for video interaction and device and method for generating video related volume labels
US20150193446A1 (en) * 2014-01-07 2015-07-09 Microsoft Corporation Point(s) of interest exposure through visual interface
CN109120954B (en) * 2018-09-30 2021-09-07 武汉斗鱼网络科技有限公司 Video message pushing method and device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013040904A1 (en) * 2011-09-22 2013-03-28 中兴通讯股份有限公司 Advertisement processing method and terminal
CN105100944A (en) * 2014-04-30 2015-11-25 广州市动景计算机科技有限公司 Article information outputting method and device
CN107704076A (en) * 2017-09-01 2018-02-16 广景视睿科技(深圳)有限公司 A kind of trend projected objects display systems and its method
CN110288400A (en) * 2019-06-25 2019-09-27 联想(北京)有限公司 Information processing method, information processing unit and information processing system
CN111859158A (en) * 2020-08-05 2020-10-30 上海连尚网络科技有限公司 Information pushing method, video processing method and equipment

Also Published As

Publication number Publication date
CN111859158A (en) 2020-10-30

Similar Documents

Publication Publication Date Title
CN112261424B (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
US20210136455A1 (en) Communication apparatus, communication control method, and computer program
CN107846561B (en) Method and system for determining and displaying contextually targeted content
WO2022028177A1 (en) Information pushing method, video processing method, and device
US9854232B2 (en) Systems and methods for picture quality monitoring
CN111523566A (en) Target video clip positioning method and device
CN111095939B (en) Identifying previously streamed portions of media items to avoid repeated playback
US20090129755A1 (en) Method and Apparatus for Generation, Distribution and Display of Interactive Video Content
US11321946B2 (en) Content entity recognition within digital video data for dynamic content generation
KR20190050864A (en) System and method for recognition of items in media data and delivery of information related thereto
US10999640B2 (en) Automatic embedding of information associated with video content
CN108235004B (en) Video playing performance test method, device and system
US20040250297A1 (en) Method, apparatus and system for providing access to product data
US20230291772A1 (en) Filtering video content items
JP2006285654A (en) Article information retrieval system
WO2022012273A1 (en) Method for item price comparison, and device
CN109241344B (en) Method and apparatus for processing information
CN114374853A (en) Content display method and device, computer equipment and storage medium
US11531700B2 (en) Tagging an image with audio-related metadata
JP2023522092A (en) INTERACTION RECORD GENERATING METHOD, APPARATUS, DEVICE AND MEDIUM
CN113298589A (en) Commodity information processing method and device, and information acquisition method and device
CN114143568B (en) Method and device for determining augmented reality live image
US11700285B2 (en) Filtering video content items
WO2023098576A1 (en) Image processing method and apparatus, device, and medium
CN111859159A (en) Information pushing method, video processing method and equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21852709

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03.07.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21852709

Country of ref document: EP

Kind code of ref document: A1