CN110909209B

CN110909209B - Live video searching method and device, equipment, server and storage medium

Info

Publication number: CN110909209B
Application number: CN201911175053.6A
Authority: CN
Inventors: 杜辉; 贾弘毅
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2019-11-26
Filing date: 2019-11-26
Publication date: 2022-12-27
Anticipated expiration: 2039-11-26
Also published as: CN110909209A

Abstract

The embodiment of the application provides a live video searching method and device, equipment, a server and a storage medium, relates to the technical field of information communication, and can search a live video related to a target picture. The specific scheme comprises the following steps: acquiring a target picture for searching a live video; identifying a target picture to obtain first picture information of the target picture; the first picture information comprises first label information used for representing the type of the target picture. And extracting first image features from the target picture, wherein the first image features are used for representing at least one of color features, texture features, shape features and spatial relationship features of the image included in the target picture. Sending first picture information and first image characteristics to a server to request the server to search K live videos matched with the target picture, wherein K is greater than or equal to 1,K and is a positive integer; and receiving a search result comprising K live videos from the server, and displaying the K live videos.

Description

Live video searching method and device, equipment, server and storage medium

Technical Field

The embodiment of the application relates to the technical field of information communication, in particular to a live video searching method and device, equipment, a server and a storage medium.

Background

In the existing technical field of internet search, various application programs provide search services for users. For example, in a video application, keywords input by a user can be received, and a search result is given through text search. Wherein, the search result is mostly video. For another example, in a search application, such as Baidu, google, etc., a keyword input by a user may be received, and a search result may be given through keyword search.

In recent years, with the development of multimedia networks, network operators provide various live network platforms for users. In a live application, a user may search for relevant live broadcasts by way of text search. However, when a user is interested in a picture and wishes to search for live video related to the picture in a live application, picture searching is generally not supported in live applications. The user is required to enter text related to the content of the picture in order to search for live video related to the picture. However, the text input by the user may not accurately express the picture content. Therefore, the live video related to the picture cannot be accurately searched, and the matching degree of the searched live video and the picture content is low.

Disclosure of Invention

The embodiment of the application provides a live video searching method and device, equipment, a server and a storage medium, provides a searching mode capable of searching live videos related to target pictures, and can improve the matching degree of the searched live videos and the target pictures.

In order to achieve the technical purpose, the following technical scheme is adopted in the application:

in a first aspect, an embodiment of the present application provides a method for searching a live video, which is applied to a user equipment, and the method may include:

acquiring a target picture for searching a live video; identifying a target picture to obtain first picture information of the target picture; the first picture information comprises first label information, and the first label information is used for representing the type of the target picture. And extracting first image features from the target picture, wherein the first image features are used for representing at least one of color features, texture features, shape features and spatial relationship features of the image included in the target picture. Sending first picture information and first image characteristics to a server to request the server to search K live videos matched with the target picture according to the first picture information and the first image characteristics, wherein K is greater than or equal to 1,K and is a positive integer; and receiving a search result from the server, wherein the search result comprises K live videos, and displaying the K live videos.

In a possible implementation, when the text information is included in the target picture, the first picture information may further include the text information in the target picture.

In another possible implementation, the acquiring a target picture for searching for a live video includes: and responding to a screenshot instruction of the image of the target area, and capturing the image of the target area to obtain a target picture.

In another possible embodiment, K.gtoreq.2; the search result further comprises a score of each live video in the K live videos; the score of the live video is used for representing the matching degree of the live video and the target picture; show K live video, include: and displaying the K live videos according to the sequence of the scores of the K live videos from high to low.

In a second aspect, the present application further provides a method for searching a live video, which is applied to a server, and the method may include: receiving first picture information and first image characteristics of a target picture sent by user equipment; the first picture information comprises first label information used for representing the type of the target picture; the first image feature is used for characterizing at least one of color features, texture features, shape features and spatial relationship features of an image included in the target picture.

Comparing second picture information of a plurality of live videos stored in a server by using the first picture information to determine M first live videos, wherein M is more than or equal to 1,M and is a positive integer; matching second picture information of the M first direct-playing videos with the first picture information; the second picture information of the first live video comprises second label information used for representing the type of the first live video. Comparing second image characteristics of the plurality of live videos by using the first image characteristics to determine N second live videos, wherein N is not less than 1,N and is a positive integer; second image features of the N second live videos are matched with the first image features; and the second image characteristics of the second live video are used for representing at least one of color characteristics, texture characteristics, shape characteristics and spatial relationship characteristics of the images in the second live video.

Obtaining scores of the M first direct-playing videos according to the matching degree of the second picture information and the first picture information of the M first direct-playing videos; obtaining scores of the N second live videos according to the matching degree of the second image characteristics of the N second live videos and the first image; and the score is used for representing the matching degree of the live video and the target picture. Selecting K live videos from the M first live videos and the N second live videos according to the scores of the M first live videos and the scores of the N second live videos, wherein K is more than or equal to 1,K and is a positive integer; the scores of the K live videos are higher than the scores of other live videos in the M first live videos and the N second live videos. And sending a search result to the user equipment, wherein the search result comprises K live videos.

In a possible implementation manner, scores of the M first live videos are obtained according to the matching degree of the second picture information and the first picture information of the M first live videos; according to the matching degree of the second image characteristics of the N second live broadcast videos and the first image, obtaining the scores of the N second live broadcast videos, wherein the method comprises the following steps: according to the matching degree of the second picture information and the first picture information of the M first direct-playing videos, respectively scoring the M first direct-playing videos, and multiplying the scores of the M first direct-playing videos by a first preset coefficient to obtain scores of the M first direct-playing videos; and the score is used for representing the matching degree of the second picture information and the first picture information of the first direct playing video. According to the matching degree of the second image characteristics of the N second live videos and the first image characteristics, scoring the N second live videos respectively, and multiplying the scores of the N second live videos by a second preset coefficient to obtain scores of the N second live videos; and the score is used for representing the matching degree of the second image characteristic of the second live video and the first image characteristic. And selecting K live videos from the M first live videos and the N second live videos according to the sequence that the scores of the M first live videos and the scores of the N second live videos are from large to small. The scores of the K live videos are higher than the scores of other live videos in the M first live videos and the N second live videos.

In another possible implementation, when the target picture includes text information, the first picture information further includes the text information in the target picture; when the first live video comprises the text information, the second picture information of the first live video also comprises the text information in the first live video.

In another possible embodiment, the server stores second picture information of a plurality of live videos in a message queue.

In another possible implementation, before receiving the first picture information and the first image feature of the target picture sent by the user equipment, the method further includes: responding to an uploading instruction of the third live video, identifying each frame image of the third live video, and determining a plurality of key frame images of the third live video; and the image difference between the key frame image and the previous frame image of the key frame image is greater than a preset difference threshold value. Performing, for each key frame image of a plurality of key frame images: identifying the key frame image to obtain third picture information of the key frame image; extracting image characteristics of the key frame image from the key frame image; the third picture information comprises label information of the key frame image, and the label information of the key frame image is used for representing the type of the key frame image; when the key frame image comprises text information, the third picture information also comprises the text information in the key frame image; the image features of the key frame image are used to characterize at least one of color features, texture features, shape features, and spatial relationship features of the key frame image. Counting third picture information of the plurality of key frame images to obtain second picture information of a third live broadcast video; and counting the image characteristics of the plurality of key frame images to obtain a second image characteristic of the third live broadcast video. Storing second picture information and second image characteristics of a third live video in a message queue; wherein the third live video is included in the plurality of live videos.

In another possible implementation manner, the second picture information of the third live video further includes tag information of the third live video; the tag information of the third live video is used for representing the type of the third live video; before saving the second picture information and the second image feature of the third live video in the message queue, the method further comprises: responding to an uploading instruction of the live video, and extracting video segments with second preset time length at intervals of first preset time length in the live video; identifying label information of the first image in each extracted video segment; wherein the label information of the first image is used for representing the type of the first image; the first image is a first frame image in the video segment, or any frame image in the video segment; and counting the label information of the first image in each video segment to obtain the label information of the third live video.

In another possible implementation, the search result further includes a score for each of the K live videos.

In a third aspect, the present application further provides a method for searching a live video, which can be applied to a server, and includes: receiving a target picture from user equipment; wherein the target picture is used for searching the live video. And identifying the target picture to obtain first picture information of the target picture, wherein the first picture information comprises first label information, and the first label information is used for representing the type of the target picture. And extracting first image features from the target picture, wherein the first image features are used for characterizing at least one of color features, texture features, shape features and spatial relationship features of the image included in the target picture.

Comparing second picture information of a plurality of live videos stored in a server by adopting the first picture information to determine M first live videos, wherein M is more than or equal to 1,M and is a positive integer; matching second picture information of the M first direct-playing videos with the first picture information; the second picture information of the first live video comprises second label information used for representing the type of the first live video. Comparing second image characteristics of the plurality of live broadcast videos by adopting the first image characteristics to determine N second live broadcast videos, wherein N is more than or equal to 1,N and is a positive integer; second image features of the N second live broadcast videos are matched with the first image features; and the second image characteristics of the second live video are used for representing at least one of color characteristics, texture characteristics, shape characteristics and spatial relationship characteristics of the images in the second live video.

Obtaining scores of the M first direct-playing videos according to the matching degree of the second picture information and the first picture information of the M first direct-playing videos; obtaining scores of the N second live videos according to the matching degree of the second image characteristics of the N second live videos and the first image; and the score is used for representing the matching degree of the live video and the target picture. Selecting K live videos from the M first live videos and the N second live videos according to the scores of the M first live videos and the scores of the N second live videos, wherein K is greater than or equal to 1,K and is a positive integer; the scores of the K live videos are higher than the scores of other live videos in the M first live videos and the N second live videos. And sending a search result to the user equipment, wherein the search result comprises K live videos.

By the adoption of the live video searching method, the server obtains the target picture, identifies the target picture and obtains the first picture information of the target picture. Compared with the user equipment, the server is higher in computing capability, identifies the target picture, searches out a plurality of live videos matched with the target picture, and sends search results comprising K live videos to the user equipment. Compared with the text which is input by the user and related to the picture content, in the embodiment of the application, the first picture information obtained by identifying the target picture and the first image feature extracted from the target picture have higher association degree with the picture content of the target picture, and the picture content of the target picture can be embodied better. Compared with the text which is input by the user and related to the picture content, in the embodiment of the application, the first picture information obtained by identifying the target picture and the first image feature extracted from the target picture have higher association degree with the picture content of the target picture, and the picture content of the target picture can be embodied better.

In a fourth aspect, the present application further provides a search apparatus for live video, including: the acquisition module is configured to acquire a target picture for searching live video; the first identification module is configured to identify the target picture acquired by the acquisition module to obtain first picture information of the target picture; the first picture information comprises first label information, and the first label information is used for representing the type of a target picture; the extraction module is configured to extract first image features from the target picture obtained by the obtaining module, wherein the first image features are used for representing at least one of color features, texture features, shape features and spatial relationship features of an image included by the target picture; the first sending module is configured to send the first picture information obtained by the first identification module and the first image characteristics obtained by the extraction module to the server so as to request the server to search K live videos matched with the target picture according to the first picture information and the first image characteristics, wherein K is greater than or equal to 1,K and is a positive integer; the first receiving module is configured to receive a search result from the server, wherein the search result comprises K live videos and shows the K live videos.

In one possible implementation, when the text information is included in the target picture, the first picture information further includes the text information in the target picture.

In another possible implementation manner, the obtaining module is specifically configured to intercept the image of the target area to obtain the target picture in response to a screenshot instruction for intercepting the image of the target area.

In another possible embodiment, K.gtoreq.2; the search result further comprises a score of each live video in the K live videos; the score of the live video is used for representing the matching degree of the live video and the target picture; the first receiving module is further configured to display the K live videos in an order from high scores to low scores of the K live videos.

In a fifth aspect, the present application further provides a device for searching for live videos, including: the second receiving module is configured to receive first picture information and first image characteristics of a target picture sent by user equipment; the first picture information comprises first label information used for representing the type of the target picture; the first image feature is used for characterizing at least one of color features, texture features, shape features and spatial relationship features of an image included in the target picture.

The first matching module is configured to compare the first picture information obtained by the second receiving module with second picture information of a plurality of live videos stored in a search device of the live videos to determine M first live videos, wherein M is greater than or equal to 1,M and is a positive integer; matching second picture information of the M first live videos with the first picture information; and the second picture information of the first live video comprises second label information for representing the type of the first live video.

The second matching module is configured to compare the first image features obtained by the second receiving module with second image features of the live videos to determine N second live videos, wherein N is greater than or equal to 1,N and is a positive integer; second image features of the N second live videos are matched with the first image features; and the second image characteristics of the second live video are used for representing at least one of color characteristics, texture characteristics, shape characteristics and spatial relationship characteristics of the images in the second live video.

The first grading module is configured to obtain grades of the M first direct-playing videos according to the matching degree of the second picture information and the first picture information of the M first direct-playing videos determined by the first matching module; obtaining scores of the N second live videos according to the matching degree of the second image characteristics of the N second live videos and the first image determined by the second matching module; and the score is used for representing the matching degree of the live video and the target picture.

The first selection module is configured to select K live videos from the M first live videos and the N second live videos according to the scores of the M first live videos and the scores of the N second live videos obtained by the first scoring module, wherein K is greater than or equal to 1,K and is a positive integer; the scores of the K live videos are higher than the scores of other live videos in the M first live videos and the N second live videos.

And the second sending module is configured to send the search result to the user equipment, wherein the search result comprises K live videos.

In a possible implementation manner, the first scoring module is specifically configured to score the M first live videos according to matching degrees of second picture information and first picture information of the M first live videos, and multiply a first preset coefficient by the scores of the M first live videos to obtain scores of the M first live videos; and scoring is used for representing the matching degree of the second picture information of the first direct playing video and the first picture information of the target picture. According to the matching degree of the second image characteristics of the N second live videos and the first image characteristics, scoring the N second live videos respectively, and multiplying the scores of the N second live videos by a second preset coefficient to obtain scores of the N second live videos; and the score is used for representing the matching degree of the second image characteristic of the second live broadcast video and the first image characteristic of the target picture. And selecting K live videos from the M first live videos and the N second live videos according to the sequence that the scores of the M first live videos and the scores of the N second live videos are from large to small. The scores of the K live videos are higher than the scores of other live videos in the M first live videos and the N second live videos.

In another possible implementation manner, when the target picture includes text information, the first picture information further includes text information in the target picture; when the first live video comprises the text information, the second picture information of the first live video also comprises the text information in the first live video.

In another possible embodiment, the searching apparatus of the live video saves the second picture information of the plurality of live videos in the message queue.

In another possible embodiment, the apparatus for searching for live video further includes: the determining module is configured to respond to an uploading instruction of the third live video, identify each frame image of the third live video, and determine a plurality of key frame images of the third live video; the image difference between the key frame image and the previous frame image of the key frame image is greater than a preset difference threshold value.

A second identification module configured to perform, for each of the plurality of key frame images determined by the determination module: identifying the key frame image to obtain third picture information of the key frame image; extracting image characteristics of the key frame images from the key frame images; the third picture information comprises label information of the key frame image, and the label information of the key frame image is used for representing the type of the key frame image; when the key frame image comprises the text information, the third picture information also comprises the text information in the key frame image; the image features of the key frame image are used to characterize at least one of color features, texture features, shape features, and spatial relationship features of the key frame image.

The statistical module is configured to count third picture information of the plurality of key frame images identified by the second identification module to obtain second picture information of a third live broadcast video; and counting the image characteristics of the plurality of key frame images identified by the second identification module to obtain second image characteristics of the third live broadcast video.

The saving module is configured to save second picture information and second image characteristics of the third live video in the message queue; wherein the third live video is contained in the plurality of live videos.

In another possible implementation manner, the response module is further configured to extract a video segment with a second preset time length at intervals of a first preset time length in the live video in response to an uploading instruction of the live video;

the second identification module is also configured to identify the label information of the first image in each extracted video segment; wherein the label information of the first image is used for representing the type of the first image; the first image is a first frame image in the video segment, or any frame image in the video segment; and the counting module is also configured to count the tag information of the first image in each video segment to obtain the tag information of the third live video.

In a sixth aspect, the present application further provides a device for searching for live video, including:

a third receiving module configured to receive a target picture from a user equipment; wherein the target picture is used for searching the live video. Identifying a target picture to obtain first picture information of the target picture, wherein the first picture information comprises first label information, and the first label information is used for representing the type of the target picture. And extracting first image features from the target picture, wherein the first image features are used for characterizing at least one of color features, texture features, shape features and spatial relationship features of the image included in the target picture.

The third matching module is configured to compare the first picture information obtained by the second receiving module with second picture information of a plurality of live videos stored in the server to determine M first live videos, wherein M is more than or equal to 1,M and is a positive integer; matching second picture information of the M first live videos with the first picture information; the second picture information of the first live video comprises second label information used for representing the type of the first live video.

The fourth matching module is configured to compare second image characteristics of the live videos by adopting the first image characteristics obtained by the second receiving module and determine N second live videos, wherein N is greater than or equal to 1,N and is a positive integer; second image features of the N second live videos are matched with the first image features; and the second image characteristics of the second live video are used for representing at least one of color characteristics, texture characteristics, shape characteristics and spatial relationship characteristics of the images in the second live video.

The second grading module is configured to obtain grades of the M first direct playing videos according to the matching degree of the second picture information and the first picture information of the M first direct playing videos determined by the first matching module; obtaining scores of the N second live broadcast videos according to the matching degree of the first images and the second image characteristics of the N second live broadcast videos determined by the second matching module; and the score is used for representing the matching degree of the live video and the target picture.

The second selection module is configured to select K live videos from the M first live videos and the N second live videos according to the scores of the M first live videos and the scores of the N second live videos obtained by the scoring module, wherein K is greater than or equal to 1,K and is a positive integer; the scores of the K live videos are higher than the scores of other live videos in the M first live videos and the N second live videos.

And the third sending module is configured to send the search result to the user equipment, wherein the search result comprises K live videos.

In a seventh aspect, the present application further provides a user equipment, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to execute the instructions to implement the method of the first aspect and any of its possible embodiments described above.

In an eighth aspect, the present application further provides a server, including a processor; a memory for storing processor-executable instructions; wherein the processor is configured to execute the instructions to implement the method of the second or third aspect and any possible implementation thereof.

In a ninth aspect, the present application further provides a computer-readable storage medium having stored thereon computer instructions for implementing the method of the first aspect and any possible implementation manner thereof when the computer instructions are executed on a user equipment.

In a tenth aspect, the present application further provides a computer-readable storage medium having stored thereon computer instructions for implementing the method of the second or third aspect and any possible implementation thereof when the computer instructions are executed on a server.

In an eleventh aspect, the present application further provides a computer program product, which includes one or more instructions that can be executed on a computer, so that the computer executes the method for processing an audio file according to the first aspect and any possible implementation manner thereof.

It is understood that a live video may include a plurality of frames of images, each frame of image including picture information (which may also be referred to as image information) of the corresponding image. In the embodiment of the application, after a target picture for searching a live video is acquired, the target picture can be identified to obtain picture information (namely first picture information) of the target picture, and first image features are extracted from the target picture; then, the picture information (namely the second picture information) of the live video and the first picture information of the target picture can be compared to obtain M first live videos matched with the target picture; obtaining N second live videos matched with the target pictures by comparing the image characteristics (namely the second image characteristics) of the live videos with the first image characteristics; determining K live videos with higher matching degree with the target picture from the M first live videos and the N second live videos according to the matching degree of the M first live videos and the target picture and the matching degree of the N second live videos and the target picture; and finally, displaying the searched K live videos to the user.

In one aspect, the first picture information of the target picture comprises first label information used for representing the type of the target picture; the first image feature of the target picture is used for characterizing at least one of a color feature, a texture feature, a shape feature and a spatial relationship feature of an image included in the target picture. That is to say, the first picture information and the first image feature of the target picture may be used to characterize the type and the image feature of the target picture, that is, may be used to embody the picture content of the target picture.

On the other hand, the second picture information of the live video comprises second label information used for representing the type of the live video; the second image features of the live video are used for characterizing at least one of color features, texture features, shape features and spatial relationship features of images in the second live video. That is to say, the second picture information and the second image feature of the live video may be used to characterize the type and the image feature of the live video, that is, may be used to embody the picture content of the live video.

Therefore, in the embodiment of the application, by comparing the second picture information of the live video with the first picture information of the target picture and comparing the second image characteristic of the live video with the first image characteristic of the target picture, the live video matched with the target picture can be obtained, and the live video related to the target picture can be searched.

In addition, compared with the text related to the picture content input by the user, in the embodiment of the application, the first picture information obtained by identifying the target picture and the first image feature extracted from the target picture have higher association degree with the picture content of the target picture, and the picture content of the target picture can be embodied better. Therefore, compared with the method for searching the live video according to the characters which are input by the user and related to the picture content, the matching degree of the searched live video and the target picture can be improved by adopting the method of the embodiment of the application.

Drawings

Fig. 1 is a schematic application environment diagram of a search method for live videos in an embodiment of the present application;

fig. 2 is a flowchart of a search method for live video in an embodiment of the present application;

FIG. 3 is a schematic diagram of a display interface of a user equipment in an embodiment of the present application;

FIG. 4A is a schematic diagram of another display interface of a user equipment in the embodiment of the present application;

FIG. 4B is a diagram of another display interface of a user equipment in an embodiment of the present application;

fig. 5 is another flowchart of a search method for live video in an embodiment of the present application;

fig. 6 is another flowchart of a search method for live video in an embodiment of the present application;

fig. 7 is another flowchart of a search method for live video in the embodiment of the present application;

FIG. 8 is a block diagram of a UE in an embodiment of the present application;

fig. 9 is a block diagram of a searching apparatus for live video in an embodiment of the present application;

fig. 10 is a schematic hardware structure diagram of a user equipment in an embodiment of the present application;

fig. 11 is a schematic structural diagram of a server in an embodiment of the present application.

Detailed Description

In the following, the terms "first", "second" are used for descriptive purposes only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present embodiment, "a plurality" means two or more unless otherwise specified.

The embodiment of the application provides a live video searching method, through which a live video related to a target picture can be searched, and the matching degree of the searched live video domain target picture can be improved.

Embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Please refer to fig. 1, which illustrates that a search method for live video provided by an embodiment of the present application may be applied to the search system shown in fig. 1. As shown in fig. 1, the enforcement environment may include a server 101 and a user device 102.

The user equipment 102 may obtain a target picture for searching a live video, and analyze the target picture to obtain picture feature information (such as first picture information and first image features described in the following embodiments) of the target picture; then, the picture feature information (such as the first picture information and the first image feature) of the target picture is sent to the server 101. The server 101 may search for a plurality of live videos matching the target picture according to the first picture information and the first image feature from the user equipment 102; search results including the plurality of live videos are then sent to the user device 102. User device 102 may receive search results from server 101 and present a plurality of live videos in the search results.

For example, the user equipment in the embodiment of the present application may be a mobile phone, a tablet computer, a desktop, a laptop, a handheld computer, a notebook computer, a vehicle-mounted device, an ultra-mobile personal computer (UMPC), a netbook, a cellular phone, a Personal Digital Assistant (PDA), an Augmented Reality (AR) \ Virtual Reality (VR) device, and the like, and the embodiment of the present application does not particularly limit the specific form of the terminal equipment.

Please refer to fig. 2, which is a flowchart of a live video searching method according to an embodiment of the present disclosure. When the method is applied to the user equipment 102, as shown in fig. 2, the method may comprise steps 201 to 205.

Step 201: and acquiring a target picture for searching the live video.

The method for acquiring the target picture may include: receiving an operation of uploading a picture of a user to obtain a target picture; or, in response to a screenshot instruction of the image of the target area, the image of the target area is intercepted to obtain a target image.

The target picture may be a target picture obtained in response to the screenshot instruction. Particularly, in the process of playing a video or a live video by the user equipment, if the user is interested in an article in the video or the live video, the user equipment can respond to the screenshot instruction to obtain the target picture.

Illustratively, in a search interface of a live application, a user device receives a picture uploading operation of a user. As shown in fig. 3 (a) as a search interface of a live application, a user device receives a click operation of a user on a picture in a search box. In response to the click operation, the user equipment displays an interface as shown in (b) in fig. 3, the interface includes a plurality of pictures, and the user equipment obtains the target picture according to the selection of the user on the picture in the interface in (b) in fig. 3. For example, the user equipment receives a click operation of the user on the picture 1, and the user equipment obtains a target image according to the click operation of the user, wherein the target image is the picture 1.

Also illustratively, an arbitrary interface is displayed on the user device, such as an interface for playing live video in a live application as shown in fig. 4A (a). At this time, if the user equipment receives a screenshot instruction of the user on the interface of the live video shown in (a) in fig. 4A, the user equipment may intercept the image of the target area to obtain a target picture in response to the screenshot instruction of intercepting the image of the target area; the user device may display an interface as shown in fig. 4A (b), the user device is searching for live video matching the target picture.

In the above example, after the user equipment receives the screenshot instruction, the user equipment takes the image of the intercepted target area as the target image. In some cases, where the user device receives a screenshot instruction from the user, the user's intent may not be to have the user device search for live video that matches the captured image of the target area. In order to provide a better experience for a user, the user equipment displays an interface for playing a live video in a live application, and after the user equipment receives a screenshot instruction of the user and the user equipment intercepts an image of a target area, the user equipment may display a first interface, which is shown in fig. 4B and includes an intercepted image 401 of the target area and a plurality of options, for example, an option "save screenshot picture", an option "search screenshot picture", an option "share screenshot picture", and the like. If the user receives a selection operation of the option "search screenshot picture", the user equipment uses the image of the intercepted target area as the target picture, displays an interface as shown in (b) in fig. 4A, and searches for the live video related to the target picture.

Step 202: and identifying the target picture to obtain first picture information of the target picture.

The first picture information comprises first label information, and the first label information is used for representing the type of the target picture.

It is to be understood that, if the target picture includes text information (i.e., the target picture includes text), the text information in the target picture may also be included in the first picture information. For example, the text information may be a user name, a commodity name, or the like.

The user equipment can be preconfigured with a label classification algorithm, and when the user equipment identifies a target picture, the label classification algorithm is operated to obtain first label information of the target picture. For example, the picture content in the target picture identified by the user equipment includes: mountains, water, figures, fishing rods, fish baskets, and the like. The user device recognizes that text information is not included in the target picture, and the first tag information of the target picture may be "landscape" and "fishing". The number of tags in the first tag information of the target picture may be one or more.

It is understood that, if the text information is included in the target picture, the text information in the target picture is also included in the first picture information. The more the first picture information acquired from the target picture China, the more the picture content of the target picture can be embodied. When the server matches the target image according to the first picture information, the matching degree of the searched live video and the target picture can be improved.

Step 203: and extracting first image features from the target picture, wherein the first image features are used for characterizing at least one of color features, texture features, shape features and spatial relationship features of the image included in the target picture.

Illustratively, the target picture may be processed by using a Scale-invariant feature transform (SIFT) algorithm, so as to perform feature vectorization on the target picture, reduce the dimension of the target picture to 1096, and extract the first image feature of the target picture.

The first image characteristics of the target picture are extracted, so that the target picture is better in distinguishability, and when the characteristics of the target picture are matched, the picture related to the target picture can be accurately and quickly matched.

Step 204: and sending the first picture information and the first image characteristics to a server to request the server to search K live videos matched with the target picture according to the first picture information and the first image characteristics.

Wherein K is not less than 1,K is a positive integer.

Step 205: and receiving a search result from the server, wherein the search result comprises K live videos, and displaying the K live videos.

Illustratively, taking K =4 as an example, if the search result received by the user device includes 4 live videos, the user device may present the 4 live videos. As shown in fig. 4A (c), the user device displays 4 live videos matching the target picture. If the search result received by the user equipment comprises 20 live videos, the user equipment displays 4 live videos, and when the user equipment receives the operation of sliding a screen of the user (or the user equipment receives the operation of switching the display content), the user equipment can display other 4 live videos.

It is understood that the degree of matching between each of the K live videos and the target picture may be different. For example, the matching degree of the live video 1 and the target picture is 95%, the matching degree of the live video 2 and the target picture is 90%, the matching degree of the live video 3 and the target picture is 70%, and the matching degree of the live video 4 and the target picture is 60%. The user equipment can display the K live videos according to the matching degree of the live videos and the target pictures.

Illustratively, the search result received by the user equipment includes scores of K live videos, and the scores of the live videos are used for representing the matching degree of the live videos and the target pictures. When the user equipment displays K live videos, the K live videos can be displayed according to the sequence from high to low of the scores of the K live videos. Taking the above 4 live videos as an example, as shown in (c) in fig. 4A, the user equipment may display the 4 live videos in an order from high to low according to matching degrees (i.e., scores) of the live video 1, the live video 2, the live video 3, and the live video 4 with the target picture.

It will be appreciated that the degree of match of each live video to the target picture in the search results may be different, and the score of the live video is related to the degree of match of the live video to the target picture. The higher the score of the live video is, the higher the matching degree of the live video and the target picture is. And the user equipment displays the K videos according to the sequence of the scores of the K live videos from high to low. That is, the live video that matches the target picture to a high degree is shown in front. When the user views the live video displayed on the user equipment, the user can quickly know the live video with the highest matching degree, and the user experience is improved.

In the embodiment of the application, the first picture information and the first image feature of the target picture are features used for representing the type and the image of the target picture, that is, picture content of the target picture can be embodied. Compared with characters which are input by a user and are related to the picture content, the first picture information and the first image characteristics of the target picture can better embody the picture content of the target picture. The first picture information and the first image characteristics are sent to the server, so that the server can obtain accurate picture content of the target image according to the first picture information and the first image characteristics, and matching degree of the searched live video and the target picture can be improved.

Referring to fig. 5, an embodiment of the present application further provides a method for searching a live video. When the method can be applied to a server, as shown in fig. 5, the method can include steps 501 to 506.

Step 501: receiving first picture information and first image characteristics of a target picture sent by user equipment.

The first picture information comprises first label information used for representing the type of the target picture; the first image feature is used for characterizing at least one of color features, texture features, shape features and spatial relationship features of an image included in the target picture.

Step 502: comparing second picture information of a plurality of live videos stored in a server by using the first picture information to determine M first live videos, wherein M is more than or equal to 1,M and is a positive integer; and matching the second picture information of the M first live videos with the first picture information.

And the second picture information of the first live video comprises second label information for representing the type of the first live video.

Illustratively, when the text information is included in the target picture, the first picture information further includes the text information in the target picture; when the first live video comprises the text information, the second picture information of the first live video also comprises the text information in the first live video.

It is understood that if the second picture information of the live video includes text information, the second picture information also includes text information. When the server matches the first picture information with the second picture information, the matching degree of the live video and the target picture can be improved.

It is understood that the second picture information of the live video is picture information of an image in the live video. The live video is composed of multiple frames of images, and the second picture information of the live video in the server can be second picture information obtained by extracting the multiple frames of images from the live video and identifying the images.

The live video can comprise a plurality of frames of images, and the third picture information of the key frame image is obtained by identifying the key frame image. And the server can obtain the third image information according to the key frame image to obtain more accurate live video information.

Illustratively, the server maintains second picture information for a plurality of live videos in a message queue. And the server compares the received first picture information with the second picture information in the message queue to determine M live videos.

In one possible embodiment, if each frame of image in the live video is identified, the second picture information is obtained. The problem of excessive computation exists, and the second picture features are similar due to similar picture contents of pictures of continuous frames. For example, the server may identify a key frame in the live video, identifying second picture information for the key frame. Therefore, when the server compares the first picture information with the second picture information, the calculation amount of the server can be reduced.

Illustratively, the server responds to an uploading instruction of the third live video, identifies each frame image of the third live video, and determines a plurality of key frame images of the third live video; and the image difference between the key frame image and the previous frame image of the key frame image is greater than a preset difference threshold value. Taking the key frame image as P (x, y) and the previous frame image of the key frame image as Q (x, y) as an example, the difference between the key frame image and the previous frame image of the key frame image is expressed by formula 1:

where S represents a difference between the key frame image and a previous frame image of the key frame image. Suppose that the key frame image P (x, y) and the previous frame image Q (x, y) of the key frame are located in a coordinate system, and the (x, y) represents the coordinates of each pixel point in the image, wherein the value of x is [1,w ]]W represents the width of the pixel; y is [1,h]And h denotes the height of the pixel. P is _xy Indicating the gray value of the pixel point with coordinates (x, y) in the image. Q _xy Representing the gray value of a pixel point with coordinates (x, y) in the image.

And if S is larger than a preset threshold value, determining that the image represented by P (x, y) is the key frame image. Wherein the preset threshold may be set empirically.

Illustratively, identifying the key frame image results in third picture information of the key frame image; the third picture information comprises label information of the key frame image, and the label information of the key frame image is used for representing the type of the key frame image. When the key frame image includes text information, the third picture information further includes the text information in the key frame image. And counting third picture information of the plurality of key frame images to obtain second picture information of a third live broadcast video.

It should be noted that the manner of identifying the second picture information in the key frame image is the same as the manner of determining the first picture information in the target picture, and is not repeated here.

In some implementations, the second picture information further includes tag information for the live video. Illustratively, in response to an uploading instruction of a live video, video segments with second preset time length are extracted at intervals of first preset time length in the live video; identifying and extracting label information of a first image in each video segment; wherein the label information of the first image is used for representing the type of the first image; the first image is a first frame image in the video segment, or any frame image in the video segment; and counting the label information of the first image in each video segment to obtain the label information of the third live video.

The live video is uploaded to the server, and the server can extract one minute of live video from every three minutes of live video. And taking a first frame image in the one-minute live video as a key frame image of the one-minute video, identifying image content in the key frame image, and setting label information according to the image content, wherein the label information is a label of the live video.

It is understood that, in the above implementation, any frame of image may be extracted from every three minutes of live video as a key frame image of the live video, and tag information may be set according to image content of the key frame image, where the tag information is a tag of the live video.

Step 503: and comparing second image characteristics of the plurality of live videos by using the first image characteristics to determine N second live videos, wherein N is greater than or equal to 1,N and is a positive integer.

Second image features of the N second live videos are matched with the first image features; and the second image characteristics of the second live video are used for representing at least one of color characteristics, texture characteristics, shape characteristics and spatial relationship characteristics of the images in the second live video.

Illustratively, extracting image features of the key frame image from the key frame image; the image features of the key frame image are used to characterize at least one of color features, texture features, shape features, and spatial relationship features of the key frame image. And counting the image characteristics of the plurality of key frame images to obtain a second image characteristic of the third live broadcast video. Storing second picture information and second image characteristics of a third live video in a message queue; wherein the third live video is included in the plurality of live videos.

Step 504: obtaining scores of the M first direct-playing videos according to the matching degree of the second picture information and the first picture information of the M first direct-playing videos; and obtaining scores of the N second live broadcast videos according to the matching degree of the second image characteristics of the N second live broadcast videos and the first image.

And the score is used for representing the matching degree of the live video and the target picture.

When the score of each live video is included in the search result, the user equipment can display the live videos according to the scores after receiving the search result.

It can be understood that the scores of the M first live videos are scores that are obtained by scoring the matching degree of the first picture information and the second picture information and determining the scores finally. And the scores of the N second live broadcast videos are scores for the matching degree of the first image characteristics and the second image characteristics, and the determined scores are determined. Because the scoring basis of the M first live videos and the N second live videos is different, different preset coefficients need to be set for the M first live videos and the N second live videos, and the score of the live videos can represent the matching degree of the live videos and the target pictures.

In some embodiments, step 504 described above may include steps 504 a-504 c.

Step 504a, according to the matching degree of the second picture information and the first picture information of the M first live videos, scoring the M first live videos respectively, and multiplying the scores of the M first live videos by a first preset coefficient to obtain scores of the M first live videos. And scoring is used for representing the matching degree of the second picture information of the first direct playing video and the first picture information of the target picture.

Illustratively, the server obtains 10 first live videos matched with the target picture, and since the scoring basis of the first live videos is the matching degree of the first picture information and the second picture information, the matching degree of each first live video and the target picture is different. For example, text information is not included in the target image, the first tag information of the target image is "landscape" and "fishing", and if the second tag information of the live video is the same as the first tag information of the target picture, each tag is 5 points; and if the second label information of the live video is similar to the first label information of the target picture, each label is 4 points.

If the second tag information in the second picture information of a first live video includes "landscape" and "fishing", the score of the first live video is 10 points, which indicates that the matching degree of the first live video and the target picture is 100%. If the second tag information in the second picture of the first live-action video is close to the first tag information, such as the second tag information "beautiful scene" and "leisure", and the "beautiful scene" tag is similar to the "landscape" tag, the score of the first live-action video is 4, which means that the matching degree of the first live-action video and the target picture is 40%.

And step 504b, scoring the N second live broadcast videos respectively according to the matching degree of the second image characteristics and the first image characteristics of the N second live broadcast videos, and multiplying the scores of the N second live broadcast videos by a second preset coefficient to obtain scores of the N second live broadcast videos. And the score is used for representing the matching degree of the second image characteristic of the second live broadcast video and the first image characteristic of the target picture.

Illustratively, in the matching process of the first image feature and the second image feature of the target picture, the live video needs to be scored from a plurality of features of color features, texture features, shape features and spatial relationship features of the image. For example, a second live video has a score of 10, but the second live video has a similarity of 40% to the target picture.

And step 504c, selecting K live videos from the M first live videos and the N second live videos according to the sequence that the scores of the M first live videos and the scores of the N second live videos are from large to small. The scores of the K live videos are higher than the scores of other live videos in the M first live videos and the N second live videos.

Therefore, when the score of the first live video is the same as the score of the second live video, the matching degree between the first live video and the target picture and the matching degree between the second live video and the target image are different. Different preset coefficients are set for the scores of the M first live videos and the scores of the N second live videos, and the scores of the M first live videos and the scores of the N second live videos are obtained. The score can represent the similarity degree of the live videos and the target pictures so as to sequence the matching degree of the M first live videos and the N second live videos with the target pictures.

Step 505: and selecting K live videos from the M first live videos and the N second live videos according to the scores of the M first live videos and the scores of the N second live videos, wherein K is greater than or equal to 1,K and is a positive integer.

The scores of the K live videos are higher than the scores of other live videos in the M first live videos and the N second live videos.

Illustratively, the server gets 10 first live videos and 20 second live videos, and the server ranks the scores of the 30 live videos from high to low. The server may send the top 20 live videos in the score ordering of the live videos to the user device as search results, or the server may send the top 15 live videos in the score ordering of the live videos to the user device as search results.

Step 506: and sending a search result to the user equipment, wherein the search result comprises K live videos.

The live video searching method provided by the embodiment of the application can be applied to a searching system shown in fig. 1. In the search system, the implementation flow of the method is shown in fig. 6, and the search method includes S601-S609.

S601: the user equipment acquires a target picture for searching live video.

S602: the user equipment identifies the target picture to obtain first picture information of the target picture.

S603: the user equipment extracts first image features from the target picture.

The first image feature is used for characterizing at least one of color features, texture features, shape features and spatial relationship features of an image included in the target picture.

S604: the user equipment sends the first picture information and the first image characteristics to the server to request the server to search K live videos matched with the target picture according to the first picture information and the first image characteristics, wherein K is greater than or equal to 1,K and is a positive integer.

It is understood that S601 to S604 in the present application are similar to the contents of step 201 to step 204, and the detailed description may refer to the related description of step 201 to step 204, which is not repeated herein.

S605: the server compares the first picture information with second picture information of a plurality of live videos stored in the server to determine M first live videos, wherein M is not less than 1,M and is a positive integer.

S606: the server compares second image characteristics of the live videos by adopting the first image characteristics to determine N second live videos, wherein N is not less than 1,N and is a positive integer.

S607: the server obtains scores of the M first direct-playing videos according to the matching degree of the second picture information and the first picture information of the M first direct-playing videos; and obtaining scores of the N second live broadcast videos according to the matching degree of the second image characteristics of the N second live broadcast videos and the first image.

S608: the server sends a search result to the user equipment, wherein the search result comprises K live videos.

S609: the user equipment displays the K live videos.

It is understood that, in the present application, the contents of steps S605 to S608 are similar to the contents of steps 502 to step 505, and the detailed description may refer to the related description of steps 201 to step 204, which is not described herein again.

In the embodiment of the application, the user equipment identifies first picture information obtained by a target picture and first image characteristics extracted from the target picture. And the server matches the live broadcast video according to the first picture information and the first image characteristics. Therefore, compared with the method for searching the live video according to the characters which are input by the user and are related to the picture content, the matching degree of the searched live video and the target picture can be improved by adopting the method of the embodiment of the application.

The embodiment of the application also provides a live video searching method, which is applied to the searching system shown in fig. 1, wherein the user equipment can send the target picture to the server, and the server identifies the target picture. As shown in fig. 7, the implementation flow of the method includes S701 to S709.

S701: the user equipment acquires a target picture for searching live video.

S702: and the user equipment sends the target picture to the server.

S703: the server identifies the target picture to obtain first picture information of the target picture.

S704: the server extracts a first image feature from the target picture.

S705: the server compares the first picture information with second picture information of a plurality of live videos stored in the server to determine M first live videos, wherein M is not less than 1,M and is a positive integer.

S706: the server compares second image characteristics of the plurality of live videos by adopting the first image characteristics to determine N second live videos, wherein N is greater than or equal to 1,N and is a positive integer.

S707: the server obtains scores of the M first direct-playing videos according to the matching degree of the second picture information and the first picture information of the M first direct-playing videos; and obtaining scores of the N second live broadcast videos according to the matching degree of the second image characteristics of the N second live broadcast videos and the first image.

S708: the server sends a search result to the user equipment, wherein the search result comprises K live videos.

S709: the user equipment displays the K live videos.

It is to be understood that the contents of steps 201 to 204 may be similar to those of steps 701 to 704, and for a specific implementation, reference may be made to the above description of steps 201 to 204, and details are not described herein again. S705 to S708 in this application are similar to the contents of step 502 to step 505, and for the specific implementation, reference may be made to the above description related to step 201 to step 204, which is not described herein again.

In the embodiment of the application, the user equipment identifies first picture information obtained by a target picture and first image characteristics extracted from the target picture. And the server matches the live broadcast video according to the first picture information and the first image characteristics. Therefore, compared with the method for searching the live video according to the characters which are input by the user and related to the picture content, the matching degree of the searched live video and the target picture can be improved by adopting the method of the embodiment of the application.

It can be understood that the search method applied to the live video of the user equipment can also be implemented by a search device of the live video. In order to implement the above functions, the search device for live video includes a hardware structure and/or a software module that performs each function. Those of skill in the art will readily appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the embodiments of the present application.

In the embodiment of the present application, the search apparatus for the live video and the like may be divided into function modules according to the method example, for example, each function module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. It should be noted that, in the embodiment of the present application, the division of the module is schematic, and is only one logic function division, and there may be another division manner in actual implementation.

In the case of dividing each function module according to each function, fig. 8 shows a schematic diagram of a possible structure of the search apparatus for live video according to the above embodiment, which corresponds to the method in fig. 2. The apparatus 800 for searching a live video includes: an obtaining module 801, a first identifying module 802, an extracting module 803, a first sending module 804 and a first receiving module 805.

An obtaining module 801 configured to obtain a target picture for searching for a live video.

Illustratively, if the target picture acquired by the acquisition module is acquired in response to the screenshot instruction. The acquisition module is specifically configured to, in response to a screenshot instruction of an image of the target area, intercept the image of the target area to obtain a target picture.

A first identifying module 802, configured to identify the target picture acquired by the acquiring module 801 to obtain first picture information of the target picture; the first picture information comprises first label information, and the first label information is used for representing the type of the target picture.

It can be understood that, in the process of identifying the target picture, if the target picture includes text information, the first picture information may also include the text information in the target picture.

An extracting module 803 configured to extract a first image feature from the target picture obtained 801 by the obtaining module, wherein the first image feature is used for characterizing at least one of a color feature, a texture feature, a shape feature and a spatial relationship feature of an image included in the target picture.

And a first sending module 804 configured to send the first picture information obtained by the first identifying module 802 and the first image feature obtained by the extracting module 803 to the server to request the server to search K live videos matched with the target picture according to the first picture information and the first image feature, where K is greater than or equal to 1,K and is a positive integer.

The first receiving module 805 is configured to receive a search result from the server, where the search result includes K live videos, and presents the K live videos.

For example, the degree of matching of each live video with the target picture may be different in K live videos. If the search result comprises scores of the K live videos, the scores of the live videos are used for representing the matching degree of the live videos and the target pictures. The first receiving module is further configured to display the K live videos in an order from high scores to low scores of the K live videos.

In the case of dividing each function module according to each function, fig. 9 shows a schematic diagram of a possible structure of the search apparatus for live video according to the above embodiment, which corresponds to the method in fig. 5. The apparatus 900 for searching for live video includes: a second receiving module 901, a first matching module 902, a second matching module 903, a scoring module 904, a selecting module 905, and a second sending module 906.

A second receiving module 901, configured to receive first picture information and first image characteristics of a target picture sent by a user equipment; the first picture information comprises first label information used for representing the type of the target picture; the first image feature is used for characterizing at least one of color features, texture features, shape features and spatial relationship features of an image included in the target picture.

The first matching module 902 is configured to compare the first picture information obtained by the second receiving module with second picture information of a plurality of live videos stored in a searching device of the live videos to determine M first live videos, wherein M is a positive integer greater than or equal to 1,M; matching second picture information of the M first live videos with the first picture information; the second picture information of the first live video comprises second label information used for representing the type of the first live video.

It is understood that, if the text information is included in the target picture, the first picture information further includes the text information of the target picture. And if the first live video comprises the text information, the second picture information of the first live video also comprises the text information in the first live video.

The second matching module 903 is configured to compare the first image features obtained by the second receiving module with second image features of the plurality of live videos to determine N second live videos, wherein N is greater than or equal to 1,N and is a positive integer; second image features of the N second live videos are matched with the first image features; and the second image characteristics of the second live video are used for representing at least one of color characteristics, texture characteristics, shape characteristics and spatial relationship characteristics of the images in the second live video.

The scoring module 904 is configured to obtain scores of the M first live videos according to the matching degree of the second picture information and the first picture information of the M first live videos determined by the first matching module; obtaining scores of the N second live videos according to the matching degree of the second image characteristics of the N second live videos and the first image determined by the second matching module; and the score is used for representing the matching degree of the live video and the target picture.

The selecting module 905 is configured to select K live videos from the M first live videos and the N second live videos according to the scores of the M first live videos and the scores of the N second live videos obtained by the scoring module, wherein K is greater than or equal to 1,K and is a positive integer; the scores of the K live videos are higher than the scores of other live videos in the M first live videos and the N second live videos.

A second sending module 906 configured to send search results to the user device, the search results including K live videos.

Illustratively, in the process of obtaining the live video by the search apparatus for live videos, the scoring basis of the M live videos is the matching degree of the first picture information and the second picture information, and the scoring basis of the N live videos is the matching degree of the first image feature and the second image feature. The scoring module is specifically configured to score the M first live videos according to matching degrees of second picture information and first picture information of the M first live videos, and multiply a first preset coefficient by the scores of the M first live videos to obtain scores of the M first live videos. And according to the matching degree of the second image characteristics and the first image characteristics of the N second live broadcast videos, respectively scoring the N second live broadcast videos, and multiplying the scores of the N second live broadcast videos by a second preset coefficient to obtain scores of the N second live broadcast videos. And selecting K live videos from the M first live videos and the N second live videos according to the sequence that the scores of the M first live videos and the scores of the N second live videos are from large to small. The scores of the K live videos are higher than the scores of other live videos in the M first live videos and the N second live videos.

It is understood that the search apparatus for live video may receive live video uploaded by the user equipment. The searching device of the live video further comprises: the device comprises a determining module, a second identifying module, a counting module and a storing module.

The determining module is configured to respond to an uploading instruction of the third live video, identify each frame image of the third live video, and determine a plurality of key frame images of the third live video; and the image difference between the key frame image and the previous frame image of the key frame image is greater than a preset difference threshold value.

A second identification module configured to perform, for each of the plurality of key frame images determined by the determination module: identifying the key frame image to obtain third picture information of the key frame image; extracting image characteristics of the key frame image from the key frame image; the third picture information comprises label information of the key frame image, and the label information of the key frame image is used for representing the type of the key frame image; when the key frame image comprises text information, the third picture information also comprises the text information in the key frame image; the image features of the key frame image are used to characterize at least one of color features, texture features, shape features, and spatial relationship features of the key frame image.

All relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again.

In some embodiments, the response module is further configured to extract a video segment of a second preset duration at intervals of a first preset duration in the live video in response to an uploading instruction of the live video. A second identification module, which is also configured to identify the label information of the first image in each extracted video segment; wherein the label information of the first image is used for representing the type of the first image; the first picture is a first picture in the video segment or any picture in the video segment. The counting module is further configured to count the tag information of the first image in each video segment to obtain the tag information of the third live video.

In case of using integrated units, fig. 10 shows a possible structural diagram of the user equipment involved in the above embodiments. As shown in fig. 10, the user equipment 102 includes a processor 1001 and a memory 1002.

It is understood that the user device 102 shown in fig. 10 can implement all the functions of the above-described search apparatus 800 for live video. The functions of the modules in the search apparatus 800 for live video described above may be implemented in the processor 1001 of the user equipment 102. For example, the functions of the acquiring module 801, the first identifying module 802, the extracting module 803, the first sending module 8010 and the first receiving module 805 described above may be implemented in the processor 1001. The storage module of the search apparatus 800 for live video corresponds to the storage 1002 of the user device 102.

Among other things, the processor 1001 may include one or more processing cores, such as a 10-core processor, an 8-core processor, and so on. The processor 1001 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a memory, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), etc. Wherein, the different processing units may be independent devices or may be integrated in one or more processors.

Memory 1002 may include one or more computer-readable storage media, which may be non-transitory. The memory 1002 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in the memory 1002 is used to store at least one instruction for execution by the processor 1001 to implement a method of processing an audio file as provided by the method embodiments of the present application.

In some embodiments, the user device 102 may further optionally include: a peripheral interface 1003 and at least one peripheral. The processor 1001, memory 1002 and peripheral interface 1003 may be connected by a bus or signal line. Various peripheral devices may be connected to peripheral interface 1003 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 1004, display screen 1005, camera assembly 1006, audio circuitry 1007, positioning assembly 1008, and power supply 1009.

The peripheral interface 1003 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 1001 and the memory 1002. In some embodiments, processor 1001, memory 1002, and peripheral interface 1003 are integrated on the same chip or circuit board; in some other embodiments, any one or both of the processor 1001, the memory 1002, and the peripheral interface 1003 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.

The Radio Frequency circuit 1004 is used to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuitry 1004 communicates with communication networks and other communication devices via electromagnetic signals. The radio frequency circuit 1004 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 1004 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 1004 may communicate with other user equipment via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 10G, and 5G), wireless local area networks, and/or Wi-Fi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 1004 may also include NFC (Near Field Communication) related circuits, which are not limited by this disclosure.

A display screen 1005 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 1005 is a touch display screen, the display screen 1005 also has the ability to capture touch signals on or over the surface of the display screen 1005. The touch signal may be input to the processor 1001 as a control signal for processing. At this point, the display screen 1005 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display screen 1005 may be one, providing a front panel of the user device 102; the Display screen 1005 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and the like.

The camera assembly 1006 is used to capture images or video. Optionally, the camera assembly 1006 includes a front camera and a rear camera. Generally, the front camera is disposed on a front panel of the user equipment, and the rear camera is disposed on a rear surface of the user equipment. The audio circuit 1007 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 1001 for processing or inputting the electric signals to the radio frequency circuit 1004 for realizing voice communication. For stereo capture or noise reduction purposes, the microphones may be multiple and located at different locations on user device 102. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 1001 or the radio frequency circuit 1004 into sound waves. The loudspeaker can be a traditional film loudspeaker and can also be a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, the audio circuit 1007 may also include a headphone jack.

The positioning component 1008 is used to locate the current geographic Location of the user equipment 102 to implement navigation or LBS (Location Based Service). The Positioning component 1008 may be a Positioning component based on the Global Positioning System (GPS) in the united states, the beidou System in china, the graves System in russia, or the galileo System in the european union.

The power supply 1009 is used to supply power to various components in the user equipment 102. The power source 1009 may be alternating current, direct current, disposable batteries, or rechargeable batteries. When the power source 1009 includes a rechargeable battery, the rechargeable battery may support wired charging or wireless charging. The rechargeable battery can also be used to support fast charge technology.

In some embodiments, user device 102 also includes one or more sensors 1010. The one or more sensors 1010 include, but are not limited to: acceleration sensors, gyroscope sensors, pressure sensors, fingerprint sensors, optical sensors, and proximity sensors.

The acceleration sensor may detect acceleration magnitudes on three coordinate axes of a coordinate system established with the user device 102. The gyroscope sensor may detect a body direction and a rotation angle of the user equipment 102, and the gyroscope sensor may cooperate with the acceleration sensor to acquire a 3D motion of the user to the user equipment 102. The pressure sensors may be disposed on a side bezel of the user device 102 and/or on an underlying layer of the display screen 1005. When the pressure sensor is disposed on a side frame of the user device 102, a user holding signal of the user device 102 may be detected. The fingerprint sensor is used for collecting fingerprints of users. The optical sensor is used for collecting the intensity of ambient light. Proximity sensors, also known as distance sensors, are typically provided on the front panel of the user device 102. The proximity sensor is used to capture the distance between the user and the front of the user device 102.

Those skilled in the art will appreciate that the architecture shown in fig. 10 is not limiting of user device 102 and may include more or fewer components than shown, or combine certain components, or employ a different arrangement of components.

An embodiment of the present application further provides a computer storage medium, where the computer storage medium includes computer instructions, and when the computer instructions are run on the user equipment, the user equipment is caused to perform various functions or steps in the foregoing method embodiment. For example, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.

In the case of an integrated unit, fig. 11 shows a schematic diagram of a possible structure of the server 101 referred to in the above embodiments. The server 101 may include: a processing module 1101, a storage module 1102 and a communication module 1103. The processing module 1101 is configured to control and manage the operation of the server. The storage module 1102 is configured to store program codes and data of the server, such as a method for identifying a live video, a method for extracting image features of a live video, and the like. The communication module 1603 is used for supporting the server to communicate with other network entities to implement functions such as data interaction, for example, the communication module 1103 supports the server to communicate with the user equipment to implement data interaction functions.

The processing module 1101 may be a processor or a controller. The communication module 1103 may be a transceiver, an RF circuit or a communication interface, etc. The storage module 1102 may be a memory.

Embodiments of the present application further provide a computer program product, which when run on a computer causes the computer to execute each function or step in the above method embodiments.

Through the description of the above embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is only one type of logical functional division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another apparatus, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, may be located in one place, or may be distributed to a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented as a software functional unit and sold or used as a separate product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially contributed to by the prior art, or all or part of the technical solutions may be embodied in the form of a software product, where the software product is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for searching live video is applied to user equipment, and comprises the following steps:

acquiring a target picture for searching a live video;

identifying the target picture to obtain first picture information of the target picture; the first picture information comprises first label information, and the first label information is used for representing the type of the target picture;

extracting first image features from the target picture, wherein the first image features are used for representing at least one of color features, texture features, shape features and spatial relationship features of an image included in the target picture;

sending the first picture information and the first image characteristics to a server to request the server to search K live videos matched with the target picture according to the first picture information and the first image characteristics, wherein K is greater than or equal to 1,K and is a positive integer;

the scores of the K live videos are higher than the scores of other live videos except the K live videos in the M first live videos and the N second live videos; the score is used for representing the matching degree of the live video and the target picture; m is not less than 1,M is a positive integer; n is not less than 1,N is a positive integer; the second picture information of the M first direct-playing videos is matched with the first picture information; the second picture information of the first live video comprises second label information used for representing the type of the first live video; second image features of the N second live videos are matched with the first image features; wherein the second image feature of the second live video is used for characterizing at least one of a color feature, a texture feature, a shape feature and a spatial relationship feature of an image in the second live video; the second picture information is obtained according to statistics of third picture information of a plurality of key frame images in the first live video, the image difference between the key frame images and the last frame image of the key frame images is greater than a preset difference threshold value, the third picture information comprises label information of the key frame images, and the label information of the key frame images is used for representing the types of the key frame images; the second image features are features obtained according to image feature statistics of a plurality of key frame images in the second live broadcast video, and the image features of the key frame images are used for representing at least one of color features, texture features, shape features and spatial relationship features of the key frame images;

receiving a search result from the server, wherein the search result comprises the K live videos and displays the K live videos.

2. The method according to claim 1, wherein when text information is included in the target picture, the first picture information further includes text information in the target picture.

3. The method for searching for the live video according to claim 1 or 2, wherein the obtaining of the target picture for searching for the live video comprises:

and in response to a screenshot instruction of an image of a target area, intercepting the image of the target area to obtain the target image.

4. The method for searching for a live video according to claim 1 or 2, wherein K ≧ 2; the search results further include a score for each of the K live videos;

the presenting the K live videos includes:

and displaying the K live videos according to the sequence of the scores of the K live videos from high to low.

5. A method for searching live video is applied to a server, and comprises the following steps:

receiving first picture information and first image characteristics of a target picture sent by user equipment; the first picture information comprises first label information used for representing the type of the target picture; the first image feature is used for characterizing at least one of color features, texture features, shape features and spatial relationship features of an image included in the target picture;

comparing second picture information of the live videos stored in the server by adopting the first picture information to determine M first live videos, wherein M is more than or equal to 1,M and is a positive integer; the second picture information of the M first direct-playing videos is matched with the first picture information; the second picture information of the first live video comprises second label information used for representing the type of the first live video; the second picture information is obtained according to statistics of third picture information of a plurality of key frame images in a third live video, the image difference between the key frame images and the last frame image of the key frame images is larger than a preset difference threshold value, the third picture information comprises label information of the key frame images, and the label information of the key frame images is used for representing the types of the key frame images; wherein the third live video is contained in the plurality of live videos;

comparing second image characteristics of the live videos by using the first image characteristics to determine N second live videos, wherein N is not less than 1,N and is a positive integer; second image features of the N second live videos are matched with the first image features; wherein the second image feature of the second live video is used for characterizing at least one of a color feature, a texture feature, a shape feature and a spatial relationship feature of an image in the second live video; the second image feature is a feature obtained according to image feature statistics of a plurality of key frame images in the third live video, and the image feature of the key frame image is used for representing at least one of a color feature, a texture feature, a shape feature and a spatial relationship feature of the key frame image;

obtaining scores of the M first direct-playing videos according to the matching degree of the second picture information of the M first direct-playing videos and the first picture information; obtaining scores of the N second live broadcast videos according to the matching degree of the second image characteristics of the N second live broadcast videos and the first image characteristics; the score is used for representing the matching degree of the live video and the target picture;

selecting K live videos from the M first live videos and the N second live videos according to the scores of the M first live videos and the scores of the N second live videos, wherein K is greater than or equal to 1,K and is a positive integer; the scores of the K live videos are higher than the scores of other live videos except the K live videos in the M first live videos and the N second live videos;

and sending a search result to the user equipment, wherein the search result comprises the K live videos.

6. The method for searching for a live video according to claim 5, wherein the scores of the M first live videos are obtained according to matching degrees of second picture information of the M first live videos and the first picture information; obtaining scores of the N second live broadcast videos according to the matching degree of the second image features of the N second live broadcast videos and the first image features, wherein the scores comprise:

according to the matching degree of the second picture information and the first picture information of the M first direct-playing videos, scoring the M first direct-playing videos respectively, and multiplying the scores of the M first direct-playing videos by a first preset coefficient respectively to obtain scores of the M first direct-playing videos; the score is used for representing the matching degree of second picture information of the first live video and the first picture information;

according to the matching degree of the second image characteristics of the N second live broadcast videos and the first image characteristics, scoring the N second live broadcast videos respectively, and multiplying the scores of the N second live broadcast videos by a second preset coefficient respectively to obtain scores of the N second live broadcast videos; wherein the score is used for representing the matching degree of a second image feature of the second live video and the first image feature;

selecting the K live videos from the M first live videos and the N second live videos according to the scores of the M first live videos and the scores of the N second live videos from big to small;

and the scores of the K live videos are higher than the scores of other live videos in the M first live videos and the N second live videos.

7. The method according to claim 5 or 6, wherein when text information is included in the target picture, the first picture information further includes text information in the target picture;

when the first live video comprises text information, the second picture information of the first live video also comprises the text information in the first live video.

8. The method for searching for live video according to claim 7, wherein the server stores the second picture information of the plurality of live videos in a message queue.

9. The method for searching for live video according to claim 8, wherein before receiving the first picture information and the first image feature of the target picture sent by the user equipment, the method further comprises:

responding to an uploading instruction of a third live video, identifying each frame of image of the third live video, and determining a plurality of key frame images of the third live video;

performing, for each key frame image of the plurality of key frame images: identifying the key frame image to obtain third picture information of the key frame image; extracting image features of the key frame images from the key frame images; the third picture information comprises label information of the key frame image, and the label information of the key frame image is used for representing the type of the key frame image; when the key frame image comprises text information, the third picture information also comprises the text information in the key frame image;

counting third picture information of the plurality of key frame images to obtain second picture information of the third live broadcast video;

counting image features of the plurality of key frame images to obtain a second image feature of the third live broadcast video;

and storing second picture information and second image characteristics of the third live video in the message queue.

10. The method according to claim 9, wherein the second picture information of the third live video further includes tag information of the third live video; the tag information of the third live video is used for representing the type of the third live video;

before saving the second picture information and the second image characteristics of the third live video in the message queue, the method further comprises:

responding to an uploading instruction of the live video, and extracting a video segment with a second preset time length at intervals of a first preset time length in the live video;

identifying and extracting label information of a first image in each video segment; wherein the label information of the first image is used for representing the type of the first image; the first image is a first frame image in the video segment, or any frame image in the video segment;

and counting the tag information of the first image in each video segment to obtain the tag information of the third live video.

11. A method for searching live video according to any one of claims 6 and 8-10, wherein the search result further comprises a score of each live video of the K live videos.

12. A search apparatus for live video, comprising:

the acquisition module is configured to acquire a target picture for searching live video;

the first identification module is configured to identify the target picture acquired by the acquisition module to obtain first picture information of the target picture; the first picture information comprises first label information, and the first label information is used for representing the type of the target picture;

an extraction module configured to extract a first image feature from the target picture acquired by the acquisition module, wherein the first image feature is used for characterizing at least one of a color feature, a texture feature, a shape feature and a spatial relationship feature of an image included in the target picture;

the first sending module is configured to send the first picture information obtained by the first identifying module and the first image feature obtained by the extracting module to a server so as to request the server to search K live videos matched with the target picture according to the first picture information and the first image feature, wherein K is greater than or equal to 1,K and is a positive integer;

a first receiving module configured to receive a search result from the server, where the search result includes the K live videos, and to display the K live videos.

13. The apparatus for searching a live video according to claim 12, wherein when text information is included in the target picture, the first picture information further includes text information in the target picture.

14. The apparatus for searching for a live video according to claim 12 or 13, wherein the obtaining module is specifically configured to, in response to a screenshot instruction for capturing an image of a target area, capture the image of the target area to obtain the target picture.

15. The apparatus for searching a live video according to claim 12 or 13, wherein K ≧ 2; the search results further include a score for each of the K live videos;

the first receiving module is further configured to display the K live videos in an order from high scores to low scores of the K live videos.

16. A search apparatus for live video, comprising:

the second receiving module is configured to receive first picture information and first image characteristics of a target picture sent by user equipment; the first picture information comprises first label information used for representing the type of the target picture; the first image feature is used for characterizing at least one of color features, texture features, shape features and spatial relationship features of an image included in the target picture;

the first matching module is used for comparing the first picture information obtained by the second receiving module with second picture information of a plurality of live videos stored in a searching device of the live videos to determine M first live videos, and M is greater than or equal to 1,M and is a positive integer; the second picture information of the M first direct-playing videos is matched with the first picture information; the second picture information of the first live video comprises second label information used for representing the type of the first live video; the second picture information is obtained according to statistics of third picture information of a plurality of key frame images in a third live video, the image difference between the key frame images and the last frame image of the key frame images is larger than a preset difference threshold value, the third picture information comprises label information of the key frame images, and the label information of the key frame images is used for representing the types of the key frame images; wherein the third live video is contained in the plurality of live videos;

the second matching module is configured to compare the first image features obtained by the second receiving module with second image features of the live videos to determine N second live videos, wherein N is greater than or equal to 1,N and is a positive integer; second image features of the N second live videos are matched with the first image features; second image features of the second live video are used for representing at least one of color features, texture features, shape features and spatial relationship features of images in the second live video; the second image features are features obtained according to image feature statistics of a plurality of key frame images in the third live broadcast video, and the image features of the key frame images are used for representing at least one of color features, texture features, shape features and spatial relationship features of the key frame images;

the scoring module is configured to obtain scores of the M first live videos according to the matching degree of the second picture information of the M first live videos and the first picture information determined by the first matching module; obtaining scores of the N second live videos according to the matching degree of the first image features and the second image features of the N second live videos determined by the second matching module; the score is used for representing the matching degree of the live video and the target picture;

the selecting module is configured to select K live videos from the M first live videos and the N second live videos according to the scores of the M first live videos and the scores of the N second live videos obtained by the scoring module, wherein K is greater than or equal to 1,K and is a positive integer; the scores of the K live videos are higher than the scores of other live videos except the K live videos in the M first live videos and the N second live videos;

a second sending module configured to send search results to the user device, the search results including the K live videos.

17. The apparatus for searching for live video according to claim 16,

the scoring module is specifically configured to score the M first live videos according to matching degrees of second picture information of the M first live videos and the first picture information, and multiply a first preset coefficient by the score of the M first live videos to obtain scores of the M first live videos; the score is used for representing the matching degree of the second picture information of the first live video and the first picture information of the target picture; scoring the N second live broadcast videos respectively according to the matching degree of second image features of the N second live broadcast videos and the first image features, and multiplying the scores of the N second live broadcast videos by a second preset coefficient to obtain scores of the N second live broadcast videos; wherein the score is used for characterizing the matching degree of the second image characteristic of the second live video and the first image characteristic of the target picture; selecting the K live videos from the M first live videos and the N second live videos according to the sequence that the scores of the M first live videos and the scores of the N second live videos are from large to small;

18. A search apparatus of a live video according to claim 16 or 17, wherein when text information is included in the target picture, the first picture information further includes text information in the target picture;

19. The apparatus for searching for live video according to claim 18, wherein the apparatus for searching for live video stores the second picture information of the plurality of live videos in a message queue.

20. The apparatus for searching a live video according to claim 19, further comprising:

the determining module is configured to respond to an uploading instruction of a third live video, identify each frame image of the third live video, and determine a plurality of key frame images of the third live video;

a second identification module configured to perform, for each of the plurality of key frame images determined by the determination module: identifying the key frame image to obtain third picture information of the key frame image; extracting image features of the key frame images from the key frame images; the third picture information comprises label information of the key frame image, and the label information of the key frame image is used for representing the type of the key frame image; when the key frame image comprises text information, the third picture information also comprises the text information in the key frame image;

the statistical module is configured to count third picture information of the plurality of key frame images identified by the second identification module to obtain second picture information of the third live broadcast video; counting the image characteristics of the plurality of key frame images identified by the second identification module to obtain second image characteristics of the third live broadcast video;

a saving module configured to save second picture information and second image features of the third live video in the message queue.

21. The apparatus according to claim 20, wherein the determining module is further configured to extract, in response to an instruction to upload the live video, video segments of a second preset duration at intervals of a first preset duration in the live video;

the second identification module is further configured to identify the label information of the first image in each extracted video segment; wherein the label information of the first image is used for representing the type of the first image; the first image is a first frame image in the video segment, or any frame image in the video segment;

the statistic module is further configured to count tag information of the first image in each video segment to obtain tag information of the third live video.

22. A device for searching live videos as claimed in any one of claims 17 and 19-21, wherein the search result further comprises a score of each live video of the K live videos.

23. A user device, comprising: a processor; a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement a search method of live video as claimed in any of claims 1-4.

24. A server, comprising: a processor; a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement a search method of live video as claimed in any one of claims 5 to 11.

25. A computer readable storage medium having computer instructions stored thereon, which when executed on a user equipment implement the method of any one of claims 1-4.

26. A computer-readable storage medium having computer instructions stored thereon, which when executed on a server implement the method of any one of claims 5-11.