CN114297433A - Method, device and equipment for searching question and answer results and storage medium - Google Patents

Method, device and equipment for searching question and answer results and storage medium Download PDF

Info

Publication number
CN114297433A
CN114297433A CN202111620699.8A CN202111620699A CN114297433A CN 114297433 A CN114297433 A CN 114297433A CN 202111620699 A CN202111620699 A CN 202111620699A CN 114297433 A CN114297433 A CN 114297433A
Authority
CN
China
Prior art keywords
video
key frame
answer
image
key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111620699.8A
Other languages
Chinese (zh)
Other versions
CN114297433B (en
Inventor
汪忠超
王艳丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN202111620699.8A priority Critical patent/CN114297433B/en
Publication of CN114297433A publication Critical patent/CN114297433A/en
Priority to PCT/CN2022/137552 priority patent/WO2023124874A1/en
Application granted granted Critical
Publication of CN114297433B publication Critical patent/CN114297433B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/732Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/735Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/738Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The present disclosure provides a method, an apparatus, a device and a storage medium for searching question and answer results, which relate to the technical field of video processing, and the method comprises the following steps: acquiring video information corresponding to the search problem; the video information comprises a video cover, the video cover is composed of a plurality of key frame images of a video, each key frame image comprises text content and image content for answering a search question, and the number of the key frame images is matched with the number of key points for answering the search question; displaying a video cover page in a search result page; and responding to the detected triggering operation of the video cover to play the video. The video search efficiency can be improved.

Description

Method, device and equipment for searching question and answer results and storage medium
Technical Field
The embodiment of the disclosure relates to the technical field of video processing, and in particular relates to a method, a device, equipment and a storage medium for searching question and answer results.
Background
Video applications provided by the related art can provide a video search function and a video play function, and a user can search for related videos and play the videos through the video search function. However, the inventor finds that, in the video search result, the user needs to spend time finding the content needed by the user for watching, and the search efficiency is reduced.
Disclosure of Invention
In order to solve the technical problems described above or at least partially solve the technical problems, the present disclosure provides a method, an apparatus, a device, and a storage medium for searching question and answer results.
The present disclosure provides a method for searching question and answer results, including:
acquiring video information corresponding to the search problem; the video information comprises a video cover, the video cover is composed of a plurality of key frame images of a video, each key frame image comprises text content and image content for answering the search question, and the number of the key frame images is matched with the number of key points for answering the search question; displaying the video cover in a search results page; and responding to the detected triggering operation of the video cover to play the video.
Optionally, the arrangement sequence of the plurality of key frame images in the video cover is matched with the answer main point sequence corresponding to the search question.
Optionally, the key frame image is an image that is obtained by capturing a video frame of the video and contains text content and image content of the answer key point.
Optionally, the text content of the answer key point included in the key frame image is identified from the video frame of the video, and the image content of the answer key point included in the key frame image is matched from the video frame of the video based on the text content of the answer key point.
Optionally, the playing the video in response to detecting the triggering operation on the video cover includes: in response to detecting a trigger operation on a first key frame image on the video cover, determining a corresponding time identifier of the first key frame image on a playing time axis of the video; jumping to the playing page of the video, and starting to play the video from the time mark
The present disclosure also provides a method for searching question and answer results, including:
receiving a search request of terminal equipment, wherein the search request comprises a search problem;
searching and obtaining a video containing answers to the search questions based on the search questions;
determining a plurality of answer key points of the answer based on the audio signal or text content of the video;
processing each answer key point from the video to obtain a key frame image containing text content and image content of the answer key point;
generating a video cover of the video based on a plurality of key frame images corresponding to the plurality of answer key points;
and feeding back the video information containing the video cover to the terminal equipment.
Optionally, processing the video to obtain a key frame image including text content and image content of the answer key point, including:
performing character recognition processing on the video to obtain text content of the key points of the answer; identifying image content matched with the text content from video frames of the video based on the text content of the answer key points; intercepting the image content from the video frame; and generating a key frame image containing the answer key points based on the text content and the image content of the answer key points.
Optionally, the generating a video cover of the video based on a plurality of key frame images corresponding to the plurality of answer key points includes:
determining the arrangement sequence of a plurality of key frame images corresponding to a plurality of answer key points in the video cover based on the sequence of the answer key points in the answer; determining a corresponding splicing template based on the sizes of the plurality of key frame images and the arrangement sequence among the plurality of key frame images; based on the size of each region in the splicing template, carrying out equal-scale scaling processing on the key frame image corresponding to each region; and inserting the zoomed key frame image into the corresponding area of the template to obtain the video cover of the video.
Optionally, before feeding back the video information including the video cover to the terminal device, the method further includes: determining a corresponding time identifier of each key frame image on the playing time axis of the video; and adding the corresponding relation between each key frame image and the time mark into the video information.
Optionally, the determining the corresponding time identifier of each key frame image on the playing time axis of the video includes:
for each key frame image, matching the key frame image with a video frame in the video, and determining a target video frame matched with the key frame image in the video frame; and determining the time identifier corresponding to the target video frame as the time identifier corresponding to the key frame image on the playing time axis of the video.
Optionally, the determining the corresponding time identifier of each key frame image on the playing time axis of the video includes: and determining the corresponding time identifier of each key frame image on the playing time axis of the video according to the mapping relation between each video frame and the playing time on the playing time axis of the video.
The present disclosure also provides a device for searching question and answer results, including:
the information acquisition module is used for acquiring video information corresponding to the search problem; the video information comprises a video cover, the video cover is composed of a plurality of key frame images of a video, each key frame image comprises text content and image content for answering the search question, and the number of the key frame images is matched with the number of key points for answering the search question;
the cover display module is used for displaying the video cover in a search result page;
and the video playing module is used for responding to the detection of the triggering operation of the video cover to play the video.
Optionally, the arrangement sequence of the plurality of key frame images in the video cover is matched with the answer main point sequence corresponding to the search question.
Optionally, the key frame image is an image that is obtained by capturing a video frame of the video and contains text content and image content of the answer key point.
Optionally, the text content of the answer key point included in the key frame image is identified from the video frame of the video, and the image content of the answer key point included in the key frame image is matched from the video frame of the video based on the text content of the answer key point.
Optionally, the video playing module is configured to: in response to detecting a trigger operation on a first key frame image on the video cover, determining a corresponding time identifier of the first key frame image on a playing time axis of the video; jumping to a playing page of the video, and starting to play the video from the time identifier.
The present disclosure also provides a device for searching question and answer results, including:
the device comprises a request receiving module, a search processing module and a search processing module, wherein the request receiving module is used for receiving a search request of terminal equipment, and the search request comprises a search problem;
the video searching module is used for searching and obtaining a video containing answers to the searching questions based on the searching questions;
the key point determining module is used for determining a plurality of key point of the answer based on the audio signal or the text content of the video;
the image obtaining module is used for processing each answer key point from the video to obtain a key frame image containing text content and image content of the answer key point;
the cover generation module is used for generating a video cover of the video based on a plurality of key frame images corresponding to the plurality of answer key points;
and the information feedback module is used for feeding back the video information containing the video cover to the terminal equipment.
Optionally, the image obtaining module includes:
a text content obtaining unit, configured to perform character recognition processing on the video to obtain text content of the answer key;
the image content identification unit is used for identifying image content matched with the text content from video frames of the video based on the text content of the answer key point;
an image content intercepting unit for intercepting the image content from the video frame;
and the image generating unit is used for generating a key frame image containing the answer key points based on the text content and the image content of the answer key points.
Optionally, the image obtaining module includes:
the order determining unit is used for determining the arrangement order of a plurality of key frame images corresponding to a plurality of answer key points in the video cover based on the order of the answer key points in the answer;
the template determining unit is used for determining a corresponding splicing template based on the sizes of the plurality of key frame images and the arrangement sequence among the plurality of key frame images;
the zooming processing unit is used for carrying out equal-scale zooming processing on the key frame images corresponding to the regions based on the sizes of the regions in the splicing template;
and the image inserting unit is used for inserting the zoomed key frame image into the corresponding area of the template to obtain the video cover of the video.
Optionally, the apparatus further comprises:
the time identifier determining module is used for determining the corresponding time identifier of each key frame image on the playing time axis of the video;
and the relationship adding module is used for adding the corresponding relationship between each key frame image and the time identifier into the video information.
Optionally, the time identifier determining module includes:
the matching unit is used for matching the key frame images with video frames in the video aiming at each key frame image and determining a target video frame matched with the key frame image in the video frames;
and the first identification determining unit is used for determining the time identification corresponding to the target video frame as the time identification corresponding to the key frame image on the playing time axis of the video.
Optionally, the time identifier determining module includes: and the second identifier determining unit is used for determining the corresponding time identifier of each key frame image on the playing time axis of the video according to the mapping relation between each video frame and the playing time on the playing time axis of the video.
The present disclosure also provides a terminal device, including: a memory in which a computer program is stored and a processor, wherein the processor performs the method as described above when the computer program is executed by the processor.
The present disclosure also provides a computer-readable storage medium having stored therein a computer program which, when executed by a processor, performs the method as described above.
Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has the following advantages:
the present disclosure provides a method, apparatus, device and storage medium for searching question and answer results, the method obtains video information corresponding to a search question; the video information comprises a video cover, the video cover is composed of a plurality of key frame images of a video, each key frame image comprises text content and image content for answering a search question, and the number of the key frame images is matched with the number of key points for answering the search question; displaying a video cover page in a search result page; and playing the video in response to detecting the triggering operation on the video cover. The method for searching the question and answer results can be exemplarily applied to video search scenes and the like of the terminal equipment. In the technical scheme, each key frame image forming the video cover comprises text content and image content for solving the search problem, wherein the text content is more in line with the daily reading habit of a user, the intuition of the search result can be improved, the image content has stronger picture feeling, and the vividness of the search result can be improved; the number of the key frame images is matched with the number of the key points of the answers, so that the answers under the search questions can be more comprehensively covered; based on this, the video cover displayed in the search result page can comprehensively, intuitively and accurately express the video searching desire of the user, so that the user can quickly find the content needing to be watched by the user from the video cover, and then the video is played in response to the detection of the triggering operation of the video cover, thereby effectively improving the video searching efficiency.
The method comprises the steps of firstly receiving a search request of terminal equipment, wherein the search request comprises a search question; then searching and obtaining a video containing answers to the search questions based on the search questions; determining a plurality of answer key points of the answers based on the audio signals or the text contents of the videos; processing each answer key point from the video to obtain a key frame image containing text content and image content of the answer key point; generating a video cover of the video based on a plurality of key frame images corresponding to a plurality of answer key points; and finally, feeding back the video information containing the video cover to the terminal equipment. The above method for searching the question and answer results can be exemplarily applied to a video search scene of a server and the like. In the technical scheme, the video containing the answer to the search question is searched, so that the obtained video and the answer to the search question have higher matching degree, and the key frame image determined based on the video contains the text content and the image content of the key point of the answer, so that the key point of the answer can be visually and vividly represented; furthermore, the video cover generated based on the plurality of key frame images can comprehensively, intuitively and accurately express the video searching desire of the user, and the video information containing the video cover is fed back to the terminal equipment, so that the user can quickly find the content needing to be watched through the video cover, and the user experience is obviously improved on the video searching function.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present disclosure, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a flowchart of a method for searching question and answer results according to an embodiment of the present disclosure;
FIG. 2 is a flowchart illustrating a method for generating a video cover according to an embodiment of the disclosure;
FIG. 3 is a schematic view of a video cover according to one embodiment of the present disclosure;
fig. 4 is a flowchart of a method for searching question and answer results according to a second embodiment of the present disclosure;
fig. 5 is a block diagram illustrating a structure of a device for searching question and answer results according to a fourth embodiment of the present disclosure;
fig. 6 is a block diagram of a device for searching question and answer results according to a fifth embodiment of the present disclosure.
Detailed Description
In order that the above objects, features and advantages of the present disclosure may be more clearly understood, aspects of the present disclosure will be further described below. It should be noted that the embodiments and features of the embodiments of the present disclosure may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced in other ways than those described herein; it is to be understood that the embodiments disclosed in the specification are only a few embodiments of the present disclosure, and not all embodiments.
In a video application provided by the related art, a user can search for a related video to play. However, in the video search result, the user needs to spend time finding the content needed by himself to watch, which reduces the search efficiency. Based on this, the embodiments of the present disclosure provide a method, an apparatus, a device, and a storage medium for searching question and answer results, and the embodiments of the present disclosure are described in detail below.
The first embodiment is as follows:
referring to a flowchart of a method for searching question and answer results shown in fig. 1, the method is applicable to terminal devices with video applications, such as mobile phones, tablet computers and the like. The method for searching the question and answer results comprises the following steps:
step S102, video information corresponding to the search question is obtained; the video information comprises a video cover, the video cover is composed of a plurality of key frame images of the video, each key frame image comprises text content and image content of the answering search question, and the number of the key frame images is matched with the number of answer key points of the answering search question.
In one embodiment, a user sends a search request including a search question to a server through a terminal device; wherein, the search question may contain at least one keyword.
For ease of understanding, one embodiment of the server feeding back video information corresponding to the search problem is given below.
Firstly, searching and obtaining a video containing answers to a search question by a server based on the search question; in practice, the server may search for an answer matching the keyword in the search question based on the search question, and then search for a video containing the answer from a preset video resource library.
Then, a plurality of answer key points of the answer are determined based on the audio signal or text content of the video. The server may identify content of the video, such as text content, audio signals, image content, play time, and the like of the video, so as to determine a plurality of answer points of the answer based on the identified content. Processing each answer key point from the video to obtain a key frame image containing text content and image content of the answer key point; specifically, for example, a key frame image matching the answer gist is determined from the video using the recognized text content and image content.
Then, based on a plurality of key frame images corresponding to a plurality of answer key points, a video cover of the video is generated, and video information containing the video cover is fed back to the terminal equipment. The video cover can be a splicing processing result of each key frame image, the splicing mode of each key frame image is flexible and changeable, and the video cover formed by the key frame images can present a better page display effect.
And step S104, displaying the video cover in the search result page.
The terminal device receives the video information and displays a video cover page included in the video information in a search result page.
And step S106, playing the video in response to the detection of the triggering operation on the video cover.
In this embodiment, the terminal device determines whether a trigger operation for the video cover is detected; and if the trigger operation is detected, playing the video in response to the trigger operation.
According to the method for searching the question and answer result, the video information corresponding to the search question is obtained; the video information comprises a video cover composed of a plurality of key frame images of the video, each key frame image comprises text content and image content of the answering search question, and the number of the key frame images is matched with the number of answer key points of the answering search question; displaying a video cover page in a search result page; and playing the video in response to detecting the triggering operation on the video cover. In the technical scheme, each key frame image forming the video cover comprises text content and image content for solving the search problem, wherein the text content is more in line with the daily reading habit of a user, the intuition of the search result can be improved, the image content has stronger picture feeling, and the vividness of the search result can be improved; the number of the key frame images is matched with the number of the key points of the answers, so that the answers under the search questions can be more comprehensively covered; based on the method, the video cover displayed in the search result page can comprehensively, intuitively and accurately express the video searching intention of the user, so that the user can quickly find the content to be watched from the video cover, and the video is played in response to the detection of the triggering operation of the video cover, thereby effectively improving the video searching efficiency.
The arrangement mode of the key frame images is considered to have great influence on the display effect of the formed video cover; based on this, this embodiment provides a video cover's show mode: the arrangement sequence of the plurality of key frame images in the video cover is matched with the answer key point sequence corresponding to the search question. The video cover highlights the order of the key points of the answers through the orderly arranged key frame images, and the matching between the video cover and the answers of the search questions is improved.
In practical application, a plurality of videos are generally obtained by searching based on a search problem, and the proportion of the videos is not necessarily the same; for example, videos are divided into landscape videos and portrait videos. The video covers are usually taken from video frames in the videos, in this case, the sizes of the video covers corresponding to different videos are different, and when the video covers of multiple videos are simultaneously displayed in the same search result page, the layout of the display interface is disordered. And if the sizes of the video covers with different proportions are simply unified, for example, the vertical video is displayed by the horizontal display cover, the effect of the horizontal display cover is not good and the image quality is difficult to guarantee.
Therefore, in order to improve the display mode of the video cover in the search result page and improve the friendliness of interface display, the embodiment provides a key frame image, which is an image obtained by intercepting the video frame of the video and containing the text content and the image content of the answer key point. The text content of the answer key points contained in the key frame images is obtained by identification from video frames of the video, and the image content of the answer key points contained in the key frame images is obtained by matching from the video frames of the video based on the text content of the answer key points.
In practical applications, the key frame image in this embodiment is generated by a server, and for better understanding, a description is given here on a manner of obtaining the key frame image for each answer point. Referring to fig. 2, the process of processing a key frame image containing text content and image content of an answer key point from a video includes the following steps S202 to S208:
step S202, character recognition processing is carried out on the video to obtain text content of key points of the answer.
Most videos have subtitles, and the subtitles can reflect the video content more accurately, so in a specific embodiment, Character Recognition processing can be performed on each video frame in the video through an OCR (Optical Character Recognition) technology or a detection model to obtain a candidate Character Recognition result on each video frame; judging whether the candidate character recognition result can be matched with any answer key point; if so, that is, the current candidate character recognition result can be successfully matched with a certain answer key point, the candidate character recognition result is matched with the search question, the answer key point of the search question can be expressed, so that the candidate character recognition result which can be successfully matched with the answer key point is determined as the target character recognition result, and the target character recognition result or the keyword in the target character recognition result is used as the text content of the answer key point.
Generally, video shows a complete program of things going on, and a solution determined based on video should also include multiple answer points with logic. For example, for tutorial videos such as gourmet videos, manual videos, and fitness videos, the displayed content usually includes a plurality of answer key points, and each answer key point corresponds to a key step. Thus, the text content obtained according to the above embodiment is also generally plural in correspondence with the answer key.
And step S204, based on the text content of the key points of the answer, identifying the image content matched with the text content from the video frames of the video.
When the total number of frames of the video is small, image contents matching the text contents can be identified one by one from the video frames of the video.
When the total frame number of the video is large, the information contained in the video is more abundant and diversified, and in order to improve the identification efficiency, the embodiment may perform segmentation processing on the video to obtain a plurality of video segments, and then identify the image content matched with the text content from the video frames of at least part of the video segments. Specific embodiments can refer to the following examples of steps (1) and (2).
(1) And when the calculated correlation degree is smaller than a preset correlation degree threshold value, performing segmentation processing on the video based on the positions of the two continuous video frames to obtain a plurality of video segments containing the characters.
(2) And identifying image content matched with the text content from the video frames of the video clip based on the text content of the key points of the answer.
Considering that the time sequence of video playing and the sequence between the answer key points are consistent, the time sequence of each video segment and the sequence of the answer key points are also related based on the time sequence. In this case, at least one video clip may be assigned to each answer gist; and identifying image content matched with the text content from the video frames of the video clips distributed for the current answer key points according to the text content of the current answer key points. The specific implementation manner may include: and identifying the image content in each video frame in the video clip one by one, judging whether the image identification result is matched with the text content of the current answer key point, and if so, obtaining the image content matched with the text content.
Of course, if the image content matching the text content of the current answer main point cannot be identified from the distributed video segments, the image content matching the text content of the current answer main point is identified from the other video segments. According to the method, all video frames of the video do not need to be compared with the text content of each answer key point, each answer key point only needs to be compared with the video frames in part of the video clips, and each answer key point is distributed with the respective video clip, so that the image content can be identified by a plurality of answer key points at the same time, and therefore the identification efficiency of the image content can be effectively improved.
Step S206, intercepting image content from the video frame.
In this embodiment, after the image content matched with the text content is identified from the video frame, the corresponding video frame is intercepted to obtain an intercepted image containing the image content.
In step S208, a key frame image including the answer key is generated based on the text content and the image content of the answer key.
In order to more completely preserve the text content and the image content of the answer key points, the embodiment may generate the key frame image containing the answer key points by the following method, including:
and carrying out target detection on the intercepted image to obtain a text content containing the key points of the answer and an enclosure of the image content in the intercepted image. Expanding and adjusting the bounding box according to a preset length-width ratio; the length-width ratio is set for facilitating splicing processing of the key frame images, and size mismatching between the images in the splicing process is avoided; and, the expanding adjustment means that, when a certain size parameter of the bounding box is smaller than the preset length-width ratio, the size of the smaller size parameter is increased so that the length-width ratio of the adjusted bounding box is the same as the preset length-width ratio. For example, when the width of the enclosure frame is smaller than the preset length-width ratio, the enclosure frame is adjusted in size by increasing the width of the enclosure frame. The problem of cutting off the local text content or the local image content can not occur in the expanded adjusting mode, and the completeness of the text content and the image content is ensured. . And intercepting the image determined by the position parameter from the intercepted image based on the position parameter of the adjusted surrounding frame in the intercepted image to obtain the key frame image containing the answer key point.
According to any one of the above embodiments, after a plurality of key frame images corresponding to a plurality of answer key points are extracted from a video, a video cover of the video is generated based on the plurality of key frame images corresponding to the plurality of answer key points. Embodiments of this step can be referred to as follows:
and determining the arrangement sequence of a plurality of key frame images corresponding to the plurality of answer key points in the video cover based on the sequence of the plurality of answer key points in the answers. Specifically, according to the sequence of the answer key points in the answers, the image contents corresponding to the text contents of the answer key points are sequenced, and the sequencing result of the image contents is used as the arrangement sequence among the key frame images.
And determining a corresponding splicing template based on the sizes of the plurality of key frame images and the arrangement sequence among the plurality of key frame images. In a specific implementation, the stitching template may be determined based on the number, arrangement order, and size of the key frame images. For example, the number of the key frame images is two images having a front-back arrangement order, and then, when determining the mosaic template, the mosaic template which represents the arrangement order in an up-down, left-right, or other manner and can mosaic the two images is selected. The width-height ratio of each key frame image is determined according to the size of the key frame image, and when the splicing template is determined, the splicing template matched with the width-height ratio can be selected. The person skilled in the art may also determine the stitching template based on other image parameters according to actual needs, and is not limited in this respect. The splicing template in this embodiment may be preset in the splicing template library for calling.
Based on the size of each region in the splicing template, carrying out equal-scale scaling processing on the key frame image corresponding to each region; and inserting the zoomed key frame image into the corresponding area of the template to obtain the video cover of the video. For the current region, the scaling is calculated according to the size of the current region and the size of the key frame image corresponding to the current region; carrying out equal-scale scaling processing on the key frame image according to the calculated scaling; the scaled key frame image is inserted into the current region.
In accordance with the above embodiments, FIG. 3 provides several examples of the presentation of video covers in a search results page. The left image in fig. 3 shows a video cover obtained by intercepting and stitching key frame images after three key frame images are extracted in a key frame image extraction mode based on text content. It can be seen that each key frame image of the video cover shown in the left image of fig. 3 includes an answer key point for solving the search question (how to delete the mobile phone spam) in the search box, and text content (such as subtitle: "correctly clear the mobile phone memory") and image content corresponding to the answer key point; meanwhile, the arrangement sequence of the plurality of key frame images from left to right and from top to bottom in the video cover is matched with the sequence of answer key points corresponding to the search question. Each key frame image of the video cover shown in the right diagram of fig. 3 includes image content associated with the answer key of the search question, which is in turn: minced meat, egg liquid and minced meat steamed eggs.
In this embodiment, the video information corresponding to the search question acquired by the terminal device may further include a time identifier, where the time identifier is used to indicate the time of the key frame image in the video cover on the playing time axis of the video.
Based on this, the present embodiment provides a method for playing a video in response to detecting a trigger operation on a video cover, including:
in response to detecting the triggering operation of the first key frame image on the video cover, determining a corresponding time identifier of the first key frame image on the playing time axis of the video; and jumping to a playing page of the video, and starting to play the video from the time identifier.
The terminal equipment judges whether the triggering operation is detected; the triggering operation is an operation for a first key frame image of a video cover on a search result solution page, for example, when a display screen of the terminal device is a touch display screen, the triggering operation may be a touch operation of an operation body such as a finger or a stylus; when the input device of the terminal device is a mouse, the triggering operation may be a clicking operation of the user on the search result page through the mouse.
If the triggering operation is detected, responding to the triggering operation, and determining a corresponding time identifier of the first key frame image on a playing time axis of the video; and jumping to a playing page of the video, and starting to play the video from the time identifier, namely starting to play the video by taking the first key frame as a starting frame based on the time identifier.
To facilitate understanding, an embodiment of a server generated time stamp is given below, comprising: firstly, determining the corresponding time identification of each key frame image on the playing time axis of the video.
The embodiment may determine the time identifier corresponding to each key frame image by using various implementation manners, which take the following two manners as examples.
The implementation mode is as follows: for each key frame image, matching the key frame image with a video frame in a video, and determining a target video frame matched with the key frame image in the video frame; and determining the time identifier corresponding to the target video frame as the time identifier corresponding to the key frame image on the playing time axis of the video.
The implementation mode two is as follows: and determining the corresponding time identifier of each key frame image on the playing time axis of the video according to the mapping relation between each video frame and the playing time on the playing time axis of the video.
And after the time identifier corresponding to each key frame image is determined according to the mode, adding the corresponding relation between each key frame image and the time identifier into the video information.
In summary, in the method for searching question and answer results provided by the above-mentioned disclosure, the terminal device obtains the video information corresponding to the search question from the server; the video information includes: a video cover page that can be presented in the search results page, and a time stamp that can specify the video playback location. The video cover is composed of a plurality of key frame images, the composing mode can be flexible and changeable, and the display mode of the video cover is enriched; the arrangement sequence of the key frame images is matched with the key points of answers corresponding to the search questions, and certain logicality is embodied; because each key frame image forming the video cover comprises the text content and the image content for solving the search question, the text content is more in line with the daily reading habit of the user, the intuition of the search result can be improved, the image content has stronger picture feeling, and the vividness of the search result can be improved. Therefore, the video cover displayed in the search result page can improve the display friendliness of the search result page, and comprehensively, intuitively and accurately express the video searching intention of the user, so that the user can quickly find the content to be watched from the video cover, and the video is played in response to the triggering operation, thereby effectively improving the video searching efficiency. For the time identification in the video information, the user can conveniently and rapidly jump to the desired video position, and the user experience is improved.
Example two:
according to the first embodiment, the present embodiment may further provide a method for searching a question and answer result, where the method is applied to a server; as shown in fig. 4, the method includes:
step S402, receiving a search request of the terminal equipment, wherein the search request comprises a search problem;
step S404, searching and obtaining a video containing answers to the search questions based on the search questions;
step S406, determining a plurality of answer key points of the answers based on the audio signals or the text contents of the videos;
step S408, processing each answer key point from the video to obtain a key frame image containing text content and image content of the answer key point;
step S410, generating a video cover of the video based on a plurality of key frame images corresponding to a plurality of answer key points;
and step S412, feeding back the video information containing the video cover to the terminal equipment.
In one embodiment, the step of processing the keyframe image from the video with the text content and the image content including the answer key points comprises:
performing character recognition processing on the video to obtain text content of key points of the answer; identifying image content matched with the text content from video frames of the video based on the text content of the answer key points; intercepting image content from a video frame; and generating a key frame image containing the key points of the answer based on the text content and the image content of the key points of the answer.
In one embodiment, the step of generating a video cover of the video based on a plurality of key frame images corresponding to a plurality of answer key points comprises:
determining the arrangement sequence of a plurality of key frame images corresponding to a plurality of answer key points in a video cover based on the sequence of the answer key points in the answers; determining a corresponding splicing template based on the sizes of the plurality of key frame images and the arrangement sequence among the plurality of key frame images; based on the size of each region in the splicing template, carrying out equal-scale scaling processing on the key frame image corresponding to each region; and inserting the zoomed key frame image into a corresponding area of the template to obtain a video cover of the video.
In one embodiment, before the step of feeding back video information including the video cover to the terminal device, the method further comprises:
determining a corresponding time identifier of each key frame image on a playing time axis of the video; and adding the corresponding relation between each key frame image and the time mark into the video information.
In one embodiment, the step of determining the corresponding time identifier of each key frame image on the playing time axis of the video includes:
for each key frame image, matching the key frame image with a video frame in a video, and determining a target video frame matched with the key frame image in the video frame; and determining the time identifier corresponding to the target video frame as the time identifier corresponding to the key frame image on the playing time axis of the video.
In one embodiment, the step of determining the corresponding time identifier of each key frame image on the playing time axis of the video includes:
and determining the corresponding time identifier of each key frame image on the playing time axis of the video according to the mapping relation between each video frame and the playing time on the playing time axis of the video.
According to the method for searching question and answer results provided by the embodiment of the disclosure, the server searches the video containing the answers to the search questions, so that the obtained video and the answers to the search questions have higher matching degree, and further the video information is generated based on the video, so that the matching degree between the video information and the answers to the search questions can be improved; the key frame image determined based on the video comprises text content and image content of answer key points, and the answer key points can be visually and vividly embodied; furthermore, the video cover generated based on the plurality of key frame images can comprehensively, visually and accurately express the video searching desire of the user, and the video information containing the video cover is fed back to the terminal equipment, so that the user can quickly find the content needing to be watched through the video cover, and the user experience is obviously improved on the video searching function.
The method provided by the embodiment has the same implementation principle and technical effect as the first embodiment, and for the sake of brief description, reference may be made to the corresponding contents in the first embodiment for the part of this embodiment that is not mentioned.
Example three:
according to the first and second embodiments, this embodiment may further provide a method for searching question and answer results, where the method includes:
step 1, the terminal equipment sends a search request to a server, wherein the search request comprises a search problem.
And 2, the server receives a search request of the terminal equipment, and searches and obtains a video containing answers to the search questions based on the search questions in the search request.
And 3, searching and obtaining the video containing the answer of the search question by the server based on the search question.
And 4, the server determines a plurality of answer key points of the answers based on the audio signals or the text contents of the videos.
And 5, processing the key frame images containing the text content and the image content of the key points from the video by the server according to each key point of the answer.
Step 6, the server generates a video cover of the video based on a plurality of key frame images corresponding to a plurality of answer key points; the video cover is composed of a plurality of key frame images of the video, each key frame image comprises text content and image content of the answering search question, and the number of the key frame images is matched with the number of answer key points of the answering search question.
Step 7, the server determines the corresponding time identification of each key frame image on the playing time axis of the video; and adding the corresponding relation between each key frame image and the time mark into the video information.
And 8, the server feeds back the video information containing the video cover to the terminal equipment.
And 9, the terminal equipment acquires the video information corresponding to the search problem.
And step 10, the terminal equipment displays the video cover in the search result page.
Step 11, the terminal device determines a corresponding time identifier of a first key frame image on a playing time axis of the video in response to detecting a trigger operation on the first key frame image on the video cover;
and step 12, the terminal equipment jumps to a playing page of the video and starts to play the video from the time identifier.
The method provided by the embodiment has the same implementation principle and technical effect as the first embodiment and the second embodiment, and for the sake of brief description, corresponding contents in the first embodiment and the second embodiment may be referred to where not mentioned in this embodiment.
Example four:
according to the first embodiment, the present disclosure also provides a device for searching question and answer results, which is applicable to a terminal device, as shown in fig. 5, and includes:
an information obtaining module 502, configured to obtain video information corresponding to the search question; the video information comprises a video cover, the video cover is composed of a plurality of key frame images of the video, each key frame image comprises text content and image content of the answering search question, and the number of the key frame images is matched with the number of answer key points of the answering search question;
a cover display module 504 for displaying a video cover in the search result page;
a video playing module 506, configured to play the video in response to detecting the triggering operation on the video cover.
In an embodiment, the video playing module 506 is specifically configured to:
in response to detecting the triggering operation of the first key frame image on the video cover, determining a corresponding time identifier of the first key frame image on the playing time axis of the video; and jumping to a playing page of the video, and starting to play the video from the time identifier.
Example five:
according to the second embodiment, the present disclosure also provides a device for searching for a question and answer result, which is applicable to a server, as shown in fig. 6, and comprises:
a request receiving module 602, configured to receive a search request of a terminal device, where the search request includes a search question;
a video searching module 604, configured to search for a video that includes an answer to the search question based on the search question;
a gist determining module 606 for determining a plurality of answer gist of the answer based on the audio signal or the text content of the video;
an image obtaining module 608, configured to, for each answer key point, process a video to obtain a key frame image including text content and image content of the answer key point;
a cover generation module 610, configured to generate a video cover of the video based on a plurality of key frame images corresponding to the plurality of answer key points;
and an information feedback module 612, configured to feed back video information including the video cover to the terminal device.
In one embodiment, the image acquisition module 608 includes:
the text content obtaining unit is used for carrying out character recognition processing on the video to obtain text content of answer key points;
the image content identification unit is used for identifying image content matched with the text content from video frames of the video based on the text content of the answer key points;
the image content intercepting unit is used for intercepting image content from the video frame;
and the image generating unit is used for generating a key frame image containing the key points of the answer based on the text content and the image content of the key points of the answer.
In one embodiment, the image acquisition module 608 includes:
the order determining unit is used for determining the arrangement order of a plurality of key frame images corresponding to a plurality of answer key points in the video cover based on the order of the answer key points in the answers;
the template determining unit is used for determining a corresponding splicing template based on the sizes of the key frame images and the arrangement sequence among the key frame images;
the zooming processing unit is used for carrying out equal-scale zooming processing on the key frame images corresponding to the regions based on the sizes of the regions in the splicing template;
and the image inserting unit is used for inserting the zoomed key frame image into the corresponding area of the template to obtain a video cover of the video.
In one embodiment, the apparatus further comprises:
the time identifier determining module is used for determining the corresponding time identifier of each key frame image on the playing time axis of the video;
and the relationship adding module is used for adding the corresponding relationship between each key frame image and the time identifier into the video information.
In one embodiment, the time identification determination module comprises:
the matching unit is used for matching the key frame images with video frames in the video aiming at each key frame image and determining a target video frame matched with the key frame images in the video frames;
and the first identification determining unit is used for determining the time identification corresponding to the target video frame as the time identification corresponding to the key frame image on the playing time axis of the video.
In one embodiment, the time identification determination module comprises: and the second identifier determining unit is used for determining the corresponding time identifier of each key frame image on the playing time axis of the video according to the mapping relation between each video frame and the playing time on the playing time axis of the video.
The device provided in this embodiment has the same implementation principle and technical effects as those of the first to third embodiments, and for the sake of brief description, reference may be made to corresponding contents of the first to third embodiments for a part not mentioned in this embodiment.
Based on the foregoing embodiment, this embodiment provides a terminal device, including: a memory in which a computer program is stored and a processor, wherein the processor performs the method as described in the above embodiments one to three when the computer program is executed by the processor.
The present embodiment also provides a computer-readable storage medium, in which a computer program is stored, which, when executed by a processor, performs the method as in the above embodiments one to three.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The foregoing are merely exemplary embodiments of the present disclosure, which enable those skilled in the art to understand or practice the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (24)

1. A method for searching question and answer results is characterized by comprising the following steps:
acquiring video information corresponding to the search problem; the video information comprises a video cover, the video cover is composed of a plurality of key frame images of a video, each key frame image comprises text content and image content for answering the search question, and the number of the key frame images is matched with the number of key points for answering the search question;
displaying the video cover in a search results page;
and responding to the detected triggering operation of the video cover to play the video.
2. The method of claim 1, wherein the plurality of key frame images are arranged in the video cover in an order that matches an answer gist order corresponding to the search question.
3. The method according to claim 1, wherein the key frame image is an image obtained by cutting out a video frame of the video and containing text content and image content of the answer key point.
4. The method according to claim 1, wherein the text content of the answer key points included in the key frame images is identified from video frames of the video, and the image content of the answer key points included in the key frame images is matched from the video frames of the video based on the text content of the answer key points.
5. The method of claim 1, wherein the playing the video in response to detecting the triggering of the video cover comprises:
in response to detecting a trigger operation on a first key frame image on the video cover, determining a corresponding time identifier of the first key frame image on a playing time axis of the video;
jumping to a playing page of the video, and starting to play the video from the time identifier.
6. A method for searching question and answer results is characterized by comprising the following steps:
receiving a search request of terminal equipment, wherein the search request comprises a search problem;
searching and obtaining a video containing answers to the search questions based on the search questions;
determining a plurality of answer key points of the answer based on the audio signal or text content of the video;
processing each answer key point from the video to obtain a key frame image containing text content and image content of the answer key point;
generating a video cover of the video based on a plurality of key frame images corresponding to the plurality of answer key points;
and feeding back the video information containing the video cover to the terminal equipment.
7. The method of claim 6, wherein the processing from the video of the keyframe image containing the text content and the image content of the answer key point comprises:
performing character recognition processing on the video to obtain text content of the key points of the answer;
identifying image content matched with the text content from video frames of the video based on the text content of the answer key points;
intercepting the image content from the video frame;
and generating a key frame image containing the answer key points based on the text content and the image content of the answer key points.
8. The method according to claim 6 or 7, wherein the generating a video cover of the video based on a plurality of key frame images corresponding to the plurality of answer key points comprises:
determining the arrangement sequence of a plurality of key frame images corresponding to a plurality of answer key points in the video cover based on the sequence of the answer key points in the answer;
determining a corresponding splicing template based on the sizes of the plurality of key frame images and the arrangement sequence among the plurality of key frame images;
based on the size of each region in the splicing template, carrying out equal-scale scaling processing on the key frame image corresponding to each region;
and inserting the zoomed key frame image into the corresponding area of the template to obtain the video cover of the video.
9. The method of claim 6, wherein before feeding back the video information including the video cover to the terminal device, further comprising:
determining a corresponding time identifier of each key frame image on the playing time axis of the video;
and adding the corresponding relation between each key frame image and the time mark into the video information.
10. The method according to claim 9, wherein the determining the corresponding time identifier of each key frame image on the playing time axis of the video comprises:
for each key frame image, matching the key frame image with a video frame in the video, and determining a target video frame matched with the key frame image in the video frame;
and determining the time identifier corresponding to the target video frame as the time identifier corresponding to the key frame image on the playing time axis of the video.
11. The method according to claim 9, wherein the determining the corresponding time identifier of each key frame image on the playing time axis of the video comprises:
and determining the corresponding time identifier of each key frame image on the playing time axis of the video according to the mapping relation between each video frame and the playing time on the playing time axis of the video.
12. A question-answer result search apparatus, comprising:
the information acquisition module is used for acquiring video information corresponding to the search problem; the video information comprises a video cover, the video cover is composed of a plurality of key frame images of a video, each key frame image comprises text content and image content for answering the search question, and the number of the key frame images is matched with the number of key points for answering the search question;
the cover display module is used for displaying the video cover in a search result page;
and the video playing module is used for responding to the detection of the triggering operation of the video cover to play the video.
13. The apparatus of claim 12, wherein the plurality of key frame images are arranged in the video cover in an order that matches an order of answer points corresponding to the search question.
14. The apparatus according to claim 12, wherein the key frame image is an image obtained by cutting out a video frame of the video and containing text content and image content of the answer key point.
15. The apparatus according to claim 12, wherein the text content of the answer key included in the key frame image is identified from a video frame of the video, and the image content of the answer key included in the key frame image is matched from the video frame of the video based on the text content of the answer key.
16. The apparatus of claim 12, wherein the video playback module is configured to:
in response to detecting a trigger operation on a first key frame image on the video cover, determining a corresponding time identifier of the first key frame image on a playing time axis of the video;
jumping to a playing page of the video, and starting to play the video from the time identifier.
17. A question-answer result search apparatus, comprising:
the device comprises a request receiving module, a search processing module and a search processing module, wherein the request receiving module is used for receiving a search request of terminal equipment, and the search request comprises a search problem;
the video searching module is used for searching and obtaining a video containing answers to the searching questions based on the searching questions;
the key point determining module is used for determining a plurality of key point of the answer based on the audio signal or the text content of the video;
the image obtaining module is used for processing each answer key point from the video to obtain a key frame image containing text content and image content of the answer key point;
the cover generation module is used for generating a video cover of the video based on a plurality of key frame images corresponding to the plurality of answer key points;
and the information feedback module is used for feeding back the video information containing the video cover to the terminal equipment.
18. The apparatus of claim 17, wherein the image obtaining module comprises:
a text content obtaining unit, configured to perform character recognition processing on the video to obtain text content of the answer key;
the image content identification unit is used for identifying image content matched with the text content from video frames of the video based on the text content of the answer key point;
an image content intercepting unit for intercepting the image content from the video frame;
and the image generating unit is used for generating a key frame image containing the answer key points based on the text content and the image content of the answer key points.
19. The apparatus of claim 17 or 18, wherein the image obtaining module comprises:
the order determining unit is used for determining the arrangement order of a plurality of key frame images corresponding to a plurality of answer key points in the video cover based on the order of the answer key points in the answer;
the template determining unit is used for determining a corresponding splicing template based on the sizes of the plurality of key frame images and the arrangement sequence among the plurality of key frame images;
the zooming processing unit is used for carrying out equal-scale zooming processing on the key frame images corresponding to the regions based on the sizes of the regions in the splicing template;
and the image inserting unit is used for inserting the zoomed key frame image into the corresponding area of the template to obtain the video cover of the video.
20. The apparatus of claim 17, further comprising:
the time identifier determining module is used for determining the corresponding time identifier of each key frame image on the playing time axis of the video;
and the relationship adding module is used for adding the corresponding relationship between each key frame image and the time identifier into the video information.
21. The apparatus of claim 20, wherein the time identification determination module comprises:
the matching unit is used for matching the key frame images with video frames in the video aiming at each key frame image and determining a target video frame matched with the key frame image in the video frames;
and the first identification determining unit is used for determining the time identification corresponding to the target video frame as the time identification corresponding to the key frame image on the playing time axis of the video.
22. The apparatus of claim 20, wherein the time identification determination module comprises:
and the second identifier determining unit is used for determining the corresponding time identifier of each key frame image on the playing time axis of the video according to the mapping relation between each video frame and the playing time on the playing time axis of the video.
23. A terminal device, comprising:
memory and a processor, wherein the memory has stored therein a computer program which, when executed by the processor, performs the method of any of claims 1-5 or performs the method of any of claims 6-11.
24. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method of any one of claims 1-5 or carries out the method of any one of claims 6-11.
CN202111620699.8A 2021-12-28 2021-12-28 Method, device, equipment and storage medium for searching question and answer result Active CN114297433B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111620699.8A CN114297433B (en) 2021-12-28 2021-12-28 Method, device, equipment and storage medium for searching question and answer result
PCT/CN2022/137552 WO2023124874A1 (en) 2021-12-28 2022-12-08 Question and answer result searching method and apparatus, device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111620699.8A CN114297433B (en) 2021-12-28 2021-12-28 Method, device, equipment and storage medium for searching question and answer result

Publications (2)

Publication Number Publication Date
CN114297433A true CN114297433A (en) 2022-04-08
CN114297433B CN114297433B (en) 2024-04-19

Family

ID=80969482

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111620699.8A Active CN114297433B (en) 2021-12-28 2021-12-28 Method, device, equipment and storage medium for searching question and answer result

Country Status (2)

Country Link
CN (1) CN114297433B (en)
WO (1) WO2023124874A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115422398A (en) * 2022-08-30 2022-12-02 北京字跳网络技术有限公司 Comment information processing method and device and storage medium
WO2023124874A1 (en) * 2021-12-28 2023-07-06 北京字节跳动网络技术有限公司 Question and answer result searching method and apparatus, device, and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180136265A (en) * 2017-06-14 2018-12-24 주식회사 핀인사이트 Apparatus, method and computer-readable medium for searching and providing sectional video
CN109492087A (en) * 2018-11-27 2019-03-19 北京中熙正保远程教育技术有限公司 A kind of automatic answer system and method for online course learning
CN110337011A (en) * 2019-07-17 2019-10-15 百度在线网络技术(北京)有限公司 Method for processing video frequency, device and equipment
CN111400553A (en) * 2020-04-26 2020-07-10 Oppo广东移动通信有限公司 Video searching method, video searching device and terminal equipment
CN112447073A (en) * 2020-12-11 2021-03-05 北京有竹居网络技术有限公司 Explanation video generation method, explanation video display method and device
CN112883235A (en) * 2021-03-11 2021-06-01 深圳市一览网络股份有限公司 Video content searching method and device, computer equipment and storage medium
CN113392288A (en) * 2020-03-11 2021-09-14 阿里巴巴集团控股有限公司 Visual question answering and model training method, device, equipment and storage medium thereof

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019933A (en) * 2018-01-02 2019-07-16 阿里巴巴集团控股有限公司 Video data handling procedure, device, electronic equipment and storage medium
CN109905782B (en) * 2019-03-31 2021-05-18 联想(北京)有限公司 Control method and device
CN110381368A (en) * 2019-07-11 2019-10-25 北京字节跳动网络技术有限公司 Video cover generation method, device and electronic equipment
CN111694984B (en) * 2020-06-12 2023-06-20 百度在线网络技术(北京)有限公司 Video searching method, device, electronic equipment and readable storage medium
CN114297433B (en) * 2021-12-28 2024-04-19 抖音视界有限公司 Method, device, equipment and storage medium for searching question and answer result

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180136265A (en) * 2017-06-14 2018-12-24 주식회사 핀인사이트 Apparatus, method and computer-readable medium for searching and providing sectional video
CN109492087A (en) * 2018-11-27 2019-03-19 北京中熙正保远程教育技术有限公司 A kind of automatic answer system and method for online course learning
CN110337011A (en) * 2019-07-17 2019-10-15 百度在线网络技术(北京)有限公司 Method for processing video frequency, device and equipment
CN113392288A (en) * 2020-03-11 2021-09-14 阿里巴巴集团控股有限公司 Visual question answering and model training method, device, equipment and storage medium thereof
CN111400553A (en) * 2020-04-26 2020-07-10 Oppo广东移动通信有限公司 Video searching method, video searching device and terminal equipment
CN112447073A (en) * 2020-12-11 2021-03-05 北京有竹居网络技术有限公司 Explanation video generation method, explanation video display method and device
CN112883235A (en) * 2021-03-11 2021-06-01 深圳市一览网络股份有限公司 Video content searching method and device, computer equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023124874A1 (en) * 2021-12-28 2023-07-06 北京字节跳动网络技术有限公司 Question and answer result searching method and apparatus, device, and storage medium
CN115422398A (en) * 2022-08-30 2022-12-02 北京字跳网络技术有限公司 Comment information processing method and device and storage medium

Also Published As

Publication number Publication date
WO2023124874A1 (en) 2023-07-06
CN114297433B (en) 2024-04-19

Similar Documents

Publication Publication Date Title
US10148928B2 (en) Generating alerts based upon detector outputs
US11317139B2 (en) Control method and apparatus
US9100701B2 (en) Enhanced video systems and methods
US8599309B2 (en) Method and system for identifying addressing data within a television presentation
CN106020448B (en) Man-machine interaction method and system based on intelligent terminal
CN114297433B (en) Method, device, equipment and storage medium for searching question and answer result
CN110913241B (en) Video retrieval method and device, electronic equipment and storage medium
WO2016029561A1 (en) Display terminal-based data processing method
CN103269455A (en) Method and device for information source access
CN109309844A (en) Video platform word treatment method, videoconference client and server
KR101404596B1 (en) System and method for providing video service based on image data
JP6202815B2 (en) Character recognition device, character recognition method, and character recognition program
CN105657514A (en) Method and apparatus for playing video key information on mobile device browser
CN110309324B (en) Searching method and related device
CN111757174A (en) Method and device for matching video and audio image quality and electronic equipment
CN110418148B (en) Video generation method, video generation device and readable storage medium
KR101513801B1 (en) Method for providing an electronic program guide, multimedia reproduction system, and computer readable storage medium
KR102122918B1 (en) Interactive question-anwering apparatus and method thereof
CN114339371A (en) Video display method, device, equipment and storage medium
CN113705154A (en) Video-based content interaction method and device, computer equipment and storage medium
CN111915637A (en) Picture display method and device, electronic equipment and storage medium
CN111586492A (en) Video playing method and device, client device and storage medium
CN110929056B (en) Multimedia file generating method, multimedia file playing method, multimedia file generating device and multimedia file playing device
CN111324819B (en) Method and device for searching media content, computer equipment and storage medium
KR102398668B1 (en) Apparatus for providing object information based on video object recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant after: Tiktok vision (Beijing) Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant before: BEIJING BYTEDANCE NETWORK TECHNOLOGY Co.,Ltd.

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant after: Douyin Vision Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant before: Tiktok vision (Beijing) Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant