CN110582016A - video information display method, device, server and storage medium - Google Patents

video information display method, device, server and storage medium Download PDF

Info

Publication number
CN110582016A
CN110582016A CN201910844556.1A CN201910844556A CN110582016A CN 110582016 A CN110582016 A CN 110582016A CN 201910844556 A CN201910844556 A CN 201910844556A CN 110582016 A CN110582016 A CN 110582016A
Authority
CN
China
Prior art keywords
video
target video
target
frame
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910844556.1A
Other languages
Chinese (zh)
Inventor
王少丽
栾富君
周凤雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN201910844556.1A priority Critical patent/CN110582016A/en
Publication of CN110582016A publication Critical patent/CN110582016A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234381Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the temporal resolution, e.g. decreasing the frame rate by frame skipping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The disclosure relates to a video information display method, a device, a server and a storage medium, wherein the method comprises the following steps: receiving a video information acquisition request sent by terminal equipment, wherein the video information acquisition request is used for requesting to display the content of a target video; detecting whether a content frame extraction result of a target video is stored locally; if the content frame extraction result of the target video is locally stored, returning the content frame extraction result of the target video to the terminal equipment; if the content frame extraction result of the target video is not stored locally, performing frame extraction on the target video to obtain the content frame extraction result of the target video, and returning the content frame extraction result of the target video to the terminal equipment; and the content frame extraction result of the target video carries the content elements of the target video. By implementing the method, the video content can be obtained by checking the content frame extraction result of the video without playing the video, and the method is time-saving and high in efficiency.

Description

Video information display method, device, server and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for displaying video information, a server, and a storage medium.
Background
With the development of multimedia technology and network technology, video resources become more and more abundant, and it is very important to quickly grasp the content of each video and know the tone of each video in the face of massive videos. At present, for various videos, the video content of the videos is mainly known through video playing, but playing of one video is long in time consumption and low in efficiency.
Disclosure of Invention
the present disclosure provides a video information display method, apparatus, server and storage medium, so as to at least solve the technical problems of long time consumption and low efficiency in the related art. The technical scheme of the disclosure is as follows:
according to a first aspect of the embodiments of the present disclosure, there is provided a video information display method applied to a server, the method including:
Receiving a video information acquisition request sent by terminal equipment, wherein the video information acquisition request is used for requesting to display the content of a target video;
detecting whether a content frame extraction result of the target video is stored locally;
If the content frame extraction result of the target video is locally stored, returning the content frame extraction result of the target video to the terminal equipment;
if the content frame extraction result of the target video is not stored locally, performing frame extraction on the target video to obtain the content frame extraction result of the target video, and returning the content frame extraction result of the target video to the terminal equipment;
and the content frame extraction result of the target video carries the content element of the target video.
optionally, as an embodiment, the step of performing frame extraction on the target video to obtain a content frame extraction result of the target video includes:
Acquiring metadata of the target video, wherein the metadata records type information and duration information of the target video;
determining a target frame extracting mode corresponding to the target video according to the metadata;
And according to the target frame extracting mode, extracting the frame of the target video to obtain a content frame extracting result of the target video.
optionally, as an embodiment, the step of determining, according to the metadata, a target frame extraction manner corresponding to the target video includes:
determining whether the target video is a video formed by splicing pictures or not according to the type information recorded in the metadata;
If so, determining that a target frame extraction mode corresponding to the target video is to extract all video frames;
Otherwise, determining a target frame extracting mode corresponding to the target video according to the duration information recorded in the metadata.
optionally, as an embodiment, the step of determining, according to the duration information recorded in the metadata, a target frame extraction manner corresponding to the target video includes:
extracting the information of the duration of the recorded target video from the duration information recorded in the metadata;
Under the condition that the duration of the target video is smaller than a preset first duration threshold, determining that a target frame extracting mode corresponding to the target video is to extract N video frames, wherein N is more than or equal to 1 and less than or equal to N, and N is the number of the video frames in the target video;
under the condition that the duration of the target video is greater than a preset second duration threshold, determining a target frame extraction mode corresponding to the target video, namely extracting one video frame at intervals of a preset time interval and generating all the extracted video frames into a dynamic graph, or extracting m video frames within a preset time period and generating all the extracted video frames into a dynamic graph, wherein the preset second duration threshold is not less than the preset first duration threshold, and m is not less than 1 and not more than N.
optionally, as an embodiment, the method further includes:
in the process of frame extraction of the target video, carrying out frame-by-frame detection on the target video to obtain the content characteristics of each video frame in the target video;
And extracting the video frames containing the specific content characteristics in the target video to obtain the content frame extraction result of the target video.
According to a second aspect of the embodiments of the present disclosure, there is provided a video information display apparatus applied to a server, the apparatus including:
the terminal equipment comprises a receiving unit, a display unit and a display unit, wherein the receiving unit is configured to receive a video information acquisition request sent by the terminal equipment, and the video information acquisition request is used for requesting to display the content of a target video;
A first detection unit configured to detect whether a content frame extraction result of the target video is stored locally;
the first sending unit is configured to return the content frame extraction result of the target video to the terminal equipment under the condition that the content frame extraction result of the target video is locally stored;
the first frame extracting unit is configured to perform frame extraction on the target video to obtain a content frame extracting result of the target video under the condition that the content frame extracting result of the target video is not stored locally;
the second sending unit is configured to return a content frame extracting result of the target video to the terminal equipment after the frame extracting unit finishes frame extracting of the target video;
and the content frame extraction result of the target video carries the content element of the target video.
optionally, as an embodiment, the first framing unit includes:
A metadata obtaining subunit, configured to obtain metadata of the target video, where type information and duration information of the target video are recorded in the metadata;
A frame extraction mode determining subunit, configured to determine, according to the metadata, a target frame extraction mode corresponding to the target video;
and the frame extracting subunit is configured to perform frame extraction on the target video according to the target frame extracting mode to obtain a content frame extracting result of the target video.
optionally, as an embodiment, the frame extraction mode determining subunit includes:
the video type determining module is configured to determine whether the target video is a video formed by splicing pictures according to the type information recorded in the metadata;
the first frame extracting mode determining module is configured to determine that the target frame extracting mode corresponding to the target video is to extract all video frames when the determination result of the video type determining module is yes;
And the second frame extracting mode determining module is configured to determine a target frame extracting mode corresponding to the target video according to the duration information recorded in the metadata under the condition that the determination result of the video type determining module is negative.
optionally, as an embodiment, the second frame extracting manner determining module includes:
a video duration determination sub-module configured to extract information of a duration in which the target video is recorded from duration information recorded in the metadata;
The first frame extracting mode determining submodule is configured to determine that the target frame extracting mode corresponding to the target video is N video frames when the duration of the target video is smaller than a preset first duration threshold, wherein N is more than or equal to 1 and less than or equal to N, and N is the number of video frames in the target video;
And the second frame extracting mode determining submodule is configured to determine, when the duration of the target video is greater than a preset second duration threshold, that the target frame extracting mode corresponding to the target video is to extract one video frame at intervals of a preset time interval and generate all the extracted video frames into a dynamic image, or extract m video frames within a preset time period and generate all the extracted video frames into a dynamic image, wherein the preset second duration threshold is not less than the preset first duration threshold, and m is greater than or equal to 1 and less than or equal to N.
optionally, as an embodiment, the apparatus further includes:
the second detection unit is configured to perform frame-by-frame detection on the target video in the process of performing frame extraction on the target video to obtain the content characteristics of each video frame in the target video;
And the second frame extracting unit is configured to extract video frames containing specific content characteristics in the target video to obtain a content frame extracting result of the target video.
according to a third aspect of the embodiments of the present disclosure, there is provided a server, including:
A processor;
A memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the video information presentation method according to the first aspect.
according to a fourth aspect of embodiments of the present disclosure, there is provided a storage medium, wherein instructions of the storage medium, when executed by a processor of a server, enable the server to perform the video information presentation method according to the first aspect.
according to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product, wherein instructions of the computer program product, when executed by a processor of a server, enable the server to perform the video information presentation method according to the first aspect.
the technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:
In the embodiment of the disclosure, when a video information acquisition request from a terminal device is received, whether a content frame extraction result of a target video is stored locally or not can be detected, if yes, the content frame extraction result of the target video is directly returned to the terminal device, otherwise, the content frame extraction result of the target video is obtained by performing frame extraction on the target video, and the content frame extraction result of the target video is returned to the terminal device. Because the content frame extraction result of the target video carries the content elements of the target video, that is, the content frame extraction result of the target video can reflect the main video content/information of the target video, in the embodiment of the present disclosure, the content frame extraction result of the video is used as the auxiliary information of the video in a display mode combining the video and the content frame extraction result of the video, so that the video does not need to be played, and the video content can be obtained by checking the content frame extraction result of the video, which is time-saving and high in efficiency.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
FIG. 1 is a flow chart illustrating a method of presenting video information in accordance with an exemplary embodiment;
FIG. 2 is a flowchart illustrating one implementation of step 104, according to an exemplary embodiment;
Fig. 3 is a diagram illustrating an application scenario of a video information presentation method according to an exemplary embodiment;
FIG. 4 is a block diagram illustrating a video information presentation device according to an exemplary embodiment;
fig. 5 is a schematic diagram illustrating a configuration of a server according to an example embodiment.
Detailed Description
in order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
it should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
at present, for multimedia of various videos, the video content is generally known through video playing, and how to quickly master the content of each video and know the tone of the content in the current massive videos, the playing of the videos one by one consumes a long time, and the efficiency is low.
in order to solve the technical problem, embodiments of the present disclosure provide a video information display method, apparatus, server and storage medium.
first, a method for displaying video information provided by the embodiment of the present disclosure is described below.
fig. 1 is a flowchart illustrating a video information presentation method performed by a server according to an exemplary embodiment, and as shown in fig. 1, the method may include the steps of: step 101, step 102, step 103, step 104 and step 105, wherein,
In step 101, a video information acquisition request sent by a terminal device is received, wherein the video information acquisition request is used for requesting to display the content of a target video.
In the embodiment of the present disclosure, the target video may be UGC (User Generated Content) or PGC (Professional Generated Content); or the target video can be a long video or a short video; or the target video can be a recorded video or a video formed by splicing pictures.
in the embodiment of the disclosure, when a user wishes to view the content of a target video, a specific operation can be input on a terminal device to trigger the terminal device to send a video information acquisition request to a server.
in the embodiment of the present disclosure, the terminal device may be a mobile terminal such as a smart phone, a tablet computer, or a personal digital assistant, or may also be a computer device such as a notebook computer, a desktop computer, and the like.
in step 102, detecting whether a content frame extraction result of a target video is stored locally; if yes, go to step 103, otherwise go to step 104.
in the embodiment of the present disclosure, the content frame extraction result of the target video carries content elements of the target video, and in practical application, the content frame extraction result of the target video may be a partial video frame of the target video, or may also be a dynamic graph generated based on the partial video frame of the target video.
in step 103, the content frame extraction result of the target video is returned to the terminal device.
in the embodiment of the disclosure, if the server locally stores the content frame extraction result of the target video, the content frame extraction result of the target video is directly returned to the terminal device.
In step 104, the target video is subjected to frame extraction to obtain a content frame extraction result of the target video.
in the embodiment of the disclosure, if the content frame extraction result of the target video is not stored locally in the server, the target video is extracted first, and after the content frame extraction result of the target video is obtained, the content frame extraction result of the target video is returned to the terminal device.
at present, for videos, video frames can accurately express video contents, but considering that the durations of different videos are usually different and the expression forms of the video contents are also different, if the same frame extraction strategy (i.e., a single frame extraction strategy) is adopted for all videos, the extracted video frames often have no pertinence, the video information is not comprehensively expressed, effective information is omitted, and the like, so that what kind of method is adopted to frame the videos, whether the videos need to be completely extracted, and how many frames the videos need to be extracted can be effectively described, which is a problem that needs to be solved urgently when the videos with a large amount of videos and various expression forms of the videos are faced at present.
in view of the above problems, in the embodiment of the present disclosure, a plurality of frame extraction strategies may be provided, and for videos with different durations, different types, and different content presentation forms, different frame extraction strategies are adopted, so as to perform frame extraction on the videos more specifically, thereby ensuring that the extracted video frames can accurately and comprehensively reflect video information of the videos.
Accordingly, in an embodiment provided by the present disclosure, in consideration that the type and duration information of a video are usually recorded in the metadata of the video, therefore, the metadata of the video may be acquired, the duration and type of the video are extracted from the metadata of the video, a frame-extracting policy matched with the video is selected according to the duration and type of the video, and the video is frame-extracted according to the selected frame-extracting policy, where, as shown in fig. 4, the step 104 may specifically include the following steps: step 1041, step 1042, and step 1043, wherein,
In step 1041, metadata of the target video is obtained, wherein the metadata records type information and duration information of the target video.
In step 1042, a target frame extraction mode corresponding to the target video is determined according to the metadata.
In this embodiment of the present disclosure, it may be determined whether to perform full frame extraction on the video (i.e. extract all video frames in the video) according to the type of the video, at this time, the step 1042 may specifically include the following steps (not shown in the figure): step 10421, step 10422, and step 10423, wherein,
in step 10421, determining whether the target video is a video formed by splicing pictures according to the type information recorded in the metadata; optionally, the type information refers to a type of a video generation manner, and the type of the video generation manner may include: the type of picture splicing, the type of continuous shot picture generation, and the like.
The step of determining whether the target video is a video formed by picture splicing according to the type information recorded in the metadata may be implemented by reading a type of a video generation manner recorded in the type information, and judging whether the type is the type formed by picture splicing.
in the embodiment of the present disclosure, a video obtained by splicing pictures refers to a segment of video synthesized based on multiple pictures.
in the embodiment of the present disclosure, in order to comprehensively understand video information of such a video, it is necessary to extract all video frames in such a video, considering that a video formed by splicing pictures is composed of a plurality of pictures, and the degree of association between the plurality of pictures is uncertain (there is a possibility that the plurality of pictures are associated, and several pictures are not associated at all). For videos other than these videos (i.e., recorded videos), the degree of correlation between video frames in a video is usually high, so that only a portion of the video frames need to be extracted.
In step 10422, if the target video is a video obtained by splicing pictures, it is determined that the target frame extracting manner corresponding to the target video is to extract all video frames;
in step 10423, if the target video is not a video obtained by splicing pictures, the target frame-extracting manner corresponding to the target video is determined according to the duration information recorded in the metadata.
in the embodiment of the disclosure, when the target video is a recorded video, frame extraction strategies corresponding to different video durations can be selected according to the video duration, and for videos with shorter durations, a fixed number of video frames can usually cover main contents of the videos, so that for videos with shorter durations, a fixed number of video frames can be extracted; for a video with a longer duration, a greater number of video frames are extracted, and at this time, the step 10423 may specifically include the following steps:
extracting the information of the duration of the recorded target video from the duration information recorded in the metadata;
under the condition that the duration of the target video is smaller than a preset first duration threshold, determining that a target frame extracting mode corresponding to the target video is to extract N video frames, wherein N is more than or equal to 1 and less than or equal to N, and N is the number of the video frames in the target video;
Under the condition that the duration of the target video is greater than a preset second duration threshold, determining a target frame extracting mode corresponding to the target video, namely extracting one video frame at intervals of a preset time interval and generating all the extracted video frames into a dynamic graph, or extracting m video frames within a preset time period and generating all the extracted video frames into a dynamic graph, wherein the preset second duration threshold is not less than the preset first duration threshold, and m is greater than or equal to 1 and less than or equal to N.
in the embodiment of the disclosure, if the duration of the target video is less than the preset first duration threshold, it is indicated that the target video is a video with a shorter duration, at this time, it is determined that the target frame extraction mode corresponding to the target video is to extract a fixed number of video frames, that is, to extract n video frames, where when the n video frames are extracted, frame extraction may be performed according to a fixed time interval, in practical application, the preset first duration threshold may be 1 minute, and n may be 10.
In the embodiment of the present disclosure, if the duration of the target video is greater than the preset second duration threshold, it indicates that the target video is a video with a longer duration, and considering that the number of video frames extracted from the video with a longer duration is greater, and if a longer time is required for frame-by-frame viewing, the extracted video frames may be generated into gif animation images to shorten the viewing duration, at this time, it is determined that the target frame extraction manner corresponding to the target video is to extract one video frame at intervals of a preset time interval and generate a dynamic image from all the extracted video frames, or extract m video frames within a preset time period and generate a dynamic image from all the extracted video frames, where in actual application, the preset second duration threshold may be 15 minutes.
in step 1043, frame extraction is performed on the target video according to the target frame extraction manner to obtain a content frame extraction result of the target video.
In the embodiment of the present disclosure, the content frame extraction result extracted in the target frame extraction manner may be a sequence of video frames (i.e., one video frame) or a dynamic image (i.e., gif animation image) generated based on the sequence of video frames.
in the embodiment of the disclosure, when the target video is a video formed by splicing pictures, the content frame extraction result of the target video is a video frame sequence, and the video frame sequence comprises all video frames of the target video; when the target video is a recorded video and a video with short duration, the content frame extraction result of the target video is a video frame sequence, and the video frame sequence comprises partial video frames of the target video; when the target video is a recorded video and a video with a long time, the content frame extraction result of the target video is a dynamic image generated based on the extracted video frame.
in one example, for a video a, first, metadata of the video a is checked, basic information such as the type and duration of the video a is acquired, and a frame extraction strategy is selected according to the type and duration of the video. If the video a is a video formed by splicing pictures, all video frames of the video a are extracted, if the video a is a recorded video, for example, the video a is a video with a duration of less than 1 minute, 10 frames are fixedly extracted, and the 10 frames can basically cover the main content in the video, and for example, if the video a is a video with a duration of more than 15 minutes, frames can be extracted in an equal time interval manner, or several frames are extracted within a specified time period, and then all the extracted videos are combined into a gif image.
In the embodiment of the disclosure, after the target video is framed to obtain the content framing result of the target video, the content framing result of the target video may be directly stored locally, or in order to save storage space, the content framing result of the target video may be compressed, and the compressed content framing result may be stored locally, so that when other terminal devices request the content framing result of the target video, the content framing result of the target video may be directly provided to the terminal device.
in this embodiment of the present disclosure, in order to avoid missing effective information in the target video, an image detection technology may be adopted to detect video frames in the target video, detect video frames containing effective information in the target video, and extract the video frames from the target video, at this time, the method shown in fig. 1 may further include the following steps (not shown in the figure): step 106, wherein the step of determining the position of the target,
In step 106, in the process of frame extraction of the target video, performing frame-by-frame detection on the target video to obtain the content characteristics of each video frame in the target video; and extracting the video frames containing the specific content characteristics in the target video to obtain the content frame extraction result of the target video.
in the embodiment of the present disclosure, the video frame including the specific content feature may be a video frame including a specific object, or may be a video frame whose content has a significant change between frames.
in step 105, the content frame extraction result of the target video is returned to the terminal device.
in the embodiment of the present disclosure, the content frame extraction result of the target video may be the frame extraction result in step 104, or may be the frame extraction result in step 104 and the frame extraction result in step 106.
in the embodiment of the present disclosure, the server may compress and store the frame extraction result in step 104 and the frame extraction result in step 106, and in practical application, the frame extraction results are called to be combined with the video to serve as an auxiliary element for displaying the video information, so that a user can quickly know the approximate content of each video in a large amount of videos without playing the videos, and quickly locate the required useful information.
For convenience of understanding, the technical solution of the present disclosure is described with reference to an application scenario diagram shown in fig. 3, as shown in fig. 3, the application scenario includes: the terminal device 31 requests a content frame extraction result of the video 1 by sending a video information acquisition request to the server 30, after receiving the request from the terminal device 31, the server 30 detects whether the content frame extraction result of the video 1 is locally stored, if the content frame extraction result of the video 1 is locally stored, the content frame extraction result of the video 1 is directly sent to the terminal device 31, if the content frame extraction result of the video 1 is not locally stored, the video 1 is extracted, and after the content frame extraction result of the video 1 is obtained, the content frame extraction result of the video 1 is provided to the terminal device 31.
after receiving the content extraction result of the video 1 from the server, the terminal device 31 displays the content extraction results of the video 1 and the video 1 on the display screen 311 of the terminal device 31, so that the user can know the approximate content of the video 1 by looking at the content extraction result of the video 1 without playing the video 1.
As can be seen from the above embodiments, in this embodiment, when a video information acquisition request from a terminal device is received, it may be detected whether a content frame extraction result of a target video is locally stored, if so, the content frame extraction result of the target video is directly returned to the terminal device, otherwise, the content frame extraction result of the target video is obtained by performing frame extraction on the target video, and the content frame extraction result of the target video is returned to the terminal device. Because the content frame extraction result of the target video carries the content elements of the target video, that is, the content frame extraction result of the target video can reflect the main video content/information of the target video, in the embodiment of the present disclosure, the content frame extraction result of the video is used as the auxiliary information of the video in a display mode combining the video and the content frame extraction result of the video, so that the video does not need to be played, and the video content can be obtained by checking the content frame extraction result of the video, which is time-saving and high in efficiency.
fig. 4 is a block diagram illustrating a video information presentation apparatus according to an exemplary embodiment, and as shown in fig. 4, the video information presentation apparatus 400 may include: a receiving unit 401, a first detecting unit 402, a first transmitting unit 403, a first framing unit 404 and a second transmitting unit 405, wherein,
a receiving unit 401 configured to receive a video information acquisition request sent by a terminal device, where the video information acquisition request is used to request to display content of a target video;
a first detecting unit 402 configured to detect whether a content frame extraction result of the target video is stored locally;
A first sending unit 403, configured to return a content frame extraction result of the target video to the terminal device in a case that the content frame extraction result of the target video is locally stored;
A first frame extracting unit 404, configured to perform frame extraction on the target video to obtain a content frame extraction result of the target video when the content frame extraction result of the target video is not locally stored;
a second sending unit 405, configured to return a content framing result of the target video to the terminal device after the framing unit completes framing the target video;
And the content frame extraction result of the target video carries the content element of the target video.
As can be seen from the above embodiments, in this embodiment, when a video information acquisition request from a terminal device is received, it may be detected whether a content frame extraction result of a target video is locally stored, if so, the content frame extraction result of the target video is directly returned to the terminal device, otherwise, the content frame extraction result of the target video is obtained by performing frame extraction on the target video, and the content frame extraction result of the target video is returned to the terminal device. Because the content frame extraction result of the target video carries the content elements of the target video, that is, the content frame extraction result of the target video can reflect the main video content/information of the target video, in the embodiment of the present disclosure, the content frame extraction result of the video is used as the auxiliary information of the video in a display mode combining the video and the content frame extraction result of the video, so that the video does not need to be played, and the video content can be obtained by checking the content frame extraction result of the video, which is time-saving and high in efficiency.
optionally, as an embodiment, the first framing unit 404 may include:
A metadata obtaining subunit, configured to obtain metadata of the target video, where type information and duration information of the target video are recorded in the metadata;
a frame extraction mode determining subunit, configured to determine, according to the metadata, a target frame extraction mode corresponding to the target video;
And the frame extracting subunit is configured to perform frame extraction on the target video according to the target frame extracting mode to obtain a content frame extracting result of the target video.
Optionally, as an embodiment, the determining the frame extraction mode subunit may include:
The video type determining module is configured to determine whether the target video is a video formed by splicing pictures according to the type information recorded in the metadata;
the first frame extracting mode determining module is configured to determine that the target frame extracting mode corresponding to the target video is to extract all video frames when the determination result of the video type determining module is yes;
and the second frame extracting mode determining module is configured to determine a target frame extracting mode corresponding to the target video according to the duration information recorded in the metadata under the condition that the determination result of the video type determining module is negative.
Optionally, as an embodiment, the second frame extracting manner determining module may include:
a video duration determination sub-module configured to extract information of a duration in which the target video is recorded from duration information recorded in the metadata;
The first frame extracting mode determining submodule is configured to determine that the target frame extracting mode corresponding to the target video is N video frames when the duration of the target video is smaller than a preset first duration threshold, wherein N is more than or equal to 1 and less than or equal to N, and N is the number of video frames in the target video;
and the second frame extracting mode determining submodule is configured to determine, when the duration of the target video is greater than a preset second duration threshold, that the target frame extracting mode corresponding to the target video is to extract one video frame at intervals of a preset time interval and generate all the extracted video frames into a dynamic image, or extract m video frames within a preset time period and generate all the extracted video frames into a dynamic image, wherein the preset second duration threshold is not less than the preset first duration threshold, and m is greater than or equal to 1 and less than or equal to N.
optionally, as an embodiment, the video information display apparatus 400 may further include:
The second detection unit is configured to perform frame-by-frame detection on the target video in the process of performing frame extraction on the target video to obtain the content characteristics of each video frame in the target video;
and the second frame extracting unit is configured to extract video frames containing specific content characteristics in the target video to obtain a content frame extracting result of the target video.
the specific manner in which each module performs operations of the apparatus in the above embodiments has been described in detail in the embodiments related to the method, and will not be described in detail here, and reference may be made to part of the description of the method embodiments for relevant points.
fig. 5 is a block diagram illustrating a server according to an example embodiment, and as shown in fig. 5, the server 500 includes a processing component 522 further including one or more processors and memory resources, represented by memory 532, for storing instructions, such as applications, executable by the processing component 522. The application programs stored in memory 532 may include one or more modules that each correspond to a set of instructions. Further, the processing component 522 is configured to execute instructions to perform the video information presentation method applied to the server described above.
The server 500 may also include a power component 526 configured to perform power management for the server 500, a wired or wireless network interface 550 configured to connect the server 500 to a network, and an input/output (I/O) interface 558. The server 500 may operate based on an operating system stored in memory 532, such as Windows Server, MacOS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
According to an embodiment of the present disclosure, there is also provided a storage medium, wherein instructions, when executed by a processor of a server, enable the server to perform the video information presentation method according to any one of the above method embodiments. Alternatively, the storage medium may be a non-transitory computer readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
according to an embodiment of the present disclosure, there is also provided a computer program product, wherein instructions of the computer program product, when executed by a processor of a server, enable the server to perform the video information presentation method according to any one of the above method embodiments.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
it will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
A1, a video information display method, applied to a server, the method comprises:
receiving a video information acquisition request sent by terminal equipment, wherein the video information acquisition request is used for requesting to display the content of a target video;
Detecting whether a content frame extraction result of the target video is stored locally;
if the content frame extraction result of the target video is locally stored, returning the content frame extraction result of the target video to the terminal equipment;
If the content frame extraction result of the target video is not stored locally, performing frame extraction on the target video to obtain the content frame extraction result of the target video, and returning the content frame extraction result of the target video to the terminal equipment;
And the content frame extraction result of the target video carries the content element of the target video.
a2, according to the method of A1, the step of framing the target video to obtain the content framing result of the target video includes:
Acquiring metadata of the target video, wherein the metadata records type information and duration information of the target video;
determining a target frame extracting mode corresponding to the target video according to the metadata;
And according to the target frame extracting mode, extracting the frame of the target video to obtain a content frame extracting result of the target video.
A3, according to the method of A2, the step of determining the target frame drawing mode corresponding to the target video according to the metadata includes:
determining whether the target video is a video formed by splicing pictures or not according to the type information recorded in the metadata;
if so, determining that a target frame extraction mode corresponding to the target video is to extract all video frames;
otherwise, determining a target frame extracting mode corresponding to the target video according to the duration information recorded in the metadata.
a4, according to the method described in A3, the step of determining the target frame-extracting manner corresponding to the target video according to the duration information recorded in the metadata includes:
Extracting the information of the duration of the recorded target video from the duration information recorded in the metadata;
under the condition that the duration of the target video is smaller than a preset first duration threshold, determining that a target frame extracting mode corresponding to the target video is to extract N video frames, wherein N is more than or equal to 1 and less than or equal to N, and N is the number of the video frames in the target video;
Under the condition that the duration of the target video is greater than a preset second duration threshold, determining a target frame extraction mode corresponding to the target video, namely extracting one video frame at intervals of a preset time interval and generating all the extracted video frames into a dynamic graph, or extracting m video frames within a preset time period and generating all the extracted video frames into a dynamic graph, wherein the preset second duration threshold is not less than the preset first duration threshold, and m is not less than 1 and not more than N.
a5, the method of any one of A1 to A4, the method further comprising:
in the process of frame extraction of the target video, carrying out frame-by-frame detection on the target video to obtain the content characteristics of each video frame in the target video;
and extracting the video frames containing the specific content characteristics in the target video to obtain the content frame extraction result of the target video.
a6, a video information display device, applied to a server, the device comprising:
the terminal equipment comprises a receiving unit, a display unit and a display unit, wherein the receiving unit is configured to receive a video information acquisition request sent by the terminal equipment, and the video information acquisition request is used for requesting to display the content of a target video;
A first detection unit configured to detect whether a content frame extraction result of the target video is stored locally;
The first sending unit is configured to return the content frame extraction result of the target video to the terminal equipment under the condition that the content frame extraction result of the target video is locally stored;
the first frame extracting unit is configured to perform frame extraction on the target video to obtain a content frame extracting result of the target video under the condition that the content frame extracting result of the target video is not stored locally;
the second sending unit is configured to return a content frame extracting result of the target video to the terminal equipment after the frame extracting unit finishes frame extracting of the target video;
And the content frame extraction result of the target video carries the content element of the target video.
A7, the apparatus of A6, the first decimated unit comprising:
A metadata obtaining subunit, configured to obtain metadata of the target video, where type information and duration information of the target video are recorded in the metadata;
a frame extraction mode determining subunit, configured to determine, according to the metadata, a target frame extraction mode corresponding to the target video;
and the frame extracting subunit is configured to perform frame extraction on the target video according to the target frame extracting mode to obtain a content frame extracting result of the target video.
a8, according to the apparatus of A7, the determining the sub-unit in the frame-extracting manner includes:
The video type determining module is configured to determine whether the target video is a video formed by splicing pictures according to the type information recorded in the metadata;
The first frame extracting mode determining module is configured to determine that the target frame extracting mode corresponding to the target video is to extract all video frames when the determination result of the video type determining module is yes;
and the second frame extracting mode determining module is configured to determine a target frame extracting mode corresponding to the target video according to the duration information recorded in the metadata under the condition that the determination result of the video type determining module is negative.
a9, the apparatus of A8, the second frame extraction determining module comprising:
a video duration determination sub-module configured to extract information of a duration in which the target video is recorded from duration information recorded in the metadata;
The first frame extracting mode determining submodule is configured to determine that the target frame extracting mode corresponding to the target video is N video frames when the duration of the target video is smaller than a preset first duration threshold, wherein N is more than or equal to 1 and less than or equal to N, and N is the number of video frames in the target video;
and the second frame extracting mode determining submodule is configured to determine, when the duration of the target video is greater than a preset second duration threshold, that the target frame extracting mode corresponding to the target video is to extract one video frame at intervals of a preset time interval and generate all the extracted video frames into a dynamic image, or extract m video frames within a preset time period and generate all the extracted video frames into a dynamic image, wherein the preset second duration threshold is not less than the preset first duration threshold, and m is greater than or equal to 1 and less than or equal to N.
A10, the apparatus of any one of A6 to 9, further comprising:
the second detection unit is configured to perform frame-by-frame detection on the target video in the process of performing frame extraction on the target video to obtain the content characteristics of each video frame in the target video;
And the second frame extracting unit is configured to extract video frames containing specific content characteristics in the target video to obtain a content frame extracting result of the target video.

Claims (10)

1. a video information display method is applied to a server, and is characterized by comprising the following steps:
receiving a video information acquisition request sent by terminal equipment, wherein the video information acquisition request is used for requesting to display the content of a target video;
Detecting whether a content frame extraction result of the target video is stored locally;
If the content frame extraction result of the target video is locally stored, returning the content frame extraction result of the target video to the terminal equipment;
if the content frame extraction result of the target video is not stored locally, performing frame extraction on the target video to obtain the content frame extraction result of the target video, and returning the content frame extraction result of the target video to the terminal equipment;
And the content frame extraction result of the target video carries the content element of the target video.
2. The method according to claim 1, wherein the step of framing the target video to obtain the content framing result of the target video comprises:
Acquiring metadata of the target video, wherein the metadata records type information and duration information of the target video;
determining a target frame extracting mode corresponding to the target video according to the metadata;
and according to the target frame extracting mode, extracting the frame of the target video to obtain a content frame extracting result of the target video.
3. the method according to claim 2, wherein the step of determining the target frame-extracting manner corresponding to the target video according to the metadata comprises:
Determining whether the target video is a video formed by splicing pictures or not according to the type information recorded in the metadata;
if so, determining that a target frame extraction mode corresponding to the target video is to extract all video frames;
Otherwise, determining a target frame extracting mode corresponding to the target video according to the duration information recorded in the metadata.
4. The method according to claim 3, wherein the step of determining the target frame-extracting manner corresponding to the target video according to the duration information recorded in the metadata comprises:
Extracting the information of the duration of the recorded target video from the duration information recorded in the metadata;
under the condition that the duration of the target video is smaller than a preset first duration threshold, determining that a target frame extracting mode corresponding to the target video is to extract N video frames, wherein N is more than or equal to 1 and less than or equal to N, and N is the number of the video frames in the target video;
under the condition that the duration of the target video is greater than a preset second duration threshold, determining a target frame extraction mode corresponding to the target video, namely extracting one video frame at intervals of a preset time interval and generating all the extracted video frames into a dynamic graph, or extracting m video frames within a preset time period and generating all the extracted video frames into a dynamic graph, wherein the preset second duration threshold is not less than the preset first duration threshold, and m is not less than 1 and not more than N.
5. The method according to any one of claims 1 to 4, further comprising:
in the process of frame extraction of the target video, carrying out frame-by-frame detection on the target video to obtain the content characteristics of each video frame in the target video;
and extracting the video frames containing the specific content characteristics in the target video to obtain the content frame extraction result of the target video.
6. A video information display device applied to a server is characterized by comprising:
The terminal equipment comprises a receiving unit, a display unit and a display unit, wherein the receiving unit is configured to receive a video information acquisition request sent by the terminal equipment, and the video information acquisition request is used for requesting to display the content of a target video;
A first detection unit configured to detect whether a content frame extraction result of the target video is stored locally;
The first sending unit is configured to return the content frame extraction result of the target video to the terminal equipment under the condition that the content frame extraction result of the target video is locally stored;
The first frame extracting unit is configured to perform frame extraction on the target video to obtain a content frame extracting result of the target video under the condition that the content frame extracting result of the target video is not stored locally;
The second sending unit is configured to return a content frame extracting result of the target video to the terminal equipment after the frame extracting unit finishes frame extracting of the target video;
and the content frame extraction result of the target video carries the content element of the target video.
7. The apparatus of claim 6, wherein the first decimating unit comprises:
a metadata obtaining subunit, configured to obtain metadata of the target video, where type information and duration information of the target video are recorded in the metadata;
A frame extraction mode determining subunit, configured to determine, according to the metadata, a target frame extraction mode corresponding to the target video;
And the frame extracting subunit is configured to perform frame extraction on the target video according to the target frame extracting mode to obtain a content frame extracting result of the target video.
8. The apparatus of claim 7, wherein the decimating mode determining subunit comprises:
the video type determining module is configured to determine whether the target video is a video formed by splicing pictures according to the type information recorded in the metadata;
the first frame extracting mode determining module is configured to determine that the target frame extracting mode corresponding to the target video is to extract all video frames when the determination result of the video type determining module is yes;
and the second frame extracting mode determining module is configured to determine a target frame extracting mode corresponding to the target video according to the duration information recorded in the metadata under the condition that the determination result of the video type determining module is negative.
9. a server, comprising:
a processor;
A memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the video information presentation method of any one of claims 1 to 5.
10. a storage medium, wherein instructions in the storage medium, when executed by a processor of a server, enable the server to perform the video information presentation method of any one of claims 1 to 5.
CN201910844556.1A 2019-09-06 2019-09-06 video information display method, device, server and storage medium Pending CN110582016A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910844556.1A CN110582016A (en) 2019-09-06 2019-09-06 video information display method, device, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910844556.1A CN110582016A (en) 2019-09-06 2019-09-06 video information display method, device, server and storage medium

Publications (1)

Publication Number Publication Date
CN110582016A true CN110582016A (en) 2019-12-17

Family

ID=68812687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910844556.1A Pending CN110582016A (en) 2019-09-06 2019-09-06 video information display method, device, server and storage medium

Country Status (1)

Country Link
CN (1) CN110582016A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112911306A (en) * 2021-01-15 2021-06-04 北京奇艺世纪科技有限公司 Video processing method and device, electronic equipment and storage medium
CN113573061A (en) * 2020-04-29 2021-10-29 安徽华米健康科技有限公司 Video frame extraction method, device and equipment
CN114079804A (en) * 2020-08-13 2022-02-22 北京达佳互联信息技术有限公司 Multimedia resource detection method, device, terminal and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070237225A1 (en) * 2006-03-30 2007-10-11 Eastman Kodak Company Method for enabling preview of video files
CN104540000A (en) * 2014-12-04 2015-04-22 广东欧珀移动通信有限公司 Method for generating dynamic thumbnail and terminal
CN104618679A (en) * 2015-03-13 2015-05-13 南京知乎信息科技有限公司 Method for extracting key information frame from monitoring video
CN106375862A (en) * 2016-09-22 2017-02-01 维沃移动通信有限公司 GIF picture acquisition method and apparatus, and terminal
CN106572380A (en) * 2016-10-19 2017-04-19 上海传英信息技术有限公司 User terminal and video dynamic thumbnail generating method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070237225A1 (en) * 2006-03-30 2007-10-11 Eastman Kodak Company Method for enabling preview of video files
CN104540000A (en) * 2014-12-04 2015-04-22 广东欧珀移动通信有限公司 Method for generating dynamic thumbnail and terminal
CN104618679A (en) * 2015-03-13 2015-05-13 南京知乎信息科技有限公司 Method for extracting key information frame from monitoring video
CN106375862A (en) * 2016-09-22 2017-02-01 维沃移动通信有限公司 GIF picture acquisition method and apparatus, and terminal
CN106572380A (en) * 2016-10-19 2017-04-19 上海传英信息技术有限公司 User terminal and video dynamic thumbnail generating method

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113573061A (en) * 2020-04-29 2021-10-29 安徽华米健康科技有限公司 Video frame extraction method, device and equipment
CN114079804A (en) * 2020-08-13 2022-02-22 北京达佳互联信息技术有限公司 Multimedia resource detection method, device, terminal and storage medium
CN114079804B (en) * 2020-08-13 2024-03-26 北京达佳互联信息技术有限公司 Method, device, terminal and storage medium for detecting multimedia resources
CN112911306A (en) * 2021-01-15 2021-06-04 北京奇艺世纪科技有限公司 Video processing method and device, electronic equipment and storage medium
CN112911306B (en) * 2021-01-15 2023-04-07 北京奇艺世纪科技有限公司 Video processing method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US10425679B2 (en) Method and device for displaying information on video image
CN106658199B (en) Video content display method and device
CN107509107B (en) Method, device and equipment for detecting video playing fault and readable medium
US10043079B2 (en) Method and apparatus for providing multi-video summary
CN110582016A (en) video information display method, device, server and storage medium
CN108062507B (en) Video processing method and device
US20150104149A1 (en) Video summary apparatus and method
CN109446025B (en) Operation behavior playback method and device, electronic equipment and readable medium
CN110740290A (en) Monitoring video previewing method and device
CN112235632A (en) Video processing method and device and server
CN109032911B (en) Frame rate detection method and device for mobile device and electronic device
CN113824987A (en) Method, medium, device and computing equipment for determining time consumption of first frame of live broadcast room
US9076207B1 (en) Image processing method, system and electronic device
CN112764838A (en) Target content display method and device and electronic equipment
CN111475677A (en) Image processing method, image processing device, storage medium and electronic equipment
CN107995538B (en) Video annotation method and system
JPWO2011135664A1 (en) Information processing apparatus, information processing method, and program
CN111881734A (en) Method and device for automatically intercepting target video
CN110490101A (en) A kind of picture intercept method, device and computer storage medium
CN114390204B (en) Intelligent medical treatment method and system
CN111683215B (en) Video playback method and device, electronic equipment and computer readable storage medium
CN110347597B (en) Interface testing method and device of picture server, storage medium and mobile terminal
CN110704294B (en) Method and apparatus for determining response time
CN113596582A (en) Video preview method and device and electronic equipment
CN113252678A (en) Appearance quality inspection method and equipment for mobile terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191217

RJ01 Rejection of invention patent application after publication