CN110740344B - Video extraction method and device and storage device - Google Patents

Video extraction method and device and storage device Download PDF

Info

Publication number
CN110740344B
CN110740344B CN201910877519.0A CN201910877519A CN110740344B CN 110740344 B CN110740344 B CN 110740344B CN 201910877519 A CN201910877519 A CN 201910877519A CN 110740344 B CN110740344 B CN 110740344B
Authority
CN
China
Prior art keywords
frame
target
video
image group
target image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910877519.0A
Other languages
Chinese (zh)
Other versions
CN110740344A (en
Inventor
樊中财
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN201910877519.0A priority Critical patent/CN110740344B/en
Publication of CN110740344A publication Critical patent/CN110740344A/en
Application granted granted Critical
Publication of CN110740344B publication Critical patent/CN110740344B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234345Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements the reformatting operation being performed only on part of the stream, e.g. a region of the image or a time segment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440245Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display the reformatting operation being performed only on part of the stream, e.g. a region of the image or a time segment

Abstract

The application discloses a video extraction method and a related device, wherein the video extraction method comprises the steps of obtaining a target image group corresponding to query time input by a user in a video to be extracted; and based on the position relation between the I frame in the target image group and the target frame corresponding to the query time, extracting the target video from the video to be extracted by using the target frame as a starting frame and adopting an extraction strategy matched with the position relation. According to the scheme, the video extraction precision can be improved.

Description

Video extraction method and device and storage device
Technical Field
The present application relates to the field of information technology, and in particular, to a video extraction method and a related apparatus.
Background
With the development of storage technology, videos can be stored on a large scale for a long time. The stored video can play an important role in industries such as big data, intelligence, security and the like. When people use the stored video, the video is often inquired according to time and then downloaded or played. However, due to the limitation of the video encoding and decoding principle, the first frame of the extracted video must be an I frame, which causes how to quickly locate the previous I frame of the target frame corresponding to the query time in the existing video extraction manner. However, extracting video in this way causes an error Of one GOP (Group Of Picture) at maximum between the target frame and the first frame Of the extracted video. Moreover, when the GOP length is large, the difference between the start time of the acquired video and the query time is also large. In view of the above, how to accurately extract a video becomes an urgent problem to be solved.
Disclosure of Invention
The technical problem mainly solved by the application is to provide a video extraction method and a related device, which can improve the precision of video extraction.
In order to solve the above problem, a first aspect of the present application provides a video extraction method, including obtaining a target image group corresponding to a query time input by a user in a video to be extracted; and based on the position relation between the I frame in the target image group and the target frame corresponding to the query time, extracting the target video from the video to be extracted by using the target frame as a starting frame and adopting an extraction strategy matched with the position relation.
In order to solve the above problem, a second aspect of the present application provides a video extraction apparatus, including an obtaining module and an extraction module, where the obtaining module is configured to obtain a target image group corresponding to a query time input by a user in a video to be extracted; the extraction module is used for extracting a target video from a video to be extracted by taking the target frame as an initial frame and adopting an extraction strategy matched with the position relation based on the position relation between the I frame in the target image group and the target frame corresponding to the query time.
In order to solve the above problem, a third aspect of the present application provides a video extracting apparatus, which includes a memory and a processor coupled to each other, wherein the processor is configured to execute program instructions stored in the memory to implement the method in the first aspect.
In order to solve the above problem, a fourth aspect of the present application provides a storage device storing program instructions executable by a processor, the program instructions being for implementing the method of the first aspect.
According to the scheme, the target image group corresponding to the query time input by the user in the video to be extracted is obtained, the target video is extracted from the video to be extracted by taking the target frame as the starting frame and adopting the extraction strategy matched with the position relation based on the position relation between the I frame in the target image group and the target frame corresponding to the query time, so that no error exists between the starting frame of the extracted target video and the target frame corresponding to the query time input by the user, the video extraction precision is accurate to the frame, and the video extraction precision can be improved.
Drawings
FIG. 1 is a schematic flow chart diagram illustrating an embodiment of a video extraction method of the present application;
FIG. 2 is a schematic diagram of a frame of an embodiment of an image group;
FIG. 3 is a schematic diagram of a frame of another embodiment of a group of images;
FIG. 4 is a schematic flow chart of one embodiment of step S12 in FIG. 1;
FIG. 5 is a flowchart illustrating an embodiment of step S123 in FIG. 4;
FIG. 6 is a block diagram of an embodiment of a video capture device according to the present application;
FIG. 7 is a block diagram of another embodiment of a video capture device according to the present application;
FIG. 8 is a block diagram of an embodiment of a storage device of the present application.
Detailed Description
The following describes in detail the embodiments of the present application with reference to the drawings attached hereto.
In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, interfaces, techniques, etc. in order to provide a thorough understanding of the present application.
The terms "system" and "network" are often used interchangeably herein. The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship. Further, the term "plurality" herein means two or more than two.
Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a video extraction method according to an embodiment of the present application. Specifically, the following steps may be included:
step S11: and acquiring a target image group corresponding to the query time input by the user in the video to be extracted.
In one implementation scenario, the query time input by the user may be determined by a user's time selection on a time axis. In another implementation scenario, the query time entered by the user may also be determined by the time selected by the user in the drop down list. In yet another implementation scenario, the query time may also be determined by a query time directly input by the user. In addition, the query time input by the user can be acquired by recognizing the voice information input by the user. The embodiment is not particularly limited herein.
Referring to fig. 2, fig. 2 is a schematic diagram of an embodiment of an image group. A Group Of Pictures (GOP) in the conventional sense starts with an I-frame and ends with a frame before the next I-frame. All frames within a GOP are independent of its preceding GOP and its succeeding GOP, i.e., a Closed group of pictures (Closed GOP). That is, all frames in a GOP cannot depend on a GOP before or after the GOP, but only on frames in the GOP.
Generally, the first frame after the scene change is an I frame, which should be transmitted in a full frame, and from the compression degree, the compression amount of the I frame is the smallest, the compression amount of the B frame is the largest after the P frame. Regarding the I frame, P frame, and B frame as the prior art in the field, the description of the present application is omitted.
Referring to fig. 2 in combination, for example, if the query time input by the user is ":09", the target image group is the group of pictures (GOP) shown in fig. 2. Other situations can be analogized, and the embodiment is not illustrated one by one here.
Step S12: and based on the position relation between the I frame in the target image group and the target frame corresponding to the query time, extracting the target video from the video to be extracted by using the target frame as a starting frame and adopting an extraction strategy matched with the position relation.
According to the position relation between the I frame in the target image group and the target frame corresponding to the query time, an extraction strategy matched with the position relation can be determined, so that the target video is extracted from the video to be extracted by taking the target frame as a starting frame. The extraction policy in this embodiment may be specifically set based on whether the time corresponding to the I frame in the target image group is the same as the time corresponding to the target frame, that is, whether the query time is the same as the time of the I frame in the target image group. Referring to fig. 2, when the query time is ":03", and the target frame is the I frame in the target image group, one extraction strategy may be determined, or, when the query time is ":09", and the target frame is not the same frame as the I frame, another extraction strategy may be determined. The specific extraction strategy in this embodiment is not described herein again.
In this embodiment, a determined extraction strategy is adopted to extract a video from a video to be extracted by using a target frame as a start frame, so as to obtain the target video. In an implementation scenario, based on specific requirements of a user, after a target video is extracted, the target video can be played; in another implementation scenario, based on specific requirements of a user, after the target video is extracted, the target video may also be downloaded, and this embodiment is not limited in this embodiment.
According to the scheme, the target image group corresponding to the query time input by the user in the video to be extracted is obtained, the target video is extracted from the video to be extracted by taking the target frame as the starting frame and adopting the extraction strategy matched with the position relation based on the position relation between the I frame in the target image group and the target frame corresponding to the query time, so that no error exists between the starting frame of the extracted target video and the target frame corresponding to the query time input by the user, the video extraction precision is accurate to the frame, and the video extraction precision can be improved.
Referring to fig. 4, fig. 4 is a schematic flowchart illustrating an embodiment of step S12 in fig. 1. Specifically, the step S12 may include:
step S121: and judging whether the I frame and the target frame in the target image group are the same frame, if so, executing a step S122, otherwise, executing a step S123.
In one implementation scenario, the query time input by the user may be compared with the time of the I frame in the target image group, and if the two are the same, it indicates that the I frame and the target frame in the target image group are the same frame, and if the two are different, it indicates that the I frame and the target frame in the target image group are not the same frame.
Step S122: and directly extracting a frame sequence after the target frame in the video to be extracted as the target video.
If the I frame and the target frame in the target image group are the same frame, because the frames after the I frame in the target image group are coded based on the I frame, the frame sequence after the target frame in the video to be extracted can be directly extracted as the target video.
Referring to fig. 2, if the query time input by the user is "03", the target frame corresponding to the query time is an I frame in the target image group, and at this time, the frame sequence after the target frame may be directly extracted as the target video.
Step S123: and respectively recoding the original frames behind the target frames in the target image group, and extracting the frame sequence behind the recoded target frames as the target video.
In an implementation scenario, a target frame is located after an I frame in a target image group, as shown in fig. 2, if the query time input by the user is ":05", the target frame corresponding to the query time is located after the I frame, and since the frame sequence after the target frame is essentially based on I frame coding, the original B frame and P frame cannot be used as the B frame and P frame of the target video, and the frame sequence after the target frame needs to be re-coded to be used as the target video. In this example and all the embodiments described below, the step of "extracting the frame sequence after the target frame in the video to be extracted" and the step of "extracting the frame sequence after the target frame after the re-encoding" both include the target frame.
According to the scheme, when the I frame and the target frame in the target image group are the same frame, the frame sequence behind the target frame in the video to be extracted is directly extracted to serve as the target video, and when the I frame and the target frame in the target image group are not the same frame, the original frame behind the target frame in the target image group is re-encoded respectively, and the frame sequence behind the re-encoded target frame is extracted to serve as the target video, so that the code stream characteristic of the extracted target video is the same as that of the original video to be extracted, and further, no influence is caused on the subsequent use of the target video.
In addition, because only the frames in the target image group are subjected to encoding and decoding processing, the secondary calculation amount generated by extracting the target video is controllable, and the influence on the whole system is reduced as much as possible.
Referring to fig. 5, fig. 5 is a flowchart illustrating an embodiment of step S123 in fig. 4. In this embodiment, in order to make the frame sequence after the target frame after re-encoding the video be the same as the code stream characteristic of the original video to be extracted, the following steps may be sampled to implement:
step S1231: and determining the encoding strategy of the video to be extracted based on the target image group.
In one implementation scenario, the encoding strategy includes a plurality of encoding parameters. In this embodiment, a parameter definition set in the target image group may be extracted, and a plurality of encoding parameters may be acquired based on the parameter definition set.
In a specific embodiment, if the video to be extracted is an h.264 code stream, a Sequence Parameter Set (SPS) and a Picture Parameter Set (PPS) may be extracted based on the target image group, and on this basis, the Sequence Parameter Set and the Picture Parameter Set are parsed to obtain a plurality of encoding parameters, for example: resolution, encoding level, frame rate, number of consecutive B frames, length of a group of pictures, and the like, and the embodiment is not particularly limited herein.
The SPS and PPS are prior art in the art, and the present embodiment will not be described herein. In addition, when the video to be extracted is other code streams such as h.265, realVideo, VC-1, and the like, the encoding strategy of the video to be extracted can be obtained by analogy, and this embodiment is not illustrated one by one.
Step S1232: and decoding the target image group based on the decoding strategy matched with the coding strategy to acquire the original frame of the target image group.
In one implementation scenario, the decoding strategy includes a plurality of decoding parameters. In this embodiment, a plurality of decoding parameters matched with the encoding parameters may be acquired based on the acquired encoding parameters, so that the target image group is decoded by using the decoding parameters, and an original frame of the target image group, for example, an original frame in YUV format, is acquired.
In addition, please refer to fig. 3 in combination, wherein fig. 3 is a schematic frame diagram of another embodiment of the image group. In practical applications, in order to obtain a larger compression ratio, there is also an Open group of pictures (Open GOP), where the Open group of pictures starts with one or more B frames, and these B frames refer to the last P frame of the previous GOP and the first I frame of the current GOP to be encoded, or these B frames may be considered as the previous GOP not ending with a P frame, but ending with a B frame, and these B frames refer to the last P frame of the GOP where they are located and the starting I frame of the next GOP to be encoded. When the group of pictures is the group of open pictures shown in fig. 3, the first two B frames (i.e., the two B frames at times ":19" and ":20" in fig. 3) in the next group of pictures in the target group of pictures are further decoded to obtain their original frames.
Step S1233: and re-encoding the original frame after the target frame by adopting an encoding strategy.
In one implementation scenario, the original frame before the target frame (excluding the target frame) in the target image group may be discarded, so that the original frame after the target frame (including the target frame) may be re-encoded by adopting the obtained encoding policy.
In an implementation scenario, the original frame after the target frame may be re-encoded by using the above-mentioned obtaining of multiple codes.
In addition, in an implementation scenario, when the group of pictures is the open group of pictures shown in fig. 3, the first two B frames (i.e. the two B frames with time ":19" and ":20" in fig. 3) in the next group of pictures of the target group of pictures are further decoded to obtain their original frames, and are re-encoded. That is, when the open group of pictures is used, two more B frames need to be additionally encoded/decoded, and therefore, the amount of secondary calculation resulting from extracting the target video is also controllable, and the influence on the overall system can be reduced as much as possible. Therefore, the video extraction method is suitable for both closed image groups and open image groups, and has good compatibility.
Step S1234: and extracting a frame sequence after the target frame after the recoding as the target video.
In one implementation scenario, the original frame group after the target frame after re-encoding may be used as a new target image group, and the original target image group in the video to be extracted is replaced by the new target image group, so as to extract a new target image group and a subsequent image group as the target video.
Referring to fig. 2, when the query time input by the user is ":07", because the target frame corresponding to the query time is not the same as the I frame in the target group of images, the frame sequence after the target frame in the target group of images is re-encoded and packaged as a new target group of images, i.e., the re-encoded frame sequence between the time ":07" -: 17 "in fig. 2 constitutes a new target group of images, which replaces the original target group of images, thereby making the new target group of images and the group of images after the new target group of images as the target video.
Referring to fig. 6, fig. 6 is a schematic block diagram of a video capture device 60 according to an embodiment of the present disclosure. The video extraction device 60 specifically includes an acquisition module 61 and an extraction module 62, where the acquisition module 61 is configured to acquire a target image group corresponding to query time input by a user in a video to be extracted, and the extraction module 62 is configured to extract a target video from the video to be extracted by using an extraction policy matched with a position relationship, with a target frame as a start frame, based on a position relationship between an I frame in the target image group and a target frame corresponding to the query time.
According to the scheme, the target image group corresponding to the query time input by the user in the video to be extracted is obtained, the target video is extracted from the video to be extracted by taking the target frame as the starting frame and adopting the extraction strategy matched with the position relation based on the position relation between the I frame in the target image group and the target frame corresponding to the query time, so that no error exists between the starting frame of the extracted target video and the target frame corresponding to the query time input by the user, the video extraction precision is accurate to the frame, and the video extraction precision can be improved.
In some embodiments, the extracting module 62 includes a first extracting sub-module configured to, when it is determined that the I frame in the target image group and the target frame are the same frame, directly extract a frame sequence after the target frame in the video to be extracted as the target video, and the extracting module 62 further includes a second extracting sub-module configured to, when it is determined that the I frame in the target image group and the target frame are not the same frame, respectively re-encode the original frame after the target frame in the target image group and extract the frame sequence after the re-encoded target frame as the target video.
In some embodiments, the second extraction sub-module comprises a policy determination unit configured to determine an encoding policy of the video to be extracted based on the target image group, the second extraction sub-module further comprises a decoding unit configured to decode the target image group based on a decoding policy matching the encoding policy and obtain an original frame of the target image group, the second extraction sub-module further comprises an encoding unit configured to re-encode the original frame after the target frame with the encoding policy, and the second extraction sub-module further comprises an extraction sub-unit configured to extract a frame sequence after the re-encoded target frame as the target video.
Different from the above embodiments, in this embodiment, an original frame after a target frame in a target image group is re-encoded based on an encoding policy of a video to be extracted, and a frame sequence after the target frame after re-encoding is extracted as a target video, so that a code stream characteristic of the target video remains unchanged, and further, use of a subsequent target video is not affected.
In addition, because only the frames in the target image group are subjected to encoding and decoding processing, the secondary calculation amount generated by extracting the target video is controllable, and the influence on the whole system is reduced as much as possible.
In some embodiments, the encoding policy includes a plurality of encoding parameters, the decoding policy includes a plurality of decoding parameters, and the policy determining unit is specifically configured to extract a parameter definition set in the target image group, and obtain the plurality of encoding parameters based on the parameter definition set; the decoding unit is specifically configured to acquire a plurality of decoding parameters that match the plurality of encoding parameters, decode the target image group using the plurality of decoding parameters, and acquire the original frame.
In some embodiments, the video to be extracted is an h.264 code stream, and the policy determining unit is specifically configured to extract a sequence parameter set and an image parameter set in a target image group, parse the sequence parameter set and the image parameter set, and acquire a plurality of encoding parameters. In one implementation scenario, the plurality of encoding parameters includes resolution, encoding level, encoding profile, frame rate, number of consecutive B frames, group of pictures length.
In some embodiments, the video extraction device 60 further comprises a playing module for playing the target video.
In some embodiments, the video extraction device 60 further comprises a download module for downloading the target video.
Referring to fig. 7, fig. 7 is a block diagram illustrating a video capture device 70 according to another embodiment of the present application. The video extraction device 70 comprises a memory 71 and a processor 72 coupled to each other, and the processor 72 is configured to execute program instructions stored in the memory 71 to implement the steps in any of the above-described embodiments of the video extraction method.
In particular, the processor 72 is configured to control itself and the memory 71 to implement the steps of any of the above-described embodiments of the video extraction method. Processor 72 may also be referred to as a CPU (Central Processing Unit). The processor 72 may be an integrated circuit chip having signal processing capabilities. The Processor 72 may also be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. Additionally, processor 72 may be commonly implemented by a plurality of integrated circuit chips.
In this embodiment, the processor 72 is configured to obtain a target image group corresponding to query time input by a user in a video to be extracted, and the processor 72 is further configured to extract, based on a position relationship between an I frame in the target image group and a target frame corresponding to the query time, the target video from the video to be extracted by using the target frame as a start frame and using an extraction policy matched with the position relationship.
According to the scheme, the target image group corresponding to the query time input by the user in the video to be extracted is obtained, the target video is extracted from the video to be extracted by taking the target frame as the starting frame and adopting the extraction strategy matched with the position relation based on the position relation between the I frame in the target image group and the target frame corresponding to the query time, so that no error exists between the starting frame of the extracted target video and the target frame corresponding to the query time input by the user, the video extraction precision is accurate to the frame, and the video extraction precision can be improved.
In some embodiments, the processor 72 is further configured to directly extract, as the target video, a frame sequence after the target frame in the video to be extracted when it is determined that the I frame and the target frame in the target group of pictures are the same frame, and the processor 72 is further configured to respectively re-encode the original frame after the target frame in the target group of pictures and extract, as the target video, the frame sequence after the re-encoded target frame when it is determined that the I frame and the target frame in the target group of pictures are not the same frame.
In some embodiments, the processor 72 is further configured to determine an encoding policy of the video to be extracted based on the target image group, the processor 72 is further configured to decode the target image group based on a decoding policy matching the encoding policy, to obtain an original frame of the target image group, the processor 72 is further configured to re-encode the original frame after the target frame with the encoding policy, and the processor 72 is further configured to extract a frame sequence after the re-encoded target frame as the target video.
Different from the above embodiments, in this embodiment, an original frame after a target frame in a target image group is re-encoded based on an encoding policy of a video to be extracted, and a frame sequence after the target frame after re-encoding is extracted as a target video, so that a code stream characteristic of the target video remains unchanged, and further, use of a subsequent target video is not affected.
In addition, because only the frames in the target image group are subjected to encoding and decoding processing, the secondary calculation amount generated by extracting the target video is controllable, and the influence on the whole system is reduced as much as possible.
In some embodiments, the encoding strategy comprises a plurality of encoding parameters, the decoding strategy comprises a plurality of decoding parameters, the processor 72 is further configured to extract a parameter definition set in the target image group, and the processor 72 is further configured to obtain a plurality of encoding parameters based on the parameter definition set; the processor 72 is further configured to obtain a plurality of decoding parameters matching the plurality of encoding parameters, and the processor 72 is further configured to decode the target group of pictures using the plurality of decoding parameters to obtain the original frame.
In some embodiments, the video to be extracted is an h.264 code stream, the processor 72 is further configured to extract a sequence parameter set and an image parameter set in the target group of pictures, and the processor 72 is further configured to parse the sequence parameter set and the image parameter set to obtain a plurality of encoding parameters. In one implementation scenario, the plurality of encoding parameters includes resolution, encoding level, encoding profile, frame rate, number of consecutive B frames, group of pictures length.
In some embodiments, the video extraction device 70 further comprises a human-computer interaction circuit for playing the target video.
In some embodiments, the video extraction device 70 further comprises communication circuitry for downloading the target video.
Referring to fig. 8, fig. 8 is a schematic diagram of a memory device 80 according to an embodiment of the present disclosure. The storage device 80 stores program instructions 81 capable of being executed by the processor, the program instructions 81 being for implementing the steps in any of the video extraction method embodiments described above.
According to the scheme, the target image group corresponding to the query time input by the user in the video to be extracted is obtained, the target video is extracted from the video to be extracted by taking the target frame as the starting frame and adopting the extraction strategy matched with the position relation based on the position relation between the I frame in the target image group and the target frame corresponding to the query time, so that no error exists between the starting frame of the extracted target video and the target frame corresponding to the query time input by the user, the video extraction precision is accurate to the frame, and the video extraction precision can be improved.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.

Claims (11)

1. A method for video extraction, comprising:
acquiring a target image group corresponding to query time input by a user in a video to be extracted;
based on the position relation between the I frame in the target image group and the target frame corresponding to the query time, extracting a target video from the video to be extracted by using the target frame as an initial frame and adopting an extraction strategy matched with the position relation;
the method for extracting the target video from the video to be extracted by using the target frame as a starting frame and adopting an extraction strategy matched with the position relation based on the position relation between the I frame in the target image group and the target frame corresponding to the query time comprises the following steps:
if the I frame in the target image group and the target frame are determined not to be the same frame, respectively re-encoding the original frame behind the target frame in the target image group, and extracting a frame sequence behind the re-encoded target frame as the target video; and decoding the B frame before the I frame in the next image group of the target image group to obtain the original frame of the target image group, and performing the re-encoding on the original frame of the B frame before the I frame in the next image group.
2. The video extraction method according to claim 1, wherein the extracting a target video from the video to be extracted by using an extraction policy that matches the position relationship with the target frame as a starting frame based on the position relationship between the I frame in the target image group and the target frame corresponding to the query time further comprises:
and if the I frame in the target image group and the target frame are determined to be the same frame, directly extracting a frame sequence behind the target frame in the video to be extracted as the target video.
3. The video extraction method according to claim 1, wherein said re-encoding the original frames after the target frame in the target image group and extracting the frame sequence after the re-encoded target frame as the target video respectively comprises:
determining an encoding strategy of the video to be extracted based on the target image group;
decoding the target image group based on a decoding strategy matched with the coding strategy to acquire an original frame of the target image group;
re-encoding the original frame after the target frame by adopting the encoding strategy;
and extracting a frame sequence after the target frame after the recoding as the target video.
4. The method of claim 3, wherein the encoding strategy comprises a plurality of encoding parameters, wherein the decoding strategy comprises a plurality of decoding parameters, and wherein the determining the encoding strategy of the video to be extracted based on the target group of pictures comprises:
extracting a parameter definition set in the target image group;
acquiring the plurality of encoding parameters based on the parameter definition set;
the decoding the target image group based on the decoding strategy matched with the coding strategy, and the acquiring of the original frame of the target image group comprises:
obtaining the plurality of decoding parameters matched with the plurality of coding parameters;
and decoding the target image group by using the plurality of decoding parameters to obtain the original frame.
5. The video extraction method according to claim 4, wherein the video to be extracted is an H.264 code stream, and the extracting the parameter definition set in the target image group includes:
extracting a sequence parameter set and an image parameter set in the target image group;
the obtaining the plurality of encoding parameters based on the parameter definition set comprises:
and analyzing the sequence parameter set and the image parameter set to acquire the plurality of encoding parameters.
6. The method of claim 5, wherein the plurality of encoding parameters comprise resolution, encoding level, frame rate, number of consecutive B frames, and group of pictures length.
7. The video extraction method according to claim 3, wherein the extracting, as the target video, a sequence of frames following the re-encoded target frame comprises:
packing the original frame group after the target frame after recoding into a new target image group;
replacing the original target image group in the video to be extracted with the new target image group;
and extracting the new target image group and the subsequent image group as the target video.
8. The video extraction method according to claim 1, wherein after the target video is extracted from the video to be extracted by using an extraction policy that matches the positional relationship with the target frame as a starting frame based on the positional relationship between the I frame in the target image group and the target frame corresponding to the query time, the method further comprises:
playing the target video; and/or the presence of a gas in the gas,
and downloading the target video.
9. A video extraction apparatus, comprising:
the acquisition module is used for acquiring a target image group corresponding to the query time input by the user in the video to be extracted;
the extraction module is used for extracting a target video from the video to be extracted by taking the target frame as an initial frame and adopting an extraction strategy matched with the position relation based on the position relation between the I frame in the target image group and the target frame corresponding to the query time;
the type of the target image group comprises an open image group, the extraction module comprises a second extraction sub-module, and the second extraction sub-module is used for respectively re-encoding an original frame behind the target frame in the target image group and extracting a frame sequence behind the re-encoded target frame as the target video when the I frame in the target image group is determined not to be the same as the target frame; and further decoding the B frame before the I frame in the next image group of the target image group to obtain the original frame thereof, and performing the re-encoding on the original frame of the B frame before the I frame in the next image group.
10. A video extraction apparatus comprising a memory and a processor coupled to each other, the processor being configured to execute program instructions stored in the memory to implement the video extraction method of any one of claims 1 to 8.
11. A storage device storing program instructions executable by a processor to implement the video extraction method of any one of claims 1 to 8.
CN201910877519.0A 2019-09-17 2019-09-17 Video extraction method and device and storage device Active CN110740344B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910877519.0A CN110740344B (en) 2019-09-17 2019-09-17 Video extraction method and device and storage device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910877519.0A CN110740344B (en) 2019-09-17 2019-09-17 Video extraction method and device and storage device

Publications (2)

Publication Number Publication Date
CN110740344A CN110740344A (en) 2020-01-31
CN110740344B true CN110740344B (en) 2022-10-04

Family

ID=69267974

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910877519.0A Active CN110740344B (en) 2019-09-17 2019-09-17 Video extraction method and device and storage device

Country Status (1)

Country Link
CN (1) CN110740344B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111405288A (en) * 2020-03-19 2020-07-10 北京字节跳动网络技术有限公司 Video frame extraction method and device, electronic equipment and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0989756A2 (en) * 1998-09-25 2000-03-29 Sarnoff Corporation Splicing information streams
CN108989846A (en) * 2018-07-09 2018-12-11 武汉斗鱼网络科技有限公司 A kind of video transformation assay method, apparatus, equipment and medium
CN110121071A (en) * 2018-02-05 2019-08-13 广东欧珀移动通信有限公司 Method for video coding and Related product

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100353750C (en) * 2000-09-15 2007-12-05 北京算通数字技术研究中心有限公司 Edition method for non-linear edition system based on MPEG-2 code stream
JP4411220B2 (en) * 2005-01-18 2010-02-10 キヤノン株式会社 Video signal processing apparatus and video signal processing method thereof
CN103024394A (en) * 2012-12-31 2013-04-03 传聚互动(北京)科技有限公司 Video file editing method and device
CN104967862A (en) * 2015-07-22 2015-10-07 东方网力科技股份有限公司 Video storage method and device, and video searching method and device
EP3185564A1 (en) * 2015-12-22 2017-06-28 Harmonic Inc. Video stream splicing of groups of pictures (gop)
CN106254869A (en) * 2016-08-25 2016-12-21 腾讯科技(深圳)有限公司 The decoding method of a kind of video data, device and system
CN106803992B (en) * 2017-02-14 2020-05-22 北京时间股份有限公司 Video editing method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0989756A2 (en) * 1998-09-25 2000-03-29 Sarnoff Corporation Splicing information streams
US6912251B1 (en) * 1998-09-25 2005-06-28 Sarnoff Corporation Frame-accurate seamless splicing of information streams
CN110121071A (en) * 2018-02-05 2019-08-13 广东欧珀移动通信有限公司 Method for video coding and Related product
CN108989846A (en) * 2018-07-09 2018-12-11 武汉斗鱼网络科技有限公司 A kind of video transformation assay method, apparatus, equipment and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于帧损伤位置感知的GOP视频质量评价机制;程德强等;《电视技术》;20171231;全文 *

Also Published As

Publication number Publication date
CN110740344A (en) 2020-01-31

Similar Documents

Publication Publication Date Title
US10452919B2 (en) Detecting segments of a video program through image comparisons
US20120057640A1 (en) Video Analytics for Security Systems and Methods
EP1635575A1 (en) System and method for embedding scene change information in a video bitstream
EP2953132B1 (en) Method and apparatus for processing audio/video file
US20060059509A1 (en) System and method for embedding commercial information in a video bitstream
KR101281850B1 (en) Video descriptor generator
CN110740344B (en) Video extraction method and device and storage device
EP3175621B1 (en) Video-segment identification systems and methods
US10264273B2 (en) Computed information for metadata extraction applied to transcoding
US10942914B2 (en) Latency optimization for digital asset compression
US11570490B2 (en) Method for on-demand video editing at transcode-time in a video streaming system
CN106937127B (en) Display method and system for intelligent search preparation
CN112019878B (en) Video decoding and editing method, device, equipment and storage medium
KR20210064587A (en) High speed split device and method for video section
KR20220061032A (en) Method and image-processing device for video processing
US7430325B2 (en) Encoding system conversion apparatus and method for same
CN112714336A (en) Video segmentation method and device, electronic equipment and computer readable storage medium
CN111225210B (en) Video coding method, video coding device and terminal equipment
JP2018137639A (en) Moving image processing system, encoder and program, decoder and program
CN114666603A (en) Video decoding method and device, electronic equipment and storage medium
JP4989506B2 (en) Character recognition device and program thereof
CN114205649A (en) Image data processing method, device, equipment and storage medium
CN117456413A (en) Video-based target content identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant