CN112770116A - Method for extracting video key frame by using video compression coding information - Google Patents

Method for extracting video key frame by using video compression coding information Download PDF

Info

Publication number
CN112770116A
CN112770116A CN202011642920.5A CN202011642920A CN112770116A CN 112770116 A CN112770116 A CN 112770116A CN 202011642920 A CN202011642920 A CN 202011642920A CN 112770116 A CN112770116 A CN 112770116A
Authority
CN
China
Prior art keywords
video
frame
value
extracting
shot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011642920.5A
Other languages
Chinese (zh)
Other versions
CN112770116B (en
Inventor
艾达
梁嘉倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Posts and Telecommunications
Original Assignee
Xian University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Posts and Telecommunications filed Critical Xian University of Posts and Telecommunications
Priority to CN202011642920.5A priority Critical patent/CN112770116B/en
Publication of CN112770116A publication Critical patent/CN112770116A/en
Application granted granted Critical
Publication of CN112770116B publication Critical patent/CN112770116B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/19Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding using optimisation based on Lagrange multipliers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A method for extracting video key frame by video compression coding information is composed of extracting depth and frame bit number characteristics, shot switching detection and extracting key frame. The invention adopts the coding unit depth information and the frame bit number compression domain characteristics in the video code stream to carry out shot switching detection, obtain shot fragments and carry out key frame extraction. The invention fully utilizes the compressed domain video to process without decompression, reduces the calculation process, shortens the processing time and improves the processing speed. Compared with the existing method, the experimental result shows that the accuracy of the method is improved by 12.1%, the recall rate is improved by 5.3%, the F value is improved by 8.4%, and the extracted key frame can well express the main content of the original video. The method has the advantages of small calculated amount, high efficiency, high accuracy, high processing speed and the like, and can be used for processing the video image.

Description

Method for extracting video key frame by using video compression coding information
Technical collar city
The invention belongs to the technical field of digital video retrieval, and particularly relates to a method for extracting video key frames by using video compression coding information.
Background
With the rapid development of multimedia technology and network technology, video data rapidly grows, unprecedented data appears, and how to effectively manage videos and rapidly acquire important information in the videos becomes a research hotspot. Under the background, key frame extraction becomes an effective way for solving the problem, and by extracting the key frame, the data volume of the video can be greatly reduced, the important information of the original video can be well expressed, the retrieval time is saved, and the video retrieval efficiency is improved.
At present, as for the extraction method of key frames, scholars at home and abroad carry out a great deal of research work, and the methods can be divided into key frame extraction in a pixel domain and key frame extraction in a compression domain according to processed video data objects. The method for extracting the key frame of the pixel domain is carried out after the video is completely decompressed, the calculated amount is large, the efficiency is low, and the real-time requirement is difficult to meet. The compressed domain video processing technology is directly oriented to compressed video data with small data volume, and the video is processed under the condition of no decompression or partial decompression, so that the processing speed of the video can be greatly improved, and therefore, the research on the key frame extraction method on the compressed domain draws wide attention.
Ali Reza et al propose a method for extracting key frames in the h.265/HEVC compressed domain, which uses a normalized histogram of intra-frame prediction modes extracted from the h.265/HEVC coded video to detect similar frames, classifies the similar frames using fuzzy c-means clustering, and extracts key frames. Zhu Zhiming et al proposed a video abstract key frame extraction method of video coding compression domain, which is to count the number of brightness prediction modes of a video coding intra-frame coding PU block at a decoding end, construct a mode feature vector, cluster the mode feature vector by using an adaptive clustering algorithm fused with an iterative self-organizing data analysis algorithm (ISODATA) to obtain candidate key frames, and filter the candidate key frames again through similarity to remove redundant frames to obtain final key frames.
The common point of the methods is that the intra-frame prediction mode value is used as the characteristic, and the experiment only aims at the full intra-frame mode, so that the processing speed of the video frame is low, the processing time is long, and the practicability is not realized.
Disclosure of Invention
The technical problem to be solved by the present invention is to overcome the disadvantages of the above video frame processing method, and provide a method for extracting video key frames by using video compression coding information, which does not need decoding, has small calculation amount, high processing speed and high extraction efficiency.
The technical scheme adopted for solving the technical problems comprises the following steps:
(1) extracting depth and frame bit number features
Determining a rate-distortion cost J of the coding unit according to equation (1):
Figure BDA0002880332880000021
wherein Dx,yAnd Rx,yRespectively representing the distortion and the coding bit number of the (x, y) th pixel in the coding unit, wherein x belongs to {1,2, …, H }, y belongs to {1,2, …, W }, W multiplied by H is video resolution, lambda is larger than or equal to 0 and is Lagrange coefficient, W and H are finite positive integers, and W is larger than H.
Determining depth feature vector F of coded frame according to equation (2)n
Fn={f1,f2,…,fα} (2)
Figure BDA0002880332880000022
Wherein N represents the nth coded frame of the video, N belongs to {1,2, …, N }, N is the total frame number of the video, N is a finite positive integer, round () is an upward rounding function, fαFor coding depth values of a unit, fαThe value of (a) is any one of 0, 1,2 and 3.
Determining the number of frame bits R according to equation (3)n
Figure BDA0002880332880000023
(2) Lens switching detection
Counting the frame bit number R of the encoded framenAnd drawing a line drawing for analysis, marking the positions which are gradually increased and then gradually reduced as shot switching, wherein 1 shot segment is arranged between two adjacent shot switching, the length of the shot segment is M, the value of M is a limited positive integer, M is less than N, K shot segments are obtained, and the value of K is a limited positive integer.
(3) Extracting key frames
The laplacian matrix L is determined as in equation (4):
Figure BDA0002880332880000024
Figure BDA0002880332880000031
Figure BDA0002880332880000032
Figure BDA0002880332880000033
Figure BDA0002880332880000034
wherein FiAnd FjThe depth feature vectors for the ith and j-th coded frames, i ∈ {1,2, …, N }, j ∈ {1,2, …, N }, respectively, are represented.
Determining eigenvectors Y corresponding to the first K eigenvalues of L according to the formula (5), and constructing an NxK order matrix Y according to the formula (6):
L×y=β×D×y (5)
Y=[y1,y2,…,yK](6) wherein y is1,y2,...,yKSequentially forming N multiplied by 1 order eigenvectors corresponding to the first K eigenvalues.
K-means clustering is carried out on the matrix Y, and the distance d between the clustering center mu and all other frames in the shot is determined according to the formula (7)m
dm=||ym-μ||2 (7)
Wherein M belongs to {1,2, …, M }, M is the length of each shot, M is a finite positive integer, and M is less than N.
Will be a distance dmThe smallest frame is denoted as the key frame.
In the step (1) of extracting the depth and frame bit number characteristics, the value of W is 176-7680, the value of H is 144-4320, and the value of N is 1000-7000.
In the step (2) of detecting lens switching, the value of K is 5-20.
The invention adopts CU depth value and frame bit number compression domain characteristics in video code stream to carry out shot switching detection, obtains shot fragments, and carries out key frame extraction. The invention fully utilizes the compressed domain video to process without decompression, reduces the calculation process, shortens the processing time and improves the processing speed. Compared with the existing method, the experimental result shows that the accuracy of the method is improved by 12.1%, the recall rate is improved by 5.3%, the F value is improved by 8.4%, and the extracted key frame can well express the main content of the original video. The method has the advantages of small calculated amount, high efficiency, high accuracy, high processing speed and the like, and can be used for processing the video image.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following drawings and examples, but the present invention is not limited to these examples.
Example 1
Taking the video sequence a New Horizon, segment 02 in the international VSUMM dataset as an example, the method for extracting video key frames by using video compression coding information in the embodiment includes the following steps (see fig. 1):
(1) extracting depth and frame bit number features
Determining a rate-distortion cost J of the coding unit according to equation (1):
Figure BDA0002880332880000041
wherein Dx,yAnd Rx,yRespectively representing the distortion and the coding bit number of the (x, y) th pixel in the coding unit, wherein x belongs to {1,2, …, H }, y belongs to {1,2, …, W }, W x H is video resolution, lambda is greater than or equal to 0 and is Lagrange coefficient, W and H are limited positive integers, W is greater than H, the value of W in the embodiment is 352, and the value of H is 240.
Determining depth feature vector F of coded frame according to equation (2)n
Fn={f1,f2,…,fα} (2)
Figure BDA0002880332880000042
Where N represents the nth encoded frame of the video, N belongs to {1,2, …, N }, N is the total frame number of the video, N is a finite positive integer, N is 1797 in this embodiment, round () is an upward rounding function, f is a positive integer, and N is a positive integerαFor coding depth values of a unit, fαIs any one of 0, 1,2 and 3, fαThe specific value of (c) should be determined according to the value of n.
Determining the number of frame bits R according to equation (3)n
Figure BDA0002880332880000043
(2) Lens switching detection
Counting the frame bit number R of the encoded framenDrawing a line graph for analysis, marking the part which is gradually increased and then gradually reduced as lens switching, and setting 1 lens between two adjacent lens switchingThe length of a shot is M, M is a limited positive integer, and M is less than N, to obtain K shot, K is a limited positive integer, K in this embodiment is 13, and M is specifically 376, 232, 128, 108, 80, 76, 72, 80, 116, 120, 68, 72, 108.
(3) Extracting key frames
The laplacian matrix L is determined as in equation (4):
Figure BDA0002880332880000051
Figure BDA0002880332880000052
Figure BDA0002880332880000053
Figure BDA0002880332880000054
Figure BDA0002880332880000055
wherein FiAnd FjThe depth feature vectors for the ith and j-th coded frames, i ∈ {1,2, …, N }, j ∈ {1,2, …, N }, respectively, are represented.
Determining eigenvectors Y corresponding to the first K eigenvalues of L according to the formula (5), and constructing an NxK order matrix Y according to the formula (6):
L×y=β×D×y (5)
Y=[y1,y2,…,yK] (6)
wherein y is1,y2,...,yKSequentially obtaining N multiplied by 1 order eigenvectors corresponding to the first K eigenvalues, wherein the value of K in the step is the same as that of K in the step (2), and the value of N is the same as that of N in the step (1).
K-means clustering is carried out on the matrix Y, and the distance d between the clustering center mu and all other frames in the shot is determined according to the formula (7)m
dm=||ym-μ||2 (7)
Wherein M belongs to {1,2, …, M }, M is the length of each shot, M is a finite positive integer, M is less than N, and the specific value of M is the same as that in step (2).
Will be a distance dmThe smallest frame is denoted as the key frame.
Example 2
Taking an ocean floor Legacy as an example, the method for extracting video key frames by using video compression coding information in the embodiment includes the following steps:
(1) extracting depth and frame bit number features
Determining a rate-distortion cost J of the coding unit according to equation (1):
Figure BDA0002880332880000061
wherein Dx,yAnd Rx,yRespectively representing the distortion and the coding bit number of the (x, y) th pixel in the coding unit, wherein x belongs to {1,2, …, H }, y belongs to {1,2, …, W }, W x H is video resolution, lambda is greater than or equal to 0 and is Lagrange coefficient, W and H are limited positive integers, W is greater than H, the value of W in the embodiment is 176, and the value of H is 144.
Determining depth feature vector F of coded frame according to equation (2)n
Fn={f1,f2,…,fα} (2)
Figure BDA0002880332880000062
Where N represents the nth encoded frame of the video, N belongs to {1,2, …, N }, N is the total frame number of the video, N is a finite positive integer, N is 1000 in this embodiment, round () is an upward rounding function, f is a positive integer, and N is a positive integerαFor coding depth values of a unit, fαIs 0,1. 2, 3, fαThe specific value of (c) should be determined according to the value of n.
Determining the number of frame bits R according to equation (3)n
Figure BDA0002880332880000063
(2) Lens switching detection
Counting the frame bit number R of the encoded framenAnd drawing a broken line graph for analysis, marking the positions which are gradually increased and then gradually decreased as shot switching, wherein 1 shot segment is arranged between every two adjacent shot switching, the length of each shot segment is M, the value of M is a limited positive integer, M is less than N, K shot segments are obtained, the value of K is a limited positive integer, the value of K in the embodiment is 5, and the specific values of M are 336, 216, 112, 96 and 296.
(3) Extracting key frames
The laplacian matrix L is determined as in equation (4):
Figure BDA0002880332880000064
Figure BDA0002880332880000065
Figure BDA0002880332880000066
Figure BDA0002880332880000071
Figure BDA0002880332880000072
wherein FiAnd FjRespectively representing the depth of the ith and j coded framesThe eigenvector, i ∈ {1,2, …, N }, j ∈ {1,2, …, N }.
Determining eigenvectors Y corresponding to the first K eigenvalues of L according to the formula (5), and constructing an NxK order matrix Y according to the formula (6):
L×y=β×D×y (5)
Y=[y1,y2,…,yK] (6)
wherein y is1,y2,...,yKSequentially obtaining N multiplied by 1 order eigenvectors corresponding to the first K eigenvalues, wherein the value of K in the step is the same as that of K in the step (2), and the value of N is the same as that of N in the step (1).
K-means clustering is carried out on the matrix Y, and the distance d between the clustering center mu and all other frames in the shot is determined according to the formula (7)m
dm=||ym-μ||2 (7)
Wherein M belongs to {1,2, …, M }, M is the length of each shot, M is a finite positive integer, M is less than N, and the specific value of M is the same as that in step (2).
Will be a distance dmThe smallest frame is denoted as the key frame.
Example 3
Taking an exceptional Terrane of a video sequence as an example, the method for extracting a video key frame by using video compression coding information of the embodiment includes the following steps:
(1) extracting depth and frame bit number features
Determining a rate-distortion cost J of the coding unit according to equation (1):
Figure BDA0002880332880000073
wherein Dx,yAnd Rx,yRespectively representing the distortion and the coding bit number of the (x, y) th pixel in the coding unit, wherein x belongs to {1,2, …, H }, y belongs to {1,2, …, W }, W x H is video resolution, lambda is greater than or equal to 0 and is Lagrange coefficient, W and H are limited positive integers, W is greater than H, the value of W in the embodiment is 7680, and the value of H is 4320.
Determining depth feature direction of coded frame according to equation (2)Quantity Fn
Fn={f1,f2,…,fα} (2)
Figure BDA0002880332880000074
Where N represents the nth encoded frame of the video, N belongs to {1,2, …, N }, N is the total frame number of the video, N is a finite positive integer, N is 7000 in this embodiment, round () is an upward rounding function, f is a positive integerαFor coding depth values of a unit, fαIs any one of 0, 1,2 and 3, fαThe specific value of (c) should be determined according to the value of n.
Determining the number of frame bits R according to equation (3)n
Figure BDA0002880332880000081
(2) Lens switching detection
Counting the frame bit number R of the encoded framenAnd drawing a broken line graph for analysis, marking the positions which are gradually increased and then gradually decreased as shot switching, wherein 1 shot segment is arranged between every two adjacent shot switching, the length of each shot segment is M, the value of M is a limited positive integer, M is less than N, K shot segments are obtained, the value of K is a limited positive integer, the value of K in the embodiment is 20, and the specific value of M is 156, 196, 596, 1068, 316, 452, 196, 96, 468, 240, 496, 176, 152, 376, 192, 112, 412, 336, 240 and 396.
(3) Extracting key frames
The laplacian matrix L is determined as in equation (4):
Figure BDA0002880332880000082
Figure BDA0002880332880000083
Figure BDA0002880332880000084
Figure BDA0002880332880000085
Figure BDA0002880332880000086
wherein FiAnd FjThe depth feature vectors for the ith and j-th coded frames, i ∈ {1,2, …, N }, j ∈ {1,2, …, N }, respectively, are represented.
Determining eigenvectors Y corresponding to the first K eigenvalues of L according to the formula (5), and constructing an NxK order matrix Y according to the formula (6):
L×y=β×D×y (5)
Y=[y1,y2,…,yK] (6)
wherein y is1,y2,...,yKSequentially obtaining N multiplied by 1 order eigenvectors corresponding to the first K eigenvalues, wherein the value of K in the step is the same as that of K in the step (2), and the value of N is the same as that of N in the step (1).
K-means clustering is carried out on the matrix Y, and the distance d between the clustering center mu and all other frames in the shot is determined according to the formula (7)m
dm=||ym-μ||2 (7)
Wherein M belongs to {1,2, …, M }, M is the length of each shot, M is a finite positive integer, M is less than N, and the specific value of M is the same as that in step (2).
Will be a distance dmThe smallest frame is denoted as the key frame.
In order to verify the beneficial effects of the present invention, the inventor performed a comparison experiment by using the method of extracting video key frames from video compression coding information in embodiment 1 of the present invention and an HEVC intra frame based compressed domain video summary (hereinafter referred to as "comparison file 1") method, and determined the accuracy, recall rate, and F value of the two methods as comprehensive indicators for evaluating the quality of the video summary, where the experiment and calculation results are shown in table 1.
The accuracy is determined as follows:
Figure BDA0002880332880000091
wherein N ismNumber of key frames, N, for the experimental method to match the user summaryASThe number of key frames extracted for the experimental method.
The recall rate is determined as follows:
Figure BDA0002880332880000092
wherein N isUSKey frame number extracted for user abstract.
The value of F is determined as follows:
Figure BDA0002880332880000093
TABLE 1 results of the experiment
Figure BDA0002880332880000094
As can be seen from Table 1, compared with the method of the comparison document 1, the method of the present invention has the advantages of significantly improved effect, wherein the accuracy rate is improved by 12.1%, the recall rate is improved by 5.3%, and the F value is improved by 8.4%.

Claims (3)

1. A method for extracting key frames from video using video compression coding information, comprising the steps of:
method for extracting video key frame by using video compression coding information
(1) Extracting depth and frame bit number features
Determining a rate-distortion cost J of the coding unit according to equation (1):
Figure FDA0002880332870000011
wherein Dx,yAnd Rx,yRespectively representing the distortion and the coding bit number of the (x, y) th pixel in a coding unit, wherein x belongs to {1,2, …, H }, y belongs to {1,2, …, W }, W multiplied by H is video resolution, lambda is more than or equal to 0 and is Lagrange coefficient, W and H are limited positive integers, and W is more than H;
determining depth feature vector F of coded frame according to equation (2)n
Fn={f1,f2,…,fα} (2)
Figure FDA0002880332870000012
Wherein N represents the nth coded frame of the video, N belongs to {1,2, …, N }, N is the total frame number of the video, N is a finite positive integer, round () is an upward rounding function, fαFor coding depth values of a unit, fαThe value of (a) is any one of 0, 1,2 and 3;
determining the number of frame bits R according to equation (3)n
Figure FDA0002880332870000013
(2) Lens switching detection
Counting the frame bit number R of the encoded framenDrawing a broken line graph for analysis, marking the positions which are gradually increased and then gradually reduced as shot switching, wherein 1 shot segment is arranged between every two adjacent shot switching, the length of each shot segment is M, the value of M is a limited positive integer, M is less than N, K shot segments are obtained, and the value of K is a limited positive integer;
(3) extracting key frames
The laplacian matrix L is determined as in equation (4):
Figure FDA0002880332870000014
Figure FDA0002880332870000021
Figure FDA0002880332870000022
Figure FDA0002880332870000023
Figure FDA0002880332870000024
wherein FiAnd FjDepth feature vectors representing the ith and jth coded frames, respectively, i ∈ {1,2, …, N }, j ∈ {1,2, …, N };
determining eigenvectors Y corresponding to the first K eigenvalues of L according to the formula (5), and constructing an NxK order matrix Y according to the formula (6):
L×y=β×D×y (5)
Y=[y1,y2,…,yK] (6)
wherein y is1,y2,...,yKSequentially carrying out Nx 1-order eigenvectors corresponding to the first K eigenvalues;
k-means clustering is carried out on the matrix Y, and the distance d between the clustering center mu and all other frames in the shot is determined according to the formula (7)m
dm=||ym-μ||2 (7)
Wherein M belongs to {1,2, …, M }, M is the length of each shot segment, M is a finite positive integer, and M is less than N;
will be a distance dmThe smallest frame is denoted as the key frame.
2. The method of claim 1, wherein the key frames of the video are extracted from the video compression coding information: in the step (1) of extracting the depth and frame bit number characteristics, the value of W is 176-7680, the value of H is 144-4320, and the value of N is 1000-7000.
3. The method of claim 1, wherein the key frames of the video are extracted from the video compression coding information: in the step (2) of detecting lens switching, the value of K is 5-20.
CN202011642920.5A 2020-12-31 2020-12-31 Method for extracting video key frame by using video compression coding information Active CN112770116B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011642920.5A CN112770116B (en) 2020-12-31 2020-12-31 Method for extracting video key frame by using video compression coding information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011642920.5A CN112770116B (en) 2020-12-31 2020-12-31 Method for extracting video key frame by using video compression coding information

Publications (2)

Publication Number Publication Date
CN112770116A true CN112770116A (en) 2021-05-07
CN112770116B CN112770116B (en) 2021-12-07

Family

ID=75698646

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011642920.5A Active CN112770116B (en) 2020-12-31 2020-12-31 Method for extracting video key frame by using video compression coding information

Country Status (1)

Country Link
CN (1) CN112770116B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114697761A (en) * 2022-04-07 2022-07-01 脸萌有限公司 Processing method, processing device, terminal equipment and medium
CN116723335A (en) * 2023-06-29 2023-09-08 西安邮电大学 Method for extracting video key frame by video compression coding information

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6690725B1 (en) * 1999-06-18 2004-02-10 Telefonaktiebolaget Lm Ericsson (Publ) Method and a system for generating summarized video
CN101453649A (en) * 2008-12-30 2009-06-10 浙江大学 Key frame extracting method for compression domain video stream
US20110225196A1 (en) * 2008-03-19 2011-09-15 National University Corporation Hokkaido University Moving image search device and moving image search program
CN105979267A (en) * 2015-12-03 2016-09-28 乐视致新电子科技(天津)有限公司 Video compression and play method and device
GB201802317D0 (en) * 2015-08-29 2018-03-28 Univ Warwick Image compression
CN108632625A (en) * 2017-03-21 2018-10-09 华为技术有限公司 A kind of method for video coding, video encoding/decoding method and relevant device
CN111984942A (en) * 2020-07-23 2020-11-24 西安理工大学 Robust video zero-watermarking method based on extremely complex exponential transformation and residual error neural network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6690725B1 (en) * 1999-06-18 2004-02-10 Telefonaktiebolaget Lm Ericsson (Publ) Method and a system for generating summarized video
US20110225196A1 (en) * 2008-03-19 2011-09-15 National University Corporation Hokkaido University Moving image search device and moving image search program
CN101453649A (en) * 2008-12-30 2009-06-10 浙江大学 Key frame extracting method for compression domain video stream
GB201802317D0 (en) * 2015-08-29 2018-03-28 Univ Warwick Image compression
CN105979267A (en) * 2015-12-03 2016-09-28 乐视致新电子科技(天津)有限公司 Video compression and play method and device
CN108632625A (en) * 2017-03-21 2018-10-09 华为技术有限公司 A kind of method for video coding, video encoding/decoding method and relevant device
CN111984942A (en) * 2020-07-23 2020-11-24 西安理工大学 Robust video zero-watermarking method based on extremely complex exponential transformation and residual error neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SUDEEP D. THEPADE 等: "An Optimized Key Frame Extraction for Detection of Near Duplicates in Content Based Video Retrieval", 《2014 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING》 *
贺鹏: "基于自适应阈值的压缩域上MPEG视频关键帧提取算法的研究", 《中国优秀硕士学位论文全文数据库(电子期刊)》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114697761A (en) * 2022-04-07 2022-07-01 脸萌有限公司 Processing method, processing device, terminal equipment and medium
CN114697761B (en) * 2022-04-07 2024-02-13 脸萌有限公司 Processing method, processing device, terminal equipment and medium
CN116723335A (en) * 2023-06-29 2023-09-08 西安邮电大学 Method for extracting video key frame by video compression coding information

Also Published As

Publication number Publication date
CN112770116B (en) 2021-12-07

Similar Documents

Publication Publication Date Title
CN107657228B (en) Video scene similarity analysis method and system, and video encoding and decoding method and system
Wang et al. Towards analysis-friendly face representation with scalable feature and texture compression
Duan et al. Compact descriptors for visual search
CN112770116B (en) Method for extracting video key frame by using video compression coding information
KR20020031015A (en) Non-linear quantization and similarity matching methods for edge histogram bins
CN103020138A (en) Method and device for video retrieval
CN106777159B (en) Video clip retrieval and positioning method based on content
CN107547902B (en) Adaptive rate distortion optimization method for surveillance video coding
CN110188625B (en) Video fine structuring method based on multi-feature fusion
CN108833928B (en) Traffic monitoring video coding method
CN109359530B (en) Intelligent video monitoring method and device
CN107682699B (en) A kind of nearly Lossless Image Compression method
CN112565793B (en) Image lossless compression method based on prediction difference value classification entropy coding
Dai et al. HEVC video steganalysis based on PU maps and multi-scale convolutional residual network
Ouyang et al. The comparison and analysis of extracting video key frame
Khmelevskiy et al. Model of Transformation of the Alphabet of the Encoded Data as a Tool to Provide the Necessary Level of Video Image Qualityi in Aeromonitoring Systems.
CN116896638A (en) Data compression coding technology for transmission operation detection scene
CN115858855A (en) Video data query method based on scene characteristics
WO2005046213A1 (en) Document image encoding/decoding
CN113784147B (en) Efficient video coding method and system based on convolutional neural network
KR20220045920A (en) Method and apparatus for processing images/videos for machine vision
CN107194961A (en) The determination method of multiple reference images in colony's Image Coding
CN114005069A (en) Video feature extraction and retrieval method
CN116723335B (en) Method for extracting video key frame by video compression coding information
Monteiro et al. Clustering based binary descriptor coding for efficient transmission in visual sensor networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant