WO2009138037A1 - Video service system, video service apparatus and extracting method of key frame thereof - Google Patents

Video service system, video service apparatus and extracting method of key frame thereof Download PDF

Info

Publication number
WO2009138037A1
WO2009138037A1 PCT/CN2009/071783 CN2009071783W WO2009138037A1 WO 2009138037 A1 WO2009138037 A1 WO 2009138037A1 CN 2009071783 W CN2009071783 W CN 2009071783W WO 2009138037 A1 WO2009138037 A1 WO 2009138037A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
vector
motion vector
feature vector
motion
Prior art date
Application number
PCT/CN2009/071783
Other languages
French (fr)
Chinese (zh)
Inventor
邸佩云
胡昌启
元辉
马彦卓
常义林
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2009138037A1 publication Critical patent/WO2009138037A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/147Scene change detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field

Definitions

  • the embodiments of the present invention relate to the field of communications technologies, and in particular, to a video service system, a video service device, and a method for extracting key frames thereof.
  • the extraordinary state means that the motion state of the object changes significantly, for example, from rest to motion, from motion to rest, the direction of motion changes, or the speed of motion changes significantly.
  • the change of the video scene is a reflection of the significant change of the motion state of the object in the scene.
  • the scene change also includes the switching of the scene. It can be considered that the switching of the scene is the sudden movement of the object in the original scene to the infinity, and the object in the new scene is moved from infinity, and the motion state of the object is intense. Variety.
  • frames are used to describe video information, and key frames are frames that best represent video information.
  • the so-called key frame refers to the frame in which the object in the scene has abnormal motion.
  • the other frame scenes between the abnormal frames remain normal.
  • the scene refers to a set of several shots containing content related.
  • the interframe difference method is applied to the kth frame and the k-1th frame to obtain a rough outline of the moving object (referred to as a first contour), and then Using the multi-level edge detection algorithm to obtain the contour of all objects in the kth frame (referred to as the second contour), and the second contour is ANDed with the first contour to obtain a clearer contour than the first contour (referred to as the third Contour), then add a rectangular frame to the moving object based on the third contour, that is, the moving object is framed by a rectangle, and the motion is obtained by the Geodesic Active Contour Model in the Level Set Method.
  • the edge contour of the object finally by judging the appearance of the edge contour of the moving object , disappear, displacement changes, and shape changes to select keyframes.
  • the contour information of all moving objects needs to be extracted, and the contour information is calculated. Because the algorithm process of extracting the contour information is complicated, the calculation amount of the prior art 1 is large;
  • the first technique is to extract the key frames from stationary to moving or from moving to stationary, but the key frames that are suddenly shifted by constant speed cannot be extracted.
  • the prior art 2 does not consider the directional problem, that is, the prior art 2 cannot reflect the uniform speed but the moving direction occurs.
  • the motion vector information of the individual moving objects is large, and the motion vectors of the individual moving objects are very Xiaoyan, after the average value calculation, may cause the perceived motion energy value of the kth frame to be small, which does not reflect the change of the kth frame well, which may lead to the misjudgment of the key frame, that is, the kth frame cannot be selected as Keyframe.
  • An embodiment of the present invention provides a method for extracting a key frame, so as to solve the problem that the prior art solution cannot accurately extract a key frame that changes in a uniform speed but changes in the direction of motion.
  • a key frame extraction method including:
  • a video service device comprising:
  • a video key frame extraction module configured to obtain, according to a motion vector of each frame in the acquired video data stream; a feature vector set of motion vectors of each frame; and a motion vector corresponding to the feature vector set of the adjacent two frames before and after Whether the direction and the amplitude change; extract the key frame from the video data stream by using the judgment result of whether the change occurs.
  • a video service system comprising: the video service device and the user terminal device, the video service device obtaining, according to a motion vector of each frame in the acquired video data stream; a feature vector set of motion vectors of each frame; Determining whether the direction and the amplitude of the motion vector corresponding to the feature vector set of the two adjacent frames before and after the change; extracting the key frame from the video data stream by using the determination result of whether the change occurs, and then, for the user The terminal device provides the key frame.
  • the video service device, the video service system and the key frame extraction method thereof provide the feature vector set of the motion vector by using the motion vector of the frame, and determine the feature vector set corresponding to the two adjacent frames. Whether the direction and amplitude of the motion vector change to extract the key frame can effectively extract the frame with sudden change in speed and uniform velocity but change direction, which reduces the error rate and complexity of extracting key frames.
  • FIG. 1 is a system diagram of a video service system according to Embodiment 1 of the present invention.
  • FIG. 2 is a block diagram of a video service device according to Embodiment 1 of the present invention.
  • FIG. 3 is a block diagram of a video service apparatus according to Embodiment 2 of the present invention.
  • FIG. 4 is a block diagram of a video service device according to Embodiment 3 of the present invention.
  • FIG. 5 is a flowchart of a method for extracting a key frame according to Embodiment 4 of the present invention.
  • FIG. 6 is a histogram of an X-segment vector in a key frame extraction method according to Embodiment 4 of the present invention.
  • FIG. 7 is a histogram of a y-divided vector in a key frame extraction method according to a fourth embodiment of the present invention.
  • FIG. 1 is a system diagram of a video service system 10 according to Embodiment 1 of the present invention.
  • the video service system 10 includes: a video service device 20 and a user terminal device 30.
  • the video service device 20 and the user terminal device 30 are connected via a network (not shown) or the video service device 20 and the user terminal device 30.
  • the peers are placed in the same video terminal device.
  • the video service apparatus 20 is configured to extract a key frame by determining whether a direction and an amplitude of a motion vector corresponding to a feature vector set of two adjacent frames in the video stream change, and extract the extracted key frame.
  • the ranking is divided and provided to the user terminal device 30.
  • the video service device 20 can be a video retrieval service device or a video transmission service device or a video encoding service device.
  • FIG. 2 is a block diagram of a video service device 20 according to Embodiment 1 of the present invention.
  • the video service device 20 is a video retrieval service device and is used to provide a video retrieval information service for the user terminal device 30.
  • the video service device 20 includes a video storage module 200 and a video key frame extraction module 210.
  • the user terminal device 30 includes a key frame copy memory unit 300 and a user search and play interface 310.
  • the video storage module 200, the key frame copy memory unit 300, and the user search and play interface 310 are all well-known technologies, and their functions are not described in detail.
  • the key frame extraction module 210 is configured to acquire a motion vector of each frame in the video data stream, and acquire a feature vector set of the motion vector.
  • the key frame extraction module 210 composes a motion vector set by combining motion vectors having the same motion vector value, and a motion vector set having the largest number of motion vectors as a feature vector set.
  • the video key frame extraction module 210 decomposes the motion vector into a sub-vector in the X-axis direction and a sub-vector in the y-axis direction.
  • the feature vector set may also be acquired by using a combination of amplitude and angle, and the motion vector set of the background and the foreground may be separately extracted by a clustering method, and the motion vector set of the foreground is a feature.
  • Vector collection may also be acquired by using a combination of amplitude and angle, and the motion vector set of the background and the foreground may be separately extracted by a clustering method, and the motion vector set of the foreground is a feature.
  • the key frame extraction module 210 is further configured to extract a key frame by determining whether the direction and the amplitude of the motion vector corresponding to the feature vector set of the two adjacent frames are changed. In this embodiment, the key frame extraction module 210 extracts the key by determining whether the direction and amplitude of the motion vector of the feature vector set of the kth frame are different from the direction and magnitude of the motion vector of the feature vector set of the k-1th frame. frame.
  • the key frame extraction module 210 sets the direction of the motion vector corresponding to the feature vector set. It can be decomposed into the x-axis direction and the y-axis direction, and the magnitude of its motion vector can be expressed by the sum of the X-divided vector size and the y-divided vector size. In this embodiment, the key frame extraction module 210 determines the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame by determining the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame.
  • the direction of the y-divided vector of the motion vector of the feature vector set of the k-frame is changed with respect to the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame, or by judging the motion of the feature vector set of the k-th frame
  • the amplitude of the vector differs from the amplitude of the motion vector of the feature vector set of the k-1 frame by more than a predetermined threshold value, and the kth frame is used as the key frame.
  • the key frame extraction module 210 is further configured to determine the category of the extracted key frame after extracting the key frame.
  • the key frame categories are classified into a first type of key frame, a second type of key frame, and a third type of key frame.
  • the first type of key frames are excellent level key frames
  • the second type of key frames are good level key frames
  • the third type of key frames are general level key frames.
  • the key frame extraction module 2 10 determines the extracted key frame category by determining whether the direction and the amplitude of the motion vector corresponding to the feature vector set of the two adjacent frames are changed, that is, the key extracted. The level of the frame.
  • the key frame extraction module 210 determines the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame relative to the X-segment vector of the motion vector of the feature vector set of the k-1th frame.
  • the direction has changed, and the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is changed from the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame, ⁇
  • the class of the frame is the first type of key frame, that is, the level of the k-th frame is divided into excellent levels.
  • the key frame extraction module 210 determines the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame relative to the X-segment vector of the motion vector of the feature vector set of the k-1th frame.
  • the direction is changed, and the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is changed by determining the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame, and By judging that the magnitude of the motion vector of the feature vector set of the kth frame is different from the magnitude of the motion vector of the feature vector set of the k-1 frame by no more than a predetermined threshold, then k
  • the category of the frame is the second type of key frame, that is, the level of the k-th frame is divided into good levels.
  • the key frame extraction module 210 determines the feature vector set of the kth frame.
  • the direction of the x-segment vector of the combined motion vector is changed with respect to the direction of the X-segment vector of the motion vector of the feature vector set of the k-1th frame, or by determining the motion vector of the feature vector set of the k-th frame
  • the direction of the divided vector is changed with respect to the direction of the y-divided vector of the motion vector of the feature vector set of the k1th frame, and by determining the magnitude of the motion vector of the feature vector set of the k-th frame and the k-1th frame
  • the magnitude of the motion vector of the feature vector set differs by more than a predetermined threshold, then the kth
  • the category of the frame is the second type of key frame, that is, the level of the k-th frame is divided into good levels.
  • the key frame extraction module 210 determines the X-score of the motion vector of the motion vector of the k-th frame by determining the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame.
  • the direction of the vector changes, and by determining that the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame changes with respect to the direction of the y-divided vector of the motion vector of the feature vector set of the k1st frame.
  • the category of the kth frame is the second type of key frame, that is, the level of the kth frame is divided into good levels.
  • the key frame extraction module 210 determines the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame relative to the X-segment vector of the motion vector of the feature vector set of the k-1th frame.
  • the direction has changed, or the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is changed by the direction of the y-divided vector of the motion vector set of the k-th frame, and the judgment is made.
  • the magnitude of the motion vector of the feature vector set of the kth frame is different from the amplitude of the motion vector of the feature vector set of the k-1th frame by no more than a predetermined threshold, then the kth
  • the class of the frame is the third type of key frame, that is, the level of the k-th frame is divided into a general level.
  • the key frame extraction module 210 determines the X-score of the motion vector of the motion vector of the k-th frame by determining the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame.
  • the direction of the vector changes, or by determining that the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame changes with respect to the direction of the y-divided vector of the motion vector of the feature vector set of the k1st frame.
  • the category of the kth frame is the third type of key frame, that is, the level of the kth frame is divided into general levels.
  • the video key frame extraction module 210 extracts a key frame from the video data stream of the video storage module 210, and transmits the classified key frame to the key frame of the user terminal device 30 for temporary storage.
  • the unit 300 plays the key frame information for the user to search and play the interface 310.
  • the video key frame extraction module 210 classifies the key frame by determining whether the direction and the amplitude of the motion vector corresponding to the feature vector set of the two adjacent frames are changed. In the case of poor communication quality of the network, the non-key frames are discarded first. If the communication quality is further deteriorated, the key frames with lower levels are discarded, so that the information of interest to the user can be better protected.
  • FIG. 3 is a block diagram of a video service device 20 according to Embodiment 2 of the present invention.
  • the video service device 20 is a video transmission service device, and further includes a video collection module 220, a video encoding module 230, and a scalable network transmission module 240.
  • the video collection module 220 is connected to the video key frame extraction module 210 and the video encoding module 230
  • the video encoding module 230 is connected to the video key frame extraction module 210, the scalable network transmission module 240, and the video encoding module 230.
  • the scalable network transmission module 240 is coupled to the video encoding module 230, the video keyframe extraction module 210, and the video storage module 200.
  • the video key frame extraction module 210 directly extracts the key frame from the compressed data stream transmitted by the video storage module 200, and then the key frame.
  • the location and level information is sent to the scalable network transmission module 240 along with the compressed data stream.
  • the scalable network transmission module 240 selects a corresponding protection policy according to the key frame information or a frame loss policy in the case of a limited rate to transmit the data stream.
  • the key frame extraction module 210 extracts the key frame information from the original video data stream transmitted by the video collection module 220, and the video encoding module 230 works in the same manner.
  • the original video data stream is encoded as a compressed video data stream and then passed to the scalable network transmission module 240 along with the key frame information.
  • the video service device 20 is a video encoding service device, further including a variable image (Group of
  • the video collection module 220 is connected to the video key frame extraction module 210 and the variable GOP video encoding module 250, and the variable GOP video encoding module 250 and the video key frame extraction module 210, the video storage module 200, and the video.
  • the collection module 220 and the scalable network transmission module 240 are connected.
  • the variable GOP video encoding module 250 encodes the key frame as an I frame, thereby implementing unequal length GOP encoding, which can improve encoding efficiency. Since the key frames are graded, when the two high-level key frames are far away, one or several low-level key frames can be inserted between them, so that the video of the random access engraving is not played. As for losing too many frames.
  • variable image group layer view The frequency encoding module 250 acts as an image group between every two key frames (Group of
  • GOP The division of Picture, GOP
  • the division of Picture, GOP will make the code stream have robust code stream transmission characteristics, facilitate the unequal protection transmission in transmission, and convenient frame dropping strategy; and high compression efficiency and access characteristics, GOP internal
  • the correlation is strong and the correlation between the two is easy to remove.
  • the access point is a key frame and conforms to the characteristics of the human eye.
  • FIG. 5 is a flowchart of a method for extracting a key frame according to Embodiment 4 of the present invention.
  • step S300 a video data stream is received.
  • step S302 a motion vector of each frame is acquired from the video data stream.
  • the motion vectors of each frame are separately decomposed, and the decomposition can be selected by the coordinate axes, and each motion vector is decomposed into a sub-vector of the X direction and a sub-vector of the y direction, that is, each motion vector is available ( Xi, yi) to express.
  • step S304 a feature vector set of motion vectors for each frame is acquired.
  • motion vectors having the same motion vector value are grouped into a motion vector set, and a motion vector set having the largest number of motion vectors is used as a feature vector set.
  • the following is: first extracting an X-segment vector having the same value and the largest number of ⁇ , or a y-divided vector having the same extracted value and the largest number ⁇ ; y-score corresponding to the X-segment vector value one-to-one Under the condition of the vector value, the value of the y-segment vector with the largest number of ⁇ is extracted or the value of the X-segment vector with the largest number of ⁇ is extracted under the condition that the y-divided vector value has a one-to-one corresponding X-segment vector value.
  • the method of establishing a one-dimensional histogram is used for explanation.
  • the X-segment vector value of the motion vector is analyzed and the y-divided vector corresponding to xi_mo S t is analyzed.
  • the value yi_mo S t of the y-divided vector may also be extracted first, and then the xi_mo S t of the x-divided vector may be extracted.
  • the feature vector set may also be acquired by using a combination of amplitude and angle, and the motion vector set of the background and the foreground may be extracted by a clustering method, and the motion vector set of the foreground is a feature vector set.
  • step S306 it is determined whether each frame is a key frame. In this embodiment, by judging the two adjacent Whether the direction and the amplitude of the motion vector corresponding to the feature vector set of the frame change determines whether each frame is a key frame. In this embodiment, it is determined whether the kth frame is the key by determining whether the direction and the amplitude of the motion vector of the feature vector set of the kth frame are different from the direction and the amplitude of the motion vector of the feature vector set of the k-1th frame. frame.
  • Step S306 If it is determined that the kth frame is not a key frame, continue to determine whether the k+1th frame is a key frame, that is, proceed to step S306; if it is determined that the kth frame is a key frame, extract the kth frame as a key frame, and Step S308 is performed.
  • the direction of the motion vector can be divided into an X-axis direction and a y-axis direction, and the magnitude of the motion vector is represented by a sum of the X-divided vector size and the y-divided vector size.
  • the X-segment vector if the X value is positive, its direction is represented by ten, if it is 0, it is represented by 0, and if it is negative, it is represented by one. The same is true for the direction of the y-divided vector.
  • the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is determined to be different from the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame, , that is, the direction of the X-divided vector changes from ten to one or from one to ten or from 0 to non-zero or from non-zero to 0, then the k-th frame is a key frame, and the k-th frame is extracted. As a keyframe.
  • the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is changed, that is, The direction of the y-divided vector is changed from ten to one or from one to ten or from 0 to non-zero or from non-zero to 0, and the k-th frame is extracted as a key frame.
  • the kth frame is the key Frame, and extract the kth frame as a key frame.
  • the threshold value is 60. In other embodiments of the invention, the threshold value may also be other values.
  • step S308 it is determined whether the category of the key frame is the first type of key frame.
  • the categories of the key frames are divided into a first type of key frame, a second type of key frame, and a third type of key frame.
  • the first type of key frames are excellent level key frames
  • the second type of key frames are good class key frames
  • the third type of key frames are general class key frames.
  • the category of the key frame is judged by judging whether the direction and the amplitude of the motion vector corresponding to the feature vector set of the two adjacent frames are changed, that is, the key frame is classified.
  • step S316 is performed; otherwise, step S310 is performed.
  • step S310 is performed.
  • step S310 is performed.
  • step S316 is performed; otherwise, step S310 is performed.
  • step S310 is performed.
  • step S310 it is determined whether the category of the key frame is the second type of key frame. If it is determined that the category of the key frame is the second type of key frame, step S316 is performed; otherwise, step S312 is performed.
  • the magnitude of the amplitude is different from the magnitude of the motion vector magnitude of the feature vector set of the k-1 frame by no more than a predetermined threshold, and the category of the kth frame is the second type of key frame, that is, the kth frame is divided into good key frames. .
  • the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is changed, the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is changed. , that is, the direction of the X-segment vector is changed from ten to one or from one to ten, and the magnitude of the motion vector magnitude of the feature vector set of the k-th frame and the motion of the feature vector set of the k-1 frame are determined. If the magnitude of the vector amplitude differs by more than a predetermined threshold, then the category of the kth frame is the second type of key frame, that is, the kth frame is divided into good key frames.
  • the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is changed, the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is changed. , that is, the direction of the y-divided vector is changed from ten to one or from one to ten, and the magnitude of the motion vector magnitude of the feature vector set of the k-th frame and the motion of the feature vector set of the k-1 frame are determined.
  • the magnitude of the vector magnitude is different If the threshold is greater than the predetermined threshold, the category of the kth frame is the second type of key frame, that is, the kth frame is divided into good key frames.
  • the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is changed, the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is changed. , that is, the direction of the X-segment vector changes from 0 to non-zero or from non-zero to 0, and the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is determined relative to the feature vector set of the k-1th frame.
  • the direction of the y-divided vector of the motion vector changes, that is, the direction of the y-divided vector changes from 0 to non-zero or from non-zero to 0, then the category of the k-th frame is the second type of key frame, that is, the k-th frame is divided.
  • a good grade keyframe is the second type of key frame, that is, the k-th frame is divided.
  • step S312 it is determined whether the category of the key frame is a third type of key frame. If it is determined that the category of the key frame is the third type of key frame, step S316 is performed; otherwise, step S314 is performed.
  • the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is changed, the direction of the y-divided vector of the motion vector set of the k-th frame is changed. , that is, the direction of the y-divided vector is changed from ten to one or from one to ten, and the magnitude of the motion vector magnitude of the feature vector set of the k-th frame and the motion of the feature vector set of the k-1 frame are determined. If the magnitude of the vector amplitude differs by no more than a predetermined threshold, then the category of the kth frame is a third type of key frame, that is, the kth frame is divided into general level key frames.
  • step S314 the undivided key frame is transmitted to the user terminal device 30.
  • step S316 the key frame of the category is transmitted to the user terminal device 30.
  • the "predetermined threshold value" in the above embodiment of the present invention may be a constant value or a value that varies depending on the scene.
  • the video service device 20, the video service system 10, and the key frame extraction method thereof are provided by using the motion vector of the frame to obtain a feature vector set of the motion vector, and determining the feature vector set of the adjacent two frames before and after. Whether the direction and amplitude of the corresponding motion vector change to extract key frames, thereby effectively extracting frames with sudden changes in speed and uniform velocity but changing direction, reducing the error rate and complexity of extracting key frames, and reducing the amount of calculation; Similarly, by classifying the key frames by using the direction and magnitude of the motion vector, those non-key frames can be discarded first in the network communication quality difference. If the communication quality is further deteriorated, the key frames with lower levels are discarded. Better protect user interest information.
  • the present invention can be implemented by hardware, or can be realized by means of software plus necessary general hardware platform, the present invention.
  • the technical solution can be embodied in the form of a software product, which can be stored in a computer readable storage medium (which can be a CD-ROM, a USB flash drive, a mobile hard disk, etc.), and includes a plurality of instructions for making a computer device (may be a personal computer, server, or network device, etc.) Perform the methods described in various embodiments of the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

An extracting method of the key frame, a video service apparatus and a video service system are disclosed, in which the method is applied to extract the key frame of the video data stream in the video service system. The method comprises: obtaining the motion vector of each frame in the video data stream, and obtaining the characteristics vector aggregate according to the motion vector; determine whether the direction and the amplitude of the motion vector corresponding to the characteristics vector aggregate of the forward and the backward adjacent two frames occurs change or not; extracting the key frame using the determining result. So the frame whose speed changes abruptly could be extracted effectively.

Description

说明书 视频服务系统、 视频服务装置及其关键帧的提取方法  Instruction manual video service system, video service device and method for extracting key frames thereof
[1] 本申请要求了 2008年 05月 13日提交的、 申请号为 200810067177.8、 发明名称为" 视频服务系统、 视频服务装置及其关键帧的提取方法"的中国申请的优先权, 其 全部内容通过引用结合在本申请中。  [1] This application claims the priority of the Chinese application filed on May 13, 2008, with the application number 200810067177.8, the invention titled "Video Service System, Video Service Device and Its Key Frame Extraction Method", all contents thereof This is incorporated herein by reference.
[2] 技术领域 [2] Technical field
[3] 本发明实施例涉及通信技术领域, 尤其涉及一种视频服务系统、 视频服务装置 及其关键帧的提取方法。  [3] The embodiments of the present invention relate to the field of communications technologies, and in particular, to a video service system, a video service device, and a method for extracting key frames thereof.
[4] 背景技术 [4] Background Art
[5] 人们在观察客观世界吋, 往往对非常态的事件最感兴趣, 并从非常态的事件中 获得大量的信息。 对于物体, 非常态就是指物体的运动状态发生了显著变化, 例如由静止到运动, 由运动到静止, 其运动方向发生变化, 或者运动速度发生 显著变化。 同样, 人们在观看视频场景吋, 将注意力放在场景的变化上, 而视 频场景的变化就是场景中物体的运动状态发生显著变化的反映。 场景变化还包 括场景的切换, 可以认为场景的切换是原有的场景中物体突然运动到无穷远的 地方, 而新的场景中物体是从无穷远的地方运动而来, 物体的运动状态发生激 烈变化。  [5] People are observing the objective world, often being most interested in unusual events, and getting a lot of information from unusual events. For an object, the extraordinary state means that the motion state of the object changes significantly, for example, from rest to motion, from motion to rest, the direction of motion changes, or the speed of motion changes significantly. Similarly, when watching a video scene, people pay attention to the change of the scene, and the change of the video scene is a reflection of the significant change of the motion state of the object in the scene. The scene change also includes the switching of the scene. It can be considered that the switching of the scene is the sudden movement of the object in the original scene to the infinity, and the object in the new scene is moved from infinity, and the motion state of the object is intense. Variety.
[6] 通常, 帧用来对视频信息进行描述, 而其中的关键帧是最能代表视频信息的帧 [6] Generally, frames are used to describe video information, and key frames are frames that best represent video information.
。 所谓的关键帧, 是指场景中物体发生非常态运动的帧, 在非常态帧之间的其 它帧场景保持常态, 其中场景是指一组包含有内容相关的若干镜头的集合。 . The so-called key frame refers to the frame in which the object in the scene has abnormal motion. The other frame scenes between the abnormal frames remain normal. The scene refers to a set of several shots containing content related.
[7] 在现有技术一的提取关键帧的做法中, 对第 k帧与第 k-1帧釆用帧间差分法, 来 得到发生运动的物体的大致轮廓 (简称第一轮廓) , 然后利用多级边缘检测算 法得到第 k帧的所有物体的轮廓 (简称第二轮廓) , 并将第二轮廓与第一轮廓进 行"与"运算以得到比第一轮廓更清楚的轮廓 (简称第三轮廓) , 再在第三轮廓基 础上对运动物体增加矩形框, 即将运动物体用矩形框框起来, 及通过水平集方 法 (Level Set Method) 中的基于主动轮廓模型 (Geodesic Active Contour Model) 来获取运动物体的边缘轮廓, 最后通过判断运动物体的边缘轮廓的出现 、 消失、 位移变化及形状变化来选取关键帧。 [7] In the prior art method of extracting key frames, the interframe difference method is applied to the kth frame and the k-1th frame to obtain a rough outline of the moving object (referred to as a first contour), and then Using the multi-level edge detection algorithm to obtain the contour of all objects in the kth frame (referred to as the second contour), and the second contour is ANDed with the first contour to obtain a clearer contour than the first contour (referred to as the third Contour), then add a rectangular frame to the moving object based on the third contour, that is, the moving object is framed by a rectangle, and the motion is obtained by the Geodesic Active Contour Model in the Level Set Method. The edge contour of the object, finally by judging the appearance of the edge contour of the moving object , disappear, displacement changes, and shape changes to select keyframes.
[8] 但是, 在实现本发明过程中, 发明人发现上述现有技术一中至少存在如下问题 [8] However, in the process of implementing the present invention, the inventors have found that at least the following problems exist in the above prior art 1.
: 首先, 现有技术一需要提取所有运动物体的轮廓信息, 并将该轮廓信息进行 运算, 由于提取轮廓信息的算法过程很复杂, 因而造成现有技术一的计算量很 大; 其次, 现有技术一是提取由静止到变为运动或由运动变为静止的关键帧, 但对由匀速突然变速的关键帧却无法提取。 First, in the prior art, the contour information of all moving objects needs to be extracted, and the contour information is calculated. Because the algorithm process of extracting the contour information is complicated, the calculation amount of the prior art 1 is large; The first technique is to extract the key frames from stationary to moving or from moving to stationary, but the key frames that are suddenly shifted by constant speed cannot be extracted.
[9] 在现有技术二的提取关键帧的做法中, 先通过提取视频流的每帧的运动物体的 运动矢量信息, 即速度幅度大小, 并获取每帧的运动矢量信息的平均值, 再用 感知运动能量值表示每一个运动矢量信息, 并组成所有帧的运动物体的感知运 动能量图, 其中感知运动能量值的上升变化表示加速, 下升变化表示减速, 再 通过三角形模型分析器对感知运动能量图划分运动单元界限, 其中以界限处对 应于感知运动能量值最小处, 且界限处标识了三角形的起点和终点, 再对每运 动单元进行三角形模型调整, 及选择关键帧。  [9] In the prior art method of extracting key frames, first, by extracting motion vector information of a moving object of each frame of the video stream, that is, a magnitude of a velocity, and obtaining an average value of motion vector information of each frame, The perceptual motion energy value is used to represent each motion vector information, and constitutes a perceptual motion energy map of the moving object of all frames, wherein the rising change of the perceptual motion energy value represents acceleration, the rising change represents deceleration, and then the perception is performed by the triangle model analyzer. The motion energy map divides the motion unit boundary, wherein the boundary corresponds to the minimum value of the perceived motion energy value, and the boundary identifies the start and end points of the triangle, and then performs triangle model adjustment for each motion unit, and selects a key frame.
[10] 但是, 在实现本发明过程中, 发明人发现上述现有技术二中至少存在如下问题 : 首先, 现有技术二没有考虑方向性问题, 即现有技术二无法体现匀速但运动 方向发生变化的运动物体, 从而无法提取该运动物体的关键帧; 其次, 当第 k帧 中具有多个运动物体吋, 其中个别的运动物体的运动矢量信息很大, 另外个别 的运动物体的运动矢量很小吋, 再经平均值计算后, 可能造成第 k帧的感知运动 能量值很小, 不能很好的体现第 k帧的变化, 从而会导致关键帧的误判, 即第 k 帧不能选取为关键帧。  [10] However, in the process of implementing the present invention, the inventors have found that at least the following problems exist in the above prior art: First, the prior art 2 does not consider the directional problem, that is, the prior art 2 cannot reflect the uniform speed but the moving direction occurs. a moving object that changes, so that the key frame of the moving object cannot be extracted; secondly, when there are multiple moving objects in the kth frame, the motion vector information of the individual moving objects is large, and the motion vectors of the individual moving objects are very Xiaoyan, after the average value calculation, may cause the perceived motion energy value of the kth frame to be small, which does not reflect the change of the kth frame well, which may lead to the misjudgment of the key frame, that is, the kth frame cannot be selected as Keyframe.
[11] 发明内容  [11] Summary of the invention
[12] 本发明实施例提供一种关键帧的提取方法, 以解决现有技术的方案无法准确提 取匀速但运动方向发生变化的关键帧的问题。  An embodiment of the present invention provides a method for extracting a key frame, so as to solve the problem that the prior art solution cannot accurately extract a key frame that changes in a uniform speed but changes in the direction of motion.
[13] 一种关键帧的提取方法, 包括: [13] A key frame extraction method, including:
[14] 根据获取的视频数据流中每帧的运动矢量获得; 每帧的运动矢量的特征矢量集 合.  [14] According to the motion vector of each frame in the acquired video data stream; the feature vector set of the motion vector of each frame.
[15] 判断前后相邻两帧的特征矢量集合相对应的运动矢量的方向与幅度是否发生变 化; [16] 釆用所述是否发生变化的判断结果从所述视频数据流中提取关键帧。 [15] determining whether the direction and amplitude of the motion vector corresponding to the feature vector set of two adjacent frames before and after the change; [16] Extracting key frames from the video data stream using the result of the determination as to whether the change has occurred.
[17] 一种视频服务装置, 包括: [17] A video service device, comprising:
[18] 视频关键帧提取模块, 用于根据获取的视频数据流中每帧的运动矢量获得; 每 帧的运动矢量的特征矢量集合; 判断前后相邻两帧的特征矢量集合相对应的运 动矢量的方向与幅度是否发生变化; 釆用所述是否发生变化的判断结果从所述 视频数据流中提取关键帧。  [18] a video key frame extraction module, configured to obtain, according to a motion vector of each frame in the acquired video data stream; a feature vector set of motion vectors of each frame; and a motion vector corresponding to the feature vector set of the adjacent two frames before and after Whether the direction and the amplitude change; extract the key frame from the video data stream by using the judgment result of whether the change occurs.
[19] 一种视频服务系统, 包括所述的视频服务装置及用户终端装置, 所述视频服务 装置根据获取的视频数据流中每帧的运动矢量获得; 每帧的运动矢量的特征矢 量集合; 判断前后相邻两帧的特征矢量集合相对应的运动矢量的方向与幅度是 否发生变化; 釆用所述是否发生变化的判断结果从所述视频数据流中提取关键 帧, 然后, 为所述用户终端装置提供所述关键帧。  [19] A video service system, comprising: the video service device and the user terminal device, the video service device obtaining, according to a motion vector of each frame in the acquired video data stream; a feature vector set of motion vectors of each frame; Determining whether the direction and the amplitude of the motion vector corresponding to the feature vector set of the two adjacent frames before and after the change; extracting the key frame from the video data stream by using the determination result of whether the change occurs, and then, for the user The terminal device provides the key frame.
[20] 本发明提供的视频服务装置、 视频服务系统及其关键帧的提取方法, 利用帧的 运动矢量, 获取运动矢量的特征矢量集合, 通过判断前后相邻两帧的特征矢量 集合相对应的运动矢量的方向与幅度是否发生变化来提取关键帧, 从而可以有 效的提取速度突然变化和匀速但方向发生变化的帧, 降低了提取关键帧的错误 率与复杂度。  [20] The video service device, the video service system and the key frame extraction method thereof provide the feature vector set of the motion vector by using the motion vector of the frame, and determine the feature vector set corresponding to the two adjacent frames. Whether the direction and amplitude of the motion vector change to extract the key frame can effectively extract the frame with sudden change in speed and uniform velocity but change direction, which reduces the error rate and complexity of extracting key frames.
[21] 附图说明  [21] BRIEF DESCRIPTION OF THE DRAWINGS
[22] 图 1为本发明实施例一的视频服务系统的系统图;  1 is a system diagram of a video service system according to Embodiment 1 of the present invention;
[23] 图 2为本发明实施例一的视频服务装置的模块图;  2 is a block diagram of a video service device according to Embodiment 1 of the present invention;
[24] 图 3为本发明实施例二的视频服务装置的模块图;  FIG. 3 is a block diagram of a video service apparatus according to Embodiment 2 of the present invention; FIG.
[25] 图 4为本发明实施例三的视频服务装置的模块图;  4 is a block diagram of a video service device according to Embodiment 3 of the present invention;
[26] 图 5为本发明实施例四的关键帧的提取方法的流程图;  FIG. 5 is a flowchart of a method for extracting a key frame according to Embodiment 4 of the present invention; FIG.
[27] 图 6为本发明实施例四的关键帧的提取方法中的 X分矢量的直方图;  6 is a histogram of an X-segment vector in a key frame extraction method according to Embodiment 4 of the present invention;
[28] 图 7为本发明实施例四的关键帧的提取方法中的 y分矢量的直方图。  7 is a histogram of a y-divided vector in a key frame extraction method according to a fourth embodiment of the present invention.
[29] 具体实施例  [29] Specific embodiment
[30] 图 1所示为本发明实施例一的视频服务系统 10的系统图。 在本实施例中, 视频 服务系统 10包括: 视频服务装置 20及用户终端装置 30, 视频服务装置 20与用户 终端装置 30通过网络 (未画出) 通信连接或视频服务装置 20与用户终端装置 30 同吋置于同一个视频终端装置中。 在本实施例中, 视频服务装置 20用于通过判 断视频流中前后相邻两帧的特征矢量集合相对应的运动矢量的方向与幅度是否 发生变化来提取关键帧, 并将提取到的关键帧划分等级, 及提供给用户终端装 置 30。 在本实施例中, 视频服务装置 20可为一个视频检索服务装置或一个视频 传输服务装置或一个视频编码服务装置。 FIG. 1 is a system diagram of a video service system 10 according to Embodiment 1 of the present invention. In the present embodiment, the video service system 10 includes: a video service device 20 and a user terminal device 30. The video service device 20 and the user terminal device 30 are connected via a network (not shown) or the video service device 20 and the user terminal device 30. The peers are placed in the same video terminal device. In this embodiment, the video service apparatus 20 is configured to extract a key frame by determining whether a direction and an amplitude of a motion vector corresponding to a feature vector set of two adjacent frames in the video stream change, and extract the extracted key frame. The ranking is divided and provided to the user terminal device 30. In this embodiment, the video service device 20 can be a video retrieval service device or a video transmission service device or a video encoding service device.
[31] 图 2所示为本发明实施例一提供的视频服务装置 20的模块图。 在本实施例中, 视频服务装置 20是一个视频检索服务装置, 并用于为用户终端装置 30提供视频 检索信息服务。 在本实施例中, 视频服务装置 20包括视频存储模块 200及视频关 键帧提取模块 210。 用户终端装置 30包括关键帧临吋存储单元 300及用户搜索与 播放界面 310。 其中, 视频存储模块 200、 关键帧临吋存储单元 300及用户搜索与 播放界面 310皆为公知技术, 其功能此吋不作详细描述。  FIG. 2 is a block diagram of a video service device 20 according to Embodiment 1 of the present invention. In the present embodiment, the video service device 20 is a video retrieval service device and is used to provide a video retrieval information service for the user terminal device 30. In this embodiment, the video service device 20 includes a video storage module 200 and a video key frame extraction module 210. The user terminal device 30 includes a key frame copy memory unit 300 and a user search and play interface 310. The video storage module 200, the key frame copy memory unit 300, and the user search and play interface 310 are all well-known technologies, and their functions are not described in detail.
[32] 在本实施例中, 关键帧提取模块 210用于获取视频数据流中每帧的运动矢量, 并获取运动矢量的特征矢量集合。 在本实施例中, 关键帧提取模块 210通过将运 动矢量值相同的运动矢量组成运动矢量集合, 及将运动矢量数量最多的运动矢 量集合作为特征矢量集合。 在本实施例中, 视频关键帧提取模块 210将运动矢量 分解为 X轴方向的分矢量与 y轴方向的分矢量。 先提取值相同且数量最多吋的 X分 矢量或提取值相同且数量最多吋的 y分矢量; 在所述 X分矢量值一一对应的 y分矢 量值的条件下, 提取数量最多吋的 y分矢量的值或在所述 y分矢量值一一对应的 X 分矢量值的条件下, 提取数量最多吋的 X分矢量的值, 再将此吋的运动矢量为 ( x,y) 的集合作为特征矢量集合。 在本发明的其它实施例中, 亦可通过使用幅度 和角度结合的方法来获取特征矢量集合, 也可通过聚类的方法分别提取背景和 前景的运动矢量集合, 前景的运动矢量集合即为特征矢量集合。  In this embodiment, the key frame extraction module 210 is configured to acquire a motion vector of each frame in the video data stream, and acquire a feature vector set of the motion vector. In this embodiment, the key frame extraction module 210 composes a motion vector set by combining motion vectors having the same motion vector value, and a motion vector set having the largest number of motion vectors as a feature vector set. In the present embodiment, the video key frame extraction module 210 decomposes the motion vector into a sub-vector in the X-axis direction and a sub-vector in the y-axis direction. First extract the X-segment vector with the same value and the largest number of 吋 or the y-divided vector with the same extraction value and the largest number 吋; under the condition of the y-divided vector value corresponding to the X-divided vector value one by one, extract the most y Extracting the value of the partial vector or the value of the X-segment vector of the y-divided vector value one by one, extracting the value of the X-number vector with the largest number of 吋, and then taking the motion vector of this 为 as a set of (x, y) As a feature vector collection. In other embodiments of the present invention, the feature vector set may also be acquired by using a combination of amplitude and angle, and the motion vector set of the background and the foreground may be separately extracted by a clustering method, and the motion vector set of the foreground is a feature. Vector collection.
[33] 在本实施例中, 关键帧提取模块 210还用于通过判断前后相邻两帧的特征矢量 集合相对应的运动矢量的方向与幅度是否发生变化来提取关键帧。 在本实施例 中, 关键帧提取模块 210通过判断第 k帧的特征矢量集合的运动矢量的方向与幅 度相对第 k-1帧的特征矢量集合的运动矢量的方向与幅度是否发生变化来提取关 键帧。  In the embodiment, the key frame extraction module 210 is further configured to extract a key frame by determining whether the direction and the amplitude of the motion vector corresponding to the feature vector set of the two adjacent frames are changed. In this embodiment, the key frame extraction module 210 extracts the key by determining whether the direction and amplitude of the motion vector of the feature vector set of the kth frame are different from the direction and magnitude of the motion vector of the feature vector set of the k-1th frame. frame.
[34] 在本实施例中, 关键帧提取模块 210将特征矢量集合相对应的运动矢量的方向 可分解为 χ轴方向与 y轴方向, 并且其运动矢量的幅度可用 X分矢量大小与 y分矢 量大小之和来表示。 在本实施例中, 关键帧提取模块 210通过判断第 k帧的特征 矢量集合的运动矢量的 X分矢量的方向相对第 k- 1帧的特征矢量集合的运动矢量的 X分矢量的方向或第 k帧的特征矢量集合的运动矢量的 y分矢量的方向相对第 k- 1帧 的特征矢量集合的运动矢量的 y分矢量的方向发生了变化, 或通过判断第 k帧的 特征矢量集合的运动矢量的幅度与第 k- 1帧的特征矢量集合的运动矢量的幅度相 差大于一个预定的门限值吋, 将第 k帧作为关键帧。 [34] In this embodiment, the key frame extraction module 210 sets the direction of the motion vector corresponding to the feature vector set. It can be decomposed into the x-axis direction and the y-axis direction, and the magnitude of its motion vector can be expressed by the sum of the X-divided vector size and the y-divided vector size. In this embodiment, the key frame extraction module 210 determines the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame by determining the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame. The direction of the y-divided vector of the motion vector of the feature vector set of the k-frame is changed with respect to the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame, or by judging the motion of the feature vector set of the k-th frame The amplitude of the vector differs from the amplitude of the motion vector of the feature vector set of the k-1 frame by more than a predetermined threshold value, and the kth frame is used as the key frame.
[35] 在本实施例中, 关键帧提取模块 210还用于在提取关键帧后, 判断提取到的关 键帧的类别。 其中, 关键帧的类别分为第一类关键帧、 第二类关键帧及第三类 关键帧。 在本实施例中, 第一类关键帧为优等级关键帧、 第二类关键帧为良等 级关键帧及第三类关键帧为一般等级关键帧。 在本实施例中, 关键帧提取模块 2 10通过判断前后相邻两帧的特征矢量集合相对应的运动矢量的方向与幅度是否 发生变化来判断提取到的关键帧类别, 即划分提取到的关键帧的等级。  In the embodiment, the key frame extraction module 210 is further configured to determine the category of the extracted key frame after extracting the key frame. The key frame categories are classified into a first type of key frame, a second type of key frame, and a third type of key frame. In this embodiment, the first type of key frames are excellent level key frames, the second type of key frames are good level key frames, and the third type of key frames are general level key frames. In this embodiment, the key frame extraction module 2 10 determines the extracted key frame category by determining whether the direction and the amplitude of the motion vector corresponding to the feature vector set of the two adjacent frames are changed, that is, the key extracted. The level of the frame.
[36] 在本实施例中, 关键帧提取模块 210通过判断第 k帧的特征矢量集合的运动矢量 的 X分矢量的方向相对第 k - 1帧的特征矢量集合的运动矢量的 X分矢量的方向发 生了变化, 并且通过判断第 k帧的特征矢量集合的运动矢量的 y分矢量的方向相 对第 k 1帧的特征矢量集合的运动矢量的 y分矢量的方向相比方向发生了变化, 同吋通过判断第 k帧的特征矢量集合的运动矢量的幅度与第 k - 1帧的特征矢量集 合的运动矢量的幅度相差大于一个预定的门限吋, 则第 k  [36] In this embodiment, the key frame extraction module 210 determines the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame relative to the X-segment vector of the motion vector of the feature vector set of the k-1th frame. The direction has changed, and the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is changed from the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame,吋 By judging that the magnitude of the motion vector of the feature vector set of the kth frame differs from the magnitude of the motion vector of the feature vector set of the k-1 frame by more than a predetermined threshold, then k
帧的类别为第一类关键帧, 即将第 k帧的等级划分为优等级。  The class of the frame is the first type of key frame, that is, the level of the k-th frame is divided into excellent levels.
[37] 在本实施例中, 关键帧提取模块 210通过判断第 k帧的特征矢量集合的运动矢量 的 X分矢量的方向相对第 k - 1帧的特征矢量集合的运动矢量的 X分矢量的方向发 生了变化, 并且通过判断第 k帧的特征矢量集合的运动矢量的 y分矢量的方向相 对所述第 k - 1帧的特征矢量集合的运动矢量的 y分矢量的方向发生了变化, 及通 过判断第 k帧的特征矢量集合的运动矢量的幅度与第 k - 1帧的特征矢量集合的运 动矢量的幅度相差不大于预定的门限吋, 则第 k  [37] In this embodiment, the key frame extraction module 210 determines the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame relative to the X-segment vector of the motion vector of the feature vector set of the k-1th frame. The direction is changed, and the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is changed by determining the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame, and By judging that the magnitude of the motion vector of the feature vector set of the kth frame is different from the magnitude of the motion vector of the feature vector set of the k-1 frame by no more than a predetermined threshold, then k
帧的类别为第二类关键帧, 即将第 k帧的等级划分为良等级。  The category of the frame is the second type of key frame, that is, the level of the k-th frame is divided into good levels.
[38] 在本发明其它实施例中, 关键帧提取模块 210通过判断所述第 k帧的特征矢量集 合的运动矢量的 x分矢量的方向相对所述第 k - 1帧的特征矢量集合的运动矢量的 X分矢量的方向发生了变化, 或者通过判断第 k帧的特征矢量集合的运动矢量的 y 分矢量的方向相对第 k 1帧的特征矢量集合的运动矢量的 y分矢量的方向发生了 变化, 并且通过判断第 k帧的特征矢量集合的运动矢量的幅度与所述第 k - 1帧的 特征矢量集合的运动矢量的幅度相差大于预定的门限吋, 则第 k [38] In other embodiments of the present invention, the key frame extraction module 210 determines the feature vector set of the kth frame. The direction of the x-segment vector of the combined motion vector is changed with respect to the direction of the X-segment vector of the motion vector of the feature vector set of the k-1th frame, or by determining the motion vector of the feature vector set of the k-th frame The direction of the divided vector is changed with respect to the direction of the y-divided vector of the motion vector of the feature vector set of the k1th frame, and by determining the magnitude of the motion vector of the feature vector set of the k-th frame and the k-1th frame The magnitude of the motion vector of the feature vector set differs by more than a predetermined threshold, then the kth
帧的类别为第二类关键帧, 即将第 k帧的等级划分为良等级。  The category of the frame is the second type of key frame, that is, the level of the k-th frame is divided into good levels.
[39] 在本发明其它实施例中, 关键帧提取模块 210通过判断第 k帧的特征矢量集合的 运动矢量的 X分矢量的方向相对第 k - 1帧的特征矢量集合的运动矢量的 X分矢量 的方向发生了变化, 并且通过判断第 k帧的特征矢量集合的运动矢量的 y分矢量 的方向相对第 k 1帧的特征矢量集合的运动矢量的 y分矢量的方向发生了变化吋 , 则第 k帧的类别为第二类关键帧, 即将第 k帧的等级划分为良等级。  In another embodiment of the present invention, the key frame extraction module 210 determines the X-score of the motion vector of the motion vector of the k-th frame by determining the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame. The direction of the vector changes, and by determining that the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame changes with respect to the direction of the y-divided vector of the motion vector of the feature vector set of the k1st frame, The category of the kth frame is the second type of key frame, that is, the level of the kth frame is divided into good levels.
[40] 在本实施例中, 关键帧提取模块 210通过判断第 k帧的特征矢量集合的运动矢量 的 X分矢量的方向相对第 k - 1帧的特征矢量集合的运动矢量的 X分矢量的方向发 生了变化, 或者通过判断第 k帧的特征矢量集合的运动矢量的 y分矢量的方向相 对第 k - 1帧的特征矢量集合的运动矢量的 y分矢量的方向发生了变化, 并且通过 判断第 k帧的特征矢量集合的运动矢量的幅度与所述第 k - 1帧的特征矢量集合的 运动矢量的幅度相差不大于预定的门限, 则第 k  [40] In the present embodiment, the key frame extraction module 210 determines the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame relative to the X-segment vector of the motion vector of the feature vector set of the k-1th frame. The direction has changed, or the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is changed by the direction of the y-divided vector of the motion vector set of the k-th frame, and the judgment is made. The magnitude of the motion vector of the feature vector set of the kth frame is different from the amplitude of the motion vector of the feature vector set of the k-1th frame by no more than a predetermined threshold, then the kth
帧的类别为第三类关键帧, 即将第 k帧的等级划分为一般等级。  The class of the frame is the third type of key frame, that is, the level of the k-th frame is divided into a general level.
[41] 在本发明其它实施例中, 关键帧提取模块 210通过判断第 k帧的特征矢量集合的 运动矢量的 X分矢量的方向相对第 k - 1帧的特征矢量集合的运动矢量的 X分矢量 的方向发生了变化, 或者通过判断第 k帧的特征矢量集合的运动矢量的 y分矢量 的方向相对第 k 1帧的特征矢量集合的运动矢量的 y分矢量的方向发生了变化吋 , 则第 k帧的类别为第三类关键帧, 即将第 k帧的等级划分为一般等级。  In another embodiment of the present invention, the key frame extraction module 210 determines the X-score of the motion vector of the motion vector of the k-th frame by determining the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame. The direction of the vector changes, or by determining that the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame changes with respect to the direction of the y-divided vector of the motion vector of the feature vector set of the k1st frame, The category of the kth frame is the third type of key frame, that is, the level of the kth frame is divided into general levels.
[42] 在本实施例中, 视频关键帧提取模块 210从视频存储模块 210的视频数据流中提 取出关键帧, 并将划分等级后的关键帧传输到用户终端装置 30的关键帧临吋存 储单元 300以供用户搜索与播放界面 310播放关键帧信息。  [42] In this embodiment, the video key frame extraction module 210 extracts a key frame from the video data stream of the video storage module 210, and transmits the classified key frame to the key frame of the user terminal device 30 for temporary storage. The unit 300 plays the key frame information for the user to search and play the interface 310.
[43] 在本实施例中, 视频关键帧提取模块 210通过判断前后相邻两帧的特征矢量集 合相对应的运动矢量的方向与幅度是否发生变化来对关键帧进行划分等级, 可 在网络通信质量差吋进行先丢弃那些非关键帧, 如果通信质量进一步恶化, 就 丢弃等级比较低的关键帧, 这样可以更好保护用户感兴趣信息。 [43] In this embodiment, the video key frame extraction module 210 classifies the key frame by determining whether the direction and the amplitude of the motion vector corresponding to the feature vector set of the two adjacent frames are changed. In the case of poor communication quality of the network, the non-key frames are discarded first. If the communication quality is further deteriorated, the key frames with lower levels are discarded, so that the information of interest to the user can be better protected.
[44] 图 3所示为本发明实施例二提供的视频服务装置 20的模块图。 在本实施例中, 视频服务装置 20是一个视频传输服务装置, 进一步包括视频釆集模块 220、 视频 编码模块 230及可分级网络传输模块 240。 在本实施例中, 视频釆集模块 220与视 频关键帧提取模块 210及视频编码模块 230相连, 视频编码模块 230与视频关键帧 提取模块 210、 可分级网络传输模块 240及视频编码模块 230相连, 可分级网络传 输模块 240与视频编码模块 230、 视频关键帧提取模块 210及视频存储模块 200相 连。 FIG. 3 is a block diagram of a video service device 20 according to Embodiment 2 of the present invention. In this embodiment, the video service device 20 is a video transmission service device, and further includes a video collection module 220, a video encoding module 230, and a scalable network transmission module 240. In this embodiment, the video collection module 220 is connected to the video key frame extraction module 210 and the video encoding module 230, and the video encoding module 230 is connected to the video key frame extraction module 210, the scalable network transmission module 240, and the video encoding module 230. The scalable network transmission module 240 is coupled to the video encoding module 230, the video keyframe extraction module 210, and the video storage module 200.
[45] 在本实施例中, 若视频服务装置 20传输的是压缩视频数据流, 则视频关键帧提 取模块 210将从视频存储模块 200传输的压缩数据流中直接提取关键帧, 然后将 关键帧的位置以及等级信息同压缩数据流一起送入可分级网络传输模块 240。 由 可分级网络传输模块 240根据关键帧信息来选取相应的保护策略或者在码率受限 情况下的丢帧策略来传输数据流。  In this embodiment, if the video service device 20 transmits the compressed video data stream, the video key frame extraction module 210 directly extracts the key frame from the compressed data stream transmitted by the video storage module 200, and then the key frame. The location and level information is sent to the scalable network transmission module 240 along with the compressed data stream. The scalable network transmission module 240 selects a corresponding protection policy according to the key frame information or a frame loss policy in the case of a limited rate to transmit the data stream.
[46] 若视频服务装置 20传输的是原始视频数据流, 则关键帧提取模块 210将从视频 釆集模块 220传输的原始视频数据流中提取关键帧信息, 视频编码模块 230同吋 工作, 将原始视频数据流编码为压缩视频数据流, 然后与关键帧信息一起传给 可分级网络传输模块 240。  [46] If the video service device 20 transmits the original video data stream, the key frame extraction module 210 extracts the key frame information from the original video data stream transmitted by the video collection module 220, and the video encoding module 230 works in the same manner. The original video data stream is encoded as a compressed video data stream and then passed to the scalable network transmission module 240 along with the key frame information.
[47] 图 4所示为本发明实施例三提供的视频服务装置 20的模块图。 在本实施例中, 视频服务装置 20是一个视频编码服务装置, 进一步包括可变图像 (Group of 4 is a block diagram of a video service device 20 according to Embodiment 3 of the present invention. In the present embodiment, the video service device 20 is a video encoding service device, further including a variable image (Group of
Picture, GOP) 组层视频编码模块 250。 在本实施例中, 视频釆集模块 220与视频 关键帧提取模块 210及可变 GOP视频编码模块 250相连, 可变 GOP视频编码模块 2 50与视频关键帧提取模块 210、 视频存储模块 200、 视频釆集模块 220及可分级网 络传输模块 240相连。 在本实施例中, 可变 GOP视频编码模块 250将关键帧作为 I 帧编码, 从而实现不等长 GOP编码, 可以提高编码效率。 由于对关键帧划分了 等级, 所以当两个高等级关键帧距离很远吋, 可以在他们之间插入一个或数个 低等级的关键帧, 以便于在播放随机接入吋刻的视频吋不至于丢失太多的帧。 Picture, GOP) Group Video Coding Module 250. In this embodiment, the video collection module 220 is connected to the video key frame extraction module 210 and the variable GOP video encoding module 250, and the variable GOP video encoding module 250 and the video key frame extraction module 210, the video storage module 200, and the video. The collection module 220 and the scalable network transmission module 240 are connected. In this embodiment, the variable GOP video encoding module 250 encodes the key frame as an I frame, thereby implementing unequal length GOP encoding, which can improve encoding efficiency. Since the key frames are graded, when the two high-level key frames are far away, one or several low-level key frames can be inserted between them, so that the video of the random access engraving is not played. As for losing too many frames.
[48] 在本实施例中, 当视频关键帧提取模块 210提取关键帧之后, 可变图像组层视 频编码模块 250对每两个关键帧之间作为一个图像组 (Group of [48] In this embodiment, after the video key frame extraction module 210 extracts the key frame, the variable image group layer view The frequency encoding module 250 acts as an image group between every two key frames (Group of
Picture, GOP) 的划分, 将使码流具有顽健的码流传输特性, 便于实现传输中的 不等保护传输, 以及方便的丢帧策略; 以及较高的压缩效率和接入特性, GOP 内部相关性强易于吋间相关性的去除, 接入点为关键帧, 符合人眼特性。  The division of Picture, GOP) will make the code stream have robust code stream transmission characteristics, facilitate the unequal protection transmission in transmission, and convenient frame dropping strategy; and high compression efficiency and access characteristics, GOP internal The correlation is strong and the correlation between the two is easy to remove. The access point is a key frame and conforms to the characteristics of the human eye.
[49] 图 5所示为本发明实施例四提供的关键帧的提取方法的流程图。  FIG. 5 is a flowchart of a method for extracting a key frame according to Embodiment 4 of the present invention.
[50] 在步骤 S300中, 接收视频数据流。  [50] In step S300, a video data stream is received.
[51] 在步骤 S302中, 从视频数据流中获取每帧的运动矢量。 在本实施例中, 将每帧 的运动矢量分别进行分解, 可以选择以坐标轴进行分解, 将每一运动矢量分解 为 X方向的分矢量与 y方向的分矢量, 即每一运动矢量可用 (xi, yi) 来表示。  [51] In step S302, a motion vector of each frame is acquired from the video data stream. In this embodiment, the motion vectors of each frame are separately decomposed, and the decomposition can be selected by the coordinate axes, and each motion vector is decomposed into a sub-vector of the X direction and a sub-vector of the y direction, that is, each motion vector is available ( Xi, yi) to express.
[52] 在步骤 S304中, 获取每帧的运动矢量的特征矢量集合。 在本实施例中, 将运动 矢量值相同的运动矢量组成运动矢量集合, 并将运动矢量数量最多的运动矢量 集合作为特征矢量集合。  [52] In step S304, a feature vector set of motion vectors for each frame is acquired. In this embodiment, motion vectors having the same motion vector value are grouped into a motion vector set, and a motion vector set having the largest number of motion vectors is used as a feature vector set.
[53] 在本实施例中, 具体为: 先提取值相同且数量最多吋的 X分矢量或提取值相同 且数量最多吋的 y分矢量; 在所述 X分矢量值一一对应的 y分矢量值的条件下, 提 取数量最多吋的 y分矢量的值或在所述 y分矢量值一一对应的 X分矢量值的条件下 , 提取数量最多吋的 X分矢量的值。 在本实施例中, 用建立一维直方图的方式来 说明。 先分析运动矢量的 X方向的分矢量, 建立 X方向的分矢量的直方图, 即一 维的直方图。 在本实施例中, 根据建立的 X方向的分矢量的直方图值相同且数量 最多吋的 X分矢量的值, 此吋 X分矢量的值用表达式 xi_moSt表示, 其中 i=l, ..., n, 如图 6所示。 然后再分析运动矢量的 X分矢量值为 xi_moSt对应的 y分矢量。 在 建立的运动矢量 y方向的分矢量的直方图中, 在 X分矢量值为 xi_moSt的条件下, 找出数量最多吋的 y分矢量的值 yi_moSt, 如图 7所示。 在本实施例中, 有较多具 有运动矢量 (xi_most, yi_most, i=l, ..., n) 的集合称为特征矢量集合。 [53] In this embodiment, the following is: first extracting an X-segment vector having the same value and the largest number of 吋, or a y-divided vector having the same extracted value and the largest number 吋; y-score corresponding to the X-segment vector value one-to-one Under the condition of the vector value, the value of the y-segment vector with the largest number of 吋 is extracted or the value of the X-segment vector with the largest number of 吋 is extracted under the condition that the y-divided vector value has a one-to-one corresponding X-segment vector value. In this embodiment, the method of establishing a one-dimensional histogram is used for explanation. First, analyze the vector of the X-direction of the motion vector, and establish a histogram of the vector of the X-direction, that is, a one-dimensional histogram. In this embodiment, the value of the X-segment vector of the same and the largest number of histogram values of the divided vector of the established X direction is represented by the expression xi_mo S t , where i=l, ..., n, as shown in Figure 6. Then, the X-segment vector value of the motion vector is analyzed and the y-divided vector corresponding to xi_mo S t is analyzed. In the histogram vector y-direction movement vectors of the establishment, at points X xi_mo S t is the vector, to identify the maximum number of points y inch vector value yi_mo S t, as shown in FIG. In this embodiment, a set having more motion vectors (xi_most, yi_most, i=l, ..., n) is called a feature vector set.
[54] 在其它实施例中, 也可先提取 y分矢量的值 yi_moSt, 再提取 x分矢量的 xi_moSt 。 在其它实施例中, 亦可通过使用幅度和角度结合的方法来获取特征矢量集合 , 也可通过聚类的方法提取背景和前景分别的运动矢量集合, 前景的运动矢量 集合即为特征矢量集合。 [54] In other embodiments, the value yi_mo S t of the y-divided vector may also be extracted first, and then the xi_mo S t of the x-divided vector may be extracted. In other embodiments, the feature vector set may also be acquired by using a combination of amplitude and angle, and the motion vector set of the background and the foreground may be extracted by a clustering method, and the motion vector set of the foreground is a feature vector set.
[55] 在步骤 S306中, 判断每帧是否为关键帧。 在本实施例中, 通过判断前后相邻两 帧的特征矢量集合相对应的运动矢量的方向与幅度是否发生变化判断每帧是否 为关键帧。 在本实施例中, 通过判断第 k帧的特征矢量集合的运动矢量的方向与 幅度相对第 k-1帧的特征矢量集合的运动矢量的方向与幅度是否发生变化来判断 第 k帧是否为关键帧。 [55] In step S306, it is determined whether each frame is a key frame. In this embodiment, by judging the two adjacent Whether the direction and the amplitude of the motion vector corresponding to the feature vector set of the frame change determines whether each frame is a key frame. In this embodiment, it is determined whether the kth frame is the key by determining whether the direction and the amplitude of the motion vector of the feature vector set of the kth frame are different from the direction and the amplitude of the motion vector of the feature vector set of the k-1th frame. frame.
[56] 若判断第 k帧不是关键帧, 则继续判断第 k+1帧是否为关键帧, 即继续执行步骤 S306; 若判断第 k帧是关键帧, 则提取第 k帧作为关键帧, 并执行步骤 S308。  [56] If it is determined that the kth frame is not a key frame, continue to determine whether the k+1th frame is a key frame, that is, proceed to step S306; if it is determined that the kth frame is a key frame, extract the kth frame as a key frame, and Step S308 is performed.
[57] 在本实施例中, 具体为: 将运动矢量的方向可分为 X轴方向与 y轴方向, 及运动 矢量的幅度用 X分矢量大小与 y分矢量大小之和来表示。 对 X分矢量而言, 如果 X 值为正, 则其方向用十 1表示, 如果为 0, 则用 0表示, 如果为负值, 则用一 1表 示。 y分矢量的方向也是如此。  [57] In this embodiment, specifically, the direction of the motion vector can be divided into an X-axis direction and a y-axis direction, and the magnitude of the motion vector is represented by a sum of the X-divided vector size and the y-divided vector size. For the X-segment vector, if the X value is positive, its direction is represented by ten, if it is 0, it is represented by 0, and if it is negative, it is represented by one. The same is true for the direction of the y-divided vector.
[58] 在本实施例中, 若判断第 k帧的特征矢量集合的运动矢量的 X分矢量的方向相对 第 k - 1帧的特征矢量集合的运动矢量的 X分矢量的方向发生了变化吋, 即 X分矢 量的方向由十 1变为一 1或由一 1变为十 1或由 0变为非 0或由非 0变为 0, 则第 k帧是 关键帧, 并提取第 k帧作为关键帧。 在本发明其它实施例中, 若判断第 k帧的特 征矢量集合的运动矢量的 y分矢量的方向相对第 k - 1帧的特征矢量集合的运动矢 量的 y分矢量的方向发生了变化, 即 y分矢量的方向由十 1变为一 1或由一 1变为十 1或由 0变为非 0或由非 0变为 0, 则提取第 k帧作为关键帧。 在本发明其它实施例 中, 若判断第 k帧的特征矢量集合的运动矢量的幅度与第 k - 1帧的特征矢量集合 的运动矢量的幅度相差大于一个预定的门限, 则第 k帧是关键帧, 并提取第 k帧 作为关键帧。  [58] In the present embodiment, if the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is determined to be different from the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame, , that is, the direction of the X-divided vector changes from ten to one or from one to ten or from 0 to non-zero or from non-zero to 0, then the k-th frame is a key frame, and the k-th frame is extracted. As a keyframe. In other embodiments of the present invention, if the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is changed, the direction of the y-divided vector of the motion vector of the k-th frame is changed, that is, The direction of the y-divided vector is changed from ten to one or from one to ten or from 0 to non-zero or from non-zero to 0, and the k-th frame is extracted as a key frame. In other embodiments of the present invention, if it is determined that the magnitude of the motion vector of the feature vector set of the kth frame differs from the amplitude of the motion vector of the feature vector set of the k-1 frame by more than a predetermined threshold, the kth frame is the key Frame, and extract the kth frame as a key frame.
[59] 在本实施例中, 门限值为 60。 在本发明的其它实施例中, 门限值亦可为其它值  In the present embodiment, the threshold value is 60. In other embodiments of the invention, the threshold value may also be other values.
[60] 在步骤 S308中, 判断关键帧的类别是否为第一类关键帧。 在本实施例中, 将关 键帧的类别划分第一类关键帧、 第二类关键帧及第三类关键帧。 其中, 第一类 关键帧为优等级关键帧、 第二类关键帧为良类关键帧及第三类关键帧为一般类 关键帧。 在本实施例中, 通过判断前后相邻两帧的特征矢量集合相对应的运动 矢量的方向与幅度是否发生变化来判断关键帧的类别, 即对关键帧划分等级。 [60] In step S308, it is determined whether the category of the key frame is the first type of key frame. In this embodiment, the categories of the key frames are divided into a first type of key frame, a second type of key frame, and a third type of key frame. Among them, the first type of key frames are excellent level key frames, the second type of key frames are good class key frames, and the third type of key frames are general class key frames. In the present embodiment, the category of the key frame is judged by judging whether the direction and the amplitude of the motion vector corresponding to the feature vector set of the two adjacent frames are changed, that is, the key frame is classified.
[61] 若判断关键帧的类别是第一类关键帧, 则执行步骤 S316; 否则执行步骤 S310。 [62] 在本实施例中, 具体为: 若判断第 k帧的特征矢量集合的运动矢量的 X分矢量的 方向相对第 k 1帧的特征矢量集合的运动矢量的 X分矢量的方向发生了变化, 即 X分矢量的方向由十 1变为一 1或由一 1变为十 1, 并且第 k帧的特征矢量集合的运 动矢量的 y分矢量的方向相对第 k - 1帧的特征矢量集合的运动矢量的 y分矢量的 方向发生了变化, 即 y分矢量的方向由十 1变为一 1或由一 1变为十 1, 同吋第 k帧 的特征矢量集合的运动矢量幅度的大小与第 k - 1帧的特征矢量集合的运动矢量 幅度的大小相差大于一个预定的门限吋, 则第 k帧的类别为第一类关键帧, 即将 第 k帧划分为优等级关键帧。 [61] If it is determined that the category of the key frame is the first type of key frame, step S316 is performed; otherwise, step S310 is performed. [62] In this embodiment, specifically: if it is determined that the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is opposite to the direction of the X-segment vector of the motion vector of the feature vector set of the k1st frame The change, that is, the direction of the X-divided vector changes from ten to one or from one to ten, and the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is relative to the feature vector of the k-1th frame The direction of the y-divided vector of the set motion vector changes, that is, the direction of the y-divided vector changes from ten to one or from one to ten, and the motion vector magnitude of the feature vector set of the k-th frame The magnitude of the motion vector magnitude of the feature vector set of the k-1th frame differs by more than a predetermined threshold, and the class of the kth frame is the first type of key frame, that is, the kth frame is divided into the superior key frame.
[63] 在步骤 S310, 判断关键帧的类别是否为第二类关键帧。 若判断关键帧的类别是 第二类关键帧, 则执行步骤 S316; 否则执行步骤 S312。  [63] In step S310, it is determined whether the category of the key frame is the second type of key frame. If it is determined that the category of the key frame is the second type of key frame, step S316 is performed; otherwise, step S312 is performed.
[64] 在本实施例中, 具体为: 若判断第 k帧的特征矢量集合的运动矢量的 X分矢量的 方向相对第 k 1帧的特征矢量集合的运动矢量的 X分矢量的方向发生了变化, 即 X分矢量的方向由十 1变为一 1或由一 1变为十 1, 并且判断第 k帧的特征矢量集合 的运动矢量的 y分矢量的方向相对第 k - 1帧的特征矢量集合的运动矢量的 y分矢 量的方向发生了变化, 即 y分矢量的方向由十 1变为一 1或由一 1变为十 1, 同吋判 断第 k帧的特征矢量集合的运动矢量幅度的大小与第 k - 1帧的特征矢量集合的运 动矢量幅度的大小相差不大于预定的门限吋, 则第 k帧的类别为第二类关键帧, 即将第 k帧划分为良等级关键帧。  [64] In this embodiment, specifically: if it is determined that the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is opposite to the direction of the X-segment vector of the motion vector of the feature vector set of the k1st frame The change, that is, the direction of the X-segment vector is changed from ten to one or from one to ten, and the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is determined relative to the feature of the k-1 frame The direction of the y-divided vector of the motion vector of the vector set changes, that is, the direction of the y-divided vector changes from ten to one or from one to ten, and the motion vector of the feature vector set of the k-th frame is determined. The magnitude of the amplitude is different from the magnitude of the motion vector magnitude of the feature vector set of the k-1 frame by no more than a predetermined threshold, and the category of the kth frame is the second type of key frame, that is, the kth frame is divided into good key frames. .
[65] 在本发明其它实施例中, 若判断第 k帧的特征矢量集合的运动矢量的 X分矢量的 方向相对第 k 1帧的特征矢量集合的运动矢量的 X分矢量的方向发生了变化, 即 X分矢量的方向由十 1变为一 1或由一 1变为十 1, 并且判断第 k帧的特征矢量集合 的运动矢量幅度的大小与第 k - 1帧的特征矢量集合的运动矢量幅度的大小相差 大于预定的门限吋, 则第 k帧的类别为第二类关键帧, 即将第 k帧划分为良等级 关键帧。  [65] In other embodiments of the present invention, if the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is changed, the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is changed. , that is, the direction of the X-segment vector is changed from ten to one or from one to ten, and the magnitude of the motion vector magnitude of the feature vector set of the k-th frame and the motion of the feature vector set of the k-1 frame are determined. If the magnitude of the vector amplitude differs by more than a predetermined threshold, then the category of the kth frame is the second type of key frame, that is, the kth frame is divided into good key frames.
[66] 在本发明其它实施例中, 若判断第 k帧的特征矢量集合的运动矢量的 y分矢量的 方向相对第 k 1帧的特征矢量集合的运动矢量的 y分矢量的方向发生了变化, 即 y分矢量的方向由十 1变为一 1或由一 1变为十 1, 并且判断第 k帧的特征矢量集合 的运动矢量幅度的大小与第 k - 1帧的特征矢量集合的运动矢量幅度的大小相差 大于预定的门限, 则第 k帧的类别为第二类关键帧, 即将第 k帧划分为良等级关 键帧。 [66] In other embodiments of the present invention, if the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is changed, the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is changed. , that is, the direction of the y-divided vector is changed from ten to one or from one to ten, and the magnitude of the motion vector magnitude of the feature vector set of the k-th frame and the motion of the feature vector set of the k-1 frame are determined. The magnitude of the vector magnitude is different If the threshold is greater than the predetermined threshold, the category of the kth frame is the second type of key frame, that is, the kth frame is divided into good key frames.
[67] 在本发明其它实施例中, 若判断第 k帧的特征矢量集合的运动矢量的 X分矢量的 方向相对第 k 1帧的特征矢量集合的运动矢量的 X分矢量的方向发生了变化, 即 X分矢量的方向由 0变为非 0或由非 0变为 0, 并且判断第 k帧的特征矢量集合的运 动矢量的 y分矢量的方向相对第 k - 1帧的特征矢量集合的运动矢量的 y分矢量的 方向发生了变化, 即 y分矢量的方向由 0变为非 0或由非 0变为 0, 则第 k帧的类别 为第二类关键帧, 即将第 k帧划分为良等级关键帧。  [67] In other embodiments of the present invention, if the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is changed, the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is changed. , that is, the direction of the X-segment vector changes from 0 to non-zero or from non-zero to 0, and the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is determined relative to the feature vector set of the k-1th frame. The direction of the y-divided vector of the motion vector changes, that is, the direction of the y-divided vector changes from 0 to non-zero or from non-zero to 0, then the category of the k-th frame is the second type of key frame, that is, the k-th frame is divided. A good grade keyframe.
[68] 在步骤 S312, 判断关键帧的类别是否为第三类关键帧。 若判断关键帧的类别是 第三类关键帧, 则执行步骤 S316; 否则执行步骤 S314。  [68] In step S312, it is determined whether the category of the key frame is a third type of key frame. If it is determined that the category of the key frame is the third type of key frame, step S316 is performed; otherwise, step S314 is performed.
[69] 在本实施例中, 具体为: 若判断第 k帧的特征矢量集合的运动矢量的 X分矢量的 方向相对第 k 1帧的特征矢量集合的运动矢量的 X分矢量的方向发生了变化, 即 X分矢量的方向由十 1变为一 1或由一 1变为十 1, 并且判断第 k帧的特征矢量集合 的运动矢量幅度的大小与第 k - 1帧的特征矢量集合的运动矢量幅度的大小相差 不大于预定的门限吋, 则第 k帧的类别为第三类关键帧, 即将第 k帧划分为一般 等级关键帧。  [69] In this embodiment, specifically: if it is determined that the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is opposite to the direction of the X-segment vector of the motion vector of the feature vector set of the k1st frame The change, that is, the direction of the X-segment vector is changed from ten to one or from one to ten, and the magnitude of the motion vector magnitude of the feature vector set of the k-th frame is determined with the feature vector set of the k-1 frame The magnitude of the motion vector magnitude differs by no more than a predetermined threshold, and the category of the kth frame is a third type of key frame, that is, the kth frame is divided into general level key frames.
[70] 在本发明其它实施例中, 若判断第 k帧的特征矢量集合的运动矢量的 y分矢量的 方向相对第 k 1帧的特征矢量集合的运动矢量的 y分矢量的方向发生了变化, 即 y分矢量的方向由十 1变为一 1或由一 1变为十 1, 并且判断第 k帧的特征矢量集合 的运动矢量幅度的大小与第 k - 1帧的特征矢量集合的运动矢量幅度的大小相差 不大于预定的门限吋, 则第 k帧的类别为第三类关键帧, 即将第 k帧划分为一般 等级关键帧。  [70] In other embodiments of the present invention, if the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is changed, the direction of the y-divided vector of the motion vector set of the k-th frame is changed. , that is, the direction of the y-divided vector is changed from ten to one or from one to ten, and the magnitude of the motion vector magnitude of the feature vector set of the k-th frame and the motion of the feature vector set of the k-1 frame are determined. If the magnitude of the vector amplitude differs by no more than a predetermined threshold, then the category of the kth frame is a third type of key frame, that is, the kth frame is divided into general level key frames.
[71] 在本发明其它实施例中, 若判断第 k帧的特征矢量集合的运动矢量的 X分矢量的 方向相对第 k 1帧的特征矢量集合的运动矢量的 X分矢量的方向发生了变化, 即 X分矢量的方向由 0变为非 0或由非 0变为 0, 或者判断第 k帧的特征矢量集合的运 动矢量的 y分矢量的方向相对第 k - 1帧的特征矢量集合的运动矢量的 y分矢量的 方向发生了变化, 即 y分矢量的方向由 0变为非 0或由非 0变为 0吋, 则第 k帧的类 别为第三类关键帧, 即将第 k帧划分为一般等级关键帧。 [72] 在步骤 S314中, 将未划分等级的关键帧传输至用户终端装置 30。 [71] In other embodiments of the present invention, if the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is changed, the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is changed. , that is, the direction of the X-segment vector changes from 0 to non-zero or from non-zero to 0, or the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is determined relative to the feature vector set of the k-1th frame The direction of the y-divided vector of the motion vector changes, that is, the direction of the y-divided vector changes from 0 to non-zero or from non-zero to 0 吋, then the category of the k-th frame is the third type of key frame, that is, the k-th frame Divided into general level keyframes. [72] In step S314, the undivided key frame is transmitted to the user terminal device 30.
[73] 在步骤 S316中, 将所述类别的关键帧传输至用户终端装置 30。 [73] In step S316, the key frame of the category is transmitted to the user terminal device 30.
[74] 上述本发明实施例中的"预定的门限值"可以是一个恒定的值, 也可以是一个随 场景变化而变化的值。 The "predetermined threshold value" in the above embodiment of the present invention may be a constant value or a value that varies depending on the scene.
[75] 本发明提供的视频服务装置 20、 视频服务系统 10及其关键帧的提取方法, 利用 帧的运动矢量, 获取运动矢量的特征矢量集合, 通过判断前后相邻两帧的特征 矢量集合相对应的运动矢量的方向与幅度是否发生变化来提取关键帧, 从而可 以有效的提取速度突然变化和匀速但方向发生变化的帧, 降低了提取关键帧的 错误率与复杂度, 减少了计算量; 同吋, 进一步通过利用运动矢量的方向与幅 度对关键帧划分等级, 可在网络通信质量差吋进行先丢弃那些非关键帧, 如果 通信质量进一步恶化, 就丢弃等级比较低的关键帧, 这样可以更好保护用户感 兴趣信息。  [75] The video service device 20, the video service system 10, and the key frame extraction method thereof are provided by using the motion vector of the frame to obtain a feature vector set of the motion vector, and determining the feature vector set of the adjacent two frames before and after. Whether the direction and amplitude of the corresponding motion vector change to extract key frames, thereby effectively extracting frames with sudden changes in speed and uniform velocity but changing direction, reducing the error rate and complexity of extracting key frames, and reducing the amount of calculation; Similarly, by classifying the key frames by using the direction and magnitude of the motion vector, those non-key frames can be discarded first in the network communication quality difference. If the communication quality is further deteriorated, the key frames with lower levels are discarded. Better protect user interest information.
[76] 通过以上的实施方式的描述, 本领域的技术人员可以清楚地了解到本发明可以 通过硬件实现, 也可以可借助软件加必要的通用硬件平台的方式来实现基于这 样的理解, 本发明的技术方案可以以软件产品的形式体现出来, 该软件产品可 以存储在一个计算机可读存储介质 (可以是 CD-ROM, U盘, 移动硬盘等) 中, 包括若干指令用以使得一台计算机设备 (可以是个人计算机, 服务器, 或者网 络设备等) 执行本发明各个实施例所述的方法。  [76] Through the description of the above embodiments, those skilled in the art can clearly understand that the present invention can be implemented by hardware, or can be realized by means of software plus necessary general hardware platform, the present invention. The technical solution can be embodied in the form of a software product, which can be stored in a computer readable storage medium (which can be a CD-ROM, a USB flash drive, a mobile hard disk, etc.), and includes a plurality of instructions for making a computer device (may be a personal computer, server, or network device, etc.) Perform the methods described in various embodiments of the present invention.
以上所述, 仅为本发明较佳的具体实施方式, 但本发明的保护范围并不局限于 此, 任何熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可轻易想到 的变化或替换, 都应涵盖在本发明的保护范围之内。 因此, 本发明的保护范围 应该以权利要求的保护范围为准。  The above is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of changes or within the technical scope disclosed by the present invention. Alternatives are intended to be covered by the scope of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.

Claims

权利要求书 Claim
一种关键帧的提取方法, 其特征在于, 包括: A method for extracting a key frame, comprising:
根据获取的视频数据流中每帧的运动矢量获得; 每帧的运动矢量的特征矢 暑虽朱隹 A口 .; Obtained according to the motion vector of each frame in the acquired video data stream; the characteristic vector of the motion vector of each frame is Zhu Zhu A mouth;
判断前后相邻两帧的特征矢量集合相对应的运动矢量的方向与幅度是否发 生变化; Determining whether the direction and amplitude of the motion vector corresponding to the set of feature vectors of two adjacent frames before and after are changed;
釆用所述是否发生变化的判断结果从所述视频数据流中提取关键帧。 The key frame is extracted from the video data stream using the result of the determination as to whether the change has occurred.
根据权利要求 1所述的关键帧的提取方法, 其特征在于, 所述根据获取的视 频数据流中每帧的运动矢量获得每帧的运动矢量的特征矢量集合包括: 分别由获取的视频数据流中每帧的运动矢量中运动矢量值相同的运动矢量 组成所述每帧的至少一个运动矢量集合, 将每帧中包含运动矢量的数量最 多的运动矢量集合作为每帧的运动矢量的特征矢量集合。 The key frame extraction method according to claim 1, wherein the obtaining a feature vector set of a motion vector of each frame according to a motion vector of each frame in the acquired video data stream comprises: respectively acquiring the video data stream The motion vector having the same motion vector value in the motion vector of each frame constitutes at least one motion vector set of each frame, and the motion vector set containing the largest number of motion vectors in each frame is used as the feature vector set of the motion vector of each frame. .
根据权利要求 2所述的关键帧的提取方法, 其特征在于, 所述根据获取的视 频数据流中每帧的运动矢量获得每帧的运动矢量的特征矢量集合还包括: 将获取的视频数据流中每帧的运动矢量分解为 X轴方向的 X分矢量与 y轴方向 的 y分矢量; The key frame extraction method according to claim 2, wherein the obtaining the feature vector set of the motion vector of each frame according to the motion vector of each frame in the acquired video data stream further comprises: acquiring the acquired video data stream The motion vector of each frame is decomposed into an X-divided vector in the X-axis direction and a y-divided vector in the y-axis direction;
提取每帧中值相同且数量最多吋的 X分矢量, 在所述 X分矢量值一一对应的 y 分矢量值的条件下, 提取数量最多吋的 y分矢量的值, 将所述数量最多吋的 X分矢量和数量最多吋的 y分矢量的组成每帧的运动矢量的特征矢量集合, 或提取每帧中值相同且数量最多吋的 y分矢量, 在所述 y分矢量值一一对应 的 X分矢量值的条件下, 提取数量最多吋的 X分矢量的值, 将所述数量最多 吋的 X分矢量和数量最多吋的 y分矢量的组成每帧的运动矢量的特征矢量集 合。 根据权利要求 1至 3任一项所述的关键帧的提取方法, 其特征在于, 所述的 判断前后相邻两帧的特征矢量集合相对应的运动矢量的方向与幅度是否发 生变化, 釆用所述是否发生变化的判断结果从所述视频数据流中提取关键 的特征矢量集合相对应的运动矢量的 X分矢量方向发生了变化或若判断第 k 帧的特征矢量集合相对应的运动矢量的 y分矢量方向相对第 k- 1帧的特征矢 量集合相对应的运动矢量的 y分矢量方向发生了变化, 则提取所述第 k帧为 关键帧; 或 Extracting the X-segment vector with the same value and the largest number of 每 in each frame, and extracting the value of the y-divided vector with the largest number of 吋, under the condition of the y-divided vector value of the X-segment vector value one by one, The X-segment vector of 吋 and the y-divided vector of the largest number of y are composed of the feature vector of the motion vector of each frame, or the y-divided vector with the same value and the largest number of 每 in each frame is extracted, and the y-divided vector value is one by one Under the condition of the corresponding X-segment vector value, the value of the X-segment vector with the largest number of 吋 is extracted, and the eigenvector vector of the motion vector of each frame is composed of the X-segment vector with the largest number of 吋 and the y-divided vector with the largest number of 吋. The key frame extraction method according to any one of claims 1 to 3, wherein the direction and the amplitude of the motion vector corresponding to the feature vector set of the two adjacent frames before and after the determination are changed, Determining whether the change has occurred, extracting a key from the video data stream The X-segment vector direction of the corresponding motion vector of the feature vector set is changed or if the y-divided vector direction of the motion vector corresponding to the feature vector set of the k-th frame is determined corresponding to the feature vector set of the k-1 frame The y-divided vector direction of the motion vector changes, and the k-th frame is extracted as a key frame; or
若判断第 k帧的特征矢量集合相对应的运动矢量的幅度与第 k-1帧的特征矢 量集合相对应的运动矢量的幅度相差大于一个预定的门限值, 则提取所述 第 k帧为关键帧, 所述运动矢量的幅度用 X分矢量大小与 y分矢量大小之和来 表示。  If it is determined that the magnitude of the motion vector corresponding to the feature vector set of the kth frame differs from the amplitude of the motion vector corresponding to the feature vector set of the k-1th frame by more than a predetermined threshold value, extracting the kth frame is Key frame, the magnitude of the motion vector is represented by the sum of the X-segment vector size and the y-segment vector size.
[5] 根据权利要求 1至 3任一项所述的关键帧的提取方法, 其特征在于, 所述方 法还包括:  [5] The method for extracting a key frame according to any one of claims 1 to 3, wherein the method further comprises:
通过判断前后相邻两帧的特征矢量集合相对应的运动矢量的方向与幅度是 否发生变化来判断提取到的关键帧的类别, 其中所述关键帧的类别分为第 一类关键帧、 第二类关键帧及第三类等级关键帧。  Determining the category of the extracted key frame by determining whether the direction and the amplitude of the motion vector corresponding to the feature vector set of the two adjacent frames before and after the change, wherein the key frame class is classified into the first type key frame, the second Class keyframes and third class level keyframes.
[6] 根据权利要求 5所述的关键帧的提取方法, 其特征在于, 所述的通过判断前 后相邻两帧的特征矢量集合相对应的运动矢量的方向与幅度是否发生变化 来判断提取到的关键帧的类别包括: [6] The method for extracting a key frame according to claim 5, wherein the determining whether the direction and the amplitude of the motion vector corresponding to the feature vector set of the two adjacent frames before and after the change are changed The key frame categories include:
若判断第 k帧的特征矢量集合相对应的运动矢量的 X分矢量方向相对第 k - 1 帧的特征矢量集合相对应的运动矢量的 X分矢量方向发生了变化, 并且所述 第 k帧的特征矢量集合相对应的运动矢量的 y分矢量方向相对所述第 k - 1帧 的特征矢量集合相对应的运动矢量的 y分矢量方向发生了变化, 并且所述第 k帧的特征矢量集合相对应的运动矢量的幅度与所述第 k - 1帧的特征矢量集 合相对应的运动矢量的幅度相差大于一个预定的门限吋, 则判断所述第 k帧 为第一类关键帧;  If it is determined that the X-segment vector direction of the motion vector corresponding to the feature vector set of the k-th frame changes with respect to the X-segment vector direction of the motion vector corresponding to the feature vector set of the k-1th frame, and the k-th frame The y-divided vector direction of the motion vector corresponding to the feature vector set is changed with respect to the y-divided vector direction of the motion vector corresponding to the feature vector set of the k-1th frame, and the feature vector set of the k-th frame is And determining, by the amplitude of the corresponding motion vector, that the amplitude of the motion vector corresponding to the feature vector set of the k-1th frame is greater than a predetermined threshold, determining that the kth frame is the first type of key frame;
或者;  Or
若判断第 k帧的特征矢量集合相对应的运动矢量的 X分矢量的方向相对第 k - 1帧的特征矢量集合相对应的运动矢量的 X分矢量的方向发生了变化, 并且 所述第 k帧的特征矢量集合相对应的运动矢量的 y分矢量的方向相对所述第 k - 1帧的特征矢量集合相对应的运动矢量的 y分矢量的方向发生了变化, 及 所述第 k帧的特征矢量集合相对应的运动矢量的幅度与所述第 k - 1帧的特征 矢量集合相对应的运动矢量的幅度相差不大于一个预定的门限吋, 则判断 所述第 k帧为第二类关键帧; If it is determined that the direction of the X-segment vector of the motion vector corresponding to the feature vector set of the k-th frame is changed with respect to the direction of the X-segment vector of the motion vector corresponding to the feature vector set of the k-1th frame, and the kth The direction of the y-divided vector of the motion vector corresponding to the feature vector set of the frame changes with respect to the direction of the y-divided vector of the motion vector corresponding to the feature vector set of the k-1th frame, and Determining the kth when the magnitude of the motion vector corresponding to the feature vector set of the kth frame is different from the amplitude of the motion vector corresponding to the feature vector set of the k-1th frame by a predetermined threshold The frame is the second type of key frame;
或者; Or
若判断第 k帧的特征矢量集合相对应的运动矢量的 X分矢量的方向相对第 k - 1帧的特征矢量集合相对应的运动矢量的 X分矢量的方向发生了变化, 或者 第 k帧的特征矢量集合相对应的运动矢量的 y分矢量的方向相对第 k 1帧的 特征矢量集合相对应的运动矢量的 y分矢量的方向发生了变化, 并且所述第 k帧的特征矢量集合相对应的运动矢量的幅度与所述第 k - 1帧的特征矢量集 合相对应的运动矢量的幅度相差大于一个预定的门限吋, 则判断所述第 k帧 为第二类关键帧; If it is determined that the direction of the X-segment vector of the motion vector corresponding to the feature vector set of the k-th frame is changed with respect to the direction of the X-segment vector of the motion vector corresponding to the feature vector set of the k-1th frame, or the k-th frame The direction of the y-divided vector of the motion vector corresponding to the feature vector set is changed with respect to the direction of the y-divided vector of the motion vector corresponding to the feature vector set of the k1st frame, and the feature vector set of the k-th frame corresponds to The amplitude of the motion vector is different from the amplitude of the motion vector corresponding to the feature vector set of the k-1th frame by more than a predetermined threshold, and the kth frame is determined to be the second type of key frame;
或者; Or
若判断第 k帧的特征矢量集合相对应的运动矢量的 X分矢量的方向相对第 k - 1帧的特征矢量集合相对应的运动矢量的 X分矢量的方向发生了变化, 并且 所述第 k帧的特征矢量集合相对应的运动矢量的 y分矢量的方向相对所述第 k - 1帧的特征矢量集合相对应的运动矢量的 y分矢量的方向发生了变化吋, 则判断所述第 k帧为第二类关键帧。 If it is determined that the direction of the X-segment vector of the motion vector corresponding to the feature vector set of the k-th frame is changed with respect to the direction of the X-segment vector of the motion vector corresponding to the feature vector set of the k-1th frame, and the kth The direction of the y-divided vector of the motion vector corresponding to the feature vector set of the frame changes with respect to the direction of the y-divided vector of the motion vector corresponding to the feature vector set of the k-1th frame, and then the kth is determined The frame is the second type of key frame.
或者; Or
若判断第 k帧的特征矢量集合的运动矢量的 X分矢量的方向相对所述第 k - 1 帧的特征矢量集合的运动矢量的 X分矢量的方向发生了变化, 或者所述第 k 帧的特征矢量集合的运动矢量的 y分矢量的方向相对所述第 k - 1帧的特征矢 量集合的运动矢量的 y分矢量的方向发生了变化, 并且所述第 k帧的特征矢 量集合的运动矢量的幅度与所述第 k - 1帧的特征矢量集合的运动矢量的幅 度相差不大于一个预定的门限, 则判断所述第 k帧为第三类关键帧; 或者; If it is determined that the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame is changed with respect to the direction of the X-segment vector of the motion vector of the feature vector set of the k-th frame, or the k-th frame The direction of the y-divided vector of the motion vector of the feature vector set changes with respect to the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame, and the motion vector of the feature vector set of the k-th frame The amplitude of the motion vector of the feature vector set of the k-1th frame is not more than a predetermined threshold, and the kth frame is determined to be a third type of key frame; or
若判断所述第 k帧的特征矢量集合相对应的运动矢量的 X分矢量的方向相对 第 k - 1帧的特征矢量集合相对应的运动矢量的 X分矢量的方向发生了变化, 或者第 k帧的特征矢量集合相对应的运动矢量的 y分矢量的方向相对第 k - 1 帧的特征矢量集合相对应的运动矢量的 y分矢量的方向发生了变化, 则判断 所述第 k帧为第三类关键帧。 If it is determined that the direction of the X-segment vector of the motion vector corresponding to the feature vector set of the k-th frame changes with respect to the direction of the X-segment vector of the motion vector corresponding to the feature vector set of the k-1th frame, or k The direction of the y-divided vector of the corresponding motion vector of the frame's feature vector set is relative to the k-1th If the direction of the y-divided vector of the motion vector corresponding to the feature vector set of the frame changes, the k-th frame is determined to be the third-type key frame.
[7] 一种视频服务装置, 其特征在于, 包括:  [7] A video service device, comprising:
视频关键帧提取模块, 用于根据获取的视频数据流中每帧的运动矢量获得 ; 每帧的运动矢量的特征矢量集合; 判断前后相邻两帧的特征矢量集合相 对应的运动矢量的方向与幅度是否发生变化; 釆用所述是否发生变化的判 断结果从所述视频数据流中提取关键帧。  a video key frame extraction module, configured to obtain, according to a motion vector of each frame in the acquired video data stream; a feature vector set of motion vectors of each frame; and determine a direction of a motion vector corresponding to the feature vector set of two adjacent frames before and after Whether the amplitude changes; extracting a key frame from the video data stream using the judgment result as to whether the change has occurred.
[8] 根据权利要求 7所述的视频服务装置, 其特征在于:  [8] The video service device according to claim 7, wherein:
所述视频关键帧提取模块, 用于分别由获取的视频数据流中每帧的运动矢 量中运动矢量值相同的运动矢量组成所述每帧的至少一个运动矢量集合, 将每帧中包含运动矢量的数量最多的运动矢量集合作为每帧的运动矢量的 特征矢量集合, 其中, 所述运动矢量的方向分为 X轴方向与 y轴方向, 所述 运动矢量的幅度用 X分矢量大小与 y分矢量大小之和来表示。  The video key frame extraction module is configured to respectively compose at least one motion vector set of each frame by a motion vector with the same motion vector value in a motion vector of each frame in the acquired video data stream, and include motion vectors in each frame. a set of motion vectors having the largest number of motion vectors as a feature vector set of motion vectors of each frame, wherein the direction of the motion vector is divided into an X-axis direction and a y-axis direction, and the magnitude of the motion vector is determined by the X-segment vector size and y-minute The sum of vector sizes is used to represent.
[9] 根据权利要求 8所述的视频服务装置, 其特征在于:  [9] The video service device according to claim 8, wherein:
所述视频关键帧提取模块, 用于通过判断第 k帧的特征矢量集合相对应的运 动矢量的 X分矢量方向相对第 k- 1帧的特征矢量集合相对应的运动矢量的 X分 矢量方向发生了变化或判断第 k帧的特征矢量集合相对应的运动矢量的 y分 矢量方向相对第 k- 1帧的特征矢量集合相对应的运动矢量的 y分矢量方向发 生了变化, 则提取所述第 k帧为关键帧;  The video key frame extraction module is configured to determine that an X-segment vector direction of a motion vector corresponding to a feature vector set of the k-th frame is generated in a direction of an X-segment vector of a motion vector corresponding to a feature vector set of the k-th frame The direction of the y-divided vector of the motion vector corresponding to the feature vector set corresponding to the k-th frame is changed or determined, and the y-divided vector direction corresponding to the feature vector set of the k-1 frame is changed. The k frame is a key frame;
或者,  Or,
通过判断第 k帧的特征矢量集合相对应的运动矢量的幅度与第 k- 1帧的特征 矢量集合相对应的运动矢量的幅度相差大于一个预定的门限值, 则提取所 述第 k帧为关键帧。  Extracting the kth frame by determining that the magnitude of the motion vector corresponding to the feature vector set of the kth frame differs from the amplitude of the motion vector corresponding to the feature vector set of the k-1 frame by more than a predetermined threshold value Keyframe.
[10] 根据权利要求 7或 8或 9所述的视频服务装置, 其特征在于, 所述视频关键帧 提取模块还用于在提取关键帧后, 通过判断前后相邻两帧的特征矢量集合 相对应的运动矢量的方向与幅度是否发生变化来判断提取到的关键帧的类 另 |J, 其中, 所述关键帧的类别分为第一类关键帧、 第二类关键帧及第三类 关键帧。 [11] 根据权利要求 10所述的视频服务装置, 其特征在于: [10] The video service device according to claim 7 or 8 or 9, wherein the video key frame extraction module is further configured to: after extracting the key frame, determine the feature vector set of the adjacent two frames before and after Determining whether the direction and magnitude of the corresponding motion vector change to determine the class of the extracted key frame, wherein the key frame class is classified into a first type key frame, a second type key frame, and a third type key frame. [11] The video service device of claim 10, wherein:
所述视频关键帧提取模块, 还用于通过判断第 k帧的特征矢量集合相对应的 运动矢量的 X分矢量方向相对第 k - 1帧的特征矢量集合相对应的运动矢量的 X分矢量方向发生了变化, 并且所述第 k帧的特征矢量集合相对应的运动矢 量的 y分矢量方向相对所述第 k - 1帧的特征矢量集合相对应的运动矢量的 y 分矢量方向发生了变化, 并且所述第 k帧的特征矢量集合相对应的运动矢量 的幅度与所述第 k - 1帧的特征矢量集合相对应的运动矢量的幅度相差大于 一个预定的门限吋, 则判断所述第 k帧为第一类关键帧; The video key frame extraction module is further configured to determine an X-segment vector direction of a motion vector corresponding to a feature vector set of the k-1 frame by determining a X-vector direction of a motion vector corresponding to the feature vector set of the k-th frame. a change has occurred, and the y-divided vector direction of the motion vector corresponding to the feature vector set of the k-th frame changes with respect to the y-divided vector direction of the motion vector corresponding to the feature vector set of the k-1th frame, And determining that the k is determined by a difference between a magnitude of a motion vector corresponding to the feature vector set of the kth frame and a magnitude of a motion vector corresponding to the feature vector set of the k -1th frame being greater than a predetermined threshold threshold The frame is the first type of key frame;
或者;  Or
所述视频关键帧提取模块, 还用于通过判断第 k帧的特征矢量集合相对应的 运动矢量的 X分矢量的方向相对第 k - 1帧的特征矢量集合相对应的运动矢量 的 X分矢量的方向发生了变化, 并且所述第 k帧的特征矢量集合相对应的运 动矢量的 y分矢量的方向相对所述第 k - 1帧的特征矢量集合相对应的运动矢 量的 y分矢量的方向发生了变化, 及所述第 k帧的特征矢量集合相对应的运 动矢量的幅度与所述第 k - 1帧的特征矢量集合相对应的运动矢量的幅度相 差不大于一个预定的门限吋, 则判断所述第 k帧为第二类关键帧; 或者;  The video key frame extraction module is further configured to determine an X-division vector of a motion vector corresponding to a feature vector set of the k-1 frame by determining a direction of an X-segment vector of a motion vector corresponding to the feature vector set of the k-th frame The direction of the change occurs, and the direction of the y-divided vector of the motion vector corresponding to the feature vector set of the k-th frame is opposite to the direction of the y-divided vector of the motion vector corresponding to the feature vector set of the k-1th frame. a change occurs, and a magnitude of a motion vector corresponding to the feature vector set of the kth frame is different from a magnitude of a motion vector corresponding to the feature vector set of the k-1 frame by a predetermined threshold 吋, Determining that the kth frame is a second type of key frame; or
所述视频关键帧提取模块, 还用于通过判断第 k帧的特征矢量集合相对应的 运动矢量的 X分矢量的方向相对第 k - 1帧的特征矢量集合相对应的运动矢量 的 X分矢量的方向发生了变化, 或者第 k帧的特征矢量集合相对应的运动矢 量的 y分矢量的方向相对第 k - 1帧的特征矢量集合相对应的运动矢量的 y分 矢量的方向发生了变化, 并且所述第 k帧的特征矢量集合相对应的运动矢量 的幅度与所述第 k - 1帧的特征矢量集合相对应的运动矢量的幅度相差大于 一个预定的门限吋, 则判断所述第 k帧为第二类关键帧; The video key frame extraction module is further configured to determine an X-division vector of a motion vector corresponding to a feature vector set of the k-1 frame by determining a direction of an X-segment vector of a motion vector corresponding to the feature vector set of the k-th frame The direction of the y-divided vector of the motion vector corresponding to the feature vector set of the k-th frame changes with respect to the direction of the y-divided vector of the motion vector corresponding to the feature vector set of the k-1th frame, And determining that the k is determined by a difference between a magnitude of a motion vector corresponding to the feature vector set of the kth frame and a magnitude of a motion vector corresponding to the feature vector set of the k -1th frame being greater than a predetermined threshold threshold The frame is the second type of key frame;
或者;  Or
所述视频关键帧提取模块, 还用于通过判断第 k帧的特征矢量集合相对应的 运动矢量的 X分矢量的方向相对第 k - 1帧的特征矢量集合相对应的运动矢量 的 X分矢量的方向发生了变化, 并且所述第 k帧的特征矢量集合相对应的运 动矢量的 y分矢量的方向相对所述第 k - 1帧的特征矢量集合相对应的运动矢 量的 y分矢量的方向发生了变化吋, 则判断所述第 k帧为第二类关键帧。 或者; The video key frame extraction module is further configured to determine an X-division vector of a motion vector corresponding to a feature vector set of the k-1 frame by determining a direction of an X-segment vector of a motion vector corresponding to the feature vector set of the k-th frame The direction of the change has occurred, and the feature vector set of the kth frame corresponds to the transport The direction of the y-divided vector of the motion vector changes with respect to the direction of the y-divided vector of the motion vector corresponding to the feature vector set of the k-1th frame, and then the kth frame is determined to be the second type of key frame. or;
所述视频关键帧提取模块, 还用于通过判断第 k帧的特征矢量集合的运动矢 量的 X分矢量的方向相对所述第 k - 1帧的特征矢量集合的运动矢量的 X分矢 量的方向发生了变化, 或者所述第 k帧的特征矢量集合的运动矢量的 y分矢 量的方向相对所述第 k - 1帧的特征矢量集合的运动矢量的 y分矢量的方向发 生了变化, 并且所述第 k帧的特征矢量集合的运动矢量的幅度与所述第 k - 1 帧的特征矢量集合的运动矢量的幅度相差不大于一个预定的门限, 则判断 所述第 k帧为第三类关键帧; The video key frame extraction module is further configured to determine a direction of an X-segment vector of a motion vector of a motion vector set of the k-th frame by determining a direction of an X-segment vector of a motion vector of the feature vector set of the k-th frame A change has occurred, or the direction of the y-divided vector of the motion vector of the feature vector set of the k-th frame is changed with respect to the direction of the y-divided vector of the motion vector of the feature vector set of the k-1th frame, and Determining that the kth frame is the third type of key, the amplitude of the motion vector of the feature vector set of the kth frame is different from the amplitude of the motion vector of the feature vector set of the k-1th frame by a predetermined threshold. frame;
或者; Or
所述视频关键帧提取模块, 还用于通过判断所述第 k帧的特征矢量集合相对 应的运动矢量的 X分矢量的方向相对第 k - 1帧的特征矢量集合相对应的运动 矢量的 X分矢量的方向发生了变化, 或者第 k帧的特征矢量集合相对应的运 动矢量的 y分矢量的方向相对第 k - 1帧的特征矢量集合相对应的运动矢量的 y分矢量的方向发生了变化, 则判断所述第 k帧为第三类关键帧。 The video key frame extraction module is further configured to: determine, by determining a direction of an X-segment vector of a motion vector corresponding to the feature vector set of the k-th frame, a motion vector X corresponding to a feature vector set of the k-1th frame The direction of the sub-vector changes, or the direction of the y-divided vector of the motion vector corresponding to the feature vector set of the k-th frame occurs in the direction of the y-divided vector of the motion vector corresponding to the feature vector set of the k-1th frame. If the change is made, the kth frame is judged to be a third type of key frame.
一种视频服务系统, 包括如权利要求 7至 11任一项所述的视频服务装置及用 户终端装置, 所述视频服务装置根据获取的视频数据流中每帧的运动矢量 获得; 每帧的运动矢量的特征矢量集合; 判断前后相邻两帧的特征矢量集 合相对应的运动矢量的方向与幅度是否发生变化; 釆用所述是否发生变化 的判断结果从所述视频数据流中提取关键帧, 然后, 为所述用户终端装置 提供所述关键帧。 A video service system comprising the video service device and the user terminal device according to any one of claims 7 to 11, wherein the video service device obtains according to a motion vector of each frame in the acquired video data stream; The feature vector set of the vector; determining whether the direction and the amplitude of the motion vector corresponding to the feature vector set of the adjacent two frames before and after the change; and extracting the key frame from the video data stream by using the determination result of whether the change occurs, The key frame is then provided for the user terminal device.
PCT/CN2009/071783 2008-05-13 2009-05-13 Video service system, video service apparatus and extracting method of key frame thereof WO2009138037A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200810067177.8 2008-05-13
CN 200810067177 CN101582063A (en) 2008-05-13 2008-05-13 Video service system, video service device and extraction method for key frame thereof

Publications (1)

Publication Number Publication Date
WO2009138037A1 true WO2009138037A1 (en) 2009-11-19

Family

ID=41318372

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2009/071783 WO2009138037A1 (en) 2008-05-13 2009-05-13 Video service system, video service apparatus and extracting method of key frame thereof

Country Status (2)

Country Link
CN (1) CN101582063A (en)
WO (1) WO2009138037A1 (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9158974B1 (en) 2014-07-07 2015-10-13 Google Inc. Method and system for motion vector-based video monitoring and event categorization
US9170707B1 (en) 2014-09-30 2015-10-27 Google Inc. Method and system for generating a smart time-lapse video clip
US9449229B1 (en) 2014-07-07 2016-09-20 Google Inc. Systems and methods for categorizing motion event candidates
US9501915B1 (en) 2014-07-07 2016-11-22 Google Inc. Systems and methods for analyzing a video stream
USD782495S1 (en) 2014-10-07 2017-03-28 Google Inc. Display screen or portion thereof with graphical user interface
US10127783B2 (en) 2014-07-07 2018-11-13 Google Llc Method and device for processing motion events
US10140827B2 (en) 2014-07-07 2018-11-27 Google Llc Method and system for processing motion event notifications
US10192415B2 (en) 2016-07-11 2019-01-29 Google Llc Methods and systems for providing intelligent alerts for events
US10380429B2 (en) 2016-07-11 2019-08-13 Google Llc Methods and systems for person detection in a video feed
US10664688B2 (en) 2017-09-20 2020-05-26 Google Llc Systems and methods of detecting and responding to a visitor to a smart home environment
US10685257B2 (en) 2017-05-30 2020-06-16 Google Llc Systems and methods of person recognition in video streams
US10957171B2 (en) 2016-07-11 2021-03-23 Google Llc Methods and systems for providing event alerts
US10972685B2 (en) 2017-05-25 2021-04-06 Google Llc Video camera assembly having an IR reflector
US11035517B2 (en) 2017-05-25 2021-06-15 Google Llc Compact electronic device with thermal management
US11082701B2 (en) 2016-05-27 2021-08-03 Google Llc Methods and devices for dynamic adaptation of encoding bitrate for video streaming
CN113542868A (en) * 2021-05-26 2021-10-22 浙江大华技术股份有限公司 Video key frame selection method and device, electronic equipment and storage medium
US11356643B2 (en) 2017-09-20 2022-06-07 Google Llc Systems and methods of presenting appropriate actions for responding to a visitor to a smart home environment
US11599259B2 (en) 2015-06-14 2023-03-07 Google Llc Methods and systems for presenting alert event indicators
US11689784B2 (en) 2017-05-25 2023-06-27 Google Llc Camera assembly having a single-piece cover element
US11783010B2 (en) 2017-05-30 2023-10-10 Google Llc Systems and methods of person recognition in video streams
US11893795B2 (en) 2019-12-09 2024-02-06 Google Llc Interacting with visitors of a connected home environment

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120027295A1 (en) * 2009-04-14 2012-02-02 Koninklijke Philips Electronics N.V. Key frames extraction for video content analysis
WO2012016370A1 (en) * 2010-08-02 2012-02-09 Peking University Representative motion flow extraction for effective video classification and retrieval
CN104125430B (en) * 2013-04-28 2017-09-12 华为技术有限公司 Video moving object detection method, device and video monitoring system
CN103413322B (en) * 2013-07-16 2015-11-18 南京师范大学 Keyframe extraction method of sequence video
CN104754322B (en) * 2013-12-27 2018-01-23 华为技术有限公司 A kind of three-dimensional video-frequency Comfort Evaluation method and device
CN107748569B (en) * 2017-09-04 2021-02-19 中国兵器工业计算机应用技术研究所 Motion control method and device for unmanned aerial vehicle and unmanned aerial vehicle system
CN109976646B (en) * 2019-03-22 2020-11-10 上海沈德医疗器械科技有限公司 Magnetic resonance scanning control and image transmission method, server and program
CN111553302B (en) * 2020-05-08 2022-01-04 深圳前海微众银行股份有限公司 Key frame selection method, device, equipment and computer readable storage medium
CN111836072B (en) * 2020-05-21 2022-09-13 北京嘀嘀无限科技发展有限公司 Video processing method, device, equipment and storage medium
CN114697764B (en) * 2022-06-01 2022-09-02 深圳比特微电子科技有限公司 Method and device for generating video abstract and readable storage medium
CN116723335B (en) * 2023-06-29 2024-06-18 西安邮电大学 Method for extracting video key frame by video compression coding information
CN116681715B (en) * 2023-08-04 2023-10-10 杭州脉流科技有限公司 Blood vessel segmentation method, device, equipment and storage medium based on pixel value change

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1842165A (en) * 2005-03-31 2006-10-04 株式会社东芝 Method and apparatus for generating interpolation frame
CN1898948A (en) * 2003-12-23 2007-01-17 皇家飞利浦电子股份有限公司 Method and system for stabilizing video data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1898948A (en) * 2003-12-23 2007-01-17 皇家飞利浦电子股份有限公司 Method and system for stabilizing video data
CN1842165A (en) * 2005-03-31 2006-10-04 株式会社东芝 Method and apparatus for generating interpolation frame

Cited By (55)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10467872B2 (en) 2014-07-07 2019-11-05 Google Llc Methods and systems for updating an event timeline with event indicators
US11011035B2 (en) 2014-07-07 2021-05-18 Google Llc Methods and systems for detecting persons in a smart home environment
US9213903B1 (en) 2014-07-07 2015-12-15 Google Inc. Method and system for cluster-based video monitoring and event categorization
US9224044B1 (en) * 2014-07-07 2015-12-29 Google Inc. Method and system for video zone monitoring
US9354794B2 (en) 2014-07-07 2016-05-31 Google Inc. Method and system for performing client-side zooming of a remote video feed
US9420331B2 (en) 2014-07-07 2016-08-16 Google Inc. Method and system for categorizing detected motion events
US9449229B1 (en) 2014-07-07 2016-09-20 Google Inc. Systems and methods for categorizing motion event candidates
US9479822B2 (en) 2014-07-07 2016-10-25 Google Inc. Method and system for categorizing detected motion events
US9489580B2 (en) 2014-07-07 2016-11-08 Google Inc. Method and system for cluster-based video monitoring and event categorization
US9501915B1 (en) 2014-07-07 2016-11-22 Google Inc. Systems and methods for analyzing a video stream
US9544636B2 (en) 2014-07-07 2017-01-10 Google Inc. Method and system for editing event categories
US9602860B2 (en) 2014-07-07 2017-03-21 Google Inc. Method and system for displaying recorded and live video feeds
US9158974B1 (en) 2014-07-07 2015-10-13 Google Inc. Method and system for motion vector-based video monitoring and event categorization
US9609380B2 (en) 2014-07-07 2017-03-28 Google Inc. Method and system for detecting and presenting a new event in a video feed
US9674570B2 (en) 2014-07-07 2017-06-06 Google Inc. Method and system for detecting and presenting video feed
US9672427B2 (en) 2014-07-07 2017-06-06 Google Inc. Systems and methods for categorizing motion events
US9779307B2 (en) 2014-07-07 2017-10-03 Google Inc. Method and system for non-causal zone search in video monitoring
US9886161B2 (en) 2014-07-07 2018-02-06 Google Llc Method and system for motion vector-based video monitoring and event categorization
US9940523B2 (en) 2014-07-07 2018-04-10 Google Llc Video monitoring user interface for displaying motion events feed
US10108862B2 (en) 2014-07-07 2018-10-23 Google Llc Methods and systems for displaying live video and recorded video
US10127783B2 (en) 2014-07-07 2018-11-13 Google Llc Method and device for processing motion events
US10140827B2 (en) 2014-07-07 2018-11-27 Google Llc Method and system for processing motion event notifications
US10452921B2 (en) 2014-07-07 2019-10-22 Google Llc Methods and systems for displaying video streams
US10192120B2 (en) 2014-07-07 2019-01-29 Google Llc Method and system for generating a smart time-lapse video clip
US11250679B2 (en) 2014-07-07 2022-02-15 Google Llc Systems and methods for categorizing motion events
US11062580B2 (en) 2014-07-07 2021-07-13 Google Llc Methods and systems for updating an event timeline with event indicators
US10180775B2 (en) 2014-07-07 2019-01-15 Google Llc Method and system for displaying recorded and live video feeds
US10977918B2 (en) 2014-07-07 2021-04-13 Google Llc Method and system for generating a smart time-lapse video clip
US10867496B2 (en) 2014-07-07 2020-12-15 Google Llc Methods and systems for presenting video feeds
US10789821B2 (en) 2014-07-07 2020-09-29 Google Llc Methods and systems for camera-side cropping of a video feed
US9170707B1 (en) 2014-09-30 2015-10-27 Google Inc. Method and system for generating a smart time-lapse video clip
USD782495S1 (en) 2014-10-07 2017-03-28 Google Inc. Display screen or portion thereof with graphical user interface
USD893508S1 (en) 2014-10-07 2020-08-18 Google Llc Display screen or portion thereof with graphical user interface
US11599259B2 (en) 2015-06-14 2023-03-07 Google Llc Methods and systems for presenting alert event indicators
US11082701B2 (en) 2016-05-27 2021-08-03 Google Llc Methods and devices for dynamic adaptation of encoding bitrate for video streaming
US11587320B2 (en) 2016-07-11 2023-02-21 Google Llc Methods and systems for person detection in a video feed
US10192415B2 (en) 2016-07-11 2019-01-29 Google Llc Methods and systems for providing intelligent alerts for events
US10657382B2 (en) 2016-07-11 2020-05-19 Google Llc Methods and systems for person detection in a video feed
US10380429B2 (en) 2016-07-11 2019-08-13 Google Llc Methods and systems for person detection in a video feed
US10957171B2 (en) 2016-07-11 2021-03-23 Google Llc Methods and systems for providing event alerts
US11035517B2 (en) 2017-05-25 2021-06-15 Google Llc Compact electronic device with thermal management
US10972685B2 (en) 2017-05-25 2021-04-06 Google Llc Video camera assembly having an IR reflector
US11156325B2 (en) 2017-05-25 2021-10-26 Google Llc Stand assembly for an electronic device providing multiple degrees of freedom and built-in cables
US11689784B2 (en) 2017-05-25 2023-06-27 Google Llc Camera assembly having a single-piece cover element
US11680677B2 (en) 2017-05-25 2023-06-20 Google Llc Compact electronic device with thermal management
US11353158B2 (en) 2017-05-25 2022-06-07 Google Llc Compact electronic device with thermal management
US10685257B2 (en) 2017-05-30 2020-06-16 Google Llc Systems and methods of person recognition in video streams
US11386285B2 (en) 2017-05-30 2022-07-12 Google Llc Systems and methods of person recognition in video streams
US11783010B2 (en) 2017-05-30 2023-10-10 Google Llc Systems and methods of person recognition in video streams
US11710387B2 (en) 2017-09-20 2023-07-25 Google Llc Systems and methods of detecting and responding to a visitor to a smart home environment
US11356643B2 (en) 2017-09-20 2022-06-07 Google Llc Systems and methods of presenting appropriate actions for responding to a visitor to a smart home environment
US11256908B2 (en) 2017-09-20 2022-02-22 Google Llc Systems and methods of detecting and responding to a visitor to a smart home environment
US10664688B2 (en) 2017-09-20 2020-05-26 Google Llc Systems and methods of detecting and responding to a visitor to a smart home environment
US11893795B2 (en) 2019-12-09 2024-02-06 Google Llc Interacting with visitors of a connected home environment
CN113542868A (en) * 2021-05-26 2021-10-22 浙江大华技术股份有限公司 Video key frame selection method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN101582063A (en) 2009-11-18

Similar Documents

Publication Publication Date Title
WO2009138037A1 (en) Video service system, video service apparatus and extracting method of key frame thereof
CN104618803B (en) Information-pushing method, device, terminal and server
US11023618B2 (en) Systems and methods for detecting modifications in a video clip
CN104754413B (en) Method and apparatus for identifying television signals and recommending information based on image search
US11748870B2 (en) Video quality measurement for virtual cameras in volumetric immersive media
JP2016527791A (en) Image processing method and apparatus
CN101273635A (en) Apparatus and method for encoding and decoding multi-view picture using camera parameter, and recording medium storing program for executing the method
EP1938208A1 (en) Face annotation in streaming video
CN104170374A (en) Modifying an appearance of a participant during a video conference
JP2014515225A (en) Target object-based image processing
WO2006025272A1 (en) Video classification device, video classification program, video search device, and videos search program
EP3513326B1 (en) Methods, systems, and media for detecting stereoscopic videos by generating fingerprints for multiple portions of a video frame
CN110312138B (en) High-embedding-capacity video steganography method and system based on time sequence residual convolution modeling
CN104378635B (en) The coding method of video interested region based on microphone array auxiliary
JP2013093840A (en) Apparatus and method for generating stereoscopic data in portable terminal, and electronic device
CN114641998A (en) Method and apparatus for machine video encoding
CN107277557B (en) A kind of methods of video segmentation and system
US20240214443A1 (en) Methods, systems, and media for selecting video formats for adaptive video streaming
CN111615008B (en) Intelligent abstract generation and subtitle reading system based on multi-device experience
JP5880558B2 (en) Video processing system, viewer preference determination method, video processing apparatus, control method thereof, and control program
CN107733874A (en) Information processing method, device, computer equipment and storage medium
CN113395583A (en) Watermark detection method, watermark detection device, computer equipment and storage medium
JP2018206292A (en) Video summary creation device and program
Dedhia et al. Saliency prediction for omnidirectional images considering optimization on sphere domain
WO2021129444A1 (en) File clustering method and apparatus, and storage medium and electronic device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09745421

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09745421

Country of ref document: EP

Kind code of ref document: A1