CN103020138A - Method and device for video retrieval - Google Patents

Method and device for video retrieval Download PDF

Info

Publication number
CN103020138A
CN103020138A CN2012104761657A CN201210476165A CN103020138A CN 103020138 A CN103020138 A CN 103020138A CN 2012104761657 A CN2012104761657 A CN 2012104761657A CN 201210476165 A CN201210476165 A CN 201210476165A CN 103020138 A CN103020138 A CN 103020138A
Authority
CN
China
Prior art keywords
video data
compressed video
obtains
motion
textural characteristics
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012104761657A
Other languages
Chinese (zh)
Inventor
宗竞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
JIANGSU LEMAIDAO NETWORK TECHNOLOGY Co Ltd
Original Assignee
JIANGSU LEMAIDAO NETWORK TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by JIANGSU LEMAIDAO NETWORK TECHNOLOGY Co Ltd filed Critical JIANGSU LEMAIDAO NETWORK TECHNOLOGY Co Ltd
Priority to CN2012104761657A priority Critical patent/CN103020138A/en
Publication of CN103020138A publication Critical patent/CN103020138A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a device for video retrieval. The method for video retrieval comprises the following steps: acquiring texture features of compressed video data; acquiring motion features of compressed video data; and performing similarity measure according to the feature fusion of the acquired texture features of the compressed video data and the acquired motion features of the compressed video data to judge the relevance of the compressed video data. The method and the device perform video retrieval based on compressed domain features, so as to improve the processing efficiency of video retrieval.

Description

A kind of method and apparatus of video frequency searching
Technical field
The present invention relates to network technology, especially a kind of method and apparatus of video frequency searching.
Background technology
Along with the fast development of multi-media computing and improving constantly of network transmission technology, the multi-medium data sharp increase that people can have access to.Video is as a kind of media format the most complicated in the multi-medium data, rely on its diversified form of expression, abundant semantic content and easily recording mode obtained widely application and development.
Video frequency searching is exactly to find required video clips from a large amount of video datas.Automatically find required piece of video breakpoint namely to realize content-based video frequency searching according to providing the description of example or feature.
The purpose of Content based video analysis and retrieval research is by video content being carried out Computer Processing, analysis and understanding, setting up structure and index, to realize acquiring video information easily and effectively.It is according to the content of video and context relation, retrieves in extensive video data.Content-based video frequency searching comprises a lot of technology, such as the automatic indexing of the analysis (Shot Detection technology) of video structure, video data and Video clustering etc.
At present, aspect the research of content-based video retrieval technology, color except identification and Description Image, texture, outside shape and the spatial relationship, other mainly concentrate on video lens and cut apart, the extraction of feature and description (comprising: visual signature, color, texture and shape and movable information and object information etc.), the aspects such as key-frame extraction and structure analysis.
According to the difference of submitting video content to, video frequency searching generally is divided into searching lens and fragment retrieval.In general, the concept of fragment is equivalent to the concept of scene, also is to be made of the relevant cinestrip of a succession of semanteme, and different is that fragment can be the some or all of of one section full scene.The at present majority of video frequency searching research also concentrates on the searching lens.The research of fragment retrieval aspect then just begins.In fact, from user's angle analysis, they can be video segments and seldom can be single physical shots to the inquiry of video database usually.From the angle analysis of quantity of information, the video segment that is comprised of several camera lenses has more more semantic than single camera lens, and it can represent the interested event of user, and therefore the result of inquiry is also more meaningful.Whether sports, the TV station that retrieval is liked in the plot of liking such as retrieval in retrieval interested event, the film in news, the sports cast retrieves certain bar advertisement and broadcasts etc.
Existing video frequency search system when video data is processed, is based on the data of decompression more.Decompressing needs certain operation time and corresponding operand, causes the video frequency search system treatment effeciency to reduce.
Therefore, need a kind of video retrieval method based on the compression domain feature and device to improve the treatment effeciency of video frequency searching.
Summary of the invention
According to an aspect of the present invention, provide a kind of video retrieval method, comprised the steps: to obtain the textural characteristics of compressed video data; Obtain the motion feature of compressed video data; Carry out measuring similarity to judge the correlativity of described compressed video data with the Fusion Features according to the motion feature of the textural characteristics of the compressed video data that obtains and the compressed video data that obtains.
Wherein, described compressed video data is the compressing video frequency flow according to Moving Picture Experts Group-2.
Wherein, the described step of obtaining the textural characteristics of compressed video data comprises the key frame that extracts described compressed video and the textural characteristics that obtains above-mentioned key frame.
Wherein, extract motion vector field and DCT residual error coefficient matrix during the described motion feature that obtains compressed video comprises from compressed video data and carry out obtaining camera motion parameter and reliable background macro block based on the global motion analysis of four parameter model, Simultaneous Motion Compensation is to obtain the absolute motion vector of each macro block.
According to a further aspect in the invention, provide a kind of video frequency searching device, having comprised: the First Characteristic acquisition module, for the textural characteristics that obtains compressed video data; The Second Characteristic acquisition module obtains the motion feature of compressed video data; And judge module, the Fusion Features that is used for the motion feature of the textural characteristics of the compressed video data that obtains according to the First Characteristic acquisition module and the compressed video data that the Second Characteristic acquisition module obtains carries out measuring similarity to judge the correlativity of described compressed video data.
Carry out video frequency searching according to video retrieval method of the present invention and device based on the compression domain feature, can improve the treatment effeciency of video frequency searching.
Description of drawings
Fig. 1 is the process flow diagram according to the video retrieval method of embodiment of the present invention; With
Fig. 2 is the process flow diagram according to the method for mobile payment of embodiment of the present invention.
Embodiment
The below is described in detail the preferred embodiment to the method and apparatus of video frequency searching of the present invention with reference to the accompanying drawings; it should be noted that; following description only is schematic; wherein related content does not consist of inventing the restriction of related content; those skilled in the art can also have many different variation patterns on the basis of disclosure below, and these all belong to protection scope of the present invention.
Consider the treatment effeciency problem that exists in the existing video retrieval technology, embodiments of the present invention provide a kind of content-based video frequency searching solution, at first directly extract textural characteristics and the motion feature of compressed video, and these content characteristic values are analyzed, then introduce the theoretical realization character of fuzzy judgment and merge and carry out video frequency searching.
When being carried out digitizing, video image will generate a large amount of numerical information, for example the digital image of a frame 720X576 dot matrix, 16 looks takies the storage space of 1.35MB, so the shared bandwidth of complete movable (per second 25 frames) image will reach per second 33.75MB.At this rate, CD-R disc commonly used can only be stored 16 seconds this active images.Therefore, in order to enter practical application, must need vision signal is compressed.
Video compress is divided into Lossless Compression and lossy compression method.
When Lossless Compression refers to the playback compressed file, can recover like clockwork raw data.This is usually used in the compression of data file, for example the ZIP file.The algorithm that Lossless Compression is commonly used is Huffman method and variable Run-Length Coding.The probability that Huffman statistics code word occurs according to shorter this cryptoprinciple of coding figure place of the high code assignment of frequency, reduces average word length, reaches the effect of packed data.This compression algorithm needs the probability that color value occurs in the statistical picture in advance, and the every width of cloth image of encoding scheme is not identical, and code efficiency is not high.Variable Run-Length Coding uses a pair of parameter, and color and length replace the same color value of a succession of Coutinuous store, thereby reduces the shared storage space of same color.Very useful during this compression algorithm compression black and white picture, but during to the Color Image Compression of activity and impracticable, it is subjected to the impact of visual complexity too large, causes compressibility excessively low, is difficult to surpass 3:1.
Lossy Compression Algorithm reduces the shared space of digital image by losing bulk redundancy information, can not intactly recover raw image during playback, and will lose selectively some details, loses how much information and is determined by the how high compressibility of needs.To the same compression algorithm, required compressibility is higher, and the picture information of loss is more.The general algorithm that adopts is transition coding+motion detection.Now general transition coding has the DCT(discrete cosine transform) and wavelet transformation, motion detection adopts the block search algorithm.Also have some other encryption algorithm: object coding, based on the coding of model, fractal image etc.Now used MPEG, H.263 wait the compression standard, all be based on the method for transition coding+motion detection, all belong to and diminish algorithm.
MPEG series is to use at present maximum compression standards.MPEG (Moving Picture Expert Group) is by ISO (International Standards Organization) (International Organization for Standardization in 1988, ISO) and (the International Electrotechnical Commission of International Electrotechnical Commission, IEC) unite the expert group of establishment, be responsible for the standard such as synchronous of the coding, decoding of exploitation television image data and voice data and they.
Moving Picture Experts Group-2 is direct high quality graphic and the sound coding standard relevant with digital television broadcasting in the MPEG series standard.MPEG-2 is the expansion of MPEG-1, because its basic coding algorithm with MPEG-1 is all identical, but MPEG-2 has increased the unexistent function of many MPEG-1, and for example the degree of accuracy of motion vector is brought up to half-pixel; Owing to having special vector in the key frame, having expanded wrong redundancy; But choice accuracy in the discrete cosine transform; The advanced prediction pattern; Quality retractility (image of tolerable different quality in same video flowing); Support VBR, variable performance (scalability) function of bit rate is provided; Increased the coding of interlaced scan tv.
The system model standard of MPEG-2 mainly is the combination that defines television image data, voice data and other data, these data is combined into one or more is suitable for the elementary stream storing or transmit.Data stream has two kinds of forms, a kind of program data stream (Program Stream, PS) that is called, and another kind is called transmitting data stream (Transport Stream, TS).Program data stream is that making up one or more normalized is packetised elementary streams (Packetized Elementary Streams, PES) a kind of data stream that generates, be used in and occur being fit to the application of using software to process under the wrong less environment that compares; Transmitting data stream also is one or more PES of combination and a kind of data stream of generating, and it is used in and occurs under wrong relatively many environment, for example in loss or noisy transmission system are arranged.
Further specifying video retrieval method according to the present invention below in conjunction with the application scenario of MPEG-2 video flowing namely installs.
Fig. 1 is the process flow diagram according to the video retrieval method of embodiment of the present invention.
As shown in Figure 1, obtain the textural characteristics of compressed video data at step S101 according to the video retrieval method of embodiment of the present invention.
In order to obtain the textural characteristics of video data, at first to extract key frame of video in compression domain.Video data is unordered, non-structured.The key-frame extraction technology can realize to this unstructured data effectively organize, management, index and inquiry.Traditional key-frame extraction technology is carried out in pixel domain, does not satisfy needs of the present invention.And compression domain key-frame extraction technology has that processing speed is fast, resource occupation is few, and the time efficiency high has become the one preferred technique of video structural process.
The key-frame extraction technology of some compression domain has been proposed in the prior art.For example at first the mpeg compressed video file part is decoded, the readout code stream information extracts the brightness DC coefficient of I frame as image feature vector, represents the similarity of the proper vector of adjacent I interframe with Euclidean distance, by the differentiation of adaptive threshold in the algorithm, obtain key frame again.
Subsequently, obtain the textural characteristics of above-mentioned key frame.The texture of piece image is the characteristics of image through quantizing in image calculation.Describing texture of image image or wherein spatial color distribution and the light distribution of pocket.The extraction of textural characteristics is divided into based on the method for structure with based on the method for statistics.Texture characteristic extracting method based on structure is that the texture that will will detect carries out modeling, the pattern that search repeats in image.Existing texture characteristic extracting method comprises LBP method (Local binary patterns) and gray level co-occurrence matrixes method.
The step that the LBP method is extracted the LBP proper vector is as follows: at first detection window is divided into 16 * 16 zonule (cell), for a pixel among each cell, (also can be a plurality of points of annular neighborhood with 8 points in its annular neighborhood, shown in three neighborhood examples that Fig. 3-4. uses the LBP algorithm) carry out clockwise or counterclockwise comparison, if center pixel value is larger than this adjoint point, be 1 with the adjoint point assignment then, otherwise assignment is 0, and each point can obtain 8 bits (usually being converted to decimal number) like this.Then calculate the histogram of each cell, be the frequency that occurs of each numeral (supposition is decimal number) (namely one about each pixel whether in the neighbour territory a large binary sequence of point add up), then this histogram is carried out normalized.The statistic histogram of each cell that will obtain at last connects, and has just obtained the LBP textural characteristics of view picture figure, then just can utilize SVM or other machines learning algorithm to classify.
Gray level co-occurrence matrixes is another kind of texture characteristic extracting method, at first for a piece image definition direction (orientation) and the step-length (step) take pixel as unit, gray level co-occurrence matrixes T(N * N), then defining M (i, j) is that the pixel of i and j appears at a point and simultaneously along the frequency on the point of defined direction span step-length for gray level.Wherein N is that gray level is divided number.Because co-occurrence matrix has the combination definition of direction and step-length, and determine that a factor of frequency is to the contributive number of pixels of matrix, and this number lacks than number altogether, and reduce along with the increase of step-length.Therefore resulting co-occurrence matrix is a sparse matrix, usually reduces to 8 grades so gray level is divided N.As calculating in the horizontal direction the co-occurrence matrix of pixel on the left and right directions, it then is symmetrical co-occurrence matrix.Similarly, if only consider pixel on the current pixel one direction (left or right), then be called asymmetric co-occurrence matrix.
As shown in Figure 1, obtain the motion feature of compressed video data at step S102 according to the video retrieval method of embodiment of the present invention.The extraction of Moving Objects is a pith that carries out video analysis.Traditional extracting method is to finish in pixel domain, and using it for video with compressed format storage then needs to expend a large amount of time and carry out first the decoding of compressed bit stream.For raising speed, by the specificity analysis to the MPEG code stream, existed in the prior art and directly in compression domain, carried out moving object extract.For example, can at first from compressed bit stream, extract motion vector field and DCT residual error coefficient matrix.Secondly, carry out the global motion analysis based on four parameter model, obtain camera motion parameter and reliable background macro block, Simultaneous Motion Compensation is to obtain the absolute motion vector of each macro block.Then, employing obtains existing the candidate region of Moving Objects based on the motion detection of Fourth-order moment, then estimate according to the motion amplitude of motion relevance characteristics defmacro interblock and the correlativity of angle, candidate region after scanning motion detects, each macro block that satisfies threshold condition is carried out cluster analysis and in conjunction with residual error DCT coefficient correction cluster result, to finish cutting apart of moving target.Adopt at last certain post-processing technology with further raising segmentation precision.
At step S103, the Fusion Features of the textural characteristics of the compressed video data that obtains according to step S101 according to the video retrieval method of embodiment of the present invention and the motion feature of the compressed video data that step S102 obtains carries out measuring similarity to judge the correlativity of described compressed video data.
Along with the research and development of image fusion technology, Fusion Features obviously gets up gradually in the advantage aspect the image similarity tolerance.The single image feature reflects respectively the attribute of image from different perspectives, and Fusion Features can utilize the useful information of many features, eliminates to a certain extent again the interference of subjective and objective factor, is a very valuable scheme.Can the weight of each feature be arranged, thereby reach satisfied retrieval effectiveness.
Fig. 2 is the block diagram of video frequency searching device of the present invention.As shown in Figure 2, the video frequency searching device comprises: First Characteristic acquisition module 201, for the textural characteristics that obtains compressed video data; Second Characteristic acquisition module 202 is for the motion feature feature of obtaining compressed video data; Judge module 203, the Fusion Features that is used for the motion feature of the textural characteristics of the compressed video data that obtains according to First Characteristic acquisition module 201 and the compressed video data that Second Characteristic acquisition module 202 obtains carries out measuring similarity to judge the correlativity of described compressed video data.
In sum, utilize user's biological characteristic to carry out safety certification according to the mobile-payment system of embodiment of the present invention, effectively avoided the unauthorized personnel to utilizing portable terminal to carry out payment transaction, thereby mentioned widely the safe reliability of mobile payment.

Claims (5)

1. a video retrieval method is characterized in that, comprises the steps:
Obtain the textural characteristics (101) of compressed video data;
Obtain the motion feature (S102) of compressed video data; With
Carry out the correlativity (S204) of measuring similarity to judge described compressed video data according to the textural characteristics of the compressed video data that obtains and the Fusion Features of the motion feature of the compressed video data that obtains.
2. video retrieval method as claimed in claim 1, wherein, described compressed video data is the compressing video frequency flow according to Moving Picture Experts Group-2.
3. video retrieval method as claimed in claim 1 or 2, wherein, the described step of obtaining the textural characteristics of compressed video data comprises the key frame that extracts described compressed video and the textural characteristics that obtains above-mentioned key frame.
4. video retrieval method as claimed in claim 1 or 2, wherein, extract motion vector field and DCT residual error coefficient matrix during the described motion feature that obtains compressed video comprises from compressed video data and carry out obtaining camera motion parameter and reliable background macro block based on the global motion analysis of four parameter model, Simultaneous Motion Compensation is to obtain the absolute motion vector of each macro block.
5. a video frequency searching device is characterized in that, comprising:
First Characteristic acquisition module (201) is for the textural characteristics that obtains compressed video data;
Second Characteristic acquisition module (202) obtains the motion feature of compressed video data; With
Judge module (203), the Fusion Features that is used for the motion feature of the textural characteristics of the compressed video data that obtains according to First Characteristic acquisition module (201) and the compressed video data that Second Characteristic acquisition module (202) obtains carries out the correlativity (S204) of measuring similarity to judge described compressed video data.
CN2012104761657A 2012-11-22 2012-11-22 Method and device for video retrieval Pending CN103020138A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012104761657A CN103020138A (en) 2012-11-22 2012-11-22 Method and device for video retrieval

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012104761657A CN103020138A (en) 2012-11-22 2012-11-22 Method and device for video retrieval

Publications (1)

Publication Number Publication Date
CN103020138A true CN103020138A (en) 2013-04-03

Family

ID=47968742

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012104761657A Pending CN103020138A (en) 2012-11-22 2012-11-22 Method and device for video retrieval

Country Status (1)

Country Link
CN (1) CN103020138A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103810467A (en) * 2013-11-01 2014-05-21 中南民族大学 Method for abnormal region detection based on self-similarity number encoding
CN103905824A (en) * 2014-03-26 2014-07-02 深圳先进技术研究院 Video semantic retrieval and compression synchronization camera system and method
CN104376003A (en) * 2013-08-13 2015-02-25 深圳市腾讯计算机系统有限公司 Video retrieval method and device
CN106611043A (en) * 2016-11-16 2017-05-03 深圳百科信息技术有限公司 Video searching method and system
CN107194364A (en) * 2017-06-02 2017-09-22 重庆邮电大学 A kind of Huffman LBP Pose-varied face recognition methods based on divide-and-conquer strategy
CN110414335A (en) * 2019-06-20 2019-11-05 北京奇艺世纪科技有限公司 Video frequency identifying method, device and computer readable storage medium
CN111159443A (en) * 2019-12-31 2020-05-15 深圳云天励飞技术有限公司 Image characteristic value searching method and device and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101064846A (en) * 2007-05-24 2007-10-31 上海交通大学 Time-shifted television video matching method combining program content metadata and content analysis
US20110208744A1 (en) * 2010-02-24 2011-08-25 Sapna Chandiramani Methods for detecting and removing duplicates in video search results

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101064846A (en) * 2007-05-24 2007-10-31 上海交通大学 Time-shifted television video matching method combining program content metadata and content analysis
US20110208744A1 (en) * 2010-02-24 2011-08-25 Sapna Chandiramani Methods for detecting and removing duplicates in video search results

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张文琪: "基于H.264、AVC压缩域的视频运动目标检测", 《中国优秀硕士学位论文全文数据库信息科技辑》, 29 February 2012 (2012-02-29), pages 4 - 24 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104376003A (en) * 2013-08-13 2015-02-25 深圳市腾讯计算机系统有限公司 Video retrieval method and device
CN104376003B (en) * 2013-08-13 2019-07-05 深圳市腾讯计算机系统有限公司 A kind of video retrieval method and device
CN103810467A (en) * 2013-11-01 2014-05-21 中南民族大学 Method for abnormal region detection based on self-similarity number encoding
CN103905824A (en) * 2014-03-26 2014-07-02 深圳先进技术研究院 Video semantic retrieval and compression synchronization camera system and method
CN106611043A (en) * 2016-11-16 2017-05-03 深圳百科信息技术有限公司 Video searching method and system
CN106611043B (en) * 2016-11-16 2020-07-03 深圳市梦网视讯有限公司 Video searching method and system
CN107194364A (en) * 2017-06-02 2017-09-22 重庆邮电大学 A kind of Huffman LBP Pose-varied face recognition methods based on divide-and-conquer strategy
CN107194364B (en) * 2017-06-02 2020-08-04 重庆邮电大学 Huffman-L BP multi-pose face recognition method based on divide and conquer strategy
CN110414335A (en) * 2019-06-20 2019-11-05 北京奇艺世纪科技有限公司 Video frequency identifying method, device and computer readable storage medium
CN111159443A (en) * 2019-12-31 2020-05-15 深圳云天励飞技术有限公司 Image characteristic value searching method and device and electronic equipment
CN111159443B (en) * 2019-12-31 2022-03-25 深圳云天励飞技术股份有限公司 Image characteristic value searching method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN108632625B (en) Video coding method, video decoding method and related equipment
CN103020138A (en) Method and device for video retrieval
Duan et al. Compact descriptors for visual search
CN101374234B (en) Method and apparatus for monitoring video copy base on content
US6618507B1 (en) Methods of feature extraction of video sequences
Wang et al. Modeling background and segmenting moving objects from compressed video
Wu et al. Lossless compression of JPEG coded photo collections
CN106231214A (en) High-speed cmos sensor image based on adjustable macro block approximation lossless compression method
CN102724554B (en) Scene-segmentation-based semantic watermark embedding method for video resource
US8787692B1 (en) Image compression using exemplar dictionary based on hierarchical clustering
CN107657228B (en) Video scene similarity analysis method and system, and video encoding and decoding method and system
CN108882020A (en) A kind of video information processing method, apparatus and system
US20030026340A1 (en) Activity descriptor for video sequences
KR20120118465A (en) Data pruning for video compression using example-based super-resolution
Wang et al. Scalable facial image compression with deep feature reconstruction
CN103957341A (en) Image transmission method and related device
CN102486800A (en) Video searching method, system and method for establishing video database
CN103402087A (en) Video encoding and decoding method based on gradable bit streams
KR102261669B1 (en) Artificial Neural Network Based Object Region Detection Method, Device and Computer Program Thereof
CN112770116B (en) Method for extracting video key frame by using video compression coding information
CN108833928B (en) Traffic monitoring video coding method
CN103533353B (en) A kind of near video coding system
US8340176B2 (en) Device and method for grouping of images and spanning tree for video compression
Ouyang et al. The comparison and analysis of extracting video key frame
KR102072576B1 (en) Apparatus and method for encoding and decoding of data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130403